Introduction
Welcome to the Gencove API docs!
The Gencove REST API makes it easy to:
- upload low-pass sequencing FASTQ files to the Gencove analysis pipeline
- download analysis results
- track sample status
- automate data delivery.
Read on to get started and try out the examples on the side along the way.
Also, the full API reference for publicly available endpoints is available here: API reference
Gencove data
Genomic data is organized into “projects”. Each
project contains “samples”. Each sample has a
sample_id
(generated by Gencove) and external_id
(provided to
Gencove by customers).
In most cases, a user account and project will be created for you by our team.
In case you would like to explore the Gencove data delivery dashboard, feel free to create an account and explore as follows:
- Create a free Gencove account using the dashboard
- Create a project by clicking “Create a new project” or going to My Project -> Create a new project
API Key
The API key is used for programatically accessing Gencove.
You can find your API key by logging into the dashboard, clicking on your avatar in the top right corner of the page, and navigating to Settings. Once in Settings, scroll to the bottom of the screen, click “Show/hide API Key”, and a 36-character string will reveal itself.
The Gencove CLI
The Gencove command-line interface (CLI) can be used to easily access the API.
It is mostly used for:
- Uploading FASTQ files for analysis
- Downloading analysis results
Quickstart
$ pip install gencove==1.0.1
$ export GENCOVE_API_KEY='<your-api-key>'
$ gencove upload sync <local-directory-path>
# In shell:
pip install gencove==1.0.1
export GENCOVE_API_KEY='<your-api-key>'
gencove upload sync <local-directory-path>
Install the Gencove CLI using the Python package manager pip
and upload
files to your Gencove account.
Setup
Installation
$ pip install gencove==1.0.1
# In shell:
pip install gencove==1.0.1
The Gencove CLI can be installed using the Python package manager
pip
. The source code is available on
GitLab.
Python and pip are commonly preinstalled on most Mac and Linux systems. In case you do need to install Python, commonly used instructions are available here.
Configuration
$ export GENCOVE_API_KEY='<your-api-key>'
export GENCOVE_API_KEY='<your-api-key>'
After acquiring your API key using the dashboard as described above, it can be
provided to the Gencove CLI via the environment variable $GENCOVE_API_KEY
or inline via the --api-key
option. We recommend using the
$GENCOVE_API_KEY
environment variable to avoid exposing the API key to
others.
Uploading FASTQ files
In order to enable FASTQ uploads for your account, log in and press the
Request access
button in the dashboard. You can expect a response from
Gencove support within 24h.
Once uploads are enabled, users can upload files to the Gencove upload area using the Gencove CLI and assign the files to projects using the Gencove Dashboard. Once files are assigned to a project, they will be processed by the Gencove analysis pipeline. Analysis results will be available via the Gencove API and Dashboard once analysis is complete.
Naming files
We highly recommend using the standard Illumina naming convention for FASTQ files. If files are named in this manner, Gencove systems will automatically detect:
- the sample identifier (and use it as the sample’s
external_id
) - R1/R2 designations of files
A summary of the naming convetion is:
SAMPLE ID
+ _
+ … + _
+ (R1
or R2
) + _
+ … + .fastq.gz
Commands
$ gencove upload --help
$ gencove upload <command> --help
gencove upload --help
gencove upload <command> --help
The Gencove CLI has 4 subcommands relevant to uploads:
sync
- upload files in bulkcp
- upload files one by onels
- list uploaded filesrm
- delete uploaded files
Help and instructions for each command can be viewed using the --help
flag.
sync <source-path> [<destination-path>]
Syncs local directories to directories in your Gencove upload area. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.
$ gencove upload sync my-fastq-files/
gencove upload sync my-fastq-files/
The example command will recursively copy all files in the
my-fastq-files/
directory on your host system to a directory with an
automatically generated name the Gencove upload area.
$ gencove upload sync my-fastq-files/ gncv://my-fastq/batch-1/
gencove upload sync my-fastq-files/ gncv://my-fastqs/batch-1/
In case more control is needed over the upload destination, a destination path
prefixed with gncv://
may be provided. This pattern is commonly used for
separating upload batches when continuously uploading data to your Gencove
account and is useful for easily filtering files in the Gencove Dashboard. A
common directory structure for batching uploads is:
gncv://<project-name>/<batch-name>/
Details of sync
behavior:
- In case a file in the local directory already exists in the destination, it will be overwritten
- In case a file exists in the destination, but not the local directory, it
will not be deleted in the destination unless the
--delete
flag is provided
cp <source-path> [<destination-path>]
$ gencove upload cp my-fastq-files/sampleid-1_R1.fastq.gz gncv://my-fastq/batch-1/
gencove upload cp my-fastq-files/sampleid-1_R1.fastq.gz gncv://my-fastq/batch-1/
Copies a local file to the Gencove upload area.
ls [<destination-path>]
$ gencove upload ls --recursive
gencove upload ls --recursive
Lists files and directories in your Gencove upload area.
rm [<destination-path>]
$ gencove upload rm gncv://my-fastq/batch-1/sampleid-1_R1.fastq.gz
gencove upload rm gncv://my-fastq/batch-1/sampleid-1_R1.fastq.gz
Remove a file from the Gencove upload area
Downloading analysis results
# Standard invocation:
$ gencove project raw_data <project-id> <external-ids-file> <csv-file>
# Using stdin and stdout:
$ cat <sample-ids-file> | gencove project raw_data --output-format json --id-type sample <project-id> - - > <json-file>
# Example to get all available VCF files, together with sample ids and external ids
$ echo | gencove project raw_data --output-format csv <project-id> - - | cut -d , -f 1,2,5 > output.csv
# In shell (standard invocation)
gencove project raw_data <project-id> <external-ids-file> <csv-file>
# Using stdin and stdout:
cat <sample-ids-file> | gencove project raw_data --output-format json --id-type sample <project-id> - - > <json-file>
# Example to get all available VCF files, together with sample ids and external ids
echo | gencove project raw_data --output-format csv <project-id> - - | cut -d , -f 1,2,5 > output.csv
The Gencove CLI enables exporting user data in bulk to csv
or json
format.
Input and output can be a file path or “-” for stdin
and stdout
.
The input stream is expected to be one id per line.
Users’ genomic data can be accessed by providing a list of sample_id
s
or external_id
s. sample_id
s are Gencove ids for samples,
while external_id
s are ids provided to us by customers.
The tool will return a list of objects containing the following data for
each sample_id
or external_id
:
sample_id
- Gencove sample id (unique)external_id
- the external id provided in JWT when requesting access to user dataproject_id
- the project idvcf_url_s3
- a presigned URL for downloading the imputed .vcf filesnp_url_s3
- a presigned URL for downloading the original AncestryDNA, 23andMe, etc. data filebam_url_s3
- a presigned URL for downloading the raw .bam file (if the user got their genomic data through Gencove)bai_url_s3
- a presigned URL for downloading the .bam file indexfastq_nongrch37_url_s3
- a presigned URL for downloading the .fastq file with non-human readsancestry_url_s3
- a presigned URL for downloading a .json file with the output of Gencove’s ancestry analysislocal_ancestry_url_s3
- a presigned URL for downloading the output of Gencove’s local ancestry analysisancestry_url_suffix
- a suffix for displaying ancestry using the Gencove map. To test, append it tohttps://ancestry-staging.gencove.com/
microbiome_url_s3
- a presigned URL for downloading a .json file with the output of Gencove’s microbiome analysistraits_url_s3
- a presigned URL for downloading a .json file with the output of Gencove’s polygenic traits analyses
The presigned download URLs expire after 2 days, but there is no limit on the number of generated presigned URLs.
If using the default “external” id type, multiple entries for the same
external id may be returned, since it is not guaranteed to be unique. Sample
(i.e., Gencove) ids are guaranteed to be unique.
In case data is not available for a given id or the id is not found, it is skipped.
Order of ids from input is not guaranteed to be preserved in output.
# Get help
$ gencove project raw_data --help
For detailed usage instructions, use the --help
flag
Automated data delivery
{
"event": "project_sample_join",
"data": {
"project_id": 1,
"sample_id": 2,
"external_id": "sample_100003",
"subsidy_in_cents": 0,
"participating": true,
"jwt_log": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBfaWQiOjEsImV4dGVybmFsX2lkIjoibWVtYmVyXzEwMDAwMSIsInN1YnNpZHlfaW5fY2VudHMiOjB9.TR4FVXDxdD0EL-LaHOaoOKQhv1N8UN0eYdvlt3nQ5ao"
}
}
# In shell (get help)
gencove project raw_data --help
Data delivery can be automated using webhooks.
A webhook URL may be defined for an project, where events will be submitted via HTTP POST requests. The content of the webhook contains:
- Type of event,
event
- Data describing the event in more detail,
data
Currently, there are 2 types of events:
project_sample_results_ready
: generated when data becomes available when it was not available before.test
: generated when validating the webhook URL. If awebhook_url
is defined when creating or updating an project, the Gencove backend will attempt to validate the URL by generating a HTTP POST request with this event type.
Example of an event can found in the sidebar.
Webhooks are only active when the project is in the enabled
state.
Using the API
Authentication
# API key
$ curl -H 'Authorization: GENCOVE-API-KEY <your_api_key>'\
https://rest.gencove.com/welcome-api-key
# API key
import requests
response = requests.get(
'https://rest.gencove.com/welcome-api-key',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
The API is accessed by placing your API key in the Authorization
header of HTTP requests: Authorization: GENCOVE-API-KEY <api_key>
Listing your projects
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
https://rest.gencove.com/projects
import requests
response = requests.get(
'https://rest.gencove.com/projects',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
Your projects can be listed with a GET
HTTP request to /projects
Info about a single project
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
https://rest.gencove.com/projects/<project_id>
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
https://rest.gencove.com/projects/<project_id>/stats
import requests
response = requests.get(
'https://rest.gencove.com/projects/<project_id>',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
response = requests.get(
'https://rest.gencove.com/projects/<project_id>/stats',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
Data and stats for a specific project can be listed with GET
HTTP
requests to /projects/<project_id>
and /projects/<project_id>/stats
, respectively.
Listing project samples
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
https://rest.gencove.com/projects/<project_id>/samples
import requests
response = requests.get(
'https://rest.gencove.com/projects/<project_id>/samples',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
project samples can be listed with a GET
HTTP request to
/projects/<project_id>/samples
Info about a specific sample
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
https://rest.gencove.com/projects/<project_id>/samples/<sample_id>
import requests
response = requests.get(
'https://rest.gencove.com/projects/<project_id>/samples/<sample_id>',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
}
)
Info about a single sample can be listed with a GET
HTTP request to
/projects/<project_id>/samples/<sample_id>
Accessing genomic data
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
-H "Content-Type: application/json"\
-X POST\
-d '{"sample_ids": [<sample_id_1>, <sample_id_2>, ..., <sample_id_N>]}'\
https://rest.gencove.com/projects/<int:project_id>/raw-data
$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
-H "Content-Type: application/json"\
-X POST\
-d '{"external_ids": [<external_id_1>, <external_id_2>, ..., <external_id_N>]}'\
https://rest.gencove.com/projects/<int:project_id>/raw-data
import requests
response = requests.post(
'https://rest.gencove.com/projects/<int:project_id>/raw-data',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
},
json={
'sample_ids': [<sample_id_1>, <sample_id_2>, ..., <sample_id_N>]
}
)
response = requests.post(
'https://rest.gencove.com/projects/<int:project_id>/raw-data',
headers={
'Authorization': 'GENCOVE-API-KEY <your_api_key>'
},
json={
'external_ids': [<external_id_1>, <external_id_2>, ..., <external_id_N>]
}
)
Genomic data can be accessed by POST
ing a list of sample_id
s
or external_id
s to /projects/<int:project_id>/raw-data
.
The endpoint provides the same data as outlined above for the Gencove CLI.
Testing environment
Developers may use the Gencove staging environment for development and testing.
The staging developer website URL is: https://app-staging.gencove.com
The staging API URL is: https://rest-staging.gencove.com
API Reference
The full API reference for publicly available endpoints is available here: API reference
Terms
We reserve the right to remove your access to our API for any reason at our sole discretion.
FAQ
Support
Contact us at support@gencove.com