bash Python


Welcome to the Gencove API docs!

The Gencove REST API makes it easy to:

  1. upload low-pass sequencing FASTQ files to the Gencove analysis pipeline
  2. download analysis results
  3. track sample status
  4. automate data delivery.

Read on to get started and try out the examples on the side along the way.

Also, additional documentation is available here: - API reference for publicly available endpoints: API reference - Command-line interface (CLI) tool reference: CLI reference

Gencove data

Genomic data is organized into “projects”. Each project contains “samples”. Each sample has an id (generated by Gencove) and client_id (provided to Gencove by clients).

In most cases, a user account and project will be created for you by our team.

In case you would like to explore the Gencove data delivery dashboard, feel free to create an account and explore as follows:

  1. Create a free Gencove account using the dashboard
  2. Create a project by going to My Projects -> Add new project

The Gencove CLI

The Gencove command-line interface (CLI) can be used to easily access the API.

It is mostly used for:

  1. Uploading FASTQ files for analysis
  2. Downloading analysis results


$ pip install gencove
$ gencove upload <local-directory-path>
# In shell:
pip install gencove
gencove upload <local-directory-path>

Install the Gencove CLI using the Python package manager pip and upload files to your Gencove account.

Hint: for the newest pre-release versions, check: PyPI

Video demo



$ pip install gencove
# In shell:
pip install gencove

The Gencove CLI can be installed using the Python package manager pip. The source code is available on GitLab.

Python and pip are commonly preinstalled on most Mac and Linux systems. In case you do need to install Python, commonly used instructions are available here.

In production environments, we highly recommend using virtualenv and/or virtualenvwrapper for installing the Gencove CLI in a dedicated Python environment.

Mac OS notes

Due to a known issue with Python that ships with Mac OS, the Gencove CLI should be installed in the user’s home directory (not system-wide) as follows: pip install --user gencove. Make sure to have ~/bin present in your $PATH environment variable.

For advanced users, we highly recommend virtualenvwrapper and installing the Gencove CLI within a dedicated virtualenv.

If you absolutely must install the Gencove CLI system-wide using sudo, the following command can be used as a last resort: sudo pip install gencove --ignore-installed six.


$ export GENCOVE_EMAIL='<your-email>'
$ export GENCOVE_PASSWORD='<your-password>'
export GENCOVE_EMAIL='<your-email>'
export GENCOVE_PASSWORD='<your-password>'
$ export GENCOVE_API_KEY='<your-api-key>'
export GENCOVE_API_KEY='<your-api-key>'

Your credentials can be provided to the Gencove CLI via environment variables:

Please note that you cannot use $GENCOVE_EMAIL+$GENCOVE_PASSWORD and $GENCOVE_API_KEY at the same time.

$ curl -H "Authorization: Api-Key <your-api-key>"
import requests

r = requests.get(
  headers={"Authorization": "Api-Key <your-api-key>"}

API keys can also be used to authenticate with the API directly by setting the Authorization HTTP header to Api-Key <your-api-key>.

Uploading FASTQ files

In order to enable FASTQ uploads for your account, log into your account and go to My FASTQs, where instructions will be provided (in case you already do not have access). You can expect a response from Gencove support within 24h.

Once uploads are enabled, users can upload files to the Gencove upload area using the Gencove CLI and assign the files to projects using the Gencove Dashboard. Once files are assigned to a project, they will be processed by the Gencove analysis pipeline. Analysis results will be available via the Gencove API and Dashboard once analysis is complete.

File naming convention

We highly recommend using the standard Illumina naming convention for FASTQ files. If files are named in this manner, Gencove systems will automatically detect:

  1. the sample identifier (and use it as the sample’s client_id)
  2. R1/R2 designations of files

A summary of the naming convention is:

SAMPLE ID + _ + … + _ + (R1 or R2) + _ + … + .fastq.gz

For example, the table below shows examples of file names using this convention and the corresponding detected sample identifiers and read designations

File name Sample ID Read pair
SAMPLE1_R1.fastq.gz SAMPLE1 R1
SAMPLE1_R2.fastq.gz SAMPLE1 R2
SAMPLE3_R1_L001.fastq.gz SAMPLE3 R1

Custom file names


To bypass the default convention outlined above and explicitly specify sample identifiers and R1/R2 designations for FASTQ files, a file ending with .fastq-map.csv can be provided as the SOURCE to the gencove upload command. The format of the file is outlined in the code snippet on the right.

The following validation is performed on the .fastq-map.csv file:

Grouping files

By default, Gencove systems expect one pair of FASTQ files per sample.


If sequencing reads for a single sample are spread across multiple FASTQ files, they need to be merged into one R1 file and one R2 file. This can be accomplished in several ways:

  1. Listing multiple files for the same client_id and r_notation in the .fastq-map.csv file (outlined in previous section) results in the files being concatenated on the fly during upload with the Gencove CLI - see example in code snippet on the right.
  2. Manually concatenate the files. Since gzip-compressed files can be merged without decompressing, it’s simply a matter of concatenating the compressed files.
  3. By providing the --no-lane-splitting flag to bcl2fastq, splitting reads into multiple FASTQ files can be avoided upstream in the demultiplexing phase.

Uploading using the CLI

gencove upload <source-path> [<destination-path>]

Syncs local directories to directories in your Gencove upload area. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.

$ gencove upload my-fastq-files/
gencove upload my-fastq-files/

The example command will recursively copy all files in the my-fastq-files/ directory on your host system to a directory with an automatically generated name the Gencove upload area.

$ gencove upload input.fastq-map.csv
gencove upload input.fastq-map.csv

If there are multiple input FASTQ files per sample, or the file names do not follow the conventions described above, a manifest describing the relationship between the sample identifiers and the input FASTQ files must be provided in a CSV file in the format described above.

$ gencove upload my-fastq-files/ gncv://my-fastq/batch-1/
gencove upload my-fastq-files/ gncv://my-fastqs/batch-1/

In case more control is needed over the upload destination, a destination path prefixed with gncv:// may be provided. This pattern is commonly used for separating upload batches when continuously uploading data to your Gencove account and is useful for easily filtering files in the Gencove Dashboard. A common directory structure for batching uploads is:


Details of upload behavior:

Automatically starting analysis

$ gencove upload my-fastq-files/ gncv://my-fastq/batch-1/ --run-project-id b1edbb20-ee77-4be0-9944-e8e3a593cc83
gencove upload my-fastq-files/ gncv://my-fastqs/batch-1/ --run-project-id b1edbb20-ee77-4be0-9944-e8e3a593cc83

To automatically assign uploads to a project and run analysis, provide the --run-project-id flag and destination project id to the Gencove CLI.

When this feature is used, the Gencove CLI will check to make sure that contents of SOURCE and DESTINATION are identical in order to avoid analysis of unwanted samples. This will always be the case if DESTINATION is omitted, i.e., autogenerated by the Gencove CLI.

It is also important to ensure uploaded files follow naming conventions outlined above to avoid sample identifier detection issues.

Downloading deliverables

Gencove provides a number of deliverables for each sample that is processed as part of a project. In case a sample fails processing due to quality control, only the original input files are provided as deliverables.

Downloading using the CLI

$ gencove download . --project-id my-project-id
gencove download . --project-id my-project-id

gencove download <local-destination-path> --project-id <project-id>

Downloads all deliverables for all samples in project the specified project, with the following default naming scheme:


This naming scheme reflects the fact that uniqueness of client-ids is not enforced, while uniqueness of gencove-id is enforced.

Customizing download naming scheme

$ gencove download . --project-id my-project-id --download-template '{client_id}.{file_extension}'
gencove download . --project-id my-project-id --download-template '{client_id}.{file_extension}'

The default naming scheme outlined above can be customized by providing the --download-template flag and a custom file naming template that may contain {client_id}, {gencove_id}, {file_type}, and {file_extension} tokens.

When using this feature, make sure to specify download templates that result in unique filenames across all samples.

Continuing previous downloads

When downloading, existing files on the local filesystem are not overwritten if the file already exists and has the same size in bytes as the file that would be downloaded. This behavior can be tweaked with the --no-skip-existing flag.

Downloading subsets of deliverables

$ gencove download . --sample-ids sample-id-1,sample-id-2,sample-id-3 --file-types impute-vcf,impute-tbi
gencove download . --sample-ids sample-id-1,sample-id-2,sample-id-3 --file-types impute-vcf,impute-tbi

Behavior of the download can also be tweaked in the following manner:

  1. Download only a specific set of sample ids by providing the --sample-ids flag instead of the --project-id flag
  2. Download only a specific set of file types by providing the --file-types flag. Currently available file types are listed below (not all file types may be available for every project).

Automated data delivery

Data delivery can be automated using webhooks.

Users can specify a webhook URL for a project. Once a webhook is specified, events relating to that project will be submitted to the webhook URL in JSON format via HTTP POST requests. The content of the webhook contains the following keys:

Together, object_id and event should be considered unique and duplicates should be handled by the receiver.

  "event": "analysis_complete",
  "object_id": "99573a16-98a8-48fc-8caf-e3b4dcdf34e6",
  "timestamp": "2018-11-18T14:09:59.741183",
  "payload": {
    "project_id": "1d6daca6-475a-4961-9841-57aac36cbd0f",
    "sample_ids": [
  "event": "analysis_complete",
  "object_id": "99573a16-98a8-48fc-8caf-e3b4dcdf34e6",
  "timestamp": "2018-11-18T14:09:59.741183",
  "payload": {
    "project_id": "1d6daca6-475a-4961-9841-57aac36cbd0f",
    "sample_ids": [

Currently, the following events are available:

Once a webhook is received, the receiver is responsible for querying the Gencove API for more details on each object that is referenced. For example, upon receiving a analysis_complete webhook for a project, the receiver should query the Gencove API for sample details, status, and fresh download URLs for deliverables related to those samples.

Webhook signatures

Gencove can optionally sign webhook events it sends to endpoints. This is done by including a signature in each event’s Gencove-Signature header, allowing you to verify that the events were sent by Gencove.

Before verifying signatures, webhooks need to be enabled and the secret needs to be retrieved for each project via the Gencove API (API reference). Note that each project has a separate unique secret.

After this setup, Gencove automatically starts signing each webhook event it sends to the endpoint of the related project.

Verifying webhook signatures

The Gencove-Signature header contains a timestamp and a signature:

Example signature: Gencove-Signature: t=1492774577,v1=5257a869e7ecebeda32affa62cdca3fa51cad7e77a0e56ff536d0ce8e108d8bd

Gencove generates signatures using a hash-based message authentication code (HMAC) with SHA-512. To prevent downgrade attacks, you should ignore all schemes that are not v1.

export SECRET='super-secret'
export TIMESTAMP='123456'
export PAYLOAD='{"k":"v"}'
python3 -c \
    'import hmac, hashlib, os; print(["SECRET"].encode("utf-8"), "{}.{}".format(os.environ["TIMESTAMP"], os.environ["PAYLOAD"]).encode("utf-8"), hashlib.sha512).hexdigest())'
import hmac, hashlib

def calculate_signature(secret, timestamp, payload):
    signature_message = "{}.{}".format(timestamp, payload).encode("utf-8")

Step 1: Extract the timestamp and signatures from the header

Split the header, using the , character as the separator, to get a list of elements. Then split each element, using the = character as the separator, to get a prefix and value pair.

The value for the prefix t corresponds to the timestamp, and v1 corresponds to the signature.

Step 2: Prepare signature_message

This is achieved by concatenating:

  1. The timestamp (as a string)
  2. The character .
  3. The actual JSON payload (i.e., the request’s body)

Step 3: Determine the expected signature

Compute an HMAC with the SHA512 hash function. Use the endpoint’s signing secret as the key and the signature_message string as the message.

Step 4: Compare signatures

Compare the signature in the header to the expected signature. If a signature matches, compute the difference between the current timestamp and the received timestamp, then decide if the difference is within your tolerance.

Project deliverables

Merged VCF file

Gencove supports generating a merged VCF file containing variant calls from all successful samples in a project.

Generating a merged VCF file is initiated from the Gencove Dashboard, by opening a project and clicking the “Merge VCFs” button. Once the merge operation is complete, a download button will appear on the project page.

Please keep in mind:

Testing environment

Developers may use the Gencove staging environment for development and testing.

The staging developer website URL is:

The staging API URL is:

Data analysis configurations

Each Gencove project is pinned to a ‘configuration’ that specifies the species, reference datasets (e.g. a reference genome and haplotype reference panel), and specific deliverables. These configurations can be private to a specific set of individuals, or public. The datasets underlying the public configurations are as follows:

API Reference

The full API reference for publicly available endpoints is available here: API reference

CLI Reference

The full CLI reference is available here: CLI reference


We reserve the right to remove your access to our API for any reason at our sole discretion.


User FAQ


Contact us at