NAV
bash Python

Introduction

Welcome to the Gencove API docs!

The Gencove REST API makes it easy to:

  1. upload low-pass sequencing FASTQ files to the Gencove analysis pipeline
  2. download analysis results
  3. track sample status
  4. automate data delivery.

Read on to get started and try out the examples on the side along the way.

Also, the full API reference for publicly available endpoints is available here: API reference

Gencove data

Genomic data is organized into “projects”. Each project contains “samples”. Each sample has a sample_id (generated by Gencove) and external_id (provided to Gencove by customers).

In most cases, a user account and project will be created for you by our team.

In case you would like to explore the Gencove data delivery dashboard, feel free to create an account and explore as follows:

  1. Create a free Gencove account using the dashboard
  2. Create a project by clicking “Create a new project” or going to My Project -> Create a new project

API Key

The API key is used for programatically accessing Gencove.

You can find your API key by logging into the dashboard, clicking on your avatar in the top right corner of the page, and navigating to Settings. Once in Settings, scroll to the bottom of the screen, click “Show/hide API Key”, and a 36-character string will reveal itself.

The Gencove CLI

The Gencove command-line interface (CLI) can be used to easily access the API.

It is mostly used for:

  1. Uploading FASTQ files for analysis
  2. Downloading analysis results

Quickstart

$ pip install gencove==1.0.1
$ export GENCOVE_API_KEY='<your-api-key>'
$ gencove upload sync <local-directory-path>
# In shell:
pip install gencove==1.0.1
export GENCOVE_API_KEY='<your-api-key>'
gencove upload sync <local-directory-path>

Install the Gencove CLI using the Python package manager pip and upload files to your Gencove account.

Setup

Installation

$ pip install gencove==1.0.1
# In shell:
pip install gencove==1.0.1

The Gencove CLI can be installed using the Python package manager pip. The source code is available on GitLab.

Python and pip are commonly preinstalled on most Mac and Linux systems. In case you do need to install Python, commonly used instructions are available here.

Configuration

$ export GENCOVE_API_KEY='<your-api-key>'
export GENCOVE_API_KEY='<your-api-key>'

After acquiring your API key using the dashboard as described above, it can be provided to the Gencove CLI via the environment variable $GENCOVE_API_KEY or inline via the --api-key option. We recommend using the $GENCOVE_API_KEY environment variable to avoid exposing the API key to others.

Uploading FASTQ files

In order to enable FASTQ uploads for your account, log in and press the Request access button in the dashboard. You can expect a response from Gencove support within 24h.

Once uploads are enabled, users can upload files to the Gencove upload area using the Gencove CLI and assign the files to projects using the Gencove Dashboard. Once files are assigned to a project, they will be processed by the Gencove analysis pipeline. Analysis results will be available via the Gencove API and Dashboard once analysis is complete.

Naming files

We highly recommend using the standard Illumina naming convention for FASTQ files. If files are named in this manner, Gencove systems will automatically detect:

  1. the sample identifier (and use it as the sample’s external_id)
  2. R1/R2 designations of files

A summary of the naming convetion is:

SAMPLE ID + _ + … + _ + (R1 or R2) + _ + … + .fastq.gz

Commands

$ gencove upload --help
$ gencove upload <command> --help
gencove upload --help
gencove upload <command> --help

The Gencove CLI has 4 subcommands relevant to uploads:

  1. sync - upload files in bulk
  2. cp - upload files one by one
  3. ls - list uploaded files
  4. rm - delete uploaded files

Help and instructions for each command can be viewed using the --help flag.

sync <source-path> [<destination-path>]

Syncs local directories to directories in your Gencove upload area. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.

$ gencove upload sync my-fastq-files/
gencove upload sync my-fastq-files/

The example command will recursively copy all files in the my-fastq-files/ directory on your host system to a directory with an automatically generated name the Gencove upload area.

$ gencove upload sync my-fastq-files/ gncv://my-fastq/batch-1/
gencove upload sync my-fastq-files/ gncv://my-fastqs/batch-1/

In case more control is needed over the upload destination, a destination path prefixed with gncv:// may be provided. This pattern is commonly used for separating upload batches when continuously uploading data to your Gencove account and is useful for easily filtering files in the Gencove Dashboard. A common directory structure for batching uploads is:

gncv://<project-name>/<batch-name>/

Details of sync behavior:

cp <source-path> [<destination-path>]

$ gencove upload cp my-fastq-files/sampleid-1_R1.fastq.gz gncv://my-fastq/batch-1/
gencove upload cp my-fastq-files/sampleid-1_R1.fastq.gz gncv://my-fastq/batch-1/

Copies a local file to the Gencove upload area.

ls [<destination-path>]

$ gencove upload ls --recursive
gencove upload ls --recursive

Lists files and directories in your Gencove upload area.

rm [<destination-path>]

$ gencove upload rm gncv://my-fastq/batch-1/sampleid-1_R1.fastq.gz
gencove upload rm gncv://my-fastq/batch-1/sampleid-1_R1.fastq.gz

Remove a file from the Gencove upload area

Downloading analysis results

# Standard invocation:
$ gencove project raw_data <project-id> <external-ids-file> <csv-file>

# Using stdin and stdout:
$ cat <sample-ids-file> | gencove project raw_data --output-format json --id-type sample <project-id> - - > <json-file>

# Example to get all available VCF files, together with sample ids and external ids
$ echo | gencove project raw_data --output-format csv <project-id>  - - | cut -d , -f 1,2,5 > output.csv
# In shell (standard invocation)
gencove project raw_data <project-id> <external-ids-file> <csv-file>

# Using stdin and stdout:
cat <sample-ids-file> | gencove project raw_data --output-format json --id-type sample <project-id> - - > <json-file>

# Example to get all available VCF files, together with sample ids and external ids
echo | gencove project raw_data --output-format csv <project-id>  - - | cut -d , -f 1,2,5 > output.csv

The Gencove CLI enables exporting user data in bulk to csv or json format.

Input and output can be a file path or “-” for stdin and stdout.

The input stream is expected to be one id per line.

Users’ genomic data can be accessed by providing a list of sample_ids or external_ids. sample_ids are Gencove ids for samples, while external_ids are ids provided to us by customers.

The tool will return a list of objects containing the following data for each sample_id or external_id:

  1. sample_id - Gencove sample id (unique)
  2. external_id - the external id provided in JWT when requesting access to user data
  3. project_id - the project id
  4. vcf_url_s3 - a presigned URL for downloading the imputed .vcf file
  5. snp_url_s3 - a presigned URL for downloading the original AncestryDNA, 23andMe, etc. data file
  6. bam_url_s3 - a presigned URL for downloading the raw .bam file (if the user got their genomic data through Gencove)
  7. bai_url_s3 - a presigned URL for downloading the .bam file index
  8. fastq_nongrch37_url_s3 - a presigned URL for downloading the .fastq file with non-human reads
  9. ancestry_url_s3 - a presigned URL for downloading a .json file with the output of Gencove’s ancestry analysis
  10. local_ancestry_url_s3 - a presigned URL for downloading the output of Gencove’s local ancestry analysis
  11. ancestry_url_suffix - a suffix for displaying ancestry using the Gencove map. To test, append it to https://ancestry-staging.gencove.com/
  12. microbiome_url_s3 - a presigned URL for downloading a .json file with the output of Gencove’s microbiome analysis
  13. traits_url_s3 - a presigned URL for downloading a .json file with the output of Gencove’s polygenic traits analyses

The presigned download URLs expire after 2 days, but there is no limit on the number of generated presigned URLs.

If using the default “external” id type, multiple entries for the same external id may be returned, since it is not guaranteed to be unique. Sample
(i.e., Gencove) ids are guaranteed to be unique.

In case data is not available for a given id or the id is not found, it is skipped.

Order of ids from input is not guaranteed to be preserved in output.

# Get help
$ gencove project raw_data --help

For detailed usage instructions, use the --help flag

Automated data delivery

{
    "event": "project_sample_join",
    "data": {
          "project_id": 1,
          "sample_id": 2,
          "external_id": "sample_100003",
          "subsidy_in_cents": 0,
          "participating": true,
          "jwt_log": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBfaWQiOjEsImV4dGVybmFsX2lkIjoibWVtYmVyXzEwMDAwMSIsInN1YnNpZHlfaW5fY2VudHMiOjB9.TR4FVXDxdD0EL-LaHOaoOKQhv1N8UN0eYdvlt3nQ5ao"
    }
}
# In shell (get help)
gencove project raw_data --help

Data delivery can be automated using webhooks.

A webhook URL may be defined for an project, where events will be submitted via HTTP POST requests. The content of the webhook contains:

  1. Type of event, event
  2. Data describing the event in more detail, data

Currently, there are 2 types of events:

  1. project_sample_results_ready: generated when data becomes available when it was not available before.
  2. test: generated when validating the webhook URL. If a webhook_url is defined when creating or updating an project, the Gencove backend will attempt to validate the URL by generating a HTTP POST request with this event type.

Example of an event can found in the sidebar.

Webhooks are only active when the project is in the enabled state.

Using the API

Authentication

# API key
$ curl -H 'Authorization: GENCOVE-API-KEY <your_api_key>'\
    https://rest.gencove.com/welcome-api-key
# API key
import requests

response = requests.get(
    'https://rest.gencove.com/welcome-api-key',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

The API is accessed by placing your API key in the Authorization header of HTTP requests: Authorization: GENCOVE-API-KEY <api_key>

Listing your projects

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    https://rest.gencove.com/projects
import requests

response = requests.get(
    'https://rest.gencove.com/projects',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

Your projects can be listed with a GET HTTP request to /projects

Info about a single project

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    https://rest.gencove.com/projects/<project_id>

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    https://rest.gencove.com/projects/<project_id>/stats
import requests

response = requests.get(
    'https://rest.gencove.com/projects/<project_id>',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

response = requests.get(
    'https://rest.gencove.com/projects/<project_id>/stats',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

Data and stats for a specific project can be listed with GET HTTP requests to /projects/<project_id> and /projects/<project_id>/stats, respectively.

Listing project samples

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    https://rest.gencove.com/projects/<project_id>/samples
import requests

response = requests.get(
    'https://rest.gencove.com/projects/<project_id>/samples',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

project samples can be listed with a GET HTTP request to /projects/<project_id>/samples

Info about a specific sample

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    https://rest.gencove.com/projects/<project_id>/samples/<sample_id>
import requests

response = requests.get(
    'https://rest.gencove.com/projects/<project_id>/samples/<sample_id>',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    }
)

Info about a single sample can be listed with a GET HTTP request to /projects/<project_id>/samples/<sample_id>

Accessing genomic data

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    -H "Content-Type: application/json"\
    -X POST\
    -d '{"sample_ids": [<sample_id_1>, <sample_id_2>, ..., <sample_id_N>]}'\
    https://rest.gencove.com/projects/<int:project_id>/raw-data

$ curl -H "Authorization: GENCOVE-API-KEY <your_api_key>"\
    -H "Content-Type: application/json"\
    -X POST\
    -d '{"external_ids": [<external_id_1>, <external_id_2>, ..., <external_id_N>]}'\
    https://rest.gencove.com/projects/<int:project_id>/raw-data
import requests

response = requests.post(
    'https://rest.gencove.com/projects/<int:project_id>/raw-data',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    },
    json={
        'sample_ids': [<sample_id_1>, <sample_id_2>, ..., <sample_id_N>]
    }
)

response = requests.post(
    'https://rest.gencove.com/projects/<int:project_id>/raw-data',
    headers={
        'Authorization': 'GENCOVE-API-KEY <your_api_key>'
    },
    json={
        'external_ids': [<external_id_1>, <external_id_2>, ..., <external_id_N>]
    }
)

Genomic data can be accessed by POSTing a list of sample_ids or external_ids to /projects/<int:project_id>/raw-data .

The endpoint provides the same data as outlined above for the Gencove CLI.

Testing environment

Developers may use the Gencove staging environment for development and testing.

The staging developer website URL is: https://app-staging.gencove.com

The staging API URL is: https://rest-staging.gencove.com

API Reference

The full API reference for publicly available endpoints is available here: API reference

Terms

We reserve the right to remove your access to our API for any reason at our sole discretion.

FAQ

User FAQ

Support

Contact us at support@gencove.com