Skip to content

Uploading using the CLI

Uploading using the CLI

$ gencove upload <source-path> [<destination-path>]

Syncs local directories to directories in your Gencove upload area. Recursively copies new and updated files from the source directory to the destination.

Alternatively, can be used to import FASTQ files from URLs using a map file.

Only creates folders in the destination if they contain one or more files.

$ gencove upload my-fastq-files/

This example command will recursively copy all files in the my-fastq-files/ directory on your host system to a directory with an automatically generated name the Gencove upload area.

$ gencove upload input.fastq-map.csv

If there are multiple input FASTQ files per sample, or the file names do not follow the conventions described above, a manifest describing the relationship between the sample identifiers and the input FASTQ files must be provided in a CSV file in the format described above.

$ gencove upload my-fastq-files/ gncv://my-fastq/batch-1/

In case more control is needed over the upload destination, a destination path prefixed with gncv:// may be provided. This pattern is commonly used for separating upload batches when continuously uploading data to your Gencove account and is useful for easily filtering files in the Gencove Dashboard. A common directory structure for batching uploads is:

gncv://<project-name>/<batch-name>/

If specifying a destination path, it is recommended to have at least one level of directories to separate batches of uploaded data. In other words, it is recommended to avoid placing all files in the root directory gncv://

Details of upload behavior:

  • In case a file in the local directory already exists in the destination, it will not be overwritten
  • In case a file exists in the destination, but not the local directory, it will not be deleted

Importing files from URLs with a map file

Using the map file described above, it is also possible to import FASTQ files from URLs. When constructing the map CSV file, include the URL for each file under the path column.

Here is an example of the contents of a CSV map file that uses URLs:

client_id,r_notation,path
sample1,r1,https://example-bucket.storage.googleapis.com/sample_R1.fastq.gz
sample1,r2,https://example-bucket.storage.googleapis.com/sample_R2.fastq.gz

Note

Note that only the following URL domains are supported:

  • amazonaws.com (AWS)

  • blob.core.windows.net (Azure)

  • googleapis.com (Google Cloud)

Once the map file has been built, the upload command can be used:

$ gencove upload input.fastq-map.csv

Warning

When generating URLs from the above cloud providers, we suggest setting a generous expiration time to ensure the URLs do not expire by the time they reach a project and need to be retrieved by the corresponding pipeline.

Automatically starting analysis

To automatically assign uploads to a project and run analysis, provide the --run-project-id flag and destination project id to the Gencove CLI.

$ gencove upload my-fastq-files/ gncv://my-fastq/batch-1/ --run-project-id b1edbb20-ee77-4be0-9944-e8e3a593cc83

When this feature is used, the Gencove CLI will check to make sure that contents of SOURCE and DESTINATION are identical in order to avoid analysis of unwanted samples. This will always be the case if DESTINATION is omitted, i.e., autogenerated by the Gencove CLI.

It is also important to ensure uploaded files follow naming conventions outlined above to avoid sample identifier detection issues.