Skip to content

Reference Genome

Reference Genome

Gencove offers access to the reference genome files utilized in generating project deliverables.

These files can be downloaded using the gencove projects get-reference-genome command. Alternatively, they are accessible through the web application, where you can download the genome.fasta file from the project detail page.

$ gencove projects get-reference-genome <project-id> <destination-dir>

Downloading subsets of deliverables

$ gencove projects get-reference-genome <project-id> <destination-dir> --file-types genome-fasta,genome-dict

Currently available file types are listed below (not all file types may be available for every project, please run gencove file-types --object reference-genome --project-id <project-id> for an accurate list of file types).

genome-fasta
Reference genome sequence in FASTA format, compressed using gzip
genome-dict
Picard sequence dictionary corresponding to the reference genome sequence
genome-fasta_amb
Auxiliary file used by the BWA alignment tool for genome indexing
genome-fasta_ann
Annotation file used by the BWA alignment tool for genome indexing
genome-fasta_bwt
Burrows-Wheeler Transform (BWT) index file, used for efficient sequence alignment
genome-fasta_fai
FASTA index file, providing quick access to sequences within the compressed FASTA file
genome-fasta_gzi
Index file for the compressed FASTA file, facilitating quick retrieval of specific regions
genome-fasta_pac
Packed alignment data file used by BWA for indexing.
genome-fasta_sa
Suffix array file, a data structure used for pattern matching and genome alignment
genome-fasta_vcf_header
Header file for a Variant Call Format (VCF) file, containing information about the reference genome and other metadata