Reference Genome
Reference Genome¶
Gencove offers access to the reference genome files utilized in generating project deliverables.
These files can be downloaded using the gencove projects get-reference-genome
command. Alternatively, they are accessible through the web application, where you can download the genome.fasta
file from the project detail page.
Downloading subsets of deliverables¶
$ gencove projects get-reference-genome <project-id> <destination-dir> --file-types genome-fasta,genome-dict
Currently available file types are listed below (not all file types may be available
for every project, please run gencove file-types --object reference-genome --project-id <project-id>
for an accurate list of file types).
genome-fasta
- Reference genome sequence in FASTA format, compressed using gzip
genome-dict
- Picard sequence dictionary corresponding to the reference genome sequence
genome-fasta_amb
- Auxiliary file used by the BWA alignment tool for genome indexing
genome-fasta_ann
- Annotation file used by the BWA alignment tool for genome indexing
genome-fasta_bwt
- Burrows-Wheeler Transform (BWT) index file, used for efficient sequence alignment
genome-fasta_fai
- FASTA index file, providing quick access to sequences within the compressed FASTA file
genome-fasta_gzi
- Index file for the compressed FASTA file, facilitating quick retrieval of specific regions
genome-fasta_pac
- Packed alignment data file used by BWA for indexing.
genome-fasta_sa
- Suffix array file, a data structure used for pattern matching and genome alignment
genome-fasta_vcf_header
- Header file for a Variant Call Format (VCF) file, containing information about the reference genome and other metadata