Reference Genome
Reference Genome¶
Gencove offers access to the reference genome files utilized in generating project deliverables.
These files can be downloaded using the gencove projects get-reference-genome command. Alternatively, they are accessible through the web application, where you can download the genome.fasta file from the project detail page.
Downloading subsets of deliverables¶
$ gencove projects get-reference-genome <project-id> <destination-dir> --file-types genome-fasta,genome-dict
Currently available file types are listed below (not all file types may be available
for every project, please run gencove file-types --object reference-genome --project-id <project-id>
for an accurate list of file types).
genome-fasta- Reference genome sequence in FASTA format, compressed using gzip
genome-dict- Picard sequence dictionary corresponding to the reference genome sequence
genome-fasta_amb- Auxiliary file used by the BWA alignment tool for genome indexing
genome-fasta_ann- Annotation file used by the BWA alignment tool for genome indexing
genome-fasta_bwt- Burrows-Wheeler Transform (BWT) index file, used for efficient sequence alignment
genome-fasta_fai- FASTA index file, providing quick access to sequences within the compressed FASTA file
genome-fasta_gzi- Index file for the compressed FASTA file, facilitating quick retrieval of specific regions
genome-fasta_pac- Packed alignment data file used by BWA for indexing.
genome-fasta_sa- Suffix array file, a data structure used for pattern matching and genome alignment
genome-fasta_vcf_header- Header file for a Variant Call Format (VCF) file, containing information about the reference genome and other metadata