Skip to content

Shortcut Library

The Gencove Explorer Library is a Python package that contains a collection of pre-made “shortcuts” that represent commonly used genomic analysis workflows like subsetting and annotating VCF files.

There two main types of shortcuts:

  • Local: execute locally
  • Remote: execute on a cluster

Local shortcuts

These shortcuts execute locally and commonly do not have large resource requirements. They commonly provide visualization and summaries of various statistics.

Local shortcuts provide:

  • run() method for running the shortcut
  • result() method for accessing shortcut results (if applicable)
  • save() and load() methods for saving and reloading shortcut state from local file storage

Remote shortcuts

These shortcuts execute remotely on the cluster and commonly represent workloads with large resource requirements that cannot reasonably complete in a local environment.

Remote shortcuts provide:

  • input_helper() method to generate input for the shortcut in a simple and user-friendly manner
    • this is a static method, therefore it is not required to instantiate an object to execute the method
  • run() method for scheduling execution of the shortcut onto the cluster
  • status() method for checking shortcut execution status
  • result() method for accessing shortcut results
  • analyses() method for returning Analysis objects upon which downstream shortcuts must depend on
  • save() and load() methods for saving and reloading shortcut state from local file storage

Composing remote shortcuts

One important aspect of these shortcuts is that they can be easily composed, assuming the respective inputs and outputs are compatible.

The example below subsets a collection of VCF files to a genomic region and annotates the resulting VCF files with ClinVar annotations.

from gencove_explorer_library.shortcuts.annotate import AnnotateVCFs, AnnotationClinVar
from gencove_explorer_library.shortcuts.subset import SubsetVCFs
from gencove_explorer.helpers import GenomicRegion

input_parameters = SubsetVCFs.input_helper("aa3a46e0-c390-4943-b613-26f9908367d5")

subset = SubsetVCFs(
    regions=[GenomicRegion(contig=1, start=860000, stop=880000)],
    **input_parameters,
)

annotated_subset = AnnotateVCFs(
    vcfs=subset,
    annotation=AnnotationClinVar(genome="GRCh37"),
).run()