Skip to content

Shortcut Library

The Gencove Explorer Library is a Python package that contains a collection of pre-made “shortcuts” that represent commonly used genomic analysis workflows like subsetting and annotating VCF files.

There two main types of shortcuts:

  • Local: execute locally
  • Remote: execute on a cluster

Local shortcuts

These shortcuts execute locally and commonly do not have large resource requirements. They commonly provide visualization and summaries of various statistics.

Local shortcuts provide:

  • run() method for running the shortcut
  • result() method for accessing shortcut results (if applicable)

Remote shortcuts

These shortcuts execute remotely on the cluster and commonly represent workloads with large resource requirements that cannot reasonably complete in a local environment.

They're automatically serialized and uploaded to EOS after executing run() and can be later retrieved using ShortcutManager.

Remote shortcuts provide:

  • input_helper() method to generate input for the shortcut in a simple and user-friendly manner
    • this is a static method, therefore it is not required to instantiate an object to execute the method
  • run() method for scheduling execution of the shortcut onto the cluster
  • status() method for checking shortcut execution status
  • result() method for accessing shortcut results
  • analyses() method for returning Analysis objects upon which downstream shortcuts must depend on

Composing remote shortcuts

One important aspect of these shortcuts is that they can be easily composed, assuming the respective inputs and outputs are compatible.

The example below subsets a collection of VCF files to a genomic region and annotates the resulting VCF files with ClinVar annotations.

from gencove_explorer_library.shortcuts.annotate import AnnotateVCFs, AnnotationClinVar
from gencove_explorer_library.shortcuts.subset import SubsetVCFs
from gencove_explorer.helpers import GenomicRegion

input_parameters = SubsetVCFs.input_helper("aa3a46e0-c390-4943-b613-26f9908367d5")

subset = SubsetVCFs(
    regions=[GenomicRegion(contig=1, start=860000, stop=880000)],
    **input_parameters,
)

annotated_subset = AnnotateVCFs(
    vcfs=subset,
    annotation=AnnotationClinVar(genome="GRCh37"),
).run()

Shortcut Manager

Similarly to how AnalysisManager works, it's also possible to retrieve and "rehydrate" previous Shortcut objects via the ShortcutManager. This allows revisiting previous shortcut executions to retrieve its outputs and logs.

Listing previous shortcuts

The ShortcutManager provides methods to list and search across previously run shortcuts.

  • Listing all shortcuts

    from gencove_explorer.shortcut_manager import ShortcutManager
    mgr = ShortcutManager()
    print(mgr.list_shortcuts())
    
  • Listing shortcuts from relative time points

    print(mgr.list_shortcuts(since="1h")) # <-- shortcuts from the last hour
    print(mgr.list_shortcuts(since="1d")) # <-- shortcuts from the last day
    print(mgr.list_shortcuts(since="2w")) # <-- shortcuts from the last two weeks
    
  • Listing shortcuts from an absolute date

    print(mgr.list_shortcuts(date="2025-01-16")) # <-- specific date
    print(mgr.list_shortcuts(date="2025-01")) # <-- from a single month
    
  • Listing shortcuts matching a name filter

    print(mgr.list_shortcuts(name="SubsetVCFs")) # <-- case-insenstive search across all shortcuts for this substring
    

Retrieving by ID

When you submit a shortcut to the AWS Batch cluster with .run(), a Shortcut ID is created and the shortcut information is saved in EOS, using the list method described above you can see those ids.

You can retrieve your results using this Shortcut ID with the ShortcutManager object.

from gencove_explorer.shortcut_manager import ShortcutManager

mgr = ShortcutManager()
shortcut = mgr.get_shortcut("2025-01-16T000000_SubsetVCFs_63a0b2eb68ee442f8bf16c753896d875")
shortcut.result(0)  # <-- this Shortcut object can now be used as a regular Shortcut object