Skip to content

Installing packages and software in Gencove Explorer

Conceptually, it is useful to think about the environments you have access to via the Explorer platform in two parts:

  • Local environment, or the environment in which you interact with via your JupyterLab instance.
  • Cluster environment, or the environment to which you can submit analysis jobs.

These are separate environments and they come with same pre-installed packages.

New packages installed by the user in one environment does not propagate to the other automatically.

This is an important distinction; for example, so just because you installed a Python module in your working environment via JupyterLab does not mean it is installed in the cluster to which you submit a job. In the following sections, we will take an example case where you have an analysis to run which requires a command line tool vcftools and a Python module cyvcf2,, neither of which do not come pre-installed on Explorer instances or on the cluster, and provide details on how you would install this on both your Explorer instance and for your cluster analysis.

Installing packages in your Gencove Explorer working environment

Your Gencove Explorer instance is a Linux virtual machine running Ubuntu 22.04, and the JupyterLab instance runs on this machine. You have superuser access to the VM and can therefore install any programs or packages as you normally would on a Linux box via the shell. In our example, to install vcftools via the system package manager APT, you would simply run this in the normal way:

$ sudo apt-get update
$ sudo apt-get install vcftools

Similarly, you can install Python modules using pip; in our example, to install the cyvcf2 module, which also does not come preinstalled on Explorer instances, you would simply run

$ pip install cyvcf2

After this command successfully runs (and after you have restarted the iPython kernel), cyvcf2 will be installed in your local environment, and you will be able to use it in your Python notebooks on your Explorer instance.

Installing packages for use in jobs submitted to the cluster

The cluster environment in which cluster jobs you submit are run is an entirely separate environment than your Explorer working environment, though it is also based on Ubuntu 22.04. As such, you must install software/modules explicitly in your work function when submitting jobs for analysis on the cluster (see documentation in Analysis).

An example of a work function that installs these two packages as the first step is as follows:

from gencove_explorer.analysis import Analysis, AnalysisContext

def work(ac: AnalysisContext):
    # install necessary packages
    ! pip install cyvcf2
    ! apt-get update
    ! apt-get install -y vcftools
    # do actual analysis

Note that at the moment, you are required to install these packages in each work function you define that uses them.