SDK Reference
This section covers class and function implementations within the Gencove Explorer SDK.
gencove_explorer.analysis.Analysis
dataclass
¶
Primary object for defining, running, and monitoring an analysis with the Explorer SDK.
For additional details, please see the Analysis docs.
analysis_prefix
property
¶
Unique S3 prefix for user analysis job
log_group: str
property
¶
Internal property to get log group
get_name()
¶
Returns a string name for the Analysis object
Returns:
Type | Description |
---|---|
str
|
|
get_output(job_index=0)
¶
Get the output for a specific Analysis job.
Defaults to job_index=0
logs(job_index=0, since=None, live=False)
¶
Prints the latest logs of the job to standard output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
int
|
If this is an array job, pass the Job index here. Defaults to 0. |
0
|
since |
str
|
From what time to begin displaying logs. By default, logs will be displayed starting from ten minutes in the past. The value provided can be an ISO 8601 timestamp or a relative time. For relative times, provide a number and a single unit. Supported units include: s (seconds), m (minutes), h (hours), d (days), w (weeks). For example, a value of '5m' would indicate to display logs starting five minutes in the past. Note that multiple units are not supported (i.e. '5h30m'). |
None
|
live |
bool
|
True to see a live tail of a running job. Defaults to False. |
False
|
run(sdk_version=None, sdk_branch=None, library_version=None, library_branch=None, dry_run=False, debug_serialized_objects=False)
¶
Submit a job to batch cluster
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sdk_version |
Optional[str]
|
SDK version to use for batch job |
None
|
sdk_branch |
Optional[str]
|
SDK branch to use for batch job |
None
|
library_version |
Optional[str]
|
library version to use for batch job |
None
|
library_branch |
Optional[str]
|
library branch to use for batch job |
None
|
dry_run |
bool
|
if flag is set, do not execute any AWS calls |
False
|
debug_serialized_objects |
bool
|
if flag is set, write serialized objects to working dir |
False
|
run_local(job_index=0, env_name=None, debug_serialized_objects=False, dry_run=False, sdk_branch=None, sdk_version=None, library_branch=None, library_version=None)
¶
Runs analysis on local machine
Note
- Can only run on a single input at a time
- By default, only runs against first input item
- Does not support jobs with dependencies
- Does not support logs or job status
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
int
|
The index from supplied inputs to use for processing. Defaults to 0. |
0
|
env_name |
Optional[str]
|
If supplied, will create a virtual env and use Analysis.pip_packages to install dependencies |
None
|
debug_serialized_objects |
bool
|
Set to store serialized objects locally |
False
|
dry_run |
bool
|
Set to prepare job without executing it |
False
|
status(job_index=None, full=False)
¶
Returns the status of the Job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
Optional[int]
|
If this is an array job, pass the Job index here. Pass None for array jobs to get the array job status instead of a child job status. Defaults to 0. |
None
|
full |
Optional[bool]
|
Set as True if you want the full response from AWS. Otherwise, returns a simple dict that contains the status of the job, and its children in "status" and "status_summary" respectively. |
False
|
Returns:
Type | Description |
---|---|
dict
|
Status of the Job |
store_analysis_history(run_type)
¶
Internal method used to write analysis history
terminate(job_index=None)
¶
Terminate a specific analysis job.
The job_index parameter is required if the Analysis
is
an array job (e.g. len(input) > 1
).
Note
- Any dependent jobs will automatically fail
- Termination requests take some time to propagate to the job cluster
- Termination requests are idempotent
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
Optional[int]
|
index of analysis to terminate |
None
|
terminate_all()
¶
Terminate all jobs for this analysis
Note
- All array jobs will be terminated
- Termination requests take some time to propagate to the job cluster
- Any dependent jobs will automatically fail
wait(job_statuses, job_index=0, spinner_text='Waiting', spinner_complete_text='Complete')
¶
Main method for waiting for the current Analysis to reach
an AWS batch status listed under job_statuses
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_statuses |
List[str]
|
list of AWS batch job statuses to wait for |
required |
job_index |
int
|
Analysis job index to wait for |
0
|
spinner_text |
str
|
Spinner text to display while waiting |
'Waiting'
|
spinner_complete_text |
str
|
Spinner text to display when target status has been detected |
'Complete'
|
wait_done(job_index=0)
¶
Wait for Analysis
job with index job_index
to reach a
terminal state (succeeded or failed).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
int
|
Analysis job index, defaults to 0 |
0
|
wait_running_or_done(job_index=0)
¶
Wait for Analysis
job with index job_index
to reach
a running or terminal state (succeeded or failed).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_index |
int
|
Analysis job index, defaults to 0 |
0
|
gencove_explorer.analysis.GlobalConfig
¶
gencove_explorer.analysis.JobDefinition
dataclass
¶
Class for defining jobs resources and configuration
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cpu |
int
|
Number of vCPUs to allocate to job |
required |
memory_mb |
int
|
Amount of memory in Megabytes to allocate to job |
required |
timeout_seconds |
int
|
Amount of time in seconds before job times out |
3600
|
gencove_explorer.analysis.AnalysisContext
dataclass
¶
Object for accessing data related to an Analysis
job.
Examples:
child_prefix
property
¶
Returns prefix to write analysis context results. Uses the batch index to determine prefix within jobs prefix.
e.g. s3://gencove-explorer-1111/users/2222/jobs/3333/outputs/0
outputs_prefix
property
¶
Returns prefix to write outputs of the group results.
e.g. s3://gencove-explorer-1111/users/2222/jobs/3333/outputs
dependency(dep_name)
¶
Returns the dependencies of the Job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dep_name |
str
|
Name of the function or Analysis Job. |
required |
Returns:
Type | Description |
---|---|
Optional[DependencyContainer]
|
Optional[DependencyContainer]: A single dependency object.
If not found returns |
gencove_explorer.job_manager.JobManager
dataclass
¶
Class for browsing and retrieving previously run Analysis jobs
Examples:
get_analysis(job_id)
¶
Restore a previously run Analysis object by ID
Parameters:
Name | Type | Description | Default |
---|---|---|---|
job_id |
str
|
analysis job ID, can be retrieved with list_jobs() method |
required |
list_jobs(since=None, date=None, name=None)
¶
List user analysis job IDs. Can optionally be filtered by relative time, date, or analysis name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
since |
Optional[str]
|
From what time to begin displaying jobs. By default, all jobs will be disabled. The value provided can be a relative time. To filter, provide a number and a single unit. Supported units include: m - minutes h - hours d - days w - weeks |
None
|
date |
Optional[str]
|
Filter jobs to this date, e.g. '2023-04-28' or '2023-05' |
None
|
name |
Optional[str]
|
Filter jobs by analysis name value (case-insensitive) |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
List of analysis job IDs |
gencove_explorer.models.File
dataclass
¶
Object for interacting with local or remote files.
Please see the File docs for more information.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path_s3 |
InitVar[str]
|
S3 path to related file |
None
|
name |
InitVar[str]
|
Unique name for file |
None
|
url |
InitVar[str]
|
Remote URL for file |
None
|
path_local |
InitVar[Path | str]
|
Local path for file |
None
|
org_shared |
InitVar[bool]
|
Share file across the organization |
False
|
upload(force=False)
¶
Method to upload file to user S3 storage.
Notes
- Requires
self.path_local
is set - Optionally
self.name
can be set
By default, will raise an exception if the supplied name is already present in the user prefix. This behaviour can be overridden with the force parameter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
force |
bool
|
is True, will ignore name check and can overwrite an existing file |
False
|
Returns:
Type | Description |
---|---|
File
|
self File object |
as_local(path_local=None, force=False)
¶
Either retrieves a file from remote storage and downloads to local storage, or generates an empty path that can be written to.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path_local |
Optional[Path | str]
|
Path to download file to |
None
|
force |
bool
|
Overwrite file at path_local if it already exists |
False
|
Returns:
Type | Description |
---|---|
Optional[Path]
|
Path to file |
as_url()
¶
If file is available remotely, generates a URL to access the file contents
Returns:
Type | Description |
---|---|
str
|
URL as string |
gencove_explorer
¶
Gencove Explorer package.
gencove_explorer.s3_path_user()
¶
Returns user's private S3 path.
Data in this path is not accessible by other members and anyone outside the organization.
Returns:
Type | Description |
---|---|
str
|
S3 path of the User |
gencove_explorer.s3_path_shared_org()
¶
Returns organization's shared S3 path.
Data that is here can be accessed by all members of the organization.
Returns:
Type | Description |
---|---|
str
|
S3 path of the Organization |