Biowulf High Performance Computing at the NIH
gen3-client on Helix

The gen3-client provides an easy-to-use command-line interface for uploading and downloading data files to and from a Gen3 data commons. An example of a gen3 data commons portal is NCI's Proteomic Data Commons.

Documentation
Important Notes

Running Interactively on Helix

Helix is a dedicated interactive data transfer node. Therefore, if you wish to use the gen3-client interactively, it is recommended that you do so on helix. Sample session (user input in bold):

[user@helix ~]$ module load gen3

[user@helix ~]$ gen3-client upload --profile=demo --upload-path=test.txt 
2020/05/18 12:45:41 Finish parsing all file paths for "~/test.txt"

The following file(s) has been found in path "~/test.txt" and will be uploaded:
	~test.txt

2020/05/18 12:45:41 Uploading data ...
test.txt  25 B / 25 B [===================================================================================] 100.00% 0s
2020/05/18 12:45:41 Successfully uploaded file "~/test.txt" to GUID 1a82043e-02ec-4974-a803-7c0fd33ecfd7.
2020/05/18 12:45:41 Local succeeded log file updated


Submission Results
Finished with 0 retries | 1
Finished with 1 retry   | 0
Finished with 2 retries | 0
Finished with 3 retries | 0
Finished with 4 retries | 0
Finished with 5 retries | 0
Failed                  | 0
TOTAL                   | 1

Batch job

You can include your gen3-client commands in a batch script and submit from biowulf. For example, create a batch input file (e.g. gen3.sh) like so:

#!/bin/bash
set -e
module load gen3
gen3-client download-single --profile=demo --guid=00149bcf-e057-4ecc-b22d-53648ae0b35f --no-prompt --skip-completed

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] gen3.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. gen3.swarm). For example:

gen3-client download-single --profile=demo --guid=8f4481b1-c8c9-4db2-a48d-ab0c1097a950 --no-prompt --skip-completed
gen3-client download-single --profile=demo --guid=494a69be-4295-46bd-aa48-4dfedbb6f545 --no-prompt --skip-completed
gen3-client download-single --profile=demo --guid=5760f226-cf57-4025-9995-84f4d827c082 --no-prompt --skip-completed
gen3-client download-single --profile=demo --guid=8398df6a-63eb-4946-acb7-cc6f31b97915 --no-prompt --skip-completed

Submit this job using the swarm command.

swarm -f gen3.swarm [-g #] [-t #] --module gen3
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module gen3 Loads the gen3 module for each subjob in the swarm