The gen3-client provides an easy-to-use command-line interface for uploading and downloading data files to and from a Gen3 data commons. An example of a gen3 data commons portal is NCI's Proteomic Data Commons.
- Module Name: gen3 (see the modules page for more information)
- gen3-client cannot be used on Biowulf
- The client requires an API key that the user must setup with the target gen3 data commons portal before uploading or downloading data. This key is stored as a named profile (such as "demo" used in examples below) by running gen3-client configure. See this documentation for more information.
Helix is a dedicated interactive data transfer node. Therefore, if you wish to use the gen3-client interactively, it is recommended that you do so on helix. Sample session (user input in bold):
[user@helix ~]$ module load gen3 [user@helix ~]$ gen3-client upload --profile=demo --upload-path=test.txt 2020/05/18 12:45:41 Finish parsing all file paths for "~/test.txt" The following file(s) has been found in path "~/test.txt" and will be uploaded: ~test.txt 2020/05/18 12:45:41 Uploading data ... test.txt 25 B / 25 B [===================================================================================] 100.00% 0s 2020/05/18 12:45:41 Successfully uploaded file "~/test.txt" to GUID 1a82043e-02ec-4974-a803-7c0fd33ecfd7. 2020/05/18 12:45:41 Local succeeded log file updated Submission Results Finished with 0 retries | 1 Finished with 1 retry | 0 Finished with 2 retries | 0 Finished with 3 retries | 0 Finished with 4 retries | 0 Finished with 5 retries | 0 Failed | 0 TOTAL | 1
You can include your gen3-client commands in a batch script and submit from biowulf. For example, create a batch input file (e.g. gen3.sh) like so:
#!/bin/bash
set -e
module load gen3
gen3-client download-single --profile=demo --guid=00149bcf-e057-4ecc-b22d-53648ae0b35f --no-prompt --skip-completed
Submit this job using the Slurm sbatch command.
sbatch [--mem=#] gen3.sh
You can also download multiple files using a manifest file generated from a Gen3 Portal. This can utilize multiple CPUs. For example:
#!/bin/bash
set -e
module load gen3
gen3-client download-multiple \
--profile=demo \
--manifest=manifest.json \
--no-prompt \
--skip-completed \
--numparallel $SLURM_CPUS_PER_TASK
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] gen3.sh
Create a swarmfile (e.g. gen3.swarm). For example:
gen3-client download-single --profile=demo --guid=8f4481b1-c8c9-4db2-a48d-ab0c1097a950 --no-prompt --skip-completed gen3-client download-single --profile=demo --guid=494a69be-4295-46bd-aa48-4dfedbb6f545 --no-prompt --skip-completed gen3-client download-single --profile=demo --guid=5760f226-cf57-4025-9995-84f4d827c082 --no-prompt --skip-completed gen3-client download-single --profile=demo --guid=8398df6a-63eb-4946-acb7-cc6f31b97915 --no-prompt --skip-completed
Submit this job using the swarm command.
swarm -f gen3.swarm [-g #] [-t #] --module gen3where
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module gen3 | Loads the gen3 module for each subjob in the swarm |