Biowulf High Performance Computing at the NIH
ICGC Score Client on Helix & Biowulf

The Score client replaces the ICGC storage client. To interact with cloud repositories such as AWS and Collaboratory, you will require the Score Client.

Documentation
Important Notes

The Score client's mount command cannot be used on the HPC systems

Interactive use
Interactive data transfers are best run on Helix.

Helix has a direct connection to the internet and does not go through one of the HPC proxy servers.
Sample session (user input in bold):

[user@helix ~]$ module load score-client
[+] Loading java 1.8.0_181  ... 
[+] Loading score-client, version 1.6.1... 
[user@helix ~]$ score-client --profile collab download --object-id 6d89e978-34f6-5074-b30e-01b7203fcbb3 --output-dir /data/$USER/collab-data
Downloading...
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
100% [##################################################]  Parts: 208/208, Checksum: 100%, Write/sec: 47.7M/s, Read/sec: 48.1M/s
Finalizing...
Total execution time:         1.301 h

Total bytes read    : 225,224,593,891
Total bytes written : 223,331,450,672
[user@helix ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. score-client.sh). For example:

#!/bin/bash
set -e
module load score-client

score-client --profile collab download --object-id 6d89e978-34f6-5074-b30e-01b7203fcbb3 --output-dir /data/$USER/collab

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] score-client.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. score-client.swarm). For example:

score-client --profile collab view --object-id ddcdd044-adda-5f09-8849-27d6038f8ccd --query 1:1-10000
score-client --profile collab download --object-id ddcdd044-adda-5f09-8849-27d6038f8ccd 5cc35183-9291-5711-967d-30afcf20e71f --output-dir data

Submit this job using the swarm command.

swarm -f score-client.swarm [-g #] [-t #] --module score-client
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module score-client Loads the ICGC score-client module for each subjob in the swarm