Aslprep on Biowulf
Aslprep is an application for preprocessing of ASL (arterial spin labeling) data and computation of CBF (cerebral blood flow). Aslprep is a pipeline that uses AFNI, FSL, ANTs, and freesurfer.
References:
- Adebimpe A, et al. ASLPrep: A Generalizable Platform for Processing of Arterial Spin Labeled MRI and Quantification of Regional Brain Perfusion. bioRxiv 2021.05.20.444998
Web site
Documentation
Important Notes
- aslprep on Biowulf is a singularity container built directly from docker://pennlink/aslprep. However, users do not need to execute singularity directly or bind directories manually because we have provided a wrapper shell script that takes care of that.
- Module Name: aslprep (see the modules page for more information)
- aslprep is a scientific pipeline with the potential to overload Biowulf's central filesystem. To avoid filesystem issues we recommend the following:
- Limit I/O with central Biowulf directories (e.g., /data, /home) by making use of local disk (/lscratch). You can make use of lscratch to store temporary working directories by directly assigning the temporary work directory of aslprep (-w option) to /lscratch/$SLURM_JOB_ID (remember to allocate enough space in /lscratch).
- Limit memory usage. Once you determine how much memory your jobs will need (by benchmarking, but see point 6 below), It is still good idea to limit the memory used by aslprep by using the option --mem. Something important to remember regarding memory allocation is to try to allocate some extra memory for your jobs as described in the resources described in point 6 below.
- Limit multi-threading. Biowulf's aslprep has been installed as a container and some of the applications in it will hyper-thread unless you explicitly limit the number of threads. You can limit the number of threads that aslprep is allowed to use across all processes by using the --nthreads flag.
- Opt out of uploading aslprep metrics to aslprep website. You can disable the submission with the --notrack flag.
- Make use of the flag --stop-on-first-crash flag to force a stop if issues occur.
- Profile/benchmark aslprep jobs: We recommend making sure a given aslprep job can scale before launching a large number of jobs. You can do this by profiling/benchmarking small jobs (e.g., a swarm of 3 aslprep commands), then monitoring the jobs by using either the user dashboard or the commands jobhist, sjobs, squeue, and jobload (see biowulf utilities). There are resources prepared by the HPC staff that go over how to monitor your jobs for benchmarking and profiling (video, and slides). Once you have profiled a swarm with a few jobs and you have refined the memory, CPU (and walltime) requirements, it is probably safe to (gradually) increase the number of aslprep jobs. For many analyses pipelines one has no way of knowing in advance how much memory or CPUs will be actually required in an HPC environment. This is why it is very important to profile/benchmark.
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load aslprep [user@cn3144 ~]$ aslprep --help usage: aslprep [-h] [--version] [--skip_bids_validation] [--participant-label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]] [--echo-idx ECHO_IDX] [--bids-filter-file FILE] [--anat-derivatives PATH] [--nprocs NPROCS] [--omp-nthreads OMP_NTHREADS] [--mem MEMORY_GB] [--low-mem] [--use-plugin FILE] [--anat-only] [--boilerplate_only] [--md-only-boilerplate] [-v] [--ignore {fieldmaps,slicetiming,sbref} [{fieldmaps,slicetiming,sbref} ...]] [--longitudinal] [--output-spaces [OUTPUT_SPACES [OUTPUT_SPACES ...]]] [--asl2t1w-init {register,header}] [--asl2t1w-dof {6,9,12}] [--force-bbr] [--force-no-bbr] [--m0_scale M0_SCALE] [--random-seed RANDOM_SEED] [--dummy-vols DUMMY_VOLS] [--smooth_kernel SMOOTH_KERNEL] [--scorescrub] [--basil] [--skull-strip-template SKULL_STRIP_TEMPLATE] [--skull-strip-fixed-seed] [--skull-strip-t1w {auto,skip,force}] [--fmap-bspline] [--fmap-no-demean] [--use-syn-sdc] [--force-syn] [--fs-license-file FILE] [-w WORK_DIR] [--clean-workdir] [--resource-monitor] [--reports-only] [--run-uuid RUN_UUID] [--write-graph] [--stop-on-first-crash] [--notrack] [--sloppy] bids_dir output_dir {participant} ASLPrep: ASL PREProcessing workflows v0.2.7 [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. aslprep.sh). For example:
#!/bin/bash # sbatch --gres=lscratch:100 --mem=32g --cpus-per-task=48 --time=72:00:00 aslprep.sh set -o pipefail set -e function fail { echo "FAIL: $@" >&2 exit 1 # signal failure } module load aslprep aslprep /data/${USER}/BIDS-dataset/asl001 /data/${USER}/BIDS-dataset/aslprep.out.ds001 \ participant --participant_label sub-01 -w /lscratch/$SLURM_JOB_ID \ --notrack --nthreads $SLURM_CPUS_PER_TASK --mem_mb $SLURM_MEM_PER_NODE \ --stop-on-first-crash
Submit this job using the Slurm sbatch command.
sbatch [--gres=lscratch:#] [--cpus-per-task=#] [--mem=#] aslprep.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.
Create a swarmfile (e.g. aslprep.swarm). For example:
export TMPDIR=/lscratch/$SLURM_JOB_ID; \ mkdir -p $TMPDIR/out; \ mkdir -p $TMPDIR/wrk; \ aslprep /data/${USER}/BIDS-dataset/asl001/ $TMPDIR/out \ participant --participant_label sub-01 -w $TMPDIR/wrk \ --notrack --nthreads $SLURM_CPUS_PER_TASK \ --mem_mb $SLURM_MEM_PER_NODE \ --stop-on-first-crash; \ mv $TMPDIR/out /data/${USER}/BIDS-dataset/aslprep.out.s001 export TMPDIR=/lscratch/$SLURM_JOB_ID; \ mkdir -p $TMPDIR/out; \ mkdir -p $TMPDIR/wrk; \ aslprep /data/${USER}/BIDS-dataset/asl001/ $TMPDIR/out \ participant --participant_label sub-02 -w $TMPDIR/wrk \ --notrack --nthreads $SLURM_CPUS_PER_TASK \ --mem_mb $SLURM_MEM_PER_NODE \ --stop-on-first-crash; \ mv $TMPDIR/out /data/${USER}/BIDS-dataset/aslprep.out.s002
Submit this job using the swarm command.
swarm -f aslprep.swarm [--gres=lscratch:#] [-g #] [-t #] --module aslprepwhere
-gres=lscratch:# | Number of Gigabytes of local disk space allocated per process (1 line in the swarm command file) |
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module aslprep | Loads the aslprep module for each subjob in the swarm |