High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
TORTOISE on Biowulf

TORTOISE (Tolerably Obsessive Registration and Tensor Optimization Indolent Software Ensemble) is for processing diffusion MRI data, and it contains three main modules:

References:

Documentation
Important Notes

Process a single subject with TORTOISE

The following example shows one method of processing several subjects. The example below is courtesy of M. Okan Irfanoglu (NIBIB).

The data in this example is from the Human Connectome Project. A copy of this data is maintained on Biowulf: to get access, please contact Adam Thomas, NIMH. (adamt@nih.gov)

The read-only Connectome data needs to be copied to the user's data directory, gunzipped, and processed with Tortoise. In the example commands below, the following variables are assumed to be set:

dir=/data/$USER/TORTOISE    (top level directory for data)
subject=######              (subject, e.g. 100206)

Copy the data and gunzip it:

cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_LR.nii.gz ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_LR.bvec  ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_LR.bval  ${dir}/${subject}/
gunzip ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii.gz
cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_RL.nii.gz ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_RL.bvec  ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/${subject}/unprocessed/3T/Diffusion/${subject}_3T_DWI_dir95_RL.bval  ${dir}/${subject}/
gunzip ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii.gz
Import the data:
ImportNIFTI -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii -p horizontal \ 
       --bvals  ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bval \
      --bvecs ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bvec  --small_delta 10.6 --big_delta 43.1
ImportNIFTI -i ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii -p horizontal \
       --bvals  ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bval \
       --bvecs ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bvec  --small_delta 10.6 --big_delta 43.1
       

Remove temp files:

rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bval
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bvec
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bval
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bvec

Run DIFF_PREP:

DIFFPREP -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_RL.list  \
       -s  ${dir}/${subject}/${subject}_T2W_structural.nii.gz  --will_be_drbuddied 1 --do_QC 0
DIFFPREP -i ${dir}/${subject}/${subject}_3T_DWI_dir95_LR_proc/${subject}_3T_DWI_dir95_LR.list  \
      -s  ${dir}/${subject}/${subject}_T2W_structural.nii.gz  --will_be_drbuddied 1 --do_QC 0
      

Run DR_BUDDI:

DR_BUDDI_withoutGUI \
      --up_data ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_RL_proc.list \
      --down_data  ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_LR_proc.list \
      ${dir}/${subject}/${subject}_T2W_structural.nii.gz -g 1

Run DIFF_CALC

EstimateTensorWLLS \
      -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final.list
ComputeFAMap \
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii
ComputeTRMap \ 
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii
ComputeDECMap \
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii

All these commands can be put into a single script that will run sequentially. A more efficient way to utilize the Biowulf resources is to parallelize the independent commands so that they can run simultaneously. In additional, all the subjects can be processed simultaneously in separate pipeline runs, as in the next example.

Process several subjects with TORTOISE

The following script takes the subject as an input argument, sets up all the dependent subjobs and submits them. This script can be used to process a large number of subjects. Example below:

TORTOISE pipeline script

#!/bin/bash

dir=/data/$USER/TORTOISE
cd $dir

if [ $# -ne 1 ]; then
     echo "*** ERROR *** No subject provided"
     echo "Usage: process.sh subject  (e.g. process.sh 100206)"
     exit 1;
fi

subject=$1
echo "Processing subject ${subject}"
mkdir ${dir}/${subject}
module load TORTOISE

#------------- copy the data -----------------------------
echo "Copying data...
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_LR.nii.gz ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_LR.bvec  ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_LR.bval  ${dir}/${subject}/
gunzip ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii.gz
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_RL.nii.gz ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_RL.bvec  ${dir}/${subject}/
cp /data/HCP/HCP_900/s3/hcp/100206/unprocessed/3T/Diffusion/100206_3T_DWI_dir95_RL.bval  ${dir}/${subject}/
gunzip ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii.gz
      
#------------- import the data -----------------------------
echo "Importing...."
# import
ImportNIFTI -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii -p horizontal \ 
       --bvals  ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bval \
      --bvecs ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bvec  --small_delta 10.6 --big_delta 43.1
ImportNIFTI -i ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii -p horizontal \
       --bvals  ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bval \
       --bvecs ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bvec  --small_delta 10.6 --big_delta 43.1

#------------- remove temp files -----------------------------
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.nii
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bval
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_LR.bvec
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.nii
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bval
rm ${dir}/${subject}/${subject}_3T_DWI_dir95_RL.bvec

#------------- create and submit a DIFF_PREP swarm file -----------------------------
cat > diffprep_$subject.swarm << EOF
DIFFPREP -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_RL.list  \
       -s  ${dir}/${subject}/${subject}_T2W_structural.nii.gz  --will_be_drbuddied 1 --do_QC 0
DIFFPREP -i ${dir}/${subject}/${subject}_3T_DWI_dir95_LR_proc/${subject}_3T_DWI_dir95_LR.list  \
      -s  ${dir}/${subject}/${subject}_T2W_structural.nii.gz  --will_be_drbuddied 1 --do_QC 0
EOF

# submit the diffprep job and capture its jobid. Each DIFF_PREP subjob can multithread to use 32 CPUs, 
#   so the swarm is submitted with '-t 32'
diffprep_jobid=$( swarm -f diffprep_$subject.swarm -t 32 -g 20 --time=24:00:00 )
echo "Submitted swarm of DIFFPREP jobs: $diffprep_jobid"

#------------- create and submit the dependent DR_BUDDI job -----------------------------
#create a DR_BUDDI job script 
cat > drbuddi_${subject}.sh << EOF
#!/bin/bash

cd $dir
module load TORTOISE
DR_BUDDI_withoutGUI \
      --up_data ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_RL_proc.list \
      --down_data  ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc/${subject}_3T_DWI_dir95_LR_proc.list \
      ${dir}/${subject}/${subject}_T2W_structural.nii.gz -g 1
EOF

# submit the DR_BUDDI job to run after the diffprep jobs complete. DR_BUDDI requires lots of memory
drbuddi_jobid=$( sbatch --depend=afterany:${diffprep_jobid} --exclusive --cpus-per-task=32 --mem=128g --time=18:00:00 drbuddi_${subject}.sh )
echo "Submitted drbuddi job: $drbuddi_jobid"

#------------- create and submit the dependent DIFF_CALC job -----------------------------
#set up a diffcalc job
cat > diffcalc_${subject}.sh <<EOF
#!/bin/bash

cd $dir
module load TORTOISE
EstimateTensorWLLS \
      -i ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final.list
ComputeFAMap \
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii
ComputeTRMap \ 
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii
ComputeDECMap \
      ${dir}/${subject}/${subject}_3T_DWI_dir95_RL_proc_DRBUDDI_proc/${subject}_RL_proc_DRBUDDI_final_L0_DT.nii
EOF

#submit the diffcalc job to run after the DR_BUDDI jobs complete
diffcalc_jobid=$( sbatch --depend=afterany:${drbuddi_jobid} --exclusive --cpus-per-task=32 --mem=20g --time=12:00:00 diffcalc_${subject}.sh

Next, a swarm command file is set up with one line for each subject:

# this file is called tortoise.swarm
cd /data/$USER/TORTOISE; ./process.sh 100206
cd /data/$USER/TORTOISE; ./process.sh 162228
cd /data/$USER/TORTOISE; ./process.sh 175540
[...]
This swarm of independent jobs is submitted with
swarm -f tortoise.swarm 
The swarm submitted in this command will only import the data and set up the DIFFPREP, DR_BUDDI and DIFF_CALC jobs, so the default 2 CPUs and memory are sufficient for this swarm. The DIFF_CALC, DR_BUDDI and DIFF_PREP jobs require more CPU and memory resources, and these are requested appropriately by the sbatch and swarm commands in the 'TORTOISE pipeline script' above.