Online class: Introduction to Biowulf

Hands-On: Using local disk on a node

In the previous Data Storage hands-on section, you should have copied the class scripts to your /data area. If you skipped or missed that section, type

hpc-classes biowulf
now. This command will copy the scripts and input files used in this online class to your /data area, and will take about 5 minutes.

In the following session, you will submit a batch job for Freebayes, a a Bayesian genetic variant detector, reading and writing from local disk on the node. If you're not familiar with this program, don't worry -- this is just an example. The basic principles of job submission are not specific for Freebayes.

# allocate an interactive session requesting 5 GB of local disk on the node
sinteractive --gres=lscratch:5

# once you are logged into a node, cd to the local scratch directory for this job
cd /lscratch/$SLURM_JOBID

# see what files it contains 
ls -l

# copy the input files required for this job to local scratch
cp -r /data/$USER/hpc-classes/biowulf/freebayes/ .

# load the freebayes module
module load freebayes

# run freebayes
cd freebayes
freebayes -f genome.fasta input.bam

# see what files have been created
ls -l 

# exit the interactive session
exit

Quiz

Once you exit the job, can you access the files on /lscratch on the node?

Answer

No. The session above would have left all the output files in /lscratch/$SLURM_JOBID on the allocated node, and that directory would have been deleted when you exited the job.

To save the output files, you would need to copy them to your /data area before the 'exit' command.

Can you write to /lscratch on the node?

Answer

No. Try it in the interactive session and see. You can only write to /lscratch/$SLURM_JOBID for your own jobs. If multiple users have jobs on the same node, each user only has access to the /lscratch directories corresponding to their own jobs.

How would you run this same freebayes job as a batch job?

Answer

The batch script would look very similar to the commands above. e.g.
#!/bin/bash
# this file is freebayes.sh

cd /lscratch/$SLURM_JOBID

# copy the input files required for this job to local scratch
cp -r /data/$USER/hpc-classes/biowulf/freebayes/ .

# load the freebayes module
module load freebayes

# run freebayes, writing the output into a file 
cd freebayes
freebayes -f genome.fasta input.bam > freebayes.out

# copy the output files back to your /data area
cp freebayes.out /data/$USER/biowulf-class/freebayes/

This job would be submitted with

sbatch --gres=lscratch:10  freebayes.sh