Biowulf High Performance Computing at the NIH
Online class: Introduction to Biowulf

Hands-On: Submitting a swarm of jobs

In the previous Data Storage hands-on section, you should have copied the class scripts to your /data area. If you skipped or missed that section, type

hpc-classes biowulf
now. This command will copy the scripts and input files used in this online class to your /data area, and will take about 5 minutes.

In the following session, you will submit a swarm of Blat jobs, a sequence alignment program . If you're not familiar with this program, don't worry -- this is just an example. The basic principles of job submission are not specific for Blat.


cd /data/$USER/hpc-classes/biowulf/swarm

# look at the blat.swarm file
more blat.swarm

# see how many lines (commands) are in the blat.swarm file, not including the comment lines
grep -v '#' blat.swarm | wc -l

# submit the swarm
swarm -f blat.swarm --module blat

Quiz

How many jobs were created?

Answer

1 job array with 30 subjobs. Each subjob corresponds to a single line in the blat.swarm file.

Where did the standard output/error files go?

Answer

In the directory from which you submitted the swarm. They would be called swarm-####.o (standard output) and swarm-####.e (standard error), one pair for each subjob.

How do you check if one of the swarm subjobs had an error?

Answer

You could look at one of the .e files, and check the sizes of all of the .e files ('ls -l *.e'). If all of them are the same size, and the one you examined contains only
[+] Loading BLAT 3.5
then they all ran successfully. (Unfortunately, the 'module load blat' command puts its output into standard error, so all the .e files will have this line.) If one has a larger size, take a look at that file.

Most of the files in the 'out' directory will show only the Blat header. This means that no match was found for that query sequence against human genome chromosome 10. 'ls -l out' will show you which output files are larger. Those are the ones corresponding to query sequences which found matches in chromosome 10.