Online class: Introduction to Biowulf
Advanced Quiz
- What's the problem with this sbatch submission command?
biowulf% sbatch myscript.sh --time=10:00:00
-
Answer
The batch script name needs to be the last parameter on the sbatch command line. Any parameters after that are considered by the batch system to be input parameters for the script. Thus, the batch system will ignore the --time=10:00:00 and schedule this job with the default walltime. - What's the difference between these three commands:
swarm -b 2 -f swarmfile swarm -p 2 -f swarmfile swarm -t 2 -f swarmfile
-
Answer
The -b 2 flag 'bundles' the commands so that 2 processes (2 lines in your swarm command file) run sequentially on the allocated cores. This flag is valid for both single-threaded and multi-threaded processes. If the processes are single-threaded, one CPU on the allocated core will be idle.The -p 2 flag 'packs' the commands so that two commands (2 lines in your swarm command file) run simultaneously on the 2 CPUs of an allocated core. This flag is set to '1' by default, can only be set to '1' or '2', and can only be set to '2' for single-threaded processes.
The -t 2 flag allocates 2 CPUs to each process (1 line in your swarm command file). The default is '-t 1', and it should only be increased if the swarm is running multi-threaded processes.
- How many subjobs would you expect if you submitted the following script via swarm?
#!/bin/bash cd /data/$USER module load bam2mpg bam2mpg --region chr1 --mpg aln.chr1.mpg.out ref.fasta aln.sort.bam
-
Answer
That's a batch script, not a swarm command file. If you submit it with swarm, you will get 3 subjobs, one for each non-comment line in the script. The results will probably not be what you wanted. A swarm file to run bam2mpg should look like this:cd /data/$USER; bam2mpg --region chr1 --mpg aln1.chr1.mpg.out ref.fasta aln1.sort.bam cd /data/$USER; bam2mpg --region chr1 --mpg aln2.chr1.mpg.out ref.fasta aln2.sort.bam [..other such commands, one line for each bamfile...]
and be submitted withswarm -f swarmfile --module bam2mpg
- What's the problem with this swarm command file?
sbatch --time=10:00:00 job1 sbatch --time=10:00:00 job2 [...]
-
Answer
To submit a bunch of sbatch commands, you do not need to use swarm at all. You can simply run the script on the Biowulf login node command line. The login node is intended for job submission.Submitting it as a swarm simply adds a layer of overhead to the process and needlessly loads the batch system.
In general, you should choose to use either swarm or sbatch to submit a collection of jobs.