Hands-On: Submit a simple batch job
In the previous Data Storage hands-on section, you should have copied the class scripts to your /data area. If you skipped or missed that section, type
hpc-classes biowulfnow. This command will copy the scripts and input files used in this online class to your /data area, and will take about 5 minutes.
In the following session, you will submit a batch job for Plink, a whole-genome association analysis program. If you're not familiar with whole genome analysis or Plink, don't worry -- this is just an example. The basic principles of job submission are not specific for Plink.
cd /data/$USER/hpc-classes/biowulf/plink # look at the batch script -- these are the same commands you would type on a command line # (use Ctrl-C to end the 'more' process) more plink.bat # submit the job sbatch plink.bat # check if it's in the queue squeue -u $USER # try the 'sjobs' command to see the status of your job sjobs
- What is the job number (aka jobid) for this Plink job?
-
Answer
This would have printed to your screen by the 'sbatch' command you ran. (If you contact the HPC staff about a job problem, it's very helpful if you can include the job number.) - Which node is your job running on?
-
Answer
'sjobs' will show you this info. But you generally don't need to know which node your jobs run on. This is the power of the batch system; once you submit your job, the batch system will find the appropriate resources and start/end your job without you needing to know which nodes are available or which node your job ran on. - Where is the output file?
-
Answer
The output file is called slurm-#####.out, where ##### is the job number. It will be in the same directory from which you submitted the job. - How many cores/CPUs were allocated? How much memory?
-
Answer
You did not specify any cores or CPUs or memory in the sbatch command, so the batch system would allocate the default of 1 core = 2 CPUs, and 4 GB of memory. - How many cores/CPUs were used?
-
Answer
Plink is a single-threaded program, therefore it would use only a single CPU. One of the 2 allocated CPUs would have been idle.In most cases, the Biowulf application webpages will indicate whether an application is single-threaded or multi-threaded. You may also need to read the documentation for the application, as some applications like R have a mix of single-threaded and multi-threaded packages.