Building pipelines using slurm dependencies

Differences between stock sbatch and sbatch on biowulf

Before discussing job dependencies we need to point out that sbatch on biowulf, and therefore in the examples below, is a wrapper script that returns just the jobid. That is different from stock sbatch which returns Submitted batch job 123456. You can think of the wrapper doing something equivalent to

#! /bin/bash

sbr="$(/path/to/real/sbatch "$@")"

if [[ "$sbr" =~ Submitted\ batch\ job\ ([0-9]+) ]]; then
    echo "${BASH_REMATCH[1]}"
    exit 0
else
    echo "sbatch failed"
    exit 1
fi
    
Introduction

Job dependencies are used to defer the start of a job until the specified dependencies have been satisfied. They are specified with the --dependency option to sbatch or swarm in the format

sbatch --dependency=<type:job_id[:job_id][,type:job_id[:job_id]]> ...

Dependency types:

after:jobid[:jobid...] job can begin after the specified jobs have started
afterany:jobid[:jobid...] job can begin after the specified jobs have terminated
afternotok:jobid[:jobid...] job can begin after the specified jobs have failed
afterok:jobid[:jobid...] job can begin after the specified jobs have run to completion with an exit code of zero (see the user guide for caveats).
singleton jobs can begin execution after all previously launched jobs with the same name and user have ended. This is useful to collate results of a swarm or to send a notification at the end of a swarm.

See also the Job Dependencies section of the User Guide.

To set up pipelines using job dependencies the most useful types are afterany, afterok and singleton. The simplest way is to use the afterok dependency for single consecutive jobs. For example:

b2$ sbatch job1.sh
11254323
b2$ sbatch --dependency=afterok:11254323 job2.sh

Now when job1 ends with an exit code of zero, job2 will become eligible for scheduling. However, if job1 fails (ends with a non-zero exit code), job2 will not be scheduled but will remain in the queue and needs to be canceled manually.

As an alternative, the afterany dependency can be used and checking for successful execution of the prerequisites can be done in the jobscript itself.

The sections below give more complicated examples of using job dependencies for pipelines in bash, perl, and python.

Bash

The following bash script is a stylized example of some useful patterns for using job dependencies:

#! /bin/bash

# first job - no dependencies
jid1=$(sbatch  --mem=12g --cpus-per-task=4 job1.sh)

# multiple jobs can depend on a single job
jid2=$(sbatch  --dependency=afterany:$jid1 --mem=20g job2.sh)
jid3=$(sbatch  --dependency=afterany:$jid1 --mem=20g job3.sh)

# a single job can depend on multiple jobs
jid4=$(sbatch  --dependency=afterany:$jid2:$jid3 job4.sh)

# swarm can use dependencies
jid5=$(swarm --dependency=afterany:$jid4 -t 4 -g 4 -f job5.sh)

# a single job can depend on an array job
# it will start executing when all arrayjobs have finished
jid6=$(sbatch --dependency=afterany:$jid5 job6.sh)

# a single job can depend on all jobs by the same user with the same name
jid7=$(sbatch --dependency=afterany:$jid6 --job-name=dtest job7.sh)
jid8=$(sbatch --dependency=afterany:$jid6 --job-name=dtest job8.sh)
sbatch --dependency=singleton --job-name=dtest job9.sh


# show dependencies in squeue output:
squeue -u $USER -o "%.8A %.4C %.10m %.20E"

A more complete example of a mock chipseq pipeline can be found here.

And here is a simple bash script that will submit a series of jobs for a benchmark test. This script submits the same job with 1 MPI process, 2 MPI processes, 4 MPI processes ... 128 MPI processes. The Slurm batch script 'jobscript' uses the environment variable $SLURM_NTASKS to specify the number of MPI processes that the program should start. The reason to use job dependencies here is that all the jobs write some temporary files with the same name, and would clobber each other if run at the same time.

#!/bin/sh

id=`sbatch --job-name=factor9-1 --ntasks=1 --ntasks-per-core=1 --output=${PWD}/results/x2650-1.slurmout jobscript`
echo "ntasks 1 jobid $id"

for n in 2 4 8 16 32 64 128; do
    id=`sbatch --depend=afterany:$id --job-name=factor9-$n --ntasks=$n --ntasks-per-core=1 --output=${PWD}/results/x2650-$n.slurmout jobscript`;
    echo "ntasks $n jobid $id"
done

The batch script corresponding to this example:

#!/bin/bash

module load  amber/14
module list

echo "Using $SLURM_NTASKS cores"

cd /data/user  /amber/factor_ix.amber10

`which mpirun` -np $SLURM_NTASKS `which sander.MPI` -O -i mdin -c inpcrd -p prmtop
Perl

A sample perl script that submits 3 jobs, each one dependent on the completion (in any state) of the previous job.

#!/usr/local/bin/perl

$num = 8;

$jobnum = `sbatch --cpus-per-task=$num myjobscript`;
chop $jobnum;
print "Job number $jobnum submitted\n\n";

$jobnum = `sbatch --depend=afterany:${jobnum} --cpus-per-task=8 --mem=2g mysecondjobscript`;
chop $jobnum;
print "Job number $jobnum submitted\n\n";

$jobnum = `sbatch --depend=afterany:${jobnum} --cpus-per-task=8 --mem=2g mythirdjobscript`;
chop $jobnum;
print "Job number $jobnum submitted\n\n";

system("sjobs");
Python

The sample Python script below submits 3 jobs that are dependent on each other, and shows the status of those jobs.

#!/usr/local/bin/python

import commands, os

# submit the first job
cmd = "sbatch Job1.bat"
print "Submitting Job1 with command: %s" % cmd
status, jobnum = commands.getstatusoutput(cmd)
if (status == 0 ):
    print "Job1 is %s" % jobnum
else:
    print "Error submitting Job1"

# submit the second job to be dependent on the first
cmd = "sbatch --depend=afterany:%s Job2.bat" % jobnum
print "Submitting Job2 with command: %s" % cmd
status,jobnum = commands.getstatusoutput(cmd)
if (status == 0 ):
    print "Job2 is %s" % jobnum
else:
    print "Error submitting Job2"

# submit the third job (a swarm) to be dependent on the second
cmd = "swarm -f swarmfile --module blast  --depend=afterany:%s Job2.bat" % jobnum
print "Submitting swarm job  with command: %s" % cmd
status,jobnum = commands.getstatusoutput(cmd)
if (status == 0 ):
    print "Job3 is %s" % jobnum
else:
    print "Error submitting Job3"

print "\nCurrent status:\n"
#show the current status with 'sjobs'
os.system("sjobs")

Running this script:

[user  @biowulf ~]$ submit_jobs.py
Submitting Job1 with command: sbatch Job1.bat
Job1 is 25452702
Submitting Job2 with command: sbatch --depend=afterany:25452702 Job2.bat
Job2 is 25452703
Submitting swarm job  with command: swarm -f swarm.cmd --module blast  --depend=afterany:25452703
Swarm job is 25452706

Current status:

User    JobId            JobName   Part  St  Reason      Runtime  Walltime  Nodes  CPUs  Memory  Dependency      
==============================================================================================================
user    25452702         Job1.bat  norm  PD  ---            0:00   4:00:00      1   1   2GB/cpu
user    25452703         Job2.bat  norm  PD  Dependency     0:00   4:00:00      1   1   2GB/cpu  afterany:25452702
user    25452706_[0-11]  swarm     norm  PD  Dependency     0:00   4:00:00      1  12   1GB/node afterany:25452703
==============================================================================================================
cpus running = 0
cpus queued = 14
jobs running = 0
jobs queued = 14