From the Encode documentation:
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data. The pipeline can be run on compute clusters with job submission engines or stand alone machines. It inherently makes uses of parallelized/distributed computing.
$EASP_BACKEND_CONF: configuration for local backend$EASP_WFOPTS: singularity backend opts (Versions > 1.0 only)$EASP_WDL: WDL file defining the workflow$EASP_TEST_DATA: Input data for example belowAllocate an interactive session and run the program. Sample session:
A note about resource allocation:
"atac.bowtie2_cpu" in the input json to the number of CPUs you want
    bowtie2 to use. Usually 8 or so.NUM_CONCURRENT_TASKS * atac.bowtie2_cpu CPUs20GB * NUM_CONCURRENT_TASKS  for big samples and 
    10GB * NUM_CONCURRENT_TASKS for small samplesWDL based workflows need a json file to define input and settings for a workflow run. In this example, we will use the 76nt data from ENCODE sample ENCSR356KRQ (keratinocyte). This includes 2 and 6 fastq files respectively for each of 2 replicates.
Continues to use the v3 annotation. However, caper apparently changed significantly so you should backup your old caper configuration and create fresh config files for this version.
[user@biowulf]$ sinteractive --cpus-per-task=8 --mem=20g --gres=lscratch:30
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ wd=$PWD  # so we can copy results back later
[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ module load encode-atac-seq-pipeline/2.2.0
[user@cn3144]$ cp -Lr ${EASP_TEST_DATA:-none}/* .
[user@cn3144]$ tree
.
├── ENCSR356KRQ_subsampled.json.2.2.0
└── input
    └── ENCSR356KRQ
        ├── ENCFF007USV.subsampled.400.fastq.gz
        ├── ENCFF031ARQ.subsampled.400.fastq.gz
        ├── ENCFF106QGY.subsampled.400.fastq.gz
        ├── ENCFF193RRC.subsampled.400.fastq.gz
        ├── ENCFF248EJF.subsampled.400.fastq.gz
        ├── ENCFF341MYG.subsampled.400.fastq.gz
        ├── ENCFF366DFI.subsampled.400.fastq.gz
        ├── ENCFF368TYI.subsampled.400.fastq.gz
        ├── ENCFF573UXK.subsampled.400.fastq.gz
        ├── ENCFF590SYZ.subsampled.400.fastq.gz
        ├── ENCFF641SFZ.subsampled.400.fastq.gz
        ├── ENCFF734PEQ.subsampled.400.fastq.gz
        ├── ENCFF751XTV.subsampled.400.fastq.gz
        ├── ENCFF859BDM.subsampled.400.fastq.gz
        ├── ENCFF886FSC.subsampled.400.fastq.gz
        ├── ENCFF927LSG.subsampled.400.fastq.gz
        └── hg38.tsv
[user@cn3144]$ cat ENCSR356KRQ_subsampled.json.2.2.0
{
    "atac.pipeline_type" : "atac",
    "atac.genome_tsv" : "/fdb/encode-atac-seq-pipeline/v3/hg38/hg38.tsv",
    "atac.fastqs_rep1_R1" : [
        "input/ENCSR356KRQ/ENCFF341MYG.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF106QGY.subsampled.400.fastq.gz"
    ],
    "atac.fastqs_rep1_R2" : [
        "input/ENCSR356KRQ/ENCFF248EJF.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF368TYI.subsampled.400.fastq.gz"
    ],
    "atac.fastqs_rep2_R1" : [
        "input/ENCSR356KRQ/ENCFF641SFZ.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF751XTV.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF927LSG.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF859BDM.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF193RRC.subsampled.400.fastq.gz",
        "input/ENCSR356KRQ/ENCFF366DFI.subsampled.400.fastq.gz"
    ],
    "atac.fastqs_rep2_R2" : [
         "input/ENCSR356KRQ/ENCFF031ARQ.subsampled.400.fastq.gz",
         "input/ENCSR356KRQ/ENCFF590SYZ.subsampled.400.fastq.gz",
         "input/ENCSR356KRQ/ENCFF734PEQ.subsampled.400.fastq.gz",
         "input/ENCSR356KRQ/ENCFF007USV.subsampled.400.fastq.gz",
         "input/ENCSR356KRQ/ENCFF886FSC.subsampled.400.fastq.gz",
         "input/ENCSR356KRQ/ENCFF573UXK.subsampled.400.fastq.gz"
    ],
    "atac.paired_end" : true,
    "atac.auto_detect_adapter" : true,
    "atac.enable_xcor" : true,
    "atac.title" : "ENCSR356KRQ (subsampled 1/400)",
    "atac.description" : "ATAC-seq on primary keratinocytes in day 0.0 of differentiation"
}
        
        In this example the pipeline will only be run locally - i.e. it will not submit tasks as slurm jobs. Follow the caper docs to set up a config file for slurm submission. This has to be done only once.
[user@cn3144]$ [[ -d ~/.caper ]] && mv ~/.caper ~/caper.$(date +%F).bak # back up old caper config [user@cn3144]$ mkdir -p ~/.caper && caper init local [user@cn3144]$ # note the need for --singularity in this version [user@cn3144]$ caper run $EASP_WDL -i ENCSR356KRQ_subsampled.json.2.2.0 --singularity [...much output...] This workflow ran successfully. There is nothing to troubleshoot
This version of the pipeline comes with a tool to copy and organize pipeline output.
[user@cn3144]$ ls atac
a0fb9f58-ede3-4c02-9bcc-26d21ab5ccbb
[user@cn3144]$ croo --method copy --out-dir=${wd}/ENCSR889WQX \
    atac/a0fb9f58-ede3-4c02-9bcc-26d21ab5ccbb/metadata.json
    Create a batch input file. For example the following batch job will run a local job (assuming the caper config file is set up correctly):
#! /bin/bash
wd=$PWD
module load encode-atac-seq-pipeline/2.2.0 || exit 1
cd /lscratch/$SLURM_JOB_ID
mkdir input
cp -rL $EASP_TEST_DATA/* .
caper run $EASP_WDL -i ENCSR356KRQ_subsampled.json.2.2.0
rc=$?
croo --method copy --out-dir=${wd}/ENCSR356KRQ \
    atac/*/metadata.json
exit $rc
Submit this job using the Slurm sbatch command.
sbatch --time=4:00:00 --cpus-per-task=8 --mem=20g --gres=lscratch:50 encode-atac-seq-pipeline.sh