SMRT® Analysis is a bioinformatics software suite available for analysis of DNA sequencing data from Pacific Biosciences’ SMRT technology. Users can choose from a variety of analysis protocols that utilize PacBio® and third-party tools. Analysis protocols include de novo genome assembly, cDNA mapping, DNA base-modification detection, and long-amplicon analysis to determine phased consensus sequences.
This is a sample interactive session of the lambda phage site acceptance test done on the local node using pbcromwell. (user input in bold):
[teacher@biowulf ~]$ sinteractive --cpus-per-task=12 salloc.exe: Pending job allocation 43027948 salloc.exe: job 43027948 queued and waiting for resources salloc.exe: job 43027948 has been allocated resources salloc.exe: Granted job allocation 43027948 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3109 are ready for job srun: error: x11: no local DISPLAY defined, skipping [teacher@cn3109 smrtanalysis]$ module load smrtanalysis [+] Loading smrtanalysis 8.0.0.79519 [teacher@cn3109 ~]$ mkdir /data/$USER/smrtanalysis [teacher@cn3109 ~]$ cd !$ [teacher@cn3109 smrtanalysis]$ pbcromwell configure [WARNING] 2024-04-14 21:24:09,447Z [pbcromwell.cli] No database port specified - will run with in-memory DB [teacher@cn3109 smrtanalysis]$ ls cromwell.conf [teacher@cn3109 smrtanalysis]$ pbcromwell show-workflows cromwell.workflows.pb_detect_methyl: 5mC CpG Detection cromwell.workflows.pb_ccs: Circular Consensus Sequencing (CCS) cromwell.workflows.pb_demux_ccs: Demultiplex Barcodes cromwell.workflows.pb_export_ccs: Export Reads cromwell.workflows.pb_assembly_hifi: Genome Assembly cromwell.workflows.pb_align_ccs: HiFi Mapping cromwell.workflows.pb_target_enrichment: HiFi Target Enrichment cromwell.workflows.pb_sars_cov2_kit: HiFiViral SARS-CoV-2 Analysis cromwell.workflows.pb_isoseq: Iso-Seq Analysis cromwell.workflows.pb_mark_duplicates: Mark PCR Duplicates cromwell.workflows.pb_microbial_analysis: Microbial Genome Analysis cromwell.workflows.pb_puretarget_re_panel: PureTarget repeat expansion cromwell.workflows.pb_segment_reads: Read Segmentation cromwell.workflows.pb_segment_reads_and_isoseq: Read Segmentation and Iso-Seq cromwell.workflows.pb_segment_reads_and_sc_isoseq: Read Segmentation and Single-Cell Iso-Seq cromwell.workflows.pb_sc_isoseq: Single-Cell Iso-Seq cromwell.workflows.pb_sv_ccs: Structural Variant Calling cromwell.workflows.pb_trim_adapters: Trim Ultra-Low Adapters cromwell.workflows.pb_undo_demux: Undo Demultiplexing cromwell.workflows.pb_variant_calling: Variant Calling Run 'pbcromwell show-workflow-details' to display further information about a workflow. Note that the cromwell.workflows. prefix is optional. The full SMRT Tools documentation for this command and PacBio analysis workflows is available online: https://www.pacb.com/support/documentation [teacher@cn3109 smrtanalysis]$ pbcromwell run pb_align_subreads \ --entry /fdb/smrtanalysis/canneddata/lambdaTINY/m54026_181219_010936_tiny.subreadset.xml \ --entry /fdb/smrtanalysis/canneddata/referenceset/lambdaNEB/referenceset.xml \ --config $PWD/cromwell.conf \ --nproc $SLURM_CPUS_PER_TASK \ --output-dir sat
While pbcromwell can be run with the slurm backend to submit jobs while the supervisor process runs in an interactive session, we recommend queuing the supervisor process itself. Create a batch input file (e.g. smrtanalysis.sh) as follows:
#!/bin/bash set -e module load smrtanalysis ### cromwell configuration pbcromwell configure # The generated cromwell.conf cannot be used as is. The following command makes the necessary adjustments. sed -i -r \ -e 's%(sbatch|scancel|squeue)%/usr/local/slurm/bin/\1%' `# use absolute paths for slurm commands (slurm directory is purged from $PATH by PacBio wrapper)` \ -e '/job-id-regex/s/(job-id-regex\s+=\s+).*/\1 "(\\\\d+)"/' `# NIH HPC's sbatch has non-standard output` \ cromwell.conf ### run workflow pbcromwell run pb_align_subreads \ --entry /fdb/smrtanalysis/canneddata/lambdaTINY/m54026_181219_010936_tiny.subreadset.xml \ --entry /fdb/smrtanalysis/canneddata/referenceset/lambdaNEB/referenceset.xml \ --config $PWD/cromwell.conf \ --backend slurm \ --output-dir sat
Submit this job using the Slurm sbatch command, requesting a walltime of the expected run length of the whole pipeline.
sbatch [--time=#] smrtanalysis.sh