Sequenza-utils is The supporting python library for the sequenza R package.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load sequenza-utils [user@cn3144 ~]$ sequenza-utils -h usage: sequenza-utils [-h] [-v] {bam2seqz,gc_wiggle,pileup2acgt,seqz_binning,snp2seqz} ... Sequenza Utils is an ensemble of tools capable of perform various tasks, primarily aimed to convert bam/pileup files to a format usable by the sequenza R package positional arguments: bam2seqz Process a paired set of BAM/pileup files (tumor and matching normal), and GC-content genome-wide information, to extract the common positions withA and B alleles frequencies gc_wiggle Given a fasta file and a window size it computes the GC percentage across the sequences, and returns a file in the UCSC wiggle format. pileup2acgt Parse the format from the samtools mpileup command, and report the occurrence of the 4 nucleotides in each position. seqz_binning Perform the binning of the seqz file to reduce file sizeand memory requirement for the analysis. snp2seqz Parse VCFs and other variant and coverage formats to produce seqz files optional arguments: -h, --help show this help message and exit -v, --verbose Show all logging information [user@cn3144 ~]$ sequenza-utils gc_wiggle -f /fdb/app_testdata/fasta/R64-1-1.cdna_nc.fa -o sequenza.out [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. sequenza-utils.sh). For example:
#!/bin/bash module load sequenza-utils sequenza-utils gc_wiggle -f /fdb/app_testdata/fasta/R64-1-1.cdna_nc.fa -o sequenza.out
Submit this job using the Slurm sbatch command.
sbatch sequenza-utils.sh
Create a swarmfile (e.g. sequenza-utils.swarm). For example:
sequenza-utils gc_wiggle -f s1.fa -o s1.out sequenza-utils gc_wiggle -f s2.fa -o s2.out sequenza-utils gc_wiggle -f s3.fa -o s3.out
Submit this job using the swarm command.
swarm -f sequenza-utils.swarm --module sequenza-utilswhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module sequenza-utils | Loads the sequenza-utils module for each subjob in the swarm |