Percolator is a software package for postprocessing of shotgun proteomics data.
Allocate an interactive session and run the program. This example runs through the test data provided by the developer.
Sample session (user input in bold):
[user@biowulf percolator]$ sinteractive -c2 --mem=4g --gres=lscratch:10 salloc.exe: Pending job allocation 11290667 salloc.exe: job 11290667 queued and waiting for resources salloc.exe: job 11290667 has been allocated resources salloc.exe: Granted job allocation 11290667 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn0863 are ready for job srun: error: x11: no local DISPLAY defined, skipping error: unable to open file /tmp/slurm-spank-x11.11290667.0 slurmstepd: error: x11: unable to read DISPLAY value [user@cn0863 percolator]$ cd /lscratch/$SLURM_JOB_ID [user@cn0863 11290667]$ module load percolator [+] Loading percolator 3.6.5 on cn4268 [user@cn0863 11290667]$ mkdir test; cd test [user@cn0863 11290667]$ tar xf $PERCOLATOR_DATA/yeast-01.sqt.tar.gz [user@cn0863 11290667]$ sqt2pin -o pin.tab yeast-01.sqt yeast-01.shuffled.sqt Written by Lukas Käll (lukas.kall@scilifelab.se) in the School of Biotechnology at KTH - Royal Institute of Technology, Stockholm. Issued command: sqt2pin -o pin.tab yeast-01.sqt yeast-01.shuffled.sqt on biowulf.nih.gov Reading yeast-01.sqt Reading yeast-01.shuffled.sqt [user@cn0863 11290667]$ percolator -v 1 -X pout.xml pin.tab > yeast-01.psms Percolator version 3.06.05, Build Date Feb 15 2024 13:55:38 Copyright (c) 2006-9 University of Washington. All rights reserved. Written by Lukas Käll (lukall@u.washington.edu) in the Department of Genome Sciences at the University of Washington. Issued command: percolator -v 1 -X pout.xml pin.tab Started Thu Feb 15 09:56:45 2024 on biowulf.nih.gov Hyperparameters: selectionFdr=0.01, Cpos=0, Cneg=0, maxNiter=10 Finding protein decoy prefix for pin.tab Using protein decoy prefix "random_" Separate target and decoy search inputs detected, using mix-max method. Selecting Cpos by cross-validation. Selecting Cneg by cross-validation. Found 7004 test set positives with q<0.01 in initial direction ---Training with Cpos selected by cross validation, Cneg selected by cross validation, initial_fdr=0.01, fdr=0.01 Found 11446 test set PSMs with q<0.01. Tossing out "redundant" PSMs keeping only the best scoring PSM for each unique peptide. Selecting pi_0=0.86749 Calculating q values. New pi_0 estimate on final list yields 7371 target peptides with q<0.01. Calculating posterior error probabilities (PEPs).
Create a batch input file (e.g. percolator.sh). For example:
#!/bin/bash set -e module load percolator percolator -X pout.xml pin.tab > out.psms
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] percolator.sh
Create a swarmfile (e.g. percolator.swarm). For example:
percolator -X pout1.xml pin1.tab > out1.psms percolator -X pout2.xml pin2.tab > out2.psms percolator -X pout3.xml pin3.tab > out3.psms percolator -X pout4.xml pin4.tab > out4.psms
Submit this job using the swarm command.
swarm -f percolator.swarm [-g #] [-t #] --module percolatorwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module percolator | Loads the percolator module for each subjob in the swarm |