Picrust on Biowulf

PICRUSt (pronounced “pie crust”) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.

Example file can be copied from /usr/local/apps/picrust/tutorials

Interactive Job on Biowulf

[user@cn3144 ~]$ module load picrust

[user@cn3144 ~]$ cp /usr/local/apps/picrust/tutorials /data/$USER/

[user@cn3144 ~]$ cd /data/$USER/tutorials

[user@cn3144 ~]$ unzip

[user@cn3144 ~]$ cd picrust_starting_files

[user@cn3144 ~]$ -t GG_tree.nwk -i -m GG_to_IMGv350.txt -o format/16S/

Running a single batch job on Biowulf

1. Create a script file similar to the lines below.


module load picrust
cd /data/$USER/picrust_starting_files -t GG_tree.nwk -i -m GG_to_IMGv350.txt -o format/16S/

2. Submit the script on biowulf:

$ sbatch jobscript

For more memory requirement (default 4gb), use --mem flag:

$ sbatch --mem=10g jobscript

Running a swarm of jobs on Biowulf

Setup a swarm command file:

  cd /data/$USER/dir1; -t GG_tree.nwk -i -m GG_to_IMGv350.txt -o format/16S/
  cd /data/$USER/dir2; -t GG_tree.nwk -i -m GG_to_IMGv350.txt -o format/16S/
  cd /data/$USER/dir3; -t GG_tree.nwk -i -m GG_to_IMGv350.txt -o format/16S/

Submit the swarm file:

  $ swarm -f swarmfile --module picrust

-f: specify the swarmfile name
--module: load the required module for each command line in the file

For more memory requirement (default 1.5gb each line of commands), use -g flag :

  $ swarm -f swarmfile -g 10 --module picrust

For more information regarding running swarm, see swarm.html