From the Rockhopper manual:
Rockhopper is a comprehensive and user-friendly system for computational analysis of bacterial RNA-seq data. As input, Rockhopper takes RNA sequencing reads output by high-throughput sequencing technology (FASTQ, QSEQ, FASTA, SAM, or BAM files).
Rockhopper has a graphical interface which is described in the Rockhopper manual. For the cluster, the most relevant mode of usage is the command line interface.
-p). Please match the number of allocated
CPUs with the number of threads$ROCKHOPPER_JAR variable is set to the path of the jar file$ROCKHOPPER_TEST_DATAAllocate an interactive session and run the program. If using the GUI, make sure that you have a graphical connection to biowulf (NX or X11 forwarding though ssh). Sample session:
[user@biowulf]$ sinteractive --mem=5g --cpus-per-task=4 --gres=lscratch:20 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load rockhopper [user@cn3144]$ # start up the GUI. Need to know the http proxy host for this. [user@cn3144]$ # Note that the proxy may change from session to session [user@cn3144]$ echo $http_proxy http://dtn03-e0:3128 [user@cn3144]$ java -Dhttp.proxyHost=dtn03-e0 -Dhttp.proxyPort=3128 -jar $ROCKHOPPER_JAR
For more details on using the GUI, please see the Rockhopper manual. Now let's use the command line interface to do a reference based analysis of a small Mycoplasma genitalium data set:
[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ cp -r ${ROCKHOPPER_TEST_DATA:-none}/* .
[user@cn3144]$ tree
.
|-- [user 16M] Example_Condition1.fastq
|-- [user 16M] Example_Condition2.fastq
`-- [user 4.0K] Mycoplasma_genitalium_G37
|-- [user 575K] NC_000908.fna
|-- [user 46] NC_000908.fna.fai
|-- [user 10K] NC_000908.genome
|-- [user 59K] NC_000908.gff
|-- [user 37K] NC_000908.ptt
`-- [user 2.3K] NC_000908.rnt
[user@cn3144]$ mkdir tmp
[user@cn3144]$ java -Xmx4000m -Djava.io.tmpdir=$PWD/tmp -cp $ROCKHOPPER_JAR Rockhopper \
-g $PWD/Mycoplasma_genitalium_G37 \
-p $SLURM_CPUS_PER_TASK \
-L cond1,cond2 \
-o $PWD/results \
Example_Condition1.fastq Example_Condition1.fastq
Aligning sequencing reads from file: Example_Condition1.fastq
Total reads: 137473
Successfully aligned reads: 125676 91% (Mycoplasma genitalium G37 chromosome)
Aligning (sense) to protein-coding genes: 96%
Aligning (antisense) to protein-coding genes: 0%
Aligning (sense) to ribosomal RNAs: 0%
Aligning (antisense) to ribosomal RNAs: 0%
Aligning (sense) to transfer RNAs: 1%
[...snip...]
[user@cn3144]$ tree results
results/
|-- [user 4.0K] genomeBrowserFiles
| |-- [user 1.4M] Example_Condition1_NC_000908_v1_m15_aT_d500_l33_fr_cF.minus.wig
| |-- [user 1.5M] Example_Condition1_NC_000908_v1_m15_aT_d500_l33_fr_cF.plus.wig
| |-- [user 1.1M] NC_000908_diffExpressedGenes.wig
| |-- [user 1.1M] NC_000908_ncRNAs.wig
| |-- [user 1.3M] NC_000908_operons.wig
| `-- [user 1.1M] NC_000908_UTRs.wig
|-- [user 4.0K] intermediary
| `-- [user 186K] Example_Condition1_NC_000908_v1_m15_aT_d500_l33_fr_cF.gz
|-- [user 4.4K] NC_000908_operons.txt
|-- [user 37K] NC_000908_transcripts.txt
`-- [user 2.0K] summary.txt
[user@cn3144]$ cp -r results /data/$USER
[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$
Create a batch input file (e.g. rockhopper.sh), which uses the input file 'rockhopper.in'. For example:
#!/bin/bash
wd=$PWD
module load rockhopper/2.0.3
cd /lscratch/$SLURM_JOB_ID
mkdir tmp
cp -r ${ROCKHOPPER_TEST_DATA:-none}/* .
java -Xmx4000m -Djava.io.tmpdir=$PWD/tmp -cp $ROCKHOPPER_JAR Rockhopper \
-g $PWD/Mycoplasma_genitalium_G37 \
-p $SLURM_CPUS_PER_TASK \
-L cond1,cond2 \
-o $PWD/results \
Example_Condition1.fastq Example_Condition1.fastq
cp -r results $wd
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=4 --mem=5g rockhopper.sh