Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one.
Cufflinks is a collaborative effort between the Laboratory for Mathematical and Computational Biology, led by Lior Pachter at UC Berkeley, Steven Salzberg's group at the University of Maryland Center for Bioinformatics and Computational Biology, and Barbara Wold's lab at Caltech.
Cufflinks is provided under the OSI-approved Boost License
Illumina has provided the RNA-Seq user community with a set of genome sequence indexes (including Bowtie, Bowtie2, and BWA indexes) as well as GTF transcript annotation files called iGenomes. These files can be used with TopHat and Cufflinks to quickly perform expression analysis and gene discovery. The annotation files are augmented with the tss_id and p_id GTF attributes that Cufflinks needs to perform differential splicing, CDS output, and promoter user analysis.
Please note that Cufflinks has entered a low maintenance, low support stage as it is now largely superseded by StringTie which provides the same core functionality (i.e. transcript assembly and quantification), in a much more efficient way.
There is a patched version of cufflinks available:
module load cufflinks/2.2.1_patched
The patch significantly accelerates progress at positions where thousands of mate pairs have the same location . The patched version seems to help when working with the Ensembl human annotation.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load cufflinks [user@cn3144 ~]$ cufflinks file.sam [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. cufflinks.sh). For example:
#!/bin/bash cd /data/$USER/mydir module load cufflinks cufflinks -p $SLURM_CPUS_PER_TASK inputFile
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] cufflinks.sh
Create a swarmfile (e.g. cufflinks.swarm). For example:
cd /data/$USER/mydir1; cufflinks -p $SLURM_CPUS_PER_TASK inputFile cd /data/$USER/mydir2; cufflinks -p $SLURM_CPUS_PER_TASK inputFile cd /data/$USER/mydir3; cufflinks -p $SLURM_CPUS_PER_TASK inputFile [...]
Submit this job using the swarm command.
swarm -f cufflinks.swarm [-g #] [-t #] --module cufflinkswhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module cufflinks | Loads the cufflinks module for each subjob in the swarm |