Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one.
Cufflinks is a collaborative effort between the Laboratory for Mathematical and Computational Biology, led by Lior Pachter at UC Berkeley, Steven Salzberg's group at the University of Maryland Center for Bioinformatics and Computational Biology, and Barbara Wold's lab at Caltech.
Cufflinks is provided under the OSI-approved Boost License
Illumina has provided the RNA-Seq user community with a set of genome sequence indexes (including Bowtie, Bowtie2, and BWA indexes) as well as GTF transcript annotation files called iGenomes. These files can be used with TopHat and Cufflinks to quickly perform expression analysis and gene discovery. The annotation files are augmented with the tss_id and p_id GTF attributes that Cufflinks needs to perform differential splicing, CDS output, and promoter user analysis.
Please note that Cufflinks has entered a low maintenance, low support stage as it is now largely superseded by StringTie which provides the same core functionality (i.e. transcript assembly and quantification), in a much more efficient way.
References:
- Cufflinks: Cole Trapnell, Brian Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Jeltje van Baren, Steven Salzberg, Barbara Wold, Lior Pachter. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology, 2010
- Cufflinks -b: Adam Roberts, Cole Trapnell, Julie Donaghey, John L. Rinn, Lior Pachter. Improving RNA-Seq expression estimates by correcting for fragment bias Genome Biology, 2011
- Cufflinks [RABT mode]: Adam Roberts, Harold Pimentel, Cole Trapnell, Lior Pachter. Identification of novel transcripts in annotated genomes using RNA-Seq Bioinformatics, 2011
- Cuffdiff: Cole Trapnell, David Hendrickson, Martin Sauvageau, Loyal Goff, John L. Rinn, Lior Pachter Differential analysis of gene regulation at transcript resolution with RNA-seq Nature Biotechnology, 2012
- Module Name: cufflinks (see the modules page for more information)
- Multithreaded
- Reference data in /fdb/igenomes/, downloaded from iGenomes
There is a patched version of cufflinks available:
module load cufflinks/2.2.1_patched
The patch significantly accelerates progress at positions where thousands of mate pairs have the same location . The patched version seems to help when working with the Ensembl human annotation.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load cufflinks [user@cn3144 ~]$ cufflinks file.sam [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. cufflinks.sh). For example:
#!/bin/bash cd /data/$USER/mydir module load cufflinks cufflinks -p $SLURM_CPUS_PER_TASK inputFile
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] cufflinks.sh
Create a swarmfile (e.g. cufflinks.swarm). For example:
cd /data/$USER/mydir1; cufflinks -p $SLURM_CPUS_PER_TASK inputFile cd /data/$USER/mydir2; cufflinks -p $SLURM_CPUS_PER_TASK inputFile cd /data/$USER/mydir3; cufflinks -p $SLURM_CPUS_PER_TASK inputFile [...]
Submit this job using the swarm command.
swarm -f cufflinks.swarm [-g #] [-t #] --module cufflinkswhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module cufflinks | Loads the cufflinks module for each subjob in the swarm |