Biowulf High Performance Computing at the NIH
eggNOG-mapper: fast genome-wide functional annotation through orthology assignment.

eggNOG-mapper is a tool for functional annotation of large sets of sequences based on fast orthology assignments using precomputed clusters and phylogenies from the eggNOG database. Orthology assignment is ideally suited for functional inference. However, predicting orthology is computationally intensive at large scale, and most other pipelines are relatively inaccessible (e.g., new assignments only available through database updates), so less precise homology-based functional transfer was previously the default for (meta-)genome annotation.


Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=16g --gres=lscratch:10
[user@cn3200 ~]$ module load eggnog-mapper  
[+] Loading python 3.7  ...
[+] Loading eggnog-mapper 2.1.2  ...
[user@cn3200 ~]$ls $EGGNOG_BIN  shell
diamond       python
Download test data files:
[user@cn3200 ~]$ mkdir /data/$USER/eggnog-mapper && cd /data/$USER/eggnog-mapper
[user@cn3200 ~]$ cp $EGGNOG_TEST/* .
Run on the test data using bacteria.dmnd diamond database:
[user@cn3200 ~]$ --dmnd_db $EGGNOG_DATA_DIR/bacteria.dmnd -i test_queries.fa -o test
 /opt/conda/envs/eggNOGmapper/lib/python3.7/site-packages/eggnog_mapper-2.1.6-py3.7.egg/eggnogmapper/bin/diamond blastp -d /usr/local/apps/eggnog-mapper/2.1.6/data/bacteria.dmnd -q /gs7/users/user/eggnog-mapper/test_queries.fa --threads 1 -o /gs7/users/user/eggnog-mapper/test.emapper.hits  --sensitive --iterate -e 0.001 --top 3  --outfmt 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qcovhsp scovhsp
Functional annotation of hits...
0 3.5762786865234375e-06 0.00 q/s (% mem usage: 3.90, % mem avail: 96.09)
2 2.5320394039154053 0.79 q/s (% mem usage: 3.90, % mem avail: 96.08)
Alternatively, you can create a diamond database on your own:
[user@cn3200 ~]$ mkdir data 
[user@cn3200 ~]$ export  EGGNOG_DATA_DIR=./data 
[user@cn3200 ~]$ -m diamond --dbname bacteria --taxa Bacteria
This will create a bacteria.dmnd diamond database in the directory specified in EGGNOG_DATA_DIR environment variable.