Microbial eukaryotes are integral components of natural microbial communities and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, eukaryotic genomes shoould be recovered from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. EukRep is a kmer- and SVM-based strategy for eukaryotic sequence identification from environmental samples.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g [user@@cn3316 ~]$ module load EukRepFrom this point, the current installation provides several options. For exaample, just typing
[user@@cn3316 ~]$EukRep Singularity EukRep_cpu.sqsh:>will bring the user to the EukRep container environment. Typing
[user@@cn3316 ~]$EukRep python -h usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ... Options and arguments (and corresponding environment variables): -b : issue warnings about str(bytes_instance), str(bytearray_instance) and comparing bytes/bytearray with str. (-bb: issue errors) -B : don't write .pyc files on import; also PYTHONDONTWRITEBYTECODE=x -c cmd : program passed in as string (terminates option list) -d : debug output from parser; also PYTHONDEBUG=x -E : ignore PYTHON* environment variables (such as PYTHONPATH) -h : print this help message and exit (also --help) -i : inspect interactively after running script; forces a prompt even if stdin does not appear to be a terminal; also PYTHONINSPECT=x ...will display the usage message, together with a list of available options. Finally, the following command will run an EukRep test:
[user@@cn3316 ~]$ EukRep python $EUKREP_TESTS/EukRep_tests.py ... ------------------------------------------------------------ Prediction tests to be performed on 13 total sequences Test sequences located in /usr/local/apps/EukRep/20180308/tests/test_sequences/test_scaffolds.fa ------------------------------------------------------------ Running test predictions with 3mers... Expect the following sequences to be predicted to be eukaryotic: test_seqeunce_1 test_sequence_2 test_sequence_4 test_sequence_5 test_sequence_6 test_sequence_7 test_sequence_8 test_sequence_9 test_sequence_10 test_sequence_11 test_sequence_12 test_sequence_13 The following sequences were predicted to be eukaryotic: test_seqeunce_1 test_sequence_2 test_sequence_4 test_sequence_5 test_sequence_6 test_sequence_7 test_sequence_8 test_sequence_9 test_sequence_10 test_sequence_11 test_sequence_12 test_sequence_13 Running test predictions with 4mers... Expect the following sequences to be predicted to be eukaryotic: test_sequence_2 test_sequence_5 test_sequence_12 The following sequences were predicted to be eukaryotic: test_sequence_2 test_sequence_5 test_sequence_12 Running test predictions with 5mers... Expect the following sequences to be predicted to be eukaryotic: test_sequence_3 test_sequence_5 The following sequences were predicted to be eukaryotic: test_sequence_3 test_sequence_5 Running test predictions with 6mers... Expect the following sequences to be predicted to be eukaryotic: test_sequence_3 test_sequence_5 The following sequences were predicted to be eukaryotic: test_sequence_3 test_sequence_5 EukRep prediction appears to be in working order [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. EukRep.sh). For example:
#!/bin/bash module load EukRep EukRep python $EUKREP_TESTS/EukRep_tests.py
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] EukRep.sh