High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Azimuth

Azimuth: Machine Learning-Based Predictive Modelling of CRISPR/Cas9 guide efficiency

References:

There are multiple versions of Azimuth available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail Azimuth

To select a module, type

module load Azimuth/[ver]

where [ver] is the version of choice.

Environment variables set:

On Helix

Sample session:

module load Azimuth
python $AZIMUTH_HOME/examples/test.py
Batch job on Biowulf

Create a batch input file (e.g. Azimuth.sh), which uses the input file 'Azimuth.in'. For example:

#!/bin/bash
module load Azimuth
python my_Azimuth.py

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=1 Azimuth.sh
Swarm of Jobs on Biowulf

Create a swarmfile (e.g. Azimuth.swarm). For example:

python my_Azimuth.py
python my_Azimuth.py
python my_Azimuth.py
python my_Azimuth.py

Submit this job using the swarm command.

swarm -f Azimuth.swarm 
Interactive job on Biowulf
$ module load Azimuth
[+] Loading Azimuth 2.0 ...
$ python
Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:09:15)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import azimuth.model_comparison
>>> import numpy as np
>>> sequences = np.array(['ACAGCTGATCTCCAGATATGACCATGGGTT', 'CAGCTGATCTCCAGATATGACCATGGGTTT', 'CCAGAAGTTTGAGCCACAAACCCATGGTCA'])
>>> amino_acid_cut_positions = np.array([2, 2, 4])
>>> percent_peptides = np.array([0.18, 0.18, 0.35])
>>> predictions = azimuth.model_comparison.predict(sequences, amino_acid_cut_positions, percent_peptides)
No model file specified, using V3_model_full
>>> for i, prediction in enumerate(predictions):
...     print sequences[i], prediction
...
ACAGCTGATCTCCAGATATGACCATGGGTT 0.672298196907
CAGCTGATCTCCAGATATGACCATGGGTTT 0.687944237021
CCAGAAGTTTGAGCCACAAACCCATGGTCA 0.659245390401
>>> quit()
Documentation