DeepM6A: Detection and Eclucidation of DNA Methylation on N6-Adenine
DeepM6A application implements a deep-learning-based algorithm for predicting potential DNA 6mA sites de novo from sequence at single-nucleotide resolution. Application of this tool to three representative model organisms, Arabidopsis thaliana, Drosophila melanogaster and Escherichia coli, demonstrated that it supercedes the conventional k-mer-based approaches in both accuracy and performance.
References:
- Fei Tan, Tian Tian, Xiurui Hou, Xiang Yu, Lei Gu, Fernanda Mafra, Brian D. Gregory, Zhi Wei and Hakon Hakonarson,
Elucidation of DNA methylation on N6-adenine with deep learning
Nature Machine Intelligence (2020), Volume 2, Pages 466–475. https://doi.org/10.1038/s42256-020-0211-4.
Documentation
Important Notes
- Module Name: DeepM6A (see the modules page for more information)
- Unusual environment variables set
- DM6A_HOME installation directory
- DM6A_BIN executable directory
- DM6A_DIR source code directory
- DM6A_DATA sample data directory
- DM6A_MODELS pre-trained models directory
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=16g --gres=gpu:p100:1,lscratch:10 -c4 [user@cn3104 ~]$ module load DeepM6A/0.1 [+] Loading CUDA Toolkit 9.0.176 ... [+] Loading cuDNN 7.0 libraries... [+] Loading DeepM6A 0.1 [user@cn3104 ~]$ cp -r $DM6A_DATA Data [user@cn3104 ~]$ cp -r $DM6A_MODELS ModelsThe training, validation and testing data in the Data folder come as HDF5 files. To learn more about the format of these data, execute the commands:
[user@cn3104 ~]$ python >>> import h5py >>> trainmat = h5py.File('Data/m6A_30.train_1.hdf5', 'r') >>> import numpy as np >>> X_train = np.transpose(np.array(trainmat['x_train']),axes=(0,2, 1)) >>> X_train.shape (31406, 61, 4) >>> y_train = np.array(trainmat['y_train']) >>> y_train.shape (31406,)Here is how the data can be used for training a model:
>>> from DEM6A import DeepM6A >>> DA = DeepM6A('./', epochs=5) >>> DA.fit(train_name='Data/m6A_30.train_1.hdf5', valid_name='Data/m6A_30.valid_1.hdf5') train_label: count (array([0, 1]), array([15703, 15703])) valid_label: count (array([0, 1]), array([1962, 1962])) building model............... /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/DEM6A/deepm6a.py:68: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(activation="linear", name="conv1", input_shape=(61, 4), filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal")` name = "conv1")) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/DEM6A/deepm6a.py:79: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(activation="linear", name="conv2", filters=80, kernel_size=2, strides=1, padding="valid", kernel_initializer="he_normal")` name = "conv2")) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/DEM6A/deepm6a.py:90: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(activation="linear", name="conv3", filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal")` name="conv3")) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/DEM6A/deepm6a.py:101: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(activation="linear", name="conv4", filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal")` name = "conv4")) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/DEM6A/deepm6a.py:113: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(padding="valid", activation="linear", name="conv5", filters=80, kernel_size=4, strides=1, kernel_initializer="he_normal")` name = "conv5")) _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv1 (Conv1D) (None, 58, 80) 1360 _________________________________________________________________ leaky_re_lu_1 (LeakyReLU) (None, 58, 80) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 58, 80) 0 _________________________________________________________________ conv2 (Conv1D) (None, 57, 80) 12880 _________________________________________________________________ leaky_re_lu_2 (LeakyReLU) (None, 57, 80) 0 _________________________________________________________________ dropout_2 (Dropout) (None, 57, 80) 0 _________________________________________________________________ conv3 (Conv1D) (None, 54, 80) 25680 _________________________________________________________________ leaky_re_lu_3 (LeakyReLU) (None, 54, 80) 0 _________________________________________________________________ dropout_3 (Dropout) (None, 54, 80) 0 _________________________________________________________________ conv4 (Conv1D) (None, 51, 80) 25680 _________________________________________________________________ leaky_re_lu_4 (LeakyReLU) (None, 51, 80) 0 _________________________________________________________________ dropout_4 (Dropout) (None, 51, 80) 0 _________________________________________________________________ conv5 (Conv1D) (None, 48, 80) 25680 _________________________________________________________________ leaky_re_lu_5 (LeakyReLU) (None, 48, 80) 0 _________________________________________________________________ dropout_5 (Dropout) (None, 48, 80) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 3840) 0 _________________________________________________________________ dense_1 (Dense) (None, 100) 384100 _________________________________________________________________ leaky_re_lu_6 (LeakyReLU) (None, 100) 0 _________________________________________________________________ dropout_6 (Dropout) (None, 100) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 101 _________________________________________________________________ activation_1 (Activation) (None, 1) 0 ================================================================= Total params: 475,481 Trainable params: 475,481 Non-trainable params: 0 _________________________________________________________________ compiling and fitting model........... Train on 31406 samples, validate on 3924 samples Epoch 1/5 2020-08-18 16:33:51.512423: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2020-08-18 16:33:51.695354: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:91:00.0 totalMemory: 15.90GiB freeMemory: 15.64GiB 2020-08-18 16:33:51.695432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2020-08-18 16:33:52.111563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-08-18 16:33:52.111639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2020-08-18 16:33:52.111655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2020-08-18 16:33:52.111800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15155 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:91:00.0, compute capability: 6.0) - 4s - loss: 0.6410 - acc: 0.6241 - val_loss: 0.4876 - val_acc: 0.7757 Epoch 00001: val_loss improved from inf to 0.48762, saving model to ./DeepM6A.hdf5 Epoch 2/5 - 1s - loss: 0.4618 - acc: 0.7918 - val_loss: 0.3504 - val_acc: 0.8626 Epoch 00002: val_loss improved from 0.48762 to 0.35043, saving model to ./DeepM6A.hdf5 Epoch 3/5 - 1s - loss: 0.4064 - acc: 0.8249 - val_loss: 0.3194 - val_acc: 0.8726 Epoch 00003: val_loss improved from 0.35043 to 0.31939, saving model to ./DeepM6A.hdf5 Epoch 4/5 - 1s - loss: 0.3818 - acc: 0.8396 - val_loss: 0.3052 - val_acc: 0.8807 Epoch 00004: val_loss improved from 0.31939 to 0.30522, saving model to ./DeepM6A.hdf5 Epoch 5/5 - 1s - loss: 0.3675 - acc: 0.8476 - val_loss: 0.3004 - val_acc: 0.8907 Epoch 00005: val_loss improved from 0.30522 to 0.30042, saving model to ./DeepM6A.hdf5 training done!To perform prediction, use a pre-trained model from the Models folder. For example:
>>> DA.predict('Data/m6A_30.test_1.hdf5', 'results.csv', 'Models/Ecoli/bestmodel.hdf5') /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(batch_input_shape=[None, 61,..., name="cov1", activity_regularizer=None, trainable=True, input_dtype="float32", activation="linear", input_shape=(61, 4), filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_1", rate=0.2)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(name="cov2", activity_regularizer=None, trainable=True, activation="linear", input_shape=(None, Non..., filters=80, kernel_size=2, strides=1, padding="valid", kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_2", rate=0.2)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(name="cov3", activity_regularizer=None, trainable=True, activation="linear", input_shape=(None, Non..., filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_3", rate=0.2)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(name="cov4", activity_regularizer=None, trainable=True, activation="linear", input_shape=(None, Non..., filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_4", rate=0.2)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(name="cov5", activity_regularizer=None, trainable=True, activation="linear", input_shape=(None, Non..., filters=80, kernel_size=4, strides=1, padding="valid", kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_5", rate=0.5)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dense` call to the Keras 2 API: `Dense(name="dense_1", activity_regularizer=None, trainable=True, input_dim=None, activation="linear", units=100, kernel_initializer="he_normal", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dropout` call to the Keras 2 API: `Dropout(trainable=True, name="dropout_6", rate=0.5)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/base_layer.py:1109: UserWarning: Update your `Dense` call to the Keras 2 API: `Dense(name="dense_2", activity_regularizer=None, trainable=True, input_dim=None, activation="linear", units=1, kernel_initializer="glorot_uniform", kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None, use_bias=True)` return cls(**config) /usr/local/apps/DeepM6A/0.1/lib/python3.6/site-packages/keras/engine/saving.py:327: UserWarning: Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer. warnings.warn('Error in loading the saved optimizer ' _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= cov1 (Conv1D) (None, 58, 80) 1360 _________________________________________________________________ leakyrelu_1 (LeakyReLU) (None, 58, 80) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 58, 80) 0 _________________________________________________________________ cov2 (Conv1D) (None, 57, 80) 12880 _________________________________________________________________ leakyrelu_2 (LeakyReLU) (None, 57, 80) 0 _________________________________________________________________ dropout_2 (Dropout) (None, 57, 80) 0 _________________________________________________________________ cov3 (Conv1D) (None, 54, 80) 25680 _________________________________________________________________ leakyrelu_3 (LeakyReLU) (None, 54, 80) 0 _________________________________________________________________ dropout_3 (Dropout) (None, 54, 80) 0 _________________________________________________________________ cov4 (Conv1D) (None, 51, 80) 25680 _________________________________________________________________ leakyrelu_4 (LeakyReLU) (None, 51, 80) 0 _________________________________________________________________ dropout_4 (Dropout) (None, 51, 80) 0 _________________________________________________________________ cov5 (Conv1D) (None, 48, 80) 25680 _________________________________________________________________ leakyrelu_5 (LeakyReLU) (None, 48, 80) 0 _________________________________________________________________ dropout_5 (Dropout) (None, 48, 80) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 3840) 0 _________________________________________________________________ dense_1 (Dense) (None, 100) 384100 _________________________________________________________________ leakyrelu_6 (LeakyReLU) (None, 100) 0 _________________________________________________________________ dropout_6 (Dropout) (None, 100) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 101 _________________________________________________________________ activation_1 (Activation) (None, 1) 0 ================================================================= Total params: 475,481 Trainable params: 475,481 Non-trainable params: 0 _________________________________________________________________ test_label: count (array([0, 1]), array([1967, 1967])) **************prediction results on test dataset************ 3934/3934 [==============================] - 0s 93us/step [3.8706200020307278, 0.4974580579562786] 3934/3934 [==============================] - 0s 76us/step 3934/3934 [==============================] - 0s 40us/step ************************ auc: 0.4892284721287104 mcc: -0.0222572017594063 precision:0.40384615384615385 recall:0.010676156583629894 f1score:0.020802377414561663 support:1967 ************************End the interactive session:
>>> quit() [user@cn3104 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$