cryoDRGN: Deep Reconstructing Generative Networks for cryo-EM heterogeneous reconstruction
CryoDRGN is an algorithm that leverages the representation power of deep neural networks to directly reconstruct continuous distributions of 3D density maps and map per-particle heterogeneity of single-particle cryo-EM datasets. It contains interactive tools to visualize a dataset’s distribution of per-particle variability, generate density maps for exploratory analysis, extract particle subsets for use with other tools and generate trajectories to visualize molecular motions. CryoDRGN is open-source software freely available at http://cryodrgn.csail.mit.edu.
References:
- Ellen D. Zhong, Tristan Bepler, Bonnie Berger, and Joseph H. Davis,
CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks.
Nature Methods, VOL 18 | FebruarY 2021 | 176–185
-
Ellen D. Zhong, Tristan Bepler, Joseph H. Davis, and Bonnie Berger,
Reconstructing continuous distributions of 3D protein structure from cryo-EM images.
ICLR 2020, Spotlight presentation.
Documentation
Important Notes
- Module Name: cryoDRGN (see the modules page for more information)
- Singlethreaded
- Unusual environment variables set
- CRYODRGN_HOME cryoDRGN installation directory
- CRYODRGN_BIN cryoDRGN executable directory
- CRYODRGN_DATA cryoDRGN data directory
- CRYODRGN_TEST cryoDRGN examples directory
- Example files in $CRYODRGN_TEST
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --gres=gpu:k80:1 [user@cn4199 ~]$ module load cryoDRGN +] Loading cuDNN/7.5/CUDA-10.0 libraries... [+] Loading CUDA Toolkit 10.0.130 ... [+] Loading singularity 3.8.5-1 on cn4199 [+] Loading cryoDRGN 0.3.4 ...Copy test data to your current folder:
[user@user@cn4199 ~]$ cp -r $CRYODRGN_TEST/* .Downsample test data:
[user@cn4199 ~]$ cryodrgn -h usage: cryodrgn [-h] [--version] {downsample,preprocess,parse_pose_csparc,parse_pose_star,parse_ctf_csparc,parse_ctf_star,train_nn,backproject_voxel,train_vae,eval_vol,eval_images,analyze,pc_traversal,graph_traversal,view_config} ... CryoDRGN neural network reconstruction optional arguments: -h, --help show this help message and exit --version show program's version number and exit Choose a command: {downsample,preprocess,parse_pose_csparc,parse_pose_star,parse_ctf_csparc,parse_ctf_star,train_nn,backproject_voxel,train_vae,eval_vol,eval_images,analyze,pc_traversal,graph_traversal,view_config} [user@cn4199 ~]$ cryodrgn downsample -h usage: cryodrgn downsample [-h] -D D -o MRCS [-b B] [--is-vol] [--chunk CHUNK] [--relion31] [--datadir DATADIR] [--max-threads MAX_THREADS] mrcs Downsample an image stack or volume by clipping fourier frequencies positional arguments: mrcs Input particles or volume (.mrc, .mrcs, .star, or .txt) optional arguments: -h, --help show this help message and exit -D D New box size in pixels, must be even -o MRCS Output projection stack (.mrcs) -b B Batch size for processing images (default: 5000) --is-vol Flag if input .mrc is a volume --chunk CHUNK Chunksize (in # of images) to split particle stack when saving --relion31 Flag for relion3.1 star format --datadir DATADIR Optionally provide path to input .mrcs if loading from a .star or .cs file --max-threads MAX_THREADS Maximum number of CPU cores for parallelization (default: 16) [user@cn4199 ~]$ cryodrgn downsample data/toy_projections.mrcs -D 24 -o particles.24.mrcs 2021-02-24 10:09:00 Processing image 0 of 1000 2021-02-24 10:09:01 (1000, 24, 24) 2021-02-24 10:09:01 Saving particles.24.mrcsParse image poses from a consensus homogeneous reconstruction:
[user@cn4199 ~]$ cryodrgn parse_pose_star -h usage: cryodrgn parse_pose_star [-h] -D D [--relion31] [--Apix APIX] -o PKL input Parse image poses from RELION .star file positional arguments: input RELION .star file optional arguments: -h, --help show this help message and exit -D D Box size of reconstruction (pixels) --relion31 Flag for relion3.1 star format --Apix APIX Pixel size (A); Required if translations are specified in Angstroms -o PKL Output pose.pkl [user@cn4199 ~]$ cryodrgn parse_pose_star data/relion31.star -o test.pkl -D 300 --relion31 --Apix 1.03 2021-02-24 10:19:42 5 particles 2021-02-24 10:19:42 Euler angles (Rot, Tilt, Psi): 2021-02-24 10:19:42 [-102.30296 82.318041 124.706463] 2021-02-24 10:19:42 Converting to rotation matrix: 2021-02-24 10:19:42 [[ 0.81941806 -0.10080704 0.56426234] [-0.53288075 0.22868946 0.81470193] [-0.21116853 -0.96826601 0.13367414]] 2021-02-24 10:19:42 Translations (pixels): 2021-02-24 10:19:42 [ 0.01876117 -0.62801942] 2021-02-24 10:19:42 Writing test.pklParse CTF parameters from a .star/.cs file:
[user@cn4199 ~]$ cryodrgn parse_ctf_star -h usage: cryodrgn parse_ctf_star [-h] --Apix APIX -D D [--relion31] -o O [--png PNG] [--kv KV] [--cs CS] [-w W] [--ps PS] star Parse CTF parameters from a RELION .star file positional arguments: star Input optional arguments: -h, --help show this help message and exit --Apix APIX Angstroms per pixel -D D Image size in pixels --relion31 Flag for relion3.1 star format -o O Output pkl of CTF parameters --png PNG Optionally plot the CTF Overwrite CTF parameters: --kv KV Accelerating voltage (kV) --cs CS Spherical abberation (mm) -w W Amplitude contrast ratio --ps PS Phase shift (deg) [user@cn4199 ~]$ cryodrgn parse_ctf_star data/relion31.star -D 300 --Apix 1.03 -o ctf.pkl --relio n31 --kv 10 --cs 1 -w 0.5 2021-02-24 10:24:39 5 particles 2021-02-24 10:24:39 Overriding accerlating voltage with 10.0 kV 2021-02-24 10:24:39 Overriding spherical abberation with 1.0 mm 2021-02-24 10:24:39 Overriding amplitude contrast ratio with 0.5 2021-02-24 10:24:39 CTF parameters for first particle: 2021-02-24 10:24:39 Image size (pix) : 300 2021-02-24 10:24:39 A/pix : 1.03 2021-02-24 10:24:39 DefocusU (A) : 13108.082418 2021-02-24 10:24:39 DefocusV (A) : 12845.582418 2021-02-24 10:24:39 Dfang (deg) : -160.39 2021-02-24 10:24:39 voltage (kV) : 10.0 2021-02-24 10:24:39 cs (mm) : 1.0 2021-02-24 10:24:39 w : 0.5 2021-02-24 10:24:39 Phase shift (deg) : 0.0 2021-02-24 10:24:39 Saving ctf.pkl [user@cn4199 ~]$ cryodrgn downsample data/relion31.star -D 24 -o relion31.24.mrcs --relion31 `fused_weight_gradient_mlp_cuda` module not found. gradient accumulation fusion with weight gradient computation disabled. 2022-03-15 18:02:03 Warning: /gpfs/gsfs7/users/user/cryoDRGN/relion31.24.mrcs already exists. Overwriting. 2022-03-15 18:02:03 Processing batch 0 2022-03-15 18:02:03 (5, 24, 24) 2022-03-15 18:02:03 Saving /gpfs/gsfs7/users/user/cryoDRGN/relion31.24.mrcs [user@cn4199 ~]$ exit salloc.exe: Relinquishing job allocation 59748321 [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. CryoDRGN.sh). For example:
#!/bin/bash module load CryoDRGN cp -r $CRYODRGN_TEST/* . cryodrgn parse_pose_star data/relion31.star -o test.pkl -D 300 --relion31 --Apix 1.03 cryodrgn parse_ctf_star data/relion31.star -D 300 --Apix 1.03 -o ctf.pkl --relio n31 --kv 10 --cs 1 -w 0.5
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] CryoDRGN.sh