Rosetta Fold (RF) dissusion is an open source method for structure generation, with or without conditional information (a motif, target etc). It can perform motif scaffolding, unconditional protein generation, and other tasks.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=20g -c8 --gres=gpu:p100:1,lscratch:10 [user@cn3335 ~]$ module load RFdiffusion [+] Loading singularity 3.10.5 on cn4338 [+] Loading RFdiffusion 1.1.0 [user@cn3335 ~]$ git clone https://github.com/RosettaCommons/RFdiffusion [user@cn3335 ~]$ cd RFdiffusion [user@cn3335 ~]$ python-rfd ./scripts/run_inference.py -h run_inference is powered by Hydra. == Configuration groups == Compose your configuration from those groups (group=option) == Config == Override anything in the config (foo.bar=value) inference: input_pdb: null num_designs: 10 design_startnum: 0 ckpt_override_path: null symmetry: null recenter: true radius: 10.0 model_only_neighbors: false output_prefix: samples/design write_trajectory: true scaffold_guided: false model_runner: SelfConditioning cautious: true align_motif: true symmetric_self_cond: true final_step: 1 deterministic: false trb_save_ckpt_path: null schedule_directory_path: null model_directory_path: null contigmap: contigs: null inpaint_seq: null provide_seq: null length: null model: n_extra_block: 4 n_main_block: 32 n_ref_block: 4 d_msa: 256 d_msa_full: 64 d_pair: 128 d_templ: 64 n_head_msa: 8 n_head_pair: 4 n_head_templ: 4 d_hidden: 32 d_hidden_templ: 32 p_drop: 0.15 SE3_param_full: num_layers: 1 num_channels: 32 num_degrees: 2 n_heads: 4 div: 4 l0_in_features: 8 l0_out_features: 8 l1_in_features: 3 l1_out_features: 2 num_edge_features: 32 SE3_param_topk: num_layers: 1 num_channels: 32 num_degrees: 2 n_heads: 4 div: 4 l0_in_features: 64 l0_out_features: 64 l1_in_features: 3 l1_out_features: 2 num_edge_features: 64 d_time_emb: null d_time_emb_proj: null freeze_track_motif: false use_motif_timestep: false diffuser: T: 50 b_0: 0.01 b_T: 0.07 schedule_type: linear so3_type: igso3 crd_scale: 0.25 partial_T: null so3_schedule_type: linear min_b: 1.5 max_b: 2.5 min_sigma: 0.02 max_sigma: 1.5 denoiser: noise_scale_ca: 1 final_noise_scale_ca: 1 ca_noise_schedule_type: constant noise_scale_frame: 1 final_noise_scale_frame: 1 frame_noise_schedule_type: constant ppi: hotspot_res: null potentials: guiding_potentials: null guide_scale: 10 guide_decay: constant olig_inter_all: null olig_intra_all: null olig_custom_contact: null substrate: null contig_settings: ref_idx: null hal_idx: null idx_rf: null inpaint_seq_tensor: null preprocess: sidechain_input: false motif_sidechain_input: true d_t1d: 22 d_t2d: 44 prob_self_cond: 0.0 str_self_cond: false predict_previous: false logging: inputs: false scaffoldguided: scaffoldguided: false target_pdb: false target_path: null scaffold_list: null scaffold_dir: null sampled_insertion: 0 sampled_N: 0 sampled_C: 0 ss_mask: 0 systematic: false target_ss: null target_adj: null mask_loops: true contig_crop: null Powered by Hydra (https://hydra.cc) Use --hydra-help to view Hydra specific helpDownload pretrained models:
[user@cn3335 ~]$ bash scripts/download_models.sh models ...Download sample data:
[user@cn3335 ~]$ cp $RFDIFFUSION_DATA/* .If needed, edit the configuration file:
config/inference/base.yamlRun RFdiffusion on the data, using the settings from the configuration file:
[user@cn3335 ~]$ python-rfd ./scripts/run_inference.py inference.output_prefix=./ inference.input_pdb=./sample.pdb 'contigmap.contigs=[10-40/a394-408/10-40]' +schedule_directory_path=./ & [2023-06-05 10:23:41,241][__main__][INFO] - Found GPU with device_name Tesla K80. Will run RFdiffusion on Tesla K80 Reading models from /vf/users/user/RFdiffusion/RFdiffusion/rfdiffusion/inference/../../models [2023-06-05 10:23:41,242][rfdiffusion.inference.model_runners][INFO] - Reading checkpoint from /vf/users/user/RFdiffusion/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt This is inf_conf.ckpt_path /vf/users/user/RFdiffusion/RFdiffusion/rfdiffusion/inference/../../models/Base_ckpt.pt [user@cn3335 ~]$ nvidia-smi Tue Apr 23 16:34:35 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA A100-SXM4-80GB On | 00000000:46:00.0 Off | 0 | | N/A 32C P0 66W / 400W | 886MiB / 81920MiB | 0% Default | | | | Disabled | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-SXM4-80GB On | 00000000:C7:00.0 Off | 0 | | N/A 32C P0 61W / 400W | 8MiB / 81920MiB | 0% Default | | | | Disabled | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 3988441 C /opt/conda/envs/SE3nv/bin/python 872MiB | +---------------------------------------------------------------------------------------+ Assembling -model, -diffuser and -preprocess configs from checkpoint USING MODEL CONFIG: self._conf[model][n_extra_block] = 4 USING MODEL CONFIG: self._conf[model][n_main_block] = 32 USING MODEL CONFIG: self._conf[model][n_ref_block] = 4 USING MODEL CONFIG: self._conf[model][d_msa] = 256 USING MODEL CONFIG: self._conf[model][d_msa_full] = 64 USING MODEL CONFIG: self._conf[model][d_pair] = 128 USING MODEL CONFIG: self._conf[model][d_templ] = 64 USING MODEL CONFIG: self._conf[model][n_head_msa] = 8 USING MODEL CONFIG: self._conf[model][n_head_pair] = 4 USING MODEL CONFIG: self._conf[model][n_head_templ] = 4 USING MODEL CONFIG: self._conf[model][d_hidden] = 32 USING MODEL CONFIG: self._conf[model][d_hidden_templ] = 32 USING MODEL CONFIG: self._conf[model][p_drop] = 0.15 USING MODEL CONFIG: self._conf[model][SE3_param_full] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 8, 'l0_out_features': 8, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 32} USING MODEL CONFIG: self._conf[model][SE3_param_topk] = {'num_layers': 1, 'num_channels': 32, 'num_degrees': 2, 'n_heads': 4, 'div': 4, 'l0_in_features': 64, 'l0_out_features': 64, 'l1_in_features': 3, 'l1_out_features': 2, 'num_edge_features': 64} USING MODEL CONFIG: self._conf[model][freeze_track_motif] = False USING MODEL CONFIG: self._conf[model][use_motif_timestep] = True USING MODEL CONFIG: self._conf[diffuser][T] = 50 USING MODEL CONFIG: self._conf[diffuser][b_0] = 0.01 USING MODEL CONFIG: self._conf[diffuser][b_T] = 0.07 USING MODEL CONFIG: self._conf[diffuser][schedule_type] = linear USING MODEL CONFIG: self._conf[diffuser][so3_type] = igso3 USING MODEL CONFIG: self._conf[diffuser][crd_scale] = 0.25 USING MODEL CONFIG: self._conf[diffuser][so3_schedule_type] = linear USING MODEL CONFIG: self._conf[diffuser][min_b] = 1.5 USING MODEL CONFIG: self._conf[diffuser][max_b] = 2.5 USING MODEL CONFIG: self._conf[diffuser][min_sigma] = 0.02 USING MODEL CONFIG: self._conf[diffuser][max_sigma] = 1.5 USING MODEL CONFIG: self._conf[preprocess][sidechain_input] = False USING MODEL CONFIG: self._conf[preprocess][motif_sidechain_input] = True USING MODEL CONFIG: self._conf[preprocess][d_t1d] = 22 USING MODEL CONFIG: self._conf[preprocess][d_t2d] = 44 USING MODEL CONFIG: self._conf[preprocess][prob_self_cond] = 0.5 USING MODEL CONFIG: self._conf[preprocess][str_self_cond] = True USING MODEL CONFIG: self._conf[preprocess][predict_previous] = False [2023-06-05 10:23:52,919][rfdiffusion.inference.model_runners][INFO] - Loading checkpoint. [2023-06-05 10:23:58,119][rfdiffusion.diffusion][INFO] - Using cached IGSO3. Successful diffuser __init__ [2023-06-05 10:23:58,199][__main__][INFO] - Making design ./_0 [2023-06-05 10:23:58,411][rfdiffusion.inference.model_runners][INFO] - Using contig: ['10-40/a394-408/10-40'] With this beta schedule (linear schedule, beta_0 = 0.04, beta_T = 0.28), alpha_bar_T = 0.00013696048699785024 [2023-06-05 10:23:58,462][rfdiffusion.inference.model_runners][INFO] - Sequence init: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:24:04,076][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.21 [2023-06-05 10:24:04,094][rfdiffusion.inference.model_runners][INFO] - Timestep 50, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:24:05,277][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.19 [2023-06-05 10:24:05,281][rfdiffusion.inference.model_runners][INFO] - Timestep 49, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:24:06,866][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.15 [2023-06-05 10:24:06,870][rfdiffusion.inference.model_runners][INFO] - Timestep 48, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:24:08,469][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.14 [2023-06-05 10:24:08,473][rfdiffusion.inference.model_runners][INFO] - Timestep 47, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:24:10,060][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.12 [2023-06-05 10:24:10,064][rfdiffusion.inference.model_runners][INFO] - Timestep 46, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- ... [2023-06-05 10:25:03,386][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.18 [2023-06-05 10:25:03,390][rfdiffusion.inference.model_runners][INFO] - Timestep 2, input to next step: -----------------------LNETHFSDDIEQQAD----------------------------------- [2023-06-05 10:25:05,602][__main__][INFO] - Finished design in 1.12 minutes [2023-06-05 10:25:05,602][__main__][INFO] - Making design ./_1 [2023-06-05 10:25:05,664][rfdiffusion.inference.model_runners][INFO] - Using contig: ['10-40/a394-408/10-40'] With this beta schedule (linear schedule, beta_0 = 0.04, beta_T = 0.28), alpha_bar_T = 0.00013696048699785024 [2023-06-05 10:25:05,694][rfdiffusion.inference.model_runners][INFO] - Sequence init: ----------------------LNETHFSDDIEQQAD-------------------- [2023-06-05 10:25:06,601][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.22 [2023-06-05 10:25:06,604][rfdiffusion.inference.model_runners][INFO] - Timestep 50, input to next step: ----------------------LNETHFSDDIEQQAD-------------------- ... [2023-06-05 10:25:51,862][__main__][INFO] - Finished design in 0.77 minutes [2023-06-05 10:25:51,863][__main__][INFO] - Making design ./_2 [2023-06-05 10:25:51,925][rfdiffusion.inference.model_runners][INFO] - Using contig: ['10-40/a394-408/10-40'] With this beta schedule (linear schedule, beta_0 = 0.04, beta_T = 0.28), alpha_bar_T = 0.00013696048699785024 [2023-06-05 10:25:51,955][rfdiffusion.inference.model_runners][INFO] - Sequence init: -------------------------------LNETHFSDDIEQQAD------------- [2023-06-05 10:25:52,886][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.20 [2023-06-05 10:25:52,889][rfdiffusion.inference.model_runners][INFO] - Timestep 50, input to next step: -------------------------------LNETHFSDDIEQQAD------------- ... [2023-06-05 10:32:34,898][rfdiffusion.inference.utils][INFO] - Sampled motif RMSD: 0.13 [2023-06-05 10:32:34,902][rfdiffusion.inference.model_runners][INFO] - Timestep 2, input to next step: -------------------LNETHFSDDIEQQAD------------------------ [2023-06-05 10:32:36,721][__main__][INFO] - Finished design in 0.79 minutesEnd the interactive session:
[user@cn3335 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file, e.g. rfdiffusion.sh:
#!/bin/sh #SBATCH --job-name mb256 #SBATCH --cpus-per-task=8 #SBATCH --mem=20g #SBATCH --gres=gpu:p100:1,lscratch:20 #SBATCH -p gpu module load RFdiffusion git clone https://github.com/RosettaCommons/RFdiffusion cd RFdiffusion bash scripts/download_models.sh models cp $RFDIFFUSION_DATA/* . python-rfd ./scripts/run_inference.py inference.output_prefix=./ inference.input_pdb=./sample.pdb 'contigmap.contigs=[10-40/a394-408/10-40]' +schedule_directory_path=./
Submit this job using the Slurm sbatch command.
sbatch rfdiffusion.sh