Protein Folding with Pyrosetta

In this tutorial, we will fold a protein structure using a very simple algorithm in PyRosetta, and compare the folded structure with the solved crystal structure of the protein.

Importing relevant libraries

We begin by importing the relevant libraries from Python. If running the following cell produces any errors or warnings, make sure you have followed all the steps in the "Setting up Pyrosetta" section.

In [3]:
import os
import glob
import shutil
import pandas as pd
import nglview as ngl
import pyrosetta as prs
prs.init()
from pyrosetta import rosetta
core.init: Checking for fconfig files in pwd and ./rosetta/flags
core.init: Rosetta version: PyRosetta4.Release.python36.mac r213 2019.10+release.fd1bdffb01b fd1bdffb01b7866da84942b9bf1b06e96270656e http://www.pyrosetta.org 2019-03-05T15:28:05
core.init: command: PyRosetta -ex1 -ex2aro -database /Users/rosello/anaconda3/envs/protein_design/lib/python3.6/site-packages/pyrosetta-2019.10+release.fd1bdffb01b-py3.6-macosx-10.7-x86_64.egg/pyrosetta/database
core.init: 'RNG device' seed mode, using '/dev/urandom', seed=-1838702617 seed_offset=0 real_seed=-1838702617
core.init.random: RandomGenerator:init: Normal mode, seed=-1838702617 RG_type=mt19937
In [ ]:
 
In [ ]:
 
In [ ]:
 

Setting up score functions that will be used across parts

In [4]:
scorefxn_low = prs.create_score_function('score3')
scorefxn_high = prs.get_fa_scorefxn()
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/env_log.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/pair_log.txt
basic.io.database: Database file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt
basic.io.database: Database file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.HS.resmooth
basic.io.database: Database file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.SS.resmooth
core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015
core.scoring.etable: Starting energy table calculation
core.scoring.etable: smooth_etable: changing atr/rep split to bottom of energy well
core.scoring.etable: smooth_etable: spline smoothing lj etables (maxdis = 6)
core.scoring.etable: smooth_etable: spline smoothing solvation etables (max_dis = 6)
core.scoring.etable: Finished calculating energy tables.
basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
basic.io.database: Database file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv
core.chemical.GlobalResidueTypeSet: Finished initializing fa_standard residue type set.  Created 696 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 0.604515 seconds.
basic.io.database: Database file opened: scoring/score_functions/rama/fd/all.ramaProb
basic.io.database: Database file opened: scoring/score_functions/rama/fd/prepro.ramaProb
basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.all.txt
basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.gly.txt
basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.pro.txt
basic.io.database: Database file opened: scoring/score_functions/omega/omega_ppdep.valile.txt
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/P_AA_n
core.scoring.P_AA: shapovalov_lib::shap_p_aa_pp_smooth_level of 1( aka low_smooth ) got activated.
basic.io.database: Database file opened: scoring/score_functions/P_AA_pp/shapovalov/10deg/kappa131/a20.prop

Loading the native (solved crystal) structure

In [5]:
native_pose = prs.pose_from_pdb('data/1BL0/1BL0_chainA.pdb')
core.import_pose.import_pose: File 'data/1BL0/1BL0_chainA.pdb' automatically determined to be of type PDB

We can check the amino acid sequence of the structure with a very simple command.

In [6]:
native_pose.sequence()
Out[6]:
'DAITIHSILDWIEDNLESPLSLEKVSERSGYSKWHLQRMFKKETGHSLGQYIRSRKMTEIAQKLKESNEPILYLAERYGFESQQTLTRTFKNYFDVPPHKYRMTNMQGESRFLHPL'

We can also assign the correct secondary structure.

In [7]:
DSSP = prs.rosetta.protocols.moves.DsspMover()
DSSP.apply(native_pose)    # populates the pose's Pose.secstruct
protocols.DsspMover: LHHHHHHHHHHHHLLLLLLLLLHHHHHHLLLLHHHHHHHHHHHHLLLHHHHHHHHHHHHHHHHHHHLLLLHHHHHHHHLLLLHHHHHHHHHHHHLLLHHHHHLLLLLLLLLLLLLL

Hint: The amino acids are also called residues!

Let's view more information about the first residue of the protein.

In [8]:
print(native_pose.residue(1))
Residue 1: ASP:NtermProteinFull (ASP, D):
Base: ASP
 Properties: POLYMER PROTEIN CANONICAL_AA LOWER_TERMINUS SC_ORBITALS POLAR CHARGED NEGATIVE_CHARGE METALBINDING ALPHA_AA L_AA
 Variant types: LOWER_TERMINUS_VARIANT
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O   1H   2H   3H    HA 
 Side-chain atoms:  CB   CG   OD1  OD2 1HB  2HB 
Atom Coordinates:
   N  : 0.229, 36.012, 74.172
   CA : 0.041, 35.606, 75.594
   C  : -0.096, 36.849, 76.498
   O  : -0.951, 36.895, 77.382
   CB : 1.225, 34.718, 76.092
   CG : 2.159, 34.156, 74.999
   OD1: 1.688, 33.361, 74.151
   OD2: 3.378, 34.497, 75.007
  1H  : 1.056, 35.74, 73.68
  2H  : -0.43, 35.723, 73.478
  3H  : 0.251, 36.981, 73.928
   HA : -0.884, 35.037, 75.696
  1HB : 1.839, 35.199, 76.854
  2HB : 0.67, 33.892, 76.539
Mirrored relative to coordinates in ResidueType: FALSE

In [9]:
pose = prs.pose_from_sequence(native_pose.sequence())
test_pose = prs.Pose()
test_pose.assign(pose)
test_pose.pdb_info().name('Linearized Pose')
In [10]:
view = ngl.show_rosetta(test_pose)
view.add_ball_and_stick()
view  # zoom in to see the atoms!

Defining movers to switch from full-atom to centroid representation

We will be using the centroid representation to perform rough and fast scoring in the initial stages of the folding algorithm. Later on, we will switch to the full atom represenation to do accurate minimization and get the final structures.

In [11]:
to_centroid = prs.SwitchResidueTypeSetMover('centroid')
to_full_atom = prs.SwitchResidueTypeSetMover('fa_standard')
In [12]:
to_full_atom.apply(test_pose)
print('Full Atom Score:', scorefxn_high(test_pose))
to_centroid.apply(test_pose)
print('Centroid Score:', scorefxn_low(test_pose))
core.util.switchresiduetypeset: [ WARNING ] When switching to a fa_standard ResidueTypeSet:  Pose already contains fa_standard ResidueTypes.
basic.io.database: Database file opened: scoring/score_functions/elec_cp_reps.dat
core.scoring.elec.util: Read 40 countpair representative atoms
core.pack.dunbrack.RotamerLibrary: shapovalov_lib_fixes_enable option is true.
core.pack.dunbrack.RotamerLibrary: shapovalov_lib::shap_dun10_smooth_level of 1( aka lowest_smooth ) got activated.
core.pack.dunbrack.RotamerLibrary: Binary rotamer library selected: /Users/rosello/anaconda3/envs/protein_design/lib/python3.6/site-packages/pyrosetta-2019.10+release.fd1bdffb01b-py3.6-macosx-10.7-x86_64.egg/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin
core.pack.dunbrack.RotamerLibrary: Using Dunbrack library binary file '/Users/rosello/anaconda3/envs/protein_design/lib/python3.6/site-packages/pyrosetta-2019.10+release.fd1bdffb01b-py3.6-macosx-10.7-x86_64.egg/pyrosetta/database/rotamer/shapovalov/StpDwn_0-0-0/Dunbrack10.lib.bin'.
core.pack.dunbrack.RotamerLibrary: Dunbrack 2010 library took 0.227604 seconds to load from binary
Full Atom Score: 46617.13925908482
core.chemical.GlobalResidueTypeSet: Finished initializing centroid residue type set.  Created 62 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 0.023001 seconds.
Centroid Score: 454.12286307835507

Q: Visualize the centroid-only structure and see the difference with the full atom that we visualized above? Print again the information for the first residue and compare.

In [13]:
# here write the code to visualize the centroid only structure and print the information of the 1st residue

print(native_pose.residue(1))
view = ngl.show_rosetta(test_pose)
view.add_ball_and_stick()
view
Residue 1: ASP:NtermProteinFull (ASP, D):
Base: ASP
 Properties: POLYMER PROTEIN CANONICAL_AA LOWER_TERMINUS SC_ORBITALS POLAR CHARGED NEGATIVE_CHARGE METALBINDING ALPHA_AA L_AA
 Variant types: LOWER_TERMINUS_VARIANT
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O   1H   2H   3H    HA 
 Side-chain atoms:  CB   CG   OD1  OD2 1HB  2HB 
Atom Coordinates:
   N  : 0.229, 36.012, 74.172
   CA : 0.041, 35.606, 75.594
   C  : -0.096, 36.849, 76.498
   O  : -0.951, 36.895, 77.382
   CB : 1.225, 34.718, 76.092
   CG : 2.159, 34.156, 74.999
   OD1: 1.688, 33.361, 74.151
   OD2: 3.378, 34.497, 75.007
  1H  : 1.056, 35.74, 73.68
  2H  : -0.43, 35.723, 73.478
  3H  : 0.251, 36.981, 73.928
   HA : -0.884, 35.037, 75.696
  1HB : 1.839, 35.199, 76.854
  2HB : 0.67, 33.892, 76.539
Mirrored relative to coordinates in ResidueType: FALSE

Setting up the folding algorithm

In [14]:
# Loading the files with the pre-computed fragmets
long_frag_filename = 'data/1BL0/9_fragments.txt'
long_frag_length = 9
short_frag_filename = 'data/1BL0/3_fragments.txt'
short_frag_length = 3

# Defining parameters of the folding algorithm
long_inserts=5  # How many 9-fragment pieces to insest during the search
short_inserts=10 # How many 3-fragment pieces to insest during the search

kT = 3.0 # Simulated Annealing temperature
cycles = 2 # How many cycles of Monte Carlo search to run
jobs = 50 # How many trajectories in parallel to compute.
job_output = 'outputs/1BL0/decoy' # The prefix of the filenames to store the results

Loading the fragmets

In [15]:
movemap = prs.MoveMap()
movemap.set_bb(True)

fragset_long = rosetta.core.fragment.ConstantLengthFragSet(long_frag_length, long_frag_filename)
long_frag_mover = rosetta.protocols.simple_moves.ClassicFragmentMover(fragset_long, movemap)

fragset_short = rosetta.core.fragment.ConstantLengthFragSet(short_frag_length, short_frag_filename)
short_frag_mover = rosetta.protocols.simple_moves.ClassicFragmentMover(fragset_short, movemap)

insert_long_frag = prs.RepeatMover(long_frag_mover, long_inserts)
insert_short_frag = prs.RepeatMover(short_frag_mover, short_inserts)
core.fragments.ConstantLengthFragSet: finished reading top 200 9mer fragments from file data/1BL0/9_fragments.txt
core.fragments.ConstantLengthFragSet: finished reading top 200 3mer fragments from file data/1BL0/3_fragments.txt

Q: How many 9-mer and 3-mer fragmets do we have in our database?

200

In [16]:
# Making sure the structure is in centroid-only mode for the search
test_pose.assign(pose)
to_centroid.apply(test_pose)
In [17]:
# Defining what sequence of actions to do between each scoring step
folding_mover = prs.SequenceMover()
folding_mover.add_mover(insert_long_frag)
folding_mover.add_mover(insert_short_frag)
In [18]:
mc = prs.MonteCarlo(test_pose, scorefxn_low, kT)
trial = prs.TrialMover(folding_mover, mc)
In [19]:
# Setting up how many cycles of search to do in each trajectory
folding = prs.RepeatMover(trial, cycles)

Setting up the relax mover for the final stage

In [20]:
fast_relax_mover = prs.rosetta.protocols.relax.FastRelax(scorefxn_high)
protocols.relax.FastRelax: Looking for default.txt
protocols.relax.RelaxScriptManager: ================== Reading script file: /Users/rosello/anaconda3/envs/protein_design/lib/python3.6/site-packages/pyrosetta-2019.10+release.fd1bdffb01b-py3.6-macosx-10.7-x86_64.egg/pyrosetta/database/sampling/relax_scripts/default.txt ==================
protocols.relax.RelaxScriptManager: repeat %%nrepeats%%
protocols.relax.RelaxScriptManager: ramp_repack_min 0.02  0.01     1.0
protocols.relax.RelaxScriptManager: ramp_repack_min 0.250 0.01     0.5
protocols.relax.RelaxScriptManager: ramp_repack_min 0.550 0.01     0.0
protocols.relax.RelaxScriptManager: ramp_repack_min 1     0.00001  0.0
protocols.relax.RelaxScriptManager: accept_to_best
protocols.relax.RelaxScriptManager: endrepeat

Running the folding algorithm!

In [21]:
scores = [0] * (jobs + 1)
scores[0] = scorefxn_low(test_pose)
In [22]:
if os.path.isdir(os.path.dirname(job_output)):
    shutil.rmtree(os.path.dirname(job_output), ignore_errors=True)
os.makedirs(os.path.dirname(job_output))
jd = prs.PyJobDistributor(job_output, nstruct=jobs, scorefxn=scorefxn_high)
Working on decoy: outputs/1BL0/decoy_23.pdb
In [24]:
counter = 0 
while not jd.job_complete:
    # a. set necessary variables for the new trajectory
    # -reload the starting pose
    test_pose.assign(pose)
    to_centroid.apply(test_pose)
    # -change the pose's PDBInfo.name, for the PyMOL_Observer
    counter += 1
    test_pose.pdb_info().name(job_output + '_' + str(counter))
    # -reset the MonteCarlo object (sets lowest_score to that of test_pose)
    mc.reset(test_pose)

    #### if you create a custom protocol, you may have additional
    ####    variables to reset, such as kT

    #### if you create a custom protocol, this section will most likely
    ####    change, many protocols exist as single Movers or can be
    ####    chained together in a sequence (see above) so you need
    ####    only apply the final Mover
    # b. apply the refinement protocol
    folding.apply(test_pose)

    ####
    # c. export the lowest scoring decoy structure for this trajectory
    # -recover the lowest scoring decoy structure
    mc.recover_low(test_pose)
    # -store the final score for this trajectory
    # -convert the decoy to fullatom
    # the sidechain conformations will all be default,
    #    normally, the decoys would NOT be converted to fullatom before
    #    writing them to PDB (since a large number of trajectories would
    #    be considered and their fullatom score are unnecessary)
    # here the fullatom mode is reproduced to make the output easier to
    #    understand and manipulate, PyRosetta can load in PDB files of
    #    centroid structures, however you must convert to fullatom for
    #    nearly any other application
    to_full_atom.apply(test_pose)
    fast_relax_mover.apply(test_pose)
    scores[counter] = scorefxn_high(test_pose)
    # -output the fullatom decoy structure into a PDB file
    jd.output_decoy(test_pose)
    # -export the final structure to PyMOL
    test_pose.pdb_info().name(job_output + '_' + str(counter) + '_fa')

Evaluating the folding results by getting the energies and RMSDs (distnace from the native pose) of each one of the predicted structures

In [25]:
decoy_poses = [prs.pose_from_pdb(f) for f in glob.glob(job_output + '*.pdb')]
core.import_pose.import_pose: File 'outputs/1BL0/decoy_9.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_43.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_42.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_8.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_40.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_41.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_45.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_44.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_46.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_47.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_20.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_34.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_35.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_21.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_37.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_23.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_22.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_36.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_32.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_26.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_27.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_33.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_19.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_25.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_31.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_30.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_24.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_18.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_15.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_29.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_28.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_14.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_16.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_17.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_13.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_12.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_38.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_10.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_11.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_39.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_0.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_1.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_3.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_49.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_48.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_2.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_6.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_7.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_5.pdb' automatically determined to be of type PDB
core.import_pose.import_pose: File 'outputs/1BL0/decoy_4.pdb' automatically determined to be of type PDB
In [26]:
def align_and_get_rmsds(native_pose, decoy_poses):
    prs.rosetta.core.pose.full_model_info.make_sure_full_model_info_is_setup(native_pose)
    rmsds = []
    for p in decoy_poses:
        prs.rosetta.core.pose.full_model_info.make_sure_full_model_info_is_setup(p)
        rmsds += [prs.rosetta.protocols.stepwise.modeler.align.superimpose_with_stepwise_aligner(native_pose, p)]
    return rmsds
In [27]:
rmsds = align_and_get_rmsds(native_pose, decoy_poses)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 48.7332221)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 18.0133396)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 34.5499798)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 41.4914632)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 27.1346392)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 29.7684244)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 31.6141794)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 37.2484125)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 38.1648029)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 32.7368518)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 16.4529961)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 25.9146214)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 30.1659624)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 32.0274536)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 27.0826427)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 25.7921177)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 45.4306412)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 34.9893092)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 21.4845615)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 35.4140907)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 51.3957981)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 39.5087405)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 28.5222972)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 25.5126421)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 38.9159325)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 18.4821149)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 29.5676482)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 40.2815603)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 34.4794450)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 54.6555152)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 34.1426189)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 24.8010408)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 30.1508356)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 52.2026684)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 43.9437017)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 27.0980363)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 28.6219641)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 41.3428510)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 20.4752403)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 37.4833826)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 44.3845731)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 36.4211375)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 44.9346909)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 29.6693023)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 44.9451462)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 30.9391692)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 30.5982517)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 34.7575798)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 19.2822913)
protocols.stepwise.modeler.align.StepWisePoseAligner: RMSD 0.000 (0 atoms in ), superimposed on 349 atoms in 1-116 (RMSD 61.6119451)
In [28]:
rmsd_data = []
for i in range(1, len(decoy_poses)):  # print out the job scores
    rmsd_data.append({'structure': decoy_poses[i].pdb_info().name(), 
                      'rmsd': rmsds[i],
                      'energy_score': scores[i]})
In [29]:
rmsd_df = pd.DataFrame(rmsd_data)
In [30]:
rmsd_df.sort_values('rmsd')
Out[30]:
energy_score rmsd structure
9 -89.824188 16.452996 outputs/1BL0/decoy_20.pdb
0 -107.422915 18.013340 outputs/1BL0/decoy_43.pdb
24 -88.357777 18.482115 outputs/1BL0/decoy_30.pdb
47 -81.296888 19.282291 outputs/1BL0/decoy_5.pdb
37 -83.360279 20.475240 outputs/1BL0/decoy_11.pdb
17 -88.664856 21.484561 outputs/1BL0/decoy_32.pdb
30 -118.831297 24.801041 outputs/1BL0/decoy_14.pdb
22 -74.382793 25.512642 outputs/1BL0/decoy_25.pdb
14 -58.302871 25.792118 outputs/1BL0/decoy_23.pdb
10 -125.920325 25.914621 outputs/1BL0/decoy_34.pdb
13 -73.975754 27.082643 outputs/1BL0/decoy_37.pdb
34 -70.600580 27.098036 outputs/1BL0/decoy_12.pdb
3 -131.450065 27.134639 outputs/1BL0/decoy_40.pdb
21 -58.987192 28.522297 outputs/1BL0/decoy_19.pdb
35 -75.885237 28.621964 outputs/1BL0/decoy_38.pdb
25 -142.872359 29.567648 outputs/1BL0/decoy_24.pdb
42 -123.387849 29.669302 outputs/1BL0/decoy_49.pdb
4 -13.824738 29.768424 outputs/1BL0/decoy_41.pdb
31 -119.743916 30.150836 outputs/1BL0/decoy_16.pdb
11 -95.668657 30.165962 outputs/1BL0/decoy_35.pdb
45 -72.788680 30.598252 outputs/1BL0/decoy_6.pdb
44 -137.173539 30.939169 outputs/1BL0/decoy_2.pdb
5 -173.288540 31.614179 outputs/1BL0/decoy_45.pdb
12 -81.825719 32.027454 outputs/1BL0/decoy_21.pdb
8 -80.201626 32.736852 outputs/1BL0/decoy_47.pdb
29 -116.408203 34.142619 outputs/1BL0/decoy_28.pdb
27 -150.457759 34.479445 outputs/1BL0/decoy_15.pdb
1 -65.682705 34.549980 outputs/1BL0/decoy_42.pdb
46 -85.175699 34.757580 outputs/1BL0/decoy_7.pdb
16 -104.232360 34.989309 outputs/1BL0/decoy_36.pdb
18 -45.776827 35.414091 outputs/1BL0/decoy_26.pdb
40 -81.558735 36.421137 outputs/1BL0/decoy_1.pdb
6 -47.928226 37.248413 outputs/1BL0/decoy_44.pdb
38 -48.892985 37.483383 outputs/1BL0/decoy_39.pdb
7 -62.655395 38.164803 outputs/1BL0/decoy_46.pdb
23 -121.304724 38.915933 outputs/1BL0/decoy_31.pdb
20 -108.992438 39.508740 outputs/1BL0/decoy_33.pdb
26 -71.246997 40.281560 outputs/1BL0/decoy_18.pdb
36 -86.206073 41.342851 outputs/1BL0/decoy_10.pdb
2 -85.128829 41.491463 outputs/1BL0/decoy_8.pdb
33 -119.169558 43.943702 outputs/1BL0/decoy_13.pdb
39 -48.741971 44.384573 outputs/1BL0/decoy_0.pdb
41 -80.620042 44.934691 outputs/1BL0/decoy_3.pdb
43 -94.074137 44.945146 outputs/1BL0/decoy_48.pdb
15 -53.512554 45.430641 outputs/1BL0/decoy_22.pdb
19 -104.359724 51.395798 outputs/1BL0/decoy_27.pdb
32 -112.609870 52.202668 outputs/1BL0/decoy_17.pdb
28 -42.243816 54.655515 outputs/1BL0/decoy_29.pdb
48 -122.845255 61.611945 outputs/1BL0/decoy_4.pdb

Q: Plot the energy_score VS rmsd. Are the lowest energy structures the closest to the native one?

In [ ]:
 
In [ ]: