ACS Publications. Most Trusted. Most Cited. Most Read
My Activity
CONTENT TYPES

Figure 1Loading Img
RETURN TO ISSUEPREVComputational Bioche...Computational BiochemistryNEXT

Free Energy Calculations Using the Movable Type Method with Molecular Dynamics Driven Protein–Ligand Sampling

  • Wenlang Liu
    Wenlang Liu
    School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    More by Wenlang Liu
  • Zhenhao Liu
    Zhenhao Liu
    School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    More by Zhenhao Liu
  • Hao Liu
    Hao Liu
    School of Mechanical and Electronic Engineering, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    More by Hao Liu
  • Lance M. Westerhoff
    Lance M. Westerhoff
    QuantumBio Inc., State College, Pennsylvania16801, United States
  • , and 
  • Zheng Zheng*
    Zheng Zheng
    School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    QuantumBio Inc., State College, Pennsylvania16801, United States
    *Email: [email protected]
    More by Zheng Zheng
Cite this: J. Chem. Inf. Model. 2022, 62, 22, 5645–5665
Publication Date (Web):October 25, 2022
https://doi.org/10.1021/acs.jcim.2c00278

Copyright © 2022 The Authors. Published by American Chemical Society. This publication is licensed under

CC-BY-NC-ND 4.0.
  • Open Access

Article Views

3162

Altmetric

-

Citations

3
LEARN ABOUT THESE METRICS
PDF (12 MB)
Supporting Info (2)»

Abstract

Fast and accurate biomolecular free energy estimation has been a significant interest for decades, and with recent advances in computer hardware, interest in new method development in this field has even grown. Thorough configurational state sampling using molecular dynamics (MD) simulations has long been applied to the estimation of the free energy change corresponding to the receptor–ligand complexing process. However, performing large-scale simulation is still a computational burden for the high-throughput hit screening. Among molecular modeling tools, docking and scoring methods are widely used during the early stages of the drug discovery process in that they can rapidly generate discrete receptor–ligand binding modes and their individual binding affinities. Unfortunately, the lack of thorough conformational sampling in docking and scoring protocols leads to difficulty discovering global minimum binding modes on a complicated energy landscape. The Movable Type (MT) method is a novel absolute binding free energy approach which has demonstrated itself to be robust across a wide range of targets and ligands. Traditionally, the MT method is used with protein–ligand binding modes generated with rigid-receptor or flexible-receptor (induced fit) docking protocols; however, these protocols are by their nature less likely to be effective with more highly flexible targets or with those situations in which binding involves multiple step pathways. In these situations, more thorough samplings are required to better explain the free energy of binding. Therefore, to explore the prediction capability and computational efficiency of the MT method when using more thorough protein–ligand conformational sampling protocols, in the present work, we introduced a series of binding mode modeling protocols ranging from conventional docking routines to single-trajectory conventional molecular dynamics (cMD) and parallel Monte Carlo molecular dynamics (MCMD). Through validation against several structurally and mechanistically diverse protein–ligand test sets, we explore the performance of the MT method as a virtual screening tool to work with the docking protocols and as an MD simulation-based binding free energy tool.

This publication is licensed under

CC-BY-NC-ND 4.0.
  • cc licence
  • by licence
  • nc licence
  • nd licence

Introduction

ARTICLE SECTIONS
Jump To

Successful drug discovery generally relies on receptor binding active site determination, ligand virtual screening, receptor–ligand binding mode analysis, and binding free energy estimation. (1−5) Structure based drug design (SBDD) supports this effort with atomic scale knowledge of one or more three-dimensional (3D) molecular structures coupled with receptor–ligand recognition mechanisms discovered from using computational tools. (6−8) Computational (in silico) molecular modeling protocols such as docking and scoring and Monte Carlo (MC) and molecular dynamics (MD) simulations support the needs of large-scale virtual screening during the hit-to-lead stage of drug discovery in which these methods have long been applied to help control the trial-and-error cost of the design and synthesis of drug leads. (9−12)
Docking and scoring methods are widely used during the early stages of the drug discovery process by rapidly generating discrete receptor–ligand binding modes and quickly calculating their individual binding affinities. (13−15) Such methods provide high-throughput virtual screening filters to reduce the number of compounds for synthesis and subsequent in vitro or in vivo evaluation. The goal of docking and scoring is therefore often to capture global minimum binding modes using a “one shot” pose-generation or a restricted local sampling protocol followed by estimation of receptor–ligand binding free energies using the calculated potential energies and conformational entropy-related features according to these binding modes. (16−20) However, the lack of thorough conformational sampling in the docking and scoring computations brings difficulty in exploration of the complicated energy landscape of the protein–ligand binding process. On one hand, correlated motif movements, especially at the receptor–ligand recognition interface, during the binding process compromise the accuracy of the global minimum binding mode prediction from using only local pose-sampling protocols. (21) On the other hand, binding affinity prediction based on a single binding mode conformation usually leads to ligand-size-dependence, in which larger molecules outperform smaller ones simply because there are more intermolecular pairwise contacts. (22) The binding free energy estimation is therefore contaminated by these “one shot” pose-generation strategies due to lack of knowledge of the converged conformational distributions for both bound-state and free-state protein–ligand pairs. However, notwithstanding the inherent disadvantages of the docking and scoring methods, they are still among the most popular SBDD tools because of their computational speed and reasonable binding mode prediction capabilities for many cases. (23,24)
Thorough configurational state sampling using MD simulations has been applied for decades to the estimation of the free energy change corresponding to the receptor–ligand complexing process. These methods are often quite reliable for revealing molecular movement trajectories and generating binding mode probability distributions. (25) However, even today with software optimized for fast central processing units (CPUs) and graphical processing units (GPUs), performing large-scale simulations at scale is still a significant computational burden for high-throughput hit screening. There are two primary binding free energy estimation regimes: absolute and relative, with the former term suggesting the calculation procedures directly generate the binding free energy change (ΔG) involving conformational sampling of end-point states or along the pathway collective variables, and the latter term indicating the protocols first estimating the relative binding free energy change (ΔΔG) of a new molecule referencing a known protein–ligand complex and then generating the binding free energy change (ΔG) using the thermodynamic cycle.
Regarding applicable scenarios and prediction accuracy, relative binding free energy methods benefit from using the thermodynamic cycle that common interaction energy errors among similar molecules can be canceled. Given the binding free energy of a reference ligand, relative binding free energy protocols usually have good accuracy performance from using the free energy perturbation involving alchemical structural transformations. (26,27) FEP and TI based relative binding free energy methods have been recently reported to achieve good simulation with errors around 1 kcal/mol on average against large-scale validations with test sets including congeneric ligands targeting common protein targets. However, relative binding free energy methods usually have limited application in the cases where no known ligands were discovered for the protein target, or structural differences were huge among the molecules. On the other hand, without depending on the reference molecule and structural modifications, absolute binding free energy methods are more robust against wider protein and ligand space but require more comprehensive conformational sampling for both free-state and bound-state receptor and ligand systems.
Fast, effective free energy methods are in high demand to fulfill a growing need in the fast-paced drug discovery efforts. The MovableType (MT) method is a recently introduced absolute binding free energy method which has shown itself to be competitive with other, much more computationally expensive methods by utilizing ensembles of protein–ligand structures generated using docking methods to mimic protein–ligand binding; (28,29,33) but docking is often limited to rigid or semirigid target protein structures, and therefore relying on docking alone can limit the applicability of MT to fairly rigid protein systems. In computational chemistry, there is a number of binding mode sampling methods available ranging from docking to large-scale molecular dynamics protocols which are generally distributed on a spectrum of performance between high efficiency and high accuracy. In this study, we evaluate the impact of the chosen sampling protocol on protein–ligand binding free energy estimation by challenging the MT method. Two validation benchmarks are used in this study including the well-known CASF-2016 set which covers a wide variety of protein and ligand structures (15) and the recently published Merck KGaA benchmark which consists of a set of 264 ligands with minor structural differences targeting eight pharmaceutically valuable receptors. (44) Since the MT free energy method has been developed to use bound-state and free-state conformations of protein ligand molecular systems for the evaluation of absolute binding free energies, we use the CASF and Merck KGaA sets to explore the compatibility of the MT free energy method working with different sampling methods and show that the method is applicable not only as a virtual screening algorithm by using docking generated poses but also as a flexible free energy evaluation algorithm by using protein–ligand complex trajectories from molecular dynamics simulations.

Method

ARTICLE SECTIONS
Jump To

The MT Method for Protein–Ligand Absolute Binding Free Energy Estimation

The MT method is a fast absolute binding free energy approach showing its robustness across a wide range of targets and ligands. It combines global sampling methods for generating ensembles of free-state and bound-state protein–ligand molecular conformations with a novel local sampling method which is used to numerically estimate local partition functions with respect to these sampled conformations. The current version of the MovableType software (available at http://www.quantumbioinc.com/) contains two protocols for the protein–ligand binding free energy estimation. The full protocol, denoted as MTScoreE, performs partition function estimations for the bound-state complex (holo) and free-state protein (apo) and ligand and then calculates the protein–ligand binding free energy with the ratio of the approximate partition functions. On the other hand, the simplified protocol, denoted as MTScoreES, predicts the protein–ligand binding affinity only using one single accurate complex conformation as the representative energy state. This simplified protocol is much faster than the full protocol and is positioned for higher-throughput virtual screening tasks. In this project, we applied the full MT protocol, MTScoreE, as the free energy evaluation tool coupled with different conformational sampling approaches for the estimation of the protein–ligand binding free energies. Its computational procedure is explained in the following paragraphs. The simplified protocol, MTScoreES, is also briefly summarized at the end of this subsection. The MTScoreES protocol is applied to the linear scaling parameter set generation with respect to the results rescale during the Merck KGaA set evaluation. This protocol is also applied to score the energy-minimal snapshots selected from the simulation trajectories to compare with the conformational ensemble-based binding free energy estimation results.
The full MT protocol first employs built-in or external molecular energy-state sampling methods (e.g., docking or dynamics) to generate significant free-state and bound-state protein–ligand conformations. Then, for each sampled molecular conformation, local numerical integrations are performed to generate an atomic pairwise potential ensemble which samples the local space of the atomic coordinates of the globally sampled molecular conformation. The local pairwise potential integrations are multiplied through all the atom pairs in the molecular conformer (e.g., protein, ligand, and complex) to estimate the local molecular energy states in the vicinity centered at each sampled conformation. These local energy states approximate local molecular partition functions, and when the MT method is supplied with multiple sampled conformations of the protein (apo), ligand, and complex (holo), the method sums over all the local partition functions regarding the free-state and bound-state NVT ensembles. This calculation yields an estimate of the approximated total partition functions of the protein–ligand system before and after complexation. The ratio of the partition function approximations is then used to estimate the gas-phase binding free energy as represented in eq 1:
Δ G b i n d i n g g a s R T log ( Z P L Z P Z L )
(1)
In this equation, each partition function Z P , Z L , and Z P L for protein (apo), ligand, and complex (holo) respectively is calculated using eq 2
Z M = α N poses Z M α = Z M 1 + Z M 2 + ... + Z M N
(2)
where M indicates the sampled molecule at different states, and every Z M i is estimated using numerical integration against all the atom pairwise interactions of each sampled conformational state i. For this study, we employed DivCon Discovery Suite v.815 in which potential energies for every sampled conformation along with the partition functions represented in eqs 1 and 2 are calculated using two sets of energy functions implemented in the DivCon Discovery Suite: (1) GARF energy function developed by Zheng et al. (GARF set) (30) and (2) Amber ff19SB and General AMBER Force Field (GAFF) force field optimized for the MT method (Amber set). (31,32) When Amber ff19SB was employed, the ligand atomic charges were suppled as Mullikan charges calculated using the built-in PM6 Hamiltonian.
This MT method, referred to in the balance of this work as MTScoreE or ensemble-score, employs conformational sampling protocols to generate energy-state ensembles for a protein–ligand pair in both unbound state and bound state, and it approximates the unbound-state and bound-state NVT partition functions using the generated energy-state ensembles to derive the Helmholtz free energy using a ratio of partition functions.
In this work, the partition function approximation for the bound-state complex is calculated as
Z P L N P L = i N P L Z P L i = Z P L 1 + Z P L 2 + ... + Z P L N P L
(3)
where NPL denotes the number of sampled complex-state conformations. The conformational sampling protocols introduced in this work for the bound-state and free-state partition function approximations will be visited in the following subsection.
Similarly, the partition function approximations for the free-state protein and ligand are calculated respectively as
Z P N P = j N P Z P j = Z P 1 + Z P 2 + ... + Z P N P
(4)
and
Z L N L = k N L Z L k = Z L 1 + Z L 2 + ... + Z L N L
(5)
NP and NL in eqs 4 and 5 denote the numbers of sampled free-state protein and ligand conformations, respectively. Applying the approximated partition functions Z P L N P L , Z P N P , and Z L N L to eq 6, the gas-phase protein–ligand binding free energy is then calculated using the logarithm of the ratio of bound-state and free-state partition functions.
Δ G b i n d i n g g a s R T log ( Z P L N P L Z P N P Z L N L )
(6)
The KMTISM solvation model is then employed for the solvation free energy change during the binding procedure. (56) KMTISM is an implicit solvation model that generates multiple solvent layers around each sampled solute conformation and calculates the solvent–solute interaction potentials by adding up the layer-based solvent–solute contact energies. Ensembles of solvent–solute interaction potentials are then generated for the protein–ligand complex and the free-state protein and ligand. Hence the solvation free energy change is calculated using the following equation
Δ G s o l = 1 β ln [ i exp ( β ( E s o l P L ) i ) ] 1 β ln [ j exp ( β ( E s o l P ) j ) ] 1 β ln [ k exp ( β ( E s o l L ) k ) ]
(7)
where (EsolPL)i, (EsolP)j, and (EsolL)k are the solvent–solute interaction potentials for the sampled complex, protein, and ligand conformations, respectively. i, j, and k are the index numbers for the corresponding conformations. The protein–ligand binding free energy is then calculated as
Δ G b i n d i n g = Δ G b i n d i n g g a s + Δ G s o l
(8)
With more careful and thorough sampling protocols, the MT method has a better chance to obtain the most significant conformations in the molecular conformational ensemble. On the other hand, we can also introduce fast and restricted sampling methods to the MT method for higher-throughput virtual screening tasks. In addition to MTScoreE, a simplified MT protocol named MTScoreES, where “ES” denotes “end-state”, has also been developed which utilizes a single binding pose as the representative energy state for the binding free energy prediction. (33) This approach only calculates the intermolecular atom pairwise potentials between the protein and ligand pair referencing the input protein–ligand binding pose. The binding free energy is then approximated in eq 9
Δ G b i n d i n g R T log ( Z I n t e r P L 0 )
(9)
where Z I n t e r P L 0 is the protein–ligand binding pose’s local partition function considering only the intermolecular atomic pairwise interaction potential.

Conformation Sampling Protocols

When provided proper ensembles of molecular conformations for unbound-state and bound-state protein ligand systems, MT is theoretically compatible with many different conformational sampling protocols. In our previous studies, we mainly focused on introducing docking protocols to work with MT for virtual screening purposes. (33) In the present work, we further explore the performance of the MT method when it is challenged with conformers generated with different simulation-based conformational sampling protocols. While docking+MT is robust and flexible for many cases, without thorough configurational space coverage or reliable energy gradient searching, discrete conformational sampling protocols have inherent limitations locating significant molecular configurations. One would expect that introducing more globally significant conformations or landscape minima should improve the free energy prediction accuracy of the MT method with the addition of more converged energy states. Therefore, in addition to docking, this work utilizes conventional MD (cMD) and Monte Carlo MD (MCMD) simulations. Protein–ligand complex “snapshots” from these sampling simulations are provided to MT to improve binding free energy estimation accuracy. With this support in place, we evaluate the MT method as a flexible free energy tool for its compatibility with different energy functions and sampling strategies. This work open further opportunities for using this method to analyze MD trajectories and receptor–ligand complexing mechanisms.
The MTCS and MTFlex protocols are built-in modules in the DivCon Discovery Suite to generate the free-state ligand and protein conformations based on corresponding prebuilt small-molecule and amino-acid rotamer libraries, respectively, followed by local structural minimization. (34,35) Both modules contain corresponding molecular fragment libraries, in which every fragment is collected from structural fragmentation against the ligands and proteins from Protein Data Bank. Every molecular fragment contains one single rotatable bond, along which the rotational energy curve for every fragment is precalculated using the built-in energy functions, i.e., Amber ff19SB and General AMBER Force Field (GAFF) force field, or GARF energy function. When using the MTCS protocol for the conformational sampling of ligands, the energy-minimum rotamers for necessary structural fragments from the libraries are selected to construct the conformations of the corresponding molecule. Local energy minimization is then performed to generate reasonable conformations according to their energy-based scores. On the other hand, due to the vast conformational space of proteins, MTFlex performs local conformational search based on the input protein configuration, instead of constructing the protein conformations totally based on fragment rotamer assembly in a “de novo” manner. For every amino-acid conformation from the input protein conformation, the MTFlex protocol selects the close-by energy-minimum rotamers and hence assembles the protein conformation, followed by local structural minimization. In such a case, the MTFlex protocol generates a cluster of conformations, resembling the configurational features of the input protein structures.
In this project, we select the same number of free-state protein conformations as the number of bound-state complex conformations for the corresponding partition function approximations (NP = NPL), mostly because the protein usually dominates conformational space in a protein–ligand complex. To explore the convergence of the approximated partition function estimation, and hence the reliability of the ΔGbinding prediction, we validate the ΔGbinding estimation convergence by introducing different numbers of sampled conformations, to decide the proper NP and NPL numbers for sampling the bound-state complex and the free-state protein. See the detailed validation procedure and result analysis in the Supporting Information.
On the other hand, following the default setting introduced in our previous work, (34) the top five lowest-energy conformers for the free-state ligand are selected from the MTCS protocol. Compared to the number of protein conformations, much fewer free-state ligand conformations are chosen because the conformational energies for most ligands increase significantly soon after the top 3 to 5 conformations.
We employed three different protocols to generate binding modes for use in this project: high-throughput with induced-fit docking as implemented in the Molecular Operating Environment (MOE) v2019 package; (36)exhaustive with molecular dynamics simulations using the Amber18 suite; and balanced (sampling efficiency and exhaustiveness) with MCMD as captured with the parallel Consecutive Histograms Monte Carlo (CHMC) method combined with MD simulation. The following conformational sampling methods are employed for the protein–ligand binding modes generation:

MT with MOE Induced Fit Docking

The coordinate files utilized in the present study structures were downloaded from the RCSB/PDB, and preparation was performed using the Molecular Operating Environment (MOE) v2019 package (36) using the Structure Prepare application followed by the Protonate3D application to add and optimize protons. After structure preparation was complete, docking for this protocol was performed using the same version of MOE. For this calculation, the ligand parameters were calculated using the Extended Hückel Theory (Amber10:EHT) as implemented in MOE, and the “Induced Fit” docking (IFD) protocol was chosen for each protein–ligand model. Under the IFD protocol, initial placement was performed using the Triangle Matcher approach, and the London ΔG score was used as the initial score followed by the GBVI/WSA ΔG score used as the final filter. (37) In each case and in order to maximize the number of initial poses, 250 poses were provided by the Triangle Matcher, and the active site was optimized using the Amber10:EHT force field as implemented in MOE. Finally, 25 poses were passed to MTScoreE as landscape minima for ensemble scoring.

MT with the Conventional Molecular Dynamics (cMD) Simulation

Molecular dynamics simulations for all the protein–ligand complexes were performed using Amber18 with the ff99SB force field. (38,39) The Generalized AMBER Force Field (GAFF) was used for ligands and for nonstandard residues as characterized using the Antechamber program. Parameters for any metal ions (e.g., Zn2+ Mg2+, and Ca2+) were introduced from the work of Li et al., (40) and the partial charges of ligands and nonstandard residues were calculated using the Antechamber program with the AM1-BCC method. (54,55) For each molecular simulation, the initial conformation for each complex was the best MOE docking pose (determined using the GBVI/WSA ΔG score) generated from the above-noted induced-fit docking protocol. The complex was solvated in a rectangular TIP3P water box with periodic boundary conditions and neutralized with Na+ or Cl- (as required) using the tleap program. The SHAKE algorithm (41) was applied to constrain bonds containing hydrogen atoms with a 2 fs time step, and the electrostatic interactions were calculated using the PME (particle mesh Ewald) (42) with a cutoff of 10.0 Å for long-range interactions. Prior to beginning the simulation, each complex was treated with an energy minimization consisting of 10000 steps of steepest descent algorithm followed by 10000 steps conjugate gradient algorithm. Next, each molecular model was gradually heated from 0 to 300 K in the isothermal-isovolumetric (NVT) ensemble for 200 ps and then equilibrated for 1 ns in the isothermal–isobaric (NPT) ensemble (300 K and 1.01325 MPa). Finally, the molecular dynamics simulations were carried out in the NPT ensemble (300 K).

MT with the Consecutive Histograms Monte Carlo (CHMC) Protocol Followed by Parallel cMD Simulation

Recently, our laboratory developed the CHMC method employing a mixed Monte Carlo/Molecular Dynamics protocol to estimate the potential of mean force (PMF) regarding the protein–ligand binding or dissociation process. (43) CHMC first generates a series of consecutive Monte Carlo sampling units with equal volume representing the receptor–ligand association/dissociation space. These sampling units are stacked from within the receptor’s binding active site (which is designated as the ligand’s initial position), along a dissociation path vector extending to the ligand’s final location which is usually in the open space representing the fully dissociated state for the protein–ligand system.
In this study, we divided each protein model’s active site into five sampling units with equal volume. Each sampling unit was then used for modeling equivalent and comparable NVT ensembles for the estimation of the partition functions in each sampling unit. CHMC generates a set of initial binding modes within each sampling unit using a docking-like protocol, and in the present project, the top 50 binding modes (based on the protein–ligand interaction energies) per sampling unit were selected for the next calculation stage. Geometric minimization was performed against these binding modes using the Newton–Raphson method to obtain an appropriate receptor–ligand complex ensemble for each sampling unit. The top-five separate minimum-energy protein–ligand complex geometries at each of the sampling units were chosen based on the protein–ligand binding mode energies. According to the free energy estimation protocols introduced in the following subsection, Amber and GARF energy functions were both introduced in the geometric minimization and protein–ligand binding mode energy calculations. The selected five protein–ligand complex geometries were thus used as initial conformations for 50 ns cMD simulations using the same setup, structure preparation, and protocol described in the “MT with Conventional Molecular Dynamics (cMD) Simulation” subsection above.

General Setting for the MT Free Energy Method

In this work, we utilized the three sampling protocols mentioned above coupled with two pairwise potential functions available in the MT package for protein–ligand binding free energy estimation. In the following paragraphs, we use “MOE-MT-amber” to designate the MT ensemble scoring protocol challenged with MOE induced-fit docking poses in which the partition function is calculated using the Amber energy function; “MOE-MT-garf” to designate the MT protocol with MOE docking poses and the GARF energy function; “cMD-MT-amber” to designate the MT protocol performing 250 ns conventional MD simulation and calculating the MT partition functions using the Amber energy function; the term “cMD-MT-garf” denoting the MT protocol performing 250 ns conventional MD simulation and calculating the MT partition functions using the GARF energy function; the term “CHMC-cMD-MT-amber” denoting the MT protocol performing CHMC parallel sampling using the Amber energy function followed by 50 ns conventional MD simulation and calculating the MT partition functions using the Amber energy function; the term “CHMC-cMD-MT-garf” denoting the MT protocol performing CHMC parallel sampling using the GARF energy function followed by 50 ns conventional MD simulation and calculating the MT partition functions using the GARF energy function.
From the MOE docking process, 25 top ranked binding poses are selected as energy minima for the bound-state partition function estimation in the MT method. From the MD simulations, including the cMD and CHMC-cMD calculations, 500 equally spaced snapshots of every protein–ligand complex (every 10 ps) are selected from the converged region of the production MD trajectories. For each protein–ligand complex, 5 different sets of the 500 snapshots from the converged MD trajectories are selected for the estimation of error bars (standard deviations) and 90% confidence intervals.

Results and Discussion

ARTICLE SECTIONS
Jump To

We applied the MT free energy method and different conformational sampling protocols through the treatment of two validation benchmarks: (1) the industry-standard Comparative Assessment of Scoring Functions v2016 set (the CASF-2016 benchmark) containing 57 protein targets and 285 ligands (15) and (2) the validation benchmark published by Merck KGaA originally for the evaluation of Schrödinger FEP+ including 8 protein targets and 264 ligands (the Merck benchmark). (44) The CASF-2016 benchmark covers various protein and ligand structures for validating the robustness of different sampling protocols across a broad range of protein and ligand classes. On the other hand, the Merck benchmark focuses on pharmaceutically relevant targets and ligand series containing ligand structures with subtle structural differences. (45−53)Figure 1 summarizes the calculation procedures applied to the CASF-2016 benchmark and the Merck benchmark.

Figure 1

Figure 1. Illustration of the free energy estimation protocols applied to (A) the CASF-2016 benchmark and (B) the Merck benchmark.

For statistical analysis of the free energy calculation results compared to the experimental binding free energies, we utilized Pearson’s R and Kendall’s τ for the measure of linear and rank correlation between the predicted and experimental binding free energies and Mean Absolute Error (MAE) for measuring the calculation errors. They are calculated using the following equations respectively
R = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 i = 1 n ( y i y ¯ ) 2
(10)
τ = 2 i < j s g n ( x i x j ) s g n ( y i y j ) n ( n 1 )
(11)
where sgn (xixj) = −1 if xi < xj, and sgn (xixj) = 1 if xi > xj, and analogously for y.
M A E = i = 1 n | ( y i x i ) | n
(12)
To compare the error performance of the MT protocols with other published free energy methods, for all the target sets from the Merck benchmark, we also utilized Root-Mean-Square Error (RMSE) evaluated using the following equation:
R M S E = i = 1 n ( y i x i ) 2 n
(13)
From eqs 10 to 13, xi represents a predicted binding free energy, and yi is the corresponding experimental value.

The CASF-2016 benchmark Results Analysis

The statistical results of the CASF-2016 benchmark for four MT free energy protocols (cMD-MT-amber, cMD-MT-garf, CHMC-cMD-MT-amber, and CHMC-cMD-MT-garf) are summarized in Table 1, and these results are plotted for each protocol in Figure 2.
Table 1. Results for Protein-Ligand Binding Free Energy Estimation Using Different MT-Based Protocols
MT protocols Pearson’s R Kendall’s τ MAE (kcal/mol)
cMD-MT-amber 0.66 0.49 2.10
cMD-MT-garf 0.70 0.52 1.68
CHMC-cMD-MT-amber 0.66 0.49 2.10
CHMC-cMD-MT-garf 0.71 0.52 1.67

Figure 2

Figure 2. Experimental binding free energies versus predicted binding free energies for the 285 protein–ligand complexes from the CASF-2016 test benchmark. Standard deviations for the test cases were shown as error bars.

As noted in the Methods section, the top-scored docked pose from MOE was chosen as the initial complex conformation for both the cMD-MT-amber and the cMD-MT-garf protocols, and the CHMC method was used to generate five different initial complex conformations to initiate the MD simulations in the CHMC-cMD-MT-amber and CHMC-cMD-MT-garf protocols. Using the MT-AMBERff19SB force field, both the cMD-MT-amber and the CHMC-cMD-MT-amber protocols achieve Pearson’s R values of 0.66, MAE values of 2.10 kcal/mol, and Kendall’s τ values of 0.49. Using the GARF energy function, the cMD-MT-garf and CHMC-cMD-MT-garf protocols generated improved results, with Pearson’s R values of 0.70 and 0.71, MAE values of 1.68 and 1.67 kcal/mol, and Kendall’s τ values of 0.52. Introducing simulation-based sampling methods in this study provided us with converged energy ensembles of each protein–ligand complex. Therefore, the initial conformations from different sampling methods were converged through the MD simulation into similar conformational ensembles, resulting in similar free energy estimation values. Figure 3 illustrated the free energies obtained from both cMD-MT and CHMC-cMD-MT against the CASF-2016 benchmark and discovered very high correlations between these two protocols when using the same energy functions.

Figure 3

Figure 3. Correlations between predicted binding free energies using cMD-MT and CHMC-cMD-MT protocols against the CASF-2016 benchmark. The correlation on the left illustrates the predicted binding free energies using cMD-MT-amber (x-axis) vs CHMC-cMD-MT-amber (y-axis). The correlation on the right illustrates the predicted binding free energies using cMD-MT-garf (x-axis) vs CHMC-cMD-MT-garf (y-axis).

In 2020, we reported free energy estimation results using MOE-MT-amber and MOE-MT-garf against the CASF-2016 benchmark. (33) In this prior work, based on 25 docking poses for each ligand, the MOE-MT-amber and MOE-MT-garf protocols generated Pearson’s R values of 0.63 and 0.64, respectively. In another publication, we reported the binding free energy reproduction against the CASF-2016 benchmark using the CHMC method with the GARF energy function, achieving the Pearson’s R as 0.61 and the MAE as 2.06 kcal/mol. (42) The present project shows that MD simulation-based sampling methods improve MT binding free energy prediction accuracy compared to the docking methods. Furthermore, the CHMC-cMD-MT-garf protocol shows that the free energy estimation accuracy with the parallel sampling strategy in CHMC is improved by introducing multiple short MD simulations and pairing this increased sampling with numerical integration and atom pairwise local sampling in MT.
While the present work represents the most exhaustive exploration of the impact of improved sampling on MT-based free energy predictions performed to date, there are still potential sources of error in these receptor–ligand binding free energy calculations. Specifically, the presented results utilized bound-state conformational sampling and therefore calculation of the bound, ZPL partition function; however, accurate binding free energy estimation also depends on conformational sampling for the free-state receptor and ligand molecules along with the change in solvation free energy during the binding process. Therefore, against test cases with flexible receptor and ligand structures, the lack of exhaustive conformational sampling for the free-state protein and ligand may introduce significant error in the predicted binding free energies. In the present effort, we focused on comparing the different binding mode sampling protocols by looking at the calculation results against the subsets targeting the identical protein receptor. In subsequent work, we will explore the question of whether we can further improve our predictions if we use apo as well as holo protein sampling.
The CASF-2016 benchmark contains 57 subsets, each of which has five different ligands targeting the same protein receptor. In our previous publication, we reported the binding free energy estimation results of 52 out of the 57 subsets from the CASF-2016 benchmark when the MT method was provided with 25 MOE dock-based poses for each of the protein:ligand complexes in those 52 subsets. To compare the performance of these earlier MOE-MT calculations with the present cMD-MT and CHMC-cMD-MT calculations on a case-by-case basis, Figure 4 shows the Pearson’s R values ranked according to each protocols’ performance and represented in pairs. For the cMD-MT-amber protocol, 36 subsets had Pearson’s R values higher than 0.7, and 40 subsets had Pearson’s R values higher than 0.6. Using the parallel, CHMC-cMD-MT-amber simulation protocol, 35 subsets had Pearson’s R values higher than 0.7, and 40 subsets had Pearson’s R values higher than 0.6. These results compared to the earlier MOE-MT-amber protocol in which 31 subsets had Pearson’s R values higher than 0.7, and 33 subsets had Pearson’s R values higher than 0.6. When performing the cMD-MT-garf protocol, 35 subsets had Pearson’s R values higher than 0.7 and 41 subsets had Pearson’s R values higher than 0.6. Using the parallel CHMC-cMD-MT-garf simulation protocol, 36 subsets had Pearson’s R values higher than 0.7, and 41 subsets had Pearson’s R values higher than 0.6. This compares favorably to the original MOE-MT-garf protocol for which 29 subsets had Pearson’s R values higher than 0.7, and 32 subsets had Pearson’s R values higher than 0.6.

Figure 4

Figure 4. Pearson’s R coefficients comparisons for simulation-based and docking-based protocols against 52 subsets from the CASF-2016 benchmark. Pearson’s R values for each protocol were ranked individually in their own descending orders.

From the calculation results comparison regarding the CASF-2016 subsets, we observe that the free energy protocols with simulation-based sampling methods generally obtain better agreement with the experimental data. When using the GARF energy function, the advantage for performing simulation-based sampling methods is even more pronounced. Moreover, the single-trajectory, exhaustive (250 ns) MD simulations (cMD-MT-amber and cMD-MT-garf) and the multitrajectory, short-time (50 ns) MD simulations (CHMC-cMD-MT-amber and CHMC-cMD-MT-garf) achieved similar free energy estimation accuracies suggesting that the free energy estimation scheme using multiple local samplings followed by parallel short-time MD simulations could obtain comparable accuracy to the free energy estimation scheme using the normal, longer running single-trajectory simulation strategy.
In the CASF-2016 benchmark, each subset contains only 5 test cases making the Pearson’s R values sensitive to protein–ligand complex selection and individual outliers, but with a sufficiently large test set, we can still obtain meaningful statistical results from the subsets study. The simulation-based methods achieved Pearson’s R values higher than 0.7 for 67% to 69% of all the 52 subsets and Pearson’s R values higher than 0.6 for 77% to 79% of the 52 subsets. The simulation-based MT free energy protocols show significant correlations with the experimental binding free energies in most of the homologous protein–ligand test sets in the CASF-2016 benchmark, and these methods show improved prediction profiles compared to the previous docking-based MT methods. To further explore the performance of the MT free energy method with different binding-mode sampling protocols, we employed another public validation benchmark, first introduced by Merck KGaA for FEP+ evaluation, which includes 8 protein targets and their corresponding congeneric ligand series.

The Merck KGaA Benchmark Results Analysis

The Merck KGaA benchmark encompasses 264 ligand structures divided between 8 protein targets including the following: 33 ligands for cyclin dependent kinase 8 (CDK8), 24 ligands for tyrosine-protein kinase met (c-MET), 28 ligands for kinesin-5 (EG5), 42 ligands for hypoxia-inducible factor 2α (HIF2α), 40 ligands for 6-Phosphofructo-2-Kinase/Fructose-2,6-Biphosphatase 3 (PFKFB3), 26 ligands for tyrosine phosphatase (SHP2), 44 ligands for tyrosine-protein kinase (SYK), and 27 ligands for tankyrase 2 (TNKS2). The ligands for each congeneric series have high structural similarity but significantly different experimental protein–ligand binding free energies. For the most part, in preparation for this benchmark, Schindler et al. used published PDB structures for each target and coupled these models with the associated ligands. In the case of EG5, an alternative loop conformation at the binding site was also considered. This structure used different initial conformations for this loop which significantly impacted the binding affinity prediction accuracy for other free energy methods considered in the original paper. (44) We likewise applied our MT protocols to both Eg5 target sets and evaluated the results separately.
The sampling protocols and the free energy calculations using the MT method were performed following the same protocols as discussed in the Method section. To better illustrate and compare the performance of different sampling protocols applied to the receptor–ligand complex state and to compensate for the lack of apo-state conformational sampling in the absolute binding free energy protocols, the raw MT results were rescaled. We selected 8 sets of protein–ligand cocrystal structures from the PDBBind database v2019 having the same 8 corresponding protein targets as the Merck benchmark but with no overlap in the ligand series selection. This curation yielded 191 cocrystal structures with known binding affinities which were passed to the MTScoreES protocol for fast binding free energy estimation (see details in the Supporting Information). The resulting regression lines for each target-ligand series were obtained by scoring against these cocrystal structures with both the MT-AMBERFF19SB and the GARF energy functions, and the MT free energy calculation results (ΔGpred) were refitted to the corresponding regression lines as the rescaled binding free energy values (ΔGpred*) to reproduce the experimentally determined free energies magnitudes.
Δ G p r e d * = a × Δ G p r e d + b
(14)
Figure 1B illustrates the calculation procedure for all the target sets from the Merck benchmark, and this procedure yields results for the 9 target sets (including the 8 primary targets along with the EG5 target with the alternative loop) which are summarized in Tables 2, 3, and 4 (Detailed plots for each set can be found in Figure 5.). Generally, all our free energy protocols achieved acceptable to good correlations and error performance against the whole benchmark. The two MOEdock-based protocols using MT-AMBERFF19SB and GARF MT energy functions exhibit very similar correlation coefficients against the whole benchmark, with Pearson’s R as 0.551 and 0.554 and Kendall’s τ as 0.433 and 0.434 for the MOE-MT-amber and MOE-MT-garf protocols, respectively. The MD simulations bring significant improvement to the MT binding free energy estimation accuracy with Pearson’s R range from 0.620 to 0.658, and Kendall’s τ ranged from 0.459 to 0.510. Meanwhile, except for the MOE-MT-amber protocol, the rescaled results exhibited overall errors with the RMSE values which range between 1.2 and 1.5 kcal/mol.
Table 2. Pearson’s R Values with 90% Confidence Intervals for All the Target Sets Using Different Free Energy Protocols
Table 3. Kendall’s τ Values with 90% Confidence Intervals for All the Target Sets Using Different Free Energy Protocols
Table 4. RMSE Values in kcal/mol with 90% Confidence Intervals for All the Target Sets Using Different Free Energy Protocols

Figure 5

Figure 5. Correlation between predicted binding free energies and experimental values for all eight target sets using the 6 different free energy protocols (For simplicity, the results for the Eg5 target set using the alternative loop are not included.).

When considering each individual target subset, the MOEdock-based and simulation-based MT protocols generate medium to good correlations for the CDK8, c-Met, Eg5, and TNKS2 sets with Pearson’s R values ranging from 0.394 to 0.852. The additional sampling provided by MD leads to significant improvement in the binding free energy estimation for all target subsets. In particular, PFKFB3 benefits most from this additional sampling. For example, when considering the GARF potential, the Pearson’s R values improve from 0.056 for MOE-MT-garf to 0.840 for cMD-MT-garf and 0.741 for CHMC-cMD-MT-garf. However, HIF-2α shows uneven performance in which the cMD-MT-garf protocol obtains Pearson’s R = 0.507, while all of the other protocols’ Pearson’s R values are less than 0.3. Against the 9 target sets, the cMD-MT-amber protocol generated high correlations (Pearson’s R ≥ 0.6) for 3 sets and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 3 sets, and the cMD-MT-garf protocol generated high correlations (Pearson’s R ≥ 0.6) for 3 sets and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 6 sets. When adding the CHMC Monte Carlo protocol, the CHMC-cMD-MT-amber protocol generated high correlations (Pearson’s R ≥ 0.6) for 5 sets and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 1 set, and the CHMC-cMD-MT-garf protocol generated high correlations (Pearson’s R ≥ 0.6) for 4 sets and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 3 set. This was in stark contrast to the dock-only based versions for this benchmark in that without thorough binding mode sampling, the docking-based protocols did not achieve as good of a ranking performance. The MOE-MT-amber protocol generated high correlations (Pearson’s R ≥ 0.6) for 1 set and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 4 sets, and the MOE-MT-garf protocol generated high correlations (Pearson’s R ≥ 0.6) for 1 set and medium correlations (0.4 ≤ Pearson’s R ≤ 0.6) for 3 sets.
As shown in Table 5, the rescaling procedure - which was performed against 8 separately curated sets of protein–ligand complexes - made significant impact to the protein–ligand binding free energy ranking performance when evaluating the whole benchmark including all the target sets. This improvement is observed both in the correlation coefficients and in error reduction. This is especially true for ranking across different target sets. Against the whole benchmark, utilizing rescaling increased the Pearson’s R and Kendall’s τ values for all protocols except for the two MOE docking-based protocols, and all the simulation-based protocols had reduced RMSE values after introducing rescaling.
Table 5. Statistical Results for All 6 Free Energy Protocols against the Merck Benchmark before and after Introducing the Rescaling Procedure
Without rescaling, the raw estimation results showed reasonable binding free energy ranking performance for the test cases within most target sets, while struggling to rank among different target sets. Relative binding free energy estimation between protein–ligand complexes has always been a challenge for both absolute and relative binding free energy methods (especially when there were major structural differences between complex molecules). Reaction-pathway dependent free energy methods rely heavily on the free energy map simulation of the sampling windows along purposely designed alchemical structural mutation pathways or protein–ligand dissociation/binding pathways. Significant structural differences between protein–ligand molecular systems require simulations with respect to high-dimensional collective variables (CVs) which heavily burden the convergence of molecular simulations. On the other hand, end-point free energy protocols usually depend on the generation of conformational probabilities for representative energy states or numerical energy characterization of significant conformations. A major structural difference often introduces large, distinct conformational changing features among different molecules which negatively impacts the conformational state sampling thoroughness for bound-state protein, free-state protein, ligand molecules, and ultimately the conformational energy calculation accuracy.
The MT method improves accuracy by using a ratio of the bound-state and free-state partition functions for the estimation of the binding free energies. This way common computational errors within both bound-state and free-state partition functions cancel out during the free energy calculation procedure. To better capture the significant binding modes, we can utilize docking procedures (which are often limited to rigid-protein structures) or - as in the case of the present work - different MD simulation methods performed on the bound-state protein–ligand complexes to cover more sampling space. For each individual target set, we clearly observe improvement using the simulation-based MT protocols compared to the docking-based MT protocols as summarized in Tables 2, 3 and 4. However, the new MT protocols applied in this study still lack the ranking capability for the protein–ligand complexes with distinct protein targets which suggests that extra simulation procedures are needed to capture the free energy baselines introduced by different proteins. The rescaling procedure is a useful tool for the absolute binding free energy protocols to improve the binding free energies ranking capabilities across different protein targets. While simulation-based protocols all had their prediction errors reduced, the two MOEdock-based protocols had increased errors after introducing the rescaling procedure. This suggests that with the same protein and different ligand structures, compared to dock-based protocols, simulation-based protocols yield similar conformational and interaction features within and between binding site protein residues versus the corresponding protein target found in the cocrystal structures from rescaling data sets. Therefore, the energy baseline generated by the MOE IFD protocol for each protein target significantly deviates from the energy baseline from the corresponding target sets used in the rescaling procedure.
When considering the computational time for performing all the free energy protocols against every protein–ligand complex from the Merck benchmark, the MD simulation was performed on a computer with 4 × NVIDIA Tesla V100 16GB (900 GB/s) GPUs and 128GB Memory, and the MOEdock was performed on a computer with an E5-2640 20-Core v4 (2.4 GHz) and 128GB Memory. The average computing time required for every protein–ligand complex per target set in the Merck benchmark is provided in Table 6. As the fastest protocol, the IFD MOEdock procedure spent 5 to 29 min on average (per protein–ligand complex) for all the target sets. This is contrasted with the cMD-MT protocol which required roughly 1190 min or 19.83 h on average to simulate a 250 ns trajectory for each of the protein–ligand complexes from this benchmark. Finally, for CHMC sampling which also included 50 ns short-term cMD simulations, these calculations cost 560.44 min or 9.34 h on average for sampling one trajectory for each protein–ligand complex from the benchmark. Because the multithread simulations in the CHMC-cMD protocol were performed parallelly, we presented the averaged time for every trajectory as the total time for the sampling workflow for each test case in the corresponding target set, noting that it was not an apples-to-apples computational time comparison because the parallel computing required more computational resources; but, employing multiple GPU cards was a simple way to support the parallel computing efficiency, while it was more expensive to improve the computational speed for the single-trajectory simulations. Overall, the CHMC-cMD protocol achieved close or even better free energy estimation accuracy compared to the 250 ns cMD protocol but at less simulation cost.
Table 6. Averaged Computational Time Monitored for All Target Sets from the Merck Benchmark
  sampling workflow averaged time (min)
protein MOE-MT cMD-MT CHMC-cMD-MT
CDK8 10 1409 855
c-Met 9 1082 810
Eg5 12 1174 249
Eg5 with an alternative loop 29 1166 233
HIF-2α 5 742 310
PFKFB3 9 1452 786
SHP-2 23 1684 907
SYK 16 1143 532
TNKS2 6 860 362
When looking at the statistical analysis results, the simulation-based MT protocols showed obvious advantages in ranking the binding affinities for the CDK8, c-Met, HIF-2α, PFKFB3 sets and achieved better overall binding affinities ranking against the whole benchmark. On the other hand, the MOE-MT protocols using both energy functions showed acceptable ranking capabilities, with even competitive performance for a few target sets. When also comparing the computational efficiency, the MOE-MT protocols achieved a fairly good “labor productivity ratio” by generating acceptable free energy estimation accuracy with much less computing time. However, when comparing the docking poses and the corresponding conformational snapshots from MD simulations, we also discovered that particular binding modes from docking could be significantly altered through the MD simulations leading to significant differences in the binding free energy estimation accuracy for some target sets, e.g., the PFKFB2 set from this benchmark.
Binding mode generation using docking methods is largely dependent on the input structure’s conformation along with the binding cavity on the protein surface. The IFD method only modestly refines the conformation of the residues in the binding site which contains the fitted ligand structure. When trying to evaluate the binding free energy with a single complex pose, important contacts could be missing, and unstable pairwise contacts may be considered important for maintaining the protein–ligand binding modes. This computational trial showed that the simulation-based free energy protocols sometimes significantly altered the conformation of the protein–ligand binding modes compared to the original docking poses, and some significant atom-pairwise interactions recognized by the docking protocol soon disappeared in the MD trajectories. Herein, we illustrate two example cases which reveal that the MD procedure often develops more reasonable binding modes leading to better binding free energy predictions compared to using docking procedures alone (target: Eg5–ligand: CHEMBL1084431 and target: PFKFB3–ligand: ligand_41).
For the Eg5 – CHEMBL1084431 protein–ligand molecular model, Figure 6 depicts the geometric difference between a representative binding mode selected from the converged cMD trajectory (Figures 6A and 6C) and the best scored docking pose from the MOE IFD (Figures 6B and 6D). Comparing Figures 6A and 6B, we see that the binding site residues wrap tighter around the ligand molecule during the MD simulation versus the IFD protocol. Figures 6C and 6D show that both IFD and MD recognize the hydrogen bond between the ligand’s N1 amine and the Gly103 backbone carbonyl group along with the hydrogen bond between the ligand’s N2 amine and the Glu102 backbone carbonyl group. However, only the MD trajectory shows that the side chain carboxyl group of Glu104 swings down to form another hydrogen bond with the ligand’s N1 amine. By reviewing the fraction of hydrogen bonds over the last 200 ns MD trajectory as reported in Table 7, we further discover that the N1 amine group moves back and forth to form several alternating hydrogen bonds with the surrounding residues. Specifically, when the N1 group moves to the left, it forms two hydrogen bonds with the Gly103 backbone and Glu104 side chain; but when the N1 group moves to the other side of the pocket, it forms one hydrogen bond with the Glu102 side chain. As a result, though the N2 group forms a stable hydrogen bond with the Glu102 residue (89.3% of the time through the last 200 ns), the N1 group was quite “bouncy” at the entrance of the binding pocket. This suggests that the ligand molecule, CHEMBL1084431, might not bind as tightly at the Eg5 binding site as shown from the MOE docking result. Against this complex, the MOE-MT-Amber protocol generated the binding free energy as −15.02 kcal/mol. Given the unstable hydrogen bonds formed between the ligand’s N1 group and the receptor, the cMD-MT-Amber free energy protocol generated a much lower binding affinity as −11.14 kcal/mol (which agrees better with the experimental value as −9.8 kcal/mol).

Figure 6

Figure 6. Illustration of the protein–ligand binding modes for the Eg5 – CHEMBL1084431 complex. (A) CHEMBL1084431 molecule at the Eg5 binding site using a representative binding mode selected from the converged cMD trajectory. (B) CHEMBL1084431 molecule at the Eg5 binding site using the best scored docking pose from the MOE IFD protocol. The area on the protein surface covered with yellow color shows the location of the residues having close contact (<4 Å) with the ligand. (C) Important Eg5 residues interacting with CHEMBL1084431 from the cMD representative binding mode. (C) Important Eg5 residues interacting with CHEMBL1084431 from the MOE IFD protocol.

Table 7. Hydrogen Bonding Percentages for the Eg5 – CHEMBL1084431 and PFKFB3–ligand_41 Complexes over Their Last 200 ns MD Trajectories
acceptor donor frac/%
Eg5 – CHEMBL1084431 complex
GLU_102@O LIG@N2 89.3
GLY_103@O LIG@N1 26.7
GLU_102@OE2 LIG@N1 14.5
GLU_102@OE1 LIG@N1 3.8
GLU_104@OE2 LIG@N1 9.6
GLU_104@OE1 LIG@N1 7.8
PFKFB3–ligand_41 Complex
LIG@N1 ASN133@ND2 18.3
LIG@N3 ASN133@ND2 6.6
For the PFKFB3–ligand_41 complex, there were two major differences in the binding modes generated by the two sampling protocols. In the MOE IFD best-scored pose, we observe that two residues at the PFKFB3 binding site act like a pair of “clips” clamping the quinoxaline group of the ligand in the middle: the Arg15-Gly16 backbone amide approaches the ligand’s quinoxaline by forming a π–π stacking interaction, while Val187 provides the side-chain isopropyl group to form a C–H−π interaction against the other side of the quinoxaline group. Notably, through the 250 ns MD simulation, the ligand’s quinoxaline group still exhibits close contacts with the Arg15-Gly16 backbone amide. However, the Val187 residue moves away from the ligand binding region as the beta-sheet containing it twists during the simulation. Meanwhile, Val213 moves to contact the ligand’s quinoxaline group and form a new C–H−π interaction which replaces the Val187 residue. Another binding-mode difference is observed at the ligand’s quinoxaline binding region. For this interaction, the MOE IFD protocol places the ligand’s pyrazine ring in a location to form two hydrogen bonds with both the Ser122 and the Asn133 side chains of the receptor, while the hydrogen bond between Ser122 and the ligand’s pyrazine nitrogen is not observed in the converged MD trajectory. Instead, MD simulation shows that the Ser122 and Lys17 side chains gradually drew each other closer and form a hydrogen bond network including the Ser122 and Lys17 side chains and the Arg15 backbone. Moreover, by studying the hydrogen bond’s fraction over the last 200 ns MD trajectory, we find that the hydrogen bond between the Asn133 side chain and the ligand’s pyrazine nitrogen is not stable through the simulation, and only 18.3% of the snapshots in the last 200 ns MD trajectory had the hydrogen bond formed between Asn133 and the ligand’s pyrazine nitrogen. For binding affinity estimation, the MOE-MT-Amber protocol determined a binding free energy of −11.36 kcal/mol, and the cMD-MT-Amber free energy protocol recognized the binding affinity as −7.72 kcal/mol. This latter estimation agrees better with the experimental value of −7.39 kcal/mol (see Figure 7).

Figure 7

Figure 7. Illustration of protein–ligand binding modes for the PFKFB3–ligand_41 complex. (A) Representative binding mode selected from the converged cMD trajectory. During the MD simulation, Val213 approached the quinoxaline group of the ligand to form a C–H−π interaction, while Val187 moved away from the ligand. (B) During the MD simulation, no hydrogen bonding interaction between SER122 and the ligand is observed, while the hydrogen bonding interaction between ASN133 and the ligand was unstable. (C) The MOE IFD protocol placed the ligand at the binding site with the quinoxaline group close to the Val187 side chain, while Val213 was far from the ligand. (D) The best scored docking pose recognized two hydrogen bonding interactions between the ligand and the SER122 and ASN133 side chains.

The IFD protocol is focused on optimizing the intermolecular interaction potential energy by locally altering the conformations of the ligand and the receptor residues at the interface, but it is generally incapable of relaxing the intramolecular potential stress given the conformational complexity of most protein receptors. During the protein–ligand binding mechanism, certain intraprotein interactions are preferred over some possible significant intermolecular interactions to achieve lower overall potential energy for the whole complex. Moreover, sometimes a stable binding mode could only be achieved with large protein conformational changes which can scarcely be obtained from a docking protocol without the support from large-scale simulation. These two examples clearly illustrate that some significant protein–ligand interactions formed from the docking protocol are unstable or not even recognized during the MD simulation, and some significant protein–ligand interactions can only be formed through significant protein motif movement.

Free Energy Estimation Using Single Conformation vs Conformation Ensembles from Simulation Trajectories

With a more comprehensive evaluation of the intra- and intermolecular interactions, force fields usually lead to different binding modes than the docking generations alone. Moreover, simulations against many test cases in this work generated multiple converged conformational clusters, suggesting multiple energy minima during the protein–ligand recognition. From the conformational energy point of view, significant energy minima, especially the global minimal conformation, usually dominated the partition function of a canonical ensemble, and thus, we further explored the binding free energy estimation accuracy by only using the energy-minimum snapshot from the simulation trajectories of the bound-state protein–ligand complexes.
We proposed two rules for selecting the energy-minimum snapshot. From the 250 ns conventional MD trajectories of every protein–ligand complex, the first protocol directly selected the energy-minimum snapshot according to the force field computation. The logic for choosing the energy-minimum snapshot was straightforward, but the wrong snapshot might be picked due to errors in energy functions. The second protocol first chose the largest conformational cluster from the multitrajectory CHMC sampling with 50 ns MD simulations of a protein–ligand complex and then selected the energy-minimum snapshot with respect to this cluster. The parallel simulations using CHMC with MD simulation protocols were expected to cover more extensive conformational space. Assuming the multitrajectory simulation converged after 50 ns local sampling, the largest conformational cluster had a higher opportunity to contain the “global” energy minimum among all the sampled conformations. The snapshot selected from the simulations of every protein–ligand complex was then treated with the MTScoreES protocol for binding free energy estimation.
In the following analysis, we used the term “snapshot-cMD-MT-amber” denoting the binding free energy estimation using the snapshot selected from the 250 ns MD simulation and treated with the MTScoreES protocol using the Amber energy function; the term “snapshot-cMD-MT-garf” denoting the binding free energy estimation using the snapshot selected from the 250 ns MD simulation and treated with the MTScoreES protocol using the GARF energy function; the term “snapshot-CHMC-cMD-MT-amber” denoting the binding free energy estimation using the snapshot selected from the CHMC parallel sampling followed by 50 ns conventional MD simulation and treated with the MTScoreES protocol using the Amber energy function; and the term “snapshot-CHMC-cMD-MT-garf” denoting the binding free energy estimation using the snapshot selected from the CHMC parallel sampling followed by 50 ns conventional MD simulation and treated with the MTScoreES protocol using the GARF energy function.
Against the CASF benchmark, free energy estimation results using single snapshots were concluded in Table 8. It was not surprising that the prediction accuracy and ranking performance were not as good as using the conformational ensemble-based protocols (See the corresponding results in Table 1.). Moreover, the results illustrated that binding free energy estimation using more thorough sampling and converged conformational ensembles from the simulation outperformed the binding affinity prediction using single conformations or docking poses. It was also worth noting that snapshots selected from the largest conformational cluster generated better binding free energy prediction results compared to using the energy-minimum snapshot from the 250 ns single-trajectory simulation. This suggested that the high-probable conformation generated from the MD simulation may be a better representative conformational state than the energy minimum selected from the simulation trajectory.
Table 8. Statistical Results for Protein-Ligand Binding Free Energy Estimation Using Single Snapshots Treated with Different MT-Based Protocols
MT protocols Pearson’s R Kendall’s τ MAE (kcal/mol)
snapshot-cMD-MT-amber 0.59 0.43 2.22
snapshot-cMD-MT-garf 0.63 0.45 1.86
snapshot-CHMC-cMD-MT-amber 0.63 0.46 2.32
snapshot-CHMC-cMD-MT-garf 0.65 0.47 1.78
Table 9 concluded the statistical results for the free energy estimation using single snapshots for the Merck benchmark. For the eight target sets, both the global minimal snapshots and the minimal snapshots from the largest conformational clusters had comparable binding affinity prediction accuracy. This suggested that both the single-trajectory and parallel simulations converged at similar conformational states.
Table 9. Statistical Results for All the Target Sets from the Merck Benchmark Using Single Snapshots Treated with Different Free Energy Protocols
    snapshot-cMD-MT-amber snapshot-cMD-MT-garf snapshot-CHMC-cMD-MT-amber snapshot-CHMC-cMD-MT-garf
CDK8 Pearson’s R 0.448 0.552 0.391 0.452
  RMSE (kcal/mol) 1.269 1.201 1.487 1.242
  Kendall’s τ 0.379 0.456 0.340 0.404
c-MET Pearson’s R 0.715 0.724 0.719 0.757
  RMSE (kcal/mol) 1.685 1.962 1.845 1.379
  Kendall’s τ 0.667 0.539 0.596 0.554
Eg5 Pearson’s R 0.405 0.505 0.464 0.506
  RMSE (kcal/mol) 1.928 2.494 1.587 1.678
  Kendall’s τ 0.072 0.169 0.207 0.220
Eg5 with an alternative loop Pearson’s R 0.367 0.403 0.360 0.361
  RMSE (kcal/mol) 2.024 2.180 1.816 1.789
  Kendall’s τ 0.121 0.183 0.113 0.118
HIF-2α Pearson’s R 0.270 0.354 0.212 0.345
  RMSE (kcal/mol) 1.184 1.409 1.274 1.263
  Kendall’s τ 0.196 0.270 0.140 0.295
PFKFB3 Pearson’s R 0.414 0.596 0.395 0.637
  RMSE (kcal/mol) 1.176 1.624 1.054 1.249
  Kendall’s τ 0.307 0.429 0.312 0.473
SHP-2 Pearson’s R 0.150 0.357 0.056 0.334
  RMSE (kcal/mol) 1.379 1.607 1.467 1.462
  Kendall’s τ 0.102 0.290 0.022 0.203
SYK Pearson’s R 0.250 0.252 0.180 0.219
  RMSE (kcal/mol) 2.535 3.163 2.550 2.523
  Kendall’s τ 0.156 0.122 0.072 0.057
TNKS2 Pearson’s R 0.567 0.350 0.617 0.440
  RMSE (kcal/mol) 0.994 2.943 0.855 2.593
  Kendall’s τ 0.335 0.167 0.372 0.227
total Pearson’s R 0.509 0.470 0.537 0.512
  RMSE (kcal/mol) 1.626 2.171 1.628 1.765
  Kendall’s τ 0.384 0.341 0.418 0.380
When comparing to the conformational ensemble-based free energy estimation protocols, we discovered that the all protocols had comparable ranking capability against the CDK8, c-MET, Eg5, and TNKS2 sets. On the other hand, for the HIF-2α, PFKFB3, SHP-2, and SYK sets, the single-snapshot-based protocols generally had higher correlation coefficients (Pearson’s R and Kendall’s τ values) compared to the docking-based protocols but were outperformed by the conformational ensemble-based MD protocols (cMD-MT and CHMC-cMD-MT protocols). It was likely that the protein–ligand complexes in the CDK8, c-MET, Eg5, and TNKS2 sets had dominantly significant global minimal conformational states and that binding free energy estimators using single conformations or conformational ensembles from docking and MD simulations all had generally acceptable accuracy. Against the HIF-2α, PFKFB3, SHP-2, and SYK sets, the MD simulation significantly altered the protein–ligand binding modes from the docking poses. Single snapshots from the MD simulations generated more reasonable binding affinities compared to the docking predictions, and the conformational ensembles from simulation further improved the binding affinity prediction accuracy.

Conclusion

ARTICLE SECTIONS
Jump To

In this study, we introduced two simulation-based conformational sampling protocols including (1) 250 ns single-trajectory Amber molecular dynamics and (2) CHMC Monte Carlo sampling followed by 50 ns multireplica Amber MD simulation and contrasted these with the previously published MOE induced-fit docking protocol. Each ensemble of protein–ligand complexes was generated and used to explore the free energy estimation accuracy and efficiency of the Movable Type absolute binding free energy method using different conformational sampling protocols. Compared to dock-based MT protocols, simulation-based MT protocols achieve significant improvement in binding free energy estimation accuracy against both the CASF-2016 and the Merck KGaA benchmarks. The CASF-2016 benchmark contains a wide variety of test cases with different structures for both ligands and proteins. The two simulation-based protocols using the GARF energy function obtained good prediction accuracy with Pearson’s R ≥ 0.7, Kendall’s τ = 0.52, and an MAE < 2 kcal/mol, and the two simulation-based protocols using the MT-AMBERFF19SB force field had lower correlation coefficients with Pearson’s R = 0.66, Kendall’s τ = 0.49 and higher errors with an MAE = 2.10 kcal/mol. Against protein–ligand complexes with identical protein targets from the Merck benchmark, our simulation-based protocols obtained acceptable (0.4 ≤ Pearson’s R ≤ 0.6) to good (Pearson’s R ≥ 0.6) ranking performances for more than 2/3rd of all the target sets. With the help of free energy prediction rescaling using a separate set of protein–ligand cocrystal structures, Pearson’s R values for all the protocols against the whole Merck benchmark were improved from 0.195–0.372 to 0.551–0.658 after rescaling, and RMSE values for all the simulation-based protocols were within 2 kcal/mol. Conversely, the RMSE values for dock-based protocols increased when introducing rescaling suggesting that the binding-site protein residues captured by the docking protocols deviated from the cocrystal structures in the corresponding rescaling set. As an absolute binding free energy method which relies on numerical energy performed on ensembles of conformational states, the MT protocol is heavily dependent on conformational sampling for protein–ligand complex systems. The present work clearly shows that utilizing MD protocols greatly improves binding free energy estimation accuracy. However, extensive simulation did not always result in better performance, and the MT-CHMC method, which employs multiple reasonable initial poses performed with parallel short-term simulations, reached similar estimation accuracy using less than half of the computational time. Since the method is compatible with many different conformational sampling methods, the present work shows that MT is a flexible free energy tool for protein–ligand binding free energy estimation tasks.

Supporting Information

ARTICLE SECTIONS
Jump To

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c00278.

  • Detailed explanation of Movable Type binding free energy estimation protocol (PDF)

  • Free energy calculation results for CASF-2016 benchmark, rescale set, and Merck KGaA benchmark using cMD-MT-amber, cMD-MT-garf, CHMC-cMD-MT-amber, and CHMC-cMD-MT-garf protocols (PDF)

Terms & Conditions

Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

Author Information

ARTICLE SECTIONS
Jump To

  • Corresponding Author
    • Zheng Zheng - School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR ChinaQuantumBio Inc., State College, Pennsylvania16801, United StatesOrcidhttps://orcid.org/0000-0001-5221-3209 Email: [email protected]
  • Authors
    • Wenlang Liu - School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    • Zhenhao Liu - School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    • Hao Liu - School of Mechanical and Electronic Engineering, Wuhan University of Technology, 122 Luoshi Road, Wuhan430070, PR China
    • Lance M. Westerhoff - QuantumBio Inc., State College, Pennsylvania16801, United States
  • Notes
    The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
    The authors declare no competing financial interest.

Acknowledgments

ARTICLE SECTIONS
Jump To

Z.Z. acknowledges support from the National Natural Science Foundation of China (Grant No. 21903061). L.M.W. acknowledges being supported in part by the National Institute of General Medical Sciences of the National Institutes of Health under Small Business Innovative Research (SBIR) Award R44GM134781. The authors also acknowledge the continued support of Michigan State University and MSU Technologies for licensing the MovableType technology to QuantumBio, Inc. We also thank Drs. Roger I. Martin, Oleg Borbulevych, Nupur Bansal, and Kenneth M. Merz, Jr. for their work and support in the prior study and the transfer and implementation of the MT technology in DivCon.

References

ARTICLE SECTIONS
Jump To

This article references 56 other publications.

  1. 1
    Schneider, G. Virtual screening: an endless staircase?. Nat. Rev. Drug Discovery 2010, 9 (4), 273276,  DOI: 10.1038/nrd3139
  2. 2
    Sliwoski, G.; Kothiwale, S.; Meiler, J.; Lowe Jr, E. W. Computational methods in drug discovery. Pharmacol. Rev. 2014, 66 (1), 334395,  DOI: 10.1124/pr.112.007336
  3. 3
    Dror, O.; Schneidman-Duhovny, D.; Inbar, Y.; Nussinov, R.; Wolfson, H. J. Novel approach for efficient pharmacophore-based virtual screening: method and applications. J. Chem. Inf. Model 2009, 49 (10), 23332343,  DOI: 10.1021/ci900263d
  4. 4
    Malmstrom, R. D.; Watowich, S. J. Using free energy of binding calculations to improve the accuracy of virtual screening predictions. J. Chem. Inf. Model 2011, 51 (7), 16481655,  DOI: 10.1021/ci200126v
  5. 5
    Klebe, G. Applying thermodynamic profiling in lead finding and optimization. Nat. Rev. Drug Discovery 2015, 14 (2), 95110,  DOI: 10.1038/nrd4486
  6. 6
    Albanese, S. K.; Chodera, J. D.; Volkamer, A.; Keng, S.; Abel, R.; Wang, L. Is Structure-Based Drug Design Ready for Selectivity Optimization?. J. Chem. Inf. Model 2020, 60 (12), 62116227,  DOI: 10.1021/acs.jcim.0c00815
  7. 7
    Raha, K.; Peters, M. B.; Wang, B.; Yu, N.; Wollacott, A. M.; Westerhoff, L. M.; Merz Jr, K. M. The role of quantum mechanics in structure-based drug design. Drug Discovery Today 2007, 12 (17–18), 725731,  DOI: 10.1016/j.drudis.2007.07.006
  8. 8
    Brooijmans, N.; Kuntz, I. D. Molecular recognition and docking algorithms. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 335373,  DOI: 10.1146/annurev.biophys.32.110601.142532
  9. 9
    Kuntz, I. D.; Blaney, J. M.; Oatley, S. J.; Langridge, R.; Ferrin, T. E. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 1982, 161 (2), 269288,  DOI: 10.1016/0022-2836(82)90153-X
  10. 10
    Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discovery 2004, 3 (11), 935949,  DOI: 10.1038/nrd1549
  11. 11
    Sledz, P.; Caflisch, A. Protein structure-based drug design: from docking to molecular dynamics. Curr. Opin. Struct. Biol. 2018, 48, 93102,  DOI: 10.1016/j.sbi.2017.10.010
  12. 12
    De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. Role of Molecular Dynamics and Related Methods in Drug Discovery. J. Med. Chem. 2016, 59 (9), 40354061,  DOI: 10.1021/acs.jmedchem.5b01684
  13. 13
    Cheng, T.; Li, X.; Li, Y.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on a diverse test set. J. Chem. Inf. Model 2009, 49 (4), 10791093,  DOI: 10.1021/ci9000053
  14. 14
    Li, Y.; Han, L.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J. Chem. Inf. Model 2014, 54 (6), 17171736,  DOI: 10.1021/ci500081m
  15. 15
    Su, M.; Yang, Q.; Du, Y.; Feng, G.; Liu, Z.; Li, Y.; Wang, R. Comparative Assessment of Scoring Functions: The CASF-2016 Update. J. Chem. Inf. Model 2019, 59 (2), 895913,  DOI: 10.1021/acs.jcim.8b00545
  16. 16
    Forli, S.; Huey, R.; Pique, M. E.; Sanner, M. F.; Goodsell, D. S.; Olson, A. J. Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat. Protoc. 2016, 11 (5), 905919,  DOI: 10.1038/nprot.2016.051
  17. 17
    Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2009, 31 (2), 455461,  DOI: 10.1002/jcc.21334
  18. 18
    Verdonk, M. L.; Cole, J. C.; Hartshorn, M. J.; Murray, C. W.; Taylor, R. D. Improved protein-ligand docking using GOLD. Proteins 2003, 52 (4), 609623,  DOI: 10.1002/prot.10465
  19. 19
    Friesner, R. A.; Murphy, R. B.; Repasky, M. P.; Frye, L. L.; Greenwood, J. R.; Halgren, T. A.; Sanschagrin, P. C.; Mainz, D. T. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J. Med. Chem. 2006, 49 (21), 61776196,  DOI: 10.1021/jm051256o
  20. 20
    Jain, A. N. Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J. Med. Chem. 2003, 46 (4), 499511,  DOI: 10.1021/jm020406h
  21. 21
    Clark, A. J.; Tiwary, P.; Borrelli, K.; Feng, S.; Miller, E. B.; Abel, R.; Friesner, R. A.; Berne, B. J. Prediction of Protein-Ligand Binding Poses via a Combination of Induced Fit Docking and Metadynamics Simulations. J. Chem. Theory Comput. 2016, 12 (6), 29902998,  DOI: 10.1021/acs.jctc.6b00201
  22. 22
    Merz, K. M., Jr. Limits of Free Energy Computation for Protein-Ligand Interactions. J. Chem. Theory Comput. 2010, 6, 17691776,  DOI: 10.1021/ct100102q
  23. 23
    Li, J.; Fu, A.; Zhang, L. An Overview of Scoring Functions Used for Protein-Ligand Interactions in Molecular Docking. Interdiscip. Sci. 2019, 11 (2), 320328,  DOI: 10.1007/s12539-019-00327-w
  24. 24
    Fischer, A.; Smiesko, M.; Sellner, M.; Lill, M. A. Decision Making in Structure-Based Drug Discovery: Visual Inspection of Docking Results. J. Med. Chem. 2021, 64 (5), 24892500,  DOI: 10.1021/acs.jmedchem.0c02227
  25. 25
    Shao, J.; Tanner, S. W.; Thompson, N.; Cheatham, T. E. Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. J. Chem. Theory Comput. 2007, 3 (6), 23122334,  DOI: 10.1021/ct700119m
  26. 26
    Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson, S.; Dahlgren, M. K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.; Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.; Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.; Berne, B. J.; Friesner, R. A.; Abel, R. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 2015, 137 (7), 26952703,  DOI: 10.1021/ja512751q
  27. 27
    Lee, T. S.; Allen, B. K.; Giese, T. J.; Guo, Z.; Li, P.; Lin, C.; McGee, T. D., Jr.; Pearlman, D. A.; Radak, B. K.; Tao, Y.; Tsai, H. C.; Xu, H.; Sherman, W.; York, D. M. Alchemical Binding Free Energy Calculations in AMBER20: Advances and Best Practices for Drug Discovery. J. Chem. Inf. Model. 2020, 60 (11), 55955623,  DOI: 10.1021/acs.jcim.0c00613
  28. 28
    Zheng, Z.; Ucisik, M. N.; Merz Jr, K. M. The Movable Type Method Applied to Protein-Ligand Binding. J. Chem. Theory Comput. 2013, 9 (12), 55265538,  DOI: 10.1021/ct4005992
  29. 29
    Stewart, J. J. Optimization of parameters for semiempirical methods V: modification of NDDO approximations and application to 70 elements. J. Mol. Model 2007, 13 (12), 11731213,  DOI: 10.1007/s00894-007-0233-4
  30. 30
    Zheng, Z.; Pei, J.; Bansal, N.; Liu, H.; Song, L. F.; Merz Jr, K. M. Generation of Pairwise Potentials Using Multidimensional Data Mining. J. Chem. Theory Comput. 2018, 14 (10), 50455067,  DOI: 10.1021/acs.jctc.8b00516
  31. 31
    Tian, C.; Kasavajhala, K.; Belfon, K. A. A.; Raguette, L.; Huang, H.; Migues, A. N.; Bickel, J.; Wang, Y.; Pincay, J.; Wu, Q.; Simmerling, C. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J. Chem. Theory Comput. 2020, 16 (1), 528552,  DOI: 10.1021/acs.jctc.9b00591
  32. 32
    Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25 (9), 11571174,  DOI: 10.1002/jcc.20035
  33. 33
    Zheng, Z.; Borbulevych, O. Y.; Liu, H.; Deng, J.; Martin, R. I.; Westerhoff, L. M. MovableType Software for Fast Free Energy-Based Virtual Screening: Protocol Development, Deployment, Validation, and Assessment. J. Chem. Inf. Model 2020, 60 (11), 54375456,  DOI: 10.1021/acs.jcim.0c00618
  34. 34
    Pan, L. L.; Zheng, Z.; Wang, T.; Merz Jr, K. M. Free Energy-Based Conformational Search Algorithm Using the Movable Type Sampling Method. J. Chem. Theory Comput. 2015, 11 (12), 58535864,  DOI: 10.1021/acs.jctc.5b00930
  35. 35
    Bansal, N.; Zheng, Z.; Merz Jr, K. M. Incorporation of side chain flexibility into protein binding pockets using MTflex. Bioorg. Med. Chem. 2016, 24 (20), 49784987,  DOI: 10.1016/j.bmc.2016.08.030
  36. 36
    Molecular Operating Environment (MOE); 2019.0102 Chemical Computing Group ULC: Montreal, QC, Canada, 2020.
  37. 37
    Corbeil, C. R.; Williams, C. I.; Labute, P. Variability in docking success rates due to dataset preparation. J. Comput. Aided Mol. Des. 2012, 26 (6), 775786,  DOI: 10.1007/s10822-012-9570-1
  38. 38
    Case, D. A.; Ben-Shalom, I. Y.; Brozell, S. R.; Cerutti, D. S.; Cheatham, T. E.; Cruzeiro, V. W. D., III; Darden, T. A.; Duke, R. E.; Ghoreishi, D.; Gilson, M. K.; Gohlke, H.; Goetz, A. W.; Greene, D.; Harris, R.; Homeyer, N.; Huang, Y.; Izadi, S.; Kovalenko, A.; Kurtzman, T.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D. J.; Merz, K. M.; Miao, Y.; Monard, G.; Nguyen, C.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Schott-Verdugo, S.; Shen, J.; Simmerling, C. L.; Smith, J.; Salomon-Ferrer, R.; Swails, J.; Walker, R. C.; Wang, J.; Wei, H.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2018; University of California: San Francisco, 2018.
  39. 39
    Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006, 65 (3), 712725,  DOI: 10.1002/prot.21123
  40. 40
    Li, P.; Roberts, B. P.; Chakravorty, D. K.; Merz Jr, K. M. Rational Design of Particle Mesh Ewald Compatible Lennard-Jones Parameters for + 2 Metal Cations in Explicit Solvent. J. Chem. Theory Comput. 2013, 9 (6), 27332748,  DOI: 10.1021/ct400146w
  41. 41
    Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103 (19), 85778593,  DOI: 10.1063/1.470117
  42. 42
    Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 1977, 23 (3), 327341,  DOI: 10.1016/0021-9991(77)90098-5
  43. 43
    Liu, H.; Deng, J.; Luo, Z.; Lin, Y.; Merz Jr, K. M.; Zheng, Z. Receptor-Ligand Binding Free Energies from a Consecutive Histograms Monte Carlo Sampling Method. J. Chem. Theory Comput. 2020, 16 (10), 66456655,  DOI: 10.1021/acs.jctc.0c00457
  44. 44
    Schindler, C. E. M.; Baumann, H.; Blum, A.; Bose, D.; Buchstaller, H. P.; Burgdorf, L.; Cappel, D.; Chekler, E.; Czodrowski, P.; Dorsch, D.; Eguida, M. K. I.; Follows, B.; Fuchss, T.; Gradler, U.; Gunera, J.; Johnson, T.; Jorand Lebrun, C.; Karra, S.; Klein, M.; Knehans, T.; Koetzner, L.; Krier, M.; Leiendecker, M.; Leuthner, B.; Li, L.; Mochalkin, I.; Musil, D.; Neagu, C.; Rippmann, F.; Schiemann, K.; Schulz, R.; Steinbrecher, T.; Tanzer, E. M.; Unzue Lopez, A.; Viacava Follis, A.; Wegener, A.; Kuhn, D. Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects. J. Chem. Inf. Model 2020, 60 (11), 54575474,  DOI: 10.1021/acs.jcim.0c00900
  45. 45
    Schiemann, K.; Mallinger, A.; Wienke, D.; Esdar, C.; Poeschke, O.; Busch, M.; Rohdich, F.; Eccles, S. A.; Schneider, R.; Raynaud, F. I.; Czodrowski, P.; Musil, D.; Schwarz, D.; Urbahns, K.; Blagg, J. Discovery of potent and selective CDK8 inhibitors from an HSP90 pharmacophore. Bioorg. Med. Chem. Lett. 2016, 26 (5), 14431451,  DOI: 10.1016/j.bmcl.2016.01.062
  46. 46
    Dorsch, D.; Schadt, O.; Stieber, F.; Meyring, M.; Gradler, U.; Bladt, F.; Friese-Hamim, M.; Knuhl, C.; Pehl, U.; Blaukat, A. Identification and optimization of pyridazinones as potent and selective c-Met kinase inhibitors. Bioorg. Med. Chem. Lett. 2015, 25 (7), 15971602,  DOI: 10.1016/j.bmcl.2015.02.002
  47. 47
    Schiemann, K.; Finsinger, D.; Zenke, F.; Amendt, C.; Knochel, T.; Bruge, D.; Buchstaller, H. P.; Emde, U.; Stahle, W.; Anzali, S. The discovery and optimization of hexahydro-2H-pyrano[3,2-c]quinolines (HHPQs) as potent and selective inhibitors of the mitotic kinesin-5. Bioorg. Med. Chem. Lett. 2010, 20 (5), 14911495,  DOI: 10.1016/j.bmcl.2010.01.110
  48. 48
    Wehn, P. M.; Rizzi, J. P.; Dixon, D. D.; Grina, J. A.; Schlachter, S. T.; Wang, B.; Xu, R.; Yang, H.; Du, X.; Han, G.; Wang, K.; Cao, Z.; Cheng, T.; Czerwinski, R. M.; Goggin, B. S.; Huang, H.; Halfmann, M. M.; Maddie, M. A.; Morton, E. L.; Olive, S. R.; Tan, H.; Xie, S.; Wong, T.; Josey, J. A.; Wallace, E. M. Design and Activity of Specific Hypoxia-Inducible Factor-2alpha (HIF-2alpha) Inhibitors for the Treatment of Clear Cell Renal Cell Carcinoma: Discovery of Clinical Candidate (S)-3-((2,2-Difluoro-1-hydroxy-7-(methylsulfonyl)-2,3-dihydro-1 H-inden-4-yl)oxy)-5-fluorobenzonitrile (PT2385). J. Med. Chem. 2018, 61 (21), 96919721,  DOI: 10.1021/acs.jmedchem.8b01196
  49. 49
    Boutard, N.; Bialas, A.; Sabiniarz, A.; Guzik, P.; Banaszak, K.; Biela, A.; Bien, M.; Buda, A.; Bugaj, B.; Cieluch, E.; Cierpich, A.; Dudek, L.; Eggenweiler, H. M.; Fogt, J.; Gaik, M.; Gondela, A.; Jakubiec, K.; Jurzak, M.; Kitlinska, A.; Kowalczyk, P.; Kujawa, M.; Kwiecinska, K.; Les, M.; Lindemann, R.; Maciuszek, M.; Mikulski, M.; Niedziejko, P.; Obara, A.; Pawlik, H.; Rzymski, T.; Sieprawska-Lupa, M.; Sowinska, M.; Szeremeta-Spisak, J.; Stachowicz, A.; Tomczyk, M. M.; Wiklik, K.; Wloszczak, L.; Ziemianska, S.; Zarebski, A.; Brzozka, K.; Nowak, M.; Fabritius, C. H. Discovery and Structure-Activity Relationships of N-Aryl 6-Aminoquinoxalines as Potent PFKFB3 Kinase Inhibitors. ChemMedChem. 2019, 14 (1), 169181,  DOI: 10.1002/cmdc.201800569
  50. 50
    Garcia Fortanet, J.; Chen, C. H.; Chen, Y. N.; Chen, Z.; Deng, Z.; Firestone, B.; Fekkes, P.; Fodor, M.; Fortin, P. D.; Fridrich, C.; Grunenfelder, D.; Ho, S.; Kang, Z. B.; Karki, R.; Kato, M.; Keen, N.; LaBonte, L. R.; Larrow, J.; Lenoir, F.; Liu, G.; Liu, S.; Lombardo, F.; Majumdar, D.; Meyer, M. J.; Palermo, M.; Perez, L.; Pu, M.; Ramsey, T.; Sellers, W. R.; Shultz, M. D.; Stams, T.; Towler, C.; Wang, P.; Williams, S. L.; Zhang, J. H.; LaMarche, M. J. Allosteric Inhibition of SHP2: Identification of a Potent, Selective, and Orally Efficacious Phosphatase Inhibitor. J. Med. Chem. 2016, 59 (17), 77737782,  DOI: 10.1021/acs.jmedchem.6b00680
  51. 51
    Chen, Y. N.; LaMarche, M. J.; Chan, H. M.; Fekkes, P.; Garcia-Fortanet, J.; Acker, M. G.; Antonakos, B.; Chen, C. H.; Chen, Z.; Cooke, V. G.; Dobson, J. R.; Deng, Z.; Fei, F.; Firestone, B.; Fodor, M.; Fridrich, C.; Gao, H.; Grunenfelder, D.; Hao, H. X.; Jacob, J.; Ho, S.; Hsiao, K.; Kang, Z. B.; Karki, R.; Kato, M.; Larrow, J.; La Bonte, L. R.; Lenoir, F.; Liu, G.; Liu, S.; Majumdar, D.; Meyer, M. J.; Palermo, M.; Perez, L.; Pu, M.; Price, E.; Quinn, C.; Shakya, S.; Shultz, M. D.; Slisz, J.; Venkatesan, K.; Wang, P.; Warmuth, M.; Williams, S.; Yang, G.; Yuan, J.; Zhang, J. H.; Zhu, P.; Ramsey, T.; Keen, N. J.; Sellers, W. R.; Stams, T.; Fortin, P. D. Allosteric inhibition of SHP2 phosphatase inhibits cancers driven by receptor tyrosine kinases. Nature 2016, 535 (7610), 148152,  DOI: 10.1038/nature18621
  52. 52
    Currie, K. S.; Kropf, J. E.; Lee, T.; Blomgren, P.; Xu, J.; Zhao, Z.; Gallion, S.; Whitney, J. A.; Maclin, D.; Lansdon, E. B.; Maciejewski, P.; Rossi, A. M.; Rong, H.; Macaluso, J.; Barbosa, J.; Di Paolo, J. A.; Mitchell, S. A. Discovery of GS-9973, a selective and orally efficacious inhibitor of spleen tyrosine kinase. J. Med. Chem. 2014, 57 (9), 38563873,  DOI: 10.1021/jm500228a
  53. 53
    Buchstaller, H. P.; Anlauf, U.; Dorsch, D.; Kuhn, D.; Lehmann, M.; Leuthner, B.; Musil, D.; Radtki, D.; Ritzert, C.; Rohdich, F.; Schneider, R.; Esdar, C. Discovery and Optimization of 2-Arylquinazolin-4-ones into a Potent and Selective Tankyrase Inhibitor Modulating Wnt Pathway Activity. J. Med. Chem. 2019, 62 (17), 78977909,  DOI: 10.1021/acs.jmedchem.9b00656
  54. 54
    Jakalian, A.; Bush, B. L.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: I. Method. J. Comput. Chem. 2000, 21 (2), 132146,  DOI: 10.1002/(SICI)1096-987X(20000130)21:2<132::AID-JCC5>3.0.CO;2-P
  55. 55
    Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 2002, 23 (16), 16231641,  DOI: 10.1002/jcc.10128
  56. 56
    Zheng, Z.; Wang, T.; Li, P.; Merz, K. M., Jr. KECSA-Movable Type Implicit Solvation Model (KMTISM). J. Chem. Theory Comput. 2015, 11 (2), 667682,  DOI: 10.1021/ct5007828

Cited By

ARTICLE SECTIONS
Jump To

This article is cited by 3 publications.

  1. Xiang Wang, Yejun Deng, Yang Zhang, Caihong Zhang, Lujie Liu, Yong Liu, Jianxin Jiang, Pujun Xie, Lixin Huang. Screening and Evaluation of Novel α-Glucosidase Inhibitory Peptides from Ginkgo biloba Seed Cake Based on Molecular Docking Combined with Molecular Dynamics Simulation. Journal of Agricultural and Food Chemistry 2023, 71 (27) , 10326-10337. https://doi.org/10.1021/acs.jafc.3c00826
  2. Ze-jun Jia, Xiao-Wei Lan, Kui Lu, Xuan Meng, Wen-Jie Jing, Shi-Ru Jia, Kai Zhao, Yu-Jie Dai. Synthesis, molecular docking, and binding Gibbs free energy calculation of β-nitrostyrene derivatives: Potential inhibitors of SARS-CoV-2 3CL protease. Journal of Molecular Structure 2023, 18 , 135409. https://doi.org/10.1016/j.molstruc.2023.135409
  3. Xiao Liu, Lei Zheng, Chu Qin, Yalong Cong, John Z. H. Zhang, Zhaoxi Sun. Comprehensive Evaluation of End-Point Free Energy Techniques in Carboxylated-Pillar[6]arene Host–Guest Binding: III. Force-Field Comparison, Three-Trajectory Realization and Further Dielectric Augmentation. Molecules 2023, 28 (6) , 2767. https://doi.org/10.3390/molecules28062767
  • Abstract

    Figure 1

    Figure 1. Illustration of the free energy estimation protocols applied to (A) the CASF-2016 benchmark and (B) the Merck benchmark.

    Figure 2

    Figure 2. Experimental binding free energies versus predicted binding free energies for the 285 protein–ligand complexes from the CASF-2016 test benchmark. Standard deviations for the test cases were shown as error bars.

    Figure 3

    Figure 3. Correlations between predicted binding free energies using cMD-MT and CHMC-cMD-MT protocols against the CASF-2016 benchmark. The correlation on the left illustrates the predicted binding free energies using cMD-MT-amber (x-axis) vs CHMC-cMD-MT-amber (y-axis). The correlation on the right illustrates the predicted binding free energies using cMD-MT-garf (x-axis) vs CHMC-cMD-MT-garf (y-axis).

    Figure 4

    Figure 4. Pearson’s R coefficients comparisons for simulation-based and docking-based protocols against 52 subsets from the CASF-2016 benchmark. Pearson’s R values for each protocol were ranked individually in their own descending orders.

    Figure 5

    Figure 5. Correlation between predicted binding free energies and experimental values for all eight target sets using the 6 different free energy protocols (For simplicity, the results for the Eg5 target set using the alternative loop are not included.).

    Figure 6

    Figure 6. Illustration of the protein–ligand binding modes for the Eg5 – CHEMBL1084431 complex. (A) CHEMBL1084431 molecule at the Eg5 binding site using a representative binding mode selected from the converged cMD trajectory. (B) CHEMBL1084431 molecule at the Eg5 binding site using the best scored docking pose from the MOE IFD protocol. The area on the protein surface covered with yellow color shows the location of the residues having close contact (<4 Å) with the ligand. (C) Important Eg5 residues interacting with CHEMBL1084431 from the cMD representative binding mode. (C) Important Eg5 residues interacting with CHEMBL1084431 from the MOE IFD protocol.

    Figure 7

    Figure 7. Illustration of protein–ligand binding modes for the PFKFB3–ligand_41 complex. (A) Representative binding mode selected from the converged cMD trajectory. During the MD simulation, Val213 approached the quinoxaline group of the ligand to form a C–H−π interaction, while Val187 moved away from the ligand. (B) During the MD simulation, no hydrogen bonding interaction between SER122 and the ligand is observed, while the hydrogen bonding interaction between ASN133 and the ligand was unstable. (C) The MOE IFD protocol placed the ligand at the binding site with the quinoxaline group close to the Val187 side chain, while Val213 was far from the ligand. (D) The best scored docking pose recognized two hydrogen bonding interactions between the ligand and the SER122 and ASN133 side chains.

  • References

    ARTICLE SECTIONS
    Jump To

    This article references 56 other publications.

    1. 1
      Schneider, G. Virtual screening: an endless staircase?. Nat. Rev. Drug Discovery 2010, 9 (4), 273276,  DOI: 10.1038/nrd3139
    2. 2
      Sliwoski, G.; Kothiwale, S.; Meiler, J.; Lowe Jr, E. W. Computational methods in drug discovery. Pharmacol. Rev. 2014, 66 (1), 334395,  DOI: 10.1124/pr.112.007336
    3. 3
      Dror, O.; Schneidman-Duhovny, D.; Inbar, Y.; Nussinov, R.; Wolfson, H. J. Novel approach for efficient pharmacophore-based virtual screening: method and applications. J. Chem. Inf. Model 2009, 49 (10), 23332343,  DOI: 10.1021/ci900263d
    4. 4
      Malmstrom, R. D.; Watowich, S. J. Using free energy of binding calculations to improve the accuracy of virtual screening predictions. J. Chem. Inf. Model 2011, 51 (7), 16481655,  DOI: 10.1021/ci200126v
    5. 5
      Klebe, G. Applying thermodynamic profiling in lead finding and optimization. Nat. Rev. Drug Discovery 2015, 14 (2), 95110,  DOI: 10.1038/nrd4486
    6. 6
      Albanese, S. K.; Chodera, J. D.; Volkamer, A.; Keng, S.; Abel, R.; Wang, L. Is Structure-Based Drug Design Ready for Selectivity Optimization?. J. Chem. Inf. Model 2020, 60 (12), 62116227,  DOI: 10.1021/acs.jcim.0c00815
    7. 7
      Raha, K.; Peters, M. B.; Wang, B.; Yu, N.; Wollacott, A. M.; Westerhoff, L. M.; Merz Jr, K. M. The role of quantum mechanics in structure-based drug design. Drug Discovery Today 2007, 12 (17–18), 725731,  DOI: 10.1016/j.drudis.2007.07.006
    8. 8
      Brooijmans, N.; Kuntz, I. D. Molecular recognition and docking algorithms. Annu. Rev. Biophys. Biomol. Struct. 2003, 32, 335373,  DOI: 10.1146/annurev.biophys.32.110601.142532
    9. 9
      Kuntz, I. D.; Blaney, J. M.; Oatley, S. J.; Langridge, R.; Ferrin, T. E. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 1982, 161 (2), 269288,  DOI: 10.1016/0022-2836(82)90153-X
    10. 10
      Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications. Nat. Rev. Drug Discovery 2004, 3 (11), 935949,  DOI: 10.1038/nrd1549
    11. 11
      Sledz, P.; Caflisch, A. Protein structure-based drug design: from docking to molecular dynamics. Curr. Opin. Struct. Biol. 2018, 48, 93102,  DOI: 10.1016/j.sbi.2017.10.010
    12. 12
      De Vivo, M.; Masetti, M.; Bottegoni, G.; Cavalli, A. Role of Molecular Dynamics and Related Methods in Drug Discovery. J. Med. Chem. 2016, 59 (9), 40354061,  DOI: 10.1021/acs.jmedchem.5b01684
    13. 13
      Cheng, T.; Li, X.; Li, Y.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on a diverse test set. J. Chem. Inf. Model 2009, 49 (4), 10791093,  DOI: 10.1021/ci9000053
    14. 14
      Li, Y.; Han, L.; Liu, Z.; Wang, R. Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J. Chem. Inf. Model 2014, 54 (6), 17171736,  DOI: 10.1021/ci500081m
    15. 15
      Su, M.; Yang, Q.; Du, Y.; Feng, G.; Liu, Z.; Li, Y.; Wang, R. Comparative Assessment of Scoring Functions: The CASF-2016 Update. J. Chem. Inf. Model 2019, 59 (2), 895913,  DOI: 10.1021/acs.jcim.8b00545
    16. 16
      Forli, S.; Huey, R.; Pique, M. E.; Sanner, M. F.; Goodsell, D. S.; Olson, A. J. Computational protein-ligand docking and virtual drug screening with the AutoDock suite. Nat. Protoc. 2016, 11 (5), 905919,  DOI: 10.1038/nprot.2016.051
    17. 17
      Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2009, 31 (2), 455461,  DOI: 10.1002/jcc.21334
    18. 18
      Verdonk, M. L.; Cole, J. C.; Hartshorn, M. J.; Murray, C. W.; Taylor, R. D. Improved protein-ligand docking using GOLD. Proteins 2003, 52 (4), 609623,  DOI: 10.1002/prot.10465
    19. 19
      Friesner, R. A.; Murphy, R. B.; Repasky, M. P.; Frye, L. L.; Greenwood, J. R.; Halgren, T. A.; Sanschagrin, P. C.; Mainz, D. T. Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein-ligand complexes. J. Med. Chem. 2006, 49 (21), 61776196,  DOI: 10.1021/jm051256o
    20. 20
      Jain, A. N. Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine. J. Med. Chem. 2003, 46 (4), 499511,  DOI: 10.1021/jm020406h
    21. 21
      Clark, A. J.; Tiwary, P.; Borrelli, K.; Feng, S.; Miller, E. B.; Abel, R.; Friesner, R. A.; Berne, B. J. Prediction of Protein-Ligand Binding Poses via a Combination of Induced Fit Docking and Metadynamics Simulations. J. Chem. Theory Comput. 2016, 12 (6), 29902998,  DOI: 10.1021/acs.jctc.6b00201
    22. 22
      Merz, K. M., Jr. Limits of Free Energy Computation for Protein-Ligand Interactions. J. Chem. Theory Comput. 2010, 6, 17691776,  DOI: 10.1021/ct100102q
    23. 23
      Li, J.; Fu, A.; Zhang, L. An Overview of Scoring Functions Used for Protein-Ligand Interactions in Molecular Docking. Interdiscip. Sci. 2019, 11 (2), 320328,  DOI: 10.1007/s12539-019-00327-w
    24. 24
      Fischer, A.; Smiesko, M.; Sellner, M.; Lill, M. A. Decision Making in Structure-Based Drug Discovery: Visual Inspection of Docking Results. J. Med. Chem. 2021, 64 (5), 24892500,  DOI: 10.1021/acs.jmedchem.0c02227
    25. 25
      Shao, J.; Tanner, S. W.; Thompson, N.; Cheatham, T. E. Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. J. Chem. Theory Comput. 2007, 3 (6), 23122334,  DOI: 10.1021/ct700119m
    26. 26
      Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson, S.; Dahlgren, M. K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.; Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.; Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.; Berne, B. J.; Friesner, R. A.; Abel, R. Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 2015, 137 (7), 26952703,  DOI: 10.1021/ja512751q
    27. 27
      Lee, T. S.; Allen, B. K.; Giese, T. J.; Guo, Z.; Li, P.; Lin, C.; McGee, T. D., Jr.; Pearlman, D. A.; Radak, B. K.; Tao, Y.; Tsai, H. C.; Xu, H.; Sherman, W.; York, D. M. Alchemical Binding Free Energy Calculations in AMBER20: Advances and Best Practices for Drug Discovery. J. Chem. Inf. Model. 2020, 60 (11), 55955623,  DOI: 10.1021/acs.jcim.0c00613
    28. 28
      Zheng, Z.; Ucisik, M. N.; Merz Jr, K. M. The Movable Type Method Applied to Protein-Ligand Binding. J. Chem. Theory Comput. 2013, 9 (12), 55265538,  DOI: 10.1021/ct4005992
    29. 29
      Stewart, J. J. Optimization of parameters for semiempirical methods V: modification of NDDO approximations and application to 70 elements. J. Mol. Model 2007, 13 (12), 11731213,  DOI: 10.1007/s00894-007-0233-4
    30. 30
      Zheng, Z.; Pei, J.; Bansal, N.; Liu, H.; Song, L. F.; Merz Jr, K. M. Generation of Pairwise Potentials Using Multidimensional Data Mining. J. Chem. Theory Comput. 2018, 14 (10), 50455067,  DOI: 10.1021/acs.jctc.8b00516
    31. 31
      Tian, C.; Kasavajhala, K.; Belfon, K. A. A.; Raguette, L.; Huang, H.; Migues, A. N.; Bickel, J.; Wang, Y.; Pincay, J.; Wu, Q.; Simmerling, C. ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution. J. Chem. Theory Comput. 2020, 16 (1), 528552,  DOI: 10.1021/acs.jctc.9b00591
    32. 32
      Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25 (9), 11571174,  DOI: 10.1002/jcc.20035
    33. 33
      Zheng, Z.; Borbulevych, O. Y.; Liu, H.; Deng, J.; Martin, R. I.; Westerhoff, L. M. MovableType Software for Fast Free Energy-Based Virtual Screening: Protocol Development, Deployment, Validation, and Assessment. J. Chem. Inf. Model 2020, 60 (11), 54375456,  DOI: 10.1021/acs.jcim.0c00618
    34. 34
      Pan, L. L.; Zheng, Z.; Wang, T.; Merz Jr, K. M. Free Energy-Based Conformational Search Algorithm Using the Movable Type Sampling Method. J. Chem. Theory Comput. 2015, 11 (12), 58535864,  DOI: 10.1021/acs.jctc.5b00930
    35. 35
      Bansal, N.; Zheng, Z.; Merz Jr, K. M. Incorporation of side chain flexibility into protein binding pockets using MTflex. Bioorg. Med. Chem. 2016, 24 (20), 49784987,  DOI: 10.1016/j.bmc.2016.08.030
    36. 36
      Molecular Operating Environment (MOE); 2019.0102 Chemical Computing Group ULC: Montreal, QC, Canada, 2020.
    37. 37
      Corbeil, C. R.; Williams, C. I.; Labute, P. Variability in docking success rates due to dataset preparation. J. Comput. Aided Mol. Des. 2012, 26 (6), 775786,  DOI: 10.1007/s10822-012-9570-1
    38. 38
      Case, D. A.; Ben-Shalom, I. Y.; Brozell, S. R.; Cerutti, D. S.; Cheatham, T. E.; Cruzeiro, V. W. D., III; Darden, T. A.; Duke, R. E.; Ghoreishi, D.; Gilson, M. K.; Gohlke, H.; Goetz, A. W.; Greene, D.; Harris, R.; Homeyer, N.; Huang, Y.; Izadi, S.; Kovalenko, A.; Kurtzman, T.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Mermelstein, D. J.; Merz, K. M.; Miao, Y.; Monard, G.; Nguyen, C.; Nguyen, H.; Omelyan, I.; Onufriev, A.; Pan, F.; Qi, R.; Roe, D. R.; Roitberg, A.; Sagui, C.; Schott-Verdugo, S.; Shen, J.; Simmerling, C. L.; Smith, J.; Salomon-Ferrer, R.; Swails, J.; Walker, R. C.; Wang, J.; Wei, H.; Wolf, R. M.; Wu, X.; Xiao, L.; York, D. M.; Kollman, P. A. AMBER 2018; University of California: San Francisco, 2018.
    39. 39
      Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006, 65 (3), 712725,  DOI: 10.1002/prot.21123
    40. 40
      Li, P.; Roberts, B. P.; Chakravorty, D. K.; Merz Jr, K. M. Rational Design of Particle Mesh Ewald Compatible Lennard-Jones Parameters for + 2 Metal Cations in Explicit Solvent. J. Chem. Theory Comput. 2013, 9 (6), 27332748,  DOI: 10.1021/ct400146w
    41. 41
      Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103 (19), 85778593,  DOI: 10.1063/1.470117
    42. 42
      Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 1977, 23 (3), 327341,  DOI: 10.1016/0021-9991(77)90098-5
    43. 43
      Liu, H.; Deng, J.; Luo, Z.; Lin, Y.; Merz Jr, K. M.; Zheng, Z. Receptor-Ligand Binding Free Energies from a Consecutive Histograms Monte Carlo Sampling Method. J. Chem. Theory Comput. 2020, 16 (10), 66456655,  DOI: 10.1021/acs.jctc.0c00457
    44. 44
      Schindler, C. E. M.; Baumann, H.; Blum, A.; Bose, D.; Buchstaller, H. P.; Burgdorf, L.; Cappel, D.; Chekler, E.; Czodrowski, P.; Dorsch, D.; Eguida, M. K. I.; Follows, B.; Fuchss, T.; Gradler, U.; Gunera, J.; Johnson, T.; Jorand Lebrun, C.; Karra, S.; Klein, M.; Knehans, T.; Koetzner, L.; Krier, M.; Leiendecker, M.; Leuthner, B.; Li, L.; Mochalkin, I.; Musil, D.; Neagu, C.; Rippmann, F.; Schiemann, K.; Schulz, R.; Steinbrecher, T.; Tanzer, E. M.; Unzue Lopez, A.; Viacava Follis, A.; Wegener, A.; Kuhn, D. Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects. J. Chem. Inf. Model 2020, 60 (11), 54575474,  DOI: 10.1021/acs.jcim.0c00900
    45. 45
      Schiemann, K.; Mallinger, A.; Wienke, D.; Esdar, C.; Poeschke, O.; Busch, M.; Rohdich, F.; Eccles, S. A.; Schneider, R.; Raynaud, F. I.; Czodrowski, P.; Musil, D.; Schwarz, D.; Urbahns, K.; Blagg, J. Discovery of potent and selective CDK8 inhibitors from an HSP90 pharmacophore. Bioorg. Med. Chem. Lett. 2016, 26 (5), 14431451,  DOI: 10.1016/j.bmcl.2016.01.062
    46. 46
      Dorsch, D.; Schadt, O.; Stieber, F.; Meyring, M.; Gradler, U.; Bladt, F.; Friese-Hamim, M.; Knuhl, C.; Pehl, U.; Blaukat, A. Identification and optimization of pyridazinones as potent and selective c-Met kinase inhibitors. Bioorg. Med. Chem. Lett. 2015, 25 (7), 15971602,  DOI: 10.1016/j.bmcl.2015.02.002
    47. 47
      Schiemann, K.; Finsinger, D.; Zenke, F.; Amendt, C.; Knochel, T.; Bruge, D.; Buchstaller, H. P.; Emde, U.; Stahle, W.; Anzali, S. The discovery and optimization of hexahydro-2H-pyrano[3,2-c]quinolines (HHPQs) as potent and selective inhibitors of the mitotic kinesin-5. Bioorg. Med. Chem. Lett. 2010, 20 (5), 14911495,  DOI: 10.1016/j.bmcl.2010.01.110
    48. 48
      Wehn, P. M.; Rizzi, J. P.; Dixon, D. D.; Grina, J. A.; Schlachter, S. T.; Wang, B.; Xu, R.; Yang, H.; Du, X.; Han, G.; Wang, K.; Cao, Z.; Cheng, T.; Czerwinski, R. M.; Goggin, B. S.; Huang, H.; Halfmann, M. M.; Maddie, M. A.; Morton, E. L.; Olive, S. R.; Tan, H.; Xie, S.; Wong, T.; Josey, J. A.; Wallace, E. M. Design and Activity of Specific Hypoxia-Inducible Factor-2alpha (HIF-2alpha) Inhibitors for the Treatment of Clear Cell Renal Cell Carcinoma: Discovery of Clinical Candidate (S)-3-((2,2-Difluoro-1-hydroxy-7-(methylsulfonyl)-2,3-dihydro-1 H-inden-4-yl)oxy)-5-fluorobenzonitrile (PT2385). J. Med. Chem. 2018, 61 (21), 96919721,  DOI: 10.1021/acs.jmedchem.8b01196
    49. 49
      Boutard, N.; Bialas, A.; Sabiniarz, A.; Guzik, P.; Banaszak, K.; Biela, A.; Bien, M.; Buda, A.; Bugaj, B.; Cieluch, E.; Cierpich, A.; Dudek, L.; Eggenweiler, H. M.; Fogt, J.; Gaik, M.; Gondela, A.; Jakubiec, K.; Jurzak, M.; Kitlinska, A.; Kowalczyk, P.; Kujawa, M.; Kwiecinska, K.; Les, M.; Lindemann, R.; Maciuszek, M.; Mikulski, M.; Niedziejko, P.; Obara, A.; Pawlik, H.; Rzymski, T.; Sieprawska-Lupa, M.; Sowinska, M.; Szeremeta-Spisak, J.; Stachowicz, A.; Tomczyk, M. M.; Wiklik, K.; Wloszczak, L.; Ziemianska, S.; Zarebski, A.; Brzozka, K.; Nowak, M.; Fabritius, C. H. Discovery and Structure-Activity Relationships of N-Aryl 6-Aminoquinoxalines as Potent PFKFB3 Kinase Inhibitors. ChemMedChem. 2019, 14 (1), 169181,  DOI: 10.1002/cmdc.201800569
    50. 50
      Garcia Fortanet, J.; Chen, C. H.; Chen, Y. N.; Chen, Z.; Deng, Z.; Firestone, B.; Fekkes, P.; Fodor, M.; Fortin, P. D.; Fridrich, C.; Grunenfelder, D.; Ho, S.; Kang, Z. B.; Karki, R.; Kato, M.; Keen, N.; LaBonte, L. R.; Larrow, J.; Lenoir, F.; Liu, G.; Liu, S.; Lombardo, F.; Majumdar, D.; Meyer, M. J.; Palermo, M.; Perez, L.; Pu, M.; Ramsey, T.; Sellers, W. R.; Shultz, M. D.; Stams, T.; Towler, C.; Wang, P.; Williams, S. L.; Zhang, J. H.; LaMarche, M. J. Allosteric Inhibition of SHP2: Identification of a Potent, Selective, and Orally Efficacious Phosphatase Inhibitor. J. Med. Chem. 2016, 59 (17), 77737782,  DOI: 10.1021/acs.jmedchem.6b00680
    51. 51
      Chen, Y. N.; LaMarche, M. J.; Chan, H. M.; Fekkes, P.; Garcia-Fortanet, J.; Acker, M. G.; Antonakos, B.; Chen, C. H.; Chen, Z.; Cooke, V. G.; Dobson, J. R.; Deng, Z.; Fei, F.; Firestone, B.; Fodor, M.; Fridrich, C.; Gao, H.; Grunenfelder, D.; Hao, H. X.; Jacob, J.; Ho, S.; Hsiao, K.; Kang, Z. B.; Karki, R.; Kato, M.; Larrow, J.; La Bonte, L. R.; Lenoir, F.; Liu, G.; Liu, S.; Majumdar, D.; Meyer, M. J.; Palermo, M.; Perez, L.; Pu, M.; Price, E.; Quinn, C.; Shakya, S.; Shultz, M. D.; Slisz, J.; Venkatesan, K.; Wang, P.; Warmuth, M.; Williams, S.; Yang, G.; Yuan, J.; Zhang, J. H.; Zhu, P.; Ramsey, T.; Keen, N. J.; Sellers, W. R.; Stams, T.; Fortin, P. D. Allosteric inhibition of SHP2 phosphatase inhibits cancers driven by receptor tyrosine kinases. Nature 2016, 535 (7610), 148152,  DOI: 10.1038/nature18621
    52. 52
      Currie, K. S.; Kropf, J. E.; Lee, T.; Blomgren, P.; Xu, J.; Zhao, Z.; Gallion, S.; Whitney, J. A.; Maclin, D.; Lansdon, E. B.; Maciejewski, P.; Rossi, A. M.; Rong, H.; Macaluso, J.; Barbosa, J.; Di Paolo, J. A.; Mitchell, S. A. Discovery of GS-9973, a selective and orally efficacious inhibitor of spleen tyrosine kinase. J. Med. Chem. 2014, 57 (9), 38563873,  DOI: 10.1021/jm500228a
    53. 53
      Buchstaller, H. P.; Anlauf, U.; Dorsch, D.; Kuhn, D.; Lehmann, M.; Leuthner, B.; Musil, D.; Radtki, D.; Ritzert, C.; Rohdich, F.; Schneider, R.; Esdar, C. Discovery and Optimization of 2-Arylquinazolin-4-ones into a Potent and Selective Tankyrase Inhibitor Modulating Wnt Pathway Activity. J. Med. Chem. 2019, 62 (17), 78977909,  DOI: 10.1021/acs.jmedchem.9b00656
    54. 54
      Jakalian, A.; Bush, B. L.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: I. Method. J. Comput. Chem. 2000, 21 (2), 132146,  DOI: 10.1002/(SICI)1096-987X(20000130)21:2<132::AID-JCC5>3.0.CO;2-P
    55. 55
      Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 2002, 23 (16), 16231641,  DOI: 10.1002/jcc.10128
    56. 56
      Zheng, Z.; Wang, T.; Li, P.; Merz, K. M., Jr. KECSA-Movable Type Implicit Solvation Model (KMTISM). J. Chem. Theory Comput. 2015, 11 (2), 667682,  DOI: 10.1021/ct5007828
  • Supporting Information

    Supporting Information

    ARTICLE SECTIONS
    Jump To

    The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c00278.

    • Detailed explanation of Movable Type binding free energy estimation protocol (PDF)

    • Free energy calculation results for CASF-2016 benchmark, rescale set, and Merck KGaA benchmark using cMD-MT-amber, cMD-MT-garf, CHMC-cMD-MT-amber, and CHMC-cMD-MT-garf protocols (PDF)


    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

Pair your accounts.

Export articles to Mendeley

Get article recommendations from ACS based on references in your Mendeley library.

Pair your accounts.

Export articles to Mendeley

Get article recommendations from ACS based on references in your Mendeley library.

You’ve supercharged your research process with ACS and Mendeley!

STEP 1:
Click to create an ACS ID

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

MENDELEY PAIRING EXPIRED
Your Mendeley pairing has expired. Please reconnect