Rational Drug Discovery, Design, and Development |

CONTENTS

20.2.2 Molecular Dynamics Simulations

20.2.3 Computational Quantum Mechanics

20.3 Pharmacophore Development and Database Searching

20.3.1 Importance of Pharmacophore Modeling

20.3.2 Pharmacophore for Modeling Drug Transporters

20.5.2 QSAR for Modeling Drug Transporters

20.6 Structure-Based Drug Design

20.6.1 Structure-Based Approaches for Modeling Drug Transporters

20.7 Proteomics and Drug Target Identification

20.8 Solubility and Computational Prediction

The widespread use of computer-based methods has had a dramatic impact on research in the physical, biological, and medical sciences. Applications of computational chemistry to drug design have clearly advanced the discovery of bioactive compounds. Computer speeds have allowed scientists in all fields to tackle problems that were not possible 20 years ago. Using highly specialized software, drug discovery scientists can apply energy-based calculations to help understand drug–receptor interactions at the molecular level. The molecular structure of drug candidates can be inserted into binding sites in silico to determine optimal interactions. The information can be used to design novel compounds with improved fits for the binding sites under examination.

When the macromolecular drug targets are not known, homology-modeling methods may be used to construct reasonable 3D models of the putative binding site based on similar protein structures. Where there are scant experimental structural data regarding the shape of macromolecular drug targets, medicinal chemists can examine the common structural, electronic, and conformational features of known biologically active compounds to construct pharmacophore models for database searching. Complex correlations between molecular structure and predicted physical properties may be used to suggest the best potential drug leads for animal model studies and clinical trials, providing investigators with candidates that are more likely to be successful. Over the last decade, there has been increasing focus on the prediction of toxicity.

With advances in technology, computers and laboratory devices have become faster and the information obtained is more accurate, but despite these advances, the difficulty of trying to design and develop a drug with all the required physical and biological properties necessary for FDA approval remains an incredibly expensive and time-consuming challenge. The latest estimates on the cost of getting a drug approved now exceed $2 billion (DiMasi and Grabowski 2012). Nevertheless, in the last 50 years, there have been great intellectual strides made in the development and use of in silico methods. This chapter will provide a brief history of the major advances in the field, as well as a description of some of the more robust and successful methods currently used in drug design and discovery, with particular emphasis on drug delivery considerations such as optimizing the solubility of lead compounds and studying drug–transporter interactions.

Computational chemistry is a generic term that describes a broadly based set of theoretical methodologies that can trace their roots to the development of mathematical physics. One of the common themes found in computational chemistry is the extensive use of computers to solve complex problems that range from polymer chemistry and nanotechnology to biochemistry and pharmacology. The methods of classical physics routinely used today in molecular modeling have foundations based on Newton’s equations and/or later formulations developed by Hamilton, Lagrange, and others. They include molecular mechanics and molecular dynamics calculations. These methods work reasonably well for large molecular systems but only when the mathematical models (equations and equation parameters) have been carefully developed. For many areas of molecular modeling, the use of quantum physics is critical, particularly when the explicit treatment of electrons is essential. In 1998, the Nobel Prize in Chemistry was awarded equally to John Pople and Walter Kohn for their independent development of methods in computational quantum chemistry and the development of density functional theory (DFT), respectively. For enzyme–substrate interactions, a combination of classical and quantum physics is necessary if the goal is to examine the bond scission and/or bond formation. A general strategy is to treat the active site with quantum mechanics to account for the shifts in electron density involved in enzyme–substrate interactions (binding, bond formation, bond breakage, etc.), while the rest of the molecule is subjected to classical methods. The marriage of molecular mechanics and quantum mechanics is called QM/MM, and the importance of this approach was underscored with the 2013 Nobel Prize in Chemistry for Martin Karplus, Michael Levitt, and Arieh Warshel. The term quantum pharmacology has been coined and is applicable when computational quantum chemistry is used to calculate the molecular structures of pharmacologic interest (Richards 1984). Many medicinal chemists have used classical and/or quantum mechanical calculations to determine preferred conformations, molecular shapes, electron distributions, enzyme–substrate reactions, and drug–receptor interactions. A brief survey of some computational methods is presented in the following texts.

Molecular mechanics is widely used in computational schemes (Bowen 2004). In this method, the atoms within a molecule are treated as soft spheres that may be viewed as being held together by springlike forces. Earlier terms include the Westheimer method and the force field method. Molecular mechanics emerged from spectroscopy, and the basic approach was outlined by D.H. Andrews in the early twentieth century. Three key papers appeared in 1946 that applied classical physics to chemical problems. One was by Westheimer and Mayer, the other by Hill, and the third by Dostrovsky, Hughes, and Ingold. These papers represent the first examples of what today is recognized as molecular mechanics. Originally, this mathematical approach was called the force field method or the Westheimer method because of his demonstration of the use of the method to understand a specific chemical problem.

Molecular mechanics divides the total potential energy (*U _{total}*) among various component potential energy terms with which chemists can readily identify, including but not limited to stretching and compression (

*U*), bending (

_{stretch}*U*), torsions (

_{bend}*U*), and nonbonded potentials (

_{torsion}*U*). These nonbonded potentials account for the electrostatic and van der Waals repulsive and attractive (London dispersion forces) interactions (Bowen and Zhong 2013):

_{non-bonded}${U}_{total}={U}_{stretch}+{U}_{bend}+{U}_{torsion}+{U}_{non-bonded}$ |

Simple mathematical expressions are commonly used (e.g., Hooke’s law for the stretching and bending potential energies and Coulomb’s law for electrostatic interactions). The advantage in using less rigorous equations is translated into faster computational time, but the big disadvantage is the loss of accuracy and predictability when simple models are used. Is it better to get less accurate information faster or the right answer in due course?

One of the major difficulties initially encountered with molecular mechanics was the simultaneous minimization of the potential energy functions with respect to the coordinates. Westheimer was able to demonstrate the utility of the force field method by doing an energy calculation by hand. Hendrickson is credited with doing the first computer-based molecular mechanics calculation in 1961. In 1965, Snyder and Schachtschneider demonstrated that force constants were essentially transferable from molecule to molecule if key off-diagonal terms among neighboring atom pairs were included. There was no general energy minimization algorithm until one was developed in the labs of Kenneth Wiberg. In the late 1970s, the MM2 method was introduced and widely used. Clark Still and others modified and improved the original MM2 method and incorporated the computational scheme into molecular modeling graphical programs such as MacroModel. Subsequent versions of MM2 were not popular with organic and medicinal chemists because of the limitations of functional group parameterizations. Today, the standard molecular mechanics method found in most software programs is the Merck molecular force field (MMFF) developed by Thomas Halgren. Today, MMFF has replaced MM2 and subsequent versions (MM3 and MM4), which are no longer being actively developed. The Jorgensen optimized potential for liquid simulations (OPLS) molecular mechanics method is also used by many scientists. OPLS is distinguished from other methods because it is parameterized to reproduce solution phase data; other molecular mechanics schemes are fit to either gas phase experimental data and/or high-level quantum mechanics calculations.

Understanding drug–receptor, substrate–enzyme, and inhibitor–enzyme interactions is critical for many of the drug design approaches. Molecular mechanics methods designed for the accurate calculations of small molecules are not necessarily used to calculate macromolecule structures. The assisted model building and energy refinement (AMBER) and Chemistry at HARvard Molecular Mechanics (CHARMM) programs reign supreme. They use simple potential energy functions, which work remarkably well for large, unstrained macromolecules.

**20.2.2 MOLECULAR DYNAMICS SIMULATIONS**

Molecular dynamics (MD) simulations are popular ways to understand macromolecular behavior as a function of time (Bowen 2004; Bowen and Zhong 2013). The trajectories of the atoms of a macromolecule can be determined using classical physics (equations of motion and molecular-kinetic theory). Molecular dynamics is based on molecular mechanics energy equations and the fact that the force is equal to the negative of the potential energy (Equation 20.2) where *U* is the potential energy as a function of the generalized coordinates *r*. Based on the molecular mechanics potential energy equation, knowing the configuration of the atoms and utilizing Equation 20.2, allows the calculation of the force and, in turn, using Newton’s equations of motion, the calculation of velocities and trajectories. This approach is useful for treating large molecular systems with water solvation models or the explicit use of water:

$F=-\frac{\partial U\left(r\right)}{\partial r}$ |

Pharmaceutical companies rarely use molecular dynamics simulations due to lengthy time requirements for MD simulation studies, but they remain of interest to many academic scientists. This approach has yielded important insights.

**20.2.3 COMPUTATIONAL QUANTUM MECHANICS**

Without question, computational quantum mechanics has emerged as an extremely useful method for examining and determining the predicted physical properties of molecular structures in silico. Unlike molecular mechanics, which can be viewed as an *ad hoc* collection of potential functions, quantum mechanics is a rigorously based theory that emerged in the mid-1920s with further developments in later years. Although several of the early pioneers of quantum mechanics did not receive Nobel Prizes for their contributions, numerous scientists (e.g., Bohr, Planck, de Broglie, Einstein, Heisenberg, Schrödinger, Pauli, Dirac, Born, and others) were recognized. In 1998, the importance of this field and two of its major contributors, John Pople and Walter Kohn, were recognized as outlined earlier. (It should be noted that independent work by Roald Hoffmann and Kenichi Fukui led to the 1981 Nobel Prize in Chemistry, and there are many seminal advances in chemistry and physics based on quantum mechanics that are not discussed in this chapter.)

Quantum mechanics has fundamentally changed the way physicists and chemists view the subatomic nature of the universe. For the most part, medicinal chemists are more involved in applying quantum mechanics to problems of pharmacological interest, rather than trying to understand the consequences of Bell’s inequality and the fundamental debate of subatomic reality. Nevertheless, the field of drug design benefits directly from the latest achievements. One interesting example is the growing recognition of the importance of what are termed halogen bonds. In a halogen bond, electron-rich donor groups form stable interactions with the electron-deficient region on the surface of the halogen. This electron-deficient area is aligned with the carbon–halogen bond and is known as the sigma hole. Figure 20.1 shows the sigma hole for a very simple molecule, chlorotrifluoromethane, calculated at the HF/6-31G(d,p) level of theory.

It should be noted that the vast majority of molecular problems of particular interest in quantum pharmacology are systems with potential functions that are not changing with time. These systems may be described with the time-independent Schrödinger equation (Equation 20.3), which is an eigenvalue equation where *Ĥ* is the Hamiltonian operator and Ψ (*r*) is the molecular wave function that depends on the position vectors, ${\overrightarrow{r}}_{i}$ (Hehre et al. 1986). The following equation is written in a compact form, where the symbolism masks the complexity of the equation for each atom:

$\widehat{H}\Psi (r)=E\Psi (r)$ |

The Hamiltonian operator, named in honor of the accomplished nineteenth-century Irish physicist Sir William Rowan Hamilton, is the quantum mechanical equivalent of the summation of the kinetic and potential energy of a system in classical physics. The first two terms of the Hamiltonian operator *Ĥ* (Equation 20.4) are the kinetic energy operators for the electrons and nuclei, respectively, with the summation over all electrons *i* and all protons *A*. The Laplacian ${\nabla}_{i}^{2}$ is defined in the following texts. The symbol *h* is Planck’s constant, and *m* and *M* are the masses of the electrons and the protons, respectively. The last three terms of Equation 20.4 are based on Coulomb’s law and represent the proton–proton repulsion, electron–proton attraction, and electron–electron repulsion electrostatic terms, respectively. The symbol *e* is the absolute value of the charge of the electron and proton; the negative sign in front of the summation signs indicates an attractive potential energy, whereas the positive sign indicates a repulsive potential energy. The signs are derived based on the charges of an electron and protons. The charge of an electron is (-e), and the charge of a proton is (+e):

$\begin{array}{l}\widehat{H}=-\frac{{h}^{2}}{8{\text{\pi}}^{2}m}{\displaystyle \sum _{i}^{\text{Electrons}}{\nabla}_{i}^{2}}-\frac{{h}^{2}}{8{\text{\pi}}^{2}m}{\displaystyle \sum _{A}^{\text{Nuclei}}\frac{1}{{M}_{A}}}{\nabla}_{A}^{2}+\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{A}^{\text{Nuclei}}{\displaystyle \sum _{B<A}^{\text{Nuclei}}\frac{{Z}_{A}{Z}_{B}}{\left|{R}_{AB}\right|}}}\\ \text{}-\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{i}^{\text{Electrons}}{\displaystyle \sum _{A}^{\text{Nuclei}}\frac{{\text{Z}}_{\text{A}}}{\left|{r}_{iA}\right|}}}+\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{i}^{\text{Electrons}}{\displaystyle \sum _{ji}^{\text{Nuclei}}\frac{1}{\left|{r}_{ij}\right|}}}\end{array}$ |

Equation 20.4 can be simplified by recognizing that on the timescale of the electronic motion, the nuclei can be considered at rest (Born–Oppenheimer approximation) on a relative basis. Thus, the second term in Equation 20.4 vanishes, as shown in Equation 20.5. The third term in Equation 20.4 is a constant for each fixed nuclear configuration and may be removed (Equation 20.6). This term may be added later to the final electronic total energy, *E _{el}*:

$-\frac{{h}^{2}}{8{\text{\pi}}^{2}m}{\displaystyle \sum _{A}^{\text{Nuclei}}\frac{1}{{M}_{A}}}{\nabla}_{A}^{2}=0$ |

$\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{A}^{\text{Nuclei}}{\displaystyle \sum _{B<A}^{\text{Nuclei}}\frac{{Z}_{A}{Z}_{B}}{\left|{R}_{AB}\right|}}}=\text{Constant}$ |

Therefore, by making the appropriate adjustments according to Equations 20.5 and 20.6, Equation 20.4 reduces to the following formulation:

${\widehat{H}}_{electronic}=-\frac{{h}^{2}}{8{\text{\pi}}^{2}m}{\displaystyle \sum _{i}^{\text{Electrons}}{\nabla}_{i}^{2}}-\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{i}^{\text{Electrons}}{\displaystyle \sum _{A}^{\text{Nuclei}}\frac{{\text{Z}}_{\text{A}}}{\left|{r}_{iA}\right|}}}+\frac{{e}^{2}}{{\text{4\pi \epsilon}}_{0}}{\displaystyle \sum _{i}^{\text{Electrons}}{\displaystyle \sum _{j<i}^{\text{Nuclei}}\frac{1}{\left|{r}_{ij}\right|}}}$ |

The Laplacian operator, commonly found in mathematical physics, is defined in Equation 20.8. It is the dot product of the gradient, ∇, which yields the scalar ∇^{2} (Bowen 2004):

${\nabla}^{2}=\left(\widehat{i}\frac{\partial}{\partial x}+\widehat{j}\frac{\partial}{\partial y}+\widehat{k}\frac{\partial}{\partial z}\right).\left(\widehat{i}\frac{\partial}{\partial x}+\widehat{j}\frac{\partial}{\partial y}+\widehat{k}\frac{\partial}{\partial z}\right)=\frac{{\partial}^{2}}{\partial {x}^{2}}+\frac{{\partial}^{2}}{\partial {y}^{2}}+\frac{{\partial}^{2}}{\partial {z}^{2}}$ |

For most molecular problems, it is easier to identify the position of a particle with a vector in spherical polar coordinates (*r*, θ, φ) rather than the Cartesian coordinate system (*x, y, z*). The Laplacian (Equation 20.8) can be transformed into spherical polar coordinates (Equation 20.9).

${\nabla}^{2}=\frac{1}{{r}^{2}}\frac{\partial}{\partial r}\left({r}^{2}\frac{\partial}{\partial r}\right)+\frac{1}{{r}^{2}\mathrm{sin}\text{\theta}\partial \text{\theta}}\left(\mathrm{sin}\text{\theta}\frac{\partial}{\partial \text{\theta}}\right)+\frac{1}{{r}^{2}{\mathrm{sin}}^{2}\text{\theta}}\frac{{\partial}^{2}}{\partial {\text{\phi}}^{2}}$ |

Removing the nuclear terms, as described earlier, produces the electronic Schrödinger equation. The Schrödinger equation can be solved exactly for only a few simple and special cases, which provide firm examples of dealing with the math. We have not discussed spin, which is a purely quantum mechanical property. For molecular systems, the approximations are too numerous and involved to be reviewed here, but the interested reader is encouraged to investigate the References at the end of the chapter. Two important approximations that must be mentioned are the following: (1) the Hartree–Fock method, where the electrons are moving independently of one another, leads to higher energies, and (2) the so-called linear combination of atomic orbitals–molecular orbitals, where molecular orbitals are viewed as a combination of atomic orbitals. In turn, the atomic orbitals can be represented by a judicious selection of a summation of Gaussian functions. The approach described earlier represents what is termed as computational ab initio theory (Hehre et al. 1986). Electron–correlation methods have been devised for greater accuracy. In this method, higher energy levels are included with ground-state energy levels and are referred to as post-Hartree–Fock calculations. Some of the problems associated with Hartree–Fock theory have been overcome with density functional theory (DFT).

The DFT method involves the direct use of electron densities (Parr and Yang 1989). It competes with and may surpass post-self-consistent field (SCF) calculations in terms of accuracy because the basic formulation, the Kohn–Sham equations, explicitly include electron correlation. DFT, compared to ab initio methods, avoids working with the many-electron wave function. This reduces the computer time of a calculation. Computer time is an important consideration when the systems are moderate to large. It has been estimated that ab initio calculations are proportional to *z*^{4}, where *z* represents the number of electrons. DFT calculations are also a function of the number of electrons but are proportional to *z*^{3}. The accuracy and reduction in time have made DFT a popular option (Bowen and Zhang 2013).

Computational quantum mechanics calculations can easily be carried out with software that can run readily on laptops. Modern quantum mechanics–based software allows one to calculate molecular structure, energy, thermodynamic values, and physical properties of small and large molecular systems. For example, the anticancer drug, ixabepilone, can readily be calculated on inexpensive laptops at the 3-21G level (or higher) and displayed with its electrostatic energy surface, Figure 20.2. While the more that is known about the underlying theory the better off one is when doing calculations, it is not necessary to be a theoretician. For example, most experimental organic and medicinal chemists use NMR spectroscopy on a regular basis, but not many consider all of the underlying quantum physics when interpreting spectra.