Other Related Techniques

With the advances in computational resources, there is an increasing urge among the computational researchers to make the in silico approaches fast, convenient, reproducible, acceptable, and sensible ones. Along with the typical two-dimensional (2D) and three-dimensional (3D) quantitative structure–activity relationship (QSAR) methods, approaches like pharmacophore, structure-based docking studies, and combinations of ligand- and structure-based approaches like comparative residue interaction analysis (CoRIA) and comparative binding energy analysis (COMBINE) have gained a significant popularity in the computational drug design process. A pharmacophore can be developed either in a ligand-based method, by superposing a set of active molecules and extracting common chemical features which are vital for their bioactivity; or in a structure-based manner, by probing probable interaction points between the macromolecular target and ligands. The interaction of protein and ligand molecules with each other is one of the interesting studies in modern molecular biology and molecular recognition. This interaction can well be explained with the conceptof a docking study to show how a molecule can bind to another molecule to exert the bioactivity. Docking and pharmacophore are non-QSAR approaches in in silico drug design that can support the QSAR findings. Approaches like CoRIA and COMBINE can use information generated from the ligand–receptor complexes to extract the critical clue concerning the types of significant interaction at the level of both the receptor and the ligand. Employing the abovementioned ligand- and structure-based methodologies and chemical libraries, virtual screening (VS) emerged as an important tool in the quest to develop novel drug compounds. VS serves as an efficient computational tool that integrates structural data with lead optimization as a cost-effective approach to drug discovery.

Keywords

COMBINE; CoRIA; database mining; docking; pharmacophore; virtual screening

Contents

10.1 Introduction 358

10.2 Pharmacophore 359

10.2.1 Concept and definition 359

10.2.2 Background and early days of pharmacophore 360

10.2.3 Methodology of pharmacophore mapping 361

10.2.3.1 Diverse conformation generation 361

10.2.3.2 Generation of 3D pharmacophore 362

10.2.3.3 Assessment of the quality of pharmacophore hypotheses 363

10.2.3.4 Validation of the pharmacophore model 365

10.2.4 Types of pharmacophore 366

10.2.4.1 Ligand-based pharmacophore modeling 366

10.2.4.2 Structure-based pharmacophore modeling 368

10.2.5 Application of pharmacophore models 372

10.2.5.1 Pharmacophore model–based VS 372

10.2.5.2 Pharmacophore-based de novo design 373

10.2.6 Advantages and limitations of pharmacophore 374

10.2.7 Software tools for pharmacophore analysis 375

10.3 Structure-Based Design–Docking 375

10.3.1 Concept and definition of docking 375

10.3.2 Definition of fundamental terms of docking 379

10.3.3 Essential requirements of docking 381

10.3.4 Categorization of docking 382

10.3.4.1 Receptor/protein flexibility 383

10.3.4.2 Ligand sampling and flexibility 385

10.3.4.3 Docking scoring functions 387

10.3.5 Basic steps of docking 388

10.3.6 Challenges and required improvements in docking studies 389

10.3.7 Applications of docking 392

10.3.8 Docking software tools 393

10.4 Combination of Structure- and Ligand-Based Design Tools 396

10.4.1 Comparative binding energy analysis 396

10.4.1.1 The concept of comparative binding energy 396

10.4.1.2 The methodology of COMBINE 397

10.4.1.3 Importance and advantages of COMBINE 399

10.4.1.4 Drawbacks and required improvements 399

10.4.1.5 Applications of COMBINE 399

10.4.1.6 Software for COMBINE 400

10.4.2 Comparative residue interaction analysis 400

10.4.2.1 Concept of CoRIA 400

10.4.2.2 Methodology of CoRIA 401

10.4.2.3 Variants of CoRIA 401

10.4.2.4 Importance and application of CoRIA 402

10.4.2.5 Drawback of CoRIA 402

10.4.2.6 Future perspective of CoRIA 403

10.5 In silico Screening of Chemical Libraries: VS 403

10.5.1 Concept 403

10.5.2 Workflow and types of VS 403

10.5.2.1 Selection of chemical libraries/databases 404

10.5.2.2 Preprocessing of chemical libraries 404

10.5.2.3 Filtering of druglike molecules 404

10.5.2.4 Screening 404

10.5.2.5 Hit selection to new chemical entity generation 405

10.5.3 Successful application of VS: A few case studies 406

10.5.4 Advantages of VS 406

10.5.4.1 Cost-effective 413

10.5.4.2 Time-saving 413

10.5.4.3 Labor-efficient 413

10.5.4.4 Sensible alternative 413

10.5.5 Pitfalls 413

10.5.6 Databases for the VS 416

10.6 Overview and Conclusions 417

References 421

10.1 Introduction

Computer-assisted tools in drug design and discovery, with an incredible modernization of computational resources, are very much appreciated throughout the world. Both ligand- and structure-based approaches are increasingly being used for the design of small lead and druglike molecules with anticipated multitarget activities [1]. Various ligand-based methods have been developed for effective and comprehensive application in virtual screening (VS), de novo design, and lead optimization. Pharmacophore has become one of the major ligand-based tools in computational chemistry for the drug research and development process [2]. Again, molecular recognitions, including enzyme–substrate, drug–protein, drug–nucleic acid, protein–nucleic acid, and protein–protein interactions, play significant roles in many biological responses. As a consequence, identification of the binding mode and affinity of the drug molecule is crucial to understanding the underlying mechanism of action in the respective therapeutic response. In this perspective, structure-based drug design is always a front-runner among all the available drug design approaches. Molecular docking is one of the largely acclaimed structure-based approaches, widely used for the study of molecular recognition, which aims to predict the binding mode and binding affinity of a complex formed by two or more constituent molecules with known structures [3].

There are a handful of novel techniques invented in the last decade employing the combined information computed from receptors and ligands. These tools can be defined as a combination of structure- and ligand-based design tools in the evolution of drug discovery techniques. Undoubtedly, methods like comparative binding energy analysis (COMBINE) [4] and comparative residue interaction analysis (CoRIA) [5] are the front-runners in the abovementioned approach with encouraging successful applications in drug discovery.

In silico screening is generally defined as VS, which is used rationally to select compounds for biological in vitro/in vivo testing from chemical libraries and databases of hundreds of thousands of compounds [6]. The VS approach is used for computationally prioritizing drug candidate molecules for future synthesis by using certain filters. The filters may be created by employing knowledge about the protein target (in structure-based VS) or known bioactive ligands (in ligand-based VS). These computational methods are powerful tools, as they supply a straightforward way to estimate the properties of the molecules and establish them as probable drug candidates from a huge number of compounds in no time in a cost-effective way. A combination of bioinformatics and chemoinformatics is crucial to the success of VS of chemical libraries, which is an alternative and complementary approach to high-throughput screening (HTS) in the lead discovery process [7]. Simply stated, the VS attempts to improve the probability of identifying bioactive molecules by maximizing the true positive rate—that is, by ranking the truly active molecules as high as possible.

10.2 Pharmacophore

10.2.1 Concept and definition

One of the most promising in silico concepts of computer-aided drug design (CADD) is that of the pharmacophore. The term pharmacophore was first coined by Paul Ehrlich in the early 1900s, but it was Monty Kier [8,9] who introduced the physical chemical concept of pharmacophore in a series of papers published between 1967 and 1971. The pharmacophore technique in modern drug discovery is extremely useful as an interface between the medicinal chemistry and computational chemistry, both in VS and library design for efficient hit discovery, as well as in the optimization of lead compounds to final drug candidates. Recent research has focused on the practice of parallel screening using pharmacophore models for bioactivity profiling and early-stage risk assessment of probable adverse effects and toxicity due to interaction of drug candidates with antitargets.

The hypothesis of pharmacophore is based on that the molecular recognition of a biological target by a class of compounds can be explained by a set of common features that interact with a set of complementary sites on the biological target [10]. Along with the features, their three-dimensional (3D) relationship with each of the features is another crucial component of the pharmacophore concept. It is closely linked to the widely used principle of bioisosterism, which can be adopted by medicinal chemists while designing bioactive compound series.

The pharmacophore can be simply defined by the following, as stated in the International Union of Pure and Applied Chemistry (IUPAC) definition of the term given in Wermuth et al. [11]:

A pharmacophore is the ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response.

A pharmacophore does not represent a real molecule or a real association of functional groups, but a purely abstract concept that accounts for the common molecular interaction capacities of a group of compounds toward their target structure.

A pharmacophore can be considered as the largest common denominator shared by a set of active molecules. This definition discards a misuse often found in the medicinal chemistry literature, which consists of naming as pharmacophores simple chemical functionalities such as guanidines, sulfonamides, or dihydroimidazoles (formerly imidazolines), or typical structural skeletons such as flavones, phenothiazines, prostaglandins, or steroids.

A pharmacophore is defined by pharmacophoric descriptors, including H-bonding, hydrophobic, and electrostatic interaction sites, defined by atoms, ring centers, and virtual points.

The pharmacophore describes the essential steric and electronic, function-determining points necessary for an optimal interaction with a relevant pharmacological target. It can also be thought of as a template, a partial description of a molecule where certain blanks need to be filled. The types of ligand molecules and the size and diversity of the data set have a great impact on the resulting pharmacophore model. Although a pharmacophore model signifies the key interactions between a ligand and its biological target, neither the structure of the target nor its identity is required to construct a handy pharmacophore model. As a consequence, pharmacophore approaches are often considered to be vital when the accessible information is very restricted. For example, when one knows nothing more than the structures of active ligands, a pharmacophore is the answer.

A simple hypothetical example is illustrated to define the common pharmacophores of three well-known compounds (namely, epinephrine, norepinephrine, and isoprenaline) in Figure 10.1.

Figure 10.1 Depiction of common pharmacophoric features of three well-known compounds: epinephrine, norepinephrine, and isoprenaline.

10.2.2 Background and early days of pharmacophore

Introducing the term pharmacophore in the year 1909, Ehrlich [12], nicknamed the “father of drug discovery,” defined it as “a molecular framework that carries (phoros) the essential features responsible for a drug’s (pharmacon) biological activity.” Although the first definition of the term was credited to Ehrlich, it was Kier who introduced the physical chemical concept in the late 1960s and early 1970s when describing common molecular features of ligands of important central nervous system receptors. This was labeled as “muscarinic pharmacophore” by Kier [8,9].

In the past, pharmacophore models were mainly worked out manually, assisted through the use of simple interactive molecular graphics visualization programs. Later, the growing complexities of molecular structures required refined computer programs for the determination and use of pharmacophore models. In the evolution of computational chemistry, the fundamental perception of a pharmacophore model as a simple geometric depiction of the key molecular interactions remains unchanged. With the advances in computational chemistry in the past 20 years, a variety of automated tools for pharmacophore modeling and applications emerged. A considerable number of studies have been carried out since the development of the pharmacophore approach [13]. Pharmacophore approaches have been used comprehensively in VS, de novo design, as well as in lead optimization and multitarget drug design [14].

10.2.3 Methodology of pharmacophore mapping

10.2.3.1 Diverse conformation generation

Conformational expansion is the most critical step, since the goal is not only to have the most representative coverage of the conformational space of a molecule, but also to have either the bioactive conformation as part of the set of generated conformations or at least a cluster of conformations that are close enough to the bioactive conformation. This conformational search can be divided into four categories: (i) systematic search in the torsional space, (ii) clustering (if wanted or needed), (iii) stochastic methods, such as Monte Carlo (MC), sampling, and Poling, and (iv) molecular dynamics [15]. Commonly employed conformational search methods are BEST, FAST, and conformer algorithms based on energy screening and recursive buildup (CAESAR) [16], all of which generate conformations that provide broad coverage of the accessible conformational space. The FAST conformation generation method searches conformations only in the torsion space and takes less time. The BEST method provides a complete and improved coverage of conformational space by performing a rigorous energy minimization and optimizing the conformations in both torsional and Cartesian space using the Poling algorithm. CAESAR is based on a divide-and-conquer and recursive conformation approach. This approach is also combined in cases of local rotational symmetry so that conformation duplicates due to topological symmetry in a systematic search can be efficiently eliminated.

10.2.3.2 Generation of 3D pharmacophore

The next step is three-dimensional (3D) pharmacophore generation, where Hypogen and HipHop are the two most commonly used algorithms [17,18]. Predictive 3D pharmacophores are generated in three phases: a constructive, a subtractive, and an optimization phase, as follows:

Constructive phase: HipHop is intended to derive common feature hypothesis-based pharmacophore models using information from a set of active compounds. HipHop does not require the selection of a template; rather, each molecule is treated as a template in turn. Different configurations of chemical features are identified in the template molecule using a pruned exhaustive search, which starts with small sets of features and then extends until no larger configuration is found. Next, each configuration is compared with the remaining molecules to identify configurations that are common to all molecules. The resulting pharmacophores are ranked using a combination of how well the molecules in the training set map onto the pharmacophore model. In HipHop, the user can define how many molecules must map completely or partially to a pharmacophore configuration. Again, HypoGen [18] is an algorithm that uses the activity values of the small compounds in the training set to generate hypotheses to build 3D pharmacophore models. HypoGen identifies all allowable pharmacophores consisting of up to five features among the two most active compounds and investigates the remaining active compounds in the list.

Subtractive phase: This phase deals with pharmacophores that were created in the constructive phase and removes pharmacophores from the data structure that are not likely to be useful.

Optimization phase: The optimization phase is performed using the simulated annealing algorithm. A maximum of 10 hypotheses are generated for each run. HypoGen develops models with different pharmacophore features: (i) hydrogen-bond acceptor (HBA); (ii) hydrogen-bond donor (HBD); (iii) hydrophobic (HYD), HYDROPHOBIC (aliphatic) and HYDROPHOBIC (aromatic); (iv) negative charge (NEG CHARGE); (v) negative ionizable (NI); (vi) positive charge (POS CHARGE); (vii) positive ionizable (PI); and (viii) ring aromatic (RA). The hypotheses generated are analyzed in terms of their correlation coefficients and the cost function values.

The basic pharmacophore features are illustrated in Figure 10.2. Pharmacophore models are usually labeled based on the number of features. For example, pharmacophore models consisting of three and four features are termed as three-point pharmacophore and four-point pharmacophore, respectively. A simple graphical representation is shown in Figure 10.3.

Figure 10.2 Basic pharmacophore features and their definitions.

10.2.3.3 Assessment of the quality of pharmacophore hypotheses

The HypoGen module performs a fixed cost calculation that represents the simple model that fits all the data, and a null cost calculation that assumes that there is no relationship in the data set and that the experimental activities are normally distributed about their average value. A small range of the total hypothesis cost obtained for each of the hypotheses indicates homogeneity of the corresponding hypothesis, and the training set selected for the purpose of pharmacophore generation is adequate. Again, values of total cost close to those of fixed cost indicate the fact that the hypotheses generated are statistically robust [19,20]. The total cost of a hypothesis is calculated as per Eq. (10.1):

Cost=eE+wW+cC (10.1)

(10.1)

where e, w, and c are the coefficients associated with the error (E), weight (W), and configuration (C) components, respectively. The other two important costs involved are the fixed cost and null cost. The fixed cost represents the simplest model that perfectly fits the data and is calculated by Eq. (10.2):

Fixedcost=eE(x=0)+wW(x=0)+cC (10.2)

(10.2)

where x is the deviation from the expected values of weight and error. The null cost is the cost of a pharmacophore when the activity data of every molecule in the training set is the average value of all activities in the set and the pharmacophore has no features. Therefore, the contribution from the weight or configuration component does not apply. The null cost is calculated as per Eq. (10.3):

Nullcost=eE(χest=χ¯) (10.3)

(10.3)

where χ_est is the averaged scaled activity of the training set molecules. It has been suggested that the differences between cost of the generated hypothesis and the null hypothesis should be as large as possible; a value of 40–60 bits difference may indicate that it has a 75–90% chance of representing a true correlation in the data set used. The total cost of any hypothesis should be toward the value of fixed cost to represent any meaningful model. Two other very important output parameters are the configuration cost and the error cost. Any value of configuration cost higher than 17 may indicate that the correlation from any generated pharmacophore is most likely due to chance. The error cost increases as the value of the root mean square (RMS) increases. The RMS deviations (RMSDs) represent the quality of the correlation between the estimated and the actual activity data.

10.2.3.4 Validation of the pharmacophore model

The pharmacophore models selected based on the acceptable correlation coefficient (R) and cost analysis, should be validated in three subsequent steps: (i) Fischer’s randomization test, (ii) test set prediction, and (iii) Güner–Henry (GH) scoring method.

Fischer’s randomization test: First, cross-validation is performed and statistical significance of the structure–activity correlation is estimated by randomizing the data using the Fischer’s randomization test [20]. This is done by scrambling the activity data of the training set molecules and assigning them new values, followed by the generation of pharmacophore hypotheses using the same features and parameters as those used to develop the original pharmacophore hypothesis. The original hypothesis is considered to be generated by mere chance if the randomized data set results in the generation of a pharmacophore with better or nearly equal correlation compared to the original one.

Test set prediction: The purpose of the pharmacophore hypothesis generation is not only to predict the activity of the training set compounds [21], but also to predict the activities of external molecules. With the objective of verifying whether the pharmacophore is able to predict the activity of test set molecules in agreement with the experimentally determined value, the activities of the test set molecules are estimated based on the mapping of the test set molecules to the developed pharmacophore model. The conformers are generated for the test set molecules based on the method that is used during the conformer generation of the training set, and they are mapped using the corresponding pharmacophore models. Thus, the predictive capacity of the models is judged based on the predictive R² values (Rpred2 with a threshold value of 0.5) or classification-based methods (such as sensitivity, specificity, precision, and accuracy). The test set should cover similar structural diversity as the training set in order to establish the broadness of the pharmacophore predictability.

GH scoring: The GH scoring method is employed following test set validation to evaluate the quality of the pharmacophore models [22–24]. The GH score can be successfully applied to quantify model selectivity precision of hits and the recall of actives from a directory of useful decoys (DUD) data set [25] consisting of known actives and inactives. The DUD is a publicly available database for free use, generated based on the observation that physical characteristics of the decoy background can be used for the classification of different compounds. The DUD can be downloaded from http://dud.docking.org.

The method involves evaluation of the following: the percent yield of actives in a database (%Y, recall), the percent ratio of actives in the hit list (%A, precision), the enrichment factor E, and the GH score. The GH score ranges from 0 to 1, where a value of 1 signifies the ideal model. The following are the metrics used for analyzing hit lists by a pharmacophore model–based database search:

%A=HaA×100 (10.4)

(10.4)

%Y=HaHt×100 (10.5)

(10.5)

E=Ha/HtA/D (10.6)

(10.6)

GH=[Ha(3A+Ht)4HtA](1−Ht−HaD−A) (10.7)

(10.7)

In these equations, %A is the percentage of known active compounds retrieved from the database (precision); Ha is the number of actives in the hit list (true positives); A is the number of active compounds in the database; %Y is the percentage of known actives in the hit list (recall); Ht is the number of hits retrieved; D is the number of compounds in the database; and E is the enrichment of the concentration of actives by the model relative to random screening without any pharmacophoric approach.

The basic steps of pharmacophore formalism are represented in Figure 10.4.

Figure 10.4 Fundamental steps of pharmacophore formalism.

10.2.4 Types of pharmacophore

A pharmacophore model can be generated in two ways. The first method is ligand-based modeling, where a set of active molecules are superimposed and common chemical features are extracted that are necessary for their bioactivity; the second is structure-based modeling performed by probing possible interaction points between the macromolecular target and ligands.

10.2.4.1 Ligand-based pharmacophore modeling

Ligand-based pharmacophore (LBP) modeling has become an important computational tool for assisting drug discovery in the case of nonavailability of a macromolecular target structure [26,27]. The LBP is usually carried out by extracting common chemical features from the 3D structures of a known set of ligands representative of fundamental interactions between the ligands and a specific macromolecular target. In the case of LBP modeling, pharmacophore generation from multiple ligands involves two major steps: First, creation of the conformational space for each ligand in the training set to represent conformational flexibility of the ligands and to align the multiple ligands in the training set, and second, determination of the essential common chemical features to build the pharmacophore model. The conformational analysis of ligands and performing molecular alignment are the key techniques as well as the main complexities in any LBP modeling.

A few challenges still exist in spite of the great advances of LBP modeling:

a. The first problem, and one of the most serious, is the modeling of ligand flexibility. Presently, two strategies are utilized to deal with this problem. The first is the preenumerating method, in which multiple conformations for each ligand are precomputed and saved in a database [28]. The advantage of this approach is lower computing cost for conducting molecular alignment at the expense of a possible need for a mass storage capacity. The second approach is the on-the-fly method, in which the conformation analysis is carried out in the pharmacophore modeling process [28]. This approach does not need mass storage but might need higher central processing unit (CPU) time for conducting meticulous optimization. It has been demonstrated that the preenumerating method outperforms the on-the-fly calculation approach [29]. Recently, a considerable number of advanced algorithms [14] have been established to sample the conformational spaces of small molecules, which are listed in Table 10.1.
Most importantly, a good conformation generator should ensure the following conditions: (i) proficiently generating all the putative bound conformations that small molecules adopt when they interact with macromolecules, (ii) keeping the list of low-energy conformations as short as possible to avoid the combinational explosion problem, and (iii) being less time consuming for the conformational calculations.

b. Molecular alignment is the second issue of concern in LBP modeling. The alignment methods can be classified into two approaches in terms of their elementary nature: point-based and property-based [29]. The points of the point-based method can be further discriminated as atoms, fragments, or chemical features [30]. In the point-based algorithms, pairs of atoms, fragments, or chemical feature points are usually superimposed employing a least-squares fitting. The major disadvantage of this approach is the requirement for predefined anchor points because the generation of these points can become problematic in the case of different ligands. Consequently, the property-based algorithms utilize molecular field descriptors, generally represented by sets of Gaussian functions, to generate alignments. Recently, new alignment methods have been developed, including stochastic proximity embedding [31], atomic property fields [32], fuzzy pattern recognition [33], and grid-based interaction energies [34].

c. The third challenge lies in the appropriate selection of training set compounds. Although this problem is simple and nontechnical, but it often puzzles researchers nonetheless. The type of ligand molecules, the size of the data set, and its chemical diversity largely affect considerably the final generated pharmacophore model [28].

Table 10.1

Various conformational sampling methods

Conformational sampling method	Characteristics
3DGEN	1. An algorithm for exhaustive generation of 3D isomers proceeding from molecular topology 2. A systematic approach 3. Based on a combinatorial process
Balloon	1. A stochastic algorithm 2. A multiobjective GA is employed 3. Removing conformational duplicates 4. Can effectively produce low-energy conformers, which are geometrically distinct from each other
CAESAR	1. A systematic search method 2. Based on a divide-and-conquer and recursive conformer buildup approach 3. Avoids conformer duplicates due to topological symmetry 4. Capable of reproducing the receptor-bound conformation
CONAN	1. A fragment-based, buildup approach combined with the rule-based method 2. Intersection strategy is used for conformational analysis
ConfGen	1. A systematic approach based on divide-and-conquer strategy 2. A rule-based approach is incorporated 3. Intended for the high-throughput generation of 3D databases
Conformation import workflow in the molecular operating environment (MOE)	1. A systematic approach with the use of divide-and-conquer strategy and combined with rule-based method 2. Each fragment is subject to a stochastic search algorithm, expected to locate most its low-energy conformers 3. A high-throughput conformer generator for library preparation
Corina	1. Fast approach 2. Straightforward performance, reasonable execution time, simplicity, and applicability to building large, 3D chemical inventories 3. Often gives a larger average RMSD to the bioactive conformation
Cyndi	1. Nondeterministic method 2. Based on a multiobjective genetic algorithm (MOGA) 3. Efficient, particularly when reproducing a bioactive conformation
Directed tweak	1. Originally for 3D query searches 2. A torsional space minimizer 3. Involving the use of analytical derivatives, and is fast as a result 4. Allowing 3D flexible searching on an interactive time scale
Genetic algorithm	1. Nondeterministic method 2. Suitable for the superimposition of sets of flexible molecules
MED-3DMC	1. Nondeterministic method 2. A Metropolis MC algorithm based on a SMARTS mapping of the rotational bonds and the MMFF94 VDW energy term is used 3. Capable of sampling the conformational space with a small average RMSD to the bioactive conformation.
MIMUMBA	1. A rule-based method 2. The revised version is in conjunction with the OMEGA approach 3. The rules can be extracted from statistical observations from a training portion of the CSD
Molecular dynamics simulation method	1. Can create reasonable conformers 2. Time consuming 3. Depends on temperature and simulation time 4. Is of special interest when dealing with molecular conformations in solution
Monte Carlo	1. A stochastic method 2. Solvation effects can be studied in an explicit solvent 3. Do not guarantee that the conformational space has been explored exhaustively; in particular, the output of a search may depend on the starting conformation 4. Efficient at the beginning of a search, but efficiency degrades as the search proceeds
OMEGA	1. A rule-based method (heuristic) combined with divide-and-conquer strategy 2. A high-throughput conformer generator for library preparation
Poling restraints	1. A stochastic method 2. Promoting conformational variation 3. Avoiding analogous conformers 4. Covering most of the pharmacophore space with significantly fewer conformers
Self-organization method	1. A distance geometry (DG) approach 2. Conformations generated are consistent with a set of geometric constraints, which include interatomic distance bounds and chiral volumes derived from the molecular connectivity table 3. Tending to produce relatively compact conformers
Systematic torsional grid method	1. A systematic search method 2. Uses a recursive tree search algorithm 3. Can generate all conformations of polypeptides which satisfy experimental NMR restraints 4. Time consuming
WIZARD	1. A rule-based method (heuristic) 2. Expert system techniques are adopted

10.2.4.2 Structure-based pharmacophore modeling

Structure-based pharmacophore (SBP) modeling is directly dependent on the 3D structures of macromolecular targets or macromolecule–ligand complexes. As the number of experimentally determined 3D structure of targets has grown to a very large number, SBP methods have attracted significant interest in the last decade. The approach is considered as the complementary one to the docking procedures, providing the same level of information as well as less demanding with respect to required computational resources. The protocol of SBP modeling involves analyzing the complementary chemical features of the active site and their spatial relationships, and developing a pharmacophore model assembly with selected features [2].

SBP modeling can be further classified into two subclasses: macromolecule–ligand–complex based and macromolecule (without ligand) based. The macromolecule–ligand–complex-based approach is suitable in identifying the ligand binding site of the macromolecular target and determining the key interaction points between ligands and the target protein. LigandScout [35] is an excellent software program that incorporates the macromolecule–ligand–complex-based scheme. Programs like Pocket v.2 [36] and GBPM [37] are based on the same approach. The major limitation of this process is the requirement for the 3D structure of the macromolecule–ligand complex. As a consequence, it cannot be applied to cases when no ligands targeting the binding site of interest are known. This can be solved by the macromolecule-based approach. The SBP method implemented in the Discovery Studio software [18] is a typical example of a macromolecule-based approach [38].

The most commonly encountered difficulty for SBP modeling is the identification of too many chemical features for a specific binding site of the macromolecular target. A pharmacophore model consisting of too many chemical features (e.g., more than seven) is not appropriate for practical applications (e.g., 3D database screening). Therefore, it is always important to pick a restricted number of chemical features (usually three to seven) to create a reliable pharmacophore hypothesis. One more significant drawback is that the obtained pharmacophore hypothesis cannot replicate the quantitative structure–activity relationship (QSAR) because the model is generated based just on a single macromolecule–ligand complex or a single macromolecule.

10.2.5 Application of pharmacophore models

Enrichment in the pharmacophore techniques in the last two decades has made the approach one of the most significant tools in drug discovery. In spite of the advances in key techniques of pharmacophore modeling, there is space for additional improvement to derive more precise and best possible pharmacophore models, which include better handling of ligand flexibility, proficient molecular alignment algorithms, and more precise model optimization. Along with the pharmacophore-based VS and de novo design, the applications of pharmacophore have been extended to lead optimization [39], multitarget drug design [40], activity profiling [41], and target identification [42]. Application of the pharmacophore technique is demonstrated in a schematic way in Figure 10.5.

Figure 10.5 Diverse applications of the pharmacophore technique.

10.2.5.1 Pharmacophore model–based VS

Pharmacophore models can be used for querying the 3D chemical database to search for potential ligands; this process is termed pharmacophore-based VS. In the case of the pharmacophore-based VS approach, a pharmacophore hypothesis is taken as a template. The intention behind the screening is actually to discover such hits that have chemical features similar to those of the template. Sometimes these hits might be related to known active compounds, but few have completely novel scaffolds. The screening process involves two major difficulties: handling the conformational flexibility of small molecules and pharmacophore pattern identification.

The flexibility of small molecules is handled either by preenumerating multiple conformations for each molecule or conformational sampling at search time. Pharmacophore pattern identification, usually known as substructure searching, is performed to check whether a query pharmacophore is present in a given conformer of a molecule. The commonly used approaches for substructure searching are Ullmann [43], the backtracking algorithm [44], and the Generic Match Algorithm (GMA) [45].

The most challenging problem for pharmacophore-based VS is that few percentages of the virtual hits are really bioactive. In simpler words, the screening results produce a higher false-positive rate, a higher false-negative rate, or both. Many factors like the quality and composition of the pharmacophore model and the macromolecular target information can contribute to this problem. The most probable factors are as follows:

a. The most critical one is the development of a robust and reliable pharmacophore hypothesis. Addressing this issue requires an inclusive validation and optimization of the pharmacophore model.

b. Different molecules can be retrieved in VS from different hypotheses of a single pharmacophore model, which is probably an important reason for the higher false-positive/false-negative rates in some studies.

c. The flexibility of target macromolecule in pharmacophore approaches is handled by introducing a tolerance radius for each pharmacophoric feature, which is unlikely to entirely account for macromolecular flexibility in some cases. Recent attempts [46] to integrate molecular dynamics simulation (MDS) into pharmacophore modeling have recommended that the pharmacophore models generated from MDS trajectories explain the considerably enhanced representation of the flexibility of pharmacophores.

d. The steric restriction by the macromolecular target which is not adequately considered in pharmacophore models, although it is partially accounted for by the consideration of excluded volumes. In most of the cases, interactions between a ligand and a protein are distance-sensitive, particularly the short-range interactions, such as the electrostatic interaction, which a pharmacophore model is tricky to account for. As a consequence, the combination of pharmacophore-based and docking-based VS can be considered as an efficient approach for VS.

10.2.5.2 Pharmacophore-based de novo design

Another vital application of pharmacophore is de novo design of ligands. In the case of pharmacophore-based VS, the obtained compounds are generally existing chemicals that might be patent protected. On the contrary, the de novo design approach can be used to generate entirely novel candidate structures that match to the requirements of a given pharmacophore. The first pharmacophore-based de novo design program is NEWLEAD [47]. It uses a set of disconnected molecular fragments that are consistent with a pharmacophore model as input. The selected sets of disconnected pharmacophore fragments are subsequently connected by using various linkers (such as atoms, chains, or ring moieties).

The limitation with NEWLEAD is that it can only handle cases in which the pharmacophore features are functional groups (not typical chemical features). The additional inadequacy of the NEWLEAD program is that the sterically illicit region of the binding site is not considered. As a result, the compounds created by the NEWLEAD program might be tricky to chemically synthesize. There are programs like LUDI [10] and BUILDER [48] that can also be used to amalgamate identification of SBP with de novo design. Both programs require knowledge of the 3D structures of the macromolecular targets.

More recently, a program called PhDD (a pharmacophore-based de novo design method of druglike molecules) has been designed by Huang et al. [49], to overcome the limitations of the present pharmacophore-based, de novo design software tools. PhDD can involuntarily create druglike compounds that satisfy the necessities of an input pharmacophore hypothesis. The pharmacophore used in PhDD can be consisted of a set of abstract chemical features and excluded volumes which are the sterically forbidden region of the binding site. In the case of PhDD, it first generates a set of new molecules that entirely conform to the requirements of the given pharmacophore model. Thereafter, a series of evaluation to the generated molecules are carried out, including the assessments of drug-likeness, bioactivity, and synthetic convenience.

10.2.6 Advantages and limitations of pharmacophore

Like any other approach, pharmacophore has both advantages and disadvantages. The major advantages and limitations are as follows:

Advantages

• Pharmacophore models can be used for VS on a large database.

• There is no need to know the binding site of the ligands in the macromolecular target protein, although this is true only for LBP modeling.

• It can be used for the design, optimization of drugs, and scaffolds hopping.

• It can conceptually be obtained even for 2D structural representation.

• This approach is comprehensive and editable. By adding or omitting chemical feature constraints, information can be easily traced to its source.

Limitations

• 2D pharmacophore is faster but less accurate than 3D pharmacophore.

• A pharmacophore is based only on the ligand structure and conformation. No interactions with the proteins are integrated. It is interesting to point out that in this case, SBP modeling can be used to solve the problem.

• It is sensitive to physicochemical features.

10.2.7 Software tools for pharmacophore analysis

Pharmacophore modeling is extensively used because of its immense accessibility through commercial software packages. Also, there is a freely available web server called PharmaGist (http://bioinfo3d.cs.tau.ac.il/PharmaGist/) for detecting a pharmacophore from a group of ligands known to bind to a particular target. A complete list of different commercialized and freely available software and program modules [19,35–38]) used for pharmacophore modeling is given in Table 10.2.

Table 10.2

Software and programs for pharmacophore modeling

Ligand-based methods
Software	Conformational analysis algorithm	Molecular alignment	Significant characteristics	Remarks
ALADDIN	N/A*	N/A*	Design and pharmacophore generation from geometric, steric, and substructure searching of 3D structures	Not commercialized
Apex-3D	Preenumerating method	Feature-based method	An expert system developed to represent, elucidate, and utilize knowledge on structure–activity relationships	Catalyst (Biovia, http://accelrys.com/)
APOLLO	Preenumerating method	Feature-based method	Identifying from a set of ligands their interaction points belonging to the receptor site and creating a pseudoreceptor	Not commercialized
CLEW	N/A*	Feature-based method	Utilizing the machine-learning method and geometrical fitting to develop the pharmacophore	Not commercialized
DANTE	N/A*	N/A*	Inferring pharmacophores automatically from structure–activity data, which include information about the shape of the binding cavity	Not commercialized
DISCO	Preenumerating method by Concord and Confort via the Sybyl interface	Bron–Kerbosh clique-detection algorithm	Considering 3D conformations of compounds as sets of interpoint distances	Integrated into the Sybyl interface, which is available from Tripos Inc. (www.tripos.com)
GALAHAD	Both preenumerating method and on-the-fly	Atom-based method	A more sophisticated GA is used for pharmacophore modeling	Integrated into the Sybyl interface, which is available from Tripos Inc. (www.tripos.com)
GAMMA	On-the-fly	Atom-based method	The conformational search and the pattern identification are performed simultaneously by utilizing the GA technique	Not commercialized
GASP	On-the-fly	Atom-based method	A flexible GA is used for pharmacophore identification	Integrated into the Sybyl interface, which is available from Tripos Inc. (www.tripos.com)
HipHop	Preenumerating method by the Poling algorithm	Feature-based method	Identifying common features by a pruned exhaustive search (qualitative model)	Discovery Studio (Biovia, http://accelrys.com/)
HypoGen	Preenumerating method by the Poling algorithm	Feature-based method	Designed to correlate structure and activity (quantitative model)	Discovery Studio (Biovia, http://accelrys.com/)
HypoRefine	Preenumerating method by the Poling algorithm	Feature-based method	An extension to the HypoGen	Discovery Studio (Biovia, http://accelrys.com/)
HypoRefine	Preenumerating method by the Poling algorithm	Feature-based method	Exclusion volumes are involved	Discovery Studio (Biovia, http://accelrys.com/)
MOE	Preenumerating method ranging from molecular dynamics to stochastic methods and systematic search	Property-based algorithm	A pharmacophore is defined manually by applying schemes using a Pharmacophore Query Editor	Chemical Computing Group, Inc. (www.chemcomp.com)
MPHIL	On-the-fly	Atom-based method (rigid)	Based on clique detection and GA	Not commercialized
PharmaGist	On-the-fly	Feature-based method	A webserver for LBP detection	http://bioinfo3d.cs.tau.ac.il/PharmaGist
PHASE	Preenumerating method by Schrödinger’s ConfGen technology	Feature-based method (called sites)	Very flexible and user friendly. SMARTS pattern matching is used for feature location. Excluded volumes are included.	Schrödinger Inc. (www.schrodinger.com)
RAPID	Preenumerating method	Atom-based method	A rigid alignment based on mapping triangles of 3D atom coordinates	Not commercialized
SCAMPI	On the fly	Feature-based method	Can handle large heterogeneous data sets	Not commercialized
XED	Preenumerating method	Molecular field-based method	Using field points to describe the VDW and electrostatic potential that surround molecules	Marketed by Cresset Biomolecular (http://www.cresset-group.com/)

Structure-based methods
Software	Molecular alignment	Significant characteristics	Remarks
GBPM	Complex-based	Based on logical and clustering operations with 3D maps computed by the GRID program on structurally known molecular complexes. Particularly suitable for identifying protein–protein interaction areas.	Not commercialized
LigandScout	Complex-based	Incorporating a complete definition of 3D chemical features. Pharmacophoric feature points-based pattern-matching alignment algorithm is used. Intuitive and easy to use.	Marketed by Inte:Ligand (www.inteligand.com/ligandscout/)
Pocket v.2	Complex-based	Capable to generate a pharmacophore model with a rational number of features when one complex structure is available.	Not commercialized
SBP	Apoprotein-based	Directly converting LUDI interaction maps within the protein binding site into Catalyst pharmacophoric features.	Discovery Studio (Biovia, http://accelrys.com/)

N/A*: Not applicable or the exact information is not available.

10.3 Structure-Based Design–Docking

10.3.1 Concept and definition of docking

Molecular docking is the study of how two or more molecular structures (e.g., drug and enzyme or protein) fit together [50]. In a simple definition, docking is a molecular modeling technique that is used to predict how a protein (enzyme) interacts with small molecules (ligands). The ability of a protein (enzyme) and nucleic acid to interact with small molecules to form a supramolecular complex plays a major role in the dynamics of the protein, which may enhance or inhibit its biological function. The behavior of small molecules in the binding pockets of target proteins can be described by molecular docking. The method aims to identify correct poses of ligands in the binding pocket of a protein and to predict the affinity between the ligand and the protein. Based on the types of ligand, docking can be classified as

• Protein–small molecule (ligand) docking

• Protein–nucleic acid docking

• Protein–protein docking

Protein–small molecule (ligand) docking represents a simpler end of the complexity spectrum, and there are many available programs that perform particularly well in predicting molecules that may potentially inhibit proteins. Protein–protein docking is typically much more complex. The reason is that proteins are flexible and their conformational space is quite vast.

Docking can be performed by placing the rigid molecules or fragments into the protein’s active site using different approaches like clique-searching, geometric hashing, or pose clustering. The performance of docking depends on the search algorithm [e.g., MC methods, genetic algorithms (GAs), fragment-based methods, Tabu searches, distance geometry methods, and the scoring functions like force field (FF) methods and empirical free energy scoring functions]. The first step of docking is the generation of composition of all possible conformations and orientations of the protein paired with the ligand. The second step is that the scoring function takes input and returns a number indicating favorable interaction [51].

To identify the active site of the protein, first, selection of the required X-ray cocrystallized structure from the protein data bank (PDB) is performed, and then extracting the bound ligand, one can optimize the protein active site of interest. But the process of identification of the active site in a protein is critical when the bound ligand is absent in the crystal structure. In that case, one has to do the following procedures:

a. One can perform comprehensive literature review of the source papers (from which the X-ray crystal structure has been included in PDB) to identify the active site of residues.

b. If any established drug giving the same pharmacological action of interest is available for the protein, then the active sites for this drug should be identified. In the initial phase of analysis, one can try these residues as active binding sites for the test ligands.

c. Every docking software program usually has a particular algorithm to identify the active site of the protein by allowing binding of the ligand in different parts of the protein and exploring the best possible binding position of the ligands with the protein.

< div class='tao-gold-member'>

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Understanding the Basics of QSAR for Applications in Pharmaceutic

Jul 18, 2016 | Posted by admin in PHARMACY | Comments Off

Basicmedical Key

Fastest Basicmedical Insight Engine

Other Related Techniques

Other Related Techniques

Keywords

10.1 Introduction

10.2 Pharmacophore

10.2.1 Concept and definition

10.2.2 Background and early days of pharmacophore

10.2.3 Methodology of pharmacophore mapping

10.2.3.1 Diverse conformation generation

10.2.3.2 Generation of 3D pharmacophore

10.2.3.3 Assessment of the quality of pharmacophore hypotheses

10.2.3.4 Validation of the pharmacophore model

10.2.4 Types of pharmacophore

10.2.4.1 Ligand-based pharmacophore modeling

10.2.4.2 Structure-based pharmacophore modeling

10.2.5 Application of pharmacophore models

10.2.5.1 Pharmacophore model–based VS

10.2.5.2 Pharmacophore-based de novo design

10.2.6 Advantages and limitations of pharmacophore

10.2.7 Software tools for pharmacophore analysis

10.3 Structure-Based Design–Docking

10.3.1 Concept and definition of docking

Like this:

Related

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

Basicmedical Key

Fastest Basicmedical Insight Engine

Other Related Techniques

Other Related Techniques

Keywords

10.1 Introduction

10.2 Pharmacophore

10.2.1 Concept and definition

10.2.2 Background and early days of pharmacophore

10.2.3 Methodology of pharmacophore mapping

10.2.3.1 Diverse conformation generation

10.2.3.2 Generation of 3D pharmacophore

10.2.3.3 Assessment of the quality of pharmacophore hypotheses

10.2.3.4 Validation of the pharmacophore model

10.2.4 Types of pharmacophore

10.2.4.1 Ligand-based pharmacophore modeling

10.2.4.2 Structure-based pharmacophore modeling

10.2.5 Application of pharmacophore models

10.2.5.1 Pharmacophore model–based VS

10.2.5.2 Pharmacophore-based de novo design

10.2.6 Advantages and limitations of pharmacophore

10.2.7 Software tools for pharmacophore analysis

10.3 Structure-Based Design–Docking

10.3.1 Concept and definition of docking

Share this:

Like this:

Related

Related posts:

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree