Parameterization and conformational sampling effects in pharmacophore multiplet searching

J Chem Inf Model. 2008 Dec;48(12):2326-34. doi: 10.1021/ci800234q.

Abstract

Pharmacophore patterns in ligands can be effectively characterized in terms of their constituent pharmacophore multiplets. Bitsets (fingerprints) encoding which particular multiplets are found in a given ligand have been and continue to be used as molecular descriptors in a range of molecular modeling applications, from ligand alignment and diversity analysis to pharmacophore-based flexible searching. Being able to create, store, and manipulate multiplets in compressed form - as bitmaps - has made it possible to integrate them into high-throughput technologies. A number of key parameters affect how well multiplets perform, including the granularity of edge length binning; how different multiplets are weighted in creating hypotheses from multiple ligands; and the number of bits that should be included in a pharmacophore hypothesis. The similarity metric employed for bitmap comparisons also affects search performance, as does the conformational sampling regime used for characterizing flexible molecules. In this report we explore the effect of parameter variation on within- and between-class similarity across seven different pharmacological classes and introduce a new measure of molecular similarity - the asymmetric stochastic cosine - uniquely suited to searching a database for matches to query hypotheses deduced from multiple ligands. Surprisingly, it turns out that the most discriminating bitmaps are obtained using relatively few conformers. The extreme discrimination power seen for single conformers, however, seems to reflect consistent effects of 2D connectivity on the 3D structure obtained. Conformational sampling by systematic search reinforces such circumstantial discrimination and should be avoided. The potential for systematic bias becomes clear when the behavior of otherwise similar conformational ensembles created by local energy minimization or by random sampling is considered. Consolidating information from multiple known actives or establishing single "bioactive" conformations a priori are safer ways to improve discrimination in pharmacophoric multiplet searching.

MeSH terms

  • Binding Sites
  • Computer Simulation
  • Databases, Factual
  • Drug Design*
  • Ligands
  • Molecular Conformation
  • Molecular Structure

Substances

  • Ligands