Structure-reactivity modeling using mixture-based representation of chemical reactions

J Comput Aided Mol Des. 2017 Sep;31(9):829-839. doi: 10.1007/s10822-017-0044-3. Epub 2017 Jul 27.

Abstract

We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.

Keywords: Chemical reactions; Condensed graph of reaction; Mixtures; Rate constant prediction; Reaction fingerprints; Simplex representation of molecular structure.

MeSH terms

  • Kinetics
  • Models, Molecular*
  • Molecular Structure
  • Organic Chemicals / chemistry*
  • Quantitative Structure-Activity Relationship

Substances

  • Organic Chemicals