The quality of QSAR models: problems and solutions

SAR QSAR Environ Res. 2007 Jan-Mar;18(1-2):89-100. doi: 10.1080/10629360601053984.

Abstract

Assessment of the quality of goodness-of-fit and the confidence in predictivity (prediction power) are the main terms used to define the statistical quality of QSAR models. Three parts of this assessment can be defined as: (1) Measure of goodness-of-fit. (2) Validation of model stability. (3) Predictivity analysis. Currently there are no mandatory requirements for the validation methods to be used and rules for the quantitative confidence estimates. To compare the statistical quality of QSAR models it is necessary to have an overall statistical quality index which will depend on the goodness-of-fit, validation and predictivity results together. To do so it is necessary to define the set of mandatory parameters for all three parts of assessment listed above and develop the approach for overall quality estimates based on these parameters. It is also necessary to include into the overall index the penalty mechanism for parameter absence. The goal of the present study is to analyse parameters for all three parts of the QSAR model statistical quality assessment and investigate the flexible weighting approach for the overall statistical quality index development. Due the different statistical parameters traditionally used for assessment of goodness-of-fit it is necessary to create the mechanism, which allows flexible set of parameters to be used for the overall statistical quality index. Only after approval by scientific community and regulatory boards the final set of mandatory parameters can be selected.

Publication types

  • Evaluation Study

MeSH terms

  • Models, Chemical*
  • Quantitative Structure-Activity Relationship*
  • Reproducibility of Results