Probabilistic Graphical Modeling for Estimating Risk of Coronary Artery Disease: Applications of a Flexible Machine-Learning Method

Med Decis Making. 2019 Nov;39(8):1032-1044. doi: 10.1177/0272989X19879095. Epub 2019 Oct 16.

Abstract

Objectives. Coronary artery disease (CAD) is the leading cause of death and disease burden worldwide, causing 1 in 7 deaths in the United States alone. Risk prediction models that can learn the complex causal relationships that give rise to CAD from data, instead of merely predicting the risk of disease, have the potential to improve transparency and efficacy of personalized CAD diagnosis and therapy selection for physicians, patients, and other decision makers. Methods. We use Bayesian networks (BNs) to model the risk of CAD using the Z-Alizadehsani data set-a published real-world observational data set of 303 Iranian patients at risk for CAD. We also describe how BNs can be used for incorporation of background knowledge, individual risk prediction, handling missing observations, and adaptive decision making under uncertainty. Results. BNs performed on par with machine-learning classifiers at predicting CAD and showed better probability calibration. They achieved a mean 10-fold area under the receiver-operating characteristic curve (AUC) of 0.93 ± 0.04, which was comparable with the performance of logistic regression with L1 or L2 regularization (AUC: 0.92 ± 0.06), support vector machine (AUC: 0.92 ± 0.06), and artificial neural network (AUC: 0.91 ± 0.05). We describe the use of BNs to predict with missing data and to adaptively calculate prognostic values of individual variables under uncertainty. Conclusion. BNs are powerful and versatile tools for risk prediction and health outcomes research that can complement traditional statistical techniques and are particularly useful in domains in which information is uncertain or incomplete and in which interpretability is important, such as medicine.

Keywords: Bayesian networks; Bayesian statistics; artificial intelligence; cardiology; coronary artery disease; graphical models; health economics and outcomes research (HEOR); machine learning; risk modeling; risk prediction; statistical models.

MeSH terms

  • Bayes Theorem*
  • Computer Graphics
  • Coronary Artery Disease / epidemiology*
  • Humans
  • Iran / epidemiology
  • Logistic Models
  • Machine Learning
  • Probability*
  • ROC Curve
  • Risk Assessment / methods*