Clinical prediction of HBV and HCV related hepatic fibrosis using machine learning

EBioMedicine. 2018 Sep:35:124-132. doi: 10.1016/j.ebiom.2018.07.041. Epub 2018 Aug 10.

Abstract

Clinical prediction of advanced hepatic fibrosis (HF) and cirrhosis has long been challenging due to the gold standard, liver biopsy, being an invasive approach with certain limitations. Less invasive blood test tandem with a cutting-edge machine learning algorithm shows promising diagnostic potential. In this study, we constructed and compared machine learning methods with the FIB-4 score in a discovery dataset (n = 490) of hepatitis B virus (HBV) patients. Models were validated in an independent HBV dataset (n = 86). We further employed these models on two independent hepatitis C virus (HCV) datasets (n = 254 and 230) to examine their applicability. In the discovery data, gradient boosting (GB) stably outperformed other methods as well as FIB-4 scores (p < .001) in the prediction of advanced HF and cirrhosis. In the HBV validation dataset, for classification between early and advanced HF, the area under receiver operating characteristic curves (AUROC) of GB model was 0.918, while FIB-4 was 0.841; for classification between non-cirrhosis and cirrhosis, GB showed AUROC of 0.871, while FIB-4 was 0.830. Additionally, GB-based prediction demonstrated good classification capacity on two HCV datasets while higher cutoffs for both GB and FIB-4 scores were required to achieve comparable specificity and sensitivity. Using the same parameters as FIB-4, the GB-based prediction system demonstrated steady improvements relative to FIB-4 in HBV and HCV cohorts with different cutoff values required in different etiological groups. A user-friendly web tool, LiveBoost, makes our prediction models freely accessible for further clinical studies and applications.

Keywords: FIB-4; Gradient boosting; Hepatic fibrosis; Hepatitis B; Hepatitis C; Machine learning.

MeSH terms

  • Adult
  • Area Under Curve
  • Cohort Studies
  • Female
  • Hepacivirus / physiology*
  • Hepatitis B virus / physiology*
  • Humans
  • Internet
  • Liver Cirrhosis / diagnosis*
  • Liver Cirrhosis / virology*
  • Machine Learning*
  • Male
  • Middle Aged
  • Models, Biological
  • ROC Curve
  • Reproducibility of Results