Prediction of low Apgar score at five minutes following labor induction intervention in vaginal deliveries: machine learning approach for imbalanced data at a tertiary hospital in North Tanzania

Clifford Silver Tarimo; Soumitra S Bhuyan; Yizhen Zhao; Weicun Ren; Akram Mohammed; Quanman Li; Marilyn Gardner; Michael Johnson Mahande; Yuhui Wang; Jian Wu

doi:10.1186/s12884-022-04534-0

Prediction of low Apgar score at five minutes following labor induction intervention in vaginal deliveries: machine learning approach for imbalanced data at a tertiary hospital in North Tanzania

BMC Pregnancy Childbirth. 2022 Apr 1;22(1):275. doi: 10.1186/s12884-022-04534-0.

Authors

Clifford Silver Tarimo^{1

2}, Soumitra S Bhuyan³, Yizhen Zhao⁴, Weicun Ren^{1

5}, Akram Mohammed⁶, Quanman Li¹, Marilyn Gardner⁷, Michael Johnson Mahande⁸, Yuhui Wang⁹, Jian Wu^{10

11}

Affiliations

¹ Department of Epidemiology and Health Statistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, China.
² Department of Science and Laboratory Technology, Dar es Salaam Institute of Technology, P.O. Box 2958, Dar es Salaam, Tanzania.
³ Rutgers University-New Brunswick, Edward J. Bloustein, School of Planning and Public Policy, New Brunswick, USA.
⁴ Luoyang Orthopedic Traumatological Hospital of Henan Province, Luoyang, China.
⁵ College of Sanquan, Xinxiang Medical University, Xinxiang, People's Republic of China.
⁶ Center for Biomedical Informatics, University of Tennessee Health Science Center, Memphis, TN, USA.
⁷ Department of Public Health, Western Kentucky University, 1906 College Heights Blvd, Bowling Green, KY, 42101, USA.
⁸ Institute of Public Health, Kilimanjaro Christian Medical University College, P.O. Box 2240, Moshi, Tanzania.
⁹ Centre for Financial and Corporate Integrity, Coventry University, Coventry, UK.
¹⁰ Department of Epidemiology and Health Statistics, College of Public Health, Zhengzhou University, 100 Kexue Avenue, Zhengzhou, 450001, Henan, China. wujian@zzu.edu.cn.
¹¹ Henan Province Engineering Research Center of Health Economics & Health Technology Assessment, Henan Province, China. wujian@zzu.edu.cn.

Abstract

Background: Prediction of low Apgar score for vaginal deliveries following labor induction intervention is critical for improving neonatal health outcomes. We set out to investigate important attributes and train popular machine learning (ML) algorithms to correctly classify neonates with a low Apgar scores from an imbalanced learning perspective.

Methods: We analyzed 7716 induced vaginal deliveries from the electronic birth registry of the Kilimanjaro Christian Medical Centre (KCMC). 733 (9.5%) of which constituted of low (< 7) Apgar score neonates. The 'extra-tree classifier' was used to assess features' importance. We used Area Under Curve (AUC), recall, precision, F-score, Matthews Correlation Coefficient (MCC), balanced accuracy (BA), bookmaker informedness (BM), and markedness (MK) to evaluate the performance of the selected six (6) machine learning classifiers. To address class imbalances, we examined three widely used resampling techniques: the Synthetic Minority Oversampling Technique (SMOTE) and Random Oversampling Examples (ROS) and Random undersampling techniques (RUS). We applied Decision Curve Analysis (DCA) to evaluate the net benefit of the selected classifiers.

Results: Birth weight, maternal age, and gestational age were found to be important predictors for the low Apgar score following induced vaginal delivery. SMOTE, ROS and and RUS techniques were more effective at improving "recalls" among other metrics in all the models under investigation. A slight improvement was observed in the F1 score, BA, and BM. DCA revealed potential benefits of applying Boosting method for predicting low Apgar scores among the tested models.

Conclusion: There is an opportunity for more algorithms to be tested to come up with theoretical guidance on more effective rebalancing techniques suitable for this particular imbalanced ratio. Future research should prioritize a debate on which performance indicators to look up to when dealing with imbalanced or skewed data.

Keywords: Imbalanced data; Low five-minute Apgar score; Machine learning; North-Tanzania; Successful labor induction.

MeSH terms

Apgar Score
Delivery, Obstetric*
Female
Humans
Infant, Newborn
Labor, Induced
Machine Learning*
Pregnancy
Tanzania
Tertiary Care Centers