SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

Yunwei Zhang; Germaine Wong; Graham Mann; Samuel Muller; Jean Y H Yang

doi:10.1093/gigascience/giac071

SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

Gigascience. 2022 Jul 30:11:giac071. doi: 10.1093/gigascience/giac071.

Authors

Yunwei Zhang^{1

2}, Germaine Wong^{3

4

5}, Graham Mann^{6

7}, Samuel Muller^{1

8}, Jean Y H Yang^{1

2

9}

Affiliations

¹ School of Mathematics and Statistics, The University of Sydney, Sydney 2006, Australia.
² Charles Perkins Centre, The University of Sydney, Sydney 2006, Australia.
³ Sydney School of Public Health, The University of Sydney, NSW, Sydney 2006, Australia.
⁴ Centre for Kidney Research, Kids Research Institute, The Children's Hospital at Westmead, NSW, 2145, Sydney, Australia.
⁵ Centre for Transplant and Renal Research, Westmead Hospital, NSW, 2145, Sydney, Australia.
⁶ John Curtin School of Medical Research, Australian National University, Canberra 2601, Australia.
⁷ Melanoma Institute Australia, North Sydney, NSW 2065, Australia.
⁸ Department of Mathematics and Statistics, Macquarie University, Sydney 2109, Australia.
⁹ Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.

Abstract

Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a diverse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics; these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies.

Keywords: machine learning; survival analysis; survival prediction.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Benchmarking*
Machine Learning*
Proportional Hazards Models
Survival Analysis