Differential variability improves the identification of cancer risk markers in DNA methylation studies profiling precursor cancer lesions

Bioinformatics. 2012 Jun 1;28(11):1487-94. doi: 10.1093/bioinformatics/bts170. Epub 2012 Apr 6.

Abstract

Motivation: The standard paradigm in omic disciplines has been to identify biologically relevant biomarkers using statistics that reflect differences in mean levels of a molecular quantity such as mRNA expression or DNA methylation. Recently, however, it has been proposed that differential epigenetic variability may mark genes that contribute to the risk of complex genetic diseases like cancer and that identification of risk and early detection markers may therefore benefit from statistics based on differential variability.

Results: Using four genome-wide DNA methylation datasets totalling 311 epithelial samples and encompassing all stages of cervical carcinogenesis, we here formally demonstrate that differential variability, as a criterion for selecting DNA methylation features, can identify cancer risk markers more reliably than statistics based on differences in mean methylation. We show that differential variability selects features with heterogeneous outlier methylation profiles and that these play a key role in the early stages of carcinogenesis. Moreover, differentially variable features identified in precursor non-invasive lesions exhibit significantly increased enrichment for developmental genes compared with differentially methylated sites. Conversely, differential variability does not add predictive value in cancer studies profiling invasive tumours or whole-blood tissue. Finally, we incorporate the differential variability feature selection step into a novel adaptive index prediction algorithm called EVORA (epigenetic variable outliers for risk prediction analysis), and demonstrate that EVORA compares favourably to powerful prediction algorithms based on differential methylation statistics.

Conclusions: Statistics based on differential variability improve the detection of cancer risk markers in the context of DNA methylation studies profiling epithelial preinvasive neoplasias. We present a novel algorithm (EVORA) which could be used for prediction and diagnosis of precursor epithelial cancer lesions.

Availability: R-scripts implementing EVORA are available from CRAN (www.r-project.org).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Breast Neoplasms / genetics
  • DNA Methylation*
  • Female
  • Genetic Markers*
  • Humans
  • Papillomavirus Infections / genetics
  • Precancerous Conditions / genetics*
  • Randomized Controlled Trials as Topic
  • Risk
  • Transcriptome*
  • Uterine Cervical Dysplasia / diagnosis
  • Uterine Cervical Dysplasia / genetics*
  • Uterine Cervical Neoplasms / diagnosis
  • Uterine Cervical Neoplasms / genetics*

Substances

  • Genetic Markers

Associated data

  • GEO/GSE30760