Quantitative comparison of within-sample heterogeneity scores for DNA methylation data

Nucleic Acids Res. 2020 May 7;48(8):e46. doi: 10.1093/nar/gkaa120.

Abstract

DNA methylation is an epigenetic mark with important regulatory roles in cellular identity and can be quantified at base resolution using bisulfite sequencing. Most studies are limited to the average DNA methylation levels of individual CpGs and thus neglect heterogeneity within the profiled cell populations. To assess this within-sample heterogeneity (WSH) several window-based scores that quantify variability in DNA methylation in sequencing reads have been proposed. We performed the first systematic comparison of four published WSH scores based on simulated and publicly available datasets. Moreover, we propose two new scores and provide guidelines for selecting appropriate scores to address cell-type heterogeneity, cellular contamination and allele-specific methylation. Most of the measures were sensitive in detecting DNA methylation heterogeneity in these scenarios, while we detected differences in susceptibility to technical bias. Using recently published DNA methylation profiles of Ewing sarcoma samples, we show that DNA methylation heterogeneity provides information complementary to the DNA methylation level. WSH scores are powerful tools for estimating variance in DNA methylation patterns and have the potential for detecting novel disease-associated genomic loci not captured by established statistics. We provide an R-package implementing the WSH scores for integration into analysis workflows.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Methylation*
  • Humans
  • Sarcoma, Ewing / genetics
  • Sequence Analysis, DNA*