NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE51170 Query DataSets for GSE51170
Status Public on Sep 30, 2014
Title Identification of building principles of methylation states at CG rich regions by high-throughput editing of a mammalian genome
Organisms Escherichia coli; Mus musculus; synthetic construct
Experiment type Methylation profiling by high throughput sequencing
Summary Methylation is a repressive modification of DNA prevalent throughout mammalian genomes yet mostly absent at CG rich stretches referred to as CGI. Here we identify their building principles by parallel genomic targeting of sequence libraries. Iterative insertions generated over 3,000 variants of genome-derived and artificial sequences at the same genomic site. Single molecule profiling of the methylation status of this collection allowed modeling the contribution of CG content and DNA binding factors towards the unmethylated state. It made the surprising prediction that the majority of CGs within endogenous islands are susceptible to methylation changes modulated by the presence of transcription factors, which is indeed confirmed by genome-wide methylation dynamics during multiple cellular differentiations. Our model further predicts blocks of constitutively unmethylated CGs independent from TF binding, which have a median size of ~300bp but are only present in half of all islands. Their constitutively unmethylated state is a hallmark of untransformed cells but their increased methylation is a specific and predictive feature of cancer. This study quantifies the two principal mechanisms governing methylation patterns in mammalian genomes. It provides a framework to interpret methylation data across normal and cancer samples and refines the concept of CpG islands. Methylation is a repressive modification of DNA prevalent throughout mammalian genomes yet mostly absent at CG rich stretches referred to as CGI. Here we identify their building principles by parallel genomic targeting of sequence libraries. Iterative insertions generated over 3,000 variants of genome-derived and artificial sequences at the same genomic site. Single molecule profiling of the methylation status of this collection allowed modeling the contribution of CG content and DNA binding factors towards the unmethylated state. It made the surprising prediction that the majority of CGs within endogenous islands are susceptible to methylation changes modulated by the presence of transcription factors, which is indeed confirmed by genome-wide methylation dynamics during multiple cellular differentiations. Our model further predicts blocks of constitutively unmethylated CGs independent from TF binding, which have a median size of ~300bp but are only present in half of all islands. Their constitutively unmethylated state is a hallmark of untransformed cells but their increased methylation is a specific and predictive feature of cancer. This study quantifies the two principal mechanisms governing methylation patterns in mammalian genomes. It provides a framework to interpret methylation data across normal and cancer samples and refines the concept of CpG islands.
 
Overall design Libraries of DNA sequences were constructed either by mouse genome (129S6) or E.coli genome (NC_010473.1) subrepresentation or custom synthesis. DNA fragments were inserted into the genome of mouse embryonic stem cells by recombination mediated casette exchange (RMCE) at the B-globin locus. Methylation status of the inserted DNA sequences was profiled by bisulfite sequencing using a pair of universal primers flanking the fragments.
 
Contributor(s) Krebs A, Schübeler D
Citation(s) 25259795
Submission date Sep 25, 2013
Last update date May 15, 2019
Contact name Dirk Schuebeler
Organization name Friedrich Miescher Institute for Biomedical Research
Street address Maulbeerstrasse 66
City Basel
ZIP/Postal code 4058
Country Switzerland
 
Platforms (5)
GPL9250 Illumina Genome Analyzer II (Mus musculus)
GPL10328 Illumina Genome Analyzer II (Escherichia coli)
GPL16085 Illumina MiSeq (Escherichia coli)
Samples (19)
GSM1240003 extreme E.coli - bisSeq 1
GSM1240004 extreme E.coli - bisSeq 2
GSM1240005 extreme E.coli - bisSeq 3
Relations
BioProject PRJNA221381
SRA SRP030456

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE51170_e.coli_MspI_summary.tab.txt.gz 4.6 Kb (ftp)(http) TXT
GSE51170_e.coli_extColi_summary.tab.txt.gz 2.0 Kb (ftp)(http) TXT
GSE51170_e.coli_fragments_definitions.tab.txt.gz 148.0 Kb (ftp)(http) TXT
GSE51170_mouse_BssHII_summary.tab.txt.gz 700 b (ftp)(http) TXT
GSE51170_mouse_BstU_summary.tab.txt.gz 2.1 Kb (ftp)(http) TXT
GSE51170_mouse_NarI_summary.tab.txt.gz 1.4 Kb (ftp)(http) TXT
GSE51170_mouse_fragments_definitions.tab.txt.gz 214.1 Kb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap