GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM385512

Query DataSets for GSM385512

Status

Public on Mar 23, 2009

Title

HeLa S3 STAT1 ChIP unstimulated_4

Sample type

SRA

Source name

HeLa S3

Organism

Homo sapiens

Characteristics

ChIP Antibody: STAT1
Cell Line: HeLa S3
Treatment: unstimulated

Extracted molecule

genomic DNA

Extraction protocol

To prepare immunoprecipated DNA for 1G sequencing, we size-fractionated 5?50 ng of immunoprecipate by 12% PAGE and excised a gel slice containing the 100?300 bp fragments. We eluted DNA from the gel slice overnight at 4 °C in 300 mul of elution buffer (5:1, LoTE buffer (3 mM Tris-HCl (pH 7.5), 0.2 mM EDTA)-7.5 M ammonium acetate) and recovered the DNA using a QIAquick PCR purification kit (Qiagen). Then we repaired the DNA ends using a 1:5 mixture of T4 and Klenow DNA polymerases (Illumina) following the manufacturer's instructions. After a 30-min incubation at 20 °C, we subjected the reaction to phenol?chloroform?isoamyl alcohol (pH 8.0; 100 mul; Fisher) extraction in 0.5-ml phase-lock gel tubes (heavy; Eppendorf) and precipitated the reactions by adding 250 ml of 100% EtOH, 3 ml of mussel glycogen (Invitrogen) and 10 ml of 7.5 M NH4OAc, and incubating at -20 °C for 20 min. We recovered the precipitate by centrifugation at 20,200g for 15 min at 4 °C in an benchtop refrigerated centrifuge (Eppendorf model 5417R). We added a single adenine base to the DNA using Klenow exo? (3' and 5' exo minus; Illumina) following the manufacturer's instructions.

Library strategy

ChIP-Seq

Library source

genomic

Library selection

ChIP

Instrument model

Illumina Genome Analyzer

Description

HS0151

Data processing

We extracted sequences from the resulting image files using the open source Firecrest and Bustard applications on a 32-CPU cluster running Red Hat Enterprise Linux 4 and Sun Microsystems Grid Engine 6. Reads were aligned (mapped) to the unmasked human reference genome (NCBI v36, hg18) using the Eland application (Illumina). Eland achieves high throughput by permitting no more than two mismatches per read sequence. We maximized the number of mapped reads by iteratively discarding two bases from the end of a rejected read and resubmitting the truncated read to Eland, until either all reads had been aligned to the genome or the lengths of all unmapped reads were less than 20 bp. Only uniquely mapped reads were retained. DNA fragments were represented by a mapped sequence read of length 27 bp. Because a relatively small number of PCR cycles was used in library preparation and DNA-fragment end locations were weakly clustered rather than random (data not shown), we allowed reads with identical start coordinates to be present in a profile. We combined uniquely aligned reads from the three biological replicates into a single set of reads. We transformed mapped sequence reads into profiles of the number of overlapped DNA fragments at each nucleotide in the reference genome. Because a 27-bp read directionally represents one end of a DNA fragment (SET), we approximated the fragment that produced a read sequence by extending the read to generate an XSET. We chose the XSET length to be the mean fragment length of the size selected DNA. From distances between mapped reads, we estimated this length to be 174 bp. A peak's height was the maximum number of overlapped XSETs for that peak. A random expectation for the probability of observing peaks with a particular height was generated from a numerical background model that generated peaks by randomly placing onto a hypothetical genome several fragments equivalent to the actual uniquely mapped number. Each fragment's length was the estimated mean fragment length, that is, the XSET length. Because 27-bp reads can be mapped uniquely to approx90% of the human genome (data not shown), the background simulations used a mappable genome length that was 90% of 3.08 Gb. For a peak height, we estimated the FDR as the ratio of the number of peaks that the background model indicated should occur by chance, to the number observed.

Submission date

Mar 23, 2009

Last update date

May 15, 2019

Contact name

Eric Chuah

E-mail(s)

echuah@bcgsc.ca

Phone

604-707-5900 3231

Organization name

BC Genome Sciences Centre