GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM883783

Query DataSets for GSM883783

Status

Public on Aug 29, 2012

Title

Dmel21_E21h

Sample type

SRA

Source name

Whole organism, embryos

Organism

Drosophila melanogaster

Characteristics

strain: y; cn bw sp
tissue: whole body
developmental stage: embryos 20-21h after egg laying

Treatment protocol

None.

Growth protocol

Stocks of the y; cn bw sp strain were maintained in standard cornmeal medium bottles in a 24ºC incubator. Embryo collections were performed in population cages (Flystuff, #59-116). 2- to 7-day-old flies were left to acclimatize to the cage for at least 48h and regularly fed with grape juice-agar plates (Flystuff, #47-102) generously loaded with yeast paste. After two 2-hour pre-lays, embryos were collected in 1-hour windows and aged appropriately (24 timepoints, 0-24h). Embryos were washed with deionized water, dechorionated for 90 sec with 50% bleach, rinsed abundantly with water, and snap-frozen in liquid nitrogen. Larvae and pupae were collected as described previously (Graveley, Brooks et al. 2010). For L1 and L2 stages, 2-hour embryo collections were aged for 42 or 66 hours, larvae were briefly rinsed with deionized water and snap-frozen. For L3 stages, embryos were transferred to bottles containing cornmeal medium supplemented with 0.05% bromophenol blue, and wandering L3 larvae were staged based on gut staining (dark, light or clear gut) and snap-frozen. For pupae, 2-hour embryo collections were transferred to standard cornmeal medium bottles, the positions of new white prepupae on the walls of the bottle were marked, and pupae were collected and snap-frozen at the desired age. For adults, 0- to 12-hour-old flies were sexed and kept in vials with cornmeal medium for 5 days, and then snap-frozen.

Extracted molecule

total RNA

Extraction protocol

RNA extraction: Total RNA was extracted from adult flies using Trizol (Invitrogen) according to the manufacturer's instructions and treated with DNaseI (Roche). Extraction from embryos, larvae and pupae was performed using the RNAdvance Tissue kit (Agencourt #A32649) according to the manufacturer's instructions, including DNaseI treatment. We systematically checked on a Bioanalyzer (Agilent) that the RNA was of very high quality. 5'monophosphate species --mainly ribosomes -- were depleted by TEX digest.

Library preparation & sequencing: Three multiplexed libraries were prepared: one for embryos (24 barcoded samples), one for larvae and pupae (10 samples), and one for adults (2 samples). The reverse-transcription was run in parallel for all samples destined to the same library, and the samples were pooled right after reverse-transcription. Our 5'-complete cDNA selection strategy relies on the combination of two orthogonal enrichment methods: reverse-transcriptase template-switching, and cap-trapping. The template-switching approach is based on the ability of reverse-transcriptase to add linker sequences to the ends of 5'-complete cDNAs -- preferentially if they are made from capped transcripts. Cap-trapping relies on the biotinylation of capped RNA molecules and specific pulldown of their associated 5'-complete cDNAs. The libraries were run on a DNA HS Bioanalyzer chip for quality control, quantified by quantitative PCR, and sequenced on one lane each on an Illumina GAIIx (adults, 2x76bp) or HiSeq (embryos, larvae and pupae, 2x101bp). Please see Supplemetary Material of the original publication for a detailed protocol.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina HiSeq 2000

Description

RNA (5'-complete cDNAs selected by TEX digest, template-switching and cap-trapping).
RNA-Seq (RAMPAGE)

Data processing

Sequencing reads alignment: The sequences corresponding to the library identification barcode (first 6 bases of read 1) and the reverse-transcription primer (first 15 bases of read 2) were trimmed prior to mapping. Trimmed reads were mapped with STAR, with parameters described in Tables S2-3 of the original publication. All uniquely mapping reads were kept. As a rescue strategy for multiply mapping reads, if all alignments for those reads started within an annotated transposon and overlapped the same gene annotation, the alignment starting in the closest transposon insertion was selected. All non-rescued multi-mappers were discarded.

Data analysis pipeline: PCR duplicates, defined as reads sharing the same alignment coordinates (start, end and splice sites), were removed from the individual datasets. To avoid over-collapsing, we took advantage of the fact that the long random sequence (15-mer) of our reverse-transcription primer often primes with mismatches. We used this sequence as a pseudo-random barcode allowing us to distinguish between true duplicates (same barcode) and independent identical inserts. All collapsed datasets were then combined prior to peak calling. The density of cDNA 5' ends across the genome was determined from this combined dataset, as well as the density of coverage by second (i.e., downstream) sequencing reads. Peaks were called by a sliding window algorithm that assesses the significance of local signal enrichment given a null distribution. Downstream read coverage in the same window was used to correct for local transcript abundance, by subtracting from the raw signal a pseudocount proportional to this coverage. After FDR correction, significant windows in close proximity to each other were merged into peaks, and those were trimmed at the edges down to the first base with signal. (Parameters: window width 15 bases, null distribution negative binomial with k=4, background weight 0.5, FDR 0.01, merging range 150 bases). These peaks were connected to annotated genes based on cDNA structure information. For each peak, if we could find at least 2 inserts having their 5' in the peak and overlapping an annotated exon of a gene, the peak was functionally linked to that gene. If a peak could potentially be linked to several genes, ties were broken by removing all links that were 5-fold weaker than the strongest one. For quantification, the signal for each peak and each timepoint was derived from the uncollapsed datasets, and normalized to dataset size (defined as the total number of reads attributed to any genic TSS). We built partial transcript models by running Cufflinks separately on the set of reads coming from each peak for each given dataset, and collapsing all transcripts for each peak using Cuffmerge. For a more detailed description of the analysis pipeline, please refer to the original publication.

*_+.bw file: bigWig coverage by cDNA 5' ends (+ strand).
*_-.bw file: bigWig coverage by cDNA 5' ends (- strand).
Genome Build:
Dmel21_E21h_-.bw: dm3
Dmel21_E21h_+.bw: dm3

Submission date

Mar 01, 2012

Last update date

May 15, 2019

Contact name

Philippe Batut

E-mail(s)

batut@cshl.edu

Phone

516-422-4122

Organization name

CSHL

Lab

Gingeras

Street address

500 Sunnyside Blvd.

City

Woodbury

State/province

ZIP/Postal code

11797

Country

USA

Platform ID

GPL13304

Series (2)

GSE36212	Promoter activity profiling throughout the Drosophila life cycle reveals role of transposons in regulatory innovation
GSE36213	Profiling of transcription start site expression in Drosophila and the human K562 cell line using RAMPAGE

Relations

Reanalyzed by

GSM3276715

SRA

SRX124973

BioSample

SAMN00794534

Supplementary file	Size	Download	File type/resource
GSM883783_Dmel21_E21h_+.bw	405.6 Kb	(ftp)(http)	BW
GSM883783_Dmel21_E21h_-.bw	394.7 Kb	(ftp)(http)	BW
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record