NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE6313 Query DataSets for GSE6313
Status Public on Jan 01, 2007
Title Comparison of Hybridization-based and Sequencing-based Gene Expression Technologies on Biological Replicates
Organism Mus musculus
Experiment type Expression profiling by array
Summary High-throughput systems for gene expression profiling have been developed and matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised regarding data comparability and agreement across technologies. Within an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing). The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive, and currently, both provide indispensable tools for transcriptome profiling.
Keywords: biological replicates
 
Overall design METHODS
Biological samples
RNA samples were isolated from three sources: two pools of C57/B6 adult mouse retina (MRP1 and MRP2, n=700) and Swiss-Webster post-natal day one (P1) mouse cortex (MC) (n=19). Retinas were dissected, collected and stored in Trizol (one pair of retinas per eppendorf tube) at -80°C prior to pooling. During the RNA extraction process, two pools of adult mouse retina (MRP1, MRP2) were created (700 retinas per pool), aliquoted. All samples were stored at -80°C prior to conducting the experiments. The animal experiments were approved by the Institutional Animal Care Facility at Harvard University.
Microarray platforms, data processing and consistency assessment
Whole-genome mouse gene expression arrays (one-dye oligonucleotide microarrays) were investigated in this study, including: Affymetrix GeneChip®, Amersham (now GE Healthcare) CodeLink®, Mergen ExpressChip®, Applied Biosystems (ABI) microarrays, and Illumina BeadArray®. Microarray experiments are composed of sample preparation, hybridization, scanning and image quantitation, which are a series of integrative procedures being conducted at a laboratory, generally according to the manufacturer’s recommended protocols. To obtain sufficient statistical confidence in the data analysis, for each biological replicate (MRP1 and MRP2), five technical replicates on each platform were obtained, with an exception on Illumina, in which a single experiment for each sample was performed (included to make data available to the public). For details of the experimental protocols and laboratories, we refer to Kuo et.al.[6], except for Illumina, which can be found in the Supplementary Material A.
The raw data sets of 63 chips after image scanning and quantification in each platform were collected. For Illumina data, we set the filtering threshold as “Detections” ≥ 0.9. Filtering for the other microarray platforms are described in Kuo et.al.[6]. We also performed percentile transformation of intensities, quantiles normalization and log2ratio calculation, as described[6].
Data repeatability and reproducibility[7] are two important aspects of microarray data consistency assessment. The former evaluates the degree of data variations among technical replicates of a platform, and the latter refers to data agreement across different microarray platforms when using the same biological samples. Two popularly used metrics, coefficient of variations (CV) among replicated measurements per gene and correlation coefficient (Pearson and Spearman correlations) between any pair of replicated experiments, were adopted to measure microarray performance. For intra-platform data consistency, the mean and standard deviation of CVs or correlation coefficients were used as summations of each platform’s performance. For inter-platform data agreement, either the mean (for normalized log2ratios) or the median after percentile transformation (for intensities) of repeated measurements on each platform were used in calculating correlation coefficients.
MPSS experiment and data processing
Total RNA of MRP1 and MRP2, which were identical to those used in microarray experiments, was sent to Lynx Therapeutics, Inc. (now Solexa, Hayward, CA) for MPSS experiments. Following an RNA quality test on a Agilent 2100 BioAnalyzer (Agilent Technologies, Palo Alto, CA), cDNA libraries were generated according to the Megaclone protocol[5, 21]. Signatures adjacent to poly (A) proximal DpnII restriction sites (“GATC”) were cloned into a Megaclone vector. The resulting library was amplified and yielded about 1.6 million loaded microbeads, which were loaded onto a flow cell. Thereafter, an iterative series of enzymatic reactions decoded the signatures as 17-bp or 20-bp sequences (including DpnII recognition sites “GATC”)[22]. The abundance of each signature was converted to transcripts per million (tpm) as supplied by Lynx Therapeutics.
The mapping of signatures to genes was done based on the mouse genome sequence (Release #3, Feb 2003, UCSC Golden Path genome browser, http://hgdownload.cse.ucsc.edu/downloads.html) and the mouse UniGene sequences (ftp://ftp.ncbi.nih.gov/repository/UniGene/, Build #122). The mapping procedure include: extraction of ‘virtual’ signatures from genomic sequence, classification of ‘virtual’ signatures from genomic sequence, and matching of MPSS expressed signatures to genomic signatures[22]. In this study, we included only the reliable signatures which were located close to polyadenylation signal (“A[A/T]TAAA” at the 155 nts at the 3’-end of the cDNA sequence) or poly(A) tail (15 base sequences containing 12 or more “A”s that occur within the 114 3’-most bases of the sequence) on a mRNA sequences with known orientation information.
Gene mapping among microarray platforms and between microarray and MPSS
Two approaches to match probes across different microarray chips, annotation-based and sequence-based probe matching were used[6]. Briefly, by the annotation-based approach, we obtained UniGene (UG) and LocusLink (LL) based matching, whereas probe matches at the RefSeq (RS) and RefSeq-exon (RSEXON) levels by utilizing actual sequence information belong to the latter.
MPSS signatures were mapped to UniGene clusters, using an in silico constructed “virtual tags” library, as described above. Thus, the gene expression data measured by microarrays and by MPSS were paired up for comparisons via UniGene clusters.
 
Contributor(s) Liu F, Jenssen T, Trimarchi J, Cepko CL, Ohno-Machado L, Hovig E, Kuo WP
Citation(s) 17555589
Submission date Nov 18, 2006
Last update date Feb 18, 2018
Contact name Winston Patrick Kuo
E-mail(s) wkuo@genetics.med.harvard.edu
Organization name Harvard Medical School
Department Genetics
Lab Cepko
Street address 188 Longwood Avenue
City Boston
State/province MA
ZIP/Postal code 02115
Country USA
 
Platforms (7)
GPL81 [MG_U74Av2] Affymetrix Murine Genome U74A Version 2 Array
GPL2995 ABI Mouse Genome Survey Microarray
GPL3734 Mergen Expresschip Mouse MO3
Samples (45)
GSM108124 ABI-MRP1-1
GSM108125 ABI-MRP1-2
GSM108128 ABI-MRP1-3
Relations
BioProject PRJNA100523

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE6313_RAW.tar 379.5 Mb (http)(custom) TAR (of CEL, TIFF, TXT)

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap