NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE18156 Query DataSets for GSE18156
Status Public on Oct 22, 2009
Title Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data.
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE).We generated sixteen million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When we mapped these reads to the human genome we found that, at heterozygous SNPs, there was a significant bias towards higher mapping rates of the allele in the reference sequence, compared to the alternative allele. Masking known SNP positions in the genome sequence eliminated the reference bias but, surprisingly, did not lead to more reliable results overall. We find that even after masking, $\sim$5-10\% of SNPs still have an inherent bias towards more effective mapping of one allele. Filtering out inherently biased SNPs removes 40\% of the top signals of ASE. The remaining SNPs showing ASE are enriched in genes previously known to harbor cis-regulatory variation or known to show uniparental imprinting. Our results have implications for a variety of applications involving detection of alternate alleles from short-read sequence data. Scripts, written in Perl and R, for simulating short reads, masking SNP variation in a reference genome, and analyzing the simulation output are available upon request from JFD.
 
Overall design RNA-Seq on two YRI Hapmap cell lines. Each individual sequenced on two lanes of the Illumina Genome Analyzer
 
Contributor(s) Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK
Citation(s) 19808877
Submission date Sep 17, 2009
Last update date May 15, 2019
Contact name Jacob F Degner
E-mail(s) jdegner@uchicago.edu
Organization name University of Chicago
Department Human Genetics
Lab Pritchard
Street address 920 E. 58th St
City Chicago
State/province IL
ZIP/Postal code 60615
Country USA
 
Platforms (1)
GPL9115 Illumina Genome Analyzer II (Homo sapiens)
Samples (2)
GSM453868 YRI NA19238 HapMap RNA Seq
GSM453869 YRI NA19239 HapMap RNA Seq
Relations
SRA SRP001462
BioProject PRJNA119495

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE18156_RAW.tar 2.4 Gb (http)(custom) TAR (of MAP)
SRA Run SelectorHelp
Processed data provided as supplementary file
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap