NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM1808042 Query DataSets for GSM1808042
Status Public on Jan 18, 2016
Title Adipose_1
Sample type SRA
 
Source name Adipose (Subcutaneous)
Organism Homo sapiens
Characteristics tissue: Adipose
Extracted molecule total RNA
Extraction protocol In brief, 0.5-1.0 μg of total RNA was twice selected for mRNA by oligo (dT) and then fragmented by heating. First strand cDNA was synthesized using Superscript III reverse transcriptase and random hexamer primers. After second strand synthesis by DNA polymerase I and with dUTP in place of dTTP, double stranded cDNA was end-repaired and A-tailed prior to ligation of Illumina adaptors including DNA indices. Libraries were made strand-specific by digestion with Uracil-DNA Glycosylase prior to PCR amplification. Bead-based clean-up was incorporated after each enzymatic reaction and libraries were checked by flash gel and Bioanalyzer analysis.
Total RNA was isolated from tissues using TRIzol Reagent.
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection cDNA
Instrument model Illumina HiSeq 2000
 
Data processing Insert sizes were estimated using the Picard Tools (v1.79) tools SortSam.jar and CollectInsertSizeMetrics.jar on Bowtie2(v2.0.0-beta7)-generated sam files from a subset of 500,000 paired-end reads per sample (bowtie2 options -q --very-fast --phred33).
Raw reads were mapped to the human genome sequence using Tophat v2.0.6, allowing for four mismatches per 100bp read, a maximum of 6 edit distances per read, and one mismatch in the splice anchor region.
Using Picard Tools, the alignment files were sorted by genomic coordinates, read-group data was added, and duplicate reads were marked and removed.
Transcript structure assembly was performed with Cufflinks (v.2.0.2) on each sample for each tissue type. The Gencode v12 annotation was used as a reference to guide assembly (--GTF-guide); additional parameters included upper-quartile normalization (--upper-quartile-norm), library type (--library-type fr-firststrand), and maximum bundle length (--max-bundle-length 7500000).
Cuffmerge (v2.0.2) was then used to merge all the cufflinks assemblies and the reference annotation into one large set of transcript structures. Splice junctions not present in the reference annotation were required to pass a Shannon entropy score threshold of 2.
The mapped reads for each sample were subsampled down to 20 million using multiple runs of the Picard tool DownsampleSam, and 18 samples per tissue type were selected to create a subsampled dataset. Subsequent steps were performed separatley on both subsampled and non-subsampled datasets.
To calculate the expression level of each gene in each tissue type, Cuffdiff (v2.2.1) was run with default parameters and the option --library-type fr-firststrand on the subsampled read files with 18 samples per tissue type used as ‘replicates’ and the merged set of transcript structures used as the reference annotation. Gene expression for each tissue was calculated by summing all isoforms for a given gene in a given tissue from the isoforms.fpkm_tracking file generated by Cuffdiff. Genes with any isoform with a status of “HIDATA” do not have a calculated FPKM value. Gene expression for each individual was calculated by summing all isoforms for a given gene for each individual from the isoforms.read_group_tracking file generated by Cuffdiff. This file has, for each individual, one value per isoform reported for each gene.
Splice junctions were identified using the JuncBASE package on reads that overlapped protein-coding genes. Only splice junctions with a Shannon entropy score greater than 2 using all subsampled reads were used. Junctions were called non-annotated if they were not present in the Gencode v12 annotation nor in the Ensemble annotation. Reported JuncBASE results were transformed into groups of mutually exclusive junction sets, each with defined length-normalized read counts (reads/100bp) and ‘percent spliced in’ (PSI) values. Complex events (e.g. cassette exon + alternate 3’ splice site) with ambiguously mapped reads and inron retention events were filtered out.
Genome_build: hg19
Supplementary_files_format_and_content: FPKM values by individual and by tissue are reported for 389 pharmacogenes in subsampled and non-subsampled datasets; Percent Spliced In (PSI) values are reported for 389 pharmacogenes by individual in subsampled and non-subsampled datasets.
 
Submission date Jul 03, 2015
Last update date May 15, 2019
Contact name Kathleen M Giacomini
Organization name University of California, San Francisco
Street address 1550 4th St, Mission Bay, RH 581
City San Francisco
State/province CA
ZIP/Postal code 94143
Country USA
 
Platform ID GPL11154
Series (1)
GSE70503 Transcriptomic variation of pharmacogenes in multiple human tissues and lymphoblastoid cell lines
Relations
Reanalyzed by GSE81474
BioSample SAMN03840207
SRA SRX1081346

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap