|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jan 18, 2016 |
Title |
Adipose_1 |
Sample type |
SRA |
|
|
Source name |
Adipose (Subcutaneous)
|
Organism |
Homo sapiens |
Characteristics |
tissue: Adipose
|
Extracted molecule |
total RNA |
Extraction protocol |
In brief, 0.5-1.0 μg of total RNA was twice selected for mRNA by oligo (dT) and then fragmented by heating. First strand cDNA was synthesized using Superscript III reverse transcriptase and random hexamer primers. After second strand synthesis by DNA polymerase I and with dUTP in place of dTTP, double stranded cDNA was end-repaired and A-tailed prior to ligation of Illumina adaptors including DNA indices. Libraries were made strand-specific by digestion with Uracil-DNA Glycosylase prior to PCR amplification. Bead-based clean-up was incorporated after each enzymatic reaction and libraries were checked by flash gel and Bioanalyzer analysis. Total RNA was isolated from tissues using TRIzol Reagent.
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina HiSeq 2000 |
|
|
Data processing |
Insert sizes were estimated using the Picard Tools (v1.79) tools SortSam.jar and CollectInsertSizeMetrics.jar on Bowtie2(v2.0.0-beta7)-generated sam files from a subset of 500,000 paired-end reads per sample (bowtie2 options -q --very-fast --phred33). Raw reads were mapped to the human genome sequence using Tophat v2.0.6, allowing for four mismatches per 100bp read, a maximum of 6 edit distances per read, and one mismatch in the splice anchor region. Using Picard Tools, the alignment files were sorted by genomic coordinates, read-group data was added, and duplicate reads were marked and removed. Transcript structure assembly was performed with Cufflinks (v.2.0.2) on each sample for each tissue type. The Gencode v12 annotation was used as a reference to guide assembly (--GTF-guide); additional parameters included upper-quartile normalization (--upper-quartile-norm), library type (--library-type fr-firststrand), and maximum bundle length (--max-bundle-length 7500000). Cuffmerge (v2.0.2) was then used to merge all the cufflinks assemblies and the reference annotation into one large set of transcript structures. Splice junctions not present in the reference annotation were required to pass a Shannon entropy score threshold of 2. The mapped reads for each sample were subsampled down to 20 million using multiple runs of the Picard tool DownsampleSam, and 18 samples per tissue type were selected to create a subsampled dataset. Subsequent steps were performed separatley on both subsampled and non-subsampled datasets. To calculate the expression level of each gene in each tissue type, Cuffdiff (v2.2.1) was run with default parameters and the option --library-type fr-firststrand on the subsampled read files with 18 samples per tissue type used as ‘replicates’ and the merged set of transcript structures used as the reference annotation. Gene expression for each tissue was calculated by summing all isoforms for a given gene in a given tissue from the isoforms.fpkm_tracking file generated by Cuffdiff. Genes with any isoform with a status of “HIDATA” do not have a calculated FPKM value. Gene expression for each individual was calculated by summing all isoforms for a given gene for each individual from the isoforms.read_group_tracking file generated by Cuffdiff. This file has, for each individual, one value per isoform reported for each gene. Splice junctions were identified using the JuncBASE package on reads that overlapped protein-coding genes. Only splice junctions with a Shannon entropy score greater than 2 using all subsampled reads were used. Junctions were called non-annotated if they were not present in the Gencode v12 annotation nor in the Ensemble annotation. Reported JuncBASE results were transformed into groups of mutually exclusive junction sets, each with defined length-normalized read counts (reads/100bp) and ‘percent spliced in’ (PSI) values. Complex events (e.g. cassette exon + alternate 3’ splice site) with ambiguously mapped reads and inron retention events were filtered out. Genome_build: hg19 Supplementary_files_format_and_content: FPKM values by individual and by tissue are reported for 389 pharmacogenes in subsampled and non-subsampled datasets; Percent Spliced In (PSI) values are reported for 389 pharmacogenes by individual in subsampled and non-subsampled datasets.
|
|
|
Submission date |
Jul 03, 2015 |
Last update date |
May 15, 2019 |
Contact name |
Kathleen M Giacomini |
Organization name |
University of California, San Francisco
|
Street address |
1550 4th St, Mission Bay, RH 581
|
City |
San Francisco |
State/province |
CA |
ZIP/Postal code |
94143 |
Country |
USA |
|
|
Platform ID |
GPL11154 |
Series (1) |
GSE70503 |
Transcriptomic variation of pharmacogenes in multiple human tissues and lymphoblastoid cell lines |
|
Relations |
Reanalyzed by |
GSE81474 |
BioSample |
SAMN03840207 |
SRA |
SRX1081346 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|