GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM1252754

Query DataSets for GSM1252754

Status

Public on Feb 02, 2015

Title

Carrier 16

Sample type

SRA

Source name

Whole blood, carrier

Organism

Homo sapiens

Characteristics

tissue: peripheral whole blood
age: 53.89
gender [0=male, 1=female]: 0
cag_repeats: 40
mutation_carrier_status: carrier
presymptomatic hd patient (yes, if motor score is 5 or less): no
motor_score [higher score indicates greater disease severity]: 7
hb_percentage [a proxy for the reticulocyte content of each sample measured as the ratio of hemoglobin tags versus total aligned tags per sample (0-100%)]: 73.88
tfc_score: 13
tfc_disease_stage: 1

Extracted molecule

total RNA

Extraction protocol

Whole blood was drawn into PAX gene tubes and total RNA was isolated using the PAX RNA isolation kit following the manufacturer's instructions, including DNAse treatment
SAGE libraries were produced according to the Illumina 3' Digitial Gene Expression NlaIII protocol

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina HiSeq 2000

Description

Sample.49

Data processing

Illumina GA Pipeline Software (version 1.5.1) was used for data sequence processing.
The FASTQ files were analyzed using the open source GAPSS_B pipeline.
All sequences were trimmed to 17 base pairs and the NlaIII recognition site (CATG) was added to the 5’ end of the sequence to create the complete 21 mer.
Sequences were aligned using the Bowtie short read aligner (version 0.12.7) against the UCSC hg19 reference genome.
A custom Perl script was used to obtain gene annotations from Biomart.
A custom python script was used to count the tags in each Ensembl gene using the sam output files from bowtie.
Normalization steps:
library(limma)
library(edgeR)
library("org.Hs.eg.db")
library(biomaRt)
#Remove HBA1,HBA2,HBB genes
sumrow=apply(Data,1,sum)
sumroworder=order(sumrow,decreasing=T)
head(sumroworder)
Data=Data[-c(26806,12722,10749),]
#Remove low abundance genes
Data=Data[rowSums(Data)>123,]
#Get HGNC symbols from ensembl
mart = useMart("ensembl", dataset="hsapiens_gene_ensembl")
ann <- getGene(id = rownames(Data), type = "ensembl_gene_id", mart = mart)
m <- match(rownames(Data),ann[,9])
genes <- ann[m,1:8]
#Make the limma design matrix
design=model.matrix(~Metadata$Motor_Score+Metadata$HB_Percentage+Metadata$Gender+Metadata$Age)
design=model.matrix(~Metadata$ TFC_Disease_Stage+Metadata$HB_Percentage+Metadata$Gender+Metadata$Age)
#Do Limma
nf=calcNormFactors(Data)
y=voom(Data,design,plot=TRUE,lib.size=colSums(Data)*nf)
y$genes <- genes
fit=lmFit(y,design)
fit=eBayes(fit)
summary(decideTests(fit))
#Make a table of the genes and export it
Topmodel=topTable(fit,coef=2,n=20,sort.by="p")[,c(1,2,3,9,13)]
write.table(Topmodel, file="Topmodel.txt")
Genome_build: hg19
Supplementary_files_format_and_content: Data.txt: Tab-delimited text file representing tag counts for all 124 samples and in each Ensembl gene using the sam output files from bowtie.
Supplementary_files_format_and_content: Normalised_data.txt: Tab-delimited text file representing scaled data (log transformed and normalized). Obtained from the y$E function in the voom limma package for RNA-seq data analysis.

Submission date

Oct 28, 2013

Last update date

May 15, 2019

Contact name

Anastasios Mastrokolias

Phone

0031715269425

Organization name

Leiden University Medical Center

Street address

LUMC, Building 2 Einthovenweg 20 2333 ZC Leiden