|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jul 07, 2013 |
Title |
Small RNA-seq of undiseased human brain |
Organism |
Homo sapiens |
Experiment type |
Non-coding RNA profiling by high throughput sequencing
|
Summary |
The surprising observation that virtually the entire human genome is transcribed means we know very little about the function of many emerging classes of RNAs, except their astounding diversity. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their ability to classify classes of non-coding RNAs (ncRNAs). To address this, we developed CoRAL, a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length, cleavage specificity, and antisense transcription to distinguish between different ncRNA classes. We evaluated CoRAL using genome-wide small RNA sequencing (smRNA-seq) datasets from two human tissue types (brain and skin [GSE31037]), and were able to classify six different types of RNA transcripts with 79~80% accuracy in cross-validation experiments, and with 71~73% accuracy when CoRAL uses one tissue type for training and the other as validation. Analysis by CoRAL revealed that long intergenic ncRNAs, small cytoplasmic RNAs, and small nuclear RNAs show more tissue specificity, while microRNAs, small nucleolar, and transposon-derived RNAs are highly discernible and consistent across the two tissue types. The ability to consistently annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using smRNA-seq data in less characterized organisms.
|
|
|
Overall design |
Four samples were sequenced, each one coming from frozen brain tissue (frontal cortex) of a deceased female human patient with no remarkable pathology.
|
|
|
Contributor(s) |
Wang L, Gregory BD |
Citation(s) |
24149843 |
Submission date |
Jan 08, 2013 |
Last update date |
May 15, 2019 |
Contact name |
Paul Ryvkin |
Organization name |
University of Pennsylvania
|
Department |
Pathology
|
Lab |
Li-San Wang's Lab
|
Street address |
423 Guardian Dr, 1404 Blockley Hall
|
City |
Philadelphia |
State/province |
PA |
ZIP/Postal code |
19104 |
Country |
USA |
|
|
Platforms (1) |
GPL9115 |
Illumina Genome Analyzer II (Homo sapiens) |
|
Samples (4)
|
|
Relations |
BioProject |
PRJNA185476 |
SRA |
SRP017809 |
Supplementary file |
Size |
Download |
File type/resource |
GSE43335_RAW.tar |
34.8 Mb |
(http)(custom) |
TAR (of TXT) |
GSE43335_class_pri.txt.gz |
275 b |
(ftp)(http) |
TXT |
GSE43335_smrna_locus.txt.gz |
1.4 Mb |
(ftp)(http) |
TXT |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
Processed data are available on Series record |
|
|
|
|
|