NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE214825 Query DataSets for GSE214825
Status Public on Oct 08, 2022
Title Deciphering the Impact of Genetic Variation on Human Polyadenylation using APARENT2
Organism Homo sapiens
Experiment type Other
Summary Genetic variants that disrupt polyadenylation can cause or contribute to genetic disorders. Yet, due to the complex cis-regulation of polyadenylation, variant interpretation remains challenging. Here, we introduce a residual neural network model, APARENT2, that can infer 3’-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3’ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3’ untranslated region. Finally, we perform in-silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of >44 million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3’-end and Autism Spectrum Disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells.
References and variant APA libraries were cloned into a reporter. APA profiling data was obtained from RNA-seq of 3 different cell lines (HEK293T, SK-N-SH, and HMC3) with 2 replicates of each library. Polyadenylation cleavage position was determined for mapped reads, and then UMIs were collapsed to determine measured variant log odds ratio of cleavage.
 
Overall design To experimentally validate APARENT2 predictions, we measured the impact of 100 clinically relevant variants (including Autism case variants) in a plasmid reporter MPRA in HEK293T, SK-N-SH and HMC3 cell lines.
 
Contributor(s) Linder J, Koplik SE, Kundaje A, Seelig G
Citation(s) 36335397
Submission date Oct 05, 2022
Last update date Nov 29, 2022
Contact name Samantha Koplik
E-mail(s) skoplik@uw.edu
Organization name University of Washington
Department Electrical and Computer Engineering
Lab Georg Seelig
Street address 185 Stevens Way
City seattle
State/province Washington
ZIP/Postal code 98195
Country USA
 
Platforms (1)
GPL15520 Illumina MiSeq (Homo sapiens)
Samples (12)
GSM6616355 HEK293T cells, APA Variant library, Replicate 1
GSM6616356 HEK293T cells, APA Variant library, Replicate 2
GSM6616357 HEK293T cells, APA Reference library, Replicate 1
Relations
BioProject PRJNA887334

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE214825_apa_100_variants_library_metadata.txt.gz 1.0 Kb (ftp)(http) TXT
GSE214825_apa_100_variants_rev2_20220621_hek293_v3_umi_mut_0.csv.gz 47.5 Kb (ftp)(http) CSV
GSE214825_apa_100_variants_rev2_20220621_hmc3_v3_umi_mut_0.csv.gz 41.7 Kb (ftp)(http) CSV
GSE214825_apa_100_variants_rev2_20220621_sknsh_v3_umi_mut_0.csv.gz 48.0 Kb (ftp)(http) CSV
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap