|
Status |
Public on Sep 24, 2021 |
Title |
sci-RNA-seq on F121-6-CASTx129 differentiated to neuronal stem cell - 4DNEXPGR1CVZ |
Sample type |
SRA |
|
|
Source name |
biosource_summary: F121-6-CASTx129 differentiated to neuronal stem cell
|
Organism |
Mus musculus |
Characteristics |
cell_line: F121-6-CASTx129 mouse_strain: 129/Sv X Cast tissue: neuronal stem cell modifications_summary: None description: Neural precursor cells clonally derived from day 11 F121-6 cells from an embryoid body differentiaton time course experiment Sex: female treatments_summary: None biosample_type: in vitro differentiated cells biosource_vendor: Gilbert Lab url: https://data.4dnucleome.org/biosamples/4DNBSSCN52FF/
|
Growth protocol |
description: Protocol for culturing F123 CASTx129 hybrid mouse cells download: https://data.4dnucleome.org/protocols/1d39b581-9200-4494-8b24-3dc77d595bbb/@@download/attachment/4DN_F123_SOP_170425.pdf description: Differentiation of mouse stem cells to embryoid bodies and subsequent NPC differentiation download: https://data.4dnucleome.org/protocols/21a7a355-e3cf-43fc-a383-c41aa0952f02/@@download/attachment/Mouse%20NPC%20differentiation%20protocol.pdf
|
Extracted molecule |
polyA RNA |
Extraction protocol |
url: https://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/pmc/articles/PMC4836442/ description: Cao et al. Comprehensive Single-Cell Transcriptional Profiling of a Multicellular Organism. PMID: 28818938
|
|
|
Library strategy |
RNA-Seq |
Library source |
transcriptomic |
Library selection |
cDNA |
Instrument model |
Illumina NextSeq 500 |
|
|
Description |
lab: Christine Disteche, UW award: 1U54DK107979-01 4DN accession: 4DNEXPGR1CVZ strandedness: forward submitted_by: Giancarlo Bonora experiment_type: sci-RNA-seq library_prep_kit: home-made sci-RNA-seq protocol contributing_labs: William Noble, UW biosample_quantity: 10000000 cells fragment_size_range: 300-1500 fragmentation_method: tagmentation average_fragment_size: 350 fragment_size_selection_method: SPRI beads url: https://data.4dnucleome.org/experiments-seq/4DNEXPGR1CVZ/
|
Data processing |
See the file "TREE.sci-RNA-seq.txt" for an overview of the folders and files described herein.
sci-RNA-seq libraries were processed using a pipeline by the Trapnell lab (Cao J et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017;357:661–7), but with reads aligned to an N-masked C57BL6/J reference genome (mm10) where every SNP locus (129 or cast for the F123, F121, and ES_Tsix-stop cell lines; B6 or spret for the Patski cell line) was substituted with an N to reduce mapping bias. The pipeline produces CellDataSet (CDS) files that contain genes-by-cell count matrices of unique molecular identifiers (UMIs) per gene, which were used for all downstream analysis. The folder "non-allelic_sci_RNA-seq_count_matrices" contains a 3-column sparse versions of the resulting CDS count matrices. Since d0 male and female cells were generated separately in their own plates, they had to be processed separately, and a separate d0 CDS file was generated. There are therefore two subfolders ("d0" and "d3-11andNPCs") that contain the CDS count matrices ("UMI.count.matrix.gz") along with associated cell IDs ("cell.annotations") and gene IDs ("gene.annotations").
The CDS was further filtered to include only cells with at least 200UMIs and then to exclude cells with extreme cells counts (beyond 2 standard deviations from the mean of the logged counts). The resulting matrix of counts can be found in the file "non-allelic_sci_RNA-seq_counts_filetered/non-allelic_sci-RNA-seq_mouseDiff_Nmasked.UMImatrix.filtered.200UMIs.final.tsv.gz" (see manuscript Additional file 1: Table S4).
Mapped single-end reads were segregated to their allele of origin if the read contained at least one SNP particular to exactly one of the parental strains. Reads containing no SNPs or containing SNPs belonging to both parental strains were discarded. The sci-RNA-seq preprocessing pipeline described above was adapted to handle allelically segregated reads. The folder "allelic_sci-RNA-seq_count_matrices" contains the resulting 3-column versions of the CDSs ("UMI.count.matrix.gz") within subfolders for each of the two separate libraries ("d0" and "d3-11andNPCs"). Associated cell IDs ("cellIDs.unique.txt") are contained in the folder "allelic_sci-ATAC-seq_count_matrices."
For female (F121 cells), the allelic data was further filtered (as was done for the non-allelic data) and the counts for each allele of each chromosome were concatenated together for allelic trajectory analysis and topic modeling. Only genes expressed from one allele in at least 10 cells and cells with at least 10 UMIs in chr1 and in chrX were included. The resulting count matrices are saved to the folder "allelic_sci-RNA-seq_catenated_count_matrices" with one file per chromosome (e.g. "allelic_sci-RNA-seq.UMImatrix.filtered.minExpr0.1.minCells5.F121.chr1.stringentlyFiltered.min_counts_per_chromosome10.tsv.gz").
|
|
|
Submission date |
Sep 22, 2021 |
Last update date |
Sep 24, 2021 |
Contact name |
4DN DCIC |
E-mail(s) |
support@4dnucleome.org
|
Organization name |
4D Nucleome - Data Coordination and Integration Center
|
Street address |
10 Shattuck St
|
City |
Boston |
State/province |
MA |
ZIP/Postal code |
02115 |
Country |
USA |
|
|
Platform ID |
GPL19057 |
Series (2) |
GSE184554 |
Single-cell landscape of nuclear configuration and gene expression during stem cell differentiation and X inactivation |
GSE184602 |
4DNESCX7WHJ1 - sci-RNA-seq on mESCs differentiated to embryoid body |
|
Relations |
BioSample |
SAMN21435454 |
SRA |
SRX12299951 |