Dendritic cells (DC) are mononuclear phagocytes which exhibit a dendritic morphology and excel at naïve T cell activation. DC encompass several subsets initially identified by their expression of specific cell surface molecules and later shown to possess distinct functions. DC subset differentiation is guided by different transcription factors and cytokines. Identifying DC subsets is challenging as very few cell surface molecules are uniquely expressed on any one of these cell populations and conventional flow cytometry analysis using limited antigens is biased and potentially misleading. Moreover, the antigens currently used to define mononuclear phagocyte subsets vary depending on the tissue and animal species studied and even between laboratories. This has led to confusion in the definition of the identity of myeloid cell subsets across tissues and between species. Here we report a comparative genomics strategy that enables universal definition of DC subsets and other myeloid cell types across species. We have developed a novel, simple and user friendly software, BubbleGUM, which generates and integrates gene signatures for high throughput gene set enrichment analysis. We illustrate the use of BubbleGUM by re-analyzing 3 concatenated public datasets of blood/spleen and skin/cutaneous lymph node myeloid cell subsets in humans and in mice. This analysis demonstrates the equivalence between human and mouse skin XCR1+ DCs, and between mouse and human Langerhans cells.
Overall design: Dendritic cells (DC) are mononuclear phagocytes which exhibit a dendritic morphology and excel at naïve T cell activation. DC encompass several subsets initially identified by their expression of specific cell surface molecules and later shown to possess distinct funcRecent studies have identified multiple DC-like populations in human skin with overlapping phenotypes but with distinct transcriptome profiles, functions, and lineage relationships with other tissue DCs in humans and mice (Haniffa et al. Immunity. 2012. PMID 22795876; Chu et al. JEM. 2012. PMID 22547651; Artyomov et al. JEM. 2015. PMID 25918340), leading to a certain level of confusion in the field. In this study, using comparative genomics, we aimed to clarify these conflicting reports and define murine and human skin mononuclear phagocyte subsets, their intra-species tissue equivalents and inter-species homologs. To achieve this goal, we performed high throughput module (gene set enrichment) meta-analyses with our newly released BubbleGUM software (Spinelli et al. BMC Genomics. 2015; PMID 26481321) on a number of independent public datasets for mononuclear phagocyte populations in the blood or spleen, skin or cutaneous lymph node of humans and in mice. This allowed us to rigorously identify DC subsets, monocytes and macrophages in these tissues and to align them across species.
The Mouse Gene 1.0 ST (GPL6246) CEL files were processed through Bioconductor in the R statistical environment (version 3.0.2). Quality control of the array hybridization (NUSE plot) and normalization of the raw Affymetrix expression data with Robust Multi-chip Analysis (Irizarry, R. and al) were performed using the oligo package (Matrix1). PCA was performed to remove the dataset effect visible on the first principal component (Matrix2), using ade4 package. The Illumina Human WG-6 v3 (GPL10558) and Illumina Human HT12 v4.0(GPL6884) raw data files were processed through Bioconductor in the R statistical environment (version 3.0.2). Gene expression signals from GSE60317 and GSE35457 were merged by their common probes. Quantile Normalization (Bolstad, 2004) was applied on the merged expression arrays using the package lumi and then expression values were log2-transformed. Gene expression signals from GSE66355 were already background corrected and quantile normalized. Noise threshold was estimated at 5 based on the density of all gene expression signals. All values less than this threshold were replaced by this threshold. Expression values were then log2-transformed to be comparable to the 2 others datasets. The three datasets were merged by the common probes and quantile normalization was applied again (Matrix3). PCA was performed to remove the dataset effect visible on the first two principal components (Matrix4). Human and Mouse datasets were then merged based on identification of orthologous genes using the Ensembl BioMart software with selection of “one-to-one” orthology relationships only (Matrix5).
Less...