This study evaluates gene expression and its regulation in human pancreatic islets, a tissue relevant in the study of genetic risk factors contributing to diabetes. We obtained islets from deceased donors and generated data from genome-wide SNP chip, bulk RNA-Seq, microRNA (miRNA)-Seq, whole genome sequence, DNA methylation (methyl)-Seq, transcription initiation profiles using cap analysis of gene expression (CAGE)-Seq, single cell RNA-seq, and single nuclei ATAC-seq. These data include ATAC-seq of two islet subjects, RNA-seq of 31 additional subjects, genome-wide chip genotypes, and imputed genotypes of the 33 subjects released with phs001188.v1. For genotyping, 500-1000 islet equivalents (IEQ) were cultured as in Gershengorn (Science, 2004, PMID: 15564314); genomic DNA isolated from islet cultures. For RNA analyses, 2500-5000 IEQ from each islet source were used for bulk or single-cell RNA isolation. Messenger RNA was isolated with trizol extraction and 12-plex libraries were generated using the Illumina TruSeq directional mRNA-seq library protocol. Bulk RNA sequencing was performed on HiSeq2000/HiSeq2500 sequencers using paired-end reads at the NIH Intramural Sequencing Center (NISC). miRNA libraries were prepared from total RNA from 68 samples, pooled and sequenced 50bp single-end reads on Illumina HiSeq2500. CAGE libraries were prepared from total RNA samples using the nAnT-iCAGE protocol at DNAFORM, Japan. CAGE libraries were sequenced at the NIH Intramural Sequencing Center (NISC) on the HiSeq2000 sequencer. Genotyping on the Illumina Omni2.5M array was performed at the NHGRI Genomics Core facility. Genotypes were imputed using the HRC.r1.1.2016 reference panel. In order to assess regions of open chromatin in islets, we performed bulk ATAC-seq on HiSeq2000 sequencers using paired-end reads at NISC. Single-nuclei ATAC-seq libraries were prepared using single-cell-combinatorial-indexing (sci-) ATAC-seq protocol and sequenced on Illumina NextSeq using paired-end reads. scRNA-seq libraries were generated using the 10X Genomics platform and sequenced on Illumina HiSeq3000 at the Genomics Technology Core of the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS).

Greater than 90% of the loci associated with T2D through genome-wide association studies occur in non-coding regions, suggesting a strong regulatory component to disease susceptibility. Therefore, there is a critical need to understand the full spectrum of genetic variation and regulatory element usage in T2D-relevant tissues. To that end, this study contains whole genome sequence and whole genome bisulfite sequence, and/or Illumina MethylationEPIC Array data, providing a comprehensive survey of both individual genetic variation as well as DNA methylation across different tissues from multiple individuals. In addition, we carried out sequencing of single cell RNAs (two subjects) and single cell nuclei (one subject) to characterize gene expression and chromatin accessibility of islets.

Study Inclusion/Exclusion Criteria

Islet samples from deceased donors were obtained from the Integrated Islet Distribution Program (IIDP), the National Disease Research Interchange (NDRI), and ProdoLabs.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Whole Genome Genotyping Illumina Infinium Omni2.5 BeadChip N/A N/A Samples from pancreatic islets
Imputed Genotyping Haplotype Reference Consortium HRC.r1.1.2016 (GRCh37/hg19) N/A N/A Pre-phased with Eagle v2.3, imputed with Minimac3
RNA Sequencing Illumina HiSeq 2000 N/A N/A Samples from pancreatic islets: Poly-A Selected - Paired End
Bulk ATAC Sequencing Illumina HiSeq 2500 N/A N/A Samples from pancreatic islets: Assay for Transposase-Accessible Chromatin with high throughput sequencing (ATAC Sequencing) to assess open chromatin regions
Methylation Illumina Infinium MethylationEPIC N/A N/A Samples from pancreatic islets
miRNA Sequencing Illumina TruSeq Small RNA Sample Prep Kit N/A N/A Samples from pancreatic islets; Sequencing with Illumina HiSeq 2500
CAGE Sequencing DNAFORM nAnT-iCAGE N/A N/A Samples from pancreatic islets
scRNA Sequencing 10x Genomics Chromium Single Cell 3' v2 N/A N/A Samples from pancreatic islets; Sequencing with Illumina HiSeq 3000
snATAC Sequencing Illumina NextSeq 550 N/A N/A Samples from pancreatic islets: Single-cell-combinatorial-indexing ATAC-seq (sci-ATAC-seq)
Whole Genome Sequencing Illumina HiSeq X N/A N/A Samples from pancreatic islets
WGBS Sequencing Illumina HiSeq 2500 V4 1T N/A N/A Samples from pancreatic islets
