NCBI
dbSNP

dbVar ClinVar GaP PubMed Nucleotide Protein
Search small variations in dbSNP or large structural variations in dbVar
transparent GIF
Spacer gif
Have a question about dbSNP? Try searching the SNP FAQ Archive!

Spacer gif
Method Detail
Submitter Method Handle: 1000GENOMES
Submitter Method ID: EXOME_CAPTURE
Submitted method description:
Deep coverage exome capture sequencing supports reliable discovery
of rare variants, including those which appear only once or twice across
all individuals sequenced.
Exome capture sequencing was performed for 1128 individuals in
HapMap and 1000 Genomes population samples by four sequencing
centers, Beijing Genome Institute, Baylor College of Medicine
Human Genome Sequencing Center, Broad Institute and Washington
University Genome Sequencing Center, using either the NimbleGen
SeqCap_EZ_Exome_v2 (BGI, BCM) or Agilent SureSelect_All_Exon_V2
(BI, WUGSC) exome capture reagents. All sequencing was done from
lymphoblastoid cell line DNA. Sequencing was considered complete
when at least 70% of the target region showed 20x or greater depth
of coverage in mapped reads. Target regions for analysis are the
intersection of the two capture reagent target regions with CCDS
coding exons, plus 50 bp flanking regions. This totals approximately
47 Mb, of which 62% are coding exons. Exact boundaries in GRCh37
sequence coordinates are shown in file:
/ftp-trace.ncbi.nlm.nih.gov/1000genomes/ftp/technical/reference/
exome_pull_down_targets/20110426_exome_add50bp.consensus.bed
Illumina sequence reads were mapped at Broad Institute using BWA
and at Boston College using Mosaik. AB SOLiD sequence reads were
mapped at Baylor using Bfast and at Boston College using Mosaik.
All read mapping uses the GRCh version 37 human genome reference
sequence, without additional decoy sequences. The final site list is the
union of calls from both sequencing technologies, separately filtered
using SVM.
While formatting the data for dbSNP submission, I observe that the
single ALT allele shown in the 1000 Genomes Phase 1 integrated
genotypes sometimes differs from that found in the contributing call
sets. For consistency, the allele from the integrated genotypes is
shown here.
Data availability:
The mapped sequence reads are indexed in file:
/ftp-trace.ncbi.nlm.nih.gov/1000genomes/ftp/alignment_indices/
20110521.exome.alignment.index
The original SNP calls from individual centers, the filtered union
site lists and integrated genotypes are currently found in directories:
/ftp-trace.ncbi.nlm.nih.gov/1000genomes/ftp/technical/working/
20110721_exome_call_sets
20110810_exome_consensus_snps
20120117_new_phase1_intgrated_genotypes (sic)

This method was used in the following submission:

Submitter Handle Batch Type Submitter batch id Release build id
1000GENOMES Assay phase_1_exome_sites_20110521 136