dbVar ClinVar GaP PubMed Nucleotide Protein
Search small variations in dbSNP or large structural variations in dbVar
transparent GIF
Spacer gif
Have a question about dbSNP? Try searching the SNP FAQ Archive!

Spacer gif
Method Detail
Submitter Method Handle: EVA_DECODE
Submitted method description:
Paired-end libraries for sequencing were prepared according to the manufacturer's instructions (Illumina, TruSeq?).
In short, approximately 1 ?g of genomic DNA, isolated from frozen blood samples, was fragmented to a mean target size of 300 bp
using a Covaris E210 instrument. The resulting fragmented DNA was end repaired using T4 and Klenow polymerases and T4 polynucleotide
kinase with 10 mM dNTP followed by addition of an 'A' base at the ends using Klenow exo fragment (3? to 5?-exo minus) and dATP (1 mM).
Sequencing adaptors containing 'T' overhangs were ligated to the DNA products followed by agarose (2%) gel electrophoresis. Fragments
of about 400-500 bp were isolated from the gels (QIAGEN Gel Extraction Kit), and the adaptor-modified DNA fragments were PCR enriched
for ten cycles using Phusion DNA polymerase (Finnzymes Oy) and a PCR primer cocktail (Illumina). Enriched libraries were further
purified using AMPure XP beads (Beckman-Coulter). The quality and concentration of the libraries were assessed with the Agilent 2100
Bioanalyzer using the DNA 1000 LabChip (Agilent). Barcoded libraries were stored at ?20 C. All steps in the workflow were monitored
using an in-house laboratory information management system with barcode tracking of all samples and reagents. Template DNA fragments were
hybridized to the surface of flow cells (GA PE cluster kit (v2) or HiSeq PE cluster kits (v2.5 or v3)) and amplified to form clusters
using the Illumina cBot. In brief, DNA (2.5?12 pM) was denatured, followed by hybridization to grafted adaptors on the flow cell.
Isothermal bridge amplification using Phusion polymerase was then followed by linearization of the bridged DNA, denaturation, blocking
of 3 ends and hybridization of the sequencing primer. Sequencing-by-synthesis (SBS) was performed on Illumina GAIIx and/or HiSeq 2000
instruments. Paired-end libraries were sequenced at 2 101 (HiSeq) or 2 120 (GAIIx) cycles of incorporation and imaging using
the appropriate TruSeq? SBS kits. Each library or sample was initially run on a single GAIIx lane for QC validation followed by further
sequencing on either GAIIx (?4 lanes) or HiSeq (?1 lane) with targeted raw cluster densities of 500?800 k/mm2, depending on the version
of the data imaging and analysis packages (SCS2.6-2-9/RTA1.6-1.9, HCS1.3.8-1.4.8/RTA1.10.36- Real-time analysis involved conversion
of image data to base-calling in real-time. Reads were aligned to NCBI Build 36 (hg18) of the human reference sequence using Burrows-Wheeler
Aligner (BWA) 0.5.7-0.5.916. Alignments were merged into a single BAM file and marked for duplicates using Picard 1.55
( Only non-duplicate reads were used for the downstream analyses. Resulting BAM files were realigned and
recalibrated using GATK version 1.2-29-g0acaf2d8,17. Multi-sample calling was performed with GATK version 2.3.9 using all the 2,636 BAM
files together.
Genotype calls made solely on the basis of next generation sequence data yield errors at a rate that decreases as a function of sequencing
depth. Thus, for example, if sequence reads at a heterozygous SNP position carry one copy of the alternative allele and seven copies of
the reference allele, then without further information the genotype would be called homozygous for the reference allele. To minimize the
number of such errors, we used information about haplotype sharing, taking advantage of the fact that all the sequenced individuals had
also been chip-typed and long range phased (Figure 2). Extending the previous example, if the individual shares a haplotype with another
who is heterozygous given his sequence reads, then the ambiguous individual would be called as heterozygous. Conversely, if the individual
shares both his haplotypes with others who are homozygous for the major allele his genotype would be called homozygous. In order to improve
genotype quality and to phase the sequencing genotypes, an iterative algorithm based on the IMPUTE HMM model 2 which uses the LRP haplotypes
was employed. Co-ordinates from genome build 36 (GCF_000001405.12) were lifted over to builds 37 (GCA_000001405.14) and 38 (GCA_000001405.17)
using the liftover tool from UCSC and the default options.

This method was used in the following submission:

Submitter Handle Batch Type Submitter batch id Release build id