| The following method definition consists of one or more |
| protocols used in The SNP Consortium project by the |
| participating laboratory SANGER. |
| See the TSC website (http://snp.cshl.org/cgi-bin/protocol) |
| for a complete list of TSC protocols. |
| \\--------------------------------------- |
| SNP production protocol TSCM0019 ( |
| http://snp.cshl.org/cgi-bin/protocol?id=TSCM0019) |
| Title: Library preparation and sequencing for PvuII enzyme |
| Lab: Sanger Center, Cambridge |
| Contact: snp@sanger.ac.uk |
| Description: |
| A panel of 24 DNAs was digested with PvuII, size |
| fractionated on an agarose gel, and cloned into puc18-based vectors. |
| Size fractions were taken at 1.0-1.1kb, 1.1-1.2kb, 1.2-1.4kb, 1.4-1.6kb, |
| 1.6-1.8kb, 1.8-2.1kb, 2.1-2.5kb, 2.5-3.0kb, 3.0-3.5kb and 3.5-4.0kb, |
| from which libraries were constructed. Samples from these libraries were |
| named such that the size insert can be identified. The sample names |
| start with p1_0, p1_1, p1_3, p1_5, p1_7, p1_9, p2_3, p2_7, p3_2 and |
| p3_7, with respect to the size fractions listed above. Sequences were |
| obtained primarily from ABI 3700 capillilary sequencers. Base-calling |
| was performed with Phrap, which provides the quality values upon which |
| the SNP detection is based. |
| \\--------------------------------------- |
| SNP detection protocol TSCM0024 ( |
| http://snp.cshl.org/cgi-bin/protocol?id=TSCM0024) |
| Title: SNPs from reads detected against finished public human genomic sequence |
| Lab: Sanger Center, Cambridge |
| Contact: snp@sanger.ac.uk |
| Description: |
| All reads were clipped of sequencing vector and low quality ends, |
| which set a usable read length for each read. The clipped reads |
| were screened for repetitive sequence with RepeatMasker, using |
| the default human settings. Only reads with >=80 non-repetitive |
| bases and >= 100 Phred quality (Q) >=30 bases were used in this |
| analysis. The reads were cross-matched against available finished |
| public human genomic sequence which is available via anonymous |
| ftp at snp.cshl.org in the file pub/SNP/nrClean21-FEB-00. All |
| reads which aligned to genomic sequence within 40bp of the start |
| and 10bp of the end of their usable read length, were assembled |
| with the matching region of genomic sequence (Q was set to 40 for |
| finished genomic sequence). High quality base discrepancies |
| (Q>=23) were identified as candidate SNPs. Further restrictions |
| on the candidate SNPs were that its neighbouring 5 bases all had |
| Phred quality values of >=15, at least 9 of the 10 neighbours |
| match and that the trace chromatograms at the SNP base location |
| did not have an underlying base signal greater than 50% of the |
| called base. If the number of detected SNPs in one clique was |
| greater than 4 or the depth of the assembly (not including the |
| genomic sequence) was greater than 8, then all SNPs were discard- |
| ed for that clique. |