| The TSC read set (available from ftp.snp.cshl.org) were clipped using ssahaCLIP |
| ( ftp://ftp.ensembl.org/pub/traces/homo_sapiens/clip/README-clip; reads also available |
| here as ftp://ftp.ensembl.org/pub/traces/homo_sapiens/clip/cshl-human-wgs-*.clip.gz.) |
| which set a usable read length for each read. The reads were aligned using SsahaSNP to |
| build 30 of NCBI the Reference Sequences. Quality values were not used for this genomic |
| sequence. Where reads matched the genome more than once, the best quality match was used. |
| High quality base discrepancies (Q>=23) were identified as candidate SNPs. Further |
| restrictions on the candidate SNPs were that its neighbouring 5 bases all had Phred |
| quality values of >=15 and at least 9 of the 10 neighbours match. |
| When both the Q>=23 threshold and the neighbourhood restrictions are met, the base then |
| meets the Neighbourhood Quality Standard (NQS). |
| If the (number of SNPs detected for a read)/(number of NQS aligned bases for that read) |
| is determined to have a ratio > 15SNPs/1kb of sequence then all SNPs were discarded for |
| that read alignment. |