AliBaba2: context specific identification of transcription factor binding sites

Niels Grabe

AliBaba2: context specific identification of transcription factor binding sites

In Silico Biol. 2002;2(1):S1-15.

Author

Niels Grabe¹

Affiliation

¹ Institute for Technical and Business Information Systems, Otto-von-Guericke University Magdeburg, Germany. grabe@iti.cs.uni-magdeburg.de

PMID: 11808873

Abstract

Currently, prediction of transcription factor binding sites is widely done using matrices collected from literature. This leads to several problems. We cannot actively control the conservation of the matrices, we cannot systematically use all binding sites available, we do not know which sites were used and which were discarded in matrix construction, we cannot compare and evaluate matrices easily, we cannot detect redundancy and we cannot control sensitivity and specificity. So we are lacking control during the identification process. In this paper a method to overcome these problems is proposed. It is assumed that each binding site has an unknown context which determines its sequence. This leads to the idea of constructing specific matrices for each sequence we are analysing. To do so we have to regard identification of binding sites as a general process, starting at a dataset of known binding sites and ending with the identification of a potential new binding site. In this paper such a process is presented. Besides overcoming the mentioned problems, the implementation also reaches a significantly higher accuracy than current approaches. Evaluations are done analysing all binding sites of TRANSFAC 3.5 public. The resulting tool AliBaba2 is available at http://wwwiti.cs.uni-magdeburg.de/grabe/alibaba2.

Publication types

Comparative Study

MeSH terms

Algorithms*
Binding Sites
Databases, Nucleic Acid
Databases, Protein
Software
Transcription Factors / genetics
Transcription Factors / metabolism*

Substances

Transcription Factors