Conotoxin superfamily prediction using diffusion maps dimensionality reduction and subspace classifier

Curr Protein Pept Sci. 2011 Sep;12(6):580-8. doi: 10.2174/138920311796957702.

Abstract

Conotoxins are disulfide-rich small peptides that are invaluable channel-targeted peptides and target neuronal receptors, which have been demonstrated to be potent pharmaceuticals in the treatment of Alzheimer's disease, Parkinson's disease, and epilepsy. Accurate prediction of conotoxin superfamily would have many important applications towards the understanding of its biological and pharmacological functions. In this study, a novel method, named dHKNN, is developed to predict conotoxin superfamily. Firstly, we extract the protein's sequential features composed of physicochemical properties, evolutionary information, predicted secondary structures and amino acid composition. Secondly, we use the diffusion maps for dimensionality reduction, which interpret the eigenfunctions of Markov matrices as a system of coordinates on the original data set in order to obtain efficient representation of data geometric descriptions. Finally, an improved K-local hyperplane distance nearest neighbor subspace classifier method called dHKNN is proposed for predicting conotoxin superfamilies by considering the local density information in the diffusion space. The overall accuracy of 91.90% is obtained through the jackknife cross-validation test on a benchmark dataset, indicating the proposed dHKNN is promising.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Computational Biology / methods*
  • Conotoxins / chemistry*
  • Conotoxins / classification
  • Cysteine / chemistry
  • Protein Structure, Secondary
  • Reproducibility of Results

Substances

  • Amino Acids
  • Conotoxins
  • Cysteine