Proteome-wide prediction and annotation of mitochondrial and sub-mitochondrial proteins by incorporating domain information

Mitochondrion. 2018 Sep:42:11-22. doi: 10.1016/j.mito.2017.10.004. Epub 2017 Oct 12.

Abstract

Mitochondrion is one of the most important subcellular organelle of eukaryotic cells. It carries out several biochemical functions that are extremely vital for cells. Defects in mitochondria also play an important role in the development and progression of different types of cancer. Therefore knowledge of complete mitochondrial protein repertoire is essential to understand overall mitochondrial functionality, maintenance, dynamics and metabolism. It would be of a great practical significance to develop an automated and reliable approach that can identify the mitochondrial proteins and their sub-mitochondrial location. In the present study, we report a two level prediction method, named as SubMitoPred, which predicts mitochondrial proteins (at first level) and their sub-mitochondrial localization (at second level). Our approach is based on combined usage of Pfam domain information and support vector machine model. During training we achieved an overall prediction accuracy of 94.37% at first level while at the second level a prediction accuracy of 74.91% for inner membrane, 82.98% for outer membrane, 71.23% for inter-membrane space and 81.58% accuracy was achieved for matrix. Evaluation on independent data shows better performance of SubMitoPred. Benchmarking showed that SubMitoPred performed better than other existing methods. We also annotated human proteome using SubMitoPred. We also developed a freely accessible web-server as well as standalone software for the use of scientific community, which is available at http://proteininformatics.org/mkumar/submitopred/.

Keywords: Amino acid composition; Dipeptide composition; Pfam domain; Split amino acid composition; Support vector machine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Humans
  • Mitochondrial Proteins / genetics*
  • Molecular Sequence Annotation*

Substances

  • Mitochondrial Proteins