Maximum contact map overlap revisited

J Comput Biol. 2011 Jan;18(1):27-41. doi: 10.1089/cmb.2009.0196.

Abstract

Among the measures for quantifying the similarity between three-dimensional (3D) protein structures, maximum contact map overlap (CMO) received sustained attention during the past decade. Despite this, the known algorithms exhibit modest performance and are not applicable for large-scale comparison. This article offers a clear advance in this respect. We present a new integer programming model for CMO and propose an exact branch-and-bound algorithm with bounds obtained by a novel Lagrangian relaxation. The efficiency of the approach is demonstrated on a popular small benchmark (Skolnick set, 40 domains). On this set, our algorithm significantly outperforms the best existing exact algorithms. Many hard CMO instances have been solved for the first time. To further assess our approach, we constructed a large-scale set of 300 protein domains. Computing the similarity measure for any of the 44850 pairs, we obtained a classification in excellent agreement with SCOP. Supplementary Material is available at www.liebertonline.com/cmb.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Databases, Protein
  • Models, Molecular*
  • Protein Conformation*
  • Proteins / chemistry*
  • Structural Homology, Protein

Substances

  • Proteins