BCL::Score--knowledge based energy potentials for ranking protein models represented by idealized secondary structure elements

PLoS One. 2012;7(11):e49242. doi: 10.1371/journal.pone.0049242. Epub 2012 Nov 16.

Abstract

The topology of most experimentally determined protein domains is defined by the relative arrangement of secondary structure elements, i.e. α-helices and β-strands, which make up 50-70% of the sequence. Pairing of β-strands defines the topology of β-sheets. The packing of side chains between α-helices and β-sheets defines the majority of the protein core. Often, limited experimental datasets restrain the position of secondary structure elements while lacking detail with respect to loop or side chain conformation. At the same time the regular structure and reduced flexibility of secondary structure elements make these interactions more predictable when compared to flexible loops and side chains. To determine the topology of the protein in such settings, we introduce a tailored knowledge-based energy function that evaluates arrangement of secondary structure elements only. Based on the amino acid C(β) atom coordinates within secondary structure elements, potentials for amino acid pair distance, amino acid environment, secondary structure element packing, β-strand pairing, loop length, radius of gyration, contact order and secondary structure prediction agreement are defined. Separate penalty functions exclude conformations with clashes between amino acids or secondary structure elements and loops that cannot be closed. Each individual term discriminates for native-like protein structures. The composite potential significantly enriches for native-like models in three different databases of 10,000-12,000 protein models in 80-94% of the cases. The corresponding application, "BCL::ScoreProtein," is available at www.meilerlab.org.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • Models, Molecular*
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Rotation
  • Thermodynamics

Substances

  • Proteins