Evaluating protein structures determined by structural genomics consortia

Proteins. 2007 Mar 1;66(4):778-95. doi: 10.1002/prot.21165.

Abstract

Structural genomics projects are providing large quantities of new 3D structural data for proteins. To monitor the quality of these data, we have developed the protein structure validation software suite (PSVS), for assessment of protein structures generated by NMR or X-ray crystallographic methods. PSVS is broadly applicable for structure quality assessment in structural biology projects. The software integrates under a single interface analyses from several widely-used structure quality evaluation tools, including PROCHECK (Laskowski et al., J Appl Crystallog 1993;26:283-291), MolProbity (Lovell et al., Proteins 2003;50:437-450), Verify3D (Luthy et al., Nature 1992;356:83-85), ProsaII (Sippl, Proteins 1993;17: 355-362), the PDB validation software, and various structure-validation tools developed in our own laboratory. PSVS provides standard constraint analyses, statistics on goodness-of-fit between structures and experimental data, and knowledge-based structure quality scores in standardized format suitable for database integration. The analysis provides both global and site-specific measures of protein structure quality. Global quality measures are reported as Z scores, based on calibration with a set of high-resolution X-ray crystal structures. PSVS is particularly useful in assessing protein structures determined by NMR methods, but is also valuable for assessing X-ray crystal structures or homology models. Using these tools, we assessed protein structures generated by the Northeast Structural Genomics Consortium and other international structural genomics projects, over a 5-year period. Protein structures produced from structural genomics projects exhibit quality score distributions similar to those of structures produced in traditional structural biology projects during the same time period. However, while some NMR structures have structure quality scores similar to those seen in higher-resolution X-ray crystal structures, the majority of NMR structures have lower scores. Potential reasons for this "structure quality score gap" between NMR and X-ray crystal structures are discussed.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Crystallography, X-Ray
  • Databases, Protein
  • Genomics
  • Humans
  • Models, Molecular
  • Nuclear Magnetic Resonance, Biomolecular
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • Software Design

Substances

  • Proteins