Chemical composition is maintained in poorly conserved intrinsically disordered regions and suggests a means for their classification

Mol Biosyst. 2012 Oct 30;8(12):3262-73. doi: 10.1039/c2mb25202c.

Abstract

Intrinsically disordered regions in proteins are known to evolve rapidly while maintaining their function. However, given their lack of structure and sequence conservation, the means through which they stay functional is not clear. Poor sequence conservation also hampers the classification of these regions into functional groups. We studied the sequence conservation of a large number of predicted and experimentally determined intrinsically disordered regions from the human proteome in 7 other eukaryotes. We determined the chemical composition of disordered regions by calculating the fraction of positive, negative, polar, hydrophobic and special (Pro, Gly) residues, and studied its maintenance in orthologous proteins. A significant number of disordered regions with low sequence conservation showed considerable similarity in their chemical composition between orthologs. Clustering disordered regions based on their chemical composition resulted in functionally distinct groups. Finally, disordered regions showed location preference within the proteins that was dependent on their chemical composition. We conclude that preserving the overall chemical composition is one of the ways through which intrinsically disordered regions maintain their flexibility and function through evolution. We propose that the chemical composition of disordered regions can be used to classify them into functional groups and, together with conservation and location, may be used to define a general classification scheme.

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Conserved Sequence*
  • Evolution, Molecular
  • Humans
  • Hydrophobic and Hydrophilic Interactions
  • Protein Conformation
  • Protein Folding
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteome
  • Sequence Analysis, Protein
  • Structure-Activity Relationship

Substances

  • Proteins
  • Proteome