A novel hybrid gene prediction method employing protein multiple sequence alignments

Bioinformatics. 2011 Mar 15;27(6):757-63. doi: 10.1093/bioinformatics/btr010. Epub 2011 Jan 6.

Abstract

Motivation: As improved DNA sequencing techniques have increased enormously the speed of producing new eukaryotic genome assemblies, the further development of automated gene prediction methods continues to be essential. While the classification of proteins into families is a task heavily relying on correct gene predictions, it can at the same time provide a source of additional information for the prediction, complementary to those presently used.

Results: We extended the gene prediction software AUGUSTUS by a method that employs block profiles generated from multiple sequence alignments as a protein signature to improve the accuracy of the prediction. Equipped with profiles modelling human dynein heavy chain (DHC) proteins and other families, AUGUSTUS was run on the genomic sequences known to contain members of these families. Compared with AUGUSTUS' ab initio version, the rate of genes predicted with high accuracy showed a dramatic increase.

Availability: The AUGUSTUS project web page is located at http://augustus.gobics.de, with the executable program as well as the source code available for download.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Computational Biology / methods
  • Dyneins / genetics
  • Electronic Data Processing / methods*
  • Exons
  • Humans
  • Models, Genetic
  • Multigene Family
  • Sequence Alignment*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Dyneins