A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts

Bioinformatics. 2007 Oct 15;23(20):2760-7. doi: 10.1093/bioinformatics/btm446. Epub 2007 Sep 5.

Abstract

Motivation: The sequence of artificial genetic constructs is composed of multiple functional fragments, or genetic parts, involved in different molecular steps of gene expression mechanisms. Biologists have deciphered structural rules that the design of genetic constructs needs to follow in order to ensure a successful completion of the gene expression process, but these rules have not been formalized, making it challenging for non-specialists to benefit from the recent progress in gene synthesis.

Results: We show that context-free grammars (CFG) can formalize these design principles. This approach provides a path to organizing libraries of genetic parts according to their biological functions, which correspond to the syntactic categories of the CFG. It also provides a framework for the systematic design of new genetic constructs consistent with the design principles expressed in the CFG. Using parsing algorithms, this syntactic model enables the verification of existing constructs. We illustrate these possibilities by describing a CFG that generates the most common architectures of genetic constructs in Escherichia coli.

Availability: A web site allows readers to experiment with the algorithms presented in this article: www.genocad.org.

Supplementary information: Sequences and models are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping / methods*
  • Computer Simulation
  • DNA / genetics*
  • Genetic Code / genetics*
  • Genetic Engineering / methods*
  • Genomic Islands / genetics*
  • Models, Genetic*

Substances

  • DNA