The origin of a novel gene through overprinting in Escherichia coli

BMC Evol Biol. 2008 Jan 28:8:31. doi: 10.1186/1471-2148-8-31.

Abstract

Background: Overlapped genes originate by a) loss of a stop codon among contiguous genes coded in different frames; b) shift to an upstream initiation codon of one of the contiguous genes; or c) by overprinting, whereby a novel open reading frame originates through point mutation inside an existing gene. Although overlapped genes are common in viruses, it is not clear whether overprinting has led to new genes in prokaryotes.

Results: Here we report the origin of a new gene through overprinting in Escherichia coli K12. The htgA gene coding for a positive regulator of the sigma 32 heat shock promoter arose by point mutation in a 123/213 phase within an open reading frame (yaaW) of unknown function, most likely in the lineage leading to E. coli and Shigella sp. Further, we show that yaaW sequences coding for htgA genes have a slower evolutionary rate than those lacking an overlapped htgA gene.

Conclusion: While overprinting has been shown to be rather frequent in the evolution of new genes in viruses, our results suggest that this mechanism has also contributed to the origin of a novel gene in a prokaryote. We propose the term janolog (from Jano, the two-faced Roman god) to describe the homology relationship that holds between two genes when one originated through overprinting of the other. One cannot dismiss the possibility that at least a small fraction of the large number of novel ORPhan genes detected in pan-genome and metagenomic studies arose by overprinting.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Escherichia coli / classification
  • Escherichia coli / genetics*
  • Escherichia coli Proteins / genetics
  • Evolution, Molecular*
  • Genes, Bacterial*
  • Heat-Shock Proteins / genetics
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Phylogeny

Substances

  • Escherichia coli Proteins
  • Heat-Shock Proteins
  • yaaW protein, E coli