Sigma factors in a thousand E. coli genomes

Environ Microbiol. 2013 Dec;15(12):3121-9. doi: 10.1111/1462-2920.12236. Epub 2013 Aug 29.

Abstract

Everyone working with bacterial genomics is familiar with the phrase 'too much data'. In this Genome Update, we discuss two methods for helping to deal with this explosion of genomic information. First, we introduce the concept of calculating a quality score for each sequenced genome, and second, we describe a method to quickly sort through genomes for a particular set of protein families. We apply these two methods to all of the current Escherichia coli genomes available in the The National Center for Biotechnology Information database. Out of the 2074 E. coli/Shigella genomes listed (June, 2013), only less than half (983) are of sufficient quality to use in comparative genomic work. Unfortunately, even some of the 'complete' E. coli genomes are in pieces, and a few 'draft' genomes are good quality. Six of the seven known sigma factors in E. coli strain K-12 are extremely well conserved; the iron-regulating sigma factor FecI (σ(19) ) is missing in most genomes. Surprisingly, the E. coli strain CFT073 genome does not encode a functional RpoD (σ(70) ), which is obviously essential, and this is likely due to poor genome assembly/annotation. We find a possible novel sigma factor present in more than a hundred E. coli genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Escherichia coli / genetics*
  • Escherichia coli / metabolism
  • Escherichia coli Proteins / chemistry
  • Escherichia coli Proteins / genetics*
  • Escherichia coli Proteins / metabolism
  • Genome, Bacterial*
  • Genomics / methods
  • Molecular Sequence Data
  • Molecular Weight
  • Protein Structure, Tertiary
  • Sigma Factor / chemistry
  • Sigma Factor / genetics*
  • Sigma Factor / metabolism

Substances

  • Escherichia coli Proteins
  • Sigma Factor

Associated data

  • GENBANK/ADUO00000000
  • GENBANK/AE014075
  • GENBANK/AEMF00000000
  • GENBANK/AJWU01000000
  • GENBANK/AP012306
  • GENBANK/CM000662
  • GENBANK/CM000960
  • GENBANK/CM001142
  • GENBANK/CM001474
  • GENBANK/CP000970
  • GENBANK/CP001063
  • GENBANK/CP001560
  • GENBANK/CP001671
  • GENBANK/CP001969
  • GENBANK/CP002211
  • GENBANK/CP002291
  • GENBANK/CU928160
  • GENBANK/FM180568
  • GENBANK/U00096