Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy

Bioinformatics. 2009 Sep 1;25(17):2174-80. doi: 10.1093/bioinformatics/btp366. Epub 2009 Jun 17.

Abstract

Motivation: The bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics nowadays for identifying proteins from a sequence database. De novo sequencing software is also available for sequencing novel peptides with relatively short sequence lengths. However, automated sequencing of novel proteins from MS/MS remains a challenging problem.

Results: Very often, although the target protein is novel, it has a homologous protein included in a known database. When this happens, we propose a novel algorithm and automated software tool, named Champs, for sequencing the complete protein from MS/MS data of a few enzymatic digestions of the purified protein. Validation with two standard proteins showed that our automated method yields >99% sequence coverage and 100% sequence accuracy on these two proteins. Our method is useful to sequence novel proteins or 're-sequence' a protein that has mutations comparing with the database protein sequence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Automation / methods*
  • Cattle
  • Chickens
  • Databases, Protein*
  • Mass Spectrometry / methods*
  • Molecular Sequence Data
  • Muramidase / chemistry
  • Sequence Alignment
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid*
  • Serum Albumin, Bovine / chemistry

Substances

  • Serum Albumin, Bovine
  • Muramidase