De novo protein sequencing by combining top-down and bottom-up tandem mass spectra

J Proteome Res. 2014 Jul 3;13(7):3241-8. doi: 10.1021/pr401300m. Epub 2014 Jun 18.

Abstract

There are two approaches for de novo protein sequencing: Edman degradation and mass spectrometry (MS). Existing MS-based methods characterize a novel protein by assembling tandem mass spectra of overlapping peptides generated from multiple proteolytic digestions of the protein. Because each tandem mass spectrum covers only a short peptide of the target protein, the key to high coverage protein sequencing is to find spectral pairs from overlapping peptides in order to assemble tandem mass spectra to long ones. However, overlapping regions of peptides may be too short to be confidently identified. High-resolution mass spectrometers have become accessible to many laboratories. These mass spectrometers are capable of analyzing molecules of large mass values, boosting the development of top-down MS. Top-down tandem mass spectra cover whole proteins. However, top-down tandem mass spectra, even combined, rarely provide full ion fragmentation coverage of a protein. We propose an algorithm, TBNovo, for de novo protein sequencing by combining top-down and bottom-up MS. In TBNovo, a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up tandem mass spectra are aligned to the scaffold to increase sequence coverage. Experiments on data sets of two proteins showed that TBNovo achieved high sequence coverage and high sequence accuracy.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alemtuzumab
  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Antibodies, Monoclonal, Humanized / chemistry
  • Carbonic Anhydrase II / chemistry
  • Cattle
  • Molecular Sequence Data
  • Peptide Mapping*
  • Sequence Analysis, Protein*
  • Tandem Mass Spectrometry / methods

Substances

  • Antibodies, Monoclonal, Humanized
  • Alemtuzumab
  • Carbonic Anhydrase II