Introduction of a distance cut-off into structural alignment by the double dynamic programming algorithm

Comput Appl Biosci. 1997 Aug;13(4):387-96. doi: 10.1093/bioinformatics/13.4.387.

Abstract

Two approximations were introduced into the double dynamic programming algorithm, in order to reduce the computational time for structural alignment. One of them was the so-called distance cut-off, which approximately describes the structural environment of each residue by its local environment. In the approximation, a sphere with a given radius is placed at the center of the side chain of each residue. The local environment of a residue is constituted only by the residues with side chain centers that are present within the sphere, which is expressed by a set of center-to-center distances from the side chain of the residue to those of all the other constituent residues. The residues outside the sphere are neglected from the local environment. Another approximation is associated with the distance cut-off, which is referred to here as the delta N cut-off. If two local environments are similar to each other, the numbers of residues constituting the environments are expected to be similar. The delta N cut-off was introduced based on the idea. If the difference between the numbers of the constituent residues of two local environments is greater than a given threshold value, delta N, the evaluation of the similarity between the local environments is skipped. The introduction of the two approximations dramatically reduced the computational time for structural alignment by the double dynamic programming algorithm. However, the approximations also decreased the accuracy of the alignment. To improve the accuracy with the approximations, a program with a two-step alignment algorithm was constructed. At first, an alignment was roughly constructed with the approximations. Then, the epsilon-suboptimal region for the alignment was determined. Finally, the double dynamic programming algorithm with full structural environments was applied to the residue pairs within the epsilon-suboptimal region to produce an improved alignment.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Databases, Factual
  • Humans
  • Molecular Sequence Data
  • Molecular Structure
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Alignment / methods*
  • Sequence Alignment / statistics & numerical data
  • Sequence Homology, Amino Acid
  • Software

Substances

  • Proteins