Amino acid propensities for secondary structures are influenced by the protein structural class

https://doi.org/10.1016/j.bbrc.2006.01.159 Get rights and content

Abstract

Amino acid propensities for secondary structures were used since the 1970s, when Chou and Fasman evaluated them within datasets of few tens of proteins and developed a method to predict secondary structure of proteins, still in use despite prediction methods having evolved to very different approaches and higher reliability. Propensity for secondary structures represents an intrinsic property of amino acid, and it is used for generating new algorithms and prediction methods, therefore our work has been aimed to investigate what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, i.e., all-α, all-β, α–β proteins. As a first analysis, we evaluated amino acid propensities for helix, β-strand, and coil in more than 2000 proteins from the PDBselect dataset. With these propensities, secondary structure predictions performed with a method very similar to that of Chou and Fasman gave us results better than the original one, based on propensities derived from the few tens of X-ray protein structures available in the 1970s. In a refined analysis, we subdivided the PDBselect dataset of proteins in three secondary structural classes, i.e., all-α, all-β, and α–β proteins. For each class, the amino acid propensities for helix, β-strand, and coil have been calculated and used to predict secondary structure elements for proteins belonging to the same class by using resubstitution and jackknife tests. This second round of predictions further improved the results of the first round. Therefore, amino acid propensities for secondary structures became more reliable depending on the degree of homogeneity of the protein dataset used to evaluate them. Indeed, our results indicate also that all algorithms using propensities for secondary structure can be still improved to obtain better predictive results.

Section snippets

Methods

Database and definition of protein secondary structure. All analyses were performed using PDBselect [48] as a set of experimentally determined, non-redundant protein structures in the Protein Data Bank (see http://homepages.fh-giessen.de/~hg12640/pdbselect ). We used the PDBselect list with <25% sequence homology, released in December 2003, which contained 2216 protein chains.

The secondary structure for every PDB entry was assigned by the DSSP algorithm [49] based on the analysis of backbone

Analysis of PDBselect as a unique set

The PDBselect release of December 2003 included 2216 structures having homology percentage <25%. We assigned the secondary structure for 2168 proteins by using the DSSP program (the others report only α carbons and DSSP did not assign the secondary structure). We simplified the 8-state secondary structure as a three-state secondary structure, considering H, G, and I as helix, B and E as β structure, and the others as coil (see Methods for details about the 8 states).

We calculated the

Discussion

We calculated the amino acid propensities in helix, β-strand, and coil for all proteins in the PDBselect dataset and evaluated their reliability by using them to predict the secondary structure of proteins. The quality of these predictions was examined by resubstitution and jackknife tests. Results obtained with the two tests are in general very similar (differences of 0.1–0.2%), and in particular when the number of proteins in the dataset was higher. This may reflect the fact that in the

Acknowledgments

This work was partially supported by MIUR-FIRB project (Grant RBNE0157EH_003) and by Rete di Spettrometria di Massa (contract FERS n. 94.05.09.103, ARINCO N. 94.IT.16.028). Ph.D. fellowship of Dr. Susan Costantini is supported by E.U.

References (64)

  • A.A. Salamov et al.

    Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments

    J. Mol. Biol.

    (1995)
  • J.M. Ball et al.

    SU proteins from virulent and a virulent EIAV demonstrate distinct biological properties

    Virology

    (2005)
  • R. Xu et al.

    A common sequence-associated physicochemical feature for proteins of beta-trefoil family

    Comput. Biol. Chem.

    (2005)
  • G. Kugler et al.

    Structural requirements of the dihydropyridine receptor α1S II-III loop for skeletal-type excitation-contraction coupling

    J. Biol. Chem.

    (2004)
  • K. Nishikawa

    Assessment of secondary-structure prediction of proteins. Comparison of computerized Chou–Fasman method with others

    Biochim. Biophys. Acta

    (1983)
  • B.R. Starcich et al.

    Identification and characterization of conserved and variable regions in the envelope gene of HTLV-III/LAV, the retrovirus of AIDS

    Cell

    (1986)
  • M. Motz et al.

    Expression of the Epstein-Barr virus 138-kDa early protein in Escherichia coli for the use as antigen in diagnostic tests

    Gene

    (1986)
  • M.M. Gromiha et al.

    Inter-residue interactions in protein folding and stability

    Prog. Biophys. Mol. Biol.

    (2004)
  • C.B. Anfinsen

    Principles that govern the folding of protein chains

    Science

    (1973)
  • P.Y. Chou et al.

    Prediction of protein conformation

    Biochemistry

    (1974)
  • P.Y. Chou

    Prediction of Protein Structure and the Principles of Protein Conformation

    (1989)
  • B. Rost et al.

    PHD-an automatic mail server for protein secondary structure prediction

    Comput. Appl. Biosci.

    (1994)
  • J.M. Chandonia et al.

    The importance of larger data sets for protein secondary structure prediction with neural networks

    Protein Sci.

    (1996)
  • J.M. Levin et al.

    Quantification of secondary structure prediction improvement using multiple alignments

    Protein Eng.

    (1993)
  • C. Geourjon et al.

    SOPM: a self-optimized method for protein secondary structure prediction

    Protein Eng.

    (1994)
  • B. Rost et al.

    Combining evolutionary information and neural networks to predict protein secondary structure

    Proteins

    (1994)
  • M.M. Gromiha et al.

    Prediction of protein secondary structures from their hydrophobic characteristics

    Int. J. Pept. Protein Res.

    (1995)
  • J. Kyngas et al.

    Unreliability of the Chou–Fasman parameters in predicting protein secondary structure

    Protein Eng.

    (1998)
  • N. Eswar et al.

    Stranded in isolation: structural role of isolated extended strands in proteins

    Protein Eng.

    (2003)
  • N.K. Dakappagari et al.

    A chimeric multi-human epidermal growth factor receptor-2 B cell epitope peptide vaccine mediates superior antitumor responses

    J. Immunol.

    (2003)
  • K. Koscielska-Kasprzak et al.

    Amyloid-forming peptides selected proteolytically from phage display library

    Protein Sci.

    (2003)
  • J.P. Malone et al.

    Type I collagen N-telopeptides adopt an ordered structure when docked to their helix receptor during fibrillogenesis

    Proteins

    (2004)
  • Cited by (0)

    View full text