Software Note
PreSSAPro: A software for the prediction of secondary structure by amino acid properties

https://doi.org/10.1016/j.compbiolchem.2007.08.010 Get rights and content

Abstract

PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha–beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/ .

Introduction

The propensities for different secondary structures represent intrinsic properties of amino acids, used in the last three decades to investigate protein structure. In the 1970s Chou and Fasman developed their pioneering prediction method based on the statistical propensity of amino acids for secondary structures, evaluated on the few tens of proteins for which the three-dimensional structures determined by X-ray diffraction were available. On the basis of such propensities, it was possible to evaluate the mean propensity for the different secondary structures along a given sequence, and so to predict its secondary structure (Chou and Fasman, 1974a, Chou and Fasman, 1974b, Chou, 1989). Propensities evaluated in the early works, or their re-evaluated versions, are still used for developing new algorithms and predictive methods (Wang and Feng, 2005, Fuchs and Alix, 2005).

The PreSSAPro service is based on our recent paper (Costantini et al., 2006) which investigated a new point of view about amino acid propensities. The main question in our work was what is the best protein dataset to evaluate the amino acid propensities, either larger but not homogeneous or smaller but homogeneous sets, and how the composition of the protein dataset affects these propensities. We evaluated the amino acid propensities for three types of secondary structures (i.e. helix, beta-strand and coil) for 2168 proteins reported in the PDBselect dataset. The success of predictions based on these propensities was improved in comparison to the original Chou and Fasman method, based on few tens of proteins. Then, this dataset was subdivided into three subsets corresponding to the secondary structural classes, i.e. all-alpha, all-beta and alpha–beta proteins, according to the definition of Nakashima et al. (1986), that consider proteins with >15% alpha-helical content and <10% beta-strand content as all-alpha proteins, with <15% alpha-content and >10% beta-content as all-beta proteins, with >15% alpha-content and >10% beta-content as mixed proteins, and the remaining as irregular. For each subset, the amino acid propensities have been calculated and used for predicting the secondary structure of the proteins belonging to that subset. The success of the predictions resulted further improved in comparison to the predictions obtained using the propensities calculated for the whole dataset. The final consideration from that work concerns the reliability of the Chou and Fasman approach. Its results can increase drastically with the growth of the number of proteins in the initial data set used to evaluate the amino acid propensities, but also by using smaller data sets of proteins which are homogeneous in their secondary structure content. These conclusions allowed us to develop a novel software for the prediction of secondary structure of proteins, named PreSSAPro.

Section snippets

Methods

PreSSAPro is a CGI script written in PERL language that predicts the secondary structure of proteins using the residue propensity values in different secondary structural types (Pij) determined from the ratio of the residue's frequency of occurrence in helices, beta-strand and coil versus its frequency of occurrence evaluated in four different protein subsets (Costantini et al., 2006). The tables of amino acids propensities calculated for the whole PDBselect dataset and for three subsets

Using PreSSAPro

PreSSAPro has been developed with the aim of offering a user-friendly web tool to provide predictions of secondary structures starting from the amino acid sequence of a given protein. The user has three choices to obtain the prediction as shown in Fig. 1: (i) by indicating the helices and/or beta-strands contents, as percentages, if known by experimental studies; (ii) by indicating the structural class of the input sequence, choosing among “all-alpha”, “all-beta” and “alpha–beta”; or (iii) by

References (7)

There are more references available in the full text version of this article.

Cited by (17)

  • Biomolecular structures: Prediction, identification and analyses

    2018, Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics
  • Proteomics to Assess Fish Quality and Bioactivity

    2017, Proteomics in Food Science: From Farm to Fork
  • Random coil structures in bacterial proteins. Relationships of their amino acid compositions to flanking structures and corresponding genic base compositions

    2013, Biochimie
    Citation Excerpt :

    There are several amino acid residues which are known as strong coil formers (i.e. helix and beta strand breakers). They are: Gly, Pro, Asn and Asp [2–5]. In this study we showed that three of them (Gly, Asn and Asp) have significantly higher usages in the specific type of coil (BCB), than in other ones.

  • Stabilization of secondary structure elements by specific combinations of hydrophilic and hydrophobic amino acid residues is more important for proteins encoded by GC-poor genes

    2012, Biochimie
    Citation Excerpt :

    In general, they have been confirmed in in vitro studies on model peptides [8]. Those propensity scales have brought new information on the theoretical issues of secondary structure formation and also have been used for secondary structure prediction in numerous computer algorithms [1,3]. It is known that amino acid content of proteins highly depends on GC-content of genes coding for them [9].

  • A compact hybrid feature vector for an accurate secondary structure prediction

    2011, Information Sciences
    Citation Excerpt :

    An amino acid of type aa was extracted from 126 sequences from the RS126 dataset. RS126 is a popular benchmark dataset widely used in secondary structure prediction [3,13,22,25,26,29,49] and is still continuously used. Furthermore, RS126 composed of low identity sequences with no sequences sharing more than 24% similar identity.

View all citing articles on Scopus
View full text