PIGS: automatic prediction of antibody structures

Marcatili, Paolo; Rosi, Alessandra; Tramontano, Anna

doi:10.1093/bioinformatics/btn341

Abstract

Summary: We describe a web server for the automatic prediction of immunoglobulin variable domains based on the canonical structure model. The server is user-friendly and flexible. It allows the user to select the templates for the frameworks and the loops using different strategies. The final output is a full-fledged 3D model of the variable domains of the target immunoglobulin.

Availability: The server is openly accessible to academic users at the address: http://arianna.bio.uniroma1.it/pigs. It does not require registration and there is no limit to the number of sequences that can be submitted.

Contact: anna.tramontano@uniroma1.it

1 INTRODUCTION

Immunoglobulins are key players of the immune response and their overall structure is reasonably well conserved. They are composed of two heavy and two light chains that contain four and two domains with a similar fold, respectively (Chothia et al., 1989). Antibodies bind their cognate antigen using the tip of the first domains of each chain (VL and VH). From a structural point of view, the antigen binding site is formed by six loops, three from the light (L1, L2, L3) and three from the heavy chain (H1, H2, H3) named according to their order of appearance in the amino acid sequence.

The structure of the main chain of five of these loops can be predicted quite accurately by taking into account the position and identity of a few specific key amino acids (Chothia and Lesk, 1987; Chothia et al., 1989; Tramontano et al., 1990; Webster and Rees, 1995). For example, either five or six residues form the L3 loop. The five residue L3 loops all have similar main chain conformation (within 0.2 Å). The six residue L3 loops can only take one of two possible conformations depending upon the position of a proline residue within the loop. The case of the third loop of the heavy chain (H3) is more complex. Extensive analysis of this loop in the many available structures has demonstrated that the conformation of the 10 residues closer to the framework (the torso) can be predicted in a similar fashion as for the other loops, while the remaining do not seem to follow identifiable sequence rules (Morea et al., 1998; Shirai et al., 1996).

The advanced understanding of the sequence to structure relationship in this important class of molecules makes it possible to predict their structure quite accurately and automatically. The crucial steps in the prediction are the correct alignment of the target sequence with those of immunoglobulins of known structure, the identification of the limits of the hypervariable loops (where insertions and deletions occur) and of the key residues determining their conformation. The alignment has to follow immunoglobulin specific rules and cannot by obtained by classical dynamic programming methods, because insertions and deletions can only occur at very specific positions and some important conserved residues, for example, two bonded cysteine and a tryptophan, need to be aligned. Nevertheless, rule-based techniques for the alignment, the identification of the canonical structures and the detection of the appropriate templates for the loops can be implemented and automated.

A server, named WAM (Webster and Rees, 1995), mostly based on the rules described earlier, is already available for immunoglobulin structure prediction. However, its usage by the academic community at large is limited by a number of factors. Users, who need to register via fax, are restricted to five sequences per month, which is a rather low threshold in the genomic era. Furthermore, the server is rather rigid: it does not allow any input for the selection of the template structures and it only works if the sequence spans the precise boundaries of the domain. Finally the alignment often requires manual intervention.

2 THE PIGS SERVER

We have developed PIGS (prediction of immunoglobulin structure), a tool to build the structure of immunoglobulins available to the academic community. The PIGS server is flexible and user-friendly and relies upon a database of known immunoglobulin structures and of their structural alignment that is regularly updated. The user only needs to input the sequence of the variable chains of the antibody of interest and the program will display a list of putative templates for both the loops and the framework for each chain, together with other useful information (Fig. 1).

Fig. 1.

Open in new tab Download slide

The main page of PIGS, displayed after the user has uploaded the target sequences. The template structures for each chain can be selected manually or according to a predefined strategy. The numbers in the ‘Loop’ column indicate the canonical structure of the loop, with the blue color indicating the same canonical structure as the target immunoglobulin chain. The alignment with each template structure can be viewed and edited in the pop-up window by clicking on the %id figure.

The user can either manually select the templates, or automatically select one of four possible strategies:

Same antibody: select the known structure that can provide a template for both the heavy and light chain, even if a different template with a higher sequence identity exists for one of the chains. If two different antibodies are selected for each chain, the program needs to reconstruct the complete molecule by matching residues known to be conserved at the VL–VH interface and this can introduce more errors than taking the two chains from the same antibody.

Same canonical structure: select the template having loops with the same canonical structure of the target even if a different template with a higher sequence identity exists for one or both chains. If a chain with different canonical structures is selected, the program needs to reconstruct the complete chain by building a ‘chimera’ taking the loops from an antibody with the matching canonical structures and this can introduce more errors than taking the loops from the same chain used to model the framework.

Same antibody and canonical structure: select an antibody structure that can be used as a template for both Vl and Vh and where the canonical structures of the loops are the same as those of the target even if a different template with a slightly better sequence identity exists for one or both chains.

Best L and H chain: select the two chains with highest sequence identity with the corresponding chains of the target and, if needed, pack the two chains together and take the loops from a different structure.

The user can also select whether he/she prefers to obtain just the main chain coordinates (‘Backbone only’), the coordinates of the main chain and of all the side chains conserved with respect to the template (‘Transfer conserved residues’) or the complete structure. In the latter case (‘Transfer conserved+SCWRL’), the conformation of the conserved side chains is retained and the remaining ones are reconstructed using SCWRL (Bower et al., 1997).

No extensive benchmarking has ever been performed to assess which of the strategies described earlier works best. In our own experience, the ‘Same antibody’ and ‘Transfer conserved+SCWRL’ are preferable, and these are the default parameters of the server. We are presently performing a large scale benchmarking whose results will be made available via the PIGS server.

Once the choice has been made, the user can select the ‘build’ option and obtain the predicted structure of the antibody in PDB format. The model can be visualized via JMol, where a few options allow the user to identify key parts of the molecule.

The reliability of the canonical structure method has been proven over and over again, so there is no need to further demonstrate it. We just show as an example, in Table 1 the r.m.s.d. deviation of the main chain atoms between the models obtained by the completely automatic procedure on three antibody structure recently deposited in the protein structure database and their experimental structures.

Table 1.

Comparison between the predicted and experimental main chain structures of four recently solved immunoglobulin structures

r.m.s.d.	Region	2HWZ	2ADF	2R29	2ZCH
All	All	1.11	1.08	1.42	1.16
FW	FW	0.94	0.88	1.22	0.95
Loops	FW	1.75	1.75	2.12	1.81
L	L	1.17	0.48	1.14	0.6
FW L	FW L	1.03	0.48	0.94	0.47
L loops	FW L	1.90	0.48	1.90	1.07
L1 (length)	FW L	1.63 (6)	0.31 (7)	2.29 (11)	1.24 (11)
L2 (length)	FW L	0.91 (3)	0.32 (3)	1.05 (3)	0.44 (3)
L3 (length)	FW L	2.43 (6)	0.68 (5)	1.39 (6)	0.94 (6)
H	H	0.74	1.01	1.50	1.21
FW H	FW H	0.52	0.61	1.44	0.73
H loops	FW H	1.38	1.97	1.90	2.44
H1 (length)	FW H	0.8 (9)	0.85 (7)	0.81 (7)	0.56 (7)
H2 (length)	FW H	1.17 (3)	0.88 (4)	1.14 (4)	0.49 (4)
H3 (length)	FW H	1.66 (10)	2.52 (9)	2.47 (7)	3.06 (13)

r.m.s.d.	Region	2HWZ	2ADF	2R29	2ZCH
All	All	1.11	1.08	1.42	1.16
FW	FW	0.94	0.88	1.22	0.95
Loops	FW	1.75	1.75	2.12	1.81
L	L	1.17	0.48	1.14	0.6
FW L	FW L	1.03	0.48	0.94	0.47
L loops	FW L	1.90	0.48	1.90	1.07
L1 (length)	FW L	1.63 (6)	0.31 (7)	2.29 (11)	1.24 (11)
L2 (length)	FW L	0.91 (3)	0.32 (3)	1.05 (3)	0.44 (3)
L3 (length)	FW L	2.43 (6)	0.68 (5)	1.39 (6)	0.94 (6)
H	H	0.74	1.01	1.50	1.21
FW H	FW H	0.52	0.61	1.44	0.73
H loops	FW H	1.38	1.97	1.90	2.44
H1 (length)	FW H	0.8 (9)	0.85 (7)	0.81 (7)	0.56 (7)
H2 (length)	FW H	1.17 (3)	0.88 (4)	1.14 (4)	0.49 (4)
H3 (length)	FW H	1.66 (10)	2.52 (9)	2.47 (7)	3.06 (13)

No refinement protocol has been applied. Values are the r.s.m.d. in A of the region in the first column after optimal superposition of the region indicated in the second. FW,Framework; L, Light chain; H, Heavy chain

Open in new tab

Table 1.

Comparison between the predicted and experimental main chain structures of four recently solved immunoglobulin structures

r.m.s.d.	Region	2HWZ	2ADF	2R29	2ZCH
All	All	1.11	1.08	1.42	1.16
FW	FW	0.94	0.88	1.22	0.95
Loops	FW	1.75	1.75	2.12	1.81
L	L	1.17	0.48	1.14	0.6
FW L	FW L	1.03	0.48	0.94	0.47
L loops	FW L	1.90	0.48	1.90	1.07
L1 (length)	FW L	1.63 (6)	0.31 (7)	2.29 (11)	1.24 (11)
L2 (length)	FW L	0.91 (3)	0.32 (3)	1.05 (3)	0.44 (3)
L3 (length)	FW L	2.43 (6)	0.68 (5)	1.39 (6)	0.94 (6)
H	H	0.74	1.01	1.50	1.21
FW H	FW H	0.52	0.61	1.44	0.73
H loops	FW H	1.38	1.97	1.90	2.44
H1 (length)	FW H	0.8 (9)	0.85 (7)	0.81 (7)	0.56 (7)
H2 (length)	FW H	1.17 (3)	0.88 (4)	1.14 (4)	0.49 (4)
H3 (length)	FW H	1.66 (10)	2.52 (9)	2.47 (7)	3.06 (13)

r.m.s.d.	Region	2HWZ	2ADF	2R29	2ZCH
All	All	1.11	1.08	1.42	1.16
FW	FW	0.94	0.88	1.22	0.95
Loops	FW	1.75	1.75	2.12	1.81
L	L	1.17	0.48	1.14	0.6
FW L	FW L	1.03	0.48	0.94	0.47
L loops	FW L	1.90	0.48	1.90	1.07
L1 (length)	FW L	1.63 (6)	0.31 (7)	2.29 (11)	1.24 (11)
L2 (length)	FW L	0.91 (3)	0.32 (3)	1.05 (3)	0.44 (3)
L3 (length)	FW L	2.43 (6)	0.68 (5)	1.39 (6)	0.94 (6)
H	H	0.74	1.01	1.50	1.21
FW H	FW H	0.52	0.61	1.44	0.73
H loops	FW H	1.38	1.97	1.90	2.44
H1 (length)	FW H	0.8 (9)	0.85 (7)	0.81 (7)	0.56 (7)
H2 (length)	FW H	1.17 (3)	0.88 (4)	1.14 (4)	0.49 (4)
H3 (length)	FW H	1.66 (10)	2.52 (9)	2.47 (7)	3.06 (13)

No refinement protocol has been applied. Values are the r.s.m.d. in A of the region in the first column after optimal superposition of the region indicated in the second. FW,Framework; L, Light chain; H, Heavy chain

Open in new tab

3 CONCLUSIONS

The prediction of antibody structures is not only important, but feasible at a level of accuracy much higher than for any other protein type. The level of understanding of the sequence structure relationship in this class of molecules is sufficiently advanced so that automatic easy to use methods can be developed and employed for the molecular analysis of the ‘immunome’. The purpose of the PIGS server described here is to enable scientists to tackle the open problem of the molecular basis of the specificity of antibodies at large and/or to obtain data useful in the context of specific biological problems.

ACKNOWLEDGEMENTS

Funding: PIGS is supported by the EU contract number LHSG-CT-2003-503265, and AIRC BICG Project.

Conflict of Interest: none declared.

REFERENCES

Bower

M

, et al.

Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool

,

J. Mol. Biol.

,

1997

, vol.

267

(pg.

1268

-

1282

)

Google Scholar

Crossref

PubMed

WorldCat

Chothia

C

,

Lesk

A

.

Canonical structures for the hypervariable regions of immunoglobulins

,

J. Mol. Biol.

,

1987

, vol.

196

(pg.

901

-

917

)

Google Scholar

Crossref

PubMed

WorldCat

Chothia

C

, et al.

Conformations of immunoglobulin hypervariable regions

,

Nature

,

1989

, vol.

342

(pg.

877

-

883

)

Google Scholar

Crossref

PubMed

WorldCat

Morea

V

, et al.

Conformations of the third hypervariable region in the VH domain of immunoglobulins

,

J. Mol. Biol.

,

1998

, vol.

275

(pg.

269

-

294

)

Google Scholar

Crossref

PubMed

WorldCat

Shirai

H

, et al.

Structural classification of CDR-H3 in antibodies

,

FEBS Lett.

,

1996

, vol.

399

(pg.

1

-

8

)

Google Scholar

Crossref

PubMed

WorldCat

Tramontano

A

, et al.

Framework residue 71 is a major determinant of the position and conformation of the second hypervariable region in the VH domains of immunoglobulins

,

J. Mol. Biol.

,

1990

, vol.

215

(pg.

175

-

182

)

Google Scholar

Crossref

PubMed

WorldCat

Webster

DM

,

Rees

AR

.

Molecular modeling of antibody-combining sites

,

Methods Mol. Biol.

,

1995

, vol.

51

(pg.

17

-

49

)

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

Author notes

Associate Editor: Alfonso Valencia

Download all slides

Month:	Total Views:
November 2016	2
December 2016	2
January 2017	8
February 2017	19
March 2017	24
April 2017	12
May 2017	17
June 2017	4
July 2017	9
August 2017	12
September 2017	9
October 2017	14
November 2017	16
December 2017	34
January 2018	39
February 2018	40
March 2018	36
April 2018	42
May 2018	27
June 2018	26
July 2018	26
August 2018	28
September 2018	17
October 2018	24
November 2018	39
December 2018	42
January 2019	30
February 2019	54
March 2019	57
April 2019	46
May 2019	45
June 2019	21
July 2019	44
August 2019	69
September 2019	66
October 2019	69
November 2019	64
December 2019	84
January 2020	90
February 2020	71
March 2020	79
April 2020	74
May 2020	47
June 2020	80
July 2020	44
August 2020	36
September 2020	36
October 2020	59
November 2020	81
December 2020	70
January 2021	39
February 2021	54
March 2021	53
April 2021	44
May 2021	80
June 2021	25
July 2021	42
August 2021	49
September 2021	63
October 2021	52
November 2021	39
December 2021	34
January 2022	39
February 2022	32
March 2022	48
April 2022	36
May 2022	36
June 2022	23
July 2022	39
August 2022	46
September 2022	44
October 2022	51
November 2022	34
December 2022	14
January 2023	38
February 2023	31
March 2023	36
April 2023	17
May 2023	42
June 2023	14
July 2023	30
August 2023	27
September 2023	25
October 2023	34
November 2023	44
December 2023	23
January 2024	31
February 2024	53
March 2024	34
April 2024	32

Article Contents

PIGS: automatic prediction of antibody structures

Abstract

1 INTRODUCTION

2 THE PIGS SERVER

3 CONCLUSIONS

ACKNOWLEDGEMENTS

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

Article Contents

PIGS: automatic prediction of antibody structures

Abstract

1 INTRODUCTION

2 THE PIGS SERVER

3 CONCLUSIONS

ACKNOWLEDGEMENTS

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

This Feature Is Available To Subscribers Only