Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

A tour of structural genomics

Abstract

Structural genomics projects aim to provide an experimental or computational three-dimensional model structure for all of the tractable macromolecules that are encoded by complete genomes. To this end, pilot centres worldwide are now exploring the feasibility of large-scale structure determination. Their experimental structures and computational models are expected to yield insight into the molecular function and mechanism of thousands of proteins. The pervasiveness of this information is likely to change the use of structure in molecular biology and biochemistry.

Key Points

  • Structural genomics aims to produce coordinates for all tractable proteins, by experimental determination of representative protein structures and computational comparative modelling of homologues.

  • There are many approaches to selecting targets for experimental characterization.

  • Structural genomics focuses on domains, rather than whole proteins or complexes.

  • Although enhancements to experimental technologies should allow structural genomics to scale up, most steps require optimization at present. Key experimental steps include cloning, expression and purification. These are followed by either nuclear magnetic resonance (NMR) assignment and structure determination, or by crystallization, diffraction, phasing and structure refinement.

  • Both NMR and X-ray crystallography will have roles in structural genomics.

  • Protein structure is better conserved than sequence, and therefore reveals distant evolutionary relationships that are undetectable from sequence.

  • Many functional inferences from structural genomics have relied on surface charge or bound ligands.

  • Solved structures are available in the usual manner from the Protein Data Bank (PDB); other databases list available targets at present.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Structure similarity without sequence similarity.
Figure 2: Processes involved in high-throughput structural genomics using X-ray crystallography.
Figure 3: Target selection for structural genomics.

Similar content being viewed by others

References

  1. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    CAS  PubMed  Google Scholar 

  2. Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

    CAS  PubMed  Google Scholar 

  3. Devos, D. & Valencia, A. Practical limits of function prediction. Proteins 41, 98–107 (2000).

    CAS  PubMed  Google Scholar 

  4. Todd, A. E., Orengo, C. A. & Thornton, J. M. Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol. 307, 1113–1143 (2001).

    CAS  PubMed  Google Scholar 

  5. Brenner, S. E. Errors in genome annotation. Trends Genet. 15, 132–133 (1999).

    CAS  PubMed  Google Scholar 

  6. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25, 25–29 (2000).

    CAS  PubMed  Google Scholar 

  7. Perutz, M. F. et al. Structure of hæmoglobin. A three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185, 416–422 (1960).

    CAS  PubMed  Google Scholar 

  8. Kendrew, J. C. & Watson, H. C. Comparison between amino-acid sequences of sperm whale myoglobin and of human haemoglobin. Nature 190, 670 (1961).

  9. Flaherty, K. M., McKay, D. B., Kabsch, W. & Holmes, K. C. Similarity of the three-dimensional structures of actin and the ATPase fragment of a 70-kDa heat shock cognate protein. Proc. Natl Acad. Sci. USA 88, 5041–5045 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Aravind, L., Leipe, D. D. & Koonin, E. V. Toprim — a conserved catalytic domain in type IA and II topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins. Nucleic Acids Res. 26, 4205–4213 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Berger, J. M., Fass, D., Wang, J. C. & Harrison, S. C. Structural similarities between topoisomerases that cleave one or both DNA strands. Proc. Natl Acad. Sci. USA 95, 7876–7881 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Brenner, S. E., Chothia, C. & Hubbard, T. J. P. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl Acad. Sci. USA 95, 6073–6078 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Bjorkman, P. J. et al. Structure of the human class I histocompatibility antigen, HLA-A2. Nature 329, 506–512 (1987).

    CAS  PubMed  Google Scholar 

  14. Wilson, I. A. & Garcia, K. C. T-cell receptor structure and TCR complexes. Curr. Opin. Struct. Biol. 7, 839–848 (1997).

    CAS  PubMed  Google Scholar 

  15. Blundell, T. L. & Mizuguchi, K. Structural genomics: an overview. Prog. Biophys. Mol. Biol. 73, 289–295 (2000).

    CAS  PubMed  Google Scholar 

  16. Burley, S. K. et al. Structural genomics: beyond the human genome project. Nature Genet. 23, 151–157 (1999).

    CAS  PubMed  Google Scholar 

  17. Domingues, F. S., Koppensteiner, W. A. & Sippl, M. J. The role of protein structure in genomics. FEBS Lett. 476, 98–102 (2000).

    CAS  PubMed  Google Scholar 

  18. Gaasterland, T. Structural genomics: bioinformatics in the driver's seat. Nature Biotechnol. 16, 625–627 (1998).

    CAS  Google Scholar 

  19. Kim, S. H. Shining a light on structural genomics. Nature Struct. Biol. 5, 643–645 (1998).

    CAS  PubMed  Google Scholar 

  20. Mittl, P. R. & Grutter, M. G. Structural genomics: opportunities and challenges. Curr. Opin. Chem. Biol. 5, 402–408 (2001).

    CAS  PubMed  Google Scholar 

  21. Montelione, G. T. & Anderson, S. Structural genomics: keystone for a Human Proteome Project. Nature Struct. Biol. 6, 11–12 (1999).

    CAS  PubMed  Google Scholar 

  22. Sali, A. 100,000 protein structures for the biologist. Nature Struct. Biol. 5, 1029–1032 (1998).

    CAS  PubMed  Google Scholar 

  23. Shapiro, L. & Lima, C. D. The Argonne Structural Genomics Workshop: Lamaze class for the birth of a new science. Structure 6, 265–267 (1998).

    CAS  PubMed  Google Scholar 

  24. Smith, T. A new era. Nature Struct. Biol. 7, 927 (2000).The introduction to a supplement to Nature Structural Biology devoted to structural genomics, which contains 20 articles that address different aspects of the field.

  25. Teichmann, S. A., Chothia, C. & Gerstein, M. Advances in structural genomics. Curr. Opin. Struct. Biol. 9, 390–399 (1999).

    CAS  PubMed  Google Scholar 

  26. Teichmann, S. A., Murzin, A. G. & Chothia, C. Determination of protein function, evolution and interactions by structural genomics. Curr. Opin. Struct. Biol. 11, 354–363 (2001).This review includes an analysis of 32 structural genomics proteins and presents lessons learned in each case.

    CAS  PubMed  Google Scholar 

  27. Doudna, J. A. Structural genomics of RNA. Nature Struct. Biol. 7, 954–956 (2000).

    CAS  PubMed  Google Scholar 

  28. Edwards, A. M. et al. Protein production: feeding the crystallographers and NMR spectroscopists. Nature Struct. Biol. 7, 970–972 (2000).

    CAS  PubMed  Google Scholar 

  29. Waldo, G. S., Standish, B. M., Berendzen, J. & Terwilliger, T. C. Rapid protein-folding assay using green fluorescent protein. Nature Biotechnol. 17, 691–695 (1999).

    CAS  Google Scholar 

  30. Yokoyama, S. et al. Structural genomics projects in Japan. Prog. Biophys. Mol. Biol. 73, 363–376 (2000).

    CAS  PubMed  Google Scholar 

  31. Christendat, D. et al. Structural proteomics of an archaeon. Nature Struct. Biol. 7, 903–909 (2000).Describes the determination of ten protein structures from M. thermoautotrophicum , using the principle of finding proteins that are most amenable to structural characterization.

    CAS  PubMed  Google Scholar 

  32. Montelione, G. T., Zheng, D., Huang, Y. J., Gunsalus, K. C. & Szyperski, T. Protein NMR spectroscopy in structural genomics. Nature Struct. Biol. 7, 982–985 (2000).

    CAS  PubMed  Google Scholar 

  33. Terwilliger, T. C. Structural genomics in North America. Nature Struct. Biol. 7, 935–939 (2000).

    CAS  PubMed  Google Scholar 

  34. Abola, E., Kuhn, P., Earnest, T. & Stevens, R. C. Automation of X-ray crystallography. Nature Struct. Biol. 7, 973–977 (2000).

    CAS  PubMed  Google Scholar 

  35. Bertone, P. et al. SPINE: an integrated tracking database and data mining approach for identifying feasible targets in high-throughput structural proteomics. Nucleic Acids Res 29, 2884–2898 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Hendrickson, W. A. Synchrotron crystallography. Trends Biochem. Sci. 25, 637–643 (2000).

    CAS  PubMed  Google Scholar 

  37. Wider, G. & Wuthrich, K. NMR spectroscopy of large molecules and multimolecular assemblies in solution. Curr. Opin. Struct. Biol. 9, 594–601 (1999).

    CAS  PubMed  Google Scholar 

  38. Prestegard, J. H., Valafar, H., Glushka, J. & Tian, F. Nuclear magnetic resonance in the era of structural genomics. Biochemistry 40, 8677–8685 (2001).

    CAS  PubMed  Google Scholar 

  39. Yokoyama, S. et al. Structural genomics projects in Japan. Nature Struct. Biol. 7, 943–945 (2000).

    CAS  PubMed  Google Scholar 

  40. Adams, P. D. & Grosse-Kunstleve, R. W. Recent developments in software for the automation of crystallographic macromolecular structure determination. Curr. Opin. Struct. Biol. 10, 564–568 (2000).

    CAS  PubMed  Google Scholar 

  41. Lamzin, V. S. & Perrakis, A. Current state of automated crystallographic data analysis. Nature Struct. Biol. 7, 978–981 (2000).

    CAS  PubMed  Google Scholar 

  42. Helgstrand, M., Kraulis, P., Allard, P. & Hard, T. Ansig for Windows: an interactive computer program for semiautomatic assignment of protein NMR spectra. J. Biomol. NMR 18, 329–336 (2000).

    CAS  PubMed  Google Scholar 

  43. Zimmerman, D. E. et al. Automated analysis of protein NMR assignments using methods from artificial intelligence. J. Mol. Biol. 269, 592–610 (1997).

    CAS  PubMed  Google Scholar 

  44. Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Norvell, J. C. & Machalek, A. Z. Structural genomics programs at the US National Institute of General Medical Sciences. Nature Struct. Biol. 7, 931 (2000).

  46. Vitkup, D., Melamud, E., Moult, J. & Sander, C. Completeness in structural genomics. Nature Struct. Biol. 8, 559–566 (2001).This paper predicts the number of structure determinations necessary to provide three-dimensional models of all (or most) families of proteins.

    CAS  PubMed  Google Scholar 

  47. Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 28, 263–266 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Kim, K. K., Hung, L. W., Yokota, H., Kim, R. & Kim, S. H. Crystal structures of eukaryotic translation initiation factor 5A from Methanococcus jannaschii at 1.8 Å resolution. Proc. Natl Acad. Sci. USA 95, 10419–10424 (1998).A report of one of the first structural genomics proteins solved; it represented inadvertent duplication of effort, as the same structure was independently solved in the next reference.

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Peat, T. S., Newman, J., Waldo, G. S., Berendzen, J. & Terwilliger, T. C. Structure of translation initiation factor 5A from Pyrobaculum aerophilum at 1.75 Å resolution. Structure 6, 1207–1214 (1998).

    CAS  PubMed  Google Scholar 

  50. Sinha, S. et al. Crystal structure of Bacillus subtilis YabJ, a purine regulatory protein and member of the highly conserved YjgF family. Proc. Natl Acad. Sci. USA 96, 13074–13079 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Volz, K. A test case for structure-based functional assignment: the 1.2 Å crystal structure of the YjgF gene product from Escherichia coli. Protein Sci. 8, 2428–2437 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Smaglik, P. Protein structure groups seek to draft common ground rules. Nature 403, 691 (2000).

    CAS  PubMed  Google Scholar 

  53. Brenner, S. E. Target selection for structural genomics. Nature Struct. Biol. 7, 967–969 (2000).

    CAS  PubMed  Google Scholar 

  54. Kuroda, Y., Tani, K., Matsuo, Y. & Yokoyama, S. Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics. Protein Sci. 9, 2313–2321 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Dietmann, S. et al. A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3. Nucleic Acids Res. 29, 55–57 (2001).An introduction to one of the most popular systems for automatically comparing proteins of known structure.

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).The SCOP database is a comprehensive expert-curated hierarchical evolutionary classification of protein domains using structural information.

    CAS  PubMed  Google Scholar 

  57. Pearl, F. M. et al. A rapid classification protocol for the CATH Domain Database to support structural genomics. Nucleic Acids Res. 29, 223–227 (2001).An introduction to CATH, a largely automated hierarchical classification of protein domain structures.

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Siddiqui, A. S., Dengler, U. & Barton, G. J. 3Dee: a database of protein structural domains. Bioinformatics 17, 200–201 (2001).

    CAS  PubMed  Google Scholar 

  59. Apic, G., Gough, J. & Teichmann, S. A. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J. Mol. Biol. 310, 311–325 (2001).

    CAS  PubMed  Google Scholar 

  60. Apic, G., Gough, J. & Teichmann, S. A. An insight into domain combinations. Bioinformatics 17 (Suppl. 1), S83–S89 (2001).

    PubMed  Google Scholar 

  61. Saha, S. et al. Solution structure of the LDL receptor EGF-AB pair. A paradigm for the assembly of tandem calcium binding EGF domains. Structure 9, 451–456 (2001).

    CAS  PubMed  Google Scholar 

  62. Gerstein, M. Integrative database analysis in structural genomics. Nature Struct. Biol. 7, 960–963 (2000).

    CAS  PubMed  Google Scholar 

  63. Fischer, D. Rational structural genomics: affirmative action for ORFans and the growth in our structural knowledge. Protein Eng. 12, 1029–1030 (1999).This paper describes interesting features of genes without homologues and the ability of structural genomics to elucidate their provenance.

    CAS  PubMed  Google Scholar 

  64. Galperin, M. Y. Conserved 'hypothetical' proteins: new hints and new puzzles. Comp. Funct. Genomics 2, 14–18 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Linial, M. & Yona, G. Methodologies for target selection in structural genomics. Prog. Biophys. Mol. Biol. 73, 297–320 (2000).

    CAS  PubMed  Google Scholar 

  66. Mallick, P., Goodwill, K. E., Fitz-Gibbon, S., Miller, J. H. & Eisenberg, D. Selecting protein targets for structural genomics of Pyrobaculum aerophilum: validating automated fold assignment methods by using binary hypothesis testing. Proc. Natl Acad. Sci. USA 97, 2450–2455 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Erlandsen, H., Abola, E. E. & Stevens, R. C. Combining structural genomics and enzymology: completing the picture in metabolic pathways and enzyme active sites. Curr. Opin. Struct. Biol. 10, 719–730 (2000).

    CAS  PubMed  Google Scholar 

  68. Lewis, H. A. et al. A structural genomics approach to the study of quorum sensing. Crystal structures of three LuxS orthologs. Structure 9, 527–537 (2001).

    CAS  PubMed  Google Scholar 

  69. Terwilliger, T. C. et al. Class-directed structure determination: foundation for a protein structure initiative. Protein Sci. 7, 1851–1856 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Shapiro, L. & Harris, T. Finding function through structural genomics. Curr. Opin. Biotechnol. 11, 31–35 (2000).

    CAS  PubMed  Google Scholar 

  71. Skolnick, J., Fetrow, J. S. & Kolinski, A. Structural genomics and its importance for gene function analysis. Nature Biotechnol. 18, 283–287 (2000).

    CAS  Google Scholar 

  72. Thornton, J. M. From genome to function. Science 292, 2095–2097 (2001).

    CAS  PubMed  Google Scholar 

  73. Thornton, J. M., Todd, A. E., Milburn, D., Borkakoti, N. & Orengo, C. A. From structure to function: approaches and limitations. Nature Struct. Biol. 7, 991–994 (2000).

    CAS  PubMed  Google Scholar 

  74. Berman, H. M. et al. The Protein Data Bank and the challenge of structural genomics. Nature Struct. Biol. 7, 957–959 (2000).

    CAS  PubMed  Google Scholar 

  75. Gibrat, J. F., Madej, T. & Bryant, S. H. Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 6, 377–385 (1996).

    CAS  PubMed  Google Scholar 

  76. Orengo, C. A. & Taylor, W. R. SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol. 266, 617–635 (1996).

    CAS  PubMed  Google Scholar 

  77. Shindyalov, I. N. & Bourne, P. E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11, 739–747 (1998).

    CAS  PubMed  Google Scholar 

  78. Subbiah, S., Laurents, D. V. & Levitt, M. Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Curr. Biol. 3, 141–149 (1993).

    CAS  PubMed  Google Scholar 

  79. Brenner, S. E. & Levitt, M. Expectations from structural genomics. Protein Sci. 9, 197–200 (2000).Uses historical data to predict the fraction of new folds and new superfamilies to be discovered by structural genomics.

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Koppensteiner, W. A., Lackner, P., Wiederstein, M. & Sippl, M. J. Characterization of novel proteins based on known protein structures. J. Mol. Biol. 296, 1139–1152 (2000).

    CAS  PubMed  Google Scholar 

  81. Cort, J. R., Yee, A., Edwards, A. M., Arrowsmith, C. H. & Kennedy, M. A. Structure-based functional classification of hypothetical protein MTH538 from Methanobacterium thermoautotrophicum. J. Mol. Biol. 302, 189–203 (2000).

    CAS  PubMed  Google Scholar 

  82. Cort, J. R., Yee, A., Edwards, A. M., Arrowsmith, C. H. & Kennedy, M. A. NMR structure determination and structure-based functional characterization of conserved hypothetical protein MTH1175 from Methanobacterium thermoautotrophicum. J. Struct. Funct. Genomics 1, 15–25 (2001).

    Google Scholar 

  83. Fetrow, J. S., Godzik, A. & Skolnick, J. Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. J. Mol. Biol. 282, 703–711 (1998).

    CAS  PubMed  Google Scholar 

  84. Wallace, A. C., Borkakoti, N. & Thornton, J. M. TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci. 6, 2308–2323 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  85. Wei, L. & Altman, R. B. Recognizing protein binding sites using statistical descriptions of their 3D environments. Pac. Symp. Biocomput. 4, 497–508 (1998).

    Google Scholar 

  86. Lichtarge, O., Bourne, H. R. & Cohen, F. E. An evolutionary trace method defines binding surfaces common to protein families. J. Mol. Biol. 257, 342–358 (1996).

    CAS  PubMed  Google Scholar 

  87. Sowa, M. E. et al. Prediction and confirmation of a site critical for effector regulation of RGS domain activity. Nature Struct. Biol. 8, 234–237 (2001).

    CAS  PubMed  Google Scholar 

  88. Boggon, T. J., Shan, W. S., Santagata, S., Myers, S. C. & Shapiro, L. Implication of tubby proteins as transcription factors by structure-based functional analysis. Science 286, 2119–2125 (1999).This paper predicts the DNA-binding function of tubby proteins on the basis of examination of the surface electrostatics of the structure.

    CAS  PubMed  Google Scholar 

  89. Teplova, M. et al. The structure of the YrdC gene product from Escherichia coli reveals a new fold and suggests a role in RNA binding. Protein Sci. 9, 2557–2566 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  90. Hwang, K. Y., Chung, J. H., Kim, S. H., Han, Y. S. & Cho, Y. Structure-based identification of a novel NTPase from Methanococcus jannaschii. Nature Struct. Biol. 6, 691–696 (1999).

    CAS  PubMed  Google Scholar 

  91. Minasov, G. et al. Functional implications from crystal structures of the conserved Bacillus subtilis protein Maf with and without dUTP. Proc. Natl Acad. Sci. USA 97, 6328–6333 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. Lim, K. et al. Crystal structure of YecO from Haemophilus influenzae (HI0319) reveals a methyltransferase fold and a bound S-adenosylhomocysteine. Proteins (in the press).

  93. Zarembinski, T. I. et al. Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics. Proc. Natl Acad. Sci. USA 95, 15189–15193 (1998).This paper reports that a bound ATP that was found in the solved structure indicated that this hypothetical protein is a molecular switch.

    CAS  PubMed  PubMed Central  Google Scholar 

  94. Sanchez, R. et al. Protein structure modeling for structural genomics. Nature Struct. Biol. 7, 986–990 (2000).

    CAS  PubMed  Google Scholar 

  95. Friedberg, I., Kaplan, T. & Margalit, H. Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments. Protein Sci. 9, 2278–2284 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. Sauder, J. M., Arthur, J. W. & Dunbrack, R. L. Jr Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40, 6–22 (2000).

    CAS  PubMed  Google Scholar 

  97. Dunker, A. K. et al. Protein disorder and the evolution of molecular recognition: theory, predictions and observations. Pac. Symp. Biocomput., 473–484 (1998).

  98. Wootton, J. C. & Federhen, S. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266, 554–571 (1996).

    CAS  PubMed  Google Scholar 

  99. Wright, P. E. & Dyson, H. J. Intrinsically unstructured proteins: re-assessing the protein structure–function paradigm. J. Mol. Biol. 293, 321–331 (1999).

    CAS  PubMed  Google Scholar 

  100. Schaffer, A. A. et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  101. Fowler, C. A., Tian, F., Al-Hashimi, H. M. & Prestegard, J. H. Rapid determination of protein folds using residual dipolar couplings. J. Mol. Biol. 304, 447–460 (2000).

    CAS  PubMed  Google Scholar 

  102. Potts, B. C. & Chazin, W. J. Chemical shift homology in proteins. J. Biomol. NMR 11, 45–57 (1998).

    CAS  PubMed  Google Scholar 

  103. Young, M. M. et al. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc. Natl Acad. Sci. USA 97, 5802–5806 (2000).In this work, cross-linking and mass spectrometry were used to glean limited structural information, sufficient to predict a protein fold.

    CAS  PubMed  PubMed Central  Google Scholar 

  104. Simons, K. T., Strauss, C. & Baker, D. Prospects for ab initio protein structural genomics. J. Mol. Biol. 306, 1191–1199 (2001).

    CAS  PubMed  Google Scholar 

  105. Wuthrich, K. Protein recognition by NMR. Nature Struct. Biol. 7, 188–189 (2000).

    CAS  PubMed  Google Scholar 

  106. Baumeister, W. & Steven, A. C. Macromolecular electron microscopy in the era of structural genomics. Trends Biochem. Sci. 25, 624–631 (2000).

    CAS  PubMed  Google Scholar 

  107. Heinemann, U. Structural genomics in Europe: slow start, strong finish? Nature Struct. Biol. 7, 940–942 (2000).

    CAS  PubMed  Google Scholar 

  108. Butler, D. Wellcome discusses structural genomics effort with industry. . . but data release remains an open question. Nature 406, 923–924 (2000).

    PubMed  Google Scholar 

  109. Williamson, A. R. Creating a structural genomics consortium. Nature Struct. Biol. 7, 953 (2000).

    CAS  PubMed  Google Scholar 

  110. Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  111. Orengo, C. A. et al. The CATH database provides insights into protein structure/function relationships. Nucleic Acids Res. 27, 275–279 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  112. Brenner, S. E., Barken, D. & Levitt, M. The PRESAGE database for structural genomics. Nucleic Acids Res. 27, 251–253 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  113. Sanchez, R. & Sali, A. ModBase: a database of comparative protein structure models. Bioinformatics 15, 1060–1061 (1999).

    CAS  PubMed  Google Scholar 

  114. Huynen, M. et al. Homology-based fold predictions for Mycoplasma genitalium proteins. J. Mol. Biol. 280, 323–326 (1998).

    CAS  PubMed  Google Scholar 

  115. Rychlewski, L., Zhang, B. & Godzik, A. Functional insights from structural predictions: analysis of the Escherichia coli genome. Protein Sci. 8, 614–624 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  116. Teichmann, S. A., Park, J. & Chothia, C. Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements. Proc. Natl Acad. Sci. USA 95, 14658–14663 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  117. Phillips, S. E. & Schoenborn, B. P. Neutron diffraction reveals oxygen–histidine hydrogen bond in oxymyoglobin. Nature 292, 81–82 (1981).

    CAS  PubMed  Google Scholar 

  118. Fermi, G., Perutz, M. F., Shaanan, B. & Fourme, R. The crystal structure of human deoxyhaemoglobin at 1.74 Å resolution. J. Mol. Biol. 175, 159–174 (1984).

    CAS  PubMed  Google Scholar 

  119. Bashford, D., Chothia, C. & Lesk, A. M. Determinants of a protein fold. Unique features of the globin amino acid sequences. J. Mol. Biol. 196, 199–216 (1987).

    CAS  PubMed  Google Scholar 

  120. Sayle, R. A. & Milner-White, E. J. RASMOL: biomolecular graphics for all. Trends Biochem. Sci. 20, 374 (1995).

  121. Kraulis, P. J. Molscript: a program to produce both detailed and schematic plots of protein structure. J. Appl. Crystallography 24, 946–950 (1991).

    Google Scholar 

  122. Merritt, E. A. & Bacon, D. J. Raster3d: photorealistic molecular graphics. Methods Enzymol. 277, 505–524 (1997).

    CAS  PubMed  Google Scholar 

  123. Eisenstein, E. et al. Biological function made crystal clear — annotation of hypothetical proteins via structural genomics. Curr. Opin. Biotechnol. 11, 25–30 (2000).

    CAS  PubMed  Google Scholar 

  124. Heinemann, U. et al. An integrated approach to structural genomics. Prog. Biophys. Mol. Biol. 73, 347–362 (2000).

    CAS  PubMed  Google Scholar 

  125. Dry, S., McCarthy, S. & Harris, T. Structural genomics in the biotechnology sector. Nature Struct. Biol. 7, 946–949 (2000).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work is supported by NIH grants and a Searle Scholarship. S.E.B. is grateful to J.-M. Chandonia, L. Lo Conte and R. Peters for critical review of the manuscript.

Author information

Authors and Affiliations

Authors

Related links

Related links

DATABASE LINKS

InterPro 

TIM

TopRim

 LocusLink 

TBP

tubby

 OMIM 

retinitis pigmentosa type 14

FURTHER INFORMATION

Airlie Agreement

Airlie Conference

CATH

Dali

ModBase

National Institute of General Medical Sciences (NIGMS)

Pfam

PRESAGE

Protein Data Bank

SCOP

SNP Consortium

Structuralgenomics.org

SWISS-PROT and TrEMBL

Glossary

COORDINATES

A set of numbers that specify the X, Y and Z positions for each atom in a protein. Together, they describe the molecular structure.

HIS-TAG

A series of histidine residues fused to a protein that aids protein purification because of its strong binding to nickel columns.

MESOPHILE

An organism that grows at moderate temperature.

DYNAMIC LIGHT SCATTERING

A technique for determining apparent molecular size, in which laser light is shone on a solution. Its scatter corresponds to the diffusion rate and, therefore, the size of the molecules in solution.

SYNCHROTRON

A device that accelerates particles of atomic size through an electric field; it is used to produce synchronous packets of particles.

BEAMLINE AUTOMATION

Technologies to reduce human intervention on synchrotron beamlines, such as robots for mounting and centring crystals in the X-ray beam.

MAD PHASING

(Multiple anomolous dispersion). An approach to determining the phases of a crystal structure by relying on the anomalous scattering of X-rays near the absorption edge of the atom (such as selenium). It allows determination of phase from several sets of data collected from a single crystal.

TROSY

(Transverse relaxation-optimized spectroscopy). A nuclear magnetic resonance technique that reduces the deterioration of signal from large proteins. It allows large proteins to be studied in high-field magnets.

ISOELECTRIC POINT

The pH at which a protein has zero net charge.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brenner, S. A tour of structural genomics. Nat Rev Genet 2, 801–809 (2001). https://doi.org/10.1038/35093574

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/35093574

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing