Advertisement
No access
Special Reviews

Comparison of the Complete Protein Sets of Worm and Yeast: Orthology and Divergence

Science
11 Dec 1998
Vol 282, Issue 5396
pp. 2022-2028

Abstract

Comparative analysis of predicted protein sequences encoded by the genomes of Caenorhabditis elegans and Saccharomyces cerevisiae suggests that most of the core biological functions are carried out by orthologous proteins (proteins of different species that can be traced back to a common ancestor) that occur in comparable numbers. The specialized processes of signal transduction and regulatory control that are unique to the multicellular worm appear to use novel proteins, many of which re-use conserved domains. Major expansion of the number of some of these domains seen in the worm may have contributed to the advent of multicellularity. The proteins conserved in yeast and worm are likely to have orthologs throughout eukaryotes; in contrast, the proteins unique to the worm may well define metazoans.

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Material

File (985134.xhtml)

REFERENCES AND NOTES

1
C. elegans Sequencing Consortium, Science 282, 2012 (1998).
2
A. Goffeau et al., ibid. 274, 546 (1996).
3
Horvitz H. R., Sulston J. E., Genetics 96, 435 (1980);
Sulston J. E., White J. G., Dev. Biol. 78, 577 (1980);
Kenyon C., Science 240, 1448 (1988);
Sternberg P. W., Adv. Genet. 27, 63 (1990);
Sternberg P. W., Felix M. A., Curr. Opin. Genet. Dev. 7, 543 (1997).
4
Nasmyth K., Shore D., Science 237, 1162 (1987);
Herskowitz I., Microbiol. Rev. 52, 536 (1988);
Marsh L., Herskowitz I., Cold Spring Harbor Symp. Quant. Biol. 53, 557 (1988) ;
Marsh L., Neiman A. M., Herskowitz I., Annu. Rev. Cell Biol. 7, 699 (1991);
Nasmyth K., Trends Genet. 12, 405 (1996).
5
J. Gerhart and M. Kirschner, Cells, Embryos, and Evolution (Blackwell, Malden, MA, 1997).
6
Doolittle R. F., Annu. Rev. Biochem. 64, 287 (1995).
7
Orthology is not necessarily a one-to-one relationship; a unique gene in one species may be the ortholog of a gene family in another species. The issue is further confounded by the fact that many proteins, particularly in eukaryotes, contain multiple domains that have a degree of evolutionary independence and are found in different combinations. Thus, the detection of even very high similarity between protein sequences from two species does not guarantee that the proteins in question are genuine orthologs with a conserved domain architecture (10). To diminish (but not eliminate) the latter problem, we required that the members of each protein pair be aligned through at least 80% of their lengths.
8
Fitch W. M., Syst. Zool. 19, 99 (1970).
9
Koonin E. V., Mushegian A. R., Galperin M. Y., Walker D. R., Mol. Microbiol. 25, 619 (1997).
10
Hennikof S., et al., Science 278, 609 (1997);
; R. L. Tatusov, E. V. Koonin, D. J. Lipman, ibid., p. 631; M. Y. Galperin and E. V. Koonin, Silico Biol. 1, 7 (1998) (www.bioinfo.de/isb/1998/01/0007).
11
Doolittle R. F., et al., Science 271, 470 (1996);
Feng D.F., Cho G., Doolittle R. F., Proc. Natl. Acad. Sci. U.S.A. 94, 13028 (1997).
12
The data set used for these comparisons was the 16 October 1998 worm protein data set from the Sanger Centre and the ORF translations in the 28 October version of the SGD (both are available from the Science Web site as well as SGD). Because the prediction of C. elegans protein sequences had, on 16 October, yet to be corrected by rigorous experimental analysis, our reliance on these predictions may result in the loss of some subset of C. elegans proteins. However, using the subset of yeast proteins for which we had identified no worm homolog, we performed BLAST searches against six frame translations of the entire worm DNA sequence (finished sequence from the Sanger Centre Web site as of 3 November 1998) and identified no additional homologs at the P< 10−10 level with the >80% alignment requirement (J. M. Cherry, unpublished data). Supplemental information regarding the analysis is available at www.sciencemag.org/feature/data/c-elegans.shl for a general overview and at www.sciencemag.org/feature/data/985134.shl for information specific to this review.
13
Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J., J. Mol. Biol. 215, 403 (1990).
Gish W., States D. J., Nature Genet. 3, 266 (1993);
Karlin S., Altschul S. F., Proc. Natl. Acad. Sci. U.S.A. 90, 5873 (1993);
; S. F. Altschul and W. Gish Methods Enzymol. 266, 460 (1996). Version 2.0a19MP-WashU of BLAST was used, with the XNU and SEG filters, BLOSUM62 scoring matrix, with gapping on, and other parameters set to default values.
14
Thompson J. D., Higgins D. G., Gibson T. J., Nucleic Acids Res. 22, 4673 (1994);
Higgins D. G., Thompson J. D., Gibson T. J., Methods Enzymol. 266, 383 (1996);
. Version 1.74 of CLUSTALW was used with the BLOSUM substitution matrices.
15
Green P., et al., Science 259, 1711 (1993).
16
Clarke N. D., Berg J. M., ibid. 282, 2018 (1998).
17
Kataoka T., et al., Cell 40, 19 (1985).
18
Chamberlin H. M., Sternberg P. W., Development 120, 2713 (1994);
Church D. L., Guan K. L., Lambie E. J., ibid. 121, 2525 (1995);
Gutch M. J., Flint A. J., Keller J., Tonks N. K., Hengartner M. O., Genes Dev. 12, 571 (1998);
Sundaram M., Yochem J., Han M., Development 122, 2823 (1996);
Yochem J., Sundaram M., Han M., Mol. Cell. Biol. 17, 2716 (1997).
19
Botstein D., Chervitz S. A., Cherry J. M., Science 277, 1259 (1997);
Botstein D., Fink G. R., ibid. 240, 1439 (1988).
20
Mori H., Palmer R. E., Sternberg P. W., Mol. Gen. Genet. 245, 781 (1994).
21
Although there are strong theoretical reasons for preferring the unrooted tree, we show the rooted trees because they are easier to display compactly and more clearly represent the relationships at the tips of the branches, where the assessment of orthology is made. These are, in fact, just representations of unrooted trees with rooting that should be considered arbitrary.
22
Felsenstein J., Methods Enzymol. 266, 418 (1996).
23
Rogalski T. M., Riddle D. L., Genetics 118, 61 (1988);
Archambault J., Friesen J. D., Microbiol. Rev. 57, 703 (1993).
24
Figure 2A shows the CLUSTALW alignment at P < 10−20 because at the P < 10−10 the yeast protein Spt5p is included, paired with a presumed worm ortholog even though the similarity of these to RNA polymerases is, upon further study, clearly spurious. This artifact, due to low complexity in the Spt5p amino acid sequence, is avoidable by more aggressive filtering, applying the >80% alignment requirement as well as increasing the stringency; each of these measures exacts a cost in information as well. It illustrates that any alignment result has to be studied for robustness with regard to both stringency and filtering.
25
Mossi R., Hubscher U., Eur. J. Biochem. 254, 209 (1998).
26
Tanaka K., Biochem. Biophys. Res. Commun. 247, 537 (1998).
27
S. A. Chervitz et al., data not shown.
28
Clark S. W., Meyer D. I., Nature 359, 246 (1992);
; J. Cell Biol.127,129 (1994).
29
Herman R. K., Cari C. K., Hartman P. S., Genetics 102, 379 (1982).
30
Nelson R. J., Ziegelhoffer T., Nicolet C., Werner-Washburne M., Craig E. A., Cell 71, 97 (1992).
31
Normington K., Kohno K., Kozutsumi Y., Gething M. J., Sambrook J., ibid. 57, 1223 (1989).
32
Heschl M. F. P., Baillie D. L, Comp. Biochem. Physiol. 96, 633 (1990).
33
Altschul S. F., et al., Nucleic Acids Res. 25, 3389 (1997).
34
The mitochondrial ribosomal protein orthologs have been missed by the automatic comparison procedure primarily because they contain nonconserved NH2-terminal import peptides as well as COOH-terminal tails (28).
35
Okimoto R., Macfarlane J. L., Wolstenholme D. R., J. Mol. Evol. 39, 598 (1994).
36
The domains were primarily from the SMART database [
Schultz J., Milpetz F., Bork P., Ponting C. P., Proc. Natl. Acad. Sci. U.S.A. 95, 5857 (1998);
], to which several domains were added. We have not attempted to cite the literature for each domain, but refer the reader to the SMART database. See also www.bork.embl-heidelberg.de/Modules_db/special_annotation_page.html
37
Bork P., Schultz J., Ponting C. P., Trends Biochem Sci. 22, 296 (1997).
38
To obtain robust counts for each of the domains in the yeast and worm protein sets, we compared representative sequences of each domain to the nonredundant protein database (National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD) using the PSI-BLAST program, and the resulting position-dependent weight matrices (profiles) were saved. The number of search iterations and the cutoff for inclusion of sequences in the profile were adjusted individually for each domain. For most widespread domains, several profiles were constructed to ensure complete coverage. The profiles were then compared separately to the yeast and worm protein databases. Typically, the random expectation value of 0.01 was used as the criterion for domain identification, but the search results were additionally scrutinized for the conservation of patterns typical of the respective domain, to ensure the elimination of any false positives. The profiles for each of the domains are available at the Web site. They can be obtained by FTP and used for PSI-BLAST searches.
39
Hall T. M., et al., Cell 91, 85 (1996).
40
Porter J. A., et al., ibid. 86, 21 (1996).
41
Dahl E., Koseki H., Balling R., Bioessays 19, 755 (1997);
Ryan A. K., Rosenfeld M. G., Genes Dev. 11, 1207 (1997).
42
Franz G., Loukeris T. G., Dialektaki G., Thompson C. R., Savakis C., Proc. Natl. Acad. Sci. U.S.A. 91, 4746 (1994).
43
Y. Nagai et al., FEBS Lett, 418, 23 (1997); L. Aravind and E. V. Koonin, J. Mol. Biol., in press.
44
Cherry J. M., et al., Nature 387, 67 (1997).
45
We thank J. Hodgkin, R. Horvitz, J. Kimble, and the editors of Science for the invitation to write this paper, D. Lipman for suggesting the collaborations, and K. Anders for helpful discussions. We are especially grateful to R. Durbin (Sanger Centre) and L. Hillier (Genome Sequencing Center) for providing sequence information and for their cooperation. The SGD is supported by a P41 national resources grant HG01315, from the National Human Genome Research Institute at the U.S. NIH. S.A.C. is supported by training grant PHS HG 00044. T.S. is supported by grant DE-FG02-98ER62558 from the Department of the Environment.

(0)eLetters

eLetters is a forum for ongoing peer review. eLetters are not edited, proofread, or indexed, but they are screened. eLetters should provide substantive and scholarly commentary on the article. Embedded figures cannot be submitted, and we discourage the use of figures within eLetters in general. If a figure is essential, please include a link to the figure within the text of the eLetter. Please read our Terms of Service before submitting an eLetter.

Log In to Submit a Response

No eLetters have been published for this article yet.

Information & Authors

Information

Published In

Science
Volume 282 | Issue 5396
11 December 1998

Submission history

Published in print: 11 December 1998

Permissions

Request permissions for this article.

Authors

Affiliations

Stephen A. Chervitz
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
L. Aravind
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Gavin Sherlock
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Catherine A. Ball
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Eugene V. Koonin
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Selina S. Dwight
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Midori A. Harris
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Kara Dolinski
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Scott Mohr
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Temple Smith
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
Shuai Weng
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
J. Michael Cherry
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.
David Botstein
S. A. Chervitz, G. Sherlock, C. A. Ball, S. S. Dwight, M. A. Harris, K. Dolinski, S. Weng, J. M. Cherry, and D. Botstein are in the Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305–5120, USA. L. Aravind and E. V. Koonin are at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. S. Mohr and T. Smith are in the Department of Biomedical Engineering, Boston University, Boston, MA 02115, USA.

Metrics & Citations

Metrics

Article Usage

Altmetrics

Citations

Cite as

Export citation

Select the format you want to export the citation of this publication.

Cited by

  1. Seasonal variation in the community distribution of protists off Wakasa Bay, Southern Sea of Japan, Continental Shelf Research, 253, (104898), (2023).https://doi.org/10.1016/j.csr.2022.104898
    Crossref
  2. AtGCS promoter-driven clustered regularly interspaced short palindromic repeats/Cas9 highly efficiently generates homozygous/biallelic mutations in the transformed roots by Agrobacterium rhizogenes–mediated transformation, Frontiers in Plant Science, 13, (2022).https://doi.org/10.3389/fpls.2022.952428
    Crossref
  3. OrthoQuantum: visualizing evolutionary repertoire of eukaryotic proteins, Nucleic Acids Research, 50, W1, (W534-W540), (2022).https://doi.org/10.1093/nar/gkac385
    Crossref
  4. Defining characteristics and conservation of poorly annotated genes in Caenorhabditis elegans using WormCat 2.0 , Genetics, 221, 4, (2022).https://doi.org/10.1093/genetics/iyac085
    Crossref
  5. Apoptotic Molecular Machinery: Vastly Increased Complexity in Vertebrates Revealed by Genome Comparisons, Science, 291, 5507, (1279-1284), (2021)./doi/10.1126/science.291.5507.1279
    Abstract
  6. Genomic Analysis of Gene Expression in C. elegans, Science, 290, 5492, (809-812), (2021)./doi/10.1126/science.290.5492.809
    Abstract
  7. Comparative Genomics of the Eukaryotes, Science, 287, 5461, (2204-2215), (2021)./doi/10.1126/science.287.5461.2204
    Abstract
  8. Conservation and Novelty in the Evolution of Cell Adhesion and Extracellular Matrix Genes, Science, 287, 5455, (989-994), (2021)./doi/10.1126/science.287.5455.989
    Abstract
  9. Caenorhabditis elegans Is a Nematode, Science, 282, 5396, (2041-2046), (2021)./doi/10.1126/science.282.5396.2041
    Abstract
  10. Genome Sequence of the Nematode C. elegans: A Platform for Investigating Biology, Science, 282, 5396, (2012-2018), (2021)./doi/10.1126/science.282.5396.2012
    Abstract
  11. See more
Loading...

View Options

Check Access

Log in to view the full text

AAAS ID LOGIN

AAAS login provides access to Science for AAAS Members, and access to other journals in the Science family to users who have purchased individual subscriptions.

Log in via OpenAthens.
Log in via Shibboleth.

More options

Purchase digital access to this article

Download and print this article for your personal scholarly, research, and educational use.

Purchase this issue in print

Buy a single issue of Science for just $15 USD.

View options

PDF format

Download this article as a PDF file

Download PDF

Full Text

FULL TEXT

Media

Figures

Multimedia

Tables

Share

Share

Share article link

Share on social media