Introduction

Biologists have long been interested in a diagrammatic representation of the evolutionary history of Life. The first illustration of this kind was Jean-Baptiste Lamarck’s “Tableau. Servant à montrer l’origine des diffèrens animaux” (Lamarck 1809, vol. 2, p. 463Footnote 1) depicting his ideas about the phylogenetic relationships between major animal groups. The most striking feature of this scheme is not that groups are derived directly from one another, reflecting the old concept of the scala naturae (Great Chain of Being), but the emergence of a branching pattern of change for some pairs of these groups. The theory behind this phenomenon, namely the principle of evolutionary divergence, was elaborated and published 50 years later by Charles Darwin (1859, hereafter “Origin”) and visualized by his book’s single “accompanying diagram” (inserted facing page 117) for a small collection of hypothetical species and genera. He then argued that “a great tree” is sometimes used to represent the affinities of all members of a given class, and then extended this statement to the entirety of life on earth in the following famous passage (Darwin 1859, p. 130):

As buds give rise by growth to fresh buds, and these, if vigorous, branch out and overtop on all sides many a feebler branch, so by generation I believe it has been with the great Tree of Life, which fills with its dead and broken branches the crust of the earth, and covers the surface with its ever branching and beautiful ramifications.

Notably, this is the only occurrence of the term “Tree of Life” (ToL) in Darwin’s work published in his lifetime.Footnote 2 For him and most of his audience, ToL might have appeared to be no more than an idealized goal that can never be achieved, and therefore the term remained almost completely dormant for more than a century in the biological literature.Footnote 3 Terminology, even at the level of metaphors, was much more strongly influenced by Ernst Haeckel, who published a large number of phylogenetic diagrams, with different styles and theoretical backgrounds, for various groups of organisms. Of these, the “Monophyletische Stammbaum der Organismen” (Tafel I in Haeckel 1866) is the first attempt in the history of science to depict phylogenetic relationships for all extant life (Hossfeld and Levit 2016).Footnote 4 Perhaps, this figure facilitated most effectively the rapid spread and wide use of the term “tree”, which has since been routinely used in evolutionary and phylogenetic investigations.

For 100 years after the publication of the Origin, these studies were largely based on observed similarities among groups of organisms and the correlations of their attributes (Mayr 1988, pp. 269–270), which was challenged significantly by the rise of cladistics around the middle of the 20th century. Bifurcating tree graphs, called cladograms, were generated more objectively than any previous phylogenetic diagrams, with the goal of hypothesizing evolutionary relationships within selected taxonomic groups. The tree of all life was still illusory, however, because no phenetic data could cover prokaryotes together with eukaryotes, for obvious reasons. Very few or no morphological characters were available to describe all members of large groups, such as metazoans or plants, and the problem of universal morphology existed at even lower levels, as well. The situation changed radically with the advent of molecular cladistics in the late 1980s, when the analysis of nucleotide sequences of universal (ribosomal) genes made possible the “reconstruction” of the phylogeny of all extant organisms.

A literature search in the Web of Science (WoS) Core Collection reveals that the expression “Tree of Life” re-appeared in the subject matter and/or the title of biological papers in 1989, and after 10 years the number of occurrences started to increase dramatically, reaching 42 per year in the title and more than 190 appearances per year in the topic by 2018 (Supplementary Fig. S1). Another search in the same database shows that papers mentioning “Tree of life” in the title or in the abstract often receive a large number of citations, further illustrating that this expression has taken a central role in the life sciences (Supplementary Table S1). The trend is evident even though the increases may be partly due to the extensions of journal coverage by WoS in the past decade. Moreover, these figures could be even larger, because WoS actually covers only a fraction of the scientific literature. In addition to periodicals, many books include the “Tree of life” (or its translation) in their title, and several internet applications, online resources and databases also bear the term in their names. Thanks to the joint effort of thousands of biologists, we are getting increasingly closer to (although still a long way from) the realization of Darwin’s dream of a historical drawing that shows all buds, branches and ramifications of the living world that exists—and has existed—on our planet.

The obvious correlation between the increased use of ToL and the rapid expansion of molecular systematics suggests that the single candidate for the long-sought historical diagram is indeed “The One True Tree of Life”Footnote 5 (title of Chapter 10 in Dawkins 1986). As Eiserhardt et al. (2018) have recently put it: “It is a fundamental infrastructure of scientific knowledge that is as central to biology as the periodic table is to chemistry”. However, in sharp contrast to Dmitri Mendeleev’s chemical chart, it is extremely surprising and embarrassing that no exact definition for its underlying concept has been given for the ToL. Indeed, the methodology for preparing the ToL does not follow any standard, and its use freely fluctuates around “metaphor, model and heuristic device” (Mindell 2013; Morrison 2014, 2016). Some glossaries provide a short and even circular description; for example, when ToL is understood as a “tree-like representation of the history of all living and extinct organisms” (Gogarten and Townsend 2005). A more fortunate formulation is due to O’Malley et al. (2010), who say that the ToL is “intended to represent the pattern of evolutionary processes that result in bifurcating species lineages” thus symbolizing powerfully “the unity of evolutionary process and pattern”.Footnote 6 But bifurcation and pattern may be manifested in different ways in a diagram. Many instances of the ToL are rigorously graph theoretical trees, such as cladograms or their weighted variants (phylograms and chronograms or “timetrees”), whereas others are much more like metaphors and thus resemble “botanical trees” in shape.

Critical reviews and point-of-view articles devoted to this matter are relatively scarce, and they are concerned mostly with incongruent “gene trees” and phenomena that are said to violate the tree condition (HGT and hybridization; see Wolf et al. 2002; Puigbò et al. 2009; Morrison 2014, 2016; Doolittle and Brunet 2016 and most contributions in O’Malley 2010), or with methods of visualization (Page 2012). In some cases, although “Tree of life” appears in the title, its role is marginal in the main text (e.g., Szklarczyk et al. 2015; Wanntorp and Ronse de Craene 2011), with the ToL thus serving merely as a vague metaphor of the living world.

The information content in most diagrams published to date is insufficient to demonstrate appropriately “the grandest earthly enterprise”: details are restricted usually to one or a few aspects of the history of life, mostly to divergence and the associated timing or molecular change. In addition, however, I think that facts from paleontology (e.g., fossil groups, extinctions), estimated dates of major evolutionary events, a temporally meaningful hierarchical classification of species, and the known diversity of life should also be demonstrated properly by one and the same diagram. Although much information has been available to prepare such a figure, the details do not yet form a coherent, comprehensive diagrammatic representation.

In this paper, I suggest a solution that is not a tree in the graph theoretical sense of the word (because a tree cannot fulfill the above requirements, as explained later), but follows Darwin’s early concept by revitalizing the coral metaphor he mentioned only once in his notebooks. A mathematically well-defined coral allows illustrating the history of life by combining cladistic, chronological, paleontological, historical and taxonomic data in a synthesis never before portrayed. A preliminary attempt to compile the first version of the Coral of Life (CoL) is thus provided here, which is admittedly heuristic in many parts, but which can be refined and elaborated continuously when new phylogenetic hypotheses arise and more details are revealed and clarified.

Main Features of ToLs

Published ToLs, as noted above, are extremely diverse regarding their main characteristics. Supplementary Table S2 provides relevant information on 32 selected diagrams that cover all major groups of extant life (plus fossils in some cases) at least at the level of domains. All of them were designated as ToL by the author(s)—with the exception of Haeckel’s Stammbaum and Whittaker’s five-kingdom system, both of which deserve inclusion in the table for their importance in the history of ToL. Criteria for feature evaluation are as follows:

  1. (1)

    if the diagram may be interpreted as a graph-theoretical tree, with nodes representing taxonomic entities (e.g., species or their groups aggregated into “higher” taxa) and edges representing relationships (see Glossary), or interpreted as an image of another kind;

  2. (2)

    whether or not the diagram is embedded into the time dimension, and if so whether the timescale is linear or logarithmic;

  3. (3)

    if the diagram is based on a single gene, allowing measurement of molecular distances along the edges of the graph, or whether there is no genetic distance measured at all;

  4. (4)

    the proportion of labeled nodes representing prokaryotes and those corresponding to eukaryotes, which indicates bias towards either group (usually towards prokaryotes);

  5. (5)

    if the diagram includes additional information on macroevolution (e.g., endosymbiotic events, significant genetic novelties, position of Last Universal Common Ancestor, etc.) or major ecological changes on Earth;

  6. (6)

    if there is a classification (partitions or a hierarchy) superimposed over the phylogeny;

  7. (7)

    the taxonomic status of entities represented by nodes or branches (e.g., species, genera, higher Linnaean ranks or clades);

  8. (8)

    if the relations depicted are of sister group (SG) or ancestor–descendant (AD) type;

  9. (9)

    if extinct organisms (fossils) also appear as labeled nodes; and

  10. (10)

    typification of the diagram according to Podani (2013, 2017).

Purely stylistic properties are excluded from this list because I do not see any essential difference, for example, between trees arranged in traditional rectangular manner and alternatives such as radial trees, the so-called phylogenetic mandalas (Hasegawa 2017)—if they otherwise agree in all other features from the above list. Orientation, color, line thickness, fonts, presence of photos and other artistic details are also irrelevant here, although such features may be important from an aesthetic viewpoint (as in Lima 2013).

It is apparent that for visualizations of the Tree of Life the various authors have felt absolutely free to combine the fundamental properties listed above. There are hardly any two diagrams that completely agree in all of these features—the world of ToLs appears to be extremely diverse in this respect. Therefore, in the following sections I provide a detailed discussion of published variants of the tree of life, show the disadvantages associated with the use of mathematical trees, and outline the possibility for adapting a different type of drawing, the coral, to summarize the history and diversity of life in a single figure.

The Scale Problem: Levels of Organization

The first and most fundamental question is the choice of the basic taxon entities, the well-distinguishable operational units of visualization or—to put it another way—the level of biological organization at which the history of life is to be conceived and displayed. Biological organization may be viewed as a hierarchical arrangement of entities, either ecological or genealogical (Tëmkin and Eldredge 2015, see their Fig. 5). The ecological hierarchy relates to the functioning of its components and their spatial associations, while the genealogical one focuses on the information flow through time. In a historical context, therefore, we are concerned with the genealogical hierarchy.

In this hierarchy, the lowest relevant level would be that of cells (Doolittle and Brunet 2016): since all cells originate from previously existing ones,Footnote 7 the cellular history of life may be modeled as a mathematical graph (see Glossary) with vertices representing cells and directed edges corresponding to parent–offspring relations, which may therefore be embedded in the time dimension. Parts of this enormous imaginary graph are trees, where each parent cell gives rise to new cells by division (binary fission in prokaryotes and mitosis or meiosis in eukaryotes). However, new cells may also be formed by the merger of two cells, such as cellular fusions in the prokaryotic world (e.g., haloarchaeans)—which may also have created the first eukaryotic cell—and the fusion of gametes in sexually reproducing eukaryotes. Due to these events, which may connect closely related and very distant lineages as well, the tree-condition (no cycles) is violated, and the diagram becomes a network—the cellular Network of Life (NoL).

There is a third type of cellular union, phagocytosis, which normally means that one cell consumes the other as food, and is thus irrelevant in the present context. On rare occasions, however, the engulfed cell survives and is gradually integrated into the host cell, a process known as symbiogenesis,Footnote 8 which may have lead to the formation of the mitochondrion and various types of plastids. In every such case, the new organelle performs functions that greatly increase the host’s fitness; and genetic material is carried over into its nucleus, thereby modifying the genome. Therefore, this type of fusion may be conceived as a major mutation event, by which evolutionary changes and diversification along the host’s lineage are greatly enhanced, whereas the line of the engulfed cell as a separate entity is eliminated. Evidently, the cellular NoL has only theoretical importance as a mental model for the unique (one true) cellular history, which involved an astronomical number of units for which I would never try to provide any estimate.

The next hierarchical level pertains to individual organisms (Dennett 1995; Doolittle and Brunet 2016), a large part of which is unicellular and is therefore present in the cellular network discussed above. Here, innovation is multicellularity, which has evolved several times independently, producing the metazoans and different groups of plants, meta-algae and fungi. The NoL of individuals derives (“simplifies”) from the above network by combining nodes that represent cells of the same individual organism. These are considered physically separable for simplicity, although in many cases (e.g., clonal plants, animal colonies) it is hard to make a distinction between individuals; in such cases these “superorganisms” are viewed as units. In this graph, the edges show parent–offspring relationships as in the cellular NoL. It is partly composed of trees for asexually propagating organisms and partly of networks in the case of sexually reproducing species. It also has theoretical significance as a model to explain the one true network of organismal history (called “tokogenetic relationships” by Hennig 1966, see his Fig. 6 as a hypothetical example for speciation), with a size several magnitudes smaller than the graph of cells.

The next two levels of organization involve classification of individuals into the basic units of taxonomy, i.e., species. The historical diagrams with species as entities derive theoretically from the graph of individuals, such that nodes assigned to the same species are aggregated. This means that continuity of the lineages of cells and individuals is maintained in the graphs. At these levels arise the uncertainties with the definition of species. There are at least 30 different species concepts (Zachos 2016, pp. 80–96), and none of them apply equally to all groups of organisms and to extant and extinct species alike. Nevertheless, since most species are distinct (genetically, morphologically, ecologically, etc.) from the others at one time point, the species delineation problem is less apparent at the population level—i.e., when a node represents a living population of a species at time t, while an edge connects a parent population to its offspring at time t + 1. Darwin’s single tree-like model diagram in the Origin is the first one of this kind (“tree-like” because dotted links are used to represent many subsequent generations and nodes appear only where a population splits). However, hybridization may occur even between distinct and well-established populations of two related species, mostly in plants and less commonly in animals, giving rise to an offspring population that must show up in our theoretical diagram no matter whether it is fertile or sterile. Hybridization of species is almost the rule rather than the exception in the plant world: according to some estimates a large proportion of extant species are hybridogeneous (percentages vary, up to 80% for dicots, Briggs and Walters 2016). A consequence is that the historical diagram of life can only be a network at the population level. Needless to say, this NoL is no more knowable than are the previous two graphs.

Populations of a given species from all times can be further amalgamated into a single node, thus giving a species diagram. Evolutionary theory suggests that temporal species boundaries are not sharp—while moving from parent to offspring species identity is unlikely to change, unless some dramatic genomic modifications happen (e.g., autopolyploidization or hybridization) that warrant taxonomic switch as well. In general, speciation is a slow, gradual process, thus questioning the meaningfulness of discrete mathematical tools, such as graphs, for its modeling.Footnote 9 Nevertheless, this is the lowest level at which scientists ever attempted to draw historical diagrams of life, and thus species are assumed as discrete spatio-temporal evolutionary entities in most attempts to determine ToL. Loops are inherited from the previous levels; therefore a complete species graph can only be a network.

In summary, the mere existence of sexual reproduction, i.e., the fusion of gametes, at these four levels of organization is the principal reason explaining the network structure in the history of life, with other processes being rare or affecting mostly prokaryotes. The tree property comes into focus again if we consider that increased biological distances between distinct lineages prevent interbreeding, which means that they can never fuse again. The minimum distance for complete separation is indirectly reflected by traditional systematics: species belonging to different families, orders or higher Linnaean taxa never form hybrids, and the history of life, if viewed at a large taxonomic scale, will be dominated by ramifications.

At first glance, then, one may think that aggregation of species into nodes representing higher taxa resolves the tree versus network dilemma in favor of the former. Historically important examples for this view are Lamarck’s Tableau and some of Haeckel’s trees (e.g., for plant phylogeny, see Fig. 3 in Dayrat 2003). Interestingly, Darwin also played with this idea when he drew a sketch for mammalian evolution in 1857 or 1858 (see Fig. 4.6 in Archibald 2014): this figure is in fact a graph theoretical tree, with a question at the top of the diagram: “Let dots represent genera???” No matter how attractive, the proposal that Linnaean taxa are nodes, and that the edges depict their AD relations in a tree, is untenable for several reasons discussed later. Here, one of them deserves attention: if agglomeration of individuals into species is problematic because the temporal boundaries are not sharp, then the same problem transfers to groups of species as well. Jumping from one family into another when going from parent to offspring is even more unrealistic than the change of species identity (Podani 2009).

Finally, we can go down one step below the cells in the biological hierarchy, and construct “gene-trees” that are often associated with the notion of the ToL. Similarly to cells, individuals, populations and species, each single gene also has a unique history, which is predominantly tree-like (Maddison 1997; Dickerman 1998). This is shaped by various processes of mutation, recombination, gene duplication and extinction, and horizontal gene transfer via cell fusions, viruses and extracellular parasites, as well as by the physical environment. Genes or their encoded proteins, sampled from individuals of selected species, have been generally used for phylogenetic studies, and comparisons suggest that species-level trees based on individual genes may be remarkably different from one another and rarely identical. Some authors therefore suggest (Puigbò et al. 2009, 2013; the idea originated in Dickerman 1998) that no universal ToL exists, and that the concept should be substituted by the Forest of Life (FoL), consisting of gene trees. Puigbò et al., on the other hand, agree that the congruence among many gene trees is much higher than expected by random, suggesting the existence of a single statistical ToL. Forterre (2015) concludes that the history of genes is “incredibly complex” and the universal tree should reflect the evolutionary history of cells, as envelopes of the genetic apparatus, rather than the change of their genome composition. Unlike cells or individuals, genes are not living entities, and may only be—more or less reliable—markers of phylogenetic relationships. Under what circumstances this history can be shown by a mathematical tree is an issue to be examined later in this paper. In the next sections, I shall first assume what most authors do: the ToL (i.e., a tree) is a meaningful summary at the level of species.

Scope, Taxon Sampling and Diversity

Whereas the ToL has obvious connotation to the entirety of life, its use is often confined to a subset of the living world: expressions such as the “prokaryotic tree of life” (McInerney et al. 2008), “plant tree of life” (Soltis et al. 2018), “fungal tree of life”, “tree of life of birds” or “amphibian tree of life” (Supplementary Table S1) are not uncommon. This is unfortunate, because in this way the ToL may end up with the same fate as the infamous “family tree”, for example, which in most publications had nothing to do with actual families (understood either in the everyday sense of the word or as a rank in systematics).Footnote 10 To clarify the issue, I suggest the use of the mathematically well-established term subtree (e.g., “avian subtree of life”) whenever attention is focused on a well-circumscribed set of species, i.e., a higher taxon.

The true ToL is therefore expected to include the widest possible range of cellular organisms, representing both domains of prokaryotes as well as eukaryotic protists, fungi, plants, animals and other groups (viruses excluded, see Moreira and López-Garcia 2009; Forterre 2015, for arguments). For extant species, most published phylogenies are based directly or indirectly on molecular data, and the resulting tree has as many leaves (terminal nodes or vertices) as the number of species examined. In such trees, especially those published before 2010, eukaryotes are usually underrepresented, since a higher proportion of bacterial species had been analyzed for nucleotide sequences than eukaryotes. ToLs with prokaryote-dominance (see column 6 in Supplementary Table S2) provide a strongly biased view on the richness of (extant) life, because the number of known species in this group is two magnitudes smaller than that of eukaryotes (ca. 20,000 vs. ca. 2 million).

The one graph theoretical ToL biologists are looking for should theoretically include around 2 million terminals, which is impossible in practice even with the currently popular circular arrangements. Some authors have tried to circumvent the problem by a more balanced taxon sampling: for example, Hillis et al. (Fig.  230 in Pietsch 2012; Supplementary Table S2) reduced the proportion of prokaryotes to less than 2%. Hedges et al. (2015) included more than 50,000 species in a large tree arranged in a spiral format, with a proportion of eukaryotes close to the true value of 99.5%. A very large number of species are found in the ToL diagram compiled by Yifan Hu (reproduced on pp. 120–121 in Lima 2013, see also at http://yifanhu.net/TOL/tol_9_19_2011.jpg), allegedly more than 90,000, but the choice of taxa is completely arbitrary, most of them are unlabeled, and the overall appearance of the tree is chaotic; and thus it does not improve our understanding of the natural order of life.

One might suggest therefore that the ToL need not be shown at the species level, but “higher” taxa should appear as terminal nodes. This solution, although not uncommon (e.g., families or “family-level” taxa in Hedges and Kumar 2009; Supplementary Table S2) is problematic for two main reasons: (1) the choice of higher taxa (either Linnaean or cladistic) can only be subjective, and (2) the species richness of these groups cannot be demonstrated graphically by proper mathematical trees. As a result, taxon sampling is still unbalanced, and some groups receive too much emphasis (vertebrates take ca. one-third of the tree tips in the ToL of Hedges and Kumar 2009). Even the number of higher taxa may be exceedingly large: in this diagram the total number of tips exceeds 1600, so that the terminal edges of the tree do not separate, and large clades cannot be distinguished without the use of different colors. The same is true for the spiral ToL (Hedges et al. 2015), in which tips are actually species but only large clades are labeled and colored differently. Again, this ToL is biased: about twice as many vertebrates as arthropods are included (actual richness values are ca. 67,000 vs. ca. 1,170,000). As well, angiosperms and vertebrates are exposed at the order level but major invertebrate groups are overlooked: names such as Mollusca, Insecta, Coleoptera and Lepidoptera do not even appear in the diagram. Although the spiral arrangement allowed an increased taxon sampling, the tree itself is squeezed into a narrow strip, and the bifurcating pattern is hard—if not impossible—to ascertain. The tree drawn by Hedges et al. is therefore more like a curiously arranged classification than an easily conceivable ToL.

In conclusion, a mathematical tree cannot visualize the phylogenetic relationships of known extant life without bias towards particular groups. Drawing a species-level tree is definitely a big data problem: the enormous technical difficulties are demonstrated lucidly by Eiserhardt et al. (2018): a tree even for extant plant species only would require a ca. 1660 m long piece of paper if the labels were given in a nine-point font—twice as much as the height of the tallest man-made construction in the world (Burj Khalifa in Dubai). For a similar graph for 2 million species, the paper length would be around 10 km! Some authors suggest, therefore, that no single drawing of the ToL is appropriate; and the use of several subtrees in a suitable navigational environment is recommended—a practice already followed by some books (e.g., Lecointre and Le Guyader 2007; Hedges and Kumar 2009; Vargas and Zardoya 2012) and internet applications (e.g., Maddison and Schulz 2007; Rosindell and Harmon 2012; de Vienne 2016).

Representing Extinct Species and Their Groups

While the large number of species induces serious problems in tree (and network) visualization, it is generally agreed that, if the ToL is to be about the history of life, then extinct species or their groups should also be included (Donoghue and Cracraft 2004, p. 3; Gogarten and Townsend 2005). This further increases the number of entities to be represented in the graph. Darwin probably agreed with this requirement—the only figure in the Origin may be interpreted as a model tree showing ancestor–descendant relationships (Dayrat 2005) at the population level, as said above. Of course, a ToL determined merely on the basis of molecular information cannot include fossils—it refers to the past of organisms living in a single slice of time and stratigraphic information is only used for calibration (Benton 2001; Benton and Ayala 2003; Benton and Donoghue 2007; Forest 2009; Clarke and Boyd 2015). However, depicting who is sister to whom is only part of the story; such cladograms are therefore by no means “reconstructions” of phylogeny.

Simultaneous representation of extinct and extant species in the same tree requires careful synthesis of molecular and morphological cladograms, which is a great challenge for contemporary biology. This practice is relatively rare at the level of the ToL, and extinct taxa, if included, usually appear only in subtrees of life. For example, in diagrams of the Tree of Life project (Maddison and Schulz 2007),Footnote 11 extinct and extant species or genera appear as sister nodes—that is, all entities are terminals on the tree. The same is true for the internet application developed by Rosindell and Harmon (2012) and de Vienne (2016). Recent approaches to model-based phenetic reconstruction of phylogeny (Puttick et al. 2017) also provide cladograms with all taxa as terminal nodes. This is a problem, because one cannot exclude the possibility that any fossil is an ancestor to another or to an extant species (Foote 1996), especially if the fossil lacks apomorphic characters.Footnote 12 The sister group relationship is thus questionable, so that the branching topology is not necessarily a faithful representation of evolutionary pathways.Footnote 13 Since we can rarely (or never) answer the question as to whether a fossil is ancestor of or an extinct sister to another species, a historical ToL will always be impossible to derive for theoretical reasons.

There is also a more or less practical problem well-known for everyone from Darwin’s time to our molecular era: the fossil record is extremely incomplete and unbalanced. The fact that only 250,000 or so fossil species have been described until now clearly demonstrates our limited knowledge about past life. Clearly, the various taxonomic groups have very different probabilities of being preserved (Donoghue and Yang 2016), e.g., for soft-bodied organisms it is nearly zero; and the discovery of fossils has always been the matter of luck. Raup (1992) estimates that at least 99% of “all species”Footnote 14 that lived in the past have disappeared forever, and these are not only unicellular organisms. We can safely say, therefore, that the Darwinian, truly historical, ToL together with its subtrees is absolutely unknowable with species as nodes. The same statement holds true for a historical NoL with similar taxonomic resolution.

If the idea of illustrating ancestor–descendant relationships by a tree is unrealistic, then do we have to give up every attempt at visualizing the paleontological past? Can we still compile a diagram that shows at least the known diversity of extinct groups and its change over time as well? To find an answer we have to go back in time: this goal is achieved by an old-style graphical tool that has become almost completely forgotten, the so-called romerogram, or spindle (bubble-, balloon-) diagram formerly so popular in paleontology.

Archetypes of these diagrams were drawn by paleontologists (including the non-evolutionists Louis Agassiz and Edward Hitchcock) in the nineteenth century. Thanks to Osborn (1917), Vialleton (1929) and especially to Romer (1945), after whom this type of diagram is named, such images were extensively used for demonstrating temporal changes in the richness of vertebrate groups in the fossil record. Other uses include paleobotany (Stewart and Rothwell 1993) and echinoderm systematics (Sprinkle 1992), but see Pietsch (2012), for a fuller account.

More apt are the other informal terms for the diagram, because the “spindles” are meant to illustrate the beginning and the end of the existence of a (higher Linnaean) taxon over geological time. The width of the spindle is proportional to the diversity of the represented group at any point of time. Its widest part corresponds to the period when the taxon reached its highest diversity, and a given spindle may have submaxima as well. Within-group relationships of taxa are thus omitted; this problem is set aside by using the completely filled spindle shape.

In Romer’s work, some spindles were directly connected to others, demonstrating “ancestor-descendant” relationships between these groups, but the connecting lines were dotted in most cases, reflecting uncertainty of the relationships. The cladistic component in these diagrams was therefore weak, if present at all.

Viewed by modern eyes, there was another disadvantage: derivation of one group from another leads to obvious paraphyly of the ancestor. For these reasons, romerograms are no longer used in the form originally proposed. More recently, another vertebrate paleontologist, Benton (2005) combined the advantages of romerograms with cladograms. In his diagrams, no group was derived directly from another while the terminal “edges” were drawn like spindles to illustrate relative diversity of the groups.Footnote 15

Although the author himself, and other sources as well, refer to these diagrams as evolutionary “trees” (e.g., Fig. 4.21 on p. 105 in Benton 2005), these are neither mathematical trees nor do they bear any resemblance to woody plants. But then, what in fact are they?

Is the Tree the Best Metaphor?

As generally accepted, bifurcations during the evolutionary process may be best illustrated by “tree-like” diagrams. Rooted mathematical trees are applicable to depict both SG and AD relationships, but their use as a true ToL necessarily is burdened by visualization problems and epistemological limitations. Diagrams resembling “botanical” trees in shape are less often displayed in scientific publications (but see examples in Supplementary Table S2) while they are more commonly used for educational purposes (e.g., museum displays, MacDonald 2010), and may serve as alternatives. An internet search for biologically relevant images reveals that in fact the “Tree of life” takes the form of a plant almost as often as it does the mathematical construct. In this section I examine whether this botanical simile is indeed useful for scientific purposes.

It was Darwin himself who realized early in his career that the tree (of life) metaphor may not be the best choice. After returning from his voyage around the world, he first formulated his pioneering ideas on the transmutation of species, and drew several branching diagrams for hypothetical taxa in his notebooks. By examining these writings, not published until the 20th century and made available on the internet only fairly recently, we find that he was more hesitant than suggested by the passage cited in the Introduction from the Origin. After “discovering” the ToL (as Eldredge 2005 put it), in Notebook B (Darwin 1837, p. 23) he added a most remarkable comment (Darwin 1837, p. 25), as pointed out by many authors (De Beer 1960, p. 44; Hull 1985; Bredekamp 2005; Voss 2007; Gayon 2011):

The tree of life should perhaps be called the coral of life, base of branches dead; so that passages cannot be seen.

This is a clear reference to the branching true corals,Footnote 16 which he examined at several stops during his travel; and it may also recall some coralline red algae collected in South America (Bredekamp 2005; Maderspacher 2006). The coralline algae had a branched macroscopic structure similar to corals, and were fossilized thanks to a hard crust of lime deposited on their surface. Two small drawings on the next page of the same notebook may be considered as illustrations of the coral concept (Fig. 1a).

Fig. 1
figure 1

a Darwin’s drawings on page 26 of his Notebook B—which may be considered as the first instances of corals, which he specifically mentions on the previous page (from Darwin Online, http://darwin-online.org.uk/). Heuristic visual models for the Coral of Life, as simplified from b Zhaxybayeva and Gogarten (2007) and c Gaucher et al. (2010). Lineages leading to extant entities are in black

Darwin did not elaborate this idea any further; and later in the same notebook he used “Tree of life” again, just as he had in Notebooks C and D (1838), and later in the manuscript form of the Origin (“Natural Selection”, Stauffer 1975) and its final version. We can only assume that Darwin gave up his early musing on corals under strong influence from Victorian society because: a) the Tree of Life as a metaphor taken from the Bible appeared to him as more convincing and attractive for religious readers of his writings; b) genealogical (or “family”) trees for ruling dynasties and large families were also common and well-known in the 19th century; and c) the heavily and sometimes nicely branching tree plants were more familiar to everyone than corals, which are simple and obscure organisms observable only in marine environments. Thus, choice in favor of trees was almost a necessity, “historically and culturally, not scientifically, predicated” (Hellström 2012).

However, as some authors note today, Darwin’s skepticism was well-grounded—the tree is inferior to (branching) corals, even as a metaphor for populations, which form a continuous stretch in space and time, very often “decorated” by splits (bi- or multifurcations) and broken by extinctions.

First of all, as implied by Darwin’s comment, only the uppermost parts of a coral are alive, with the older branches being dead. The fragments may be accumulated over millions of years on the sea floor, thus producing atolls, for example, which Darwin knew very well. A potentially 1000 m thick layer of dead material may be a better allegory of the extinct past than is a tree, since trees are alive from the lowest parts (apical root cells) to the tips of ultimate leaves. Furthermore, a coral starts to grow upwards on the rock (or other) surface as a tiny polyp, and ramifies later. Contrarily, a tree grows from seed into two directions, to develop underground parts, the roots, which may be just as large and similarly branched as the distal part, the crown. Therefore, the root-trunk-crown system, i.e., the entire plant, cannot serve at all as an appropriate metaphor of evolution or phylogeny.

The third problem has to do with the shape of the parts: coral branches may be fairly even in diameter, sometimes wider above than below, suggesting that the abundance of organisms they may represent diagrammatically did not necessarily change much over time. An average tree, regarding its visible parts, is the thickest at the ground level, and then decreases gradually in diameter along the trunk, towards the limbs, smaller branches and then to twigs. This gives the false impression that richness (in whatever sense) or abundance of populations is continuously diminishing over time, which is obviously not the case.

The fourth serious difficulty with the tree metaphor is the most widely recognized argument against its use. Once bifurcated, tree branches do not fuse again under normal circumstances, whereas separate evolutionary lineages may be merged through cell fusions. At this point I will recall what has been said in the section on biological hierarchy: a complete ToL cannot be a mathematical tree, but is a network at all levels from the cellular to the species. Several authors have argued that Darwin’s almost forgotten suggestion may resolve this problem, as well. In particular: the compact dump of fragmented coral may stand for the effect of lateral relationships that confound the vertical pattern of inheritance (Fournier et al. 2009); corals grow on each other, thus symbolizing competitive struggle (Müller-Ville 2009); and many coral species (fan corals) exhibit a more or less anastomosing structure anyway (Olendzenski and Gogarten 2009), especially near the base of the colony.

Olendzenski and Gogarten (2009) and Müller-Ville (2009) even suggested that the unknown past of lineages may be best represented by an entire coral reef. Zhaxybayeva and Gogarten (2007) presented first a graphical model they called tentatively the “tree of life/coral of life” with several merging extinct lineages (see Fig. 1b).Footnote 17 In another graphical scheme, Gaucher et al. (2010) used corals to distinguish between two major phases of evolution, the first dominated by horizontal inheritance and the second by vertical (Fig. 1c). These early heuristics were not elaborated further, and no other uses of the coral have been published in this context.

The question remains: trees or corals? To clarify the problem of confounding mathematical trees and botanical tree metaphors, which often cause misinterpretation of phylogenetic relationships, I have suggested (Podani 2017) as a first step the use of the term Branching Silhouette Diagram (BSD). This refers to drawings conveying phylogenetic/evolutionary information, but which do not qualify as mathematical trees. I also provided a definition for its four main types, according to whether time is considered or disregarded and whether the BSD portrays AD or SG relationships; this elevates these images from the status of metaphors to that of mathematical constructs. By revitalizing Darwin’s early musing and operationalizing all other proposals mentioned above, I defined corals as diachronous BSDs, embedded into the time dimension—thereby showing AD relationships. This is in full agreement with Darwin’s note and its subsequent interpretations.

Each point in the coral may represent an individual, a population, a species or even a higher taxon, depending on the objectives of the researcher. Thus, it contains as special cases the romerograms, the stem “trees” (Wiley and Lieberman 2011) and other tree-like structures within which certain events or processes (e.g., lineage sorting) may be displayed (the latter often referred to as “species trees” in the literature of molecular genetics, e.g., Maddison 1997Footnote 18; Nakhleh 2013). The horizontal time slice of the coral corresponds to a partition of organisms, and the width of the branches may be drawn to be proportional to the relative diversity of the corresponding class in the partition. By zooming deeply into the coral, it becomes a graph with individuals as nodes and links depicting parent/offspring relationships (“tokogenetic” graphs of Hennig 1966; as in Maddison and Maddison 1992; Baum and Smith 2013).

The other type of BSD which shows AD relationships is the achronous cactus (named after Bessey’s famous diagram, see e.g., Pietsch 2012, Fig. 112), in which pads (nopales) represent higher taxonomic categories. In this diagram, a descendant higher taxon may be extinct, while its ancestor taxon is still surviving. Consequently the time dimension is scrambled, whence the name.

BSDs depicting the SG (rather than AD) relationship among entities have two different versions, both reminiscent of the traditional “botanical” or figurative tree diagrams drawn by Haeckel (1866): the synchronous and asynchronous oaks. They are, in practice, cladograms for organisms living at a given point of time or at any point of time, respectively; but they may be artistically designed with a thick trunk and continuously tapered branches and twigs.

Figure 2 summarizes this classification of BSDs, allowing a comparison with the analogous four groups distinguished among mathematical trees. In Supplementary Table S2, there are examples of these categories, represented by icons reduced from the original figures (col. 2) and classified into tree or BSD types (col. 12). This table demonstrates that all eight possible combinations (tree or BSD × time considered or disregarded × AD or SG) of tree-like phylogenetic/evolutionary diagrams are represented in the sample collection of ToLs, the most common type being the synchronous tree (= cladogram), which is the typical result of molecular phylogenetic investigations.

Fig. 2
figure 2

Typification and illustration of trees and tree-like phylogenetic diagrams according to Podani (2013, 2017). Upper four: mathematical trees, lower four: branching silhouette diagrams, ADR ancestor–descendant relationships, SGR sister group relationships. (filled circle) extant entity, (open circle): extinct entity (except for the coral and the oak branches)

To recapitulate what has been said so far, corals are more suitable than trees to express both phylogenetic relationships and the changing diversity of groups simultaneously, i.e., the history of life, as both metaphors and mathematical objects. The use of coral branches solves many problems that a tree cannot: (1) species need not appear as separate nodes, as in trees, and this way the species delineation problem is circumvented; (2) the relative richness of groups and its changes over time may be visualized; (3) AD relationships are assumed to exist along the branches, without the need to explicitly display any particular fossil, known or hypothetical; and (4) different genetic events and processes associated with evolution may be drawn onto the corals. Importantly, the coral has a cladistic component, i.e., the relation corals ⊃ trees exists.

A further advantage is that corals may be modified, if necessary, to accommodate horizontal events as well, leading to fan corals for which the definition is given in Supplementary Fig. S2. Fan corals are thus analogous to networks. We must keep in mind that coral diagrams have a branching structure because the partition of organisms changes over time (due to speciation and extinction), i.e., there must be a temporally interpretable underlying classification.

This leads us to the last topic to be discussed in detail, namely historical taxonomy.

Classification Schemes, Trees and Corals

The classification of organisms is traditionally Linnaean (= rank-based): species are grouped into genera, genera into families, families into orders, and so on, up to phyla and then to kingdoms. In other words, there is an inclusive hierarchy that implies that species in the same genus are more closely related morphologically to each other than to members of other genera. Likewise, genera in a given family are closer to one another than they are to genera in other families. As a consequence, the morphological gap is narrower between two genera in the same family than between two families in the same order; and these relationships hold true all the way up the hierarchy. Also, the system of ranks implies that differences between plant genera in the same family are “commensurable”, for example, with those between genera of a given animal family. This system was developed long before the theory of evolution, with the intention to classify organisms living at the present slice of time. Although Linnaeus did have some knowledge about fossils as well, he deliberately classified them among the rocks. For him, classification of petrified forms was not a real problem to worry about, for reasons I need not elaborate here.

Evolutionary theory has always been a major challenge to this system. Lamarck (1809, p. 20) asserted that nature did not produce classes, orders, families and even species, only individual organisms, which follow one another in time and resemble each other—taxa are merely our inventions, he said. Darwin (1862, pp. 330–331) warned that the system of ranks would not work if we knew more details about past life. It is due to the absence of extinct forms and “to the consequent wide gaps in the series, that we are enabled to divide the existing species into definable groups, such as genera, families, and tribes”. In the Origin (Darwin 1859, p. 330), he commented that groups that clearly separable at present, based on many characters, have much fewer differences for their ancient members, which are therefore closer to each other in the past than are their descendants in the present.Footnote 19 In other words: (1) ranks are arbitrarily assigned to groups; (2) in the evolutionary continuum taxa can only be separated artificially; and (3) the Linnaean hierarchy is meaningless if the recent grouping is projected into the past, because the morphological gaps diminish when we go back in time.Footnote 20 Thus, the Linnaean system is meaningful, at best, at a given slice of time (as Crowson 1970 has suggested), but the assignment of ranks still remains arbitrary anyway.

The conceptual basis for a classification that is fully compatible with evolution is almost as old as the idea of natural selection itself. Darwin, in yet another book (Darwin 1871, vol. 1, p. 188) wrote:

[The natural system,] …it is now generally admitted, must be, as far as possible, genealogical in arrangement,– that is, the co-descendants of the same form must be kept together in one group, apart from the co-descendants of any other form; but if the parent-forms are related, so will be their descendants, and the two groups together will form a larger group…

In this way, Darwin makes clear that there is a single, natural system in the organic world, the one true classification governed by a natural process, evolution.

Darwin’s proposal was made operational by Hennig (1966), who launched the cladistic approach. He made central to his theory the requirement that every group in a classification should consist of an ancestor and all of its descendants—any other groups are non-natural. The natural groups, called clades, can be circumscribed in (mathematical) tree representations differently, depending on the type of the underlying graph (Supplementary Fig. S3ab). In trees showing AD relations, a clade contains an ancestor node and all of its descendant nodes. In cladograms, which depict SG relations, a clade is the set of all and only those terminal nodes that derive from the same ancestor node, i.e., a complete sister group system (Nelson 1971), in which case the ancestor is usually unknown and hypothetical.Footnote 21 The contents of a clade, however, need not be enumerated extensively; another possibility is to define clades intensionally through membership rules (described in detail in the PhyloCode; Cantino and de Queiroz 2010). For example, an apomorphy-based clade contains a—usually hypothetical—ancestor in which the apomorphic character originated, and all of its descendants. To illustrate this concept, the ancestor is placed somewhere between two nodes in the phylogenetic “tree”—which means that visualization is in fact by corals rather than treesFootnote 22 (Supplementary Fig. S3c). In a coral model, branches are analogous to monophyletic groups, or clades in AD trees: a branch contains an ancestor class (group of organisms) and all of its descendant classes (Supplementary Fig. S3d).

Despite his pioneering role in disseminating cladistic thinking, Hennig did not refute the Linnaean system, and supported assignment of ranks to clades. This is not logical, however, because increasing ranks imply the existence of increasing gaps, as noted above, while there are no restrictions regarding minimum (morphological or molecular) differences between clades. The genealogical hierarchy, although it is also inclusive (large clades include smaller ones, etc.) does not require this.

Although there is increasing support in the scientific community to place the classification of the living world onto an evolutionary basis, the Linnaean view persists. Newly discovered fossils are still forced into the straightjacket of the rank-based scheme. Many recent “reconstructions” of the molecular phylogeny of particular groups aim to revise or refine the Linnaean classification, thereby rearranging the contents of families, orders and other higher taxa, and reallocating many species from one genus to the other.

There is no consensus, even between the supporters of the cladistic approach, regarding the implementation of classifications and nomenclatureFootnote 23 —for historical, practical, technical and personal reasons. Direct followers of the Hennigian approach usually insist upon ranks even though the classification is derived entirely from a cladistic framework. For example, Benton (2005, Appendix), who was the first to combine clades with romerograms, adopts a completely ranked classification for vertebrates, with an inevitable over-proliferation of categories (for example, superlegion, legion, sublegion, infralegion, cohort, magnorder, grandorder, mirorder, etc. inserted between standard ranks). Through all versions published to date, the (APG (Angiosperm Phylogeny Group 2016) insists that angiosperms be divided first into unranked clades (such as eudicots ⊃ superasterids ⊃ campanulids) but then—all of a sudden—arranged into Linnaean orders and families.Footnote 24 Many articles in Wikipedia refer to classifications in which clades and ranked categories are intermingledFootnote 25 —thereby triggering taxonomic chaos rather than clarifying the concept of a natural genealogical order, which confuses the readers of this popular digital encyclopedia.

In published ToLs, as columns 8–9 in Supplementary Table S2 indicate, the use of ranks is a fairly general practice, with some authors explicitly referring to families and orders, for example, as integral parts of the classification projected to, or deducted from, the tree. A few ToLs that show only major groups of life (up to 30–40) rely apparently on clades, but these diagrams do not provide classification details within these groups, so that the authors’ attitude towards ranks remains undisclosed. Parts of ToLs visualized as subtrees contain Linnaean taxa (for instance, those inherited from the APG classification of plants) in all of the books and internet resources listed in the supplement.Footnote 26 To my knowledge, therefore, no high-resolution ToL is available that is completely free from the non-evolutionary idea of Linnaean ranks.

The Coral of Life

After examining major criteria for tree-like diagrams of life in detail, relying largely upon published work and my own evaluation, we may conclude that the graphical scheme showing as many aspects of the history of life as possible should ideally be:

  • Cladistic: evolutionary divergence events, as the most essential elements of the phylogenetic process, are depicted similarly to trees;

  • Chronological: the diagram is embedded in the time dimension, preferably at a linear scale, which demonstrates the depth of time needed for evolution more sensibly than the logarithmic or any other scale;

  • Paleontological: known extinct organisms (fossils) and their groups are also represented;

  • Historical: major events of macroevolution are noted, and other comments (about geological events, mass extinctions, for example) are included in the drawing area;

  • Self-similar: while a single main diagram should convey as much information as possible, its parts should be zoomable to show particulars;

  • Continuous: in the mathematical sense: in this way it is free from constraints imposed by discrete mathematics (graph theory), which is especially critical when species (or higher taxa) are taken as distinct historical entities;

  • Anastomosing: the diagram tolerates reticulation events, which are either hidden within coral segments or are explicitly shown as links between two segments;

  • Rank-free: a classification of past and present life is superimposed on the diagram, and groups are named such that the non-evolutionary taxonomic concept of ranks is neglectedFootnote 27;

  • Proportional to diversity: species richness is visualized at the same scale for all groups, according to the best available information, thus illustrating faithfully how biodiversity is distributed over the different groups.

All of these criteria are satisfied simultaneously by a coral diagram inserted into a 2 dimensional coordinate system (Fig. 3, a high resolution poster-size image is supplied as Supplementary Fig. S4), with time as the vertical axis and species richness as the horizontal axis, and with each point representing a population of a species. Timing, and the division of geological time into eons, eras, periods and epochs (the latter appearing only in insets), follow the standards established by the International Commission of Stratigraphy (2018). The entire range of the horizontal axis corresponds to the 2 million species known today. The ordering of groups along this axis is arbitrary, because the diagram may be rotated at every branching point: usually larger or “more developed” groups are placed on the right. Here, prokaryotes are at the left, followed by excavates, plants, various groups of meta-algae and other protists, amoebozoa, fungi and then metazoa. Within the latter group, some “basal” branches, such as sponges are arranged first, followed by deuterostomia and protostomia. The small red arrow points to the position of our species, Homo sapiens, within the vertebrates. This group is deliberately positioned in the middle, to break with the long—selfish and misleading—tradition of placing humans at the end of a uni-dimensional ordering of life. Consistent with this, the most successful group in terms of species richness, namely the insects, and within this group, the beetles, are positioned at the right.Footnote 28

Fig. 3
figure 3

The prototype of the Coral of Life. Prepared in Microsoft Power Point, based on a first version drawn by hand. A high resolution poster variant is given as Supplementary Fig. S4

To the casual eye, the diagram is most reminiscent of romerograms, but in the coral the cladistic component is stronger and the classification is cladistic, rather than rank-based. This is meant to be a prototype of the Coral of Life, prepared using phylogenies and species richness data published in various sources, and divergence dates from the TimeTree of Life (timetree.org).

Since estimated divergence dates are burdened with high uncertainty, and therefore have high variance for a given event (Morrison 2009), I used mean values whenever there was no conflict with the sequence of bifurcations—otherwise the dates had to be shifted a bit. The width of coral segments is proportional to species richness, scaled to the entire axis. The smallest line width is used for branches for which no richness data are available. Also, due to the constraining effect of line width, species richness of small groups cannot be visualized proportionally. This problem may be solved in the future by increasing the resolution of the figure.

For many branches of the coral, where fossils are scarce or unavailable, the shapes were drawn to indicate gradual diversification, starting from the latest divergence event. This means that the coral still has a strong heuristic component—which may be reduced by experts of particular groups in further editions and revisions of the diagram. There is also an artistic element regarding the arched shape of branches, which may also be drawn differently.

Colors are used to make distinctions between major branches of life: magenta for eubacteria, orange for archaea, black for various non-photosynthetic protists, green for photosynthetic eukaryotes and blue for unikonts (Amorphea: amoebozoa, fungi and animals). Classification into large nested branches is shown at the top using curly brackets.

The self-similarity of the diagram is illustrated by four insets, gradually expanding the details for monocots (Monocotyledoneae), and within them for the branch of orchids (Orchidaceae), and then for Lady’s slipper orchids (Cypripedioideae), and finally for species assigned to the genus Cypripedium.Footnote 29 The last inset is the only one explicitly using a fan coral, which demonstrates the ease with which orchids hybridize. Only natural hybrids are shown—in the laboratory hybrids may be formed for practically every pair species. Details of the other branches of life may be visualized similarly—presumably using a zooming tool, to be developed and implemented later, and preferably as an internet application.

Major horizontal events in the main diagram are indicated by dotted or dashed arrows. At the current scale, I can show relatively few events involving cell fusion (as hypothesized for the origin of eukaryotes, for example), which means that at this point the diagram is anastomosing, and is in fact a fan coral (Supplementary Fig. S2). Strongly zoomed parts of the diagram would have more horizontal coral segments, demonstrating hybridization, as shown for Cypripedium. Small stickers are used to label important evolutionary innovations (appearance of cell types, large branches) and to show the approximate position of enigmatic groups (Ediacara fauna). The red explosion marks on the right refer to the dates of major mass extinctions, including the one we experience today.

Conclusions

The Tree of Life has its origins outside biology, being linked in a number of ways to religion, spirituality, mythology, ancient and folk art, literature, and the history of civilization. Consequently, it may be—and in practice is—interpreted with considerable freedom in the humanities. This explains, but does not justify, why the meaning of the ToL in biological publications is also extremely ambiguous: it may be understood as a concept, a diagram, a metaphor, or a model, each visualized in many different ways.

Although the ToL is expected to incorporate several features of the history of life on Earth simultaneously, most published ToL diagrams are limited in scope and taxonomic resolution; they have little or no paleontological relevance; they often neglect the time dimension; and the underlying classification contradicts evolution, i.e., is rank-based.

The question as to whether the ToL can be drawn at all is raised mostly by microbiologists, who usually emphasize: (1) the discrepancy between gene trees; (2) the relationships of the three domains and the position of the root; (3) different phenomena leading to loops in the graph; and (4) whether ToL is to be understood at the level of genes, cells, individual organisms or higher up in the hierarchy of life. Due to the strong bias towards prokaryotes, 99% of known species are thus down-weighted in their diagrams. Notwithstanding that the tree condition is easily violated in the microbial world, these are predominantly synchronous cladograms, rooted or unrooted, with some genetic events indicated along edges—while metaphorical forms are rare (e.g. Doolittle 1999; McInerney et al. 2008).

In systematics outside microbiology, the species ToL is the primary target, such that: (1) many diagrams are restricted to depicting relationships within a single “higher” taxon; 2) fossils, if they appear, are displayed as sisters to extant species; 3) the time dimension is ignored except for “timetrees”; and 4) tree-thinking is often metaphorical rather than graph theoretical. The tree property (no loops) is rarely questioned, although it is widely acknowledged that hybridization among closely related species is common.

The fact that the term Tree of Life still dominates biological thinking is due to two main factors: 1) fitting trees to morphological or genetic data is computationally more feasible than calculating networks (cf. Morrison 2013, 2016); and 2) for the actual subset (or sample) of species selected in a study, the tree is indeed an approximately correct summary of evolutionary relationships, in many cases. Tree-generating algorithms have contributed a lot to our understanding of evolution and phylogeny, and will continue to do so in the future. However, if we carefully consider the different levels of biological organization, from cells to species, we can easily see that the history of the entirety of life is not tree-like in most parts. This is not because the histories of genes are incongruent—the predominance of sexual reproduction in eukaryotes alone explains the existence of both vertical and horizontal links in the immense historical graph of life. This might suggest that an appropriate display would rather be a directed graph, and our goal would therefore be to strive for the Network of Life (NoL).

For visualizing the history of all of life, the “grandest earthly enterprise”, as a single picture, however, neither trees nor networks are feasible, for several reasons. The use of cells, individuals, populations or species has an obvious epistemological limitation: we do not and will never know sufficient details to prepare, or even to imagine such a diagram. We could restrict the contents to known extant and extinct species, but their number is too large to be shown by a single mathematical graph. Sampling evenly from the—otherwise arbitrary—higher taxa is not a solution, because in this way species diversity of these groups cannot be portrayed graphically. Furthermore, graphs are tools of discrete mathematics, in which we assume that the operational units are distinguishable entities. But the living world fails to satisfy the discreteness criterion both in space and time, when it comes to categorizing individuals into species, and species into larger groups. To demonstrate continuity and fuzziness in the history of life, some other means of visualization is required.

The use of the tree as a metaphor is not free from difficulties, either. The term tree has two definitions in science: one is botanical and the other is mathematical (see Glossary)—any other usage of the word can only be ambiguous and should therefore be avoided. Trees have several limitations even as metaphors of evolution or phylogeny, rendering them useless for scientific communication, in practice. These include that: (1) a tree is a living entity from the root tips to the topmost leaves, (2) it has a root system comparable in size and complexity to the crown, (3) the trunk and the branches continuously decrease upwards in diameter and, last but not least, (4) the branches normally do not form anastomoses. There is only one fundamental feature of trees that remains meaningful metaphorically, namely ramification. This property alone explains why in the history of science trees have become metaphors of hierarchical classification (starting with Peter Simon Pallas and Augustin Augier) or phylogeny (pioneered by Haeckel)—while the inconsistency of the other botanical characteristics of a tree with evolutionary thinking was largely ignored.

We have to consult another important person in the history of biology, Darwin, to find a solution at the level of both metaphors and mathematical objects. He suggested that corals or coral-like branching creatures (such as certain algae) of the sea could be a better metaphor of evolution than a tree. A branching coral is free from problems 1-3 mentioned above, and a fan-coral may have horizontally connected branches as well. Furthermore, an entire coral as a continuous object may also represent sensibly the spatio-temporal continuum of populations, species or their groups. This way, drawing a coral is not burdened with the necessity of specifying every detail in the diagram, unlike for trees and directed networks. Although the idea of a coral has long been neglected, many types of figures used in biology in the past 100 years are actually comparable to the concept Darwin had in mind. These include romerograms, “phylogenetic stem trees”, and gene-tree enclosing “species trees” of molecular genetics. The coral as a visual model of phylogeny re-appeared only a few years ago (Zhaxybayeva and Gogarten 2007; Gaucher et al. 2010), apparently escaping general attention, and was defined mathematically as the most common instance of branching silhouette diagrams even more recently (Podani 2017).

Raising the coral to the level of mathematically definable constructs allows the examination of its potential for replacing trees and networks to illustrate the history of life as a single diagram. I have presented a prototype of the Coral of Life which, being embedded in the time versus species richness space, depicts simultaneously many meaningful aspects of the history of life, such as evolutionary divergence, chronology, paleontology, and some major evolutionary innovations. As such, this diagram is suitable for educational purposes even in its present form, while its precise structure may be continuously refined in the future. The coral diagram, and its zoomed parts, may not only capture biological history but also provide a solution to the problem raised by Avise (2009, p. 23), namely the need for a “first ever universally standardized scheme of biological classification”. The shift from tree- or network-thinking to coral-thinking is inevitably complemented by a change from the Linnaean system to a genealogy-based classificatory scheme, in which coral branches are distinguished as taxa. This will not be easy, because the tree simile is at least as deeply anchored in our brains as is the rank-based classification of life.