Mammalian phylogeny reveals recent diversification rate shifts

Stadler, Tanja

doi:10.1073/pnas.1016876108

Research Article

Biological Sciences

Mammalian phylogeny reveals recent diversification rate shifts

Tanja Stadler [email protected]Authors Info & Affiliations

Edited by David M. Hillis, University of Texas at Austin, Austin, TX, and approved March 2, 2011 (received for review November 9, 2010)

March 28, 2011

108 (15) 6187-6192

https://doi.org/10.1073/pnas.1016876108

PDF/EPUB

Abstract

Phylogenetic trees of present-day species allow investigation of the rate of evolution that led to the present-day diversity. A recent analysis of the mammalian phylogeny challenged the view of explosive mammalian evolution after the Cretaceous–Tertiary (K/T) boundary (65 Mya). However, due to lack of appropriate methods, the diversification (speciation minus extinction) rates in the more recent past of mammalian evolution could not be determined. In this paper, I provide a method that reveals that the tempo of mammalian evolution did not change until ∼33 Mya. This constant period was followed by a peak of diversification rates between 33 and 30 Mya. Thereafter, diversification rates remained high and constant until 8.55 Mya. Diversification rates declined significantly at 8.55 and 3.35 Mya. Investigation of mammalian subgroups (marsupials, placentals, and the six largest placental subgroups) reveals that the diversification rate peak at 33–30 Mya is mainly driven by rodents, cetartiodactyla, and marsupials. The recent diversification rate decrease is significant for all analyzed subgroups but eulipotyphla, cetartiodactyla, and primates. My likelihood approach is not limited to mammalian evolution. It provides a robust framework to infer diversification rate changes and mass extinction events in phylogenies, reconstructed from, e.g., present-day species or virus data. In particular, the method is very robust toward noise and uncertainty in the phylogeny and can account for incomplete taxon sampling.

It has been a long-standing question whether the rise of present-day mammals began following the mass extinction event at the K/T boundary (1, 2). The hypothesis of a significant rise in mammals following the K/T boundary was challenged when the mammalian phylogeny, including 4,510 present-day species, became available (3). This phylogeny is ∼83% complete at the species level (4). It allowed detection of diversification rate (speciation rate minus extinction rate) shifts throughout the evolutionary past of almost all present-day mammals. No shift was detected at the K/T boundary; however, a peak in diversification rates at ∼93 Mya was detected (3).

Estimating the time and amount of diversification rate changes is a key toward understanding evolutionary patterns (5, 6). For example, reliable diversification rate estimates can have the power to decide if species richness is typically due to intrinsic causes (the ancestor evolved a new feature) or extrinsic causes (the environment changed) or a complex combination of both. Already in 1996, Sanderson and Donoghue (ref. 5, p. 1) wrote “Few issues in evolutionary biology have received as much attention over the years [ … ] as those involving evolutionary rate.” However, until now there has been no general framework available to estimate diversification rates and their changes through time.

Detecting diversification rate changes is typically done by detecting changes in the slope of the lineages-through-time (LTT) plot of the present-day species (3, 7–9, 10). The LTT plot of the mammals and the main mammalian subgroups are displayed in Fig. 1. For constant speciation rate λ and constant extinction rate μ, the slope of the semilog LTT plot is λ − μ in the distant past and λ in the recent past (11), with the change in slope occurring ∼1/(λ − μ) time units ago. This change in slope is also called the “pull of the present” (7). A departure from a straight line in the distant past indicates a diversification rate change.

Fig. 1.

Lineages-through-time plot of all present-day mammals, placentals, marsupials, the six largest placental subgroups, and 100 simulated trees (black). The maximum-likelihood parameters obtained from the mammalian phylogeny were used for simulations.

I argue that a slope change before the pull of the present slope change is not necessarily caused by a diversification rate change. Suppose a very high speciation rate λ₁ and a very high extinction rate μ₁ in the distant past, until say 25 Mya. The LTT plot of the species extant at 25 Mya has first a constant slope λ₁ − μ₁, followed by slope λ₁ when approaching 25 Mya (pull of the present, where the present is 25 Mya). Suppose more recent than 25 Mya, there was a constant speciation rate λ₀ and no extinction, meaning that all species alive at 25 Mya have present-day species descending. This observation yields an LTT plot of the present-day species with slope λ₁ − μ₁ in the distant past, λ₁ close to 25 Mya, and λ₀ more recent than 25 Mya. The diversification rate analysis based on slope changes in LTT plots for the time period before 25 Mya (as done for the mammalian phylogeny) (3) would have detected a period with diversification rate λ₁ − μ₁ followed by a period with diversification rate λ₁ before 25 Mya, whereas there was actually only one period with diversification rate λ₁ − μ₁ before 25 Mya.

An approach that estimates simultaneously all rates throughout the evolution of a considered phylogeny, instead of looking for local slope changes, is required to avoid such misleading results.

A New Likelihood Approach

I formulate an evolutionary model, the birth–death-shift process, where the speciation and extinction rates can change through time (Methods). When no rate change occurs, my model simplifies to the well-studied constant rate birth–death processes (12–16). By deriving the likelihood function (Methods, Theorem 2.7) of a phylogenetic tree under the birth–death-shift model, maximum-likelihood time intervals together with each interval's diversification rate (speciation rate minus extinction rate) and turnover (extinction rate divided by speciation rate) can be estimated. Likelihood-ratio tests are used to decide how many shifts are best supported. Using simulations (17), I show in SI Text that my analytical likelihood approach is powerful in detecting rate shifts accurately and is robust toward noise in data, such as uncertainty in divergence dates and unresolved polytomies. My likelihood parameter estimation method is available as an R package on CRAN (18).

Tempo of Mammalian Diversification

Using this unique likelihood approach, I estimated the maximum-likelihood speciation and extinction rates together with the shift times for the mammalian phylogeny (3) (SI Text S1). I detect four significant diversification rate shifts throughout mammalian evolution (Fig. 2 and Tables S1, S2, and S3). My method reveals that diversification rates were low (0.05 per million years) until ∼33 Mya. Then a diversification rate peak (0.16) lasted for 3 Myr. After this peak, diversification rate dropped to 0.10 and remained constant until ∼8.55 Mya. The rate declined at 8.55 Mya to a value of 0.06 and at 3.35 Mya to a value of 0.02 (Table S3). I further show that there is no support for a rate shift at the K/T boundary (SI Text S1).

Fig. 2.

Maximum-likelihood diversification rate and turnover estimates for the mammalian phylogeny. The dark blue line displays the diversification rate estimates (λ_i − μ_i) per million years and the light blue line displays the turnover estimates (λ_i/μ_i). The dashed lines are 95% confidence intervals.

The turnover of mammals is estimated to be close to 0, but the confidence intervals are wide (Fig. 2). The difficulty of estimating turnover has been already observed for small trees (19–21). However, I verify through simulations that in large trees, turnover close to 0 can be distinguished from high turnover (SI Text S2.1), meaning that the mammals had a low turnover throughout their evolution. The turnover was higher during the diversification peak at 33−30 Mya, meaning that both mammalian speciation and extinction accelerated.

I estimated diversification rate shifts also for the mammalian subgroups placentals and marsupials, as well as for the six largest placental subgroups: rodents, chiroptera (bats), eulipotyphla (shrews, moles, hedgehogs), cetartiodactyla (whales), carnivores, and primates (Fig. 3). At the 99% level, placentals, rodents and cetartiodactyla show a significant increase in diversification rates ∼33 Mya; however, only rodents and cetartiodactyla show a peak (i.e., the diversification rates drop again significantly within a few million years). When relaxing the likelihood-ratio test to reject at the 95% level, marsupials also show a significant peak ∼33−30 Mya (Fig. 3, dotted line). When allowing for one more shift in placentals (corresponding to an 83% level instead of a 99% level), placentals also show a peak ∼33−30 Mya (Fig. 3, dotted line). Placentals show a diversification rate peak ∼12 Mya that is mainly driven by the diversification rate peak of the eulipotyphla. Cetartiodactyla is the only group with a diversification rate peak ∼8 Mya, and chiroptera is the only group with a diversification rate peak ∼3.35 Mya. All analyzed groups but eulipotyphla, cetartiodactyla, and primates show a significant decrease in rates in the very recent past.

Fig. 3.

Maximum-likelihood diversification rate estimates (per million years) for the mammalian phylogeny and the mammalian subgroups placentals, marsupials, and the six largest placental subgroups. The number of shifts is determined with the likelihood-ratio test at the 99% level. The dotted line for the marsupials corresponds to a 95% level and the dotted line for the placentals corresponds to allowing for one more shift than at the 99% level (which is a P value of 0.83).

I verify that the likelihood method is very robust for inferring diversification rates by performing a variety of simulations (SI Text S2). In particular, all four shifts in the mammalian phylogeny can be recovered accurately (Fig. 4).

Fig. 4.

Maximum-likelihood diversification rate estimates (per million years) under the four-shift model for 100 simulated trees (black) and the mammalian phylogeny (blue). The maximum-likelihood parameters obtained from the mammalian phylogeny were used for simulations.

Discussion

Mammalian Evolution.

The mammalian phylogeny reveals no major diversification rate shifts for mammals or mammalian subgroups before 33 Mya. In particular, I do not find a significant shift in rates at 93 Mya unlike previous work (3). As there were only 10 lineages present at 93 Mya, I argue that it is hard to distinguish between a stochastic effect and a real change in rates. I confirm the previous finding (3) that there were no major shifts in rates at the K/T boundary.

The previous work (3) based on the mammalian phylogeny did not detect the peak at 33−30 Mya, which my results suggest. I argue that their method based on LTT plot slope changes is not powerful in detecting rate changes when a large number of species are present (which is the case at 33 Mya): The method looks simply at slope changes, ignoring that small slope changes might be significant when many species are present, whereas large slope changes might be stochastic factors when few species are present and might therefore be nonsignificant.

Up to now, the recent past (the last 25 Myr) could not be analyzed on the basis of the mammalian phylogeny, as the available methods could not distinguish between the pull of the present and a real rate change (11, 3). Two studies (22, 23) based on fossil data investigated the diversification rates of mammals between 30 and 5 Mya. Both studies found a major increase in species richness during 17−14 Mya. When I allow for at least six shifts, a diversification rate increase ∼15.85 Mya (Fig. S1) is detected. Note, however, that the additional shifts are nonsignificant (P value = 0.28).

I detect a decline in diversification rates for all mammals and most mammalian subgroups ∼3.35 Mya. To my knowledge, no existing study investigated the diversification rates of mammals during this time.

My study provides maximum-likelihood diversification rate estimates on the basis of the whole mammalian phylogeny. However, it remains controversial why the rates were shifting.

A study on the response of mammals to global warming (ref. 24, p. 1) states “Faunal turnover nears 100% and species diversity may increase when warm temperatures last hundreds of thousands to millions of years, because speciation takes place and faunal changes initiated by a variety of shorter-term processes accumulate.” The two studies discussed above (22, 23) find accelerated diversification during only a single warm period, 17–14 Mya. My study does not suggest higher diversification rates during long warm periods. In particular, during the Early-Eocene climatic optimum at 53–51 Mya, when temperatures reached a long-term maximum, I detect no diversification rate increase. Further, I do not detect a significant rate increase during the warmest part of the Mid-Miocene climatic optimum at 17–14 Mya.

On the other hand, I find an increase in diversification rate and turnover during cooling. Around 34 Mya, the earth climate changed from warm to cool (25) and remained cool until today. Antarctica began to be glaciated, grasslands expanded, presumably many herbivores increased considerably in population sizes, and, as I show, mammals underwent rapid diversification at 33−30 Mya, in particular rodents, cetartiodactyla, and marsupials. To disentangle whether increased mammalian diversification rates were caused by climate cooling or expanding grassland or neither, I suggest analyzing nonmammalian herbivore phylogenies with my maximum-likelihood method. Note that the increased diversification rate for cetartiodactyla indicates that not only grassland expansion caused the rate increase in mammals.

The statistically most significant rate shift observed in my study, at 3.35 Mya, occurred in an era of high tectonic activity: At 3.5 Mya, the great American biotic interchange took place as the new land bridge between North and South America was formed. However, I find a decrease in diversification rate, as opposed to an increase in rate suggested during times of high tectonic activity (23).

My results are based on the previously published phylogeny of mammals (3). This phylogeny was inferred using a supertree approach that combines reconstructed mammalian subtrees into one phylogeny of all mammals. The accuracy of such supertree approaches is limited (26), and the analyzed mammalian tree will certainly be refined in the future, possibly having an effect on some of the conclusions I draw. In particular, the mammalian supertree is poorly resolved in the recent past, during which I detect two significant rate shifts (3.35 and 8.55 Mya). However, because I show, using simulations, that my method is robust toward unbiased uncertainties in speciation time estimates and unresolved polytomies, I argue that my observed general rate pattern (including the recent shifts) actually reflects the pattern of mammalian evolution. As my method is available in the R package TreePar (18), my hypotheses can be easily checked and corrected with new mammalian phylogenies being inferred. In particular, new mammalian phylogenies will reveal if my assumption of the uncertainties in Bindinda's mammalian phylogeny being unbiased is appropriate.

My diversification rate estimates do not provide any support for the hypotheses of accelerated diversification during warming or during an era of high tectonic activity. However, with the increasing number of reconstructed phylogenies, my likelihood method can be used to obtain a more complete picture of the evolutionary past of present-day species. Such analyses can reveal common diversification patterns across different groups of species and, together with paleontological and paleoclimatic data, can help link environmental patterns and diversification patterns.

Advantages, Extensions, and Limitations of the Presented Likelihood Method.

My likelihood method for detecting rate shifts opens the possibility for analyzing all kinds of phylogenetic trees more accurately. The method is not tailored specifically to species phylogenies but can also be used for, e.g., virus phylogenies. The method can detect in addition to rate shifts also mass extinction events (Methods).

My approach, which uses one general model throughout evolutionary time and infers the maximum-likelihood rate estimates, has several advantages over methods using only parts of the phylogeny by looking at local slope changes in the LTT plot:

i) As discussed in the Introduction, slope changes do not necessarily correspond to rate changes; my method accounts for such slope changes without predicting a false rate change.

ii) Rates in the recent and distant past can be inferred, for both complete phylogenies and incomplete phylogenies assuming random sampling.

iii) Simulations reveal that the likelihood method is robust to inaccuracies in speciation time estimates. In particular, unresolved nodes (polytomies) do not pose a problem, whereas the available regression methods require well-dated edges (3).

iv) When half of the lineages speciate in a short time interval, a rate shift is suggested by standard regression methods. However, if the number of lineages is small, the speciation pattern can easily be explained by stochastic effects. Likelihood methods using the whole tree account for such stochastic effects.

v) Regression methods for estimating diversification rates require setting some parameters a priori (3) (e.g., basis dimension, gamma parameter, and time steps) and it is unclear how different settings influence the results. A likelihood framework requires no input besides the dated tree.

Previous studies derived likelihood approaches to infer diversification rates under special cases of the birth–death-shift process. Under a model without rate changes, i.e. constant rates throughout the whole evolutionary time, a likelihood approach is well established (14). The maximum-likelihood parameter estimation for the constant rate birth–death process was generalized to changing birth and death rates (14). However, only integral equations were available, which have to be evaluated numerically. Numerical integration has been done, e.g., on an Enallagma phylogeny (27). However, their 90% complete trees contained only 20 species and they allowed for only one rate shift. For big phylogenetic trees where several rate shifts can be expected, direct analytical methods become crucial to analyze the data.

A previous study (6) assumes the birth–death-shift model with one rate shift (and constraints on the shift time and rates). The parameters are estimated assuming that the tree before the rate shift time is independent of the tree after the rate shift time. However, this assumption is not valid, because extinction in a later interval affects species that appeared in an earlier interval and survived until the later interval.

Further, a mass extinction event produces an LTT plot with a constant slope interrupted by a plateau (11, 28). However, results were purely based on simulations as no likelihood function was available. The likelihood approach developed in this study accounts for mass extinction events.

My simulations revealed that diversification rates can be estimated with high confidence. However, the turnover cannot be estimated accurately, even in the case of no rate shift. For obtaining very accurate extinction rates, fossil data are necessary (29).

Models have been proposed where speciation and extinction rates are more complex than under the birth–death-shift model. For example, extinction rates might be heritable (30) (e.g., if intrinsic factors determine speciation rate and these intrinsic factors are heritable), or speciation rates might be density dependent (31). Up to now, such models can be investigated only with simulations. Maximum-likelihood methods to infer rates under these more complex models will require new analytic techniques. Once such methods become available, statistical tests can be applied to address the validity of models with changing rates at discrete points in time (presented in this paper) or models with continuously changing rates through time.

The presented likelihood equations can of course also be used in a Markov chain Monte Carlo (MCMC) approach to estimate the posterior parameter distribution for a given phylogeny, if one is willing to assume prior distributions for the parameters.

Also, the birth–death-shift likelihood can be used as an a priori distribution for reconstructing phylogenies in an MCMC framework. This method will be in particular interesting for reconstructing virus phylogenies. When using the birth–death-shift model for viruses, the birth rate corresponds to the transmission rate of a virus. As the transmission rate undergoes major declines when new treatment strategies become available (as treated patients usually cannot transmit), a model with rate shifts is very appropriate for viruses.

Methods

1) Birth–Death-Shift Model.

To detect shifts of diversification rates in a phylogeny, we need to define an evolutionary model accounting for such shifts. I define the model very generally, allowing for rate shifts as well as mass extinction events. In this way, the method is applicable to a wide range of datasets.

For modeling the shift of rates and mass extinction events, we define a birth–death-shift process as follows. The vector t = (t₀, …, t_m), with 0 = t₀ < t₁ < … < t_m, determines the times (before the present) of rate shift and mass extinction events, where t₀ = 0 denotes the present. We set t_m+₁ = ∞. The vector λ = (λ₀, …, λ_m), where λ_i > 0 (i = 0, 1, …, m) defines the speciation rates. λ_i is the speciation rate in time interval (t_i, t_i+₁]. The vector μ = (μ₀, …, μ_m), where λ_i > μ_i ≥ 0 (i = 0, 1, …, m) defines the extinction rates. μ_i is the extinction rate in time interval (t_i, t_i+₁]. The vector ρ = (ρ₀, …, ρ_m), where 1 ≥ ρ_i > 0 (i = 0, 1, …, m) defines the survival probabilities: ρ_i for i > 0 is the probability of surviving a mass extinction event at time t_i. Note that ρ_i = 1 corresponds to a rate shift from λ_i, μ_i to λ_i−₁, μ_i−₁, but no extinction at time t_i. The probability ρ₀ is the probability of sampling an extant species. ρ₀ = 1 corresponds to complete sampling of extant species.

The birth–death-shift process starts with one species at time x₀ in the past and is stopped at time t₀ = 0. At time point t, where t_i+₁ > t > t_i, each species gives birth to a new species with rate λ_i; the two subtending lineages are denoted l and r such that we can distinguish between them. Each species dies with rate μ_i. At time t_i, each species survives the mass extinction with probability ρ_i. The process is stopped at time 0, the present. Each present-day species is sampled with probability ρ₀.

Such a process induces a tree (Fig. 5, Left). Suppressing all extinct and all nonsampled lineages yields the sampled tree with root edge (Fig. 5, Right). The sampled tree is obtained by deleting the edge subtending the origin. Note that the sampled tree is a binary tree with each leaf having the same distance from the root. Each internal vertex has a lineage l and a lineage r descending. In the sampled tree, we subdivide the edge from t_s to t_e with time t_s > t_i > t_e at time t_i. So at time t_i, we have a degree-two vertex. The n − 1 speciation times in a sampled tree with n present-day species are x₁ > x₂ > … > x_n−₁.

Fig. 5.

Tree notation. (*Left*) An example of a tree that evolved under the birth–death shift process. The sampled species are denoted with a solid circle. (*Right*) The corresponding sampled tree with root edge. The labels l and r are suppressed on most branches for clarity.

Note that a phylogeny inferred from data does not have the orientation labels l and r but each leaf has a unique label. We considered oriented trees and calculate their probability density because of convenient notation. The probability density of a labeled tree (each leaf is assigned a unique label, and the orientations l and r are dropped) follows from the equations for an oriented tree by multiplying with

as each labeling and each orientation is equally likely (32). Note that in maximum-likelihood parameter estimations on a fixed tree, the factor

is constant, and therefore it can be discarded. Therefore, throughout the rest of the paper, we consider only oriented trees.

2) Deriving the Likelihood of a Tree.

The aim of this section is to calculate the probability density of a sampled tree T under the birth–death-shift process. We do so by first calculating the probability of a species having 0 descendants [2.1) Probability of 0 extant descendants]. Using this probability, we calculate the density of a tree conditioned on the time of origin [2.2) Probability density of a tree conditioned on x₀] and the time of the most recent common ancestor [Section 2.3) Probability density of a tree conditioned on the time of the most recent common ancestor].

Having the general expression for the probability density of any sampled tree allows us to calculate the probability density of a data phylogeny (which is a particular sampled tree). This probability density is maximized over the parameter space to obtain the maximum-likelihood parameter estimates for the phylogeny.

2.1) Probability of 0 extant descendants.

Theorem 2.1. The probability that a species alive at time t before today has no sampled extant descendants with t_i ≤ t ≤ t_i₊₁ (i = 0, …, m with t_m₊₁ = ∞) is

with

and c₀ = 1 − ρ₀.

Proof. The master equation for the probability

is, for t_i ≤ t ≤ t_i+₁,

By plugging Eq. 1 into the master equation, one can verify that it is a solution of the master equation. For showing the uniqueness, define

As

are continuous, uniqueness follows by the uniqueness theorem for ordinary differential equations (e.g., ref. 33, p. 211). □

Remark 2.2. Note that for i = 0, we have

which was already established (14).

In the next section, we need Eq. 1 to calculate the density function of a sampled tree T.

2.2) Probability density of a tree conditioned on x₀.

First note that the time of the start of the process, x₀, is a parameter of the birth–death-shift process, and therefore we cannot state a tree probability density independent of x₀ unless we assume a prior distribution for x₀. In the following, we simply condition on the time x₀ being fixed. In the subtending section, we then calculate the probability density of a tree conditioned on the time of the most recent common ancestor of the extant and sampled species: This tree can be considered as two trees with time of origin at x₀. The time of the most recent common ancestor of a clade is the age of the clade. This value is known for well-studied clades and therefore conditioning on this value is reasonable and generally done when estimating rates (19).

For t_i ≤ t ≤ t_i+₁, let g_i_,e(t) be the probability density that the species corresponding to edge e at time t evolved between t and the present as observed in T.

Theorem 2.3. The unique solution of the ordinary differential equation for g_i_,e(t) is

with

Note that

for all i ≥ 0.

Proof. The master equation for g_i_,e(t) along an edge with starting time t_s and ending time t_e is, with t_s ≥ t ≥ t_e,

with the initial values at t = t_e,

By plugging Eq. 2 into the master equation, one can verify that it is a solution of the master equation. For showing the uniqueness, define

As

are continuous, uniqueness follows by the uniqueness theorem for ordinary differential equations (e.g., ref. 33, p. 211).□

Corollary 2.4. Let

be the probability that a species alive at time t before today has one sampled extant descendant with t_i ≤ t ≤ t_i₊₁ (i = 1, …, m with t_m+₁ = ∞). Then

Proof. The proof follows from Theorem 2.3 by noting that

is the probability of observing a tree with one sampled individual. □

Remark 2.5. For i = 0, we have

which was established in Remark 3.2 in ref. 16.

The density of a sampled tree with root edge, T_or, given the time of origin, f [T_or | t_or = x₀], can be calculated by recursively using the formula for g_e,i(t), starting with t = x₀.

For obtaining a closed-form solution, note that each x_i with i > 0 is twice a starting point and once an ending point of an edge. x₀ is only a starting point. The leaves are only ending points.

Further, for t_i+₁ > t ≥ t_i, define l(t) := i. Further, let n_i be the number of edges in T_or that go through time t_i. Using this notation, we obtain

Therefore, we established the following theorem.

Theorem 2.6. The density of a sampled tree with root edge, T_or, conditioned on the time of origin being x₀, is

2.3) Probability density of a tree conditioned on the time of the most recent common ancestor.

We next provide the density for an oriented tree conditioned on the time of the most recent common ancestors of the extant sampled species. This formula is used for the maximum-likelihood parameter estimation in phylogenies.

Theorem 2.7. The density of a sampled tree T, conditioned on the time of the most recent common ancestor being x₁, is

Proof. When conditioning on the time of the most recent common ancestor of the sampled species, we implicitly condition that the two descendants at the time of the most recent common ancestor have sampled descendants, and we call S the event that a species has at least one sampled extant descendant. Let

be the left subtree of T and

be the right subtree of T descending from the most recent common ancestor of T. So

This establishes the theorem. □

3) Implementation of the Maximum-Likelihood Method.

The probability density in Theorem 2.7 is used for maximum-likelihood rate estimation in phylogenetic trees inferred from (species) data. A model with m rate shifts (or mass extinction events) can be tested against a model with m + 1 rate shifts (or mass extinction events) using the likelihood-ratio test. The method is implemented in the R package TreePar (18). For the maximization of the likelihood function, I use the R package subplex (34) available on CRAN. The likelihood estimations were performed on the ETH Brutus cluster.

The maximization performs well for fixed rate shift times, but terminates in local optima instead of the global optimum when estimating simultaneously the rate shift times and the speciation and extinction rates. I therefore use a fine grid on shift times and optimize the speciation and extinction rates for the fixed rate shift times on the grid.

When analyzing the mammalian phylogeny, I first allow for one shift and estimate rates in 0.1-Myr steps between 2.05 and 100.05 Mya. Then I allow for two shifts by fixing the maximum-likelihood time shift from the one-shift estimation and using for the second shift again a grid of 0.1-Myr steps between 2.05 and 100.05 Mya. I proceed in this way to add more shifts. Note that estimating all time shifts simultaneously would be favorable over the greedy approach, but it is computationally not feasible. I show through simulations that the greedy approach recovers the rate shifts reliably (SI Text S2).

I showed in a previous paper (17) that it is not possible to distinguish a mass extinction event from constant rates interrupted by a period of stasis (i.e., basically no speciation and extinction), as both events leave a similar footprint on the phylogeny. Distinguishing between mass extinction and rate shift events requires additional paleontological data. If only the phylogenetic tree is available, then for each time interval, we need to know one of the three parameters λ_i, μ_i, and ρ_i or we have to assume dependencies, like λ_i = λ_i−₁. Note that requiring one of the three parameters λ_i, μ_i, and ρ_i to be known is already necessary if no shift and no mass extinction occur (35).

In the mammal phylogeny, I investigate whether and when rate shift events occurred; i.e., I set ρ_i = 1 for all i. I investigate the accuracy of the maximum-likelihood rate estimates through simulating trees with n species and then estimating their rates. For the simulations, I use the R package TreeSim (36).

Trees with polytomies can be analyzed with my method. Polytomies are considered by the method as a random binary resolution with edges of length 0 (note that each binary resolution of the polytomy has the same likelihood).

Note

This article is a PNAS Direct Submission.

Acknowledgments

I thank Alexandre Antonelli, Marcelo Sanchez, Helen Alexander, Jan Engelstaedter, Roland Regoes, Olin Silander, the editor, and the two anonymous reviewers for very helpful comments. Funding for this work came from Eidgenössiche Technische Hochschule, Zurich.

Supporting Information

Supporting Information (PDF)

Supporting Information

Download
756.35 KB

References

1

RJ Asher, et al., Stem Lagomorpha and the antiquity of Glires. Science 307, 1091–1094 (2005).

Crossref

PubMed

Google Scholar

2

JR Wible, GW Rougier, MJ Novacek, RJ Asher, Cretaceous eutherians and Laurasian origin for placental mammals near the K/T boundary. Nature 447, 1003–1006 (2007).

Crossref

PubMed

Google Scholar

3

OR Bininda-Emonds, et al., The delayed rise of present-day mammals. Nature 446, 507–512 (2007).

Crossref

PubMed

Google Scholar

4

D Wilson, D Reeder Mammal Species of the World: A Taxonomic and Geographic Reference (Johns Hopkins Univ Press, Baltimore, 2005).

Google Scholar

5

MJ Sanderson, MJ Donoghue, Reconstructing shifts in diversification rates on phylogenetic trees. Trends Ecol Evol 11, 15–20 (1996).

Crossref

PubMed

Google Scholar

6

DL Rabosky, Likelihood methods for detecting temporal shifts in diversification rates. Evolution 60, 1152–1164 (2006).

Crossref

PubMed

Google Scholar

7

S Nee, EC Holmes, RM May, PH Harvey, Extinction rates can be estimated from molecular phylogenies. Philos Trans R Soc Lond B Biol Sci 344, 77–82 (1994).

Crossref

PubMed

Google Scholar

8

E Paradis, Assessing temporal variations in diversification rates from phylogenies: Estimation and hypothesis testing. Proc Biol Sci 264, 1141 (1997).

Crossref

Google Scholar

9

OG Pybus, PH Harvey, Testing macro-evolutionary models using incomplete molecular phylogenies. Proc Biol Sci 267, 2267–2272 (2000).

Crossref

PubMed

Google Scholar

10

RE Ricklefs, Estimating diversification rates from phylogenetic information. Trends Ecol Evol 22, 601–610 (2007).

Crossref

PubMed

Google Scholar

11

PH Harvey, RM May, S Nee, Phylogenies without fossils. Evolution 48, 523–529 (1994).

Crossref

PubMed

Google Scholar

12

DG Kendall, On the generalized “birth-and-death” process. Ann Math Stat 19, 1–15 (1948).

Crossref

Google Scholar

13

N Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences (Wiley, New York, 1964).

Google Scholar

14

SC Nee, RM May, PH Harvey, The reconstructed evolutionary process. Philos Trans R Soc Lond B Biol Sci 344, 305–311 (1994).

Crossref

PubMed

Google Scholar

15

T Gernhard, The conditioned reconstructed process. J Theor Biol 253, 769–778 (2008).

Crossref

PubMed

Google Scholar

16

T Stadler, Sampling-through-time in birth-death trees. J Theor Biol 267, 396–404 (2010).

Crossref

PubMed

Google Scholar

17

T Stadler, Simulating trees on a fixed number of extant species. Syst Biol, in press. (2011).

Crossref

Google Scholar

18

T Stadler, TreePar in R - Estimating diversification rates in phylogenies., Available at http://cran.r-project.org/web/packages/TreePar/index.html. Accessed January 25, 2011. (2011).

Google Scholar

19

SC Nee, Inferring speciation rates from phylogenies. Evolution 55, 661–668 (2001).

Crossref

PubMed

Google Scholar

20

DL Rabosky, Extinction rates should not be estimated from molecular phylogenies. Evolution 64, 1816–1824 (2010).

Crossref

PubMed

Google Scholar

21

TB Quental, CR Marshall, Diversity dynamics: Molecular phylogenies need the fossil record. Trends Ecol Evol 25, 434–441 (2010).

Crossref

PubMed

Google Scholar

22

A Barnosky, M Carrasco, Effects of Oligo-Miocene global climate changes on mammalian species richness in the northwestern quarter of the USA. Evol Ecol Res 4, 811–841 (2002).

Google Scholar

23

J Finarelli, C Badgley, Diversity dynamics of Miocene mammals in relation to the history of tectonism and climate. Proc R Soc B 277, 2721–2726 (2010).

Crossref

PubMed

Google Scholar

24

A Barnosky, E Hadly, C Bell, Mammalian response to global warming on varied temporal scales. J Mammal 84, 354–368 (2003).

Crossref

Google Scholar

25

JC Zachos, GR Dickens, RE Zeebe, An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics. Nature 451, 279–283 (2008).

Crossref

PubMed

Google Scholar

26

MS Swenson, F Barbançon, T Warnow, CR Linder, A simulation study comparing supertree and combined analysis methods using SMIDGen. Algorithms Mol Biol 5, 8 (2010).

Crossref

PubMed

Google Scholar

27

J Turgeon, R Stoks, RA Thum, JM Brown, MA McPeek, Simultaneous Quaternary radiations of three damselfly clades across the Holarctic. Am Nat 165, E78–E107 (2005).

Crossref

PubMed

Google Scholar

28

MD Crisp, LG Cook, Explosive radiation or cryptic mass extinction? Interpreting signatures in molecular phylogenies. Evolution 63, 2257–2265 (2009).

Crossref

PubMed

Google Scholar

29

J Alroy, New methods for quantifying macroevolutionary patterns and processes. Paleobiology 26, 707–733 (2000).

Crossref

Google Scholar

30

DL Rabosky, Heritability of extinction rates links diversification patterns in molecular phylogenies and fossils. Syst Biol 58, 629–640 (2009).

Crossref

PubMed

Google Scholar

31

DL Rabosky, IJ Lovette, Density-dependent diversification in North American wood warblers. Proc Biol Sci 275, 2363–2371 (2008).

Crossref

PubMed

Google Scholar

32

D Ford, FA Matsen, T Stadler, A method for investigating relative timing information on phylogenetic trees. Syst Biol 58, 167–183 (2009).

Crossref

PubMed

Google Scholar

33

A King, J Billingham, S Otto Differential Equations: Linear, Nonlinear, Ordinary, Partial (Cambridge Univ Press, Cambridge, UK, 2003).

Crossref

Google Scholar

34

A King, The subplex algorithm for unconstrained optimization., Available at http://cran.r-project.org/web/packages/subplex/index.html. Accessed July 28, 2008. (2008).

Google Scholar

35

T Stadler, On incomplete sampling under birth-death models and connections to the sampling-based coalescent. J Theor Biol 261, 58–66 (2009).

Crossref

PubMed

Google Scholar

36

T Stadler, TreeSim in R - Simulating trees under the birth-death model., Available at http://cran.r-project.org/web/packages/TreeSim/index.html. Accessed February 23, 2010. (2010).

Google Scholar

Information & Authors

Information

Published in

Proceedings of the National Academy of Sciences

Vol. 108 | No. 15
April 12, 2011

PubMed: 21444816

Classifications

Submission history

Published online: March 28, 2011

Published in issue: April 12, 2011

Keywords

Acknowledgments

I thank Alexandre Antonelli, Marcelo Sanchez, Helen Alexander, Jan Engelstaedter, Roland Regoes, Olin Silander, the editor, and the two anonymous reviewers for very helpful comments. Funding for this work came from Eidgenössiche Technische Hochschule, Zurich.

Authors

Affiliations

Tanja Stadler¹ [email protected]

Institut für Integrative Biologie, Eidgenössiche Technische Hochschule Zurich, 8092 Zurich, Switzerland

View all articles by this author

Notes

1

E-mail: [email protected].

Author contributions: T.S. designed research, performed research, contributed new reagents/analytic tools, analyzed data, and wrote the paper.

Competing Interests

The author declares no conflict of interest.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.

Citation statements

Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

View options

PDF format

Download this article as a PDF file

DOWNLOAD PDF

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Personal login Institutional Login

Recommend to a librarian

Recommend PNAS to a Librarian

Save for later

Purchase options

Purchase this article to get full access to it.

Single Article Purchase

Mammalian phylogeny reveals recent diversification rate shifts

Proceedings of the National Academy of Sciences

Vol. 108
No. 15
pp. 5923-6336

Restore content access

Restore content access for purchases made as a guest

Featured Topics

Articles By Topic

Featured Topics

Articles By Topic

Featured Topic

Articles By Topic

Abstract

Sign up for PNAS alerts.

A New Likelihood Approach

Tempo of Mammalian Diversification

Discussion

Mammalian Evolution.

Advantages, Extensions, and Limitations of the Presented Likelihood Method.

Methods

1) Birth–Death-Shift Model.

2) Deriving the Likelihood of a Tree.

2.1) Probability of 0 extant descendants.

2.2) Probability density of a tree conditioned on x0.

2.3) Probability density of a tree conditioned on the time of the most recent common ancestor.

3) Implementation of the Maximum-Likelihood Method.

Note

Acknowledgments

Supporting Information

References

Information

Published in

Classifications

Submission history

Keywords

Acknowledgments

Authors

Affiliations

Notes

Competing Interests

Metrics

Citation statements

Altmetrics

Citations

Cited by

View options

PDF format

Get Access

Login options

Recommend to a librarian

Purchase options

Restore content access

Figures

Tables

Other

Share

Share article link

Share on social media

Further reading in this issue

Rapid, global demographic expansions after the origins of agriculture

Lignin content in natural Populus variants affects sugar release

Calcium-dependent copper redistributions in neuronal cells revealed by a fluorescent copper sensor and X-ray fluorescence microscopy

Universals and cultural variation in turn-taking in conversation

The effects of Facebook and Instagram on the 2020 election: A deactivation experiment

Bodily maps of emotions

Sign up for thePNAS Highlights newsletter

2.2) Probability density of a tree conditioned on x₀.