This tutorial is based on the BiSSE and HiSSE tutorial by Luke Harmon: https://lukejharmon.github.io/ilhabela/2015/07/05/BiSSE-and-HiSSE.
For more background on the BiSSE and HiSSE models, please read:
You can find the source files for this tutorial on the class GitHub repository:
https://github.com/EEOB-Macroevolution/EEOB565X-Spring2020/tree/master/practicals/03-BiSSE-HiSSE
The BiSSE model is implemented as part of the diversitree
package, which also contains many other -SSE models, while the HiSSE model is implemented in the standalone package hisse
. So we will start by installing and loading both of these packages.
install.packages("diversitree")
install.packages("hisse")
library(diversitree)
## Loading required package: ape
library(hisse)
## Loading required package: deSolve
## Loading required package: GenSA
## Loading required package: subplex
## Loading required package: nloptr
To test the ability of BiSSE to detect differences in diversification rates, we are going to simulate a tree under a character-driven birth-death process, where the birth rates are very dependent on a trait while the death rates are identical. The following will create a tree with 50 tips, with a birth rate of 1.0 in state 0 and 4.0 in state 1, a death rate in both states of 0.5 and transition rates of 0.2 in the direction 0 -> 1 and 0.8 in the direction 1 -> 0.
sim_parameters = c(1.0, 4.0, 0.5, 0.5, 0.2, 0.8)
tree_bisse = tree.bisse(sim_parameters, max.taxa = 50)
plot(tree_bisse)
We can also show the history of our simulated tree. Here edges in state 0 will be shown in red, and edges in state 1 in blue.
treehistory = history.from.sim.discrete(tree_bisse, 0:1)
plot(treehistory, tree_bisse, cols = c("red", "blue"))
Note that because we are each performing a different replicate simulation, your tree will not look like the one above.
Now we will find out if BiSSE is able to correctly infer whether there are changes in the diversification rates on our simulated tree, by estimating the rates using maximum likelihood.
This simply attaches a BiSSE model to our simulated tree:
bisse_model = make.bisse(tree_bisse, tree_bisse$tip.state)
We are also going to need a starting point for our search:
start_mle = starting.point.bisse(tree_bisse)
start_mle
## lambda0 lambda1 mu0 mu1 q01 q10
## 2.28488500 2.28488500 1.88949285 1.88949285 0.07907843 0.07907843
Finally, we are ready to perform ML estimation. To do this, we will use the find.mle()
function from diversitree to perform a maximum likelihood search to estimate the rate parameters under the BiSSE model.
bisse_mle = find.mle(bisse_model, start_mle)
Let’s view the log-likelihood for the set of parameters estimated from the search (keep in mind because we are working with different simulations, you will have a different value).
logLik(bisse_mle)
## 'log Lik.' -45.64364 (df=6)
Next we can compare the ML estimates of the parameters to our simulation parameters. Here are the simulation parameters, which are in the order: lambda0
, lambda1
, mu0
, mu1
, q01
, q10
.
sim_parameters
## [1] 1.0 4.0 0.5 0.5 0.2 0.8
Now let’s print the estimated parameters. To do this, we will use the function coef()
, which will extract the estimated values from the bisse_mle
object. We’ll also round the estimates to two significant digits to make it easier to compare them to the true (simulated) values.
round(coef(bisse_mle),digits = 2)
## lambda0 lambda1 mu0 mu1 q01 q10
## 0.80 5.34 0.00 4.18 0.27 0.25
How accurate are your estimates? Do the MLE values capture the magnitude of the speciation and extinction rate differences among the two states?
We can also compare the fit of the BiSSE model to a constrained model where the birth and death rates are identical between states
null_bisse_model = constrain(bisse_model, lambda0 ~ lambda1)
null_bisse_model = constrain(null_bisse_model, mu0 ~ mu1)
Similarly to above, we perform ML estimation on this model. For the starting values, we will just remove lambda0
and mu0
from our previous starting values.
start_mle_null = start_mle[c(-1,-3)]
start_mle_null
## lambda1 mu1 q01 q10
## 2.28488500 1.88949285 0.07907843 0.07907843
null_bisse_mle = find.mle(null_bisse_model, start_mle_null)
round(coef(null_bisse_mle),digits = 2)
## lambda1 mu1 q01 q10
## 2.29 1.89 0.12 0.50
Let us compare the fit of both models
anova(bisse_mle, constrained = null_bisse_mle)
## Df lnLik AIC ChiSq Pr(>|Chi|)
## full 6 -45.644 103.29
## constrained 4 -53.358 114.72 15.429 0.0004463 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The AIC comparison of the two models shows that the full
model is a better fit for our simulated data.
One issue with the BiSSE model is that it has trouble distinguishing between changes in rates that are tied to the character state tested, and changes in rates which are independent of that character. To test this, we make up an artificial character for our tree.
incorrect_tip_states = c(rep(0, 12), rep(1, 13), rep(0, 13), rep(1, 12))
names(incorrect_tip_states) = names(tree_bisse$tip.state)
incorrect_tip_states
## sp5 sp14 sp17 sp22 sp23 sp24 sp29 sp30 sp31 sp32 sp33 sp34 sp35 sp36 sp37
## 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
## sp38 sp39 sp41 sp43 sp44 sp45 sp46 sp48 sp50 sp51 sp52 sp53 sp54 sp55 sp56
## 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
## sp57 sp58 sp59 sp60 sp61 sp62 sp63 sp64 sp65 sp66 sp67 sp68 sp69 sp70 sp71
## 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
## sp72 sp73 sp74 sp75 sp76
## 1 1 1 1 1
Then we run ML estimation using the BiSSE model and this new character.
incorrect_bisse_model = make.bisse(tree_bisse, incorrect_tip_states)
incorrect_bisse_mle = find.mle(incorrect_bisse_model, start_mle)
Similarly to before, we will also try a null model where both birth rates and death rates are identical between states
incorrect_null_bisse_model = constrain(incorrect_bisse_model, lambda0 ~ lambda1)
incorrect_null_bisse_model = constrain(incorrect_null_bisse_model, mu0 ~ mu1)
incorrect_null_bisse_mle = find.mle(incorrect_null_bisse_model, start_mle_null)
Finally, we can again compare the fit of both models
anova(incorrect_bisse_mle, constrained = incorrect_null_bisse_mle)
## Df lnLik AIC ChiSq Pr(>|Chi|)
## full 6 -59.615 131.23
## constrained 4 -63.279 134.56 7.3283 0.02563 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The HiSSE model was developed to address this issue, by implementing a better null model than the simple constrained one we have been using so far. Note that the hisse
package was developed independently from diversitree
, so its syntax is going to be quite different.
We want to test whether the changes in rates are due to the observed character or to a hidden one, so we will build a null model where the rates only depend on a hidden character. The first step is to build a transition rate matrix.
null_rate_matrix = TransMatMaker(hidden.states = TRUE)
We will use the transition rates recommended by the hisse documentation.
null_rate_matrix = ParDrop(null_rate_matrix, c(3,5,8,10)) # no simultaneous transitions in both characters
null_rate_matrix[cbind(c(1,3,2,4), c(3,1,4,2))] = 2 # rates for change between rate classes
null_rate_matrix[cbind(c(2,4), c(1,3))] = 1 # rates for 1 -> 0 transition in character are equal
null_rate_matrix[cbind(c(1,3), c(2,4))] = 3 # rates for 0 -> 1 transition in character are equal
Then we need to specify which rates are shared between states. The states are ordered 0A, 1A, 0B, 1B, where 0/1 describes the observed character and A/B the hidden character. So we will specify that the rates are identical between 0 and 1 but can differ between A and B - i.e. that states 0A and 1A share one value, and 0B and 1B share another.
null_net_turnover = c(1,1,2,2)
null_extinction_fraction = c(1,1,2,2)
And finally we will transform the tip character we simulated previously into the HiSSE format.
hisse_states = cbind(names(tree_bisse$tip.state), tree_bisse$tip.state)
Now we can run the ML estimation on the HiSSE null model.
null_hisse = hisse(tree_bisse, hisse_states, hidden.states=TRUE, turnover.anc = null_net_turnover, eps.anc = null_extinction_fraction, trans.rate = null_rate_matrix)
## Initializing...
## Finished. Beginning bounded subplex routine...
## Finished. Summarizing results...
null_hisse
##
## Fit
## lnL AIC AICc ntax
## -49.72085 113.4417 116.1084 50
##
## Probability of extinction:
##
## Diversification Rates
## turnover ext.frac
## rate0A 0.9417231 2.492367
## alpha0A 1.0000000 1.000000
## beta0A 1.0000000 1.000000
## timeslice.factor0A 1.0000000 1.000000
##
## turnover ext.frac
## rate1A 0.9417231 2.492367
## alpha1A 1.0000000 1.000000
## beta1A 1.0000000 1.000000
## timeslice.factor1A 1.0000000 1.000000
##
## turnover ext.frac
## rate0B 9.042448 0.8335288
## alpha0B 1.000000 1.0000000
## beta0B 1.000000 1.0000000
## timeslice.factor0B 1.000000 1.0000000
##
## turnover ext.frac
## rate1B 9.042448 0.8335288
## alpha1B 1.000000 1.0000000
## beta1B 1.000000 1.0000000
## timeslice.factor1B 1.000000 1.0000000
##
## Transition Rates
## (0A) (1A) (0B) (1B)
## (0A) NA 0.1237954 2.3831082 0.0000000
## (1A) 0.5016438 NA 0.0000000 2.3831082
## (0B) 2.3831082 0.0000000 NA 0.1237954
## (1B) 0.0000000 2.3831082 0.5016438 NA
We can now compare the AIC obtained from the BiSSE model with the AIC of the HiSSE null model.
null_hisse$AIC
## [1] 113.4417
AIC(null_bisse_mle)
## [1] 114.7162
AIC(bisse_mle)
## [1] 103.2873
The AIC value for the bisse_mle
model indicates that the data fit this model the best.
Let us check whether HiSSE can detect that our artificial character is not the one driving the changes in rates. We build a similar HiSSE null model using this character.
incorrect_hisse_states = cbind(names(incorrect_tip_states), incorrect_tip_states)
incorrect_null_hisse = hisse(tree_bisse, incorrect_hisse_states, hidden.states=TRUE, turnover.anc = null_net_turnover, eps.anc = null_extinction_fraction, trans.rate = null_rate_matrix)
## Initializing...
## Finished. Beginning bounded subplex routine...
## Finished. Summarizing results...
And again we can compare the AIC values obtained.
incorrect_null_hisse$AIC
## [1] 133.2069
AIC(incorrect_null_bisse_mle)
## [1] 134.5577
AIC(incorrect_bisse_mle)
## [1] 131.2294
These model comparisons indicate that the BiSSE model is a better fit for the data. However, given the closeness of the AIC values, it is likely the data are not informative enough to distinguish the model fit correctly.