A bounded continuous character evolution model with absorbing boundaries for phytools

Followers of this blog should know by now that I’m fascinated by applications of the discretized diffusion approximation described in Boucher & Démery (2016), for instance to fit a multi-state threshold model, a semi-threshold model, or a state-dependent or hidden-state model of multi-rate continuous character evolution.

Really one of the simplest extensions of Boucher & Démery’s bounded model, however, might be a bounded model with absorbing rather than reflective bounds.

This is just an evolutionary process in which whenever stochastic diffusion reaches one of the boundary values of the trait, it just sticks there.

As far as I can tell, the only way to simulate data with this property is forward-in-time, but this is the best way to get a good sense of what I’m talking about, so here goes.

First, here’s my simulating function:

sim_absorbing<-function(tree,a=0,sig2=1,
  bounds=c(-Inf,Inf),nsteps=1000,
  ...){
  if(hasArg(plot)) plot<-list(...)$plot
  else plot=FALSE
  if(hasArg(reflective)) reflective<-list(...)$reflective
  else reflective<-FALSE
  tree<-reorder(tree,"cladewise")
  ll<-max(nodeHeights(tree))
  tt<-map.to.singleton(make.era.map(tree,
    limits=seq(0,ll,length.out=nsteps+1)))
  delta_x<-rnorm(n=nrow(tt$edge),sd=sqrt(sig2*tt$edge.length))
  X<-matrix(NA,nrow(tt$edge),2)
  ROOT<-Ntip(tt)+1
  X[which(tt$edge[,1]==ROOT),1]<-a
  for(i in 1:nrow(tt$edge)){
    X[i,2]<-X[i,1]+delta_x[i]
    if(!reflective){
      if(X[i,2]<=bounds[1]) X[i,2]<--Inf
      else if(X[i,2]>=bounds[2]) X[i,2]<-Inf
    } else {
      while(X[i,2]<=bounds[1]||X[i,2]>=bounds[2]){
        if(X[i,2]<=bounds[1]) X[i,2]<-2*bounds[1]-X[i,2]
        if(X[i,2]>=bounds[2]) X[i,2]<-2*bounds[2]-X[i,2]
      }
    }
    jj<-which(tt$edge[,1]==tt$edge[i,2])
    if(length(jj)>0) X[jj,1]<-X[i,2]
  }
  ii<-which(X==-Inf)
  if(length(ii)>0)  X[ii]<-bounds[1]
  ii<-which(X==Inf)
  if(length(ii)>0) X[ii]<-bounds[2]
  if(plot){
    LIMS<-vector()
    LIMS[1]<-if(bounds[1]==-Inf) min(X) else bounds[1]
    LIMS[2]<-if(bounds[2]==Inf) max(X) else bounds[2]
    hh<-hcl.colors(n=201)
    edge_col<-hh[floor(200*((X[,1]-LIMS[1])/diff(LIMS)))+1]
    par(mfrow=c(2,1))
    par(mar=c(0.1,4.1,2.1,1.1))
    plot(tt,edge.color=edge_col,show.tip.label=FALSE,
      edge.width=3)
    par(mar=c(5.1,4.1,0.1,1.1))
    plot(NA,xlim=c(0,ll),ylim=range(X),
      xlab="time",ylab="phenotype",bty="n",
      las=1)
    H<-nodeHeights(tt)
    for(i in 1:nrow(tt$edge))
      lines(H[i,],X[i,],col=edge_col[i],lwd=3)
  }
  setNames(sapply(X=1:Ntip(tt),
    FUN=function(i,x,e) X[which(e[,2]==i),2],
    x=X,e=tt$edge),tt$tip.label)
}

Now let’s load phytools and run it:

library(phytools)

packageVersion("phytools")

## [1] '2.4.13'

tree<-pbtree(n=40,scale=1)
x<-sim_absorbing(tree,plot=TRUE,bounds=c(-1,1),
  nsteps=400)
bcols<-hcl.colors(2)
lines(c(0,max(nodeHeights(tree))),rep(-1,2),lwd=8,
  col=make.transparent(bcols[1],0.2))
text(0.1,-1,"lower bound",pos=3)
lines(c(0,max(nodeHeights(tree))),rep(1,2),lwd=8,
  col=make.transparent(bcols[2],0.2))
text(0.1,1,"upper bound",pos=1)

plot of chunk unnamed-chunk-17

The update to phytools::bounded_bm that allows us to fit this model is already in phytools on GitHub. I won’t explain all the details, because if you understand the discrete diffusion approximation – you’ll understand how this works. (If you don’t understand it, I recommend checking out my blog over the past 16 months, starting with this post.)

fit_absorbing<-bounded_bm(tree,x,absorbing=TRUE,
  lims=c(-1,1))
fit_absorbing

## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Absorbing bounded model
## 
## Set or estimated bounds: [ -1 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 1.087513 
##      x0: 0.195 
## 
## Log-likelihood: 61.975761 
## 
## R thinks it has found the ML solution.

(This takes longer than the bounded model due to some technical reasons regarding matrix exponentiation to compute the likelihood which I won’t get into.)

We can compare this to the bounded model and to a standard unbounded Brownian model as follows:

fit_bounded<-bounded_bm(tree,x,lims=c(-1,1))
fit_bounded

## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -1 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 0.889727 
##      x0: 0.675 
## 
## Log-likelihood: -10.098431 
## 
## R thinks it has found the ML solution.

fit_bm<-bounded_bm(tree,x,levs=400)
fit_bm

## Object of class "bounded_bm" based on
##  	a discretization with k = 400 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -2 , 2 ]
## 
## Fitted model parameters:
##   sigsq: 0.68214 
##      x0: 0.335 
## 
## Log-likelihood: -26.840089 
## 
## R thinks it has found the ML solution.

(I increased levs for the standard BM model so that I would have the same bin density in both discretizations. This isn’t technically necessary, but might make model comparison more precise.)

AIC(fit_bm,fit_bounded,fit_absorbing)

##               df        AIC
## fit_bm         2   57.68018
## fit_bounded    4   28.19686
## fit_absorbing  4 -115.95152

This shows that we have way more support for the generating model than for the other two. If we take a look at our fitted models, we’ll also learn that our parameter estimates are much closer to the generating conditions when the absorbing model is fit.

Our standard BM model should be equivalent (in the limit as levs is increased) in both parameter estimates and log-likelihood to what you would obtain using geiger::fitContinuous with the default model. Let’s see.

geiger::fitContinuous(tree,x)

## GEIGER-fitted comparative model of continuous data
##  fitted 'BM' model parameters:
## 	sigsq = 0.683954
## 	z0 = 0.339721
## 
##  model summary:
## 	log-likelihood = -26.937688
## 	AIC = 57.875377
## 	AICc = 58.199701
## 	free parameters = 2
## 
## Convergence diagnostics:
## 	optimization iterations = 100
## 	failed iterations = 0
## 	number of iterations with same best fit = 100
## 	frequency of best fit = 1.000
## 
##  object summary:
## 	'lik' -- likelihood function
## 	'bnd' -- bounds for likelihood search
## 	'res' -- optimization iteration summary
## 	'opt' -- maximum likelihood parameter estimates

Just as a gut-check, let’s repeat the same exercise, even using our forward-in-time simulator, but where our data were simulated under bounded Brownian motion with reflective boundary conditions – rather than with absorbing bounds. (I’m going to use a the same random number seed as I did before so that the first part of my simulation, until the boundary is reached, will be the same.)

tree<-pbtree(n=40,scale=1)
x<-sim_absorbing(tree,plot=TRUE,bounds=c(-1,1),
  nsteps=400,reflective=TRUE)
bcols<-hcl.colors(2)
lines(c(0,max(nodeHeights(tree))),rep(-1,2),lwd=8,
  col=make.transparent(bcols[1],0.2))
text(0.1,-1,"lower bound",pos=3)
lines(c(0,max(nodeHeights(tree))),rep(1,2),lwd=8,
  col=make.transparent(bcols[2],0.2))
text(0.1,1,"upper bound",pos=1)

plot of chunk unnamed-chunk-24

fit_bm.2<-bounded_bm(tree,x,levs=400)
fit_bm.2

## Object of class "bounded_bm" based on
##  	a discretization with k = 400 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -1.854628 , 1.942412 ]
## 
## Fitted model parameters:
##   sigsq: 0.59925 
##      x0: 0.266968 
## 
## Log-likelihood: -24.279488 
## 
## R thinks it has found the ML solution.

fit_bounded.2<-bounded_bm(tree,x,lims=c(-1,1))
fit_bounded.2

## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -1 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 1.131566 
##      x0: 0.995 
## 
## Log-likelihood: -16.018271 
## 
## R thinks it has found the ML solution.

fit_absorbing.2<-bounded_bm(tree,x,lims=c(-1,1),
  absorbing=TRUE)
fit_absorbing.2

## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Absorbing bounded model
## 
## Set or estimated bounds: [ -1 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 0.440578 
##      x0: 0.165 
## 
## Log-likelihood: -30.985182 
## 
## R thinks it has found the ML solution.

AIC(fit_bm.2,fit_bounded.2,fit_absorbing.2)

##                 df      AIC
## fit_bm.2         2 52.55898
## fit_bounded.2    4 40.03654
## fit_absorbing.2  4 69.97036

Makes sense.

A bounded continuous character evolution model with absorbing boundaries for phytools

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: Ziba Zako ft Rich Bizzy & General Kanene – Chikwati (Prod by: Bicko...

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...