Parameter estimation under the bounded Brownian model with absorbing bounds

Yesterday on this blog I posted about bounded continuous trait evolution with absorbing bounds.

This is just an evolutionary process in which the trait evolves via a stochastic diffusion process until it reaches the boundary value – and then it simply gets stuck there. (A variant of this model that I could see being pretty straightforward to add, would allow “escape” from the boundary at some other, much lower, rate.)

I don’t show this, but the equilibrium distribution under this model (by the way) will be everything on one boundary or the other – with relative frequencies determined by the position of the starting value relative to the bounds, and independent of \(\sigma^2\).

Today, I thought I’d show that parameter estimation under this model using the discrete diffusion approximation of Boucher & Démery (2016) totally works.

To illustrate this, I need to first be able to simulate under our model.

This is not too hard, though I need to do it using a forward in time simulation. I can using my simulation function, sim_absorbing, from yesterday’s post:

sim_absorbing<-function(tree,a=0,sig2=1,
  bounds=c(-Inf,Inf),nsteps=1000,
  ...){
  if(hasArg(plot)) plot<-list(...)$plot
  else plot=FALSE
  if(hasArg(reflective)) reflective<-list(...)$reflective
  else reflective<-FALSE
  tree<-reorder(tree,"cladewise")
  ll<-max(nodeHeights(tree))
  tt<-map.to.singleton(make.era.map(tree,
    limits=seq(0,ll,length.out=nsteps+1)))
  delta_x<-rnorm(n=nrow(tt$edge),sd=sqrt(sig2*tt$edge.length))
  X<-matrix(NA,nrow(tt$edge),2)
  ROOT<-Ntip(tt)+1
  X[which(tt$edge[,1]==ROOT),1]<-a
  for(i in 1:nrow(tt$edge)){
    X[i,2]<-X[i,1]+delta_x[i]
    if(!reflective){
      if(X[i,2]<=bounds[1]) X[i,2]<--Inf
      else if(X[i,2]>=bounds[2]) X[i,2]<-Inf
    } else {
      while(X[i,2]<=bounds[1]||X[i,2]>=bounds[2]){
        if(X[i,2]<=bounds[1]) X[i,2]<-2*bounds[1]-X[i,2]
        if(X[i,2]>=bounds[2]) X[i,2]<-2*bounds[2]-X[i,2]
      }
    }
    jj<-which(tt$edge[,1]==tt$edge[i,2])
    if(length(jj)>0) X[jj,1]<-X[i,2]
  }
  ii<-which(X==-Inf)
  if(length(ii)>0)  X[ii]<-bounds[1]
  ii<-which(X==Inf)
  if(length(ii)>0) X[ii]<-bounds[2]
  if(plot){
    LIMS<-vector()
    LIMS[1]<-if(bounds[1]==-Inf) min(X) else bounds[1]
    LIMS[2]<-if(bounds[2]==Inf) max(X) else bounds[2]
    hh<-hcl.colors(n=201)
    edge_col<-hh[floor(200*((X[,1]-LIMS[1])/diff(LIMS)))+1]
    par(mfrow=c(2,1))
    par(mar=c(0.1,4.1,1.1,1.1))
    plot(tt,edge.color=edge_col,show.tip.label=FALSE,
      edge.width=3,y.lim=c(-0.15*Ntip(tt),Ntip(tt)))
    add.color.bar(0.5*max(nodeHeights(tree)),
      cols=hcl.colors(n=100),
      title="phenotype",lims=LIMS,digits=2,
      prompt=FALSE,lwd=10,outline=FALSE,
      x=0,y=-0.075*Ntip(tree),fsize=0.8,
      subtitle="")
    par(mar=c(5.1,4.1,1.1,1.1))
    plot(NA,xlim=c(0,ll),ylim=range(X),
      xlab="time",ylab="phenotype",bty="n",
      las=1)
    H<-nodeHeights(tt)
    for(i in 1:nrow(tt$edge))
      lines(H[i,],X[i,],col=edge_col[i],lwd=3)
    bcols<-hcl.colors(2)
    if(is.finite(bounds[1])){
      lines(c(0,max(nodeHeights(tree))),
        rep(bounds[1],2),lwd=8,
        col=make.transparent(bcols[1],0.2))
      text(0.1*max(nodeHeights(tree)),bounds[1],
        "lower bound",pos=3)
    }
    if(is.finite(bounds[2])){
      lines(c(0,max(nodeHeights(tree))),
        rep(bounds[2],2),lwd=8,
        col=make.transparent(bcols[2],0.2))
      text(0.1*max(nodeHeights(tree)),bounds[2],
        "upper bound",pos=1)
    }
  }
  setNames(sapply(X=1:Ntip(tt),
    FUN=function(i,x,e) X[which(e[,2]==i),2],
    x=X,e=tt$edge),tt$tip.label)
}

Let’s load phytools, of course:

library(phytools)
packageVersion("phytools")

## [1] '2.4.13'

(Anyone with an earlier phytools version will need to update from GitHub to follow along.)

Our general workflow will be to simulate data under the absorbing condition, fit an unbounded Brownian model, perhaps our reflective bounded Brownian motion model of Boucher & Démery, and then our new absorbing bounds model – then pull out estimated parameter values from each model.

We can also set \(\sigma^2\) (the rate) and \(x_0\) (the starting value) randomly in each simulation.

Something like this…

bounds<-c(0,1)
tree<-pbtree(n=100,scale=0.1)
sig2<-exp(rnorm(n=1,sd=1))
x0<-runif(n=1,bounds[1],bounds[2])
x<-sim_absorbing(tree,sig2=sig2,a=x0,
  plot=TRUE,bounds=bounds,
  nsteps=500)

plot of chunk unnamed-chunk-19

We can start by fitting unbounded (i.e., “standard”) Brownian motion.

## fit unbounded BM
bm<-bounded_bm(tree,x,levs=200)
bm

## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -0.5 , 1.5 ]
## 
## Fitted model parameters:
##   sigsq: 1.018149 
##      x0: 0.375 
## 
## Log-likelihood: 41.079127 
## 
## R thinks it has found the ML solution.

(This model should be very similar to geiger::fitContinuous, and will become identical in the limit as levs is increased towards \(\infty\).)

geiger::fitContinuous(tree,x)

## GEIGER-fitted comparative model of continuous data
##  fitted 'BM' model parameters:
## 	sigsq = 1.025324
## 	z0 = 0.371028
## 
##  model summary:
## 	log-likelihood = 40.791327
## 	AIC = -77.582655
## 	AICc = -77.458943
## 	free parameters = 2
## 
## Convergence diagnostics:
## 	optimization iterations = 100
## 	failed iterations = 0
## 	number of iterations with same best fit = 100
## 	frequency of best fit = 1.000
## 
##  object summary:
## 	'lik' -- likelihood function
## 	'bnd' -- bounds for likelihood search
## 	'res' -- optimization iteration summary
## 	'opt' -- maximum likelihood parameter estimates

Now our Boucher & Démery bounded and new absorbing bounded models.

## fit bounded BM
bbm<-bounded_bm(tree,x,lims=bounds,levs=100)
bbm

## Object of class "bounded_bm" based on
##  	a discretization with k = 100 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ 0 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 1.079083 
##      x0: 0.355 
## 
## Log-likelihood: 81.641556 
## 
## R thinks it has found the ML solution.

## absorbing bounds model
abm<-bounded_bm(tree,x,lims=bounds,absorbing=TRUE,
      lik.func="parallel",levs=100)
abm

## Object of class "bounded_bm" based on
##  	a discretization with k = 100 levels.
## 
## Absorbing bounded model
## 
## Set or estimated bounds: [ 0 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 2.069447 
##      x0: 0.405 
## 
## Log-likelihood: 249.686329 
## 
## R thinks it has found the ML solution.

If we compare this to our generating parameter values, we should see that we got much closer when we fit the absorbing model than for either standard BM or BBM.

sig2

## [1] 2.099244

x0

## [1] 0.3165764

OK, so let’s loop that 100 times & see if the pattern holds true! Here, I’m going to add a check to ensure that not all values are stuck to one bound, which would make the trait invariant (while(var(x)==0)); and I’ll use try syntax to make sure we don’t get stuck on any optimization failures that kill bounded_bm.

nsim<-100
## matrix to store results
RESULTS<-matrix(NA,nrow=nsim,ncol=11,
  dimnames=list(1:nsim,
    c("sig2","BM(sig2)","BBM(sig2)","ABM(sig2)",
      "x0","BM(x0)","BBM(x0)","ABM(x0)",
      "logL(BM)","logL(BBM)","logL(ABM)")))
## set bounds
bounds<-c(0,1)
## loop
for(i in 1:nsim){
  tree<-pbtree(n=100,scale=0.1)
  x<-rep(0,Ntip(tree))
  while(var(x)==0){
    sig2<-exp(rnorm(n=1,sd=1))
    x0<-runif(n=1,bounds[1],bounds[2])
    x<-sim_absorbing(tree,sig2=sig2,a=x0,
      plot=FALSE,bounds=bounds,
      nsteps=500)
  }
  ## fit models
  bm<-bbm<-abm<-list()
  class(bm)<-class(bbm)<-class(abm)<-
    "try-error"
  while(inherits(bm,"try-error"))
    bm<-try(bounded_bm(tree,x,levs=200))
  while(inherits(bbm,"try-error"))
    bbm<-try(bounded_bm(tree,x,lims=bounds,
      levs=100))
  while(inherits(abm,"try-error"))
    abm<-try(bounded_bm(tree,x,lims=bounds,absorbing=TRUE,
      lik.func="parallel",levs=100))
  RESULTS[i,]<-c(
    sig2,
    bm$sigsq,bbm$sigsq,abm$sigsq,
    x0,
    bm$x0,bbm$x0,abm$x0,
    logLik(bm),logLik(bbm),logLik(abm)
  )
  print(round(RESULTS[i,1:8,drop=FALSE],4))
  cat("...\n")
}

Now let’s graph our results. (I had a few weird outlier estimates – possibly due to non-convergence, so I made a custom function to exclude those when I compueted the x and y graph limits for visualization.)

## custom function to get the 
range_no_outliers<-function(x,na.rm=TRUE){
  if(na.rm) x<-x[!is.na(x)]
  q<-quantile(x,probs=c(0.25,0.75))
  iqr<-q[2]-q[1]
  lower<-q[1]-1.5*iqr
  upper<-q[2]+1.5*iqr
  inliers<-x[x>=lower & x<=upper]
  range(inliers,na.rm=na.rm)
}
## multi-panel plot
layout(matrix(1:6,3,2))
par(mar=c(5.1,5.1,3.1,1.1))
xylim<-exp(range_no_outliers(log(RESULTS[,1:4])))
plot(RESULTS[,c(1,2)],xlim=xylim,ylim=xylim,
  log="xy",pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(sigma^2),
  ylab=expression(paste(sigma^2," (BM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("a) ",sigma^2,
  " standard BM fit")),adj=0,cex=0.8)
plot(RESULTS[,c(1,3)],xlim=xylim,ylim=xylim,
  log="xy",pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(sigma^2),
  ylab=expression(paste(sigma^2," (BBM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("b) ",sigma^2,
  " bounded BM fit")),adj=0,cex=0.8)
plot(RESULTS[,c(1,4)],xlim=xylim,ylim=xylim,
  log="xy",pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(sigma^2),
  ylab=expression(paste(sigma^2," (ABM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("c) ",sigma^2,
  " absorbing BM (generating model)")),
  adj=0,cex=0.8)
xylim=c(0,1)
plot(RESULTS[,c(5,6)],xlim=xylim,ylim=xylim,
  pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(x[0]),
  ylab=expression(paste(x[0]," (BM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("d) ",x[0],
  " standard BM fit")),adj=0,cex=0.8)
plot(RESULTS[,c(5,7)],xlim=xylim,ylim=xylim,
  pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(x[0]),
  ylab=expression(paste(x[0]," (BBM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("e) ",x[0],
  " bounded BM fit")),adj=0,cex=0.8)
plot(RESULTS[,c(5,8)],xlim=xylim,ylim=xylim,
  pch=21,bg="grey",bty="n",las=1,
  cex.axis=0.8,xlab=expression(x[0]),
  ylab=expression(paste(x[0]," (ABM)")))
grid()
lines(xylim,xylim)
mtext(expression(paste("f) ",x[0],
  " absorbing BM (generating model)")),
  adj=0,cex=0.8)

plot of chunk unnamed-chunk-28

That’s about as good a result as one could hope for, I think. Interestingly, all three models do a pretty good job at estimating the ancestral state – which makes sense, I think, since at equilibrium that’s probably just the weighted mean of the upper and lower bounds, with the weights coming from the fraction stuck at each bound! (That’s an educated guess, I haven’t checked that – but I’d bet it’s correct.)

That’s all for now, folks.

Parameter estimation under the bounded Brownian model with absorbing bounds

Trending Articles

ZARIA CUMMINGS

PURPLE RANGE LIVE AT GAL AMUNA 2013

Moondru Mudichu 07-06-2016 – Polimer tv Serial

MESCOM Junior Assistant Result 2017 Assistant Cut Off/ Merit List, mescom.in

GTA 5 PPSSPP Zip File Download For Android Mediafire 382 MB

Jumping pipe Loan App Customer” Care Helpline Number®️))+91-8452013280 @! 779...

Chris Brown – 2008 – Exclusive (The Forever Edition)

KCPE 2013 Results: County order of ranking

Throw Back: 2Toff “Ye Na Bra” Ft. Castro

OpenID - (Error 500 Redirect Failed)

Practice Sheet of Right form of verbs for HSC Students

Lady Gaga & Bruno Mars – Die With A Smile (Acoustic) – Single [iTunes Plus M4A]

John Torode sticky BBQ pork ribs with corn muffins and an apple and fennel...

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

AFMG EASERA SysTune Pro v1.3.7 CE-V.R

FD-20-000666 RAYAD NADER; MAJD...

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

Error: Acquisition for fixed asset, value model already exists

Nalgonda District Police Office Mobile Numbers List in Telangana State

Roger Blackman spearheads renewable energy drive