Quantcast
Channel: Phylogenetic Tools for Comparative Biology
Viewing all articles
Browse latest Browse all 802

Parallelized multirateBM for variable-rate Brownian evolution model

$
0
0

Last year I published an article describing a penalized likelihood approach to fitting an evolutionary model to quantitative trait data in which the rate of evolution itself, \(\sigma^{2}\), evolved under a geometric Brownian motion process, implemented in the phytools function multirateBM.

Even though I think this is a pretty neat model, one of its limitations is that penalized likelihood optimization is very slow.

To try to help with that, I recently pushed a phytoolsupdate that adds parallelized numerical optimization to multirateBM using the cool CRAN package optimParallel.

Let's see how it works.

Why don't we start by loading a tree and dataset. In this case, I'll use a tree and body size dataset for 49 species of mammals from Garland et al. (1992).

library(phytools)
packageVersion("phytools")
## [1] '1.2.2'
data(mammal.tree)
data(mammal.data)
lnMass<-setNames(log(mammal.data$bodyMass),
rownames(mammal.data))

OK. We're ready to test out the parallelized version of multirateBM.

Unlike, fitMk.parallel, which I blogged about this summer, parallelization is not implemented as a separate function in multirateBM, but as a function option: parallel=TRUE.

Let's see.

First, I'll fit my model using parallel=FALSE, the default. Then I'll repeat with parallel=TRUE.

Even though multiple optimization methods are available in multirateBM, I'll stick to "L-BFGS-B" (a box constrained, quasi-Newton optimization routine), because that's what is implemented in optimParallel.

Finally, I'll wrap each of my optimization chunks in system.time so that we can get a timing from our R system.

system.time(
fit.serial<-multirateBM(mammal.tree,lnMass,
lambda=1,optim="L-BFGS-B",parallel=FALSE)
)
## Beginning optimization....
## Optimization iteration 1. Using "L-BFGS-B" optimization method.
## Best (penalized) log-likelihood so far: -230.967
## Done optimization.
##    user  system elapsed 
## 89.30 0.92 90.28
fit.serial
## Multi-rate Brownian model using multirateBM.
##
## Fitted rates:
## U._maritimus U._arctos U._americanus N._narica P._lotor M._mephitis
## 0.013659 0.013007 0.126694 0.024988 0.017302 0.153454
## M._meles C._lupus
## 0.011781 0.178708 ....
##
## lambda penalty term: 1
## log-likelihood: -57.689982
## AIC: 311.379964
##
## R thinks it has found a solution.
system.time(
fit.parallel<-multirateBM(mammal.tree,lnMass,
lambda=1,optim="L-BFGS-B",parallel=TRUE)
)
## Beginning optimization....
## Using socket cluster with 8 nodes on host 'localhost'.
## Optimization iteration 1. Using "L-BFGS-B" (parallel) optimization method.
## Best (penalized) log-likelihood so far: -230.967
## Done optimization.
##    user  system elapsed 
## 1.19 1.04 48.95
fit.parallel
## Multi-rate Brownian model using multirateBM.
##
## Fitted rates:
## U._maritimus U._arctos U._americanus N._narica P._lotor M._mephitis
## 0.013659 0.013007 0.126694 0.024988 0.017302 0.153454
## M._meles C._lupus
## 0.011781 0.178708 ....
##
## lambda penalty term: 1
## log-likelihood: -57.689982
## AIC: 311.379964
##
## R thinks it has found a solution.

Neat. We should see that we've obtained exactly the same solutions from both parallel=FALSE and parallel=TRUE, but the latter was produced in around half the time on my computer. (Results will vary, obviously!)

The reader also may have noted that I didn't specify the number of cores to use. I could have, using the optional argument ncores, but if I don't, then multirateBM will try to figure out how many cores I have using parallel::detectCores, and then print the results to screen: Using socket cluster with 8 nodes on host ‘localhost’.

Let's plot our fitted model!

## generate plotting object
obj<-plot(fit.parallel,plot=FALSE)
obj
## Object of class "multirateBM_plot" containing:
##
## (1) A phylogenetic tree with 49 tips and 48 internal nodes.
##
## (2) A mapped set of Brownian evolution rates.
## change color (needs viridisLite package)
obj<-setMap(obj,viridisLite::viridis(n=8))
## graph
plot(obj,ftype="i",fsize=0.8,outline=TRUE,lwd=5)

plot of chunk unnamed-chunk-3


Viewing all articles
Browse latest Browse all 802

Trending Articles