Today I received the following question about stochastic character mapped trees in phytools:
I'm working on a data set where I would like to infer the absolute date associated with a transition from one character state to another on a phylogeny (in my case its a habitat transition in a group of fishes).
I am using SIMMAP to map the binary traits onto a subset of trees from a BEAST analysis…. Basically, I would be happy with simply extracting absolute time of each character transition from each SIMMAP tree. For example, on each tree, I just want to be able to extract that transitions happened at 10.1 mya, 12.5 mya, 18.8 mya, etc…. I just cannot figure out a straight-forward way to do this for 100-1000 trees and I was wondering if there was a function in phytools that was designed for just this sort of thing.
Well, there are several functions in phytools, such as contSimmap
and describe.simmap
that, in addition to densityMap
, can be used to summarize the results of stochastic mapping in phytools; however none contain the precise function that is being sought after here. That said, it should not be too hard.
Let's start by seeing how it can be done on a single history. I'll keep it simple, and simulate a tree with three states: "a"
, "b"
, and "c"
:
library(phytools)
Q<-matrix(c(-1,1,0,1,-2,1,0,1,-1),3,3)
colnames(Q)<-rownames(Q)<-letters[1:3]
Q
## a b c
## a -1 1 0
## b 1 -2 1
## c 0 1 -1
tree<-sim.history(pbtree(n=26,tip.label=LETTERS,
scale=1),anc="a",Q)
## Done simulation(s).
This is a simulated history - but this would obviously normally be used with a stochastically mapped discrete character history generated, for example, with the phytools function make.simmap
.
I'm going to plot the tree with it's mapped discrete character history - but I will also use the function markChanges
that I just posted on my blog hereso that we can see the temporal position of the changes a little bit more easily:
plotSimmap(tree,mar=c(4.1,0.1,0.1,0.1))
## no colors provided. using the following legend:
## a b c
## "black" "red" "green3"
axis(1)
## so we can see the changes more clearly:
markChanges<-function(tree,colors=NULL,cex=1,lwd=2){
states<-sort(unique(getStates(tree)))
if(is.null(colors)) colors<-setNames(palette()[1:length(states)],
states)
obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
nc<-sapply(tree$maps,length)-1
ii<-which(nc>0)
nc<-nc[ii]
h<-vector()
for(i in 1:length(ii)){
for(j in 1:nc[i]){
ss<-names(tree$maps[[ii[i]]])[j+1]
mm<-tree$edge[ii[i],1]
dd<-tree$edge[ii[i],2]
x<-rep(obj$xx[mm]+cumsum(tree$maps[[ii[i]]])[j],2)
y<-c(obj$yy[dd]-0.5*mean(strheight(LETTERS)*cex),
obj$yy[dd]+0.5*mean(strheight(LETTERS)*cex))
lines(x,y,lwd=lwd,col=colors[ss],lend=2)
h<-c(h,x[1])
}
}
invisible(h)
}
markChanges(tree,lwd=3)
Now we can use some of the same tricks in the function above to get the heights of all the changes in the tree, along with their types:
x<-getStates(tree,"tips")
states<-sort(unique(x))
m<-length(states)
ct<-sapply(states,function(x,y) sapply(y,function(y,x)
paste(x,"->",y,sep=""),x=x),y=states)
ct
## a b c
## a "a->a" "b->a" "c->a"
## b "a->b" "b->b" "c->b"
## c "a->c" "b->c" "c->c"
ii<-matrix(TRUE,m,m)
diag(ii)<-rep(FALSE,m)
changes<-vector(mode="list",length=m*(m-1))
names(changes)<-as.vector(ct[ii])
nc<-sapply(tree$maps,length)-1
ind<-which(nc>0)
nc<-nc[ind]
H<-nodeHeights(tree)
maps<-tree$maps[ind]
for(i in 1:length(maps)){
for(j in 1:nc[i]){
sc<-paste(names(maps[[i]])[j:(j+1)],collapse="->")
h<-H[ind[i],1]+cumsum(maps[[i]])[j]
changes[[sc]]<-c(changes[[sc]],as.numeric(h))
}
}
changes<-changes[!sapply(changes,is.null)]
changes<-lapply(changes,sort)
changes
## $`a->b`
## [1] 0.3263951 0.5449173 0.5896102 0.6901126
##
## $`b->a`
## [1] 0.4872248
##
## $`b->c`
## [1] 0.6666082 0.8735417
We should be able to see by plotting the changes on the tree that our computed heights correspond with the mapped changes on our plotted tree:
plotSimmap(tree,ylim=c(0,Ntip(tree)),mar=c(4.1,0.1,0.1,0.1))
## no colors provided. using the following legend:
## a b c
## "black" "red" "green3"
xx<-markChanges(tree,lwd=3)
axis(1)
obj<-sapply(xx,function(x) lines(rep(x,2),par()$usr[3:4],
lty="dashed",col="grey"))
obj<-sapply(changes,function(x) points(x,rep(par()$usr[3]+0.2,
length(x))))
add.simmap.legend(colors=setNames(palette()[1:m],states),x=0.05,
y=0,vertical=FALSE,prompt=FALSE)
markChanges(tree,lwd=3)
Of course, we can duplicate this across a sample of trees from a posterior sample; however we need to be cognizant of the fact that we are sure to end up with different numbers of changes of each type in each different tree.
That's it.