Quantcast
Channel: Phylogenetic Tools for Comparative Biology
Viewing all 801 articles
Browse latest View live

New function cladelabels

$
0
0

I just added a new function cladelabels to the phytools package. This function is in some ways analogous to nodelabels, tiplabels, etc. in ape. It basically implements the method I gave here, while taking advantage of the trick described here.

The code for this function is here, and it is also in a new phytools version, which can be downloaded here.

Here's a demo:

## this is just code to get a "realistic" looking tree
tree<-pbtree(n=26,tip.label=
  paste(LETTERS,"._",sapply(round(runif(n=26,min=3,max=6)),
  function(x) paste(sample(letters,x),collapse="")),sep=""),
  scale=1)
tree$edge.length<-tree$edge.length+0.1*rchisq(n=
  length(tree$edge.length),df=10)/10
plotTree(tree) ## also can use plot.phylo
nodelabels()

Now let's label the three clades descended from nodes 46 & 33; and then also node 28 (which is inclusive of node 33:

## adjust xlim to make sure there is space
plotTree(tree,xlim=c(0,1.25*max(nodeHeights(tree))),
  ftype="i")
cladelabels(tree,node=49,"Clade A")
cladelabels(tree,node=34,"Clade B")
cladelabels(tree,node=31,"Clade C",offset=3))

Cool - this was more or less what we were going for. We can also do this without sending cladelabels the tree, although in this case we need to provide some guidance on the space that cladelabels should leave for tip labels - otherwise it will assume a fixed width of 8 characters.

plotTree(tree,xlim=c(0,1.25*max(nodeHeights(tree))),
  ftype="i")
## delete the tree! (we really don't need it)
rm(tree)
cladelabels(node=49,text="Clade A",offset=4.5)
cladelabels(node=34,text="Clade B",offset=6)
cladelabels(node=31,text="Clade C",offset=9)

It's also fairly obvious how this could be combined with findMRCA to label clades on the basis of the taxa in the clade, rather than their specific MRCA node number. For instance:

plotTree(tree,xlim=c(0,1.25*max(nodeHeights(tree))),
  ftype="i")
tips<-c("L._rsqlcb","J._zidwa","K._hnu","M._tgrkb",
  "N._bpso","O._hupxz","P._shimt","Q._qcyft",
  "R._khi","S._okl","T._rfz","U._kbjho")
cladelabels(tree,node=findMRCA(tree,tips),text="clade D",
  offset=2)

Finally, it's possible to send the function multiple node numbers & clade labels in one function call - although at present this does not permit us to use different offset values. So:

cladelabels(tree,node=49,"Clade A")
cladelabels(tree,node=34,"Clade B")
## is the same as:
cladelabels(tree,node=c(49,34),text=c("Clade A","Clade B"))

One caveat important to mention is that at present this works only for rightward facing phylograms (or cladograms, for tree=NULL). This is not theoretically difficult to extend to other plot types, it just requires more work.


Function to quickly compute the height above the root of a pair of taxa

$
0
0

Today I received the following phytools user request:

I've been using your phytools package (and the findMRCA function with type="height" to find the height of a specific node in a phylogenetic tree. However, the computation time is rather long for my purpose and I was thinking about using fastMRCA. However, this outputs only the node number of the parental node. Is there a chance to get the coalescent time of this node (as with the type argument of findMRCA). Or do you have a general other suggestion to increase speed (I would really only need to calculate the time of coalescence of two tips of the tree).

Unfortunately, it doesn't matter which (findMRCA or fastMRCA) is used in this case because the bottleneck is nodeHeights, which is used to compute the height of the common ancestor above the root. Here's a quick demo to illustrate that:

> ## simulate a large tree
> tree<-pbtree(n=10000)
> ## using findMRCA
> system.time(h1<-findMRCA(tree,c("t1","t100"),
 type="height"))
   user  system elapsed
  27.67    0.00   27.90
> h1
[1] 2.148063
> ## using fastMRCA & nodeHeights
> system.time(h2<-nodeHeights(tree)[which(tree$edge[,1]==
 fastMRCA(tree,"t1","t100"))[1],1])
   user  system elapsed
  27.13    0.00   27.17
> h2
[1] 2.148063
> ## this is almost completely driven by nodeHeights
> system.time(H<-nodeHeights(tree))
   user  system elapsed
  26.74    0.00   26.77

However, we can take a different approach. What if we instead (1) found all the ancestors (back to the root) of each tip, (2) identified the intersection of these two sets, and (3) summed the parent branch length for each ancestral node above the root in the intersection? Here's a function that does this and it runs pretty fast!

fastHeight<-function(tree,sp1,sp2){
  ancs<-phytools:::getAncestors
  a1<-ancs(tree,which(tree$tip.label==sp1))
  a2<-ancs(tree,which(tree$tip.label==sp2))
  a12<-intersect(a1,a2)
  if(length(a12)>1){
    a12<-a12[2:length(a12)-1]
    h<-sapply(a12,function(i,tree)
      tree$edge.length[which(tree$edge[,2]==i)],tree=tree)
    return(sum(h))
  } else return(0)
}

In use with the same example as before:

> system.time(h3<-fastHeight(tree,"t1","t100"))
   user  system elapsed
   0.01    0.00    0.01
> h3
[1] 2.148063

I will probably put this in phytools.

Rphylip now on CRAN!

$
0
0

Rphylip, an R interface for J. Felsenstein's PHYLIP phylogeny methods program package, is now on CRAN.

Rphylip is a collaborative project with Scott Chamberlain at Simon Fraser University. Although I did most of the programming of Rphylip, Scott wrote some of the initial interface code that helped me to get started on the project, he has helped to debug the package, he has contributed some of the example datasets, and he is co-authoring the program note that we are presently preparing for journal submission.

Rphylip now contains interface functions for close to 90% of the programs of the PHYLIP package. In almost every case, all or nearly all of the functionality of the PHYLIP programs are transferred to the R user. The Rphylip functions, like PHYLIP itself, cover an enormous range of applications - from phylogeny inference, to distance matrix calculation, to phylogenetic comparative methods. In every case, we have done our best to preserve all the functionality of the PHYLIP programs while allowing them to be integrated seamlessly into an R workflow. Rphylip also contains a number of helper functions which both broaden the functionality of PHYLIP (albeit slightly), and allow it to be more easily used. For instance, the user does not necessarily need to supply the path to the PHYLIP executable. If a path is not supplied, then Rphylip will search common locations for the PHYLIP executables (such as in C:\Program Files\ on a Windows machine. Rphylip even includes a function (setupOSX) that automates the somewhat complicated procedure of installing PHYLIP to a Mac OS X computer.

Of course, before Rphylip can be used, PHYLIP must first be installed. Furthermore, any use of Rphylip in publication should automatically trigger a citation of PHYLIP (as well as relevant references for the particular method employed).

This release of Rphylip in advance of submitting our program note for publication should be considered highly beta. We welcome any feedback on the package or its use. At the moment of writing, only the package source and Mac OS binary are available. I expect that a Windows package binary should be posted soon.

Finally (oops!) I accidentally misentered the package name in the R package DESCRIPTION file which is responsible for the double ("Rphylip: Rphylip: .....") package header on CRAN. I'm aware of this & it will be fixed in future releases.

Plotting a right-facing round phylogram

$
0
0

Don't ask me why I'm working on this. Here's how to do it:

roundPhylogram<-function(tree){
  n<-length(tree$tip.label)
  # reorder cladewise to assign tip positions
  cw<-reorder(tree,"cladewise")
  y<-vector(length=n+cw$Nnode)
  y[cw$edge[cw$edge[,2]<=n,2]]<-1:n
  # reorder pruningwise for post-order traversal
  pw<-reorder(tree,"pruningwise")
  nn<-unique(pw$edge[,1])
  # compute vertical position of each edge
  for(i in 1:length(nn)){
    yy<-y[pw$edge[which(pw$edge[,1]==nn[i]),2]]
    y[nn[i]]<-mean(range(yy))
  }
  # compute start & end points of each edge
  X<-nodeHeights(cw)
  ## end preliminaries
  # open & size a new plot
  plot.new(); par(mar=rep(1.1,4))
  plot.window(xlim=c(0,1.05*max(X)),ylim=c(1,max(y)))
  # plot edges
  for(i in 1:nrow(X)){
    b<-y[cw$edge[i,1]]
    c<-X[i,1]
    d<-if(y[cw$edge[i,2]]>y[cw$edge[i,1]]) 1 else -1
    xx<-X[i,2]
    yy<-y[cw$edge[i,2]]
    a<-(xx-c)/(yy-b)^2
    curve(d*sqrt((x-c)/a)+b,from=X[i,1],to=X[i,2],add=TRUE,
      lwd=2)
  }
  # plot tip labels
  for(i in 1:n)
    text(X[which(cw$edge[,2]==i),2],y[i],tree$tip.label[i],
      pos=4,offset=0.3,font=2)
}

Now let's see how it works:

> library(phytools)
> source("roundPhylogram.R")
> tree<-pbtree(n=26,tip.label=LETTERS[26:1])
> roundPhylogram(tree)

roundPhylogram now in phytools

$
0
0

A slightly more fully functional version of roundPhylogram (described on the blog earlier today) is now part of the phytools package. The function source code can be viewed here, and updated phytools package source can be downloaded from the phytools page.

Here's a demo, using the Caribbean anole tree:

> require(phytools)
Loading required package: phytools
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.3.97’
> data(anoletree)
> roundPhylogram(anoletree,ftype="i",fsize=0.7)

Putting a barplot next to a plotted tree

$
0
0

Today a phytools user contacted me about creating a plot that looks like this. Well, I'm not going to try to duplicate this exactly, but here is a quick demo about how to put a bar plot next to a plotted tree with a continuous character map overlain:

## first let's simulate some tree & data to work with
tree<-pbtree(n=40)
x<-fastBM(tree)

Now let's create our plot:

## create a split plot
layout(matrix(c(1,2),1,2),c(0.7,0.3))
## plot our tree
xx<-contMap(tree,x,mar=c(4.1,1.1,1.1,0),res=200,plot=FALSE)
plot(xx,legend=FALSE,mar=c(4.1,1.1,1.1,0))
## click to add legend interactively
add.color.bar(1,cols=xx$cols,lims=xx$lims,title="")
## add bar plot
par(mar=c(4.1,0,1.1,1.1))
barplot(x[tree$tip.label],horiz=TRUE,width=1,space=0,
  ylim=c(1,length(tree$tip.label))-0.5,names="")

Obviously, the same approach could be used with an ordinary right-facing phylogram or a stochastic character mapped tree.

New version of Rphylip with Rseqboot & demo

$
0
0

I just submitted a new version (Rphylip 0.1-23) of the Rphylip package ('an R interface for PHYLIP') to CRAN. This version fixes a couple of bugs in the first CRAN version - including taxon name length limits in Rconsense and Rtreedist (which are present in the corresponding PHYLIP programs CONSENSE and TREEDIST, but can easily be circumvented when calling PHYLIP from within R). It also introduces the new function Rseqboot, which is an R interface two SEQBOOT. SEQBOOT implements a range of non-parametric bootstrapping, jacknife, and data permutation methods. Because it can take a variety of different character types as input, writing the interface was a bit of a pain in the butt - but it is finished to my satisfaction today.

Here's a demo of bootstrapping, distance matrix calculation from all bootstrap samples, and then consensus phylogeny estimation using the Rphylip package:

> require(Rphylip)
Loading required package: Rphylip
> packageVersion("Rphylip")
[1] ‘0.1.23’
> ## load primate dataset
> data(primates)
> ## basic bootstrapping
> X<-Rseqboot(primates,quiet=TRUE)
> ## compute distance matrices
> D<-lapply(X,Rdnadist,quiet=TRUE)
> ## compute all NJ trees
> ## note Rneighbor(...,quiet=TRUE) is
> ## not as quiet as it should be!
> trees<-lapply(D,Rneighbor,quiet=TRUE)
> ## reroot all trees using midpoint rooting
> require(phangorn)
Loading required package: phangorn
> trees<-lapply(trees,midpoint)
> class(trees)<-"multiPhylo"
> ## compute consensus tree
> tree<-Rconsense(trees,quiet=TRUE,rooted=TRUE)
> ## now plot the result with the bootstrap %s
> plot(tree,edge.width=2,no.margin=TRUE)
> ## find edges
> e<-sapply(2:tree$Nnode+length(tree$tip.label),
 function(x,y) which(y==x),y=tree$edge[,2])
> ## add bootstrap
> edgelabels(tree$node.label[2:tree$Nnode],e,pos=3,
 frame="none",bg="transparent")

Cool. Note that RETREE in PHYLIP does midpoint rooting, but this program cannot yet be called with Rphylip. The same general pipeline could be used with ML, MP, or other phylogeny inference methods in PHYLIP (although this would be slower, of course).

This version of Rphylip is already available from GitHub, but hopefully will also be accepted on CRAN soon.

New function to paint individual branch or branches of the tree

$
0
0

A few days ago I received the following email:

"I have recently come across your excellent paper entitled "On the analysis of evolutionary change along single branches in a phylogeny". In that you employed a software "available on request", so I am writing to request permission to obtain and use your software. Alternatively, I wonder whether the functions from that software were incorporated into the Phytools R package. I am sorry if I am missing the obvious and it is indeed in the Phytools package - I am not familiar with all functions in that package yet."

(One interesting attribute of this message is that it harks back to a day in the not too distant past in which sharing data or software 'on request' seemed more than reasonable - how quaint &passé that seems today!)

The paper that's being referred to here is Revell (2008; Am. Nat.) in which I show that a (at that time) new likelihood method by O'Meara et al. (2006) could be used to test hypotheses about exceptional evolution along (one or more) single branch(es) of the tree. Previously McPeek (1995) had proposed a cruder approach based loosely on contrasts, which I showed had reasonable type I error; however I also showed that the likelihood method was probably preferable under most circumstances.

Indeed this method is available in phytools in the form of brownie.lite; although to 'paint' branches which we have a priori hypothesized to have a higher or lower evolutionary rate is a little bit tricky. I decided to make this a whole lot easier by adding the function paintBranches to the phytools package (new source build here).

Here is an example workflow using the function to test the hypothesis of a higher rate of evolution along a specific branch of the tree:

> library(phytools)
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.3.99’
> ## simulate data with a high rate on one branch
> tree<-pbtree(n=26,tip.label=LETTERS[26:1])
> plotTree(tree)
> nodelabels()
> ## stretch one of the branches by x 100
> tt<-tree
> tt$edge.length[which(tt$edge[,2]==29)]<-
 tt$edge.length[which(tt$edge[,2]==29)]*100
> ## simulate data on distorted tree
> x<-fastBM(tt)
> ## paint branch with second regime to fit 2-rate model
> tree<-paintBranches(tree,29,"2")
> plotSimmap(tree,lwd=4)
no colors provided. using the following legend:
      1       2
"black"   "red"
> nodelabels()
> fit<-brownie.lite(tree,x)
> fit
ML single-rate model:
        s^2     se      a       k  logL
value   1.41    0.39    1.35    2  -39.37 

ML multi-rate model:
        s^2(1)  se(1)   s^2(2)  se(2)   a       k  logL 
value   0.96    0.26    40.24   60.27   0.12    3  -35.82

P-value (based on X^2): 0.01

R thinks it has found the ML solution.

We can use the same approach if our hypothesized exceptional regime is found on several branches in the tree. In this case, the function argument edge is an integer vector instead of a single node number. Finally, if we have an a priori hypothesis of multiple classes of exceptional rate, we can test that too using multiple calls to paintBranches.

That's it.


New options & methods for ltt95

$
0
0

A couple of days ago I received the following comment about the phytools function ltt95:

"I'm using ltt95 function with the times method and log transformed y axis, but I get reversed ages (both in the plot and in the results table). I’ve not been able to tackle down this small issue."

Well, this isn't precisely a bug of ltt95. ltt95 merely records the accumulation of lineages forward in time from the root; rather than backward (as I believe is more common in publications) from the present time.

In a newly updated version of the function I have now added the argument xaxis which can assume three values: "standard" (forward in time lineage accumulation, as in the current CRAN version of phytools); "flipped" (the time-from-the-present plot); and "negative" (similar to "flipped", but with negative time from the present). At the same time I also have modified ltt95 to return an object of class "ltt95" invisibly (don't worry, this is just the same matrix with some attributes); and I have added S3 generic plotting & print methods for objects of this class.

Here's a demo - but the function can be used in more diverse ways than I will demonstrate here:

> library(phytools)
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.4.0’
> ## simulate some trees
> trees<-pbtree(n=100,scale=50,nsim=100)
> trees
100 phylogenetic trees
> ## run ltt95
> obj<-ltt95(trees)
> ## ok, now let's flip the axis & plot on semi-log
> ## (I could have done this originally too)
> obj
Object of class "ltt95".
  alpha:        0.05
  method:       lineages
  mode:         median
  N(lineages):  100
> plot(obj,xaxis="flipped",log=TRUE)

Code for this new function version is here and it is in a phytools build, which can be downloaded & installed from source.

That's all for now.

Computing the height above the root of a node in the tree

$
0
0

phytools already has a function that computes the height of all the nodes in the tree: nodHeights. If we only want to know the height of one node, then using this function is very inefficient, for obvious reasons.

The following is alternative code for computing the height of only one node. It uses the phytools internal function getAncestors, so if we are not using it as part of phytools we will first have to do:

getAncestors<-phytools:::getAncestors

And here is the function:

nodeheight<-function(tree,node){
  if(node==(length(tree$tip.label)+1)) h<-0
  else {
    a<-setdiff(c(getAncestors(tree,node),node),
      length(tree$tip.label)+1)
    h<-sum(tree$edge.length[sapply(a,function(x,e)
      which(e==x),e=tree$edge[,2])])
  }
  h
}

That's it.

New print & plot methods for describe.simmap

$
0
0

I just added new print & plot methods to phytools for the object returned by describe.simmap, a function designed to summarize the results of stochastic mapping conducted with the phytools function make.simmap. Here's a quick demo:

> require(phytools)
Loading required package: phytools
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.4.2’
> data(anoletree)
> x<-getStates(anoletree,"tips")
> trees<-make.simmap(anoletree,x,nsim=100,model="ER",
  message=FALSE)
> obj<-describe.simmap(trees)
> ## print method
> obj
100 trees with a mapped discrete character with states:
 CG, GB, TC, TG, Tr, Tw

trees have 25.88 changes between states on average

changes are of the following types:
     CG,GB CG,TC CG,TG CG,Tr CG,Tw GB,CG GB,TC GB,TG GB,Tr
x->y  0.27  0.39  0.29  0.18  0.37  0.53  0.77  0.96  0.39
     GB,Tw TC,CG TC,GB TC,TG TC,Tr TC,Tw TG,CG TG,GB TG,TC
x->y  1.09  1.23  0.74  0.51  0.71  0.94  1.15  3.32  1.89
     TG,Tr TG,Tw Tr,CG Tr,GB Tr,TC Tr,TG Tr,Tw Tw,CG Tw,GB
x->y  1.05  1.78  0.21  0.29  0.21  0.21  0.28  0.86   1.8
     Tw,TC Tw,TG Tw,Tr
x->y  1.82  0.97  0.67

mean total time spent in each state is:
             CG         GB         TC        TG          Tr
raw  12.9528230 44.2907143 32.4717543 69.734876 12.85247436
prop  0.0629796  0.2153516  0.1578851  0.339067  0.06249168
            Tw   total
raw  33.364318 205.667
prop  0.162225   1.000

> ## plot method
> plot(obj,fsize=0.6,cex=c(0.6,0.3),ftype="i")
> ## add legend
> states<-sort(unique(getStates(trees[[1]],"tips")))
> add.simmap.legend(colors=
  setNames(palette()[1:length(states)],states),fsize=0.8)
Click where you want to draw the legend

That's not bad. The plotted pies at nodes (and tips) give the posterior probability of each node being in each state from (in this case) empirical Bayesian stochastic character mapping.

The code for these new methods, and the updated describe.simmap, is here; and a new phytools build with these methods can be downloaded from the phytools page.

New option to offset tip labels in contMap and densityMap

$
0
0

In response to a recent Facebook request I just added the option to offset tip labels in the phytools functions contMap and densityMap. This is simple. Because plot.contMap and plot.densityMap (the plotting methods for objects returned by each function) use plotSimmap internally, all I had to do was migrate that option to plotSimmap. To get this functionally, users will have to update phytools to a version >=0.4-03 (e.g., here). As of writing, this is not on CRAN.

Here's a quick demo using contMap:

> require(phytools)
Loading required package: phytools
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.4.3’
> tree<-pbtree(n=40,scale=10)
> x<-fastBM(tree)
> obj<-contMap(tree,x)

OK, now say I'm unhappy with the spacing of my tip labels. I can now do:

> plot(obj,offset=1)

adjusting offset until satisfied. Cool.

Note that although the offset is adjusted, the x-axis limits are not. This could create a problem for a large offset, & I will fix this in future so that xlim is under user control; however, plot.contMap and plot.densityMap already allow a fairly generous space to the right of tip labels anyway.

Finally, this will not work for type="fan". This is the same as in plotSimmap, so addressing this will take a little more effort than I'm willing to invest at this precise moment in time.

New postdoc in phylogeny methods

$
0
0

I am hiring a new postdoc in phylogenetic comparative methods. The job advertisement is below:

-----------------------------------

Postdoctoral research associate in phylogenetic comparative methods

A postdoctoral position is available in the Revell lab (http://faculty.umb.edu/liam.revell/) at the University of Massachusetts Boston in theoretical phylogenetics and/or computational phylogeny methods. Applicants should have a Ph.D. and extensive training and experience in one or more of the following areas: phylogeny method development or application in software; theoretical evolutionary quantitative genetics; and/or evolutionary computational biology. The ideal candidate will also have broad training in evolutionary biology, strong writing skills, and prior teaching or mentoring experience.

The postdoc hired from this search will play a key role in a recently funded NSF project to develop and apply new methods for evolutionary analysis in the context of phylogenetic trees. Major goals of this project include developing new visualization methods for phylogenetic comparative biology, improving the integration of phylogeny inference and comparative analysis, and bridging micro- and macroevolution in phylogenetic comparative biology. Consequently, the best candidate for this position will have skills and experience in multiple areas. The project also has substantial training goals, including the development of a new series of phylogenetic analysis mini-courses in Latin America, and a young developers’ workshop at UMass Boston’s Nantucket Field Station. The successful candidate will also be expected to participate in some of these programs.

The position is available for one year with the possibility of renewal. Start date is flexible. Please email Liam Revell (liam.revell@umb.edu) with any questions about this position.

A complete application for this position will include: (1) a brief cover letter; (2) a curriculum vitae; (3) a maximum two-page statement of your research experience & interest; and (4) names & contact information for three references. Applications can be submitted online via UMass Boston’s Interview Exchange system via the following URL: http://www.phytools.org/postdoc.search/. The position is open until filled, but applications should be sent by May 29, 2014 for full consideration.

UMass Boston provides equal employment opportunities (EEO) to all employees and applicants for employment.

Dropping tips while retaining the ancestors of remaining extant tips as singleton nodes

$
0
0

Luke Mahler asked the following:

"Do you know of a way to drop a terminal branch from a phylogeny, yet preserve the node it came from as a singleton node? I initially thought drop.tip(trim.internal=F) would do this, but it does something a little different, apparently (it preserves internal branches that become tips by pruning, but not nodes that would become singleton nodes)."

In the simple case in which we just want to drop one tip, this is relatively straightforward. We just have to drop the corresponding row & element from tree$edge, tree$edge.length, and tree$tip.label, and then update our node & tip numbers in tree$edge to follow the "phylo" object convention. However, generalizing to drop an arbitrary number of tips (while retaining all ancestral nodes to extant tips, regardless of whether they now have one or multiple descendants) now becomes a little bit trickier. Here is my function for this:

drop.tip.singleton<-function(tree,tip){
  N<-length(tree$tip.label)
  m<-length(tip)
  ii<-sapply(tip,function(x,y) which(y==x),y=tree$tip.label)
  tree$tip.label<-tree$tip.label[-ii]
  ii<-sapply(ii,function(x,y) which(y==x),y=tree$edge[,2])
  tree$edge<-tree$edge[-ii,]
  tree$edge.length<-tree$edge.length[-ii]
  tree$edge[tree$edge<=N]<-
    as.integer(rank(tree$edge[tree$edge<=N]))
  tree$edge[tree$edge>N]<-tree$edge[tree$edge>N]-m
  N<-N-m
  if(any(sapply(tree$edge[tree$edge[,2]>N,2],"%in%",
    tree$edge[,1])==FALSE)) internal<-TRUE
  while(internal){
    ii<-which(sapply(tree$edge[,2],"%in%",c(1:N,
      tree$edge[,1]))==FALSE)
    nn<-sort(tree$edge[ii,2])
    tree$edge<-tree$edge[-ii,]
    tree$edge.length<-tree$edge.length[-ii]
    for(i in 1:length(nn)) tree$edge[tree$edge>nn[i]]<-
      tree$edge[tree$edge>nn[i]]-1
    tree$Nnode<-tree$Nnode-length(ii)
    if(any(sapply(tree$edge[tree$edge[,2]>N,2],
      "%in%",tree$edge[,1])==FALSE)) internal<-TRUE
    else internal<-FALSE
  }
  tree
}

Now try it:

> tree<-pbtree(n=26,tip.label=LETTERS)
> plotTree(tree)
> tip<-sample(LETTERS,10)
> tip
[1] "N""M""F""I""Z""R""P""S""G""W"
> tt<-drop.tip.singleton(tree,tip)
> plotTree.singletons(tt)

This seems to be the correct result.

Fixed drop.tip.singleton

$
0
0

Here is a fixed version of the function I just posted to drop leaves while retaining all ancestral nodes of remaining extant taxa as singletons:

drop.tip.singleton<-function(tree,tip){
  N<-length(tree$tip.label)
  m<-length(tip)
  ii<-sapply(tip,function(x,y) which(y==x),y=tree$tip.label)
  tree$tip.label<-tree$tip.label[-ii]
  ii<-sapply(ii,function(x,y) which(y==x),y=tree$edge[,2])
  tree$edge<-tree$edge[-ii,]
  tree$edge.length<-tree$edge.length[-ii]
  tree$edge[tree$edge<=N]<-
    as.integer(rank(tree$edge[tree$edge<=N]))
  tree$edge[tree$edge>N]<-tree$edge[tree$edge>N]-m
  N<-N-m
  if(any(sapply(tree$edge[tree$edge[,2]>N,2],"%in%",
    tree$edge[,1])==FALSE)) internal<-TRUE
  else internal<-FALSE
  while(internal){
    ii<-which(sapply(tree$edge[,2],"%in%",
      c(1:N,tree$edge[,1]))==FALSE)[1]
    nn<-tree$edge[ii,2]
    tree$edge<-tree$edge[-ii,]
    tree$edge.length<-tree$edge.length[-ii]
    tree$edge[tree$edge>nn]<-tree$edge[tree$edge>nn]-1
    tree$Nnode<-tree$Nnode-length(ii)
    if(any(sapply(tree$edge[tree$edge[,2]>N,2],
      "%in%",tree$edge[,1])==FALSE)) internal<-TRUE
    else internal<-FALSE
  }
  tree
}

The previous version ran into trouble when the number of remaining internal edges that the function tried to remove in a step is >1. This version is less ambitious & seems to work fine.


New version of phytools submitted to CRAN

$
0
0

I just submitted a new version of phytools (phytools 0.4-05) to CRAN. Some updates in the current version relative to the previous CRAN version include the following:

1. A new function, cladelabels, to label membership in a clade.

2. A new function, fastHeight, to get the height above the root of the common ancestor of a pair of species.

3. A new function, roundPhylogram (1, 2), to plot a round phylogram.

4. A new function, paintBranches, to paint an edge or set of edges on the tree.

5. Some new options and S3 methods for the function ltt95 (described here).

6. A simple new function, nodeheight, to quickly compute the height above the root of a single node. (For large trees this method will be much quicker than nodeHeights if the height of only one or a small number of nodes are of interest).

7. New plot & print methods for describe.simmap (described here).

8. Some new plotting options for contMap and densityMap (described here).

9. Finally, last & possibly least, a new function drop.tip.singleton (1, 2) to drop one or multiple tips from the tree while retaining all ancestral nodes of remaining tips from the original tree as singleton nodes.

Hopefully, this new version will be accepted & posted to CRAN; but in the meantime it can be downloaded & installed from source from the phytools page. Check it out.

contMap with missing data

$
0
0

Today a phytools user emailed with the following question:

"I am still having difficulty running an analysis on the evolution of a continuous character with a partial data set (I have character states for about 50% of the taxa). I would like to display the results using contMap, but since fastAnc is built into the code, it won't map the character because the # of taxa in the data set doesn't match the # of taxa in the tree. I attempted to replace fastAnc with the modified version of anc.ML from one of your blog posts (link), but I couldn't get that to work either. I received the same error message as with fastAnc:
Error in ace(x, a, method = "pic") : length of phenotypic and of phylogenetic data do not match
Is there any way to use
contMap with missing data? Any help would be greatly appreciated. Thank you for your time."

They are totally on the right track, but things get a little trickier than they might have anticipated because whereas fastAnc (when used in it's default mode) just returns a vector of estimated character values at nodes, anc.ML (when some taxa are present in the tree but missing from the input data) returns a list with the vector missing.x containing the MLE phenotypic trait values of taxa with missing data. It also happens to be the case that I never got around to adding this updated version of anc.ML to the phytools package.

Well, I have now made both updates so it is possible to (1) specify an inference method (method="fastAnc" or method="anc.ML"); and (2) phytools function anc.ML now permits missing data, in which case the states for the tip taxa in the tree missing from the data vector will be estimated using ML. The code for these updates are here: anc.ML, contMap, but since both use internal functions not in the phytools namespace it is probably wise to download the latest phytools build and install from source.

Here's a quick demo:

> packageVersion("phytools")
[1] ‘0.4.7’
> ## simulate tree & data
> tree<-pbtree(n=26,tip.label=LETTERS)
> x<-fastBM(tree)
> ## compute limits on x to use in both plots
> lims<-c(floor(min(x)*10)/10,ceiling(max(x)*10)/10)
> ## simulate missing tip values
> y<-sample(x,20)
> ## which taxa are missing
> setdiff(tree$tip.label,names(y))
[1] "H""I""O""T""W""Z"
> ## contMap from full dataset
> contMap(tree,x,lims=lims,legend=1.5)
> ## contMap from reduced dataset
> contMap(tree,y,lims=lims,method="anc.ML",legend=1.5)

OK - that's it.

Plotting bars at the tips of a circular tree

$
0
0

Today Matt Helmus posted the following request to the R-sig-phylo mailing list:

"Does anyone know of R code (or perhaps another program) to plot bars across the tips of a radial/fan phylogeny? Specifically, I have a large phylogeny and a corresponding vector of continuous trait values for the tips, and while I could use those values to vary the color and size of the tip labels, or also to plot points of varying color or size at those tips, I think a better depiction of the data would be to plot bars that vary in height."

Naturally, this appealed to me. The biggest challenge is resuscitating high-school (or perhaps elementary school) trig (think SOH-CAH-TOA) to get the position of the vertices for each tip bar/rectangle correct. Here is my attempt. Note that it doesn't contain much in the ways of bells-and-whistles, but these could easily be added:

plotFan.wBars<-function(tree,x,scale=1,width=NULL){
  lims=c(-max(nodeHeights(tree))-scale*max(x),
    max(nodeHeights(tree))+scale*max(x))
  obj<-plotTree(tree,type="fan",ftype="off",
    xlim=lims,ylim=lims)
  if(is.null(width)) width<-(par()$usr[2]-par()$usr[1])/100
  w<-width
  x<-x[tree$tip.label]*scale
  for(i in 1:length(x)){
    dx<-obj$xx[i]
    dy<-obj$yy[i]
    theta<-atan(dy/dx)
    x1<-dx+(w/2)*cos(pi/2-theta)
    y1<-dy-(w/2)*sin(pi/2-theta)
    x2<-dx-(w/2)*cos(pi/2-theta)
    y2<-dy+(w/2)*sin(pi/2-theta)
    s<-if(dx>0) 1 else -1
    x3<-s*x[i]*cos(theta)+x2
    y3<-s*x[i]*sin(theta)+y2
    x4<-s*x[i]*cos(theta)+x1
    y4<-s*x[i]*sin(theta)+y1
    polygon(c(x1,x2,x3,x4),c(y1,y2,y3,y4),col="grey")
  }
  invisible(obj)
}

The two inobvious input arguments, scale&width, control the scale of the tip bars relative to the total height of the tree & the width of bars, respectively.

Let's try it:

> tree<-pbtree(n=50,scale=1)
> x<-fastBM(tree)
> x<-abs(x)
> plotFan.wBars(tree,x,scale=0.2)

That's all.

Plotting bars at the tips of a tree, part II

$
0
0

Earlier today, I responded to an R-sig-phylo request to be able to plot bars showing phenotypic trait values for species at the times of a circular or 'fan' tree. I have now added this function (plotTree.wBars) to the phytools package. It can be downloaded & installed from source here.

In addition to plotting a circular tree, this function version also:

(1) Plots a square phylogram in "rightward" or "leftward" orientation.

(2) Plots a stochastic character mapped tree (using plotSimmap instead of plotTree internally).

(3) Can accept most of the arguments of plotTree&plotSimmap (see the documentation page for plotSimmap for more information.)

Here's a demo:

> packageVersion("phytools")
[1] ‘0.4.8’
> tree<-pbtree(n=50)
> x<-fastBM(tree,bounds=c(0,Inf))
> plotTree.wBars(tree,x)
> Q<-matrix(c(-1,1,1,-1),2,2)
> tree<-sim.history(tree,Q)
> plotTree.wBars(tree,x,method="plotSimmap", colors=setNames(c("blue","red"),1:2),lwd=3)
> plotTree.wBars(tree,x,type="fan",method="plotSimmap", colors=setNames(c("blue","red"),1:2),lwd=3,scale=0.5, width=0.2)

That's it.

Changing the color ramp in contMap or densityMap

$
0
0

I recently received the following question:

"Would it be possible in contMap to specify our own color ramp instead of stuck with the default red-to-blue?"

This cannot yet be done automatically, but it is not too hard. Here are a couple of examples. Note that they also apply equally to objects created using densityMap.

> ## first create simulated tree & data
> tree<-pbtree(n=26,tip.label=LETTERS[26:1])
> x<-fastBM(tree)
> ## build object of class "contMap"
> obj<-contMap(tree,x,plot=FALSE)
> obj
Object of class "contMap" containing:

(1) A phylogenetic tree with 26 tips and 25 internal nodes.

(2) A mapped continuous trait on the range (-4.458, 2.5).

> ## default colors
> plot(obj)
> ## what is the length of the current color ramp?
> n<-length(obj$cols)
> ## change to grey scale
> obj$cols[1:n]<-grey(0:(n-1)/(n-1))
> plot(obj)
> ## change to blue -> red
> obj$cols[1:n]<-colorRampPalette(c("blue","red"), space="Lab")(n)
> plot(obj)

That's it!

Viewing all 801 articles
Browse latest View live