Quantcast
Channel: Phylogenetic Tools for Comparative Biology
Viewing all articles
Browse latest Browse all 802

Projecting a continuous character onto a tree using different edge widths

$
0
0

Earlier today, I received the following message from a phytools user:

“I am calculating rates of change of a character trait and then plotting that change on a phylogenetic tree using the function you created in R, plotBranchbyTrait(). While, I managed to show the differences in rates of change using different colors, I was wondering if you have also created a function that represents branch thickness in each branch of the tree proportionally to the value of that branch (e.g. branches evolving at a faster rate will be represented with a thicker line etc.). If not, do you have any advice of how I would be able to do that?”

Tempted as I was to show him how to do this using phytools (which is possible, though complicated), the more responsible response was to point him towards the help page in ape::plot.phylo which documents the argument edge.width as follows:

edge.width– a numeric vector giving the width of the branches of the plotted phylogeny. These are taken to be in the same order than the component edge of phy. If fewer widths are given than the length of edge, then these are recycled.

I also cautioned that (depending on our plotting device) R might render these as only whole integer (I know that's redundant - but just to be clear) values: i.e., 0.5 to 1.5 as 1; 1.5 to 2.5 as 2; and so on.

Nonetheless, I thought it could be interesting to illustrate this method using a real example, in which I set the width of each plotted edge of the tree to match the “mean” (average of parent & daughter known or reconstructed values) of a continuous trait on the tree.

Let's go:

library(phytools)
data(mammal.tree)
## check our tree
plotTree(mammal.tree,fsize=0.8,ftype="i")

plot of chunk unnamed-chunk-1

data(mammal.data)
## check our data
mammal.data
##                    bodyMass homeRange
## U_maritimus 265.00 115.600
## U_arctos 251.30 82.800
## U_americanus 93.40 56.800
## N_narica 4.40 1.050
## P_locor 7.00 1.140
## M_mephitis 2.50 2.500
## M_meles 11.60 0.870
## C_lupus 35.30 202.800
## C_latrans 13.30 45.000
## L_pictus 20.00 160.000
## C_aureus 8.80 9.100
## U_cinereoargenteus 3.70 1.100
## V_fulva 4.80 3.870
## H_hyaena 26.80 152.800
## C_crocuta 52.00 25.000
## A_jubatus 58.80 62.100
## P_pardus 52.40 23.200
## P_tigris 161.00 69.600
## P_leo 155.80 236.000
## T_bairdii 250.00 2.000
## C_simum 2000.00 6.650
## D_bicornis 1200.00 15.600
## E_hemionus 200.00 35.000
## E_caballus 350.00 22.500
## E_burchelli 235.00 165.000
## L_guanicoe 95.00 0.500
## C_dromedarius 550.00 100.000
## G_camelopardalis 1075.00 84.600
## S_caffer 6210.00 138.000
## B_bison 865.00 133.000
## T_oryx 511.00 87.500
## G_granti 62.50 20.000
## G_thomsonii 20.50 5.300
## A_cervicapra 37.50 6.500
## M_kirki 5.00 0.043
## O_americanus 113.50 22.750
## O_canadensis 85.00 14.330
## H_equinus 226.50 80.000
## A_melampus 53.25 3.800
## C_taurinus 216.00 75.000
## D_lunatus 130.00 2.200
## A_buselaphus 136.00 5.000
## A_americana 50.00 10.000
## C_canadensis 300.00 12.930
## D_dama 55.00 1.300
## A_alces 384.00 16.100
## R_tarandus 100.00 30.000
## O_virginianus 57.00 1.960
## O_hemionus 74.00 2.850
## pull out body mass on log scale
lnBodyMass<-setNames(log(mammal.data$bodyMass),
rownames(mammal.data))
## reconstruct body mass on the tree
a<-fastAnc(mammal.tree,lnBodyMass)
a
## Ancestral character estimates using fastAnc:
## 50 51 52 53 54 55 56 57 58 59 60
## 4.640573 3.941365 3.580030 3.378235 5.008602 5.417042 2.506860 1.783328 2.075654 2.092616 2.102696
## 61 62 63 64 65 66 67 68 69 70 71
## 2.427963 2.879837 2.935833 4.003930 3.660685 4.315032 4.607959 4.760302 4.873642 5.317474 5.559303
## 72 73 74 75 76 77 78 79 80 81 82
## 7.078588 5.517309 5.552671 5.192977 5.370860 5.348416 5.312073 5.325223 5.429285 6.920095 4.688839
## 83 84 85 86 87 88 89 90 91 92 93
## 3.685079 3.636850 3.590634 4.698303 4.603683 4.874816 4.790504 4.882241 4.886430 5.262581 5.009917
## 94 95 96 97
## 4.889860 4.940759 4.842665 4.247903

OK, now for the hard part. Our edge.width vector needs to be in the order of the edges of mammal.tree$edge. As such, what I'm going to do is go through each edge of this matrix, find the observed or reconstructed value of the parent and daughter nodes, and average them into a single vector. I'm going to do this with apply, but I could have also done it using a for loop.

First, though, I will sort all of my trait values into a single long vector with an order that corresponds to the node indices.

## get node values for all nodes (including tips)
node.values<-c(lnBodyMass[mammal.tree$tip.label],
unclass(a))
node.values
##        U_maritimus           U_arctos       U_americanus           N_narica            P_locor 
## 5.5797298 5.5266474 4.5368913 1.4816045 1.9459101
## M_mephitis M_meles C_lupus C_latrans L_pictus
## 0.9162907 2.4510051 3.5638830 2.5877640 2.9957323
## C_aureus U_cinereoargenteus V_fulva H_hyaena C_crocuta
## 2.1747517 1.3083328 1.5686159 3.2884019 3.9512437
## A_jubatus P_pardus P_tigris P_leo T_bairdii
## 4.0741419 3.9589066 5.0814044 5.0485731 5.5214609
## C_simum D_bicornis E_hemionus E_caballus E_burchelli
## 7.6009025 7.0900768 5.2983174 5.8579332 5.4595855
## L_guanicoe C_dromedarius G_camelopardalis S_caffer B_bison
## 4.5538769 6.3099183 6.9800759 8.7339162 6.7627295
## T_oryx G_granti G_thomsonii A_cervicapra M_kirki
## 6.2363696 4.1351666 3.0204249 3.6243409 1.6094379
## O_americanus O_canadensis H_equinus A_melampus C_taurinus
## 4.7318028 4.4426513 5.4227449 3.9749978 5.3752784
## D_lunatus A_buselaphus A_americana C_canadensis D_dama
## 4.8675344 4.9126549 3.9120230 5.7037825 4.0073332
## A_alces R_tarandus O_virginianus O_hemionus 50
## 5.9506426 4.6051702 4.0430513 4.3040651 4.6405728
## 51 52 53 54 55
## 3.9413651 3.5800304 3.3782346 5.0086025 5.4170421
## 56 57 58 59 60
## 2.5068605 1.7833278 2.0756539 2.0926160 2.1026959
## 61 62 63 64 65
## 2.4279633 2.8798366 2.9358328 4.0039298 3.6606852
## 66 67 68 69 70
## 4.3150319 4.6079589 4.7603022 4.8736421 5.3174738
## 71 72 73 74 75
## 5.5593032 7.0785882 5.5173085 5.5526712 5.1929774
## 76 77 78 79 80
## 5.3708596 5.3484157 5.3120732 5.3252231 5.4292850
## 81 82 83 84 85
## 6.9200950 4.6888392 3.6850793 3.6368500 3.5906336
## 86 87 88 89 90
## 4.6983032 4.6036828 4.8748164 4.7905038 4.8822412
## 91 92 93 94 95
## 4.8864297 5.2625810 5.0099171 4.8898599 4.9407594
## 96 97
## 4.8426647 4.2479034
## get edge values by averaging parent & daughter nodes
## for each edge
edge.values<-apply(mammal.tree$edge,1,function(e,nv)
mean(nv[e]),nv=node.values)
edge.values
##  [1] 4.290969 3.760698 3.479132 4.193419 5.212822 5.498386 5.471845 4.772747 2.942548 2.145094 1.632466
## [12] 1.864619 2.291257 1.495972 2.263329 2.836323 2.097656 2.265330 2.653900 2.907835 3.249858 2.761798
## [23] 2.937784 2.301358 1.705514 1.830616 3.972647 3.832308 3.474544 3.805964 4.159481 4.194587 4.461495
## [34] 4.283433 4.684131 4.920853 4.904438 4.757107 5.095558 5.438389 5.540382 6.318946 7.339745 7.084333
## [45] 5.417391 5.407813 5.534990 5.705302 5.506128 5.033310 5.281918 4.962368 5.840389 5.270697 6.164246
## [56] 5.330244 5.318648 5.377254 6.174690 7.827006 6.841412 5.832827 5.007031 4.186959 3.660965 3.613742
## [67] 3.862900 3.305529 3.630595 2.647259 4.693571 4.650993 4.667743 4.523167 4.786560 5.148781 4.832660
## [78] 4.382751 4.836372 5.128760 4.884335 4.876982 4.899542 5.287327 4.587302 5.136249 4.949888 5.296821
## [89] 4.448597 4.975338 5.445701 4.891712 4.723917 4.545284 4.145477 4.275984

OK, now we're ready. I'm going to round my edge widths to have values between 1 & 10. I'll do this using a simple rescale of my current values:

edge.widths<-edge.values-min(edge.values)
edge.widths<-edge.widths*(9/max(edge.widths))+1

OK, now let's plot:

par(lend=2)
plot(mammal.tree,no.margin=TRUE,cex=0.7,edge.width=edge.widths,label.offset=1)

plot of chunk unnamed-chunk-4

Lastly, for fun why don't we try to add a legend!

par(lend=2)
plot(mammal.tree,no.margin=TRUE,cex=0.7,edge.width=edge.widths,
label.offset=1,edge.col="darkgrey")
leg.length<-30
nsegs<-20
segments(x0=seq(0,(1-1/nsegs)*leg.length,length.out=nsegs),
y0=rep(Ntip(mammal.tree),nsegs),
x1=seq(1/nsegs*leg.length,leg.length,length.out=nsegs),
y1=rep(Ntip(mammal.tree),nsegs),
lwd=seq(1,10,length.out=nsegs),col="darkgrey")
text(0,Ntip(mammal.tree),round(min(edge.values),2),pos=1,
cex=0.8)
text(leg.length,Ntip(mammal.tree),round(max(edge.values),2),
pos=1,cex=0.8)
text(leg.length/2,Ntip(mammal.tree),"log(body size kg)",
pos=1,cex=0.8)

plot of chunk unnamed-chunk-5

That's not bad, right?


Viewing all articles
Browse latest Browse all 802

Trending Articles