Quantcast
Channel: Phylogenetic Tools for Comparative Biology
Viewing all articles
Browse latest Browse all 802

More on Rphylip: Interfaces for threshml & treedist

$
0
0

I've been gradually plugging away at Rphylip (my R wrapper for PHYLIP), which now includes working interfaces for contml, contrast, dnaml, dnamlk, dnapars, neighbor, threshml, and treedist. Obviously, I have a long way to go; however some of the new additions cover functionality that can't be found in any other software. For example, Rcontrast (interface for contrast) does regular phylogenetic contrasts - but also allows for within-species sampling (Felsenstein 2008). Rthreshml (interface for threshml, which is not actually yet in PHYLIP but will be in the next release) implements the threshold model which can be used for estimating the correlation between discrete and continuous characters, among other things (Felsenstein 2012).

Here's a quick demo of Rthreshml:

> library(Rphylip)
Loading required package: ape
> require(phytools)
Loading required package: phytools
Loading required package: maps
Loading required package: rgl
> tree<-pbtree(n=100)
> V<-matrix(c(1,0,0.8,0,
0,2,0,1.2,
0.8,0,1,0,
0,1.2,0,1),4,4)
> V
     [,1] [,2] [,3] [,4]
[1,]  1.0  0.0  0.8  0.0
[2,]  0.0  2.0  0.0  1.2
[3,]  0.8  0.0  1.0  0.0
[4,]  0.0  1.2  0.0  1.0
> X<-sim.corrs(tree,V)
> th<-setNames(c(0,Inf),LETTERS[1:2])
> X<-data.frame(X[,1],X[,2],sapply(X[,3],threshState,th), sapply(X[,4],threshState,th))
> names(X)<-paste("v",1:ncol(X),sep="")
> X
             v1          v2 v3 v4
t15   2.8841985  4.98211106  B  B
t26   2.1335369  7.73244895  A  B
t27   0.9770008  6.58944668  A  B
t44   3.5651266  4.94113242  B  B
t85   1.6813201  3.73859795  A  B
t86   1.3738674  3.36676417  A  B
t69   1.1805336  2.17284155  A  B
t48   2.2652202  3.93811812  B  B
....

> fit<-Rthreshml(tree,X,proposal=0.1)

....

Threshold character Maximum Likelihood method version 3.7a

....

Markov chain Monte Carlo (MCMC) estimation of covariances:

                                 Acceptance   Norm of change
Chains (20)                         rate        in transform
------                          ----------    --------------

Burn-in: 1000 updates
Chain 1: 100000 updates  ........  0.8623         0.339614
Chain 2: 100000 updates  ........  0.8651         0.215640
Chain 3: 100000 updates  ........  0.8674         0.160055
Chain 4: 100000 updates  ....

....

Covariance matrix of continuous characters
and liabilities of discrete characters
(the continuous characters are first)

             1          2          3          4   
  1       0.94584    0.09582    0.73779   -0.15732
  2       0.09582    2.07466   -0.17442    1.13575
  3       0.73779   -0.17442    1.00000   -0.31895
  4      -0.15732    1.13575   -0.31895    1.00000

....

> fit
$Covariance_matrix
         v1       v2       v3       v4
v1  0.94584  0.09582  0.73779 -0.15732
v2  0.09582  2.07466 -0.17442  1.13575
v3  0.73779 -0.17442  1.00000 -0.31895
v4 -0.15732  1.13575 -0.31895  1.00000

$Transform_indepvar_liab
         v1       v2       v3      v4
v1  0.97254  0.00000  0.00000 0.00000
v2  0.09853  1.43699  0.00000 0.00000
v3  0.75862 -0.17339  0.62804 0.00000
v4 -0.16177  0.80146 -0.09118 0.56849

$Var_change
      v1       v2       v3       v4
2.859484 1.708908 0.238306 0.213801

$Transform_liab_cont
         v1       v2       v3      v4
v1 -0.08096  0.81423 -0.20088 0.53863
v2 -0.69130 -0.26046 -0.67287 0.03887
v3 -0.08890 -0.47040  0.32064 0.81732
v4  0.71249 -0.21889 -0.63568 0.20090

We can see that the estimated covariance matrix is actually very close to our generating matrix. Cool.

Unlike the threshml standalone, we can provide our discrete & continuous characters in any order, and our discrete character can be coded anyway we want - so long as it only has two states. Rthreshml creates the input file, reads back in the output, and sorts the estimated covariance matrix back into the order of the columns of X.


Viewing all articles
Browse latest Browse all 802

Trending Articles