Quantcast
Channel: Phylogenetic Tools for Comparative Biology
Viewing all articles
Browse latest Browse all 802

Combining phylogenetic PCA with the factoextra package

$
0
0

A couple of days ago, Roberto Toro posted the following tweet:

I responded, and then we wrote back and forth, and I think I've now developed a solution whereby a "phyl.pca" object produced by phytools::phyl.pca can be converted to a "princomp" object and plotted using the factoextra package.

The update can be seen on the phytools GitHub page, and obtained by updating phytoolsfrom GitHub using devtools.

The following demo requires that phytools be updated (to version 1.0.2 or above) & that factoextra is installed from CRAN.

First, I'll load my packages & data.

## load packages
library(phytools)
packageVersion("phytools")
## [1] '1.0.2'
library(factoextra)
## load example data
data(anoletree)
data(anole.data)

Next, I'll pull out the factor “ecomorph” from the tree object, anoletree. This is only possible because anoletree is a "simmap" class object, but I'll use this factor to plot my PC scores by group – something that factoextra facilitates. All the other steps of this demo do not require a grouping factor – I do this only to help demonstrate the full capabilities of factoextra!

ecomorph<-getStates(anoletree,"tips")
head(ecomorph,10)
##            ahli         allogus     rubribarbus           imias          sagrei 
## "TG" "TG" "TG" "TG" "TG"
## bremeri quadriocellifer ophiolepis mestrei jubar
## "TG" "TG" "GB" "TG" "TG"

Now I'll do my phylogenetic PCA (using phyl.pca) and then convert my "phyl.pca" object to a "princomp"object using as.princomp, the function I just added to phytools.

pca<-phyl.pca(anoletree,anole.data)
pca
## Phylogenetic pca
## Standard deviations:
## PC1 PC2 PC3 PC4 PC5 PC6
## 0.33222579 0.09207406 0.05012082 0.04318123 0.02008655 0.01507125
## Loads:
## PC1 PC2 PC3 PC4 PC5 PC6
## SVL -0.9712234 0.16067288 0.01972570 0.14782215 -0.06211906 0.06935433
## HL -0.9645111 0.16955087 -0.01203113 0.17994634 0.08064324 -0.04406887
## HLL -0.9814164 -0.02674808 0.10315533 -0.13790763 0.06887922 0.04126248
## FLL -0.9712265 0.17585694 0.10697935 -0.09105747 -0.06075142 -0.04864769
## LAM -0.7810052 0.37429334 -0.47398703 -0.15871456 0.00217418 0.00875408
## TL -0.9014509 -0.42528918 -0.07614571 0.01709649 -0.01750404 -0.01088743
obj<-as.princomp(pca)
obj
## Call:
## phyl.pca(tree = anoletree, Y = anole.data)
##
## Standard deviations:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
## 0.33222579 0.09207406 0.05012082 0.04318123 0.02008655 0.01507125
##
## 6 variables and observations.

Try a simple factoextra visualization.

fviz_screeplot(obj,addlabels=TRUE)

plot of chunk unnamed-chunk-4

Cool!

Finally, I'll use a differentfactoextra plotting method to show the PC scores in two different dimensions separated by my discrete factor, ecomorph.

fviz_pca_ind(obj,label="none",habillage=ecomorph,
addEllipses=TRUE)

plot of chunk unnamed-chunk-5

That's it.


Viewing all articles
Browse latest Browse all 802

Trending Articles