-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Hi Martin - in my teaching light term, I have returned to trying to do some phylogenies, and noticed that in even v simple analyses, which I think should have strong ingroup support, when using the inapplicables search, I seem to be betting a basal polytomy between the outgroup and the earliest branching ingroup when I use maximiseParsimony(). To test this, I created the following nexus file, which should have unambiguous ingroup support:
#NEXUS
BEGIN DATA;
DIMENSIONS NTAX = 5 NCHAR = 10
FORMAT DATATYPE = STANDARD GAP = - MISSING = ? SYMBOLS = "0 1 2 3 4 5 6 7 8 9 A B C D E F G H J K M N P Q R S T U V W X Y Z a b c d e f g h j k m n p q r s";
MATRIX
Taxon_01 0000000000
Taxon_02 1111101111
Taxon_03 1112111111
Taxon_04 1122111111
Taxon_05 1122111111
;
END;
I then simplified a script to do a search, that I wrote last summer, to do fitch and inapplicables aware search on this, assuming the above is saved as a file ccalled test.nex in the working directory:
#Install packages if required
packages <- c("parallel", "TreeTools", "TreeDist", "TreeSearch", "Rogue", "cheapr", "paleobioDB", "strap")
if(length(packages[!packages %in% installed.packages()[,"Package"]]) > 0){
install.packages(packages[!packages %in% installed.packages()[,"Package"]])
}
# Load required libraries
invisible(lapply(packages, library, character.only = TRUE))
rm(packages)
#This is required if not running from Bash
#setwd("~/Desktop/Programming/scripts/Phylogeny/R_parsimony")
#This is used to write relevant output files, in addition to reading the data in
inputDataNexusFile<-"./test.nex"
#Set output directory for plots (all other files written to wd)
outDirectory<-getwd()
#Number of cores for parallelisation
numCores <- detectCores()
##################################################################################
#### Source functions
doSearch <- function(phyMatrix, stub, k, outDirectory) {
if (k == Inf) {
title <- paste("Equal weights - ", stub)
} else {
title <- paste("Implied weights - K", k, " ", stub, sep = "")
}
# Select starting tree
startTree <- AdditionTree(phyMatrix, concavity = k)
# Search for optimal trees under equal weights
newTrees <- MaximizeParsimony(
phyMatrix,
tree = startTree,
concavity = k,
ratchIter = 12,
tbrIter = 1,
maxHits = 100,
maxTime = 60,
startIter = 3,
finalIter = 1,
verbosity = 4
)
cat(title, file = firstHitsFile, append = TRUE)
cat(attr(newTrees, "firstHit"), file = firstHitsFile, append = TRUE)
cat("\n", file = firstHitsFile, append = TRUE)
# Calculate rogues
rogues <- Rogue::QuickRogue(newTrees, fullSeq = TRUE, p = 1)
cat("Rogues are: ")
print(rogues) # Detailed results of rogue analysis
# Calculate consensus
conTree <- Consensus(newTrees, p = 1.0)
# Root consensus
rootedConTree <- root(conTree, outgroup = outgroupForRooting)
cat("Outgroup is ",outgroupForRooting)
# Adjust parameters here
# nReplicates <- 200
# jackTrees <- lapply(logical(nReplicates), function(x) {
# Resample(
# phyMatrix,
# newTrees,
# ratchIter = 12,
# tbrIter = 1,
# maxHits = 100,
# maxTime = 20,
# startIter = 3,
# finalIter = 1,
# verbosity = 4
# )
# })
# Save tree
fileName <- gsub(" ", "_", title)
saveDirectory <- paste(outDirectory, "/", fileName, ".pdf", sep = "")
pdf(saveDirectory, width = 10, height = 20)
# Set up plotting area - reduce font size
par(cex = 0.8)
# Plot the tree
# JackLabels(rootedConTree, lapply(jackTrees, ape::consensus), add=TRUE, pos=1, adj=-10)
plot(rootedConTree, tip.color = ColByStability(newTrees))
#nodelabels(JackLabels(rootedConTree, lapply(jackTrees, ape::consensus), plot = FALSE), frame = "none", pos = 4, offset = 0.1, cex = 0.6, col = "darkgray")
# Trying to colour nodes by support:
# nodelabels(thermo=as.numeric(JackLabels(rootedConTree, lapply(jackTrees, ape::consensus), plot=FALSE)), frame="circle", pos=4, offset = 0.1, cex =0.6)
#ggtree(rootedConTree) + geom_tiplab(as_ylab = TRUE, color = ColByStability(newTrees))
title(title)
PlotTools::SpectrumLegend("bottomright", legend = c("Stable", "Unstable"), palette = hcl.colors(131, "inferno")[1:101])
dev.off()
# Remove data no longer required
rm(startTree, newTrees, rogues, conTree, rootedConTree)
}
makeFitchMatrix <- function(phyloMatrixAsPhyDat){
datasetFitchMatrix<-PhyDatToMatrix(phyloMatrixAsPhyDat)
datasetFitchMatrix[datasetFitchMatrix == "-"] <- "?"
return(MatrixToPhyDat(datasetFitchMatrix))
}
##################################################################################
##### Load data, create matrices for inapplicable aware and fitch parsimony
#Store datasets to run in a list
datasets<-list()
#Load data from matrix exported as TSV nexus format
datasetInapplicables <- ReadAsPhyDat(inputDataNexusFile)
#Add this dataset to our list of datasets
datasets[["datasetInapplicables"]]<-datasetInapplicables
rm(datasetInapplicables)
#Use this to set the outgroup now the names are changed
outgroupForRooting<-names(datasets[["datasetInapplicables"]])[1]
#Create a version of this to do Fitch treatment of inapplicables
datasets[["datasetFitch"]]<-makeFitchMatrix(datasets[["datasetInapplicables"]])
#Lets record the first hit attribute so we can check we are likely to be exploring treespace OK
firstHitsFile<-paste(outDirectory,"/search_firstHits.txt",sep="")
mclapply(1:length(datasets), mc.cores = numCores-2, function(i)
#for(i in 1:length(datasets))
{
#Work out name stub for this treatment
stub<-strsplit(names(datasets)[i],split='dataset')[[1]][2]
### Equal weights ###
doSearch(datasets[[i]],stub,Inf, outDirectory)
### Implied weights - K of 10 ###
doSearch(datasets[[i]],stub,10, outDirectory)
### Implied weights - K of 3 ###
doSearch(datasets[[i]],stub,3, outDirectory)
})
Outputs are:
Equal_weights_-__Fitch.pdf
Equal_weights_-__Inapplicables.pdf
Implied_weights_-_K10_Fitch.pdf
Implied_weights_-_K10_Inapplicables.pdf
Implied_weights_-_K3_Fitch.pdf
Implied_weights_-_K3_Inapplicables.pdf
As you can see, with the inapplicables aware search, we do indeed always get a basal polytomy between the out and earliest branching ingroup. I am very aware I might be missing something obvious (or subtle), and if so apologies, but I wanted to flag it and find out if this is the expected behaviour, or whether there something untoward going on?