(This post is for my own notes. It will probably make sense to about 0.00000001% of people subscribed to this blog.)
Consider the two unrooted phylogenies shown below. Both trees contain the same taxa. Suppose these topologies were discovered during an MCMC run. The “ingroup” taxa are blue and named with “i-.” The outgroup taxa are red and named with “o-.” On the ML tree, its pretty clear which nodes are the last-common ancestor of the ingroup and outgroup. However, on the alternate tree, the rooting is ambiguous.
Simply put, the ingroup and outgroup are not monophyletic. This dilemma is problematic for ancestral sequence reconstruction (ASR) methods which attempt to incorporate phylogenetic uncertainty. In such methods, we want to calculate the Bayesian average of the ancestral sequence from both trees. On the alternate tree, which ancestor do we choose?
Here are three potential solutions:
- Discard trees with non-monophyletic ingroups/outgroups.
- Randomly select one of the putative ancestors, and forget about the other one.
- Use the average of the putative ancestral sequences. In our example, the alternate tree contains two possible roots. Therefore, we average the ancestral sequences from both of these nodes, and then use this averaged sequence in our overall Bayesian average with other trees.