Figure 1 the relationship between the per-site LRT and the P-value in a hypothetical case.

Consider the case of a dataset which has 100 characters (or sites in a sequence alignment) and the alternative tree has a log-likelihood that is 5.0 higher than the null hypothesized tree. If we do not use asymptotic assumptions to connect the 2*(the difference in log-likelihoods) to the chi-square distribution, then try to infer a null distribution based on the per-site differences in log-likelihoods.

If we construct the null distribution in this way, we find that the variability of the support across characters if very important to assessing the statistical significance of a log-likelihood difference.

Typically we want to test a null hypothesis that the two trees are equally good explanations of the data. Thus under the null, the expected mean difference log-likelihoods per-site should be 0. So, if we have a method of producing a plausible amount of variability in the per-site log-likelihood differences, we can generate a null distribution at the per-site level.

Because the total difference in log-likelhoods is simply the sum of the differences for all sites, we can get the null distribution of the total log-likelihood difference from the per-site null distribution.

Further information on toplogical testing will be available in a forthcoming "Encyclopedia of Evolution" article by Emily Jane B. McTavish and Mark T. Holder.