by Nicholas Galuszynski
In my previous post I attempted to outline some of the methods used to build
phylogenetic trees. While these are useful tools for explorers of the evolutionary history of life, they all have one fundamental problem.
They fail to
address reticulate evolutionary processes such as hybridization and
horizontal gene transfer . While these
processes may be rare for most animals, they play an important role
in the formation of new plant species . Phylogenetic trees
therefore, evoke an oversimplified view of evolution and require
non-tree based topologies to accurately express evolutionary
histories. These non-tree based topologies are generally found in the
form of phylogenetic networks. Phylogenetic networks, like their
branching counterparts (trees), incorporate either distance (much
like neighbour-joining trees) or discrete (as with maximum likelihood
and parsimony trees) data sets to produce a visual representation of
the evolutionary relationships between taxa. As with trees, the
branch lengths reflect the amount of evolutionary change between
taxa, or genetic distance .
Ideally a network
display would consist of both tree like as well as reticulated
portions . The tree like areas would represent sections of the
phylogeny that have no conflict among characters. Areas with a lot of
reticulation on the other hand, would represent those portions of the
phylogeny where there is either insufficient data to accurately
construct a phylogeny or parts of the phylogeny where there are
conflicting character –state patterns. Thus, phylogenetic networks
provide a visual cue as to the number of potential phylogenetic trees
that can fit the data, with more reticulate networks having the
greater number of potential trees due to increased conflicts within
the data.
Neighbour-net
(quality testing)
Neighbour-net
phylogenetic networks construct split networks from distance based
data sets . Since it is a distance based method, based partly on
the neighbour-joining algorithm , it has the advantage of being
faster that discrete data based methods. The method attempts to
generalise the tree building technique of neighbour-joining by
slowing the rate at which connections are made in a similar manner to
pyramid clustering . Further similarities between neighbour-net
and neighbour-joining are that both are agglomerative algorithms,
have similar selection criteria and are both considered to be
consistent . The method is straight forward to apply as there are
few choices to be made, other than the distance measure. While the
display of conflicts has been reported to respond well to increased
complexity, making it ideal for the analysis of complex and ambiguous
phylogenies .
There are
however, a number of issues that have been raised regarding
neighbour-net. Firstly, there has been criticism of its use in
phylogenetics due to its lack of an obvious tree interpretation, an
issue further complicated by the lack of informative theorems about
neighbour-net and the need to understand T-theory . This has
resulted in sense that much of the interpretations of neighbour-nets
have been affected by some degree of subjectivity. Furthermore,
neighbour-net has been recognised as a greedy algorithm , meaning
that it follows a heuristic construction path that selects for the
shortest branch length at each step, finding the local minimum and
not necessarily the global minimum. Even though there are these
limitations, neighbor-net provides a powerful means to visually
inspect conflicts between probable trees produced from large data
sets that are prone to fall victim to systematic errors .
|
Neighbour-Net
for AFLP data (dice distance computed) for the genus Baldellia.
Neighbor-Net plots
can be quite overwhelming when large data sets are analysed, such as
those associated with entire genera. Fortunately labels make cluster
recognition simple, leading to subjective interpretations at times.
Neighbour-Nets are normally interpreted in conjunction with
additional analyses such as parsimony based networks or phylogenetic
trees. Image from Arrigo et al., 2011 .
|
Statistical parsimony (data displaying)
Statistical
parsimony builds a network by sequentially connecting taxa in order
of increasing character state differences until the parsimony
connecting limit is reached . That is, the limit at which
parsimony can be considered a reliable method for phylogenetic
inference. The analysis is able to process both discrete sequence
data as well as distance values. The method is straightforward,
requiring no parameter selection and displays evolutionary change in
a similar manner to a parsimony tree; the distances between taxa
reflect evolutionary change. Since it is based on parsimony this
method faces some of the same issues effecting parsimony tree
constructions, such as the inability to process large, complex data
sets. Under these conditions of increased complexity statistical
parsimony disconnects the network, producing an array of separate
networks, rather than a single diagram; making the analysis more
prone to false negatives . Furthermore, since the network is built
sequentially the most parsimonious connections are made at each step,
possibly resulting in incomplete character conflicts due to a lack of
indirect links and the production of overall networks that do not
achieve a maximum parsimony .
The value of
parsimony networks is that they provides additional insight into the
phylogenetic history of the haplotype under investigation (inferring
populations or species history) as well as the relative abundance of
the haplotype .
|
Parsimony
network representing the genealogical relationships of haplotypes
with in the Little Karoo endemic Berkheya
cuneata
(Asteraceae). The nodes (small dots) indicate sequence changes
between haplotypes A-E and the size of the coloured circles
(haplotypes) indicate the relative frequency of these haplotypes
occurring within the samples. The large circles are the most common
haplotypes and therefore are generally considered to be older. Out
group species included B.
fruticosa (a), B.
coriacea and B. spinosa.
Image taken from Potts
et al., 2014.
|
Relevant literature:
Arrigo, N., Buerki, S., Sarr, A., Guadagnuolo, R., Kozlowski, G.,
2011. Phylogenetics and phylogeography of the monocot genus Baldellia
(Alismataceae): Mediterranean refugia, suture zones and implications
for conservation. Molecular Phylogenetics and Evolution 58, 33–42.
Huson, D.H., Bryant, D., 2006. Application of phylogenetic networks
in evolutionary studies. Molecular Biology and Evolution 23, 254–267.
Levy, D., Pachter, L., 2011. The neighbor-net algorithm. Advances in
Applied Mathematics 47, 240–258.
Moret, B.M.E., Nakhleh, L., Warnow, T., Linder, C.R., Tholse, A.,
Padolina, A., Sun, J., Timme, R., 2004. Phylogenetic networks:
Modeling, reconstructibility, and accuracy. IEEE/ACM Transactions on
Computational Biology and Bioinformatics 1, 13–23.
Morrison, D.A., 2005. Networks in phylogenetic analysis: New tools
for population biology. International Journal for Parasitology 35,
567–582.
Potts, A.J., Hedderson, T.A., Vlok, J.H.J., Cowling, R.M., 2013.
Pleistocene range dynamics in the eastern Greater Cape Floristic
Region: A case study of the Little Karoo endemic Berkheya cuneata
(Asteraceae). South African Journal of Botany 88, 401–413.