In the l -phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than l connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degree. This version of the problem is known to be NP -complete for l ≥ 3 even for degree-3 trees in which no state labels more than l+1 leaves (and therefore there is a trivial l + 1 phylogeny). We give a 2 -approximation algorithm for all l ≥ 3 for arbitrary input topologies and we give an optimal approximation algorithm that constructs a 4 -phylogeny when a 3 -phylogeny exists. Dynamic programming techniques, which are typically used in fixed-topology problems, cannot be applied to l -phylogeny problems. Our 2 -approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are polymorphic.
- Approximation algorithm
- Computational biology