Results 1  10
of
11
Inferring a level1 phylogenetic network from a dense set of rooted triplets
, 2006
"... ..."
(Show Context)
Maximum agreement and compatible supertrees
 Proceedings of the 15th Combinatorial Pattern Matching Symposium (CPM’O4), volume 3109 of LNCS
, 2004
"... Given a set of leaflabelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problem ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
(Show Context)
Given a set of leaflabelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have nonidentical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT. A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size. It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves. SMAST is shown to be W[2]hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A simlar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, can not be approximated in polynomial time within a constant factor, unless P = NP. These results also hold when the input trees have a bounded number of leaves. The presented results apply to both collections of rooted and unrooted trees. Preprint submitted to Elsevier Science 17 November 2006 1
New Results on Optimizing Rooted Triplets Consistency
"... Abstract. A set of phylogenetic trees with overlapping leaf sets is consistent if it can be merged without conflicts into a supertree. In this paper, we study the polynomialtime approximability of two related optimization problems called the maximum rooted triplets consistency problem (MaxRTC) and ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
Abstract. A set of phylogenetic trees with overlapping leaf sets is consistent if it can be merged without conflicts into a supertree. In this paper, we study the polynomialtime approximability of two related optimization problems called the maximum rooted triplets consistency problem (MaxRTC) and the minimum rooted triplets inconsistency problem (MinRTI) in which the input is a set R of rooted triplets, and where the objectives are to find a largest cardinality subset of R which is consistent and a smallest cardinality subset of R whose removal from R results in a consistent set, respectively. We first show that a simple modification to Wu’s BestPairMergeFirst heuristic [25] results in a bottomupbased 3approximation for MaxRTC. We then demonstrate how any approximation algorithm for MinRTI could be used to approximate MaxRTC, and thus obtain the first polynomialtime approximation algorithm for MaxRTC with approximation ratio smaller than 3. Next, we prove that for a set of rooted triplets generated under a uniform random model, the maximum fraction of triplets which can be consistent with any tree is approximately one third, and then provide a deterministic construction of a triplet set having a similar property which is subsequently used to prove that both MaxRTC and MinRTI are NPhard even if restricted to minimally dense instances. Finally, we prove that MinRTI cannot be approximated within a ratio of Ω(log n) in polynomial time, unless P = NP. 1
On the approximation of computing evolutionary trees
 in Proceedings of the 11th International Computing and Combinatorics Conference (COCOON’05
, 2005
"... Abstract. Given a set of leaflabelled trees with identical leaf sets, the wellknown MAST problem consists of finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called MCT are of particular interest in computational biology. Th ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
Abstract. Given a set of leaflabelled trees with identical leaf sets, the wellknown MAST problem consists of finding a subtree homeomorphically included in all input trees and with the largest number of leaves. MAST and its variant called MCT are of particular interest in computational biology. This paper presents positive and negative results on the approximation of MAST, MCT and their complement versions, denoted CMAST and CMCT. For CMAST and CMCT on rooted trees we give 3approximation algorithms achieving significantly lower running times than those previously known. In particular, the algorithm for CMAST runs in linear time. The approximation threshold for CMAST, resp. CMCT, is shown to be the same whenever collections of rooted trees or of unrooted trees are considered. Moreover, hardness of approximation results are stated for CMAST, CMCT and MCT on small number of trees, and for MCT on unbounded number of trees.
A generalization of haussler’s convolution kernel: mapping kernel
 Proceeding of the International Conference on Machine Learning
, 2008
"... Haussler’s convolution kernel provides a successful framework for engineering new positive semidefinite kernels, and has been applied to a wide range of data types and applications. In the framework, each data object represents a finite set of finer grained components. Then, Haussler’s convolution k ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Haussler’s convolution kernel provides a successful framework for engineering new positive semidefinite kernels, and has been applied to a wide range of data types and applications. In the framework, each data object represents a finite set of finer grained components. Then, Haussler’s convolution kernel takes a pair of data objects as input, and returns the sum of the return values of the predetermined primitive positive semidefinite kernel calculated for all the possible pairs of the components of the input data objects. On the other hand, the mapping kernel that we introduce in this paper is a natural generalization of Haussler’s convolution kernel, in that the input to the primitive kernel moves over a predetermined subset rather than the entire cross product. Although we have plural instances of the mapping kernel in the literature, their positive semidefiniteness was investigated in casebycase manners, and worse yet, was sometimes incorrectly concluded. In fact, there exists a simple and easily checkable necessary and sufficient condition, which is generic in the sense that it enables us to investigate the positive semidefiniteness of an arbitrary instance of the mapping kernel. This is the first paper that presents and proves the validity of the condition. In addition, we introduce two important instances of the mapping kernel, which we refer to as the sizeofindexstructuredistribution kernel and the editcostdistribution kernel. Both of them are naturally derived from well known (dis)similarity measurements in the literature (e.g. the maximum agreement tree, the edit distance), and are reasonably expected to improve the performance of the existing measures by evaluating their distributional features rather than their peak (maximum/minimum) features.
Improved Parameterized Complexity of the Maximum Agreement Subtree and . . .
 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
, 2006
"... Given a set of evolutionary trees on a same set of taxa, the maximum agreement subtree problem (MAST), respectively maximum compatible tree problem (MCT), consists of finding a largest subset of taxa such that all input trees restricted to these taxa are isomorphic, respectively compatible. These ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Given a set of evolutionary trees on a same set of taxa, the maximum agreement subtree problem (MAST), respectively maximum compatible tree problem (MCT), consists of finding a largest subset of taxa such that all input trees restricted to these taxa are isomorphic, respectively compatible. These problems
Clustering with Relative Constraints
"... Recent studies [26, 22] have suggested using relative distance comparisons as constraints to represent domain knowledge. A natural extension to relative comparisons is the combination of two comparisons defined on the same set of three instances. Constraints in this form, termed Relative Constraints ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Recent studies [26, 22] have suggested using relative distance comparisons as constraints to represent domain knowledge. A natural extension to relative comparisons is the combination of two comparisons defined on the same set of three instances. Constraints in this form, termed Relative Constraints, provide a unified knowledge representation for both partitional and hierarchical clusterings. But many key properties of relative constraints remain unknown. In this paper, we answer the following important questions that enable the broader application of relative constraints in general clustering problems: • Feasibility: Does there exist a clustering that satisfies a given set of relative constraints? (consistency of constraints) • Completeness: Given a set of consistent relative constraints, how can one derive a complete clustering without running into deadends? • Informativeness: How can one extract the most informative relative constraints from given knowledge sources? We show that any hierarchical domain knowledge can be easily represented by relative constraints. We further present a hierarchical algorithm that finds a clustering satisfying all given constraints in polynomial time. Experiments showed that our algorithm achieves significantly higher accuracy than the existing metric learning approach based on relative comparisons.
A Heuristic Algorithm for Optimal Agreement Supertrees
 In proceedings of the International Conference on Systemics, Cybernetics and Informatics (ICSCI2005
, 2005
"... Phylogenetic supertree is a collection of different phylogenetic trees combined into a single tree forming tree of life. Many smaller overlapping phylogenetic trees are combined in such a way that no branching information is lost. There may exist an exponentially large number of supertrees for a giv ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Phylogenetic supertree is a collection of different phylogenetic trees combined into a single tree forming tree of life. Many smaller overlapping phylogenetic trees are combined in such a way that no branching information is lost. There may exist an exponentially large number of supertrees for a given set of trees. The optimal tree is a selected based upon different optimality criteria. In this paper we present a polynomial time heuristic algorithms for merging the trees, checking their optimality based on the ordinary least square criteria. The out come of the algorithm is an optimal agreement super tree. Keywords: Phylogenetic tree, phylogenetic supertree, heuristic algorithm, optimality criteria, ordinary least square 1.
Least Common Ancestor Based Method for Efficiently Constructing Rooted
"... Phylogenetic supertree is a collection of different phylogenetic trees combined into a single tree forming a tree of life. The smaller overlapping phylogenetic trees are combined in such a way that no branching information is lost. This problem is important for several biological applications. Yet t ..."
Abstract
 Add to MetaCart
(Show Context)
Phylogenetic supertree is a collection of different phylogenetic trees combined into a single tree forming a tree of life. The smaller overlapping phylogenetic trees are combined in such a way that no branching information is lost. This problem is important for several biological applications. Yet the solution is difficult as exponentially large number of supertrees exists for a given set of trees and the optimal tree has to be selected based upon some optimality criteria. In this paper, we propose a polynomial time algorithm for combining phylogenetic trees, which makes use of least common ancestor information as optimality criterion. The algorithm satisfies the desirable properties of a phylogenetic supertree method as mentioned in literature, and constructs a single phylogenetic supertree even for incompatible input trees which is difficult to solve. Experimental results and comparisons with other works show the superiority of our algorithm.