Results 11 -
17 of
17
A 2-Approximation for the Minimum Duplication Speciation Problem
- in "Journal of Computational Biology
"... We consider the following problem: given a set of gene family trees, spanning a given set of species, find a first speciation which splits these species into two subsets and minimizes the number of gene duplications that happened before this speciation. We call this problem the Minimum Duplication B ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We consider the following problem: given a set of gene family trees, spanning a given set of species, find a first speciation which splits these species into two subsets and minimizes the number of gene duplications that happened before this speciation. We call this problem the Minimum Duplication Bipartition Problem. Using a generalization of the Minimum Edge-Cut Problem, we propose a polynomial time 2-approximation algorithm for the Minimum Duplication Bipartition Problem. We apply this algorithm to the inference of species trees on synthetic datasets and on two datasets of eukaryotic species. Key words: computational molecular biology, dynamic programming, genomics rearrangements. 1.
Minimum Leaf Removal for Reconciliation: Complexity and Algorithms
"... Abstract. Reconciliation is a well-known method for studying the evolution of a gene family through speciation, duplication, and loss. Unfortunately, the inferred history strongly depends on the considered gene tree for the gene family, as a few misplaced leaves can lead to a completely different hi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Reconciliation is a well-known method for studying the evolution of a gene family through speciation, duplication, and loss. Unfortunately, the inferred history strongly depends on the considered gene tree for the gene family, as a few misplaced leaves can lead to a completely different history, possibly with significantly more duplications and losses. It is therefore essential to develop methods that are able to preprocess and correct gene trees prior to reconciliation. In this paper, we consider a combinatorial problem, known as the Minimum Leaf Removal problem, that has been proposed to remove errors from a gene tree by deleting some of its leaves. We prove that the problem is APX-hard, even in the restricted case of a gene family with at most two copies per genome. On the positive side, we present fixed-parameter algorithms where the parameters are the size of the solution (minimum number of leaf removals) and the number of genomes containing multiple gene copies. 1
Efficient genome-scale phylogenetic analysis under the duplication-loss and
"... deep coalescence cost models ..."
Evolution of orthologous tandemly arrayed gene clusters
"... Background: Tandemly Arrayed Gene (TAG) clusters are groups of paralogous genes that are found adjacent on a chromosome. TAGs represent an important repertoire of genes in eukaryotes. In addition to tandem duplication events, TAG clusters are affected during their evolution by other mechanisms, such ..."
Abstract
- Add to MetaCart
Background: Tandemly Arrayed Gene (TAG) clusters are groups of paralogous genes that are found adjacent on a chromosome. TAGs represent an important repertoire of genes in eukaryotes. In addition to tandem duplication events, TAG clusters are affected during their evolution by other mechanisms, such as inversion and deletion events, that affect the order and orientation of genes. The DILTAG algorithm developed in [1] allows to infer a set of optimal evolutionary histories explaining the evolution of a single TAG cluster, from an ancestral single gene, through tandem duplication (simple or multiple, direct or inverted), deletion and inversion events. Results: We present a general methodology, which is an extension of DILTAG, for the study of the evolutionary history of a set of orthologous TAG clusters in multiple species. In addition to the speciation events reflected by the phylogenetic tree of the considered species, the evolutionary events that are taken into account are simple or multiple tandem duplications, direct or inverted, simple or multiple deletions, and inversions. Conclusions: Our results obtained on simulated data sets showed a good performance in inferring the total number and size distribution of duplication events. A limitation of the algorithm is however in dealing with multiple gene deletions, as the algorithm is highly exponential in this case, and becomes quickly intractable. 1 1
Gene trees and Species trees: Irreconcilable Differences
"... Background: Reconciliation is the classical method for inferring a duplication and loss history from a set of extant genes. It is based upon the notion of embedding the gene tree into the species tree, the incongruence between the two indicating evidence for duplication and loss. However, results ob ..."
Abstract
- Add to MetaCart
Background: Reconciliation is the classical method for inferring a duplication and loss history from a set of extant genes. It is based upon the notion of embedding the gene tree into the species tree, the incongruence between the two indicating evidence for duplication and loss. However, results obtained by this method are highly dependent upon the considered species and gene trees. Thus, painstaking attention has been given to the development of methods for reconstructing accurate gene trees. Results: This paper highlights the fact that errors in gene trees are not the only reasons for the inference of an erroneous duplication-loss history. More precisely, we prove that, under certain reasonable hypotheses based on the widely accepted link between function and sequence constraints, even a well-supported gene tree yield a reconciliation that does not correspond to the true history. We then provide the theoretical underpinnings for a conservative approach to infer histories given such gene trees. We apply our method to the mammalian interleukin-1 (IL) gene tree, that has been used by Page and Holmes [1] as a model example to illustrate the role of reconciliation. 1
Gene Tree Correction for Reconciliation and Species Tree Inference
"... Background: Reconciliation is the commonly used method for inferring the evolutionary scenario for a gene family. It consists in “embedding ” inferred gene trees into a known species tree, revealing the evolution of the gene family by duplications and losses. When a species tree is not known, a natu ..."
Abstract
- Add to MetaCart
Background: Reconciliation is the commonly used method for inferring the evolutionary scenario for a gene family. It consists in “embedding ” inferred gene trees into a known species tree, revealing the evolution of the gene family by duplications and losses. When a species tree is not known, a natural algorithmic problem is to infer a species tree from a set of gene trees, such that the corresponding reconciliation minimizes the number of duplications and/or losses. The main drawback of reconciliation is that the inferred evolutionary scenario is strongly dependent on the considered gene trees, as few misplaced leaves may lead to a completely different history, with significantly more duplications and losses. Results: In this paper, we take advantage of certain gene trees ’ properties in order to preprocess them for reconciliation or species tree inference. We flag certain duplication vertices of a gene tree, the “non-apparent duplication ” (NAD) vertices, as resulting from the misplacement of leaves. In the case of species tree inference, we develop a polynomial-time heuristic for removing the minimum number of species leading to a set of gene trees that exhibit no NAD vertices with respect to at least one species tree. In the case of reconciliation, we consider the optimization problem of removing the minimum number of leaves or species leading to a tree without any NAD vertex. We develop a polynomial-time algorithm that is exact for two special classes of gene trees, and show a good performance on simulated data sets in the general case. 1 1

