Results 1 
3 of
3
Maximal Strip Recovery Problem with Gaps: Hardness and Approximation Algorithms
"... Given two comparative maps, that is two sequences of markers each representing a genome, the Maximal Strip Recovery problem (MSR) asks to extract a largest sequence of markers from each map such that the two extracted sequences are decomposable into nonoverlapping strips (or synteny blocks). This ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
Given two comparative maps, that is two sequences of markers each representing a genome, the Maximal Strip Recovery problem (MSR) asks to extract a largest sequence of markers from each map such that the two extracted sequences are decomposable into nonoverlapping strips (or synteny blocks). This aims at de ning a robust set of synteny blocks between di erent species, which is a key to understand the evolution process since their last common ancestor. In this paper, we add a fundamental constraint to the initial problem, which expresses the biologically sustained need to bound the number of intermediate (nonselected) markers between two consecutive markers in a strip. We therefore introduce the problem δgapMSR, where δ is a (usually small) nonnegative integer that upper bounds the number of nonselected markers between two consecutive markers in a strip. Depending on the nature of the comparative maps (i.e., with or without duplicates), we show that δgapMSR is NPcomplete for any δ ≥ 1, and even APXhard for any δ ≥ 2. We also provide two approximation algorithms, with ratio 1.8 for δ = 1, and ratio 4 for δ ≥ 2.
Inferring gene order phylogenies by learning ancestral adjacencies
"... Abstract. As genomes evolve over very long times, genes get rearranged which changes their order along the genome. The resulting differing orders for various species provide evidence of their phylogenetic relationships, particularly, from long ago. Indeed, this type of data is increasingly useful in ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. As genomes evolve over very long times, genes get rearranged which changes their order along the genome. The resulting differing orders for various species provide evidence of their phylogenetic relationships, particularly, from long ago. Indeed, this type of data is increasingly useful in sorting out evolutionary relationships [16, 17, 23]. In this paper, we give the first polynomial time algorithm for inferring phylogenies from logarithmic length geneorder data. We provide an implementation of a version of this algorithm which is effective in the high mutation regimes which are difficult for previous polynomial time approaches [17]. Our method takes advantage of reconstructing internal sequences as does the branch and bound approach in GRAPPA [16, 17]. Our polynomial time runtime versus GRAPPA’s exponential time is dramatic in practice as well as theory. The heart of our contribution is a method for estimating distance between genomes and an associated learning method to infer geneorder data at internal nodes. This leads to a polynomial time algorithm for reconstructing a phylogeny on a set of n taxa from geneorder data consisting of N = O(log n) genes. Our algorithm follows the structure of the algorithm by Mihaescu et al. [15] which applies to character data. We replace the estimators and learning subroutines in the Mihaescu algorithm with methods that work with gene order. We implement and test a version of our method against the best previous polynomial time method on simulated data. Our method does considerably better in high evolution conditions. For low evolution conditions, we don’t do as well as previous methods.