Results 1 - 10
of
90
Sorting by weighted reversals, transpositions and inverted transpositions
- PROCEEDINGS OF RECOMB2006, LECTURE NOTES IN BIOINFORMATICS
, 2006
"... During evolution, genomes are subject to genome rearrangements that alter the ordering and orientation of genes on the chromosomes. If a genome consists of a single chromosome (like mitochondrial, chloroplast, or bacterial genomes), the biologically relevant genome rearrangements are (1) inversions ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
During evolution, genomes are subject to genome rearrangements that alter the ordering and orientation of genes on the chromosomes. If a genome consists of a single chromosome (like mitochondrial, chloroplast, or bacterial genomes), the biologically relevant genome rearrangements are (1) inversions -- also called reversals -- where a section of the genome is excised, reversed in orientation, and reinserted and (2) transpositions, where a section of the genome is excised and reinserted at a new position in the genome; if this also involves an inversion, one speaks of an inverted transposition. To reconstruct ancient events in the evolutionary history of organisms, one is interested in finding an optimal sequence of genome rearrangements that transforms a given genome into another genome. It is well known that this problem is equivalent to the problem of "sorting" a signed per-mutation into the identity permutation. In this paper, we provide a 1.5-approximation algorithm for sorting by weighted reversals, transpositions and inverted transpositions for biologically realistic weights.
Yeast Ancestral Genome Reconstructions: The Possibilities of Computational Methods II
, 2010
"... Since the availability of assembled eukaryotic genomes, the first one being a budding yeast, many computational methods for the reconstruction of ancestral karyotypes and gene orders have been developed. The difficulty has always been to assess their reliability, since we often miss a good knowledge ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
Since the availability of assembled eukaryotic genomes, the first one being a budding yeast, many computational methods for the reconstruction of ancestral karyotypes and gene orders have been developed. The difficulty has always been to assess their reliability, since we often miss a good knowledge of the true ancestral genomes to compare their results to, as well as a good knowledge of the evolutionary mechanisms to test them on realistic simulated data. In this study, we propose some measures of reliability of several kinds of methods, and apply them to infer and analyse the architectures of two ancestral yeast genomes, based on the sequence of seven assembled extant ones. The pre-duplication common ancestor of S. cerevisiae and C. glabrata has been inferred manually by Gordon et al. (Plos Genet. 2009). We show why, in this case, a good convergence of the methods is explained by some properties of the data, and why results are reliable. In another study, Jean et al. (J. Comput Biol. 2009) proposed an ancestral architecture of the last common ancestor of S. kluyveri, K. thermotolerans, K. lactis, A. gossypii, and Z. rouxii inferred by a computational method. In this case, we show that the dataset does not seem to contain enough information to infer a reliable architecture, and we construct a higher resolution dataset which gives a good reliability on a new ancestral configuration.
Stoye J: A new linear time algorithm to compute the genomic distance via the double cut and join distance. Theor Comput Sci 2009
"... Abstract The genomic distance problem in the Hannenhalli-Pevzner (HP) theory is the following: Given two genomes whose chromosomes are linear, calculate the minimum number of translocations, fusions, fissions and inversions that transform one genome into the other. This paper presents a new distanc ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Abstract The genomic distance problem in the Hannenhalli-Pevzner (HP) theory is the following: Given two genomes whose chromosomes are linear, calculate the minimum number of translocations, fusions, fissions and inversions that transform one genome into the other. This paper presents a new distance formula based on a simple tree structure that captures all the delicate features of this problem in a unifying way, and a linear-time algorithm for computing this distance.
Bayesian Sampling of Genomic Rearrangement Scenarios via Double Cut and Join
"... Motivation: When comparing the organization of two genomes, it is important not to draw conclusions on their modes of evolution from a single most parsimonious scenario explaining their differences. Better estimations can be obtained by sampling many different genomic rearrangement scenarios. For th ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
(Show Context)
Motivation: When comparing the organization of two genomes, it is important not to draw conclusions on their modes of evolution from a single most parsimonious scenario explaining their differences. Better estimations can be obtained by sampling many different genomic rearrangement scenarios. For this problem, the Double Cut and Join (DCJ) model, while less relevant, is computationally easier than the Hannenhalli-Pevzner (HP) model. Indeed, in some special cases, the total number of DCJ sorting scenarios can be analytically calculated, and uniformly distributed random DCJ scenarios can be drawn in polynomial running time, while the complexity of counting the number of HP scenarios and sampling from the uniform distribution of their space is unknown, and conjectured to be #P-complete. Statistical methods, like MCMC for sampling from the uniform distribution of the most parsimonious or the Bayesian distribution of all possible HP scenarios are required. Results: We use the computational facilities of the DCJ model to draw a sampling of HP scenarios. It is based on a parallel MCMC method that cools down DCJ scenarios to HP scenarios. We introduce two theorems underlying the theoretical mixing properties of this parallel MCMC method. The method was tested on yeast and mammalian genomic data, and allowed us to provide estimates of the different modes of evolution in diverse lineages.
GENOME HALVING WITH DOUBLE CUT AND JOIN
, 2007
"... The genome halving problem, previously solved by El-Mabrouk for inversions and reciprocal translocations, is here solved in a more general context allowing transpositions and block interchange as well, for genomes including multiple linear and circular chromosomes. We apply this to several data sets ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
The genome halving problem, previously solved by El-Mabrouk for inversions and reciprocal translocations, is here solved in a more general context allowing transpositions and block interchange as well, for genomes including multiple linear and circular chromosomes. We apply this to several data sets and compare the results to the previous algorithm.
D: Genome Aliquoting with Double Cut and Join
- BMC Bioinformatics 2009, 10(Suppl 1):S2
"... ..."
(Show Context)
Multichromosomal genome median and halving problems
- In Proc. 8th Workshop on Algorithms in Bioinformatics
, 2008
"... Abstract. Genome median and halving problems aim at reconstructing ancestral genomes and the evolutionary events leading from the ances-tor to extant species. The NP-hardness of the breakpoint distance and reversal distance median problems for unichromosomal genomes do not easily imply the same for ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract. Genome median and halving problems aim at reconstructing ancestral genomes and the evolutionary events leading from the ances-tor to extant species. The NP-hardness of the breakpoint distance and reversal distance median problems for unichromosomal genomes do not easily imply the same for the multichromosomal case. Here we find the complexity of several genome median and halving problems, including a surprising polynomial result for the breakpoint median and guided halving problems in genomes with circular and linear chromosomes; the multichromosomal problem is thus easier than the unichromosomal one. 1
D: Genome aliquoting revisited
- Journal of Computational Biology
"... Abstract. We prove that the genome aliquoting problem, the problem of finding a recent polyploid ancestor of a genome, with breakpoint dis-tance can be solved in polynomial time. We propose an aliquoting algo-rithm that is a 2-approximation for the genome aliquoting problem with double cut and join ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We prove that the genome aliquoting problem, the problem of finding a recent polyploid ancestor of a genome, with breakpoint dis-tance can be solved in polynomial time. We propose an aliquoting algo-rithm that is a 2-approximation for the genome aliquoting problem with double cut and join distance, improving upon the previous best solution to this problem, Feijão and Meidanis ’ 4-approximation algorithm. Comparing two genomes with duplicated genes is difficult. None of the distances used to compare genomes today (breakpoint distance, reversal distance, double cut and join distance, etc...) handle duplicated genes. However, in the special case where all genes are duplicated the same number of times, there has been some success. Informally, the genome aliquoting problem is the problem of finding a genome with one copy of every gene given a genome with exactly p copies of every gene such that the distance between the given and resulting genomes is minimized according to some distance metric. Thus, the genome aliquoting problem elimi-nates the duplicate genes allowing a genome to be compared with other genomes using an existing algorithm. Solving this problem will allow genomes which have undergone a recent polyploidization event, common in plants, to be compared. There have been a number of solutions to the genome aliquoting problem where the genome has exactly two copies of every gene. This restricted version of the problem is called the genome halving problem and was first introduced in [3]. It was solved for reversal and translocation distance in [3,1] and for double cut and join distance in [10,6]. The genome aliquoting problem was introduced in [9] along with a sketch of a heuristic algorithm for the problem under double cut and join distance. [4] provided an exact solution under single cut or join distance which is also a 4-approximation algorithm under double cut and join distance. In this paper, we provide an exact polynomial-time algorithm under breakpoint distance which is also a 2-approximation algorithm under double cut and join distance. Since our algorithm is similar to that presented in [9] it also bounds that heuristic as a 2-approximation for double cut and join distance. 1
Multi-Break Rearrangements and Breakpoint Re-Uses: From Circular to Linear Genomes
, 2008
"... Multi-break rearrangements break a genome into multiple fragments and further glue them together in a new order. While 2-break rearrangements represent standard reversals, fusions, fissions, and translocations, 3-break rearrangements represent a natural generalization of transpositions. Alekseyev an ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Multi-break rearrangements break a genome into multiple fragments and further glue them together in a new order. While 2-break rearrangements represent standard reversals, fusions, fissions, and translocations, 3-break rearrangements represent a natural generalization of transpositions. Alekseyev and Pevzner (2007a, 2008a) studied multi-break rearrangements in circular genomes and further applied them to the analysis of chromosomal evolution in mammalian genomes. In this paper, we extend these results to the more difficult case of linear genomes. In particular, we give lower bounds for the rearrangement distance between linear genomes and for the breakpoint re-use rate as functions of the number and proportion of transpositions. We further use these results to analyze comparative genomic architecture of mammalian genomes.
Reactive Stochastic Local Search Algorithms for the Genomic Median Problem
-
, 2008
"... The genomic median problem is an optimization problem inspired by a biological issue: it aims to find the chromosome organization of the common ancestor to multiple living species. It is formulated as the search for a genome that minimizes a rearrangement distance measure among given genomes. Seve ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The genomic median problem is an optimization problem inspired by a biological issue: it aims to find the chromosome organization of the common ancestor to multiple living species. It is formulated as the search for a genome that minimizes a rearrangement distance measure among given genomes. Several attempts have been reported for solving this NP-hard problem. These range from simple heuristic methods to a stochastic local search algorithm inspired by WalkSAT, a well-known local search algorithm for the satisfiability problem in propositional logic. The main objective of this research is to develop improved algorithmic techniques for tackling the genomic median problem and to provide new state-of-the-art solutions. In particular, we have developed an algorithm that is based on tabu search and iterated local search and that shows high performance. To alleviate the dependence of the algorithm performance on a single fixed parameter setting, we have included a reactive scheme that automatically adapts the tabu list length of the tabu search part and the perturbation strength of the iterated local search part. In fact, computational results show that we have developed a new very high-performing stochastic local search algorithm for the genomic median problem and we also have found a new best solution for a realworld case.