Results 1 
7 of
7
The complexity of multiple sequence alignment with SPscore that is a metric
 TCS
, 2001
"... This paper analyzes the computational complexity of computing the optimal alignment of a set of sequences under the SP (sum of all pairs) score scheme. We solve an open question by showing that the problem is NP complete in the very restricted case in which the sequences are over a binary alphabet ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
(Show Context)
This paper analyzes the computational complexity of computing the optimal alignment of a set of sequences under the SP (sum of all pairs) score scheme. We solve an open question by showing that the problem is NP complete in the very restricted case in which the sequences are over a binary alphabet and the score is a metric. This result establishes the intractability of multiple sequence alignment under a score function of mathematical interest, which has indeed received much attention in biological sequence comparison.
Near optimal bounds for steiner tree in the hypercube
 SIAM Journal on Discrete Mathematics
, 2011
"... Abstract Given a set S of vertices in a connected graph G, the classic Steiner tree problem asks for the minimum number of edges of a connected subgraph of G that contains S. We study this problem in the hypercube. Given a set S of vertices in the ndimensional hypercube Q n , the Steiner cost of S ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract Given a set S of vertices in a connected graph G, the classic Steiner tree problem asks for the minimum number of edges of a connected subgraph of G that contains S. We study this problem in the hypercube. Given a set S of vertices in the ndimensional hypercube Q n , the Steiner cost of S, denoted by cost(S), is the minimum number of edges among all connected subgraphs of Q n that contain S. We obtain the following results on cost(S). Let be any given small, positive constant, and set k = S. (1) [upper bound] For every set S we have cost(S) < ( there is a constant c 1 depending only on such that if k > c 1 , then cost(S) < ( (2) We develop a randomized algorithm of running time O(kn) that produces a connected subgraph H of Q n containing S such that with probability approaching 1 as k, n → ∞ we have E(H) < ( We also show that for fixed k, as n → ∞, almost always a random family of k vertices in Q n satisfies 2 ) k ) n + √ n ln n.
New computational approaches for . . .
, 2009
"... In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in O(n2) time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into ..."
Abstract
 Add to MetaCart
In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in O(n2) time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence structure the problem becomes even harder. Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for multiple RNA alignments first generate pairwise RNA structure alignments and then build the multiple alignment using only the sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequencestructure alignment procedure to build a multiple RNA structure alignment. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. Specifically, we reduce the memory consumption to ∼ O(band2 ∗ m) where band is the banding size. Other solutions are ∼ O(n2 ∗ m) where
The complexity of multiple sequence alignment with SPscore that is a metric
"... This paper analyzes the computational complexity of computing the optimal alignment of a set of sequences under the SP (sum of all pairs) score scheme. We solve an open question by showing that the problem is NPcomplete in the very restricted case in which the sequences are over a binary alphabet a ..."
Abstract
 Add to MetaCart
(Show Context)
This paper analyzes the computational complexity of computing the optimal alignment of a set of sequences under the SP (sum of all pairs) score scheme. We solve an open question by showing that the problem is NPcomplete in the very restricted case in which the sequences are over a binary alphabet and the score is a metric. This result establishes the intractability of multiple sequence alignment under a score function of mathematical interest, which has indeed received much attention in biological sequence comparison. Key words: multiple sequence alignment, SPscore, intractability. 1
On Improving Heuristics for Maximum Likelihood and Multiple . . .
, 2006
"... In the construction of phylogenetic trees, exact methods for maximum likelihood and multiple sequence alignment are very computationally intensive algorithms, even on a fixed tree. Since anything above cubic time is too computationally complex to be practical in phylogenetics, heuristics methods ar ..."
Abstract
 Add to MetaCart
In the construction of phylogenetic trees, exact methods for maximum likelihood and multiple sequence alignment are very computationally intensive algorithms, even on a fixed tree. Since anything above cubic time is too computationally complex to be practical in phylogenetics, heuristics methods are used to approximate a solution. We attempt to optimize maximum likelihood heuristic methods to improve performance and accuracy by eliminating “fast evolving sites.” In addition, we compare various multiple sequence alignment heuristic methods to meet an overall goal of improving generalized tree alignment heuristic methods.
13892029/03 $41.00+.00 ©2003 Bentham Science Publishers Review of Common Sequence Alignment Methods: Clues to Enhance Reliability
"... Abstract: Today, in various aspects of molecular biology, sequence alignment has become an essential tool to study the structurefunction relationships of proteins. With the impressive increase of the number of available sequences, alignments provide a substantial piece of information by way of vari ..."
Abstract
 Add to MetaCart
Abstract: Today, in various aspects of molecular biology, sequence alignment has become an essential tool to study the structurefunction relationships of proteins. With the impressive increase of the number of available sequences, alignments provide a substantial piece of information by way of various computational methods. These approaches have generally become a crucial tool to put forward working hypotheses for timeconsuming bench work, as protein engineering and site directed mutagenesis. However alignment methods remain hugely perfectible. All methods are dramatically limited in the twilight zone, taking place around 25 % of identity between pairs of sequences. More worrying is the very high rate of false positive results generated by most algorithms, depending of empirical parameters, and hard to validate by statistical criteria. After reviewing the main methods, this paper draws userÕs attention to the fact that algorithm performance evaluations are entirely limited to alignment power (sensibility) evaluation. In reference to a given truth defined from alignment of know structures, the power is defined as the proportion of truth restored in the solution. The power may be overestimated by a lack of independent sets of poorly related sequences and its value depends entirely on the criterion used to define the truth. On the other hand, confidence (selectivity) represents the proportion of the solution that is true. Depending on the method and the parameters used, confidence may be much lower than power, and is usually never evaluated. For nontrivial alignments, when the power is high, confidence is low, which means that correctly aligned positions are embedded in large regions unduly aligned. One possible solution to these problems is to use consensus of several multiple alignment methods, which will increase the confidence of the results. The addition of external information, such as the prediction of the secondary structure and/or the prediction of solvent accessibility is also an other way that should increase the performance of existing multiple alignment methods. 1.