Results 1  10
of
10
Computing the Similarity of Two Sequences with Nested Arc Annotations
 Theoretical Computer Science
, 2003
"... We present exact algorithms for the NPcomplete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, one of our algorithms determines (if existent) in O(3.3 ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
We present exact algorithms for the NPcomplete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, one of our algorithms determines (if existent) in O(3.31 time an arcpreserving subsequence of both sequences, which can be obtained by deleting (together with corresponding arcs) k 1 letters from the first and k 2 letters from the second sequence. A second algorithm shows that (in case of a four letter alphabet) we can find a length l arcannotated subsequence in O(12 n) time. This means that the problem is fixedparameter tractable when parameterized by the number of deletions as well as when parameterized by the subsequence length. Our findings complement known approximation results which give a quadratic time factor2approximation for the general and polynomial time approximation schemes for restricted versions of the problem. In addition, we obtain further fixedparameter tractability results for these restricted versions.
Towards Optimally Solving the Longest Common Subsequence Problem for Sequences with Nested Arc Annotations in Linear Time
 In Proc. of the 13th Symposium on Combinatorial Pattern Matching (CPM02), volume 2373 of LNCS
, 2002
"... We present exact algorithms for the NPcomplete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, our algorithm determines (if existent) in time O(3.3 ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
We present exact algorithms for the NPcomplete Longest Common Subsequence problem for sequences with nested arc annotations, a problem occurring in structure comparison of RNA. Given two sequences of length at most n and nested arc structure, our algorithm determines (if existent) in time O(3.31 k 1 +k 2 n) an arcpreserving subsequence of both sequences, which can be obtained by deleting (together with corresponding arcs) k1 letters from the first and k2 letters from the second sequence. Thus, the problem is fixedparameter tractable when parameterized by the number of deletions. This complements known approximation results which give a quadratic time factor2approximation for the general and polynomial time approximation schemes for restricted versions of the problem. In addition, we obtain further fixedparameter tractability results for these restricted versions.
Exact Algorithms for the Longest Common Subsequence Problem for ArcAnnotated Sequences
 Diploma thesis, Universität Tübingen, Fed. Rep. of
, 2002
"... Contents 1 Introduction 5 2 Biological Motivation 9 2.1 Some Molecular Biology . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Biological Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Some Basic Definitions 13 3.1 LCS and Some Problems from Graph Theory . . . . . . . . . . . ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
Contents 1 Introduction 5 2 Biological Motivation 9 2.1 Some Molecular Biology . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Biological Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Some Basic Definitions 13 3.1 LCS and Some Problems from Graph Theory . . . . . . . . . . . 13 3.2 Parameterized Complexity . . . . . . . . . . . . . . . . . . . . . . 15 3.3 Arc Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 Previous Results 27 4.1 Classical Complexity . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2 Parameterized Complexity . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Complexity of ArcPreserving Subsequence Problem . . . . . . . 30 4.4 Overview of This Work . . . . . . . . . . . . . . . . . . . . . . . 31 5 cfragment, cdiagonal LAPCS 33 5.1 cfragment<F10
Inferring an Original Sequence from Erroneous Copies: a Bayesian Approach
 AsiaPaci®c BioTech News
, 2003
"... This paper considers the problem of inferring an original sequence from a number of erroneous copies. The problem arises in DNA sequencing, particularly in the context of emerging technologies that provide high throughput or other advantages, but at the cost of introducing many errors. We develop a ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
This paper considers the problem of inferring an original sequence from a number of erroneous copies. The problem arises in DNA sequencing, particularly in the context of emerging technologies that provide high throughput or other advantages, but at the cost of introducing many errors. We develop a Bayesian probabilistic model of the introduction of errors, and search for a sequence that has maximum posterior probability with respect to the model. We present results of extensive tests in which errorprone sequencing of real DNA was simulated. The results obtained using the new approach are compared to results obtained by deriving a consensus sequence from a multiple sequence alignment. We find that a significant improvement in accuracy is obtained using the new approach. The implication is that high error levels need not be a barrier to the adoption of sequencing technologies that are in other respects promising, because most errors can be detected and corrected using a small number of reads.
Parameterized Complexity and Biopolymer Sequence Comparison
, 2007
"... The paper surveys parameterized algorithms and complexities for computational tasks on biopolymer sequences, including the problems of longest common subsequence, shortest common supersequence, pairwise sequence alignment, multiple sequencing alignment, structure–sequence alignment and structure–str ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The paper surveys parameterized algorithms and complexities for computational tasks on biopolymer sequences, including the problems of longest common subsequence, shortest common supersequence, pairwise sequence alignment, multiple sequencing alignment, structure–sequence alignment and structure–structure alignment. Algorithm techniques, built on the structuralunit level as well as on the residue level, are discussed.
Arcpreserving subsequences of arcannotated sequences
 Acta Universitatis Sapientiae. Informatica
"... Abstract. Arcannotated sequences are useful in representing the structural information of RNA and protein sequences. The longest arcpreserving common subsequence problem has been introduced as a framework for studying the similarity of arcannotated sequences. In this paper, we consider arcannot ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Arcannotated sequences are useful in representing the structural information of RNA and protein sequences. The longest arcpreserving common subsequence problem has been introduced as a framework for studying the similarity of arcannotated sequences. In this paper, we consider arcannotated sequences with various arc structures. We consider the longest arc preserving common subsequence problem. In particular, we show that the decision version of the 1fragment LAPCS(crossing,chain) and the decision version of the 0diagonal LAPCS(crossing,chain) are NPcomplete for some fixed alphabet Σ such that Σ = 2. Also we show that if Σ = 1, then the decision version of the 1fragment LAPCS(unlimited, plain) and the decision version of the 0diagonal LAPCS(unlimited, plain) are NPcomplete.
unknown title
, 2007
"... constraint reduces the size of the space that a dynamic programming implementation must search for an optimal pairwise alignment. Some algorithms, such as TCoffee (Notredame et al., 2000) and DbClustal (Thompson et al., 2000), do use libraries of pairwise alignments, but they do not attempt to expl ..."
Abstract
 Add to MetaCart
constraint reduces the size of the space that a dynamic programming implementation must search for an optimal pairwise alignment. Some algorithms, such as TCoffee (Notredame et al., 2000) and DbClustal (Thompson et al., 2000), do use libraries of pairwise alignments, but they do not attempt to explicitly choose alignments present in multiple pairs. 3. An easy way for users to specify regions they want to see aligned in any multiple alignment computed. User input can be particularly useful when user knowledge is not reflected in sequence similarity. Such a capability was added to a semiautomatic version of DIALIGN (Morgenstern et al., 1998) by Morgenstern et al. (2006). Other methods (for example SALIGN in MODELLER (MartiRenom et al., 2004)) include the option for a user to specify constraints when aligning sequences and structures.
Algorithms on Constrained Sequence Alignment
, 2004
"... One of the fundamental issues that arises in computational biology is Multiple Sequence Alignment (MSA), which needs to be addressed in many applications of Bioinformatics (e.g. study of the SARS Coronavirus and the Human Genome Project). Many algorithms have been proposed to solve the MSA problem, ..."
Abstract
 Add to MetaCart
One of the fundamental issues that arises in computational biology is Multiple Sequence Alignment (MSA), which needs to be addressed in many applications of Bioinformatics (e.g. study of the SARS Coronavirus and the Human Genome Project). Many algorithms have been proposed to solve the MSA problem, but often cannot incorporate users' (biologists') knowledge of the functionalities or structures of these sequences into their solutions. This kind of information is very useful for an accurate and biologically meaningful alignment. The Constrained Multiple Sequence Alignment (CMSA) was proposed by Tang et al. (2002) to rectify the shortcomings of MSA by introducing a constrained sequence to represent more important residues in the sequences. Every character of the constrained sequence has to appear in an entire column in the alignment of the multiple sequences, and in the same order as in the constrained sequence.
BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm076 Sequence analysis COBALT: constraintbased alignment tool for multiple
"... Motivation: A tool that simultaneously aligns multiple protein sequences, automatically utilizes information about protein domains, and has a good compromise between speed and accuracy will have practical advantages over current tools. Results: We describe COBALT, a constraint based alignment tool t ..."
Abstract
 Add to MetaCart
Motivation: A tool that simultaneously aligns multiple protein sequences, automatically utilizes information about protein domains, and has a good compromise between speed and accuracy will have practical advantages over current tools. Results: We describe COBALT, a constraint based alignment tool that implements a general framework for multiple alignment of protein sequences. COBALT finds a collection of pairwise constraints derived from database searches, sequence similarity and user input, combines these pairwise constraints, and then incorporates them into a progressive multiple alignment. We show that using constraints derived from the conserved domain database (CDD) and PROSITE proteinmotif database improves COBALT’s alignment quality. We also show that COBALT has reasonable runtime performance and alignment accuracy comparable to or exceeding that of other tools for a broad range of problems. Availability: COBALT is included in the NCBI Cþþ toolkit. A Linux executable for COBALT, and CDD and PROSITE data used is
Optimization Multiple Sequence Alignment Scheme in DCBTA
"... doi:10.4156/jdcta.vol4. issue8.6 Multiple sequence alignment is a fundamental problem in computational molecular biology. This paper shows a brand new refinement strategy combining divideandconquer and BeamThrough alignment (DCBTA). Optimization objective function (OF) is additively computed wit ..."
Abstract
 Add to MetaCart
(Show Context)
doi:10.4156/jdcta.vol4. issue8.6 Multiple sequence alignment is a fundamental problem in computational molecular biology. This paper shows a brand new refinement strategy combining divideandconquer and BeamThrough alignment (DCBTA). Optimization objective function (OF) is additively computed with new stage beam area, which is corresponding to beam area rate in [6]. The refinement is based on previous alignment result to extract heuristic start beam sources (BS). By a logical structure of P.S.M of beam status from the latest round, combined with beam weight strategy, sequences are classified into similarity class for better BS to improve refinement of later round. Some important conclusions are given on divideandconquer of MSA, BS collecting, and similarity degree. These conclusions promise sound and efficient optimal MSA refinements.