Exact algorithm for planted motif challenge problems
 APBC
, 2005
"... The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted £¤¥¦§motif problem. Several instances of ..."
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted £¤¥¦§motif problem. Several instances of this problem have been posed as a challenge. Numerous algorithms have been proposed in the literature that address this challenge. Many of these algorithms fall under the category of approximation algorithms. In this paper we present algorithms for the planted £¤¥¦§motif problem that always find the correct answer(s). Our algorithms are very simple and are based on ideas that are fundamentally different from the ones employed in the literature. We believe that the techniques we introduce in this paper will find independent applications. 1.
SMSForbid: An efficient algorithm for Simple Motif Problem
"... Finding common motifs from a set of strings coding biological sequences is an important problem in Molecular Biology. Several versions of the motif finding problem have been proposed in the literature and for each version, numerous algorithms have been developed. However, many of these algorithms fa ..."
Finding common motifs from a set of strings coding biological sequences is an important problem in Molecular Biology. Several versions of the motif finding problem have been proposed in the literature and for each version, numerous algorithms have been developed. However, many of these algorithms fall under the category of heuristics. In this paper, we concentrate on the Simple Motif Problem (SMP) and we propose an exact algorithm, called SMSForbid, for this version of motif finding problem. SMSForbid make use of less time and space than the known exact algorithms. 1
An efficient motif search algorithm based on a minimal forbidden patterns approach
 In Proceedings of the 5th International Conference on Practical Applications of Computational Biology and Bioinformatics, Advances in Intelligent and SoftComputing Seies
"... Abstract One of the problems arising in the analysis of biological sequences is the discovery of sequence similarity by finding common motifs. Several versions of the motif finding problem have been proposed for dealing with this problem and for each version, numerous algorithms have been developed. ..."
Abstract One of the problems arising in the analysis of biological sequences is the discovery of sequence similarity by finding common motifs. Several versions of the motif finding problem have been proposed for dealing with this problem and for each version, numerous algorithms have been developed. In this paper, we propose an exact algorithm, called SMSHFORBID to solve the Simple Motif Problem (SMP). SMSHFORBID is based on clever techniques reducing the number of patterns to be searched for. These techniques are fundamentally different from the ones employed in the literature making SMP more practical. 1
Extraction of Infrequent Simple Motifs from a Finite Set of Sequences using a Lattice Structure
"... In this paper we present a method for finding infrequent simple motifs in a finite set of sequences. The method uses a lattice structure and minimal forbidden patterns. It is based on a method for solving the Simple Motif Problem. ..."
In this paper we present a method for finding infrequent simple motifs in a finite set of sequences. The method uses a lattice structure and minimal forbidden patterns. It is based on a method for solving the Simple Motif Problem.
Improved Algorithms for Finding Edit Distance Based Motifs
"... Motif search is an important step in extracting meaningful patterns from biological data. Since the general problem of motif search is intractable, there is a pressing need to develop efficient exact and approximation algorithms to solve this problem. We design novel algorithms for solving the Edit ..."
Motif search is an important step in extracting meaningful patterns from biological data. Since the general problem of motif search is intractable, there is a pressing need to develop efficient exact and approximation algorithms to solve this problem. We design novel algorithms for solving the Editdistancebased Motif Search (EMS) problem: given two integers l,d and n biological strings, find all strings of length l that appear in each input strings with at most d substitutions, insertions and deletions. These algorithms have been evaluated on several challenging instances. Our algorithm solves a moderately hard instance (11,3) in a couple of minutes and the next difficult instance (14,3) in a couple of hours whereas the best previously known algorithm, EMS1, solves (11,3) in a few hours and does not solve (13,4) even after 3 days. This significant improvement is due to a novel and provably efficient neighborhood generation technique introduced in this paper. This efficient approach can be used in other edit distance based applications in Bioinformatics, such as kspectrum based sequence error correction algorithms. We also use a trie based data structure to efficiently store the candidate motifs in the neighbourhood and to output the motifs in a sorted order.
Sequence analysis Advance Access publication December 9, 2010
, 2010
"... SlideSort: all pairs similarity search for short reads ..."
Experimental Evaluation of Fast Solution Algorithms for the Simple Motif Problem
"... Abstract. The Simple Motif Problem (SMP) is: given a set of strings Y = {y0,y1,...,yn−1} built from a finite alphabet Σ, p>0 an integer and q ≤ n a quorum, find all the simple motifs of length at most p that occurs in at least q strings of Y. This paper presents an experimental evaluation of algo ..."
Abstract. The Simple Motif Problem (SMP) is: given a set of strings Y = {y0,y1,...,yn−1} built from a finite alphabet Σ, p>0 an integer and q ≤ n a quorum, find all the simple motifs of length at most p that occurs in at least q strings of Y. This paper presents an experimental evaluation of algorithms dealing with SMP and using a minimal forbidden pattern approach. The experiments are concerned both with running times and space consumption.