Download:
|
by Rune B. Lyngs, Christian N. S. Pedersen, Jens Stoye
http://www.daimi.aau.dk/~gerth/Papers/cpm99.ps.gz
Add To MetaCart
Abstract:
Abstract. A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them dierent. The gap of a pair is the number of characters between the two occurrences of the substring. In this paper we present methods for nding all maximal pairs under various constraints on the gap. In a string of length n we can nd all maximal pairs with gap in an upper and lower bounded interval in time O(n log n+ z) where z is the number of reported pairs. If the upper bound is removed the time reduces to O(n+z). Since a tandem repeat is a pair where the gap is zero, our methods can be seen as a generalization of nding tandem repeats. The running time of our methods equals the running time of well known methods for nding tandem repeats. 1
Citations
|
572
|
Leda: A Platform for Combinatorial and Geometric Computing
– Mehlhorn, Naher
- 1999
|
|
310
|
Linear pattern matching algorithms
– Weiner
- 1973
|
|
200
|
A Dichromatic Framework for Balanced Trees
– Guibas, Sedgewick
- 1978
|
|
104
|
A new data structure for representing sorted lists
– Huddleston, Mehlhorn
- 1982
|
|
83
|
Transducers and repetitions
– Crochemore
- 1986
|
|
65
|
A space-economical sux tree construction algorithm
– McCreight
- 1976
|
|
62
|
An algorithm for approximate tandem repeats
– Landau, Schmidt, et al.
|
|
58
|
An optimal algorithm for computing the repetitions in a word
– Crochemore
- 1981
|
|
54
|
Algorithms on strings, trees, and sequences
– Gus
- 1997
|
|
40
|
On-line construction of sux trees
– Ukkonen
- 1995
|
|
37
|
Sorting and Searching, volume 1 of Data Structures and Algorithms
– Mehlhorn
- 1984
|
|
32
|
A simple algorithm for merging two disjoint linearly ordered sets
– Hwang, Lin
- 1972
|
|
30
|
A fast merging algorithm
– Brown, Tarjan
- 1979
|
|
30
|
An efficient algorithm for identifying matches with errors in multiple long molecular sequences
– Leung, Blaisdell, et al.
- 1991
|
|
28
|
An algorithm for the organization of information. Doklady Akademii Nauk SSSR
– Adelson-Velskii, Landis
- 1962
|
|
28
|
Optimal sux tree construction with large alphabets
– Farach
- 1997
|
|
21
|
Efficient algorithms for molecular sequence analysis
– Karlin, Morris, et al.
- 1988
|
|
21
|
Linear time recognition of squarefree strings
– Main, Lorentz
- 1985
|
|
20
|
Finding maximal pairs with bounded gap
– Brodal, Lyngsø, et al.
- 1999
|
|
18
|
Computation of squares in a string
– Kosaraju
- 1994
|
|
18
|
Identifying satellites in nucleic acid sequences
– Sagot, Myers
- 1998
|
|
16
|
An O(n log n) algorithm for all repetitions in a string
– Main, Lorentz
- 1984
|
|
9
|
Optimal o-line detection of repetitions in a string. Theoretical Computer Science
– Apostolico, Preparata
- 1983
|
|
8
|
Simple and detection of contiguous repeats using a sux tree
– Stoye, Gus
- 1998
|
|
7
|
Maximal Repetitions in words or how to all squares in linear time
– Kolpakov, Kucherov
- 1998
|
|
5
|
Linear time algorithms for and representing all the tandem repeats in a string
– Gus, Stoye
- 1998
|