| A. Ehrenfeucht and D. Haussler. A New Distance Metric on Strings Computable in Linear Time. Discrete Applied Mathematics, 20:191-- 203, 1988. |
.... a string as an arbitrary path of an unrooted labeled tree [4] performing efficient dictionary matching [6, 5, 7, 21, 43] data compression schemes [39, 40, 73, 83, 84, 102, 103] searching for the longest run of a given motif in molecular sequences [53, 54, 100] metric distance on strings [34]; complexity measure on random strings for cryptology [81] inverted indices [22] analyzing genetic sequences [25, 23] finding duplication in programming code [13] generating names for programs in assembly tasks [14] testing unique decipherability for a set of words [83] detecting ....
.... of an unrooted labeled tree [4] performing efficient dictionary matching [6, 5, 7, 21, 43] Other interesting applications are described in the excellent survey of Apostolico [8] Beside pattern matching, suffix trees have been applied to many other problems, such as metric distance on strings [34], complexity measure on random strings for cryptology [81] inverted indices [22] analyzing genetic sequences [25, 23] finding duplication in programming code [13] generating names for programs in assembly tasks [14] and testing unique decipherability for a set of words [83] Suffix trees can ....
Ehrenfeucht, A., and Haussler, D., A new distance metric on strings computable in linear time, Disc. Applied Math., 20, 191-203, (1988).
....sequential symbols. 1 While the LCS indicates the sequential commonality between strings, it does not necessarily detect the minimum set of changes. More generally, it has been asserted that string metrics that examine symbols sequentially fail to emphasize the global similarity of two strings [6]. Miller and Myers [9] established the limitations of LCS when they produced a new file compare program that executes at four times the speed of the diff program while producing significantly smaller deltas. The edit distance [14] proved to be a better metric for the difference of files and ....
EHRENFEUCHT, A., AND HAUSSLER, D. A new distance metric on strings computable in linear time. Discrete Applied Mathematics 20 (1988), 191--203.
....O(m) space by evaluating an (m 1) Theta (n 1) table D(i; j) minfedist(P [1 : i] s) j s is a suffix of T [1 : j]g using dynamic programming. If D(m; j) k, then there is an approximate match ending at position j. The following definitions are motivated by Ehrenfeucht and Haussler s [4] notion of compatible markings. A partition of v w.r.t. u is a list [w 1 ; c 1 ; w r ; c r ; w r 1 ] of subwords w 1 ; w r ; w r 1 of u and characters c 1 ; c r such that v = w 1 c 1 : w r c r w r 1 . Let Psi = w 1 ; c 1 ; w r ; c r ; w r 1 ] be a partition of ....
....using the suffix tree for u (see [9] The following lemma states an important relation between alignments and partitions. Using this lemma, it is easy to show that mmdist(u; v) edist(u; v) 1 Note that mmdist is not a distance in the usual mathematical sense, since it is not symmetric, see [4]. Lemma 1. Let A be an alignment of u and v. There is an r; 0 r ffi(A) and a partition [w 1 ; c 1 ; w r ; c r ; w r 1 ] of v w.r.t. u such that w 1 is a prefix and w r 1 is a suffix of u. 3 Dynamic Filtering applied to LET The linear expected time algorithm (LET for short) of Chang ....
A. Ehrenfeucht and D. Haussler. A New Distance Metric on Strings Computable in Linear Time. Discrete Applied Mathematics, 20:191--203, 1988.
....determine approximate matches. In Section 5, we describe algorithms which additionally compute the shortest approximate matches ending at each position, thus solving the approximate string matching problems. 2. 3 Maximal Matches The following definitions are motivated by Ehrenfeucht and Haussler s [EH88] notion of compatible markings. A partition of v w.r.t. u is a list [w 1 ; c 1 ; w r ; c r ; w r 1 ] of subwords w 1 ; w r ; w r 1 of u and characters c 1 ; c r such that v = w 1 c 1 : w r c r w r 1 . Let Psi = w 1 ; c 1 ; w r ; c r ; w r 1 ] be a partition ....
....r ; w r 1 are the submatches in Psi. c 1 ; c r are the marked characters in Psi. The size of Psi, denoted by j Psij, is r. mmdist(u; v) is the size of any minimal partition of v w.r.t. u. Following Ukkonen [Ukk92a] we call mmdist(u; v) maximal matches distance of u and v. Example 2. 1 [EH88] Let u = abcba and v = cbaabdcb. Psi 1 = cba; a; b; d; cb] is a partition of v w.r.t. u, since cba, b, and cb are subwords of u. Psi 2 = cb; a; ab; d; cb] is a partition of v w.r.t. u, since cb and ab are subwords of u. It is clear that Psi 1 and Psi 2 are of minimal size. Hence, mmdist(u; ....
A. Ehrenfeucht and D. Haussler. A New Distance Metric on Strings Computable in Linear Time. Discrete Applied Mathematics, 20:191--203, 1988.
....sequential symbols. 1 While the LCS indicates the sequential commonality between strings, it does not necessarily detect the minimum set of changes. More generally, it has been asserted that string metrics that examine symbols sequentially fail to emphasize the global similarity of two strings [4]. Miller and Myers [6] established the limitations of LCS when they produced a new file compare program that executed at four times the speed of the diff program while producing significantly smaller deltas. The edit distance [10] proved to be a better metric for the difference of files and ....
EHRENFEUCHT, A., AND HAUSSLER, D. A new distance metric on strings computable in linear time. Discrete Applied Mathematics 20 (1988), 191--203.
No context found.
A. Ehrenfeucht and D. Haussler. A New Distance Metric on Strings Computable in Linear Time. Discrete Applied Mathematics, 20:191-- 203, 1988.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC