21 citations found. Retrieving documents...
D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181:159--179, 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Crawling the Hidden Web - Raghavan, Garcia-Molina (2001)   (40 citations)  (Correct)

....hose) 1) However, word reorderings requires a new distance measure so that two labels such as Company Type and Type of Company (these become company type and type company after normalization) are identified as being very close to each other. The block edit models proposed in [21], succinctly represent both typing errors and word reorderings. These models define the concept of block edit distance; a generalization of the traditional notion of edit distances to handle block word movements. We used one of a family of algorithms from [21] to implement our label matching ....

....The block edit models proposed in [21] succinctly represent both typing errors and word reorderings. These models define the concept of block edit distance; a generalization of the traditional notion of edit distances to handle block word movements. We used one of a family of algorithms from [21] to implement our label matching system based on the block edit model. We match a form element # to an LVS entry by minimizing the block edit distance between their labels, subject to a threshold. Specifically, let ## b ### ## denote the block edit distance between strings # and #;let# ....

Daniel Lopresti and Andrew Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181(1):159--179, July 1997.


Text Joins in an RDBMS for Web Data Integration - Gravano, Ipeirotis, Koudas.. (2003)   (3 citations)  (Correct)

....case, we can use block edit distance, a more general edit distance metric that allows for block moves as a basic edit operation. By allowing for block moves, the block edit distance can also capture word rearrangements. Finding the exact block edit distance of two strings is an NP hard problem [17]. Block edit distance cannot capture all mismatches. Differences between records also occur due to insertions and deletions of common words. For example, KAR Corporation International and KAR Corporation have block edit distance 14. If we allow large edit distance thresholds to capture such ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181(1):159--179, 1997.


CLUSEQ: Efficient and Effective Sequence Clustering - Yang, Wang (2003)   (1 citation)  (Correct)

....the optimal global alignment between a pair of sequences, but ignores many other local alignments that often represent important features shared by the pair of sequences . These overlooked features may be very crucial to produce meaningful clusters. Even though allowing block operations [19, 21] may alleviate this weakness to a certain degree, the Consider three sequences aaaabbb, bbbaaaa, and abcdefg. The edit distance between aaaabbb and bbbaaaa is 6 whereas the edit distance between aaaabbb and abcdefg is also 6. This, to a certain extent, contradicts the intuition that aaaabbb is ....

D. Loprestli and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 1996.


Automatic Processing of Document Annotations - Stevens, Gee, Dance (1998)   (Correct)

....which can match multiple words in one string with a single word in the other, and can tolerate small discrepancies between the lengths of matched words. Our approximate string matching algorithm is built around a simple dynamic pro 444 British Machine Vision Conference gramming approach [7]. If # ### ## is the best cost for matching substring # # ###### # of string # to any substring of #, then we exploit the dynamic programming equation: # ##### # ### ### ## ####### ## ###### To find the best match for the entire printed word string, we construct a table, where the rows are ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181:159--179, 1997.


Crawling the Hidden Web - Raghavan, Garcia-Molina (2001)   (40 citations)  (Correct)

....errors but also word reorderings (e. g, we require that two labels Company Type and Type of Company , which become company type and type company after normalization, be identified as being very similar, separated by a very small edit distance) HiWE employs a string matching algorithm from [15] that meets these requirements. Given element E i ,let LabelMatch(E i ) denote the entry in the LVS table whose label has the minimum edit distance to label(E i ) subject to a threshold #. If all entries in the LVS table are more than # edit operations away from label(E i ) LabelMatch(E i ) is ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181(1):159--179, July 1997.


Approximate Nearest Neighbors and Sequence Comparison.. - Muthukrishnan, Sahinalp (2000)   (13 citations)  (Correct)

....study. 2. Block arrangements which involves moving a block (any consecutive set of characters) from one place to another; this is a rather natural notion in defining similarity of objects (such as in moving a paragraph of a text to another location [T84] or moving objects around in pen computing [LT96] or intrasequence rearrangements in genomic data [GD91] It also involves copying blocks from one place to another within a sequence, or deleting a copy of a block that exists elsewhere. These operations are motivated by data compression. 3. Block reversals which involves reversing an entire ....

....minfi 0 : log (i) k 1g, where log (i) k = log(log (i Gamma1) k) and log (0) x = 0. edit operations. If the distance d(S; T ) between two sequences S and T is the minimum number of arbitrary block moves needed to transform S to T , the problem of computing d(S; T ) become NP hard [LT96]. Thus, we focus on block reversals only. Since it is not possible to transform a given sequence S (e.g. all 0 s) to every other sequence T (e.g. all 1 s) by block reversals alone, d(S; T ) is not well defined if we allow only block reversal operations. The simplest well defined distance with ....

[Article contains additional citation context not shown here]

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 1996.


File System Support for Delta Compression - MacDonald (2000)   (25 citations)  (Correct)

....(2) insert a space character, and (3) copy proxy . Insert delete algorithms do not consider the high level structure of the inputs; based on this, Lopresti and Tomkins present several re ned block edit cost models with applications to such things as molecular biology and handwriting analysis [28]. The most basic copy insert algorithm is a greedy one. In a single pass through the target version, it picks the longest available match at any given o set and then continues searching at the end of the match. By assuming a simple but unrealistic cost model for delta encoding, the greedy ....

Lopresti, D., and Tomkins, A. Block edit models for approximate string matching. Theoretical Computer Science 181, 1 (15 July 1997), 159-179.


Similarity Search for Sequences of Different Lengths.. - Yazdani, Bozkaya..   (Correct)

....using R trees, based on some features of these atomic sequences. This method also depends on the sequence elements to be mostly correlated since outliers are discarded in the matching process. A problem similar to sequence matching is pattern matching in hand written text and hand drawn pictograms [LT94, LT95], also referred as electronic ink. In [LT94] a modified version of edit distance function is used to find the matching text to a hand written text using pen stroke data. The edit distance function used is similar to string edit distance function, but customized for the application with a ....

....use a modified version of the edit distance function to compute similarity between sequences. However, we use different cost functions and also create a one to one mapping between matching and nonmatching elements of sequences making deletions and insertions (via interpolation) if necessary. In [LT95], the authors look at a harder problem on approximate string matching using block edit models. They use string block edit distance which compares two strings by extracting collections of their substrings and creating a correspondence between them. The distances between the matching substrings ....

D. Lopresti, A. Tomkins, "Block Edit Models for Approximate String Matching", in SSAWSP 95.


Index Structures For Temporal And Multimedia Databases - Bozkaya (1998)   (1 citation)  (Correct)

....based on some features of these atomic sequences. This method also depends on the sequence elements to be mostly correlated since outliers are discarded in the matching process. A similar problem to sequence matching is the problem of pattern matching in hand written text and hand drawn pictograms [LT95a, LT95b], also referred as electronic ink. In [LT95a] a modified version of the edit distance function is used to find the matching text to a hand written text using pen stroke data. In [LT95b] the authors look at a harder problem on approximate string matching using block edit models. They use string ....

....to sequence matching is the problem of pattern matching in hand written text and hand drawn pictograms [LT95a, LT95b] also referred as electronic ink. In [LT95a] a modified version of the edit distance function is used to find the matching text to a hand written text using pen stroke data. In [LT95b], the authors look at a harder problem on approximate string matching using block edit models. They use string block edit distance which compares two strings by extracting collections of their substrings and creating a correspondence between them. The distances between the matching substrings are ....

D. Lopresti, A. Tomkins, "Block Edit Models for Approximate String Matching", Second South American Workshop on String Processing (SSAWSP), 1995.


Cross-Domain Approximate String Matching - Lopresti, Wilfong (1999)   Self-citation (Lopresti)   (Correct)

....edit distance between two strings is then defined as the cost of the least expensive sequence of operations that transforms one string into the other. This basic model has been both specialized and extended in numerous ways, including adding new operations (e.g. transpositions [15] block motion [5]) generalizing from simple strings to formal languages (e.g. regular and context free languages [1, 7] and editing other types of data structures (e.g. trees [13] 2 D strings [8] Another Presented at the Sixth International Symposium on String Processing and Information Retrieval, ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, (181):159--179, 1997.


Temporal-Domain Matching of Hand-Drawn Pictorial Queries - Daniel Lopresti And (1995)   (2 citations)  Self-citation (Lopresti Tomkins)   (Correct)

....of block motion. Time B1 B2 B3 B4 B5 B6 B7 B3 B4 B5 B7 B7 B7 B3 B4 B6 B5 B1 B2 B2 Picture A Basic Blocks Picture B Block String Matching Figure 1: Approximate string matching applied to hand drawn pictorial queries. We apply a new block edit string matching algorithm, presented and analyzed in [4], that addresses these problems. Briefly, the algorithm takes a database d 1 . d n and a query string q 1 . q m as input, partitions the query string into blocks optimally, and matches each block to a database block such that the sum of the distances between corresponding blocks is ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. In Proc. South American Workshop on String Processing, pages 11--26, Valpara'iso, Chile, Apr. 1995.


Algorithms for Matching Hand-Drawn Sketches - Lopresti, Tomkins, Zhou (1997)   (3 citations)  Self-citation (Lopresti Tomkins)   (Correct)

....or repeated (see, for example, Figure 2) Algorithms that have been developed for matching textual ink are not flexible enough to capture these kinds of block motion. In a recent paper, Lopresti and Tomkins introduced a family of algorithms for block string matching that address this issue [5]. Preliminary results, based on a small experiment, seemed promising [6] In the present paper, we give a more detailed analysis of the problem of matching hand drawn sketches. We describe a hierarchical approach that exploits the temporal nature of electronic ink, and evaluate it using a larger ....

....substrings. The correspondence between blocks is given by a permutation oe 2 S t from the symmetric group on t elements. More formally, B(Q;D) j min t min Qj t ;Dj t min oe2S(t) t X i=1 dist i Q (i) D (oe(i) j ) 1) While the most general form of Equation 1 is NP complete [5], certain variants have efficient solutions. In particular, if the substring family for one of the ink strings, say D, is unconstrained so that: 1) blocks may overlap, and (2) all of D need not be used, we have developed the following polynomial time algorithm. Let W (i; j) be the value of the ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. In Proc. Second South American Work. String Processing, pages 11--26, Apr. 1995. To appear in Theoretical Computer Science.


Temporal-Domain Matching of Hand-Drawn Pictorial Queries - Lopresti, Tomkins (1997)   (2 citations)  Self-citation (Lopresti Tomkins)   (Correct)

....of block motion. Time B1 B2 B3 B4 B5 B6 B7 B3 B4 B5 B7 B7 B7 B3 B4 B6 B5 B1 B2 B2 Picture A Basic Blocks Picture B Block String Matching Figure 4: Approximate string matching applied to hand drawn pictorial queries. For this problem, we have developed new block edit algorithms for string matching [9]. Briefly, our approach takes a database d 1 : d n and a query string q 1 : q m as input, partitions the query string into blocks optimally, and matches each block to a database block such that the sum of the distances between corresponding blocks is minimized. Thus, the procedure allows ....

....2 ) CD O(m 2 n) NP complete CD O(m 2 n 2 ) NP complete NP complete CD O(m 2 n) NP complete NP complete NP complete Table 2: Complexities of block matching problems. Equation 7 does not specify whether the particular substring families must be covers, disjoint, or both. In an earlier paper [9], we examined the various cases, showed which are hard, and presented algorithms for those that are solvable in polynomial time. Table 2 summarizes these results. 3.2 An Algorithm for Block Edit Distance We now present a polynomial time algorithm for one of the variants of block edit distance ....

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. In R. Baeza-Yates and U. Manber, editors, Proceedings of the Second Annual South American Workshop on String Processing, pages 11--26, Valpara'iso, Chile, April 1995.


Schema Matching using Duplicates - Alexander Bilke Technische   (1 citation)  (Correct)

No context found.

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 181:159--179, 1997.


CDER: Efficient MT Evaluation Using Block Movements - Leusch, Ueffing, Ney (2006)   (1 citation)  (Correct)

No context found.

D. Lopresti and A. Tomkins. 1997. Block edit models for approximate string matching. Theoretical Computer Science, 181(1):159--179, Jul.


Approximate Nearest Neighbors and Sequence Comparison - With Block Operations   (Correct)

No context found.

D. Lopresti and A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science, 1996.


The Greedy Algorithm for the Minimum Common String.. - Chrobak, Kolman, Sgall (2004)   (Correct)

No context found.

D. Lopresti, A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science 181 (159--179) 1997.


The Greedy Algorithm for the Minimum Common String.. - Chrobak, Kolman, Sgall (2004)   (Correct)

No context found.

D. Lopresti, A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science 181 (159--179) 1997.


The Greedy Algorithm for the Minimum Common String.. - Chrobak, Kolman, Sgall (2004)   (Correct)

No context found.

D. Lopresti, A. Tomkins. Block edit models for approximate string matching. Theoretical Computer Science 181 (159-179) 1997.


File System Support for Delta Compression - Joshua Macdonald University (2000)   (25 citations)  (Correct)

No context found.

LOPRESTI, D., AND TOMKINS, A. Block edit models for approximate string matching. Theoretical Computer Science 181, 1 (15 July 1997), 159--179.


Matching and Indexing Sequences of Different Lengths - Bozkaya, Yazdani, Özsoyoglu (1997)   (24 citations)  (Correct)

No context found.

D. Lopresti, A. Tomkins, "Block Edit Models for Approximate String Matching", in SSAWSP 95.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC