| M. W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992. |
....filling a dynamic programming matrix D using the following well known recurrence. D[i,0] i, D[0, j] j. Dr i . rD[i 1, j 1] if a[i] j [ J] 1 min D[i 1, j 1 D[i 1, j] D[i, j 1] if a[i] B[j] In this matrix D[id] ed, id) and so ed(c, 5) D[la,I ] Du and Chang [3] have given the following Recurrence 2 for the edit distance edt. In this case we denote the dynamic programming matrix by DT, and the superscript R denotes the reverse of a string (that is, if cc = abc , then of = cba ) Dr[i, 1] Dr[ 1, j] max I c: I, I 3 I . Dr[i,O] i, Dr[o, j] j. ....
....modified for use in the ABNDM algorithm [4] D Myers (Pat [1. 10. 11. 12. If HP 3 13. If HN 3 14. 15. 16. Text[l n] k) Prepreces sing For z 6 all characters Do PM 0 For i 6 1. m Do PMpac[ PMpac[ I 0 10 VP0 15 VNo O, D[m, j] m For j 6 1. n Do DO 3 ( PMrxc[3] VP 3 ) HP3 VN3 I (DO3 I VP3 ) HN3 DO3 VP3 10mi a 0 10miw 0 VP 3 ) VP 3 ) I PMrexc[3] Then D[m, j] D[m, j] 1 Then D[m, j] D[m, j] 1 [ VN 3 If D[m,j] k Then report VP3 (HN3 1) I (DO3 I VN3 DO3 (HP3 1) a match ending at Text[j] HP3 1) Fig. 5. Our ....
[Article contains additional citation context not shown here]
M .W. Du and S.C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29: 281-302, 1992.
....metric is calculated through a recursive formula: d(i 1,j 1) c(i,j) Substitution d(i 2,j 2) c(i,j 1) c(i 1,j) 1) Transposition This is a standard metric most often resolved by dynamic programming. Other procedures of solving the problem are proposed in for example [Brodda 1966] and [Du Chang 1992]. To improve speed performance a cut off criterion can be set up, enabling abortion during the comparison process when too many errors are detected. The following example is taken from [Salton 1988] M O N S T E R C 1 E 1 N 0 1 T 0 R 0.5 E 0.5 The minimal edit distance between the words ....
....Broddawa, the edit distance algorithm with Broddas distance cost matrix, which was only tested with Whlin first letter coding, is by far the slowest algorithm. The source code for this is based on dynamic programming and could possibly be enhanced and optimised. One faster method is described in [Du Chang 1992]. Precision and Recall The precision and recall graphs in the preceding chapter give a fairly clear picture of which algorithms have a high efficiency. All the graphs show data where real name frequencies have been taken into account. Tests were also performed with the frequency for all names in ....
Du, M. W., and Chang, S. C. A model and a fast algorithm for multiple errors spelling correction, Acta Informatica, Springer-Verlag, Vol 29, 1992, pp 281-302
.... Pollock and Zamora (1984) observed that most OCR errors are substitutions (mis recognition of a character rather than mis segmentation of gaps between words) and went on to propose a new method for correcting scientific and scholarly text (a more difficult domain than previously explored) Du and Chang (1992, 1994) proposed a fast algorithm for multiple errors (furthering edit distance work (Wagner and Fischer 1974; Lowrance and Wagner 1975) Mitton (1995) wrote a PhD thesis about an improved spelling checker which, rather than being a revolutionary improvement on previous algorithms, was a ....
Du, M. and S. Chang (1992). A model and a fast algorithm for multiple errors spelling correction. Acta Informatica 29, 281--302.
....from an alphabet A. X[j] Y [j] denotes the initial substring of X (Y ) up to and including the j th symbol. Given X and Y , the edit distance ed(X[m] Y [n] computed according to the recurrence below, gives the minimum number of unit editing operations to convert one string to the other [4]. ed(X[i 1] Y [j 1] ed(X[i] Y [j] if x i 1 = y j 1 (last characters are same) 1 minfed(X[i Gamma 1] Y [j Gamma 1] if both x i = y j 1 ed(X[i 1] Y [j] and x i 1 = y j ed(X[i] Y [j 1] g (last two characters are transposed) 1 minfed(X[i] Y [j] otherwise ed(X[i ....
....1) H(i; j) H(i; j 1) H(i 1; j) H(i 1; j 1) 1 C C C C C C C C C C C A Figure 2: Computation of the elements of the H matrix. matrix H with element H(i; j) ed(X[i] Y [j] [4]. We can note that the computation of the element H(i 1; j 1) recursively depends on only H(i; j) H(i; j 1) H(i 1; j) and H(i Gamma 1; j Gamma 1) from the earlier definition of the edit distance. During the depth first search of the state graph of the recognizer, entries in column n ....
M .W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
....insertions, and deletions needed to transform one string into the other. By applying various mathematical transformations, this method becomes a family of metrics. These methods have been used in fields as various as DNA RNA matching ( 54] substring matching ( 37, 49] spelling correction ([22]) syntax error correction ( 1, 27, 47] and even the well known Unix diff program. A good reference to the general area of sequence comparison is [41] Other methods of doing this measurement do not offer the versatility that the string distance metrics do. Hamming distance, for example, is the ....
M.W. Du and S.C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
....the j th symbol. We will use X (of length m) to denoted the misspelled string, and Y (of length n) to denote the string that is a (possibly partial) candidate string. Given two strings X and Y , the edit distance ed(X[m] Y [n] computed according to the recurrence below (from Du and Chang [3]) gives the minimum number of unit editing operations to convert one string to the other. ed(X[i 1] Y [j 1] ed(X[i] Y [j] if x i 1 = y j 1 (last characters are same) 1 minfed(X[i Gamma 1] Y [j Gamma 1] if both x i = y j 1 ed(X[i 1] Y [j] and x i 1 = y j ed(X[i] Y [j ....
....the graph, as shown in Figure 3. The crucial point in this algorithm is that the cut off edit distance computation can be performed very efficiently using a dynamic programming based approach. To illustrate this, we use the distance matrix H, an m by n matrix with element H(i; j) ed(X[i] Y [j] [3]. We can note that the computation of the element H(i 1; j 1) recursively depends on only H(i; j) H(i; j 1) H(i 1; j) and H(i Gamma 1; j Gamma 1) from the definition of the edit distance (see Figure 4. 2 Note that we have to do this check since we may come to other irrelevant final ....
M. W. Du and S. C. Chang. 1992 A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302.
....insertions, and deletions needed to transform one string into the other. By applying various mathematical transformations, this method becomes a family of metrics. These methods have been used in fields as various as DNA RNA matching ( 120] substring matching ( 79, 108] spelling correction ([45]) syntax error correction ( 2, 52, 102] and even the well known Unix diff program. A good reference to the general area of sequence comparison is [85] Other methods of doing this measurement do not offer the versatility that the string distance metrics do. Hamming distance, for example, is the ....
M.W. Du and S.C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
....and in agglutinative languages. We then present some preliminary definitions and mathematical background and introduce an algorithm for spelling correction for agglutinative languages. We finally present results from our implementation for Turkish. 2 The spelling correction problem Du and Chang [3] define the spelling correction problem as follows: From a set of known words (dictionary) find those words that most resemble a given (misspelled) character string. The keyword in this definition is resemble. It is difficult to express rigorously how two strings resemble. Generally, a distance ....
....[j] ed(X [i] Y [j 1] g ed(X[0] Y [j] j 1 j n ed(X [i] Y [0] i 1 i m gives the minimum number insertions, deletions, replaces and transpositions one needs to perform to convert one string to the other. This is a slight modification of edit distance formulas given by Du and Chang [3] and by Wagner et.al. 15] For example, the edit distance between iflobra and foobar is 3: an i is to be deleted, l has to be replaced with an o, and ra has to be transposed. 3.2.2 q grams A q gram is a simply substring of length q. The q gram distance is based on counting the number of ....
[Article contains additional citation context not shown here]
M .W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
No context found.
M. W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
No context found.
M. W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
No context found.
M. W. Du and S. C. Chang. A model and a fast algorithm for multiple errors spelling correction. Acta Informatica, 29:281--302, 1992.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC