A typical syntactic pattern recognition (PR) problem involves comparing a noisy string with every element of a dictionary, H. The problem of classification can be greatly simplified if the dictionary is partitioned into a set of sub-dictionaries. In this case, the classification can be hierarchical-- the noisy string is first compared to a representative element of each sub-dictionary and the closest match within the sub-dictionary is subsequently located. Indeed, the entire problem of sub-dividing a set of strings into subsets where each subset contains "similar " strings has been referred to as the "String Taxonomy Problem". To our knowledge there is no reported solution to this problem (see footnote on Page 2). In this paper we shall present a learning-automaton based solution to string taxonomy. The solution utilizes the Object Migrating Automaton (OMA) whose power in clustering objects and images [33,35] has been reported. The power of the scheme for string taxonomy has been demonstrated using random strings and garbled versions of string representations of fragments of macromolecules.
|
788
|
Clustering Algorithms
– Hartigan
- 1975
|
|
713
|
A general method applicable to the search for similarities in the amino acid sequence of two proteins
– Needleman, Wunsch
- 1970
|
|
685
|
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
– Viterbi
- 1967
|
|
536
|
Binary codes capable of correcting deletions, insertions and reversals
– Levenshtein
- 1966
|
|
503
|
The Viterbi algorithm
– Forney
- 1973
|
|
344
|
Non-Uniform Random Variate Generation
– Devroye
- 1986
|
|
123
|
A faster algorithm computing string edit distances
– Masek, Paterson
- 1980
|
|
120
|
Learning automata: An introduction
– Narendra, Thathachar
- 1989
|
|
113
|
Algorithms for the longest common subsequence problem
– Hirschberg
- 1977
|
|
110
|
A fast algorithm for computing longest common subsequences
– Hunt, Szymanski
- 1977
|
|
97
|
The complexity of some problems on subsequences and supersequences
– Maier
- 1978
|
|
96
|
The String to String Correction problem
– Wagner, Fisher
- 1974
|
|
50
|
Computer programs for detecting and correcting spelling errors
– Peterson
- 1980
|
|
43
|
Bounds on the Complexity of the Longest Common Subsequence Problem
– Aho, Hirschberg, et al.
- 1976
|
|
36
|
Analysis of digital tries with Markovian dependency
– Jacquet, Szpankowski
- 1991
|
|
31
|
Automaton Theory and Modeling of Biological Systems
– Tsetlin
- 1973
|
|
28
|
Relative Frequency of English Speech Sounds
– Dewey
- 1923
|
|
28
|
A Method for the Correction of Garbled Words Based on the Levenshtein Metric
– Okuda, Tanaka, et al.
- 1976
|
|
21
|
A Note on the Height of Suffix Trees
– Devroye, Szpankowski, et al.
- 1992
|
|
21
|
Principles of
– Nilsson
- 1980
|
|
19
|
Computer text recognition and error correction
– Srihari
- 1984
|
|
16
|
Approximate string matching, Comput. Surveys
– Hall, Dowling
- 1980
|
|
16
|
An effective algorithm for string correction using generalized edit distances -I. Description of the algorithm and its optimality
– Kashyap, Oommen
- 1981
|
|
16
|
The Viterbi Algorithm as an Aid in Text Recognition
– Neuhoff
- 1975
|
|
13
|
Recognition of noisy subsequences using constrained edit distances
– Oommen
- 1987
|
|
13
|
Experiments in text recognition with the modified Viterbi algorithm
– Shinghal, Toussaint
- 1979
|
|
12
|
Probabilistic analysis of generalized suffix trees
– Szpankowski
- 1992
|
|
11
|
The noisy substring matching problem
– Kashyap, Oommen
- 1983
|
|
10
|
Deterministic Learning Automata Solutions to the Equipartitioning Problem
– Oommen, Ma
- 1988
|
|
9
|
Time Warps,String Edits and Macromolecules: The Theory and practice of Sequence Comparison, Addison-Wesley
– Sankoff, Kruskal
- 1983
|
|
8
|
A common basis for similarity and dissimilarity measures involving two strings
– Kashyap, Oommen
- 1983
|
|
7
|
String correction using probabilistic methods, Pattern Recognition Letters
– Kashyap, Oommen
- 1984
|
|
6
|
Decoding with channels with insertions, deletions and substitutions with applications to speech recognition
– Bahl, Jelinek
- 1975
|
|
5
|
Fast Learning Automaton-Based Image Examination and Retrieval
– Oommen, Fothergill
- 1993
|
|
4
|
Keyboard optimization technique to improve output rate of disabled individuals
– Minneman
- 1986
|
|
4
|
Symbolic Channel Modelling for Noisy Channels which Permit Arbitrary Noise Distributions
– Oommen, Kashyap
- 1993
|
|
4
|
A language approach to string searching evaluation
– Regnier
- 1992
|
|
4
|
Adaptive Clustering Schemes: General Framework
– Yu, SIU, et al.
- 1981
|
|
4
|
Atlas of Protein Sequence and Structure, National Biomedical
– Dayhoff
- 1971
|
|
3
|
Computer disambiguation of multi-character text entry : An adaptive design approach
– Levine, H, et al.
- 1986
|
|
3
|
Stochastic Automata Solutions to the Object Partitioning Problem
– Oommen, Ma
- 1992
|
|
2
|
Similarity measures for sets of strings
– Kashyap, Oommen
- 1983
|
|
2
|
An adaptive approach to optimal keyboard design for nonvocal communication
– Levine
- 1985
|
|
1
|
A simplified Touch-Tone telecommunication aid for deaf and hearing impaired individuals
– Minneman
- 1985
|
|
1
|
The noisy substring matching problem
– Math
- 1983
|
|
1
|
A faster algorithm computing string edit distances
– Mach
- 1978
|
|
1
|
String Taxonomy Using Learning Automata Page 22
– Minneman
- 1985
|
|
1
|
Bounds for the string editing problem
– Mach
- 1974
|