MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Identifying Hierarchical Structure in Sequences:

Download:
pdf | ps
by A Linear-time Algorithm, Craig G. Nevill-manning, Ian H. Witten
http://www.cs.washington.edu/research/jair/volume7/nevill97a.ps
Add To MetaCart

Abstract:

SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences. 1.

Citations

627 Language identification in the limit – Gold
557 Text Compression – Bell, Cleary, et al. - 1990
545 An introduction to hidden markov models – Rabiner, Juang - 1986
124 Inference of reversible languages – Angluin - 1982
91 Inducing probabilistic grammars by Bayesian model merging – Stolcke, Omohundro - 1994
36 Inferring Sequential Structure – Nevill-Manning - 1996
33 Learning syntax by automata induction – Berwick, Pilato - 1987
33 Discrete sequence prediction and its applications – Laird - 1994
31 A version space approach to learning context-free grammars – VanLehn, Ball - 1987
30 Attention and structure in sequence learning – Cohen, Ivry, et al. - 1990
26 Browsing in digital libraries: A phrase-based approach – Nevill-Manning, Witten, et al.
22 Manual of information to accompany the LancasterOslo/Bergen corpus of British English, for use with digital computers – Johansson, Leech, et al. - 1978
21 Grammatical inference by hill climbing – Cook, Rosenfeld, et al. - 1976
17 Behaviour/Structure Transformations Under Uncertainty – Gaines - 1976
16 Language acquisition and the discovery of phrase structure – Wolff - 1980
13 Simplicity and representation change in grammar induction. Unpublished mss – LANGLEY - 1994
13 The discovery of segments in natural language – Wolff - 1977
11 An algorithm for the segmentation of an artificial language analogue – Wolff - 1975
8 The art of computer programming 1: fundamental algorithms – Knuth - 1968
8 Grammar enumeration and inference – Wharton - 1977
5 Thinking With The Teachable Machine – Andreae - 1977