Download:
|
by Fernando Pereira, Naftali Tishby, Lillian Lee
In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics
ftp://cs.nyu.edu/pub/nlp/anlp97/model.ps
Add To MetaCart
Abstract:
We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft " clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.
Citations
|
4363
|
Elements of Information Theory
– Cover, Thomas
- 1991
|
|
4344
|
Maximum likelihood from incomplete data via the EM algorithm
– Dempster, Laird, et al.
- 1977
|
|
2961
|
Pattern Classification and Scene Analysis
– Duda, Hart
- 1973
|
|
622
|
A stochastic parts program and noun phrase parser for unrestricted text
– Church
- 1988
|
|
390
|
Class-Based n-gram Models of Natural Language
– Brown, Pietra, et al.
- 1992
|
|
152
|
Noun classification from predicate-argument structures
– Hindle
- 1990
|
|
130
|
Pattern Classification and Scene Analysis. Wiley-Interscience Publication
– Duda, Hart
- 1973
|
|
125
|
A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language 5.19–54
– CHURCH, GALE
- 1991
|
|
115
|
Statistical mechanics and phase transitions in clustering
– Rose, Fox
- 1990
|
|
114
|
Stochastic Lexicalized Tree-Adjoining Grammars
– Schabes
- 1992
|
|
44
|
Wordnet and distributional analysis: A class-based approach to lexical discovery
– Resnik
- 1992
|
|
24
|
A parser for text corpora
– Hindle
- 1994
|
|
3
|
Brandeis lectures
– Jaynes
- 1983
|
|
3
|
CONC: Tools for text corpora
– Yarowsky
- 1992
|
|
2
|
Contextual word similarity and the estimation of sparse lexical relations
– Dagan, Markus, et al.
- 1992
|
|
1
|
Stochastic lexicalized treeadjoining grammars
– Sehabes
- 1992
|