Download:
|
by Marti A. Hearst, Hinrich Schutze
Proc. of the Workshop on Extracting Lexical Knowledge
http://www.sims.berkeley.edu/~hearst/papers/siglex93.ps.gz
Add To MetaCart
Abstract:
We discuss a method for augmenting and rearranging a structured lexicon in order to make it more suitable for a topic labeling task, by making use of lexical association information from a large text corpus. We first describe an algorithm for converting the hierarchical structure of WordNet [13] into a set of flat categories. We then use lexical cooccurrence statistics in combination with these categories to classify proper names, assign more specific senses to broadly defined terms, and classify new words into existing categories. We also describe how to use these statistics to assign schema-like information to the categories and show how the new categories improve a text-labeling algorithm. In effect, we provide a mechanism for successfully combining
Citations
|
1519
|
Indexing by latent semantic analysis
– Deerwester, Dumais, et al.
- 1990
|
|
1016
|
WordNet: an On-line Lexical Database
– Miller
- 1990
|
|
544
|
A framework for representing knowledge
– Minsky
- 1981
|
|
355
|
Automatic acquisition of hyponyms from large text corpora
– Hearst
|
|
83
|
SCISOR: extracting information from on-line news
– Jacobs, Rau
- 1990
|
|
77
|
Classifying News Stories Using Memory Based Reasoning
– Masand, Linoff, et al.
- 1992
|
|
50
|
Semantically significant patterns in dictionary definitions
– Markowitz, Ahlswede, et al.
- 1986
|
|
45
|
WordNet and Distributional Analysis: A Class-based Approach to Lexical Discovery
– Resnik
- 1992
|
|
43
|
Large-scale sparse singular value computations
– Berry
- 1992
|
|
41
|
Lexical cohesion, the thesaurus, and the structure of text
– Morris
- 1988
|
|
40
|
TextTiling: A quantitative approach to discourse segmentation
– Hearst
- 1993
|
|
38
|
Providing machine tractable dictionary tools, Machine Translation 5
– Wilks, Fass, et al.
- 1990
|
|
28
|
Part-of-speech induction from scratch
– Schutze
- 1993
|
|
26
|
Processing dictionary definitions with phrasal pattern hierarchies
– Alshawi
- 1987
|
|
23
|
Acquisition of Lexical Information from a Large Textual Italian Corpus
– Calzolari, Bindi
- 1990
|
|
16
|
Graph drawing by force-directed placement
– Fruchtermann, Reingold
- 1990
|
|
8
|
Sense Disambiguation using statistical models of roget’s categories trained on large corpora
– Word
|
|
5
|
On the acquisition of lexical entries: The perceptual origin of thematic relations
– Pustejovsky
- 1987
|
|
4
|
Part-of-speech induction from scratch
– Schtze
- 1993
|
|
1
|
Carta: A network topology presentation tool. Project Report
– Amir
- 1993
|
|
1
|
A new knowledge-poor technique for knowledge extraction from large corpora
– Grefenstette
|
|
1
|
Calzolari and Remo Bindi. Acquisition of lexical information from a large textual italian corpus
– Nicoletta
- 1990
|