Results 1 - 10
of
20
Word sense disambiguation: a survey
- ACM COMPUTING SURVEYS
, 2009
"... Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
Word sense disambiguation (WSD) is the ability to identify the meaning of words in context in a computational manner. WSD is considered an AI-complete problem, that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence. We introduce the reader to the motivations for solving the ambiguity of words and provide a description of the task. We overview supervised, unsupervised, and knowledge-based approaches. The assessment of WSD systems is discussed in the context of the Senseval/Semeval campaigns, aiming at the objective evaluation of systems participating in several different disambiguation tasks. Finally, applications, open problems, and future directions are discussed.
A WordNet Detour to FrameNet
"... In this paper, we present a rule-based system for the assignment of FrameNet frames by way of a “detour via WordNet”. The system can be used to overcome sparse-data problems of statistical systems trained on current FrameNet data. We devise a weighting scheme to select the best frame(s) out of a set ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
In this paper, we present a rule-based system for the assignment of FrameNet frames by way of a “detour via WordNet”. The system can be used to overcome sparse-data problems of statistical systems trained on current FrameNet data. We devise a weighting scheme to select the best frame(s) out of a set of candidate frames, and present first figures of evaluation.
Unsupervised and supervised exploitation of semantic domains in lexical disambiguation. Computer Speech and Language
, 2004
"... Domains are common areas of human discussion, such as economics, politics, law, science etc., which are at the basis of lexical coherence. This paper explores the dual role of domains in word sense disambiguation (WSD). On one hand, domain information provides generalized features at the paradigmati ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Domains are common areas of human discussion, such as economics, politics, law, science etc., which are at the basis of lexical coherence. This paper explores the dual role of domains in word sense disambiguation (WSD). On one hand, domain information provides generalized features at the paradigmatic level that are useful to discriminate among word senses. On the other hand, domain distinctions constitute a useful level of coarse grained sense distinctions, which lends itself to more accurate disambiguation with lower amounts of knowledge. In this paper we extend and ground the modeling of domains and the exploitation of WordNet Domains, an extension of WordNet in which each synset is labeled with domain information. We propose a novel unsupervised probabilistic method for the critical step of estimating domain relevance for contexts, and suggest utilizing it within unsupervised Domain Driven Disambiguation (DDD) for word senses, as well as within a traditional supervised approach. The paper presents empirical assessments of the potential utilization of domains in WSD at a wide range of comparative settings, supervised and unsupervised. Following the dual role of domains we report experiments that evaluate both the extent to which domain information provides effective features for WSD, as well as the accuracy obtained by WSD at domain-level sense granularity. Furthermore, we demonstrate the potential for either avoiding or minimizing manual annotation thanks to the generalized level of information provided by domains. Key words:
Supervised Word Sense Disambiguation with Support Vector Machines and . . .
- SENSEVAL-3: THIRD INTERNATIONAL WORKSHOP ON THE EVALUATION OF SYSTEMS FOR THE SEMANTIC ANALYSIS OF TEXT
, 2004
"... ... sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were partof -speech of neighboring words, single words in th ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
... sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were partof -speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. For the translation and sense subtask of the multilingual lexical sample task, the English sense given for the target word was also used as an additional knowledge source. For the English lexical sample task, we obtained fine-grained and coarse-grained score (for both recall and precision) of 0.724 and 0.788 respectively. For the multilingual lexical sample task, we obtained recall (and precision) of 0.634 for the translation subtask, and 0.673 for the translation and sense subtask.
The Role of Semantic Roles in Disambiguating Verb Senses
- PROCEEDINGS OF THE 43RD MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL-05), ANN ARBOR
, 2005
"... We describe an automatic Word Sense Disambiguation (WSD) system that disambiguates verb senses using syntactic and semantic features that encode information about predicate arguments and semantic classes. Our system performs at the best published accuracy on the English verbs of Senseval-2. We also ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We describe an automatic Word Sense Disambiguation (WSD) system that disambiguates verb senses using syntactic and semantic features that encode information about predicate arguments and semantic classes. Our system performs at the best published accuracy on the English verbs of Senseval-2. We also experiment with using the gold-standard predicate-argument labels from PropBank for disambiguating fine-grained WordNet senses and course-grained PropBank framesets, and show that disambiguation of verb senses can be further improved with better extraction of semantic roles.
HowtogetaChineseName(Entity): Segmentation and Combination Issues
- In Proceedings of EMNLP’03
, 2003
"... When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique answer to this question, we discuss in detail advantages and disadvantages of each model, ide ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
When building a Chinese named entity recognition system, one must deal with certain language-specific issues such as whether the model should be based on characters or words. While there is no unique answer to this question, we discuss in detail advantages and disadvantages of each model, identify problems in segmentation and suggest possible solutions, presenting our observations, analysis, and experimental results. The second topic of this paper is classifier combination.
Smoothing and Word Sense Disambiguation
- IN PROCEEDINGS OF ESTAL - ESPAÑA FOR NATURAL LANGUAGE PROCESSING
, 2004
"... This paper presents an algorithm to apply the smoothing techniques described in [1] to three different Machine Learning (ML) methods for Word Sense Disambiguation (WSD). The method to obtain better estimations for the features is explained step by step, and applied to n-way ambiguities. The results ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper presents an algorithm to apply the smoothing techniques described in [1] to three different Machine Learning (ML) methods for Word Sense Disambiguation (WSD). The method to obtain better estimations for the features is explained step by step, and applied to n-way ambiguities. The results obtained in the Senseval-2 framework show that the method can help improve the precision of some weak learners, and in combination attain the best results so far in this setting.
Exploring feature set combinations for WSD
- In Proc. of the SEPLN
, 2006
"... Resumen: Este trabajo explora la división de atributos en grupos para poder mejorar la desambiguación de acepciones (wsd) mediante la combinación de sistemas entrenados en cada uno de estos grupos de atributos. Los resultados conseguidos demuestran que sólo k-nn es capaz de obtener beneficio de la c ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Resumen: Este trabajo explora la división de atributos en grupos para poder mejorar la desambiguación de acepciones (wsd) mediante la combinación de sistemas entrenados en cada uno de estos grupos de atributos. Los resultados conseguidos demuestran que sólo k-nn es capaz de obtener beneficio de la combinación de la división de atributos, y que el voto único no es suficiente para la mejora. Por ello proponemos combinar todo los subsistemas k-nn donde cada vecino da su voto según su rango de vecindad. Para la evaluación hemos utilizado dos conjuntos de datos (Senseval-3 Lexical-Sample y All-words), fijando las mejores opciones de combinación en un tercer conjunto de datos (Senseval-2 Lexical-Sample). Los resultados para la tarea All-words de Senseval-3 son los mejores que se han publicado hasta el día de hoy. Los resultados del Lexical-Sample se situan entre los mejores en el estado-del-arte. Palabras clave: Desambiguación de acepciones de palabra, espacio de atributos, k Nearest Neighbor Abstract: This paper explores the split of features sets in order to obtain better wsd systems through combinations of classifiers learned over each of the split feature sets. Our results show that only k-nn is able to profit from the combination of split features, and that simple voting is not enough for that. Instead we propose combining all k-nn subsystems where each of the k neighbors casts one vote. We have performed a thorough evaluation on two datasets (Senseval-3 Lexical-Sample and All-words), having set the best combination options in a development dataset (Senseval-2 Lexical-Sample). The results for the All-Words task are the best published up to date. The results for the lexical sample are state-of-the-art.
Programs for machine learning
- Advances in Neural Information Processing Systems 15
, 1993
"... learning algorithms on word sense disambiguation with small datasets ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
learning algorithms on word sense disambiguation with small datasets

