Results 1 - 10
of
21
Entity Linking meets Word Sense Disambiguation: A Unified Approach
- Transactions of the Association for Computational Linguistics
, 2014
"... Entity Linking (EL) and Word Sense Disam-biguation (WSD) both address the lexical am-biguity of language. But while the two tasks are pretty similar, they differ in a fundamen-tal respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while ..."
Abstract
-
Cited by 46 (17 self)
- Add to MetaCart
Entity Linking (EL) and Word Sense Disam-biguation (WSD) both address the lexical am-biguity of language. But while the two tasks are pretty similar, they differ in a fundamen-tal respect: in EL the textual mention can be linked to a named entity which may or may not contain the exact mention, while in WSD there is a perfect match between the word form (bet-ter, its lemma) and a suitable word sense. In this paper we present Babelfy, a unified graph-based approach to EL and WSD based on a loose identification of candidate mean-ings coupled with a densest subgraph heuris-tic which selects high-coherence semantic in-terpretations. Our experiments show state-of-the-art performances on both tasks on 6 differ-ent datasets, including a multilingual setting. Babelfy is online at
Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
"... Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unif ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: semantic textual similarity, word similarity, and word sense coarsening. 1
SemEval-2013 Task 11: Word Sense Induction & Disambiguation within an End-User Application
"... In this paper we describe our Semeval-2013 task on Word Sense Induction and Disambiguation within an end-user application, namely Web search result clustering and diversification. Given a target query, induction and disambiguation systems are requested to cluster and diversify the search results ret ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper we describe our Semeval-2013 task on Word Sense Induction and Disambiguation within an end-user application, namely Web search result clustering and diversification. Given a target query, induction and disambiguation systems are requested to cluster and diversify the search results returned by a search engine for that query. The task enables the end-to-end evaluation and comparison of systems. 1
A Large-scale Pseudoword-based Evaluation Framework for State-of-the-Art Word Sense Disambiguation
"... The evaluation of several tasks in lexical semantics is often limited by the lack of large amounts of manual annotations, not only for training purposes, but also for testing purposes. Word Sense Disambiguation (WSD) is a case in point, as hand-labeled datasets are particularly hard and time-consumi ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The evaluation of several tasks in lexical semantics is often limited by the lack of large amounts of manual annotations, not only for training purposes, but also for testing purposes. Word Sense Disambiguation (WSD) is a case in point, as hand-labeled datasets are particularly hard and time-consuming to create. Consequently, evaluations tend to be performed on a small scale, which does not allow for in-depth analysis of the factors that determine a systems ’ performance. In this paper we address this issue by means of a realistic simulation of large-scale evaluation for the WSD task. We do this by providing two main contributions: first, we put forward two novel approaches to the wide-coverage generation of semantically-aware pseudowords, i.e., artificial words capable of modeling real polysemous words; second, we leverage the most suitable type of pseudoword to create large pseudosense-annotated corpora, which enable a large-scale experimental framework for the comparison of state-of-the-art supervised and knowledge-based algorithms. Using this framework, we study the impact of supervision and knowledge on the two major disambiguation paradigms and perform an in-depth analysis of the factors which affect their performance. 1.
Exploiting dbpedia for web search results clustering
- In Proceedings of the 2013 Workshop on Automated Knowledge Base Construction, AKBC ’13
, 2013
"... ABSTRACT We present a knowledge-rich approach to Web search result clustering which exploits the output of an open-domain entity linker, as well as the types and topical concepts encoded within a wide-coverage ontology. Our results indicate that, thanks to an accurate and compact semantification of ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT We present a knowledge-rich approach to Web search result clustering which exploits the output of an open-domain entity linker, as well as the types and topical concepts encoded within a wide-coverage ontology. Our results indicate that, thanks to an accurate and compact semantification of the search result snippets, we are able to achieve a competitive performance on a benchmarking dataset for this task.
Post-Retrieval Clustering Using Third-Order Similarity Measures
"... Post-retrieval clustering is the task of clustering Web search results. Within this context, we propose a new methodology that adapts the classical K-means algorithm to a third-order similarity measure initially developed for NLP tasks. Results obtained with the definition of a new stopping criterio ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Post-retrieval clustering is the task of clustering Web search results. Within this context, we propose a new methodology that adapts the classical K-means algorithm to a third-order similarity measure initially developed for NLP tasks. Results obtained with the definition of a new stopping criterion over the ODP-239 and the MORESQUE gold standard datasets evidence that our proposal outperforms all reported text-based approaches. 1
Duluth: Word Sense Induction Applied to Web Page Clustering
"... The Duluth systems that participated in task 11 of SemEval–2013 carried out word sense induction (WSI) in order to cluster Web search results. They relied on an approach that represented Web snippets using second–order co– occurrences. These systems were all implemented using SenseClusters, a freely ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The Duluth systems that participated in task 11 of SemEval–2013 carried out word sense induction (WSI) in order to cluster Web search results. They relied on an approach that represented Web snippets using second–order co– occurrences. These systems were all implemented using SenseClusters, a freely available open source software package. 1
Easy Web Search Results Clustering: When Baselines Can Reach State-of-the-Art Algorithms
"... This work discusses the evaluation of baseline algorithms for Web search re-sults clustering. An analysis is performed over frequently used baseline algorithms and standard datasets. Our work shows that competitive results can be obtained by either fine tuning or performing cascade clustering over w ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This work discusses the evaluation of baseline algorithms for Web search re-sults clustering. An analysis is performed over frequently used baseline algorithms and standard datasets. Our work shows that competitive results can be obtained by either fine tuning or performing cascade clustering over well-known algorithms. In particular, the latter strategy can lead to a scalable and real-world solution, which evidences comparative results to recent text-based state-of-the-art algorithms. 1
SATTY : Word Sense Induction Application in Web Search Clustering *
"... Abstract The aim of this paper is to perform Word Sense induction (WSI); which clusters web search results and produces a diversified list of search results. It describes the WSI system developed for Task 11 of SemEval -2013. This paper implements the idea of monotone submodular function optimizati ..."
Abstract
- Add to MetaCart
Abstract The aim of this paper is to perform Word Sense induction (WSI); which clusters web search results and produces a diversified list of search results. It describes the WSI system developed for Task 11 of SemEval -2013. This paper implements the idea of monotone submodular function optimization using greedy algorithm.
†Ubiquitous Knowledge Processing Lab (UKP-TUDA)
"... www.ukp.tu-darmstadt.de In this paper, we describe the UKP Lab system participating in the Semeval-2013 task “Word Sense Induction and Disambiguation within an End-User Application”. Our approach uses preprocessing, co-occurrence extraction, graph clustering, and a state-of-theart word sense disambi ..."
Abstract
- Add to MetaCart
www.ukp.tu-darmstadt.de In this paper, we describe the UKP Lab system participating in the Semeval-2013 task “Word Sense Induction and Disambiguation within an End-User Application”. Our approach uses preprocessing, co-occurrence extraction, graph clustering, and a state-of-theart word sense disambiguation system. We developed a configurable pipeline which can be used to integrate and evaluate other components for the various steps of the complex