Results 1 -
7 of
7
Retrofitting word vectors to semantic lexicons
- In Proceedings of the 2015 Conference of NAACL
, 2015
"... Vector space word representations are typically learned using only co-occurrence statistics from text corpora. Although such statistics are informative, they disre-gard easily accessible (and often carefully curated) information archived in se-mantic lexicons such as WordNet, FrameNet, and the Parap ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Vector space word representations are typically learned using only co-occurrence statistics from text corpora. Although such statistics are informative, they disre-gard easily accessible (and often carefully curated) information archived in se-mantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a technique to leverage both distributional and lexicon-derived ev-idence to obtain better representations. We run belief propagation on a word type graph constructed from word similarity information from lexicons to encourage connected (related) words to have similar representations, and also to be close to the unsupervised vectors. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, using several different underlying word vec-tor models, we obtain substantially improved vectors and consistently outperform existing approaches of incorporating semantic knowledge in word vectors. 1
PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification
"... We present a new release of the Para-phrase Database. PPDB 2.0 includes a discriminatively re-ranked set of para-phrases that achieve a higher correlation with human judgments than PPDB 1.0’s heuristic rankings. Each paraphrase pair in the database now also includes fine-grained entailment relations ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a new release of the Para-phrase Database. PPDB 2.0 includes a discriminatively re-ranked set of para-phrases that achieve a higher correlation with human judgments than PPDB 1.0’s heuristic rankings. Each paraphrase pair in the database now also includes fine-grained entailment relations, word embed-ding similarities, and style annotations. 1
Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings
"... We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has fo-cused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Add ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has fo-cused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Additionally, we evaluate three types of neural embeddings for representing Chinese text. Finally, we propose a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text. Our meth-ods yield a 9 % improvement over a state-of-the-art baseline. 1
Convolutional Neural Network Based Semantic Tagging with Entity Embeddings
"... Abstract Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words or phr ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract Unsupervised word embeddings provide rich linguistic and conceptual information about words. However, they may provide weak information about domain specific semantic relations for certain tasks such as semantic parsing of natural language queries, where such information about words or phrases can be valuable. To encode the prior knowledge about the semantic word relations, we extended the neural network based lexical word embedding objective function by incorporating the information about relationship between entities that we extract from knowledge bases
Component-Enhanced Chinese Character Embeddings
"... Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on En-glish. In this work, we innovatively de-velop two component-enhanced Chinese character embedding models and their bi-gram extensions. Dis ..."
Abstract
- Add to MetaCart
(Show Context)
Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on En-glish. In this work, we innovatively de-velop two component-enhanced Chinese character embedding models and their bi-gram extensions. Distinguished from En-glish word embeddings, our models ex-plore the compositions of Chinese char-acters, which often serve as semantic in-dictors inherently. The evaluations on both word similarity and text classification demonstrate the effectiveness of our mod-els. 1
AKNET: A General Framework for Learning Word Embedding using Morphological Knowledge
"... Neural network techniques are widely applied to obtain high-quality distributed representations of words, i.e., word embeddings, to address text mining, information retrieval, and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context ..."
Abstract
- Add to MetaCart
(Show Context)
Neural network techniques are widely applied to obtain high-quality distributed representations of words, i.e., word embeddings, to address text mining, information retrieval, and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words or rare words with insufficient context. In this paper, inspired by the study on word recognition process in cognitive psychology, we propose to take advantage of seemingly less obvious but essentially important morphological knowledge to address these challenges. In particular, we introduce a novel neural network architecture called KNET that leverages both contextual information and morphological word similarity built based on morphological knowledge to learn word embeddings. Meanwhile, the learning architecture is also able to refine the pre-defined morphological knowledge and obtain more accurate word similarity. Experiments on an analogical reasoning task and a word similarity task both demonstrate that the proposed KNET framework can greatly enhance the effectiveness of word embeddings. ACM Reference Format: