Results 1 -
4 of
4
Enhanced Information Access to Social Streams through Word Clouds with Entity Grouping
"... Abstract: Intuitive and effective access to large volumes of information is increasingly important. As social media explodes as a useful source of information, so are methods required to access these large volumes of user-generated content. Word clouds are an effective information access tool. Howev ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract: Intuitive and effective access to large volumes of information is increasingly important. As social media explodes as a useful source of information, so are methods required to access these large volumes of user-generated content. Word clouds are an effective information access tool. However, those generated over social media data often depict redundant and mis-ranked entries. This limits the users ’ ability to browse and explore datasets. This paper proposes a method for improving word cloud generation over social streams. Named entity expressions in tweets are detected, disambiguated and aggregated into entity clusters. A word cloud is generated from terms that represent the most relevant entity clusters. We find that word clouds with grouped named entities attain significantly broader coverage and significantly decreased content duplication. Further, access to relevant entries in the collection is improved. An extrinsic crowdsourced user evaluation of generated word clouds was performed. Word clouds with grouped named entities are rated as significantly more relevant and more diverse with respect to the baseline. In addition, we found that word clouds with higher levels of Mean Average Precision (MAP) are more likely to be rated by users as being relevant to the concepts reflected. Critically, this supports MAP as a tool for predicting word cloud quality without requiring a human in the loop. 1
Information Processing and Management
, 2014
"... tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully & Dredze, carefully w ..."
Abstract
- Add to MetaCart
(Show Context)
tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully & Dredze, carefully w
Analysis of Named Entity Recognition and Linking for Tweets
"... Applying natural language processing for mining and intelligent information ac-cess to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-depen ..."
Abstract
- Add to MetaCart
(Show Context)
Applying natural language processing for mining and intelligent information ac-cess to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disam-biguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art sys-tems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.