Results 1 - 10
of
22
Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach
"... In this paper, we define and study a new opinionated text data analysis problem called Latent Aspect Rating Analysis (LARA), which aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each as ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
(Show Context)
In this paper, we define and study a new opinionated text data analysis problem called Latent Aspect Rating Analysis (LARA), which aims at analyzing opinions expressed about an entity in an online review at the level of topical aspects to discover each individual reviewer’s latent opinion on each aspect as well as the relative emphasis on different aspects when forming the overall judgment of the entity. We propose a novel probabilistic rating regression model to solve this new text mining problem in a general way. Empirical experiments on a hotel review data set show that the proposed latent rating regression model can effectively solve the problem of LARA, and that the detailed analysis of opinions at the level of topical aspects enabled by the proposed model can support a wide range of application tasks, such as aspect opinion summarization, entity ranking based on aspect ratings, and analysis of reviewers rating behavior.
2011b. Social Context Summarization
- In Proceeding of SIGIR-11
"... We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on ex-tracting informative sentences from standard documents. With the rapid growth of online social networks, abundant user generated content (e.g., comments) associated with t ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
We study a novel problem of social context summarization for Web documents. Traditional summarization research has focused on ex-tracting informative sentences from standard documents. With the rapid growth of online social networks, abundant user generated content (e.g., comments) associated with the standard documents is available. Which parts in a document are social users really caring about? How can we generate summaries for standard documents by considering both the informativeness of sentences and interests of social users? This paper explores such an approach by model-ing Web documents and social contexts into a unified framework. We propose a dual wing factor graph (DWFG) model, which uti-lizes the mutual reinforcement between Web documents and their associated social contexts to generate summaries. An efficient al-gorithm is designed to learn the proposed factor graph model. Ex-perimental results on a Twitter data set validate the effectiveness of the proposed model. By leveraging the social context information, our approach obtains significant improvement (averagely +5.0%-17.3%) over several alternative methods (CRF, SVM, LR, PR, and DocLead) on the performance of summarization.
Micropinion Generation: An Unsupervised Approach to Generating Ultra-Concise Summaries of Opinions
"... This paper presents a new unsupervised approach to generating ultra-concise summaries of opinions. We formulate the problem of generating such a micropinion summary as an optimization problem, where we seek a set of concise and non-redundant phrases that are readable and represent key opinions in te ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
(Show Context)
This paper presents a new unsupervised approach to generating ultra-concise summaries of opinions. We formulate the problem of generating such a micropinion summary as an optimization problem, where we seek a set of concise and non-redundant phrases that are readable and represent key opinions in text. We measure representativeness based on a modified mutual information function and model readability with an n-gram language model. We propose some heuristic algorithms to efficiently solve this optimization problem. Evaluation results show that our unsupervised approach outperforms other state of the art summarization methods and the generated summaries are informative and readable.
Content Models with Attitude
"... We present a probabilistic topic model for jointly identifying properties and attributes of social media review snippets. Our model simultaneously learns a set of properties of a product and captures aggregate user sentiments towards these properties. This approach directly enables discovery of high ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
We present a probabilistic topic model for jointly identifying properties and attributes of social media review snippets. Our model simultaneously learns a set of properties of a product and captures aggregate user sentiments towards these properties. This approach directly enables discovery of highly rated or inconsistent properties of a product. Our model admits an efficient variational meanfield inference algorithm which can be parallelized and run on large snippet collections. We evaluate our model on a large corpus of snippets from Yelp reviews to assess property and attribute prediction. We demonstrate that it outperforms applicable baselines by a considerable margin. 1
Shallow Information Extraction from Medical Forum Data
"... We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical forum documents, our goal is to extract two related types of sentences that describe a biomedical case (i.e., medical probl ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical forum documents, our goal is to extract two related types of sentences that describe a biomedical case (i.e., medical problem descriptions and medical treatment descriptions). Such an extraction task directly generates medical case descriptions that can be useful in many applications. We solve the problem using two popular machine learning methods Support Vector Machines (SVM) and Conditional Random Fields (CRF). We propose novel features to improve the accuracy of extraction. Experiment results show that we can obtain an accuracy of up to 75%. 1
Comprehensive Review of Opinion Summarization
, 2011
"... Last updated on: 2013/08/10 With the growth of the web over the last decade, opinions can now be found almost everywhere- blogs, social networking sites like Facebook and Twitter, news portals, e-commerce sites, etc. While these opinions are meant to be helpful, the vast availability ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Last updated on: 2013/08/10 With the growth of the web over the last decade, opinions can now be found almost everywhere- blogs, social networking sites like Facebook and Twitter, news portals, e-commerce sites, etc. While these opinions are meant to be helpful, the vast availability
Automatic Aggregation by Joint Modeling of Aspects and Values
"... We present a model for aggregation of product review snippets by joint aspect identification and sentiment analysis. Our model simultaneously identifies an underlying set of ratable aspects presented in the reviews of a product (e.g., sushi and miso for a Japanese restaurant) and determines the corr ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
We present a model for aggregation of product review snippets by joint aspect identification and sentiment analysis. Our model simultaneously identifies an underlying set of ratable aspects presented in the reviews of a product (e.g., sushi and miso for a Japanese restaurant) and determines the corresponding sentiment of each aspect. This approach directly enables discovery of highly-rated or inconsistent aspects of a product. Our generative model admits an efficient variational mean-field inference algorithm. It is also easily extensible, and we describe several modifications and their effects on model structure and inference. We test our model on two tasks, joint aspect identification and sentiment analysis on a set of Yelp reviews and aspect identification alone on a set of medical summaries. We evaluate the performance of the model on aspect identification, sentiment analysis, and per-word labeling accuracy. We demonstrate that our model outperforms applicable baselines by a considerable margin, yielding up to 32 % relative error reduction on aspect identification and up to 20 % relative error reduction on sentiment analysis. 1.
Comparative News Summarization Using Linear Programming
"... Comparative News Summarization aims to highlight the commonalities and differences between two comparable news topics. In this study, we propose a novel approach to generating comparative news summaries. We formulate the task as an optimization problem of selecting proper sentences to maximize the c ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Comparative News Summarization aims to highlight the commonalities and differences between two comparable news topics. In this study, we propose a novel approach to generating comparative news summaries. We formulate the task as an optimization problem of selecting proper sentences to maximize the comparativeness within the summary and the representativeness to both news topics. We consider semantic-related cross-topic concept pairs as comparative evidences, and consider topic-related concepts as representative evidences. The optimization problem is addressed by using a linear programming model. The experimental results demonstrate the effectiveness of our proposed model. 1
Bringing Representativeness into Social Media Monitoring and Analysis
"... The opinions, expectations and behavior of citizens are increasingly reflected online – therefore, mining the internet for such data can enhance decision-making in public policy, communications, marketing, finance and other fields. However, to come closer to the representativeness of classic opinion ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The opinions, expectations and behavior of citizens are increasingly reflected online – therefore, mining the internet for such data can enhance decision-making in public policy, communications, marketing, finance and other fields. However, to come closer to the representativeness of classic opinion surveys there is a lack of knowledge about the socio-demographic characteristics of those voicing opinions on the internet. This paper proposes to calibrate online opinions aggregated from multiple and heterogeneous data sources with traditional surveys enhanced with rich socio-demographic information to enable insights into which opinions are expressed on the internet by specific segments of society. The goal of this research is to provide professionals in citizen- and consumer-centered domains with more concise near real-time intelligence on online opinions. To become effective, the methodologies presented in this paper must be integrated into a coherent decision support system. 1.