Results 1 - 10
of
12
Subjectivity and sentiment analysis of arabic: A survey
- In Advanced Machine Learning Technologies and Applications
, 2012
"... Abstract. Subjectivity and sentiment analysis (SSA) has recently gained consid-erable attention, but most of the resources and systems built so far are tailored to English and other Indo-European languages. The need for designing systems for other languages is increasing, especially as blogging and ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Subjectivity and sentiment analysis (SSA) has recently gained consid-erable attention, but most of the resources and systems built so far are tailored to English and other Indo-European languages. The need for designing systems for other languages is increasing, especially as blogging and micro-blogging web-sites become popular throughout the world. This paper surveys different tech-niques for SSA for Arabic. After a brief synopsis about Arabic, we describe the main existing techniques and test corpora for Arabic SSA that have been intro-duced in the literature. 1
Automatic Detection of Point of View Differences in
"... We investigate differences in point of view (POV) between two objective documents, where one is describing the subject matter in a more positive/negative way than the other, and present an automatic method for detecting such POV differences. We use Amazon Mechanical Turk (AMT) to annotate sentences ..."
Abstract
- Add to MetaCart
We investigate differences in point of view (POV) between two objective documents, where one is describing the subject matter in a more positive/negative way than the other, and present an automatic method for detecting such POV differences. We use Amazon Mechanical Turk (AMT) to annotate sentences as positive, negative or neutral based on their POV towards a given target. A statistical classifier is trained to predict the POV score of a document, which reflects how positive/negative the document’s POV towards its target is. The results of our experiments on a set of articles in the Arabic and English Wikipedias from the people category show that our method successfully detects POV differences.
Multi-View AdaBoost for Multilingual Subjectivity Analysis
"... Subjectivity analysis has received increasing attention in natural language processing field. Most of the subjectivity analysis works however are conducted on single languages. In this paper, we propose to perform multilingual subjectivity analysis by combining multi-view learning and AdaBoost techn ..."
Abstract
- Add to MetaCart
(Show Context)
Subjectivity analysis has received increasing attention in natural language processing field. Most of the subjectivity analysis works however are conducted on single languages. In this paper, we propose to perform multilingual subjectivity analysis by combining multi-view learning and AdaBoost techniques. We aim to show that by boosting multi-view classifiers we can develop more effective multilingual subjectivity analysis tools for new languages as well as increase the classification performance for English data. We empirically evaluate our two multi-view AdaBoost approaches on the multilingual MPQA dataset. The experimental results show the multi-view AdaBoost approaches significantly outperform existing monolingual and multilingual methods.
Building A Sentiment Analysis Corpus With Multifaceted Hierarchical Annotation
"... A corpus is a collection of documents. An annotated corpus consists of documents or entities annotated with some task related labels such as part of speech tags, sentiment etc. While it is customary to annotate a document for a specific task, it is also possible to annotate it for multiple tasks, re ..."
Abstract
- Add to MetaCart
(Show Context)
A corpus is a collection of documents. An annotated corpus consists of documents or entities annotated with some task related labels such as part of speech tags, sentiment etc. While it is customary to annotate a document for a specific task, it is also possible to annotate it for multiple tasks, resulting in a multifaceted annotation scheme. These annotations can be organized in a hierarchical fashion, if such a scheme naturally occurred in the data, resulting in a hierarchical text categorization problem. We developed a multifaceted, multilingual corpus for hierarchical sentiment analysis. The different facets include hierarchical nominal sentiment labels, a numerical sentiment score, language, and the dialect. Our corpus consists of 191K reviews of hotels in Saudi Arabia. The reviews are divided into eleven different categories. Within each category, the reviews are further divided into two positive and negative categories. The corpus contains 1.8 million tokens. Reviews are mostly written in Arabic and English but there are instances of other languages too.
Arabic Sentiment Analysis: A Survey
"... Abstract—Most social media commentary in the Arabic language space is made using unstructured non-grammatical slang Arabic language, presenting complex challenges for sentiment analysis and opinion extraction of online commentary and micro blogging data in this important domain. This paper provides ..."
Abstract
- Add to MetaCart
Abstract—Most social media commentary in the Arabic language space is made using unstructured non-grammatical slang Arabic language, presenting complex challenges for sentiment analysis and opinion extraction of online commentary and micro blogging data in this important domain. This paper provides a comprehensive analysis of the important research works in the field of Arabic sentiment analysis. An in-depth qualitative analysis of the various features of the research works is carried out and a summary of objective findings is presented. We used smoothness analysis to evaluate the percentage error in the performance scores reported in the studies from their linearly-projected values (smoothness) which is an estimate of the influence of the different approaches used by the authors on the performance scores obtained. To solve a bounding issue with the data as it was reported, we modified existing logarithmic smoothing technique and applied it to pre-process the performance scores before the analysis. Our results from the analysis have been reported and interpreted for the various
Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries
"... and other research outputs Sentiment and behaviour annotation in a corpus of di-alogue summaries ..."
Abstract
- Add to MetaCart
and other research outputs Sentiment and behaviour annotation in a corpus of di-alogue summaries
Idioms-Proverbs Lexicon for Modern Standard Arabic and Colloquial Sentiment Analysis
"... Although, the fair amount of works in sentiment analysis (SA) and opinion mining (OM) systems in the last decade and with respect to the performance of these systems, but it still not desired performance, especially for morphologically-Rich Language (MRL) such as Arabic, due to the complexities and ..."
Abstract
- Add to MetaCart
Although, the fair amount of works in sentiment analysis (SA) and opinion mining (OM) systems in the last decade and with respect to the performance of these systems, but it still not desired performance, especially for morphologically-Rich Language (MRL) such as Arabic, due to the complexities and challenges exist in the nature of the languages itself. One of these challenges is the detection of idioms or proverbs phrases within the writer text or comment. An idiom or proverb is a form of speech or an expression that is peculiar to itself. Grammatically, it cannot be understood from the individual meanings of its elements and can yield different sentiment when treats as separate words. Consequently, In order to facilitate the task of detection and classification of lexical phrases for automated SA systems, this paper presents AIPSeLEX a novel idioms / proverbs sentiment lexicon for modern standard Arabic (MSA) and colloquial. AIPSeLEX is manually collected and annotated at sentence level with semantic orientation (positive or negative). The efforts of manually building and annotating the lexicon are reported. Moreover, we build a classifier that extracts idioms and proverbs, phrases from text using n-gram and similarity measure methods. Finally, several experiments were carried out on various data, including Arabic tweets and Arabic microblogs (hotel reservation, product reviews, and TV program comments) from publicly available Arabic online reviews websites (social media, blogs, forums, e-commerce web sites) to evaluate the coverage and accuracy of
1LABR: A Large Scale Arabic Book Reviews
"... Abstract—We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks ..."
Abstract
- Add to MetaCart
Abstract—We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and ratings classification. We provide standard splits of the dataset into training, validation and testing, for both polarity and ratings classification, in both balanced and unbalanced settings. We extend the work done in Aly and Atiya [2013] by performing a comprehensive analysis on the dataset. In particular, we perform an extended survey of the different classifiers typically used for the sentiment polarity classification problem. Also we construct a sentiment lexicon from the dataset that contains both single and compound sentiment words and we explore its effectiveness. I.
Hierarchical Classifiers for Multi-Way Sentiment Analysis of Arabic Reviews
"... Abstract—Sentiment Analysis (SA) is one of hottest fields in data mining (DM) and natural language processing (NLP). The goal of SA is to extract the sentiment conveyed in a certain text based on its content. While most current works focus on the simple problem of determining whether the sentiment i ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Sentiment Analysis (SA) is one of hottest fields in data mining (DM) and natural language processing (NLP). The goal of SA is to extract the sentiment conveyed in a certain text based on its content. While most current works focus on the simple problem of determining whether the sentiment is positive or negative, Multi-Way Sentiment Analysis (MWSA) focuses on sentiments conveyed through a rating or scoring system (e.g., a 5-star scoring system). In such scoring systems, the sentiments conveyed in two reviews of close scores (such as 4 stars and 5 stars) can be very similar creating an added challenge compared to traditional SA. One intuitive way of handling this challenge is via a divide-and-conquer approach where the MWSA problem is divided into a set of sub-problems allowing the use of customized classifiers to differentiate between reviews of close scores. A hierarchical classification structure can be used with
A HYBRID METHOD USING LEXICON-BASED APPROACH AND NAIVE BAYES CLASSIFIER FOR ARABIC OPINION QUESTION ANSWERING
"... Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the ..."
Abstract
- Add to MetaCart
Opinion Question Answering (Opinion QA) is the task of enabling users to explore others opinions toward a particular service of product in order to make decisions. Arabic Opinion QA is more challenging due to its complex morphology compared to other languages and has many varieties dialects. On the other hand, there are insignificant research efforts and resources available that focus on Opinion QA in Arabic. This study aims to address the difficulties of Arabic opinion QA by proposing a hybrid method of lexicon-based approach and classification using Naïve Bayes classifier. The proposed method contains pre-processing phases such as, transformation, normalization and tokenization and exploiting auxiliary information (thesaurus). The lexicon-based approach is executed by replacing some words with its synonyms using the domain dictionary. The classification task is performed by Naïve Bayes classifier to classify the opinions based on the positive or negative sentiment polarity. The proposed method has been evaluated using the common information retrieval metrics i.e., Precision, Recall and F-measure. For comparison, three classifiers have been applied which are Naïve Bayes (NB), Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The experimental results have demonstrated that NB outperforms SVM and KNN by achieving 91 % accuracy.