Results 1 - 10
of
1,680
Opinion Mining and Sentiment Analysis
, 2008
"... An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, active ..."
Abstract
-
Cited by 749 (3 self)
- Add to MetaCart
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do
Rich feature hierarchies for accurate object detection and semantic segmentation
"... Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex en-semble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scala ..."
Abstract
-
Cited by 251 (23 self)
- Add to MetaCart
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex en-semble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple
Object categorization by learned universal visual dictionary
- IN ICCV
, 2005
"... This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable ..."
Abstract
-
Cited by 302 (8 self)
- Add to MetaCart
from the web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes).
Towards rich mobile phone datasets: Lausanne data collection campaign
- In Proceedings of the 7th International Conference on Pervasive Services
, 2010
"... Abstract—Mobile phones have recently been used to collect large-scale continuous data about human behavior. In a paradigm known as people centric sensing, users are not only the carriers of sensing devices, but also the sources and consumers of sensed events. This paper describes a data collection c ..."
Abstract
-
Cited by 50 (14 self)
- Add to MetaCart
Abstract—Mobile phones have recently been used to collect large-scale continuous data about human behavior. In a paradigm known as people centric sensing, users are not only the carriers of sensing devices, but also the sources and consumers of sensed events. This paper describes a data collection campaign wherein Nokia N95 phones are allocated to a heterogeneous sample of nearly 170 participants from Lausanne, a mid-tier city in Switzerland, to be used over a period of one year. The data collection software runs on the background of the phones in a non-intrusive manner, yielding data on modalities such as social interaction and spatial behavior. The main motivations for organizing a new campaign on top of the ones that have been successfully conducted in the past are the following: First, in comparison to the Reality Mining data, generated in 2004-2005, the present data set is expected to provide a richer means to study location attributes, in particular, because today’s mobile phones are more powerful and equipped with more sensors. Second, we aim to recruit a heterogeneous set of participants, comprising family and leisure related social networks in addition to organizationally driven ones. This paper provides a methodological description of the project and shows the potential of the resulting data set in terms of illuminating multiple aspects of human behavior. I.
The VNCTokens Dataset
- In proceedings of the MWE workshop. ACL
, 2008
"... Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb–noun ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb
Rich Probabilistic Models for Gene Expression
, 2001
"... Clustering is commonly used for analyzing gene expression data. Despite their successes, clustering methods suffer from a number of limitations. First, these methods reveal similarities that exist over all of the measurements, while obscuring relationships that exist over only a subset of the data. ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
. Second, clustering methods cannot readily incorporate additional types of information, such as clinical data or known attributes of genes. To circumvent these shortcomings, we propose the use of a single coherent probabilistic model, that encompasses much of the rich structure in the genomic expression
The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content
"... The written form of Arabic, Modern Standard Arabic (MSA), differs quite a bit from the spoken dialects of Arabic, which are the true “native ” languages of Arabic speakers used in daily life. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
. We present the Arabic Online Commentary Dataset, a 52M-word monolingual dataset rich in dialectal content, and we describe our long-term annotation effort to identify the dialect level (and dialect itself) in each sentence of the dataset. So far, we have labeled 108K sentences, 41% of which as having
Collaborative Web UI Localization, or How to Build Feature-rich Multilingual Datasets
"... We present a method to generate feature-rich multilingual parallel datasets for ma-chine translation systems, including e.g. type of widget, user’s locale, or geoloca-tion. To support this argument, we have developed a bookmarklet that instruments arbitrary websites so that casual end users can modi ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a method to generate feature-rich multilingual parallel datasets for ma-chine translation systems, including e.g. type of widget, user’s locale, or geoloca-tion. To support this argument, we have developed a bookmarklet that instruments arbitrary websites so that casual end users can
chapter Efficient Mining under Rich Constraints Derived from Various Datasets
- Post-proceedings of the 5th International Workshop on Knowledge Discovery in Inductive Databases in conjunction with ECML/PKDD 2006 (KDID’06
, 2007
"... Abstract. Mining patterns under many kinds of constraints is a key point to successfully get new knowledge. In this paper, we propose an efficient new algorithm Music-dfs which soundly and completely mines patterns with various constraints from large data and takes into account external data represe ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
represented by several heterogeneous datasets. Constraints are freely built of a large set of primitives and enable to link the information scattered in various knowledge sources. Efficiency is achieved thanks to a new closure operator providing an interval pruning strategy applied during the depth
The DBpedia Events Dataset
"... Abstract. Wikipedia is the largest encyclopedia worldwide and is frequently updated by thousands of collaborators. A large part of the knowledge in Wikipedia is not static, but frequently updated, e. g., political events or new movies. This makes Wikipedia an extremely rich, crowdsourced informatio ..."
Abstract
- Add to MetaCart
Abstract. Wikipedia is the largest encyclopedia worldwide and is frequently updated by thousands of collaborators. A large part of the knowledge in Wikipedia is not static, but frequently updated, e. g., political events or new movies. This makes Wikipedia an extremely rich, crowdsourced
Results 1 - 10
of
1,680