• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 1,680
Next 10 →

Opinion Mining and Sentiment Analysis

by Bo Pang, Lillian Lee , 2008
"... An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, active ..."
Abstract - Cited by 749 (3 self) - Add to MetaCart
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do

Rich feature hierarchies for accurate object detection and semantic segmentation

by Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
"... Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex en-semble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scala ..."
Abstract - Cited by 251 (23 self) - Add to MetaCart
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex en-semble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple

Object categorization by learned universal visual dictionary

by J. Winn, A. Criminisi, T. Minka - IN ICCV , 2005
"... This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable ..."
Abstract - Cited by 302 (8 self) - Add to MetaCart
from the web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes).

Towards rich mobile phone datasets: Lausanne data collection campaign

by Niko Kiukkonen, Jan Blom, Olivier Dousse, Daniel Gatica-perez, Juha Laurila - In Proceedings of the 7th International Conference on Pervasive Services , 2010
"... Abstract—Mobile phones have recently been used to collect large-scale continuous data about human behavior. In a paradigm known as people centric sensing, users are not only the carriers of sensing devices, but also the sources and consumers of sensed events. This paper describes a data collection c ..."
Abstract - Cited by 50 (14 self) - Add to MetaCart
Abstract—Mobile phones have recently been used to collect large-scale continuous data about human behavior. In a paradigm known as people centric sensing, users are not only the carriers of sensing devices, but also the sources and consumers of sensed events. This paper describes a data collection campaign wherein Nokia N95 phones are allocated to a heterogeneous sample of nearly 170 participants from Lausanne, a mid-tier city in Switzerland, to be used over a period of one year. The data collection software runs on the background of the phones in a non-intrusive manner, yielding data on modalities such as social interaction and spatial behavior. The main motivations for organizing a new campaign on top of the ones that have been successfully conducted in the past are the following: First, in comparison to the Reality Mining data, generated in 2004-2005, the present data set is expected to provide a richer means to study location attributes, in particular, because today’s mobile phones are more powerful and equipped with more sensors. Second, we aim to recruit a heterogeneous set of participants, comprising family and leisure related social networks in addition to organizationally driven ones. This paper provides a methodological description of the project and shows the potential of the resulting data set in terms of illuminating multiple aspects of human behavior. I.

The VNCTokens Dataset

by Paul Cook, Afsaneh Fazly, Suzanne Stevenson - In proceedings of the MWE workshop. ACL , 2008
"... Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb–noun ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb

Rich Probabilistic Models for Gene Expression

by Eran Segal, Ben Taskar, Audrey Gasch, Nir Friedman, Daphne Koller , 2001
"... Clustering is commonly used for analyzing gene expression data. Despite their successes, clustering methods suffer from a number of limitations. First, these methods reveal similarities that exist over all of the measurements, while obscuring relationships that exist over only a subset of the data. ..."
Abstract - Cited by 89 (8 self) - Add to MetaCart
. Second, clustering methods cannot readily incorporate additional types of information, such as clinical data or known attributes of genes. To circumvent these shortcomings, we propose the use of a single coherent probabilistic model, that encompasses much of the rich structure in the genomic expression

The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content

by Omar F. Zaidan, Chris Callison-burch
"... The written form of Arabic, Modern Standard Arabic (MSA), differs quite a bit from the spoken dialects of Arabic, which are the true “native ” languages of Arabic speakers used in daily life. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. ..."
Abstract - Cited by 8 (1 self) - Add to MetaCart
. We present the Arabic Online Commentary Dataset, a 52M-word monolingual dataset rich in dialectal content, and we describe our long-term annotation effort to identify the dialect level (and dialect itself) in each sentence of the dataset. So far, we have labeled 108K sentences, 41% of which as having

Collaborative Web UI Localization, or How to Build Feature-rich Multilingual Datasets

by Vicent Alabau, Luis A. Leiva
"... We present a method to generate feature-rich multilingual parallel datasets for ma-chine translation systems, including e.g. type of widget, user’s locale, or geoloca-tion. To support this argument, we have developed a bookmarklet that instruments arbitrary websites so that casual end users can modi ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
We present a method to generate feature-rich multilingual parallel datasets for ma-chine translation systems, including e.g. type of widget, user’s locale, or geoloca-tion. To support this argument, we have developed a bookmarklet that instruments arbitrary websites so that casual end users can

chapter Efficient Mining under Rich Constraints Derived from Various Datasets

by Arnaud Soulet, Bruno Crémilleux - Post-proceedings of the 5th International Workshop on Knowledge Discovery in Inductive Databases in conjunction with ECML/PKDD 2006 (KDID’06 , 2007
"... Abstract. Mining patterns under many kinds of constraints is a key point to successfully get new knowledge. In this paper, we propose an efficient new algorithm Music-dfs which soundly and completely mines patterns with various constraints from large data and takes into account external data represe ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
represented by several heterogeneous datasets. Constraints are freely built of a large set of primitives and enable to link the information scattered in various knowledge sources. Efficiency is achieved thanks to a new closure operator providing an interval pruning strategy applied during the depth

The DBpedia Events Dataset

by Magnus Knuth , Jens Lehmann , Dimitris Kontokostas , Thomas Steiner , Harald Sack
"... Abstract. Wikipedia is the largest encyclopedia worldwide and is frequently updated by thousands of collaborators. A large part of the knowledge in Wikipedia is not static, but frequently updated, e. g., political events or new movies. This makes Wikipedia an extremely rich, crowdsourced informatio ..."
Abstract - Add to MetaCart
Abstract. Wikipedia is the largest encyclopedia worldwide and is frequently updated by thousands of collaborators. A large part of the knowledge in Wikipedia is not static, but frequently updated, e. g., political events or new movies. This makes Wikipedia an extremely rich, crowdsourced
Next 10 →
Results 1 - 10 of 1,680
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University