• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Extracting information nuggets from disaster- Related messages in social media, pages 791–801. Karlsruher Institut fur Technologie (KIT (2013)

by Muhammad Imran, Shady Elbassuoni, Carlos Castillo, Fernando Diaz, Patrick Meier
Add To MetaCart

Tools

Sorted by:
Results 1 - 1 of 1

More Features Are Not Always Better: Evaluating Generalizing Models in Incident Type Classification of Tweets

by Axel Schulz, Christian Guckelsberger, Benedikt Schmidt
"... Social media represents a rich source of up-to-date information about events such as in-cidents. The sheer amount of available infor-mation makes machine learning approaches a necessity for further processing. This learn-ing problem is often concerned with region-ally restricted datasets such as dat ..."
Abstract - Add to MetaCart
Social media represents a rich source of up-to-date information about events such as in-cidents. The sheer amount of available infor-mation makes machine learning approaches a necessity for further processing. This learn-ing problem is often concerned with region-ally restricted datasets such as data from only one city. Because social media data such as tweets varies considerably across differ-ent cities, the training of efficient models re-quires labeling data from each city of inter-est, which is costly and time consuming. In this study, we investigate which features are most suitable for training generalizable models, i.e., models that show good per-formance across different datasets. We re-implemented the most popular features from the state of the art in addition to other novel approaches, and evaluated them on data from ten different cities. We show that many so-phisticated features are not necessarily valu-able for training a generalized model and are outperformed by classic features such as plain word-n-grams and character-n-grams. 1
(Show Context)

Citation Context

... for incident type classification. (Sakaki and Okazaki, 2010; Carvalho et al., 2010; Agarwal et al., 2012; Robert Power, 2013; Schulz and Janssen, 2014) trained an SVM, whereas (Agarwal et al., 2012; =-=Imran et al., 2013-=-; Schulz and Janssen, 2014) also evaluated an NB classifier. In contrast to these works, (Wanichayapong et al., 2011) followed a dictionary-based approach using traffic-related keywords. (Li et al., 2...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University