• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 27,621
Next 10 →

Text Classification using String Kernels

by Huma Lodhi, Craig Saunders, John Shawe-Taylor, Nello Cristianini, Chris Watkins
"... We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguo ..."
Abstract - Cited by 495 (7 self) - Add to MetaCart
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily

N-grambased text categorization

by William B. Cavnar, John M. Trenkle - In Proc. of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval , 1994
"... Text categorization is a fundamental task in document processing, allowing the automated handling of enormous streams of documents in electronic form. One difficulty in handling some classes of documents is the presence of different kinds of textual errors, such as spelling and grammatical errors in ..."
Abstract - Cited by 445 (0 self) - Add to MetaCart
Text categorization is a fundamental task in document processing, allowing the automated handling of enormous streams of documents in electronic form. One difficulty in handling some classes of documents is the presence of different kinds of textual errors, such as spelling and grammatical errors

Accurate Methods for the Statistics of Surprise and Coincidence

by Ted Dunning - COMPUTATIONAL LINGUISTICS , 1993
"... Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used unjustifiably ..."
Abstract - Cited by 1057 (1 self) - Add to MetaCart
Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used

XORs in the air: practical wireless network coding

by Sachin Katti, Hariharan Rahul, Wenjun Hu, Dina Katabi, Muriel Médard, Jon Crowcroft - In Proc. ACM SIGCOMM , 2006
"... This paper proposes COPE, a new architecture for wireless mesh networks. In addition to forwarding packets, routers mix (i.e., code) packets from different sources to increase the information content of each transmission. We show that intelligently mixing packets increases network throughput. Our de ..."
Abstract - Cited by 548 (20 self) - Add to MetaCart
This paper proposes COPE, a new architecture for wireless mesh networks. In addition to forwarding packets, routers mix (i.e., code) packets from different sources to increase the information content of each transmission. We show that intelligently mixing packets increases network throughput. Our

A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge

by Thomas K Landauer, Susan T. Dutnais - PSYCHOLOGICAL REVIEW , 1997
"... How do people know as much as they do with as little information as they get? The problem takes many forms; learning vocabulary from text is an especially dramatic and convenient case for research. A new general theory of acquired similarity and knowledge representation, latent semantic analysis (LS ..."
Abstract - Cited by 1816 (10 self) - Add to MetaCart
How do people know as much as they do with as little information as they get? The problem takes many forms; learning vocabulary from text is an especially dramatic and convenient case for research. A new general theory of acquired similarity and knowledge representation, latent semantic analysis

Maximum entropy markov models for information extraction and segmentation

by Andrew McCallum, Dayne Freitag, Fernando Pereira , 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract - Cited by 561 (18 self) - Add to MetaCart
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled

Framing: toward clarification of a fractured paradigm’’,

by Robert M Entman - Journal of Communication , 1993
"... In response to the proposition that communication lacks disciplinary status because of deficient core knowledge, I propose that we turn an ostensible weakness into a strength. We should identify our mission as bringing together insights and theories that would otherwise remain scattered in other di ..."
Abstract - Cited by 620 (1 self) - Add to MetaCart
other fields and feed back their studies to outside researchers. At the same time, such an enterprise would enhance the theoretical rigor of communication scholarship proper. The idea of "framing" offers a case study of just the kind of scattered conceptualization I have identified. Despite

Pegasos: Primal Estimated sub-gradient solver for SVM

by Shai Shalev-Shwartz, Yoram Singer, Nathan Srebro, Andrew Cotter
"... We describe and analyze a simple and effective stochastic sub-gradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract - Cited by 542 (20 self) - Add to MetaCart
-linear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an order-of-magnitude speedup over previous SVM learning

Matching words and pictures

by Kobus Barnard, Pinar Duygulu, David Forsyth, Nando De Freitas, David M. Blei, Michael I. Jordan - JOURNAL OF MACHINE LEARNING RESEARCH , 2003
"... We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation ..."
Abstract - Cited by 665 (40 self) - Add to MetaCart
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto

Monopolistic competition and optimum product diversity. The American Economic Review,

by Avinash K Dixit , Joseph E Stiglitz , Harold Hotelling , Nicholas Stern , Kelvin Lancaster , Stiglitz , 1977
"... The basic issue concerning production in welfare economics is whether a market solution will yield the socially optimum kinds and quantities of commodities. It is well known that problems can arise for three broad reasons: distributive justice; external effects; and scale economies. This paper is c ..."
Abstract - Cited by 1911 (5 self) - Add to MetaCart
commodities already embodies the desirability of variety. Thus, a consumer who is indifferent between the quantities (1,0) and (0,1) of two commodities prefers the mix (1/2,1/2) to either extreme. The advantage of this view is that the results involve the familiar ownand cross-elasticities of demand functions
Next 10 →
Results 1 - 10 of 27,621
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University