• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 2,290
Next 10 →

Shallow Parsing with Conditional Random Fields

by Fei Sha, Fernando Pereira , 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract - Cited by 581 (8 self) - Add to MetaCart
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard

Using Maximum Entropy for Text Classification

by Kamal Nigam, John Lafferty, Andrew Mccallum , 1999
"... This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying principl ..."
Abstract - Cited by 326 (6 self) - Add to MetaCart
This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying

Cutting-Plane Training of Structural SVMs

by Thorsten Joachims, Thomas Finley, Chun-nam John Yu , 2007
"... Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training algorithms are computationally expensive or i ..."
Abstract - Cited by 321 (10 self) - Add to MetaCart
Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training algorithms are computationally expensive

DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language

by Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar, Gunda Jon Currey
"... DryadLINQ is a system and a set of language extensions that enable a new programming model for large scale distributed computing. It generalizes previous execution environments such as SQL, MapReduce, and Dryad in two ways: by adopting an expressive data model of strongly typed.NET objects; and by s ..."
Abstract - Cited by 273 (27 self) - Add to MetaCart
; and by supporting general-purpose imperative and declarative operations on datasets within a traditional high-level programming language. A DryadLINQ program is a sequential program composed of LINQ expressions performing arbitrary sideeffect-free transformations on datasets, and can be written and debugged using

Letor: Benchmark dataset for research on learning to rank for information retrieval

by Tie-yan Liu, Jun Xu, Tao Qin, Wenying Xiong, Hang Li - In Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval , 2007
"... This paper is concerned with learning to rank for information retrieval (IR). Ranking is the central problem for information retrieval, and employing machine learning techniques to learn the ranking function is viewed as a promising approach to IR. Unfortunately, there was no benchmark dataset that ..."
Abstract - Cited by 156 (16 self) - Add to MetaCart
the datasets, including both conventional features, such as term frequency, inverse document frequency, BM25, and language models for IR, and features proposed recently at SIGIR, such as HostRank, feature propagation, and topical PageRank. We have then packaged LETOR with the extracted features, queries

Limitations of Co-Training for Natural Language Learning from Large Datasets

by David Pierce, Claire Cardie - In Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing , 2001
"... Co-Training is a weakly supervised learning paradigm in which the redundancy of the learning task is captured by training two classifiers using separate views of the same data. This enables bootstrapping from a small set of labeled training data via a large set of unlabeled data. This study examines ..."
Abstract - Cited by 86 (3 self) - Add to MetaCart
examines the learning behavior of co-training on natural language processing tasks that typically require large numbers of training instances to achieve usable performance levels. Using base noun phrase bracketing as a case study, we find that co-training reduces by 36% the di#erence in error between co

Atmospheric Environment

by Jun Wang, Xia Hu, Wenhan Chao, Biyun Hu, Zhoujun Li - Dicarbonyl Products of the OH Radical-Initiated Reactions of Naphthalene and the C1and C2-Alkylnaphthalenes , 2007
"... This paper is concerned with the problem of question recommendation in the setting of Community Question Answering (CQA). Given a question as query, our goal is to rank all of the retrieved questions according to their likelihood of being good recommendations for the query. In this paper, we propose ..."
Abstract - Cited by 215 (10 self) - Add to MetaCart
propose a notion of public interest, and show how public interest can boost the performance of question recommendation. In particular, to model public interest in question recommendation, we build a language model to combine relevance score to the query and popularity score regarding question popularity

Topic and role discovery in social networks

by Andrew Mccallum, Andrés Corrada-emmanuel, Xuerui Wang - In IJCAI , 2005
"... Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the language content or topics on those links. We present the Author-Recipient-Topic (ART) model for social network analysis, which learns topic distributions based on the direction- ..."
Abstract - Cited by 221 (15 self) - Add to MetaCart
Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the language content or topics on those links. We present the Author-Recipient-Topic (ART) model for social network analysis, which learns topic distributions based on the direction

The VNCTokens Dataset

by Paul Cook, Afsaneh Fazly, Suzanne Stevenson - In proceedings of the MWE workshop. ACL , 2008
"... Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb–noun ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb

Why does unsupervised pre-training help deep learning?

by Dumitru Erhan, Aaron Courville, Yoshua Bengio, Pascal Vincent , 2010
"... Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of autoencoder variants with impressive results being obtained in several areas, mostly on vision and language datasets. The best results obtained on supervised learning tasks ..."
Abstract - Cited by 155 (20 self) - Add to MetaCart
Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of autoencoder variants with impressive results being obtained in several areas, mostly on vision and language datasets. The best results obtained on supervised learning tasks
Next 10 →
Results 1 - 10 of 2,290
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University