• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 24,764
Next 10 →

A Shallow Text Processing Core Engine

by Günter Neumann, Jakub Piskorski - Computational Intelligence , 2002
"... We present 1 sppc, a high-performance system for intelligent extraction of structured data from free text documents. sppc consists of a set of domain-adaptive shallow core components which are realized by means of cascaded weighted finite state machines and generic dynamic tries. The system has ..."
Abstract - Cited by 25 (13 self) - Add to MetaCart
processing, shallow free text processing, German language, finite-state technology, information extract...

Combining shallow text processing and machine learning in real world applications

by Günter Neumann, Sven Schmeier - In Proceedings of the IJCAI-99 workshop on Machine Learning for Information Filtering , 1999
"... In this paper, we present first results we achieved and experiences we had combining shallow text processing methods with machine learning tools. In two research projects, where DFKI and industrial partners are involved, German real world texts have to be classified into several predefined categorie ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
In this paper, we present first results we achieved and experiences we had combining shallow text processing methods with machine learning tools. In two research projects, where DFKI and industrial partners are involved, German real world texts have to be classified into several predefined

Shallow Parsing with Conditional Random Fields

by Fei Sha, Fernando Pereira , 2003
"... Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluati ..."
Abstract - Cited by 581 (8 self) - Add to MetaCart
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard

Machine Learning in Automated Text Categorization

by Fabrizio Sebastiani - ACM COMPUTING SURVEYS , 2002
"... The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this p ..."
Abstract - Cited by 1734 (22 self) - Add to MetaCart
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach

Toward a model of text comprehension and production

by Walter Kintsch, Teun A. Van Dijk - Psychological Review , 1978
"... The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory. ..."
Abstract - Cited by 557 (12 self) - Add to MetaCart
The semantic structure of texts can be described both at the local microlevel and at a more global macrolevel. A model for text comprehension based on this notion accounts for the formation of a coherent semantic text base in terms of a cyclical process constrained by limitations of working memory

A Sequential Algorithm for Training Text Classifiers

by David D. Lewis, William A. Gale , 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract - Cited by 631 (10 self) - Add to MetaCart
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers

Parallel Networks that Learn to Pronounce English Text

by Terrence J. Sejnowski, Charles R. Rosenberg - COMPLEX SYSTEMS , 1987
"... This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed h ..."
Abstract - Cited by 549 (5 self) - Add to MetaCart
This paper describes NETtalk, a class of massively-parallel network systems that learn to convert English text to speech. The memory representations for pronunciations are learned by practice and are shared among many processing units. The performance of NETtalk has some similarities with observed

Extracting Relations from Large Plain-Text Collections

by Eugene Agichtein, Luis Gravano , 2000
"... Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use for answering precise queries or for running data mining tasks. We explore a technique for extracting such tables fr ..."
Abstract - Cited by 494 (25 self) - Add to MetaCart
introduces novel strategies for generating patterns and extracting tuples from plain-text documents. At each iteration of the extraction process, Snowball evaluates the quality of these patterns and tuples without human intervention, and keeps only the most reliable ones for the next iteration

Hierarchical Dirichlet processes.

by Yee Whye Teh , Michael I Jordan , Matthew J Beal , David M Blei - Journal of the American Statistical Association, , 2006
"... We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this s ..."
Abstract - Cited by 942 (78 self) - Add to MetaCart
Carlo algorithms for posterior inference in hierarchical Dirichlet process mixtures and describe applications to problems in information retrieval and text modeling.

Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction

by Mary Elaine Califf, Raymond J. Mooney, David Cohn , 2003
"... Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for ..."
Abstract - Cited by 406 (20 self) - Add to MetaCart
Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application
Next 10 →
Results 1 - 10 of 24,764
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University