Results 1 -
6 of
6
Machine Learning in Automated Text Categorization
- ACM Computing Surveys
, 2002
"... The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this p ..."
Abstract
-
Cited by 839 (13 self)
- Add to MetaCart
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.
Learning Syntactic Rules and Tags with Genetic Algorithms for Information Retrieval and Filtering: An Empirical Basis for Grammatical Rules
- Information Processing & Management
, 2000
"... The grammars of natural languages may be learned by using genetic algorithms that reproduce and mutate grammatical rules and part-of-speech tags, improving the quality of later generations of grammatical components. Syntactic rules are randomly generated and then evolve; those rules resulting in ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
The grammars of natural languages may be learned by using genetic algorithms that reproduce and mutate grammatical rules and part-of-speech tags, improving the quality of later generations of grammatical components. Syntactic rules are randomly generated and then evolve; those rules resulting in improved parsing and occasionally improved retrieval and filtering performance are allowed to further propagate. The LUST system learns the characteristics of the language or sublanguage used in document abstracts by learning from the document rankings obtained from the parsed abstracts. Unlike the application of traditional linguistic rules to retrieval and filtering applications, LUST develops grammatical structures and tags without the prior imposition of some common grammatical assumptions (e.g., part-of-speech assumptions), producing grammars that are empirically based and are optimized for this particular application. The author wishes to thank Stephanie Haas for discussions...
Genetic programming -- computers using "natural selection" to generate programs
- WC1E 6BT
, 1995
"... Computers that "program themselves"; science fact or fiction? Genetic Programming uses novel optimisation techniques to "evolve " simple programs; mimicking the way humans construct programs by progressively re-writing them. Trial programs are repeatedly modified in the search fo ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Computers that "program themselves"; science fact or fiction? Genetic Programming uses novel optimisation techniques to "evolve " simple programs; mimicking the way humans construct programs by progressively re-writing them. Trial programs are repeatedly modified in the search for "better/fitter " solutions. The underlying basis is Genetic Algorithms (GAs). Genetic Algorithms, pioneered by Holland [Hol92], Goldberg [Gol89] and others, are evolutionary search techniques inspired by natural selection (i.e survival of the fittest). GAs work with a "population " of trial solutions to a problem, frequently encoded as strings, and repeatedly select the "fitter " solutions, attempting to evolve better ones. The power of GAs is being demonstrated for an increasing range of applications; financial, imaging, VLSI circuit layout, gas pipeline control and production scheduling [Dav91]. But one of the most intriguing uses of GAs- driven by Koza [Koz92]- is automatic program generation. Genetic Programming applies GAs to a "population " of programs- typically encoded as tree-structures. Trial programs are evaluated against a "fitness function " and the best solutions selected for modification and re-evaluation. This modification-evaluation cycle is repeated
Enhancing Information Retrieval by Automatic Acquisition of Textual Relations Using Genetic Programming
- In: Proceedings of Intelligent User Interfaces (IUI) 2000, ACM
, 2000
"... We have explored a novel method to find textual relations in electronic documents using genetic programming and semantic networks. This can be used for enhancing information retrieval and simplifying user interfaces. The automatic extraction of relations from text enables easier updating of electron ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We have explored a novel method to find textual relations in electronic documents using genetic programming and semantic networks. This can be used for enhancing information retrieval and simplifying user interfaces. The automatic extraction of relations from text enables easier updating of electronic dictionaries and may reduce interface area both for search input and hit output on small screens such as cell phones and PDAs (Personal Digital Assistants).
patriciaQviktoria.informatik.gu.se
"... nordin Qfy.chalmers.se We have explored a novel method to find textual relations in electronic documents using genetic programming and semantic networks. This can be used for enhancing information retrieval and simplifying user interfaces. The automatic extraction of relations from text enables easi ..."
Abstract
- Add to MetaCart
nordin Qfy.chalmers.se We have explored a novel method to find textual relations in electronic documents using genetic programming and semantic networks. This can be used for enhancing information retrieval and simplifying user interfaces. The automatic extraction of relations from text enables easier updating of electronic dictionaries and may reduce interface area both for search input and hit output on small screens such as cell phones and PDAs (Personal Digital Assistants).

