| W. W. Cohen. Text categorization and relational learning. In A. Prieditis and S. J. Russell, editors, Proceedings of ICML-95, 12th International Conference on Machine Learning, pages 124--132, Lake Tahoe, US, 1995. Morgan Kaufmann Publishers, San Francisco, US. |
....80 max , Approx ConstrainedSingle (FfHC; 1 FPR) TPR; is Hard. Proof included in appendix 2) As far as we know, max 175 41616 (roughly 4:2 10 ) but we think that this bound can be much improved. The accuracy can sometimes be conveniently replaced by the F statistics [Cohen, 1995], which is an accurate composition of precision and recall (see subsection 3.1 for their de nition) useful for text categorization problems [Cohen, 1995] So far, we have not been able to conclude to the hardness of using this criterion in our framework. 4.4 Beyond computational complexity and ....
....4:2 10 ) but we think that this bound can be much improved. The accuracy can sometimes be conveniently replaced by the F statistics [Cohen, 1995] which is an accurate composition of precision and recall (see subsection 3. 1 for their de nition) useful for text categorization problems [Cohen, 1995]. So far, we have not been able to conclude to the hardness of using this criterion in our framework. 4.4 Beyond computational complexity and ILP It is well known since [Kearns et al. 1987] that negative results on such problems can sometimes be extended to negative results for PAC type ....
Cohen, W. W. (1995). Text categorization and relational learning. In International Conference on Machine Learning, pages 124-132.
....in the TC literature. An increasing number of learning approaches havebeenapplied, including regression models[9, 32] nearest neighbor classification[17, 29, 33, 31, 14] Bayesian probabilistic approaches [25, 16, 20, 13, 12, 18, 3] decision trees[9, 16, 20, 2, 12] inductive rule learning[1, 5, 6, 21], neural networks[28, 22] on line learning[6, 15] and Support Vector Machines [12] While the rich literature provides valuable information about individual methods, clear conclusions about crossmethod comparison have been difficult because often the published results are not directly ....
....it would be meaningful to compare the performance of different classifiers with respect to category frequencies, and to measure howmuch the effectiveness of each method depends on the amountof data available for training. Evaluation scores of specific categories have been often reported[28, 5, 15, 13, 12]# however, performance analysis as a function of the rareness of categories has been seldom seen in the TC literature. Most commonly, methods are compared using a single score, such as the accuracy, error rate, or averaged F1 measure(see Section 2 for definition) over all category assignments to ....
[Article contains additional citation context not shown here]
William W. Cohen. Text categorization and relational learning. In The Twelfth International Conference on Machine Learning (ICML'95). Morgan Kaufmann, 1995.
.... of statistical learning methods have been applied to this problem in recent years, including regression models[7,27] nearest neighbor classifiers[11,14,25,26,28] Bayes belief networks[3,9,10,12,16,17,18,23] decision trees[2,9,12,18] neural networks[22,24] and inductive rule learning techniques[ 1,5,6,19]. Recent studies have proved the success of statistical approaches for learning to classify text documents [3,10,16,20,26] These approaches commonly represent documents as vectors of words, and learn by gathering statistical information from the observed frequencies of these words within ....
Cohen,William W., Text categorization and relational learning, In The Twelfth International Conference on Machine Learning (ICML'95), Morgan Kaufmann, 1995.
....neighbour, FLNMAP with voting. 1 Introduction The text categorization problem appears in a number of application domains including information retrieval (IR) 23, 24, 25] data mining [22] and Web searching [14, 16] A number of text categorization algorithms have appeared in the literature [7, 8, 25, 37] for an overview see [23, 26, 38, 39] While unsupervised text categorization has also attracted attention [37] most of the above algorithms address the problem of supervised text categorization where a typical task has the following characteristics. 1. A set of document categories is given ....
W.W. Cohen, "Text categorization and relational learning", in Proceedings of the 12th International Conference in Machine Learning, pp.124-132, 1995.
.... et al. 1999] With growing demands by incorporating more sophisticated services, an Ilp based approach for learning grammars looks interesting (see [Huck et al. 1998] for a discussion of pattern and grammar based wrapping, Hammer et al. 1997] for a pattern extraction from web resources and [Cohen, 1995] for an Ilp approach for text categorization) 6 Conclusion and Prospects Currently, Bikini is limited to German news only which is caused by the use of a German lexicon for morphological reduction. Switching to other languages is rather simple, while a multi lingual approach requires a more ....
Cohen, W. W. (1995). Text categorization and relational learning. In Prieditis, A. and Russell, S., editors, Machine Learning, pages 124-132. Morgan Kaufmann.
....labeled examples (where the label indicates which category the example document belongs to) and attempts to infer a function that will map new documents into their categories. Several algorithms have been proposed within this framework, including regression models [29] inductive logic programming [6], probabilistic classifiers [17, 21, 16] decision trees [18] neural networks [22] and more recently support vector machines [12] Research on text categorization has been mainly focused on non structured documents. In the typical approach, inherited from information retrieval, each document is ....
W. W. Cohen. Text categorization and relational learning. In Proceedings of the Twelfth International Conference on Machine Learning, Lake Tahoe, California, 1995.
....or better accuracy on 22 of 37 benchmark data sets. RIPPER has already been applied to a number of standard problems in text classification with quite promising results, and FLIPPER, a first order logic version, was developed in an attempt to study the use of phrases for text classification [COH95b, COH95c]. It is important to emphasize that RIPPER is a rule based machine learning system that has made its mark in a field dominated by purely statistical methods. This makes RIPPER interesting for this study since most conclusions about the effectiveness of various representations have been drawn in a ....
William W. Cohen. Text Categorization and Relational Learning. In Proc. ICML-95. 1995. 124-132.
....In the short term, research into new algorithms based on combining classifiers probably holds the most promise. 2. A QUICK LOOK AT RIPPER Before moving on to discuss the new text representations, we need to quickly introduce the learning algorithm we used. RIPPER was developed by William Cohen [1995a] based on repeated application of Furnkranz and Widmer s [1994] IREP algorithm followed by two new global optimization procedures. Like other rule based learners, RIPPER grows rules in a greedy fashion guided by an information gain heuristic. It is comparable in accuracy to similar algorithms ....
....[1994] IREP algorithm followed by two new global optimization procedures. Like other rule based learners, RIPPER grows rules in a greedy fashion guided by an information gain heuristic. It is comparable in accuracy to similar algorithms such as c4.5rules, but is significantly more efficient [Cohen 1995a] This efficiency combined with RIPPER s implementation of set valued features [Cohen, 1996b] allows learning in much larger feature spaces than would be possible with c4.5rules. RIPPER has already been applied to a number of standard problems in text classification with quite promising results ....
[Article contains additional citation context not shown here]
Cohen, William W. 1995. Text categorization and relational learning. ICML-95. 124-132.
.... in [151] Particular works include applications to grammatical inference [247, 248] automatic induction of natural language interfaces for querying data bases [249, 222] information extraction tasks [216, 217, 29, 79, 80, 81, 30, 215] acquisition of verbal properties [153] text categorization [49, 50, 53, 213], and generation of natural language [176] 2.3 Subsymbolic Machine Learning Approaches 2.3.1 Neural Networks In their relation to NLP, neural networks [94] have been used basically to address low level problems, such as OCR [204] speech recognition and synthesis [206, 121, 155, 113, 229] and ....
....73] 150, 224] 182, 72, 74, 165] 202] Text categorization and filtering [183, 238, 237, 239] 233] 198, 196, 13] 98, 69, 99] Co reference and anaphora resol. 39] 148, 147] 35, 34] Rocchio RI ILP LSM GAs ME WSD [74] Text categorization and filtering [185, 91, 118, 196, 64] [49, 51, 50, 53, 134, 213] [51, 118, 62, 64] 236, 127] 161] Information Extraction [216, 217, 29, 79, 80, 81, 30, 215] Table 3: References corresponding to the discourse level semantics NLP problems. 13 IBL ILP NNs GAs Clust Lexical acquisition [76, 75] 169] PoS acquisition [108, 127] Grammatical ....
W. Cohen. Text Categorization and Relational Learning. In Proceedings of the 12th International Conference on Machine Learning, pages 124--132, San Francisco, CA, 1995. Morgan Kaufmann.
....in the TC literature. An increasing number of learning approaches have been applied, including regression models[9, 32] nearest neighbor classification[17, 29, 33, 31, 14] Bayesian probabilistic approaches [25, 16, 20, 13, 12, 18, 3] decision trees[9, 16, 20, 2, 12] inductive rule learning[1, 5, 6, 21], neural networks[28, 22] on line learning[6, 15] and Support Vector Machines [12] While the rich literature provides valuable information about individual methods, clear conclusions about crossmethod comparison have been difficult because often the published results are not directly ....
....it would be meaningful to compare the performance of different classifiers with respect to category frequencies, and to measure how much the effectiveness of each method depends on the amount of data available for training. Evaluation scores of specific categories have been often reported[28, 5, 15, 13, 12]; however, performance analysis as a function of the rareness of categories has been seldom seen in the TC literature. Most commonly, methods are compared using a single score, such as the accuracy, error rate, or averaged F1 measure(see Section 2 for definition) over all category assignments to ....
[Article contains additional citation context not shown here]
William W. Cohen. Text categorization and relational learning. In The Twelfth International Conference on Machine Learning (ICML'95). Morgan Kaufmann, 1995.
....work failed to find performance improvements using phrases with statistical learning algorithms (see [Lewis, 1992b] for a review) we thought there was still reason to believe that phrases could improve the performance of a symbolic, rule based learning algorithm. The recent development of RIPPER [Cohen, 1995a] a fast rule based learner that performs well on bag of words representations, opens up the possibility of using phrase based representations for symbolic learning. Next, we look at new representations based on synonyms and hypernyms. Previous work using semantic relationships in text ....
....In the short term, research into new algorithms based on combining classifiers probably holds the most promise. 1. A Quick Look at RIPPER Before moving on to discuss the new text representations, we need to quickly introduce the learning algorithm we used. RIPPER was developed by William Cohen [1995a] based on repeated application of Furnkranz and Widmer s [1994] IREP algorithm. Like other rule based learners, RIPPER grows rules in a greedy fashion guided by an information gain heuristic. RIPPER is comparable in accuracy to similar algorithms such as c4.5, but is significantly more efficient ....
[Article contains additional citation context not shown here]
Cohen, William W. 1995. Text Categorization and Relational Learning. ICML-95. 124-132.
....This task abstracts away the syntactic structures by the bag of words representation as well as the structure of categories by a set of given, alternative categories. Relational approaches do not change the learning task although they cover some aspects of the structure of linked hypertexts (Cohen 1995; M.Craven, Slattery, Nigam 1998) Text classification is connected with social actions if software agents are to use the WWW as their source of information. Until now, however, the activities of the agents are not the driving force of learning how to classify texts nor is their ativity used by ....
Cohen, W. W. 1995. Text categorization and relational learning. In Macbine Learning: Proceedings of the Twelfth International Conference (ICML '95). Lake Taho, CA: Morgan Kaufmann.
....the search engine s ranking system. With growing demands by incorporating more sophisticated services, an Ilp based approach for learning grammars seems more appropriate (see [7] for a discussion of pattern and grammar based wrapping, 6] for a pattern extraction from web resources and [5] for an Ilp approach for text categorization) 3.3 Learning about Urls We currently work on an Url prediction system that we want to use to classify documents by their type with with nothing but its Url as basis. Any Url can be interpreted as a 4 tuple hp; s; t; fi of protocol p, server name ....
W. W. Cohen. Text categorization and relational learning. In Machine Learning: Proceedings of the 12th International Conference (ML95), 1995.
.... [33] have been used with mail and USENET news filtering tasks [23, 26, 27] and automated Web browsing [4, 17] Minimum Description Length techniques have been explored in USENET news filtering [19] and relational learning algorithms such as FOIL [28] have been applied to text categorisation [9]. However, it is difficult to evaluate the relative performance of the techniques employed by learning agents, even within the same domain (such as news filtering) Evaluations performed on various agent systems to date have relied on individually constructed datasets. No standard datasets yet ....
....a database query language, so that users can retrieve popular or interesting mail articles. Other systems rely on user defined scripts which contain short programs, such as those employed by the Information Retrieval Agent (IRA) 34] The use of scripts highlights a number of important issues [9]. Learning and utilising a scripting language may discourage nontechnical users from using the system. As well as understanding exactly how they require the system to behave, a user must appreciate how the agent will perform with the script. For example, the behaviour of individual rules may ....
W.W. Cohen. Text Categorization and Relational Learning. In The 12th International Conference on Machine Learning, pages 124--132, 1995.
.... Payne, Edwards, Green 1995) and automated Web browsing (Bayer 1995; Balabanovi c Yun 1995) Minimum Description Length techniques have been explored in USENET news filtering (Lang 1995) and relational learning algorithms such as FOIL (Quinlan 1990) have been applied to text categorisation (Cohen 1995). Unfortunately, it is difficult to assess the relative perfor Feature Extraction Observation Profile Generation GUI Feature Extraction Classification Prediction Underlying Application Figure 1: A Learning Interface Agent Architecture. mance of these techniques for agent systems, due to the ad ....
Cohen, W. 1995. Text Categorization and Relational Learning. In The 12th International Conference on Machine Learning, 124--132.
....System um Internet Suche zu erweitern und um die Meta Suche in OySTER mit einer NL Schnittstelle zu versehen. Daruberhinaus kann OySTER von der in Osiris verwendeten thematischen Klassifikation profitieren. OySTER befindet sich z.Z. in der Antragsphase. 5 Zur Kategorisierung s. [Coh95]. ....
W. W. Cohen, Text categorization and relational learning, Machine Learning: Proceedings of the 12th International Conference (ML95), 1995.
....represent contextual information. 1 Introduction Learning methods are frequently used to automatically construct classifiers from labeled documents [Lewis, 1992b; Lewis and Ringuette, 1994; Lewis and Gale, 1994; Apt e et al. 1994b; Yang and Chute, 1994; Hull et al. 1995; Wiener et al. 1995; Cohen, 1995b] In this paper, we will investigate the performance of two recently implemented machine learning algorithms on a number of large text categorization problems. The two algorithms considered are set valued Ripper, a recent rule learning algorithm [Cohen, 1995a; Cohen, 1996b] and ....
....contextual information. When possible, we have compared our results to previous results on the same tasks. There is a large number of relevant studies (see for instance [Lewis, 1992b; Lewis, 1992a; Yang, 1994; Yang and Chute, 1994; Apte et al. 1994a; Hull et al. 1995] Wiener et al. 1995; Cohen, 1995b; Schutze et al. 1996; Ng et al. 1997] and the references therein) for which a direct comparison is impossible due to the diversity of the different datasets used in the experiments, the different methods used to pre process and partition the data, and the different measures used to evaluate ....
William W. Cohen. Text categorization and relational learning. In Machine Learning: Proceedings of the Twelfth International Conference, Lake Tahoe, California, 1995. Morgan Kaufmann.
....of the usefulness of classifiers that represent contextual information. 1 Introduction Learning methods are frequently used to automatically construct classifiers from labeled documents [Lewis, 1992; Lewis and Ringuette, 1994; Lewis and Gale, 1994; Apt e et al. 1994; Wiener et al. 1995; Cohen, 1995b] In this paper, we will investigate the performance of two recently implemented machine learning algorithms on a number of large text categorization problems. The two algorithms considered are set valued Ripper, a recent rule learning algorithm [Cohen, 1995a; Cohen, 1996] and sleeping experts, ....
William W. Cohen. Text categorization and relational learning. In Machine Learning: Proceedings of the Twelfth International Conference, Lake Taho, California, 1995. Morgan Kaufmann.
....as a confirmation of the usefulness of classifiers that represent contextual information. 1 Introduction Learning methods are frequently used to automatically construct classifiers from labeled documents [Lewis and Ringuette, 1994; Lewis and Gale, 1994; Apt e et al. 1994; Wiener et al. 1995; Cohen, 1995b] In this paper, we will investigate the performance of two recently implemented machine learning algorithms on a number of large text categorization problems. The two algorithms considered are set valued Ripper, a recent rule learning algorithm [Cohen, 1995a; Cohen, 1996] and a new on line ....
William W. Cohen. Text categorization and relational learning. In Machine Learning: Proceedings of the Twelfth International Conference, Lake Taho, California, 1995. Morgan Kaufmann.
No context found.
W. W. Cohen. Text categorization and relational learning. In A. Prieditis and S. J. Russell, editors, Proceedings of ICML-95, 12th International Conference on Machine Learning, pages 124--132, Lake Tahoe, US, 1995. Morgan Kaufmann Publishers, San Francisco, US.
No context found.
W. W. Cohen, \Text categorization and relational learning," in Proceedings of International Conference on Machine Learning ICML-95, pp. 124-132, 1995.
No context found.
Cohen W. 1995. Text categorization and relational learning. In Proceedings of the International Machine Learning Conference (ICML-95), 124-132, Morgan Kaufmann.
No context found.
Cohen, W.W. (1995). Text Categorization and Relational Learning. In A. Prieditis and S.J. Russell (Eds.), Proc. of the 12th International Conference on Machine Learning, Lake Tahoe, California (pp. 124--132).
No context found.
#4#:255#268, 1987. #16# William W. Cohen. Text categorization and relational learning. In Proceedings of the Twelfth International ConferenceonMachine Learning #ML95#, pages
No context found.
COHEN,W.W.(1995). TextCategorization and Relational Learning. The 12th International Conference on Machine Learning,pp. 124--132.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC