10 citations found. Retrieving documents...
Y. Yang, "Sampling strategies and learning efficiency in text categorization," in Proc. of the AAAI Spring Symposium on Machine Learning in Information Access, M. Hearst and H. Hirsh, Eds. 1996, pp. 88--95, AAAI Press.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Integrating Feature and Instance Selection - For Text Classification   (Correct)

....good predictors. Therefore, we can reduce the redundancies hoping that we will not reduce the amount of information in the dataset. There is not much research on Instance Selection for text classification. The issue is mostly addressed either with the traditional statistical approach of sampling [16] or by more elaborate, but sometimes heuristic, approaches. Most of the work refers to Instance based or lazy algorithms [1] In [15] the problem is addressed using a distance measure. In essence instances that are closer to each other tend to bear overlapping information; therefore, some of ....

Yang, Y. "Sampling strategies and learning efficiency in text categorization". /1I Spring Symposium on Machine Learning in Information Access, pp. 88-95 1996.


Active Learning For Automatic Speech Recognition - Hakkani-Tür, Riccardi, Gorin (2002)   (Correct)

.... language processing framework, certainty based methods have been used for natural language parsing and information extraction [4] Similar sampling strategies were examined for text categorization, not to reduce the transcription cost, but to reduce the training time by using less training data [5]. While there is a wide literature on confidence score computation in ASR [6, 7, among others] to the authors knowledge none of these works address the active learning question for speech recognition. 3. APPROACH Inspired by the certainty based active learning methods to reduce the ....

Y. Yang, "Sampling strategies and learning efficiency in text categorization," in Proc. of the AAAI Spring Symposium on Machine Learning in Information Access, M. Hearst and H. Hirsh, Eds. 1996, pp. 88--95, AAAI Press.


Integrative Windowing - Fürnkranz (1998)   (Correct)

....runs of windowing and to select the best tree. Nevertheless, windowing is arguably one of C4.5 s least frequently used options. Recent work in the areas of Knowledge Discovery in Databases (Kivinen Mannila, 1994; Toivonen, 1996) and Intelligent Information Retrieval (Lewis Catlett, 1994; Yang, 1996) has re emphasized the importance of sub sampling procedures for reducing both learning time and memory requirements. Thus the interest in windowing techniques has revived as well. We discuss some of the more recent approaches in section 6. 1. Quinlan does not explicitly specify how this case ....

Yang, Y. (1996). Sampling strategies and learning efficiency in text categorization. In Hearst, M., & Hirsh, H. (Eds.), Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, pp. 88--95. AAAI Press. Technical Report SS-96-05.


Noise-Tolerant Windowing - Fürnkranz   (Correct)

....certainly is the rapid development of computer hardware, which made the motivation for windowing seem less compelling. However, recent work in the areas of Knowledge Discovery in Databases [Kivinen and Mannila, 1994; Toivonen, 1996] and Intelligent Information Retrieval [Lewis and Catlett, 1994; Yang, 1996] has recognized the importance of subsampling procedures for reducing both, learning time and memory requirements. A good deal of this lack of interest can be attributed to an empirical study [Wirth and Catlett, 1988] which showed that windowing is unlikely to gain any efficiency. The authors ....

Yiming Yang. Sampling strategies and learning efficiency in text categorization. In M. Hearst and H. Hirsh, editors, Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, pages 88--95. AAAI Press, 1996. Technical Report SS-96-05.


More Efficient Windowing - Fürnkranz (1997)   (Correct)

....and Smyth 1996) and Intelligent Information Retrieval (Hearst and Hirsh 1996) have again shown the limits of conventional machine learning algorithms. Dimensionality reduction through subsampling procedures has been recognized as a promising field of research (Lewis and Catlett 1994; Yang 1996). A good deal of the lack of interest in windowing can also be attributed to an empirical study (Wirth and Catlett 1988) that showed that windowing is unlikely to gain any efficiency. The authors studied windowing with ID3 in various domains and concluded that windowing cannot be recommended as a ....

Yang, Y. (1996). Sampling strategies and learning efficiency in text categorization. In M. Hearst and H. Hirsh (Eds.), Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, pp. 88--95. AAAI Press. Technical Report SS-96-05.


Dimensionality Reduction in ILP: A Call To Arms - Fürnkranz   (Correct)

....is certainly the rapid development of computer hardware, which made the motivation for windowing seem less compelling. However, recent work in the areas of Knowledge Discovery in Databases [Kivinen and Mannila, 1994; Toivonen, 1996] and Intelligent Information Retrieval [Lewis and Catlett, 1994; Yang, 1996] has recognized the importance of dimensionality reduction through subsampling for reducing both, learning time and memory requirements. Other subsampling approaches include peepholing [Catlett, 1991] which uses dynamical subsampling at each node in a decision tree, thus extending an earlier ....

Yiming Yang. Sampling strategies and learning efficiency in text categorization. In M. Hearst and H. Hirsh, editors, Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, pages 88--95. AAAI Press, 1996. Technical Report SS-96-05.


A Re-Examination of Text Categorization Methods - Yang, Liu (1999)   (148 citations)  Self-citation (Yang)   (Correct)

....then what empirical evidence has been found so far These questions have not been addressed well. Another open question for TC researchishow robust methods are in solving problems with a skewed category distribution. Since categories typically have an extremely nonuniform distribution in practice[30], it would be meaningful to compare the performance of different classifiers with respect to category frequencies, and to measure howmuch the effectiveness of each method depends on the amountof data available for training. Evaluation scores of specific categories have been often reported[28, 5, ....

Y. Yang. Sampling strategies and learning efficiency in text categorization.InAAAI Spring Symposium on Machine Learning in Information Access, pages 88--95, 1996.


A Re-Examination of Text Categorization Methods - Yang, Liu (1999)   (148 citations)  Self-citation (Yang)   (Correct)

....what empirical evidence has been found so far These questions have not been addressed well. Another open question for TC research is how robust methods are in solving problems with a skewed category distribution. Since categories typically have an extremely nonuniform distribution in practice[30], it would be meaningful to compare the performance of different classifiers with respect to category frequencies, and to measure how much the effectiveness of each method depends on the amount of data available for training. Evaluation scores of specific categories have been often reported[28, 5, ....

Y. Yang. Sampling strategies and learning efficiency in text categorization. In AAAI Spring Symposium on Machine Learning in Information Access, pages 88--95, 1996.


Active Learning for Automatic Speech Recognition - Hakkani-Tür, Riccardi, Gorin   (Correct)

No context found.

Y. Yang, "Sampling strategies and learning efficiency in text categorization," in Proc. of the AAAI Spring Symposium on Machine Learning in Information Access, M. Hearst and H. Hirsh, Eds. 1996, pp. 88--95, AAAI Press.


A Survey on Personalised Information Filtering Systems for the.. - Aas (1997)   (Correct)

No context found.

Y. Yang, Sampling Strategies and Learning Efficiency in Text Categorization, AAAI Spring Symposium on Machine Learning in Information Access, Stanford, March 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC