25 citations found. Retrieving documents...
Yang, Y. (1995). Noise reduction in a statistical approach to text categorization. In Proceedings of the ACM SIGIR on Research and Development in Information Retrieval.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Text Categorization by Boosting Automatically Extracted Concepts - Cai, Hofmann (2003)   (4 citations)  (Correct)

....two adjacent feature values of e in the training data. 5. RELATED WORK The use of autonmtically extracted features and dimension reduction techniques for text categorization has been investigated before. Most notably is the use of SVD based truncation to suppress noisy features as suggested in [12, 13]. A similar idea has also been investigated more recently under the title of semantic keaels and utilized in conjunc tion with SVMs for text categorization [14] Yet another approach of deriving document representations that takes semantic similarities of terms into account has been pro posed in ....

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th A CM-S[G[R International Conference on Research and Development in [nfonation Retrieval, pages 256-263, 1995.


On the Use of Singular Value Decomposition for Text Retrieval - Husbands, Simon, Ding (2000)   (7 citations)  (Correct)

....performance. Furthermore, LSI can be alternatively reviewed as a query expansion method (see section 2.2 and 5) so that recall is generally improved. Experiments indicates both improved retrieval precision and recall when LSI is adopted. 4, 6, 10, 2, 1, 21] LSI also improves text categorization [7, 20], and word sense disambiguation [17] Theoretical results [1, 14, 5, 21] have also provided some understanding on the e ectiveness of LSI. These LSI studies have, however, mostly used relatively small text collections and simpli ed document models. In this work we investigate the use of the LSI ....

Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. Proc. of SIGIR'95 (ACM Press, 1995), pp.256-263.


A Probabilistic Model for Dimensionality Reduction in Information.. - Ding (2001)   (1 citation)  (Correct)

.... computes a much smaller semantic subspace from the original text collection, which improves recall and precision in information retrieval [Deerwester et al., 1990, Bartell et al.,1995, Zha et al., 1998, Hofmann, 1999, Husbands et al., 2000] information ltering or text classi cation[Dumais, 1995, Yang, 1995, Baker and McCallum, 1998] and word sense disambiguation [Sch utze, 1998] The e ectiveness of LSI in these empirical studies is often attributed to reduction of noise, redundancy, and ambiguity. Synonyms and polysemy problems are somehow reduced in the process. Several recent studies ....

....done in a number of ways. A simple method is to calculate a centroid vector c k of category k, i.e. the average of all documents in the category[Dumais, 1995] All m centroid vectors for m categories form a d m matrix C = c 1 cm ) Another method is to solve for mapping vectors c k so that[Yang, 1995] C = arg min C jjC T X Bjj: In the least square problem, the m n matrix B de nes categories for each document. The Frobenius norm of a matrix J is de ned as jjJ jj 2 = P n i=1 P m k=1 (J k i ) 2 . A new incoming document is then projected onto these centroids or mapping vectors to ....

[Article contains additional citation context not shown here]

Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. Proc. of SIGIR'95 (ACM Press, 1995), pp.256-263.


A Similarity-based Probability Model for Latent Semantic Indexing - Ding (1999)   (17 citations)  (Correct)

....that effectively capture the essential associative semantic relationship between terms and documents, consistent with the empirical evidences and the general intuition. LSI SVD techniques have been used in information filtering (document classification) and computational linguistics (e.g. [4, 14, 15]) Our model applies to these cases too, as long as the semantic structures defined by the dot product similarity remains the essential relationship there. In text classification[4, 15] documents are projected into the LSI subspace; the same semantic relations remain in this new feature space as ....

....have been used in information filtering (document classification) and computational linguistics (e.g. 4, 14, 15] Our model applies to these cases too, as long as the semantic structures defined by the dot product similarity remains the essential relationship there. In text classification[4, 15], documents are projected into the LSI subspace; the same semantic relations remain in this new feature space as in the retrieval cases. In word sense disambiguation[14] the relevant relationship is the cosine between two vectors in the context space and thus our theory applies here also. In all ....

Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. In Proc. of 18th Annual ACM SIGIR Conference (SIGIR '95) 1995:256-263.


A Dual Probabilistic Model for Latent Semantic Indexing in.. - Ding (1999)   (Correct)

.... The query is transformed to q T U k , the documents are represented as k V T k , and the relevance score is computed as s = q T U k ) k V T k ) Typically taking k = 200 (far less than original dimensions) LSI increases the precision for retrieval and accuracy for classi cation [7, 8, 1, 23, 22]. The success of LSI is attributed to that LSI subspace captures the essential associative semantic relationships better than the original document space, and thus partially resolves the word choice problem. Clearly, a theoretical and quantitative understanding is important. Mathematically, the ....

....the existence of optimal semantic subspace is conclusively demonstrated here for the rst time. If we pick the rst peak as k opt , then k opt = 377 for Cran eld, 184 for Medline, 726 for CACM, and 274 for CISI. k opt values for Medline and CACM are quite close to experimentally determined values [22, 23]. As explained before, the statistical signi cance of each LSI index vector relates to their eigenvalue. The eigenvalues of the latent semantic vectors for Cran eld collection are shown in Figure 1b. Within the range of meaningful latent semantic vectors, 1 i k opt , they follow a Zipf like ....

Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. Proc. of SIGIR'95 (ACM Press, 1995), pp.256-263.


Feature Reduction and Database Maintenance in NETNEWS.. - Hsu, Lang   (Correct)

....request expirations and specify modi Thetacations from external sources. The latter signi Thetacantly increases the amount of data that can be expired. Their ef Thetacient algorithms determine what data can be expired by taking into account the types of updates that may occur. Another work [18] in text categorization uses a statistical learning method, Linear Least Square Fit Mapping, to reduce noise for computational ef Thetaciency. In this study, multiple noise reducing strategies were used and the results show signi Thetacant improvements in ef Thetaciency without losing ....

Y. Yang. #Noise Reduction in a Statistical Approach to Text Categorization#, SIGIR, pp. 256-263, 1995.


Classification Algorithms for NETNEWS Articles - Hsu, Lang (1999)   (2 citations)  (Correct)

....can declaratively request expirations and specify modifications from external sources. The latter significantly increases the amount of data that can be expired. Their efficient algorithms determine what data can be expired by taking into account the types of updates that may occur. Another work [26] in text categorization uses a statistical learning method, Linear Least Square Fit Mapping, to reduce noise for computational efficiency. In this study, multiple noise reducing strategies were used and the results show significant improvements in efficiency without losing categorization accuracy. ....

Y. Yang. "Noise Reduction in a Statistical Approach to Text Categorization", Proceedings of SIGIR, pp. 256-263, 1995.


Effect of noise reduction using rough set theory on the.. - Gardner, Lalmas, Ruthven   (Correct)

.... documents for retrieval purposes, and for document filtering (selective dissemination of information) It has been shown that noise reduction prior to categorisation has beneficial effects, both on the accuracy of classification and on the computational efficiency of the classification algorithm [Yan95]. Accordingly a par2 ticular focus of this study was to explore the utility of noise reduction prior to K NN classification. The noise reduction strategy selected involved attribute reduction using a technique based on Rough Set Theory [Paw82] Paw93] The study uses 107 anonymised reports of ....

Y Yang. Noise reduction in a statistical approach to text categorisation. In E. A Fox, P. Ingwersen, and R. Fidel, editors, Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval, pages 256 -- 263, Seattle, Washington, USA, 1995.


Boosting and Rocchio Applied to Text Filtering - Schapire, Singer, Singhal (1998)   (51 citations)  (Correct)

.... been studied in two different communities machine learning (ML) and information retrieval (IR) Many algorithms for text classification have been proposed and evaluated in the past, for example, Bayesian classifiers, k nearest neighbors, neural networks, rule learning algorithms, and many more [13, 16, 30, 1, 31, 11, 17, 5, 18]. Most studies use Rocchio s method [21] a well known algorithm in the IR community (usually used for relevance feedback and document routing) as a comparison baseline for their classifiers. One aim of this study is to examine the relative merits of a fairly new ML algorithm called boosting, ....

Yiming Yang. Noise reduction in a statistical approach to text categorization. In Edward Fox, Peter Ingwersen, and Raya Fidel, editors, Proceedings of the Eighteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 256--263. Association for Computing Machinery, New York, July 1995.


Distributional Clustering of Words for Text Classification - Baker, McCallum (1998)   (57 citations)  (Correct)

....which is the vector sum of all the feature vectors of all the documents in that class. A new document is labelled with the class of the centroid to which its feature vector is closest, as measured by the cosine similarity between the two vectors. The Linear Least Squares Fit (LLSF) method [22] is another classification algorithm based on PCA, which is equivalent to Dumais use of LSI for classification except that LLSF uses the dot product to compute similarity instead of the cosine and is thus sensitive to the length of the two vectors being compared. 4 Experimental Results This ....

Yiming Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


Boosting and Rocchio Applied to Text Filtering - Schapire, Singer, Singhal (1998)   (51 citations)  (Correct)

.... has been studied in two different communities machine learning (ML) and information retrieval (IR) Many algorithms for text filtering have been proposed and evaluated in the past, for example, Bayesian classifiers, k nearest neighbors, neural networks, rule learning algorithms, and many more [17, 20, 40, 2, 41, 14, 22, 8, 24]. Most studies use Rocchio s method [28] a well known algorithm in the IR community (traditionally used for relevance feedbackand more recently for document routing [38] as a comparison baseline for their classifiers. However, most such studies use a weak version of Rocchio s algorithm, not ....

....used for evaluating text filtering effectiveness, and do not use it in this study. Some other measures that have been used to evaluate text filtering are: ffl Average precision, or precision at a fixed rank cutoff: Many studies have used one of these measures to evaluate filtering effectiveness [2, 40, 41, 22, 1, 7]. These measures are intended to evaluate the ranking effectiveness of a system [31] not its filtering effectiveness. Even though the filtering effectiveness of a system is related to its ranking effectiveness, this relationship is not strong enough to use ranking evaluation measures to evaluate ....

Yiming Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the Eighteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 256--263, July 1995.


Daimler Benz Research: System and Experiments -.. - Bayer.. (1997)   (1 citation)  (Correct)

....for subsequent processing which are not discussed in detail here. However, the essential consequence is that the resulting decision vector can not be normalized to length 1. The linear classifier is identical to the LLSF (linear least square fit) classifier described by Yang (see [7] and [8]) However, the mathematical principle is different in general if higher order polynomials are used. In this case, a non linear function (e.g. quadratic polynomial) maps the feature space to the decision space yielding better separation of categories in the decision space. 4 Experiments The ....

Y. Y. Yang, Noise reduction in a statistical approach to text categorization, Proceedings, 18th Int. ACM-SIGIR Conf. on Research and Development in Information Retrieval, Seattle, WA, 1995.


A Scalability Analysis of Classifiers in Text Categorization - Yang, Zhang, Kisiel (2003)   (1 citation)  Self-citation (Yang)   (Correct)

No context found.

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


A Scalability Analysis of Classifiers in Text Categorization - Yiming Yang Carnegie (2003)   (1 citation)  Self-citation (Yang)   (Correct)

No context found.

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256-263, 1995.


A Scalability Analysis of Classifiers in Text Categorization - Yang, Zhang, Kisiel (2003)   (1 citation)  Self-citation (Yang)   (Correct)

No context found.

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256-263, 1995.


A Scalability Analysis of Classifiers in Text Categorization - Yiming Yang Carnegie (2003)   (1 citation)  Self-citation (Yang)   (Correct)

....requirement is O(Nks ) for storing either U or V. The value of ks can be empirically chosen through validation. In our experiments with LLSF on benchmark collections (Reuters news stories, MEDLINE documents, etc) we observed the optimal ranges of ks to be between a few hundred and one thousand[16]. Step 2. Compute the pseudo inverse = VS 1 U = X US 2 U The time complexity here is O(ksN ) dominated by the computation of US 2 U . The space complexity is O(NV ) for storing matrix X . Step 3. Compute the solution matrix W # = X Y. Since the matrix Y ....

....needed for the inverted indexing is also O(NLc ) Matrix X would make the space complexity O(V M) if we need to keep it in a dense form. However, our previous work showed that aggressive elimination of non influential elements from that matrix would not cause any loss of classification accuracy[16]. Among the above steps, the dominating part in the training time of LLSF is the matrix multiplication in Step 2, with a complexity of O(ksN ) As for the space complexity, the dominating part is the storage required for matrix X , with a complexity of O(NV ) For the testing phase, the ....

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


An Evaluation of Statistical Approaches to Text Categorization - Yang (1999)   (127 citations)  Self-citation (Yang)   (Correct)

.... investigations on suitable choices of these parameter values were reported in previous papers where the main observations were that the performance of kNN is relatively stable for a large range of k values[22] and that satisfactory performance of LLSF depends on whether p is sufficiently large[23]. Given the large number of possible combinations of parameter values, exhaustive testing of all the combinations is neither practical nor necessary. We take a greedy search strategy for parameter tuning. That is, we first subjectively decide the order of parameters to be tuned, and then ....

.....004 CPU second per test document on average. LLSF is an eager learning method, and has a off line training phase and an on line testing phase. The training phase has a quadratic time complexity, O(pn 0 ) where p is the number of singular vectors used for computing an approximated LLSF solution[23], and n 0 = maxfm;ng is the larger number between n, the number of training documents, and m, the number of unique terms in the training documents. This quadratic complexity is the computational bottleneck for scaling this method to large applications. Once the training is done, the on line ....

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


Translingual Information Retrieval: Learning from.. - Yang, Carbonell.. (1997)   (9 citations)  Self-citation (Yang)   (Correct)

....l documents. This can be expensive for very large applications. Fortunately, we found it possible to significantly reduce this complexity by aggressively removing non influential elements from the transformed document vectors without sacrificing retrieval performance, as shown in our previous work[28] and in the empirical results of this study (Section 5.4) 3.3 Latent Semantic Indexing Latent Semantic Indexing[10] LSI) is a one step extension of GVSM. The claim is that neither terms nor documents are the optimal choice for the orthogonal basis of a semantic space, and that a reduced vector ....

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


An Evaluation of Statistical Approaches to Text Categorization - Yang (1998)   (127 citations)  Self-citation (Yang)   (Correct)

.... investigations on suitable choices of these parameter values were reported in previous papers where the main observations were that the performance of kNN is relatively stable for a large range of k values[22] and that satisfactory performance of LLSF depends on whether p is sufficiently large[23]. Given the large number of possible combinations of parameter values, exhaustive testing of all the combinations is neither practical nor final.tex; 7 08 1998; 15:29; p.16 INRT Journal 1998 17 necessary. We take a greedy search strategy for parameter tuning. That is, we first subjectively ....

....document on average. LLSF is an eager learning method, and has a off line training phase and an on line testing phase. The training phase has a quadratic time complexity, O(pn 0 ) where p is the number of principal components (singular vectors) used for computing an approximated LLSF solution[23], and n 0 = maxfm;ng is the larger number between n, the number of training documents, and m, the number of unique terms in the training documents. This quadratic complexity is the computational bottleneck for scaling this method to large applications. Once the training is done, the on line ....

Y. Yang. Noise reduction in a statistical approach to text categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'95), pages 256--263, 1995.


Translingual Information Retrieval: A Comparative.. - Carbonell, Yang.. (1997)   (28 citations)  Self-citation (Yang)   (Correct)

....the bilingual training corpus. The time complexity in the second part, is O(n) per document, or O(nl) for a test corpus of l documents. It is possible to significantly reduce this complexity in large problems by aggressively removing non influential elements from the transformed document vectors [ Yang, 1995 ] 3.3 Latent Semantic Indexing Latent Semantic Indexing [ Deerwester et al. 1990 ] LSI) is a one step extension of GVSM. The claim is that neither terms nor documents are the optimal choice for the orthogonal basis of a semantic space, and that a reduced vector space consisting of the most ....

Y. Yang. Noise Reduction in a Statistical Approach to Text Categorization. In Proceedings of the 18th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '95), pages 256--263, 1995.


The Maximum-Margin Approach to Learning Text Classifiers -.. - Joachims (2000)   (17 citations)  (Correct)

No context found.

Yang, Y. (1995). Noise reduction in a statistical approach to text categorization. In Proceedings of the ACM SIGIR on Research and Development in Information Retrieval.


Application of K-NN and FPTC Based Text Categorization Algorithms.. - Ilhan (2001)   (Correct)

No context found.

Yang, Y., Noise Reduction in a Statistical Approach to Text Categorization, In the Proceedings of the 18 th International Annual ACM/SIGIR Conference, 1995. BIBLIOGRAPHY 68


SCAI TREC-8 Experiments - Shin, Kim, Kim, Eom, Shin, Zhang   (1 citation)  (Correct)

No context found.

Yang, Y., Noise Reduction In A Statistical Approach To Text Categorization, SIGIR-95, 1995. 8


A Two-Stage Retrieval Model for the TREC-7 Ad Hoc Task - Dong-Ho Shin (1998)   (1 citation)  (Correct)

No context found.

Yang, Y., Noise Reduction in a Statistical Approach to Text Categorization, SIGIR-95, pp. 256-263, 1995.


Exploiting Thesaurus Knowledge in Rule Induction for Text.. - Junker, Abecker (1997)   (6 citations)  (Correct)

No context found.

, pages 256--263, Seattle, Washington, USA, July 9-13 1995.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC