See this document in CiteSeerX!

Asymmetric missing-data problems: overcoming the lack of negative data in preference ranking (2002)  (Make Corrections)  
Aleksander Kocz and Joshua Alspector Personalogy, Inc. 24 South Weber Suite...
Information Retrieval



  Home/Search   Context   Related

 
View or download:
iit.edu/~alek/ir_2002.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  iit.edu/~alek/publications (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: In certain classication problems there is a strong asymmetry between the number of labeled examples available for each of the classes involved. In an extreme case, there may be a complete lack of labeled data for one of the classes while, at the same time, there are adequate labeled examples for the others, accompanied by a large body of unlabeled data. Since most classication algorithms require some information about all classes involved, label estimation for the un-represented class is... (Update)

Active bibliography (related documents):   More   All
0.8:   WebMate: A Personal Agent for Browsing and Searching - Chen, Sycara (1998)   (Correct)
0.5:   On a Recursive Spectral Algorithm for Clustering from.. - Cheng, Kannan..   (Correct)
0.5:   Stemming for Term Conflation in Malay Texts - Idris, Mustapha (2001)   (Correct)

Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@article{ cz02asymmetric,
    author = "Aleksander Ko{\l}cz and Joshua Alspector",
    title = "Asymmetric Missing-data Problems: Overcoming the Lack of Negative Data in Preference Ranking",
    journal = "Information Retrieval",
    volume = "5",
    number = "1",
    publisher = "Kluwer Academic Publishers",
    pages = "5--40",
    year = "2002",
    url = "citeseer.ist.psu.edu/750477.html" }
Citations (may not include all citations):
2528   Maximum likelihood from incomplete data via the EM algorithm (context) - Dempster, Laird et al. - 1977
947   Statistical Learning Theory (context) - Vapnik - 1998
416   Information retrieval - van Rijsbergen - 1979
376   Text categorization with support vector machines: Learning w.. - Joachims - 1997
288   Relevance feedback in information retrieval (context) - Rocchio - 1971
268   Making large-scale svm learning practical - Joachims - 1999
244   Letizia: An agent that assists web browsing - Lieberman - 1995
243   Information Retrieval: Data Structures and Algorithms (context) - Frakes, Baeza-Yates et al. - 1992
226   The EM Algorithm and Extensions (context) - McLachlan, Krishnan - 1996
201   Relevance weighting of search terms (context) - Robertson, Sparck-Jones - 1976
191   The SMART Retrieval System: Experiments in Automatic Documen.. (context) - Salton - 1971
189   Webwatcher: A tour guide for the world wide web - Joachims, Freitag et al. - 1997
180   Combining labeled and unlabeled data with co-training - Blum, Mitchell - 1998
166   A re-examination of text categorization methods - Yang, Liu - 1999
103   at forty: the independence assumption in information retriev.. (context) - Lewis - 1998
100   Personalized information delivery: An analysis of informatio.. (context) - Foltz, Dumais - 1992
91   Improving generalization with active learning - Cohn, Atlas et al. - 1994
83   Query by committee - Seung, Opper et al. - 1992
80   Learning to classify text from labeled and unlabeled documen.. - Nigam, McCallum et al. - 2000
77   Probabilistic outputs for support vector machines and compar.. - Platt - 2000
72   Optimization of relevance feedback weights (context) - Buckley, Salton - 1995
55   Using probabilistic models of document retrieval without rel.. (context) - Croft, Harper - 1979
48   NewsWeeder: Learning to lter NetNews (context) - Lang - 1995
32   A comparison of event models for naive bayes text classicati.. (context) - McCallum, Nigam - 1998
29   Less is more: Active learning with support vector machines - Schohn, Cohn - 2000
23   Addressing the curse of imbalanced training sets: One-sided .. - Kubat, Matwin - 1997
20   A user model neural network for a personal news service (context) - Jennings, Higichi et al. - 1993
17   MetaCost: A general method for making classiers cost-sensiti.. (context) - Domingos - 1999
14   Automatic query expansion using SMART: TREC-3 (context) - Buckley, Salton et al. - 1995
12   Analyzing the effectiveness and applicability of co-training - Nigam, Ghani - 2000
11   Learning and revising user proles: The identication of inter.. (context) - Pazzani, Billsus - 1997
10   Learning when negative examples abound - Kubat, Holte et al. - 1997
10   Active learning using adaptive resampling - Iyengar, Apte et al. - 2000
9   Boosting and Rocchio applied to text ltering (context) - Schapire, Singer et al. - 1998
9   A sequential algorithm for training text classiers (context) - Lewis, Gale - 1994
7   Knowledge discovery and knowledge validation in intensive ca.. - Morik, Imboff et al. - 2000
4   A hybrid user model for news story classication (context) - Billsus, Pazzani - 1999
3   Combining content-based and collaborative lters in an online.. (context) - Claypool, Gokhale et al. - 1999
3   Information ltering based on user behavior analysis and best.. (context) - Morita, Shinoda - 1994
3   Reducing misclassication costs (context) - Pazzani, Merz et al. - 1994
2   Using unsupervised learning to guide resampling in imbalance.. (context) - Nickerson, Japkowicz et al. - 2001
2   An algorithm for sufx stripping (context) - Porter - 1980
2   Handling imbalanced data sets in insurance risk modeling (context) - Pednault, Rosen et al. - 2000
2   Machine learning from imbalanced data sets 101 (context) - Provost - 2000
1   Automating the creation of information lters (context) - Stevens - 1992
1   ANATANAGONOMY: a personalized newspaper on the world wide we.. (context) - Kamba, Sahagami et al. - 1997
1   Learning queries in a query zone (context) - Mitra, Singhal et al. - 1997
1   The class imbalance problem: Signicance and strategies (context) - Japkowicz - 2000

Documents on the same site (http://ir.iit.edu/~alek/publications.html):   More
yyxrvdfrxw Gxyhfrh' xi Dhvx3d'hg 2frfrdu Eh"du - Ehf Xt' Uht'dwgh   (Correct)
Summarization as Feature Selection for Text Categorization - Kolcz, Prabakarmurthi.. (2001)   (Correct)
SVM-based Filtering of E-mail Spam with Content-specic - Misclassication Costs.. (2001)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC