13 citations found. Retrieving documents...
W. Meng, W. Wang, H. Sun, and C. Yu. Concept hierarchy-based text database categorization. Knowledge and Information Systems, 4:132--150, 2002.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Query- vs. Crawling-based Classification of Searchable.. - Gravano, Ipeirotis.. (2002)   (Correct)

....web databases. The average number of queries sent to each database was 182, and no documents needed to be retrieved from the databases. Furthermore, the number of words per query ranged between just one and four words. Further details of our algorithm and evaluation are described in [8] See [2, 3, 10, 6, 7] 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CNN Sports Illustrated Johns Hopkins AIDS Service Tom s Hardware Guide Office of Scientific and Technical Information Duke University Rare Books Specificity Arts Computers Health Science Sports Figure 2: Distribution of documents in the ....

Wenxian Wang, Weiyi Meng, and Clement Yu. Concept hierarchy based text database categorization in a metasearch engine environment. In Proceedings of the First International Conference on Web Information Systems Engineering (WISE'2000.


Query- vs. Crawling-based Classification - Of Searchable Web (2002)   (Correct)

....Johns Hopkins AIDS Service Tom s Hardware Guide Office of Scientific and Technical Information Duke University Rare Books Specificity Arts Computers Health Science Sports Figure 2: Distribution of documents in the top level categories for five searchable web databases. in [8] See [2, 3, 10, 6, 7] for other related work relevant to database classification. As we discussed in the introduction, our technique can be also applied to the classification of any database that offers a search interface for its contents, no matter if its contents are hidden or not. 3.2 Crawling based ....

Wenxian Wang, Weiyi Meng, and Clement Yu. Concept hierarchy based text database categorization in a metasearch engine environment. In Proceedings of the First International Conference on Web Information Systems Engineering (WISE'2000.


Frequency-Based Coverage Statistics Mining for Data Integration - Nie, Kambhampati (2003)   (Correct)

....discuss how such coverage statistics could be learned. In contrast, our main aim in this paper is to provide a framework for learning the required statistics. There has also been some work on ranking text databases in the context of key word queries submitted to meta search engines. Recent work ( WMY00] IGS01] considers the problem of classifying text databases into a topic hierarchy. While our approach is similar to these approaches in terms of using concept hierarchies, and using probing and counting methods, it differs in several significant ways. First, the text database work uses a ....

W. Wang, W. Meng, and C. Yu. Concept Hierarchy based text database categorization in a metasearch engine environment. In WISE2000, June 2000.


Mining Coverage Statistics for Websource Selection In a .. - Nie, Nambiar, Vaddi.. (2002)   (Correct)

....these classes. From a learning point of view, we believe that our approach makes better sense since inter class overlap statistics cannot be learned directly. There has also been some work on ranking text databases in the context of key word queries submitted to meta search engines. Recent work ( WMY00] IGS01] considers the problem of classifying text databases into a topic hierarchy. While our approach is similar to these approaches in terms of using concept hierarchies, and using probing and counting methods, it differs in several significant ways. First, the text database work uses a ....

....choose the con8 ference attribute as the classificatory attribute, even if we know the domain of the year attribute. Once the classificatory attributes are selected, the AV hierarchies for those attributes can either be provided by the mediator designer (using existing domain ontologies, c.f. WMY00;IGS01] or be automatically generated through clustering techniques. In the following discussion, we will assume that AV hiearchies are made available. We discuss the issues involved in the hierarchy generation in Section 9. Query Classes: Since we focus on selection queries, a typical query ....

[Article contains additional citation context not shown here]

W. Wang, W. Meng, and C. Yu. Concept Hierarchy based text database categorization in a metasearch engine environment. In WISE2000, June 2000.


Mining Source Coverage Statistics for Data Integration - Nie, Kambhampati, Nambiar.. (2001)   (Correct)

....[NK01] we describe a framework that uses both coverage and response time statistics to jointly optimize the cost and coverage of query plans in data integration. There has been some work on ranking text databases in the context of key word queries submitted to meta search engines. Recent work ( WMY00] IGS01] considers the problem of classifying text databases into a topic hierarchy. While their approach involves estimating the relevance of a database for a given topic, the textual nature of the databases precludes any sophisticated estimation of coverage and overlap. 7. CONCLUSION In ....

W. Wang, W. Meng, and C. Yu. Concept Hierarchy based text database categorization in a metasearch engine environment. In WISE2000, June 2000. . http://rakaposhi.eas.asu.edu/havasu.html


Probe, Count, and Classify: Categorizing Hidden-Web Databases - Ipeirotis, Gravano, Sahami (2001)   (2 citations)  (Correct)

....Techniques for Comparison We tested variations of our probing technique, which we refer to as Probe and Count, against two alternative strategies. The first one is an adaptation of the technique described in [2] which we refer to as Document Sampling. The second one is a method described in [29] that was specifically designed for database classification. We will refer to this method as Title based Querying. The methods are described in detail below. Probe and Count (PnC) This is our technique, described in Section 3, which uses a document classifier for each internal node of our ....

....our Probe and Count technique is that we only use the number of matches reported by each database, while the Document Sampling technique requires retrieving and analyzing the actual documents from the database for the key Step 4 termination condition test. Title based Querying (TQ) Wang et al. [29] present three di#erent techniques for the classification of searchable web databases. For our experimental evaluation we picked the method they deemed best. Their technique creates one long query for each category using the title of the category itself (e.g. Baseball ) augmented by the titles ....

[Article contains additional citation context not shown here]

W. Wang, W. Meng, and C. Yu. Concept hierarchy based text database categorization in a metasearch engine environment. In Proceedings of First International Conference on Web Information Systems Engineering (WISE'2000), June 2000.


Clustering E-Commerce Search Engines - Qian Peng Weiyi (2004)   Self-citation (Meng Yu)   (Correct)

No context found.

W. Meng, W. Wang, H. Sun, C. Yu. Concept Hierarchy Based Text Database Categorization. Journal of Knowledge and Information Systems, 4(2), 2002, 132-150.


Towards a Highly-Scalable Metasearch Engine - Meng, Yu, Wu   Self-citation (Meng Yu)   (Correct)

....Several researchers have studied the idea of assigning databases to concepts and or clustering documents into new databases to improve retrieval effectiveness in a metasearch engine or distributed document retrieval environment [9, 36] Positive experimental results have been reported. In [35], three methods are proposed for assigning databases to concepts in a concept hierarchy built from the category hierarchy of Yahoo. Two of the three methods can assign databases automatically. The concept hierarchy and its associated databases can be used as follows. Before a user submits a query ....

W. Wang, W. Meng, and C. Yu. Concept Hierarchy Based Text Database Categorization in a Metasearch Engine Envrionment. Manuscript under preparation. 2000.


Parameterized Generation of Labeled Datasets for.. - Davidov.. (2004)   (1 citation)  (Correct)

No context found.

W. Meng, W. Wang, H. Sun, and C. Yu. Concept hierarchy-based text database categorization. Knowledge and Information Systems, 4:132--150, 2002.


A Frequency-based Approach for Mining Coverage Statistics in .. - Nie, Kambhampati (2004)   (Correct)

No context found.

W. Wang, W. Meng, and C. Yu. Concept Hierarchy based text database categorization in a metasearch engine environment. In Proc. of WISE, June 2000.


A Frequency-based Approach for Mining Coverage Statistics in .. - Nie, Kambhampati (2004)   (Correct)

No context found.

W. Wang, W. Meng, and C. Yu. Concept Hierarchy based text database categorization in a metasearch engine environment. In WISE2000, June 2000.


Parameterized Generation of Labeled Datasets for.. - Davidov.. (2004)   (1 citation)  (Correct)

No context found.

W. Meng, W. Wang, H. Sun, and C. Yu. Concept hierarchy-based text database categorization. Knowledge and Information Systems, 4:132--150, 2002.


Approximate Information Filtering with Multiple.. - Stuckenschmidt (2002)   (Correct)

No context found.

W. Wang, W. Meng and C. Yu, Concept hierarchy based text database categorization in a metasearch engine environment, Proceedings of First International Conference on Web Information Systems Engineering (2000).

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC