| Alternate document: Details Mining Topic Specific Concepts and Definitions on the Web (03) Bing Liu, et al. |
(Enter summary)
Abstract: A large amount of information on the Web is contained in
regularly structured objects, which we call data records. Such
data records are important because they often present the essential
information of their host pages, e.g., lists of products and services.
It is useful to mine such data records in order to extract
information from them to provide value-added services. Existing
approaches to solving this problem mainly include the manual
approach, supervised learning, and automatic... (Update)
Cited by: More
Classification Techniques for Categorization of Hypertext Documents - Arumugam
(Correct)
Active bibliography (related documents): More All
1.1: Mining Data Records in Web Pages - Liu, Grossman, Zhai (2003)
(Correct)
0.4: From Tables to Frames - Pivk, Cimiano, Sure (2005)
(Correct)
0.3: Agents Need to Become Welcome - Magnin, Snoussi, Pham, Dury, Nie (2002)
(Correct)
Similar documents based on text: More All
0.2: Building Text Classifiers Using Positive and Unlabeled Examples - Liu, Dai, Li, al. (2003)
(Correct)
0.2: A Refinement Approach to Handling Model Misfit in Text.. - Wu, Phang, Liu, Li
(Correct)
0.2: Learning with Positive and Unlabeled Examples Using Weighted.. - Lee, Liu (2003)
(Correct)
Related documents from co-citation: More All
1519: On integrating catalogs
- Agrawal, Srikant - 2001
1426: Nearest Neighbor Algorithm for Text Categorization (context) - Baoli, Shiwen et al. - 2003
1271: Mining the Web: Discovering knowledge from hypertext data (context) - Chakrabarti - 2003
BibTeX entry: (Update)
Liu, B., Grossman, R. and Zhai, Y. Mining data records in Web pages. UIC Technical Report, 2003. http://citeseer.ist.psu.edu/liu03mining.html More
@misc{ liu03mining,
author = "B. Liu and R. Grossman and Y. Zhai",
title = "Mining data records in Web pages",
text = "Liu, B., Grossman, R. and Zhai, Y. Mining data records in Web pages. UIC
Technical Report, 2003.",
year = "2003",
url = "citeseer.ist.psu.edu/liu03mining.html" }
Citations (may not include all citations):
576
Authoritative sources in a hyperlinked environment
- Kleinberg - 1998 ACM DBLP
228
Wrapper induction for information extraction
- Kushmerick, Weld et al. - 1997 ACM DBLP
171
A scalable comparison shopping agent for the World Wide Web
- Doorenbos, Etzioni et al. - 1997
145
Extracting semi-structured information from the Web
- Hammer, Garcia-Molina et al. - 1997
81
Learning to construct knowledge base from the World Wide Web..
- Craven, DiPasquo et al. - 2000
68
Learning to extract text-based information from the World Wi..
- Soderland - 1997 DBLP
62
A hierarchical approach to wrapper induction
- Muslea, Minton et al. - 1999 ACM DBLP
61
Mining the Web: Discovering Knowledge from Hypertext Data (context) - Chakrabarti - 2002 DBLP
50
XWrap: an XML-enabled wrapper construction system for Web in..
- Liu, Pu et al. - 2000 DBLP
48
Generating finite-state transducers for semi-structured data..
- Hsu, Dung - 1998 ACM DBLP
44
Wrapper induction: efficiency and expressiveness
- Kushmerick - 2000 DBLP
40
Record-boundary discovery in Web documents
- Embley, Jiang et al. - 1999 ACM DBLP
39
Algorithms on strings (context) - Gusfield - 1997
19
A flexible learning system for wrapping tables and lists in ..
- Cohen, Hurst et al. - 2002 ACM DBLP
13
WysiWyg Web wrapper Factory (context) - Sahuget, Azavant - 1999
10
Mining tables from large scale html texts
- Chen, Tsai et al. - 2000 ACM DBLP
8
Automatic data extraction from lists and tables in web sourc..
- Lerman, Knoblock et al. - 2001
5
IEEE Data Engineering Bulletin (context) - Cohen, McCallum et al. - 2000
4
A machine learning based approach for table detection on the..
- Wang, Hu - 2002 ACM DBLP
3
Data mining for Web intelligence
- Han, Chang - 2002 ACM DBLP
3
A fully automated extraction system for the World Wide Web (context) - Buttler, Liu et al. - 2001
2
Algorithms for string matching: A survey (context) - Baeza-Yates - 1989
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.uic.edu/~liub/publications/papers_chron.html): More
Building Text Classifiers Using Positive and Unlabeled Examples - Liu, Dai, Li, al. (2003)
(Correct)
Learning with Positive and Unlabeled Examples Using Weighted.. - Lee, Liu (2003)
(Correct)
Mining Data Records in Web Pages - Liu, Grossman, Zhai (2003)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC