Alternate document:   Details   Mining Topic Specific Concepts and Definitions on the Web (03) Bing Liu, et al.

See this document in CiteSeerX!

Mining Data Records in Web Pages (2003)  (Make Corrections)  (36 citations)
Bing Liu, Robert Grossman, Y Zhai



  Home/Search   Context   Related

 
View or download:
uic.edu/~liub/publ...ataRecordfull.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  uic.edu/~liub/publ...papers_chron (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: A large amount of information on the Web is contained in regularly structured objects, which we call data records. Such data records are important because they often present the essential information of their host pages, e.g., lists of products and services. It is useful to mine such data records in order to extract information from them to provide value-added services. Existing approaches to solving this problem mainly include the manual approach, supervised learning, and automatic... (Update)

Cited by:   More
Classification Techniques for Categorization of Hypertext Documents - Arumugam   (Correct)

Active bibliography (related documents):   More   All
1.1:   Mining Data Records in Web Pages - Liu, Grossman, Zhai (2003)   (Correct)
0.4:   From Tables to Frames - Pivk, Cimiano, Sure (2005)   (Correct)
0.3:   Agents Need to Become Welcome - Magnin, Snoussi, Pham, Dury, Nie (2002)   (Correct)

Similar documents based on text:   More   All
0.2:   Building Text Classifiers Using Positive and Unlabeled Examples - Liu, Dai, Li, al. (2003)   (Correct)
0.2:   A Refinement Approach to Handling Model Misfit in Text.. - Wu, Phang, Liu, Li   (Correct)
0.2:   Learning with Positive and Unlabeled Examples Using Weighted.. - Lee, Liu (2003)   (Correct)

Related documents from co-citation:   More   All
1519:   On integrating catalogs - Agrawal, Srikant - 2001
1426:   Nearest Neighbor Algorithm for Text Categorization (context) - Baoli, Shiwen et al. - 2003
1271:   Mining the Web: Discovering knowledge from hypertext data (context) - Chakrabarti - 2003

BibTeX entry:   (Update)

Liu, B., Grossman, R. and Zhai, Y. Mining data records in Web pages. UIC Technical Report, 2003. http://citeseer.ist.psu.edu/liu03mining.html   More

@misc{ liu03mining,
  author = "B. Liu and R. Grossman and Y. Zhai",
  title = "Mining data records in Web pages",
  text = "Liu, B., Grossman, R. and Zhai, Y. Mining data records in Web pages. UIC
    Technical Report, 2003.",
  year = "2003",
  url = "citeseer.ist.psu.edu/liu03mining.html" }
Citations (may not include all citations):
576   Authoritative sources in a hyperlinked environment - Kleinberg - 1998  ACM   DBLP
228   Wrapper induction for information extraction - Kushmerick, Weld et al. - 1997  ACM   DBLP
171   A scalable comparison shopping agent for the World Wide Web - Doorenbos, Etzioni et al. - 1997
145   Extracting semi-structured information from the Web - Hammer, Garcia-Molina et al. - 1997
81   Learning to construct knowledge base from the World Wide Web.. - Craven, DiPasquo et al. - 2000
68   Learning to extract text-based information from the World Wi.. - Soderland - 1997  DBLP
62   A hierarchical approach to wrapper induction - Muslea, Minton et al. - 1999  ACM   DBLP
61   Mining the Web: Discovering Knowledge from Hypertext Data (context) - Chakrabarti - 2002  DBLP
50   XWrap: an XML-enabled wrapper construction system for Web in.. - Liu, Pu et al. - 2000  DBLP
48   Generating finite-state transducers for semi-structured data.. - Hsu, Dung - 1998  ACM   DBLP
44   Wrapper induction: efficiency and expressiveness - Kushmerick - 2000  DBLP
40   Record-boundary discovery in Web documents - Embley, Jiang et al. - 1999  ACM   DBLP
39   Algorithms on strings (context) - Gusfield - 1997
19   A flexible learning system for wrapping tables and lists in .. - Cohen, Hurst et al. - 2002  ACM   DBLP
13   WysiWyg Web wrapper Factory (context) - Sahuget, Azavant - 1999
10   Mining tables from large scale html texts - Chen, Tsai et al. - 2000  ACM   DBLP
8   Automatic data extraction from lists and tables in web sourc.. - Lerman, Knoblock et al. - 2001
5   IEEE Data Engineering Bulletin (context) - Cohen, McCallum et al. - 2000
4   A machine learning based approach for table detection on the.. - Wang, Hu - 2002  ACM   DBLP
3   Data mining for Web intelligence - Han, Chang - 2002  ACM   DBLP
3   A fully automated extraction system for the World Wide Web (context) - Buttler, Liu et al. - 2001
2   Algorithms for string matching: A survey (context) - Baeza-Yates - 1989



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.uic.edu/~liub/publications/papers_chron.html):   More
Building Text Classifiers Using Positive and Unlabeled Examples - Liu, Dai, Li, al. (2003)   (Correct)
Learning with Positive and Unlabeled Examples Using Weighted.. - Lee, Liu (2003)   (Correct)
Mining Data Records in Web Pages - Liu, Grossman, Zhai (2003)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC