See this document in CiteSeerX!

Computational Aspects of Resilient Data Extraction from Semistructured Sources (Extended Abstract) (2000)  (Make Corrections)  (8 citations)
Hasan Davulcu, Guizhen Yang, Michael Kifer, I.V. Ramakrishnan
Symposium on Principles of Database Systems



  Home/Search   Context   Related

 
View or download:
sunysb.edu/pub/Tec...aextraction.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  sunysb.edu/~kifer/papers (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Hasan Davulcu Guizhen Yang Michael Kifer I.V. Ramakrishnan Department of Computer Science SUNY at Stony Brook Stony Brook, NY 11794-4400, USA fdavulcu,guizyang,kifer,ramg@cs.sunysb.edu ABSTRACT Automatic data extraction from semistructured sources such as HTML pages is rapidly growing into a problem of signi - cant importance, spurred by the growing popularity of the so called "shopbots" that enable end users to compare prices of goods and other services at various web sites without... (Update)

Context of citations to this paper:   More

...under a large class of variations in the page layout. Some of the techniques used to create unambiguous resilient expressions are detailed in [7]. To illustrate the idea, consider the second visible table in Figure 1 (below the Component Detail header) This table is actually...

Cited by:   More
Semantic Bookmarking for Non-Visual Web Access - Saikat Mukherjee Stony   (Correct)
On Precision and Recall of Multi-Attribute Data.. - Yang, Mukherjee.. (2003)   (Correct)
Reverse Engineering for Web Data: From Visual to Semantic.. - Chung, Gertz (2002)   (Correct)

Active bibliography (related documents):   More   All
0.4:   Extraction Techniques for Mining Services from Web Sources - Davulcu, Mukherjee.. (2002)   (Correct)
0.2:   A Framework for Generating Attribute Extractors for.. - Reis, Araujo, Silva, .. (2002)   (Correct)
0.1:   On the Power of Semantic Partitioning of Web Documents - Yang, Mukherjee, Tan.. (2003)   (Correct)

Similar documents based on text:   More   All
1.2:   Design and Implementation of the Physical Layer in.. - Davulcu, Yang.. (2000)   (Correct)
0.4:   Cv - Davulcu   (Correct)
0.4:   Logic Based Modeling and Analysis of Workflows (Extended.. - Davulcu, al. (1998)   (Correct)

Related documents from co-citation:   More   All
4:   XTRACT: A system for extracting document type descriptors from XML documents - Garofalakis, Gionis et al. - 2000
3:   Wrapper induction for information extraction - Kushmerick, Weld et al. - 1997
3:   Wrapping web information providers by transducer induction - Chidlovskii - 2001

BibTeX entry:   (Update)

H. Davulcu, G. Yang, M. Kifer, and I. Ramakrishnan. Computational aspects of resilient data extraction from semistructured sources. In 19th ACM Symposium on Principles of Database Systems, 136--144, 2000. http://citeseer.ist.psu.edu/davulcu00computational.html   More

@inproceedings{ davulcu00computational,
    author = "Hasan Davulcu and Guizhen Yang and Michael Kifer and I. V. Ramakrishnan",
    title = "Computational Aspects of Resilient Data Extraction from Semistructured Sources",
    booktitle = "Symposium on Principles of Database Systems",
    pages = "136-144",
    year = "2000",
    url = "citeseer.ist.psu.edu/davulcu00computational.html" }
Citations (may not include all citations):
1911   Introduction to Automata Theory (context) - Hopcroft, Ullman - 1979
501   The lorel query language for semistructured data - Abiteboul, Quass et al. - 1997
373   Querying semi-structured data - Abiteboul - 1997
331   A query language and optimization techniques for unstructure.. - Buneman, Davidson et al. - 1996
259   Elements of the Theory of Computation (context) - Lewis, Papadimitriou - 1981
228   Wrapper induction for information extraction - Kushmerick, Weld et al. - 1997
179   Semistructured data - Buneman - 1997
156   Inductive inference of formal languages from positive data (context) - Angluin - 1980
145   Extracting semistructured information from the web - Hammer, Garcia-Molina et al. - 1997
142   Finding patterns common to a set of strings (context) - Angluin - 1979
104   Wrapper generation for semi-structured internet sources - Ashish, Knoblock - 1997
80   Template-based wrappers in the tsimmis system - Hammer, Garcia-Molina et al. - 1997
62   NoDoSe: A tool for semi-automatically extracting structured .. - Adelberg - 1998
32   Learning to understand information on the internet: An examp.. - Perkowitz, Doorenbos et al. - 1997
23   Conceptual-model-based data extraction from multiple-record .. - Embley, Campbell et al. - 1999
22   Learning syntax by automata induction (context) - Berwick, Pilato - 1987
17   Wrapper generation for web accessible data sources - Gruser, Raschid et al. - 1998
8   Web Ecology Recycling HTML page as XML document using WF (context) - Azavant, Recycling et al. - 1999
6   Extracting semi-structured data through examples (context) - Ribeiro-Neto, Laender et al. - 1999
6   An automated approach for retrieving hierarchical data from .. (context) - Lim, Ng - 1999
http://www.jango.com



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.sunysb.edu/~kifer/papers.html):   More
Theory of Generalized Annotated Logic Programming and its.. - Kifer, Subrahmanian (1992)   (Correct)
On the Decidability and Axiomatization of Query Finiteness in.. - Kifer (1998)   (Correct)
A Theory of Nonmonotonic Inheritance Based on Annotated Logic - Thirunarayan, Kifer (1992)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC