| Liu, L., Han, W., Buttler, D., Pu, C., and Tang, W. An XMLbased wrapper generator for web information extraction. In Proceedings of the ACM SIGMOD International Conference on Management of Data. New York (June 1--3, 1999), pp. 540--543. |
....by other tools. # XML. Our tackling of XML is di erent from the one of XML QL [9] based on patterns and explicit constructs because we derive it from our extraction process that handles HTML pages with no explicit structure. For the same reason, our XML templates are more restrictivethanXWrap [17]. As pointed previously, the range of XML documents we can create is very limited, due the choice of our template language. We think that it is importanttoo eraneasy way to specify ### mapping, knowing that it is always possible to transform the generated XML document(s) using other tools. # ....
....structure and the OQL object model but it means writing complicated ################# queries. Semi automatic generation bene ts from support tools to help design wrappers. In WIDL [3] the entire structure understood by the system is presented to the user who has to pick what he wants. In [2] and [17], the user is presented a dual view of the document with its layout and its corresponding tree. SIMS [20] and LiveAgent [14] o er a ###################### interface where the user shows the system what information to extract. In [16] and [15] Kushmerick uses machine learning techniques to ....
[Article contains additional citation context not shown here]
Ling Liu, Calton Pu, Wei Han, David Buttler, and Wei Tang. An XMLbased Wrapper Generator for Web Information Extraction. In ACM SIGMOD International Conference, June 1999.
....often both easier to use and sucient in practise. 1 The idea of wrappers as a means of extracting information from databases was introduced by Wiederhold [Wie92] Most wrapper applications discussed in the literature have been built for extracting data from Web sources [HGM97] AsK97] Ade98] [LHB99][LPH00] SaA99] The wrapper speci cation for HTML pages is either a set of rules or a query in some query language designed for wrapping. The wrappers are built semi automatically or automatically from these speci cations using a special tool or a programming language. For heterogenous data ....
L. Liu, W. Han, D. Buttler, C. Pu, and W. Tang. An XML-based wrapper generator for Web information extraction. SIGMOD Record, 28(2):540-543, 1999.
....the Web, i.e. Web pages are not well structured and there is no schema that describes the contents of Web pages. It exploits the formatting information on Web pages to hypothesis the underlying structure of a page. With this structure a wrapper that facilitates queries on the page is generated [1, 5, 8, 12, 16, 21]. While there are various issues in Web query processing and different approaches to tackling these issues, we describe in this paper our efforts to build an online query processing system that enables users to query the Web with ease and obtain the results in a database like fashion. By our ....
L. Liu, W. Han, D. Buttler, C. Pu, W. Tang. An XMLBased Wrapper Generator for Web Information Extraction. In Proc.of ACM-SIGMOD 99, pp. 540-543, May 1999.
No context found.
Liu, L., Han, W., Buttler, D., Pu, C., and Tang, W. An XMLbased wrapper generator for web information extraction. In Proceedings of the ACM SIGMOD International Conference on Management of Data. New York (June 1--3, 1999), pp. 540--543.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC