| X. Gao and L. Sterling. AutoWrapper: Automatic Wrapper Generation for Multiple Online Services. In Proceedings of Asia Pacific Web Conference 1999. |
....probably concentrate our efforts on the information extraction task. Up to now, there has been a lot of research upon which we can build within the field of Information Extraction with regard to the Web; most importantly, on different approaches to the manual or automatic construction of Wrappers [22, 16, 34, 2, 28, 23]. Wrappers are highly specialised software modules that are able to parse HTML documents belonging to a tightly defined thematic domain (e.g. car or real estate advertisements) in order to extract their information. The Web genre notion could be optimally used to generalize these up to now ....
X. Gao and L. Sterling. AutoWrapper: Automatic Wrapper Generation for Multiple Online Services. In Proceedings of Asia Pacific Web Conference 1999.
....inductive logic programming algorithm for learning wrappers, also using labelled examples. Cohen[6] introduced a method for learning a general extraction procedure from pairs of page speci c wrappers and the pages they wrap, although the method was restricted to simple list structures. AutoWrapper[11] induces wrappers from unlabelled examples but is restricted to simple table structures. Ghani et al. 12] combined extraction of data from corporate websites with data mining on the resulting information. The TSIMMIS project[5] is another system aimed at integrating web data sources; however, its ....
X. Gao and L. Sterling, \AutoWrapper: automatic wrapper generation for multiple online services," in Asia Paci c Web Conference '99, Hong Kong. (1999).
....inductive logic programming algorithm for learning wrappers, also using labelled examples. Cohen[6] introduced a method for learning a general extraction procedure from pairs of page speci c wrappers and the pages they wrap, although the method was restricted to simple list structures. AutoWrapper[11] induces wrappers from unlabelled examples but is restricted to simple table structures. Ghani et al. 12] combined extraction of data from corporate websites with data mining on the resulting information. The TSIMMIS project[5] is another system aimed at integrating web data sources; however, its ....
X. Gao and L. Sterling, \AutoWrapper: automatic wrapper generation for multiple online services," in Asia Pacic Web Conference '99, Hong Kong. (1999).
....was the case in the past. Recent work on data integration has not encompassed knowledge creation and is handicapped by the signi cant e ort needed to set up a KIDM infrastructure (e.g. most work requires source wrappers to be painstakingly hand crafted and e orts to automate wrapper construction [9, 13, 16] have so far been constrained to very speci c contexts) Thus, the need to meet the challenge above is becoming both more acute and more widespread. The main contribution of this paper is an architectural approach to the solution of some aspects of the above problem, particularly in the context ....
X. Gao and L. Sterling. AutoWrapper: Automatic Wrapper Generation for Multiple Online Services. In Proceedings of Asia Pacic Web Conference 1999 (APWeb99) , 1999.
....classes of wrappers that could be induced from labelled examples. Craven et al. 4] describe an inductive logic programming algorithm for learning wrappers, also using labelled examples. The only work of which we are aware that induces general wrappers from unlabelled examples is AutoWrapper [6], which uses a di erent approach based on nding similarities between table rows. 6 Conclusions and further work In conclusion, we have demonstrated a method for generating information extraction wrappers using grammatical inference. Our approach does not require the overhead of ....
X. Gao and L. Sterling, \AutoWrapper: automatic wrapper generation for multiple online services," Asia Pacic Web Conference '99, Hong Kong (1999).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC