| B. Thomas, Token-Templates and Logic Programs for Intelligent Web Search, Journal of Intelligent Information Systems 14 (2000) 241--261. |
.... expressions and to learn a procedure (henceforth called wrapper) that allows for an extraction of this information (cf. 12, 21, 8] e.g. AEFSs seem to provide an appropriate framework to describe extraction procedures that naturally comprises the approaches proposed in the IE community (cf. [12, 22], e.g. For illustration, consider the following table and its L A T E X source which contains details about the rst half dozen of workshops on Algorithmic Learning Theory (ALT) The aim of the IE task is to extract all pairs (y,c) that refer to the year y and the corresponding conference ....
B. Thomas, Token-Templates and Logic Programs for Intelligent Web Search, Journal of Intelligent Information Systems 14 (2000) 241-261.
....procedures) for certain web domains. Whenever the extraction agents visit one of these domains during their search they use these pre learned wrappers to extract information from one of the web pages. Currently the wrapper toolkit uses a one shot learning strategy [Grieser et al. 2000; Thomas, 2000; Thomas, 1999a; Thomas, 1999b] extended with a special document representation based on the Document Object Model (DOM) representation. The general strategy to learn left and right anchors to define the start and end point for extraction (delimiters) is kept, but extended to path learning in the ....
Bernd Thomas. Token-Templates and Logic Programs for Intelligent Web Search. Intelligent Information Systems, 14(2/3):241--261, March-June 2000. Special Issue: Methodologies for Intelligent Information Systems.
....(type=html, tag=a, href=Url) sequence(P, E, TokenSeq) not in(token(type=html end, tag=a) TokenSeq) next(E, token(type=html end, tag=a) extracts tuples of the form Description, URL from a HTML document. For further details of how logic programs can be used for information extraction, see [35]. In the last decade various representations have been developed, some influenced largely by logic programming [16, 34] and other slot oriented approaches motivated by natural language processing. In essence they all can be represented without much effort in a first order predicate logic syntax. ....
B. Thomas. Token-Templates and Logic Programs for Intelligent Web Search. Intelligent Information Systems, 14(2/3):241--261, March-June 2000. Special Issue: Methodologies for Intelligent Information Systems.
....on web documents (HTML documents) it makes more sense to use the special text formatting and annotating strings (tags) of these documents to recognize and extract relevant information. We use the syntax based approach of automatically learning extraction procedures (wrappers [18] described in [7, 15, 16]. Similar approaches using machine learning techniques for the automatic wrapper construction are described in [3, 6, 9] To extract information, we can assume special text parts to be delimiters marking the beginning and the end of the relevant information to be extracted. Thus the key idea of ....
B. Thomas. Token-Templates and Logic Programs for Intelligent Web Search. Intelligent Information Systems, 14(2/3):241--261, March-June 2000. Special Issue: Methodologies for Intelligent Information Systems.
....on web documents (HTML documents) it makes more sense to use the special text formatting and annotating strings (tags) of these documents to recognize and extract relevant information. We use the syntax based approach of automatically learning extraction procedures (wrappers [16] described in [5, 13, 14]. Similar approaches using machine learning techniques for the automatic wrapper construction are described in [1, 4, 7] To extract information, we can assume special text parts to be delimiters marking the beginning and the end of the relevant information to be extracted. Thus the key idea of ....
B. Thomas. Token-Templates and Logic Programs for Intelligent Web Search. Intelligent Information Systems, 14(2/3):241--261, March-June 2000. Special Issue: Methodologies for Intelligent Information Systems.
No context found.
B. Thomas, Token-Templates and Logic Programs for Intelligent Web Search, Journal of Intelligent Information Systems 14 (2000) 241--261.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC