| Lau, T., Etzioni, O., and Weld, D. ( |
....study of shared calendar systems [7] resulted in the classic finding that information sharing can break down when the people who must do the work to share the information aren t the same people who will benefit. Other research on additional types of information to share includes Lau et al. [9], who discussed CollabClio, an interface for people to share information about which web pages they had visited. CollabClio looked at mechanisms for people to describe which information they wanted to share with whom, as well as mechanisms for indicating to people that they were performing actions ....
Lau, T., Etzioni, O., and Weld, D. (
....to gather Web pages for a search engine database. However it is not controlled and traces to only linked pages. NaviPlanning can trace to unlinked pages, and is controlled for generating a plan. Some learning systems have been developed for information gathering and browsing in the WWW. ShopBot (Doorenbos, Etzioni, Weld 1997) learns the text pattern indicating the price of CD ROMs, and searches for the cheapest one more efficiently than a human. The purpose of ShopBot is different from our research. WebWatcher (Armstrong et al. 1995) and Letizia (Lieberman 1995) are able to indicate the Web pages which a user wants to ....
Doorenbos, R. B.; Etzioni, O.; and Weld, D. S. 1997.
....1 It should be noted that although we are dealing with unifying information from distinct information sources in the context of value driven information gathering, these techniques are valid in domains with few information sources. are developing much more open ended extraction engines (Doorenbos, Etzioni, Weld 1997; Ashish Knoblock 1997a; 1997b; Konopnicki Shmueli 1995; Genesereth, Keller, Mueller 1996) Figure 1 shows the influence diagram use by VDIG to evaluate digital cameras. Decision (model X or None) Camera evaluation (casio 10A) I C I C Camera evaluation (Quicktake 150) I C Camera ....
Doorenbos, R. B.; Etzioni, O.; and Weld, D. S. 1997.
....and passed to a natural language generator for response. For machine generated documents, the idea of wrapper induction was introduced to acquire the regularities in the data (Kushmerick 1998; Muslea, Minton, Knoblock 1998) Similar ideas were applied to learning query based information sources (Doorenbos, Etzioni, Weld 1997) or manually edited pages (Hsu 1998) SRV was proposed to extract information from electronic seminar announcements, medical abstracts, and news wire articles on corporate acquisitions (Freitag 1998) Most of these approaches rely on syntactical patterns in a document. Any semantic processing is ....
Doorenbos, R. B.; Etzioni, O.; and Weld, D. S. 1997.
....paper that has not been addressed previously is how one represents the information within a single page, across pages at a site, and across sites to support web based information integration. Another closely related body of work is on the extraction of data from web sources (Hammer et al. 1997; Doorenbos et al. 1997; Kushmerick 1997) The focus of all of these systems are on building wrappers for semi structured sources. The systems either take a template based specification of a source, as in (Hammer et al. 1997) or learn the structure of the source by example and then compile a wrapper that provides ....
Doorenbos, R.B.; Etzioni, O.; and Weld, D.S. 1997.
....have shown an interest in the general IE problem. At the University of Washington, several projects have explored to possibility of machine learning for information extraction from World Wide Web pages. The ShopBot, for example, is able to locate catalog listings for novel vendor sites on the Web (Doorenbos, Etzioni, Weld 1996). Once there, it forms a model of the listing format, in order to extract details such as product name and price. In (Kushmerick, Weld, Doorenbos 1997) this idea is elaborated into a general purpose algorithm for inferring such patterns under certain assumptions. This can be regarded as ....
Doorenbos, R. B.; Etzioni, O.; and Weld, D. S. 1996.
.... LI John Tanner, I Visiting Associate of Computer Science I LI Fred Thompson, I Professor Emeritus of Applied Philosophy and Computer Science I (b) Figure 1: a) fragment of Caltech CS faculty Web page and (b) its HTML source (as for November, 1997) rapid wrapper construction (e.g. (Doorenbos, Etzioni, Weld 1997; Ashish Knoblock 1997; Kushmerick 1997) Essentially, these wrappers extract a tuple by scanning the input HTML string, recognizing the delimiters surrounding the first attribute, and repeating the same steps for the next attribute until all attributes are extracted. Kushmerick 1997) advanced ....
Doorenbos, R. B.; Etzioni, O.; and Weld, D. S. 1997.
....and returns tuples of arity m v that satisfy the following implication: 8 Z) v( Z) r v ( Z) That is, every tuple obtained from the information source satisfies the conjunction r v . Note that the description does not imply that the source contains all the tuples that satisfy r v (see [ Etzioni et al. 1994 ] for a formalism dealing with complete information) The representation of information sources satisfies our desiderata. The expressive power of conjunctive queries, Classic descriptions and order constraints provides a very rich language in which fine grained distinctions between sources can be ....
....1995b ] but algorithms for answering queries for this formalism were not described. Context logics have also been proposed for modeling contents of information sources [ Farquhar et al. 1995 ] but designing algorithms for determining relevance of sources has not been addressed. Finally, Etzioni et al. 1994 ] describes an elegant formalism and algorithms for representing that a source has complete information of a certain kind, and shows that such information can be used to prune accesses to information sources. One direction in which we are currently extending the Information Manifold is to exploit ....
Etzioni, Oren; Golden, Keith; and Weld, Daniel 1994.
....we describe novel query processing algorithms used to combine information from multiple sources. In particular, our algorithms are guaranteed to find exactly the set of information sources relevant to a query, and to completely exploit knowledge about local closed world information [ Etzioni et al. 1994 ] Introduction We are currently witnessing an explosion in the amount of information that is available online. For example, the rapid rise in popularity of the World Wide Web (WWW) has increased the amount of information available over the Internet. As another example, large companies and ....
....determining relevance are sufficiently general, such that we can incorporate Horn rules with more expressive description logics, consider queries involving negation, and statements describing relationships between the information sources. ffl Local closed world information (LCW) introduced in [ Etzioni et al. 1994 ] enables us to express the fact that an information source has complete knowledge about some part of the domain. The query processor can use this knowledge to prune access to redundant information sources (i.e. sources that are relevant, but whose content is contained in the union of some ....
[Article contains additional citation context not shown here]
Etzioni, Oren; Golden, Keith; and Weld, Daniel 1994.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC