| Shoens, K., Luniewski, A., Schwarz, P., Stamos, J.: The Rufus System: Information Organization for Semi-Structured Data. In Proceedings of the 19th VLDB Conference, Dublin (1993) 97-107 |
....receiver, date, and contents. The information collected by the Classifier Extractor can then be exported via a translator to the rest of the system, together with the raw data. The Classifier Extractor component is based on the Rufus system developed at the IBM Almaden Research Center [ Shoens et al. 1993 ] Mediators Above the translators in Figure 1 lie the mediators. A mediator is a software module that refines in some way information from one or more sources [ Wiederhold, 1992 ] A mediator embodies the knowledge that is necessary for processing a specific type of information. For example, ....
K. Shoens, A. Luniewski, P.Schwarz, J. Stamos, and J. Thomas. The rufus system: Information organization for semistructured data. In Proceedings of the International ConferenceonVery Large Databases, pages 97--107, Dublin, Ireland, August 1993.
....(howtobrowse it ) In wehave developed components that address all of the above issues and together provide an integrated solution to the problem of managing semistructured data. Several other recent projects have similar goals (e.g. Lore [10] Garlic [3] Information Manifold [9] Rufus [14]) but we do not survey them here. 2 Representing Semistructured Data in TSIMMIS For the project wehave adopted a simple selfdescribing (or tagged) object model. Similar models have been in use for years# wecallourversion the Object Exchange Model,orOem [4] Oemis a flexible model that is ....
K. Shoens, A. Luniewski, P.Schwarz, J. Stamos, and J. Thomas. The rufus system: Information organization for semi-structured data. In Proceedings of the International ConferenceonVery Large Databases, pages 97--107, Dublin, Ireland, August 1993.
....rapid access to huge amounts of heterogeneous information in a distributed environment without any relocation, restructuring, or reformatting of data. Many researchers have investigated the use of metadata to support run time access to the original information [1,3,8,9,11, 12,13,19,26] Others [5,11,21,27] have investigated the use of data mining for the automatic extraction of metadata. We refine and synthesize some of the ideas contained in these efforts to provide advanced search and browsing capabilities without Bell Communications Research, 444 Hoes Lane, Piscataway, NJ 08854 LSDIS, ....
K. Shoens, A Luniewski, P. Shwartz, J. Stamos, and J. Thomas, "The Rufus System: Information Organization for Semi-Structured Data", Proceedings of the 19th VLDB Conference, Dublin, Ireland, 1993.
....update is a major need [11] 32] Regarding the kind of DBS used for the integration, one can distinguish four different approaches [3] 18] First, special purpose DBS are particularly tailored to store, retrieve, and update XML documents. Examples thereof are research prototypes such as Rufus [30], Lore [20] Strudel [17] and Natix [21] as well as commercial systems such as eXcelon [24] and Tamino [28] Second, because of the rich data modeling capabilities of object oriented DBS, they are well suited for storing hypertext documents [5] 33] Object oriented DBS and special purpose DBS, ....
Shoens, K., et al.: The Rufus system: Information organization for semi-structured data. Proc. of the Int. Conf. On Very Large Data Bases (VLDB), Dublin, Ireland, 1993
....another, human intervention is necessary. Currently, significant research is conducted in the field of semi structured data, aiming at the extraction of information and the automatic classification of electronic documents. So far, promising results have been achieved and the reader is referred to [10, 11, 12, 13] for a summary. The present work differs mainly in two respects: it allows more freedom of form and content in the extraction of information from electronic documents and it uses an intelligent agent with automatic learning capabilities in the extraction process. 3.2. DOCUMENT FILTERING The ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The Rufus system: information organization for semi-structured data. In Proceedings of the 19th Conference on Very Large Databases (VLDB '93), pages 97--107, Dublin, Ireland, 1993.
....in order to retrieve data more efficiently from the Web. Databases, however, contain structured data and most Web data is semistructured in nature and cannot be retrieved easily by using traditional techniques. Therefore, quite a lot of recent work has been targeted to handle this problem [2, 7, 11, 13, 16]. Semistructured data [1, 5] is characterized by the lack of any fixed and rigid schema, although, unlike unstructured raw data, typically the data has some implicit structure. Querying such data is greatly difficult. Due to this irregularity, each web page contains its own schema or structure. ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The rufus system: Information organization for semi-structured data. In Proceedings of Nineteenth International Conference on Very Large Databases, pages 97--107, Dublin, Ireland, 1993.
....path expressions and the ability to extract information about the schema from the data. There are three possible approaches to store semi structured data (i.e. XML documents) and to execute queries on that data. One, build a special purpose database system. Example research prototypes are Rufus [31], Lore [21] and Strudel [13] Lotus Notes is an example commercial product[20] Such a system is particularly tailored to store and retrieve XML data, using specially designed structures and indices[23, 24] and particular query optimization techniques[15, 22] To some extent SGML databases [36, ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proc. of the Int. Conf. on Very Large Data Bases (VLDB), pages 97107, Dublin, Ireland, 1993.
....invoked for image files and an audio player for sound files. The user need not have knowledge about ways of launching the appropriate tool. Hypertext browsing can also be supported if the original document is a hypertext. The InfoHarness system shares some of it s objectives with the RUFUS system [3]. The RUFUS system has an extensible object oriented data model, storage system, and associated search and display methods for a variety of user file types. The system automatically classifies a user s data files and extracts type specific attributes. This corresponds to the metadata extraction ....
K. Shoens, A. Luniewski, P. Shwartz, J. Stamos, and J. Thomas, "The Rufus System: Information Organization for Semi-Structured Data", Proceedings of the 19th VLDB Conference, Dublin, Ireland, 1993.
....An important next step should be to support location and repository independent queries. Early steps in this direction are implemented using associative access in [28] In Nomenclator [24] metadata about the various repositories is cached to help constrain the search space for a query. The Rufus [30] and the InfoHarness 1 [29] systems use automatically generated metadata to access and retrieve heterogeneous information independent of type, representation and location. An approach using a global ontology divided into micro theories is discussed in [7] We extend or build upon some of the ....
K. Shoens, A. Luniewski, P. Schwartz, J. Stamos, and J. Thomas. The Rufus System: Information Organization for Semi-Structured Data. In Proceedings of the 19th VLDB Conference, Sept. 1993.
....in the interfaces that are being developed for Mosaic. Here there exist virtual files at various sites that when accessed as a file, actually invoke a computation of some sort. Other related work can also be seen in databases that try to abstract information from files such as the Rufus system [13]. 4.0 Fragment Integration Our approach to integration combines standard control integration with a much simplified form of data integration that we call fragment integration. Rather than storing everything about the system in a database, we identify fragments of the original file that are ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas, "The Rufus system: information organization for semi-structured data," Proc. 19th VLDB Conference, pp. 1-12 (1993).
....for mediators, such as finding servers with information a client seeks, coordinating multiple operations to fulfill a user s request, and making systems more fault tolerant and reliable. Examples of systems that use mediators include Netfind [SP94] MetaCrawler [SE97] Indie [DLO92] and Rufus [SLST93] The first three examples all help with search. The first two provide front ends to a variety of heterogeneous services, and the third includes a network of brokers that maintain and exhange search information. Rufus is a mediatorbased system that gives an object oriented view of data typically ....
K. Shoens, A. Luniewski, P. Schwarz, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proceedings of the 19th VLDB Conference, Dublin, Ireland, 1993.
....edited or annotated by several persons. Typical examples are newspapers changing dynamically ( partial news updatestrategy ) or encyclopediae with many authors and editors and individualized or annotated versions for different user groups. Conventional file oriented systems without databases [25] lead to many difficulties in handling this kind of documents. For example, by leaving documentfile objects intact multi user mode is not supported. The granularity is on the document level. Concurrently authoring different parts of the same document is cumbersome at least. Besides that, if ....
K. Shoens et al. The rufus system: Information organization for semistructured data. In R. Agrawal, S. Baker, and D. Bell, editors, Proceedings of the International Conference on Very Large Data Bases, pages 97--107. VLDB Endowment, 1993. Dublin, Ireland.
....work without a global database schema. Classifiers and extractors can be used to extract information from unstructured documents (e.g. plain text files, mail messages, etc. and classify them in terms of the domain model. The classifier extractor components of Tsimmis is based on the Rufus system [54]. Rufus uses an object oriented database to store descriptive information about user s data, and a full text retrieval database to provide access to the textual content of data. Another proposal along these lines is constituted by the ARANEUS Project [7] whose aim is to make explicit the schema ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proceedings of the Nineteenth International Conference on Very Large Data Bases (VLDB-93), 1993.
....in the interfaces that are being developed for Mosaic. Here there exist virtual files at various sites that when accessed as a file, actually invoke a computation of some sort. Other related work can also be seen in databases that try to abstract information from files such as the Rufus system [21]. 6.0 Experience and Conclusions We have had some experience with the Desert environment. While the prototype is still too premature to release to even selective user communities, we have been using the system to develop itself for the last two months. The prototype involves about 60,000 lines ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas, "The Rufus system: information organization for semi-structured data," Proc. 19th VLDB Conference, pp. 1-12 (1993).
....for dynamic sets. The penalty is that applications would have to be rewritten for this system, and may not be portable to new platforms. 192 CHAPTER 11. RELATED WORK An example of this approach is the ELFS file system discussed earlier[30] Another example is the Rufus system by Shoens et al.[86]. Rufus provides searching, organizing, and browsing for the semi structured information commonly stored in computer systems [86] p 97) Rufus consists of a class hierarchy, an object oriented database which stores file contents and attributes, and an automatic indexing mechanism based on ....
....192 CHAPTER 11. RELATED WORK An example of this approach is the ELFS file system discussed earlier[30] Another example is the Rufus system by Shoens et al. 86] Rufus provides searching, organizing, and browsing for the semi structured information commonly stored in computer systems [86](p 97) Rufus consists of a class hierarchy, an object oriented database which stores file contents and attributes, and an automatic indexing mechanism based on typespecific classifiers. Users import objects into the Rufus system to add them to the database, supplying the object s class which ....
Shoens, K., Luniewski, A., Schwarz, P., Stamos, J., and Thomas, J. The Rufus system: Information organization for semi-structured data. In Proceedings of the 19th International Conference on Very Large Data Bases (Dublin, Ireland, Aug. 1993).
....to browse it ) In Tsimmis we have developed components that address all of the above issues and together provide an integrated solution to the problem of managing semistructured data. Several other recent projects have similar goals (e.g. Lore [10] Garlic [3] Information Manifold [9] Rufus [14]) but we do not survey them here. 2 Representing Semistructured Data in TSIMMIS For the Tsimmis project we have adopted a simple selfdescribing (or tagged) object model. Similar models have been in use for years; we call our version the Object Exchange Model, or Oem [4] Oemis a flexible ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The rufus system: Information organization for semi-structured data. In Proceedings of the International Conference on Very Large Databases, pages 97--107, Dublin, Ireland, August 1993.
....work without a global database schema. Classifiers and extractors can be used to extract information from unstructured documents (e.g. plain text files, mail messages, etc. and classify them in terms of the domain model. The classifier extractor components of Tsimmis is based on the Rufus system [14]. Rufus uses an object oriented database to store descriptive information about user s data, and a full text retrieval database to provide access to the textual content of data. Another proposal along these lines is constituted by the ARANEUS Project [10] whose aim is to make explicit the schema ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proceedings of the Nineteenth International Conference on Very Large Data Bases (VLDB-93), 1993.
....exactly the same stream of I O operations a real system would (although the contents differ) and measures the time to execute the stream. To test our belief that Phase II CPU costs can be overlapped with I O operations, we need to test an actual running IR system. We selected the Rufus system [9] and built an inverted index for 307 megabytes of documents from a collection of IBM internal bulletin board articles. First the Phase I requirements were measured at about 9 minutes and then the real Phase II time required to build the index in two situations was measured. The first situation ....
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proceedings of the 19th VLDB Conference, Dublin, Ireland, 1993.
No context found.
Shoens, K., Luniewski, A., Schwarz, P., Stamos, J.: The Rufus System: Information Organization for Semi-Structured Data. In Proceedings of the 19th VLDB Conference, Dublin (1993) 97-107
No context found.
K. Shoens, A. Luniewski, P.Schwarz, J. Stamos, and J. Thomas. The Rufus system: Information organization for semi-structured data. In Proceedings of the Nineteenth International Conferenceon Very Large Data Bases (VLDB-93),1993.
No context found.
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas, The Rufus system: Information organization for semi-structured data, Proc. Of the Int. Conf. On VLDB, pages 97-107, Dublin, Ireland, 1993.
No context found.
K. Shoens, A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas, The Rufus system: Information organization for semi-structured data, Proc. Of the Int. Conf. On VLDB, pages 97-107, Dublin, Ireland, 1993. 18
No context found.
Shoens, K., A. Luniewski, P. Schwarz, J. Stamos, and J. Thomas: 1993, `The Rufus System: Information Organization for Semi-structured Data'. In: Proceedings of the 19th Conference on Very Large Databases (VLDB '93). Dublin, Ireland, pp. 97--107.
No context found.
K. Shoens, A. Luniewski, P. Schwartz, J. Stamos, and J. Thomas. The Rufus System: Information Organization for Semi-Structured Data. In Proceedings of the 19th VLDB Conference, September 1993.
No context found.
K. Shoens et al. The Rufus system: Information organization for semistructured data. In Proc. VLDB Conference, Dublin, Ireland, 1993.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC