Results 1 - 10
of
280
The anatomy of the Grid: Enabling scalable virtual organizations.
- The International Journal of High Performance Computing Applications
, 2001
"... Abstract "Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, ..."
Abstract
-
Cited by 2673 (86 self)
- Add to MetaCart
(Show Context)
Abstract "Grid" computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation. In this article, we define this new field. First, we review the "Grid problem," which we define as flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources-what we refer to as virtual organizations. In such settings, we encounter unique authentication, authorization, resource access, resource discovery, and other challenges. It is this class of problem that is addressed by Grid technologies. Next, we present an extensible and open Grid architecture, in which protocols, services, application programming interfaces, and software development kits are categorized according to their roles in enabling resource sharing. We describe requirements that we believe any such mechanisms must satisfy and we discuss the importance of defining a compact set of intergrid protocols to enable interoperability among different Grid systems. Finally, we discuss how Grid technologies relate to other contemporary technologies, including enterprise integration, application service provider, storage service provider, and peer-to-peer computing. We maintain that Grid concepts and technologies complement and have much to contribute to these other approaches.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets
- JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
, 1999
"... In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the des ..."
Abstract
-
Cited by 471 (41 self)
- Add to MetaCart
In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the design of a data grid, namely, storage systems and metadata management. Next, we explain how these services can be used to develop higher-level services for replica management and replica selection. We conclude by describing our initial implementation of data grid functionality.
Underwater Acoustic Sensor Networks: Research Challenges
- AD HOC NETWORKS (ELSEVIER
, 2005
"... Underwater sensor nodes will find applications in oceanographic data collection, pollution monitoring, o#shore exploration, disaster prevention, assisted navigation and tactical surveillance applications. Moreover, unmanned or autonomous underwater vehicles (UUVs, AUVs), equipped with sensors, will ..."
Abstract
-
Cited by 321 (27 self)
- Add to MetaCart
Underwater sensor nodes will find applications in oceanographic data collection, pollution monitoring, o#shore exploration, disaster prevention, assisted navigation and tactical surveillance applications. Moreover, unmanned or autonomous underwater vehicles (UUVs, AUVs), equipped with sensors, will enable the exploration of natural undersea resources and gathering of scientific data in collaborative monitoring missions. Underwater acoustic networking is the enabling technology for these applications. Underwater networks consist of a variable number of sensors and vehicles that are deployed to perform collaborative monitoring tasks over a given area. In this
Chimera: A Virtual Data System For Representing, Querying, and Automating Data Derivation
- In Proceedings of the 14th Conference on Scientific and Statistical Database Management
, 2002
"... Much scientific data is not obtained from measurements' but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods', and on-de ..."
Abstract
-
Cited by 282 (28 self)
- Add to MetaCart
(Show Context)
Much scientific data is not obtained from measurements' but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods', and on-demand data generation (socalled "virtual data"). To explore this' idea, we have developed the Chimera virtual data system, which combines a virtual data catalog, for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests' into data definition and query operations on the database. We couple the Chimera system with distributed "Data Grid" services to enable on-demand execution of computation schedules constructed from database queries. We have applied this system to two challenge problems, the reconstruction of simulated collision event data from a high-energy physics experiment, and the search of digital sky survey data for galactic clusters', with promising results'.
Data Management and Transfer in High-Performance Computational Grid Environments
- Parallel Computing Journal
, 2001
"... An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accel ..."
Abstract
-
Cited by 206 (13 self)
- Add to MetaCart
(Show Context)
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accelerators, and in simulation science, where the data is generated by supercomputers. So-called Data Grids provide essential infrastructure for such applications, much as the Internet provides essential services for applications such as e-mail and the Web. We describe here two services that we believe are fundamental to any Data Grid: reliable, high-speed transport and replica management. Our high-speed transport service, GridFTP, extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access. Our replica management service integrates a replica catalog with GridFTP transfers to provide for the creation, registration, location, and management of dataset replicas. We present the design of both services and also preliminary performance results. Our implementations exploit security and other services provided by the Globus Toolkit.
Giggle: A Framework for Constructing Scalable Replica Location Services
, 2002
"... In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location ser ..."
Abstract
-
Cited by 158 (37 self)
- Add to MetaCart
(Show Context)
In wide area computing systems, it is often desirable to create remote read-only copies (replicas) of files. Replication can be used to reduce access latency, improve data locality, and/or increase robustness, scalability and performance for distributed applications. We define a replica location service (RLS) as a system that maintains and provides access to information about the physical locations of copies. An RLS typically functions as one component of a data grid architecture. This paper makes the following contributions. First, we characterize RLS requirements. Next, we describe a parameterized architectural framework, which we name Giggle (for GIGa-scale Global Location Engine), within which a wide range of RLSs can be defined. We define several concrete instantiations of this framework with different performance characteristics. Finally, we present initial performance results for an RLS prototype, demonstrating that RLS systems can be constructed that meet performance goals.
Stork: Making Data Placement a First Class Citizen in the Grid
, 2004
"... Todays scientific applications have huge data requirements which continue to increase drastically every year. These data are generally accessed by many users from all across the the globe. This implies a major necessity to move huge amounts of data around wide area networks to complete the computati ..."
Abstract
-
Cited by 122 (24 self)
- Add to MetaCart
(Show Context)
Todays scientific applications have huge data requirements which continue to increase drastically every year. These data are generally accessed by many users from all across the the globe. This implies a major necessity to move huge amounts of data around wide area networks to complete the computation cycle, which brings with it the problem of efficient and reliable data placement. The current approach to solve this problem of data placement is either doing it manually, or employing simple scripts which do not have any automation or fault tolerance capabilities. Our goal is to make data placement activities first class citizens in the Grid just like the computational jobs. They will be queued, scheduled, monitored, managed, and even check-pointed. More importantly, it will be made sure that they complete successfully and without any human interaction. We also believe that data placement jobs should be treated differently from computational jobs, since they may have different semantics and different characteristics. For this purpose, we have developed Stork, a scheduler for data placement activities in the Grid.
Replica Selection in the Globus Data Grid
, 2000
"... The Globus Data Grid architecture provides a scalable infrastructure for the management of storage resources and data that are distributed across Grid environments. These services are designed to support a variety of scientific applications, ranging from high-energy physics to computational genomics ..."
Abstract
-
Cited by 113 (12 self)
- Add to MetaCart
(Show Context)
The Globus Data Grid architecture provides a scalable infrastructure for the management of storage resources and data that are distributed across Grid environments. These services are designed to support a variety of scientific applications, ranging from high-energy physics to computational genomics, that require access to large amounts of data (terabytes or even petabytes) with varied quality of service requirements. Layering on a set of core services, such as data transport, security, and replica cataloging, various higher-level services can be constructed. In this paper, we discuss the design and implementation of a high-level replica selection service that uses information regarding replica location and user preferences to guide selection from among storage replica alternatives We first present a basic replica selection service design, then show how dynamic information collected using Globus information service capabilities concerning storage system properties can help improve and op...
A SUITE OF DAML+OIL ONTOLOGIES TO DESCRIBE BIOINFORMATICS WEB SERVICES AND DATA
- INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS
"... The growing quantity and distribution of bioinformatics resources means that finding and utilizing them requires a great deal of expert knowledge,especially as many resources need to be tied together into a workflow to accomplish a useful goal. We want to formally capture at least some of this knowl ..."
Abstract
-
Cited by 89 (32 self)
- Add to MetaCart
The growing quantity and distribution of bioinformatics resources means that finding and utilizing them requires a great deal of expert knowledge,especially as many resources need to be tied together into a workflow to accomplish a useful goal. We want to formally capture at least some of this knowledge within a virtual workbench and middleware framework to assist a wider range of biologists in utilizing these resources. Different activities require different representations of knowledge. Finding or substituting a service within a workflow is often best supported by a classification. Marshalling and configuring services is best accomplished using a formal description. Both representations are highly interdependent and maintaining consistency between the two by hand is difficult. We report on a description logic approach using the web ontology language DAML+OIL that uses property based service descriptions. The ontology is founded on DAML-S to dynamically create service classifications. These classifications are then used to support semantic service matching and discovery in a large grid based middleware project my GRID. We describe the extensions necessary to DAML-S in order to support bioinformatics service description; the utility of DAML+OIL in creating dynamic classifications based on formal descriptions; and the implementation of a DAML+OIL ontology service to support partial user-driven service matching and composition.