Results 1 - 10
of
14
Cloud Computing and Grid Computing 360-Degree Compared
, 2008
"... Cloud Computing has become another buzzword after Web 2.0. However, there are dozens of different definitions for Cloud Computing and there seems to be no consensus on what a Cloud is. On the other hand, Cloud Computing is not a completely new concept; it has intricate connection to the relatively ..."
Abstract
-
Cited by 248 (9 self)
- Add to MetaCart
Cloud Computing has become another buzzword after Web 2.0. However, there are dozens of different definitions for Cloud Computing and there seems to be no consensus on what a Cloud is. On the other hand, Cloud Computing is not a completely new concept; it has intricate connection to the relatively new but thirteen-year established Grid Computing paradigm, and other relevant technologies such as utility computing, cluster computing, and distributed systems in general. This paper strives to compare and contrast Cloud Computing with Grid Computing from various angles and give insights into the essential characteristics of both.
A Reference Architecture for Scientific Workflow Management Systems and the VIEW SOA Solution
"... Abstract—Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically control flow oriented, scientific ..."
Abstract
-
Cited by 20 (14 self)
- Add to MetaCart
(Show Context)
Abstract—Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically control flow oriented, scientific workflows tend to be dataflow oriented, introducing a new set of requirements for system development. These requirements demand a new architectural design for scientific workflow management systems (SWFMSs). Although several SWFMSs have been developed that provide much experience for future research and development, a study from an architectural perspective is still missing. The main contributions of this paper are: 1) based on a comprehensive survey of the literature and identification of key requirements for SWFMSs, we propose the first reference architecture for SWFMSs; 2) according to the reference architecture, we further propose a service-oriented architecture for VIEW (a VIsual sciEntific Workflow management system); 3) we implemented VIEW to validate the feasibility of the proposed architectures; and 4) we present a VIEW-based scientific workflow application system (SWFAS), called FiberFlow, to showcase the application of our VIEW system. Index Terms—Reference architecture, scientific workflows, scientific workflow management system, SOA, VIEW. Ç
Scientific Workflow Provenance Querying with Security Views
, 2008
"... Provenance, the metadata that pertains to the derivation history of a data product, has become increasingly important in scientific workflow environments. In many cases, both data products and their provenance can be sensitive and effective access control mechanisms are essential to protect their co ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
(Show Context)
Provenance, the metadata that pertains to the derivation history of a data product, has become increasingly important in scientific workflow environments. In many cases, both data products and their provenance can be sensitive and effective access control mechanisms are essential to protect their confidentiality. In this paper, we propose i) a formalization of scientific workflow provenance as the basis for querying and access control; ii) a security specification mechanism for provenance at various granularity levels and the derivation of a full security specification based on inheritance, overriding, and conflict resolution rules; iii) a formalization of security views that are derived from a scientific workflow run provenance for different roles of users; and iv) a framework that integrates abstraction views and security views such that a user can examine provenance at different abstraction levels while respecting the security policy prescribed for her. We have developed the SECPROV prototype to validate the effectiveness of our approach.
Service-Oriented Architecture for VIEW: a Visual Scientific Workflow Management System
- IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING
, 2008
"... Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically controlflow oriented, scientific workflows ..."
Abstract
-
Cited by 14 (11 self)
- Add to MetaCart
Scientific workflows have recently emerged as a new paradigm for scientists to formalize and structure complex and distributed scientific processes to enable and accelerate many scientific discoveries. In contrast to business workflows, which are typically controlflow oriented, scientific workflows tend to be dataflow oriented, introducing a new set of requirements for system development. These requirements demand a new architectural design for scientific workflow management systems (SWFMSs). Although several SWFMSs have been developed that provide much experience for future research and development, a study from an architectural perspective is still missing. The main contributions of this paper are: i) based on a comprehensive survey of the literature and identification of key requirements for SWFMSs, we propose the first reference architecture for SWFMSs, ii) in compliance with the reference architecture, we further propose a service-oriented architecture for VIEW (a VIsual sciEntific Workflow management system), iii) we implement VIEW to validate the feasibility of the proposed architectures, and iv) we present two case studies to showcase the applications of our VIEW system.
Data Replication in Data Intensive Scientific Applications With Performance Guarantee
"... Data replication has been well adopted in data intensive scientific applications to reduce data file transfer time and bandwidth consumption. However, the problem of data replication in Data Grids, an enabling technology for data intensive applications, has proven to be NP-hard and even non-approxi ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Data replication has been well adopted in data intensive scientific applications to reduce data file transfer time and bandwidth consumption. However, the problem of data replication in Data Grids, an enabling technology for data intensive applications, has proven to be NP-hard and even non-approximable, making this problem difficult to solve. Meanwhile, most of the previous research in this field is either theoretical investigation without practical consideration, or heuristics-based with little or no theoretical performance guarantee. In this paper, we propose a data replication algorithm that not only has a provable theoretical performance guarantee, but also can be implemented in a distributed and practical manner. Specifically, we design a polynomial time centralized replication algorithm that reduces the total data file access delay by at least half of that reduced by the optimal replication solution. Based on this centralized algorithm, we also design a distributed caching algorithm, which can be easily adopted in a distributed environment such as Data Grids. Extensive simulations are performed to validate the efficiency of our proposed algorithms. Using our own simulator, we show that our centralized replication algorithm performs comparably to the optimal algorithm and other intuitive heuristics under different network parameters. Using GridSim, a popular distributed Grid simulator, we demonstrate that the distributed caching technique significantly outperforms an existing popular file caching technique in Data Grids, and it is more scalable and adaptive to the dynamic change of file access patterns in Data Grids.
Collaborative scientific workflows
- in Web Services, 2009. ICWS 2009. IEEE International Conference on
, 2009
"... In recent years, a number of scientific workflow management systems (SWFMSs) have been developed to help domain scientists synergistically integrate distributed computations, datasets, and analysis tools to enable and accelerate scientific discoveries. As more scientific research projects become col ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
(Show Context)
In recent years, a number of scientific workflow management systems (SWFMSs) have been developed to help domain scientists synergistically integrate distributed computations, datasets, and analysis tools to enable and accelerate scientific discoveries. As more scientific research projects become collaborative in nature, there is a compelling need of dedicated services to support collaborative scientific workflows on the Internet. This paper reviews the state of the art of the field of scientific workflows towards the support of collaborative scientific workflows, identifies critical research challenges, and presents our ongoing research work aiming to study how to create services supporting collaborative scientific workflows.
Formal modeling and analysis of scientific workflows using hierarchical state machines
"... Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and data analysis, and have enabled and accelerated many scientific discoveries. Many scientific workflows are distributed and collaborative as they result from some ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and data analysis, and have enabled and accelerated many scientific discoveries. Many scientific workflows are distributed and collaborative as they result from some collaborative research projects that involve a number of geographicaly distributed organizations. In these workflows, information flow control becomes a key security problem. In this paper, we propose to model a scientific workflow using a hierarchical state machine and present techniques for verifying and controlling information propagation in scientific workflow environments based on hierarchical state machines. To the best of our knowledge, this is the first effort for information flow analysis in the area of scientific workflows.
Confucius: a scientific collaboration system using collaborative scientific workflows
- In: Proceedings of IEEE International Conference on Web Services (ICWS) 2010
"... Abstract-Large-scale scientific data management and analysis usually relies on many distributed scientists with diverse expertise. In recent years, such a collaborative effort is often composed and automated into a dataflow-oriented process, a so-called scientific workflow. However, existing scient ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract-Large-scale scientific data management and analysis usually relies on many distributed scientists with diverse expertise. In recent years, such a collaborative effort is often composed and automated into a dataflow-oriented process, a so-called scientific workflow. However, existing scientific workflow tools are single user-oriented and do not support collaborative scientific workflow composition, execution, and management among multiple distributed scientists. In this paper, we report our study of collaboration protocols towards building a tool supporting collaborative scientific workflow composition. Based on a scientific collaboration ontology, we propose a collaboration model supported by a set of collaboration primitives and patterns. The collaboration protocols are then applied to support effective concurrency control in the process of collaborative workflow composition.
Scientific workflow provenance metadata management using an RDBMS
, 2007
"... Abstract. Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperabilit ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power of an RDBMS. Specifically, we propose: i) two schema mapping algorithms to map an arbitrary OWL provenance ontology to a relational database schema that is optimized for common provenance queries; ii) three efficient data mapping algorithms to map provenance RDF metadata to relational data according to the generated relational database schema, and iii) a schema-independent SPARQL-to-SQL translation algorithm that is optimized on-the-fly by using the type information of an instance available from the input provenance ontology and the statistics of the sizes of the tables in the database. While the schema mapping and query translation and optimization algorithms are applicable to general RDF storage and query systems, the data mapping algorithms are optimized for and applicable only to scientific workflow provenance metadata. Moreover, we extend SPARQL with negation, aggregation, and set operations to support additional important provenance queries. Experimental results are presented to show that our algorithms are efficient and scalable. The comparison with existing RDF stores, Jena and Sesame, showed that our optimizations result in improved performance and scalability for provenance metadata management.