Results 1 - 10
of
69
Scientific workflow management and the Kepler system
- CONCURR. COMPUT.: PRACT. EXP
, 2006
"... Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery “pipelines”. A related trend is that more and more scientific communities realize the benefits of sharing their data and ..."
Abstract
-
Cited by 280 (19 self)
- Add to MetaCart
Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery “pipelines”. A related trend is that more and more scientific communities realize the benefits of sharing their data and computational services, and are thus contributing to a distributed data and computational community infrastructure (a.k.a. “the Grid”). However, this infrastructure is only a means to an end and scientists ideally should be bothered little with its existence. The goal is for scientists to focus on development and use of what we call scientific workflows. These are networks of analytical steps that may involve, e.g., database access
A Next-Generation Design Framework for Platform-Based Design
"... Abstract — The platform-based design methodology [1] is based on the usage of formal modeling techniques, clearly defined abstraction levels and the separation of concerns to enable an effective design process. The METROPOLIS framework embodies the platform-based design methodology and has been appl ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
(Show Context)
Abstract — The platform-based design methodology [1] is based on the usage of formal modeling techniques, clearly defined abstraction levels and the separation of concerns to enable an effective design process. The METROPOLIS framework embodies the platform-based design methodology and has been applied to a number of case studies across multiple domains. Based on these experiences, we have identified three key features that need to be enhanced: heterogeneous IP import, orthogonalization of performance from behavior, and design space exploration. The next generation METRO II framework incorporates these advanced features. The main concepts underlying METRO II are described in this paper and illustrated with a small example. I.
Enabling scientific workflow reuse through structured composition of dataflow and control-flow
- In IEEE Workshop on Workflow and Data Flow for Scientific Applications
, 2006
"... Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling “control-flow intensive” tasks using dataflow constructs often leads to overly complicated workflows t ..."
Abstract
-
Cited by 27 (8 self)
- Add to MetaCart
(Show Context)
Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling “control-flow intensive” tasks using dataflow constructs often leads to overly complicated workflows that are hard to comprehend, reuse, and maintain. We describe a generic framework, based on scientific workflow templates and frames, for embedding control-flow intensive subtasks within dataflow process networks. This approach can seamlessly handle complex control-flow without sacrificing the benefits of dataflow. We illustrate our approach with a real-world scientific workflow from the astrophysics domain, requiring remote execution and file transfer in a semi-reliable environment. For such workflows, we also describe a 3-layered architecture based on frames and templates where the top-layer consists of an overall dataflow process network, the second layer consists of a tranducer template for modeling the desired control-flow behavior, and the bottom layer consists of frames inside the template that are specialized by embedding the desired component implementation. Our approach can enable scientific workflows that are more robust (faulttolerance strategies can be defined by control-flow driven transducer templates) and at the same time more reusable, since the embedding of frames and templates yields more structured and modular workflow designs.
Managing Scientific Data: From Data Integration to Scientific Workflows
"... Scientists are confronted with significant datamanagement problems due to the large volume and high complexity of scientific data. In particular, the latter makes data integration a difficult technical challenge. In this paper, we describe our work on semantic mediation and scientific workflows, and ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Scientists are confronted with significant datamanagement problems due to the large volume and high complexity of scientific data. In particular, the latter makes data integration a difficult technical challenge. In this paper, we describe our work on semantic mediation and scientific workflows, and discuss how these technologies address integration challenges in scientific data management. We first give an overview of the main data-integration problems that arise from heterogeneity in the syntax, structure, and semantics of data. Starting from a traditional mediator approach, we show how semantic extensions can facilitate data integration in complex, multipleworlds scenarios, where data sources cover different but related scientific domains. Such scenarios are not amenable to conventional schema-integration approaches. The core idea of semantic mediation is to augment database mediators and query evaluation algorithms with appropriate knowledge-representation techniques to exploit information from shared ontologies. Semantic mediation relies on semantic data registration, which associates existing data with semantic information from an ontology. The Kepler scientific workflow system addresses the problem of synthesizing, from existing tools and applications, reusable workflow components and analytical pipelines to automate scientific analyses. After presenting core features and example workflows in Kepler, we present a framework for adding semantic information to scientific workflows. The resulting system is aware of semantically plausible connections between workflow components as well as between data sources and workflow components. This information can be used by the scientist during workflow design, and by the workflow engineer for creating data transformation steps ...
Classes and Inheritance in Actor-Oriented Design
, 2007
"... Actor-oriented components emphasize concurrency and temporal semantics and are used for modeling and designing embedded software and hardware. Actors interact with one another through ports via a messaging schema that can follow any of several concurrent semantics. Domainspecific actor-oriented lang ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
Actor-oriented components emphasize concurrency and temporal semantics and are used for modeling and designing embedded software and hardware. Actors interact with one another through ports via a messaging schema that can follow any of several concurrent semantics. Domainspecific actor-oriented languages and frameworks are common (Simulink, LabVIEW, SystemC, etc.). However, they lack many modularity and abstraction mechanisms that programmers have become accustomed to in object-oriented components, such as classes, inheritance, interfaces, and polymorphism, except as inherited from the host language. This paper shows a form that such mechanisms can take in actor-oriented components, gives a formal structure, and describes a prototype implementation. The mechanisms support actor-oriented class definitions, subclassing, inheritance, and overriding. The formal structure imposes structural constraints on a model (mainly the “derivation invariant”) that lead to a policy to govern inheritance. In particular, the structural constraints permit a disciplined form of multiple inheritance with unambiguous inheritance and overriding behavior. The policy is based formally on a generalized ultrametric space with some remarkable properties. In this space, inheritance is favored when actors are “closer” (in the generalized ultrametric), and we show that when inheritance can occur from multiple sources, one source is always unambiguously closer than the other.
S.: Simulation and implementation of the PTIDES programming model
- In: Proceedings of the 12th IEEE/ACM International Symposium on Distributed Simulation and Real-Time Applications (DS-RT ’08
, 2008
"... We have previously proposed PTIDES (Prog-ramming Temporally Integrated Distributed Embedded Systems), a discrete-event framework that binds real-time with model time at sensors, actuators, and network interfaces. In this experimental effort we focus on performance issues and tradeoffs in PTIDES impl ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
(Show Context)
We have previously proposed PTIDES (Prog-ramming Temporally Integrated Distributed Embedded Systems), a discrete-event framework that binds real-time with model time at sensors, actuators, and network interfaces. In this experimental effort we focus on performance issues and tradeoffs in PTIDES implementation. We address event processing performance with respect to other distributed discrete-event approaches that can be applied in a similar setting. The procedure is experimentally evaluated on a distributed setup with standard software and networking components.
A generic execution framework for models of computation
- In International Workshop Series on Model-based Methodologies for Pervasive and Embedded Software (MOMPES
, 2007
"... Abstract The Model Driven Engineering approach has had an important impact on the methods used for the conception of systems. However, some important difficult points remain in this domain. In this paper, we focus on problems related to the heterogeneity of the computation models (and therefore of ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
(Show Context)
Abstract The Model Driven Engineering approach has had an important impact on the methods used for the conception of systems. However, some important difficult points remain in this domain. In this paper, we focus on problems related to the heterogeneity of the computation models (and therefore of the modeling techniques) used for the different aspects of a system and to the validation and the execution of a model. We present here a language for describing computation models, coupled with a generic execution platform where different computation models as well as their composition can be interpreted. Our goal is to be able to describe precisely the semantics of the computation models underlying Domain Specific Languages, and to allow the interpretation of these models within our platform. This provides for a non ambiguous definition of the behavior of heterogeneous models of a system, which is essential for validation, simulation and code generation.
The Center for Plasma Edge Simulation Workflow Requirements
- in 22nd International Conference on Data Engineering Workshops (ICDEW’06
"... The Center for Plasma Edge Simulation (CPES) is a recently funded prototype Fusion Simulation Project, which is part of the DOE SciDAC program. Our center is developing a novel integrated predictive plasma edge simulation framework, which is applicable to existing magnetic fusion facilities (D3D, NS ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
The Center for Plasma Edge Simulation (CPES) is a recently funded prototype Fusion Simulation Project, which is part of the DOE SciDAC program. Our center is developing a novel integrated predictive plasma edge simulation framework, which is applicable to existing magnetic fusion facilities (D3D, NSTX, CMOD) and next generation burning plasma experiments, e.g. ITER. The success of this project will be in developing and understanding new models for the plasma edge in a kinetic regime with complex geometry. Because of the multi-scale nature of the problem, we will study the neoclassical physics time scale kinetically, and the fast and larger scale MHD modes via a fluid code. Our approach is to couple these codes via a scientific workflow system, Kepler-HPC. Kepler-HPC will enhance Kepler with capabilities such as code coupling and data redistribution, high volume data transfers and interactive (and autonomic) monitoring, steering and debugging, which will be necessary for scientific progress in this project. 1.