Home     Top: Operating Systems: Fault Tolerance    [Clusters   Distributed   Fault Tolerance   Linux   Memory Management   Microkernel   Real-time   Unix   Windows]

Change ordering:   Authority   Hubs (tutorials)   Date   Expected authority       Show titles only
Reverse date order

This directory is created automatically and some papers may be mislabeled. Only document within the CiteSeer database are listed. The directory is intended to provide entry points for browsing the database and is not intended to be authoritative. Papers may not appear in all relevant categories. For example, papers in a sub-category may not appear in higher level categories.

Performance Evaluation of the Quadrics Interconnection Network - Petrini, Frachtenberg, Hoisie, Coll (2003)   (Correct)
In this paper we present an in-depth description of the Quadrics interconnection network (QsNET) and an experimental performance evaluation on a 64-node AlphaServer cluster. We explore several perform... / user-level communication operating system bypass . Introduction br for Quality of Service QoS fault-tolerance remote direct memory

Distributed Shared State (Position Paper) - Scott, Chen, Dwarkadas, Tang (2003)   (Correct)
Increasingly, Internet-level distributed systems are oriented as much toward information access as they are toward computation. From computer-supported collaborative work to peer-to-peer computing, e-... / integrated S-DSM into the operating system kernel in order to support br scalability latency and fault tolerance most distributed

Improving Availability with Recursive Micro-Reboots: A Soft-State.. - Candea, Cutler, Fox (2003)   (Correct)
Even after decades of software engineering research, complex computer systems still fail. This paper makes the case for increasing research emphasis on dependability and, specifically, on improving av... / synchronous disk writes many operating systems cache metadata updates in br daemon process to provide fault tolerance for long-running UNIX

A Group Membership Protocol For An Intrusion-Tolerant Group.. - Ramasamy (2002)   (Correct)
Group Communication Systems have been developed to address the problem of maintaining consistency of replicated information. This thesis describes the research work that resulted in the design, develo... / the middleware level the operating system level or the hardware level. br tolerance focuses on keeping the system operational in spite of benign

Active Replication of Multithreaded Applications - Basile, Kalbarczyk, Whisnant, Iyer (2002)   (Correct)
Software-based active replication is a well-known technique for providing fault tolerance using space redundancy and faultmasking. However, much of the recent research in software replication has yet ... /

Assessment of the Java Programming Language for Use in High Integrity .. - Kwon, Wellings, King (2002)   (Correct)
This paper sets a goal of investigating the use of Java in the development of high integrity systems. Important requirements of programming languages for the development of high integrity software are... /

Quantifying the Cost of Providing Intrusion Tolerance in Group.. - Ramasamy, Pandey, Lyons, Cukier.. (2002)   (Correct)
Group communication systems that provide consistent group membership and reliable, ordered multicast properties in the presence of faults resulting from malicious intrusions have not been analyzed ext... /

Ravenscar-Java: A High Integrity Profile for Real-Time Java - Andy (2002)   (Correct)
For many, Java is the antithesis of a high integrity programming language. Its combination of object-oriented programming features, its automatic garbage collection, and its poor support for real-time... /

STORM: Lightning-Fast Resource Management - Frachtenberg, Petrini, Fernandez.. (2002)   (Correct)
Although clusters are a popular form of high-performance computing (HPC), they remain more difficult to manage than sequential systems, or even symmetric multiprocessors. Furthermore, as cluster sizes... / GB I O buses node Operating system Red Hat Linux . br process-scheduling algorithms fault tolerance or usage policies can be

Windows Performance Monitoring and Data Reduction - Using Watchtower Michael (2002)   (Correct)
We describe and evaluate WatchTower, a set of library routines that simplifies the collection of performance data for the monitoring of Windows NT/2000. WatchTower has an overhead similar to that of e... /

An Architecture for Adaptive Coordination of Heterogeneous Agents - Bonarini, Restelli (2002)   (Correct)
We present a novel architecture to design multi-agent systems. Each agent is described by features that can dynamically change, while it is participating to the common activity. Basing on these featur... /

Memory Mapped Networks: a new deal for Distributed Shared Memories ? - The Scifs Experience (2002)   (Correct)
Distributed Shared Memories (DSM) performance has always suffered from high network latencies and software communication layers with a large overhead. Memory mapped networks such as Scalable Coherent ... / memory without involving the operating system. To show how DSM systems can br be kept smaller and hardware fault tolerance is improved. It is also

Design and Validation of Portable Communication Infrastructure for.. - Li, Tao, Goldberg, Hsu, Tamir (2002)   (Correct)
We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the cluster management mi... /

Scalable, Efficient Range Queries for Grid Information Services - Andrzejak, Xu (2002)   (Correct)
Recent Peer-to-Peer (P2P) systems such as Tapestry, Chord or CAN act primarily as a Distributed Hash Table (DHT). A DHT is a data structure for distributed storing of pairs (key, data) which allows fa... / for example the type of the operating system network address CPU speed br by adding self-organization faulttolerance and an ability to efficiently

Loose Synchronization of Multithreaded Replicas - Basile, Whisnant, Kalbarczyk, Iyer (2002)   (Correct)
Although multithreading can improve performance, it is a source of nondeterminism in application behavior. Existing approaches to replicating multithreaded applications either synchronize replicas at ... / algorithm interferes with the operating system scheduler only when granting br and middlewares providing fault tolerance to CORBA objects. . Loose

Requirements of a Middleware for Managing a large - Hughes (2002)   (Correct)
Programmable networking is an increasingly popular area of research in both industry and academia. Although most programmable network research projects seem to focus on the router architecture rather ... / it is located between the operating system and the application. The br and transactions. Security fault tolerance and usability are also

Flexible Distributed Process Topologies for Enterprise - Applications Christoph Hartwich (2002)   (Correct)
Enterprise applications can be viewed as topologies of distributed processes that access business data objects stored in one or more transactional datastores. There are several wellknown topology patt... / distributed heavyweight operating system level processes address br properties like scalability fault tolerance or response time.

Practical QoS Network System with Fault Tolerance - Das Gerla Lee (2002)   (Correct)
In this paper, we present an "emulation environment" for the design and planning of intranets. Intranets must support Quality-ofService (QoS) for real-time traffic and must be fault-tolerant for missi... /

Process Migration: A Generalized Approach - Using Virtualizing Operating (2002)   (Correct)
Process migration has been used to perform specialized tasks, such as load sharing and checkpoint/restarting long running applications. Implementation typically consists of modifications to existing a... /

The Caltech Multi-Vehicle Wireless Testbed - Lars Cremean William (2002)   (Correct)
In this paper we introduce the Caltech Multi-Vehicle Wireless Testbed (MVWT), a platform for testing decentralized control methodologies for multiple vehicle coordination and formation stabilization. ... /

Group Communications And Database Replication: Techniques, Issues and .. - Wiesmann (2002)   (Correct)
Databases are an important part of today's IT infrastructure: both companies and state institutions rely on database systems to store most of their important data. As we are more and more dependent on... / to all the people in the Operating System Lab and the Distributed br One way to ensure the fault-tolerance of a system is by replicating

Service Management in Distributed Systems: A Service Publisher - Riviere, Gimenez, de Sales, Sibilla, .. (2002)   (Correct)
A characteristic of an open distributed system is its dynamic behaviour. A Dynamic Distribution of Service - DDS offers services from providers to clients. The DDS can be seen as `yellowpages' in whic... /

Evaluation of Dependable Layered Systems with Fault - Management Architecture Olivia (2002)   (Correct)
The need for a separate fault-management system, that is able to carry out both failure detection and reconfiguration, is becoming imperative due to the increasing complexity of fault-tolerant distrib... /

The Quadrics Network: High Performance Clustering Technology - Petrini, Feng, Hoisie, Coll.. (2002)   (Correct)
this article) connects the Quadrics network to a processing node containing one or more CPUs. In addition to generating and accepting packets to and from the network, Elan provides substantial local p... /

Complete Specification of APIs and Protocols for the MAFTIA Middleware - Neves, Verissimo (2002)   (Correct)
This document describes the complete specification of the APIs and Protocols for the MAFTIA Middleware. The architecture of the middleware subsystem has been described in a previous document, where th... /

Pragmatic nonblocking synchronization for real-time systems - Hohmuth (2002)   (Correct)
In this thesis I present a pragmatic methodology for designing nonblocking real-time systems. My methodology uses a combination of lock-free and wait-free synchronization techniques and clearly states... /

Are Law-Abiding Agents Realistic? - Brazier, Kubbe, Oskamp, Wijngaards (2002)   (Correct)
Software agents are an inherent extension to the current Internet. They are, however, without a legal status. They autonomously roam the Internet, perform transactions, and gather information. The leg... /

Error Management in the Pluggable File System - Thain, Livny (2002)   (Correct)
Distributed computing continues to be an alphabet-soup of services and protocols. No single system for managing CPUs or I/O devices has emerged (or is likely to emerge) as a universal solution. Theref... /

Cost and benefit of separate address spaces in real-time operating.. - Mehnert, Hohmuth, Härtig (2002)   (Correct)
The combination of a real-time executive and an off-theshelf time-sharing operating system has the potential of providing both predictability and the comfort of a large application base. To isolate th... /

Testing the Fault-Tolerance of Networked Systems - Sieh, Buchacker (2002)   (Correct)
This paper presents an extensible framework for testing the behavior of networked machines running the Linux operating system in the presence of faults. The framework allows injection of a variety of ... /

On Improving Thread Migration: Safety and Performance - Jiang, Chaudhary (2002)   (Correct)
Application-level migration schemes have been paid more attention recently because of their great potential for heterogeneous migration. unknown On Improving Thread Migration: Safety and Performance ... / migration is a part of the operating system. Threads are moved around br dynamic load distribution fault tolerance eased system administration

UMLinux - A Versatile SWIFI Tool - Sieh, Buchacker (2002)   (Correct)
This tool presentation describes UMLinux, a versatile framework for testing the behavior of networked machines running the Linux operating system in the presence of faults. UMLinux can inject a vari... /

Architecture-Based Exception Handling - Issarny, Banâtre (2001)   (Correct)
Architecture-based development environments are becoming an effective solution towards the construction of robust distributed systems. Through the abstract description of complex software systems conf... / or the underlying operating system in which case it takes the br has been paid to software fault tolerance and in particular exception

Towards Global Storage Management and Data Placement - Veitch, Riedel, Towers, Wilkes (2001)   (Correct)
As users' and companies' dependence on shared, networked information services continues to increase, we will see continued growth in large data centers and service providers. This will happen both as ... / in networking security operating system design systems management br on the other hand for fault-tolerance of critical services and to

When Local Becomes Global: An Application Study of Data Consistency.. - Riedel, Spence, Veitch (2001)   (Correct)
As users and companies depend increasingly on shared, networked information services, and as companies and customers become more international, computer systems will need to keep pace with a global sc... / on the other hand for fault-tolerance of critical services and to

Practical Byzantine Fault Tolerance - Castro (2001)   (Correct)
Our growing reliance on online services accessible on the Internet demands highly-available systems that provide correct service without interruptions. Byzantine faults such as software bugs, operator... /

Dynamic Software Updating - Hicks, Moore, Nettles (2001)   (Correct)
Many important applications must run continuously and without interruption, yet must be changed to x bugs or upgrade functionality. To date, no existing dynamic updating system has achieved a practic... /

Introducing Fault Tolerance to Distributed Maple - Schreiner, Kusper, Bosa (2001)   (Correct)
We have extended the parallel computer algebra environment Distributed Maple by fault tolerance mechanisms such that the time spent in a long running computation is not any more wasted by the eventu... / our control machine network operating system faults may happen in any br the root creates a task the system operates according to one of two modes

First Specification of APIs and Protocols for the MAFTIA Middleware - Neves, Verissimo (2001)   (Correct)
This document describes the rst speci cation of the APIs and Protocols for the MAFTIA Middleware. The architecture of the middleware subsystem has been described in a previous document, where the seve... / TTCB and of course the Operating System OS This deliverable is br Malicious- and Accidental-Fault Tolerance for Internet Applications

LegionFS: A Secure and Scalable File System Supporting Cross-Domain.. - White, Walker, Humphrey, Grimshaw (2001)   (Correct)
Realizing that current file systems can not cope with the diverse requirements of wide-area collaborations, researchers have developed data access facilities to meet their needs. Recent work has focus... / of the Legion wide-area operating system and an in-depth discussion of br and addresses the goals of fault tolerance and availability.

Fault-Tolerant Cluster Management For Reliable High-Performance.. - Li, Goldberg, Tao, Tamir (2001)   (Correct)
Clusters of COTS workstations/PCs are commonly used to implement cost-effective high-performance systems. A central coordinator/manager is often the simplest way to implement many of the operations re... / Recovery . Introduction Operating systems such as Amoeba that br as high-level coordination of fault tolerance mechanisms and interactions

Recursive Restartability: Turning the Reboot Sledgehammer into a.. - Candea, Fox (2001)   (Correct)
Even after decades of software engineering research, complex computer systems still fail, primarily due to nondeterministic bugs that are typically resolved by rebooting. Conceding that Heisenbugs wil... /

Fail-Stutter Fault Tolerance - Remzi Arpaci-Dusseau And (2001)   (Correct)
Traditional fault models present system designers with two extremes: the Byzantine fault model, which is general and therefore difficult to apply, and the fail-stop fault model, which is easier to emp... /

Proactive Management of Software Aging - Castelli, Harper, Heidelberger.. (2001)   (Correct)
this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be... / process group or entire operating system depending on the br another process. Most current fault-tolerance techniques are reactive in

A Monitoring-based Approach to Object-Oriented Real-Time Computing - Gergeleit (2001)   (Correct)
base-class for evaluators ................................................ 102 Figure 3.17: Abstract base-class for results ...................................................... 104 Figure 3.18: Exam... /

LocALE: a Location-Aware Lifecycle Environment for Ubiquitous.. - de Ipina (2001)   (Correct)
The LocALE (Location-Aware Lifecycle Environment) framework provides a simple management interface for controlling the lifecycle of CORBA distributed objects. It supports mechanisms for the remote con... / is programmed its underlying operating system or any other system aspects br automatic activation and fault-tolerance facilities for the services

Middleware Support for Voting and Data Fusion - Zhiyuan (2001)   (Correct)
Middleware is a class of software systems above the operating system which is becoming widely used for programming distributed systems. Voting is a fundamental operation when distributed systems invol... / of software systems above the operating system which is becoming widely br with respect to performance fault tolerance precision and correctness

The Architecture of a Secure Group Communication System Based on.. - Correia, Veríssimo, Neves (2001)   (Correct)
This paper presents the architecture of a secure group communication system with the fortress model of trust, where the participants of the group equally trust one another. We consider that only a sma... / they all assume that the operating system can be considered to be a br the recursive use of fault tolerance and fault prevention

The Design and Implementation of a Fault-Tolerant Cluster Manager - Ming (2001)   (Correct)
Cluster management middleware schedules tasks on a cluster, controls access to shared resources, provides for task submission and monitoring, and coordinates the cluster's fault tolerance mechanisms. ... / local copy of an off-the-shelf operating system that is not designed to br and coordinates the cluster's fault tolerance mechanisms. Thus reliable

Enhancing Survivability of Security Services using Redundancy - Hiltunen, Schlichting, Ugarte (2001)   (Correct)
Traditional distributed system services that provide guarantees related to confidentiality, integrity, and authenticity enhance security, but are not survivable since each attribute is implemented by ... / of security where the existing operating system already provides br failure when considering fault-tolerance attributes. This problem is

The Marvel Programming Model: a higher-order distributed process.. - Schmitt, Stefani (2001)   (Correct)
Contents 1 Introduction 2 1.1 Requirements for a distributed programming model . . . . . . . . . . . . . . . . . . . . . 3 1.2 Introducing the M-calculus . . . . . . . . . . . . . . . . . . . . . . .... / programming language e.g. operating system processes failure br abstractions e.g. for fault-tolerance or multi-party communication

Designing for High Availability and Measurability - Candea, Fox (2001)   (Correct)
We propose a structuring model, called recursive restartability, aimed at controlling the amount of endto -end unavailability and improving the measurability of software infrastructures with high avai... /

LOTTERYBUS: A New High-Performance Communication Architecture for.. - Lahiri, Raghunathan, Lakshminarayana (2001)   (Correct)
This paper presents LOTTERYBUS, a novel high-performance communication architecture for system-on-chip (SoC) designs. The LOTTERYBUS architecture was designed to address the following limitations of c... / in a multi-threaded operating system However in that br as dynamic scalability and fault tolerance apply in the design of

Susceptibility of Modern Systems and Software to Soft Errors - Messer, Bernadat, Fu, Chen.. (2001)   (Correct)
It is widely understood that most downtime is accounted for by programming errors and administration time. However, recent work has indicated an increasing cause of downtime may stem from transient ha... / and the susceptibility of operating systems and applications to them br Cornell's Hypervisor-based fault tolerance system provides a similar

M-Calculus: A Higher-Order Distributed Process Calculus - Schmitt, Stefani (2001)   (Correct)
this paper a new process calculus, called the M-calculus, which represents an attempt at defining a formal distributed programming model. Key insights for the calculus are similar to those laid out in... / programming language e.g. operating system processes failure br abstractions e.g. for fault-tolerance or multi-party communication

A Tutorial Of Lustre - Halbwachs, RAYMOND (2001)   (Correct)
This document is an introduction to the language Lustre V4 and its associated tools. We will not give a systematic presentation of the language, but a complete bibliography is added. The basic referen... /

Performance Evaluation of the Quadrics Interconnection - Petrini, Coll, Frachtenberg, Hoisie (2001)   (Correct)
In this paper we present an in-depth description of the Quadrics interconnection network (QsNET) and an experimental performance evaluation on a 64-node Alphaserver cluster. We expose the performance ... / User-level Communication Operating System Bypass. Introduction br for Quality of Service QoS fault-tolerance remote direct memory access

Handoff of Application Sessions Across Time and Space - Phan, Xu, Guy, Bagrodia (2001)   (Correct)
Personal computing on mobile platforms such as laptops and personal digital assistants, rather than in a traditional desktop environment, is becoming increasingly more common. In this paper we address... / of the architecture and operating system thus allowing a session to br migration in the fields of fault-tolerance and load-balancing. Process

Fault Tolerance for Cluster Computing Based on Functional Tasks - Schreiner, Kusper, Bosa (2001)   (Correct)
We have extended the parallel computer algebra environment Distributed Maple by fault tolerance mechanisms such that the time spent in a long running computation is not any more wasted by the eve... / our control machine network operating system faults may happen in any br hti message is issued the system operates analogously to a task

Gang Scheduling with Lightweight User-Level Communication - Frachtenberg, Petrini, Coll, Feng (2001)   (Correct)
In this paper, we explore the performance of gang scheduling on a cluster using the Quadrics interconnection network. In such a cluster, the scheduler can take advantage of this network's unique capab... / delay removing the operating system from the communication br communication patterns and fault tolerance. The Elan network interface

Using Abstraction To Improve Fault Tolerance - Castro, Rodrigues, Liskov (2001)   (Correct)
Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expens... /

JVM Susceptibility to Memory Errors - Deqing Chen Alan (2001)   (Correct)
Modern computer systems are becoming more powerful and are using larger memories. However, except for very high end systems, little attention is being paid to high availability. This is particularly t... / CPU status and can notify the operating system to handle the exception. br techniques to study the fault tolerance of UNIX systems. Fine is a

IP Network Configuration for Intradomain Traffic Engineering - Feldmann, Rexford (2001)   (Correct)
The smooth operation of the Internet depends on the careful configuration of routers in thousands of autonomous systems throughout the world. Configuring routers is extremely complicated because of ... / For example Cisco's Internet Operating System IOS has over commands. br or peer for load balancing and fault tolerance. The delivery of traffic

Building Firewalls with Intelligent Network Interface Cards - Friedman, Nagle (2001)   (Correct)
The primary method for protecting networks today is to use a rewall: a boundary separating the protected network from the untrusted Internet. However, these rewalls offer no protection from internal a... /

The L4Ka Vision - Dannowski, Elphinstone, Liedtke.. (2001)   (Correct)
Microkernels are minimal but highly flexible kernels. Both conventional and non-classical operating systems can be built on top or adapted to run on top of them. Microkernel-based architectures should... / and non-classical operating systems can be built on top or br including reliability and fault tolerance protection and security.

The Quadrics Network (QsNet): High-Performance Clustering Technology - Petrini, Feng, Hoisie, Coll.. (2001)   (Correct)
The Quadrics interconnection network (QsNet) contributes two novel innovations to the field of highperformance interconnects: (1) integration of the virtualaddress spaces of individual nodes into a si... / feats by extending the native operating system in the nodes with a network br space and network fault tolerance via link-level and

A Highly Adaptable Infrastructure for Service Discovery and.. - Lalana Kagal Vladimir (2001)   (Correct)
In an age where wirelessly networked appliances and devices are becoming commonplace, there is a necessity for providing a standard interface to them that is easily accessible by any mobile user. The ... / components managed by an operating system Gaia OS which acts as a br state management and increased fault tolerance. Even in the event of

Building Modern Distributed Systems - Pautet, Quinot, Tardieu (2001)   (Correct)
Ada 95 has been the first standardized language to include distribution in the core language itself. However, the set of features required by the Distributed Systems Annex of the Reference Manual is... / hardware with a working operating system. More advanced concepts such br account advanced needs such as fault tolerance code migration or persistent

On Aspect-Orientation in Distributed Real-time Dependable Systems - Gal, Schröder-Preikschat, Spinczyk (2001)   (Correct)
The design and implementation of distributed real-time dependable systems is often dominated by non-functional considerations like timeliness, object placement and fault tolerance. In this paper we ... / library middleware layer or operating systems services. The variety and br object placement and fault tolerance. In this paper we illustrate

Type Evolution and Version Management in a Persistent Distributed.. - Schoettner, Marquardt, Wende.. (2001)   (Correct)
Todays commercial Operating Systems (OS) use message passing communication facilities such as Corba, Remote Procedure Calls, and TCP/IP. Distributed Shared Memory (DSM) is an alternative mainly used f... /

RTLinux with Address Spaces - Mehnert, Hohmuth, Schönberg, Härtig (2001)   (Correct)
The combination of a real-time executive and an o#-the-shelf time-sharing operating system has the potential of providing both predictability and the comfort of a large application base. To isolate t... / an o the-shelf time-sharing operating system has the potential of br This increased level of fault tolerance is desirable for many

The Willow Survivability Architecture - Knight, Heimbigner, Wolf, Carzaniga, .. (2001)   (Correct)
this paper we summarize the Willow concepts and provide an overview of the Willow architecture. Finally we describe a demonstration application system that has been built on top of a prototype Willow ... /

An Erlang-based hierarchical distributed VoD System - Juan Sanchez Jose (2001)   (Correct)
Video on Demand (VoD) is a service that enables users to request any multimedia content at any time, without being constrained by any pre-established scheduling. Current commercial solutions tend to... / tools has allowed for easy operating system and hardware neutrality. br time for any user. Fault tolerance with x expected uptime

Dealing with Denial-of-Service Attacks in Agent-enabled Active and.. - Karnouskos (2001)   (Correct)
Denial of Service (DoS) attacks is a well-known problem with victims even among prestigious commercial sites. Such attacks in traditional networking are difficult to recognize and to handle. An active... / buffer etc The Node Operating System NodeOS provides the basic br traffic and dependencies fault tolerance etc. The number of

A Framework for Group Integrity Management in Multimedia Multicasting - Andreas Meissner Lars (2001)   (Correct)
Multicast research has so far been focused on routing and network-level group management. Conditions on the composition of multicast groups have however been kept simple, with little efforts to speci ... /

Implementation of Pipes in Distributed Process Management Protocol.. - Agarwal, al. (2001)   (Correct)
The Distributed Process Management Protocol (DPMP) developed at IIT Kanpur is a distributed operating system which facilitates load sharing among heterogenous UNIX workstations in a transparent fashio... /

Automatic Failure Detection and Recovery for Java Servers - Klemm, Singh (2001)   (Correct)
Increasingly, server systems such as e-commerce and telecommunications servers are partially or completely implemented in the Java programming language. One reason why many developers prefer Java over... / are often only caught by the operating system which can result in the br in C or Cwith no additional fault tolerance provisions. Suppose this

State Synchronization and Recovery for Strongly Consistent Replicated .. - Narasimhan, Moser, Melliar-Smith (2001)   (Correct)
The Eternal system provides transparent fault tolerance for CORBA applications, without requiring the modification of either the application or the ORB. Eternal replicates the application objects, and... /

Compiling in a Persistent Distributed Shared Memory Environment - Schoettner, Marquardt, Wende, Link.. (2001)   (Correct)
Plurix is a general purpose Operating System (OS) developed for the PC platform. Network com- munication is implemented by a Distributed Shared Memory (DSM). Restartable transactions and op- timistic ... /

Scalable Resource Management in High Performance Computers - Frachtenberg, Petrini, Fernandez.. (2001)   (Correct)
Clusters of workstations have emerged as an important platform for building cost-effective, scalable and highly-available computers. Although many hardware solutions are available today, the largest c... /

Framework for Testing the Fault-Tolerance of Systems Including OS and .. - Buchacker, Sieh (2001)   (Correct)
This paper presents an extensible framework for testing the behavior of networked machines running the Linux operating sytem in the presence of faults. The framework allows to inject a variety of faul... /

End-To-End Fault Containment In Scalable Shared-Memory Multiprocessors - Teodosiu (2000)   (Correct)
Current shared-memory multiprocessors suffer from an inherent fragility, since a single hardware or system software failure can cause the entire machine to crash. This dissertation describes a combina... / unmodified off-the-shelf operating systems. I have validated this

Differentiated and Predictable Quality of Service in Web Server.. - Aron (2000)   (Correct)
As the World Wide Web experiences increasing commercial and mission-critical use, server systems are expected to deliver high and predictable performance. The phenomenal improvement in microprocessor ... / management facilities in the operating system software are studied. This

Supporting High-performance I/O in QoS-enabled ORB Middleware - Kuhns, Levine, Schmidt, O'Ryan (2000)   (Correct)
To be an effective platform for high-performance distributed applications, off-the-shelf Object Request Broker (ORB) middleware, such as CORBA, must preserve communication-layer quality of service (Qo... / and overview of the Solaris operating system. Supporting br concurrency control and fault tolerance. This requires an efficient

Dynamic User Management System for web sites - Christian (2000)   (Correct)
With the growing quantity of information around the world, besides the software development community, many other fields are interested in finding solutions for efficient information management. In t... / of the approach on an operating system. In the development of this br concurrency scalability fault tolerance and transparency. Though the

Hierarchical Error Detection in a Software Implemented Fault.. - Bagchi, Srinivasan, Whisnant.. (2000)   (Correct)
This paper proposes a hierarchical error detection framework for a Software Implemented Fault Tolerance (SIFT) layer of a distributed system. A four-level error detection hierarchy is proposed in the ... / of platforms hardware and operating systems They can migrate from one br in a Software Implemented Fault Tolerance SIFT Environment

Slipstream Processors: Improving both Performance and Fault Tolerance - Sundaramoorthy, al. (2000)   (Correct)
Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating... / effect. Therefore the operating system creates two redundant br Improving both Performance and Fault Tolerance ABSTRACT Processors

Supporting a Flexible Parallel Programming Model on a Network of.. - Huang (2000)   (Correct)
Execution Model We provide an abstract parallel machine with shared memory to the programmer, so the users are not concerned with message passing, data and execution distribution, machine failures, a... / modification of the underlying operating system. Nested parallelism br high reliability distributed system. Operating Systems Review

Cooperating Threads Architecture: Improving both Performance and.. - Sundaramoorthy, Purser, Rotenberg (2000)   (Correct)
Processors execute the full dynamic instruction stream to arrive at the final output of a program, yet there exist shorter instruction streams that produce the same overall effect. We propose creating... / in Figure . Initially the operating system creates two redundant br Improving both Performance and Fault Tolerance April

Design and Implementation of QoS enabled OO Middleware - Kachroo, Krishnamurthy, Akers.. (2000)   (Correct)
The current interest in the commodity Internet for commercial purposes has helped to fuel R&D into advanced networks and distributed applications. Much of this research is addressing a common problem ... / endsystems Advances in operating system technology and techniques br availability and fault tolerance. Adaptive applications are

Scheduling with Global Information in Distributed Systems - Petrini, Feng (2000)   (Correct)
One of the major problems faced by the developers of parallel programs is the lack of a clear separation between the programming model and the operating system. In this paper, we present a new methodo... / the programming model and the operating system. In this paper we present a br non-trivial implementation of fault tolerance and the lack of a

On the Integration of Configuration and Meta-Level Programming.. - Loques, Sztajnberg, Leite, Lobosco (2000)   (Correct)
Configuration Programming, based on Architecture Description Languages, and Meta-Level Programming are considered promising approaches in the software engineering field. This paper shows that ther... / facilities of a particular operating system or by any mix of resources br from application modules during system operation. Configuration programming

Applying a Pattern Language to Develop Application-level Gateways - Schmidt (2000)   (Correct)
Developers of communication applications must address recurring design challenges related to efficiency, extensibility, and robustness. These challenges are often independent of application-specific r... / and relationships. Moreover operating system OS platform features br event loop integration and fault tolerance. Successful communication

Distributing Trust on the Internet - Cachin (2000)   (Correct)
This paper describes an architecture for secure and fault-tolerant service replication in an asynchronous network such as the Internet, where a malicious adversary may corrupt some servers and contr... / vary in their con guration operating system physical location load etc. br way for enhancing the fault tolerance of centralized components is

Time-Sharing Parallel Jobs in the Presence of Multiple Resource.. - Fabrizio Petrini And (2000)   (Correct)
Bu ered coscheduling is a new methodology that can substantially increase resource utilization, improve response time, and simplify the development of the run-time support in a parallel machine. I... / Job Scheduling Distributed Operating Systems Communication Protocols br at kernel level to provide fault tolerance in the communication. For

EtheReal: A Fault Tolerant Host-Transparent Mechanism for Bandwidth.. - Varadarajan (2000)   (Correct)
of the Dissertation EtheReal: A Fault Tolerant Host-Transparent Mechanism for Bandwidth Guarantees over Switched Ethernet Networks by Srinidhi Varadarajan Doctor of Philosophy in Computer Scien... / any changes to the end host operating system and network br they carry this legacy in their fault tolerance mechanisms which while

A Study of Slipstream Processors - Purser, al. (2000)   (Correct)
A slipstream processor reduces the length of a running program by dynamically skipping computation non-essential for correct forward progress. The shortened program runs faster as a result, but it is ... / is instantiated twice by the operating system and each copy has its own br performance trends and fault tolerance are related. Time redundancy

Resource-Conscious Customization of CORBA for CAN-based Distributed.. - Kim, Jeon, Hong, Kim, Kim (2000)   (Correct)
The software components of embedded control systems get extremely complex as they are designed into distributed systems consisting of a large number of inexpensive microcontrollers interconnected by l... / supports from real-time operating systems well-defined network br regions and to achieve fault tolerance by replicating nodes.

An Enabling Framework for Master-Worker Applications on the.. - Jean-Pierre Goux Department (2000)   (Correct)
We describe MW -- a software framework that allows users to quickly and easily parallelize scientific computations using the master-worker paradigm on the computational grid. MW provides both a "top l... / include the architecture operating system amount of memory disk br must address issues such as fault tolerance task scheduling and

Concepts for Dependable Distributed Discrete Event Simulation - Lüthi, Berchtold (2000)   (Correct)
In many situations, parallel and distributed simulation is a well-suited approach to overcome performance as well as capacity limitations of complex simulation models. However, if distributed simulati... / which is provided by the operating system of each PE. The br Dependable Systems Fault Tolerance Hla Abstract In Many

NFTAPE: A Framework for Assessing Dependability in Distributed.. - Stott, Floering, Burke, Kalbarczyk.. (2000)   (Correct)
Many fault injection tools are available for dependability assessment. Although these tools are good at injecting a single fault model into a single system, they suffer from two main limitations for u... / interface hardware and operating system and . Versatile Error br SIFT software implemented fault tolerance middleware layer black box

Developing Next-generation Distributed Applications with QoS-enabled.. - Schmidt, Kachroo, Krishnamurthy.. (2000)   (Correct)
This paper describes how recent advances in distributed object computing (DOC) middleware are enabling the creation of common quality-of-service (QoS) capabilities that support next-generation distrib... / for applications on multiple operating system platforms. Keywords br secure communications and fault tolerance. There are also efforts to

Design Principles for Dynamic Object Systems - Salzmann (2000)   (Correct)
Dynamic distributed object systems (e.g. Jini, Salutation) are a new generation of distributed object middleware that enable the system to adapt its configuration during runtime. Those kind of middlew... / Internet Transport Operating System Virtual Platform Middleware br example need a high grade of fault tolerance. Fallout may cause serious

DOORS: Towards High-performance Fault Tolerant CORBA - Balachandran Natarajan Dept (2000)   (Correct)
An increasing number of applications are being developed using distributed object computing middleware, such as CORBA. Many of these applications require the underlying middleware, operating systems, ... / the underlying middleware operating systems and networks to provide br scalability and fault tolerance. The Object Management Group

Failure Recovery Algorithms for Multimedia servers - Shenoy, Vin (2000)   (Correct)
In this paper, we present two novel disk failure recovery methods that utilize the inherent characteristics of video streams for efficient recovery. Whereas the first method exploits the inherent re... / to software failures and operating-system crashes customers of br disk arrays -RAID -Fault tolerance -Video compression

Integrating Subscription-based and Connection-oriented Communications .. - Kim, Hong, Kim, Kim (2000)   (Correct)
Recently emerging component-based middleware technologies such as CORBA are widely believed to be a viable solution to the software complexity problem of a distributed embedded computer control syst... / on the mArx real-time operating system. Our measurements reveal br timing constraints and fault tolerance requirements. Recently

Data Replication Strategies for Fault Tolerance and Availability on.. - Amza, Cox, Zwaenepoel (2000)   (Correct)
Recent work has shown the advantages of using persistent memory for transaction processing. In particular, the Vista transaction system uses recoverable memory to avoid disk I/O, thus improving perfor... / namely power failures and operating system crashes. An un-interruptible br Data Replication Strategies for Fault Tolerance and Availability on Commodity

Implementing Journaling in a Linux Shared Disk File System - Kenneth Preslan Sistina (2000)   (Correct)
In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channel and Gigabit Ether... / code to the open source Linux operating system. We did this for several br caching and aggregating file system operations to improve performance by

S-DSM for Heterogeneous Machine Architectures - Eduardo Pinheiro Deqing (2000)   (Correct)
Many---indeed most---distributed applications employ some notion of distributed shared state: information required at more than one location. For applications that span the Internet, this state is alm... / program and signals from the operating system. The user calls support br for access control and fault tolerance. They must also accommodate

HADES: A distributed System for Dependable Hard Real-Time.. - Chevochot, Puaut, Cabillic, Colin.. (2000)   (Correct)
Most dependable embedded real-time systems designed in the past have been specialized to meet the specific requirements of the application domain for which they were targeted, leading to inflexible an... / hard real-time operating system COTS components performance br signicant overhead for the basic system operations task creation context

Transparent Migration of Distributed Communicating Processes - Nasika, Dasgupta (2000)   (Correct)
A Computing Community is a group of cooperating machines that behave like a single system and runs all general-purpose applications---without any modifications to the shrink-wrapped binary applicat... / binary applications or the operating system. In order to realize such a br global scheduling fault tolerance and application

iMW: A Web-Based Problem Solving Environment for Grid Computing.. - Good, Goux (2000)   (Correct)
Grid-enabled solvers are tied to complex grid computing platforms and are therefore difficult to distribute. To make such solvers useful to a wider community of users, remote access tools are needed... / and M. Humphrey. Legion An operating system for wide-area computing. br difficult issues such as fault tolerance task scheduling and

Highly configurable operating systems : the VVM approach - Piumarta, Folliot, Seinturier.. (2000)   (Correct)
this paper is not to tackle this problem, but rather to show that the VVM can be used as a \weaver constructor" for any given weaving model (AspectJ-like, D-like, etc.). The natural way to do this is ... / Highly con gurable operating systems the VVM approach Ian br for communication fault tolerance mobility replication and

Extending MINIX with Real-Time Services and Fault Tolerance.. - Rogina, Wainer (2000)   (Correct)
The MINIX operating system was extended with real-time services, ranging from A/D drivers to new scheduling algorithms and statistics collection. A testbed was constructed to tests several sensor repl... / Key-words Fault Tolerance Operating Systems Real-time Systems Sensing br with Real-Time Services and Fault Tolerance Capabilities Pablo J.

Operating System Management of MEMS-based Storage Devices - Griffin, Schlosser, Ganger, Nagle (2000)   (Correct)
MEMS-based storage devices promise signi cant performance, reliability, and power improvements relative to disk drives. This paper compares and contrasts these two storage technologies and explores ho... / of the th Symposium on Operating Systems Design and Implementation br to manage performance fault tolerance and power consumption. For

User-Level Infrastructure for System Call Interposition: A Platform.. - Jain, Sekar (2000)   (Correct)
Several new approaches for detecting malicious attacks on computer systems and/or confining untrusted or malicious applications have emerged over the past several years. These techniques often rely on... / are implemented within the operating system kernel. We explore an br e.g.data encryption or fault-tolerance e.g.data replication It

Supporting Component-Based Software Development Using Domain Knowledge - Baum, al. (2000)   (Correct)
A consistent implementation of component-based reuse bears several implications for the design of the software development process. For instance, requirements engineering has to be tailored to particu... / at the domain of embedded operating systems which is perceived small br aspects such as scalability or fault tolerance the architecture as a whole

Architecture for a Grid Operating System - Krauter, Maheswaran (2000)   (Correct)
A Grid architecture is proposed that is motivated by the large-scale routing principles in the Internet to provide an extensible, high-performance, scalable, and secure Grid. Central to the proposed a... / Architecture for a Grid Operating System Klaus Krauter and br of an intentional naming system Operating Systems Review Vol.

Evaluating The Performance of Non-Blocking Synchronisation on Modern.. - Tsigas, Zhang (2000)   (Correct)
Parallel programs running on shared memory multiprocessors coordinate via shared data objects/structures. To ensure the consistency of the shared data structures, programs typically rely on some forms... / parallel application or by the operating system very often need to share data br locks . they provide high fault tolerance processor failures will

A Flexible, Interoperable Framework for Active Spaces - Kon, Hess, Román, Campbell, Mickunas (2000)   (Correct)
this paper we describe the requirements faced by such a system and propose an integrated architecture meeting these requirements. The paper focuses on a representation of Active Spaces using standard ... / properly. Conventional operating systems already have a hard time br security and privacy fault-tolerance and quality of service.

Programming with Object Groups in CORBA - Felber, Guerraoui, Wiesmann (2000)   (Correct)
Our Object Group Service extends CORBA with the ability to gather several objects inside a group and to transparently handle the group membership and the consistent invocations of the group members. W... / in Maf VB nor on any operating system facility e.g.as in br Request For Proposal on CORBA Fault Tolerance. This paper does not detail

Towards Modelling and Verification of Concurrent Ada Programs Using.. - Burns, Wellings, Burns, Koelmans.. (2000)   (Correct)
Ada 95 is an expressive concurrent programming language with which it is possible to build complex multi-tasking applications. Much of the complexity of these applications stems from the interactions ... / these problems language and operating system researchers have introduced br in the world of software faulttolerance the notion of conversations

An optimized MPI library for VIA/SCI cards - Sven Schindler Wolfgang (2000)   (Correct)
Rapid developments in computer architecture and in networking technology have driven the construction of clusters of cluster. Now cluster computers are an inexpensive alternative to parallel computers... / is necessary to involve the operating system kernel for starting DMA

Experimental evaluation of the fail-silent behavior of a distributed.. - Chevochot, Puaut (2000)   (Correct)
Mainly for economic and maintainability reasons, more and more dependable real-time systems are built from Commercial Off-The-Shelf (COTS) components. To build these systems, a commonly-used assumptio... / COTS components hardware operating system The results show that with br they must be complemented by fault tolerance mechanisms error detection

Holistic Schedulability Analysis of a Fault-Tolerant Real-Time.. - Chevochot, Puaut (2000)   (Correct)
The feasibility test of a hard real-time system must not only take into account the temporal behavior of the application tasks but also the behavior of the run-time support in charge of executing appl... / components hardware and operating system COTS hardware and operating br a complex run-time support with fault-tolerance capabilities and made of

Request Sequencing: Optimizing Communication for the Grid - Arnold, Bachmann, Dongarra (2000)   (Correct)
As we research to make the use of Computational Grids seamless, the allocation of resources in these dynamic environments is proving to be very unwieldy. In this paper, we introduce, describe and ... / popular variants of the UNIX operating system and parts of the system are br tasks. Users Applications Fault Tolerance Load Balancing Server NS

Fault Tolerant Wide-Area Parallel Computing - Weissman (2000)   (Correct)
Executing parallel applications across distributed networks introduces the problem of fault tolerance. A viable solution for fault tolerance must keep overhead manageable and not compromise the hi... / in distributed systems and operating systems in which generic br introduces the problem of fault tolerance. A viable solution for fault

A Skeleton-Based Approach for the Design and Implementation of.. - Fethi Rabhi School (2000)   (Correct)
It has long been argued that developing distributed software is a difficult and error-prone activity. Based on previous work on design patterns and skeletons, this paper proposes a template-based appr... / programming language or host operating system. However such standards br data distribution and fault-tolerance. Part of the problem comes

A Simple, Fast and Scalable Non-Blocking Concurrent FIFO Queue for.. - Tsigas, Zhang (2000)   (Correct)
A non-blocking FIFO queue algorithm for multiprocessor shared memory systems is presented in this paper. The algorithm is very simple, fast and scales very well in both symmetric and non-symmetric mul... / applications algorithms and operating systems for multiprocessor systems. br locks . they provide high fault tolerance processor failures will

A Primitive Mobile-State Protocol for Constructing Fault-Tolerant.. - Brand, Van Roy, Collet, Klintskog (2000)   (Correct)
Mobile-state protocols are important for distributed object systems. We define a lightweight mobile-state protocol that has a well-defined behavior for site failures and network inactivities. The pr... / can burn disks can crash operating systems can be corrupted. br as a foundation for programming faulttolerance. As discussed in Gar

Grid-Based File Access: The Legion I/O Model - White (2000)   (Correct)
The unprecedented scale, heterogeneity, and varied usage patterns of grids pose significant technical challenges to any underlying file system that will support them. While grids present a host of new... / is an object-based grid operating system charged with reconciling a br scalability programming ease fault tolerance security and site autonomy.

Symbolic Program Execution Using the Erlang Verification Tool - Earle (2000)   (Correct)
this article is as follows. First, we introduce Erlang, the logic and the tool we use. Second, we present the symbolic program execution and debugging techniques and sketch some of the ideas behind th... / are handled in an distributed operating system while maintaining the br by special architectures for fault tolerance robust hardware and

Reanimating SAFER in VDM-SL Using CORBA - Fenkam (2000)   (Correct)
This paper presents a method for visual validation of systems based on their VDM-SL specification. In a traditional development process acceptance tests are carried out too late when a first release o... /

Nomad: A Scalable Operating System for Clusters of Uni and.. - Eduardo Souza De (2000)   (Correct)
The recent improvements in workstation and interconnection network performance have popularized the clusters of off-the-shelf workstations. However, the usefulness of these clusters is yet to be ful... / Nomad A Scalable Operating System for Clusters of Uni and br high disk I O throughput and fault tolerance anyway For instance in

Home-based Release Consistency in Object-based Software DSM Systems - Markus Zahn Computing (2000)   (Correct)
This paper discusses the application of consistency models in objectbased software distributed shared memory (DSM) systems. In particular, we propose a home-based release consistency protocol as appli... / the same multithreaded operating system is used for single- and br of Computer Design and Fault Tolerance University of Karlsruhe

Writing High-Performance Server Applications in Haskell Case Study: A .. - Marlow (2000)   (Correct)
Server applications, and in particular networkbased server applications, place a unique combination of demands on a programming language: lightweight concurrency and high I/O throughput are both impor... / access to the log le. . Operating-System Threads Operating system br a malevolent client. Fault tolerance is as important as

An Approach For Network Communications Systems Recovery - Mitchell, Brown (2000)   (Correct)
In this paper we examine the problem of failures within network communications and telecom systems and outline a localised Invisible Recovery solution to such systems. We introduce a new approach that... / pattern which is neither operating system nor protocol specific. This br know -way hand shake. Our IR system operates in the form of an envelope

Are COTS suitable for building distributed fault-tolerant hard.. - Chevochot, Colin, Decotigny, Puaut (2000)   (Correct)
For economic reasons, a new trend in the development of distributed hard real-time systems is to rely on the use of Commercial Off-The-Shelf (cots) hardware and operating systems. As such systems ofte... / cots hardware and operating systems. As such systems often br with stringent realtime and fault-tolerance requirements. The use of

Application-Level Fault Tolerance as a Complement to System-Level.. - Joshua Haines Jhaines (2000)   (Correct)
As multiprocessor systems become more complex, their reliability will need to increase as well. In this paper we propose a novel technique which is applicable to a wide variety of distributed real-t... / system software includes the operating system and components such as the br Application-Level Fault Tolerance as a Complement to

Modeling and Analysis of Software Aging and Rejuvenation - Trivedi, Vaidyanathan.. (2000)   (Correct)
Software systems are known to suffer from outages due to transient errors. Recently, the phenomenon of "software aging", one in which the state of the software system degrades with time, has been repo... / data collected from the UNIX operating system over a period of time. The br design diversity techniques for fault tolerance in software systems such as

Distributed architectures - Fernandez (2000)   (Correct)
evels enforce authorization constraints. Each level and mappings can be described using OO models and patterns. Authentication--- Uses cryptographic protocols Filtering---Some objects need to be fil... / Heterogeneity-A variety of Operating systems Unix several varieties br legal or security reasons Fault tolerance-Ability to stay up in the

A Universal Framework for Managing Metadata in the Distributed Dragon .. - Wedde, Siepmann (2000)   (Correct)
In the multimedia field, metadata are becoming increasingly important for efficiently cataloguing the abundant flood of information. (Metadata are data on information structures.) The number of electr... / in order to provide uniform operating system support running on a network br This results in higher system fault tolerance and faster local read

Injecting Distributed Capabilities into Legacy Applications Through.. - Boyd, Dasgupta (2000)   (Correct)
Applications and operating systems can be augmented with extra functionality by injecting additional middleware into the boundary layer between them, without tampering with their binaries. Using this ... / Abstract Applications and operating systems can be augmented with extra br are extractable during the system operation they are best captured

An object-oriented concurrent and distributed programming platform.. - Reinfelds (2000)   (Correct)
y. Today's demands on program and data dependability are so strong and so important, that the design and development of distributed concurrent multi-platform application programs is becoming the rule ... / data files were handled by the operating system. The management of br distribution structure fault tolerance and security. The current

Supporting the Design of Adaptable Operating Systems Using.. - Netinant Constantinides Elrad (2000)   (Correct)
Supporting separation of concerns in the design of operating systems can provide a number of benefits such as reusability, extensibility and reconfigurability. However, in order to maximize these bene... / the Design of Adaptable Operating Systems Using Aspect-Oriented br scheduling and fault tolerance cut across the basic

Giotto: A Time-triggered Language for Embedded Programming - Henzinger, Horowitz, Kirsch (2000)   (Correct)
Giotto provides an abstract programmer's model for the implementation of embedded control systems with hard real-time constraints. A typical control application consists of periodic software tasks t... / together with a real-time operating system Typical activities of the br and achieving a degree of fault tolerance through replication and error

The Cactus Approach to Building Configurable Middleware - Hiltunen, Schlichting (2000)   (Correct)
Introduction A number of fundamental abstractions and supporting software mechanisms have been developed for simplifying the problems associated with programming highly dependable distributed systems... / the application and above the operating system. For each type of br state machine approach to fault tolerance by ensuring that changes to

Providing Infrastructure And Interface To High-Performance.. - Arnold, Dongarra, Lee, Wheeler (2000)   (Correct)
The NetSolve project was established to aid scientists who prefer not to be concerned with the usual tedium associated with nding and maintaining software libraries which they use to create programs,... / popular variants of the UNIX operating system and parts of the system are br Users Applications Fault Tolerance Load Balancing Server NS

Virtualizing Operating Systems for Seamless Distributed Environments - Boyd, Dasgupta (2000)   (Correct)
Applications and operating systems can be augmented with extra functionality by injecting additional middleware into the boundary layer between them, without tampering with their binaries. Using this ... / Virtualizing Operating Systems for Seamless Distributed br the support for the overall system operation within the context of the

WOS: an Internet Computing Environment - Peter Kropf Informatique (2000)   (Correct)
Given the current development of the Internet, the Web, mobile communications and services, we are clearly heading towards an era of widely integrated ubiquitous services sharing some kind of global o... /

Towards Rapid Development of Configurable, Reliable, and Scalable.. - Buskens, Sabnani (2000)   (Correct)
This paper presents Aurora, a software toolkit that dramatically reduces the effort required to develop configurable, reliable, and scalable wireless applications. The toolkit consists of software lib... / in these systems excluding operating system and third party class br that provide initialization and fault tolerance support typically needed by

Integration of CORBA Services with a Dynamic Real-time Architecture - Andreas Polze Janek (2000)   (Correct)
The Common Object Request Broker Architecture (CORBA) is the most successful representative for an object-based distributed computing architecture. Although CORBA simplifies the implementation of co... / assumes that the underlying operating system supports either the priority br CORBA The idea of providing fault tolerance as additional feature to

Replication of CORBA Objects - Pascal Felber Rachid (2000)   (Correct)
Distributed computing is one of the major trends in the computer industry. As systems become more distributed, they also become more complex and have to deal with new kinds of problems, such as par... /

Harmonious Internal Clock Synchronization - Wedde, Freund (2000)   (Correct)
Internal clock synchronization has been investigated, or employed, for quite a number of years, under the requirement of good upper bounds for the deviation, or accuracy, between a predefined master n... / capacity each The local operating system is Suse Linux . .This br for the purpose of achieving fault tolerance. Nearly all previous

The USC Autonomous Flying Vehicle (AFV) Project: Year 2000 Status - Montgomery (2000)   (Correct)
this document. The BIT and health checks software has not yet been implemented. The vision processing software runs on a ground-based computer but has not yet been ported to execute onboard the AVATAR... / contains both the realtime operating system RTOS and flight software. br software for increasing robot fault tolerance ffl vision processing

Linking And Loading In A Persistent Dsm Operating System - Schoettner, Marquardt, Wende.. (2000)   (Correct)
Our native Java compiler directly generates runtime structures in a persistent Distributed Shared Memory (DSM). The compiler has been used to build a general purpose PC Operating System (OS) on top of... /

Systematic Customization of Middleware - Zarras (2000)   (Correct)
The urgent need to deal with problems that are frequently met in many different families of application led to the evolution and standardization of a software layer that lies between the application a... / application and the underlying operating system. This layer is widely known br security transactions fault tolerance etc. Middleware is typically

A Distributed Virtual Reality Prototype for Real Time GPS Data - Ladner, Klos, Abdelguerfi, Richard.. (2000)   (Correct)
We describe a prototype that provides distributed, threedimensional, interactive virtual worlds, which are enhanced with reliable communication and recording of real time events throughout the sys... / prototype application and the operating system. The core components of br prototype as desired for both fault tolerance and prototype functionality

Extending the Execution Environment with DITools - Serra, Navarro (2000)   (Correct)
This document describes the Dynamic Interposition Tools (DITools), a set of tools bringing an environment in which dynamically-linked executables can be extended at runtime with unforeseen functionali... / UPC-DAC- - Keywords operating systems extensibility br improvements e.g. to provide fault tolerance data stream encription

Survivability Measure - Millen (2000)   (Correct)
nfigurations Services s s s s s 1 1 Figure 1: Service Hierarchy Services were given a survivability ordering: one service is no more survivable than another if every service set that supports t... / as a mail application and an operating system. The hardware is also built br and software. To study fault tolerance and recon guration we

Using Redundancy to Increase Survivability - Hiltunen, Schlichting, Ugarte (2000)   (Correct)
This paper focuses on two key requirements for using redundancy to improve survivability, the development of appropriate techniques and the availability of suitable system support. We begin by discuss... / of security where the existing operating system already provides br system failure when considering faulttolerance attributes. This problem is

Intrusion Tolerant - Systems Partha Pal (2000)   (Correct)
this paper have outlined our approach, and presented several key problems that we are currently investigating unknown Intrusion Tolerant Systems Partha P. Pal, Franklin Webber, Richard E. Schantz an... / such as routers. At the operating system level we may see br failures if we follow fault-tolerance terminology caused by the

Designing Efficient Fault-Tolerant Systems on Wireless Networks - Guohong Cao Department (2000)   (Correct)
Introduction The falling cost of both communication and mobile computing devices (laptop computers, hand-held computers, etc.) is making mobile computing affordable to both business users and private... / the network level and the operating system level. At the network level br on wireless links. Traditional fault-tolerance schemes cannot be directly

Evaluating the Scalability Distributed Systems - Jogalekar, Woodside (2000)   (Correct)
Many distributed systems must scalable, meaning that they must economically deployable a wide range sizes and con#gurations. This paper presents scalability metric based cost- e#ectiveness, where e#ec... / Microsoft's Windows operating system is discussed using in-memory br acceptable cost-bene t ratio the system operator. value the threshold is

XML based interfaces to TelORB - Johannesson, Tajallaei (2000)   (Correct)
The requirements on future telecommunication platforms include high performance, fault tolerant, open architectures and scalability. TelORB, which is a distributed operating system for large-scale, em... / which is a distributed operating system for large-scale embedded br include high performance fault tolerance open architectures and

Problem Formulations for QoS Management in Automatic Control - By Martin Sanfridson (2000)   (Correct)
Title Language Keywords Document type Date Quality-of-service management is a method where applications negotiate with a broker for resources. How can the notion of QoS, which comes from multimedia an... / by a broker is located to the operating system. The broker cooperates with br Can QoS improve the fault tolerance or does it make fault

A DYNAMICALLY CONFIGURABLE ENVIRONMENT FOR HIGH Performance Computing - Abdennadher, Babin, Kropf, Kuonen (2000)   (Correct)
Current tools available for high performance computing require that all the computing nodes used in a parallel execution be known in advance: the execution environment must know where the different "c... /

Runtime System Level Fault Tolerance for a Distributed Functional.. - Trinder, Pointon, Loidl (2000)   (Correct)
Functional languages potentially o er bene ts for distributed fault tolerance: many computations are pure, and hence have no side-e ects to be reversed during error recovery; moreover functional la... /

Testing for Software Vulnerability Using Environment Perturbation - Du, Mathur (2000)   (Correct)
We describe an methodology for testing a software system for possible security flaws. Based on the observation that most security flaws are caused by the program's inappropriate interactions with the ... /

Performance Availability for Networks of Workstations - Arpaci-Dusseau (1999)   (Correct)
Performance Availability for Networks of Workstations by Remzi H. Arpaci-Dusseau Software systems for large-scale distributed and parallel machines are difficult to build. When run in dynamic, pro... / . . . Operating System . br are unaware of the specifics of system operation. The problem of attaining

Local Anonymity In The Internet - Martin, Jr. (1999)   (Correct)
Packet-switched computer networks of all sizes are widely used for personal, professional, and governmental communication. However, the speed, versatility, and largely unregulated nature of computer n... / . . Operating System . br . . . Fault Tolerance .

Efficient Implementations of Software Architectures via Partial.. - Marlet, Thibault, Consel (1999)   (Correct)
The notion of flexibility (that is, the ability to adapt to changing requirements or execution contexts) is recognized as a key concern in structuring software, and many architectures have been desi... / available platforms hardware operating systems etc.and features the br as well as safety fault tolerance and quality of service.

Active Names: Programmable Location and Transport of Wide-Area.. - Vahdat, Anderson, Dahlin (1999)   (Correct)
Active Names are a general framework for the development and composition of wide-area applications. The key insight behind Active Names is the need to introduce programmability of name binding to supp... / for Programming Languages and Operating Systems Cambridge MA . Fox

Flexible and Adaptive Control of Real-Time Distributed Object.. - Loyall, Atlas, Schantz, Gill.. (1999)   (Correct)
Next-generation distributed systems have growing demands for real-time quality of service (QoS), flexibility, and control over the often unpredictable environments in which they are deployed. These... / and the underlying operating systems protocol stacks and br and preserve QoS during system operation. TAO's inband mechanisms

Automated Synthesis and Optimization of Robot Configurations: An.. - Leger (1999)   (Correct)
Robot configuration design is hampered by the lack of established, well-known design rules, and designers cannot easily grasp the space of possible designs and the impact of all design variables on a ... /

Chameleon: A Software Infrastructure for Adaptive Fault Tolerance - Kalbarczyk Bagchi (1999)   (Correct)
This paper presents Chameleon, an adaptive infrastructure, which allows different levels of availability requirements to be simultaneously supported in a networked environment. Chameleon provides depe... / ARMORs the hardware and the operating system. Keywords adaptive fault br Infrastructure for Adaptive Fault Tolerance Z. Kalbarczyk S. Bagchi

Highly Reliable Upgrading of Components - Cook, Dage (1999)   (Correct)
After a system is deployed, fixes, enhancements, and modifications all occur that change the components that make up the system. Unfortunately, new versions of components can introduce new errors and ... / have not addressed software fault tolerance many of the issues

Mobility and Extensibility in the StratOSphere Framework - Wu, Agrawal, Abbadi (1999)   (Correct)
We describe the design and implementation of our StratOSphere project, a framework which unifies distributed objects and mobile code applications. We begin by first examining dioeerent mobile code par... / and packages distributed operating systems Gos and distributed br processes for load-balancing fault-tolerance and resilience. Systems such

The MultiSpace: an Evolutionary Platform for Infrastructural Services - Gribble, Welsh, Brewer, Culler (1999)   (Correct)
This paper presents the architecture for a Base, a clustered environment for building and executing highly available, scalable, but exible and adaptable infrastructure services. Our architecture has t... / c application and feature set operating system and hardware platform. br all of the dicult service faulttolerance availability and

A Web-Based Distributed Programming Environment - Aoki (1999)   (Correct)
A Web-Based Distributed Programming Environment Kiyoko F. Aoki A Java-based system, the GeoJAVA System, that allows a user to remotely compile his/her own C/C++ programs and execute them for visualiz... / space without specialized operating systems to handle such a procedure br which performs some checks for fault tolerance. For example in the case

Supporting Customized Failure Models for Distributed Software - Hiltunen, Immanuel, Schlichting (1999)   (Correct)
The cost of employing software fault-tolerance techniques in distributed systems is strongly related to the type of failures to be tolerated. For example, in terms of the amount of redundancy required... / On The Osf ri Mk . Mach Operating System And Cords tmr A Variant br The cost of employing software fault-tolerance techniques in distributed

CiteSeer - citeseer.org - Terms of Service - Privacy Policy - Copyright © 1997-2002 NEC Research Institute