Results 1 
5 of
5
Selection in the presence of memory faults, with applications to inplace resilient sorting
 In Proc. 23rd ISAAC, volume 7676 of LNCS
, 2012
"... ar ..."
Neutron Sensitivity and Software Hardening Strategies for Matrix Multiplication and FFT on Graphics Processing Units
"... In this paper, we compare the radiation response of GPUs executing matrix multiplication and FFT algorithms. The provided experimental results demonstrate that for both algorithms, in the majority of cases, the output is affected by multiple errors. The architectural and code analysis highlight that ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In this paper, we compare the radiation response of GPUs executing matrix multiplication and FFT algorithms. The provided experimental results demonstrate that for both algorithms, in the majority of cases, the output is affected by multiple errors. The architectural and code analysis highlight that multiple errors are caused by shared resources corruption or thread dependencies. The experimental data and analytical studies can be fruitfully employed to evaluate the expected error rate of GPUs in realistic applications and to design specific and optimized softwarebased hardening procedures.
Research Statement
, 2012
"... My main area of research are distributed algorithms in networks. I am particularly interested in the influence of both the amount and the quality of information on the efficiency with which a given task can be performed, in a distributed way. Knowledge available to nodes of the network executing a g ..."
Abstract
 Add to MetaCart
(Show Context)
My main area of research are distributed algorithms in networks. I am particularly interested in the influence of both the amount and the quality of information on the efficiency with which a given task can be performed, in a distributed way. Knowledge available to nodes of the network executing a given algorithm can be partitioned into information provided to them a priori, and information acquired through message exchanges with other nodes. A natural approach to a quantitative study of the impact of information on the efficiency is known in the literature as the advice paradigm. An oracle knowing the network gives a (possibly different) string of bits to each node. Then nodes execute a distributed algorithm that does not assume knowledge of the network, but uses communication between nodes and the advice provided to them by the oracle, in order to complete the task. Thus the framework of advice permits to quantify the minimum amount of information needed for an efficient solution of a given network problem, regardless of the type of information that is provided. On the other hand, the quality of information at the disposal of nodes depends on its accuracy (e.g., how precisely can a node of a sensor network perceive its geographic position) and on its reliability (e.g., how prone to faults are the network components, or how vulnerable to forgery are
Resilient dynamic programmingâˆ—
, 2015
"... We investigate the design of dynamic programming algorithms in unreliable memories, i.e., in the presence of errors that lead the logical state of some bits to be read differently from how they were last written. Assuming that a limited number of memory faults can be inserted at runtime by an adver ..."
Abstract
 Add to MetaCart
(Show Context)
We investigate the design of dynamic programming algorithms in unreliable memories, i.e., in the presence of errors that lead the logical state of some bits to be read differently from how they were last written. Assuming that a limited number of memory faults can be inserted at runtime by an adversary with unbounded computational power, we obtain the first resilient algorithms for a broad range of dynamic programming problems, devising a general framework that can be applied to both iterative and recursive implementations. Besides all local dependency problems, where updates to table entries are determined by the contents of neighboring cells, we also settle challenging nonlocal problems, such as allpairs shortest paths and matrix multiplication. All our algorithms are correct with high probability and match the running time of their standard nonresilient counterparts while tolerating a polynomial number of faults. The recursive algorithms are also cacheefficient and can tolerate faults at any level of the memory hierarchy. Our results exploit a careful combination of data replication, majority techniques, fingerprint computations, and lazy fault detection. To cope with the complex data access patterns induced by some of our algorithms, we also devise amplified fingerprints, which might be of independent interest in the design of resilient algorithms for different problems.