Results 1  10
of
6,125
LogP: Towards a Realistic Model of Parallel Computation
, 1993
"... A vast body of theoretical research has focused either on overly simplistic models of parallel computation, notably the PRAM, or overly specific models that have few representatives in the real world. Both kinds of models encourage exploitation of formal loopholes, rather than rewarding developme ..."
Abstract

Cited by 562 (15 self)
 Add to MetaCart
A vast body of theoretical research has focused either on overly simplistic models of parallel computation, notably the PRAM, or overly specific models that have few representatives in the real world. Both kinds of models encourage exploitation of formal loopholes, rather than rewarding
The SPLASH2 programs: Characterization and methodological considerations
 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE
, 1995
"... The SPLASH2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed sharedaddressspace multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH2 programs in terms of fundamental propertie ..."
Abstract

Cited by 1399 (12 self)
 Add to MetaCart
The SPLASH2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed sharedaddressspace multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH2 programs in terms of fundamental properties and architectural interactions that are important to understand them well. The properties we study include the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality, as well as how these properties scale with problem size and the number of processors. The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful way. For example, by characterizing the working sets of the applications, we describe which operating points in terms of cache size and problem size are representative of realistic situations, which are not, and which re redundant. Using SPLASH2 as an example, we hope to convey the importance of understanding the interplay of problem size, number of processors, and working sets in designing experiments and interpreting their results.
Interior Point Methods in Semidefinite Programming with Applications to Combinatorial Optimization
 SIAM Journal on Optimization
, 1993
"... We study the semidefinite programming problem (SDP), i.e the problem of optimization of a linear function of a symmetric matrix subject to linear equality constraints and the additional condition that the matrix be positive semidefinite. First we review the classical cone duality as specialized to S ..."
Abstract

Cited by 557 (12 self)
 Add to MetaCart
We study the semidefinite programming problem (SDP), i.e the problem of optimization of a linear function of a symmetric matrix subject to linear equality constraints and the additional condition that the matrix be positive semidefinite. First we review the classical cone duality as specialized to SDP. Next we present an interior point algorithm which converges to the optimal solution in polynomial time. The approach is a direct extension of Ye's projective method for linear programming. We also argue that most known interior point methods for linear programs can be transformed in a mechanical way to algorithms for SDP with proofs of convergence and polynomial time complexity also carrying over in a similar fashion. Finally we study the significance of these results in a variety of combinatorial optimization problems including the general 01 integer programs, the maximum clique and maximum stable set problems in perfect graphs, the maximum k partite subgraph problem in graphs, and va...
Reflections on PRAM
, 1997
"... This paper has been written as a contribution to the Conference on Statistical Data Protection '98, March 2527 1998, Lisbon, Portugal. ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This paper has been written as a contribution to the Conference on Statistical Data Protection '98, March 2527 1998, Lisbon, Portugal.
On the Physical Design of PRAMs
, 1993
"... The Saarbrucken Parallel Random Access Machine (SBPRAM) is a scalable shared memory machine. At the gate level it is a reengineered version of the Fluent machine [A. G. Ranade, S. N. Bhatt and S. L. Johnson. The Fluent Abstract Machine. In Proc. 5th MIT Conference on Advanced Research in VLSI, pp. ..."
Abstract

Cited by 49 (13 self)
 Add to MetaCart
The Saarbrucken Parallel Random Access Machine (SBPRAM) is a scalable shared memory machine. At the gate level it is a reengineered version of the Fluent machine [A. G. Ranade, S. N. Bhatt and S. L. Johnson. The Fluent Abstract Machine. In Proc. 5th MIT Conference on Advanced Research in VLSI, pp
Balanced Allocations
 SIAM Journal on Computing
, 1994
"... Suppose that we sequentially place n balls into n boxes by putting each ball into a randomly chosen box. It is well known that when we are done, the fullest box has with high probability (1 + o(1)) ln n/ ln ln n balls in it. Suppose instead that for each ball we choose two boxes at random and place ..."
Abstract

Cited by 331 (8 self)
 Add to MetaCart
Suppose that we sequentially place n balls into n boxes by putting each ball into a randomly chosen box. It is well known that when we are done, the fullest box has with high probability (1 + o(1)) ln n/ ln ln n balls in it. Suppose instead that for each ball we choose two boxes at random and place the ball into the one which is less full at the time of placement. We show that with high probability, the fullest box contains only ln ln n/ ln 2 +O(1) balls  exponentially less than before. Furthermore, we show that a similar gap exists in the infinite process, where at each step one ball, chosen uniformly at random, is deleted, and one ball is added in the manner above. We discuss consequences of this and related theorems for dynamic resource allocation, hashing, and online load balancing.
On the CostEffectiveness of PRAMs
, 1991
"... We introduce a formalism which allows to treat computer architecture as a formal optimization problem. We apply this to the design of shared memory parallel machines. Present computers of this type support the programming model of a shared memory. But simultaneous access to the shared memory by seve ..."
Abstract

Cited by 33 (12 self)
 Add to MetaCart
We introduce a formalism which allows to treat computer architecture as a formal optimization problem. We apply this to the design of shared memory parallel machines. Present computers of this type support the programming model of a shared memory. But simultaneous access to the shared memory by several processors is in many situations processed sequentially. Asymptotically good solutions for this problem are offered by theoretical computer science. We modify these constructions under engineering aspects and improve the price/performance ratio by roughly a factor of 6. The resulting machine has surprisingly good price/performance ratio even if compared with distributed memory machines. For almost all access patterns of all processors into the shared memory, access is as fast as the access of only a single processor. 1 Introduction Commercially available parallel machines can be classified as distributed memory machines or shared memory machines. Exchange of data between different proce...
Fast connected components algorithms for the erew pram
 SIAM J. Comput
, 1999
"... We present fast and ecient parallel algorithms for nding the connected components of an undirected graph. These algorithms run on the exclusiveread, exclusivewrite (EREW) PRAM. On a graph with n vertices and m edges, our randomized algorithm runs in O(log n) time using (m+n 1+) = logn EREW process ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
We present fast and ecient parallel algorithms for nding the connected components of an undirected graph. These algorithms run on the exclusiveread, exclusivewrite (EREW) PRAM. On a graph with n vertices and m edges, our randomized algorithm runs in O(log n) time using (m+n 1+) = logn EREW
The Owner Concept for PRAMs
, 1991
"... We analyze the owner concept for PRAMs. In OROWPRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROWPRAMs, ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
We analyze the owner concept for PRAMs. In OROWPRAMs each memory cell has one distinct processor that is the only one allowed to write into this memory cell and one distinct processor that is the only one allowed to read from it. By symmetric pointer doubling, a new proof technique for OROWPRAMs
Parallel Algorithmic Techniques: PRAM Algorithms And PRAM Simulations
, 1995
"... PRAM , which is the Priority CRCW PRAM in which each processor can perform arbitrary complex local operations in a single step. Clearly the Abtract PRAM is stronger than the Priority CRCW PRAM, and actually, it is stronger than any other standard (hence we do not take into account the Minimum CRCW ..."
Abstract
 Add to MetaCart
PRAM , which is the Priority CRCW PRAM in which each processor can perform arbitrary complex local operations in a single step. Clearly the Abtract PRAM is stronger than the Priority CRCW PRAM, and actually, it is stronger than any other standard (hence we do not take into account the Minimum CRCW
Results 1  10
of
6,125