• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 65
Next 10 →

Implementation and performance of Munin

by John B. Carter, John K. Bennett, Willy Zwaenepoel - IN PROCEEDINGS OF THE 13TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES , 1991
"... Munin is a distributed shared memory (DSM) system that allows shared memory parallel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, sha ..."
Abstract - Cited by 587 (22 self) - Add to MetaCart
to keep memory consistent. Munin's multiprotocol release consistency is implemented in software using a delayed update queue that buffers and merges pending outgoing writes. A sixteen-processor prototype of Munin is currently operational. We evaluate its implementation and describe the execution

Evaluation of Release Consistent Software Distributed Shared Memory on Emerging Network Technology

by Sandhya Dwarkadas, Pete Keleher, Alan L. Cox, Willy Zwaenepoel
"... We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidat ..."
Abstract - Cited by 467 (43 self) - Add to MetaCart
independent of the protocol used. Medium-grained applications, such as Water, can achieve good performance, but the choice of protocol is critical. For sixteen processors, the best protocol, lazy hybrid, performed more than three times better than the worst, the eager update. Fine-grained applications

Digital Beam Former Architecture for Sixteen Elements Planar Phased Array Radar

by Mirza Mukram Baig, Md Wajid, Hussain M-tech, Imthiazunnisa Begum M-tech, Md Abdul Khader, Cs Sadaq Basha
"... Abstract- Beam forming is a signal processing technique used in antenna arrays for directional signal transmission or reception. Phased array radar is very important in modern radar development, and multiple digital beams forming technology is the most significant technology in phased array radar. D ..."
Abstract - Add to MetaCart
. Digital multiple beam forming on each antenna element about large phased array radar is impossible in processor based digital processing units, because it needs simultaneous processing many A/D channels. This paper describes architecture for a digital beam former developed for 16 element phased array

Design and Analysis of a Many-Core Processor Architecture for Multimedia Applications

by Jyu-Yuan Lai , Po-Yu Chen , Ting-Shuo Hsu , Chih-Tsun Huang , Jing-Jia Liou
"... Abstract-We present a design of many-core processor architecture with superior cost-effectiveness to fulfill the rapid increasing demand of high-speed embedded multimedia applications. The prototype platform consists of sixteen processor cores and a 4-by-4 mesh-based duplex network interconnection ..."
Abstract - Add to MetaCart
Abstract-We present a design of many-core processor architecture with superior cost-effectiveness to fulfill the rapid increasing demand of high-speed embedded multimedia applications. The prototype platform consists of sixteen processor cores and a 4-by-4 mesh-based duplex network interconnection

The Hardware Architecture and Linear Expansion of Tandem NonStop Systems

by Robert Horst, Tim Chou, Robert Horst, Tim Chou, Robert Horst, Tim Chou - Proc. 12th Int. Conf. Computer Architecture , 1985
"... The Tandem NonStop TXP is a commercially available multiple processor system that delivers mainframe class performance for transaction processing applications. Several sixteen-processor systems may be configured in a ring structure using fiber optics. This structure allows from two to over two hundr ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
The Tandem NonStop TXP is a commercially available multiple processor system that delivers mainframe class performance for transaction processing applications. Several sixteen-processor systems may be configured in a ring structure using fiber optics. This structure allows from two to over two

Scientific Computing Research Environments for the Mathematica Sciences

by Matthias Heinkenschloss, Petr Kloucek, Yin Zhang , 2001
"... This report describes the research projects and accomplishments made possible through the availability of the sixteen processor SGI Origin 2000, purchased in parts with the funds from NSF SCREMS grant NSF 98-72009. To date the SGI Origin 2000 has served as the main computing facility in many inte ..."
Abstract - Add to MetaCart
This report describes the research projects and accomplishments made possible through the availability of the sixteen processor SGI Origin 2000, purchased in parts with the funds from NSF SCREMS grant NSF 98-72009. To date the SGI Origin 2000 has served as the main computing facility in many

An algorithm-by-blocks for SuperMatrix band Cholesky factorization

by Gregorio Quintana-ortí, Enrique S. Quintana-ortí, Alfredo Remón, Robert A. Van De Geijn - In VECPAR ’08: Proceedings of the Eighth International Meeting on High Performance Computing for Computational Science
"... Abstract. We pursue the scalable parallel implementation of the factorization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. Th ..."
Abstract - Cited by 3 (3 self) - Add to MetaCart
. The SuperMatrix run-time system allows an out-of-order scheduling of operations that is transparent to the programmer. Experimental results for the Cholesky factorization of band matrices on two parallel platforms with sixteen processors demonstrate the scalability of the solution.

Utilizing Memory Bandwidth in DSP Embedded Processors

by Catherine H. Gebotys , 2001
"... This paper presents a network flow approach to solving the register binding and allocation problem for multiword memory access DSP processors. In recently announced DSP processors, such as Star*core, sixteen bit instructions which simultaneously access four words from memory are supported. A pol ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
This paper presents a network flow approach to solving the register binding and allocation problem for multiword memory access DSP processors. In recently announced DSP processors, such as Star*core, sixteen bit instructions which simultaneously access four words from memory are supported. A

Optical logic array processor using shadowgrams

by J. Ichioka, J. Tanida, Y. Ichioka - J. Opt. Soc. Am , 1983
"... On the basis of a lensless shadow-casting technique, a new, simple method of optically implementing digital logic gates has been developed. These gates are capable of performing a complete set of logical operations on a large array of binary variables in parallel, i.e., the pattern logics. A light-e ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
-emitting diode (LED) array is used as an inco-herent light source in the lensless shadow-casting system. Sixteen possible functions of two binary variables are simply realizable with these gates in parallel by controlling the switching modes of the LED's. Experimental re-sults demonstrate the feasibility

A Comparative Evaluation of Parallel Garbage Collector Implementations

by Clement Attanasio, David Bacon, Anthony Cocchi, Stephen Smith , 2001
"... While uniprocessor garbage collection is relatively well understood, experience with collectors for large multiprocessor servers is limited and it is unknown which techniques best scale with large memories and large numbers of processors. In order to explore these issues we designed a modular gar ..."
Abstract - Cited by 32 (4 self) - Add to MetaCart
as how little memory they can run them in. All of our collectors scale linearly up to sixteen processors. The least memory is usually required by the hybrid mark-sweep collector that uses a copying collector for its nursery, although sometimes the non-generational mark-sweep collector requires less
Next 10 →
Results 1 - 10 of 65
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University