(Enter summary)
Abstract: To help tolerate the latency of accessing remote
data in a shared-memory multiprocessor, we explore a
novel approach to switch-on-miss multithreading that
is software-controlled rather than hardware-controlled.
Our technique uses informing memory operations to
trigger the thread switches with sufficiently low overhead
that we observe speedups of 10% or more for
four out of seven applications, with one application
speeding up by 14%. By selectively applying register
partitioning to reduce thread ... (Update)
Context of citations to this paper: More
...switch [6] There are two ways of thinking for reducing the context switch overhead. One is to shorten the context handling time itself[9, 10]. The other is to reduce the chances of switching. In this paper, we explore the latter approach to improve the performance. In order...
.... software based multithreading on an architecture in which a cache miss causes the processor to branch to a predetermined user level PC [52]. They set this PC to a light weight context switch routine and suggest compiler based register file partitioning to reduce context switch...
Cited by: More
Balanced Multithreading: Increasing Throughput via a.. - Tune, Kumar, Tullsen, .. (2004)
(Correct)
Intelligent Memory Manager Eliminates Cache Pollution Due to.. - Rezaei, Kavi (2003)
(Correct)
Mini-threads: Increasing TLP on Small-Scale SMT Processors - Joshua Redstone Susan (2003)
(Correct)
Similar documents (at the sentence level):
62.6%: Software-Controlled Multithreading Using Informing Memory.. - Mowry, Ramkissoon (1998)
(Correct)
Active bibliography (related documents): More All
0.2: Architectural Support for Copy and Tamper-Resistant Software - Lie (2003)
(Correct)
0.2: Efficient Execution of Compressed Programs - Lefurgy (2000)
(Correct)
0.1: Reducing Coherence-Related Communication in Software.. - Speight, Bennett (1998)
(Correct)
Similar documents based on text: More All
0.5: Per-Node Multi-Threading and Remote Latency - Kritchalach Thitikamol And
(Correct)
0.2: Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)
(Correct)
0.2: Informing Memory Operations: Providing Memory Performance.. - Horowitz (1996)
(Correct)
Related documents from co-citation: More All
6: Exploiting choice: Instruction fetch and issue on an implementable simultaneous ..
- Tullsen, Eggers et al. - 1996
5: APRIL: A Processor Architecture for Multiprocessing
- Agarwal, Lim et al. - 1990
4: Simultaneous Subordinate Microthreading (context) - Chappell, Stark et al. - 1999
BibTeX entry: (Update)
Todd Mowry. Software-controlled multithreading using informing memory operations. Technical Report CMU-CS-98-169, School of Computer Science, Carnegie Mellon University; http://www.cs.cmu.edu/~tcm/tcm papers/sw mt tr98.ps, October 1998. http://citeseer.ist.psu.edu/mowry98softwarecontrolled.html More
@inproceedings{ mowry00softwarecontrolled,
author = "Todd C. Mowry and Sherwyn R. Ramkissoon",
title = "Software-Controlled Multithreading Using Informing Memory Operations",
booktitle = "{HPCA}",
pages = "121-132",
year = "2000",
url = "citeseer.ist.psu.edu/mowry98softwarecontrolled.html" }
Citations (may not include all citations):
474
A data locality optimizing algorithm (context) - Wolf, Lam - 1991
358
The Tera Computer System
- Alverson, Callahan et al. - 1990
353
The splash-2 programs: Characterization and methodological c..
- Woo, Ohara et al. - 1995
251
Simultaneous Multithreading: Maximizing On-Chip Parallelism
- Tullsen, Eggers et al. - 1995
249
Tolerating Latency Through SoftwareControlled Data Prefetchi..
- Mowry - 1994
222
The SGI Origin: A ccNUMA Highly Scalable Server (context) - Laudon, Lenoski - 1997
212
APRIL: A Processor Architecture for Multiprocessing
- Agarwal, Lim et al. - 1990
186
Exploiting Choice: Instruction Fetch and Issue on an Impleme..
- Tullsen, Eggers et al. - 1996
157
Architecture and applications of the HEP multiprocessor comp.. (context) - Smith - 1981
137
Lockup-free Instruction Fetch/Prefetch Cache Organization (context) - Kroft - 1981
136
Superscalar Microprocessor (context) - Yeager - 1996
131
Fine-Grain Access Control for Distributed Shared Memory
- Schoinas, Falsafi et al. - 1994
104
Compiler-Based Prefetching for Recursive Data Structures
- Luk, Mowry - 1996
51
An Integrated Compile-Time/Run-Time Software Distributed Sha..
- Dwarkadas, Cox et al. - 1996
40
Interleaving: A Multithreading Technique Targeting Multiproc..
- Laudon, Gupta et al. - 1994
33
Register relocation: Flexible contexts for multithreading
- Waldspurger, Weihl - 1993
19
Multi-threading and Remote Latency in Software DSMs
- Thitikamol, Keleher - 1997
14
Comparative Evaluation of Latency Tolerance Techniques for S..
- Mowry, Chan et al. - 1998
4
Informing Memory Operations: Memory Performance FeedbackMech..
- Horowitz, Martonosi et al. - 1998
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.cs.cmu.edu/~tcm/Papers.html): More
Predicting Data Cache Misses in Non-Numeric Applications.. - Mowry, Luk (1997)
(Correct)
Automatic Compiler-Inserted I/O Prefetching for.. - Mowry, Demke, Krieger (1996)
(Correct)
Informing Loads: Enabling Software To Observe And.. - Horowitz.. (1995)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC