Results 1  10
of
20
Soft RealTime Scheduling on Simultaneous Multithreaded Processors
, 2002
"... Simultaneous multithreading (SMT) improves processor throughput by processing instructions from multiple threads each cycle. This is the first work to explore soft realtime scheduling on an SMT processor. Scheduling with SMT requires two decisions: (1) which threads to run simultaneously (the cosc ..."
Abstract

Cited by 54 (1 self)
 Add to MetaCart
(Show Context)
Simultaneous multithreading (SMT) improves processor throughput by processing instructions from multiple threads each cycle. This is the first work to explore soft realtime scheduling on an SMT processor. Scheduling with SMT requires two decisions: (1) which threads to run simultaneously (the coschedule), and (2) how to share processor resources among coscheduled threads. We explore algorithms for both for softreal time multimedia applications, focusing more on coschedule selection. We examine previous multiprocessor coscheduling algorithms, including partitioning and global scheduling. We propose new variations that consider resource sharing and try to utilize SMT more effectively by exploiting application symbiosis.We find (using simulation) that the best algorithm uses global scheduling, exploits symbiosis, prioritizes high utilization tasks, and uses dynamic resource sharing. This algorithm, however, imposes significant profiling overhead and does not provide admission control. We propose alternatives to overcome these limitations, but at the cost of schedulability. 1
The Design and Implementation of a RegionBased Parallel Language
 UNIVERSITY OF WASHINGTON
, 2001
"... This dissertation describes the design and implementation of ZPL. ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
This dissertation describes the design and implementation of ZPL.
Parallel multilevel tetrahedral grid refinement
 SIAM J. Sci. Comput
, 2005
"... Abstract. In this paper we analyze a parallel version of a multilevel red/green local refinement algorithm for tetrahedral meshes. The refinement method is similar to the approaches used in the UGpackage [33] and by Bey [11, 12]. We introduce a new data distribution format that is very suitable for ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Abstract. In this paper we analyze a parallel version of a multilevel red/green local refinement algorithm for tetrahedral meshes. The refinement method is similar to the approaches used in the UGpackage [33] and by Bey [11, 12]. We introduce a new data distribution format that is very suitable for the parallel multilevel refinement algorithm. This format is called an admissible hierarchical decomposition. We will prove that the application of the parallel refinement algorithm to an input admissible hierarchical decomposition yields an admissible hierarchical decomposition. The analysis shows that the data partitioning between the processors is such that we have a favourable data locality (e.g., parent and children are on the same processor) and at the same time only a small amount of copies. AMS subject classifications. 65N50, 65N55
T and Prandi D. Taming the complexity of biological pathways through parallel computing
 Brief Bioinform
"... Biological systems are characterised by a large number of interacting entities whose dynamics is described by a number of reaction equations. Mathematical methods for modelling biological systems are mostly based on a centralised solution approach: the modelled system is described as a whole and the ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Biological systems are characterised by a large number of interacting entities whose dynamics is described by a number of reaction equations. Mathematical methods for modelling biological systems are mostly based on a centralised solution approach: the modelled system is described as a whole and the solution technique, normally the integration of a system of ordinary differential equations (ODEs) or the simulation of a stochastic model, is commonly computed in a centralised fashion. In recent times, research efforts moved towards the definition of parallel/distributed algorithms as a means to tackle the complexity of biological models analysis. In this article, we present a survey on the progresses of such parallelisation efforts describing the most promising results so far obtained.
On Graphs, GPUs, and Blind Dating: A Workload to Processor Matchmaking Quest
"... Abstract — Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid syst ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract — Graph processing has gained renewed attention. The increasing large scale and wealth of connected data, such as those accrued by social network applications, demand the design of new techniques and platforms to efficiently derive actionable information from large scale graphs. Hybrid systems that host processing units optimized for both fast sequential processing and bulk processing (e.g., GPUaccelerated systems) have the potential to cope with the heterogeneous structure of real graphs and enable high performance graph processing. Reaching this point, however, poses multiple challenges. The heterogeneity of the processing elements (e.g., GPUs implement a different parallel processing model than CPUs and have much less memory) and the inherent irregularity of graph workloads require careful graph partitioning and load assignment. In particular, the workload generated by a partitioning scheme should match the strength of the processing element the partition is allocated to. This work explores the feasibility and quantifies the performance gains of such lowcost partitioning schemes. We propose to partition the workload between the two types of processing elements based on vertex connectivity. We show that such partitioning schemes offer a simple, yet efficient way to boost the overall performance of the hybrid system. Our evaluation illustrates that processing a 4billion edges graph on a system with one CPU socket and one GPU, while offloading as little as 25 % of the edges to the GPU, achieves 2x performance improvement over stateof–theart implementations running on a dualsocket symmetric system. Moreover, for the same graph, a hybrid system with dualsocket and dualGPU is capable of 1.13 Billion breadthfirst search traversed edge per second, a performance rate that is competitive with the latest entries in the Graph500 list, yet at a much lower price point. I.
Multilevel Compound Tree – Construction Visualization and Interaction
 Proceedings of the International Conference on HumanComputer Interaction (INTERACT ’05), Lecture Notes in Computer Science 3583
, 2005
"... Abstract. Several hierarchical clustering techniques have been proposed to visualize large graphs, but fewer solutions suggest a focus based approach. We propose a multilevel clustering technique that produces in linear time a contextual clustered view depending on a userfocus. We get a tree of clu ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. Several hierarchical clustering techniques have been proposed to visualize large graphs, but fewer solutions suggest a focus based approach. We propose a multilevel clustering technique that produces in linear time a contextual clustered view depending on a userfocus. We get a tree of clusters where each cluster called metasilhouette is itself hierarchically clustered into an inclusion tree of silhouettes. Resulting Multilevel Silhouette Tree (MuSiTree) has a specific structure called multilevel compound tree. This work builds upon previous work on a compound tree structure called MOTree. The work presented in this paper is a major improvement over previous work by (1) defining multilevel compound tree as a more generic structure, (2) proposing original spacefilling visualization techniques to display it, (3) defining relevant interaction model based on both focus changes and graph filtering techniques and (4) reporting from case studies in various fields: cocitation graphs, relateddocument graphs and social graphs. 1
Virtual Network Embedding with Collocation: Benefits and Limitations of PreClustering
"... Abstract—Given that mechanisms for resource isolation are in place, the collocation of virtual network (VNet) nodes is attractive as it reduces the intermachine communication and hence improves the VNet embedding. However, existing VNet embedding algorithms either do not support the collocation of ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Given that mechanisms for resource isolation are in place, the collocation of virtual network (VNet) nodes is attractive as it reduces the intermachine communication and hence improves the VNet embedding. However, existing VNet embedding algorithms either do not support the collocation of virtual nodes of the same VNet, or only support it implicitly by referring to the possibility to precluster the VNet topology: this preclustered network forms the new VNet request and is embedded accordingly. This paper presents a preclustering algorithm OPTCUT that is optimal in the sense that it minimizes the amount of link resources needed for the embedding. It is based on a smart linear program formulation that ensures fast solutions. OPTCUT can be used together with any existing VNet embedding algorithms, and we show that it can greatly improve the stateoftheart embedding algorithm SecondNet [16]. The paper also describes a simple algorithm LOCO that directly supports collocation. This algorithm is part of a novel and generic VNet embedding framework METATREE which may be of independent interest. We compare the performance of the preclustering approaches with the direct VNet embeddings by LOCO, and find that preclustering also has its limitations. In particular, the information gap between the preclustering and the actual algorithm, as well as an inaccurate estimation of the distribution of remaining substrate resources, may lead to a low network utilization. I.
Load Balancing Strategies for Parallel . . .
 SURF 2005
, 2005
"... Highly resolved numerical solutions of partial differential equations are important in many areas of science and technology. Only adaptive mesh refinement methods reduce the necessary work sufficiently, allowing the calculation of realistic problems. Blockstructured ..."
Abstract
 Add to MetaCart
Highly resolved numerical solutions of partial differential equations are important in many areas of science and technology. Only adaptive mesh refinement methods reduce the necessary work sufficiently, allowing the calculation of realistic problems. Blockstructured
FOR
, 2006
"... Prof. Dr. Canan Özgen Director I certify that this thesis satisfies all the requirements as a thesis for the degree of ..."
Abstract
 Add to MetaCart
(Show Context)
Prof. Dr. Canan Özgen Director I certify that this thesis satisfies all the requirements as a thesis for the degree of