• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

C.: Macroscopic Data Structure Analysis and Optimization, (2005)

by Lattner
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 28
Next 10 →

CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution

by Tom Bergan, Owen Anderson, Joseph Devietti, Luis Ceze, Dan Grossman
"... The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic ..."
Abstract - Cited by 106 (10 self) - Add to MetaCart
The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic replication. In this work, we avoid these complications by eliminating their root cause: we develop a compiler and runtime system that runs arbitrary multithreaded C/C++ POSIX Threads programs deterministically. A trivial non-performant approach to providing determinism is simply deterministically serializing execution. Instead, we present a compiler and runtime infrastructure that ensures determinism but resorts to serialization rarely, for handling interthread communication and synchronization. We develop two basic approaches, both of which are largely dynamic with performance improved by some static compiler optimizations. First, an ownership-based approach detects interthread communication via an evolving table that tracks ownership of memory regions by threads. Second, a buffering approach uses versioned memory and employs a deterministic commit protocol to make changes visible to other threads. While buffering has larger single-threaded overhead than ownership, it tends to scale better (serializing less often). A hybrid system sometimes performs and scales better than either approach individually. Our implementation is based on the LLVM compiler infrastructure. It needs neither programmer annotations nor special hardware. Our empirical evaluation uses the PARSEC and SPLASH2 benchmarks and shows that our approach scales comparably to nondeterministic execution.
(Show Context)

Citation Context

... extra constraint: An object is thread-local only if all accesses of that object are through pointers that must-alias threadlocal objects. We use a points-to analysis based on Data Structure Analysis =-=[18]-=-. DSA is unification-based, so all pointers which mayalias the same object will point at the same node in the points-to graph. This automatically satisfies our extra constraint: we simply label each n...

Automatic pool allocation: improving performance by controlling data structure layout in the heap

by Chris Lattner, Vikram Adve - In Proceedings of PLDI , 2005
"... This paper describes Automatic Pool Allocation, a transformation framework that segregates distinct instances of heap-based data structures into seperate memory pools and allows heuristics to be used to partially control the internal layout of those data structures. The primary goal of this work is ..."
Abstract - Cited by 82 (9 self) - Add to MetaCart
This paper describes Automatic Pool Allocation, a transformation framework that segregates distinct instances of heap-based data structures into seperate memory pools and allows heuristics to be used to partially control the internal layout of those data structures. The primary goal of this work is performance improvement, not automatic memory management, and the paper makes several new contributions. The key contribution is a new compiler algorithm for partitioning heap objects in imperative programs based on a context-sensitive pointer analysis, including a novel strategy for correct handling of indirect (and potentially unsafe) function calls. The transformation does not require type safe programs and works for the full generality of C and C++. Second, the paper describes several optimizations that exploit data structure partitioning to fur-ther improve program performance. Third, the paper evaluates how memory hierarchy behavior and overall program performance are impacted by the new transformations. Using a number of bench-marks and a few applications, we find that compilation times are extremely low, and overall running times for heap intensive pro-grams speed up by 10-25 % in many cases, about 2x in two cases, and more than 10x in two small benchmarks. Overall, we believe this work provides a new framework for optimizing pointer inten-sive programs by segregating and controlling the layout of heap-based data structures.
(Show Context)

Citation Context

... is driven by a points-to graph computed by some pointer analysis that uses an explicit representation of memory [27]. In our implementation, we use an algorithm we call Data Structure Analysis (DSA) =-=[32]-=- to compute these points-to graphs. DSA is context-sensitive (both in its analysis and in that it distinguishes heap and stack objects by entire acyclic call paths), unification-based, and field-sensi...

Making context-sensitive points-to analysis with heap cloning practical for the real world

by Chris Lattner, Andrew Lenharth, Vikram Adve - In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation , 2007
"... Context-sensitive pointer analysis algorithms with full “heap cloning ” are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive algorithm with full heap cloning (by acyclic call pat ..."
Abstract - Cited by 67 (5 self) - Add to MetaCart
Context-sensitive pointer analysis algorithms with full “heap cloning ” are powerful but are widely considered to be too expensive to include in production compilers. This paper shows, for the first time, that a context-sensitive, field-sensitive algorithm with full heap cloning (by acyclic call paths) can indeed be both scalable and extremely fast in practice. Overall, the algorithm is able to analyze programs in the range of 100K-200K lines of C code in 1-3 seconds, takes less than 5 % of the time it takes for GCC to compile the code (which includes no whole-program analysis), and scales well across five orders of magnitude of code size. It is also able to analyze the Linux kernel (about 355K lines of code) in 3.1 seconds. The paper describes the major algorithmic and engineering design choices that are required to achieve these results, including (a) using flow-insensitive and unification-based analysis, which
(Show Context)

Citation Context

...ensitive subset-based algorithm), showing that DSA is about as precise Andersen’s for many cases, is significantly more precise for some programs, and is only worse in rare cases. Further, other work =-=[18]-=- shows that the mod/ref information captured by DSA is significantly better than that computed by non-context-sensitive algorithms. In addition to alias analysis, DSA can also be used to extract limit...

SAFECode: Enforcing Alias Analysis for Weakly Typed Languages

by Dinakar Dhurjati, Sumant Kowshik, Vikram Adve
"... Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. We describe a compilation strategy for standard C programs that guarantees that aggr ..."
Abstract - Cited by 52 (8 self) - Add to MetaCart
Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. We describe a compilation strategy for standard C programs that guarantees that aggressive interprocedural pointer analysis (or less precise ones), a call graph, and type information for a subset of memory, are never invalidated by any possible memory errors. We formalize our approach as a new type system with the necessary run-time checks in operational semantics and prove the correctness of our approach for a subset of C. Our semantics provide the foundation for other sophisticated static analyses to be applied to C programs with a guarantee of soundness. Our work builds on a previously published transformation called Automatic Pool Allocation to ensure that hard-to-detect memory errors (dangling pointer references and certain array bounds errors) cannot invalidate
(Show Context)

Citation Context

...lves with how these analysis results are actually computed; we only assume that these are given in the format described below. In our implementation, we use an analysis called Data Structure Analysis =-=[20]-=-, a context-sensitive, field sensitive, unification based algorithm to compute both the pointer-analysis and the call graph. DSA is context-sensitive over entire acyclic call paths, both in its analys...

Transparent Pointer Compression for Linked Data Structures

by Chris Lattner, Vikram S. Adve - In Proc. ACM Workshop on Memory System Performance , 2005
"... 64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and memory bandwidth of the system (compared to 32-bit ad-dress spaces). This paper presents a sophisticated, auto-matic transf ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and memory bandwidth of the system (compared to 32-bit ad-dress spaces). This paper presents a sophisticated, auto-matic transformation that shrinks pointers from 64-bits to 32-bits. The approach is “macroscopic, ” i.e., it operates on an entire logical data structure in the program at a time. It allows an individual data structure instance or even a subset thereof to grow up to 232 bytes in size, and can compress pointers to some data structures but not others. Together, these properties allow efficient usage of a large (64-bit) ad-dress space. We also describe (but have not implemented) a dynamic version of the technique that can transparently expand the pointers in an individual data structure if it ex-ceeds the 4GB limit. For a collection of pointer-intensive benchmarks, we show that the transformation reduces peak heap sizes substantially by (20 % to 2x) for several of these benchmarks, and improves overall performance significantly in some cases.
(Show Context)

Citation Context

...s is affected by how DS graphs are computed. Therefore, we also very briefly describe relevant aspects of the pointer analysis we use to compute DS graphs, which we call Data Structure Analysis (DSA) =-=[7]-=-. Figure 4 shows a simple linked-list example and the DS graphs computed by DSA for the three functions in the example. We will use this as a running example in this Section and the next. 2.1 Points-t...

MemSafe: Ensuring spatial and temporal memory safety of C at runtime

by Matthew S. Simpson, Rajeev K. Barua, Key Words - IEEE Working Conference on Source Code Analysis and Manipulation , 2010
"... Memory access violations are a leading source of unreliability in C programs. As evidence of this problem, a variety of methods exist that retrofit C with software checks to detect memory errors at runtime. However, these methods generally suffer from one or more drawbacks including the inability to ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Memory access violations are a leading source of unreliability in C programs. As evidence of this problem, a variety of methods exist that retrofit C with software checks to detect memory errors at runtime. However, these methods generally suffer from one or more drawbacks including the inability to detect all errors, the use of incompatible metadata, the need for manual code modifications, and high runtime overheads. This paper presents a compiler analysis and transformation for ensuring the memory safety of C called MemSafe. MemSafe makes several novel contributions that improve upon previous work and lower the cost of safety. These include (1) a method for modeling temporal errors as spatial errors, (2) a metadata representation that combines features of both object- and pointer-based approaches, and (3) a data-flow representation that simplifies optimizations for removing unneeded checks. MemSafe is capable of detecting real errors with lower overheads than previous efforts. Experimental results show that MemSafe detects all memory errors in six programs with known violations as well as two large and widely-used open source applications. Finally, MemSafe ensures complete safety with an average overhead of 88 % on 30 programs commonly used for evaluating the performance of error detection tools. Copyright c ○ 2011 John Wiley & Sons, Ltd.
(Show Context)

Citation Context

...retrieving metadata from the object database. Dhurjati and Adve [17] improve the runtime cost associated with this lookup operation by partitioning the object database using Automatic Pool Allocation =-=[34]-=-, and Akritidis et al. [33] improve runtime by constraining the size and alignment of allocated objects. However, these methods do not detect sub-object overflows or temporal errors. Figure 1 depicts ...

Input-Covering Schedules for Multithreaded Programs

by Tom Bergan, Luis Ceze, Dan Grossman
"... We propose constraining multithreaded execution to small sets of input-covering schedules, which we define as follows: given a program P, we say that a set of schedules Σ covers all inputs of program P if, when given any valid input, P’s execution can be constrained to some schedule in Σ and still p ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
We propose constraining multithreaded execution to small sets of input-covering schedules, which we define as follows: given a program P, we say that a set of schedules Σ covers all inputs of program P if, when given any valid input, P’s execution can be constrained to some schedule in Σ and still produce a semantically-valid result. Our approach is to first compute a small Σ for a given program P, and then, at runtime, constrain P’s execution to always follow some schedule in Σ, and never deviate. We have designed an algorithm that uses symbolic execution to systematically enumerate a set of input-covering schedules, Σ. To deal with programs that run for an unbounded length of time, we partition execution into bounded epochs, find input-covering schedules for each epoch in isolation, and then piece the schedules together at runtime. We have implemented this algorithm and a constrained execution runtime, and we report early results. Our approach has the following advantage: because all possible runtime schedules are known a priori, we can seek to validate the program by thoroughly testing each schedule in Σ, in isolation, without needing to reason about the huge space of thread interleavings that arises due to conventional nondeterministic execution. 1.
(Show Context)

Citation Context

...to the incorrect belief that a live variable was written, which results in an overly strong schedule precondition, which results in the exploration of redundant schedules. Our implementation uses DSA =-=[27]-=-, which, in wholeprogram mode, degrades to a field-sensitive Steensgaard (equality-based) analysis. Our experience suggests that an inclusion-based analysis is vital. The problem intensifies because w...

FACADE: A compiler and runtime for (almost) object-bounded big data applications

by Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang, Jianfei Hu, Guoqing Xu - In ASPLOS (2015
"... The past decade has witnessed the increasing demands on data-driven business intelligence that led to the proliferation of data-intensive applications. A managed object-oriented programming language such as Java is often the developer’s choice for implementing such applications, due to its quick dev ..."
Abstract - Cited by 5 (2 self) - Add to MetaCart
The past decade has witnessed the increasing demands on data-driven business intelligence that led to the proliferation of data-intensive applications. A managed object-oriented programming language such as Java is often the developer’s choice for implementing such applications, due to its quick development cycle and rich community resource. While the use of such languages makes programming easier, their au-tomated memory management comes at a cost. When the managed runtime meets Big Data, this cost is significantly magnified and becomes a scalability-prohibiting bottleneck. This paper presents a novel compiler framework, called FACADE, that can generate highly-efficient data manipula-tion code by automatically transforming the data path of an existing Big Data application. The key treatment is that in the generated code, the number of runtime heap objects created for data types in each thread is (almost) statically bounded, leading to significantly reduced memory manage-ment cost and improved scalability. We have implemented FACADE and used it to transform 7 common applications on 3 real-world, already well-optimized Big Data frameworks: GraphChi, Hyracks, and GPS. Our experimental results are very positive: the generated programs have (1) achieved a 3%–48 % execution time reduction and an up to 88 × GC reduction; (2) consumed up to 50 % less memory, and (3) scaled to much larger datasets.

Pointer Analysis, Conditional Soundness, and Proving the Absence of Errors

by Christopher L. Conway, Dennis Dams, Kedar S. Namjoshi, Clark Barrett
"... It is well known that the use of points-to information can substantially improve the accuracy of a static program analysis. Commonly used algorithms for computing points-to information are known to be sound only for memory-safe programs. Thus, it appears problematic to utilize points-to information ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
It is well known that the use of points-to information can substantially improve the accuracy of a static program analysis. Commonly used algorithms for computing points-to information are known to be sound only for memory-safe programs. Thus, it appears problematic to utilize points-to information to verify the memory safety property without giving up soundness. We show that a sound combination is possible, even if the points-to information is computed separately and only conditionally sound. This result is based on a refined statement of the soundness conditions of points-to analyses and a general mechanism for composing conditionally sound analyses.
(Show Context)

Citation Context

...ion of conditional soundness provides a simpler model which captures the behavior of a variety of interesting analyses. Pointer analysis for C programs has been an active area of research for decades =-=[21, 16, 28, 3, 27, 17, 12, 20, 23]-=-. The correctness arguments for points-to algorithms are typically stated informally—each of the analyses has been developed for the purpose of program transformation and understanding, not for use in...

PRACTICAL LOW-OVERHEAD ENFORCEMENT OF MEMORY SAFETY FOR C PROGRAMS

by Santosh Ganapati Nagarakatte
"... COPYRIGHT 2012 Santosh Ganapati NagarakatteThis dissertation is dedicated to my parents. Without them, this would not have been possible. iii Acknowledgments This dissertation is a direct result of constant support and encouragement from my parents who had more confidence in my abilities than I had, ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
COPYRIGHT 2012 Santosh Ganapati NagarakatteThis dissertation is dedicated to my parents. Without them, this would not have been possible. iii Acknowledgments This dissertation is a direct result of constant support and encouragement from my parents who had more confidence in my abilities than I had, at times, in my ability. Apart from my parents, there are numerous people who have been instrumental in the growth of my research career and my development as an individual. My adviser, Milo Martin has had a transformative influence on me as a researcher. I am fortunate to have worked with him for the last five years. Milo provided me the initial insights, the motivation to work on the problem, and eventually has nourished my ideas. He was generous with his time and wisdom. He provided me an excellent platform where I could excel. Apart from the research under him, he gave me freedom to collaborate with other researchers independently. I have also learned a great deal about teaching, presentations, and mentoring that will be amazingly useful in my future
(Show Context)

Citation Context

...f work is to prevent violations of type-safety through dangling pointer errors rather than detecting dangling pointer errors. To prevent temporal errors on the stack, they use data structure analysis =-=[82]-=- that performs a flow-insensitive and context sensitive analysis to identify all pointers to the stack frame that escape into heap or global locations. To avoid annotations, many static analyses that ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University