Results 1 - 10
of
12
Directed proof generation for machine code
, 2010
"... Abstract. We present the algorithms used in MCVETO (Machine-Code VErification TOol), a tool to check whether a stripped machinecode program satisfies a safety property. The verification problem that MCVETO addresses is challenging because it cannot assume that it has access to (i) certain structures ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
(Show Context)
Abstract. We present the algorithms used in MCVETO (Machine-Code VErification TOol), a tool to check whether a stripped machinecode program satisfies a safety property. The verification problem that MCVETO addresses is challenging because it cannot assume that it has access to (i) certain structures commonly relied on by source-code verification tools, such as control-flow graphs and call-graphs, and (ii) metadata, such as information about variables, types, and aliasing. It cannot even rely on out-of-scope local variables and return addresses being protected from the program’s actions. What distinguishes MCVETO from other work on software model checking is that it shows how verification of machine-code can be performed, while avoiding conventional techniques that would be unsound if applied at the machine-code level. 1
TSL: A system for generating abstract interpreters and its application to machine-code analysis
- TOPLAS
"... This paper describes the design and implementation of a system, called TSL (for “Transformer Specification Language”), that provides a systematic solution to the problem of creating retargetable tools for analyzing machine code. TSL is a tool generator—i.e., a meta-tool—that automatically creates di ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
This paper describes the design and implementation of a system, called TSL (for “Transformer Specification Language”), that provides a systematic solution to the problem of creating retargetable tools for analyzing machine code. TSL is a tool generator—i.e., a meta-tool—that automatically creates different abstract interpreters for machine-code instruction sets. The most challenging technical issue that we faced in designing TSL was how to automate the generation of the set of abstract transformers for a given abstract interpretaton of a given instruction set. From a description of the concrete operational semantics of an instruction set, together with the datatypes and operations that define an abstract domain, TSL automatically creates the set of abstract transformers for the instructions of the instruction set. TSL advances the state of the art in program analysis because it provides two dimensions of parameterizability: (i) a given analysis component can be retargeted to different instruction sets; (ii) multiple analysis components can be created automatically from a single specification of the concrete operational semantics of the language to be analyzed. TSL is an abstract-transformer-generator generator. The paper describes the principles behind TSL, and discusses how one uses TSL to develop different abstract interpreters.
There’s Plenty of Room at the Bottom: Analyzing and Verifying Machine Code
, 2010
"... This paper discusses the obstacles that stand in the way of doing a good job of machine-code analysis. Compared with analysis of source code, the challenge is to drop all assumptions about having certain kinds of information available (variables, control-flow graph, call-graph, etc.) and also to add ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
This paper discusses the obstacles that stand in the way of doing a good job of machine-code analysis. Compared with analysis of source code, the challenge is to drop all assumptions about having certain kinds of information available (variables, control-flow graph, call-graph, etc.) and also to address new kinds of behaviors (arithmetic on addresses, jumps to “hidden” instructions starting at positions that are out of registration with the instruction boundaries of a given reading of an instruction stream, self-modifying code, etc.). The paper describes some of the challenges that arise when analyzing machine code, and what can be done about them. It also provides a rationale for some of the design decisions made in the machine-code-analysis tools that we have built over the past few years.
McDash: Refinement-based property verification for machine code
, 2009
"... This paper presents MCDASH, a refinement-based model checker for machine code. While model checkers such as SLAM, BLAST, and DASH have each made significant contributions in the field of verification/flaw-detection, their use has been restricted to programs for which source code is available. This p ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
This paper presents MCDASH, a refinement-based model checker for machine code. While model checkers such as SLAM, BLAST, and DASH have each made significant contributions in the field of verification/flaw-detection, their use has been restricted to programs for which source code is available. This paper discusses several challenges that arise when working with machine code, and explains how they are addressed in MCDASH. Unlike previous model checkers, MCDASH does not require the usual preprocessing steps of (a) building control-flow graphs, and (b) performing points-to analysis (or alias analysis); nor does MCDASH require type information to be supplied. The paper also describes how we extended MCDASH to check properties of self-modifying code. MCDASH is built using language-independent meta-tools that generate the implementations of the required analysis components from descriptions of an instruction set’s syntax and semantics. It has been instantiated for Intel x86 and PowerPC. 1.
Static Detection of Unsafe Component Loadings
"... Dynamic loading of software components is a commonly used mechanism to achieve better flexibility and modularity in software. For an application’s runtime safety, it is important for the application to load only its intended components. However, programming mistakes may lead to failures to load a co ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Dynamic loading of software components is a commonly used mechanism to achieve better flexibility and modularity in software. For an application’s runtime safety, it is important for the application to load only its intended components. However, programming mistakes may lead to failures to load a component, or even worse, to load a malicious component. Recent work has shown that these errors are both prevalent and severe, sometimes leading to remote code execution attacks. The work is based on dynamic analysis by monitoring and analyzing runtime component loadings. Although simple and effective in detecting real errors, it suffers from limited code coverage and may miss important vulnerabilities. Thus, it is desirable to develop effective techniques to detect all possible unsafe component loadings. This paper presents the first static binary analysis aiming at detecting all possible loading-related errors. The key challenge is how to scalably and precisely compute what components may be loaded at relevant program locations. Our main insight is that this information is often determined locally from the component loading call sites. This motivates us to design a demand-driven analysis, working backward starting from the relevant call sites. In particular, for a given call site c, we first compute its context-sensitive executable slices, one for each execution context. Then we emulate the slices to obtain the set of components possibly loaded at c. This novel combination of slicing and emulation achieves good scalability and precision by avoiding expensive symbolic analysis. We implemented our technique and evaluated its effectiveness against the existing dynamic technique on nine popular Windows applications. Results show that our tool has better coverage and is precise—it is able to detect many more unsafe loadings. It is also scalable and able to analyze all nine applications within minutes. 1.
Synthesis of Machine Code from Semantics
"... In this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive Synthesis (CEGIS) framework, in combination with search ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive Synthesis (CEGIS) framework, in combination with search-space pruning heuristics to synthesize instruction-sequences. To counter the exponential cost inherent in enumerative synthesis, our technique uses a divide-and-conquer strategy to break the input QFBV formula into independent sub-formulas, and synthesize instructions for the sub-formulas. Synthesizers created by our technique could be used to create semantics-based binary rewriting tools such as optimizers, partial evaluators, program obfuscators/de-obfuscators, etc. Our experiments for Intel’s IA-32 instruction set show that, in comparison to our base-line algorithm, our search-space pruning heuristics reduce the synthesis time by a factor of 473, and our divide-and-conquer strategy reduces the synthesis time by a further 3 to 5 orders of magnitude.
CS704: WLP for a Language with Pointers
, 2010
"... This lecture concerns weakest liberal precondition for a language with pointers. It discusses two approaches to the issue: (i) one based on an enhanced rule of substitution (of programminglanguage elements into formulas), and (ii) one based on an encoding of the programming language semantics into l ..."
Abstract
- Add to MetaCart
This lecture concerns weakest liberal precondition for a language with pointers. It discusses two approaches to the issue: (i) one based on an enhanced rule of substitution (of programminglanguage elements into formulas), and (ii) one based on an encoding of the programming language semantics into logic. 1
NOTICE AND SIGNATURE PAGE
, 2015
"... Approved for public release; distribution unlimited. See additional restrictions described on inside pages ..."
Abstract
- Add to MetaCart
(Show Context)
Approved for public release; distribution unlimited. See additional restrictions described on inside pages