Results 1 -
8 of
8
Fact Extraction, Querying and Visualization of Large C++ Code Bases Design and Implementation
, 2006
"... Reverse engineering aims to increase the developer’s understanding on software products. The source code of a software product typically contains all the knowledge of the software product. The first task in reverse engineering is the processing of the source code so as to obtain the desired data, an ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Reverse engineering aims to increase the developer’s understanding on software products. The source code of a software product typically contains all the knowledge of the software product. The first task in reverse engineering is the processing of the source code so as to obtain the desired data, and is performed by programs known as fact extractors. There are many variations in the requirements of a fact extractor depending on the task at hand. Facts can range from low-level information on expressions as well to high level information such as class diagrams and metrics. Our original goal was to find a suitable fact extractor for the C++ programming language that is able to handle industry-size source code bases of millions of lines of code, as a basis for a source code visualizer. We have conducted a survey and found no fact extractor to be suitable for our project. Instead of matching our visualization capabilities to a certain fact extractor, we have decided to create a new fact extractor, which we call EFES, based on Elsa, one of the survey semantic graph (ASG) and more, and stores this in an efficient custom database format. We have created a source code visualizer. The visualizer allows for the visualizing of the source code
An interactive reverse engineering environment for large-scale C++ code
- In Proc. ACM SoftVis
, 2008
"... Few toolsets for reverse-engineering and understanding of C++ code provide parsing and fact extraction, querying, analysis and code metrics, navigation, and visualization of source-code-level facts in a way which is as easy-to-use as integrated development environments (IDEs) are for forward enginee ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Few toolsets for reverse-engineering and understanding of C++ code provide parsing and fact extraction, querying, analysis and code metrics, navigation, and visualization of source-code-level facts in a way which is as easy-to-use as integrated development environments (IDEs) are for forward engineering. We present an interactive reverse-engineering environment (IRE) for C and C++ which allows to set up the fact extraction process, apply userwritten queries and metrics, and visualize combined query results, metrics, code text, and code structure. Our IRE tightly couples a fast, tolerant C++ fact extractor, an open query system, and several scalable dense-pixel visualizations in a novel way, offering an easy way to analyze and examine large code bases. We illustrate our IRE with several examples, focusing on the added value of the integrated, visual reverse-engineering approach. 1
A Tool for Optimizing the Build Performance of Large Software Code Bases
"... We present Build Analyzer, a tool that helps developers optimize the build performance of huge systems written in C. Due to complex C header dependencies, even small code changes can cause extremely long rebuilds, which are problematic when code is shared and modified by teams of hundreds of individ ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present Build Analyzer, a tool that helps developers optimize the build performance of huge systems written in C. Due to complex C header dependencies, even small code changes can cause extremely long rebuilds, which are problematic when code is shared and modified by teams of hundreds of individuals. Build Analyzer supports several use cases. For developers, it provides an estimate of the build impact and distribution caused by a given change. For architects, it shows why a build is costly, how its cost is spread over the entire code base, which headers cause build bottlenecks, and suggests ways to refactor these to reduce the cost. We demonstrate Build Analyzer with a use-case on a real industry code base. 1.
www.elsevier.com/locate/entcs A Framework for Reverse Engineering Large C++ Code Bases
"... When assessing the quality and maintainability of large C++ code bases, tools are needed for extracting several facts from the source code, such as: architecture, structure, code smells, and quality metrics. Moreover, these facts should be presented in such ways so that one can correlate them and fi ..."
Abstract
- Add to MetaCart
When assessing the quality and maintainability of large C++ code bases, tools are needed for extracting several facts from the source code, such as: architecture, structure, code smells, and quality metrics. Moreover, these facts should be presented in such ways so that one can correlate them and find outliers and anomalies. We present SolidFX, an integrated reverse-engineering environment (IRE) for C and C++. SolidFX was specifically designed to support code parsing, fact extraction, metric computation, and interactive visual analysis of the results in much the same way IDEs and design tools offer for the forward engineering pipeline. In the design of SolidFX, we adapted and extended several existing code analysis and data visualization techniques to render them scalable for handling code bases of millions of lines. In this paper, we detail several design decisions taken to construct SolidFX. We also illustrate the application of our tool and our lessons learnt in using it in several types of analyses of real-world industrial code bases, including maintainability and modularity assessments, detection of coding patterns, and complexity analyses. Keywords: reverse engineering, parsing, C++, software visualization
Architecting an Open System for Querying Large C and C++ Code Bases
"... Static code analysis offers a number of tools for the assessment of complexity, maintainability, modularity and safety of industry-size source code bases. Typically, such scenarios include three main phases. First, the code is parsed and ’raw ’ data is extracted and saved, such as syntax trees, poss ..."
Abstract
- Add to MetaCart
Static code analysis offers a number of tools for the assessment of complexity, maintainability, modularity and safety of industry-size source code bases. Typically, such scenarios include three main phases. First, the code is parsed and ’raw ’ data is extracted and saved, such as syntax trees, possibly annotated with semantic (type) information. In the second phase, the raw data is queried to check the presence or absence of specific code patterns which supports or invalidates specific claims on the code. In the third and last phase, the query results are presented (visualized) such that correlations between code structure and query results are emphasized in an easily understandable way. Whereas parsing source code is largely standardized, using several existing parsers, querying the outputs of such parsers is still a complex task. The main problem resides in the difficulty of easily translating high-level, cross-cutting concerns in the problem domain into queries in the raw data domain. We present here an open framework for constructing and executing queries on industry-size C++ code bases. Our query system adds several so-called query primitives atop a flexible C++ parser, offers options to combine these primitives into arbitrarily complex expressions, has a highly efficient way to evaluate such expressions on syntax trees of millions of nodes, and presents the query results in a visual, compact, intuitive way. We demonstrate our query framework, integratd in the SOLIDFX C++ reverse-engineering environment, with several real-world analyses on industrial codebases. 1
SOLIDFX: An Integrated Reverse Engineering Environment for C++
"... Many C++ extractors exist that produce syntax trees, call graphs, and metrics from C++ code, yet few offer integrated querying, navigation, and visualization of sourcecode-level facts to the end-user. We present an interactive reverse engineering environment which supports reverseengineering tasks o ..."
Abstract
- Add to MetaCart
Many C++ extractors exist that produce syntax trees, call graphs, and metrics from C++ code, yet few offer integrated querying, navigation, and visualization of sourcecode-level facts to the end-user. We present an interactive reverse engineering environment which supports reverseengineering tasks on C/C++ code, e.g. set up the extraction process, apply user-written queries on the extracted facts, and visualize query results, much like classical forwardengineering IDEs do. We illustrate our environment with several examples of reverse-engineering analyses. 1.
Querying Large C and C++ Code Bases: The Open Approach
"... Static code analysis offers a number of tools for the assessment of complexity, maintainability, modularity and safety of industrysize source code bases. Most analysis scenarios include two main phases. First, the code is parsed and ’raw ’ information is extracted and saved, such as syntax trees, po ..."
Abstract
- Add to MetaCart
Static code analysis offers a number of tools for the assessment of complexity, maintainability, modularity and safety of industrysize source code bases. Most analysis scenarios include two main phases. First, the code is parsed and ’raw ’ information is extracted and saved, such as syntax trees, possibly annotated with semantic (type) information. In the second phase, the raw information is queried to check the presence or absence of specific code patterns which supports or invalidates specific claims on the code. Whereas parsing source code is largely standardized, and several solutions (parsers) exist already, querying the outputs of such parsers is still a complex task. The main problem resides in the difficulty of easily translating high-level concerns in the problem domain into low-level queries into the raw data domain. We present here an open system for constructing and executing queries on industry-size C++ code bases. Our query system adds several so-called query primitives atop a flexible C++ parser, offers several options to combine these predicates into arbitrarily complex expressions, and has a very efficient way to evaluate such expressions on syntax trees of millions of nodes. We demonstrate the integration of our query system, C++ parser, and interactive visualizations, into the SOLIDFXintegrated environment for industrial code analysis. 1
Author manuscript, published in "Visualization and Data Analysis 2008 (2008)" DOI: 10.1117/12.766440 Visual and Analytical Extensions for the Table Lens
, 2012
"... Many visualization approaches teach us that ease of use is the key to effective visual data analysis. The Table Lens is an excellent example of a simple, yet expressive visual method that can help in analyzing even larger volumes of data. In this work, we present two extensions of the original Table ..."
Abstract
- Add to MetaCart
Many visualization approaches teach us that ease of use is the key to effective visual data analysis. The Table Lens is an excellent example of a simple, yet expressive visual method that can help in analyzing even larger volumes of data. In this work, we present two extensions of the original Table Lens approach. In particular, we extend the Table Lens by Two-Tone Pseudo Coloring (TTPC) and a hybrid clustering. By integrating TTPC into the Table Lens, we obtain visual representations that can communicate larger volumes of data while still maintaining precision. Secondly, we propose to integrate a data analysis step that implements a hybrid clustering based on self-organizing maps and hierarchical clustering. The analysis step helps to extract and communicate complementary structural information about the data and also serves to drive interactive information drill-down. Keywords: Information Visualization, Two-Tone Pseudo Coloring, Table Lens 1.

