Opentopia Directory Encyclopedia Tools

Data-flow analysis

Encyclopedia : D : DA : DAT : Data-flow analysis


Data-flow analysis is a technique of automatically gathering information about the values calculated in a procedural computer program by applying rules to its control flow graph. The information gathered is often useful in optimizing the program, so most compilers for procedural languages include one or more data-flow analysis stages.

A simple way to perform dataflow analysis of programs is to set up dataflow equations for each node of the control flow graph and solve them by repeatedly calculating the output from the input locally at each node until the whole system stabilizes, i.e., it reaches a fixpoint. To guarantee termination it is required that the data flow equations form a fixed-point data-flow analysis, i.e., the local updates by the equations are monotone.

A canonical example of a data-flow analysis is reaching definitions.

An iterative algorithm

The fixpoint of the dataflow equations can be calculated using the round-robin iterative algorithm:

[\texttt\ i\ \leftarrow\ 1\ \texttt\ N]
[\mathit\ i]
[\texttt\ (\mathit)] [\texttt\ i\ \leftarrow\ 1\ \texttt\ N] [\mathit\ i]

The order matters

The efficiency of iteratively solving data-flow equations is influenced by the order at which local nodes are visited. For the above round-robin iterative algorithm it depends in which order nodes are selected from the set of nodes. Furthermore, it depends, whether the data-flow equations are used for forward or backward data-flow analysis over the CFG. In the following, a few iteration orders for solving data-flow equations are discussed (a related concept to iteration order of a CFG is tree traversal of a tree).

Examples

The following are examples of properties of computer programs that can be calculated by data-flow analysis. Note that the properties calculated by data-flow analysis are typically only approximations of the real properties. This is because data-flow analysis operates on the syntactical structure of the CFG without simulating the exact control flow of the program. However, to be still useful in practice, a data-flow analysis algorithm is typically designed to calculate an upper respectively lower approximation of the real program properties.

Forward Analysis

The reaching definition analysis calculates for each program point the set of variable values that may potentially reach this program point.

1: if b==4 then
2:    a = 5;
3: else
4:    a = 3;
5: endif
6:
7: if  a < 4 then
8:    ...
The reaching definition of variable "a" at line 7 is the set .

Backward Analysis

The live variable analysis calculates for each program point the variables that may be potentially read afterwards before their next write update.

1: a = 3;
2: c = 5;
3: a = b + c;
The set of live variables at line 2 is , but the set of live variables at line 1 is only since variable "c" is updated in line 2.

Sensitivities

Data-flow analysis is inherently flow-sensitive. Data-flow analysis is typically path-insensitive, though it is possible to define data-flow equations that yield a path-sensitive analysis.

These terms aren't specific to data-flow analysis. The following defintions should be moved somewhere else and fleshed out with more verbose examples. Wikipedia's inter-document structure for static analysis could use some rework.

Further reading

Aho, Alfred V. Sethi, Ravi. Ullman, Jeffrey D. Compilers: Principles, Techniques and Tools. Addison Wesley. 1986.

Appel, Andrew W. Modern Compiler Implementation in ML. Cambridge University Press. 1999.

Cooper, Keith D. and Torczon, Linda. Engineering a Compiler. Morgan Kaufmann. 2005.

Muchnick, Steven S. Advanced Compiler Design and Implementation. Morgan Kaufmann. 1997.

Hecht, Matthew S. Flow Analysis of Computer Programs. Elsevier North-Holland Inc. 1977.

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: