Instruction level parallelism
Encyclopedia : I : IN : INS : Instruction level parallelism
Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously. Consider the following program:
1. e = a + b 2. f = c + d 3. g = e * fOperation 3 depends on the results of operations 1 and 2, so it cannot be calculated until both of them are completed. However, operations 1 and 2 do not depend on any other operation, so they can be calculated simultaneously. (See below for more on which operations are dependent on each other, and on the various types of instruction dependencies). If we assume that each operation can be completed in one unit of time then these three instructions can be completed in a total of two units of time, giving an ILP of 3/2.
A goal of compiler and processor designers is to identify and take advantage of as much ILP as possible.
How much ILP exists in programs is very application specific. In certain fields, such as graphics and scientific computing the amount can be very large. However, workloads such as cryptography exhibit much less parallelism.
Micro-architectural techniques that are used to exploit ILP include:
- Instruction pipelining which depend on CPU caches
- Register renaming which refers to a technique used to avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations
- Speculative execution which reduce pipeline stalls due to control dependencies
- Branch prediction which is used to keep the pipeline full
- Superscalar execution in which multiple execution units are used to execute multiple instructions in parallel
- Out-of-order execution which reduces pipeline stalls due to operand dependencies
- VLIW and the closely related Explicitly Parallel Instruction Computing concepts
Dependencies
There are various dependencies that may hinder instruction level parallelism - parallel execution of multiple instructions may lead to a different instruction execution order, and a different instruction ordering will change the final effect of a set of instructions in the case of a dependency.
- True dependency
1. A = 3 2. B = A 3. C = BInstruction 3 is truly dependent on instruction 2, as the final value of C depends on the instruction updating B. Instruction 2 is truly dependent on instruction 1, as the final value of B depends on the instruction updating A. Since instruction 3 is truly dependent upon instruction 2 and instruction 2 is truly dependent on instruction 1, instruction 3 is also truly dependent on instruction 1. Instruction level parallelism is therefore not an option in this example.
- Anti-dependency
1. B = 3 2. A = B + 1 3. B = 7An anti-dependency is an example of a name dependency. That is, renaming of variables could remove the dependency, as in the below example:
1. B = 3 N. B2 = B 2. A = B2 + 1 3. B = 7A new variable, B2, has been declared as a copy of B in a new instruction, instruction N. The anti-dependency between 2 and 3 has been removed, meaning that these instructions may now be executed in parallel. However, the modification has introduced a new dependency: instruction A is now truly dependent on instruction N, which is truly dependent upon instruction 1. As true dependencies, these new dependencies are impossible to safely remove.
- Output dependency
1 A = 2 * X 2 B = A / 3 3 A = 9 * YAs with anti-dependencies, output dependencies are name dependencies. That is, they may be removed through renaming of variables, as in the below modification of the above example:
1 A2 = 2 * X 2 B = A2 /3 3 A = 9 * Y
External links
- [Reflections of the Memory Wall]
- [Wired magazine article that refers to the above paper]
- [Approaches to addressing the Memory Wall]
References
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
