Unit 5 | Compiler Design Notes | AKTU Notes


Unit 5 | Compiler Design Notes | AKTU Notes





Compiler Design: Code Generation & Code Optimization

Code Generation

After the compiler completes semantic analysis and intermediate code generation, the next step is code generation. This phase translates the intermediate code into machine code or assembly code, which can be executed by the computer.

A. Design Issues in Code Generation

When generating machine code, the compiler must consider:

1. Efficiency: The generated code should run fast and use minimal memory.

2. Correctness: The code must produce the correct output.

3. Instruction Selection: Choosing the right machine instructions.

4. Register Allocation: Efficient use of CPU registers to store values.

5. Target Machine Characteristics: Different processors have different instruction sets.

B. The Target Language

- The target language is usually machine code or assembly language.

- Example:

   - High-Level Code: C = A + B;

   - Intermediate Code: T1 = A + B; C = T1;

   - Assembly Code:

LOAD R1, A
LOAD R2, B
ADD R1, R2
STORE C, R1

- The generated code must be optimized to minimize execution time and memory usage.

C. Addresses in the Target Code

In machine code, memory addresses are used for storing data and instructions. The compiler must decide:

1. Absolute Addressing: Uses a fixed memory location.

2. Relative Addressing: Uses offsets from a base address.

3. Register Addressing: Uses CPU registers instead of memory.

Example:

MOV R1, [1000] ; Load value from memory address 1000 into register R1

D. Basic Blocks and Flow Graphs

- A basic block is a sequence of instructions without branches (jumps, loops) except at the start or end.

- A flow graph represents control flow between basic blocks in a program.

Example:

A = B + C;
D = A * 2;
E = D - 5;

This forms a single basic block because there are no jumps or branches.

Flow Graph Example

START → [BASIC BLOCK 1] → [BASIC BLOCK 2] → END

Each block is connected by edges, representing possible execution paths.

E. Optimization of Basic Blocks

Basic block optimization aims to reduce execution time by:

1. Removing unnecessary computations

2. Reusing computed values

3. Simplifying expressions

Example of Optimization:

x = 2 * 4; → x = 8; (Constant Folding)
y = x * 0; → y = 0; (Multiplication by zero)

F. Code Generator

A code generator translates optimized intermediate code into the final machine code.
It performs:

- Instruction selection (choosing the best CPU instructions).

- Register allocation (minimizing memory access).

- Instruction scheduling (reordering instructions for better performance).

Code Optimization

Code optimization improves program performance by reducing:

- Execution time

- Memory usage

- Number of instructions

Optimization is divided into two types:

1. Machine-Independent Optimization (applies to any machine).

2. Machine-Dependent Optimization (specific to a CPU architecture).

A. Machine-Independent Optimizations

These optimizations do not depend on the target machine and focus on:

- Constant Folding: Replacing expressions with constant values.

- Dead Code Elimination: Removing unused code.

- Common Subexpression Elimination: Avoiding duplicate calculations.


Example:

x = a * b;
y = a * b + 5;

Optimized:

t = a * b;
x = t;
y = t + 5;

(Computes a * b only once.)

B. Loop Optimization

Loops are time-consuming because they repeat multiple times. Optimizations include:

1. Loop Invariant Code Motion: Moves calculations outside the loop if they remain constant.

2. Strength Reduction: Replacing expensive operations with cheaper ones.

3. Loop Unrolling: Reducing loop iterations by executing multiple steps in one iteration.

Example:

for (int i = 0; i < n; i++) {
    x = y + 5; // Doesn't depend on i, can move outside
    A[i] = i * x;
}

Optimized:

x = y + 5;
for (int i = 0; i < n; i++) {
    A[i] = i * x;
}

C. DAG Representation of Basic Blocks

A Directed Acyclic Graph (DAG) represents computations in a basic block. It helps in:

- Detecting common subexpressions.

- Eliminating redundant calculations.

- Reordering operations efficiently.

Example:
For x = a + b; y = a + b + c; z = x + c;, the DAG is:

      +
     / \
    a b
     \  |
      + c
     /    \
    y    z

Here, x and y share the a + b computation, avoiding redundant calculations.

D. Value Numbers and Algebraic Laws

- Value Numbering assigns unique numbers to expressions to find duplicate computations.

- Algebraic Laws simplify expressions (e.g., a * 1 = a, a + 0 = a).

Example:

x = a * b;
y = a * b;

The compiler assigns the same value number to a * b, eliminating duplicate computation.

E. Global Data-Flow Analysis

- Analyzes the entire program instead of just a basic block.

- Used for optimizations like constant propagation, dead code elimination, and live variable analysis.


Example:
If x is not used anywhere after x = a + b;, the compiler removes it.

Conclusion

- Code Generation converts intermediate code to machine code while considering efficiency, instruction selection, and register allocation.

- Code Optimization enhances performance using techniques like loop optimization, DAGs, value numbering, and algebraic simplifications.

No comments:

Post a Comment