Unit 3 | Compiler Design Notes | AKTU Notes


Unit 3 | Compiler Design Notes | AKTU Notes



Syntax-Directed Translation (SDT) in Compiler Design

What is Syntax-Directed Translation (SDT)?

Syntax-Directed Translation (SDT) is a method used in compilers where semantic actions are performed during syntax analysis. This means that while the compiler checks whether a program follows grammar rules, it also translates the program into an intermediate representation (like three-address code) at the same time.

SDT uses syntax-directed definitions (SDDs), which associate rules with grammar productions.

Syntax-Directed Translation Schemes

A Syntax-Directed Translation Scheme (SDTS) is a set of rules that specify how to translate a programming language into an intermediate representation.

It consists of:

1. Grammar rules – Define how the language is structured.

2. Semantic actions – Define what translation should happen when a rule is applied.

Example of SDT Rule

For an arithmetic expression grammar:

E → E1 + T  { print('+') }
E → T  
T → id  { print(id.name) }

If we input x + y, the translation would be:

1. Read x → print(x)

2. Read + → print(+)

3. Read y → print(y)
Output: x y + (Postfix notation)

Implementation of Syntax-Directed Translators

A syntax-directed translator is a program that follows SDT rules to convert source code into an intermediate representation.

Two main approaches to implementation:

1. Embedded actions in parsing – Actions are executed during parsing.

2. Using an explicit parse tree – First, build a parse tree, then perform translation.

Intermediate Code Generation

Intermediate code is a representation of the program that is between source code and machine code. It helps in optimization and simplifies code generation.

Common types of intermediate code:

- Postfix Notation

- Three-Address Code (TAC)

- Quadruples & Triples

Postfix Notation (Reverse Polish Notation - RPN)

- Operators appear after operands.

- Eliminates the need for parentheses.

- Example:

Infix: (a + b) * c

Postfix: a b + c *

Postfix notation is useful in stack-based evaluation (like in some calculators and interpreters).

Parse Trees & Syntax Trees

- Parse Tree: A tree that represents the derivation of an expression according to grammar rules.

- Syntax Tree: A simplified form of a parse tree that focuses on essential operations.


Example:

For a + b * c:

Parse Tree:

      E
     / \
    E   +
    / \   \
  a   *   E
      / \
    b   c

Syntax Tree:

    +
   / \
  a   *
       / \
     b   c


The syntax tree removes unnecessary non-terminals, making it more compact.


Three-Address Code (TAC)

- TAC breaks complex expressions into simpler steps using temporary variables.

- Each instruction has at most three operands.


Example:

For x = a + b * c:

1. t1 = b * c

2. t2 = a + t1

3. x = t2

Here, t1 and t2 are temporary variables.

Quadruples & Triples

TAC can be represented in different formats:

Quadruples (4 Fields)

Each instruction has four fields:
(Operator, Arg1, Arg2, Result)

{Fig }}}}}


Triples (3 Fields, No Result Column)

Each instruction has three fields:
(Operator, Arg1, Arg2)


{{{Fig}}}

Triples eliminate the result field, reducing memory usage.

Translation of Assignment Statements

Assignment statements like x = a + b * c are translated using TAC.

Example:

1. t1 = b * c

2. t2 = a + t1

3. x = t2

The same logic applies to other statements like loops and conditionals.

Translation of Boolean Expressions

Boolean expressions use jump statements for evaluation.

Example:
For if (a > b) x = 10; else x = 20;
TAC representation:

if a > b goto L1  
x = 20  
goto L2  
L1: x = 10  
L2:

Here, L1 and L2 are labels for controlling flow.

Translation of Flow-Control Statements

Statements like if-else, loops, and switch are translated into TAC using labels.

Example: While Loop

For:

while (a < b) {
    a = a + 1;
}

TAC:

L1: if a >= b goto L2  
a = a + 1  
goto L1  
L2:

This creates a loop structure using goto statements.

Postfix Translation with a Top-Down Parser

Top-down parsers can translate expressions directly into postfix notation.

Example:
For a + b * c:

1. Process b * c → Output b c *

2. Process a + (b * c) → Output a b c * +

Final output: a b c * +

Array References in Arithmetic Expressions

Arrays require index calculations before being used in expressions.

Example:
For x = A[i] + B[j], TAC is:

t1 = i * 4  
t2 = A[t1]  
t3 = j * 4  
t4 = B[t3]  
t5 = t2 + t4  
x = t5

This ensures correct memory access.

Procedure Calls in Translation

Function calls require handling parameters, return values, and stack frames.

Example:
For x = func(a, b):

param a  
param b  
t1 = call func, 2  
x = t1

This translates a function call with two parameters.

Translation of Declarations

Declarations involve allocating memory for variables.

Example:
For int x, y;, TAC could involve:

alloc x, 4  
alloc y, 4

This assigns 4 bytes for each integer variable.

Translation of Case Statements (Switch Case)

A switch statement is converted into jump tables.

Example:

switch (x) {
  case 1: y = 10; break;
  case 2: y = 20; break;
  default: y = 30;
}

TAC:

if x == 1 goto L1  
if x == 2 goto L2  
goto L3  
L1: y = 10; goto L4  
L2: y = 20; goto L4  
L3: y = 30  
L4:

Here, L1, L2, L3, and L4 manage control flow.

Conclusion

- Syntax-Directed Translation helps in converting source code into an intermediate representation.

- TAC, quadruples, and triples simplify further translation into machine code.

- Boolean expressions, control statements, and function calls require special handling in translation.

Mastering these concepts will help in understanding how modern compilers work efficiently!

No comments:

Post a Comment