Syntax-Directed Translation (SDT) in Compiler Design
What is Syntax-Directed Translation (SDT)?
Syntax-Directed Translation (SDT) is a method used in compilers where semantic actions are performed during syntax analysis. This means that while the compiler checks whether a program follows grammar rules, it also translates the program into an intermediate representation (like three-address code) at the same time.
SDT uses syntax-directed definitions (SDDs), which associate rules with grammar productions.
Syntax-Directed Translation Schemes
A Syntax-Directed Translation Scheme (SDTS) is a set of rules that specify how to translate a programming language into an intermediate representation.
It consists of:
1. Grammar rules – Define how the language is structured.
2. Semantic actions – Define what translation should happen when a rule is applied.
Example of SDT Rule
For an arithmetic expression grammar:
E → E1 + T { print('+') }
E → T
T → id { print(id.name) }
If we input x + y, the translation would be:
1. Read x → print(x)
2. Read + → print(+)
3. Read y → print(y)
Output: x y + (Postfix notation)
Implementation of Syntax-Directed Translators
A syntax-directed translator is a program that follows SDT rules to convert source code into an intermediate representation.
Two main approaches to implementation:
1. Embedded actions in parsing – Actions are executed during parsing.
2. Using an explicit parse tree – First, build a parse tree, then perform translation.
Intermediate Code Generation
Intermediate code is a representation of the program that is between source code and machine code. It helps in optimization and simplifies code generation.
Common types of intermediate code:
- Postfix Notation
- Three-Address Code (TAC)
- Quadruples & Triples
Postfix Notation (Reverse Polish Notation - RPN)
- Operators appear after operands.
- Eliminates the need for parentheses.
- Example:
Infix: (a + b) * c
Postfix: a b + c *
Postfix notation is useful in stack-based evaluation (like in some calculators and interpreters).
Parse Trees & Syntax Trees
- Parse Tree: A tree that represents the derivation of an expression according to grammar rules.
- Syntax Tree: A simplified form of a parse tree that focuses on essential operations.
Example:
For a + b * c:
Parse Tree:
E
/ \
E +
/ \ \
a * E
/ \
b c
Syntax Tree:
+
/ \
a *
/ \
b c
The syntax tree removes unnecessary non-terminals, making it more compact.
Three-Address Code (TAC)
- TAC breaks complex expressions into simpler steps using temporary variables.
- Each instruction has at most three operands.
Example:
For x = a + b * c:
1. t1 = b * c
2. t2 = a + t1
3. x = t2
Here, t1 and t2 are temporary variables.
Quadruples & Triples
TAC can be represented in different formats:
Quadruples (4 Fields)
Each instruction has four fields:
(Operator, Arg1, Arg2, Result)
{Fig }}}}}
Triples (3 Fields, No Result Column)
Each instruction has three fields:
(Operator, Arg1, Arg2)
{{{Fig}}}
Triples eliminate the result field, reducing memory usage.
Translation of Assignment Statements
Assignment statements like x = a + b * c are translated using TAC.
Example:
1. t1 = b * c
2. t2 = a + t1
3. x = t2
The same logic applies to other statements like loops and conditionals.
Translation of Boolean Expressions
Boolean expressions use jump statements for evaluation.
Example:
For if (a > b) x = 10; else x = 20;
TAC representation:
if a > b goto L1
x = 20
goto L2
L1: x = 10
L2:
Here, L1 and L2 are labels for controlling flow.
Translation of Flow-Control Statements
Statements like if-else, loops, and switch are translated into TAC using labels.
Example: While Loop
For:
while (a < b) {
a = a + 1;
}
TAC:
L1: if a >= b goto L2
a = a + 1
goto L1
L2:
This creates a loop structure using goto statements.
Postfix Translation with a Top-Down Parser
Top-down parsers can translate expressions directly into postfix notation.
Example:
For a + b * c:
1. Process b * c → Output b c *
2. Process a + (b * c) → Output a b c * +
Final output: a b c * +
Array References in Arithmetic Expressions
Arrays require index calculations before being used in expressions.
Example:
For x = A[i] + B[j], TAC is:
t1 = i * 4
t2 = A[t1]
t3 = j * 4
t4 = B[t3]
t5 = t2 + t4
x = t5
This ensures correct memory access.
Procedure Calls in Translation
Function calls require handling parameters, return values, and stack frames.
Example:
For x = func(a, b):
param a
param b
t1 = call func, 2
x = t1
This translates a function call with two parameters.
Translation of Declarations
Declarations involve allocating memory for variables.
Example:
For int x, y;, TAC could involve:
alloc x, 4
alloc y, 4
This assigns 4 bytes for each integer variable.
Translation of Case Statements (Switch Case)
A switch statement is converted into jump tables.
Example:
switch (x) {
case 1: y = 10; break;
case 2: y = 20; break;
default: y = 30;
}
TAC:
if x == 1 goto L1
if x == 2 goto L2
goto L3
L1: y = 10; goto L4
L2: y = 20; goto L4
L3: y = 30
L4:
Here, L1, L2, L3, and L4 manage control flow.
Conclusion
- Syntax-Directed Translation helps in converting source code into an intermediate representation.
- TAC, quadruples, and triples simplify further translation into machine code.
- Boolean expressions, control statements, and function calls require special handling in translation.
Mastering these concepts will help in understanding how modern compilers work efficiently!
No comments:
Post a Comment