Unit 4 | Compiler Design Notes | AKTU Notes


Unit 4 | Compiler Design Notes | AKTU Notes




Compiler Design: Symbol Tables, Run-Time Administration & Error Detection

Symbol Tables

What is a Symbol Table?

A symbol table is a data structure used by a compiler to store information about variables, functions, objects, and other identifiers in a program. It helps in semantic analysis, type checking, and code generation.

Why is a Symbol Table Important?

- It keeps track of identifiers (variable names, function names, etc.).

- It stores attributes like data type, scope, memory location, and value.

- It helps in error checking by identifying undeclared variables.

Data Structure for Symbol Tables

Symbol tables can be implemented using:

1. Hash Tables (Fast lookup and insertion)

2. Binary Search Trees (BSTs) (Keeps symbols in sorted order)

3. Linked Lists (Simple but slower lookup)

Example Symbol Table for a Program:


{{Fig}}}}




Representing Scope Information

A program has different scopes, meaning variables and functions are only accessible in certain parts of the program.

- Global Scope: Variables accessible everywhere.

- Local Scope: Variables inside functions or blocks.

To handle nested scopes, compilers use a stack-based approach where each function or block gets a new symbol table.

Example:

int x; // Global variable
void func() {
    int y; // Local variable (only inside func)
}

When func() is called, a new symbol table is created for y. When func() ends, this table is deleted.

Run-Time Administration

What is Run-Time Administration?

At run-time, a program needs memory for variables and function calls. The compiler decides how memory is allocated using storage allocation techniques.

Simple Stack Allocation Scheme

- Memory is allocated using a stack.

- When a function is called, a new stack frame is pushed.

- When the function ends, the frame is popped.

Example:

void func() {
    int x; // Allocated on stack
}

When func() runs, x gets memory on the stack. When func() ends, x's memory is released.

Storage Allocation in Block-Structured Languages

Languages like C, C++, Java use block-structured storage allocation.

Three main types of storage allocation:

1. Static Allocation (Fixed at compile time, e.g., global variables).

2. Stack Allocation (For function calls and local variables).

3. Heap Allocation (For dynamic memory using malloc() or new in C++).

Example of Heap Allocation:

int* ptr = (int*)malloc(sizeof(int)); // Allocates memory dynamically

This memory remains allocated until freed manually.

Error Detection & Recovery

Errors occur at different stages of compilation. The compiler must detect and recover from errors instead of stopping execution.

Types of Errors in Compilation

A. Lexical Phase Errors

- Errors in tokens (invalid characters, misspelled keywords).

- Example: int 123var = 5; (Variable names can't start with a number).

- Detection: The lexical analyzer identifies illegal tokens.

- Recovery: Skipping the incorrect token or replacing it with a valid token.

B. Syntactic Phase Errors

- Errors in grammar or structure.

- Example: if (x > 5 Missing )

- Detection: The parser identifies incorrect syntax.

- Recovery:

   - Panic mode: Skip invalid input until a correct token is found.

   - Phrase-level correction: Modify the code slightly to fix the error.

   - Error productions: Define additional grammar rules for common errors.

C. Semantic Errors

- Meaning-related errors (type mismatch, undeclared variables).

- Example:

int x;
x = "Hello"; // Type mismatch error

- Detection: Semantic analysis phase.

- Recovery: Suggest possible corrections.

Conclusion

- Symbol Tables store information about variables, functions, and their scope.

- Run-Time Administration manages memory allocation using stacks, heaps, and static storage.

- Error Detection & Recovery handles lexical, syntactic, and semantic errors efficiently.

No comments:

Post a Comment