Unit - 4
Symbol Tables
Q1) Describe the implementation of the symbol table?
A1) Implementation of the symbol table
The data structure for implementing the symbol table is widely used below:
● An array is used in this technique to store names and associated information.
● An "available" pointer is kept at the end of all records stored and new names are inserted in order as they arrive.
● We start looking for a name from the beginning of the list till the pointer is available and if not found, we get the error "use of the undeclared name".
● We need to ensure that it is not already present when adding a new name, otherwise, there is an error, i.e., "Multiple defined names"
● Insertion O (1) is fast, but the search is slow on average for large tables-O(n)
● The benefit is that a minimum amount of space is needed.
2. Linked list
● The linked list is used for this implementation. To each record, a connection field is added.
● Searching for names is performed in the order suggested by the link area of the link.
● To point to the first record of the symbol table, a pointer 'First' is retained.
● The insertion is fast O (1), but the search is slow on average for large tables-O(n).
3. Hash table
● Two tables are kept in the hashing system-a hash table and symbol table and are the most widely used method for implementing symbol tables.
● An array with an index range is a hash table: 0 to table size-1. These entries are pointers pointing to table symbol names.
● We use hash functions to look for a name that will result in an integer between 0 and table size-1.
● It is possible to insert and lookup very easily, O (1).
● The benefit is that it is possible to search rapidly and the downside is that it is difficult to implement hashing.
4. Binary search tree
● Another solution is to use the binary search tree to enforce the symbol table, i.e., we add two relation fields, i.e., left and right kid.
● All the names are generated as root node children, which always follow the binary search tree property.
● On average, insertion and lookup are O (log2 n).
Q2) Define a symbol table and also write the operation of the symbol table?
A2) Symbol Table
The Symbol Table is an essential data structure generated and maintained by the compiler to keep track of variable semantics, i.e., to store information about the scope and binding information about names, information about instances of different entities, such as variable and function names, classes, objects, etc.
A table of symbols used for the following purposes:
● It is used to store the names of all individuals in one location in a standardized way.
● To check if a variable has been declared, it is used.
● It is used to evaluate a name's scope.
● It is used to enforce type checking by verifying the assignments and semantically correct expressions in the source code.
A table of symbols may either be linear or a table of hashes. Using the following format, the entry for each name is kept
<symbol name, type, attribute>
For reference, suppose that a variable stores information about the following declaration of a variable:
static int salary
Then, in the following format, it stores an entry:
<salary, int, static>
The attribute of the clause includes the name-related entries.
Symbol Table entries –
Every entry in the symbol table is associated with attributes that the compiler supports at various stages.
Items contained in Table Symbols:
❏ Variable names and constants
❏ Procedure and function names
❏ Literal constants and strings
❏ Compiler generated temporaries
❏ Labels in source languages
Details from the Symbol table used by the compiler:
● Data type and name
● Declaring procedures
● Offset in storage
● If structure or record then, a pointer to structure table.
● For parameters, whether parameter passing by value or by reference
● Number and type of arguments passed to function
● Base Address
Operations of Symbol table –
The basic operations listed on the table of symbols include:
Operation | Functions |
insert | To insert a name into a table of symbols and return a pointer to its input.
|
lookup | To check for a name and return its input to the pointer
|
free | Removal of all entries and free symbol table storage.
|
allocate | To assign a new empty table of symbols. |
Q3) Write short notes on error detection and recovery?
A3) Error detection and recovery
All potential errors made by the user are identified at this compilation stage and recorded to the user in the form of error messages. This method of finding errors and disclosing them to the user is called the process of Error Handling.
Functions of the Error handler
● Detection
● Reporting
● Recovery
Classification of errors -
Time errors in the compilation are of three kinds:
Q4) How to represent scope information?
A4) Two symbol table forms are included in the compiler: the global symbol table and the scope symbol table.
Global symbol table that can be reached by all the tables of procedures and scope symbols that are generated in the software for each scope.
Both procedures and the scope symbol table can be accessed via the Global Symbol Table.
In the hierarchy structure, the scope of a name and symbol table is organized as shown below.
int value=10;
void sum_num()
{
int num_1;
int num_2;
{
int num_3;
int num_4;
}
int num_5;
{
int_num 6;
int_num 7;
}
}
Void sum_id
{
int id_1;
int id_2;
{
int id_3;
int id_4;
}
int num_5;
}
In a hierarchical data structure of symbol tables, the grammar above can be represented:
There are one global variable and two procedure names in the global symbol table. For sum id and its child tables, the name cited in the sum num table is not available.
The hierarchy of the symbol table's data structure is stored in the semantic analyzer. If you want to look for the name in the table of symbols, you can use the following algorithm to search for it:
● A symbol is first checked for in the current table of symbols.
● If the name is found, the search will be completed or the name will be checked in the parent symbol table until,
● Find a name or look for a global symbol.
Q5) What is the simple stack allocation scheme?
A5) In the source software, every name has a validity area, called that name's scope.
The following are the rules in a block-structured language:
● If a name is declared within Block B, then only within B will it be valid.
● The name that is valid for block B2 is also valid for B1 if the B1 block is nested within B2 unless the name's identifier is re-declared in B1.
These scope rules require a more complicated symbol table organization than a list of names and attribute associations.
The tables are arranged into stacks, and each table includes a list of names and attributes associated with them.
● A new table is entered into the stack whenever a new block is entered.
● The name that is declared as local to this block is included in the new table.
● The table will look for a name when the declaration is compiled.
● If the name of the table is not identified, a new name is added
Example:
int x;
void f(int m) {
float x, y;
{
int i, j;
int u, v;
}
}
int g (int n)
{
bool t;
}
Q6) Write short notes on the run - time environment?
A6) Run -time environment
As a source code, a programme is merely a set of text (code, statements, etc.) and it needs actions to be performed on the target computer to make it alive. To execute instructions, a programme requires memory resources. A programme includes names for processes, identifiers, etc. that enable the actual memory position to be mapped at runtime.
By runtime, we mean the execution of a programme. The runtime environment is the state of the target machine that can provide services to the processes running in the system, including software libraries, environment variables, and so on.
A set, often created with the executable programme itself, is a runtime support system that enables the communication of the process between the process and the runtime environment. Memory allocation and deallocation are taken care of when the programme is being executed.
Q7) What is the semantic error? write recovery method also?
A7) Semantic error
This type of error occurs during the semantic analysis process. These error types are observed at the time of compilation.
Scope and declaration errors are often compiled time errors. E.g.: undeclared or multiple identifiers declared. Another compile-time error is Form Mismatched.
Using the wrong variable or using the incorrect operator or doing an operation in the wrong order, the semantic error can occur.
Typical semantic errors are as follows:
● Incompatible type of operands
● Undeclared variables
● Not matching of actual arguments with the formal one
Example:
Use of a non-initialized variable:
int a;
void x (int s)
{
s=t;
}
t is undeclared in this code, which is why the semantic error is seen.
Error recovery
❏ If an 'Undeclared Identifier' error is found, a symbol table entry is made for the corresponding identifier to recover from this.
❏ If two operand data types are incompatible, then the compiler performs an automatic type conversion.
Q8) Describe the lexical phase error with the recovery method?
A8) Lexical phase error
This type of error can be identified during the lexical analysis stage.
The lexical error is a character sequence that does not fit any token's pattern. During programme execution, a lexical step error is observed.
The lexical phase error is:
● Spelling error.
● Exceeding the length of an identifier or numeric constants.
● The appearance of illegal characters.
● To remove the character that should be present.
● To replace a character with an incorrect character.
● Transposition of two characters.
Example:
Void main()
{
int a=10, b=20;
char * x;
x= &a;
a= 1axy;
}
1axy is neither a number nor an identifier in this code. So the lexical error will be shown by this code.
Error recovery:
Panic Mode Recovery
➔ In this process, until a specified set of synchronizing tokens is identified, successive characters from the input are extracted one at a time. Tokens for synchronization are delimiters such as; or}
➔ The advantage is that it is quick to execute and means that there is no infinite loop.
➔ The disadvantage is that a large amount of input is missed without additional errors being tested.
Q9) What do you mean by syntactic phase error with recovery?
A9) Syntactic phase error
This type of error occurs during the syntax analysis process. A syntax error is observed during programme execution.
There may be some syntax errors:
● Errors in structure
● Missing operator
● Misspelled keywords
● Unbalanced parenthesis
A syntax error can also occur when an incorrect formula is entered into a calculator. This can be activated by entering some decimal points or by opening brackets without closing them.
Example: Missing semicolon
int a = 5 // semicolon is missing
Compiler message:
ab.java:20: ';' expected
int a = 5
Error recovery:
❏ In this process, one at a time, successive characters are deleted from the input until a specified collection of synchronizing tokens is found. Tokens for synchronization are deli-meters such as; or}
❏ The advantage is that it is quick to implement and promises not to go to the infinite loop.
❏ The disadvantage is that, without testing for additional faults, a large amount of feedback is missed.
2. Statement Mode recovery
❏ In this process, when a parser detects an error, the remaining input is corrected to allow the parser to parse ahead of the rest of the input statement.
❏ You can remove extra semicolons, substitute a semicolon with a comma, or insert a missing semicolon for correction.
❏ When correcting, it is important to take the utmost care not to go in an endless loop.
❏ The drawback is that cases where the real error occurred before the detection point find it difficult to manage.
3. Error production
❏ If the consumer is aware of common errors that can then be found, these errors can be introduced by increasing the grammar by generating errors that produce erroneous constructs.
❏ If this is used, then suitable error messages can be produced during parsing and parsing can be continued.
❏ The drawback is that it's hard to hold.
4. Global Correction
❏ The parser analyses the entire programme and tries to figure out which error-free match is nearest to it.
❏ To recover from erroneous data, the closest match programme has fewer insertions, deletions, and token shifts.
❏ This technique is not practically applied due to elevated time and space complexity.
Q10) Describe the storage allocation in block-structured language?
A10) The numerous ways that memory can be allocated are:
Static storage allocation:
➢ Names are attached to storage locations in static allocation.
➢ If the memory is created at the time of compilation, the memory is created in a static region and only once.
➢ The dynamic data structure is assisted by static allocation, meaning that memory is only generated at compile-time and redistributed after programme completion.
➢ The downside of static storage allocation is that it is important to know the size and position of data objects at compile time.
➢ The constraint of the recursion method is another downside.
Stack storage allocation:
➢ Space is structured as a stack of static storage allocation.
➢ When activation begins, an activation record is placed into the stack and it pops when the activation finishes.
➢ The activation record includes the locals of each activation record so that they are connected to fresh storage. When the activation stops, the value for locals is deleted.
➢ It operates based on last-in-first-out (LIFO) and the recursion mechanism is supported by this allocation.
Heap storage allocation:
➢ The most versatile allocation scheme is heap allocation.
➢ Depending on the requirement of the user, memory allocation and deallocation can be performed at any time and any venue.
➢ Heap allocation is used to dynamically assign memory to the variables and then regain it when the variables are no longer used.
➢ The recursion mechanism is assisted by heap capacity allocation.