Unit1 | unit 1 introduction

Compiler Design

Unit-1

Introduction

Q1) What are the phases of compilation ?

A1) Phases of compiler

The method of compilation includes the sequence of different stages. Each phase takes a source programme in one representation and produces output in another representation. From its previous stage, each process takes input.

The various stages of the compiler take place :

Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generation
Code Optimization
Code Generation

Fig 1: phases of compiler

Lexical Analysis : The first phase of the compilation process is the lexical analyzer phase. As input, it takes source code. One character at a time, it reads the source programme and translates it into meaningful lexemes. These lexemes are interpreted in the form of tokens by the Lexical analyzer.

2. Syntax Analysis : Syntax analysis is the second step of the method of compilation. Tokens are required as input and a parse tree is generated as output. The parser verifies that the expression made by the tokens is syntactically right or not in the syntax analysis process.

3. Semantic Analysis : The third step of the method of compilation is semantic analysis. This tests whether the parse tree complies with language rules. The semantic analyzer keeps track of identifiers, forms, and expressions of identifiers. The output step of semantic analysis is the syntax of the annotated tree.

4. Intermediate Code Generation : The compiler generates the source code into an intermediate code during the intermediate code generation process. Between high-level language and machine language, intermediate code is created. You can produce the intermediate code in such a way that you can convert it easily into the code of the target computer.

5. Code Optimization : An optional step is code optimization. It is used to enhance the intermediate code so that the program's output can run quicker and take up less space. It eliminates redundant code lines and arranges the sequence of statements to speed up the execution of the programme.

6. Code Generation : The final stage of the compilation process is code generation. It takes as its input the optimised intermediate code and maps it to the language of the target computer. The code generator converts the intermediate code to the required computer's machine code.

Q2) What do you mean by compilers ?

A2) Compliers

● A compiler is a translator that translates the language at the highest level into the language of the computer.

● A developer writes high-level language and the processor can understand machine language.

● The compiler is used to display a programmer's errors.

● The main objective of the compiler is to modify the code written in one language without altering the program's meaning.

● When you run a programme that is written in the language of HLL programming, it runs in two parts.

Fig 2: Execution process of Compiler

Q3) Write the features of compilers?

A3) Features of Compilers

● Maintaining the right interpretation of the code

● The speed of the target code

● Understand prototypes for legal and illegal systems

● Good documentation / management of errors

● Correctness

● Speed of compilation

● Code debugging help

Q4) What kind of the task performed by compilers ?

A4) Task of Compilers

Tasks which are performed by compilers :

● Divides the source programme into parts and imposes on them a grammatical structure.

● Allows you from the intermediate representation to construct the desired.

● target programme and also create the symbol table.

● Compiles and detects source code errors.

● Storage of all variables and codes is handled.

● Separate compilation support.

● Learn, interpret, and translate the entire programme to a semantically equivalent.

● Depending on the computer sort, conversion of the source code into object code.

Q5) Write the operation of regular language ?

A5) operation of regular language

A language is regular if, in terms of regular expression, it can be presented.

Regular languages are closed under a wide variety of operations.

● Union and intersection

Pick DFAs recognizing the two languages and use the cross-product construction to build a DFA recognizing their union or intersection.

● Set complement

Pick a DFA recognizing the language, then swap the accept/non- accept markings on its states.

● String reversal

Pick an NFA recognizing the language. Create a new final state, with epsilon transitions to it from all the old final states. Then swap the final and start states and reverse all the transition arrows.

● Set difference

Rewrite set difference using a combination of intersection and set complement.

● Concatenation and Star

Pick an NFA recognizing the language and modify.

● Homomorphism

A homomorphism is a function from strings to strings. What makes it a homomorphism is that its output on a multi-character string is just the concatenation of its outputs on each individual character in the string. Or, equivalently, h(xy) = h(x)h(y) for any strings x and y. If S is a set of strings, then h(S) is {w : w = h(x) for some x in S}.

To show that regular languages are closed under homomorphism, choose an arbitrary regular language L and a homomorphism h. It can be represented using a regular expression R. But then h(R) is a regular expression representing h(L). So h(L) must also be regular.

Notice that regular languages are not closed under the subset/superset relation. For

Example, 0 * 1 * is regular, but its subset {O n 1 n : n >= 0} is not regular, but its subset {01, 0011, 000111} is regular again.

Q6) What do you mean by regular expressions ?

A6) Regular expressions

● Simple expressions called Regular Expressions can readily define the language adopted by finite automata. This is the most powerful way of expressing any language.

● Regular languages are referred to as the languages recognised by certain regular expressions.

● A regular expression may also be defined as a pattern sequence that defines a series.

● For matching character combinations in strings, regular expressions are used. This pattern is used by the string search algorithm to locate operations on a string.

A Regular Expression can be recursively defined as follows −

● ε is a Regular Expression indicates the language containing an empty

String. (L (ε) = {ε})

● φ is a Regular Expression denoting an empty language. (L (φ) = { })

● x is a Regular Expression where L = {x}

● If X is a Regular Expression denoting the language L(X) and Y is a Regular

● Expression denoting the language L(Y), then

● X + Y is a Regular Expression corresponding to the language L(X) ∪

L(Y) where L(X+Y) = L(X) ∪ L(Y).

● X . Y is a Regular Expression corresponding to the language L(X) .

L(Y) where L(X.Y) = L(X) . L(Y)

● R* is a Regular Expression corresponding to the language L(R*)

Where L(R*) = (L(R))*

● If we apply any of the rules several times from 1 to 5, they are Regular Expressions.

Q7) Explain scanner generator lex ?

A7) Scanner generator lex

Lex is a lexical analyzer generating programme. It is used to produce the YACC parser. A software that transforms an input stream into a sequence of tokens is the lexical analyzer. It reads the input stream and, by implementing the lexical analyzer in the C programme, produces the source code as output.

Fig 3: lexical analyzer generator

If a token is considered invalid by the lexical analyzer, it produces an error. The lexical analyzer works with the syntax analyzer in close collaboration. It reads character streams from the source code, checks for legal tokens, and, when necessary, passes the information to the syntax analyzer.

Tokens

It is said that lexemes are a character sequence (alphanumeric) in a token. For each lexeme to be identified as a valid token, there are some predefined rules. These rules, by means of a pattern, are described by grammar rules. A pattern describes what a token can be, and by means of regular expressions, these patterns are described.

Keywords, constants, identifiers, sequences, numbers, operators and punctuation symbols can be used as tokens in the programming language.

Example : the variable declaration line in the C language

Int value = 100 ;

Contains the tokens:

Int (keyword) , value (identifiers) = (operator) , 100 (constant) and ; (symbol)

Q8) Define flex ?

A8) Flex

A tool for scanner generation is FLEX (Fast LEXical analyzer generator). You just need to define the vocabulary of a certain language, write a pattern specification using regular expressions (e.g. DIGIT [0-9]) instead of writing a scanner from scratch, and FLEX will create a scanner for you.

In general, FLEX is used in the manner represented here:

Fig 4: Flex

First, FLEX reads a scanner specification either from an input file *.lex or from a standard input and generates a lex.yy.c source C file as the output. Then, to create an executable a.out, lex.yy.c is compiled and connected to the '-lfl' library. Finally, a.out analyses and transforms the input stream into a token sequence.

● *.Lex comes in the form of regular expressions and C code pairs.

● A yylex() routine specifies lex.yy.c, which uses the specification to identify tokens.

● Currently, A.out is a scanner!

Q9) Describe finite automata ?

A9) Finite automata

● To identify patterns, finite state machines are used.

● The Finite Automated System takes the symbol string as an input and modifies its state accordingly. When a desired symbol is found in the input, then the transformation happens.

● The automated devices may either switch to the next state during the transition or remain in the same state.

● FA has two states: state approve or state deny. When the input string is processed successfully and the automatic has reached its final state, it will approve it.

The following refers of a finite automatic:

Q: finite set of states

∑: finite set of input symbol

q0: initial state

F: final state

δ: Transition function

It is possible to describe transition functions as

δ: Q x ∑ →Q

The FA is described in two ways:

DFA
NDFA

Q10) How to contsruct FA from RE ?

A10) constructing an FA from RE

We begin by showing how to construct an FA for the operands in a regular

Expression.

● If the operand is a character c, then our FA has two states, s0 (the start state)

And sF (the final, accepting state), and a transition from s0 to sF with label c.

● If the operand is epsilon, then our FA has two states, s0 (the start state) and

SF (the final, accepting state), and an epsilon transition from s0 to sF.

● If the operand is null, then our FA has two states, s0 (the start state) and sF

(the final, accepting state), and no transitions.

Given FA for R1 and R2, we now show how to build an FA for R1R2, R1|R2, and

R1*. Let A (with start state a0 and final state aF) be the machine accepting L(R1)

And B (with start state b0 and final state bF) be the machine accepting L(R2).

● The machine C accepting L(R1R2) includes A and B, with start state a0, final

State bF, and an epsilon transition from aF to b0.

● The machine C accepting L(R1|R2) includes A and B, with a new start state

c0, a new final state cF, and epsilon transitions from c0 to a0 and b0, and

From aF and bF to cF.

● The machine C accepting L(R1*) includes A, with a new start state c0, a new

Final state cF, and epsilon transitions from c0 to a0 and cF, and from aF to a0,

And from aF to cF.

Sign Up