● {0,1} is a set of binary alphabets,
● {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} is a set of Hexadecimal alphabets,
● {a-z, A-Z} is a set of English language alphabets.
2. Strings : A string is called any finite sequence of alphabets.
3. Special Symbol : The following symbols are used in a standard high-level language:
Arithmetic Symbols | Addition(+), Subtraction(-), Multiplication(*), Division(/) |
Punctuation | Comma(,), Semicolon(;), Dot(.) |
Assignment | = |
Special Assignment | +=, -=, *=, /= |
Comparison | ==, !=. <. <=. >, >= |
Preprocessor | # |
4. Language : A language over any finite set of alphabets is known as a finite set of strings.
5. Longest Match Rule : When the lexical analyzer reads the source code, it checks the code letter by letter and concludes that a phrase is completed when it finds a white space, operator symbol, or special symbols.
6. Operation : The different language operations are:
● The Union is written in two languages, L and M as , L U M = {s | s is in L or s is in M}
● Two languages, L and M, are concatenated as, LM = {st | s is in L and t is in M}
● A Language L's Kleene Closure is written as , L* = Zero or more occurrence of language L .
7. Notations : If r and s are regular expressions that denote the L(r) and L(s) languages, then
● Union : L(r)UL(s)
● Concatenation : L(r)L(s)
● Kleene Closure : (L(r))*
8. Representing valid tokens of a language in regular expression : If x is an expression that is normal, then:
● x* implies the occurrence of x at zero or more.
● x+ implies one or more instances of x.
9. Finite automata : Finite Automata is a state machine that takes as input a string of symbols and changes its state accordingly.
If the input string is processed successfully and the automatic has reached its final state, it will be accepted.The Finite Automata Mathematical Model consists of :● Q - Finite set of state
● Σ - Finite set of input symbol
● q0 - One start state
● qf - set of final state
● δ - Transition function
The transition function (δ) maps the finite set of state (Q) to a finite set of input symbols (Σ), Q × Σ ➔ QQ. 5) Write about tokens ?Ans : TokenIt is said that lexemes are a character sequence (alphanumeric) in a token. For each lexeme to be identified as a valid token, there are some predefined rules. These rules, by means of a pattern, are described by grammar rules. A pattern describes what a token can be, and by means of regular expressions, these patterns are described. Keywords, constants, identifiers, sequences, numbers, operators and punctuation symbols can be used as tokens in the programming language. Example : the variable declaration line in the C languageint num = 60 ; contains the tokens:int (keyword) , num (identifiers) = (operator) , 60 (constant) and ; (symbol)Q. 6) Explain finite automata ?Ans : Finite AutomataFinite Automata(FA) is the simplest pattern recognition machine. The finite automatic or finite state machine is an abstract machine with five components or tuples. It has a set of states and rules that switch from one state to another, but it depends on the input symbol that is added. It is essentially an abstract digital machine model. Some important features of general automation are shown in the figure below.
● Q - Finite set of state
● Σ - Finite set of input symbol
● q0 - One start state
● qf - set of final state
● δ - Transition function
A formal computer specification is{ Q, Σ, q, F, δ } Finite automata model :Input tape and finite control can be represented by Finite Automata. ● Input tape : It's a linear tape that has a certain number of cells. In each cell, each input symbol is positioned. ● Finite control : On receiving unique feedback from the input tape, the finite control determines the next condition. The tape reader reads the cells from left to right, one by one, and only one input symbol is read at a time.● Lex is a lexical analyzer generating programme. It is used to produce the YACC parser. A software that transforms an input stream into a sequence of tokens is the lexical analyzer. It reads the input stream and, by implementing the lexical analyzer in the C programme, produces the source code as output.