| unit 4 parsing and pushdown automata

ALC

Unit-4 Parsing and Pushdown Automata

4.1 Definition of Pushdown AutomataIt is a way to implement a context-free grammar as we design DFA for a regular grammar. A DFA can remember a finite amount of information, but a PDA can remember an infinite amount of information.Basically, a pushdown automaton is −"Finite state machine" + "a stack"A pushdown automaton has three components −● an input tape,● a control unit, and● a stack with infinite size.The stack head scans the top symbol of the stack.A stack does two operations −● Push − a new symbol is added at the top.● Pop − the top symbol is read and removed.A PDA may or may not read an input symbol, but it has to read the top of the stack in every transition.

PDA

Figure Basic Structure of PDAA PDA can be formally described as a 7-tuple (Q, ∑, S, δ, q0, I, F) −● Q is the finite number of states● ∑ is input alphabet● S is stack symbols● δ is the transition function: Q × (∑ ∪ {ε}) × S × Q × S*● q0 is the initial state (q0 ∈ Q)● I is the initial stack top symbol (I ∈ S)● F is a set of accepting states (F ∈ Q)The following diagram shows a transition in a PDA from a state q1 to state q2, labeled as a,b → c −

Transition in a PDA

Figure exampleThis means at state q1, if we encounter an input string ‘a’ and top symbol of the stack is ‘b’, then we pop ‘b’, push ‘c’ on top of the stack and move to state q2. Machine Configuration, yields, acceptanceA machine configuration is an element of K×Σ*×Γ*.(p,w,z) = (current state, unprocessed input, stack content)We define the usual yields in one step relation: (p,σw,αz) ⊢ (q,w,βz) if ((p,σ,α),(q,β)) ∈ Δor (p,w,αz) ⊢ (q,w,βz) if ((p,ε,α),(q,β)) ∈ Δ As expected, the yields relation, ⊢*, is the reflexive, transitive closure of ⊢.A string w is accepted by the PDA if (s,w,ε) ⊢* (f,ε,ε)Namely, from the start state with empty stack, we

● process the entire string,

● end in a final state

● end with an empty stack.

Graphical Representation and ε-transitionThe book does not indicate so, but there is a graphical representation of PDAs. A transition((p,x,α),(q,β)) where x = ε or x ∈ Σwould be depicted like this (respectively):

The stack usage represented by α/β represents these actions:

● the top of the stack must match α

● if we make the transition, pop α and push β

A PDA is non-deterministic. There are several forms on non-determinism in the description:

● Δ is a relation

● there are ε-transitions in terms of the input

● there are ε-transitions in terms of the stack contents

The true PDA ε-transition, in the sense of being equivalent to the NFA ε-transition is this:

because it consults neither the input, nor the stack and will leave the previous configuration intact.

Palindrome examples

{x ∈ {a,b,c}* : x = wcwR for w ∈ {a,b}*}

The machine pushes a's and b's in states, makes a transition to f when it sees the middle marker, c, and then matches input symbols with those on the stack and pops the stack symbol. Non-accepting string examples are these:Ε in state sAb in states with non-empty stackAbcab in state f with unconsumed input and non-empty stackAbcb in state f with non-empty stackAbcbab in state f with unconsumed input and empty stackObserve that this PDA is deterministic in the sense that there are no choices in transitions.The second example is:

{x ∈ {a,b}* : x = wwR for w ∈ {a,b}*}

This PDA is identical to the previous
one except for the ε-transition

Nevertheless, there is a significant difference in that this PDA must guess when to stop pushing symbols, jump to the final state and start matching off of the stack. Therefore, this machine is decidedly non-deterministic. In a general programming model (like Turing Machines), we have the luxury of preprocessing the string to determine its length and thereby knowing when the middle is coming.Key takeaway:- A DFA can remember a finite amount of information, but a PDA can remember an infinite amount of information.- A PDA may or may not read an input symbol, but it has to read the top of the stack in every transition.

4.2 Deterministic PDA● Machine transitions are based on the current state and input symbol, and also the current topmost symbol of the stack. ● Symbols lower in the stack are not visible and have no immediate effect. Machine actions include pushing, popping, or replacing the stack top.● A deterministic pushdown automaton has at most one legal transition for the same combination of input symbol, state, and top stack symbol. ● This is where it differs from the nondeterministic pushdown automaton.

Key takeaway:- A deterministic pushdown automaton has at most one legal transition for the same combination of input symbol, state, and top stack symbol.- Machine transitions are based on the current state and input symbol, and also the current topmost symbol of the stack.

4.3 Equivalence of CFG’s & PDA’s

4.4 Top-down parsingRecursive or predictive parsing is known as top-down parsing. The parsing begins with the start symbol in the top-down parsing and transforms it into the input symbol. The top-down parser is the parser that, by expanding the non-terminals, generates a parse for the given input string with the aid of grammar productions, i.e. it begins from the start symbol and ends on the terminals. Parse Tree The input string representation of acdb is as follows:

Fig 2: parse tree There are 2 types of additional Top-down parser: Recursive descent parser, and Non-recursive descent parser.

Recursive descent parser:

It is also known as Brute force parser or the with backtracking parser. Basically, by using brute force and backtracking, it produces the parse tree. A top-down parsing technique that builds the parse tree from the top and reads the input from left to right is recursive descent. For any terminal and non-terminal entity, it uses procedures. The input is recursively parsed by this parsing technique to make a parse tree, which may or may not require back-tracking. But the related grammar (if not left-faced) does not prevent back-tracking. Predictive is considered a type of recursive-descent parsing that does not require any back-tracking. As it uses context-free grammar that is recursive in nature, this parsing technique is called recursive. II. Non-recursive descent parser:It is also known with or without a backtracking parser or dynamic parser, LL(1) parser or predictive parser. It uses the parsing table instead of backtracking to create the parsing tree. Key takeaway:

- The parsing begins with the start symbol in the top-down parsing and transforms it into the input symbol.

- A top-down parsing technique that builds the parse tree from the top and reads the input from left to right is recursive descent.

4.5 Bottom-up parsingBottom-up parsing is also known as parsing for shift-reduce. Bottom-up parsing is used to build an input string parsing tree. The parsing begins with the input symbol in the bottom-up parsing and builds the parse tree up to the start symbol by tracing the rightmost string derivations in reverse. Parsing from the bottom up is known as multiple parsing. That are like the following:

Shift-Reduce Parsing

Operator Precedence Parsing

Table Driven LR Parsing

● LR( 1 )● SLR( 1 )● CLR ( 1 )● LALR( 1 ) Shift reduce parsing ● Shift reduction parsing is a mechanism by which a string is reduced to the beginning symbol of a grammar. ● Using a stack to hold the grammar and an input tape to hold the string, Shift reduces parsing.

● The two acts are done by Shift Decrease Parsing: move and decrease. This is why it is considered that shift reduce parsing. ● The current symbol in the input string is moved into the stack for the shift operation.● The symbols will be substituted by the non-terminals at each reduction. The mark is the right side of manufacturing and the left side of output is non-terminal.Operator precedence parsingThe grammar of operator precedence is a kind of shift reduction system of parsing. It is applied to a small grammar class of operators. A grammar, if it has two properties, is said to be operator precedence grammar: ● There's no R.H.S. of any production. ● There are no two adjoining non-terminals. Operator precedence can only be defined between the terminals of the grammar. A non-terminal is overlooked. The three operator precedence relations exist: ● a ⋗ b suggests that the precedence of terminal "a" is greater than terminal "b" ● a ⋖ b means the lower precedence of terminal "a" than terminal "b" ● a ≐ b indicates that all terminals "a" and "b" have the same precedence. LR parsers ● In the LR parser, “ L” means left to right scanning of any input .● One form of bottom up parsing is LR parsing. It is used to evaluate a broad grammar class. ● R" stands for the construction of a proper reverse derivation. ● "K" is the number of look-ahead input symbols used to make a number of parsing decisions. LR parsing is divided into four parts: parsing for LR (0), parsing for SLR, parsing for CLR and parsing for LALR. Key takeaway:

- Bottom-up parsing is also known as parsing for shift-reduce.

- Bottom-up parsing is used to build an input string parsing tree.

- Shift reduction parsing is a mechanism by which a string is reduced to the beginning symbol of a grammar.

- The grammar of operator precedence is a kind of shift reduction system of parsing.

References:

Introduction to Automata Theory, Languages, and Computation, 2e by John E, Hopcroft, Rajeev Motwani, Jeffery D. Ullman , Pearson Education.

2. Theory of Computer Science (Automata, Languages and Computation), 2e by K. L. P. Mishra and N. Chandrasekharan, PHI

3. Michael Sipser, Introduction to the Theory of Computation, PWS Publishing.

Sign Up

Index

Notes

Highlighted

Underlined

Browse by Topics

Notes

Highlighted

Underlined