Class 27: Converting Grammars to PDAs

Sections 3.4 of Theory of Computing, A Gentle Introduction.
Then printout the homework and answer the questions on that paper. Due 10 November.

The 6-tuple (Q,Σ,Γ,Δ,s,W) tells you what a PDA is, but it doesn't tell you how the thing works ... it's kind of like having the Monopoly board and pieces without haveing the rules that tell you how to play. In the next few sections, we'll develop those rules.

For machine M = (Q,Σ,Γ,Δ,s,W) a configuation of M is a 3-tuple (p,w,v) ⊆ Q × Σ* × Γ *, where p is the current state, w is the unread input, and v is the contents of the stack.

Just as with finite automata, a configuration is like a snapshot of the machine in the middle of a computation. It has all the information you would need to continue the computation from that point.

Yields in One Step
A machine is allowed to go from one configuration to another provided there is a transition that allows it. We mnake that idea formal with the following:

For machine M = (Q,Σ,Γ,Δ,s,W), configuration (p,u,v) yields (q,u',v') in 1 step (which we write as (p,u,v) ⇒ (q,u',v')) provided that (p,x,y,q,z) ∈ Δ, where u = xu' and x ∈ Σ ∪ {λ}, v = y r and v' = z r, where r,y,z ∈ Γ *.

Accepting computation
Machine M = (Q,Σ,Γ,Δ,s,W) accepts string w ∈ Σ* if there exists a sequence of configurations
(s,w,λ) ⇒ (p1,u1,v1) ⇒ ... ⇒ (pn,un,vn)
where pn ∈ W and un = λ. If a string is not accepted, i.e. no such sequence of configurations exists, it is rejected.

we looked at a nice example of a PDA last class that accepted anbn.
Now we can write down an accpeting computation of this machine as a sequence of configurations. For example given input w = aabb we have to following accepting computation:
(q0,aabb,λ) ⇒ (q1,aabb,$) ⇒ (q1,abb,a$) ⇒ (q1,bb,aa$) ⇒ (q1,b,a$) ⇒ (q1,λ,$) ⇒ (q2,λ,λ)

Yet another Mystery Algorithm
Consider the following Mystery Algorithm:
Input: G = (Σ, NT, R, S)
Output: M = ({q0,q1,q2}, Σ,Σ ∪ NT ∪ {$}, q0, {(q0,λ,λ,q1,$), (q1,λ,S$,q2,λ} ∪ {(q1,λ,wR,q1,A) | (A,w) ∈ R} ∪ {(q1,x,λ,q1,x) | x ∈ Σ}, {q2})
What does this algorithm output for input G = ({1,0,+},{S,B},{(S,S+B),(S,B),(B,1),(B,0)},S)?

M = ( {q0,q1,q2}, Σ, Σ ∪ NT ∪ {$}, { (q0,λ,λ,q1,$), (q1,λ,S$,q2,λ),
(q1,λ,B+S,q1,S), (q1,λ,B,q1,S),
(q1,λ,0,q1,B), (q1,λ,1,q1,B),
(q1,0,λ,q1,0), (q1,1,λ,q1,1),
} , q0,{q2} )

Mystery Algorithm Unmasked
The Mystery Algorithm actually takes grammar G and produces a push down automaton M that accepts the language generated by G. This is a handy algorithm. For starters, it proves a nice theorem:
Theorem: If a language is generated by a CFG, it is accepted by some PDA.
Second, this means that we can specify a language with a grammar, and create an automaton to recognize strings for us. We could even imagine augmenting this machine to create a parse tree for us. However, while this machine accepts the language in theory, it's pretty useless in practice. Why? Because it is really nondeterministic. Algorithms for creating deterministic machines from input grammars is a whole big field in and of itself. This is at the heart of "parsing" for programming languages.

Ignoring the nondeterministic aspect of this mathine, though, let's see why we say that it "parses" a string. Suppose that whenever we take a transition based on the rule A ⇒ w, a transition that pops wR off the stack and replaces it with A, we annotate the new varible A with the value that it replaced. When an accepting computation completes, we'll have just popped an S off the stack that is annotated with ... the entire parse tree. The following sloppy diagram should show how that works for the simple example above and the input string 0 + 1 + 1.

So if, given a grammar, we can make a deterministic version of the machine produced by the above Mystery Algorithm, we will be able to parse strings in the language defined by the grammar.

Christopher W Brown
Last modified: Mon Nov 2 08:38:35 EST 2009