SI340: More on converting Grammars to PDAs

Reading: Sections 3.4 of Theory of Computing, A Gentle Introduction.
Homework: No homework. Enjoy a bit of free time.

The Grammar-to-PDA conversion

The following algorithm (last lecture's "MysteryAlgorithm") converts grammar G to a PDA M accepting the same language as generated by the grammar.

Algorithm: Grammar2PDA Input: G = (Σ, NT, R, S)
Output: PDA M such that L(M) = L(G)
M = ({q0,q1,q2}, Σ,Σ ∪ NT ∪ {\$}, {(q0,λ,λ,q1,\$), (q1,λ,S\$,q2,λ} ∪ {(q1,λ,w^R,q1,A) | (A,w) ∈ R} ∪ {(q1,x,λ,q1,x) | x ∈ Σ}, q0, {q2})

Isn't it astounding that such a simple algorithm could accomplish such a seemingly difficult task? No matter how complex the grammar, the simple three-state machine produced by this algorithm accepts exactly the language generated by the grammar! It's hard to believe. Do you actually believe it? Hopefully you won't take this on faith. You should demand proof that this algorithm really works!

Proof of correctness for the Grammar2PDA algorithm: Intro

To prove that the algorithm Grammar2PDA meets its specification, we must prove that L(M) = L(G), where G is the input grammar, and M is the PDA produced by the algorithm. It is by no means clear that this is true, and it's not easy to see how to go about proving that it's true. On thing that should be clear by now, though, is that proving that L(M) = L(G) actually requires two sub-proofs:

prove that if u ∈ L(G) then u ∈ L(M), and
prove that if u ∈ L(M) then u ∈ L(G).

The fundamental approach to proving both of these is similar. How do we show a string u is in L(G)? We give a parse tree for u! How do we show string u is in L(M)? We give a sequence of configurations comprising an accepting computation for machine M on input u! So to prove u ∈ L(G) ⇒ u ∈ L(M) we give an algorithm that converts a parse tree in G for string u into a configuration-sequence for an accepting computation of M on input u! Such an algorithm shows that any string derivable in the grammar is accepted by the PDA. And to prove u ∈ L(M) ⇒ u ∈ L(G) we give an algorithm that converts configuration-sequence for an accepting computation of M on input u into a parse tree in G for string u! Such an algorithm shows that any string that is accepted by the PDA is derivable in the grammar.

It's going to be important in what follows to be able to piece together several computations (given as sequences of configurations) into a single computation (also given as sequences of configurations). The following lemma is the key to doing this.

Lemma 1: For any PDA M, if (p,u,λ) ⇒*_M (p',λ,r) and (p',v,λ) ⇒*_M (q,λ,t) then (p,uv,λ) ⇒*_M (q,λ,tr).
Proof: This is obvious enough that I won't prove it. A picky mathematician would use induction. It is worth looking at an example, however.

Suppose we have (p,aab,λ) ⇒*_M (p',λ,Xa) and (p',cbc,λ) ⇒*_M (q,λ,bccZ). The theorem tells us that (p,aabcbc,λ) ⇒*_M (q,λ,bccZXa).

In essence, Lemma 1 is giving us an algorithm that joins two sequences of configurations representing computations in machine $M$ into a single long sequence of configurations representing the computations done one-after-another in $M$. The algorithm looks something like this:

Algorithm: $\text{join}_M(s,t)$
Input: sequences of configurations $s = (p_1,u_1,\lambda) \Rightarrow_M \cdots \Rightarrow_M (p_m,\lambda,v_m)$ and $t = (q_1,x_1,\lambda) \Rightarrow_M \cdots \Rightarrow_M (q_n,\lambda,y_n)$ such that $p_m = q_1$
Output: $r$, a sequence of configurations representing a computation $(p_1,u_1 x_1,\lambda) \Rightarrow_M^* (q_n,\lambda,y_n v_m)$ in $M$
$r = (p_1,u_1 x_1,\lambda) \Rightarrow_M \cdots \Rightarrow_M (p_m,x_1,v_m) = (q_1,x_1,v_m) \Rightarrow_M \cdots (q_n,\lambda,y_n v_m)$

Proof of correctness for the Grammar2PDA algorithm: u ∈ L(G) ⇒ u ∈ L(M)

The following is a lemma that shows that a sub-parse-tree in grammar G that produces string u can be transformed into a sequence of configurations in machine M that reads input string u. If you look closely at the (inductive) proof, you'll see it's essentially a recursive algorithm.

Algorithm: build($N$)
Input: node $N$ that is the root of a subtree of a valid parsetree for grammar $G$
Output: $s$ a sequence of configurations representing the computation $(q_1,w,\lambda) \Rightarrow_M^* (q_1,\lambda,A)$ in $M$, where $A$ is $N$'s label in the parse tree and $w \in \Sigma^*$ is the string whose derivation $N$ represents
let $A$ be the label of parse tree node $N$ (i.e. the element of $\Sigma \cup NT \cup \{\lambda\}$ it corresponds to)
if $A = \lambda$ return $(q_1,\lambda,\lambda)$
if $A \in \Sigma$ return $(q_1,A,\lambda) \Rightarrow_M (q_1,\lambda,A)$
let $c_1,\ldots,c_k$ be the children of $N$ in left-to-right order and let $A_1,\ldots,A_k$ be their labels
$s' = \text{build}(c_1)$
for $i$ from $2\ldots k$ do
$s'=\text{join}_M(s',\text{build}(c_i))$
Note: in the last configuration of $s'$, the stack will be $A_k A_{k-1} \cdots A_1$ and $(q_1,\lambda,A_k A_{k-1} \cdots A_1,q_1,A) \in \Delta$
$s = s' \Rightarrow_M (q_1,\lambda,A)$

Lemma 2: Let N be the root node of a sub parse-tree that derives some string u, and let A be the symbol from the grammar that N has as a label. (q1,u,λ) ⇒*_M (q1,λ,A)
Proof: Procceed by induction on the depth of the parse tree rooted at N. If the depth is zero, then N is a leaf and u = A. If u = A is a terminal then, (q1,u,λ) ⇒_M (q1,λ,A). Otherwise, u = A = λ, and (q1,u,λ) ⇒*_M (q1,λ,A) is really (q1,λ,λ) ⇒*_M (q1,λ,λ) which is trivially true (yields in zero steps).

Suppose the depth is greater than zero. Then node N represents the application of some rule A → w in G, where w ∈ (Σ ∪ {NT})*. If w = λ, then (q1,λ,λ) ⇒_M (q1,λ,A). Otherwise, w = w1 w2 ... wk, where the wi are individual symbols, and they are the roots of the individual children of N, N1,N2,...,Nk, respectively. If we use ui to name the string yielded by Ni, then u = u1u2...uk. By induction, for each Ni we have

(q1,ui,λ) ⇒*_M (q1,λ,wi).

By repeated application of Lemma 1 to

(q1,u1,λ) ⇒*_M (q1,λ,w1), (q1,u2,λ) ⇒*_M (q1,λ,w2), ... (q1,uk,λ) ⇒*_M (q1,λ,wk)

we get

(q1,u1u2...uk,λ) ⇒*_M (q1,λ,wk...w2w1).

Since (q1,λ,wk...w2w1) ⇒_M (q1,λ,A), we get that

(q1,u1u2...uk,λ) ⇒*_M (q1,λ,A).

Theorem 1: If u ∈ L(G) then w ∈ L(M).
Proof: Let T be a parse tree for u in grammar G. The label of the root is, of course, the start symbol S. From the definition of M we have (q0,λ,λ) ⇒_M (q1,λ,$)) and by Lemma 2 we have (q1,u,λ) ⇒*_M (q1,λ,S) , which Lemma 1 allows us to combine into (q0,u,λ) ⇒*_M (q1,λ,S$) . Finally, since the definition of M includes (q1,λ,S$) ⇒_M (q2,λ,λ), we have (q0,u,λ) ⇒*_M (q2,λ,λ), which means M accepts u.

Proof of correctness for the Grammar2PDA algorithm: u ∈ L(M) ⇒ u ∈ L(G)

Now we need to go the other way. We need to prove that if the machine $M$ accepts string $u$, $u$ is derivable in grammar $G$. What does it mean that $u \in L(M)$? It means we have a sequence of configurations $(q_0,u,\lambda) \Rightarrow_M \cdots \Rightarrow_M (q_2,\lambda,\lambda)$ [note that we know the stack is empty at the end because $M$ uses the \$ trick!]. To prove $u \in L(G)$, I need an algorithm that would convert that sequence of configurations into a parse tree for $u$ in grammar $G$. If you look at the proof below, you'll see that what it's really doing is presenting such an algorithm.

Theorem: If u ∈ L(M) then u ∈ L(G).
Proof: If u ∈ L(M), the structure of the machine ensures that (q1,u,$) ⇒*_M (q1,λ,S$). What we will do is show how to annotate the configurations in this computation so that the final configuration includes an annotation that is a parse tree in G for the string u.

We annotate a configuration by labeling each character on the stack with a sub-parse-tree according to these two rules:

if the stack character is a terminal x that is pushed as a result of the transition (q1,x,λ,q1,x), annotate x with the sub-parse tree consisting of a single node labeled x
if the stack character is a non-terminal A that is pushed as a result of the transition (q1,λ,w^R,q1,A) [Note that this means A → w is a rule in G, and also note that if w is non-empty, each character of w^R on the stack in the previous configuration is annotated with a sub-parse-tree!] annotate A with
- if w = λ, a node labeled A with one child, which is labeled λ,
- if w = w1 ... wk, where each wi ∈ Σ ∪ NT, the sub-parse tree consisting of a root node labeled with A and children consisting of the first k annotations from the stack in the previous configuration in reverse order.

Our claim is that in the final configuration (q1,λ,S$), the S is annotated with a parse tree in grammar G for string u. A proper proof of this would rely on induction. However, it seems obvious enough that this procedure works that we won't bother. We simply note that

it's obvious the procedure can only produce proper parse trees,
it's obvious the non-λ leaf nodes are exactly the characters in u, and
each pair of characters in u start off in distinct trees on the stack, and are in reverse order on the stack eactly up until the point at which those trees are combined as children of a common parent, which puts the two characters in proper order with respect to one another, where they stay for the remainder of the computation.

This proof is actually really important, not-only because it's half of showing that M is equivalent to G, but also because it shows that machine M can actually be seen as building a parse tree for the input string. This is, in fact, how tools like bison work.

Christopher W Brown

Last modified: Mon Nov 2 08:38:35 EST 2009