Class 31: More of what you can't do with CFGs and CFLs


Reading
Sections 3.6 of Theory of Computing, A Gentle Introduction, especially Theorem 3.6.3..
Homework
Then printout the homework and answer the questions on that paper.

A pumping lemma for CFGs Version 1.0
Last class we developed a pumping lemma for Context free grammars:
Pumping Lemma for CFGs v1.0
Let G = (Σ,NT,R,S) be a CFG and let u be a non-empty string in L(G). If |u | ≥ max{ |t| | (A,t) ∈ R }|NT| + 1 then there are strings v,w,x,y,z such that
  • u = vwxyz
  • at least one of w, y is not λ
  • vwkxykz ∈ L(G) for all k ≥ 0.

You should go back and read that section of the notes to see why this is true!

Proving a language is not generated by any context from grammar
Theorem The language L = {anbncn | n ≥ 0} is not a context free language.
Proof Any context free language is defined by a grammar. We will prove that no grammar defines L by giving an algorithm that, for any grammar, produces a "counter-example string":
Input: Grammar G = (Σ,NT,R,S)
Output: String u' such that u' ∈ L but u' ∉ L(G) or u' ∉ L but u' ∈ L(G)
  1. set m = max{ |t| | (A,t) ∈ R}|NT| + 1
  2. set n = ⌈m/3⌉
  3. set u = anbncn
  4. if u ∉ L(G) then let u' = u, otherwise
    1. get v,w,x,y,z from the pumping lemma for for CFGs
    2. set u' = vw2xy2z
  • note that m from Step 1 is the bound required by the pumping lemma.
  • note that |u| ≥ m
  • clearly if u ∉ L(G) then u' = u is a counter-example string.
  • u' from Step 4 b is not in L since if wy does not contain one of a,b,c then u' will have less of that character than at least one of the others, while if all three characters are in wy then either w contains a's and b's so that w2 contains b's followed by a's, or y contains b's and c's so that y2 contains c's followed by b's.

    

Apply this algorithm to the grammar G:

G: S → aS | bSc | λ
to produce a string that proves that G does not generate the language anbncn.

Intersection of context-free languages
Consider the following two context-free languages and the grammars that generates them:
L1 = {ambmcn | m,n ≥ 0} L2 = {anbmcm | m,n ≥ 0}
S → XC
X → aXb | λ
C → Cc | λ
S → AY
Y → bYc | λ
A → Aa | λ
Clearly both L1 and L2 are CFLs, since we have grammars for them. However, L1 ∩ L2 = {ambmcm | m ≥ 0}, which is the one language we know is not context free. Therefore, in contrast with regular languages, the intersection of two context free languages can fail to be context free.
Theorem: The intersection of two context free languages is not necessarily context free.
Another Mystery Algorithm
Consider the following mystery algorithm:
Input: PDA M = (Q,Σ,Γ,s,Δ,W) and NDFA M' = (Q',Σ,Δ',s',W'), where M' has no λ-transitions
Output: PDA M* = (QxQ',Σ,Γ,(s,s'),Δ*,WxW'), where
Δ* = {((p,p'),x,y,(q,q'),z) | (p,x,y,q,z) ∈ Δ and (p',x,q') ∈ Δ'} ∪ {((p,p'),λ,y,(q,p'),z) | (p,λ,y,q,z) ∈ Δ}
Note that we already have algorithms to transform NDFA's with λ-transitions into equivalent machines without λ-transitions. Therefore, this algorithm can be applied to any PDA/NDFA combination, it just may require a little prerocessing for some NDFA's. We ran this algorithm with the concrete input below:
With the resulting machine to guide us, we determined that the above algorithm takes a PDA M and an NDFA M' and produces a PDA that accepts L(M)∩L(M'). But wait, doesn't this contradict our previous result? No! Because that would require a similar algorithm for two PDAs, not for a PDA and NDFA. So what's stopping us from doing the same thing for two PDAs? Well, that would require two stacks in the resulting machine! So, hopefully we understand a bit better why intersection of CFLs doesn't always produce a CFL, and as a side benefit we have a new theorem:
Theorem: The intersection of a context free language and a regular language is a context free language.
Complement
From what we know about intersection from the above, we get the following theorem:
Theorem: The complement of a context free language is not necessarily context free.
Proof: Suppose that the complement of a context free language were always context free. Then consider any two context free languages, L1 and L2. Given our supposition, L1 and L2 are both context free. Since we proved several lectures ago that the union of context free languages is context free, L1L2 is context free. Then given our assumption
L1L2
would be context free, but this is exactly L1 ∩ L2, and we know that L1 ∩ L2 is not always context free. Therefore, our supposition must be wrong, and the complement of a context free language is not always context free.

Christopher W Brown
Last modified: Mon Nov 16 16:34:04 EST 2009