Class 25: Simple Algorithms for Grammars


Reading
Theorem 3.5.1 of Section 3.5.1 of Theory of Computing, A Gentle Introduction presents our three algorithms from the below.
Homework
Then printout the homework and answer the questions on that paper.

Last homework & the Compilation process
The previous homework gave you a peek at how compilers work. The input file is first broken up into tokens, which are chunks of text defined by regular expressions. Relationships between tokens are defined by a CFG. So after the input is tokenized, the sequence of tokens is parsed to produce a parse tree. The structure of this parse tree tells you something about the program's structure.

Regular expression operators on CFGs
Below we give an algorithm for taking grammars G1 and G2 and producing grammar G such that L(G) = L(G1) ∪ L(G2). We did this in class. Below are algorithms for concatenation and Kleene Star as well.
Algorithm: Union
Input: G1 = (Σ1,NT1,R1,S1)
G2 = (Σ2,NT2,R2,S2), where NT1 ∩ NT2 = ∅
Output: G = (Σ1 ∪ Σ2, NT1 ∪ NT2 ∪ {$}, R1 ∪ R2 ∪ {($,S1),($,S2)}, $), where $ is a new symbol, i.e. $ ∉ NT1 ∪ NT2
Algorithm: Concatenation
Input: G1 = (Σ1,NT1,R1,S1)
G2 = (Σ2,NT2,R2,S2), where NT1 ∩ NT2 = ∅
Output: G = (Σ1 ∪ Σ2, NT1 ∪ NT2 ∪ {$}, R1 ∪ R2 ∪ {($,S1S2)}, $), where $ is a new symbol, i.e. $ ∉ NT1 ∪ NT2
Algorithm: Kleene Star
Input: G = (Σ,NT,R,S)
Output: G' = (Σ, NT ∪ {$}, R ∪ {($,S$),($,λ)}, $), where $ is a new symbol, i.e. $ ∉ NT
Since a grammar for ∅, {λ} and {a}, for any character a, is trivial, we have all the operations we need for regular expressions. Thus, we can start with a regular expression, build recursively build grammars for its subexpressions, and combine them with our algorithms. The resulting algorithm is a proof that any regular language can be defined by a context free grammar.


Christopher W Brown
Last modified: Tue Oct 27 21:56:09 EDT 2009