Class 25: Context Free Grammars
- Reading
-
Section 3.1 of Theory of Computing, A Gentle
Introduction. Please read it!
- Homework
-
- Example 3.1.5 from the text provides a grammar for
arithmetical expression in the variable v. Using
that grammar, write down a derivation for the expression
(((v + v)*v) - v).
-
The previous grammar gives us arithmetical expressions in
infix notation. Infix requires parentheses,
which get cumbersome. Postfix notation has the
advantage of not needing parentheses. In postfix, an
expression has two operands then the operator to be
applied. For example, (3 + 5)*(4 - 2) is 3 5 + 4 2 - * ,
which we view as
3 5 +
4 2 - *
which becomes
8
2 *
which gives 16 .
Write a grammar for postfix expressions in the variable
v with the operators +,-,*,/.
-
Using the grammar you created in (2), give a derivation
for v v + v v * /.
-
Let the grammar G be defined by the 4-tuple ({S,X},{0,1,@},{(S,@X@),(X,0X1),(X,1X0),(X,@)},S).
Give a derivation of @0110@1001@ in grammar G.
-
Provide the 4-tuple defining the grammar:
S → <S,S>
S → <N,N>
N → N0
N → N1
N → 1
- Context Free Grammars: Informally
-
A context free grammar is a set of "rewrite rules"
that say that a particular string can be "rewritten" by
substituting particular expressions in place of particular
symbols. A simple example: We may have a rule
stating that the character A can be replaced with the
string aaA (which we'll write as A → aaA).
With this rule, we can "rewrite" the string babAb as
babaaAb. We distinguish between symbols for which
there is no substitution rules (called terminals) and
those for which there are subsitution rules (called
nonterminals). The goal in using these rules is to
produce a string of terminals - after all, at that point no
further substitutions are possible.
Example:
Given the rules A → aaA and A → b
we can produce the string aaaab from the string
A as follows:
- start with A
- apply the first rule and get aaA
- apply the first rule and get aaaaA
- apply the second rule and get aaaab
We say that we have derived string aaaab from the string
A.
A collection of these rules along with a nonterminal
designated as the string with which derivations begin is a grammar.
When all rules have the form X → w, where
X is a single nonterminal, the grammar is called
Context Free.
Example: The following rules, with nonterminal
S designated as the start symbol, define the
language of all even-length strings over the alphabet
{a,b}.
- S → A
- A → aaA
- A → abA
- A → baA
- A → bbA
- A → e
A derivation of abaabb, for example, would be
S ⇒ A ⇒ abA ⇒ abaaA ⇒ abaabbA ⇒ abaabb
where the "⇒" denotes the application of a single
grammar rule.
Life gets substantially more interesting when grammar
derivations produce strings containing several nonterminals,
and during the derivation process you have many choices as to
which rules to apply.
Example: Consider the following grammar:
In derivations for this grammar we have decisions - decisions
about which non-terminal to replace as well as which rule to
use in the replacement. For example, the string c,c,c
can be derived in several different ways:
S ⇒ A ⇒ A,A ⇒ c,A ⇒ c,A,A ⇒ c,c,A ⇒ c,c,c
S ⇒ A ⇒ A,A ⇒ A,A,A ⇒ c,A,A ⇒ c,c,A ⇒ c,c,c
S ⇒ A ⇒ A,A ⇒ A,c ⇒ A,A,c ⇒ c,A,c ⇒ c,c,c
Problem: Consider the following grammar:
- S → cAc
- A → AcA
- A → aa
- A → bb
Find derivations for the strings caacbbcbbc
and cbbcaacaacaac.
- Languages defined by Context Free Grammars
-
CFGs are kind of like regular expressions, in the sense that
they are a nice way to specify a language, as opposed to
testing whether a string belongs to a language. So CFGs are
similar to regular expressions, but how do they compare?
Can they express everything that regular exrpessions can?
Excercise:
Here's a list of some of the languages defined by regular
expressions.
See if you can write a CFG generating any of them.
- (a|b)*, i.e. any string over {a,b}
- a(a|b)*b, i.e. any string that starts with
a and ends with b
- abb|bba, i.e. {abb,bba}
Can
they express languages that cannot be expressed by regular
expressions?
Excercise:
Here's a list of some of the languages we proved are not
regular, i.e. cannot be accepted by finite automata.
See if you can write a CFG generating any of them.
- {anbn | n ≥ 0}
- {anbncn | n ≥ 0}
- The language of properly parenthesized lists of a's
- The language of palindromes over {a,b,c}
- The language of all strings of a's whose
length is prime
- A formal definition of a CFG
-
A Context Free Grammar is a tuple
G = (Σ,NT,R,S), where
- Σ is the alphabet, i.e. the set of terminal
symbols
- NT is the set of nonterminal symbols
- R ⊆ NT
x(Σ ∪ NT)*,
i.e. the elements of R are pairs whose first
componant is an element of NT and whose second
componant is a string over the alphabet Σ ∪ NT
- S ∈ NT is the start symbol
Christopher W Brown
Last modified: Fri Oct 24 15:14:26 EDT 2003