**Reading**-
Section 3.2 of
*Theory of Computing, A Gentle Introduction*. **Homework**- Then printout the homework and answer the questions on that paper.

**Technical definition of the language defined by a grammar**-
To give a technical definition of the language derived from
some grammar, we need to define a single derivation step.
**Definition:**Let $G = (\Sigma,NT,R,S)$ be a context free grammar. For strings $u,v\in (NT\cup\Sigma)^*$, we say*$u$ yields in one step in grammar $G$ $v$*(denoted $u \Rightarrow_G v$) if \[ u = yXz \text{ and } v = ywz \text{, where $y,z,w \in (NT\cup\Sigma)^*$ and $(X,w) \in R$} \] This is also sometimes expressed as*$v$ can be derived in one step from $u$ in grammar $G$*.Just as with DFAs and NDFAs, it is helpful to define a special notation for derivable in "zero or more steps".

**Definition:**Let $G = (\Sigma,NT,R,S)$ be a context free grammar. For strings $u,v\in (NT\cup\Sigma)^*$, we say*$u$ yields in zero or more steps in grammar $G$ $v$*(denoted $u \Rightarrow_G^* v$) if $u = v$ or if for some strings $w_1,w_2,\ldots,w_k$ we have: \[ u = w_1 \Rightarrow_G w_2 \Rightarrow_G \cdots \Rightarrow_G w_k = v \] This is also sometimes expressed as*$v$ can be derived from $u$ in grammar $G$*.With these notions in hand, defining the language generated by a grammar is natural.

**Definition:**Let $G = (\Sigma,NT,R,S)$ be a context free grammar. The*language generated by grammar $G$*(denoted $L(G)$) is $\{ w \in \Sigma^*\ |\ S \Rightarrow_G^* w\}$.**Note:**The class of languages that are generated by context free grammars are called the*Context Free Languages*. **A quick grammar short-hand**-
It becomes tiresome to write grammars like
S → aSa

in which many rules share the same left-hand side. So we adopt the short-hand convention that we can list many right-hand sides for the same left-hand side all on one line by separating them with the "|" character, like this:

S → bSs

S → a

S → b

S → λS → aSa | bSs | a | b | λ

This is just a short-hand notation. The |'s are not terminal characters, in fact they're not really in the grammar at all. It's just a short-hand to save us bother and screen/paper space. **An Example & Parse Trees**-
We wrote a grammar for postfix
expressions in the variable
*v*. Here's a solution:S → SSO | v

We saw that for a string like

O → + | - | * | /*v v + v v - **there are*lots*of different derivations, all corresponding to the many choices of which particular non-terminal to substitute for at any step.*Parse Trees*allow us to represent the derivation of a string in a grammar without distinguish the order in which non-terminals are substituted. Thus, one and the same parse tree describes many derivations. **A Parse tree's structure**-
The parse tree for the previous example not only allowed us to
examine a derivation without worrying about order, it also
showed us the
*meaning*of the postfix expression*v v + v v - **, i.e. showed us how that expression was to be evaluated. This can be helpful if you're not familiar with postfix. So, for example, consider the expression*3 5 + 6 2 - **. If you annotate the*v*leaf nodes in the above parse tree with 3,5,6,2 going from left to right, you can follow the parse tree's structure and evaluate.**Exercise:**Draw a parse tree for*v v v v * + -*using the above grammar and use it to evaluate`5 6 4 9 * + -`

.**Important:**in computer science we often rely on the structure of a grammar's parse trees to provide*meaning*. **Ambiguity**-
In C++ we have expressions like
`cout << "the" << 3`

. Suppose we want to model such expressions with a grammar. Instead of worrying about concrete values, we'll just worry about some basic object types, like`ostream`

,`string`

, and`int`

.*S → S << S | obj*

obj → ostream | string | int`ostream << string << int`

.**Exercise:**Draw a parse tree for`ostream << string << int`

.*two*valid parse trees for this string! Now that we're relying on parse trees to give meaning to strings, this is a real problem. We can't have two different meanings for a string. Then the meaining is*abiguous*, we won't know what to do! If a string has more than one parse tree in a given grammar, that grammar is called*ambiguous*. C++ uses the parse tree corresponding to`(ostream << string) << int`

, a fact that is dictated by`<<`

's left-to-right associativity. We can rewrite the grammar in a way that encodes this associativity, and remove the ambiguity.*S → S << obj | obj*

obj → ostream | string | int*inherently ambiguous*, meaning that any grammar for the language will have strings with more than one parse tree. Inherent ambiguity would be quite a problem in a programming language!**Definition:**A grammar is called*ambiguous*if there is a string for which there is more than one parse tree in the grammar. **Example with arithmetic expressions in infix**-
Most people don't like postfix (I don't know why!) and
prefer our usual infix notation for arithmetic. So, let's
support these knuckleheads and write a grammar for infix
expressions on the values 2 and 3 using + and *.
*S → S + S | S * S | 2 | 3**2 + 3 * 3*. Problem! There are two possible parse trees, and they lead to different answers when we evaluate the expression! Our grammar is ambiguous! That's a real problem, since this implies that the*meaning*of the expression is ambiguous!We could try to write the grammar in a different way so that it generates the same strings, but without the ambiguity.

*S → S + T | T*

T → T * N | N

N → 2 | 3*2 + 3 * 3*and what's more, it matches our precedence rules! So that's good.However, if you play around with this grammar, you'll find that there's no way to express that you'd like the addition "2 + 3" to happen before the multiplication "*3". In fact, to adequately express arithmetic in infix, you need parentheses. And that is one of the big reasons why postfix really is a superior language for an arithmetic calculator!

*S → S + T | T*

T → T * N | N

N → (S) | 2 | 3 **Dangling Else**- The "dangling else" problem is a classic example of ambigious grammar problems in real-world programming languages. Read about it in the book (p. 78), or read the Wikipedia article.

Christopher W Brown Last modified: Fri Oct 23 15:42:38 EDT 2009