**Reading**-
Section 2.4 of
*Theory of Computing, A Gentle Introduction*and, of course, these notes. Make sure you know what a "regular language" is! **Homework**-
Repeat to yourself 10 times:
*Any lanugage accepted by a DFA, can be accepted by an NDFA. Any language accepted by an NDFA can be defined by a Regular Expression. Any language defined by a Regular Expression can be accepted by an NDFA. Any language accepted by an NDFA can be accepted by a DFA.*Then printout the homework and answer the questions on that paper.

**Repeat: Formal definition of a regular expression**-
Now that we are (hopefully) familiar with using regular
expressions in an informal way, it's time to give a formal
defintions of a regular expression
**definition of regular expression****definition of language defined by a regular expression****Definition:**A*regular expression*over the alphabet Σ is an expression derived from the following rules:- λ, ∅, and
*a*, where*a ∈ Σ*, are all regular expressions - if
*E*is a regular expression then*(E)**is a regular expression - if
*E1*and*E2*are regular expressions then*(E1)(E2)*and*(E1)***|***(E2)*are regular expressions - nothing else is a regular expression!

**Definition:**If R is a regular expression over alphabet Σ, then*L(R)*, the*language defined by regular expression R*, is- {λ} if R = λ,
{ } if R = ∅, and {
*a*} if R =*a*, where*a ∈ Σ* *L(E)**if R =*(E)**for some regular expression*E**L(E1)L(E2)*(respectively*L(E1) ∪ L(E2)*) if R =*(E1)(E2)*(respectively R =*(E1)***|***(E2)*) where*E1*and*E2*are regular expressions

- λ, ∅, and
**Informal method for constructing FAs accepting languages defined by regular expressions**-
Hopefully you remember expression trees from data structures
or parse trees from programming languages. We went over a
demo in class about how we could use the expression tree
representation of a regular expression and apply
our union, Kleene star and concatenation algorithms to
construct a NDFA that accepts the language defined by a
regular expression.
**Converting Regular Expressions to Finite Automata**-
Since the
only operations on regular expressions are union concatenation
and Kleene star, those are the only real algorithms we need
for FAs in order to do our conversion.
Here they are
(pulled from previous lectures):
**Algorithm: Union**-
**Input:**Machines*M1 = (Q1,Σ1,Δ1,s1,W1)*and*M2 = (Q2,Σ2,Δ2,s2,W2)*, where*Q1 ∩ Q2 = ∅*

**Output:***M = (Q1 ∪ Q2 &cup {s}, Σ1 ∪ Σ2, Δ1 ∪ Δ2 ∪ {(s,λ,s1),(s,λ,s2)}, s, W1 ∪ W2)*, where*s ∉ Q1*and*s ∉ Q2*(i.e.*s*is a new state) **Algorithm: Concatenation**-
**Input:**Machines*M1 = (Q1,Σ1,Δ1,s1,W1)*and*M2 = (Q2,Σ2,Δ2,s2,W2)*, where*Q1 ∩ Q2 = ∅*

**Output:**Machine*M = (Q1 ∪ Q2, Σ1 ∪ Σ2, Δ, s1,W2)*, where*Δ = Δ1 ∪ Δ2 ∪ {(q,λ,s2) | q ∈ W1}* **Algorithm: KleeneStar**-
**Input:**Machine*M = (Q, Σ, Δ, s, W)*

**Output:***M' = (Q ∪ {$}, Σ, Δ ∪ {($,λ,s)} ∪ {(q,λ,$) | q ∈ W}, $, {$})*, where*$ ∉ Q*, i.e.*$*is a new state

*a*, where*a ∈ Σ*, λ and ∅. So we need algorithms for building those machines. This is*really*easy!*a ∈ Σ*Machineλ Machine ∅ Machine **Input:***a ∈ Σ*

**Output:***M = ({q0,q1},Σ,{(q0,a,q1)},q0,{q1})***Input:***nothing*

**Output:***M = ({q0},Σ,{ },q0,{q0})***Input:***nothing*

**Output:***M = ({q0},Σ,{ },q0,{ })*Now we really have everything we need for a recursive algorithm to convert a regular expression λ into a finite automaton.

**Algorithm: RE2FA**

**Input:**regular expression*E*

**Output:**NDFA*M*such that*L(M) = L(E)*.

**if***E*is*a*, λ or ∅, return the*a ∈ Σ*Machine, λ Machine or ∅ Machine from above

**else if***E = E1 | E2*return**Union**(**RE2FA**(*E1*),**RE2FA**(*E2*))

**else if***E = E1E2*return**Concatenation**(**RE2FA**(*E1*),**RE2FA**(*E2*))

**else if***E = E1**return**KleeneStar**(**RE2FA**(*E1*)) **Regular Exressions vs. Finite Automata**- How does the set of all languages defined by regular expressions compare to the set of all languages accepted by finite automata? The previous algorithm tells us that the set of languages defined by regular expressions is contained within the set or languages accepted by finite automata. But what about the other way? Are there lanugages accepted by finite automata for which there is no regular expression? Do I care? Well, yes and no. No I don't care in the sense that I won't really want to convert finite automata into regular expressions. Yes I do care, however, because if there are languages that are accepted by finite automata but which cannot be represented by regular expressions, then I should extend regular expressions so that they take full advantage of the power of the finite automata into which we'll convert them.
**Converting Finite Automata to Regular Expressions**-
Yes, any finite automaton can be converted into regular
expression defining the language the automaton accepts. This
means the set of all languages defined by regular
expressions is equal to the set of all languages accepted by
finite automata, so there's no point trying to extend the
expressive power of regular expressions. Because of this
equivalence, we refer to all these languages as
*regular languages*.The algorithm for this conversion is a little ugly. I'm not going to cover it in any great depth. The point is that it is possible ... it's not the kind of thing there's a great need to do in practice.

**Input:**NDFA*M = (Q,Σ,Δ,s,W)*containing no λ-transitions,*(p*where_{a},x,p_{b}) ∈ QxΣxQ*(p*, and a table_{a},x,p_{b}) ∉ Δ*E*such that*E*is a regular expression for_{ij}*{w ∈ Σ* | (q*._{i},w) ⇒* (q_{j},λ)}

**Output:**A table*E*such that^{new}*E*is a regular expression for^{new}_{ij}*{w ∈ Σ* | in M*where^{new}we have (q_{i},w)⇒*(q_{j},λ)}*M*.^{new}= (Q,Σ,Δ∪{(p_{a},x,p_{b})},s,W)}-
set
*E*^{new}_{ab}= (E_{aa})(x (E_{ba}))*(E_{ab}|x)(E_{bb}) -
for all
*(r,s) ≠ (a,b)*set*E*^{new}_{rs}= E_{rs}|(E_{ra})(E^{new}_{ab})(E_{bs})

We use this algorithm to build a table T such that T[p,q] is a regular expression defining the set of all strings that take you from state p to state q in the machine, i.e. all strings

*w*such that*(p,w)⇒*(q,λ)*. We build this table incrementally, by starting with a machine that is the same as the input, except all the transitions are missing. The table for this initial machine is clearly λs along the diagonal, i.e. T[p,p] = λ for each state p, and ∅ everywhere else, since there are no transitions to take you from one state to another. We add transitions one-by-one, updating the table along the way by the algorithm above. When all the transitions have been added, we create the final regular expression by taking the "or" of all table entries T[p,q] where p is the start state and q is an accepting state.This algorithm works, but it creates horribly complicated regular expressions. I've implemented it, adding a little bit of simplification on the regular expressions that appear in the table. Even with that, the results are ugly. But, you do really get a regular expression at the end of the day that is equivalent to the input machine. The figure below shows the algorithm operating on input machine:

state 0 state 1 state 0 λ ∅ state 1 ∅ λ state 0 state 1 state 0 λ a state 1 ∅ λ state 0 state 1 state 0 λ a|b state 1 ∅ λ state 0 state 1 state 0 (a|b)(a(a|b))*a|λ (a|b)(a(a|b))*a(a|b)|a|b state 1 (a(a|b))*a (a(a|b))*a(a|b)|λ state 0 state 1 state 0 (a|b)(a(a|b))*a|λ|(((a|b)(a(a|b))*a(a|b)|a|b)(((a(a|b))*a(a|b)|λ)(b((a|b)(a(a|b))*a(a|b)|a|b))*(((a(a|b))*a)|b)((a|b)(a(a|b))*a|λ))((a|b)(a(a|b))*a|λ)) (a|b)(a(a|b))*a(a|b)|a|b|(((a|b)(a(a|b))*a(a|b)|a|b)(((a(a|b))*a(a|b)|λ)(b((a|b)(a(a|b))*a(a|b)|a|b))*(((a(a|b))*a)|b)((a|b)(a(a|b))*a|λ))((a|b)(a(a|b))*a(a|b)|a|b)) state 1 ((a(a|b))*a(a|b)|λ)(b((a|b)(a(a|b))*a(a|b)|a|b))*(((a(a|b))*a)|b)((a|b)(a(a|b))*a|λ) (a(a|b))*a(a|b)|λ|(((a(a|b))*a(a|b)|λ)(((a(a|b))*a(a|b)|λ)(b((a|b)(a(a|b))*a(a|b)|a|b))*(((a(a|b))*a)|b)((a|b)(a(a|b))*a|λ))((a|b)(a(a|b))*a(a|b)|a|b)) To get our final answer, we note that since state 1 is the only accepting state, T[0,1] is our answer, i.e.

*(a|b)(a(a|b))*a(a|b)|a|b|(((a|b)(a(a|b))*a(a|b)|a|b)(((a(a|b))*a(a|b)|λ)(b((a|b)(a(a|b))*a(a|b)|a|b))*(((a(a|b))*a)|b)((a|b)(a(a|b))*a|λ))((a|b)(a(a|b))*a(a|b)|a|b))**(a|b)*(a|b)*is also an equivalent regular expression, that looks pretty ugly. It is, however, equivalent. I actually checked!Now here is finally a case where you're looking at an algorithm and probably thinking "why would I believe this works?" It's not clear that the table has the desired property, i.e. that

*T[p,q]*define*{ w ∈ Σ*| (p,w) ⇒ (q,λ)}*. Really we need to prove that in the algorithm described above the following is true:if

This is not too bad to prove, but it requires the concept of inductive proof, which not all of you have seen at this point.*E*is a table such that*E*is a regular expression for_{ij}*{w ∈ Σ* | (q*then_{i},w) ⇒*_{M}(q_{j},λ)}*E*is a table such that^{new}*E*is a regular expression for^{new}_{ij}*{w ∈ Σ* | (q*where_{i},w)⇒*_{Mnew}(q_{j},λ)}*M*.^{new}= (Q,Σ,Δ∪{(p_{a},x,p_{b})},s,W)} -
set

Christopher W Brown Last modified: Fri Oct 9 11:13:28 EDT 2009