Algorithm: Non-Deterministic (w/o λ-transitions) to Deterministic FA
Input: M = (Q,Σ,Δ,s,W), where Δ contains no λ-transitions Output: M' = (2Q,Σ,δ, {s}, {R ∈2Q | R ∩ W ≠ ∅ }), where
δ : 2Q x Σ → 2Q δ(H,x) = { q ∈ Q | (p,x,q) ∈ Δ for some p ∈ H }
For further practice, run the above algorithm on the machine M = ({q0,q1},{a,b,c},{(q0,a,q0},(q0,c,q0),(q0,c,q1),(q1,a,q1),(q1,b,q1)},q0, {q1}).
| all paths in $M$ |
$(q_0,babc)$ | $\Rightarrow_M$ | $(q_0,abc)$ | $\Rightarrow_M$ | $(q_0,bc)$ | $\Rightarrow_M$ | $(q_0,c)$ | $\times$ | ||
| $(q_0,babc)$ | $\Rightarrow_M$ | $(q_0,abc)$ | $\Rightarrow_M$ | $(q_0,bc)$ | $\Rightarrow_M$ | $(q_1,c)$ | $\Rightarrow_M$ | $(q_1,\lambda)$ | ||
| $(q_0,babc)$ | $\Rightarrow_M$ | $(q_1,abc)$ | $\times$ | |||||||
|
reachable states |
$\{q_0\}$ | $\{q_0,q_1\}$ | $\{q_0\}$ | $\{q_0,q_1\}$ | $\{q_1\}$ | |||||
| path in $M'$ |
$(\{q_0\},babc)$ | $\Rightarrow_M'$ | $(\{q_0,q_1\},abc)$ | $\Rightarrow_M'$ | $(\{q_0\},bc)$ | $\Rightarrow_M'$ | $(\{q_0,q_1\},c)$ | $\Rightarrow_M'$ | $(\{q_1\},\lambda)$ |
Hopefully that makes clear what I meant by saying that a computation in $M'$ for $w$ is a summary of all possible computations in $M$ for $w$. If we can establish that this is indeed always the case, then it's clear that DFA $M'$ and NDFA $M$ accept exactly the same strings — i.e. that the conversion algorithm is correct. Formally, what we'd like to prove is the following theorem:
Theorem: If $M = (Q,\Sigma,\Delta,s,W)$ is an NDFA w/o $\lambda$-transitions, and $M'$ is the DFA resulting from the conversion algorithm, then for any string $w = w_1\cdots w_n \in \Sigma$ $$ (\{s\},w) \Rightarrow_{M'}^* (H,\lambda) \text{ if and only if } H = \{ q\in Q\ |\ (s,w) \Rightarrow_M^* (q,\lambda) \}. $$
Proof:
This really requires proof by induction — induction on
$n$, which is the length of $w$.
For the moment I'm not going to include this proof
Now, this idea works fine if there are not, in turn, λ-transitions out of q. The question is, if a machine has λ-transitions, can we always find at least one λ-transition to a state that has no λ-transitions? Then we could keep eliminating λ-transitions of that type until there weren't any at all. Unfortunately, there is one way that we could be unable to find an λ-transition to a state out of which there are no λ-transitions: What if there is a loop (or in graph parlance a cycle) of λ-transitions? So, first we need to get rid of any cycles of λ-transitions. Well, if there is a cycle of λ-transitions, the states in the cycle all form one big "super state", because whenever you're in one of the states you can get to any other without reading any input. So if there's a cycle, we'll merge all the states along the cycle into one state, wiping out all the λ-transitions between them in the process. This can simply be repeated until we finally arrive at a machine that has no cycles.
Algorithm: Lose λ-transition
Input: NDFA M = (Q,Σ,Δ,s,W) and (p,λ,q) ∈ Δ, where there are no λ-transitions out of q.
Output: NDFA M' such that $L(M) = L(M')$ and $M'$ has one fewer $\lambda$-transitions than $M$
M' = (Q,Σ,Δ',s,W'), where
Δ' = Δ - {(p,λ,q)} ∪ { (p,x,r) ∈ QxΣxQ | (q,x,r) ∈ Δ }
W' = W ∪ {p} if q ∈ W, and W otherwise
Claim:
Algorithm Lose λ-transition meets its
specification, i.e.
L(M') = L(M) and there is one fewer
λ-transition in M' than
in M.
Proof:
The fact that there is one fewer λ-transition in M' than
in M is obvious, so we concentrate on proving that L(M') = L(M).
We need to prove that for any string $w$,
$w \in L(M)$ if and only if $w \in L(M')$. This means that
$M$ and $M'$ accept the same strings and reject the same strings.
The only problem that remains is what to do if there are $\lambda$-transitions, but all of them into states that themselves have outgoing $\lambda$-transions. In this case, we can't apply the above algorithm. The thing to note is that this can only happen in the presence of cycles of $\lambda$-transitions. The next algorithm removes such cycles by merging all states in such a cycle into a single state.
Algorithm: Lose λ-cycle Input: NDFA M = (Q,Σ,Δ,s,W) and states p1,p2,...,pk comprising a cycle of 2 or more λ-transitions in MClaim: Algorithm: Lose λ-cycle meets its specification, i.e. L(M') = L(M) and there are fewer λ-transitions in M' than in M.
Output: NDFA M' such that $L(M') = L(M)$ and $M'$ has fewer $\lambda$-transitions than $M$
M' = (Q',Σ,Δ',s',W') where
Q' = Q - {p1,...,pk} ∪ {D}, where D is a new state i.e. D ∉ Q
s' = D if s ∈ {p1,...,pk}, and s otherwise
Δ' = Δ - {p1,...,pk}x(Σ ∪ {λ})xQ remove transitions out of the cycle - Qx(Σ ∪ {λ})x{p1,...,pk} remove transitions into the cycle ∪ {(D,x,q) ∈ {D}x(Σ ∪ {λ})xQ' | for some pi, (pi,x,q) ∈ Δ} add transitions out of D ∪ {(q,x,D) ∈ Q'x(Σ ∪ {λ})x{D} | for some pi, (q,x,pi) ∈ Δ} add transitions into D ∪ {(D,x,D) ∈ {D}xΣx{D} | for some pi and pj, (pi,x,pj) ∈ Δ} add D's loops
W' = W - {p1,...,pk} ∪ D if W ∩ {p1,...,pk} = ∅, and W otherwise
The deterministic finite automaton that results from this process accepts the same language as the original nondeterministic machine!