03 — Nondeterministic Finite Automata (NFA)
The same power as DFAs, but dramatically easier to design
Why NFAs?

Try designing a DFA for "strings over \(\{0,1\}\) where the third-to-last symbol is 1." Go ahead. You'll need 8 states because the DFA has to remember the last 3 symbols in its state (each a 0 or 1, so \(2^3 = 8\) combinations).

Now imagine a machine that can "guess": when it reads a 1, it nondeterministically guesses "this might be the third-to-last symbol" and starts counting. If the guess was right and exactly 2 more symbols follow, it accepts. That machine needs only 4 states.

NFAs are design tools. They make constructions simpler (closures, regex conversion), and they're provably equivalent to DFAs in power — every NFA can be converted to a DFA. You never lose expressiveness by using nondeterminism; you only gain convenience.

Formal Definition
Definition. A nondeterministic finite automaton (NFA) is a 5-tuple \(N = (Q, \Sigma, \delta, q_0, F)\) where:
  1. \(Q\) — a finite set of states
  2. \(\Sigma\) — a finite alphabet
  3. \(\delta: Q \times \Sigma_\varepsilon \to \mathcal{P}(Q)\) — the transition function
  4. \(q_0 \in Q\) — the start state
  5. \(F \subseteq Q\) — the set of accept states
where \(\Sigma_\varepsilon = \Sigma \cup \{\varepsilon\}\) and \(\mathcal{P}(Q)\) is the power set of \(Q\).

Two Key Differences from DFA

Difference 1: Domain includes \(\varepsilon\).
\(\delta\) takes inputs from \(\Sigma_\varepsilon = \Sigma \cup \{\varepsilon\}\), not just \(\Sigma\). This means a state can have an \(\varepsilon\)-transition: a "free move" to another state without consuming any input symbol.

Difference 2: Returns a SET of states.
\(\delta(q, a) \subseteq Q\) (returns a subset of \(Q\)), not a single state. The set can be:

Software analogy: a DFA is a single-threaded program. An NFA is a multi-threaded program that can fork() at any point. If any thread reaches an accept state at the end of input, the whole computation accepts.

\(\varepsilon\)-Transitions

An \(\varepsilon\)-transition is a "free move" — the machine changes state without reading any input. Think of it as a goto that doesn't advance the input pointer.

\(\varepsilon\)-Closure

Definition. The \(\varepsilon\)-closure of a state \(q\), written \(E(q)\) or \(\text{ECLOSE}(q)\), is the set of all states reachable from \(q\) by following zero or more \(\varepsilon\)-transitions.

For a set of states \(S\): \(E(S) = \bigcup_{q \in S} E(q)\).

Key: \(q \in E(q)\) always — you can reach yourself with zero \(\varepsilon\)-transitions.

The NFA is "simultaneously in" all states in the \(\varepsilon\)-closure at all times. Before reading any input, before processing a transition, after every step — \(\varepsilon\)-closure is applied.

NFA Acceptance
Definition. An NFA \(N\) accepts string \(w\) if there exists at least one sequence of transitions leading from \(q_0\) to some \(q_f \in F\) after reading all of \(w\).

Equivalently: the NFA accepts if, when you explore ALL possible computation paths, at least one ends in an accept state.
NFA acceptance is existential. Even if 999 paths reject and 1 path accepts, the NFA accepts. Think of it as: "the NFA magically guesses the right path." This is nondeterminism — the machine always makes the luckiest possible choice.

DFA acceptance is straightforward — there's only one path, and it either ends in \(F\) or not.
Worked Example: NFA with \(\varepsilon\)-transitions

Here's an NFA that accepts strings over \(\{a, b\}\) that end with ab:

State diagram:

digraph { rankdir=LR; bgcolor="transparent"; node [shape=circle fontname="Helvetica" fontsize=11 width=0.4 fixedsize=true]; edge [fontname="Helvetica" fontsize=10]; __start [shape=none label="" width=0]; __start -> q0; q0 [label="q0" style=filled fillcolor="#d0e8ff"]; q1 [label="q1" style=filled fillcolor="#d0e8ff"]; q2 [label="q2" shape=doublecircle style=filled fillcolor="#c8f7c5"]; q0 -> q0 [label="a"]; q0 -> q1 [label="a"]; q0 -> q0 [label="b"]; q1 -> q2 [label="b"]; }

Transition table:

\(a\)\(b\)\(\varepsilon\)
\(\to q_0\)\(\{q_0, q_1\}\)\(\{q_0\}\)\(\emptyset\)
\(q_1\)\(\emptyset\)\(\{q_2\}\)\(\emptyset\)
\(*\; q_2\)\(\emptyset\)\(\emptyset\)\(\emptyset\)

Design insight: \(q_0\) loops on everything — it's the "waiting" state. On reading \(a\), it nondeterministically guesses "this might be the start of the final ab" and also sends a copy to \(q_1\). If a \(b\) follows in \(q_1\), we reach \(q_2\) (accept). If the guess was wrong (no \(b\) follows), that branch dies.

Now let's add \(\varepsilon\)-transitions. Consider this NFA accepting strings that either (1) end with ab OR (2) are the empty string:

State diagram:

digraph { rankdir=LR; bgcolor="transparent"; node [shape=circle fontname="Helvetica" fontsize=11 width=0.4 fixedsize=true]; edge [fontname="Helvetica" fontsize=10]; __start [shape=none label="" width=0]; __start -> qs; qs [label="qs" style=filled fillcolor="#d0e8ff"]; q0 [label="q0" style=filled fillcolor="#d0e8ff"]; q1 [label="q1" style=filled fillcolor="#d0e8ff"]; q2 [label="q2" shape=doublecircle style=filled fillcolor="#c8f7c5"]; qacc [label="q_acc" shape=doublecircle style=filled fillcolor="#c8f7c5"]; qs -> q0 [label="ε"]; qs -> qacc [label="ε"]; q0 -> q0 [label="a"]; q0 -> q1 [label="a"]; q0 -> q0 [label="b"]; q1 -> q2 [label="b"]; }

Transition table:

\(a\)\(b\)\(\varepsilon\)
\(\to q_s\)\(\emptyset\)\(\emptyset\)\(\{q_0, q_\text{acc}\}\)
\(q_0\)\(\{q_0, q_1\}\)\(\{q_0\}\)\(\emptyset\)
\(q_1\)\(\emptyset\)\(\{q_2\}\)\(\emptyset\)
\(*\; q_2\)\(\emptyset\)\(\emptyset\)\(\emptyset\)
\(*\; q_\text{acc}\)\(\emptyset\)\(\emptyset\)\(\emptyset\)

\(\varepsilon\)-closure of start: \(E(q_s) = \{q_s, q_0, q_\text{acc}\}\). Since \(q_\text{acc} \in F\), the empty string is immediately accepted.

NFA Simulator: "Ends with ab"

Watch all computation branches in parallel. Green = accept, grey = dead branch.

DFA vs NFA — Side by Side
DFANFA
Transition domain\(Q \times \Sigma\)\(Q \times \Sigma_\varepsilon\)
Transition range\(Q\) (one state)\(\mathcal{P}(Q)\) (set of states)
Missing transitionsNot allowed (total)Allowed (go to \(\emptyset\))
\(\varepsilon\)-transitionsNoYes
AcceptanceSingle pathAny path accepts
ComputationDeterministicNondeterministic
Languages recognizedIdentical — both recognize exactly the regular languages
The Equivalence Theorem
Theorem (Rabin-Scott, 1959). For every NFA \(N\), there exists a DFA \(D\) such that \(L(N) = L(D)\).

Combined with the trivial fact that every DFA is already an NFA: NFAs and DFAs recognize exactly the same class of languages.

The conversion algorithm is called subset construction (next page). The resulting DFA may have up to \(2^{|Q|}\) states — exponentially more — but it always exists.

This is a deep result. Nondeterminism (the ability to "guess" and explore multiple paths) doesn't add computational power to finite automata. It adds convenience — NFAs are easier to design and compose — but the set of recognizable languages stays the same.

Caveat: this equivalence is specific to finite automata. For other models (pushdown automata, Turing machines), nondeterminism CAN add power or at least change complexity. Don't generalize this result beyond regular languages.
Summary
NFA essentials: