The simplest model of computation — and surprisingly powerful
Why DFAs?
Imagine you need to write a function that checks if a binary string has an even number of 1s. You'd probably use a boolean flag, flipping it on every 1. That flag is a state. You only need one bit of memory. You're already thinking like a DFA.
A DFA is the formalization of this idea: a machine that reads input one symbol at a time, left to right, transitioning between a finite number of states, and at the end either accepts or rejects. No backtracking, no memory beyond the current state, fully deterministic — for every state and symbol, there's exactly one next state.
DFAs are the foundation of the Chomsky hierarchy (a classification of language families by the type of automaton that recognizes them). They recognize exactly the regular languages — the same class described by regular expressions and NFAs.
Formal Definition
Definition. A deterministic finite automaton (DFA) is a 5-tuple \(M = (Q, \Sigma, \delta, q_0, F)\) where:
\(Q\) — a finite set of states
\(\Sigma\) — a finite alphabet (input symbols)
\(\delta: Q \times \Sigma \to Q\) — the transition function (total)
\(q_0 \in Q\) — the start state
\(F \subseteq Q\) — the set of accept states (also called final states)
Let's unpack each component:
\(Q\) — States
The machine's complete memory. A DFA in state \(q_3\) has forgotten everything about the input except whatever property \(q_3\) encodes. This is both the power (simple) and the limitation (can't count arbitrarily) of DFAs.
\(\delta\) — Transition Function
Total function. \(\delta\) must be defined for every \((q, a)\) pair — every state, every symbol. No "undefined" transitions. If a DFA diagram doesn't show a transition, there's an implicit transition to a dead state (trap state) that loops on all symbols. This is the key difference from NFAs.
In software terms: \(\delta\) is a lookup table with no null entries. It's a Map<[State, Symbol], State> that's exhaustively defined.
\(F\) — Accept States
\(F\) can be empty (\(L(M) = \emptyset\)), contain one state, or contain all states (\(L(M) = \Sigma^*\)). It's just a subset of \(Q\).
Extended Transition Function \(\hat{\delta}\)
Definition. The extended transition function \(\hat{\delta}: Q \times \Sigma^* \to Q\) processes an entire string:
$$\hat{\delta}(q, \varepsilon) = q$$
$$\hat{\delta}(q, wa) = \delta(\hat{\delta}(q, w), a)$$
where \(w \in \Sigma^*\) and \(a \in \Sigma\).
Read the recursion: to process string \(wa\), first process \(w\) (getting to some state), then take one more step on symbol \(a\). It's just "run the machine on the whole string."
Acceptance & Language
Definition. A DFA \(M\) accepts string \(w\) if \(\hat{\delta}(q_0, w) \in F\).
The language recognized by \(M\) is:
$$L(M) = \{w \in \Sigma^* \mid \hat{\delta}(q_0, w) \in F\}$$
Definition. A language \(L\) is regular if there exists a DFA \(M\) such that \(L = L(M)\).
Representations
State Diagram
Visual representation with circles (states), arrows (transitions), double circles (accept states), and an arrow from nowhere pointing to the start state.
Transition Table
For a DFA recognizing "strings over \(\{a,b\}\) containing an even number of a's":
\(\to\) marks the start state, \(*\) marks accept states. Two states, two symbols, every cell filled — that's what "total function" means.
Dead States (Trap States)
Dead state: A non-accepting state that loops to itself on every symbol. Once you enter it, you can never reach an accept state. In diagrams, dead states are often omitted for clarity — but they're formally required because \(\delta\) must be total.
When you see a diagram where some transitions seem "missing," mentally add a dead state with self-loops on all symbols. Every missing arrow goes to the dead state.
Worked Example
Let's build a DFA \(M\) that accepts binary strings containing the substring 01.
Final state \(q_2 \in F\) → ACCEPT. The string "1001" contains "01" (starting at position 2).
DFA Design Patterns
When designing DFAs, think about what each state needs to remember:
Counting mod \(k\): Use \(k\) states. E.g., "even number of 1s" needs 2 states (even/odd).
Pattern matching: States track how much of the pattern you've matched so far. E.g., "contains 01" needs 3 states (no match, seen 0, seen 01).
Last \(k\) symbols: Use \(2^k\) states (for binary). Each state remembers the last \(k\) symbols seen.
Combinations: Use pairs of states (one from each sub-DFA) to track two properties simultaneously. E.g., "even # of 0s AND contains 11" — each state is a tuple tracking both conditions. (This is formalized as the product construction in page 05.)
Key limitation: DFAs can't count unboundedly. They can count mod \(k\) (using \(k\) states), but they can't remember "how many 0s we've seen" for an arbitrary number. That's why \(\{0^n1^n\}\) isn't regular — you'd need infinitely many states to remember the exact count of 0s.
Summary
DFA in a nutshell:
5-tuple: \((Q, \Sigma, \delta, q_0, F)\)
Deterministic: exactly one transition per (state, symbol) pair
Total: \(\delta\) is defined everywhere — no missing transitions
Accepts: when final state is in \(F\) after reading entire input
Recognizes: the class of regular languages
Memory: only the current state — finite, with no auxiliary storage