05 — Regular Operations & Closure
What you can do with regular languages — and why the answer is "a lot"
Why Closure Properties?

In software engineering, when you compose two type-safe functions, the result is still type-safe. Similarly, when you combine regular languages using certain operations, the result is always regular. These are closure properties — the class of regular languages is "closed" under these operations.

This is powerful for two reasons:

  1. Building complex recognizers: Design simple DFAs/NFAs for simple properties, then combine them mechanically.
  2. Proving regularity: If you can express a language as combinations of known regular languages using closed operations, it's regular — no explicit DFA needed.
The Three Regular Operations
Definition. Let \(A\) and \(B\) be languages over \(\Sigma\).

Note on Kleene star: \(k = 0\) means the concatenation of zero strings, which is \(\varepsilon\). So \(\varepsilon \in A^*\) always, even if \(\varepsilon \notin A\), even if \(A = \emptyset\).

Software analogy:

Closure Proofs via NFA Constructions

The easiest way to prove closure is: given NFAs \(N_1\) and \(N_2\), construct a new NFA for the combined language. Since every NFA has an equivalent DFA, the result is regular.

Union: \(L(N_1) \cup L(N_2)\)

Idea: Create a new start state that \(\varepsilon\)-transitions to both \(N_1\)'s and \(N_2\)'s start states. The machine nondeterministically runs both in parallel.

digraph { rankdir=LR; bgcolor="transparent"; node [fontname="Helvetica" fontsize=10]; edge [fontname="Helvetica" fontsize=9]; __start [shape=none label="" width=0]; __start -> q_new; q_new [shape=circle label="q_new" style=filled fillcolor="#d0e8ff" width=0.5]; subgraph cluster_N1 { style=dashed; color="#9a9590"; label="N₁"; fontname="Helvetica"; fontsize=9; labeljust=l; s1 [shape=circle label="start₁" style=filled fillcolor="#d0e8ff" width=0.5]; f1 [shape=doublecircle label="F₁" style=filled fillcolor="#c8f7c5" width=0.5]; s1 -> f1 [label="..."]; } subgraph cluster_N2 { style=dashed; color="#9a9590"; label="N₂"; fontname="Helvetica"; fontsize=9; labeljust=l; s2 [shape=circle label="start₂" style=filled fillcolor="#d0e8ff" width=0.5]; f2 [shape=doublecircle label="F₂" style=filled fillcolor="#c8f7c5" width=0.5]; s2 -> f2 [label="..."]; } q_new -> s1 [label="ε"]; q_new -> s2 [label="ε"]; }

Accept states: F₁ ∪ F₂ (keep both machines' accept states)

Construction: Given \(N_1 = (Q_1, \Sigma, \delta_1, q_1, F_1)\) and \(N_2 = (Q_2, \Sigma, \delta_2, q_2, F_2)\), build \(N = (Q, \Sigma, \delta, q_0, F)\):
  • \(Q = Q_1 \cup Q_2 \cup \{q_0\}\) (fresh start state)
  • \(\delta(q_0, \varepsilon) = \{q_1, q_2\}\), \(\delta(q_0, a) = \emptyset\) for \(a \in \Sigma\)
  • \(\delta(q, a) = \delta_1(q, a)\) if \(q \in Q_1\), \(\delta(q, a) = \delta_2(q, a)\) if \(q \in Q_2\)
  • \(F = F_1 \cup F_2\)

Concatenation: \(L(N_1) \circ L(N_2)\)

Idea: Run \(N_1\). Whenever \(N_1\) reaches an accept state, nondeterministically jump to \(N_2\)'s start. Accept only when \(N_2\) accepts.

digraph { rankdir=LR; bgcolor="transparent"; node [fontname="Helvetica" fontsize=10]; edge [fontname="Helvetica" fontsize=9]; subgraph cluster_N1 { style=dashed; color="#9a9590"; label="N₁"; fontname="Helvetica"; fontsize=9; labeljust=l; s1 [shape=circle label="start₁" style=filled fillcolor="#d0e8ff" width=0.5]; f1a [shape=circle label="f₁" style=filled fillcolor="#e0e0e0" width=0.4]; f1b [shape=circle label="f₂" style=filled fillcolor="#e0e0e0" width=0.4]; s1 -> f1a [label="..."]; s1 -> f1b [label="..." style=dashed]; } subgraph cluster_N2 { style=dashed; color="#9a9590"; label="N₂"; fontname="Helvetica"; fontsize=9; labeljust=l; s2 [shape=circle label="start₂" style=filled fillcolor="#d0e8ff" width=0.5]; f2 [shape=doublecircle label="F₂" style=filled fillcolor="#c8f7c5" width=0.5]; s2 -> f2 [label="..."]; } __start [shape=none label="" width=0]; __start -> s1; f1a -> s2 [label="ε"]; f1b -> s2 [label="ε"]; }

ε-transitions from every N₁ accept state to N₂'s start. N₁'s accept states lose their accepting status. Only N₂'s accept states accept.

Construction:
  • Start state: \(q_1\) (N₁'s start)
  • For each \(f \in F_1\): add \(\varepsilon\)-transition to \(q_2\) (N₂'s start)
  • \(F = F_2\) only (N₁'s accept states lose their accepting status)
Subtlety: N₁'s former accept states still function as regular states — they just aren't in \(F\) anymore. The \(\varepsilon\)-transitions from them to N₂ don't remove their existing transitions. A string can still pass through them as intermediate states in N₁.

Kleene Star: \(L(N_1)^*\)

Idea: Add a new accepting start state. Add \(\varepsilon\)-transitions from all accept states back to N₁'s original start. This allows: accepting \(\varepsilon\) (new start is accepting), running N₁ once, or looping back to run it again.

digraph { rankdir=LR; bgcolor="transparent"; node [fontname="Helvetica" fontsize=10]; edge [fontname="Helvetica" fontsize=9]; __start [shape=none label="" width=0]; __start -> q_new; q_new [shape=doublecircle label="q_new" style=filled fillcolor="#c8f7c5" width=0.5]; subgraph cluster_N1 { style=dashed; color="#9a9590"; label="N₁"; fontname="Helvetica"; fontsize=9; labeljust=l; s1 [shape=circle label="start₁" style=filled fillcolor="#d0e8ff" width=0.5]; f1a [shape=doublecircle label="f₁" style=filled fillcolor="#c8f7c5" width=0.4]; f1b [shape=doublecircle label="f₂" style=filled fillcolor="#c8f7c5" width=0.4]; s1 -> f1a [label="..."]; s1 -> f1b [label="..." style=dashed]; } q_new -> s1 [label="ε"]; f1a -> s1 [label="ε"]; f1b -> s1 [label="ε"]; }

New start q_new is ACCEPTING (handles ε ∈ L*). ε from q_new to start₁. ε from every f ∈ F₁ back to start₁. Accept states: {q_new} ∪ F₁.

Construction:
  • New start state \(q_0\) with \(\delta(q_0, \varepsilon) = \{q_1\}\)
  • \(q_0 \in F\) (so \(\varepsilon\) is accepted)
  • For each \(f \in F_1\): add \(\varepsilon\)-transition to \(q_1\) (loop back)
  • \(F = \{q_0\} \cup F_1\)
Why a new start state? If we just made N₁'s start accepting and added \(\varepsilon\)-loops from accept states back to it, we might introduce unwanted paths. The fresh \(q_0\) cleanly separates "haven't started yet" from "looping."
Product Construction (DFA-based)

For union, intersection, and set difference, there's a DFA-based alternative that's often more intuitive: the product construction. It runs two DFAs simultaneously using tuple states.

Product Construction. Given DFAs \(D_1 = (Q_1, \Sigma, \delta_1, s_1, F_1)\) and \(D_2 = (Q_2, \Sigma, \delta_2, s_2, F_2)\), construct \(D = (Q, \Sigma, \delta, s, F)\):

Software analogy: the product construction is like running two state machines in parallel and combining their accept conditions with AND/OR. It's a zip operation on automata.

Complement
Closure under complement. If \(L\) is regular (recognized by DFA \(D = (Q, \Sigma, \delta, q_0, F)\)), then \(\overline{L} = \Sigma^* \setminus L\) is regular — just flip accept and non-accept states: $$D' = (Q, \Sigma, \delta, q_0, Q \setminus F)$$
Warning: Complement only works cleanly on DFAs, not NFAs! Flipping accept states on an NFA does NOT give the complement. Why? Because NFA acceptance is existential ("at least one path accepts"). Flipping states would make it "at least one path rejects," which is NOT the same as "all paths reject." Always convert to DFA first, then complement.
Closure Properties — Quick Reference
OperationRegular?Proof Method
Union \(A \cup B\)YesNFA (new start + \(\varepsilon\)) or product DFA
Intersection \(A \cap B\)YesProduct DFA or De Morgan: \(\overline{\overline{A} \cup \overline{B}}\)
Complement \(\overline{A}\)YesDFA: flip accept states
Concatenation \(A \circ B\)YesNFA: \(\varepsilon\) from \(F_1\) to \(q_2\)
Kleene star \(A^*\)YesNFA: new start + \(\varepsilon\)-loops
Difference \(A \setminus B\)YesProduct DFA or \(A \cap \overline{B}\)
Reversal \(A^R\)YesReverse all arrows, swap start/accept (see below)
Symmetric diff \(A \triangle B\)Yes\((A \setminus B) \cup (B \setminus A)\)
Worked Example: Product Construction

Build a DFA for \(L = \{w \in \{0,1\}^* \mid w \text{ has even # of 0s AND odd # of 1s}\}\) using intersection.

DFA \(D_1\): even number of 0s

State diagram:

digraph { rankdir=LR; bgcolor="transparent"; node [shape=circle fontname="Helvetica" fontsize=11 width=0.4 fixedsize=true]; edge [fontname="Helvetica" fontsize=10]; __start [shape=none label="" width=0]; __start -> e0; e0 [label="e0" shape=doublecircle style=filled fillcolor="#c8f7c5"]; o0 [label="o0" style=filled fillcolor="#d0e8ff"]; e0 -> o0 [label="0"]; e0 -> e0 [label="1"]; o0 -> e0 [label="0"]; o0 -> o0 [label="1"]; }

Transition table:

01
\(\to *\; e_0\)\(o_0\)\(e_0\)
\(o_0\)\(e_0\)\(o_0\)

DFA \(D_2\): odd number of 1s

State diagram:

digraph { rankdir=LR; bgcolor="transparent"; node [shape=circle fontname="Helvetica" fontsize=11 width=0.4 fixedsize=true]; edge [fontname="Helvetica" fontsize=10]; __start [shape=none label="" width=0]; __start -> e1; e1 [label="e1" style=filled fillcolor="#d0e8ff"]; o1 [label="o1" shape=doublecircle style=filled fillcolor="#c8f7c5"]; e1 -> e1 [label="0"]; e1 -> o1 [label="1"]; o1 -> o1 [label="0"]; o1 -> e1 [label="1"]; }

Transition table:

01
\(\to e_1\)\(e_1\)\(o_1\)
\(*\; o_1\)\(o_1\)\(e_1\)

Product DFA (intersection: both must accept):

State diagram:

digraph { rankdir=LR; bgcolor="transparent"; node [shape=circle fontname="Helvetica" fontsize=11 width=0.5 fixedsize=true]; edge [fontname="Helvetica" fontsize=10]; __start [shape=none label="" width=0]; __start -> ee; ee [label="(e0,e1)" style=filled fillcolor="#d0e8ff"]; oe [label="(o0,e1)" style=filled fillcolor="#d0e8ff"]; eo [label="(e0,o1)" shape=doublecircle style=filled fillcolor="#c8f7c5"]; oo [label="(o0,o1)" style=filled fillcolor="#d0e8ff"]; ee -> oe [label="0"]; ee -> eo [label="1"]; oe -> ee [label="0"]; oe -> oo [label="1"]; eo -> oo [label="0"]; eo -> ee [label="1"]; oo -> eo [label="0"]; oo -> oe [label="1"]; }

Transition table:

01Accept?
\(\to (e_0, e_1)\)\((o_0, e_1)\)\((e_0, o_1)\)No
\((o_0, e_1)\)\((e_0, e_1)\)\((o_0, o_1)\)No
\(*\; (e_0, o_1)\)\((o_0, o_1)\)\((e_0, e_1)\)Yes
\((o_0, o_1)\)\((e_0, o_1)\)\((o_0, e_1)\)No

Accept states for intersection: \(\{(q_1,q_2) \mid q_1 \in F_1 \text{ AND } q_2 \in F_2\}\). Here \(F_1 = \{e_0\}\) (even 0s) and \(F_2 = \{o_1\}\) (odd 1s), so the only accept state is \((e_0, o_1)\).

Trace "001":
  1. Start: \((e_0, e_1)\)
  2. Read 0: \(\delta((e_0,e_1), 0) = (\delta_1(e_0,0), \delta_2(e_1,0)) = (o_0, e_1)\)
  3. Read 0: \(\delta((o_0,e_1), 0) = (e_0, e_1)\)
  4. Read 1: \(\delta((e_0,e_1), 1) = (e_0, o_1)\)
Final state: \((e_0, o_1)\) — ACCEPT. Two 0s (even ✓) and one 1 (odd ✓).
Common mistake: Confusing which component DFA's accept state is which. \(e_0\) means even 0s (accept for \(D_1\)), \(o_1\) means odd 1s (accept for \(D_2\)). Be meticulous — the construction is mechanical, errors come from misidentifying \(F_1\) and \(F_2\).
Reversal Construction

Given a regular language \(A\), the reversal \(A^R = \{w^R \mid w \in A\}\) is also regular.

Construction: Given an NFA \(N\) for \(A\), build an NFA \(N^R\) for \(A^R\):

  1. Reverse every arrow: if \(N\) has a transition \(q_i \xrightarrow{a} q_j\), then \(N^R\) has \(q_j \xrightarrow{a} q_i\).
  2. The old accept states \(F\) become the new start states. (Since NFAs have a single start state, add a new start state \(q_s'\) with \(\varepsilon\)-transitions to every state in \(F\).)
  3. The old start state \(q_0\) becomes the sole accept state.
Intuition: If \(N\) accepts a string by following a path from \(q_0\) to some \(f \in F\), then \(N^R\) follows the same path backwards — from \(f\) to \(q_0\) — accepting the reversed string.
Summary
Regular languages are closed under: Two proof toolkits: Key trap: Complement only works on DFAs, not NFAs. Convert first.