CSE2315 — Automata, Computability and Complexity

Act I — Regular Languages

W1 01 — Alphabets, Strings & Languages

The type system — defines what we're computing over.

Before you can talk about machines, you need to define their inputs and outputs. An alphabet \(\Sigma\) gives you symbols. Strings are sequences of those symbols. Languages are sets of strings — the "yes" answers to a decision problem. Every machine in this course takes a string as input and answers: is it in the language or not?

W1 02 — Deterministic Finite Automata (DFA)

The simplest machine — reads left to right, no memory, fully deterministic.

A DFA is the formalization of "a machine with finite memory." It reads input one symbol at a time, transitions between states, and accepts or rejects at the end. No backtracking, no scratch space — just a state. The languages DFAs recognize are called regular languages. The surprise: this simple model handles pattern matching, modular counting, and protocol checking.

W1 03 — Nondeterministic Finite Automata (NFA)

Same power as DFAs, but wildly easier to design.

NFAs allow multiple transitions from one state on the same symbol, plus "free" \(\varepsilon\)-moves. The machine explores all paths simultaneously and accepts if any path reaches an accept state. Despite seeming more powerful, NFAs recognize exactly the same languages as DFAs — but they can be exponentially more compact.

W1 04 — Subset Construction

The proof that NFA = DFA — converts any NFA into an equivalent DFA.

The subset construction algorithm takes an NFA and produces a DFA where each state represents a set of NFA states. This constructive proof is the backbone of automata theory: it lets us use NFAs for easy design, then convert to DFAs when we need determinism. The cost: potentially \(2^n\) states, but often far fewer in practice.

W2 05 — Regular Operations & Closure Properties

What you can build by combining regular languages — the engineering toolkit.

Regular languages are closed under union, intersection, complement, concatenation, Kleene star, reversal, and more. Two proof toolkits: NFA constructions (union, concat, star — wire up \(\varepsilon\)-transitions) and product construction (intersection, difference — run two DFAs in parallel). Closure is powerful: prove one language is regular, then derive others for free.

W2 06 — Regular Expressions

The declarative notation — describes regular languages without drawing machines.

Regular expressions are a recursive syntax for building languages from base cases (\(a\), \(\varepsilon\), \(\emptyset\)) and operations (union, concatenation, star). They're equivalent to DFAs and NFAs — the equivalence triangle. Thompson's construction converts regex → NFA. State elimination (GNFA) converts DFA → regex. Three completely different notations, same class of languages.

W2 07 — GNFA & State Elimination

Closes the equivalence triangle — converts any DFA/NFA back to a regex.

A GNFA is an NFA with regex-labeled transitions. State elimination removes states one by one, rerouting paths through the eliminated state using a formula that combines the original regex labels. When only the start and accept states remain, the single transition between them is the regex for the language. Think of it as "bulldozing a city and rerouting all highways."

W2 08 — The Pumping Lemma

The impossibility tool — proves a language is NOT regular.

Everything so far showed what regular languages can do. The pumping lemma shows what they can't. If a language is regular, every sufficiently long string can be "pumped" (a middle section repeated any number of times) and stay in the language. To prove non-regularity, play an adversarial game: find a string where no valid pump works. This is the bridge to Act II — once you hit something DFAs can't do, you need a more powerful machine.

Act II — Context-Free Languages & Turing Machines

W3 09 — Context-Free Grammars (CFG)

Recursive rules for generating languages — handles nested structure.

CFGs use production rules to generate strings — variables rewrite into combinations of variables and terminals. They capture recursive/nested structure that regular languages can't: matched parentheses, palindromes, \(\{0^n1^n\}\). Covers the 4-tuple definition, derivations (\(\Rightarrow\) and \(\overset{*}{\Rightarrow}\)), parse trees, leftmost/rightmost derivations, ambiguity, and Chomsky Normal Form. Plus: every regular language is context-free (DFA → CFG conversion).

W3 10 — Pushdown Automata (PDA)

NFA + stack — the machine model for context-free languages.

A PDA is an NFA with a stack for memory. The 6-tuple \((Q, \Sigma, \Gamma, \delta, q_0, F)\) adds a stack alphabet and transitions that read/pop/push simultaneously. The stack lets it count and match — push for \(0\)'s, pop for \(1\)'s — which is exactly what's needed for nested structure. Critical insight: nondeterminism genuinely adds power here (unlike DFA vs. NFA). PDAs recognize exactly the context-free languages — the equivalence theorem CFG = PDA. Covers both construction directions, acceptance modes, and a full DFA/NFA/PDA comparison table.

W3 11 — CF Pumping Lemma

The impossibility tool for context-free languages.

Same spirit as the regular pumping lemma, but applied to parse trees instead of DFA states. Long strings force tall parse trees, tall trees force repeated variables (pigeonhole on CNF), and repeated variables give pumpable "wings" — two pieces \(v\) and \(y\) that pump simultaneously. The 5-part decomposition \(s = uvxyz\) with \(|vxy| \leq p\) (window can be anywhere, not just the start). Proves \(\{a^n b^n c^n\}\) and \(\{ww\}\) aren't context-free. Case analysis is harder here because the adversary has more freedom.

W4 12 — Turing Machines

The universal model of computation — infinite tape, read/write head.

A TM has an infinite tape it can read, write, and move on in both directions. This is the most powerful model in the course — the 7-tuple \((Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})\) adds a tape alphabet and separate accept/reject states. Three possible outcomes: accept, reject, or loop forever — creating the recognizable vs. decidable distinction. Variants (multitape, nondeterministic) are all equivalent in power. The Church-Turing thesis: anything we'd intuitively call "computable" can be computed by a TM. Covers formal definitions, configurations, TM design patterns (zig-zag marking, sweeping), enumerators, and the complete DFA/NFA/PDA/TM comparison.

Act III — Decidability, Reductions & Complexity

W5 13 — Decidable & Recognizable Languages

Languages about machines — which questions can algorithms always answer?

We shift to a meta-level: machines that analyze other machines. Four standard questions (\(A, E, ALL, EQ\)) asked about each model (DFA, NFA, CFG, TM). For DFAs, all four are decidable — simulate, check reachability, compute symmetric differences. For CFGs, cracks appear: \(ALL_{\text{CFG}}\) and \(EQ_{\text{CFG}}\) are undecidable. Enumerators connect to recognizability. The key theorem: a language is decidable iff both it and its complement are recognizable.

W6 14 — Undecidability & Diagonalization

The proof that computation has absolute limits.

\(A_{\text{TM}}\) is recognizable (the Universal TM) but undecidable — proved by diagonalization. Build a TM \(D\) that does the opposite of what \(M\) does on \(\langle M \rangle\), feed \(D\) its own encoding, get contradiction. The halting problem. More undecidable problems (\(E_{\text{TM}}\), \(EQ_{\text{TM}}\)) via reduction. \(\overline{A_{\text{TM}}}\) is not recognizable. The complete decidability table filled in.

W7 15 — Reductions, P, NP & NP-Completeness

From "what can be computed?" to "what can be computed efficiently?"

Mapping reductions (\(\leq_m\)) formalize transforming one problem into another — the tool for proving undecidability by comparison. Then the shift to complexity: Big-O for TMs, input encoding (the pseudo-polynomial trap), the class P (efficiently solvable), the class NP (efficiently verifiable), polynomial-time Karp reductions (\(\leq_p\)), NP-completeness, the Cook-Levin theorem (SAT is NPC), 3SAT, and the decision-to-search reduction.

W8 16 — NP-Completeness Reductions

Proving Vertex Cover, Clique, and Hamiltonian Path are NP-complete.

The NPC proof recipe in action: show it's in NP (polynomial verifier), then reduce from a known NPC problem. Three reductions: 3-SAT \(\leq_p\) Vertex Cover (proposition edges + clause triangles + connection edges, \(K = |U| + 2|C|\)), VC \(\leq_p\) Clique (complement the edges, \(K' = |V| - K\)), and HAMC \(\leq_p\) HAMP (add 3 nodes to break the cycle open). Plus: how to spot broken reductions that illegally reference the solution. Space complexity (PSPACE, Savitch's, etc.) is not exam material this year.

The Thread That Connects Everything

Every topic in this course follows the same pattern:

Define a machine model (DFA → NFA → PDA → TM)
Show what it can recognize (closure properties, equivalences, constructions)
Prove what it can't (pumping lemma, diagonalization, reductions)
Move to a strictly more powerful model and repeat

The exam tests all three levels: can you build a machine (construction), can you prove a language belongs to a class (closure arguments), and can you prove a language doesn't belong (impossibility proofs)?

Exam format: 45 points / 10 questions / 3 hours. Handwritten A4 cheat sheet allowed (both sides). The 6 recurring question types: DFA/NFA construction, subset construction, closure proof, pumping lemma, TM description, and reduction/decidability argument.