Closed-book exam. One double-sided A4 handwritten cheatsheet allowed. No calculator. 10 open questions, 45 points total. Assume \(P \neq NP\) and \(NP \neq EXP\). Karp reductions denoted \(\leq_P\). Irrelevant information costs points.
This mock is harder than typical. Tricky edge cases, less obvious constructions, requires deep understanding. The questions look similar to real exam questions, but the specific instances are trickier.
DFA \(D_1\) over \(\Sigma = \{a, b, c\}\):
| State | \(a\) | \(b\) | \(c\) |
|---|---|---|---|
| \(q_0\) | \(q_1\) | \(q_0\) | \(q_0\) |
| \(q_1\) | \(q_2\) | \(q_0\) | \(q_1\) |
| \(q_2\) | \(q_2\) | \(q_2\) | \(q_2\) |
\(L(D_1)\) = strings over \(\{a,b,c\}\) where no two consecutive \(a\)’s appear. \(q_2\) is a trap (non-accepting).
DFA \(D_2\) over \(\Sigma = \{a, b, c\}\):
| State | \(a\) | \(b\) | \(c\) |
|---|---|---|---|
| \(p_0\) | \(p_0\) | \(p_1\) | \(p_0\) |
| \(p_1\) | \(p_1\) | \(p_0\) | \(p_1\) |
\(L(D_2)\) = strings with an odd number of \(b\)’s.
(a) (1 pt) Give \(F\) for \(D_1\) in set notation.
(b) (4 pts) Construct DFA \(D''\) with \(L(D'') = L(D_1) \setminus L(D_2)\) (set difference). Use product construction. Leave out unreachable states.
Part (a)
Part (b) — Key insight
Set difference: \(L(D_1) \setminus L(D_2) = L(D_1) \cap \overline{L(D_2)}\). In the product construction, accept when \(D_1\) accepts AND \(D_2\) rejects:
$$F'' = \{(q, p) \mid q \in \{q_0, q_1\} \text{ and } p = p_0\}$$
Product states & transitions
Start state: \((q_0, p_0)\). Build reachable states:
| State | \(a\) | \(b\) | \(c\) | Accept? |
|---|---|---|---|---|
| \((q_0, p_0)\) | \((q_1, p_0)\) | \((q_0, p_1)\) | \((q_0, p_0)\) | ✓ |
| \((q_1, p_0)\) | \((q_2, p_0)\) | \((q_0, p_1)\) | \((q_1, p_0)\) | ✓ |
| \((q_0, p_1)\) | \((q_1, p_1)\) | \((q_0, p_0)\) | \((q_0, p_1)\) | ✗ |
| \((q_2, p_0)\) | \((q_2, p_0)\) | \((q_2, p_1)\) | \((q_2, p_0)\) | ✗ |
| \((q_1, p_1)\) | \((q_2, p_1)\) | \((q_0, p_0)\) | \((q_1, p_1)\) | ✗ |
| \((q_2, p_1)\) | \((q_2, p_1)\) | \((q_2, p_0)\) | \((q_2, p_1)\) | ✗ |
All 6 product states are reachable.
An MNFA (Majority NFA) is like an NFA except it accepts a word \(w\) if more than half of all computation paths on \(w\) end in an accepting state.
Do MNFAs accept the same class of languages as NFAs (i.e., the regular languages)?
If so, prove both directions. If not, give a separating example.
Answer: Yes. MNFAs recognize exactly the regular languages.
Direction 1: DFA → MNFA (trivial)
Every DFA is a special case of an MNFA. A DFA has exactly one computation path per input. If that path ends in an accept state, 1/1 > 1/2 of all paths accept. If it ends in a reject state, 0/1 paths accept. So the DFA acceptance condition is a special case of the MNFA acceptance condition.
Direction 2: MNFA → NFA → DFA
Given an MNFA \(M = (Q, \Sigma, \delta, q_0, F)\), we build a DFA that tracks, for each state \(q \in Q\), the number of paths that reach \(q\) after reading the input so far.
Key insight: the branching factor per step is bounded by \(|Q|\) (at most \(|Q|\) successor states from any state). After reading a word of length \(n\), the total number of paths is at most \(|Q|^n\) — unbounded, so we cannot track raw counts in finite state.
Instead, we track the ratio of accepting-to-total paths. Observe: at each step, the transition function maps a distribution over states to a new distribution. We can represent this as a vector \(v \in \mathbb{Q}^{|Q|}\) where \(v_q\) = fraction of paths in state \(q\). The initial vector has \(v_{q_0} = 1\). Each symbol applies a linear transformation (a matrix). The accept condition is: \(\sum_{q \in F} v_q > 1/2\).
Since the matrix entries are rational, and we track fractions, the state space of possible ratio vectors is not finite in general. A cleaner approach:
Alternative (cleaner): Build a DFA \(D\) that simulates all paths of \(M\) simultaneously, tracking for each state \(q \in Q\) the exact number of paths reaching \(q\). The state of \(D\) is a function \(f: Q \to \mathbb{N}\). Since the numbers grow without bound, directly this doesn’t give a finite automaton. However, the acceptance condition only depends on whether \(\sum_{q \in F} f(q) > \frac{1}{2} \sum_{q \in Q} f(q)\), i.e., the ratio. We can normalize: track the vector of fractions and observe there are finitely many equivalence classes under the linear maps induced by each input symbol, because the transformations are integer matrices of fixed dimension.
Formally, the result follows from the fact that any language defined by a threshold condition on a stochastic (or weighted) automaton over \(\mathbb{Q}\) with an isolated cut-point is regular. This is a classical result.
The exam likely expects: (1) DFA → MNFA trivially, and (2) for MNFA → regular, state that the majority condition can be decided from the final state distribution computed via matrix powers, and since weighted automata with isolated cut-points recognize regular languages (Rabin 1963), MNFAs accept exactly the regular languages.
Give a regular expression \(R\) such that:
$$L(R) = \{w \in \{a, b\}^* \mid \text{every } a \text{ in } w \text{ is immediately followed by a } b\}$$
Examples: \(\varepsilon \in L(R)\), \(b \in L(R)\), \(ab \in L(R)\), \(abb \in L(R)\), \(abab \in L(R)\).
Counterexamples: \(a \notin L(R)\), \(ba \notin L(R)\), \(aab \notin L(R)\).
Every position in \(w\) is either a standalone \(b\) or the pair \(ab\) (an \(a\) immediately followed by \(b\)). No \(a\) can appear at the end, and no \(a\) can appear before another \(a\).
Alternatively written: \((b \mid ab)^*\). The key insight is that every \(a\) must be “consumed” together with its following \(b\) as a single unit.
Consider the language:
$$L = \{w \in \{a, b\}^* \mid \text{the number of } a\text{'s in } w \text{ is even}\}$$
A student attempts a pumping lemma proof:
“Suppose \(L\) is regular. Let \(p\) be the pumping length. Take \(w = a^{2p}\). Then \(|w| = 2p \geq p\). Split \(w = xyz\) with \(|xy| \leq p\), \(|y| \geq 1\). Then \(y = a^k\) for some \(1 \leq k \leq p\).”
Is it possible to finish this proof and reach a contradiction? Explain why or why not.
No, the proof cannot be finished.
Why it fails
\(L\) is regular. A DFA with 2 states recognizes it: one state for “even count of \(a\)’s” (accept), one for “odd count” (reject), with \(b\)-transitions staying in the same state.
The pumping lemma can only prove that a language is not regular. It cannot prove that a regular language is non-regular, because every regular language satisfies the pumping lemma.
Where the proof attempt breaks down
With \(w = a^{2p}\), any valid split gives \(y = a^k\) where \(1 \leq k \leq p\). Pumping with \(i = 0\): \(xz = a^{2p-k}\). This is in \(L\) iff \(2p - k\) is even, i.e., iff \(k\) is even.
The student needs a contradiction for all valid splits — but the split \(y = a^2\) (with \(k = 2\)) survives all pumping: \(a^{2p + 2(i-1)}\) always has an even number of \(a\)’s. Since there exists a valid split that works, no contradiction is reached.
This is a trick question testing whether the student recognizes that (1) \(L\) is regular, and (2) the pumping lemma only provides a necessary condition, not a sufficient one for non-regularity.
Consider the CFG \(G = (\{S, B\}, \{a, b, c\}, R, S)\) with rules:
$$S \to aSc \mid B \qquad B \to bBc \mid \varepsilon$$
(a) (2 pts) What is \(L(G)\)? Is \(G\) ambiguous?
(b) (3 pts) Give a PDA \(P\) with at most 7 states such that \(L(P) = \{a^i b^j c^{i+j} \mid i, j \geq 0\}\).
Part (a)
Trace the derivation:
The \(S \to aSc\) rule wraps \(a\)’s and \(c\)’s around the outside (\(i\) times). Then \(S \to B\) switches to the \(B\) production, which wraps \(b\)’s and \(c\)’s (\(j\) times). The \(c\)’s concatenate: \(c^j c^i = c^{i+j}\).
Ambiguity: \(G\) is not ambiguous. For any word \(a^i b^j c^{i+j}\), the number of \(a\)’s forces exactly \(i\) applications of \(S \to aSc\), then exactly one application of \(S \to B\), then exactly \(j\) applications of \(B \to bBc\). The parse tree is unique.
Part (b) — PDA construction
Idea: push a marker for each \(a\) and each \(b\) (counting total), then pop one marker per \(c\). Accept when stack is empty.
PDA \(P = (Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)\):
Transitions:
| From | Read | Pop | Push | To | Comment |
|---|---|---|---|---|---|
| \(q_0\) | \(\varepsilon\) | \(Z_0\) | \(Z_0\) | \(q_a\) | initialize |
| \(q_a\) | \(a\) | \(Z_0\) | \(XZ_0\) | \(q_a\) | push first \(X\) |
| \(q_a\) | \(a\) | \(X\) | \(XX\) | \(q_a\) | push \(X\) for each \(a\) |
| \(q_a\) | \(\varepsilon\) | — | — | \(q_b\) | switch to reading \(b\)’s |
| \(q_b\) | \(b\) | \(Z_0\) | \(XZ_0\) | \(q_b\) | push first \(X\) for \(b\) |
| \(q_b\) | \(b\) | \(X\) | \(XX\) | \(q_b\) | push \(X\) for each \(b\) |
| \(q_b\) | \(\varepsilon\) | — | — | \(q_c\) | switch to reading \(c\)’s |
| \(q_c\) | \(c\) | \(X\) | \(\varepsilon\) | \(q_c\) | pop one \(X\) per \(c\) |
| \(q_c\) | \(\varepsilon\) | \(Z_0\) | \(\varepsilon\) | \(q_{\text{acc}}\) | stack empty → accept |
Consider the TM \(M\) with \(\Sigma = \{a, b\}\), \(\Gamma = \{a, b, \sqcup\}\), and transitions:
| State | Read | Write | Move | Next |
|---|---|---|---|---|
| \(q_0\) | \(a\) | \(b\) | R | \(q_0\) |
| \(q_0\) | \(b\) | \(a\) | R | \(q_0\) |
| \(q_0\) | \(\sqcup\) | \(\sqcup\) | L | \(q_1\) |
| \(q_1\) | \(a\) | \(a\) | L | \(q_1\) |
| \(q_1\) | \(b\) | \(b\) | L | \(q_1\) |
| \(q_1\) | \(\sqcup\) | \(\sqcup\) | R | \(q_{\text{acc}}\) |
(a) (2 pts) What does \(M\) do? Give a word \(w\) with \(|w| = 3\) that \(M\) accepts and show the full configuration sequence.
(b) (2 pts) Is \(L(M)\) decidable? Does \(M\) always halt?
Part (a)
\(M\) does two things:
Net effect: \(M\) flips every bit of the input and accepts. \(L(M) = \Sigma^*\) (accepts every string).
Example: \(w = aba\)
(Head position shown by state before the symbol it reads.)
\(q_0\, aba\) → \(b\, q_0\, ba\) → \(ba\, q_0\, a\) → \(bab\, q_0\, \sqcup\) → \(ba\, q_1\, b\) → \(b\, q_1\, ab\) → \(q_1\, \sqcup\, bab\) → \(q_{\text{acc}}\, bab\)
Input \(aba\) is flipped to \(bab\), and \(M\) accepts.
Part (b)
Decidable: Yes. \(L(M) = \Sigma^*\), which is trivially decidable (a TM that always accepts decides it).
Always halts: Yes. \(q_0\) moves right on every step until it hits blank (at most \(|w|+1\) steps). \(q_1\) moves left on every step until it hits blank (at most \(|w|+1\) steps). Then it transitions to \(q_{\text{acc}}\). Total steps: \(O(|w|)\), always terminates.
$$L = \{\langle M_1, M_2 \rangle \mid M_1 \text{ and } M_2 \text{ are TMs and } L(M_1) \cap L(M_2) = \emptyset\}$$
(a) (2 pts) Prove \(L\) is co-Turing-recognizable.
(b) (4 pts) Prove \(L\) is undecidable.
(c) (1 pt) Prove \(L\) is not Turing-recognizable.
Part (a) — co-Turing-recognizable
We show \(\overline{L}\) is Turing-recognizable. \(\overline{L} = \{\langle M_1, M_2 \rangle \mid L(M_1) \cap L(M_2) \neq \emptyset\}\).
Build a TM \(R\) that recognizes \(\overline{L}\):
On input \(\langle M_1, M_2 \rangle\):
For \(i = 1, 2, 3, \ldots\):
For each string \(s\) with \(|s| \leq i\):
Run \(M_1\) on \(s\) for \(i\) steps and \(M_2\) on \(s\) for \(i\) steps.
If both accept, accept.
If \(L(M_1) \cap L(M_2) \neq \emptyset\), some witness \(s\) exists. Dovetailing finds it. If the intersection is empty, \(R\) runs forever. So \(R\) recognizes \(\overline{L}\), meaning \(L\) is co-TR. □
Part (b) — Undecidable
We reduce from \(A_{TM}\). Given \(\langle M, w \rangle\), construct:
Analysis:
This is a mapping reduction \(A_{TM} \leq_m \overline{L}\): yes-instances of \(A_{TM}\) map to \(\overline{L}\), no-instances map to \(L\). Since \(A_{TM}\) is undecidable, \(\overline{L}\) is undecidable, and therefore \(L\) is undecidable. □
Part (c) — Not Turing-recognizable
From part (a), \(L\) is co-TR. From part (b), \(L\) is undecidable. If \(L\) were also TR, then \(L\) would be both TR and co-TR, which implies \(L\) is decidable — contradiction. Therefore \(L\) is not Turing-recognizable. □
$$L = \{\langle M, w \rangle \mid M \text{ is a TM that halts on } w \text{ within 100 steps}\}$$
(a) (1 pt) Is \(L\) decidable? Is it Turing-recognizable? Is it co-Turing-recognizable?
(b) (2 pts) A student claims: “\(L\) is undecidable because the halting problem is undecidable.” Is this correct? Explain.
Part (a)
\(L\) is decidable.
Decider: On input \(\langle M, w \rangle\), simulate \(M\) on \(w\) for exactly 100 steps. If \(M\) halts within those 100 steps, accept. Otherwise, reject.
This always terminates (at most 100 simulation steps), so it is a decider.
Since decidable ⇒ TR and co-TR: \(L\) is all three (decidable, Turing-recognizable, and co-Turing-recognizable).
Part (b)
The student is wrong.
\(L\) is not the halting problem. The halting problem \(\text{HALT}_{TM} = \{\langle M, w \rangle \mid M \text{ halts on } w\}\) asks whether \(M\) halts in any number of steps — this is undecidable because there is no bound on how long to simulate.
\(L\) asks whether \(M\) halts within a fixed bound (100 steps). With a fixed bound, we can simply simulate and check. The bounded version is strictly easier — the unboundedness is precisely what makes \(\text{HALT}_{TM}\) undecidable.
DOMINATING SET
(a) (2 pts) Prove DOMINATING SET \(\in\) NP.
(b) (1 pt) Give a yes-instance of Vertex Cover with \(|V| = 5\), \(|E| = 5\), \(K = 2\). State the cover.
(c) (2 pts) A student proposes the following reduction from Vertex Cover to Dominating Set:
Given VC instance \((G = (V, E), K)\), output DS instance \((G' = G, k = K)\).
Apply this reduction to your VC instance from (b). State the resulting DS instance.
(d) (1 pt) Is the result from (c) a yes-instance of Dominating Set?
(e) (1 pt) Is the reduction from (c) correct? If not, explain what goes wrong and sketch how to fix it.
Part (a) — DOMINATING SET ∈ NP
Certificate: A set \(D \subseteq V\).
Verifier:
Both checks are polynomial. If a dominating set of size \(\leq k\) exists, it serves as a certificate that the verifier accepts. □
Part (b) — VC yes-instance
\(V = \{1, 2, 3, 4, 5\}\), \(E = \{\{1,2\}, \{2,3\}, \{3,4\}, \{4,5\}, \{2,4\}\}\), \(K = 2\).
Vertex cover: \(\{2, 4\}\).
Check: \(\{1,2\}\) ✓ (2), \(\{2,3\}\) ✓ (2), \(\{3,4\}\) ✓ (4), \(\{4,5\}\) ✓ (4), \(\{2,4\}\) ✓ (both). All 5 edges covered.
Part (c) — Apply reduction
DS instance: \(G' = G\) (same graph), \(k = K = 2\). Candidate dominating set: \(D = \{2, 4\}\).
Part (d) — Yes-instance?
Check if \(\{2, 4\}\) is a dominating set:
Part (e) — Is the reduction correct?
No, the reduction is incorrect.
A correct reduction must map all yes-instances of VC to yes-instances of DS and all no-instances to no-instances. The identity reduction \(G' = G, k = K\) fails because:
Fix: For each edge \(\{u, v\} \in E\), add a new vertex \(w_{uv}\) connected only to \(u\) and \(v\). In the modified graph \(G'\), any dominating set must include \(u\) or \(v\) for each edge (to dominate \(w_{uv}\)), which is exactly the vertex cover condition. Set \(k = K\).
For each statement, determine if it is True or False and give a brief justification.
(a) (2 pts) If \(L\) is Turing-recognizable and \(L\) is infinite, then \(L\) contains an infinite decidable subset.
(b) (1 pt) There exists a language that is neither Turing-recognizable nor co-Turing-recognizable.
(c) (2 pts) If \(A \leq_P B\) and \(A\) is NP-complete, then \(B\) is NP-complete.
(d) (1 pt) The language \(\{\langle D \rangle \mid D \text{ is a DFA and } L(D) = \Sigma^*\}\) is decidable.
(a) True
Since \(L\) is Turing-recognizable, there exists a TM \(M\) that recognizes it. Use dovetailing: run \(M\) on all strings with increasing time bounds. This produces an enumeration \(s_1, s_2, s_3, \ldots\) of the elements of \(L\) (with possible repeats).
Define \(S\) as follows: include \(s_1\). Then for each subsequent \(s_i\), include it only if it is lexicographically greater than the last included element. Since \(L\) is infinite, arbitrarily large strings keep appearing, so \(S\) is infinite.
\(S\) is decidable: on input \(x\), run the enumeration. Either \(x\) appears and is included in \(S\) (accept), or a string lexicographically greater than \(x\) is included without \(x\) ever appearing (reject). This always terminates because the included elements grow without bound.
(b) True
\(EQ_{TM} = \{\langle M_1, M_2 \rangle \mid L(M_1) = L(M_2)\}\) is neither TR nor co-TR. This is a standard result: \(EQ_{TM}\) is not TR (reduces from \(\overline{A_{TM}}\)) and not co-TR (reduces from \(A_{TM}\)).
(c) False
\(A \leq_P B\) and \(A\) NP-complete implies \(B\) is NP-hard (every NP problem reduces to \(A\), which reduces to \(B\), so every NP problem reduces to \(B\)). But \(B\) need not be in NP.
Counterexample: Let \(A = \text{3SAT}\) (NP-complete) and \(B = \overline{\text{HALT}_{TM}}\) (undecidable). We can reduce 3SAT to \(\overline{\text{HALT}_{TM}}\): given formula \(\phi\), construct a TM that tries all assignments; it halts iff \(\phi\) is satisfiable. Map to the complement appropriately. \(B\) is NP-hard but not in NP (not even decidable), so \(B\) is not NP-complete.
(d) True
This is the complement of \(E_{DFA}\) (emptiness testing) applied to the complement DFA. Algorithm:
Both steps (complement + emptiness test) run in polynomial time.
| Question | Points | Correct? |
|---|---|---|
| Q1a | 1 | |
| Q1b | 4 | |
| Q2 | 3 | |
| Q3 | 2 | |
| Q4 | 3 | |
| Q5a | 2 | |
| Q5b | 3 | |
| Q6a | 2 | |
| Q6b | 2 | |
| Q7a | 2 | |
| Q7b | 4 | |
| Q7c | 1 | |
| Q8a | 1 | |
| Q8b | 2 | |
| Q9a | 2 | |
| Q9b | 1 | |
| Q9c+d | 3 | |
| Q9e | 1 | |
| Q10a | 2 | |
| Q10b | 1 | |
| Q10c | 2 | |
| Q10d | 1 | |
| Total | /45 |