Time: 3 hours. Aids: One double-sided A4 handwritten cheatsheet. No calculator.
Assume P ≠ NP and NP ≠ EXP. Karp reductions denoted ≤P. Mapping reductions denoted ≤m. 0 ∈ ℕ.
Irrelevant information may cost points. Use scrap paper first.
This is the hardest mock exam. Novel angles, tricky edge cases, deep reasoning required. If you can do this exam, the real one will feel easy.
| Question | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Points | 5 | 3 | 2 | 3 | 5 | 4 | 7 | 3 | 7 | 6 | 45 |
Consider DFA D = (Q, {0, 1}, δ, s0, F) with states {s0, s1, s2, s3, s4} where s0 is the start state and transition function:
| δ | 0 | 1 |
|---|---|---|
| s0 (start) | s1 | s0 |
| s1 | s2 | s0 |
| s2 | s3 | s0 |
| s3 | s4 | s0 |
| s4 | s4 | s4 |
F = {s2, s4}.
(a) (1 pt) Describe L(D) in plain English.
(b) (4 pts) Consider DFA D′ = ({r0, r1}, {0, 1}, δ′, r0, {r0}) with:
| δ′ | 0 | 1 |
|---|---|---|
| r0 (start, accept) | r1 | r0 |
| r1 | r0 | r1 |
Construct DFA D′′ such that L(D′′) = L(D) ∩ L(D′) using the product construction. Leave out unreachable states.
(a) D counts consecutive 0’s. Any ‘1’ resets the counter to s0. s4 is a trap (both transitions stay in s4).
L(D) = strings over {0,1} that either end with a block of exactly two consecutive 0’s (not preceded/followed by more 0’s) or contain a block of four or more consecutive 0’s.
More precisely: L(D) = {w ∈ {0,1}* | w ends in a run of exactly 2 consecutive 0’s, or w contains a run of ≥ 4 consecutive 0’s}.
(b) L(D′) = strings with an even number of 0’s. (r0 = even count, r1 = odd count. 1’s don’t change state.)
Product construction D′′ = (Q′′, {0, 1}, δ′′, (s0, r0), F′′) where F′′ = {s2, s4} × {r0} = {(s2, r0), (s4, r0)}.
Exploring reachable states from (s0, r0):
| State | 0 | 1 | Accept? |
|---|---|---|---|
| (s0, r0) | (s1, r1) | (s0, r0) | No |
| (s1, r1) | (s2, r0) | (s0, r1) | No |
| (s2, r0) | (s3, r1) | (s0, r0) | Yes |
| (s0, r1) | (s1, r0) | (s0, r1) | No |
| (s3, r1) | (s4, r0) | (s0, r1) | No |
| (s1, r0) | (s2, r1) | (s0, r0) | No |
| (s4, r0) | (s4, r1) | (s4, r0) | Yes |
| (s2, r1) | (s3, r0) | (s0, r1) | No |
| (s4, r1) | (s4, r0) | (s4, r1) | No |
| (s3, r0) | (s4, r1) | (s0, r0) | No |
10 reachable states (all 5 × 2 = 10 states are reachable). L(D′′) = strings from L(D) that also have an even number of 0’s.
An SNFA (Single-Path Accepting NFA) is defined exactly like an NFA, but with a different acceptance criterion:
An SNFA N accepts a word w if exactly one computation path on w ends in an accepting state.
(If zero paths accept, or if two or more paths accept, the SNFA rejects w.)
Are SNFAs equivalent in expressive power to NFAs? That is, do they recognize exactly the regular languages?
If yes, prove both directions. If no, give a concrete regular language not recognizable by any SNFA, or a non-regular language recognizable by some SNFA.
Yes, SNFAs recognize exactly the regular languages.
Direction 1: Regular ⊆ SNFA. Every regular language has a DFA. A DFA has exactly one computation path for every input (determinism means one transition per state per symbol). If the DFA accepts, that is exactly 1 accepting path. If it rejects, that is 0 accepting paths. So every DFA is automatically a valid SNFA for the same language. Since every regular language has a DFA, every regular language is recognized by some SNFA. □
Direction 2: SNFA ⊆ Regular. Let N be an SNFA with states Q = {q1, …, qn} and accept states F. Build a DFA D that tracks, for each NFA state qi, how many computation paths reach it: “0”, “1”, or “≥2”. Formally, the DFA state space is {0, 1, 2+}n, which is finite (3n states).
Transitions: On reading symbol a in DFA state (c1, …, cn), compute the new count for each qj by summing the counts ci over all qi with qj ∈ δ(qi, a), capping at 2+.
Acceptance: D accepts iff the total number of accepting paths equals exactly 1. Formally: sum the counts for all states in F, and accept iff that sum equals 1.
This is a finite-state construction, so L(N) is regular. □
Give a regular expression R over Σ = {a, b, c} such that:
L(R) = { w ∈ {a, b, c}* | w contains “abc” as a subsequence (not necessarily contiguous) }
Examples: abc ∈ L, aXbYc ∈ L for any strings X, Y. bac ∉ L, ab ∉ L, bc ∉ L.
We need to match: anything, then ‘a’, then anything, then ‘b’, then anything, then ‘c’, then anything. Let Σ = (a | b | c).
The first and last Σ* can be omitted if you only care about containing the subsequence (they handle characters before the first ‘a’ and after the last ‘c’). But including them is cleaner and unambiguously correct. A shorter equivalent: (b | c)*a(a | b | c)*b(a | b | c)*c(a | b | c)* — the prefix only needs b’s and c’s since we pick the first ‘a’.
Let L = { ambn | m ≠ n } over Σ = {a, b}.
A student begins the following proof:
“Suppose L is regular. Let p be the pumping length. Choose w = apbp+p!. Then |w| = p + p + p! ≥ p. By the pumping lemma, w = xyz with |xy| ≤ p and |y| ≥ 1.”
Can the student complete this proof? If so, finish it. In particular, explain why the choice of w = apbp+p! is essential (why doesn’t the simpler w = apbp+1 work?).
Yes, the proof can be completed.
Since |xy| ≤ p and w begins with ap, the entire prefix xy lies in the a-block. So y = ak for some 1 ≤ k ≤ p.
Pump with i = (p!/k) + 1:
xyiz = ap − k + k · ibp+p! = ap − k + k(p!/k + 1)bp+p! = ap + p!bp+p!
Now the number of a’s equals the number of b’s (both p + p!), so xyiz ∉ L. This contradicts the pumping lemma. □
Why p! is essential: We need the pumping exponent i = (p!/k) + 1 to be a non-negative integer, which requires k | p!. Since 1 ≤ k ≤ p, we know k divides p! (every integer from 1 to p divides p!). This is why we use p! and not a simpler offset.
Why apbp+1 fails: With w = apbp+1, pumping gives ap−k+kibp+1. For this to equal ap+1bp+1 (equal counts), we need ki − k = 1, i.e., k(i − 1) = 1, i.e., k = 1 and i = 2. But k could be 2, 3, …, p, and for those values no integer i makes k(i−1) = 1. We need to find a single i that works for all possible k, and apbp+1 does not allow this.
(a) (2 pts) Give a CFG G with at most 5 production rules for the language:
L = { w ∈ {a, b}* | #a(w) = 2 · #b(w) }
(The number of a’s is exactly twice the number of b’s.)
(b) (3 pts) Give a PDA P with at most 7 states for the language:
L′ = { aibj | i ≠ j, i, j ≥ 0 }
Acceptance is by final state. Give your answer as a state diagram with transitions in the format “input, pop → push”.
(a) Assign each symbol a “weight”: a = +1, b = −2. Then L = strings with total weight 0.
Each non-ε rule must add one b (−2) and two a’s (+2) to maintain balance, but symbols can appear in any order:
S → ε | SS | aaSb | abSa | bSaa
Verification: S → SS allows concatenation of balanced strings (balanced + balanced = balanced). The three 3-symbol rules cover all orderings of (a, a, b) with S interspersed to allow arbitrary balanced strings between them. Every string with #a = 2 · #b can be built this way. 5 rules total. □
(b) PDA for L′ = { aibj | i ≠ j }.
Strategy: nondeterministically guess i < j or i > j at the start.
Branch 1 (i > j): Push all a’s onto the stack. Pop one per b. After all b’s, check the stack is nonempty → accept.
Branch 2 (i < j): Push all a’s onto the stack. Pop one per b. Once the stack is empty, read at least one more b → accept.
States: q0 (start), q1 (branch: i > j, pushing a’s), q2 (branch: i > j, popping with b’s), q3 (branch: i < j, pushing a’s), q4 (branch: i < j, popping with b’s), q5 (branch: i < j, reading extra b’s), qacc.
Transitions:
7 states: {q0, q1, q2, q3, q4, q5, qacc}. □
Note: The i > j branch accepts when, after reading all b’s (no more input), the stack still has at least one A. The nondeterministic ε-transition from q2 to qacc checks that A is on top (not $). The i < j branch accepts when the stack hits $ while b’s remain.
Consider TM M = (Q, {0, 1}, {0, 1, X, ⊔}, δ, q0, qacc, qrej) with transition function:
| State | Read | Write | Move | Next |
|---|---|---|---|---|
| q0 | 0 | X | R | q1 |
| q0 | 1 | 1 | R | qrej |
| q0 | ⊔ | ⊔ | R | qacc |
| q0 | X | X | R | q0 |
| q1 | 0 | 0 | R | q1 |
| q1 | 1 | 1 | R | q1 |
| q1 | ⊔ | ⊔ | L | q2 |
| q2 | 0 | ⊔ | L | q3 |
| q2 | 1 | ⊔ | L | q3 |
| q2 | X | X | R | qrej |
| q3 | 0 | 0 | L | q3 |
| q3 | 1 | 1 | L | q3 |
| q3 | X | X | R | q0 |
(a) (2 pts) Find the shortest word w that M accepts. Show the full sequence of configurations.
(b) (2 pts) What is L(M)? Is L(M) decidable? Justify.
(a) Testing short words:
w = ε: q0 reads ⊔ → qacc. Accepts! But wait — ε has length 0. Is there a non-trivial accepted word?
Actually, ε is accepted: q0 reads blank → accept. So ε is the shortest.
But let’s check if the question intends non-empty words. The shortest non-empty accepted word:
w = “0”:
q0 0 ⊔
⊢ X q1 ⊔ (write X, move R, go to q1)
⊢ q2 X ⊔ (read ⊔, move L, go to q2)
⊢ X qrej ⊔ (read X in q2 → reject)
Rejected. (After marking the first 0 with X and scanning right, q2 finds X immediately — only one symbol was between the marks.)
w = “00”:
q0 00 ⊔
⊢ X q1 0 ⊔ (mark first 0)
⊢ X0 q1 ⊔ (scan right past 0)
⊢ X q2 0 ⊔ (hit blank, move L)
⊢ q3 X ⊔ ⊔ (erase the 0, write ⊔, move L)
⊢ X q0 ⊔ ⊔ (read X in q3, move R to q0)
⊢ X ⊔ qacc (read ⊔ in q0 → accept!)
“00” is accepted. Shortest non-empty word: “00” (length 2).
Technically ε is the shortest, but the question asks to “find the shortest word” and “show configurations”, so “00” gives a more instructive trace. Full credit for either answer with correct configurations.
(b) M’s algorithm: mark the leftmost unmarked 0 with X (reject if it sees a 1), scan right to the end of the input, erase the rightmost symbol (reject if it immediately sees X, meaning nothing was between mark and end), scan left back to X, then repeat from q0 which skips over X’s. Accept when only X’s and blanks remain.
Each pass consumes one symbol from the left (must be 0) and one symbol from the right (can be 0 or 1). The process continues until nothing remains between the X markers.
This means:
L(M) = { 0nw | w ∈ {0, 1}n, n ≥ 0 } = strings of even length whose first half is all 0’s.
Equivalently: L(M) = {ε} ∪ { 0nw | n ≥ 1, w ∈ {0, 1}n }.
Yes, L(M) is decidable. M always halts: each pass erases 2 symbols, strictly reducing the input, so M terminates in at most |w|/2 passes. Moreover, L(M) is a regular language (a DFA can track position and verify the first half is all 0’s by counting), so it is decidable.
Consider the language:
L* = { 〈M1, M2〉 | M1, M2 are TMs and L(M1) = L(M2)* }
(M1’s language equals the Kleene star of M2’s language.)
(a) (2 pts) Prove that L* is co-Turing-recognizable.
(b) (4 pts) Prove that L* is undecidable. Hint: reduce from ATM.
(c) (1 pt) Prove that L* is not Turing-recognizable.
(a) L* is co-Turing-recognizable.
We show L* is Turing-recognizable. Build TM R:
R = “On input 〈M1, M2〉:
If L(M1) ≠ L(M2)*, some witness s exists and will eventually be found by dovetailing. So R recognizes L*, which means L* is co-TR. □
(b) L* is undecidable.
Reduction: ATM ≤m L*.
On input 〈M, w〉, construct:
Analysis:
So 〈M, w〉 ∈ ATM ⇔ f(〈M, w〉) ∈ L*. The function f is computable. Thus ATM ≤m L*.
Since ATM is undecidable, L* is undecidable. □
(c) L* is not Turing-recognizable.
From (a), L* is co-TR. From (b), L* is undecidable. If L* were also TR, then L* would be both TR and co-TR, which implies L* is decidable (a language is decidable iff it and its complement are both recognizable). But (b) shows L* is undecidable. Contradiction.
Therefore L* is not Turing-recognizable. □
Consider the language:
LDEC = { 〈M〉 | M is a TM and L(M) is decidable }
(a) (1 pt) Let MATM be a fixed TM that recognizes ATM. Is 〈MATM〉 ∈ LDEC? Explain.
(b) (2 pts) A student attempts to prove LDEC is undecidable with this reduction from ATM:
“On input 〈M, w〉: Construct M′ = ‘On input x: Ignore x. Run M on w. If M accepts, accept.’ Output 〈M′〉.”
The student argues: “If M accepts w, then L(M′) = Σ*, which is decidable, so 〈M′〉 ∈ LDEC. If M does not accept w, then L(M′) = ∅, which is decidable, so 〈M′〉 ∈ LDEC. Wait — both cases map into LDEC! This doesn’t work.”
Is the student correct that this reduction fails? What can you conclude about LDEC’s decidability, and why does a simple reduction from ATM not work here?
(a) No, 〈MATM〉 ∉ LDEC.
MATM recognizes ATM, so L(MATM) = ATM. Since ATM is undecidable, L(MATM) is not decidable, so 〈MATM〉 ∉ LDEC.
(b) Yes, the student is correct that this reduction fails.
The reduction maps every input 〈M, w〉 to some 〈M′〉 ∈ LDEC regardless of whether M accepts w (in both cases, L(M′) is decidable — either Σ* or ∅). A valid many-one reduction requires: 〈M, w〉 ∈ ATM ⇔ f(〈M, w〉) ∈ LDEC. Since both branches map into LDEC, the “only if” direction fails.
Why a simple ATM reduction fails: LDEC is a “higher-order” property. The standard trick of building M′ whose language is either Σ* or ∅ doesn’t help because both Σ* and ∅ are decidable. To get a “no” instance of LDEC, you need M′ to have an undecidable language — but that requires encoding a much more complex TM.
Conclusion: LDEC IS undecidable (this follows from Rice’s theorem since “L(M) is decidable” is a non-trivial semantic property of TMs — some TMs have it, some don’t). But proving it requires a more sophisticated reduction than the simple ATM approach above. In fact, LDEC is Σ3-complete in the arithmetic hierarchy, meaning it is “harder” than ATM — no mapping reduction from ATM to LDEC or to LDEC exists.
Consider the following problem:
SCHEDULE
Instance: A set of tasks T = {t1, …, tn}, processing times p : T → ℕ, deadlines d : T → ℕ, and a target integer k.
Question: Is there an ordering of the tasks such that at most k tasks miss their deadline? (A task ti misses its deadline if its completion time > d(ti). Tasks are processed sequentially with no gaps.)
(a) (2 pts) Prove that SCHEDULE ∈ NP.
(b) (1 pt) Give a yes-instance of 3SAT with |U| = 3 variables and |C| = 2 clauses.
(c) (2 pts) Consider the following reduction from 3SAT to SCHEDULE. Given 3SAT instance (U, C) with U = {u1, …, un} and C = {c1, …, cm}:
Apply this reduction to your 3SAT instance from (b). List all tasks with their processing times and deadlines.
(d) (1 pt) Is the resulting SCHEDULE instance a yes-instance? Explain.
(e) (1 pt) The reduction has a flaw. Identify it and suggest a fix.
(a) SCHEDULE ∈ NP.
Certificate: a permutation σ of the tasks (the proposed ordering).
Verifier:
All steps polynomial in input size. □
(b) 3SAT yes-instance.
U = {x, y, z}, C = {(x ∨ y ∨ ¬z), (¬x ∨ z ∨ y)}.
Satisfying assignment: x = T, y = T, z = T. First clause: T ∨ T ∨ F = T ✓. Second clause: F ∨ T ∨ T = T ✓.
(c) Apply the reduction.
n = 3 variables, m = 2 clauses. Tasks:
| Task | Meaning | p | d |
|---|---|---|---|
| t1 | x = true | 1 | 2 |
| f1 | x = false | 1 | 2 |
| t2 | y = true | 1 | 4 |
| f2 | y = false | 1 | 4 |
| t3 | z = true | 1 | 6 |
| f3 | z = false | 1 | 6 |
| c1 | clause 1 | 1 | 7 |
| c2 | clause 2 | 1 | 8 |
k = 0. Total tasks: 8 (processing time sum = 8).
(d) Is it a yes-instance?
No. There are 8 tasks, each with p = 1, so total processing time = 8. The last task finishes at time 8 regardless of ordering. But consider: for each variable ui, both ti and fi have deadline 2i. With k = 0, both tasks must finish by time 2i. However, we have 2 tasks per variable, and tasks before ui’s pair consume time too.
For variable u1: t1 and f1 both have deadline 2. We can schedule at most 2 tasks by time 2. That works if t1 and f1 go first.
For variable u2: t2 and f2 have deadline 4. By time 4 we can schedule 4 tasks. With t1,f1,t2,f2 first, all meet deadlines.
For variable u3: deadline 6. By time 6 we schedule 6 tasks, all 6 variable tasks fit.
Then c1 at time 7 (deadline 7 ✓) and c2 at time 8 (deadline 8 ✓).
So actually yes — ordering [t1, f1, t2, f2, t3, f3, c1, c2] has all tasks meeting deadlines with k = 0. ✓
(e) The flaw.
The reduction always produces a yes-instance of SCHEDULE, even when the 3SAT instance is unsatisfiable. The problem is that k = 0 with these deadlines allows all variable tasks (both “true” and “false”) to be scheduled on time — the deadlines are loose enough to fit everything. The reduction does not encode the constraint that setting a variable to true should “conflict” with setting it to false.
Fix: Change the deadlines so that for each variable pair (ti, fi), only one of the two can meet its deadline. For example, set d(ti) = d(fi) = 2i − 1 (giving a “slot” of size 1 for two tasks). Then exactly one misses, and set k = n (one missed task per variable). The clause tasks must then be scheduled using the “slack” created by the satisfying assignment to also meet their deadlines.
For each claim, state whether it is True or False and give a proof or counterexample.
(a) (2 pts) There exists a Turing-recognizable language whose complement is also Turing-recognizable, but no single TM can decide it (i.e., the language is undecidable).
(b) (2 pts) For every language L ∈ NP, there exists a polynomial p(n) such that L ∈ TIME(2p(n)).
(c) (1 pt) If L is regular and L′ is not regular, then L ∪ L′ might be regular.
(d) (1 pt) ETM ≤m ATM.
(a) FALSE.
This is impossible. If L is TR and L is also TR, then L is decidable. Proof: let R1 recognize L and R2 recognize L. Build decider D: “On input w, run R1 and R2 in parallel (interleaving steps). If R1 accepts, accept. If R2 accepts, reject.” One of them must accept (w is either in L or in L), so D always halts. This is a theorem, not an assumption: L is decidable iff both L and L are TR.
(b) TRUE.
If L ∈ NP, there exists a polynomial-time verifier V and a polynomial q(n) bounding the certificate length. A brute-force algorithm: for input w of length n, enumerate all 2q(n) possible certificates c and run V(w, c) on each. V runs in time poly(n), so total time is 2q(n) · poly(n). This is bounded by 2p(n) for a slightly larger polynomial p(n) = q(n) + O(log n). So L ∈ TIME(2p(n)). □
This shows every NP language is solvable in (at worst) exponential time. The converse — whether EXPTIME contains languages outside NP — is a major open question (but we assume NP ≠ EXP).
(c) TRUE.
Counterexample: L = Σ* (regular), L′ = {anbn | n ≥ 0} (not regular). Then L ∪ L′ = Σ* which is regular. More interestingly: L = {anbm | n ≠ m} (not regular — wait, let’s pick valid examples). L = Σ* works. The union of any language with Σ* is Σ*, which is regular.
(d) FALSE.
ETM = {〈M〉 | L(M) = ∅} is co-Turing-recognizable but not Turing-recognizable. ATM is Turing-recognizable. If ETM ≤m ATM, then since ATM is TR, ETM would be TR (mapping reductions preserve TR: if L ≤m L′ and L′ is TR, then L is TR). But ETM is not TR. Contradiction.
Note: the converse direction ETM ≤m ATM IS true (both are TR), but that’s a different statement. ETM itself sits outside the TR class.
| Question | Topic | Max | Your Score |
|---|---|---|---|
| Q1a | DFA — identify language | 1 | |
| Q1b | DFA — product construction (10 states) | 4 | |
| Q2 | NFA variant — SNFA equivalence proof | 3 | |
| Q3 | Regex for subsequence matching | 2 | |
| Q4 | Pumping lemma — m ≠ n with factorial trick | 3 | |
| Q5a | CFG — #a = 2·#b | 2 | |
| Q5b | PDA — aibj with i ≠ j | 3 | |
| Q6a | TM configurations | 2 | |
| Q6b | Identify L(M) + decidability | 2 | |
| Q7a | Prove co-TR (dovetailing) | 2 | |
| Q7b | Prove undecidable (ATM reduction) | 4 | |
| Q7c | Prove not TR | 1 | |
| Q8a | LDEC membership | 1 | |
| Q8b | Why reduction fails + hierarchy insight | 2 | |
| Q9a | Prove NP | 2 | |
| Q9b | 3SAT instance | 1 | |
| Q9c | Apply reduction | 2 | |
| Q9d | Check yes/no | 1 | |
| Q9e | Find & fix flaw | 1 | |
| Q10a | TR + co-TR ⇒ decidable | 2 | |
| Q10b | NP ⊆ EXP | 2 | |
| Q10c | Regular ∪ non-regular | 1 | |
| Q10d | ETM ≤m ATM? | 1 | |
| Total | 45 | ||