11 — The Context-Free Pumping Lemma
The tool for proving languages are NOT context-free
Why a Second Pumping Lemma?

In page 08, we built a powerful weapon: the pumping lemma for regular languages. It exploited a fundamental limitation of DFAs (finite states force loops) to prove languages are not regular. We used it to banish \(\{0^n1^n\}\) from the regular world.

Then in pages 09-10, we built a bigger world: context-free languages with CFGs and PDAs. These can handle \(\{0^n1^n\}\), balanced parentheses, palindromes. But are all languages context-free? No. Consider \(\{a^n b^n c^n \mid n \geq 0\}\) — matching three counts simultaneously. A PDA's single stack can match two groups (push for \(a\)'s, pop for \(b\)'s), but it has no way to simultaneously verify the \(c\)'s match too. We suspect this language is not context-free.

But suspicion isn't proof. "I couldn't build a PDA" isn't a valid argument — just like "I couldn't build a DFA" wasn't valid for the regular pumping lemma. We need a proof technique. We need a pumping lemma for context-free languages.

The story so far: Same spirit, same game structure, but the mechanism is different because the underlying machine model is different. A DFA has finitely many states; a CFG has finitely many variables. The pigeonhole principle applies to both, but in different ways.
The Intuition: Parse Trees, Not DFA States

Recall the regular pumping lemma's core argument: a DFA with \(p\) states reading a string of length \(\geq p\) must revisit a state (pigeonhole), creating a loop that can be pumped. The CF version uses the exact same reasoning pattern — but applied to parse trees instead of state sequences.

The pigeonhole argument on parse trees

Assume the grammar is in Chomsky Normal Form (page 09 covered CNF). In CNF, every rule is either \(A \to BC\) (two variables) or \(A \to a\) (one terminal). This means:

Now here's the key: the grammar has \(|V|\) variables. If a parse tree is taller than \(|V|\), then any root-to-leaf path of length \(> |V|\) must pass through more than \(|V|\) variable nodes. By the pigeonhole principle, some variable \(R\) must appear at least twice on that path.

The repeated variable gives us the pump. When variable \(R\) appears twice on a root-to-leaf path, we have two nested subtrees — both rooted at \(R\). The larger \(R\)-subtree generates some string \(vxy\), and the smaller \(R\)-subtree (nested inside it) generates just \(x\). The parts of \(vxy\) that are not in the inner subtree are \(v\) (left of \(x\)) and \(y\) (right of \(x\)). The rest of the full string gives us \(u\) (left of everything) and \(z\) (right of everything).

So the full string is \(s = uvxyz\).

Visualizing the two subtrees

digraph { rankdir=TB; size="3.5,3"; node [shape=none fontname="IBM Plex Sans" fontsize=12 margin="0.05,0.02"]; edge [arrowhead=none]; nodesep=0.3; ranksep=0.3; T [label="T (start)" fontcolor="#555"]; T -> u [style=dashed]; T -> R1; T -> z [style=dashed]; u [label="u" fontcolor="#888" fontname="IBM Plex Mono"]; z [label="z" fontcolor="#888" fontname="IBM Plex Mono"]; R1 [label="R ← upper" fontcolor="#dc2626"]; R1 -> v [style=dashed]; R1 -> R2; R1 -> y [style=dashed]; v [label="v" fontcolor="#2563eb" fontname="IBM Plex Mono"]; y [label="y" fontcolor="#2563eb" fontname="IBM Plex Mono"]; R2 [label="R ← lower" fontcolor="#dc2626"]; R2 -> x; x [label="x" fontcolor="#16a34a" fontname="IBM Plex Mono"]; { rank=same; u; v; x; y; z; } }

Now the magic: since both subtrees are rooted at the same variable \(R\), they're interchangeable. We can:

The analogy to the regular case: The regular version pumps one piece (\(y\)). The CF version pumps two pieces simultaneously (\(v\) and \(y\)) — because the repeated structure in a tree has a left side and a right side.
Interactive: See the Pump in Action

The static diagram above shows the idea, but the real insight comes from watching the tree change as you pump. The next two examples let you vary the pumping factor \(i\) and see exactly how the parse tree grows or shrinks — making the "repeated variable creates two wings" mechanism viscerally obvious.

Parse Tree Example 1: \(S \to aSb \mid \varepsilon\) generating \(\{a^n b^n\}\)

This grammar has one variable \(S\) that appears on every root-to-leaf path. When \(S\) appears twice, the outer \(S\)-subtree generates \(a \cdot S \cdot b\) (so \(v = a\), \(y = b\)) and the inner \(S\)-subtree generates the center \(x\). Pumping means inserting or removing layers of this \(S \to aSb\) wrapping.

Use the slider to change \(i\). Watch the tree grow (pump up) or shrink (pump down). The string decomposition \(uvxyz\) is color-coded throughout.

i = 1 (i=1 is the original string)
Parse Tree
String decomposition: \(uv^ixy^iz\)
u (prefix) v (left wing) x (center) y (right wing) z (suffix) R (repeated variable)

Parse Tree Example 2: Arithmetic Expressions

Grammar: \(E \to E + T \mid T\),   \(T \to T \times F \mid F\),   \(F \to (E) \mid a\).

Consider the string \(a + a \times a\). In the parse tree, the variable \(E\) appears twice on the path from root to the leftmost \(a\): the root \(E\) and the \(E\) inside \(E \to E + T\). This repeated \(E\) is our \(R\). The outer \(E\)-subtree generates the full string \(a + a \times a\), and the inner \(E\)-subtree generates just \(a\) (the leftmost \(a\)).

The decomposition is: \(u = \varepsilon\), \(v = \varepsilon\), \(x = a\), \(y = {+}\,a \times a\), \(z = \varepsilon\). Here \(v\) is empty and \(y\) carries all the pumping. When we pump up, we duplicate the \(+\,a \times a\) tail. When we pump down, we collapse to just \(a\).

But there is a more interesting decomposition using \(T\) as the repeated variable. On the path from root to the rightmost \(a\), \(T\) appears twice: the \(T\) in \(E \to E + T\) and the \(T\) inside \(T \to T \times F\). This gives: \(u = a +\), \(v = a \times\), \(x = a\), \(y = \varepsilon\), \(z = \varepsilon\). Now \(v\) carries the pumping — each pump adds another \(a \times\) factor. Use the selector below to explore both.

i = 1
Parse Tree
String decomposition: \(uv^ixy^iz\)
u (prefix) v (left wing) x (center) y (right wing) z (suffix) R (repeated variable)
What to notice in both examples:
Formal Statement
Pumping Lemma for Context-Free Languages (Sipser Theorem 2.34).
If \(A\) is a context-free language, then there exists a number \(p \geq 1\) (the pumping length) such that for every string \(s \in A\) with \(|s| \geq p\), there exists a decomposition \(s = uvxyz\) satisfying:
  1. For each \(i \geq 0\): \(uv^ixy^iz \in A\) — pumping preserves membership
  2. \(|vy| > 0\) — the pumped parts are not both empty
  3. \(|vxy| \leq p\) — the pumpable region is bounded in length
Key difference from regular PL: \(|vxy| \leq p\) means the pumpable window can be anywhere in the string (not pinned to the start like \(|xy| \leq p\) in the regular version). The adversary chooses where. This makes CF pumping proofs harder — you must argue all window placements fail. Full comparison table at the bottom of this page.

The Quantifier Structure

Same alternating quantifier pattern as the regular version — just with more pieces:

\(\exists\, p\) \(\forall\, s \in A,\; |s| \geq p\) \(\exists\, uvxyz = s\) \(\forall\, i \geq 0\) : \(uv^ixy^iz \in A\)

If you already internalized the quantifier game from page 08, this is the same game with five pieces instead of three. The adversary still picks \(p\) and the split; you still pick the string and the pump value.

The Adversarial Game

To prove a language \(L\) is not context-free, you negate the pumping lemma. Same game structure as page 08, adapted for five parts:

The CF Pumping Lemma Game (to prove \(L\) is NOT context-free):
Adversary (defends CF) You (attack)
1. Picks \(p\) — any positive integer
2. Pick \(s \in L\) with \(|s| \geq p\)
3. Picks split \(s = uvxyz\) satisfying \(|vy| > 0\) and \(|vxy| \leq p\)
4. Pick \(i\) such that \(uv^ixy^iz \notin L\)
What changed from the regular game: The adversary has strictly more power here (5-part split, window can slide anywhere). That's why CF pumping lemma proofs tend to involve more case analysis than regular PL proofs.
The Proof Scheme
Template for CF pumping lemma proofs:
  1. Assume for contradiction that \(L\) is context-free.
  2. Then the CF pumping lemma gives us a pumping length \(p\).
  3. Choose \(s \in L\) with \(|s| \geq p\). (Strategic choice — must be in the language!)
  4. Consider any decomposition \(s = uvxyz\) with \(|vy| > 0\) and \(|vxy| \leq p\).
  5. Show that some \(uv^ixy^iz \notin L\) — find a specific \(i\) that breaks it, for every valid split.
  6. Contradiction. The pumping lemma is violated, so \(L\) is not context-free.

Identical template to the regular PL proofs from page 08. The only mechanical difference: you reason about \(uvxyz\) instead of \(xyz\), and use \(|vy| > 0\), \(|vxy| \leq p\) instead of \(|y| > 0\), \(|xy| \leq p\).

Proof Sketch: Why the CF Pumping Lemma Holds

Here's the full argument in five steps (Sipser Theorem 2.34 proof).

Step 1: Put the grammar in CNF

Let \(G\) be a CFG for \(A\) in Chomsky Normal Form. Let \(|V|\) be the number of variables in \(G\). In CNF, every internal node has exactly 2 children (from rules \(A \to BC\)), and leaves correspond to terminals (from rules \(A \to a\)).

Step 2: Tall trees force repeated variables

A binary tree of height \(h\) has at most \(2^h\) leaves. Set \(p = 2^{|V|+1}\). If \(|s| \geq p = 2^{|V|+1}\), then any parse tree for \(s\) must have height at least \(|V| + 1\). A path of length \(|V| + 1\) from root to leaf passes through at least \(|V| + 1\) internal nodes (variables). Since there are only \(|V|\) variables, by the pigeonhole principle, some variable \(R\) appears at least twice on this path.

Step 3: Find the repeated variable on the longest path

Pick the parse tree \(\tau\) for \(s\) with the smallest number of nodes (the most efficient parse). Take its longest root-to-leaf path. Select \(R\) to be a variable that repeats among the bottom \(|V| + 1\) variables on this path. This choice ensures the upper \(R\)-subtree has height at most \(|V| + 1\).

Step 4: Read off the five pieces

The upper \(R\)-subtree generates some substring \(vxy\) of \(s\). The lower \(R\)-subtree (nested inside) generates \(x\). The rest gives \(u\) on the left and \(z\) on the right. So \(s = uvxyz\).

Summary of the proof: Long string \(\Rightarrow\) tall parse tree \(\Rightarrow\) long root-to-leaf path \(\Rightarrow\) repeated variable \(R\) (pigeonhole) \(\Rightarrow\) two nested \(R\)-subtrees \(\Rightarrow\) subtree surgery gives pumping. The exact same logical skeleton as the regular PL proof, just with parse trees instead of state sequences.
Worked Examples

Example 1: \(B = \{a^n b^n c^n \mid n \geq 0\}\)

The canonical non-context-free language. Matching two groups is context-free (\(\{a^n b^n\}\)); matching three is not. (Sipser Example 2.36.)

Step 1: Assume \(B\) is context-free

Then by the CF pumping lemma, there exists a pumping length \(p\).

Step 2: Choose \(s = a^p b^p c^p\)

\(s \in B\) because it has equal counts of \(a\)'s, \(b\)'s, and \(c\)'s. \(|s| = 3p \geq p\).

Step 3: Consider any valid split \(s = uvxyz\)

Since \(|vxy| \leq p\), the pumpable region spans at most \(p\) characters. The string has three blocks of \(p\) characters each: \(a^p\), \(b^p\), \(c^p\). A window of size \(\leq p\) can touch at most two of these three blocks. So \(v\) and \(y\) together contain at most two of the three symbols \(\{a, b, c\}\).

Step 4: Pump with \(i = 2\)

Consider \(uv^2xy^2z\). Since \(|vy| > 0\), pumping increases the total count of at least one symbol. But since \(v\) and \(y\) touch at most two of the three symbol types, pumping increases the count of at most two types while leaving the third unchanged. The three counts are no longer equal.

Therefore \(uv^2xy^2z \notin B\).

Step 5: Contradiction

The CF pumping lemma requires \(uv^ixy^iz \in B\) for all \(i\), but we found \(i = 2\) where it fails. Therefore \(B\) is not context-free.

Why \(|vxy| \leq p\) was essential. Without this constraint, the adversary could choose \(v = a^k\) and \(y = c^k\), spanning the entire string. Then pumping would keep all three counts equal, and the proof would fail. The length bound \(|vxy| \leq p\) forces the pumpable window to be "local" — it can't span all three blocks — so pumping necessarily unbalances the counts.
Note on case analysis. You might wonder: don't we need to consider where exactly \(vxy\) falls (all in the \(a\)'s? straddling \(a\)'s and \(b\)'s? etc.)? For this particular proof we don't, because the argument "at most two symbol types are affected" handles all cases at once. But for harder examples, you will need explicit case analysis on where the window lands.

Example 2: \(D = \{ww \mid w \in \{0,1\}^*\}\)

Strings that are the concatenation of some string with itself. This appeared in the regular pumping lemma too (page 08), where we proved \(\{ww\}\) is not regular. Now we prove the stronger result: it's not even context-free. (Sipser Example 2.38.)

Step 1: Assume \(D\) is context-free

The CF pumping lemma gives us pumping length \(p\).

Step 2: Choose \(s = 0^p 1^p 0^p 1^p\)

\(s \in D\) because \(s = ww\) with \(w = 0^p 1^p\). And \(|s| = 4p \geq p\).

Note: The string choice matters a lot here. Sipser points out that \(s = 0^p 1 0^p 1\) does not work — it can be pumped without leaving \(D\). The string \(0^p 1^p 0^p 1^p\) captures more of the language's "essence."

Step 3: Use \(|vxy| \leq p\) to constrain where pumping happens

Since \(|vxy| \leq p\), the pumpable region fits within a window of \(p\) characters. The string is: \(\underbrace{0\cdots0}_{p}\;\underbrace{1\cdots1}_{p}\;\underbrace{0\cdots0}_{p}\;\underbrace{1\cdots1}_{p}\). We consider where \(vxy\) can fall:

  • Entirely in the first half (\(0^p 1^p\)): Pumping changes the first half but not the second, so the two halves no longer match. \(uv^2xy^2z \neq ww\).
  • Entirely in the second half (\(0^p 1^p\)): Same argument — second half changes, first half doesn't.
  • Straddling the midpoint: The window crosses from the first \(1^p\) block into the second \(0^p\) block. Pumping with \(i = 2\) shifts the boundary — a 1 from the first half moves into the first position of the second half (or vice versa). The resulting string can't have the form \(ww\) because the midpoint is disrupted.

Step 4: Conclusion

In all cases, \(uv^2xy^2z \notin D\). Contradiction. Therefore \(D\) is not context-free.

Why the string choice matters so much. With the regular PL, the constraint \(|xy| \leq p\) always pinned the pump region to the start of the string, so "bad" string choices were rare. With the CF PL, \(|vxy| \leq p\) lets the window be anywhere. You need to pick a string where every position of the window leads to a contradiction. The string \(0^p 1^p 0^p 1^p\) works because it has a symmetric structure with a clear midpoint, and any local pumping disrupts the global \(ww\) property.

Example 3: \(C = \{a^i b^j c^k \mid 0 \leq i \leq j \leq k\}\)

This is Sipser Example 2.37. It looks similar to \(\{a^n b^n c^n\}\) but is subtler — the constraint is \(i \leq j \leq k\) (non-strict inequality chain), not strict equality. The proof requires more careful case analysis because we need to argue about both pumping up and pumping down.

Step 1: Assume \(C\) is context-free

The CF pumping lemma gives us pumping length \(p\).

Step 2: Choose \(s = a^p b^p c^p\)

\(s \in C\) because \(p \leq p \leq p\). And \(|s| = 3p \geq p\).

Step 3: Case analysis on the split

Since \(|vxy| \leq p\), the window \(vxy\) touches at most two symbol types. We consider all possibilities:

Case A: \(v\) and \(y\) each contain only one type of symbol.

Since \(v\) and \(y\) together span at most two of \(\{a, b, c\}\), at least one symbol is absent from both \(v\) and \(y\). We check each sub-case:

  • \(a\)'s don't appear in \(v\) or \(y\): Then pump up (\(i = 2\)). This increases \(b\)'s and/or \(c\)'s but not \(a\)'s. Since initially \(i = j = p\), increasing \(j\) (or \(k\)) while keeping \(i = p\) is fine. But what if only \(b\)'s increase? Then \(j > k\), violating \(j \leq k\). If only \(c\)'s increase, \(i \leq j \leq k\) still holds. So we need to try pump down instead: \(uv^0xy^0z = uxz\). This decreases \(b\)'s and/or \(c\)'s. If \(b\)'s decrease, then \(a > b\) violates \(a \leq b\). If \(c\)'s decrease, then \(b > c\) violates \(b \leq c\). Either way, contradiction.
  • \(b\)'s don't appear in \(v\) or \(y\): Pump up: increases \(a\)'s and/or \(c\)'s. If \(a\)'s increase, \(a > b\) violates \(a \leq b\). Contradiction.
  • \(c\)'s don't appear in \(v\) or \(y\): Pump up: increases \(a\)'s and/or \(b\)'s. If \(b\)'s increase, \(b > c\) violates \(b \leq c\). If \(a\)'s increase, \(a > b\) violates \(a \leq b\). Contradiction.

Case B: \(v\) or \(y\) contains more than one type of symbol.

Then \(uv^2xy^2z\) has symbols in the wrong order (e.g., \(a\)'s after \(b\)'s, or \(b\)'s after \(c\)'s). The string is no longer of the form \(a^* b^* c^*\), so it's certainly not in \(C\). Contradiction.

Step 4: Conclusion

Every valid split leads to a contradiction. Therefore \(C\) is not context-free.

Lesson from this example. Unlike Example 1 where one pump value (\(i = 2\)) killed all cases, here we needed different pump values for different cases: sometimes \(i = 2\) (pump up), sometimes \(i = 0\) (pump down). This is perfectly valid — you're allowed to pick different \(i\) values for different splits. The key is that for each split, some \(i\) breaks it.

The CF Pumping Game

Play against the CF pumping lemma for \(B = \{a^n b^n c^n \mid n \geq 0\}\).

The adversary picks \(p\) and a split. You pick \(i\) to break it.

Common Mistakes
1. Picking a specific \(p\).
Same mistake as with the regular PL. Wrong: "Let \(p = 5\)." The adversary picks \(p\). Your string must work for ALL \(p\).

2. Picking a specific split.
Wrong: "Let \(v = aa\), \(y = bb\)." The adversary picks the split. You must show ALL valid splits (satisfying \(|vy| > 0\) and \(|vxy| \leq p\)) fail. This usually means case analysis.

3. Forgetting \(|vxy| \leq p\) applies differently than \(|xy| \leq p\).
In the regular PL, \(|xy| \leq p\) pins the pump to the start of the string. In the CF PL, \(|vxy| \leq p\) means the window can be anywhere. Don't assume \(v\) and \(y\) are at the beginning.

4. Not handling all cases.
Because the window can be anywhere, you need to account for every possible position. A common error: proving it works when \(vxy\) is in one part of the string, but forgetting the case where it straddles two parts.

5. Only pumping up.
Sometimes \(i = 2\) doesn't work for every split, but \(i = 0\) (pumping down) does. You're free to use different \(i\) values for different cases. Some proofs need both pump-up and pump-down.

6. Bad string choice.
This is even more critical than with the regular PL. Sipser explicitly shows that for \(D = \{ww\}\), the string \(0^p 1 0^p 1\) does NOT work — but \(0^p 1^p 0^p 1^p\) does. If your proof gets stuck, try a different string before concluding the approach is wrong.

7. Using the CF PL to prove context-freeness.
Same one-way trap: CF-pumpable does NOT imply context-free. The CF pumping lemma can only prove languages are NOT context-free. It's a necessary condition for CFLs, not sufficient.
Full Comparison: Regular PL vs. CF PL
Regular Pumping Lemma CF Pumping Lemma
Proves Language is NOT regular Language is NOT context-free
Based on DFA states (pigeonhole) Parse tree variables (pigeonhole)
Machine model DFA/NFA — finite states PDA — finite states + stack
Generative model Regular expressions Context-free grammars (CNF)
Split \(s = xyz\) (3 parts) \(s = uvxyz\) (5 parts)
Pumped form \(xy^iz\) \(uv^ixy^iz\)
Non-empty \(|y| > 0\) \(|vy| > 0\)
Length bound \(|xy| \leq p\) (first \(p\) chars) \(|vxy| \leq p\) (any window of \(p\) chars)
Pumping length \(p = \) # of DFA states \(p = 2^{|V|+1}\) in CNF
Quantifier structure \(\exists p\; \forall s\; \exists xyz\; \forall i\) \(\exists p\; \forall s\; \exists uvxyz\; \forall i\)
Proof difficulty Often one case (pump at start) Usually multiple cases (window can be anywhere)
Canonical example \(\{0^n 1^n\}\) is not regular \(\{a^n b^n c^n\}\) is not context-free
The Hierarchy So Far

With the CF pumping lemma, we can now prove strict containment at both levels. Here's the picture:

??? (Turing-recognizable, decidable, ...)
\(\{a^n b^n c^n\}\), \(\{ww\}\), and beyond...
Context-Free Languages
\(\{0^n1^n\}\), \(\{ww^R\}\), balanced parens
Regular Languages
DFA/NFA/regex: \(0^*1^*\), even parity
What we can now prove: The next level up is decidable languages (Chapter 3 of Sipser), recognized by Turing machines that always halt. And \(\{a^n b^n c^n\}\) is decidable — a Turing machine with a read/write tape can easily match three counts by making multiple passes.
Closure properties confirm the separation. As noted in page 10, CFLs are NOT closed under intersection. The classic witness: \(\{a^n b^n c^k \mid n, k \geq 0\}\) and \(\{a^k b^n c^n \mid n, k \geq 0\}\) are both CFLs, but their intersection is \(\{a^n b^n c^n\}\) — which we just proved is not context-free.
Exercises

Exercise 1: Prove \(\{a^n b^n c^{2n} \mid n \geq 0\}\) is not context-free

Use the CF pumping lemma. Choose an appropriate string \(s\) and handle all cases for where \(vxy\) can fall.

Hint: Choose \(s = a^p b^p c^{2p}\). Think about what happens to the ratio between \(a\)'s, \(b\)'s, and \(c\)'s when you pump.

Exercise 2: Why can't the CF PL prove \(\{a^n b^n\}\) is not context-free?

We know \(\{a^n b^n\}\) IS context-free (it has the CFG \(S \to aSb \mid \varepsilon\)). But can you show concretely why a CF pumping lemma proof attempt would fail? Pick \(s = a^p b^p\) and show that the adversary can always find a valid split where pumping stays in the language.

Exercise 3: \(\{0^i 1^j \mid i = j^2\}\) — non-context-free?

Prove that \(L = \{0^{n^2} 1^n \mid n \geq 0\}\) (strings with \(n^2\) zeros followed by \(n\) ones) is not context-free.

Hint: Choose \(s = 0^{p^2} 1^p\). When you pump, the number of 0's or 1's changes linearly, but the relationship \(\#0 = (\#1)^2\) is quadratic. A linear perturbation breaks a quadratic relationship.

Summary
The Context-Free Pumping Lemma — the essentials: The quantifier game (CF version):
They pick \(p\)You pick \(s\)They split \(uvxyz\)You pick \(i\) → show \(uv^ixy^iz \notin L\)