# Analogies between finitary and infinitary math

Part 5 of a six-part series of adaptations from Terence Tao’s book “Structure and Randomness“.

## Prerequisites

Readers need some familiarity with sentences like “for every $\varepsilon$ there exists a large $N$ such that…” or its symbolic equivalent, “$\forall\varepsilon: \exists N: \dotsc$“.

A rough idea of big O notation would also be helpful but is not necessary. A brief introduction: we say that $f(x) = O(g(x))$ (as $x \to \infty$) to mean that the “growth” of $f(x)$ is “bounded above” by $g(x)$; more formally, there exist positive real numbers $M$ and $x_0$ such that $\lvert f(x)\rvert \leq M\lvert g(x)\rvert$ for all $x > x_0$.

(note that $\lvert x\rvert$ denotes the absolute value of $x$, not its cardinality as indicated in Part 4.)

## [1.3] Soft analysis, hard analysis, and the finite convergence principle

Analysis (something like an advanced calculus) is often differentiated into “hard analysis” (“quantitative”, “finitary”) and “soft analysis” (“qualitative”, “infinitary”). Discrete math, computer science, and analytic  number theory normally uses hard analysis while operator algebra, abstract harmonic analysis, and ergodic theory tend to rely on soft analysis. The field of partial differential equations uses techniques from both.

Convenient notation (e.g. $O(\:)$) from qualitative analysis can conceal gritty details from quantitative and argue efficiently from the big picture, at the cost of a precise description. Conversely, quantitative analysis can be seen as a more precise and detailed refinement of qualitative analysis. The intuitions, methods and results in hard analysis often have analogues in soft analysis and vice versa, despite their contrasting language. Tao argues this technique transfer can benefit both disciplines. Table 5 features a rough “dictionary” between the notational languages of soft and hard analysis. Kudos to Tao for such an illuminating comparison!

Table 5: “Translating” soft analysis to hard analysis [1.3]

 Soft analysis Hard analysis $x$ is finite $x$ is bounded (e.g. $x = O(1)$) $x$ vanishes $x$ is small (e.g. $\lvert x\rvert \leq \varepsilon$) $x$ is infinite $x$ is large (e.g. $\lvert x\rvert \geq N$) $x_n \to 0$ Quantitative decay bound (e.g. $x_n = O(n^{-c})$) $x_n$ is convergent $x_n$ is metastable*

*A sequence is “metastable” when a large number of consecutive terms are very close to each other; see “Further discussion” for more details.

Here I must reproduce two concise and important observations of Tao (almost directly lifted):

• Soft analysis statements can often be formulated both succinctly and rigorously, by using precisely defined and useful concepts (e.g. compactness, measurability, etc.). In hard analysis, one usually has to sacrifice one or the other: either one is rigorous but verbose (using lots of parameters such as $\varepsilon$, $N$, etc.), or succinct but “fuzzy” (using intuitive but vaguely defined concepts such as “size”, “complexity”, “nearby”, etc.)
• A single concept in soft analysis can have multiple hard analysis counterparts. In particular, a “naive” translation of a statement in soft analysis into hard analysis may be incorrect. (In particular, one should not use Table 5 blindly to convert from one to the other.)

Tao’s original blog post walks through one such “translation”, from soft analysis to hard analysis, of the fundamental infinite convergence principle, more commonly known as:

Monotone convergence theorem. Every bounded monotone (i.e. either nonincreasing or nondecreasing) sequence of real numbers converges.

Intuition: If a sequences increases but does not slow down enough at some limit, no upper bound can contain its growth.

As the first step to a hard analysis version, we try to understand this theorem quantitatively—using precise quantities like $\varepsilon$ and $N$. For simplicity, we restrict the bounded sequence to the interval $[0, 1]$.

Infinite convergence principle. If $\varepsilon > 0$ and $0 \leq x_1 \leq x_2 \leq \dotsb \leq 1$, then there exists an $N$ such that $\lvert x_n - x_m\rvert \leq \varepsilon$ for all $n, m \geq N$.

Intuition: After the first $N$ terms in the sequence, the terms get very close to each other. The sequence is “permanently stable” (varies very little) after $N$ terms.

(This formulation of the monotone convergence theorem uses the definition of Cauchy sequences, as opposed to the usual $(\epsilon, \delta)$-definition of convergence. See “Further discussion” for more details.)

We seek to capture in the finitary (hard analysis) version the idea that the bounds on a finite monotone sequence “squeeze” the terms together to some precise extent. Indeed, a natural candidate would be:

Pigeonhole principle (PP). If $\varepsilon > 0$ and $0 \leq x_1 \leq x_2 \leq \dotsb \leq x_M \leq 1$ is such that $M \geq 1/\varepsilon + 1$, then there exists an integer $N$ with $1 \leq N < M$ such that $\lvert x_{N + 1} - x_N\rvert \leq \varepsilon$.

Intuition: Given many points $x_n$ on the line segment $[0, 1]$, pushing points apart will result in some points being squeezed closer together.

The usual version of the Pigeonhole Principle has the same intuition – if you try to fit too many pigeons (points) into a limited number of holes (bounded interval), some pigeons will end up close together.

This is easily proved by contradiction. If the distance between consecutive terms $x_N$ and $x_{N + 1}$ is always greater than $\varepsilon$, then the sum of all the distances is greater than $(M - 1)\varepsilon \geq 1$, which is absurd because all of the terms must fit inside the interval $[0, 1]$.

Let’s say that a sequence is stable on the (integer) interval $[a, b]$ if $\lvert x_n - x_m\rvert \leq \varepsilon$ for all $a \leq n, m \leq b$. Thus we say that the infinite convergence principle guarantees stability on $[N, +\infty)$. It turns out that the PP does not obviously imply the infinite convergence principle, so the former is not a satisfactory finitary version of the latter (we would like them to be equivalent). This is because the PP only guarantees stability on the tiny range $[N, N+1]$. After different generalizations of the PP that extend that range, Tao presents the correct translation:

Finite convergence principle. If $\varepsilon > 0$ and $F$ is a function from the positive integers to the positive integers, and $0 \leq x_1 \leq x_2 \leq \dotsb \leq x_M \leq 1$ is such that $M$ is sufficiently large depending on $F$ and $\varepsilon$, then there exists an integer $N$ with $1 \leq N < N + F(N) \leq M$ such that $\lvert x_n - x_m\rvert \leq \varepsilon$ for all $N \leq n, m \leq N + F(N)$.

(This can be proved by applying the PP to the subsequence $x_{i_1}, x_{i_2}, x_{i_3}, \dotsc$, where $i_1 = 1$ and $i_{j + 1} = i_j + F(i_j)$. This gives an explicit bound on how large $M$ needs to be; any $M \geq i_{\lfloor 1/\varepsilon\rfloor + 1}$ will suffice.)

The finite convergence principle guarantees stability on $[N, N + F(N)]$, which can be very large, depending how $F$ is chosen. Tao proved its equivalence to the infinite convergence principle, thus establishing it as the true finitary version. It looks considerably uglier than its infinitary counterpart, with its epsilons and quantifiers and tangled logic; this demonstrates the ability of soft analysis to abstract away those wordy, quantitative descriptions into cleaner statements.

However, not every infinitary statement has a finitary analogue; consider the obvious statement

Infinite pigeonhole principle. If the integers are divided among a finite number of groups, one of those groups must have infinite size.

## Further discussion

I suspect that there was much of the informal intuition that I couldn’t capture in my adaptation above – simply because I don’t fully understand the subject matter. The dangers of oversimplification also make it useful for readers to check out the original blog post too.

### 1 Translation table

I left out many rows of the original Table 5 because they needed much prior knowledge; but the following row might bring insight to the student of topology:

 Soft analysis Hard analysis $f$ is uniformly continuous $f$ has a Lipschitz or Hölder bound (e.g. $\lvert f(x) - f(y)\rvert = O(\lvert x - y\rvert)$)

From one of Tao’s papers (Tao, 2008), I’m guessing metastability (a word he coined recently) to mean the stability of a sequence over a large interval.

### 2 Alternatives to the limit

Tao could well have formulated the infinite convergence principle as the equivalent

Infinite convergence principle (limit version). If $0 \leq x_1 \leq x_2 \leq \dotsb \leq 1$, then there exists a real number $x$ such that for every $\varepsilon > 0$, there exists a n $N$ such that $\lvert x_n - x\rvert \leq \varepsilon$ for all $n \geq N$.

which is the familiar $(\varepsilon, \delta)$-definition of convergence from calculus. But the idea of a “limit” has no obvious finitary counterpart, so Tao favoured the formulation using Cauchy sequences as presented earlier.

### 3 Extending the stability range

The initial attempts to extend the PP’s interval of stability are actually quite instructive. A “constant extension” gives a stability range of $[N, N + k]$:

Constant-extended pigeonhole principle. If $\varepsilon > 0$ and $k$ is a positive integer and $0 \leq x_1 \leq x_2 \leq \dotsb \leq x_M \leq 1$ is such that $M \geq k/\varepsilon + 1$, then there exists an integer $N$ with $1 \leq N < N + k \leq M$ such that $\lvert x_n - x_m\rvert \leq \varepsilon$ for all $N \leq n, m \leq N + k$.

(Prove it! Hint: apply the PP to the subsequence $x_k, x_{2k}, x_{3k}, \dotsc$)

A “linear extension” gives a stability range of $[N, 2N]$:

Linearly-extended pigeonhole principle. If $\varepsilon > 0$ and $0 \leq x_1 \leq x_2 \leq \dotsb \leq x_M \leq 1$ is such that $M \geq 2^{1/\varepsilon} + 1$, then there exists an integer $N$ with $1 \leq N < 2N \leq M$ such that $\lvert x_n - x_m\rvert \leq \varepsilon$ for all $N \leq n, m \leq 2N$.

(Prove it! Hint: apply the PP to the subsequence $x_1, x_2, x_4, x_8, \dotsc$)

Greater extensions progressively strengthen the PP but still do not yield any generalization that can imply the infinite convergence principle. The finite convergence principle is somewhat the “ultimate generalization” which takes into account every single extension $F$ possible (including the extensions $F(N) = N + k$ and $F(N) = 2N$ as shown above). Only then does the PP gain enough strength to imply the infinite convergence principle.

Challenge: can you generalize the PP to extend its stability range “polynomially” (range $[N, N^k]$) or “exponentially” (range $[N, k^N]$)?

## References

[1.3] Tao, Terence. Soft analysis, hard analysis, and the finite convergence principle. In Structure and Randomness: Pages from Year One of a Mathematical Blog. American Mathematical Society (2008).

(Tao, 2008) Tao, Terence. Norm convergence of multiple ergodic averages for commuting transformationsErgodic Theory and Dynamical Systems 28, 657-688 (2008).