As such, I have moved the blog to a new site: http://herngyi.weebly.com/ One new feature is a gallery of my origami designs.

Weebly does seem to make website design easy, but only time will tell its degree of flexibility (or lack thereof) and possible shortcomings. At least there aren’t unsavory ads.

I installed MathJax so the new blog will have LaTeX support as well.

This blog will be closed. Hope you visit its reincarnation!

]]>More details about the public workshops will be confirmed soon. I’ll be displaying some of my own creations as a “local designer”, so please come to support the event and expand your horizons!

]]>In the previous article we saw that fruitful analogies between finitary and infinitary mathematics can allow the techniques of one to shed light on the other. Here we borrow the power of infinitary math—in particular, ultrafilters and non-standard analysis—to simplify proofs of finitary statements.

Arguments in hard analysis are notorious for their profusion of “epsilons and deltas”, a more familiar example being the -definition of convergence from high school calculus. One may have to keep track of a whole army of epsilons, some of which are “small”, “very small” (i.e. negligible as compared to even the “small” epsilons), “very very small” and so on. This “epsilon management” is exacerbated with those unsightly quantifiers (“for every there exists such that…”) sprinkled within any statement. These quantifers need care to weave together, and need careful untangling to comprehend. To borrow a rather mild example from the previous article,

**Finite convergence principle**. If and is a function from the positive integers to the positive integers, and is such that is sufficiently large depending on and , then there exists an integer with such that for all .

(Anyone knows of more convoluted examples?)

“Automating” epsilon management has progressed in “asymptotic notation” like the family of notations, as well as the – and -type symbols, which rigorously formulate the respective qualitative ideas of “bounded by”, “much smaller than” and “comparable in size to”, without resorting to explicit quantities like and .

However, the absence of actual quantities inhibits detailed study; for instance, sums and products of “bounded numbers” (i.e. ) are also bounded, but it’s meaningless (obstructed by an axiom of set theory) to say that the set of is closed under addition and multiplication. *Non-standard analysi*s solves this problem by adding new numbers into our number system, including infinities and infinitesimals.

Such “new numbers” are defined using *non-principal **ultrafilter*s, which are method to find the *-limit* of any sequence of real numbers. If converges then the -limit is the usual limit. However, the sequence has a -limit of the ordinal , which you can think of as “the smallest infinity”. You can get bigger infinities from the -limits of sequences like (which understandably converges to ).

The standard real number system together with all possible -limits forms the set of *non-standard numbers*, or *hyperreal number*s. The analogue of an number is then a hyperreal number that is smaller than some standard real number. In fact, this set is a *ring* and we can readily apply every insight from *ring theor*y to it. This is made possible from the principle that non-standard numbers can be manipulated just like standard numbers, also known as:

**Transfer principle**. Every proposition valid over real numbers is also valid over the hyperreals.

This allows us to take reciprocals of infinities to get infinitesimals for use in calculus. In fact, allowing calculus to rigorously work with infinitesimals was a major motivation for the development of non-standard analysis. For any infinitesimal , the -limit of the sequence is much, much smaller than . This process can be iterated to churn out a heirarchy of infinitesimals that shrink at a ridiculous pace, simplifying epsilon management. Tao says that:

“it lets one avoid having to explicitly write a lot of epsilon-management phrases such as ‘Let be a small number (depending on and ) to be chosen later’ and ‘… assuming was chosen sufficiently small depending on and ‘, which are very frequent in hard analysis literature…”

I guess ultrafilters do not change a proof in essence but greatly simplifies its language, freeing one’s attention for the big picture, as opposed to wading in a swamp of .

The original blog post details several important limitations to the above properties. It also develops some interesting properties of ultrafilters, such as their connection to the usual limits (and identities concerning them) and *propositional logic*. Many of these properties are explained with a wonderfully illustrative analogy of an ultrafilter as a “voting system”: In a sequence , each integer “votes” on some real number , and the -limit is the elected candidate. Different ultrafilters are distinguished by how much influence each integer has on the final decision (voting is unfair!)

The connection with propositional logic comes with asking each integer a yes-no question and evaluating the -limit, which would be either yes () or no (). A property of integers (i.e. ) is “-true” (i.e. almost surely true) if the “decision” after asking the integers “do you satisfy ?” is “yes”. -truths satisfy the laws of logic, but tautologically true statements are -true!

[1.5] Tao, Terence. Ultrafilters, non-standard analysis, and epsilon management. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

Readers need some familiarity with sentences like “for every there exists a large such that…” or its symbolic equivalent, ““.

A rough idea of *big O notation* would also be helpful but is not necessary. A brief introduction: we say that (as ) to mean that the “growth” of is “bounded above” by ; more formally, there exist positive real numbers and such that for all .

(note that denotes the absolute value of , not its cardinality as indicated in Part 4.)

*Analysis* (something like an advanced calculus) is often differentiated into “hard analysis” (“quantitative”, “finitary”) and “soft analysis” (“qualitative”, “infinitary”). *Discrete math*, *computer science*, and *analytic number theory* normally uses hard analysis while *operator algebra*, *abstract harmonic analysis*, and *ergodic theory* tend to rely on soft analysis. The field of *partial differential equations* uses techniques from both.

Convenient notation (e.g. ) from qualitative analysis can conceal gritty details from quantitative and argue efficiently from the big picture, at the cost of a precise description. Conversely, quantitative analysis can be seen as a more precise and detailed refinement of qualitative analysis. The intuitions, methods and results in hard analysis often have analogues in soft analysis and vice versa, despite their contrasting language. Tao argues this technique transfer can benefit both disciplines. Table 5 features a rough “dictionary” between the notational languages of soft and hard analysis. Kudos to Tao for such an illuminating comparison!

Table 5: “Translating” soft analysis to hard analysis [1.3]

Soft analysis |
Hard analysis |

is finite | is bounded (e.g. ) |

vanishes | is small (e.g. ) |

is infinite | is large (e.g. ) |

Quantitative decay bound (e.g. ) | |

is convergent | is metastable* |

*A sequence is “metastable” when a large number of consecutive terms are very close to each other; see “**Further discussion**” for more details.

Here I must reproduce two concise and important observations of Tao (almost directly lifted):

- Soft analysis statements can often be formulated both succinctly and rigorously, by using precisely defined and useful concepts (e.g. compactness, measurability, etc.). In hard analysis, one usually has to sacrifice one or the other: either one is rigorous but verbose (using lots of parameters such as , , etc.), or succinct but “fuzzy” (using intuitive but vaguely defined concepts such as “size”, “complexity”, “nearby”, etc.)
- A single concept in soft analysis can have multiple hard analysis counterparts. In particular, a “naive” translation of a statement in soft analysis into hard analysis may be incorrect. (In particular, one should not use Table 5 blindly to convert from one to the other.)

Tao’s original blog post walks through one such “translation”, from soft analysis to hard analysis, of the fundamental *infinite convergence principle*, more commonly known as:

**Monotone convergence theorem.** Every bounded monotone (i.e. either nonincreasing or nondecreasing) sequence of real numbers converges.

*Intuition: If a sequences increases but does not slow down enough at some limit, no upper bound can contain its growth.*

As the first step to a hard analysis version, we try to understand this theorem quantitatively—using precise quantities like and . For simplicity, we restrict the bounded sequence to the interval .

**Infinite convergence principle.** If and , then there exists an such that for all .

*Intuition: After the first terms in the sequence, the terms get very close to each other. The sequence is “permanently stable” (varies very little) after terms.*

(This formulation of the monotone convergence theorem uses the definition of *Cauchy sequences*, as opposed to the usual -definition of convergence. See “**Further discussion**” for more details.)

We seek to capture in the finitary (hard analysis) version the idea that the bounds on a finite monotone sequence “squeeze” the terms together to some precise extent. Indeed, a natural candidate would be:

**Pigeonhole principle (PP)**. If and is such that , then there exists an integer with such that .

*Intuition: Given many points on the line segment , pushing points apart will result in some points being squeezed closer together.*

This is easily proved by contradiction. If the distance between consecutive terms and is always greater than , then the sum of all the distances is greater than , which is absurd because all of the terms must fit inside the interval .

Let’s say that a sequence is *stable* on the (integer) interval if for all . Thus we say that the infinite convergence principle guarantees stability on . It turns out that the PP does not obviously imply the infinite convergence principle, so the former is not a satisfactory finitary version of the latter (we would like them to be equivalent). This is because the PP only guarantees stability on the tiny range . After different generalizations of the PP that extend that range, Tao presents the correct translation:

**Finite convergence principle**. If and is a function from the positive integers to the positive integers, and is such that is sufficiently large depending on and , then there exists an integer with such that for all .

*(This can be proved by applying the PP to the subsequence , where and . This gives an explicit bound on how large needs to be; any will suffice.)*

The finite convergence principle guarantees stability on , which can be very large, depending how is chosen. Tao proved its equivalence to the infinite convergence principle, thus establishing it as the true finitary version. It looks considerably uglier than its infinitary counterpart, with its epsilons and quantifiers and tangled logic; this demonstrates the ability of soft analysis to abstract away those wordy, quantitative descriptions into cleaner statements.

However, not every infinitary statement has a finitary analogue; consider the obvious statement

**Infinite pigeonhole principle**. If the integers are divided among a finite number of groups, one of those groups must have infinite size.

I suspect that there was much of the informal intuition that I couldn’t capture in my adaptation above – simply because I don’t fully understand the subject matter. The dangers of oversimplification also make it useful for readers to check out the original blog post too.

I left out many rows of the original Table 5 because they needed much prior knowledge; but the following row might bring insight to the student of topology:

Soft analysis |
Hard analysis |

is uniformly continuous | has a Lipschitz or Hölder bound (e.g. ) |

From one of Tao’s papers (Tao, 2008), I’m guessing *metastability* (a word he coined recently) to mean the stability of a sequence over a large interval.

Tao could well have formulated the infinite convergence principle as the equivalent

**Infinite convergence principle (limit version).** If , then there exists a real number such that for every , there exists a n such that for all .

which is the familiar -definition of convergence from calculus. But the idea of a “limit” has no obvious finitary counterpart, so Tao favoured the formulation using Cauchy sequences as presented earlier.

The initial attempts to extend the PP’s interval of stability are actually quite instructive. A “constant extension” gives a stability range of :

**Constant-extended pigeonhole principle**. If and is a positive integer and is such that , then there exists an integer with such that for all .

*(Prove it! Hint: apply the PP to the subsequence )*

A “linear extension” gives a stability range of :

**Linearly-extended pigeonhole principle**. If and is such that , then there exists an integer with such that for all .

*(Prove it! Hint: apply the PP to the subsequence )*

Greater extensions progressively strengthen the PP but still do not yield any generalization that can imply the infinite convergence principle. The finite convergence principle is somewhat the “ultimate generalization” which takes into account every single extension possible (including the extensions and as shown above). Only then does the PP gain enough strength to imply the infinite convergence principle.

**Challenge: can you generalize the PP to extend its stability range “polynomially” (range ) or “exponentially” (range )?**

[1.3] Tao, Terence. Soft analysis, hard analysis, and the finite convergence principle. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

(Tao, 2008) Tao, Terence. Norm convergence of multiple ergodic averages for commuting transformations. *Ergodic Theory and Dynamical Systems* **28**, 657-688 (2008).

A graph is a set of *vertices* (“objects”) and a set of *edges* (“relationships between two objects”). For example, could be a set of people and the set of friendships (Facebook really uses this social graph to analyze user behavior). could also be the set of airports and the set of flights from one airport to another. The immense flexibility of this definition allow graphs to model and analyze a tremendous variety of real-life situations, but here we are interested in the abstract representation of a graph, in particular its *drawing*. The focus is on applying the technique of *amplification*, which was presented in Part 3 of this series.

A drawing of graph simply draws the vertices in as dots on the plane, and the edges as lines (or curves) connecting them. can have many possible drawings, in which the edges can have different numbers of crossings between pairs of edges (i.e. three concurrent edges have 3 crossings). Fig. 4(a)-(c) features three drawings of the graph with different numbers of crossings.

(Moving vertices to remove crossings is the theme of the wonderful online game *Planarity.*)

Let denote the minimum number of crossings among all drawings of , also called the *crossing number* of . The number of vertices is a useful measure of the “size” of ; likewise, is a useful measure of the “complexity” of . Thus we attempt to express many quantities related to graphs in terms of these two numbers. For instance, Euler’s formula applies when has a drawing with no crossing edges (i.e. is *planar*, and ):

where is the number of *faces* of that non-crossing drawing, i.e. the number of regions the plane is divided into by the edges (verify it for a drawing of a chessboard!). You may recall this formula as the Euler characteristic for polyhedra; the analogy is obvious when you think of stretching a polyhedron open and squashing it into the plane, drawing a planar graph.

We’d like to relate to and , but two graphs can have the same number of vertices and edges yet different crossing numbers (find two such graphs!). The next best thing is a bound in terms of and , one of which we derive here—the crossing number inequality:

The condition requires that is “complex enough” relative to its size; this bound is useful because we are mainly interested in the behavior of complicated graphs.

Unbelievably, this bound is the result of amplifying (twice) an inequality that is rather trivial in comparison; this “starting point” is the relationship between and when is planar. Since each edge touches 2 faces and each face touches at least 3 edges, we have the “starting point” (when does equality hold?). Substituting into Eq. (4.1), we have

We now amplify by exploiting the symmetry due to the “freedom to look at a subgraph” (a smaller graph obtained by deleting some edges or vertices), something like self-similarity. The crucial observations are:

- An imbalance in this symmetry occurs if we remove edges from , because the RHS of Eq. (4.3) is unaffected.
- Removing edges can remove crossings.

Suppose that . Find the drawing of with that number of crossings. For each crossing, remove one of the edges involved; this eliminates all of the crossings by removing at most edges. With no more crossings left, we apply Eq. (4.3) to this smaller graph to get

Eq. (4.4) is our first “real” bound. We amplify more, this time by removing vertices (and the edges touching them). This indirectly removes crossings, but for the sake of a strong bound we want to remove many crossings with few vertices. Which vertices should we choose? We call upon the counter-intuitive *probabilistic method* from combinatorics:

… if there is no obvious “best” choice for some combinatorial object (such as a set of vertices to delete), then trying a

randomchoice will often give a reasonable answer, provided the notion of “random” is chosen carefully.

[1.10, p. 97]

Hence we randomly remove each vertex with a probability . Vertices survive with probability , edges with probability (both endpoints must survive), and crossings with probability less than (see the full reasoning). We apply Eq. (4.4) to the “expected surviving graph” to obtain

The crossing number inequality (Eq. (4.2)) results from increasing the RHS of Eq. (4.5) by tweaking the value of .

My explanation of how the probabilistic method is used here risks oversimplification; I urge readers to look up the full argument in the original blog post.

The crossing number inequality is very strong and, as a result, very useful (see “**Why do we need strong inequalities?**” in Part 3 of this series). The original blog post lists two of its applications, including an easy derivation of the Szemerédi-Trotter theorem, which bounds the number of point-line incidences that can occur given a collection of points and lines on a plane. The connection arises by viewing the setup as the drawing of a graph: points are vertices, and lines containing many points can be broken into edges connecting those points.

[1.10] Tao, Terence. The crossing number inequality. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

This’ll be quick; the only part of the original blog post that I understood comfortably was the brief (and slightly inaccurate) explanation of traditional image compression. It’s how image formats like JPEG can reduce the memory space needed by an image file drastically while losing only a bit of quality.

A camera takes a photo of dimensions 1024 × 2048 by recording each of the roughly 2 million pixels. That makes for a lot of data, but it turns out that most of the data is *redundant* and can be thrown away if we can accept a small reduction in image quality.

Many pictures have large areas with more or less the same color (e.g. blue sky, white table). Actually, the color may vary slightly but not enough for our eyes to notice the difference. We can partition the array of 2 million pixels into a patchwork of rectangles of pixels, where each rectangle has roughly the same color. Instead of storing each pixel with its color, we simply need to store the average color and the rectangle’s size and position!

Tao provides a more accurate and comprehensive picture of traditional image compression in his original blog post. He then goes into the main topic of compressed sensing, which aims to reconstruct the entire image (signal, data recording etc) by capturing only part of the image. He justifies the need for this ruthless efficiency in sensor networks with small power supplies. Applications include magnetic resonance imaging (“MRI scans”), astronomy, and recovering corrupted data.

[1.2] Tao, Terence. Compressed sensing and single-pixel cameras. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

Tao uses the analogy of the game Tomb Raider as a model to give some intuition for the reasons behind the “weird” consequences of quantum mechanics (QM), in particular the so-called “many worlds interpretation“. The game consists of two worlds:

**Internal System:**Lara Croft, the protagonist, needs to navigate tombs and solve the puzzles in them to survive. Suppose that she is intelligent and independent of the Player of the game.**External**the Player of the game can view the game world and “save” the game to “restore” it when Lara dies. “Restoring” resets Lara’s memory (well, she simply went back to an “earlier state”, right?). Unfortunately, he cannot see inside the tombs, which limits his assistance.**System**:

(At this point, Tao apologizes for the violent analogy. Well, I’ve added a little drama… and fixed a loophole.)

Each save point before a lethal puzzle causes Lara’s world (from her point of view) to split into many possible “developments”, some of which involve her failure to solve the puzzle and thus death, while others have her survive. Figure 1 illustrates a sample puzzle: A tomb whose only exit a wooden trapdoor on the floor that leads to an underground passage. The tomb will collapse in, say, five minutes, and Lara (“L”) must escape through the passageway, but the trapdoor is locked. The Player (smiley face) helps by creating a save point.

From the Player’s perspective (Fig. 1(b)), he watches Lara enter the tomb, saves the game, and waits anxiously for five minutes before the dreaded collapse – but no sign of Lara. He restores and waits helplessly as history repeats itself. *“She wasted that health pack I gave her!”* But he brightens after the second restore, when Lara emerges in two minutes flat, carrying two extra health packs (picked up in the tomb) to boot!

So far, so good. The game proceeds in accordance with the game mechanisms, with past events determining future ones – that is to say, the **External System** operates in a “deterministic” manner.

However, from Lara’s point of view (Fig. 1(c)), her world “splits” into several “routes” at a save point, and each Lara-copy attempts the puzzle-copy in her tomb-copy. Each Lara-copy experiences the puzzle only once, no matter whether she eventually lives or dies. Like the proverbial cat, Lara as a whole (if that makes sense at all) is in a mixture of life-deathness. The **Internal System** is not deterministic, even as it exists within a deterministic **External System**!

Here’s where quantum mechanics comes in. The separate “routes” can interfere with one another (or “superimpose”, in QM-speak). Suppose that every time Lara dies, her corpse remains in the tomb even after restoring. The game still runs deterministically (in the **External System**). Each poor Lara-copy must be quite unnerved by an assortment of her own bodies strewn about the tomb she enters! But why would we dream of such a morbid setup? It turns out to be necessary to explain these strange rumours that have been intriguing the Player.

*“I’d heard that this tomb was really hard – only a one in three chance of surviving. I’d hoped the health pack would boost her chances, you know. But the other Players have all told me that no matter well they prepared Lara, her survival rate always stayed at 33%!”*

And that’s not all; the other tombs have rigid survival rates with uncanny values like 50%, 20%, 25%… the reciprocals of whole numbers.

(If you’re wondering what QM has to do with all of this, QM grew out of the need to explain the strange phenomenon of quantized energy levels in atoms. Gas of a single element emits light of a few fixed colors when excited. In fact, those colors correspond to “uncanny” fixed amounts of energy, proportional to the reciprocals of square numbers.)

How can “superimposing worlds” explain the curious tomb survival rates? Well, the astute Lara observed that the termite-infested wood of the trapdoor looked brittle and tried to break through it, but her body weight wasn’t enough (the price of her figure). After two hapless Laras got buried in rubble, the third Lara had a brainwave after a couple of minutes: Nerves of steel overcoming her disgust, she dragged her two corpses onto the trapdoor and slammed against it with all her might – and it was just enough to break through! Flushed with success, she left the tomb carrying her loot – two extra health packs that, well, belonged to her in the first place…

Unfortunately, the next Lara would have no convenient corpses lying around, and only every third Lara would survive.

Tao’s original blog post [1.1] extends this analogy to interpret more “weird” QM phenomena such as quantum entanglement, the famous double-slit experiment, the violation of Bell’s Inequality, and even the fact that “weird” quantum effects occur at microscopic scales but not in everyday life.

Tao’s original explanation of the “reciprocal survival rates” was that Lara needed to stack a certain number of corpses together to reach a switch crucial to her survival. However, the corpses would remain there and assist Lara whenever she tried the puzzle again, ensuring survival. So I arranged for the corpses to leave the tomb together with Lara.

[1.1] Tao, Terence. Quantum mechanics and Tomb Raider. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

Given an inequality with imbalances in symmetry between the left-hand side (LHS) and right-hand side (RHS), *amplification* is a mathematical trick that can exploit that imbalance to derive a stronger inequality (i.e. the LHS and RHS are closer). As for why mathematicians might need such a technique, see “**Why do we need strong inequalities?**” below.

Consider some transformations that change such that , but not , is “symmetric” relative to . That is, . Then we can choose to maximize the LHS and “tighten” the inequality. Let’s illustrate this trick by applying it to prove the Cauchy-Schwarz Inequality (actually, the special case of the familiar -dimensional space ):

for all

Discard the trivial case of either and being a zero vector. The dot product and norms (self-dot product) like suggest expanding the (nonnegative) dot product into

The vector norm (and by extension, the RHS) is symmetric relative to dilation by factor (i.e. flip each point 180° about the origin) but the LHS is not. So we choose the transformation . This gives

The inequality has been tightened by increasing the LHS, but the RHS is still too big. This is because (easily derived by expanding ).

We reduce the RHS by exploiting another imbalance in symmetry, this time of a “homogenisation” transformation which scales the first vector by and the second by , which keeps the product of their lengths the same. Thus the dot product (and by extension, the LHS) is symmetric relative to but the RHS is not, which gives

Finally we minimize the RHS by setting to get Eq. (3.1). We have “amplified” our way into the Cauchy-Schwarz Inequality!

For another insight to Eq. (3.1), consider that .

The “magical cheating” of amplification can produce amazingly strong inequalities, that are difficult to prove otherwise, from comparatively weak or even trivial inequalities. However, from another perspective, this means that the strong and weak inequalities were equivalent in the first place! Our vision was simply obstructed and misled by low-quality versions of the strong inequality.

Tao’s original blog post [1.9] exploits more symmetry imbalances – via dilation, homogenity, linearity and phase rotation (for complex numbers) to amplify an astonishing variety of inequalities in *harmonic analysis*.

Even symmetry via Cartesian Products can be exploited to amplify the following inequality on cardinalities of sumsets (do check the definition)

into the more general inequality

This intrigued me for two reasons – this was a rather “discrete” inequality (mostly involving integers) as opposed to the “continuous” nature of the Cauchy-Schwarz Inequality, and amplification was used to **generalize**, not merely to strengthen inequalities!

For the sake of readers unfamiliar with inequalities, I feel that I need to explain beyond what Tao did:

Many properties and quantities in math are difficult or impossible to calculate exactly (e.g. the number of twin primes below some (primes differing by , like 11 and 13) or the position or momentum of particles), so we often establish upper or lower bounds on them to at least give some idea of the size of the quantity. Conjecturing is now somewhat easier with computer methods (e.g. plot a graph of “number of twin primes below ” versus and look at the graph shape) but how do we prove a bound ?

We could start off with known inequalities and add them together, multiply or somehow manipulate them into some , where hopefully (A) and (B) . Naive attempts usually fail because we either violate (A) or (B). Such failure gets more likely if the gap between and is narrow; such an inequality is called “tight”, “strong” or “powerful”. Failure increases because an inequality that is “too weak” (LHS and RHS are too far apart) may participate in our manipulations, bloating the gap between and and preventing us from “squeezing” both of them between and . So we need to be extra careful to use only strong inequalities in our manipulations, which requires many strong inequalities to be available in the first place. Strong inequalities are so useful that they can give rise to quick proofs of difficult inequalities.

Of course, after we prove our conjecture , ambitious souls might attempt to tighten the bound further. “Is this the most we can estimate about ? Could we improve our knowledge of its size?” Strengthening techniques such as amplification might then lend a hand.

For instance, an upper bound of the aforementioned number of twin primes below is known to be

where .

The smaller the constant , the tighter the upper bound, but the harder is is to prove that the inequality still holds. The history of improvement on the value of the *twin prime constant* (as proven to be possible) by various researchers is summarized in the following table (Nazardonyavi, 2012):

Interested readers can check out the proof of the best known bound (Wu, 2004).

[1.9] Tao, Terence. Amplification, arbitrage, and the tensor power trick. In *Structure and Randomness: Pages from Year One of a Mathematical Blog*. American Mathematical Society (2008).

(Nazardonyavi, 2012) Nazardonyavi, S. Some History About the Twin Prime Conjecture. *ArXiv* (2012).

(Wu, 2004) Wu, J. Chen’s double sieve, Goldbach’s conjecture and the twin prime problem. *Acta Arithmetica* **114**, 215-273, (2004).

The first posts that I read from Fields Medalist Terence Tao’s research blog “What’s new” were pieces of advice to aspiring mathematicians, such as mathematical writing tips or what it takes to do math. His blog helped me decide to create my own blog to talk about my own math research, among other things. But his technical posts put me off reading his blog until I recently read a compilation of some of his posts in his book *Structure and Randomness* (Tao, 2008) (See cover at right).

His expository articles on math and science were surprisingly nontechnical, as well as fun and enriching to read. They conveyed the big picture of the topics in question, imparting an intuition and wonder of the way that math and mathematicians work. I could understand only a few articles, but even so I decided to share my joy with other nontechnical readers like myself. I have extracted what I could understand, expanded on it and adapted it to minimize prerequisite math knowledge, into the following six-part article series:

**Tomb raider: an analogy for quantum weirdness**(adapted from*Quantum mechanics and Tomb Raider*)**Image compression basics**(adapted from*Compressed sensing and single-pixel cameras*)**Strengthening inequalities: a mathematical trick**(adapted from*Amplification, arbitrage, and the tensor power trick*)**Drawing networks on a plane**(adapted from*The crossing number inequality*)**Analogies between finitary and infinitary math**(adapted from*Soft analysis, hard analysis and the finite convergence principle*)**Infinities as numbers: purging the epsilons and deltas from proofs**(adapted from*Ultrafilters, nonstandard analysis, and epsilon management*)

(1) and (2) are non-technical discussions of math and science topics; (3), (4), and (6) are demonstrations of mathematical tools and tricks; and (5) is about a piece of “mathematical folklore” (see “**Why blog about research?**” below).

Tao’s writing is concise enough that any further extracting, as I have done, misses the complete picture. He is careful to note down important caveats to the insights he presents. I have ignored those caveats, as worthy sacrifices to pique the interest of less technical readers. However, I hope my adaptations also serve as gateways to attract technical readers towards Tao’s well-written and more comprehensive exposition. They’d probably understand more of what he writes than I did!

Besides the many expository articles I didn’t understand enough to confidently present, *Structure and Randomness* included transcripts of some of his lecture series and some of his favourite open problems in math. The target audience of the former is more technical than I am, and the latter is (by definition) at the forefront of mathematical knowledge. However, I hope that able readers check those out as well for their equally illuminating focus on the big picture without the most technical details.

P.S. Tao’s concise writing and excellent choice of words sometimes forces me to lift directly from him rather than cobble together my own contrived paraphrase. I hope he doesn’t mind; this is testament to his writing skills. If you find my adaptations convincing and enjoyable, most of the credit goes to Tao

Tao describes in the preface to *Structure and Randomness* the advantages of research blogging and its niche between traditional print media (journals, books) and informal communications (lectures, conferences). I will skip the benefits for the blogger himself in favor of those for the readers.

- Research blogs are informal, dynamic, and interactive (discussions in the comments), but with a permanent record, well-defined author, and links to further resources.
- The math community circulates pieces of “mathematical folklore”, or common intuitions about a field and their interpretations (e.g. “discrete” vs. “continuous”; also, (5) is folklore about the field of
*analysis*), too fuzzy for formal literature but suitable for blogging.

However, blog posts are not permanent enough to be cited. This pushed Tao to compile some of those posts, with corrections and further ideas from reader comments, into *Structure and Randomness*.

(Tao, 2008) Tao, Terence. Structure and Randomness: Pages from Year One of a Mathematical Blog. American Mathematical Society (2008).

]]>恒行人少处

毅力自磨勤

先辈开山引

生吾效法心

Li Shou, 2012-08-11

A rough translation:

Persevering in the path less travelled;

Determination to sharpen one’s skills;

My forebears broke new ground,

which inspires me to follow.

I keep this poem around in my wallet. It fills me with ambition whenever I read it. Thank you so much for this poem!

P.S. It’s also an acrostic poem; the first word of each line forms the phrase “Mr. Herng Yi”.

]]>