### Conway’s Base 13 Function

(N.b. Credit where credit’s due: I learned about this function from an answer of Robin Chapman’s on MathOverflow, and its measurability from Noah Stein.)

Conway’s base 13-function is a strange beast. It was originally crafted by John Conway as a counterexample to the converse of the intermediate value theorem, and has the property that on any open interval its image contains the entire real line. In addition, its support set also serves as an illustration of a dense, uncountable set of numbers whose Lebesgue measure is 0.

The beast itself
We define a function $f$ in the following way. Let $x$ be an arbitrary real number. Expand it in base 13 using the symbols $0,1,2,3,4,5,6,7,8,9,\cdot,-,+$ and disallow the infinite-repeating string of the plus sign (which we round up the previous digit in the same way $0.99999999\ldots := 1$ in ordinary base-10 notation). Consider the pattern $\ldots \pm a_1\ldots a_n\cdot b_1\ldots$ of symbols where $a_i,b_i$ are digits (0,1,…,9). In other words, we are looking for numbers whose base 13 expansion “ends” with either the plus or minus sign, a string of digits, the dot sign, and an infinite string of digits.

Now if $x$ is a number of the form we are looking for, we define $f(x)$ to be the number, in base 10, given by the tail-expansion $\pm a_1\ldots a_n\cdot b_1\ldots$, where the dot sign is interpreted as the decimal point. If $x$ is not a number we are looking for, we define $f(x) = 0$. In other words, if after throwing away finitely many leading tredigits of the base 13 expansion of $x$ we end up with what looks like a base 10 number, we keep it as the value of $f$, else we set the value to 0.

Every interval is mapped to the whole real line
Let $A < B$ be two distinct real numbers. I claim that for every real number $r$, there are infinitely many $x$'s between $A$ and $B$ satisfying $f(x) = r$. To illustrate: take the base 13 expansions of $A,B$. Since the two numbers are not equal, after finitely many tredigits [btw, digits are base ten, bits are base two, and tredigits are the counterpart in base 13] the two starts to differ. In other words

$A = \ldots \chi_n\ldots \chi_0 A_1 A_2 \ldots$
$B = \ldots \chi_n\ldots \chi_0 B_1 B_2 \ldots$

Furthermore, by definition, the tredecimal expansion cannot end in an infinite string of plus signs. So for some finite $N \geq 2$ we can assume $A_N$ is not the plus sign. We define the number $C$ to be

$C = \ldots \chi_n\ldots \chi_0 A_1 \ldots A_{N-1} +$

given by a finite string. By construction for any $x$ obtained by starting with $C$ and adding terms beyond the final plus sign, we have $A < C < x < B$.

Now take the decimal expansion of $r$ (prefixed with the plus or minus sign), and just append it to the end of $C$; this gives a number $x$ such that $f(x) = r$. If between the expansion of $r$ and $C$ we insert any finite string of tredigits, we get another $x$ with $f(x) = r$. Since the set of finite strings are infinite, we obtain the claim that $f(x) = r$ infinitely often within any interval.

Now denote the set where $f(x) \neq 0$ by $S$. By our explicit construction above, inside any interval, there exists an injection from the 'real line sans zero’ into $S$, so $S$ is uncountable. By the same token, $S$ intersects any open set, so $S$ is dense.

Detour: intermediate value theorem

A classic theorem from either the college algebra/pre-calculus course in high school, or from a first calculus class in college is the intermediate value theorem. This says that if a function $h(x)$ is continuous, and for some $a_0 < b_0$ we have $h(a_0) = A$ and $h(b_0) = B$, then for any value $C$ between $A$ and $B$, we can find at least one $c_0$ between $a_0$ and $b_0$ such that $h(c_0) = C$.

This appeals to the intuition that to continuously get from $A$ to $B$ on the real line, one must pass through every point in between.

Students, however, are often willing to assume the converse of the intermediate value theorem, that is, if a function is such that for every $a_0 < b_0$ with $h(a_0) = A$ and $h(b_0) = B$ and $C$ between $A,B$, there exists some $a_0 < c_0 < b_0$ such that $h(c_0) = C$, then $h$ must be continuous. This is, of course, false. And a powerful example is given by Conway’s base 13 function.

By construction, the function visits every real number within the span of any interval. So it fits the condition that to get from $A$ to $B$, it passes through every point in between. However, by the same token, for any fixed number $y$, no matter how close one gets to $y$, one can always find some $x$ even closer such that $f(x)$ is arbitrarily large. And hence $f(x)$ is, by construction, not a continuous function.

The size of the support set

We have already seen that the set $S$ on which $f$ does not vanish is a dense, uncountable set. A different way of measuring the size of a set, however, is using the notion of the Lebesgue measure. The two notions (cardinality and measure) interacts strangely. Students in a measure theory class are often taught early that the set of rational numbers have Lebesgue measure 0. The usual proof of this fact uses that the Lebesgue measure is countably additive, so that the measure of a countable union of sets is bounded above by the sum of their measures. Since the Lebesgue measure of a single point is 0, and there are countably many rational numbers, this means that the set of rational numbers have Lebesgue measure 0 (in between we also need the fact that the countable union of measurable sets is also measurable).

An immediate consequence is that the set of irrational numbers have full measure, or that, in non-precise probablistic language, real numbers are almost surely irrational.

Unfortunately this sometimes leads to the confusion among students that for dense sets, the measure directly correlates with cardinality. The set $S$ constructed above is an explicit example of a dense, uncountable set whose Lebesgue measure is 0. The proof of this fact is essentially due to Émille Borel (no, he predated Conway by at least half a century; what he proved was in fact that a much larger set [the set of abnormal numbers] than $S$ also has Lebesgue measure 0).

The key observation is that by construction, the set $S$ is a proper subset of the set of numbers such that its tredecimal expansion only has finitely many $+,-,\cdot$ symbols. So let us restrict ourselves to the fixed interval $[0,1)$. If we pick a random number in the interval, and ask what the $N$th tredigit is in the base 13 expansion, we have a uniform probability among the 13 available choices. Furthermore, it is obvious that the probability of any two distinct tredigit positions are independent. So the sequence of tredigits $r_i$ for a given number

$r = 0.r_1r_2\ldots r_N\ldots$

are a sequence of independent random variables, with probability $\mathbb{P}(r_i \in \{ +,-,\cdot\}) = 3/13$ for each $i$. Now we apply the second Borel-Cantelli Lemma to this sequence of events $E_i = \{ r_i \in \{ +,-,\cdot\}\}$. $E_i$ are independent, and the sum of the probabilities diverges as the probability of each event is fixed. Therefore almsot surely $E_i$ happens infinitely often. In other words the probability that only finitely many of the symbols $r_i$ being one of the plus, minus, and dot signs is zero. This effectively (modulo some tidying up connecting this probability with the Lebesgue measure) shows that $S$ is a subset of a measurable set with measure 0, and hence $S$ is also measurable with measure 0.