Conway’s Base 13 Function
by Willie Wong
(N.b. Credit where credit’s due: I learned about this function from an answer of Robin Chapman’s on MathOverflow, and its measurability from Noah Stein.)
Conway’s base 13-function is a strange beast. It was originally crafted by John Conway as a counterexample to the converse of the intermediate value theorem, and has the property that on any open interval its image contains the entire real line. In addition, its support set also serves as an illustration of a dense, uncountable set of numbers whose Lebesgue measure is 0.
The beast itself
We define a function in the following way. Let
be an arbitrary real number. Expand it in base 13 using the symbols
and disallow the infinite-repeating string of the plus sign (which we round up the previous digit in the same way
in ordinary base-10 notation). Consider the pattern
of symbols where
are digits (0,1,…,9). In other words, we are looking for numbers whose base 13 expansion “ends” with either the plus or minus sign, a string of digits, the dot sign, and an infinite string of digits.
Now if is a number of the form we are looking for, we define
to be the number, in base 10, given by the tail-expansion
, where the dot sign is interpreted as the decimal point. If
is not a number we are looking for, we define
. In other words, if after throwing away finitely many leading tredigits of the base 13 expansion of
we end up with what looks like a base 10 number, we keep it as the value of
, else we set the value to 0.
Every interval is mapped to the whole real line
Let be two distinct real numbers. I claim that for every real number
, there are infinitely many
's between
and
satisfying
. To illustrate: take the base 13 expansions of
. Since the two numbers are not equal, after finitely many tredigits [btw, digits are base ten, bits are base two, and tredigits are the counterpart in base 13] the two starts to differ. In other words
Furthermore, by definition, the tredecimal expansion cannot end in an infinite string of plus signs. So for some finite we can assume
is not the plus sign. We define the number
to be
given by a finite string. By construction for any obtained by starting with
and adding terms beyond the final plus sign, we have
.
Now take the decimal expansion of (prefixed with the plus or minus sign), and just append it to the end of
; this gives a number
such that
. If between the expansion of
and
we insert any finite string of tredigits, we get another
with
. Since the set of finite strings are infinite, we obtain the claim that
infinitely often within any interval.
Now denote the set where by
. By our explicit construction above, inside any interval, there exists an injection from the 'real line sans zero’ into
, so
is uncountable. By the same token,
intersects any open set, so
is dense.
Detour: intermediate value theorem
A classic theorem from either the college algebra/pre-calculus course in high school, or from a first calculus class in college is the intermediate value theorem. This says that if a function is continuous, and for some
we have
and
, then for any value
between
and
, we can find at least one
between
and
such that
.
This appeals to the intuition that to continuously get from to
on the real line, one must pass through every point in between.
Students, however, are often willing to assume the converse of the intermediate value theorem, that is, if a function is such that for every with
and
and
between
, there exists some
such that
, then
must be continuous. This is, of course, false. And a powerful example is given by Conway’s base 13 function.
By construction, the function visits every real number within the span of any interval. So it fits the condition that to get from to
, it passes through every point in between. However, by the same token, for any fixed number
, no matter how close one gets to
, one can always find some
even closer such that
is arbitrarily large. And hence
is, by construction, not a continuous function.
The size of the support set
We have already seen that the set on which
does not vanish is a dense, uncountable set. A different way of measuring the size of a set, however, is using the notion of the Lebesgue measure. The two notions (cardinality and measure) interacts strangely. Students in a measure theory class are often taught early that the set of rational numbers have Lebesgue measure 0. The usual proof of this fact uses that the Lebesgue measure is countably additive, so that the measure of a countable union of sets is bounded above by the sum of their measures. Since the Lebesgue measure of a single point is 0, and there are countably many rational numbers, this means that the set of rational numbers have Lebesgue measure 0 (in between we also need the fact that the countable union of measurable sets is also measurable).
An immediate consequence is that the set of irrational numbers have full measure, or that, in non-precise probablistic language, real numbers are almost surely irrational.
Unfortunately this sometimes leads to the confusion among students that for dense sets, the measure directly correlates with cardinality. The set constructed above is an explicit example of a dense, uncountable set whose Lebesgue measure is 0. The proof of this fact is essentially due to Émille Borel (no, he predated Conway by at least half a century; what he proved was in fact that a much larger set [the set of abnormal numbers] than
also has Lebesgue measure 0).
The key observation is that by construction, the set is a proper subset of the set of numbers such that its tredecimal expansion only has finitely many
symbols. So let us restrict ourselves to the fixed interval
. If we pick a random number in the interval, and ask what the
th tredigit is in the base 13 expansion, we have a uniform probability among the 13 available choices. Furthermore, it is obvious that the probability of any two distinct tredigit positions are independent. So the sequence of tredigits
for a given number
are a sequence of independent random variables, with probability for each
. Now we apply the second Borel-Cantelli Lemma to this sequence of events
.
are independent, and the sum of the probabilities diverges as the probability of each event is fixed. Therefore almsot surely
happens infinitely often. In other words the probability that only finitely many of the symbols
being one of the plus, minus, and dot signs is zero. This effectively (modulo some tidying up connecting this probability with the Lebesgue measure) shows that
is a subset of a measurable set with measure 0, and hence
is also measurable with measure 0.
Trivial typo: “almsot surely” should be “almost surely”
Other than that, I enjoyed this. See also my work on Decimal Goedelization, as summarized on The On-Line Encyclopedia of Integer Sequences, starting with:
http://www.research.att.com/~njas/sequences/A166746
A166746 Count of propositional theorems up to 10^n in Richard Schroeppel’s Goedelization of A101273.
A100200 Decimal Goedelization of antitheorems from propositional calculus, in Richard Schroeppel’s metatheory of A101273.
A101248 Decimal Goedelization of contingent WFFs (well-formed formulae) from propositional calculus, in Richard Schroeppel’s metatheory of A101273. Truth value depends on truth value of variables, but is neither always true (theorem) nor always false (antitheorem).
COMMENT
Blocks of 1’s and 2s are variables: A = 1, B = 2, C = 11, D = 12, E = 21, … Not (also written -) = 3; And = 4; Xor = 5; Or = 6; Implies = 7; Equiv = 8; Left Parenthesis = 9; Right Parenthesis = 0. Operator binding strength is in numerical order, Not > And > … > Equiv. The non-associative “Implies” is evaluated from Left to Right; A->B->C = is interpreted (A->B)->C.
Redundant parentheses are permitted, so long as they are balanced and centered on a valid variable or sentential formula and not on the null character. Besides A101273 (theorems = tautologies), A100200 (antitheorems = always false WFFs) there can also be the subsequence of theorems that can be proved within the more restricted intuitionistic logic; this sequence of well-formed formulae whose truth value is contingent on the truth values of their variables; and many others.
As with A101273, I conjecture that a power law approximates the number of integers in this sequence, where the number with N digits is approximately N to the power of some real number D. The union of A101273, A100200 and this sequence is the set of all WFFs in Richard Schroeppel’s metatheory of A101273.