### Compactness. Part 1: Degrees of freedom

#### by Willie Wong

**Preamble**

This series of posts has its genesis in a couple questions posted on MathOverflow.net. It is taking me a bit longer to write this than expected: I am not entirely sure what level of an audience I should address the post to. Another complication is that it is hard to get ideas organized when I am only writing about this during downtime where I hit a snag in research. The main idea that I want to get to is “a failure of sequential compactness of a domain represents the presence of some sort of infinity.” In this first post I will motivate the definition of compactness in mathematics.

**Introduction**

We start with the pigeonhole principle: *“If one tries to stuff pigeons into pigeonholes, and there are more pigeons than there are pigeonholes, then at least one hole is shared by multiple pigeons.”* This can be easily generalized the the case where there are infinitely many pigeons, which is the version we will use

Pigeonhole principle. If one tries to stuff infinitely many pigeons into a finite number of pigeonholes, then at least one hole is occupied by infinitely many pigeons.

Now imagine being the pigeon sorter: I hand you a pigeon, you put it into *a* pigeonhole. However you sort the pigeons, you are making a (infinite) sequence of assignments of pigeons to pigeonholes. Mathematically, *you* represent a sequence of points (choices) in a finite set (the list of all holes). In this example, we will say that a point in the set of choices is a *cluster point* relative to a sequence of choices if that particular point is visited infinitely many times. A mathematical way of formulating the pigeonhole principle then is:

Let be a finite set. Let be a sequence of points in . Then there exist some point such that is a cluster point of .

The key here is the word finite: if you have an infinitely long row of pigeonholes (say running to the right of you), and I keep handing you pigeons, as long as you keep stepping to the right after stuffing a pigeon into a hole, you can keep a “one pigeon per hole (or fewer)” policy.

If we now take to be the set of all real numbers between 0 and 1, however, the pigeonhole principle fails. That set is *uncountably infinite*, as Cantor would say, and if we consider each number in to be a unique pigeonhole, I can stuff the first pigeon into the hole labeled 1, the second labeled 1/2, the nth pigeon into the hole labeled 1/n, and so on, and no pigeon will share a hole with another pigeon. On the other hand, if you look at the sequence of points given by (1/n), you see that they gradually approach the point 0. So it is somewhat dissatisfying to directly extend the notion of a cluster point for the case where is finite, to the case where is infinite.

What our intuition tells us is that we need some notion that two points in is close; or equivalently, we need some “fuzzy” notion of pigeonholes. For the set , we have a natural way of measuring closeness: the distance between two points in is defined as the absolute value of their difference . And using the distance function to define fuzziness, we arrive at the following version of the fuzzy pigeonhole principle:

A “fuzzy” pigeonhole principle for [0,1]. Let , the set of numbers greater-than-or-equal to 0 and less-than-or-equal-to 1; and let be a sequence of points in . Then for any positive , which represents the degree of fuzziness, there exists some point in such that there are infinitely many points in whose distance to is less than .

The fuzzy pigeonhole principle is a direct consequence of the original pigeonhole principle: for any fixed , we can divide into a bunch of smaller intervals . There are at most such intervals, where is the smallest integer bigger than . So we’ve divided into a finite number of smaller intervals, where any two points taken from the same sub-interval have distance less than . Now use the finite pigeonhole principle on the sequence, but replacing the set by the set of the sub-intervals. So there must be a particular sub-interval such that infinitely many points of land inside it. Then any point inside the sub-interval will qualify as the point we seek.

In view of the fuzzy pigeonhole principle, we can say that a point is a cluster point of if it can serve as the point for *any* positive . Observe that not every is a cluster point. If we fix the fuzziness scale to be , then the number is one of the points that satisfies the fuzzy pigeonhole principle for the sequence (1/n). In fact, all but the first term lies within distance 1/2 of . But if we shrink the fuzziness scale to be , then 1/3 no longer serves as a viable choice for . In fact, in this example, the only cluster point is .

But the cluster point of a sequence need not be unique. Let the sequence be given by

if n is odd, and if n is even.

then it is a small exercise to see that both 0 and 1 are cluster points of this sequence. (In the finite case, the analogue is the fact that with infinite pigeons to put into a finite number of holes, there can be more than one hole with infinitely many pigeons.)

**Compactness for metric spaces / sequential compactness**

The pigeonhole principle, by itself, discriminates between finite and infinite sets. We’ve just shown that for a particular infinite set, if we use the notion of a distance to make things fuzzier, we can recover a version of the pigeonhole principle. Can we do this for arbitrary infinite sets? The answer is no. Let be the set of all natural numbers. Again we have a infinite set. This infinite set is “smaller” than , in the Cantor sense that it is countably infinite. But on this set if we use the standard distance function — absolute value of the difference between two numbers — we see that the distance between any two points is at least 1. So if we pick , the fuzziness factor does not make the set look any different. In other words, if the infinite set cannot be divided into a finite collection of small (in the distance sense) sets, the fuzzy pigeonhole principle cannot hold.

On the other hand, if we use a funny distance on the set of natural numbers, we can recover the pigeonhole principle! If we define the distance between two natural numbers as the inverse distance: , then we have the fuzzy pigeonhole principle. This is because for any fixed , we can let be the smallest natural number such that . Then for any two numbers , we have that . So we can divide the natural numbers into N small sets: a subset for each of the numbers going from 1 to (N-1), and one last subset for all the numbers bigger-than-or-equal-to N. Then running the argument as in the case, we can appeal to the finite pigeonhole principle and conclude that infinitely many choices must reside in one of the subsets, and any representative point in that subset is the we want.

So we see that the fuzzy pigeonhole principle depends not just on the set we choose, but also on the distance function we define!

Definition. Metric space

A metric space is a set endowed with a distance function , which takes as input two elements of and outputs a non-negative real number. The distance function is required to satisfy the following properties

- : there’s no distance between a point and itself.
- If , then : two different points must have a positive distance between them.
- : the order does not matter in the input for the distance function; another way of looking at it is saying that the distance from x to y is the same as the distance from y to x.
- The triangle inequality : if two points are both close to a given third point, then the two points cannot be too far apart.

We can then classify metric spaces into those for which the fuzzy pigeonhole principle holds, and those for which it doesn’t.

Definition. Sequentially pre-compact

A metric space is said to be (sequentially) pre-compact if for any positive number , and for any given sequence of points in , we can find a point in such that infinitely many points in the sequence has a distance .

(The word “sequentially” is used to distinguish the definition from *the* topological definition. As it turns out, however, the word “sequentially” can be safely omitted here, as the two definitions agree for metric spaces. It is only for more unreasonable generalizations that a distinction need to be made.) (For those of you who already had a bit of analysis, the condition given, in view of the triangle inequality, is equivalent to saying that any sequence has a Cauchy subsequence.) A slightly stronger definition is

Definition. Sequential compactness

A metric space is said to be (sequentially) compact if for any given sequence of points , we can find a cluster point . In other words, for any sequence there exists a point such that for any infinitely many points of has distance .

(And here we see that syntactical order in the statement is important in mathematics.) The difference between pre-compact and compact is best illustrated by example: consider the sets and , the former being all natural numbers less than N, the latter being all natural numbers. On the two sets we use the same inverse distance function as described before . For any N, , being a finite set, is sequentially compact: any “hole” described by the finite pigeonhole principle is a cluster point. , however, is only pre-compact (which we’ve already shown): consider the sequence of numbers in . For any fixed element , if we take , , so there can be no more than elements whose distance to is closer than . This means that cannot be a cluster point. So for this particular sequence, there does not exists a cluster point in , and so is not sequentially compact.

While it is important to make the distinction between compact and pre-compact spaces, we will not dwell too much upon it: the discussion to follow will only really need the fuzzy pigeonhole principle (i.e. pre-compactness).

**Uniform spaces and compactness**

*(I will not discuss compactness from the point of view of point-set topology. It is a useful notion, to be sure, but since my motivation is more analysis-heavy, I’ll skip it for now. Furthermore, a discussion of topological compactness cannot be complete without relating the various notions of compactness to each other, which necessitates many definitions and complications I don’t want to get into; this is especially since all the notions of compactness agree on spaces we really care about in analysis. Instead, I’ll present the same definition, just motivated from the uniform space point of view, which is a more natural setting for analysis.)*

Strictly speaking, the distance function is not necessary for getting a concept of compactness. That is, we don’t actually need to know the *numerical value* of the distance between points, we just need some notion of points being *relatively close* to each other. This is captured by the notion of an uniform space.

Definition. Uniform space

A uniform space is a set endowed with a uniform structure .

Well, that’s useful! A uniform structure is a replacement for the distance function.

Definition. Uniform structure (a.k.a. uniformity)

A uniform structure on a set is a collection of subsets of , the set of all ordered pairs with values in . is required to satisfy the axioms of uniformity:

- Every element of contains the diagonal as a subset: this says that any point is close to itself, analogous to the condition that for a distance function.
- The only points contained in all of the elements of is the diagonal. In other words, the diagonal. This is the condition of separability, analogous to the statement that any two distinct points must have positive distance between them. [Note, quite often this separability axiom is not assumed as necessary. Indeed, much of the theory can be done without it. But for our case it is rather necessary.]
- is symmetric: if is an element of , then the set is also an element. This says that closeness is symmetric, and replaces the condition that .
- The existence of a “half-as-large” measure: if is an element of , then there exists another element such that if and , then . This is a replacement of the triangle inequality.
is also required to be a filter:

- Let be subsets of . If are both in , then so is their intersection .
- Let be subsets of . If contains as a subset, and is an element of , then is also an element.

A subset of that is an element of is called an *entourage* (French for neighborhood). Given an entourage and a point , we will write as the subset of given by . An entourage therefore is a rule that shows, for each point in , which points are close to it. The main difference between measuring using an entourage and measuring using a distance function is the following: the distance function is totally ordered. Given a point , and two values , we can write down the sets of points whose distance to is at most and respectively. Then we know that either or the other way around. With entourages, two measures of closeness can agree on some points without having one necessarily subsumed by the other.

One can think of a uniform structure as something obtained by querying several different opinions as to what constitutes closeness on the set , and then completing the collection via the uniformity and filter axioms. In other words, we are formalizing the concept of fuzziness that we used in the introduction. We can then define sequential (pre)compactness just as before

Definition. Sequential pre-compact for uniform spaces

A uniform space is said to be sequentially pre-compact if for any given sequence of points in , and for any fixed entourage , there exist some point such that infinitely many points in the sequence lands inside .

Definition. Sequential compactness for uniform spaces

The uniform space is said to be sequentially compact if for any given sequence , there exists some point such that for any entourage , contains infinitely many points of .

To make more explicit the connection between the finite pigeonhole principle and compactness, we make the following definition

Definition. Pre-compactness of uniform spaces

Let be a uniform space. It is said to be pre-compact if for any entourage in , we can find a finite set of points in such that . In other words, every uniform cover has a finite refinement.

This definition is stricter than sequential pre-compactness.

Proposition. A pre-compact uniform space is sequentially pre-compact.

*Proof*. Let be some sequence, and let be the fixed entourage. By pre-compactness, there exists finitely many such that every point in belongs to some . Now we can apply the finite pigeonhole principle, and conclude that one of the is the we want. Q.E.D.

Just as an example, consider the half-open interval again. We’ll put on it two different uniform structures. The first is the uniform structure associated to the standard metric: is an entourage if there exists some such that . By the same construction we did for , we see that for any such entourage, then, we can cover by the neighborhoods of roughly points spaced equally apart. So under this uniform structure is pre-compact. On the other hand, let the uniform structure be defined by the following: is an entourage if there exists some such that is contained in . The entourages becomes funnel shaped: they are allowed to get very, very thin as one approaches 0. A nice exercise is to show that the entourage defined by does not generate a cover of that has a finite refinement. (Hint: if there were a finite cover associated to the points , then there is a smallest . Show that then there are necessarily points around 0 which are not in the cover.) Therefore with this uniform structure, is not pre-compact. (In fact, this uniform structure is the same as that of the half line with the standard metric, which is also not pre-compact.)

Could you give a link to the related questions in MO?

Sorry, it was long enough ago that I have no idea what MO post I was talking about.

A wonderful read! Thank you so much for posting your thoughts on the internet where they can serve as a resource for complete strangers — you do us a great service