Continuity of the infimum

by Willie Wong

Just realised (two seeks ago, but only gotten around to finish this blog posting now) that an argument used to prove a proposition in a project I am working on is wrong. After reducing the problem to its core I found that it is something quite elementary. So today’s post would be of a different flavour from the ones of recent past.

Question Let X,Y be topological spaces. Let f:X\times Y\to\mathbb{R} be a bounded, continuous function. Is the function g(x) = \inf_{y\in Y}f(x,y) continuous?

Intuitively, one may be tempted to say “yes”. Indeed, there are plenty of examples where the answer is in the positive. The simplest one is when we can replace the infimum with the minimum:

Example Let the space Y be a finite set with the discrete topology. Then g(x) = \min_{y\in Y} f(x,y) is continuous.
Proof left as exercise.

But in fact, the answer to the question is “No”. Here’s a counterexample:

Example Let X = Y = \mathbb{R} with the standard topology. Define

\displaystyle f(x,y) = \begin{cases} 1 & x > 0 \\ 0 & x < -e^{y} \\ 1 + x e^{-y} & x\in [-e^{y},0]  \end{cases}

which is clearly continuous. But the infimum function g(x) is roughly the Heaviside function: g(x) = 1 if x \geq 0, and g(x) = 0 if x < 0.

So what is it about the first example that makes the argument work? What is the different between the minimum and the infimum? A naive guess maybe that in the finite case, we are taking a minimum, and therefore the infimum is attained. This guess is not unreasonable: there are a lot of arguments in analysis where when the infimum can be assumed to be attained, the problem becomes a lot easier (when we are then allowed to deal with a minimizer instead of a minimizing sequence). But sadly that is not (entirely) the case here: for every x_0, we can certainly find a y_0 such that f(x_0,y_0) = g(x_0). So attaining the infimum point-wise is not enough.

What we need, here, is compactness. In fact, we have the following

Theorem If X,Y are topological spaces and Y is compact. Then for any continuous f:X\times Y\to\mathbb{R}, the function g(x) := \inf_{y\in Y} f(x,y) is well-defined and continuous.

Proof usually proceeds in three parts. That g(x) > -\infty follows from the fact that for any fixed x\in X, f(x,\cdot):Y\to\mathbb{R} is a continuous function defined on a compact space, and hence is bounded (in fact the infimum is attained). Then using that the sets (-\infty,a) and (b,\infty) form a subbase for the topology of \mathbb{R}, it suffices to check that g^{-1}((-\infty,a)) and g^{-1}((b,\infty)) are open.

Let \pi_X be the canonical projection \pi_X:X\times Y\to X, which we recall is continuous and open. It is easy to see that g^{-1}((-\infty,a)) = \pi_X \circ f^{-1}((-\infty,a)). So continuity of f implies that this set is open. (Note that this part does not depend on compactness of Y. In fact, a minor modification of this proof shows that for any family of upper semicontinuous functions \{f_c\}_C, the pointwise infimum \inf_{c\in C} f_c is also upper semicontinuous, a fact that is very useful in convex analysis. And indeed, the counterexample function given above is upper semicontinuous.)

It is in this last part, showing that g^{-1}((b,\infty)) is open, that compactness is crucially used. Observe that g(x) > b \implies f(x,y) > b~ \forall y. In other words g(x) > b \implies \forall y, (x,y) \in f^{-1}((b,\infty)) an open set. This in particular implies that \forall x\in g^{-1}((b,\infty)) \forall y\in Y there exists a “box” neighborhood U_{(x,y)}\times V_{(x,y)} contained in f^{-1}((b,\infty)). Now using compactness of Y, a finite subset \{(x,y_i)\} of all these boxes cover \{x\}\times Y. And in particular we have

\displaystyle \{x\}\times Y \subset \left(\cap_{i = 1}^k U_{(x,y_i)}\right)\times Y \subset f^{-1}((b,\infty))

and hence g^{-1}((b,\infty)) = \cup_{x\in g^{-1}((b,\infty))} \cap_{i = 1}^{k(x)} U_{x,y_i} is open. Q.E.D.

One question we may ask is how sharp is the requirement that Y is compact. As with most things in topology, counterexamples abound.

Example Let Y be any uncountably infinite set equipped with the co-countable topology. That is, the collection of open subsets are precisely the empty set and all subsets whose complement is countable. The two interesting properties of this topology are (a) Y is not compact and (b) Y is hyperconnected. (a) is easy to see: let C be some countably infinite subset of Y. For each c\in C let U_c = \{c\}\cup (Y\setminus C). This forms an open cover with not finite sub-cover. Hyperconnected spaces are, roughly speaking, spaces in which all open nonempty sets are “large”, in the sense that they mutually overlap a lot. In particular, a continuous map from a hyperconnected space to a Hausdorff space must be constant. In our case we can see this directly: suppose h:Y\to \mathbb{R} is a continuous map. Fix y_1,y_2\in Y. Let N_{1,2}\subset \mathbb{R} be open neighborhoods of f(y_{1,2}). Since h is continuous, h^{-1}(N_1)\cap h^{-1}(N_2) is open and non-empty (by the co-countable assumption). Therefore N_1\cap N_2\neq \emptyset for any pairs of neighborhoods. Since \mathbb{R} is Hausdorff, this forces h to be the constant map. This implies that for any topological space X, a continuous function f:X\times Y\to\mathbb{R} is constant along Y, and hence for any y_0\in Y, we have \inf_{y\in Y} f(x,y) =: g(x) = f(x,y_0) is continuous.

One can try to introduce various regularity/separation assumptions on the spaces X,Y to see at what level compactness becomes a crucial requirement. As an analyst, however, I really only care about topological manifolds. In which case the second counterexample up top can be readily used. We can slightly weaken the assumptions and still prove the following partial converse in essentially the same way.

Theorem Let X be Tychonoff, connected, and first countable, such that X contains a non-trivial open subset whose closure is not the entire space; and let Y be paracompact, Lindelof. Then if Y is noncompact, there exists a continuous function f:X\times Y\to\mathbb{R} such that \inf_{y\in Y}f:X\to \mathbb{R} is not continuous.

Remark Connected (nontrivial) topological manifolds automatically satisfy the conditions on X and Y except for non-compactness. The conditions given are not necessary for the theorem to hold; but they more or less capture the topological properties used in the construction of the second counterexample above.

Remark If X is such that every open set’s closure is the entire space, we must have that it is hyperconnected (let C\subset X be a closed set. Suppose D\subset X is another closed set such that C\cup D = X. Then C\subset D^c and vice versa, but D^c is open, so C = X. Hence X cannot be written as the union of two proper closed subsets). And if it is Tychonoff, then X is either the empty-set or the one-point set.

Lemma For a paracompact Lindelof space that is noncompact, there exists a countably infinite open cover \{U_k\} and a sequence of points y_k \in U_k such that \{y_k\}\cap U_j = \emptyset if j\neq k.

Proof: By noncompactness, there exists an open cover that is infinite. By Lindelof, this open cover can be assumed to be countable, which we enumerate by \{V_k\} and assume WLOG that \forall k, V_k \setminus \cup_{j =1}^{k-1} V_j \neq \emptyset. Define \{U_k\} and \{y_k\} inductively by: U_k = V_k \setminus \cup_{j = 1}^{k-1} \{ y_j\} and choose y_k \in U_k \setminus \cup_{j=1}^{k-1}U_j.

Proof of theorem: We first construct a sequence of continuous functions on X. Let G\subset X be a non-empty open set such that its closure-complement H = (\bar{G})^c is a non-empty open set (G exists by assumption). By connectedness \bar{G}\cap \bar{H} \neq \emptyset, so we can pick x_0 in the intersection. Let \{x_j\}\subset H be a sequence of points converging to x_0, which exists by first countability. Using Tychonoff, we can get a sequence of continuous functions f_jon X such that f_j|_{\bar{G}} = 0 and f_j(x_j) = -1.

On Y, choose an open cover \{U_k\} and points \{y_k\} per the previous Lemma. By paracompactness we have a partition of unity \{\psi_k\} subordinate to U_k, and by the conclusion of the Lemma we have that \psi_k(y_k) = 1. Now we define the function

\displaystyle f(x,y) = \sum_{k} f_k(x)\psi_k(y)

which is continuous, and such that f|_{\bar{G}\times Y} = 0. But by construction \inf_{y\in Y}f(x,y) \leq f(x_k,y_k) = f_k(x_k) = -1, which combined with the fact that x_k \to x_0 \in \bar{G} shows the desired result. q.e.d.