Bessaga’s converse to the contraction mapping theorem

by Willie Wong

In preparing some lecture notes for the implicit function theorem, I took a look at Schechter’s delightfully comprehensive Handbook of Analysis and its Foundations (which you can also find on his website), and I learned something new about the Banach fixed point theorem. To quote Schechter:

… although Banach’s theorem is quite easy to prove, a longer proof cannot yield stronger results.

I will write a little bit here about a “converse” to the Banach theorem due to Bessaga, which uses a little bit of help from the Axiom of Choice.

Banach’s Theorem
The theorem states:

Let (X,d) be a complete non-empty metric space. Let f:X \to X be a strict contraction: that is to say, suppose there exists some \alpha \in [0,1) such that d(f(x_1),f(x_2)) \leq \alpha d(x_1,x_2) always hold. Then f has a unique fixed point \xi.

Proof. Let x\in X be arbitrary and fixed. Consider the sequence x_0 = x, and x_n = f(x_{n-1}) for n \geq 1. Let d_0 = d(x_0,x_1). Easily we see that by triangle inequality, for 0 \leq m  0 and so we can choose M sufficiently large (using convergence) such that d(x_M,\xi) + d(x_{M+1},\xi) \leq d(\xi,f(\xi)) / 10. But then the triangle inequality implies

\displaystyle d(x_{M+1},f(\xi)) \geq d(\xi,f(\xi)) - d(x_{M+1},\xi) \geq \frac{9}{10} d(\xi,f(\xi)) > d(x_M,\xi)

giving a contradiction to the assumption that f is a contraction. Similarly, \xi is unique since for two fixed points \xi_1,\xi_2 we must have

\displaystyle d(\xi_1,\xi_2) = d(f(\xi_1),f(\xi_2)) \leq \alpha d(\xi_1,\xi_2) \implies \xi_1 = \xi_2. QED

Note that one immediate corollary is that: if f is a contraction mapping, then each iterate f^{(k)} = \underbrace{f\circ f\circ \cdots \circ f}_{k \mbox{ times}} possesses exactly one (the same) fixed point \xi.

Bessaga’s converse

Let X be a set and \xi\in X. Suppose f:X\to X is a function such that for any k we have

\displaystyle f^{(k)}(x) = x \iff x = \xi

then there exists a complete metric d on X such that f is a strict contraction.

Sketch of proof. First we define an equivalence relation on X. We say x\sim y if there exists positive integers p,q such that f^{(p)}(x) = f^{(q)}(y). If x\sim y we define

\displaystyle \rho(x,y) = \min \{p+q\mid f^{(p)}(x) = f^{(q)}(y)\} and \displaystyle x\wedge y = f^{(p)}(x) where p is the value that attains \rho(x,y).

We also define

\displaystyle \sigma(x,y) = \rho(x,x\wedge y) - \rho(y,x\wedge y).

Observe that \rho is symmetric in its arguments and \sigma antisymmetric.

Now, by the axiom of choice, there exists a choice function that chooses for each equivalence class of X/\sim a representative, this extends to a function c:X\to X. From this we can define a function \lambda: X \to \mathbb{Z} by

\displaystyle \lambda(x) = \sigma(c(x),x).

Now we are in a position to define our distance function d. We do it in several steps.

  1. If x\sim y, we define

    \displaystyle d(x,y) = 2^{-\lambda(x)} + 2^{-\lambda(y)} - 2\cdot 2^{-\lambda(x\wedge y)}

    where the empty sum yields 0.

  2. If x\not\sim \xi, we define

    \displaystyle d(x,\xi) = 2^{- \lambda(x)}

  3. If x\not\sim y and neither x,y is \xi, we define

    \displaystyle d(x,y) = d(x,\xi) + d(y,\xi).

This definition is obviously symmetric and non-negative. And it is easy to check that d(x,y) = 0 \implies x\sim y and x = y = x\wedge y. Triangle inequality involves a little bit more work, but most cases are immediately obvious except when x\sim y\sim z. Here we need to check that

\displaystyle 2\cdot 2^{-\lambda(y)} - 2 \cdot 2^{-\lambda(x\wedge y)} - 2\cdot 2^{-\lambda(y\wedge z)} \geq - 2 \cdot 2^{-\lambda(x\wedge z)}

which we do right now. Suppose f^{(p)}(x) = f^{(q_1)}(y) and f^{(q_2)}(y) = f^{(r)}(z). WLOG we can take q_1 \geq q_2. Then we have that f^{(p)}(x) = f^{(r + q_1 - q_2)}(z). This shows that \lambda(x\wedge z) \leq \min( \lambda(x\wedge y), \lambda(y\wedge z)). And this proves the inequality above.

It is easy to see that any Cauchy sequence either is eventually constant, or must converge to \xi: if x\neq y we have that

\displaystyle d(x,y) \geq 2^{- \min(\lambda(x),\lambda(y)) - 1} \geq \frac14 (d(x,\xi) + d(y,\xi))

and this shows that d is a complete metric. Now, it remains to verify that f is a contraction. Noting that \lambda(f(x)) = \lambda(x) + 1 and f(x) \wedge f(y) = x\wedge y we see easily that f is a contraction with Lipschitz constant \frac12. QED