### Bessaga’s converse to the contraction mapping theorem

In preparing some lecture notes for the implicit function theorem, I took a look at Schechter’s delightfully comprehensive Handbook of Analysis and its Foundations (which you can also find on his website), and I learned something new about the Banach fixed point theorem. To quote Schechter:

… although Banach’s theorem is quite easy to prove, a longer proof cannot yield stronger results.

I will write a little bit here about a “converse” to the Banach theorem due to Bessaga, which uses a little bit of help from the Axiom of Choice.

Banach’s Theorem
The theorem states:

Let $(X,d)$ be a complete non-empty metric space. Let $f:X \to X$ be a strict contraction: that is to say, suppose there exists some $\alpha \in [0,1)$ such that $d(f(x_1),f(x_2)) \leq \alpha d(x_1,x_2)$ always hold. Then $f$ has a unique fixed point $\xi$.

Proof. Let $x\in X$ be arbitrary and fixed. Consider the sequence $x_0 = x$, and $x_n = f(x_{n-1})$ for $n \geq 1$. Let $d_0 = d(x_0,x_1)$. Easily we see that by triangle inequality, for $0 \leq m 0$ and so we can choose $M$ sufficiently large (using convergence) such that $d(x_M,\xi) + d(x_{M+1},\xi) \leq d(\xi,f(\xi)) / 10$. But then the triangle inequality implies

$\displaystyle d(x_{M+1},f(\xi)) \geq d(\xi,f(\xi)) - d(x_{M+1},\xi) \geq \frac{9}{10} d(\xi,f(\xi)) > d(x_M,\xi)$

giving a contradiction to the assumption that $f$ is a contraction. Similarly, $\xi$ is unique since for two fixed points $\xi_1,\xi_2$ we must have

$\displaystyle d(\xi_1,\xi_2) = d(f(\xi_1),f(\xi_2)) \leq \alpha d(\xi_1,\xi_2) \implies \xi_1 = \xi_2$. QED

Note that one immediate corollary is that: if $f$ is a contraction mapping, then each iterate $f^{(k)} = \underbrace{f\circ f\circ \cdots \circ f}_{k \mbox{ times}}$ possesses exactly one (the same) fixed point $\xi$.

Bessaga’s converse

Theorem
Let $X$ be a set and $\xi\in X$. Suppose $f:X\to X$ is a function such that for any $k$ we have

$\displaystyle f^{(k)}(x) = x \iff x = \xi$

then there exists a complete metric $d$ on $X$ such that $f$ is a strict contraction.

Sketch of proof. First we define an equivalence relation on $X$. We say $x\sim y$ if there exists positive integers $p,q$ such that $f^{(p)}(x) = f^{(q)}(y)$. If $x\sim y$ we define

$\displaystyle \rho(x,y) = \min \{p+q\mid f^{(p)}(x) = f^{(q)}(y)\}$ and $\displaystyle x\wedge y = f^{(p)}(x)$ where $p$ is the value that attains $\rho(x,y)$.

We also define

$\displaystyle \sigma(x,y) = \rho(x,x\wedge y) - \rho(y,x\wedge y)$.

Observe that $\rho$ is symmetric in its arguments and $\sigma$ antisymmetric.

Now, by the axiom of choice, there exists a choice function that chooses for each equivalence class of $X/\sim$ a representative, this extends to a function $c:X\to X$. From this we can define a function $\lambda: X \to \mathbb{Z}$ by

$\displaystyle \lambda(x) = \sigma(c(x),x)$.

Now we are in a position to define our distance function $d$. We do it in several steps.

1. If $x\sim y$, we define

$\displaystyle d(x,y) = 2^{-\lambda(x)} + 2^{-\lambda(y)} - 2\cdot 2^{-\lambda(x\wedge y)}$

where the empty sum yields 0.

2. If $x\not\sim \xi$, we define

$\displaystyle d(x,\xi) = 2^{- \lambda(x)}$

3. If $x\not\sim y$ and neither $x,y$ is $\xi$, we define

$\displaystyle d(x,y) = d(x,\xi) + d(y,\xi)$.

This definition is obviously symmetric and non-negative. And it is easy to check that $d(x,y) = 0 \implies x\sim y$ and $x = y = x\wedge y$. Triangle inequality involves a little bit more work, but most cases are immediately obvious except when $x\sim y\sim z$. Here we need to check that

$\displaystyle 2\cdot 2^{-\lambda(y)} - 2 \cdot 2^{-\lambda(x\wedge y)} - 2\cdot 2^{-\lambda(y\wedge z)} \geq - 2 \cdot 2^{-\lambda(x\wedge z)}$

which we do right now. Suppose $f^{(p)}(x) = f^{(q_1)}(y)$ and $f^{(q_2)}(y) = f^{(r)}(z)$. WLOG we can take $q_1 \geq q_2$. Then we have that $f^{(p)}(x) = f^{(r + q_1 - q_2)}(z)$. This shows that $\lambda(x\wedge z) \leq \min( \lambda(x\wedge y), \lambda(y\wedge z))$. And this proves the inequality above.

It is easy to see that any Cauchy sequence either is eventually constant, or must converge to $\xi$: if $x\neq y$ we have that

$\displaystyle d(x,y) \geq 2^{- \min(\lambda(x),\lambda(y)) - 1} \geq \frac14 (d(x,\xi) + d(y,\xi))$

and this shows that $d$ is a complete metric. Now, it remains to verify that $f$ is a contraction. Noting that $\lambda(f(x)) = \lambda(x) + 1$ and $f(x) \wedge f(y) = x\wedge y$ we see easily that $f$ is a contraction with Lipschitz constant $\frac12$. QED