Goal: think uncommonly about common things; explain uncommon things commonly.

## Category: Require introductory level university maths

### Bessaga’s converse to the contraction mapping theorem

In preparing some lecture notes for the implicit function theorem, I took a look at Schechter’s delightfully comprehensive Handbook of Analysis and its Foundations (which you can also find on his website), and I learned something new about the Banach fixed point theorem. To quote Schechter:

… although Banach’s theorem is quite easy to prove, a longer proof cannot yield stronger results.

I will write a little bit here about a “converse” to the Banach theorem due to Bessaga, which uses a little bit of help from the Axiom of Choice.

### Compactifying (p,q)-Minkowski space

In a previous post I described a method of thinking about conformal compactifications, and I mentioned in passing that, in principle, the method should also apply to arbitrary signature pseudo-Euclidean space $\mathbb{R}^{p,q}$. A few days ago while visiting Oxford I had a conversation with Sergiu Klainerman where this came up, and we realised that we don’t actually know what the conformal compactifications are! So let me write down here the computations in case I need to think about it again in the future. Read the rest of this entry »

### Products and expectation values

Let us start with an instructive example (modified from one I learned from Steven Landsburg). Let us play a game:

I show you three identical looking boxes. In the first box there are 3 red marbles and 1 blue one. In the second box there are 2 red marbles and 1 blue one. In the last box there is 1 red marble and 4 blue ones. You choose one at random. What is …

• The expected number of red marbles you will find?
• The expected number of blue marbles you will find?
• The expected number of marbles, irregardless of colour, you will find?
• The expected percentage of red marbles you will find?
• The expected percentage of blue marbles you will find?

Answer below the cut… Read the rest of this entry »

### Mariş’s Theorem

During a literature search (to answer my question concerning symmetries of “ground states” in variational problem, I came across a very nice theorem due to Mihai Mariş. The theorem itself is, more than anything else, a statement about the geometry of Euclidean spaces. I will give a rather elementary write-up of (a special case of) the theorem here. (The proof presented here can equally well be applied to get the full strength of the theorem as presented in Maris’s paper; I give just the special case for clarity of the discussion.)

### Continuity of the infimum

Just realised (two seeks ago, but only gotten around to finish this blog posting now) that an argument used to prove a proposition in a project I am working on is wrong. After reducing the problem to its core I found that it is something quite elementary. So today’s post would be of a different flavour from the ones of recent past.

Question Let $X,Y$ be topological spaces. Let $f:X\times Y\to\mathbb{R}$ be a bounded, continuous function. Is the function $g(x) = \inf_{y\in Y}f(x,y)$ continuous?

Intuitively, one may be tempted to say “yes”. Indeed, there are plenty of examples where the answer is in the positive. The simplest one is when we can replace the infimum with the minimum:

Example Let the space $Y$ be a finite set with the discrete topology. Then $g(x) = \min_{y\in Y} f(x,y)$ is continuous.
Proof left as exercise.

But in fact, the answer to the question is “No”. Here’s a counterexample:

Example Let $X = Y = \mathbb{R}$ with the standard topology. Define

$\displaystyle f(x,y) = \begin{cases} 1 & x > 0 \\ 0 & x < -e^{y} \\ 1 + x e^{-y} & x\in [-e^{y},0] \end{cases}$

which is clearly continuous. But the infimum function $g(x)$ is roughly the Heaviside function: $g(x) = 1$ if $x \geq 0$, and $g(x) = 0$ if $x < 0$.

So what is it about the first example that makes the argument work? What is the different between the minimum and the infimum? A naive guess maybe that in the finite case, we are taking a minimum, and therefore the infimum is attained. This guess is not unreasonable: there are a lot of arguments in analysis where when the infimum can be assumed to be attained, the problem becomes a lot easier (when we are then allowed to deal with a minimizer instead of a minimizing sequence). But sadly that is not (entirely) the case here: for every $x_0$, we can certainly find a $y_0$ such that $f(x_0,y_0) = g(x_0)$. So attaining the infimum point-wise is not enough.

What we need, here, is compactness. In fact, we have the following

Theorem If $X,Y$ are topological spaces and $Y$ is compact. Then for any continuous $f:X\times Y\to\mathbb{R}$, the function $g(x) := \inf_{y\in Y} f(x,y)$ is well-defined and continuous.

Proof usually proceeds in three parts. That $g(x) > -\infty$ follows from the fact that for any fixed $x\in X$, $f(x,\cdot):Y\to\mathbb{R}$ is a continuous function defined on a compact space, and hence is bounded (in fact the infimum is attained). Then using that the sets $(-\infty,a)$ and $(b,\infty)$ form a subbase for the topology of $\mathbb{R}$, it suffices to check that $g^{-1}((-\infty,a))$ and $g^{-1}((b,\infty))$ are open.

Let $\pi_X$ be the canonical projection $\pi_X:X\times Y\to X$, which we recall is continuous and open. It is easy to see that $g^{-1}((-\infty,a)) = \pi_X \circ f^{-1}((-\infty,a))$. So continuity of $f$ implies that this set is open. (Note that this part does not depend on compactness of $Y$. In fact, a minor modification of this proof shows that for any family of upper semicontinuous functions $\{f_c\}_C$, the pointwise infimum $\inf_{c\in C} f_c$ is also upper semicontinuous, a fact that is very useful in convex analysis. And indeed, the counterexample function given above is upper semicontinuous.)

It is in this last part, showing that $g^{-1}((b,\infty))$ is open, that compactness is crucially used. Observe that $g(x) > b \implies f(x,y) > b~ \forall y$. In other words $g(x) > b \implies \forall y, (x,y) \in f^{-1}((b,\infty))$ an open set. This in particular implies that $\forall x\in g^{-1}((b,\infty)) \forall y\in Y$ there exists a “box” neighborhood $U_{(x,y)}\times V_{(x,y)}$ contained in $f^{-1}((b,\infty))$. Now using compactness of $Y$, a finite subset $\{(x,y_i)\}$ of all these boxes cover $\{x\}\times Y$. And in particular we have

$\displaystyle \{x\}\times Y \subset \left(\cap_{i = 1}^k U_{(x,y_i)}\right)\times Y \subset f^{-1}((b,\infty))$

and hence $g^{-1}((b,\infty)) = \cup_{x\in g^{-1}((b,\infty))} \cap_{i = 1}^{k(x)} U_{x,y_i}$ is open. Q.E.D.

One question we may ask is how sharp is the requirement that $Y$ is compact. As with most things in topology, counterexamples abound.

Example Let $Y$ be any uncountably infinite set equipped with the co-countable topology. That is, the collection of open subsets are precisely the empty set and all subsets whose complement is countable. The two interesting properties of this topology are (a) $Y$ is not compact and (b) $Y$ is hyperconnected. (a) is easy to see: let $C$ be some countably infinite subset of $Y$. For each $c\in C$ let $U_c = \{c\}\cup (Y\setminus C)$. This forms an open cover with not finite sub-cover. Hyperconnected spaces are, roughly speaking, spaces in which all open nonempty sets are “large”, in the sense that they mutually overlap a lot. In particular, a continuous map from a hyperconnected space to a Hausdorff space must be constant. In our case we can see this directly: suppose $h:Y\to \mathbb{R}$ is a continuous map. Fix $y_1,y_2\in Y$. Let $N_{1,2}\subset \mathbb{R}$ be open neighborhoods of $f(y_{1,2})$. Since $h$ is continuous, $h^{-1}(N_1)\cap h^{-1}(N_2)$ is open and non-empty (by the co-countable assumption). Therefore $N_1\cap N_2\neq \emptyset$ for any pairs of neighborhoods. Since $\mathbb{R}$ is Hausdorff, this forces $h$ to be the constant map. This implies that for any topological space $X$, a continuous function $f:X\times Y\to\mathbb{R}$ is constant along $Y$, and hence for any $y_0\in Y$, we have $\inf_{y\in Y} f(x,y) =: g(x) = f(x,y_0)$ is continuous.

One can try to introduce various regularity/separation assumptions on the spaces $X,Y$ to see at what level compactness becomes a crucial requirement. As an analyst, however, I really only care about topological manifolds. In which case the second counterexample up top can be readily used. We can slightly weaken the assumptions and still prove the following partial converse in essentially the same way.

Theorem Let $X$ be Tychonoff, connected, and first countable, such that $X$ contains a non-trivial open subset whose closure is not the entire space; and let $Y$ be paracompact, Lindelof. Then if $Y$ is noncompact, there exists a continuous function $f:X\times Y\to\mathbb{R}$ such that $\inf_{y\in Y}f:X\to \mathbb{R}$ is not continuous.

Remark Connected (nontrivial) topological manifolds automatically satisfy the conditions on $X$ and $Y$ except for non-compactness. The conditions given are not necessary for the theorem to hold; but they more or less capture the topological properties used in the construction of the second counterexample above.

Remark If $X$ is such that every open set’s closure is the entire space, we must have that it is hyperconnected (let $C\subset X$ be a closed set. Suppose $D\subset X$ is another closed set such that $C\cup D = X$. Then $C\subset D^c$ and vice versa, but $D^c$ is open, so $C = X$. Hence $X$ cannot be written as the union of two proper closed subsets). And if it is Tychonoff, then $X$ is either the empty-set or the one-point set.

Lemma For a paracompact Lindelof space that is noncompact, there exists a countably infinite open cover $\{U_k\}$ and a sequence of points $y_k \in U_k$ such that $\{y_k\}\cap U_j = \emptyset$ if $j\neq k$.

Proof: By noncompactness, there exists an open cover that is infinite. By Lindelof, this open cover can be assumed to be countable, which we enumerate by $\{V_k\}$ and assume WLOG that $\forall k, V_k \setminus \cup_{j =1}^{k-1} V_j \neq \emptyset$. Define $\{U_k\}$ and $\{y_k\}$ inductively by: $U_k = V_k \setminus \cup_{j = 1}^{k-1} \{ y_j\}$ and choose $y_k \in U_k \setminus \cup_{j=1}^{k-1}U_j$.

Proof of theorem: We first construct a sequence of continuous functions on $X$. Let $G\subset X$ be a non-empty open set such that its closure-complement $H = (\bar{G})^c$ is a non-empty open set ($G$ exists by assumption). By connectedness $\bar{G}\cap \bar{H} \neq \emptyset$, so we can pick $x_0$ in the intersection. Let $\{x_j\}\subset H$ be a sequence of points converging to $x_0$, which exists by first countability. Using Tychonoff, we can get a sequence of continuous functions $f_j$on $X$ such that $f_j|_{\bar{G}} = 0$ and $f_j(x_j) = -1$.

On $Y$, choose an open cover $\{U_k\}$ and points $\{y_k\}$ per the previous Lemma. By paracompactness we have a partition of unity $\{\psi_k\}$ subordinate to $U_k$, and by the conclusion of the Lemma we have that $\psi_k(y_k) = 1$. Now we define the function

$\displaystyle f(x,y) = \sum_{k} f_k(x)\psi_k(y)$

which is continuous, and such that $f|_{\bar{G}\times Y} = 0$. But by construction $\inf_{y\in Y}f(x,y) \leq f(x_k,y_k) = f_k(x_k) = -1$, which combined with the fact that $x_k \to x_0 \in \bar{G}$ shows the desired result. q.e.d.

### Inverted time translations

Plot of the vector field K_0 and its stream function

In the study of the global properties of wave-type equations, a well-developed method is the vector field method due to Sergiu Klainerman and Demetrios Christodoulou. Maybe in another day I will write a more detailed treatise on what the vector field method is and how to apply it; I won’t do it now. The method is crucial in many proofs of nonlinear stability for wave-type problems, and with perhaps the most striking application the global nonlinear stability of Minkowski space. The main idea behind the vector field method is to construct a tensor that measures the local energy content of the solution to our equations, and exploit the properties of this tensor via vector fields. Examples of this tensor includes the Einstein-Hilbert stress for electromagnetism, as well as the Bel-Robinson tensor for spin-2 (graviton) fields. To exploit the fine properties of this tensor field, one applies the divergence theorem to the tensor field contracted against suitable vector fields. For vector fields associated to the symmetries of the problem, this procedure will produce conservation laws, which will give control of the physical solution at a later time based on control at the present.

As it turns out, the useful symmetries of the equation, in the geometrical case, are closedly related to the conformal symmetries of Minkowski space. These include the true symmetries (translations, rotations, and Lorentzian boosts), as well as the conformal scaling and, what we will discuss here, the inverted time translation, which lies at the heart of decay estimates for spin-1 and spin-2 fields on Minkowski space.

The inverted time translation, often denoted $K_0$, is the vector field given in radial coordinates $K_0 = (t^2 + r^2)\partial_t + 2tr \partial_r$. In the picture to the upper left, the vector field is plotted along with its stream function. This vector field is a conformal symmetry of Minkowski space. The name of the vector field indicates the fact that it is associated to a conformal inversion (which is also used in the conformal compactification of Minkowski space). On Minkowski space, the inversion map $x^\mu \mapsto \frac{x^\mu}{\langle x,x\rangle}$ is a conformal isometry. The vector field $K_0$ can be checked to be the vector field $\partial_t$ conjugated by the inversion map. As such, it has a very nice property compared with the other symmetry vector fields. The time translation $\partial_t$ and the inverted time translation $K_0$ are essentially (up to Lorentz boosts) the only globally causal conformal vector fields of Minkowski space. As such, with a dominant-energy type condition, they are the ones associated to which we have nonnegative energy controls.

### Shock singularities in Burgers’ equation

It is generally well known that partial differential equations that model fluid motion can exhibit “shock waves”. In fact, the subject I will write about today is generally presented as the canonical example for such behaviour in a first course in partial differential equations (while also introducing the method of characteristics). The focus here, however, will not be so much on the formation of shocks, but on the profile of the shock boundary. This discussion tends to be omitted from introductory texts.

Solving Burgers’ equation
First we recall the inviscid Burgers’ equation, a fundamental partial differential equation in the study of fluids. The equation is written

Equation 1. Inviscid Burgers’ equation
$\displaystyle \frac{\partial}{\partial t} u + u \frac{\partial}{\partial x} u = 0$

where $u = u(t,x)$ is the “local fluid velocity” at time $t$ and at spatial coordinate $x$. The solution of the equation is closely related to its derivation: notice that we can re-write the equation as

$v \cdot \nabla u = (\partial_t + u \partial_x) u = 0$

The question we consider is the initial value problem for the PDE: given some initial velocity configuration $u_0(x)$, we want to find a solution $u(t,x)$ to Burgers’ equation such that $u(0,x) = u_0(x)$.

The traditional way of obtaining a solution is via the method of characteristics. We first observe (1) the alternate form of the equation above means that if $X(t)$ is a curve tangent to the vector field $v = \partial_t + u\partial_x$, we must have $u(t,X(t))$ be a constant valued function of the parameter $t$. (2) Plugging this back in implies that along such a curve $X(t)$, the vector field $v = \partial_t + u\partial_x = \partial_t + u_0 \partial_x$ is constant. (3) A curve whose tangent vector is constant is a straight line. So we have that a solution of the Burgers’ equation must verify

$u(t, x + u_0(x) \cdot t) = u_0(x)$

And we call the family of curves given by $X_x(t) = x + u_0(x) \cdot t$ the characteristic curves of the solution.

To extract more qualitative information about Burgers’ equation, let us take another spatial derivative of the equation, and call the function $w = \partial_x u$. Then we have

$\partial_t w + w^2 + u \partial_x w = 0 \implies v \cdot w + w^2 = 0$

So letting $X(t)$ be a characteristic curve, and write $W(t) = w(t, X(t))$, we have that along the characteristic curve

$\displaystyle \frac{d}{dt}W = - W^2 \implies W(t) = \frac{1}{t+W(0)^{-1}}$

So in particular, we see that if $W(0) < 0$, $W(t)$ must blow up in time $t \leq |W(0)|^{-1}$.

So what does this mean? We’ve seen that along characteristic lines, the value of $u$ stays constant. But we’ve also seen that along those lines, the value of its spatial derivative can blow up if the initial slope is negative. Perhaps the best thing to do is to illustrate it with two pictures. In the pictures the thick, red curve is the initial velocity distribution $u_0(x)$, shown with the black line representing the $x$-axis: so when the curve is above the axis, initially the local fluid velocity is positive, and the fluid is moving to the right. The blue curves are the characteristic lines. In the first image to the right, we see that the initial velocity distribution is such that the velocity is increasing to the right. And so $w(0,x)$ is always positive. We see that in this situation the flow is divergent, the flow lines getting further and further apart, corresponding to the solution where $w(t,x)$ gets smaller and smaller along a flow line. For the second image here on our left, the situation is different. The initial velocity distribution starts out increasing, then hits a maximum, dips down to a minimum, and finally increases again. In the regions where the velocity distribution is increasing, we see the same “spreading out” behaviour as before, with the flow lines getting further and further apart (especially in the upper left region). But for flowlines originating in the region where the velocity distribution is decreasing, those characteristic curves gets bunched together as time goes on, eventually intersecting! This intersection is what is known as a shock. From the picture, it becomes clear what the blow-up of $W(t)$ means: Suppose the initial velocity distribution is such that for two points $x_1 u_0(x_2)$. Since the flow line originating from $x_1$ is moving faster, it will eventually catch up to the the flow line originating from $x_2$. When the two flow lines intersect, we have a problem: if we follow the flow line from $x_1$, the function $u$ must take the value $u_0(x_1)$ at the point; but if we follow the flow line from $x_2$, the function must take the value $u_0(x_2)$ at the point. So we cannot consistently assign a value to the function $u$ at the points of intersection for flow-lines in a way that satisfies Burgers’ equation.

Another way of thinking about this difficulty is in terms of particle dynamics. Imagine the line being a highway, and points on it being cars. The dynamics of the traffic flow described by Burgers’ equation is one in which each driver starts at one speed (which can be in reverse), and maintains that speed completely without regard for the cars in front of or behind it. If we start out with a distribution where the leading cars always drive faster than the trailing ones, then the cars will spread further apart as time goes on. But if we start out with a distribution where a car in front is driving slower than a car behind, the second car will eventually catch up and crash into the one in front. And this is the formation of the shock wave.

(Now technically, in this view, once the two cars crash their flow-lines should end, and so cars that are in front of the collision and moving forward should not be affected by the collision at all. But if we imagine that instead of real cars, we are driving bumper cars, so after a collision, the car in front maintains speed at the velocity of the car that hit it, while the car in back drives at the velocity of the car it hit [so the they swap speeds in an elastic collision], then we have something like the picture plotted above.)

Shock boundary
Having established that shocks can form, we move on to the main discussion of this post: the geometry of the set of shock singularities. We will consider the purely local effects of the shocks; by which we mean that we will ignore the chain reactions as described in the parenthetical remark above. Therefore we will assume that at the formation of the shock, the flow-lines terminate and the particles they represent disappear. In other words, we will consider only shocks coming from nearest neighbor collisions. In this scenario, the time of existence of a characteristic line is precisely governed by the equation on $W$ we derived before: that is given $u_0(x)$, the characteristic line emanating from $x = x_0$ will run into the shock precisely at the time $t = - \frac{1}{\partial_x u_0(x_0)}$. (It will continue indefinitely in the future if the derivative is positive.)

The most well-known image of a shock formation is the image on the right, where we see the classic fan/wedge type shock. (Due to the simplicity in sketching thie diagram by hand, this is probably how most people are introduced to this type of diagrams, either on a homework set or in class.) What we see here is an illustration of the fact that

If for $x_1 < x < x_2$, we have $\partial^2_{xx} u_0(x) = 0$, and $\partial_x u_0(x) < 0$, then the shock boundary is degenerate: it consists of a single focal point.

To see this analytically: observe that because the blow-up time depends on the first derivative of the initial velocity distribution, for such a set-up the blow-up time $t_0 = - (\partial_x u_0)^{-1}$ is constant for the various points. Then we see that the spatial coordinate of the blow-up will be $x + u_0(x) t_0$. But since $u_0(x)$ is linear in $x$, we have

$\displaystyle x + u_0(x) t_0 = x_1 + (x-x_1) + u_0(x_1)t_0 + \partial_xu_0 \cdot (x - x_1) t_0 = x_1 + u_0(x_1) t_0$

is constant. And therefore the shock boundary is degenerate.

Next we consider the case where $\partial^2_{xx} u_0$ vanishes at some point $x_0$, but $\partial^3_{xxx}u_0(x_0) \neq 0$. The two pictures to the right of this paragraph illustrates the typical shock boundary behaviour. On the far right we have the slightly aphysical situation: notice that for a particle coming in from the left, before it hits its shock boundary, it first crosses the shock boundary formed by the particles coming in from the right. This is the situation where the third derivative is positive, and the cusp point which corresponds to the shock boundary for $x_0$ opens to the future. The nearer picture is the situation where the third derivative is negative, with the cusp point opening downwards. Notice that since we are in a neighborhood of a point where the second derivative vanishes, the initial velocity distributions both look almost straight, and it is hard to distinguish from this image the sign of the third derivative. The picture on the far right is based on an arctan type initial distribution, whereas the nearer picture is based on an $x^3$ type initial distribution. Let us again analyse the situation more deeply. Near the point $x_0$, we shall assume that $\partial^3_{xxx}u_0 \sim \partial^3_{xxx}u_0(x_0) = C$ for some constant. And we will assume, using Galilean transformations, that $u_0(x_0) = 0 = x_0$. Then letting $t_0 = - (\partial_x u_0(x_0))^{-1}$, we have

$\displaystyle u_0(x) = \frac{C}{6} x^3 - \frac{1}{t_0} x$

Thus as a function of $x$, the blow-up times of flow lines are given by

$\displaystyle t(x) = \frac{t_0}{1 - \frac{C}{2}t_0 x^2}$

Solving for their blow-up profile $y = x + u_0(x) t(x)$ then gives (after quite a bit of algebraic manipulation)

$\displaystyle \frac{ (\frac{t}{t_0} - 1)^3}{t} = \frac{9C}{8} y^2$

which can be easily seen to be a cusp: $\frac{dy}{dt} = 0$ at $y=0, t = t_0$. And it is clear that the side the cusp opens is dependent on the sign of the third derivative, $C$.

The last bit of computation we will do is for the case $D = \partial^2_{xx}u_0(x) \neq 0$. In this case we can take

$\displaystyle u_0(x) = - \frac{1}{t_0}x + \frac{D}{2} x^2$

as an approximation. Then the blowup times will be

$\displaystyle t(x) = \frac{t_0}{1 - D t_0 x}$

which leads to the blowup profile $y$ being [Thanks to Huy for the correction.]

$\displaystyle y = -\frac{1}{2Dt} \left( 1 - \frac{t}{t_0}\right)^2$

and a direct computation will then lead to the conclusion that in this generic scenario, the shock boundary will be everywhere tangent to the flow-line that ends there.

### Conway’s Base 13 Function

(N.b. Credit where credit’s due: I learned about this function from an answer of Robin Chapman’s on MathOverflow, and its measurability from Noah Stein.)

Conway’s base 13-function is a strange beast. It was originally crafted by John Conway as a counterexample to the converse of the intermediate value theorem, and has the property that on any open interval its image contains the entire real line. In addition, its support set also serves as an illustration of a dense, uncountable set of numbers whose Lebesgue measure is 0. Read the rest of this entry »

### How to ballast your ship or why my kitchen sink is always dirty

Recently I’ve been reading a bit of fluid dynamics, and came across the classical problems of free and forced vortices. In this post I will first discuss some mathematics leading to a computation of the free surface for the two types of vortices, and finish with a discussion of their implications.

Free vortex
A free vortex is seen in nature as either the bathtub drain or the maelstrom. A physical model of this is to start with a circular tub with a drain in the centre. To establish a steady state, refill the tub at the same rate it drains. The water is refilled near the rim of the tub, where the water enters tangential to the tub with fixed angular velocity $\omega_0$. The fluid is assumed to be incompressible and inviscid (good approximation of water) and is free of vorticity (the water is rotational on a global level, but we have a topological obstruction, so it is allowed to be locally free of vorticity). We also assume that the draining rate is slow, so that the fluid motion is dominated by rotation around the drain.

Under these conditions, the fluid can be modelled by the irrotational Euler equation

$\displaystyle \frac{\partial u}{\partial t} + \nabla \left( \frac{p}{\rho} + gz + \frac{1}{2} u\cdot u\right) = 0$

where $u$ is the fluid velocity field, $p$ is the pressure, $\rho$ is the (constant) density, $g$ is the acceleration due to gravity. We work in a cylindrical coordinate system $(r,\theta,z)$ where the $z$ axis is centred with respect to the tub, whose rim is $r = R$.

Now, by the slow-draining assumption, $u\cdot u$ is given by the square norm of the azimuthal component of the velocity. This we can determine by conservation of angular momentum: the particles come in at the rim with angular momentum $\omega_0 R^2$, so $u\cdot u = \frac{\omega_0^2 R^4}{r^2}$. In the stead-state $\partial_t u =0$, so in the end we need to have

$\displaystyle \frac{1}{\rho}\frac{\partial p}{\partial z} = -g$
and
$\displaystyle \frac{1}{\rho}\frac{\partial p}{\partial r} = \frac{\omega_0^2R^4}{r^3}$

Now, let the fluid surface by given by $z = f(r)$. At the free surface the pressure is fixed (to be the ambient atmospheric pressure), so $\frac{d}{dr} p(r,f(r)) = 0$. From the chain rule this implies

$\displaystyle \frac{\omega_0^2R^4}{r^3} - g f'(r) = 0$

which shows that the fluid depth dips by

$\displaystyle f(r) = f(R) + \frac{\omega_0^2R^2}{2g} - \frac{\omega_0^2R^4}{2g r^2}$

(where $f(r)$ drops below zero is, presumably, inside the drain hole).

Forced vortex
For the second problem of the forced rotation, I’d like to present a slightly different method of computation. The physical model of this problem is to set a bucket of water spinning around the central axis of the bucket. Now, in the idealized perfect fluid, the lack of interaction between concentric layers means that any word done spinning the bucket will not be transferred to the fluid. However, we are interested in the steady state problem. In the steady state the fluid will be co-rotating with the bucket and each concentric layer will have the same angular velocity (so there’s non-vanishing vorticity and the framework of the previous problem cannot be used), so there is no shear. In the regime where there is no shear, viscosity plays no role. So after the fluid has settled down to a steady state, we can analyse it as if it were incompressible and inviscid.

For this problem in particular, we can use the action principle. In the steady state, the total kinetic energy of the fluid is given by

$\displaystyle T = \int_0^R \pi \rho f(r) r^3 \omega_0^2 dr$

where the factor of $\pi$ enters from the suppressed azimuthal integral, and the computation comes from the particulate $\frac{1}{2} mv^2$ expression for kinetic energy. $f(r)$ is, as before, the height of the fluid. The potential energy is gravitational

$\displaystyle V = \int_0^R \pi f(r)^2 g r dr$

The incompressibility gives the constraint $\int f(r) r dr$ is a fixed constant. In the steady state the action principle gives that $S[f] = T - V$. To compute the variation, observe that the volume constraint implies $\int \delta f(r) r dr = 0$, and so by fundamental theorem of calculus, the admissible perturbations are such that

$\displaystyle \delta f(r) = \frac{1}{r} \frac{d}{dr} \delta h(r)$
where
$h(R) = h(0) = 0$

Thus we can compute $\delta S[f]$ and integrate by parts to obtain the Euler-Lagrange equation

$\displaystyle \frac{d}{dr}(r^2\omega^2 - 2f(r) g) = 0$

which leads to the paraboloid solution

$\displaystyle f(r) = f(0) + \frac{r^2\omega^2}{2g}$

(I have a bit of fondness for this problem as it appeared on the 2001 International Physics Olympiad’s experimental section. We were then to use the parabolic surface of the spinning bucket of gel as a focusing mirror.)

How to ballast your boat
So far the discussion is fairly well-known. Now let us consider a floating body on the vortex. We’ll assume that the angular motion of the floating body is driven by the fluid’s rotation, so that at every point of its journey its angular speed is equal to that of the fluid’s.

We try to draw a force diagram for the object. It experiences the force of gravity and the centrifugal force at its centre of mass. The force of gravity is $mg$ and the centrifugal force is $mv^2 / r$: in the case of a maelstrom, it is $m\omega_0^2 R^4 / r^3$, and in the case of the forced vortex it is $m\omega_0^2 r$.

On the other hand, it also experiences a force due to the pressure of the water at its centre of buoyancy. By Archimede’s principle, this force is equal and opposite to the sum of the gravitational force and the centrifugal force on the water displaced. We can assume that the total net force normal to the water surface is zero, and we examine the radial drift of the floating body.

If the radial position of the centres of mass and of buoyancy are equal, then the total force on the object must be zero, so the object stays with the current and rotates around the vortex. This is the case of the ship being perfectly ballasted.

If the centre of mass is shallower than the centre of buoyancy (say, of a beach ball floating on water, or any uniform, lighter-than-water body), then we see that the radial position of the centre of mass $r_m \le r_b$ the radial position of the centre of buoyancy. In a forced vortex, this means that the force of buoyancy provided by the water pressure in the radial direction $\omega_0^2 r_b \ge \omega_0^2 r_m$ the centripetal force required to keep the object on that circular trajectory, so the under-ballasted ship will drift toward the centre of a forced vortex. This should be familiar for those people who have stirred powdered milk or made hot chocolate: when the mug or pot is stirred, the floating clumps of powder congregates toward the centre. On the other hand, in a free vortex, the force of buoyancy in the radial direction is $\omega_0^2 R^4 / r_b^3 \le \omega_0^2 R^4 / r_m^3$ the requisite centripetal force, so objects will drift away from the vortex! This explains why, after washing dishes and pulling drain-stop, the floating food bits tend not to go down the drain but end up stuck to the walls of the kitchen sink.

We can also consider over-ballasted ships. This tend to be the norm to ensure stability for travel at sea, especially in face of changing barometric pressures. Now the centre of mass is deeper than the centre of buoyancy $r_m \ge r_b$, and we see that the conclusions are reversed! For a forced vortex the over-ballasted ship will be turned away from the vortex, whereas in a maelstrom the over-ballasted ship will be sucked in!

So the moral of the story is: if your ship is trapped in a maelstrom, throw all ballasts overboard and stand up straight holding your arm over your head, and hope for the best.

### A little Hilbert space problem

First let us consider the following question on a finite dimensional vector space. Let $(V, \langle\rangle)$ be a $k$-dimensional Hermitian-product space. Let $(e_i)_{1\leq i \leq k}$ be an orthonormal basis for $V$. Let $T:V\to V$ be the linear operator defined by $T(e_i) = e_{i+1}$ when $i < k$, and $T(e_k) = 0$. Does there exist any non-trivial vector $v\in V$ such that $\langle v,v\rangle = 1$ and $\langle v, T^jv\rangle = 0$?

The answer, in this case, is no. Present $v = \sum v_i e_i$ where $v_i$ are complex numbers. Let $a$ be the smallest index such that $v_a \neq 0$ and $v_i = 0, i < a$. Similarly let $b$ be the largest non-vanishing index. If $a = b$, then $v = v_a e_a$ is a multiple of a standard basis element, and so is trivial. So assume $a < b$. Now, by the requirement $\langle v, T^{b-a}v\rangle = 0$, we see that $v_a v_b = 0$, which contradicts our assumption that $a,b$ are the minimum and maximum non-vanishing indices. In this proof, we used crucially that $V$ is finite dimensional, so that a largest element $b$ can exist.

Now, onto the real question

Question
Take the complex Hilbert space $\ell^2(\mathbb{N})$, i.e. the set of all complex sequences $(a_i)_{0\leq i < \infty}$ satisfying $\sum_{i\in\mathbb{N}} |a_i|^2 < \infty$. Let $e = (1,0,0,\ldots), and let$latex T\$ be the right shift operator: $(Ta)_{i+1} = a_i$ and $(Ta)_0 = 0$. Then $T^ke$ is an orthonormal basis of $\ell^2$, and we have $\langle e, T^ke\rangle = \delta_0^k$. Does there exist non-trivial elements of $\ell^2$ for which $\langle v, T^kv\rangle = \delta_0^k$ hold?

The answer is yes by the way. Read the rest of this entry »