… Data aequatione quotcunque fluentes quantitates involvente fluxiones invenire et vice versa …

## Category: partial differential equations

### Heat ball

There are very few things I find unsatisfactory in L.C. Evans’ wonderful textbook on Partial Differential Equations; one of them is the illustration (on p.53 of the second edition) of the “heat ball”.

The heat ball is a region with respect to which an analogue of the mean value property of solutions to Laplace’s equation can be expressed, now for solutions of the heat equation. In the case of the Laplace’s equation, the regions are round balls. In the case of the heat equation, the regions are somewhat more complicated. They are defined by the expression

$\displaystyle E(x,t;r) := \left\{ (y,s)\in \mathbb{R}^{n+1}~|~s \leq t, \Phi(x-y, t-s) \geq \frac{1}{r^n} \right\}$

where $\Phi$ is the fundamental solution of the heat equation

$\displaystyle \Phi(x,t) := \frac{1}{(4\pi t)^{n/2}} e^{- \frac{|x|^2}{4t}}.$

In the expressions above, the constant $n$ is the number of spatial dimensions; $r$ is the analogue of the radius of the ball, and in $E(x,t;r)$, the point $(x,r)$ is the center. Below is a better visualization of the heat balls: the curves shown are the boundaries $\partial E(0,5;r)$ in dimension $n = 1$, for radii between 0.75 and 4 in steps of 0.25 (in particular all the red curves have integer radii). In higher dimensions the shape is generally the same, though they appear more “squashed” in the $t$ direction.

1-dimensional heat balls centered at (0,5) for various radii. (Made using Desmos)

### Decay of Waves IV: Numerical Interlude

I offer two videos. In both videos the same colour scheme is used: we have four waves in red, green, blue, and magenta. The four represent the amplitudes of spherically symmetric free waves on four different types of spatial geometries: 1 dimension flat space, 2 dimensional flat space, 3 dimensional flat space, and a 3 dimensional asymptotically flat manifold with “trapping” (has closed geodesics). Can you tell which is which? (Answer below the fold.)

### Gauge invariance, geometrically

A somewhat convoluted chain of events led me to think about the geometric description of partial differential equations. And a question I asked myself this morning was

Question
What is the meaning of gauge invariance in the jet-bundle treatment of partial differential equations?

The answer, actually, is quite simple.

Review of geometric formulation PDE
We consider here abstract PDEs formulated geometrically. All objects considered will be smooth. For more about the formal framework presented here, a good reference is H. Goldschmidt, “Integrability criteria for systems of nonlinear partial differential equations”, JDG (1967) 1:269–307.

A quick review: the background manifold $X$ is assumed (here we take a slightly more restrictive point of view) to be a connected smooth manifold. The configuration space $\mathcal{C}$ is defined to be a fibred manifold $p:\mathcal{C}\to X$. By $J^r\mathcal{C}$ we refer to the fibred manifold of $r$-jets of $\mathcal{C}$, whose projection $p^r = \pi^r_0 \circ p$ where for $r > s$ we use $\pi^r_s: J^r\mathcal{C}\to J^s\mathcal{C}$ for the canonical projection.

A field is a (smooth) section $\phi \subset \Gamma \mathcal{C}$. A simple example that capture most of the usual cases: if we are studying mappings between manifolds $\phi: X\to N$, then we take $\mathcal{C} = N\times X$ the trivial fibre bundle. The $s$-jet operator naturally sends $j^s: \Gamma\mathcal{C} \ni \phi \mapsto j^s\phi \in \Gamma J^r\mathcal{C}$.

A partial differential equation of order $r$ is defined to be a fibred submanifold $J^r\mathcal{C} \supset R^r \to X$. A field is said to solve the PDE if $j^r\phi \subset R^r$.

In the usual case of systems of PDEs on Euclidean space, $X$ is taken to be $\mathbb{R}^d$ and $\mathcal{C} = \mathbb{R}^n\times X$ the trivial vector bundle. A system of $m$ PDEs of order $r$ is usually taken to be $F(x,\phi, \partial\phi, \partial^2\phi, \ldots, \partial^r\phi) = 0$ where

$\displaystyle F: X\times \mathbb{R}^n \times \mathbb{R}^{dn} \times \mathbb{R}^{\frac{1}{2}d(d+1)n} \times \cdots \times \mathbb{R}^{{d+r-1 \choose r} n} \to \mathbb{R}^m$

is some function. We note that the domain of $F$ can be identified in this case with $J^r\mathcal{C}$, We can then extend $F$ to $\tilde{F}: J^r\mathcal{C} \ni c \mapsto (F(c),p^r(c)) \in \mathbb{R}^m\times X$ a fibre bundle morphism.

If we assume that $\tilde{F}$ has constant rank, then $\tilde{F}^{-1}(0)$ is a fibred submanifold of $J^r\mathcal{C}$, and this is our differential equation.

Gauge invariance
In this frame work, the gauge invariance of a partial differential equation relative to certain symmetry groups can be captured by requiring $R^r$ be an invariant submanifold.

More precisely, we take

Definition
A symmetry/gauge group $\mathcal{G}$ is a subgroup of $\mathrm{Diff}(\mathcal{C})$, with the property that for any $g\in\mathcal{G}$, there exists a $g'\in \mathrm{Diff}(X)$ with $p\circ g = g' \circ p$.

It is important we are looking at the diffeomorphism group for $\mathcal{C}$, not $J^r\mathcal{C}$. In general diffeomorphisms of $J^r\mathcal{C}$ will not preserve holonomy for sections of the form $j^r\phi$, a condition that is essential for solving PDEs. The condition that the symmetry operation “commutes with projections” is to ensure that $g:\Gamma\mathcal{C}\to\Gamma\mathcal{C}$, which in particular guarantees that $g$ extends to a diffeomorphism of $J^rC$ with itself that commutes with projections.

From this point of view, a (system of) partial differential equation(s) $R^r$ is said to be $\mathcal{G}$-invariant if for every $g\in\mathcal{G}$, we have $g(R^r) \subset R^r$.

We give two examples showing that this description agrees with the classical notions.

Gauge theory. In classical gauged theories, the configuration space $\mathcal{C}$ is a fibre bundle with structure group $G$ which acts on the fibres. A section of $G\times X \to X$ induces a diffeomorphism of $\mathcal{C}$ by fibre-wise action. In fact, the gauge symmetry is a fibre bundle morphism (fixes the base points).

General relativity. In general relativity, the configuration space is the space of Lorentzian metrics. So the background manifold is the space-time $X$. And the configuration space is the open submanifold of $S^2T^*X$ given by non-degenerate symmetric bilinear forms with signature (-+++). A diffeomorphism $\Psi:X\to X$ induces $T^*\Psi = (\Psi^{-1})^*: T^*X \to T^*X$ and hence a configuration space diffeomorphism that commutes with projection. It is in this sense that Einstein’s equations are diffeomorphism invariant.

Notice of course, this formulation does not contain the “physical” distinction between global and local gauge transformations. For example, for a linear PDE (so $\mathcal{C}$ is a vector bundle and $R^r$ is closed under linear operations), the trivial “global scaling” of a solution is considered in this frame work a gauge symmetry, though it is generally ignored in physics.

### Decay of waves IIIb: tails for homogeneous linear equation on curved background

Now we will actually show that the specific decay properties of the linear wave equation on Minkowski space–in particular the strong Huygens’ principle–is very strongly tied to the global geometry of that space-time. In particular, we’ll build, by hand, an example of a space-time where geometry itself induces back-scattering, and even linear, homogeneous waves will exhibit a tail.

For convenience, the space-time we construct will be spherically symmetric, and we will only consider spherically symmetric solutions of the wave equation on it. We will also focus on the 1+3 dimensional case. Read the rest of this entry »

### Decay of waves IIIa: nonlinear tails in Minkowski space redux

Before we move on to the geometric case, I want to flesh out the nonlinear case mentioned in the end of the last post a bit more. Recall that it was shown for generic nonlinear (actually semilinear; for quasilinear and worse equations we cannot use Duhamel’s principle) wave equations, if we put in compact support for the initial data, we expect the first iterate to exhibit a tail. One may ask whether it is possible that, in fact, this is an artifact of the successive approximation scheme; that in fact somehow it always transpires that a conspiracy happens, and all the higher order iterates cancel out the tail coming from the first iterate. This is rather unlikely, owing to the fact that the convergence to $\phi_\infty$ is dominated by a geometric series. But to just make double sure, here we give a nonlinear system of wave equations such that the successive approximation scheme converges after finitely many steps (in fact, after the first iterate), and so we can also explicitly compute the rate of decay for the nonlinear tail. While the decay rate is not claimed to be generic (though it is), the existence of one such example with a fixed decay rate shows that for a statement quantifying over all nonlinear wave equations, it would be impossible to demonstrate better decay rate than the one exhibited. Read the rest of this entry »

### Decay of waves IIb: Minkowski space, with right-hand side

In the first half of this second part of the series, we considered solutions to the linear, homogeneous wave equation on flat Minkowski space, and showed that for compactly supported initial data, we have strong Huygens’ principle. We further made references to the fact that this behaviour is expected to be unstable. In this post, we will further illustrate this instability by looking at Equation 1 first with a fixed source $F = F(t,x)$, and then with a nonlinearity $F = F(t,x, \phi, \partial\phi)$.

Duhamel’s Principle

To study how one can incorporate inhomogeneous terms into a linear equation, and to get a qualitative grasp of how the source term contributes to the solution, we need to discuss the abstract method known as Duhamel’s Principle. We start by illustrating this for a very simple ordinary differential equation.

Consider the ODE satisfied by a scalar function $\alpha$:

Equation 13
$\displaystyle \frac{d}{ds}\alpha(s) = k(s)\alpha(s) + \beta(s)$

when $\beta\equiv 0$, we can easily solve the equation with integration factors

$\displaystyle \alpha(s) = \alpha(0) e^{\int_0^s k(t) dt}$

Using this as a sort of an ansatz, we can solve the inhomogeneous equation as follows. For convenience we denote by $K(s) = \int_0^s k(t) dt$ the anti-derivative of $k$. Then multiplying Equation 13 through by $\exp -K(s)$, we have that

Equation 14
$\displaystyle \frac{d}{ds} \left( e^{-K(s)}\alpha(s)\right) = e^{-K(s)}\beta(s)$

which we solve by integrating

Equation 15
$\displaystyle \alpha(s) = e^{K(s)}\alpha(0) + e^{K(s)} \int_0^s e^{-K(t)}\beta(t) dt$

If we write $K(s;t) = \int_t^s k(u) du$, then we can rewrite Equation 15 as given by an integral operator

Equation 15′
$\displaystyle \alpha(s) = e^{K(s)}\alpha(0) + \int_0^s e^{K(s;t)}\beta(t) dt$

### Decay of waves IIa: Minkowski background, homogeneous case

Now let us get into the mathematics. The wave equations that we will consider take the form

Equation 1
$-\partial_t^2 \phi + \triangle \phi = F$

where $\phi:\mathbb{R}^{1+n}\to\mathbb{R}$ is a real valued function defined on (1+n)-dimensional Minkowski space that describes our solution, and $F$ represents a “source” term. When $F$ vanishes identically, we say that we are looking at the linear, homogeneous wave equation. When $F$ is itself a function of $\phi$ and its first derivatives, we say that the equation is a semilinear wave equation.

Homogeneous wave equation in one spatial dimension

One interesting aspect of the wave equation is that it only possesses the second, multidimensional, dispersive mechanism as described in my previous post. In physical parlance, the “phase velocity” and the “group velocity” of the wave equation are the same. And therefore, a solution of the wave equation, quite unlike a solution of the Schroedinger equation, will not exhibit decay when there is only one spatial dimension (mathematically this is one significant difference between relativistic and quantum mechanics). In this section we make a computation to demonstrate this, a fact that would also be useful later on when we look at higher (in particular, three) dimensions.

Use $x\in\mathbb{R}$ for the variable representing spatial position. The wave equation can be written as

$-\partial_t^2 \phi + \partial_x^2\phi = 0$

Now we perform a change of variables: let $u = \frac{1}{2}(t-x)$ and $v = \frac{1}{2}(t+x)$ be the canonical null variables. The change of variable formula replaces

Equation 2
$\displaystyle \partial_t \to \frac{\partial u}{\partial t} \partial_u + \frac{\partial v}{\partial t} \partial v = \frac{1}{2}\partial_u + \frac{1}{2}\partial_v$
$\displaystyle \partial_x \to \frac{\partial u}{\partial x} \partial_u + \frac{\partial v}{\partial x} \partial v = -\frac{1}{2}\partial_u + \frac{1}{2}\partial_v$

and we get that in the $(u,v)$ coordinate system,

Equation 3
$-\partial_u \partial_v \phi = 0$

### Decay of waves I: Introduction

In the next week or so, I will compose a series of posts on the heuristics for the decay of the solutions of the wave equation on curved (and flat) backgrounds. (I have my fingers crossed that this does not end up aborted like my series of posts on compactness.) In this first post I will give some physical intuition of why waves decay. In the next post I will write about the case of linear and nonlinear waves on flat space-time, which will be used to motivate the construction, in post number three, of an example space-time which gives an upper bound on the best decay that can be generally expected for linear waves on non-flat backgrounds. This last argument, due to Mihalis Dafermos, shows that why the heuristics known as Price’s Law is as good as one can reasonably hope for in the linear case. (In the nonlinear case, things immediately get much much worse as we will see already in the next post.)

This first post will not be too heavily mathematical, indeed, the only realy foray into mathematics will be in the appendix; the next ones, however, requires some basic familiarity with partial differential equations and pseudo-Riemannian geometry. Read the rest of this entry »

### Shock singularities in Burgers’ equation

It is generally well known that partial differential equations that model fluid motion can exhibit “shock waves”. In fact, the subject I will write about today is generally presented as the canonical example for such behaviour in a first course in partial differential equations (while also introducing the method of characteristics). The focus here, however, will not be so much on the formation of shocks, but on the profile of the shock boundary. This discussion tends to be omitted from introductory texts.

Solving Burgers’ equation
First we recall the inviscid Burgers’ equation, a fundamental partial differential equation in the study of fluids. The equation is written

Equation 1. Inviscid Burgers’ equation
$\displaystyle \frac{\partial}{\partial t} u + u \frac{\partial}{\partial x} u = 0$

where $u = u(t,x)$ is the “local fluid velocity” at time $t$ and at spatial coordinate $x$. The solution of the equation is closely related to its derivation: notice that we can re-write the equation as

$v \cdot \nabla u = (\partial_t + u \partial_x) u = 0$

The question we consider is the initial value problem for the PDE: given some initial velocity configuration $u_0(x)$, we want to find a solution $u(t,x)$ to Burgers’ equation such that $u(0,x) = u_0(x)$.

The traditional way of obtaining a solution is via the method of characteristics. We first observe (1) the alternate form of the equation above means that if $X(t)$ is a curve tangent to the vector field $v = \partial_t + u\partial_x$, we must have $u(t,X(t))$ be a constant valued function of the parameter $t$. (2) Plugging this back in implies that along such a curve $X(t)$, the vector field $v = \partial_t + u\partial_x = \partial_t + u_0 \partial_x$ is constant. (3) A curve whose tangent vector is constant is a straight line. So we have that a solution of the Burgers’ equation must verify

$u(t, x + u_0(x) \cdot t) = u_0(x)$

And we call the family of curves given by $X_x(t) = x + u_0(x) \cdot t$ the characteristic curves of the solution.

To extract more qualitative information about Burgers’ equation, let us take another spatial derivative of the equation, and call the function $w = \partial_x u$. Then we have

$\partial_t w + w^2 + u \partial_x w = 0 \implies v \cdot w + w^2 = 0$

So letting $X(t)$ be a characteristic curve, and write $W(t) = w(t, X(t))$, we have that along the characteristic curve

$\displaystyle \frac{d}{dt}W = - W^2 \implies W(t) = \frac{1}{t+W(0)^{-1}}$

So in particular, we see that if $W(0) < 0$, $W(t)$ must blow up in time $t \leq |W(0)|^{-1}$.

So what does this mean? We’ve seen that along characteristic lines, the value of $u$ stays constant. But we’ve also seen that along those lines, the value of its spatial derivative can blow up if the initial slope is negative. Perhaps the best thing to do is to illustrate it with two pictures. In the pictures the thick, red curve is the initial velocity distribution $u_0(x)$, shown with the black line representing the $x$-axis: so when the curve is above the axis, initially the local fluid velocity is positive, and the fluid is moving to the right. The blue curves are the characteristic lines. In the first image to the right, we see that the initial velocity distribution is such that the velocity is increasing to the right. And so $w(0,x)$ is always positive. We see that in this situation the flow is divergent, the flow lines getting further and further apart, corresponding to the solution where $w(t,x)$ gets smaller and smaller along a flow line. For the second image here on our left, the situation is different. The initial velocity distribution starts out increasing, then hits a maximum, dips down to a minimum, and finally increases again. In the regions where the velocity distribution is increasing, we see the same “spreading out” behaviour as before, with the flow lines getting further and further apart (especially in the upper left region). But for flowlines originating in the region where the velocity distribution is decreasing, those characteristic curves gets bunched together as time goes on, eventually intersecting! This intersection is what is known as a shock. From the picture, it becomes clear what the blow-up of $W(t)$ means: Suppose the initial velocity distribution is such that for two points $x_1 u_0(x_2)$. Since the flow line originating from $x_1$ is moving faster, it will eventually catch up to the the flow line originating from $x_2$. When the two flow lines intersect, we have a problem: if we follow the flow line from $x_1$, the function $u$ must take the value $u_0(x_1)$ at the point; but if we follow the flow line from $x_2$, the function must take the value $u_0(x_2)$ at the point. So we cannot consistently assign a value to the function $u$ at the points of intersection for flow-lines in a way that satisfies Burgers’ equation.

Another way of thinking about this difficulty is in terms of particle dynamics. Imagine the line being a highway, and points on it being cars. The dynamics of the traffic flow described by Burgers’ equation is one in which each driver starts at one speed (which can be in reverse), and maintains that speed completely without regard for the cars in front of or behind it. If we start out with a distribution where the leading cars always drive faster than the trailing ones, then the cars will spread further apart as time goes on. But if we start out with a distribution where a car in front is driving slower than a car behind, the second car will eventually catch up and crash into the one in front. And this is the formation of the shock wave.

(Now technically, in this view, once the two cars crash their flow-lines should end, and so cars that are in front of the collision and moving forward should not be affected by the collision at all. But if we imagine that instead of real cars, we are driving bumper cars, so after a collision, the car in front maintains speed at the velocity of the car that hit it, while the car in back drives at the velocity of the car it hit [so the they swap speeds in an elastic collision], then we have something like the picture plotted above.)

Shock boundary
Having established that shocks can form, we move on to the main discussion of this post: the geometry of the set of shock singularities. We will consider the purely local effects of the shocks; by which we mean that we will ignore the chain reactions as described in the parenthetical remark above. Therefore we will assume that at the formation of the shock, the flow-lines terminate and the particles they represent disappear. In other words, we will consider only shocks coming from nearest neighbor collisions. In this scenario, the time of existence of a characteristic line is precisely governed by the equation on $W$ we derived before: that is given $u_0(x)$, the characteristic line emanating from $x = x_0$ will run into the shock precisely at the time $t = - \frac{1}{\partial_x u_0(x_0)}$. (It will continue indefinitely in the future if the derivative is positive.)

The most well-known image of a shock formation is the image on the right, where we see the classic fan/wedge type shock. (Due to the simplicity in sketching thie diagram by hand, this is probably how most people are introduced to this type of diagrams, either on a homework set or in class.) What we see here is an illustration of the fact that

If for $x_1 < x < x_2$, we have $\partial^2_{xx} u_0(x) = 0$, and $\partial_x u_0(x) < 0$, then the shock boundary is degenerate: it consists of a single focal point.

To see this analytically: observe that because the blow-up time depends on the first derivative of the initial velocity distribution, for such a set-up the blow-up time $t_0 = - (\partial_x u_0)^{-1}$ is constant for the various points. Then we see that the spatial coordinate of the blow-up will be $x + u_0(x) t_0$. But since $u_0(x)$ is linear in $x$, we have

$\displaystyle x + u_0(x) t_0 = x_1 + (x-x_1) + u_0(x_1)t_0 + \partial_xu_0 \cdot (x - x_1) t_0 = x_1 + u_0(x_1) t_0$

is constant. And therefore the shock boundary is degenerate.

Next we consider the case where $\partial^2_{xx} u_0$ vanishes at some point $x_0$, but $\partial^3_{xxx}u_0(x_0) \neq 0$. The two pictures to the right of this paragraph illustrates the typical shock boundary behaviour. On the far right we have the slightly aphysical situation: notice that for a particle coming in from the left, before it hits its shock boundary, it first crosses the shock boundary formed by the particles coming in from the right. This is the situation where the third derivative is positive, and the cusp point which corresponds to the shock boundary for $x_0$ opens to the future. The nearer picture is the situation where the third derivative is negative, with the cusp point opening downwards. Notice that since we are in a neighborhood of a point where the second derivative vanishes, the initial velocity distributions both look almost straight, and it is hard to distinguish from this image the sign of the third derivative. The picture on the far right is based on an arctan type initial distribution, whereas the nearer picture is based on an $x^3$ type initial distribution. Let us again analyse the situation more deeply. Near the point $x_0$, we shall assume that $\partial^3_{xxx}u_0 \sim \partial^3_{xxx}u_0(x_0) = C$ for some constant. And we will assume, using Galilean transformations, that $u_0(x_0) = 0 = x_0$. Then letting $t_0 = - (\partial_x u_0(x_0))^{-1}$, we have

$\displaystyle u_0(x) = \frac{C}{6} x^3 - \frac{1}{t_0} x$

Thus as a function of $x$, the blow-up times of flow lines are given by

$\displaystyle t(x) = \frac{t_0}{1 - \frac{C}{2}t_0 x^2}$

Solving for their blow-up profile $y = x + u_0(x) t(x)$ then gives (after quite a bit of algebraic manipulation)

$\displaystyle \frac{ (\frac{t}{t_0} - 1)^3}{t} = \frac{9C}{8} y^2$

which can be easily seen to be a cusp: $\frac{dy}{dt} = 0$ at $y=0, t = t_0$. And it is clear that the side the cusp opens is dependent on the sign of the third derivative, $C$.

The last bit of computation we will do is for the case $D = \partial^2_{xx}u_0(x) \neq 0$. In this case we can take

$\displaystyle u_0(x) = - \frac{1}{t_0}x + \frac{D}{2} x^2$

as an approximation. Then the blowup times will be

$\displaystyle t(x) = \frac{t_0}{1 - D t_0 x}$

which leads to the blowup profile $y$ being [Thanks to Huy for the correction.]

$\displaystyle y = -\frac{1}{2Dt} \left( 1 - \frac{t}{t_0}\right)^2$

and a direct computation will then lead to the conclusion that in this generic scenario, the shock boundary will be everywhere tangent to the flow-line that ends there.

### Minimal blow-up solution to an inhomogeneous NLS

Yesterday I went to a wonderful talk by Jeremie Szeftel on his recent joint work with Pierre Raphaël. The starting point is the following equation:

Eq 1. Homogeneous NLS
$i \partial_t u + \triangle u + u|u|^2 = 0$ on $[0,t) \times \mathbb{R}^2$

It is known as the mass-critical nonlinear Schrödinger equation. One of its very interesting properties is that it admits a soliton solution $Q$, which is the unique positive (real) radial solution to

Eq 2. Soliton equation
$\triangle Q - Q + Q^3 = 0$

Plugging $Q$ into the homogeneous NLS, we see that it evolves purely by phase-rotation: it represents a non-dispersing standing wave. Physically, this represents the case when the self-interaction attractive non-linearity exactly balances out the tendency for a wave to disperse.

As one can easily see from its form, the homogeneous NLS has a large class of continuous symmetries:

• Time translation
• Spatial translation
• Phase translation (translation in Fourier space)
• Dilation
• Galilean boosts

(It is also symmetric under rotations, but as the spatial rotation group is compact, it cannot cause problems for the analysis [a lesson from concentration compactness; I’ll write about this another time], so we’ll just forget about it for the time being.) The NLS also admits the so-called pseudo-conformal transformation, which is a discrete $\mathbb{Z}_2$ action that replacing

Eq 3. Pseudo-conformal inversion
$\displaystyle u(t,x) \longrightarrow \frac{1}{|t|}\bar{u}(\frac{1}{t},\frac{x}{t}) e^{i x^2 / (4t)}$

maps a solution to another solution. A particularly interesting phenomenon related to this additional symmetry is the existence of the minimal mass blow-up solution: by acting on $Q$ (the soliton) with the pseudo-conformal transform, we obtain a solution that blows-up in finite time. But why do we call this a “minimal mass” solution? This is because previously it has been shown by Michael Weinstein (I think) that for any initial data to the NLS with initial mass ($L^2$ norm) smaller than that of $Q$, the solution must exists for all time, whereas for any values of mass strictly above that of $Q$, one can find a (in fact, multiple) solution that blows-up in finite time. With concentration compactness methods, Frank Merle was able to show that the pseudo-conformally inverted $Q$ is the only initial data that leads to finite-time blow-up with that fixed mass.

In some sense, however, the homogeneous NLS is too-nice of a equation: because of its astounding number of symmetries, one can write down an explicit, self-similar blow-up solution just via the pseudo-conformal transform. A natural question to ask is that whether the property of the existence/uniqueness of a minimal mass blow-up solution can exist for more generic looking equations. The toy model one is led to consider is

Eq 4. Inhomogeneous NLS
$i\partial_tu + \triangle u + k(x) u|u|^2 = 0$

for which the $k(x) = 1$ case reduces to the homogeneous equation. The addition of the arbitrary term kills all of the symmetries except phase translation and time translation. The former is a trivial set of symmetries (whose orbit is compact, so not posing any difficulty), while the latter is important since it generates the conservation of energy for this equation.

In the case where $k(x)$ is a differentiable, bounded function, some facts about this equation are known through the work of Merle. Without loss of generality, we will assume from now on that $k(x) \leq 1$ (we can always arrange for this by rescaling the equation). It was found that in this case, if the initial mass of the data is smaller than that of $Q$, again, we have global existence of a solution. Heuristically, the idea is that $k(x)$ measures the self-interaction strength of the particle, which can vary with its spatial position: the larger the value of $k$, the more strongly the interaction. Now, in the homogeneous case the low-mass initial data does not have enough matter to lead to a strong enough self-interaction, so the dispersive behavior dominates and there cannot be concentration of energy and blow-up. Heuristically we expect that for interactions strictly weaker than the homogeneous case ($k \leq 1$), the dispersion should still dominate over the attractive self-force.

Furthermore, Merle also found that a minimal mass blow-up to the inhomogeneous NLS can only occur if $k(x) = 1$ (hits an interior maximum) at some finite point, and that $k(x)$ is bounded strictly away from 1 outside some large compact set. In this case, the blow-up can only occur in such a way that leads to a concentration of energy at the maximum point. Heuristically, again, this is natural: the strong self-interaction gives a lower potential energy. So it is natural to expect the particle to slide down into this potential well when it concentrates. If the potential asymptotes to the minimum at infinity, however, one may expect the wave to slide out to infinity and disperse, so it is important to have a strict maximum of the interaction strength in the interior.

Szeftel and Raphaël’s work show that such a blow-up solution indeed exists, and is in fact unique.

Around the local maximum of $k(x)$, we can (heuristically) expand by Taylor polynomials. That $x_0$ is a local maximum implies that we have schematically

$k(x) = 1 + c_2\nabla^2k(x_0) x^2 + c_3\nabla^3k(x_0)x^3 + \ldots$

In the case where the Hessian term vanishes, by “zooming in” along a pre-supposed self-similar blow-up with rates identical to the one induced by the pseudo-conformal transform in the homogeneous case, we can convince ourselves that the more we zoom in, the flatter $k(x)$ looks. In this case, then it is not too unreasonable that the focusing behavior of the homogeneous case carries over: by zooming in sufficiently we rapidly approach a situation which is locally identical to the homogeneous case. If the energy is already concentrating, then the errors introduced at “large distances” will be small and controllable. This suggest that the problem admits a purely perturbative treatment. This, indeed, was the case, as Banica, Carles, and Duyckaerts have shown.

On the other hand, of the Hessian term does not vanish, one sees that it remains scale-invariant down to the smallest scales. In other words, no matter how far we zoom in, the pinnacle at $k(x_0)$ will always look curved. In this situation, a perturbative method is less suitable, and this is the regime where Szeftel and Raphaël works.

The trick, it seems to me, is the following (I don’t think I completely understand all the intricacies of the proof; here I’ll just talk about the impression I got from the talk and from looking a bit at the paper): it turns out that by inverting the pseudo-conformal transform, we can reformulate the blow-up equation as a large-time equation in some rescaled variables, where now the potential $k$ depends on the scaling parameter which also depends on time. The idea is to “solve backwards from infinity”. If we just plug-in naïvely the stationary solution $Q$ at infinity, there will be error terms when we evolve back. What we want to do is capture the error terms. If we directly linearize the equation around $Q$, we will pick-up negative-eigenmodes, which will lead to exponential blow-up and destroy our ansatz. To overcome this difficulty, as is standard, the authors applied modulation theory. The idea behind modulation theory is that all the bad eigenmodes for the linearized equation of a good Hamiltonian system should be captured by the natural symmetries. In this case, we don’t have any natural symmetries to use. But we have “almost” symmetries coming from the homogeneous system. So we consider the manifold of functions spanned by symmetry transformations of $Q$, and decompose the solution as a projection part $u$ which lives on the manifold, and an orthogonal part $u'$. In this way, all the wild, uncontrolled directions of the flow are captured in some sort of motion on the symmetry manifold. We don’t actually particularly care how the flow happens, as the flow on the manifold preserves norms. The only bit we care about the flow is how it converges as time approaches the blow-up time: this is what gives us the blow-up rate of the equation.

As it turns out, this decomposition is a very good one: the analysis showed that the flow on the manifold is a good approximation (to the fourth order) of the actual physical flow. This means that the orthogonal error $u'$ is going to be rather small and controllable. Of course, to establish these estimates is a lot of hard work; fundamentally, however, the idea is a beautiful one.