Bubbles Bad; Ripples Good

… Data aequatione quotcunque fluentes quantitates involvente fluxiones invenire et vice versa …

Category: Requires upper level university maths

Simulating closed cosmic strings (or yet another adventure in Julia)

One of my current research interests is in the geometry of constant-mean-curvature time-like submanifolds of Minkowski space. A special case of this is the study of cosmic strings under the Nambu-Goto action. Mathematically the classical (as opposed to quantum) behavior of such strings are quite well understood, by combining the works of theoretical physicists since the 80s (especially those of Jens Hoppe and collaborators) together with recent mathematical developments (such as the work of Nguyen and Tian and that of Jerrard, Novaga, and Orlandi). To get a better handle on what is going on with these results, and especially to provide some visual aids, I’ve tried to produce some simulations of the dynamics of closed cosmic strings. (This was also an opportunity for me to practice code-writing in Julia and learn a bit about the best practices for performance and optimization in that language.)

The code

After some false starts, here are some reasonably stable code.

function MC_corr3!(position::Array{Float64,2}, prev_vel::Array{Float64,2}, next_vel::Array{Float64,2}, result::Array{Float64,2}, dt::Float64)
  # We will assume that data is stored in the format point(coord,number), so as a 3x1000 array or something. 
  num_points = size(position,2)
  num_dims = size(position,1)

  curr_vel = zeros(num_dims)
  curr_vs = zeros(num_dims)
  curr_ps = zeros(num_dims)
  curr_pss = zeros(num_dims)

  pred_vel = zeros(num_dims)
  agreement = true

  for col = 1:num_points  #Outer loop is column
    if col == 1
      prev_col = num_points
      next_col = 2
    elseif col == num_points
      prev_col = num_points - 1
      next_col = 1
    else
      prev_col = col -1
      next_col = col + 1
    end

    for row = 1:num_dims
      curr_vel[row] = (next_vel[row,col] + prev_vel[row,col])/2
      curr_vs[row] = (next_vel[row,next_col] + prev_vel[row,next_col] - next_vel[row,prev_col] - prev_vel[row,prev_col])/4
      curr_ps[row] = (position[row,next_col] - position[row,prev_col])/2
      curr_pss[row] = position[row,next_col] + position[row,prev_col] - 2*position[row,col]
    end

    beta = (1 + dot(curr_vel,curr_vel))^(1/2)
    sigma = dot(curr_ps,curr_ps)
    psvs = dot(curr_ps,curr_vs)
    bvvs = dot(curr_vs,curr_vel) / (beta^2)
    pssps = dot(curr_pss,curr_ps)

    for row in 1:num_dims
      result[row,col] = curr_pss[row] / (sigma * beta) - curr_ps[row] * pssps / (sigma^2 * beta) - curr_vel[row] * psvs / (sigma * beta) - curr_ps[row] * bvvs / (sigma * beta)
      pred_vel[row] = prev_vel[row,col] + dt * result[row,col]
    end

    agreement = agreement && isapprox(next_vel[:,col], pred_vel, rtol=sqrt(eps(Float64)))
  end

  return agreement
end


function find_next_vel!(position::Array{Float64,2}, prev_vel::Array{Float64,2}, next_vel::Array{Float64,2}, dt::Float64; max_tries::Int64=50)
  tries = 1
  result = zeros(next_vel)
  agreement = MC_corr3!(position,prev_vel,next_vel,result,dt)
  for j in 1:size(next_vel,2), i in 1:size(next_vel,1)
    next_vel[i,j] = prev_vel[i,j] + result[i,j]*dt
  end
  while !agreement && tries < max_tries
    agreement = MC_corr3!(position,prev_vel,next_vel,result,dt)
    for j in 1:size(next_vel,2), i in 1:size(next_vel,1)
      next_vel[i,j] = prev_vel[i,j] + result[i,j]*dt
    end
    tries +=1
  end
  return tries, agreement
end

This first file does the heavy lifting of solving the evolution equation. The scheme is a semi-implicit finite difference scheme. The function MC_Corr3 takes as input the current position, the previous velocity, and the next velocity, and computes the correct current acceleration. The function find_next_vel iterates MC_Corr3 until the computed acceleration agrees (up to numerical errors) with the input previous and next velocities.

Or, in notations:
MC_Corr3: ( x[t], v[t-1], v[t+1] ) --> Delta-v[t]
and find_next_vel iterates MC_Corr3 until
Delta-v[t] == (v[t+1] - v[t-1]) / 2

The code in this file is also where the performance matters the most, and I spent quite some time experimenting with different algorithms to find one with most reasonable speed.

function make_ellipse(a::Float64,b::Float64, n::Int64, extra_dims::Int64=1)  # a,b are relative lengths of x and y axes
  s = linspace(0,2π * (n-1)/n, n)
  if extra_dims == 0
    return vcat(transpose(a*cos(s)), transpose(b*sin(s)))
  elseif extra_dims > 0
    return vcat(transpose(a*cos(s)), transpose(b*sin(s)), zeros(extra_dims,n))
  else
    error("extra_dims must be non-negative")
  end
end

function perturb_data!(data::Array{Float64,2}, coeff::Vector{Float64}, num_modes::Int64) 
  # num_modes is the number of modes
  # coeff are the relative sizes of the perturbations

  numpts = size(data,2)

  for j in 2:num_modes
    rcoeff = rand(length(coeff),2)

    for pt in 1:numpts
      theta = 2j * π * pt / numpts
      for d in 1:length(coeff)
        data[d,pt] += ( (rcoeff[d,1] - 0.5) *  cos(theta) + (rcoeff[d,2] - 0.5) * sin(theta)) * coeff[d] / j^2
      end
    end
  end

  nothing
end

This file just sets up the initial data. Note that in principle the number of ambient spatial dimensions is arbitrary.

using Plots

pyplot(size=(1920,1080), reuse=true)

function plot_data2D(filename_prefix::ASCIIString, filename_offset::Int64, titlestring::ASCIIString, data::Array{Float64,2}, additional_data...)
  x_max = 1.5
  y_max = 1.5
  plot(transpose(data)[:,1], transpose(data)[:,2] , xlims=(-x_max,x_max), ylims=(-y_max,y_max), title=titlestring)
  
  if length(additional_data) > 0
    for i in 1:length(additional_data)
      plot!(transpose(additional_data[i][1,:]), transpose(additional_data[i][2,:]))
    end
  end
  
  png(filename_prefix*dec(filename_offset,5)*".png")
  nothing
end

function plot_data3D(filename_prefix::ASCIIString, filename_offset::Int64, titlestring::ASCIIString, data::Array{Float64,2}, additional_data...)
  x_max = 1.5
  y_max = 1.5
  z_max = 0.9
  tdata = transpose(data)
  plot(tdata[:,1], tdata[:,2],tdata[:,3], xlims=(-x_max,x_max), ylims=(-y_max,y_max),zlims=(-z_max,z_max), title=titlestring)

  if length(additional_data) > 0
    for i in 1:length(additional_data)
      tdata = transpose(additional_data[i])
      plot!(tdata[:,1], tdata[:,2], tdata[:,3]) 
    end
  end

  png(filename_prefix*dec(filename_offset,5)*".png")
  nothing
end

This file provides some wrapper commands for generating the plots.

include("InitialData3.jl")
include("MeanCurvature3.jl")
include("GraphCode3.jl")

num_pts = 3000
default_timestep = 0.01 / num_pts
max_time = 3
plot_every_ts = 1500

my_data = make_ellipse(1.0,1.0,num_pts,0)
perturb_data!(my_data, [1.0,1.0], 15)
this_vel = zeros(my_data)
next_vel = zeros(my_data)

for t = 0:floor(Int64,max_time / default_timestep)
  num_tries, agreement = find_next_vel!(my_data, this_vel,next_vel,default_timestep)

  if !agreement
    warn("Time $(t*default_timestep): failed to converge when finding next_vel.")
    warn("Dumping information:")
    max_beta = 1.0
    max_col = 1
    for col in 1:size(my_data,2)
      beta = (1 + dot(next_vel[:,col], next_vel[:,col]))^(1/2)
      if beta > max_beta
        max_beta = beta
	max_col = col
      end
    end
    warn("   Beta attains maximum at position $max_col")
    warn("   Beta = $max_beta")
    warn("   Position = ", my_data[:,max_col])
    prevcol = max_col - 1
    nextcol = max_col + 1
    if max_col == 1
      prevcol = size(my_data,2)
    elseif max_col == size(my_data,2)
      nextcol = 1
    end
    warn("   Deltas")
    warn("    Left:  ", my_data[:,max_col] - my_data[:,prevcol])
    warn("    Right:  ", my_data[:,nextcol] - my_data[:,max_col])
    warn("   Previous velocity: ", this_vel[:,max_col])
    warn("   Putative next velocity: ", next_vel[:,max_col])
    warn("Quitting...")
    break
  end

  for col in 1:size(my_data,2)
    beta = (1 + dot(next_vel[:,col], next_vel[:,col]))^(1/2)
    for row in 1:size(my_data,1)
      my_data[row,col] += next_vel[row,col] * default_timestep / beta
      this_vel[row,col] = next_vel[row,col]
    end

    if beta > 1e7
      warn("time: ", t * default_timestep)
      warn("Almost null... beta = ", beta)
      warn("current position = ", my_data[:,col])
      warn("current Deltas")
      prevcol = col - 1
      nextcol = col + 1
      if col == 1
        prevcol = size(my_data,2)
      elseif col == size(my_data,2)
        nextcol = 1
      end
      warn(" Left: ", my_data[:,col] - my_data[:,prevcol])
      warn(" Right: ", my_data[:,nextcol] - my_data[:,col])
    end
  end

  if t % plot_every_ts ==0
    plot_data2D("3Dtest", div(t,plot_every_ts), @sprintf("elapsed: %0.4f",t*default_timestep), my_data, make_ellipse(cos(t*default_timestep), cos(t*default_timestep),100,0))
    info("Frame $(t/plot_every_ts):  used $num_tries tries.")
  end
end

And finally the main file. Mostly it just ties the other files together to produce the plots using the simulation code; there are some diagnostics included for me to keep an eye on the output.

The results

First thing to do is to run a sanity check against explicit solutions. In rotational symmetry, the solution to the cosmic string equations can be found analytically. As you can see below the simulation closely replicates the explicit solution in this case.

The video ends when the simulation stopped. The simulation stopped because a singularity has formed; in this video the singularity can be seen as the collapse of the string to a single point.

Next we can play around with a more complicated initial configuration.

In this video the blue curve is the closed cosmic string, which starts out as a random perturbation of the circle with zero initial speed. The string contracts with acceleration determined by the Nambu-Goto action. The simulation ends when a singularity has formed. It is perhaps a bit hard to see directly where the singularity happened. The diagnostic messages, however, help in this regard. From it we know that the onset of singularity can be seen in the final frame:

3Dtest00268

The highlighted region is getting quite pointy. In fact, that is accompanied with the “corner” picking up infinite acceleration (in other words, experiencing an infinite force). The mathematical singularity corresponds to something unreasonable happening in the physics.

To make it easier to see the “speed” at which the curve is moving, the following videos show the string along with its “trail”. This first one again shows how a singularity can happen as the curve gets gradually more bent, eventually forming a corner.

This next one does a good job emphasizing the “wave” nature of the motion.

The closed cosmic strings behave like a elastic band. The string, overall, wants to contract to a point. Small undulations along the string however are propagated like traveling waves. Both of these tendencies can be seen quite clearly in the above video. That the numerical solver can solve “past” the singular point is a happy accident; while theoretically the solutions can in fact be analytically continued past the singular points, the renormalization process involved in this continuation is numerically unstable and we shouldn’t be able to see it on the computer most of the time.

The next video also emphasizes the wave nature of the motion. In addition to the traveling waves, pay attention to the bottom left of the video. Initially the string is almost straight there. This total lack of curvature is a stationary configuration for the string, and so initially there is absolutely no acceleration of that segment of the string. The curvature from the left and right of that segment slowly intrudes on the quiescent piece until the whole thing starts moving.

The last video for this post is a simulation when the ambient space is 3 dimensional. The motion of the string, as you can see, becomes somewhat more complicated. When the ambient space is 2 dimensional a point either accelerates or decelerates based on the local (signed) curvature of the string. But when the ambient space is 3 dimensional, the curvature is now a vector and this additional degree of freedom introduces complications into the behavior. For example, when the ambient space is 2 dimensional it is known that all closed cosmic strings become singular in finite time. But in 3 dimensions there are many closed cosmic strings that vibrate in place without every becoming singular. The video below is one that does however become singular. In addition to a fading trail to help visualize the speed of the curve, this plot also includes the shadows: projections of the curve onto the three coordinate planes.

Purely kinetic initial data for Schwarzschild

I am at MSRI at the moment attending a program on mathematical relativity, and today heard something interesting from Mu-Tao Wang during his discussion of quasi-local mass.

To motivate: consider Einstein’s (vacuum) equations in general relativity. The initial data formulation requires specifying a manifold \Sigma, a Riemannian metric g, and a symmetric two tensor k. Remember that we are interested in solving for a space-time (M,\bar{g}) satisfying \mathrm{Ric}(\bar{g}) = 0; the induced metric g can be thought of as the initial data and the tensor k (which will turn out to be the second fundamental form of the embedding of \Sigma into M) the initial first time derivative of the data (Einstein’s equation is second order, and so we specify both the data and its first time derivative on the initial slice).

The nature of Einstein’s equations is that the initial data needs to satisfy a set of constraint equations, for it to be compatible with any solution. In any dimension we have the Hamiltonian constraint (assuming vacuum, so no stress-energy tensor)

\displaystyle R(g) - |k|^2_g + \mathrm{tr}_g k = 0

and the momentum constraint

\displaystyle \mathrm{div}_g k - \mathrm{d}~\mathrm{tr}_g k = 0

A particular class of initial data is those called time-symmetric, which are those with k \equiv 0. The physical interpretation is clear in view of the above formal considerations: k corresponds to the instantaneous “velocity” of the space-time as a whole, and its vanishing represents that the spacetime is instantaneously “not moving”. To draw an analogy with classical mechanics, this is the situation where velocities are zero and all the energy of the system is contained in the potentials.

One may ask the question whether we can write down initial data for Einstein’s equation that represents, again using the classical mechanics analogy, a system that is instantaneously completely relaxed, and so all its energy are placed in the kinetic part. A reasonable interpretation for “completely relaxed” would be that the initial data has zero spatial curvature, like the standard slice in Minkowski space; all the energy will live inside the “kinetic term” k.

As it turns out: we can in fact do this. To exhibit a solution, let us work in spherical symmetry to reduce the constraint equations to an ordinary differential equation. Since the spatial metric is flat, we can choose spherical coordinates so that

\displaystyle g = \mathrm{d}r^2 + r^2 \mathrm{d}\omega^2

and by spherical symmetry (assuming spatial dimension n \geq 3) we have that the second fundamental form must decompose as

\displaystyle k = A(r) \mathrm{d} r^2 + r^2 B(r) \mathrm{d}\omega^2

which implies that

\displaystyle |k|^2_g = A^2 + (n-1) B^2 \qquad \mathrm{tr}_g k = A + (n-1) B

so that the Hamiltonian constraint yields

\displaystyle B \equiv 0 \qquad \text{or} \qquad 2A + (n-2)B = 0

(note that if A = B = 0 then we reduce down to Minkowski initial data). Next we consider the momentum constraint. A direct computation (using Christoffel symbols or otherwise) of the divergence term yields

\displaystyle (\mathrm{div} K)(\partial_r) = \partial_r A + \frac{n-1}{r} (A - B)

Now, in either of the cases we can reduce the momentum constraint to a one variable ODE:

B \equiv 0 \implies A \equiv 0 \implies \text{Minkowski}

and more interestingly

\displaystyle 2A + (n-2)B = 0 \implies \partial_r A + \frac{n-1}{r} (1+ \frac{2}{n-2}) A = \partial_r A - (n-1)\frac{2}{n-2} \partial_r A

which we can simplify to get

\displaystyle \partial_r A + \frac{n}{2r} A = 0

Using integrating factors we solve to get

\displaystyle A = c r^{-n/2} \qquad B = - \frac{2c}{n-2} r^{-n/2}

Now where does this slice sit in Schwarzschild? If we solve, in Boyer-Lindquist coordinates for t = f(r) such that the induced metric is flat, we easily come to the conclusion that, in the case n = 3 (I may add the computation for general n later) we have f'(r) = \pm \sqrt{2m/r}(1 - 2m/r)^{-1}. This slice goes from space-like infinity inward, and cuts through the (future or past, depending on sign) event horizon as r \to 2m.

Interestingly, we note that the decay of the second fundamental form k is in fact critical, at least in the context of the positive mass theorem (the momentum integral formally diverges!), so at least on that level this doesn’t give a contradiction to the PMT.

For more discussions one can see this paper.

Decay of Waves IV: Numerical Interlude

I offer two videos. In both videos the same colour scheme is used: we have four waves in red, green, blue, and magenta. The four represent the amplitudes of spherically symmetric free waves on four different types of spatial geometries: 1 dimension flat space, 2 dimensional flat space, 3 dimensional flat space, and a 3 dimensional asymptotically flat manifold with “trapping” (has closed geodesics). Can you tell which is which? (Answer below the fold.)

Read the rest of this entry »

Bubbles!

Joachim Krieger and I posted a new pre-print on the critical nonlinear wave equation. After close to four years of the existence of this blog I finally have a paper out that actually relates to the title of this blog! Considering that the paper itself is only ten pages long, I will just direct readers to the arXiv instead of writing more about it here

“The asymptotically hyperboloidal is not asymptotically null.”

By way of Roland Donninger, I learned today of the statement above which is apparently well-known in the numerical relativity community.

It may seem intuitively surprising: after all, the archetype of an asymptotically hyperboloidal surface is the hyperboloid as embedded in Minkowski space. Let (t,r, \omega)\in \mathbb{R}\times\mathbb{R}_+ \times \mathbb{S}^{d-1} be the spherical coordinate system for the Minkowski space \mathbb{R}^{1,d}, the hyperboloid embeds in it as the surface t^2 - r^2 = 1. If you draw a picture we see clearly that the surface is asymptotic to the null cone t = |r|

The key, however, lies in the definition. For better or for worse, the definition under which the titular statement makes sense the following:

Definition
Let (M,g) be an asymptotically simple space-time (or one for which one can define a Penrose compactification), and let (\bar{M},\Omega^2 g) be the compactified space-time. We say that a hypersurface \Sigma \subset M is asymptotically null if the \bar{\Sigma}\cap \bar{M} transversely and the tangent space of \bar{\Sigma} is null along \partial\bar{M}.

Now suppose near \partial\bar{M} we can foliate via a double-null foliation (u,v), with \partial\bar{M} = \{ u = 0\}. Let x be a coordinate on \partial\bar{M} so that (u,v,x) form a coordinate system for a neighborhood of \partial\bar{M}. Assume that our surface \Sigma can be written as a graph

v = \phi(u,x)

where \phi is a C^3 function. Then the asymptotically null condition is just that \partial_u \phi |_{u = 0} = 0. Taking a Taylor expansion we have that this means

v \approx \phi_{\infty}(x) + \phi^{(2)}_{\infty}(x) u^2.

For the usual conformal compactification of Minkowski space, we have u = \frac{\pi}{2} - \cot^{-1}\left( \frac{1}{r+t}\right). Hence we require that an asymptotically null surface to have convergence to the null surface at rate O(1/(r+t)^2) (if \phi is sufficiently differentiable; if we relax the differentiability at infinity we see that the above condition allows us to relax all the way to O(1/(r+t)^{1+}), but O(1/(r+t)) is not admissible).

On the other hand, the hyperboloid is given by (r+t)(r-t) = -1 \implies r-t = v = O(1/(r+t)) and so is not asymptotically null. And indeed, we can also check by direct computation that in the usual conformal compactification of Minkowski space, the limit of the hyperboloid at null infinity is space-like.

Whitney extension and counterexample to Sard’s theorem

In a first course in differential geometry/topology we are often taught the following version of Sard’s theorem:

Theorem (Sard, smooth case)
Let M and N be (finite dimensional) smooth manifolds, and let f:M\to N be a smooth (infinitely differentiable) map. Let S\subset N be the set of critical values, that is, for every y \in S there exists x \in f^{-1}(y) such that f'(x) is not surjective. Then S has measure zero.

It turns out that Sard in his 1942 paper proved something stronger.

Theorem (Sard)
Let f:\mathbb{R}^m \supset M\to N\subset \mathbb{R}^n be a mapping of class C^k (all partial derivatives up to order k exists and are continuous) for k \geq 1. Let S\subset N be defined as above. And if either

  1. m \leq n; or
  2. m > n and k \geq m - n + 1,

then S has measure 0.

Over at MathOverflow, Sergei Ivanov has shown that in the case m = n = 1, the statement can be modified so that C^1 is not necessary.

In this post, we will describe counterexamples in the case m > n if the condition on k is not satisfied. We will also sketch a natural generalisation of Ivanov’s argument for higher dimensions. Read the rest of this entry »

Non-existence of multiple-black-hole solutions that look locally like subextremal Kerr-Newman

Pin Yu and I just posted a paper on arXiv based on some joint work we started doing when we were both graduate students (and parts of which appeared in his dissertation). I think I did a decent job explaining the motivation in a mathematical way in the introduction to the paper, so I’ll motivate the problem here a little differently.

Ever since the discovery of the black hole solutions, there had been interest in whether static or stationary black hole solutions with more than one hole can exists in equilibrium. One of the earliest considerations was by Bach and Weyl in 1922, fairly soon after the discovery of the Schwarzschild solutions. In general it was concluded that a static configuration is impossible due to singularities forming between two holes. Roughly speaking, a black hole in vacuum spacetime is a strongly gravitating object, and gravity attracts. If two black holes were to form and were to be kept apart, an external force will be needed to hold on to them. This manifests in a singularity between the two black holes.

On the other hand, if we were to add electrical charge to the black holes, then two black holes with charges of the same sign stands a chance of being in equilibrium: the electromagnetic interaction between two like charges is repulsive, while gravity is attractive, so maybe they can cancel out! And indeed such a configuration is possible. Under what are now called Majumdar-Papapetrou solutions are precisely these types of balanced multiple black hole solutions. But the balance has to be precise! The electric charge has to be large enough to equal the gravitational mass. And this in turn requires that the black holes represent what are called “extremal black holes”.

For the stationary, as opposed to static, case, the situation is less clear. In the static case, the black holes must remain fixed in place. In the stationary case, we can allow the black holes to orbit each other. As we know well from our own solar system, orbiting systems can happen where static systems are disallowed. This is because of the “fictitious” centrifugal force which can balance out the gravitational attraction (or, more correctly stated in accordance to American high school physics curricula, the gravitational force provides the centripetal force for the bound orbit).

In our paper we show that orbiting systems (as opposed to inspiraling ones where the multiple black holes eventually crash into one another) cannot exist provided that outside the black hole things looks more or less like the situation with only one black hole. A posteriori, knowing that these systems cannot exist, we are justified in our inability to provide examples. But even a priori it is difficult to imagine a system which our assumptions do not immediately rule out.

  • If we have two black holes with large mass placed close to each other, we’d expect there to be lots of gravitational interaction and the local geometry will be distorted heavily away from the single black hole Kerr Newman solution.
  • If we have two black holes with significant mass place far from each other, we’d naively expect that the gravitational field from infinity sees a single body with the combined masses of the two black hole, but near one individual black hole we only see the gravitational effects from that one black hole. So intuitively a situation like this cannot have an exterior that looks everywhere just like that with one single black hole.

The situation that intuitively we cannot rule out here is the case where we start with one gigantic black hole and one tiny black hole. The effect of the tiny black hole on the gravitational field is but a mere blip compared to the big black hole. So we can say expect that the space-time metric looks like that of when there is just the one big black hole. This is the one to which our theorem applies.

Our theorem also says nothing about the extremal case. We only consider the case where the charge is insufficient to balance out the mass. As we know from the Majumdar-Papapetrou solutions we do have a need to ruling out that class of solutions. Where this restriction enters the proof, however, is in a cute way. First, let me describe the general strategy of the proof. Our method is roughly inspired by Morse theory. We construct a real valued function on the space-time such that its level surfaces foliate the space-time with homeomorphic leaves. We then show that the boundaries of the black holes (the event horizons) necessary are almost level surfaces for this real valued function. This will give rise to a contradiction: on the one hand near infinity, the level surfaces look like spheres and is a connected surface; the boundary of the black holes, together form a level surface that has two components. This contradicts the homeomorphism between level surfaces. In the actual argument, however, we do not prove homeomorphism. Instead, we show that the real valued function has no critical points. The contradiction then is provided by a mountain pass lemma. It is here we need to use subextremality.

We can only show that the real valued function has no critical points where its values are at least as large as the values on the event horizons (there is a sign issue). So to actually get a contradiction by the mountain pass lemma, we need that as we go out from the event horizon, the value of the real-valued function increases. For subextremal black holes, the event horizon is non-degenerate, and the near horizon geometry forces this to be true. For extremal black holes, the event horizon is degenerate, and the near horizon geometry allows the value of the function to remain the same or decrease.

One and two (space-time) dimensional general relativity

This is a bit of a silly post, partly written because I came across this paper when doing some literature search.

To give a spoiler: the punchline is

Life is very boring in one and two space-time dimensions.

Readers familiar with Einstein’s equations (with cosmological constant) R_{ab} - \frac12 R g_{ab} = T_{ab} - \Lambda g_{ab} may want to pause and think for a few seconds; then they may want to skip reading this post entirely.

Read the rest of this entry »

Gauge invariance, geometrically

A somewhat convoluted chain of events led me to think about the geometric description of partial differential equations. And a question I asked myself this morning was

Question
What is the meaning of gauge invariance in the jet-bundle treatment of partial differential equations?

The answer, actually, is quite simple.

Review of geometric formulation PDE
We consider here abstract PDEs formulated geometrically. All objects considered will be smooth. For more about the formal framework presented here, a good reference is H. Goldschmidt, “Integrability criteria for systems of nonlinear partial differential equations”, JDG (1967) 1:269–307.

A quick review: the background manifold X is assumed (here we take a slightly more restrictive point of view) to be a connected smooth manifold. The configuration space \mathcal{C} is defined to be a fibred manifold p:\mathcal{C}\to X. By J^r\mathcal{C} we refer to the fibred manifold of r-jets of \mathcal{C}, whose projection p^r = \pi^r_0 \circ p where for r > s we use \pi^r_s: J^r\mathcal{C}\to J^s\mathcal{C} for the canonical projection.

A field is a (smooth) section \phi \subset \Gamma \mathcal{C}. A simple example that capture most of the usual cases: if we are studying mappings between manifolds \phi: X\to N, then we take \mathcal{C} = N\times X the trivial fibre bundle. The s-jet operator naturally sends j^s: \Gamma\mathcal{C} \ni \phi \mapsto j^s\phi \in \Gamma J^r\mathcal{C}.

A partial differential equation of order r is defined to be a fibred submanifold J^r\mathcal{C} \supset R^r \to X. A field is said to solve the PDE if j^r\phi \subset R^r.

In the usual case of systems of PDEs on Euclidean space, X is taken to be \mathbb{R}^d and \mathcal{C} = \mathbb{R}^n\times X the trivial vector bundle. A system of m PDEs of order r is usually taken to be F(x,\phi, \partial\phi, \partial^2\phi, \ldots, \partial^r\phi) = 0 where

\displaystyle F: X\times \mathbb{R}^n \times \mathbb{R}^{dn} \times \mathbb{R}^{\frac{1}{2}d(d+1)n} \times \cdots \times \mathbb{R}^{{d+r-1 \choose r} n} \to \mathbb{R}^m

is some function. We note that the domain of F can be identified in this case with J^r\mathcal{C}, We can then extend F to \tilde{F}: J^r\mathcal{C} \ni c \mapsto (F(c),p^r(c)) \in \mathbb{R}^m\times X a fibre bundle morphism.

If we assume that \tilde{F} has constant rank, then \tilde{F}^{-1}(0) is a fibred submanifold of J^r\mathcal{C}, and this is our differential equation.

Gauge invariance
In this frame work, the gauge invariance of a partial differential equation relative to certain symmetry groups can be captured by requiring R^r be an invariant submanifold.

More precisely, we take

Definition
A symmetry/gauge group \mathcal{G} is a subgroup of \mathrm{Diff}(\mathcal{C}), with the property that for any g\in\mathcal{G}, there exists a g'\in \mathrm{Diff}(X) with p\circ g = g' \circ p.

It is important we are looking at the diffeomorphism group for \mathcal{C}, not J^r\mathcal{C}. In general diffeomorphisms of J^r\mathcal{C} will not preserve holonomy for sections of the form j^r\phi, a condition that is essential for solving PDEs. The condition that the symmetry operation “commutes with projections” is to ensure that g:\Gamma\mathcal{C}\to\Gamma\mathcal{C}, which in particular guarantees that g extends to a diffeomorphism of J^rC with itself that commutes with projections.

From this point of view, a (system of) partial differential equation(s) R^r is said to be \mathcal{G}-invariant if for every g\in\mathcal{G}, we have g(R^r) \subset R^r.

We give two examples showing that this description agrees with the classical notions.

Gauge theory. In classical gauged theories, the configuration space \mathcal{C} is a fibre bundle with structure group G which acts on the fibres. A section of G\times X \to X induces a diffeomorphism of \mathcal{C} by fibre-wise action. In fact, the gauge symmetry is a fibre bundle morphism (fixes the base points).

General relativity. In general relativity, the configuration space is the space of Lorentzian metrics. So the background manifold is the space-time X. And the configuration space is the open submanifold of S^2T^*X given by non-degenerate symmetric bilinear forms with signature (-+++). A diffeomorphism \Psi:X\to X induces T^*\Psi = (\Psi^{-1})^*: T^*X \to T^*X and hence a configuration space diffeomorphism that commutes with projection. It is in this sense that Einstein’s equations are diffeomorphism invariant.

Notice of course, this formulation does not contain the “physical” distinction between global and local gauge transformations. For example, for a linear PDE (so \mathcal{C} is a vector bundle and R^r is closed under linear operations), the trivial “global scaling” of a solution is considered in this frame work a gauge symmetry, though it is generally ignored in physics.

Extensions of (co)vector fields to tangent bundles

I am reading Sasaki’s original paper on the construction of the Sasaki metric (a canonical Riemannian metric on the tangent bundle of a Riemannian manifold), and the following took me way too long to understand. So I’ll write it down in case I forgot in the future.

In section two of the paper, Sasaki consider “extended transformations and extended tensors”. Basically he wanted to give a way to “lift” tensor fields from a manifold to tensor fields of the same rank on its tangent bundle. And he did so in the language of coordinate changes, which geometrical content is a bit hard to parse. I’ll discuss his construction in a bit. But first I’ll talk about something different.

The trivial lifts
Let M, N be smooth manifolds, and let f:M\to N a submersion. Then we can trivially lift covariant objects on N to equivalent objects on M by the pull-back operation. To define the pull-back, we start with a covariant tensor field \vartheta \in \Gamma T^0_kN, and set f^*\vartheta \in \Gamma T^0_kM by the formula:

\displaystyle f^*\vartheta(X_1,\ldots,X_k) = \vartheta(df\circ X_1, \ldots, df\circ X_k)

where the X_1, \ldots, X_k \in T_pM, and we use that df(p): T_pM \to T_{f(p)}N. Observe that for a function g: N \to \mathbb{R}, the pull-back is simply f^*g = g\circ f :M\to N\to\mathbb{R}.

On the other hand, for contravariant tensor fields, the pull-back is not uniquely defined: using that f is a submersion, we have that TM / \ker(df) = TN, so while, given a vector field v on N, we can always find a vector field w on M such that df(w) = v, the vector field w is only unique up to an addition of a vector field that lies in the kernel of df. If, however, that M is Riemannian, then we can take the orthogonal decomposition of TM into the kernel and its complement, thereby getting a well-defined lift of the vector field (in other words, by exploiting the identification between the tangent and cotangent spaces).

Remarkably, the extensions defined by Sasaki is not this one.

(Let me just add a remark here: given two manifolds, once one obtain a well defined way of lifting vectors, covectors, and functions from one to the other, such that they are compatible (\vartheta^*(v^*) = [\vartheta(v)]^*), one can extend this mapping to arbitrary tensor fields.)

The extensions defined by Sasaki
As seen above, if we just rely on the canonical submersion \pi:TM\to M, we cannot generally extend vector fields. Sasaki’s construction, however, strongly exploits the fact that TM is the tangent bundle of M.

We start by looking at the vector field extension defined by equation (2.6) of the linked paper. We first observe that a vector field v on a manifold M is a section of the tangent bundle. That is, v is a map M\to TM such that the composition with the canonical projection \pi\circ v:M\to M is the identity map. This implies, using the chain rule, that the map d(\pi\circ v)= d\pi \circ dv: TM\to TM is also the identity map. Now, d\pi: T(TM) \to TM is the projection induced by the projection map \pi, which is different from the canonical projection \pi_2: T(TM) \to TM from the tangent bundle of a manifold to the manifold itself. However, a Proposition of Kobayashi (see “Theory of Connections” (1957), Proposition 1.4), shows that there exists an automorphism \alpha:T(TM) \to T(TM) such that d\pi \circ \alpha = \pi_2 and \pi_2\circ\alpha = d\pi. So v as a differential mapping induces a map \alpha\circ dv: TM \to T(TM), which is a map from the tangent bundle TM to the double tangent bundle T(TM), which when composed with the canonical projection \pi_2 is the identity. In other words, \alpha\circ dv is a vector field on TM.

Next we look at the definition (2.7) for one-forms. Give \vartheta a one-form on M, it induces naturally a scalar function on TM: for p\in M, v\in T_pM, we have \vartheta: TM\to \mathbb{R} taking value \vartheta(p)\cdot v. Hence its differential d\vartheta is a one-form over TM.

Now, what about scalar functions? Let \vartheta be a one-form and v be a vector field on M, we consider the pairing of their extensions to TM. It is not too hard to check that the corresponding scalar field to \vartheta(v), when evaluated at (p,w)\in TM, is in fact d(\vartheta(v))|_{p,w}, the derivative of the scalar function \vartheta(v) in the direction of w at point p. In general, the compatible lift of scalar fields g:M\to \mathbb{R} to TM is the function \tilde{g}(p,v) = dg(p)[v].

Using this we can extend the construction to arbitrary tensor fields, and a simple computation yields that this construction is in fact identical, for rank-2 tensors, to the expressions given in (2.8), (2.9), and (2.10) in the paper.

The second extension
The above extension is not the only map sending vectors on M to vectors on TM. In the statement of Lemmas 3 there is also another construction. Given a vector field v, it induces a one parameter family of diffeomorphisms on TM via that maps \psi_t(p,w) = (p, w+vt). Its differential \frac{d}{dt}\psi_t|_{t=0} is a vector field over TM.

The construction in the statement of Lemma 4 is the trivial one mentioned at the start of this post.