**Geodesics and variation**

One of the classical formulation of the criterion for a curve to be geodesic is that it is a stationary point of the *length functional*. Let be a Riemannian manifold, and let $latex: \gamma:[0,1]\to M$ be a mapping. Define the length functional to be

A geodesic then is a curve that is a critical point of under perturbations that fix the endpoints $\latex \gamma(0)$ and .

One minor annoyance about the length functional is that it is invariant under reparametrization of , and so it does not admit unique solutions. One way to work around this is to instead consider the *energy functional* (which also has the advantage of also being easily generalizable to pseudo-Riemannian manifolds)

It turns out that critical points of the energy functional are always critical points of the length functional. Furthermore, the energy functional has some added convexity: a curve is a critical point of the energy functional if it is a geodesic and that it has constant speed (in the sense that is independent of the parameter ).

The standard way to analyze the variation of is by first fixing a coordinate system . Writing the infinitesimal perturbation as , we can compute the first variation of :

Integrating the second term by parts we recover the familiar geodesic equation in local coordinates.

There is a second way to analyze the variation. Using the diffeomorphism invariance, we can imagine instead of varying while fixing the manifold, we can deform the manifold while fixing the curve . From the point of view of the energy functional the two should be indistinguishable. Consider the variation , which can be regarded as a vector field along which vanishes at the two end points. Let be a vector field on that extends . Then the infinitesimal variation of moving the curve in the direction should be reproducible by flowing the manifold by and *pulling back the metric*. To be more precise, let be the one parameter family of diffeomorphisms generated by the vector field , the first variation can be analogously represented as

By the definition of the Lie derivative we get the following characterizing condition for a geodesic:

Theorem

A curve is an affinely parametrized geodesic if and only if for every vector field vanishing near and , the integral

Noticing that , where is the Levi-Civita connection, we have that the above integral condition is equivalent to requiring

Using the boundary conditions and integrating by parts we see this also gives us, without passing through the local coordinate formulation, the geodesic equation

**The Einstein-Infeld-Hoffmann theorem**

The EIH theorem reads:

Theorem (EIH)

A curve is geodesic if and only if there exists a non-vanishing contravariant symmetric two tensor along such that for every vector field vanishing near and , the integral

(where is the induced length measure on ).

The EIH theorem follows immediately from the discussion in the previous section and the following lemma.

Lemma

A contravariant symmetric two tensor that satisfies the assumptions in the previous theorem must be proportional to .

*Proof:* Choose an orthonormal frame along for such that is tangent to . Write . Suppose . Then there exists a vector field such that and the symmetric part of is equal to . (We can construct by choosing a local coordinate system in a tubular neighborhood of such that . Then can be prescribed by its first order Taylor expansion in the normal direction to .) Let be a non-negative cut-off function and setting we note that since vanishes along . Therefore we have that the desired integral condition cannot hold. q.e.d.

References:

- Shlomo Sternberg,
*Curvature in Mathematics and Physics* - Einstein, Infeld, Hoffmann, “Gravitational Equations and the Problem of Motion”

Filed under: differential/pseudo-Riemannian geometry, Disseminating mathematics, Maths, Requires upper level university maths ]]>

It however bugged me to no end that the recurring example this particular individual returns to for something old-fashioned and “ought not be taught” is *integration by parts*; and he justifies this by mentioning that computer algebra systems (or even just google) can do the integrals faster and better than we humans can.

I don’t generally mind others cracking jokes at mathematicians’ expense. But this particular self-serving *strawman* uttered by so well-regarded an individual is, to those of us actually in the field teaching calculus to freshmen and sophomores, very damaging and disingenuous.

I happened to have just spent the entirety of last year rethinking how we can best teach calculus to the modern engineering majors. Believe me, students nowadays know perfectly well when we are just asking them to do busywork; they also know perfectly well that computer algebra systems are generally better at finding closed-form integral expressions than we can. Part of the challenge of the redesign that I am involved in is precisely to convince the students that calculus is worth learning *in spite of* computers. The difficulty is not in dearth of reason; on the contrary, there are many good reasons why a solid grounding of calculus is important to a modern engineering students. To give a few examples:

- Taylor series are in fact important
*because*of computers, since they provide a method of compactly encoding an entire function. - Newton’s method for root finding (and its application to, say, numerical optimization) is build on a solid understanding of differential calculus.
- The entirety of the finite element method of numerical simulation, which underlies a lot of civil and mechanical engineering applications, are based on a variational formulation of differential equations that, guess what, only make sense when one understand integration by parts.
- The notion of Fourier transform which is behind a lot of signal/image processing requires understanding how trigonometric functions behave under integration.

No, the difficulty for me and my collaborator is narrowing down a list of examples that we can not only reasonably explain to undergraduate students, but also have them have some hands-on experience working with.

When my collaborator and I were first plunged into this adventure of designing engineering-specific calculus material, one of the very first things that we did was to seek out inputs from our engineering colleagues. My original impulse was to cut some curricular content in order to give the students a chance to develop deeper understanding of fewer topics. To that end I selected some number of topics which *I thought* are old-fashioned, out-dated, and no longer used in this day and age. How wrong I was! Even something like “integration by partial fractions” which most practicing mathematicians will defer to a computer to do has its advocates (those who have to teach control theory insists that a lot of fundamental examples in their field can be reduced to evaluating integrals of rational functions, and a good grasp of how such integrals behave is key to developing a general sense of how control theory works).

In short, unlike some individuals will have you believe, math education is not obsolete because we all have calculators. In fact, I would argue the opposite: math education is especially pertinent now that we all have calculators. Long gone was the age where a superficial understanding of mathematics in terms of its rote computations is a valuable skill. A successful scientist or engineer needs to be able to effectively leverage the large toolbox that is available to her, and this requires a much deeper understanding of mathematics, one that goes beyond just the *how* but also the *what* and the *why*.

There are indeed much that can be done to better math education for the modern student. But one thing that shouldn’t be done is getting rid of integration by parts.

Filed under: Doceamus, Life of a mathematician, Maths ]]>

but with all the terms, whose decimal expansion includes the digit ‘9’, removed, in fact converges to some number below 80. The original proof is given in the Wikipedia article linked above, so I will not repeat it. But to make it easier to see the idea: let us first think about the case where the number is expressed in base 2. In base 2, all the positive integers has the leading binary bit being 1 (since it cannot be zero). Therefore there are *no* binary positive numbers without the bit ‘1’ in its expansion. So the corresponding series converges trivially to zero. How about the case of the bit ‘0’? The only binary numbers without any ‘0’ bits are

.

So the corresponding series actually becomes

So somewhere from the heavily divergent harmonic series, we pick up a rapidly converging geometric series. So what’s at work here? Among all the n-bit binary numbers, exactly 1 has all bits not being 0. So the *density* of these kinds of numbers decays rather quickly: in base 2, there are numbers that are exactly n-bit long. So if a number has a binary representation that is exactly n bits long (which means that ), the chances that it is one of the special type of numbers is . This probability we can treat then as a *density*: replacing the discrete sum by the integral (calculus students may recognize this as the germ of the “integral test”) and replacing the by the density , we get the estimate

.

Doing the same thing with the original Kempner series gives that the chances a n-digit number does not contain the digit nine to be

The length of the decimal expansion of a natural number is basically . So the density we are interested in becomes

From this we can do an integral estimate

The integral can be computed using that

to get

Notice that this estimate is much closer to the currently known value of roughly 22.92 than to the original upper bound of 80 computed by Kempner.

Kempner’s estimate is a heavy overestimate because he performed a summation replacing every n-digit long number that does not contain the digit 9 by ; this number can be many times (up to 9) times smaller than the original number. Our estimate is low because among the n-digit long numbers, the numbers that do not contain the digit 9 are not evenly distributed: they tend to crowd in the front rather than in the back (in fact, we do not allow them to crowd in the back because none of the numbers that start with the digit 9 is admissible). So if in the original question we had asked for numbers that do not contain the digit 1, then our computation will give an overestimate instead since these numbers tend to crowd to the back.

Filed under: Disseminating mathematics, Maths, Require high school maths ]]>

**The problem**

Let’s start with a concrete example: I keep a “lab notebook”. It is where I document all my miscellaneous thoughts and computations that come up during my research. Some of those are immediately useful and are collected into papers for publication. Some of those are not, and I prefer to keep them for future reference. These computations range over many different subjects. Now and then, I want to share with a collaborator or a student some subset of these notes. So I want a way to quickly search (by keywords/abstract) for relevant notes, and that compile them into one large LaTeX document.

Another concrete example: I am starting to collect a bunch of examples and exercises in analysis for use in my various classes. Again, I want to have them organized for easy search and retrieval, especially to make into exercise sheets.

**The JabRef solution**

The “correct” way to do this is probably with a database (or a document store), with each document tagged with a list of keywords. But that requires a bit more programming than I want to worry about at the moment.

JabRef, as it turns out, is sort of a metadata database: by defining a customized entry type you can use the BibTeX syntax as a proxy for JSON-style data. So for my lab notebook example, I define a custom type `lnbentry`

in JabRef with

- Required fields:
`year, month, day, title, file`

- Optional fields:
`keywords, abstract`

I store each lab notebook entry as an individual TeX file, whose file system address is stored in the `file`

field. The remaining metadata fields’ contents are self-evident.

(Technical note: in my case I actually store the metadata in the TeX file and have a script to parse the TeX files and update the bib database accordingly.)

For generating output, we can use JabRef’s convenient export filter support. In the simplest case we can create a custom export layout with the main layout file containing the single line

\\input{\file}

with appropriate `begin`

and `end`

incantations to make the output a fully-formed TeX file. Then one can simply select the entries to be exported, click on “Export”, and generate the appropriate TeX file on the fly.

(Technical note: JabRef can also be run without a GUI. So one can use this to perform searches through the database on the command line.)

Filed under: Disseminating mathematics, Life of a mathematician ]]>

**The code**

After some false starts, here are some reasonably stable code.

function MC_corr3!(position::Array{Float64,2}, prev_vel::Array{Float64,2}, next_vel::Array{Float64,2}, result::Array{Float64,2}, dt::Float64) # We will assume that data is stored in the format point(coord,number), so as a 3x1000 array or something. num_points = size(position,2) num_dims = size(position,1) curr_vel = zeros(num_dims) curr_vs = zeros(num_dims) curr_ps = zeros(num_dims) curr_pss = zeros(num_dims) pred_vel = zeros(num_dims) agreement = true for col = 1:num_points #Outer loop is column if col == 1 prev_col = num_points next_col = 2 elseif col == num_points prev_col = num_points - 1 next_col = 1 else prev_col = col -1 next_col = col + 1 end for row = 1:num_dims curr_vel[row] = (next_vel[row,col] + prev_vel[row,col])/2 curr_vs[row] = (next_vel[row,next_col] + prev_vel[row,next_col] - next_vel[row,prev_col] - prev_vel[row,prev_col])/4 curr_ps[row] = (position[row,next_col] - position[row,prev_col])/2 curr_pss[row] = position[row,next_col] + position[row,prev_col] - 2*position[row,col] end beta = (1 + dot(curr_vel,curr_vel))^(1/2) sigma = dot(curr_ps,curr_ps) psvs = dot(curr_ps,curr_vs) bvvs = dot(curr_vs,curr_vel) / (beta^2) pssps = dot(curr_pss,curr_ps) for row in 1:num_dims result[row,col] = curr_pss[row] / (sigma * beta) - curr_ps[row] * pssps / (sigma^2 * beta) - curr_vel[row] * psvs / (sigma * beta) - curr_ps[row] * bvvs / (sigma * beta) pred_vel[row] = prev_vel[row,col] + dt * result[row,col] end agreement = agreement && isapprox(next_vel[:,col], pred_vel, rtol=sqrt(eps(Float64))) end return agreement end function find_next_vel!(position::Array{Float64,2}, prev_vel::Array{Float64,2}, next_vel::Array{Float64,2}, dt::Float64; max_tries::Int64=50) tries = 1 result = zeros(next_vel) agreement = MC_corr3!(position,prev_vel,next_vel,result,dt) for j in 1:size(next_vel,2), i in 1:size(next_vel,1) next_vel[i,j] = prev_vel[i,j] + result[i,j]*dt end while !agreement && tries < max_tries agreement = MC_corr3!(position,prev_vel,next_vel,result,dt) for j in 1:size(next_vel,2), i in 1:size(next_vel,1) next_vel[i,j] = prev_vel[i,j] + result[i,j]*dt end tries +=1 end return tries, agreement end

This first file does the heavy lifting of solving the evolution equation. The scheme is a semi-implicit finite difference scheme. The function `MC_Corr3`

takes as input the current position, the previous velocity, and the next velocity, and computes the correct current acceleration. The function `find_next_vel`

iterates `MC_Corr3`

until the computed acceleration agrees (up to numerical errors) with the input previous and next velocities.

Or, in notations:

`MC_Corr3: ( x[t], v[t-1], v[t+1] ) --> Delta-v[t]`

and `find_next_vel`

iterates `MC_Corr3`

until

` Delta-v[t] == (v[t+1] - v[t-1]) / 2`

The code in this file is also where the performance matters the most, and I spent quite some time experimenting with different algorithms to find one with most reasonable speed.

function make_ellipse(a::Float64,b::Float64, n::Int64, extra_dims::Int64=1) # a,b are relative lengths of x and y axes s = linspace(0,2π * (n-1)/n, n) if extra_dims == 0 return vcat(transpose(a*cos(s)), transpose(b*sin(s))) elseif extra_dims > 0 return vcat(transpose(a*cos(s)), transpose(b*sin(s)), zeros(extra_dims,n)) else error("extra_dims must be non-negative") end end function perturb_data!(data::Array{Float64,2}, coeff::Vector{Float64}, num_modes::Int64) # num_modes is the number of modes # coeff are the relative sizes of the perturbations numpts = size(data,2) for j in 2:num_modes rcoeff = rand(length(coeff),2) for pt in 1:numpts theta = 2j * π * pt / numpts for d in 1:length(coeff) data[d,pt] += ( (rcoeff[d,1] - 0.5) * cos(theta) + (rcoeff[d,2] - 0.5) * sin(theta)) * coeff[d] / j^2 end end end nothing end

This file just sets up the initial data. Note that in principle the number of ambient spatial dimensions is arbitrary.

using Plots pyplot(size=(1920,1080), reuse=true) function plot_data2D(filename_prefix::ASCIIString, filename_offset::Int64, titlestring::ASCIIString, data::Array{Float64,2}, additional_data...) x_max = 1.5 y_max = 1.5 plot(transpose(data)[:,1], transpose(data)[:,2] , xlims=(-x_max,x_max), ylims=(-y_max,y_max), title=titlestring) if length(additional_data) > 0 for i in 1:length(additional_data) plot!(transpose(additional_data[i][1,:]), transpose(additional_data[i][2,:])) end end png(filename_prefix*dec(filename_offset,5)*".png") nothing end function plot_data3D(filename_prefix::ASCIIString, filename_offset::Int64, titlestring::ASCIIString, data::Array{Float64,2}, additional_data...) x_max = 1.5 y_max = 1.5 z_max = 0.9 tdata = transpose(data) plot(tdata[:,1], tdata[:,2],tdata[:,3], xlims=(-x_max,x_max), ylims=(-y_max,y_max),zlims=(-z_max,z_max), title=titlestring) if length(additional_data) > 0 for i in 1:length(additional_data) tdata = transpose(additional_data[i]) plot!(tdata[:,1], tdata[:,2], tdata[:,3]) end end png(filename_prefix*dec(filename_offset,5)*".png") nothing end

This file provides some wrapper commands for generating the plots.

include("InitialData3.jl") include("MeanCurvature3.jl") include("GraphCode3.jl") num_pts = 3000 default_timestep = 0.01 / num_pts max_time = 3 plot_every_ts = 1500 my_data = make_ellipse(1.0,1.0,num_pts,0) perturb_data!(my_data, [1.0,1.0], 15) this_vel = zeros(my_data) next_vel = zeros(my_data) for t = 0:floor(Int64,max_time / default_timestep) num_tries, agreement = find_next_vel!(my_data, this_vel,next_vel,default_timestep) if !agreement warn("Time $(t*default_timestep): failed to converge when finding next_vel.") warn("Dumping information:") max_beta = 1.0 max_col = 1 for col in 1:size(my_data,2) beta = (1 + dot(next_vel[:,col], next_vel[:,col]))^(1/2) if beta > max_beta max_beta = beta max_col = col end end warn(" Beta attains maximum at position $max_col") warn(" Beta = $max_beta") warn(" Position = ", my_data[:,max_col]) prevcol = max_col - 1 nextcol = max_col + 1 if max_col == 1 prevcol = size(my_data,2) elseif max_col == size(my_data,2) nextcol = 1 end warn(" Deltas") warn(" Left: ", my_data[:,max_col] - my_data[:,prevcol]) warn(" Right: ", my_data[:,nextcol] - my_data[:,max_col]) warn(" Previous velocity: ", this_vel[:,max_col]) warn(" Putative next velocity: ", next_vel[:,max_col]) warn("Quitting...") break end for col in 1:size(my_data,2) beta = (1 + dot(next_vel[:,col], next_vel[:,col]))^(1/2) for row in 1:size(my_data,1) my_data[row,col] += next_vel[row,col] * default_timestep / beta this_vel[row,col] = next_vel[row,col] end if beta > 1e7 warn("time: ", t * default_timestep) warn("Almost null... beta = ", beta) warn("current position = ", my_data[:,col]) warn("current Deltas") prevcol = col - 1 nextcol = col + 1 if col == 1 prevcol = size(my_data,2) elseif col == size(my_data,2) nextcol = 1 end warn(" Left: ", my_data[:,col] - my_data[:,prevcol]) warn(" Right: ", my_data[:,nextcol] - my_data[:,col]) end end if t % plot_every_ts ==0 plot_data2D("3Dtest", div(t,plot_every_ts), @sprintf("elapsed: %0.4f",t*default_timestep), my_data, make_ellipse(cos(t*default_timestep), cos(t*default_timestep),100,0)) info("Frame $(t/plot_every_ts): used $num_tries tries.") end end

And finally the main file. Mostly it just ties the other files together to produce the plots using the simulation code; there are some diagnostics included for me to keep an eye on the output.

**The results**

First thing to do is to run a sanity check against explicit solutions. In rotational symmetry, the solution to the cosmic string equations can be found analytically. As you can see below the simulation closely replicates the explicit solution in this case.

The video ends when the simulation stopped. The simulation stopped because a *singularity* has formed; in this video the singularity can be seen as the collapse of the string to a single point.

Next we can play around with a more complicated initial configuration.

In this video the blue curve is the closed cosmic string, which starts out as a random perturbation of the circle with zero initial speed. The string contracts with acceleration determined by the Nambu-Goto action. The simulation ends when a singularity has formed. It is perhaps a bit hard to see directly where the singularity happened. The diagnostic messages, however, help in this regard. From it we know that the onset of singularity can be seen in the final frame:

The highlighted region is getting quite pointy. In fact, that is accompanied with the “corner” picking up infinite acceleration (in other words, experiencing an infinite force). The mathematical singularity corresponds to something unreasonable happening in the physics.

To make it easier to see the “speed” at which the curve is moving, the following videos show the string along with its “trail”. This first one again shows how a singularity can happen as the curve gets gradually more bent, eventually forming a corner.

This next one does a good job emphasizing the “wave” nature of the motion.

The closed cosmic strings behave like a elastic band. The string, overall, wants to contract to a point. Small undulations along the string however are propagated like traveling waves. Both of these tendencies can be seen quite clearly in the above video. That the numerical solver can solve “past” the singular point is a happy accident; while theoretically the solutions can in fact be analytically continued past the singular points, the renormalization process involved in this continuation is numerically unstable and we shouldn’t be able to see it on the computer most of the time.

The next video also emphasizes the wave nature of the motion. In addition to the traveling waves, pay attention to the bottom left of the video. Initially the string is almost straight there. This total lack of curvature is a stationary configuration for the string, and so initially there is absolutely no acceleration of that segment of the string. The curvature from the left and right of that segment slowly intrudes on the quiescent piece until the whole thing starts moving.

The last video for this post is a simulation when the ambient space is 3 dimensional. The motion of the string, as you can see, becomes somewhat more complicated. When the ambient space is 2 dimensional a point either accelerates or decelerates based on the local (signed) curvature of the string. But when the ambient space is 3 dimensional, the curvature is now a vector and this additional degree of freedom introduces complications into the behavior. For example, when the ambient space is 2 dimensional it is known that all closed cosmic strings become singular in finite time. But in 3 dimensions there are many closed cosmic strings that vibrate in place without every becoming singular. The video below is one that does however become singular. In addition to a fading trail to help visualize the speed of the curve, this plot also includes the shadows: projections of the curve onto the three coordinate planes.

Filed under: differential/pseudo-Riemannian geometry, Disseminating mathematics, general relativity, Life of a mathematician, Maths, partial differential equations, Requires upper level university maths, wave and Schroedinger equations ]]>

A few random things …

**Juno**

Julia has a decent IDE in JunoLab, which is built on top of Atom. In terms of functionality it captures most of the sort of things I used to use with Spyder for python, so is very convenient.

**Jupyter**

Julia interfaces with Jupyter notebooks through the IJulia kernel. I am a fan of Jupyter (I will be using it with the MATLAB kernel for a class I am teaching this fall).

**Plots.jl**

For plotting, right now one of the most convenience ways is through Plots.jl, which is a plotting front-end the bridges between your code and various different backends that can be almost swapped in and out on the fly. The actual plotting is powered by things like matplotlib or plotlyJS, but for the most part you can ignore the backend. This drastically simplifies the production of visualizations. (At least compared to what I remembered for my previous simulations in python.)

**Automatic Differentiation**

I just learned very recently about automatic differentiation. At a cost in running time for my scripts, it can very much simplify the coding of the scripts. For example, we can have a black-box root finder using Newton iteration that does not require pre-computing the Jacobian by hand:

module NewtonIteration using ForwardDiff export RootFind function RootFind(f, init_guess::Vector, accuracy::Float64, cache::ForwardDiffCache; method="Newton", max_iters=100, chunk_size=0) ### Takes input function f(x::Vector) → y::Vector of the same dimension and an initial guess init_guess. Apply Newton iteration to find solution of f(x) = 0. Stop when accuracy is better than prescribed, or when max_iters is reached, at which point a warning is raised. ### Setting chunk_size=0 deactivates chunking. But for large dimensional functions, chunk_size=5 or 10 improves performance drastically. Note that chunk_size must evenly divide the dimension of the input vector. ### Available methods are Newton or Chord # First check if we are already within the accuracy bounds error_term = f(init_guess) if norm(error_term) < accuracy info("Initial guess accurate.") return init_guess end # Different solution methods i = 1 current_guess = init_guess if method=="Chord" df = jacobian(f,current_guess,chunk_size=chunk_size) while norm(error_term) >= accuracy && i <= max_iters current_guess -= df \ error_term error_term = f(current_guess) i += 1 end elseif method=="Newton" jake = jacobian(f, ForwardDiff.AllResults, chunk_size=chunk_size, cache=cache) df, lower_order = jake(init_guess) while norm(value(lower_order)) >= accuracy && i <= max_iters current_guess -= df \ value(lower_order) df, lower_order = jake(current_guess) i += 1 end error_term = value(lower_order) else warn("Unknown method: ", method, ", returning initial guess.") return init_guess end # Check if converged if norm(error_term) >= accuracy warn("Did not converge, check initial guess or try increasing max_iters (currently: ", max_iters, ").") end info("Used ", i, " iterations; remaining error=", norm(error_term)) return current_guess end end

This can then be wrapped in finite difference code for solving nonlinear PDEs!

Filed under: Disseminating mathematics, Life of a mathematician ]]>

`nvim`

) is that it now supports asynchronous job dispatch. This makes it a bit nicer to call external previewers for instance (otherwise the previewer may block the editing). So here are the latest LaTeX runtime code that I use, modified for NeoVim.
function Dvipreview() let dviviewjob = jobstart(['xdvi', '-sourceposition', line(".")." ".expand("%"), expand("%:r") . ".dvi"]) endfunction function PDFpreview() let pdfviewjob = jobstart(['evince', expand("%:r") . ".pdf"]) endfunction au BufRead *.tex call LaTeXStartup() function LaTeXStartup() set dictionary+=~/.config/nvim/custom/latextmp/labelsdictionary set iskeyword=@,48-57,_,: call SimpleTexFold() set completefunc=CompleteBib set completeopt=menuone,preview runtime custom/latextmp/bibdictionary call SetShortCuts() endfunction function SimpleTexFold() exe "normal mz" 1 set foldmethod=manual if search('\\begin{document}','nW') 1,/\\begin{document}/-1fold if search('\\section','nW') /\\section/1 endif while search('\\section','nW') .,/\\section/-1fold /\\section/1 endwhile .,$fold endif if search('\\begin{entry}','nW') /\\begin{entry}/1 while search('\\begin{entry}','nW') .,/\\begin{entry}/-1fold /\\begin{entry}/1 endwhile .,$fold endif exe "normal g`zzv" endfunction function SetShortCuts() " Map <F2> to save and compile imap <F2> ^[:w^M:! latex -src-specials % >/dev/null^M^Mi " Map S-<F2> to save and compile as PDF " apparently <S-F2> sends the same keycode as <F12>? imap <F12> ^[:w^M:! pdflatex % >/dev/null^M^Mi " Map <F3> to Dvipreview() imap <F3> ^[:call Dvipreview()^M " Map S-<F3> to PDFpreview() " apparently <S-F3> = <F13> imap <F13> ^[:call PDFpreview()^M " Map <F4> to bibtex imap <F4> ^[:! bibtex "%:r" >/dev/null^M^Mi " Map <F5> to change the previous word into a latex \begin .. \end environment imap <F5> ^[diwi\begin{^[pi<Right>}^M^M\end{^[pi<Right>}<Up> " Map <F6> to 'escape the current \begin .. \end environment imap <F6> ^[/\\end{.*}/e^Mi<Right> " Map <F7> to search the labels dictionary for matching labels imap <F7> ^[diwi\ref{^[pi<Right>^X^K " Map <F8> to rebuild the labels dictionary imap <F8> ^[:w^M:! ~/.config/nvim/custom/latexreadlabels.sh %^M^Mi " Map <F9> to search using the bibs dictionary imap <F9> ^[diwi\cite{^[pi<Right>^X^U imap <S-Tab>C ^[diwi\mathcal{^[pi<Right>} imap <S-Tab>B ^[diwi\mathbb{^[pi<Right>} imap <S-Tab>F ^[diwi\mathfrak{^[pi<Right>} imap <S-Tab>R ^[diwi\mathrm{^[pi<Right>} imap <S-Tab>O ^[diwi\mathop{^[pi<Right>} imap <S-Tab>= ^[diWi\bar{^[pi<Right>} imap <S-Tab>. ^[diWi\dot{^[pi<Right>} imap <S-Tab>" ^[diWi\ddot{^[pi<Right>} imap <S-Tab>- ^[diWi\overline{^[pi<Right>} imap <S-Tab>^ ^[diWi\widehat{^[pi<Right>} imap <S-Tab>~ ^[diWi\widetilde{^[pi<Right>} imap <S-Tab>_ ^[diWi\underline{^[pi<Right>} endfunction

Pay attention that the control characters did not copy-paste entirely correctly in the `SetShortCuts()`

routine. Those need to be replaced by the actual control-X sequences. The read labels shell script is simply

#!/bin/sh grep '\label{' $1 | sed -r 's/.*\\label\{([^}]*)\}.*/\1/' > ~/.config/nvim/custom/latextmp/labelsdictionary

(I probably should observe the proper directory structure and dump the dictionary into `~/.local/share/`

instead.)

Filed under: Life of a mathematician ]]>

**Teaching**

Last semester I taught two classes. One is a “Intro to Proofs” class, and another is (supposed to be) an advanced undergraduate real analysis course. Upon reflection both of the classes could have benefitted from some inclusion of more structured proofs.

For the “Intro to Proofs” class, this is the belated recognition that “how mathematicians write and read proofs” is not always the same as “how mathematicians think about proofs”. There’s much that can be (and has been) written about this, but the short of the matter is that despite of our pretenses, mathematicians typically don’t write proofs completely rigorously. And we read proofs we don’t often check every detail, but choosing instead to absorb the “big picture”. As such, mathematical proofs that we see presented in graduate level textbooks and in journal articles are frequently really merely “sketches”: there are gaps to be filled by the reader.

What is often neglected in teaching students to read and write proofs is that these proofs or their sketches are backed up, usually, by a concrete and rigorous understanding of the subject. And that the distillation from a complete proof to what is presented on a piece of paper as the sketch is a bit of an art. In some respects a flipped classroom, especially in IBL style, is perfect for this. The students start by presenting proofs in great details, and as they collectively grow more and more may be omitted.

However, what I found disconcerting is that in a regular education, there can be fourth year undergraduate students studying for a degree in mathematics that still have not internalized this difference and are unable to successfully read and write mathematics.

This is where the involvement of more structured proof writing can become useful. Similar to how study of literature involves diagramming sentences, here we diagram proofs. A proof can be written in various levels of details. In a structured presentation these levels can be made explicit: a proof is decomposed into individual landmark statements which taken together will yield the desired result. Each statement will need justification, and the proofs also can recursively be sketched. The final writing of the proof is to prune this tree of ideas by removing the “trivial” justifications and keeping the important ones.

Instructors can demonstrate this in action by preparing detailed proofs of classical theorems in this format. I’ve found that rewriting theorems in this format forces me to re-examine assumptions and conclusions, and overall be more succinct when trying to come-up with a high level sketch. Furthermore, if lecture notes are presented in this format students will also benefit from being able to study the proofs by first obtaining a bird’s eye view of the process and then diving in various levels of detail to the nitty-gritty of the arguments.

I will try to implement this in a future undergraduate math major class and see what happens.

**Research**

Mathematics papers are getting longer. Especially in my field of hyperbolic PDEs. It is getting harder and harder, when reading a paper, to keep in one’s head a coherent picture of the overall argument. This is a problem that I think can be beautifully solved with more structured approach to presenting arguments.

I am not advocating re-writing proofs as pedantic as Lamport advocates. I am not even advocating the strict presentation. What I do like about the idea of structured proofs is the two-dimensionality of the presentation. In this I am also a bit influenced by Terry Tao’s “circuit-diagram” approach to diagramming proofs that he used in, among other things, his Nonlinear Dispersive Equations book and his recent Averaged Navier Stokes paper.

What I have in mind is the presentation of proofs as nested sketches, but each level written more-or-less in natural language as is currently. Each step of the proof is justified by its own “proof”. The proof can be read at different levels of details, and readers can choose to zoom in and study a portion of the proof when interested. Assumptions and conclusions of individual steps should be made clear; the tree structure of the presentation can help prevent circular arguments. (Some aspects of this is already present in modern mathematical papers: important intermediate arguments are often extracted in the form of lemmata and propositions. This proposal just makes everything more organized.)

This also can improve the refereeing experience. A paper can be rejected if an individual step can be shown to be false. Additional clarifications can be inserted if the referee feels that the paper does not go deep enough in the chain of justifications.

**Technological support**

This presentation of ideas does not require non-traditional media. But this presentation of ideas can be improved by non-traditional media. I’ve just run into a Web App that does some approximation of what I envisioned: Gingko. It is free to use if your usage isn’t heavy.

What I would love would be for cards to also support

- Cross referencing; currently it supports hashtags, but not referencing with a definite target card.
- Duplicating cards; currently it supports moving, but not duplicating.
- Multiple ancestry: it would be great if the same card can appear as the child of two different cards. But this can also be emulated with cross referencing support or duplication support.

Filed under: Uncategorized ]]>

`svn`

to manage my research work, and I probably would have remained so if it weren’t for my next job favoring git instead. So in the past few weeks I have been reading up on git and in the process discovering all sorts of things that I have been doing wrong, or at least sub-optimally. So here are just some notes on what I’ve just figured out (yay slow me!).
**Each paper should be a repository**

Previously I keep one single giant repository for all my research work. I’ve discovered that this is not the best idea for multiple reasons:

- Collaboration: one of the great things about version control systems is that it makes collaboration easier to manage. But your collaborators are not a static set and you probably don’t want them to peek at every one of your research ideas. The easiest way to share individual projects with only those who should be allowed to see and edit them is to have one repo for each paper. (I got away with what I did mostly because I failed to convince any of my collaborators to use a VCS beyond that built-in support in Dropbox.)
- Organisation: to keep track of papers I have them stored in subdirectories, some of which are “stuff I am working on” and some of which are “stuff that is finished from year X” and some of which are “stuff that is being refereed”. It is a bit silly that I have to do
`svn mv`

changes to “graduate” a project from one subdirectory to the next. By keeping each paper in its own (git) repository, the local directory representation of the storage is immaterial. And this makes more sense to me. - (In)compatibility: here’s something that I changed my mind on. Previously I thought it a great idea to keep a single up-to-date bibtex file containing all the references that I can ever need, and a single up-to-date version of my custom LaTeX class and style files. The advantage of course is that I just need to issue one
`svn up`

to get the newest versions of everything. But the disadvantage is that when upgrading my class and style files, or when updating my bibtex files, I have to maintain*backward compatibility*. And when I do break the compatibility, it is then required that I keep a copy of the old versions of the files along with the LaTeX source that uses them, which, when you think about it, defeats the purpose of having a single up-to-date version in one repo completely.

So my new workflow, instead of one giant repository, is that I will create a repo for each paper/project. My LaTeX class and style files will be itself a separate Git repo, on which I can upgrade and develop to my hearts desire. When I start a new paper I will simply make a copy of the current version of the files (with `git archive`

instead of `git clone`

because I won’t need the previous versions, nor will I want to track the changes). This also allows me to set-up my “development environment” (via `.gitattributes`

and `.gitignore`

) quickly.

**Keyword substitution is not necessary**

The papers I keep in my svn repo I have been using the svn and svn-multi packages to add time-stamp and versioning information to the PDF files. Both of those packages rely on the “keyword substitution” capabilities of the svn system at commit time. Naturally when I wanted to start using git, I looked for a replacement. The obvious one is gitinfot2. One thing I don’t like is that unlike the keyword replacements, this package does not directly modified the source LaTeX file; instead it creates (via commit and checkout hooks) a supplementary file in the `.git/`

directory which it searches for and inserts when building the PDF file. This makes it a bit more of a hassle when uploading stuff to the arXiv, for example.

So I started reading up on how one can actually imitate keyword expansion using commit and checkout filters. And I went so far as to implement something for LaTeX. And then I read the discussion by the kernel devs on this issue, and Linus Torvalds’ comments left an impression on me. In short:

- When you are working on the code in a git repository, you don’t need this tagging since you can just “ask git”.
- Conversely, this sort of tagging is only needed when your code is ready to leave the repository (upload to arXiv or sent to non-git-using collaborators, for example).

So philosophically it is much less useful to have something that work on the *working copy* compared to something that works on an *exported archive*. And while git, by design, cannot and will not do keyword expansion on commits, it is perfectly happy to do keyword expansion when one exports the repo. Furthermore, since the export substitution can be essentially formatted arbitrarily, this moots the need for something like `svn`

or `svn-multi`

to parse the string generated by the RCS: we can make the string appear how we want to start with. The only hiccup is that *before* the substitution (i.e. when you are working in the working copy), the syntax for the export substitution is not exactly compatible with LaTeX, and requires a little mucking about with catcodes. But with that problem solved, and with the workflow now accounting for each paper as a separate repository, for arXiv uploads the easiest thing will actually be to simply issue `git archive`

and upload the resulting tarball.

Filed under: Life of a mathematician ]]>

Let us begin by setting the notations and recalling what happens *without* the Stieltjes part.

**Defn (Partition)**

Let be a closed interval. A partition is a finite collection of closed subintervals such that

- is finite;
- covers , i.e. ;
- is pairwise almost disjoint, i.e. for distinct elements of , their intersection contains at most one point.

We write for the set of all partitions of .

**Defn (Refinement)**

Fix a closed interval, and two partitions. We say that refines or that if for every there exists such that .

**Defn (Selection)**

Given a closed interval and a partition, a selection is a mapping that satisfies .

**Defn (Size)**

Given a closed interval and a partition, the size of is defined as , where is the length of the closed interval .

**Remark** In the above we have defined two different preorders on the set of all partitions. One is induced by the size: we say that if . The other is given by the refinement . Note that neither are partial orders. (But that the preorder given by refinement can be made into a partial order if we disallow zero-length degenerate closed intervals.) Note also that if we must have .

Now we can define the notions of integrability.

**Defn (Integrability)**

Let be a closed, bounded interval and be a bounded function. We say that is integrable with integral in the sense of

*Riemann*if for every there exists such that for every and every selection we have

*Generalised-Riemann*if for every there exists such that for every and every selection we have

*Darboux*if

From the definition it is clear that “Riemann integrable” implies “Generalised-Riemann integrable”. Furthermore, we have clearly that for a fixed

and that if we have

so “Darboux integrable” also implies “Generalised-Riemann integrable”. A little bit more work shows that “Generalised-Riemann integrable” also implies “Darboux integrable” (if the suprema and infima are obtained on the intervals , this would follow immediately; using the boundedness of the intervals we can find such that the Riemann sum approximates the upper or lower Darboux sums arbitrarily well.

The interesting part is the following

**Theorem**

Darboux integrable functions are Riemann integrable. Thus all three notions are equivalent.

*Proof*. Let be partitions. Let , and let be the number of non-degenerate subintervals in . We have the following estimate

The estimate follows by noting that “most” of the will be proper subsets of , and there can be at most of the that straddles between two different non-degenerate sub-intervals of . To prove the theorem it suffices to choose first a such that the upper and lower Darboux sums well-approximates the integral. Then we can conclude for all with sufficiently small the Riemann sum is almost controlled by the -Darboux sums. Q.E.D.

Now that we have recalled the case of the usual integrability. Let us consider the case of the Stieltjes integrals: instead of integrating against , we integrate against , where is roughly speaking a “cumulative distribution function”: we assume that is a bounded monotonically increasing function.

The definition of the integrals are largely the same, except that at every step we replace the width of the interval by the diameter of , i.e. . The arguments above immediately also imply that

- “Riemann-Stieltjes integrable” implies “Generalised-Riemann-Stieltjes integrable”
- “Darboux-Stieltjes integrable” implies “Generalised-Riemann-Stieltjes integrable”
- “Generalised-Riemann-Stieltjes integrable” implies “Darboux-Stientjes integrable”

However, Darboux-Stieltjes integrable functions need not be Riemann-Stieltjes integrable. The possibility of failure can be seen in the proof of the theorem above, where we used the fact that is allow to be made arbitrarily small. The same estimate, in the case of the Stieltjes version of the integrals, has replaced by , which for arbitrary partitions need to shrink to zero. To have a concrete illustration, we give the following:

**Example**

Let . Let if and otherwise. Let if and otherwise. Let be the partition . We have that

while

so we have that in particular the pair is Darboux-Stieltjes integrable with integral 0. However, let be any odd integer, consider the partition of into equal portions. Depending on the choice of the selection , we see that the sum can take the values

which shows that the Riemann-Stieltjes condition can never be satisfied.

The example above where both and are discontinuous at the same point is essentially sharp. A easy modification of the previous theorem shows that

**Prop**

If at least one of is continuous, then Darboux-Stieltjes integrability is equivalent to Riemann-Stieltjes integrability.

**Remark** The nonexistence of Riemann-Stieltjes integral when and has shared discontinuity points is similar in spirit to the idea in distribution theory where whether the product of two distributions is well-defined (as a distribution) depends on their wave-front sets.

Filed under: classical analysis, Disseminating mathematics, Maths, Require introductory level university maths ]]>