Bubbles Bad; Ripples Good

… Data aequatione quotcunque fluentes quantitates involvente fluxiones invenire et vice versa …

Migration

Very short note: I am in the process of setting up a separate website for my blogging needs. (There are certain constraints to the WordPress platform that makes me no longer excited about updating this page.) More info will be posted about that later once the site is completed.

For now, however, please note that some of my older posts (especially those that are very math heavy) may start to disappear from this site. Do not worry: I am just transferring those posts to my new site slowly slowly. Not all of my posts will be migrated over; some will be left over here because they don’t fit the theme of my new site very well.

[Edit (September 5, 2019): the new site is up at https://qnlw.info.%5D

Chromebook adventure

Wow, it has been over a year since my last post.

I just got myself a new Chromebook, which supports Crostini. Today’s post is mostly a documentation to help me remember what to do if I need to re-do the set-up.

  • To install ArchLinux instead of Debian, follow https://www.reddit.com/r/Crostini/wiki/howto/run-arch-linux
  • Remember: the way Chrome interacts with Crostini, pure X applications don’t work. But applications that are Wayland capable should. (Think evince.)
  • Stuff to install in Arch: texlive, asymptote, julia, neovim, vim (need classic vim to edit vim-encrypted files), gummi.
  • It has been a LONG time since I last used gummi; I’ve been mostly just using neovim for editing on my desktop. But on the limited screen real estate of the chromebook, gummi seems like a good idea.
  • As of yet, Crostini does not obey VPN routing of the host OS. This means that cannot use ssh or git+ssh from within the container. Workaround for git: use the https connections instead. Have to type lots of passwords, but is okay.
  • Syncthing seems to work well via the android app.
  • Joplin seems to work well via the android app. (In fact, better than on Galaxy Tab.
  • EbookDroid allows reading of Djvu files.
  • chrome://flags allows access to some interesting beta features; one of them allows reading of Android files in the Files dialog. This would be needed to make Syncthing really useful.
  • Asymptote (in Crostini) doesn’t work very well at the moment; I think it has to do with the fact that it needs OpenGL for 3D rendering. Right now trying to render 3D pictures either gives a freeglut error; ends up with a blank PDF file; or core dumps. 2D pictures are perfectly fine though.

Heat ball

There are very few things I find unsatisfactory in L.C. Evans’ wonderful textbook on Partial Differential Equations; one of them is the illustration (on p.53 of the second edition) of the “heat ball”.

The heat ball is a region with respect to which an analogue of the mean value property of solutions to Laplace’s equation can be expressed, now for solutions of the heat equation. In the case of the Laplace’s equation, the regions are round balls. In the case of the heat equation, the regions are somewhat more complicated. They are defined by the expression

\displaystyle E(x,t;r) := \left\{ (y,s)\in \mathbb{R}^{n+1}~|~s \leq t, \Phi(x-y, t-s) \geq \frac{1}{r^n} \right\}

where \Phi is the fundamental solution of the heat equation

\displaystyle \Phi(x,t) := \frac{1}{(4\pi t)^{n/2}} e^{- \frac{|x|^2}{4t}}.

In the expressions above, the constant n is the number of spatial dimensions; r is the analogue of the radius of the ball, and in E(x,t;r), the point (x,r) is the center. Below is a better visualization of the heat balls: the curves shown are the boundaries \partial E(0,5;r) in dimension n = 1, for radii between 0.75 and 4 in steps of 0.25 (in particular all the red curves have integer radii). In higher dimensions the shape is generally the same, though they appear more “squashed” in the t direction.

1-dimensional heat balls centered at (0,5) for various radii. (Made using Desmos)

TeXLive — on Android

I’ve just done something that is, admittedly, rather silly. And I am somewhat surprised that it actually worked.

I managed to have TeXLive running on my new Samsung Galaxy Tab A, which runs Android.

It is painfully slow (speed is comparable to my old netbook from 2010). But in a pinch, I now know that it works. And that gives me some (minor) peace of mind.

Read the rest of this entry »

Workflow

Recently I started rethinking how I organize my incomplete and under development notes.

I know full well the inherent dangers of such an exercise in terms of my actual productivity. But now that I have completed my newest workflow I think I’ve finally found one that works well. (Fingers crossed.)

Before I describe what I do now, I’d like to document what I used to do, what I changed the last time, and why I am changing it again.

Read the rest of this entry »

In defense of integration by parts

A prominent academic, who happens not to be a mathematician, visited my home institution recently and gave a public address about the role of the university in the modern world. Most of what he said concerning our teaching mission are the usual platitudes about not being stuck in the past and making sure that our curricular content and learning objectives are aligned with what we would expect a 21st century college graduates to need.

It however bugged me to no end that the recurring example this particular individual returns to for something old-fashioned and “ought not be taught” is integration by parts; and he justifies this by mentioning that computer algebra systems (or even just google) can do the integrals faster and better than we humans can.

I don’t generally mind others cracking jokes at mathematicians’ expense. But this particular self-serving strawman uttered by so well-regarded an individual is, to those of us actually in the field teaching calculus to freshmen and sophomores, very damaging and disingenuous.

I happened to have just spent the entirety of last year rethinking how we can best teach calculus to the modern engineering majors. Believe me, students nowadays know perfectly well when we are just asking them to do busywork; they also know perfectly well that computer algebra systems are generally better at finding closed-form integral expressions than we can. Part of the challenge of the redesign that I am involved in is precisely to convince the students that calculus is worth learning in spite of computers. The difficulty is not in dearth of reason; on the contrary, there are many good reasons why a solid grounding of calculus is important to a modern engineering students. To give a few examples:

  1. Taylor series are in fact important because of computers, since they provide a method of compactly encoding an entire function.
  2. Newton’s method for root finding (and its application to, say, numerical optimization) is build on a solid understanding of differential calculus.
  3. The entirety of the finite element method of numerical simulation, which underlies a lot of civil and mechanical engineering applications, are based on a variational formulation of differential equations that, guess what, only make sense when one understand integration by parts.
  4. The notion of Fourier transform which is behind a lot of signal/image processing requires understanding how trigonometric functions behave under integration.

No, the difficulty for me and my collaborator is narrowing down a list of examples that we can not only reasonably explain to undergraduate students, but also have them have some hands-on experience working with.

When my collaborator and I were first plunged into this adventure of designing engineering-specific calculus material, one of the very first things that we did was to seek out inputs from our engineering colleagues. My original impulse was to cut some curricular content in order to give the students a chance to develop deeper understanding of fewer topics. To that end I selected some number of topics which I thought are old-fashioned, out-dated, and no longer used in this day and age. How wrong I was! Even something like “integration by partial fractions” which most practicing mathematicians will defer to a computer to do has its advocates (those who have to teach control theory insists that a lot of fundamental examples in their field can be reduced to evaluating integrals of rational functions, and a good grasp of how such integrals behave is key to developing a general sense of how control theory works).

In short, unlike some individuals will have you believe, math education is not obsolete because we all have calculators. In fact, I would argue the opposite: math education is especially pertinent now that we all have calculators. Long gone was the age where a superficial understanding of mathematics in terms of its rote computations is a valuable skill. A successful scientist or engineer needs to be able to effectively leverage the large toolbox that is available to her, and this requires a much deeper understanding of mathematics, one that goes beyond just the how but also the what and the why.

There are indeed much that can be done to better math education for the modern student. But one thing that shouldn’t be done is getting rid of integration by parts.

Abusing JabRef to manage snipplets of TeX

I use JabRef as my reference manager. In this post, however, I will discuss how we can abuse it to do some other things.

The problem

Let’s start with a concrete example: I keep a “lab notebook”. It is where I document all my miscellaneous thoughts and computations that come up during my research. Some of those are immediately useful and are collected into papers for publication. Some of those are not, and I prefer to keep them for future reference. These computations range over many different subjects. Now and then, I want to share with a collaborator or a student some subset of these notes. So I want a way to quickly search (by keywords/abstract) for relevant notes, and that compile them into one large LaTeX document.

Another concrete example: I am starting to collect a bunch of examples and exercises in analysis for use in my various classes. Again, I want to have them organized for easy search and retrieval, especially to make into exercise sheets.

The JabRef solution

The “correct” way to do this is probably with a database (or a document store), with each document tagged with a list of keywords. But that requires a bit more programming than I want to worry about at the moment.

JabRef, as it turns out, is sort of a metadata database: by defining a customized entry type you can use the BibTeX syntax as a proxy for JSON-style data. So for my lab notebook example, I define a custom type lnbentry in JabRef with

  • Required fields: year, month, day, title, file
  • Optional fields: keywords, abstract

I store each lab notebook entry as an individual TeX file, whose file system address is stored in the file field. The remaining metadata fields’ contents are self-evident.

(Technical note: in my case I actually store the metadata in the TeX file and have a script to parse the TeX files and update the bib database accordingly.)

For generating output, we can use JabRef’s convenient export filter support. In the simplest case we can create a custom export layout with the main layout file containing the single line

\\input{\file}

with appropriate begin and end incantations to make the output a fully-formed TeX file. Then one can simply select the entries to be exported, click on “Export”, and generate the appropriate TeX file on the fly.

(Technical note: JabRef can also be run without a GUI. So one can use this to perform searches through the database on the command line.)

Gingko and proof-writing

I have recently been reminded of Lamport’s How to Write a 21st Century Proof in more than one way.

Teaching

Last semester I taught two classes. One is a “Intro to Proofs” class, and another is (supposed to be) an advanced undergraduate real analysis course. Upon reflection both of the classes could have benefitted from some inclusion of more structured proofs.

For the “Intro to Proofs” class, this is the belated recognition that “how mathematicians write and read proofs” is not always the same as “how mathematicians think about proofs”. There’s much that can be (and has been) written about this, but the short of the matter is that despite of our pretenses, mathematicians typically don’t write proofs completely rigorously. And we read proofs we don’t often check every detail, but choosing instead to absorb the “big picture”. As such, mathematical proofs that we see presented in graduate level textbooks and in journal articles are frequently really merely “sketches”: there are gaps to be filled by the reader.

What is often neglected in teaching students to read and write proofs is that these proofs or their sketches are backed up, usually, by a concrete and rigorous understanding of the subject. And that the distillation from a complete proof to what is presented on a piece of paper as the sketch is a bit of an art. In some respects a flipped classroom, especially in IBL style, is perfect for this. The students start by presenting proofs in great details, and as they collectively grow more and more may be omitted.

However, what I found disconcerting is that in a regular education, there can be fourth year undergraduate students studying for a degree in mathematics that still have not internalized this difference and are unable to successfully read and write mathematics.

This is where the involvement of more structured proof writing can become useful. Similar to how study of literature involves diagramming sentences, here we diagram proofs. A proof can be written in various levels of details. In a structured presentation these levels can be made explicit: a proof is decomposed into individual landmark statements which taken together will yield the desired result. Each statement will need justification, and the proofs also can recursively be sketched. The final writing of the proof is to prune this tree of ideas by removing the “trivial” justifications and keeping the important ones.

Instructors can demonstrate this in action by preparing detailed proofs of classical theorems in this format. I’ve found that rewriting theorems in this format forces me to re-examine assumptions and conclusions, and overall be more succinct when trying to come-up with a high level sketch. Furthermore, if lecture notes are presented in this format students will also benefit from being able to study the proofs by first obtaining a bird’s eye view of the process and then diving in various levels of detail to the nitty-gritty of the arguments.

I will try to implement this in a future undergraduate math major class and see what happens.

Research

Mathematics papers are getting longer. Especially in my field of hyperbolic PDEs. It is getting harder and harder, when reading a paper, to keep in one’s head a coherent picture of the overall argument. This is a problem that I think can be beautifully solved with more structured approach to presenting arguments.

I am not advocating re-writing proofs as pedantic as Lamport advocates. I am not even advocating the strict presentation. What I do like about the idea of structured proofs is the two-dimensionality of the presentation. In this I am also a bit influenced by Terry Tao’s “circuit-diagram” approach to diagramming proofs that he used in, among other things, his Nonlinear Dispersive Equations book and his recent Averaged Navier Stokes paper.

What I have in mind is the presentation of proofs as nested sketches, but each level written more-or-less in natural language as is currently. Each step of the proof is justified by its own “proof”. The proof can be read at different levels of details, and readers can choose to zoom in and study a portion of the proof when interested. Assumptions and conclusions of individual steps should be made clear; the tree structure of the presentation can help prevent circular arguments. (Some aspects of this is already present in modern mathematical papers: important intermediate arguments are often extracted in the form of lemmata and propositions. This proposal just makes everything more organized.)

This also can improve the refereeing experience. A paper can be rejected if an individual step can be shown to be false. Additional clarifications can be inserted if the referee feels that the paper does not go deep enough in the chain of justifications.

Technological support

This presentation of ideas does not require non-traditional media. But this presentation of ideas can be improved by non-traditional media. I’ve just run into a Web App that does some approximation of what I envisioned: Gingko. It is free to use if your usage isn’t heavy.

What I would love would be for cards to also support

  • Cross referencing; currently it supports hashtags, but not referencing with a definite target card.
  • Duplicating cards; currently it supports moving, but not duplicating.
  • Multiple ancestry: it would be great if the same card can appear as the child of two different cards. But this can also be emulated with cross referencing support or duplication support.

Riemann-, Generalized-Riemann-, and Darboux-Stieltjes integrals

(The following is somewhat rough and may have typos.)

Let us begin by setting the notations and recalling what happens without the Stieltjes part.

Defn (Partition)
Let I be a closed interval. A partition P is a finite collection of closed subintervals \{I_\alpha\} such that

  1. P is finite;
  2. P covers I, i.e. \cup P = I;
  3. P is pairwise almost disjoint, i.e. for I_\alpha, I_\beta distinct elements of P, their intersection contains at most one point.

We write \mathscr{P} for the set of all partitions of I.

Defn (Refinement)
Fix I a closed interval, and P, Q two partitions. We say that P refines Q or that P \preceq Q if for every I_\alpha\in P there exists J_\beta \in Q such that I_\alpha \subseteq J_\beta.

Defn (Selection)
Given I a closed interval and P a partition, a selection \sigma: P \to I is a mapping that satisfies \sigma(I_\alpha) \in I_\alpha.

Defn (Size)
Given I a closed interval and P a partition, the size of P is defined as |P| = \sup_{I_\alpha \in P} |I_\alpha|, where |I_\alpha| is the length of the closed interval I_\alpha.

Remark In the above we have defined two different preorders on the set \mathscr{P} of all partitions. One is induced by the size: we say that P \leq Q if |P| \leq |Q|. The other is given by the refinement P\preceq Q. Note that neither are partial orders. (But that the preorder given by refinement can be made into a partial order if we disallow zero-length degenerate closed intervals.) Note also that if P\preceq Q we must have P \leq Q.

Now we can define the notions of integrability.

Defn (Integrability)
Let I be a closed, bounded interval and f:I \to \mathbb{R} be a bounded function. We say that f is integrable with integral s in the sense of

  • Riemann if for every \epsilon > 0 there exists P_0\in \mathcal{P} such that for every P \leq P_0 and every selection \sigma:P \to I we have
    \displaystyle \left| \sum_{I' \in P} f(\sigma(I')) |I'| - s \right| < \epsilon

  • Generalised-Riemann if for every \epsilon > 0 there exists P_0 \in \mathcal{P} such that for every P \preceq P_0 and every selection \sigma: P\to I we have
    \displaystyle \left| \sum_{I' \in P} f(\sigma(I')) |I'| - s \right| < \epsilon

  • Darboux if
    \displaystyle \inf_{P\in\mathscr{P}} \sum_{I' \in P} (\sup_{I'} f )|I'| = \sup_{P\in\mathscr{P}} \sum_{I' \in P} (\inf_{I'} f )|I'| = s

From the definition it is clear that “Riemann integrable” implies “Generalised-Riemann integrable”. Furthermore, we have clearly that for a fixed P
\displaystyle \sum_{I' \in P} (\inf_{I'} f) |I'| \leq \sum_{I' \in P} f(\sigma(I')) |I'| \leq \sum_{I' \in P} (\sup_{I'} f) |I'|
and that if P \preceq Q we have
\displaystyle \sum_{I' \in Q} (\inf_{I'} f) |I'| \leq \sum_{I' \in P} (\inf_{I'} f) |I'| \leq \sum_{I' \in P} (\sup_{I'} f) |I'| \leq \sum_{I' \in Q} (\inf_{I'} f) |I'|
so “Darboux integrable” also implies “Generalised-Riemann integrable”. A little bit more work shows that “Generalised-Riemann integrable” also implies “Darboux integrable” (if the suprema and infima are obtained on the intervals I', this would follow immediately; using the boundedness of the intervals we can find \sigma such that the Riemann sum approximates the upper or lower Darboux sums arbitrarily well.

The interesting part is the following
Theorem
Darboux integrable functions are Riemann integrable. Thus all three notions are equivalent.

Proof. Let P, Q be partitions. Let |P| \leq \inf_{I'\in Q, |I'| \neq 0} |I'|, and let m be the number of non-degenerate subintervals in Q. We have the following estimate
\displaystyle   \sum_{I'\in Q} (\inf_{I'} f) |I'| - (m-1) |P| (\sup_I 2|f|) \leq \sum_{J'\in P} f(\sigma(J')) |J'| \leq \sum_{I'\in Q} (\sup_{I'} f) |I'| + (m-1) |P| (\sup_I 2|f|)
The estimate follows by noting that “most” of the J'\in P will be proper subsets of I'\in Q, and there can be at most m-1 of the J' that straddles between two different non-degenerate sub-intervals of Q. To prove the theorem it suffices to choose first a Q such that the upper and lower Darboux sums well-approximates the integral. Then we can conclude for all P with |P| sufficiently small the Riemann sum is almost controlled by the Q-Darboux sums. Q.E.D.

Now that we have recalled the case of the usual integrability. Let us consider the case of the Stieltjes integrals: instead of integrating against \mathrm{d}x, we integrate against \mathrm{d}\rho, where \rho is roughly speaking a “cumulative distribution function”: we assume that \rho:I \to \mathbb{R} is a bounded monotonically increasing function.

The definition of the integrals are largely the same, except that at every step we replace the width of the interval |I'| by the diameter of \rho(I'), i.e. \sup_{I'} \rho - \inf_{I'} \rho. The arguments above immediately also imply that

  • “Riemann-Stieltjes integrable” implies “Generalised-Riemann-Stieltjes integrable”
  • “Darboux-Stieltjes integrable” implies “Generalised-Riemann-Stieltjes integrable”
  • “Generalised-Riemann-Stieltjes integrable” implies “Darboux-Stientjes integrable”

However, Darboux-Stieltjes integrable functions need not be Riemann-Stieltjes integrable. The possibility of failure can be seen in the proof of the theorem above, where we used the fact that |P| is allow to be made arbitrarily small. The same estimate, in the case of the Stieltjes version of the integrals, has |P| replaced by \sup_{J'\in P} (\sup_{J'} \rho - \inf_{J'} \rho), which for arbitrary partitions need to shrink to zero. To have a concrete illustration, we give the following:

Example
Let I = [0,1]. Let \rho(x) = 0 if x < \frac12 and 1 otherwise. Let f(x) = 0 if x \leq \frac12 and 1 otherwise. Let Q_0 be the partition \{ [0,\frac12], [\frac12,1]\}. We have that
\displaystyle \sum_{I'\in Q_0} (\sup_{I'} f) (\sup_{I'} \rho - \inf_{I'} \rho) = 0 \cdot (1 - 0) + 1\cdot (1 - 1) = 0
while
\displaystyle \sum_{I'\in Q_0} (\inf_{I'} f) (\sup_{I'} \rho - \inf_{I'} \rho) = 0 \cdot (1-0) + 0 \cdot(1-1) = 0
so we have that in particular the pair (f,\rho) is Darboux-Stieltjes integrable with integral 0. However, let k be any odd integer, consider the partition P_k of [0,1] into k equal portions. Depending on the choice of the selection \sigma, we see that the sum can take the values
\displaystyle \sum_{I'\in P_k} f(\sigma(I')) (\sup_{I'} \rho - \inf_{I'}\rho) = f(\sigma([\frac12 - \frac1{2k},\frac12 + \frac1{2k}])) (1 - 0) \in \{0,1\}
which shows that the Riemann-Stieltjes condition can never be satisfied.

The example above where both f and \rho are discontinuous at the same point is essentially sharp. A easy modification of the previous theorem shows that
Prop
If at least one of f,\rho is continuous, then Darboux-Stieltjes integrability is equivalent to Riemann-Stieltjes integrability.

Remark The nonexistence of Riemann-Stieltjes integral when f and g has shared discontinuity points is similar in spirit to the idea in distribution theory where whether the product of two distributions is well-defined (as a distribution) depends on their wave-front sets.

An optimization problem: variation

Examining the theorem proven in the previous post, we are led naturally to ask whether there are higher order generalizations.

Question: Let f \in C^{k}([-1,1]) with f^{(k)} > 0. What can we say about the minimizer of C = \int_{-1}^1 |f(x) - p(x)|~\mathrm{d}x where p ranges over degree k-1 polynomials?

It is pretty easy to see that we expect p to intersect f at the maximum number of points, which is k. We label those points x_1, \ldots, x_k and call x_0 = -1 and x_{k+1}= 1. Then the cost function can be written as
\displaystyle C = \sum_{j = 0}^k (-1)^j \int_{x_j}^{x_{j+1}} f(x) - p(x; x_1, \ldots, x_k) ~\mathrm{d}x
Since we know that values of p at the points x_1, \ldots, x_k we can write down the interpolation polynomial explicitly using Sylvester’s formula:
\displaystyle p = \sum_{j = 1}^k \left( \prod_{1 \leq m \leq k, m\neq j} \frac{x - x_m}{x_j - x_m} \right) f(x_j) = \sum L_j(x; x_1, \ldots, x_k) f(x_j)

The partial derivatives are now
\displaystyle \partial_n C = \sum_{j = 0}^k (-1)^{j+1} \int_{x_j}^{x_{j+1}} \partial_n p(x; x_1, \ldots, x_k) ~\mathrm{d}x
It remains to compute \partial_n p for 1 \leq n \leq k. We observe that when n \neq j
\displaystyle \partial_n L_j = - \frac{1}{x - x_n} L_j + \frac{1}{x_j - x_n} L_j
and also
\displaystyle \partial_n L_n = - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n
So
\displaystyle \partial_n p =  \sum_{j \neq n} \frac{x-x_j}{(x_j - x_n)(x - x_n)} L_j f(x_j) + L_n f'(x_n) - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n f(x_n)
Now, we observe that
\displaystyle \frac{x - x_j}{x - x_n} L_j =   - \left( \prod_{m \neq n,j} \frac{x_n - x_m}{x_j - x_m} \right)  L_n
so after some computation we arrive at
\displaystyle \partial_n p = L_n(x) \cdot \left[ f'(x_n) - \sum_{j \neq n} \frac{1}{x_j - x_n} \left(\left( \prod_{m \neq j,n}\frac{x_n - x_m}{x_j - x_m}\right)f(x_j) - f(x_n) \right)\right]
which we can further simplify to
\displaystyle \partial_n p = L_n(x) \cdot \left( f'(x_n) - p'(x_n)\right)
Now, since f and p cross transversely at x_n, the difference of their derivatives is non-zero. (This harks back to our assumption that f^{(k)} > 0.) So we are down, as in the case where k = 2, to equations entirely independent of f.

More precisely, we see that the stationarity condition becomes the choice of x_1, \ldots, x_k such that the integrals
\displaystyle \sum_{j = 0}^k (-1)^{j} \int_{x_j}^{x_{j+1}} L_n(x)  ~\mathrm{d}x = 0
for each n. Since L_n form a basis for the polynomials of degree at most k-1, we have that the function
\chi(x) = (-1)^j \qquad x \in (x_j, x_{j+1})
is L^2 orthogonal to every polynomial of degree at most k-1. So in particular the x_j are solutions to the following system of equations
x_0 = -1, \qquad x_{k+1} = 1
\sum_{j = 0}^k (-1)^j \left[ x_{j+1}^d - x_{j}^d \right] = 0 \qquad \forall d \in \{1, \ldots, k\}

From symmetry considerations we have that x_j = - x_{k+1 - j}. This also kills about half of the equations. For the low k we have

  1. \{ 0\}
  2. \{ -1/2, 1/2\}
  3. \{-1/2, 0, 1/2\}
  4. \{ (\pm 1 \pm \sqrt{5})/4 \}
  5. \{ 0, \pm\frac12, \pm \frac{\sqrt{3}}2 \}