Bubbles Bad; Ripples Good

… Data aequatione quotcunque fluentes quantitates involvente fluxiones invenire et vice versa …

What is a function anyway?

I tried to teach differential geometry to a biophysicist friend yesterday (at his request; he doesn’t really need to know it, but he wanted to know how certain formulae commonly used in their literature came about). Rather surprisingly I hit an early snag. Hence the title of this entry.

Part of the problem was, as usual, my own folly. Since he is more interested in vector fields and tensor fields, I thought I can take a short cut and introduce notions more with a sheafy flavour. (At the end of the day, tangent and cotangent spaces are defined (rather circularly) as dual of each other, and each with a partial, hand-wavy description.) I certainly didn’t expect having to spend a large amount of time explaining the concept of the function.
Read the rest of this entry »

Conway’s Base 13 Function

(N.b. Credit where credit’s due: I learned about this function from an answer of Robin Chapman’s on MathOverflow, and its measurability from Noah Stein.)

Conway’s base 13-function is a strange beast. It was originally crafted by John Conway as a counterexample to the converse of the intermediate value theorem, and has the property that on any open interval its image contains the entire real line. In addition, its support set also serves as an illustration of a dense, uncountable set of numbers whose Lebesgue measure is 0. Read the rest of this entry »

Why learn maths?

There is a quite well-known joke that goes something like this: a mathematician, a physicist, and an engineer are each handed a little red rubber ball, and are told to find out its volume. The mathematician takes a caliper and measures the diameter of the ball, and uses the formula V = \frac{1}{6}\pi d^3. The physicist applies Archimedes’ principle: he submerges the ball in oil, and measures the displacement. The engineer goes back to his lab, walks over to the bookshelf, and opens this gigantic leather-bound volume to the Table of Little-red-rubber-balls.

To continue my earlier rant, it appears that I am not the only one to think that modern mathematics education is misguided. It took Underwood Dudley (warning: a subscription to the AMS Notices is required to access the PDF file) 8 pages to get to it, but in the end, he comes to the same conclusion as I did, albeit from a different starting point. The joke above is meant to illustrate Professor Dudley’s main argument, that despite grandiose claims by the National Academy of Science or the National Research Council, a mathematics education does not provide a practical skill set that is necessary for most jobs. (I will come back to the emphasis on the word “practical” in a bit.) With many examples, Dudley’s essay illustrates a common fallacy, that mathematics is important to learn because frequently in life (especially at work) one will encounter situations which calls for computations beyond basic arithmetic.

Perhaps I should make the distinction clear here: I am not claiming that mathematics education is useless. I am just observing that fact of life that most people, going about their everyday lives, will very infrequently, if ever, encounter a situation that requires the finer understanding of mathematics beyond the middle school level (US; elementary school for East Asia). And therefore we, as scientists/mathematicians/educators/parents, should not oversell the learning of mathematics as something crucial to one’s future, and bully the kids into studying it. Or, in other words, while I think that familiarity, nay, fluency, with mathematical concepts is a requirement for a well-educated man, I do not consider erudition to be a requirement to lead a productive life in society.

The mathematical curriculum (and by extension the physical sciences and history and all other of the more academic classes in American schools) should not invent reasons to convince the students that mathematics is used in all facets of life, and hence important. The knowledge of how to change a flat tire is likely to be much more practical for the average Joe over his lifetime, than the knowledge of how to solve the quadratic equation. (Most of what you need to know to succeed in life, you learn in kindergarten anyway.) That people come to question what goes into the general education based on potential on-the-job utility is completely misguided. A general education for the populace should not be equated with vocational training. A general education should train the students in the ability to reason, to think soundly, to approach problems logically. A flexible mind that is open to new ideas and is capable of solving problems is an asset applicable to any job. By narrow-mindedly restricting one’s attention to the immediate and direct applications of classroom subjects, one runs the risk of missing the grander picture in which the whole is more than just the sum of its parts.

Arrow’s Impossibility Theorem

Partially prompted by Terry’s buzz, I decided to take a look at Arrow’s Impossibility Theorem. The name I have heard before, since I participated in CollegeBowl as an undergraduate, and questions about Arrow’s theorem are perennial favourites. The theorem’s most famous interpretation is in voting theory:

Some definitions

  1. Given a set of electors E and a finite set of candidates C, a preference \pi assigns to each elector e \in E an ordering of the set C. In particular, we can write \pi_e(c_1) > \pi_e(c_2) for the statement “the elector e prefers candidate c_1 to candidate c_2“. The set of all possible preferences is denoted \Pi.
  2. A voting system v assigns to each preference \pi\in\Pi an ordering of the set C.
  3. Given a preference \pi and two candidates c_1,c_2, a bloc biased toward c_1 is defined as the subset b(\pi,c_1,c_2) := \{ e\in E | \pi_e(c_1) > \pi_e(c_2) \}
  4. The voting system is said to be
    1. unanimous if, whenever all electors prefer candidate c_1 to c_2, the voting system will return as such. In other words, “\pi_e(c_1) > \pi_e(c_2) \forall e\in E \implies v(\pi,c_1) > v(\pi,c_2)“.
    2. independent if the voting results comparing candidates c_1 and c_2 only depend on the individual preferences between them. In particular, whether v(\pi,c_1) > v(\pi,c_2) only depends on b(\pi,c_1,c_2). An independent system is said to be monotonic if, in addition, a strictly larger biased bloc will give the same voting result: if v(\pi,c_1) > v(\pi,c_2) and b(\pi,c_1,c_2) \subset b(\pi',c_1,c_2), then v(\pi',c_1) > v(\pi',c_2) necessarily.
    3. dictator-free if there isn’t one elector e_0\in E whose vote always coincides with the end-result. In other words, we define a dictator to be an elector e_0 such that v(\pi,c_1) > v(\pi,c_2) \iff \pi_{e_0}(c_1) > \pi_{e_0}(c_2) for any \pi\in \Pi, c_1,c_2\in C.
  5. A voting system is said to be fair if it is unanimous, independent and monotonic, and has no dictators.

And the theorem states

Arrow’s Impossibility Theorem
In an election consisting of a finite set of electors E with at least three candidates C, there can be no fair voting system.

As we shall see, finiteness of the set of electors and the lower-bound on the number of candidates are crucial. In the case where there are only two candidates, the simple-majority test is a fair voting system. (Finiteness is more subtle.) It is also easy to see that if we allow dictators, i.e. force the voting results to align with the preference of a particular predetermined individual, then unanimity, independence, and monotonicity are all trivially satisfied.

What’s wrong with the simple majority test in more than three candidates? The problem is that it is not, by definition, a proper voting system: it can create loops! Imagine we have three electors e_1, e_2, e_3 and three candidates c_1,c_2,c_3. The simple majority test says that v(\pi,c_1) > v(\pi,c_2) if and only if two or more of the electors prefer c_1 to c_2. But this causes a problem in the following scenario:

e_1: c_1 > c_2 > c_3
e_2: c_2 > c_3 > c_1
e_3: c_3 > c_1 > c_2

then the voting result will have v(c_1) > v(c_2), v(c_2) > v(c_3), and v(c_3) > v(c_1), a circular situation which implies that the “result” is not an ordering of the candidates! (An ordering of the set requires the comparisons to be transitive.) So the simple-majority test is, in fact, not a valid voting system.

From this first example, we see already that, in the situation of more than 2 candidates, designing a voting system is a non-trivial problem. Making them fair, as we shall see, will be impossible. Read the rest of this entry »

Compactness. Part 1: Degrees of freedom

Preamble
This series of posts has its genesis in a couple questions posted on MathOverflow.net. It is taking me a bit longer to write this than expected: I am not entirely sure what level of an audience I should address the post to. Another complication is that it is hard to get ideas organized when I am only writing about this during downtime where I hit a snag in research. The main idea that I want to get to is “a failure of sequential compactness of a domain represents the presence of some sort of infinity.” In this first post I will motivate the definition of compactness in mathematics.

Introduction
We start with the pigeonhole principle: “If one tries to stuff pigeons into pigeonholes, and there are more pigeons than there are pigeonholes, then at least one hole is shared by multiple pigeons.” This can be easily generalized the the case where there are infinitely many pigeons, which is the version we will use

Pigeonhole principle. If one tries to stuff infinitely many pigeons into a finite number of pigeonholes, then at least one hole is occupied by infinitely many pigeons.

Now imagine being the pigeon sorter: I hand you a pigeon, you put it into a pigeonhole. However you sort the pigeons, you are making a (infinite) sequence of assignments of pigeons to pigeonholes. Mathematically, you represent a sequence of points (choices) in a finite set (the list of all holes). In this example, we will say that a point in the set of choices is a cluster point relative to a sequence of choices if that particular point is visited infinitely many times. A mathematical way of formulating the pigeonhole principle then is:

Let S be a finite set. Let (a_n) be a sequence of points in S. Then there exist some point p \in S such that p is a cluster point of (a_n).

The key here is the word finite: if you have an infinitely long row of pigeonholes (say running to the right of you), and I keep handing you pigeons, as long as you keep stepping to the right after stuffing a pigeon into a hole, you can keep a “one pigeon per hole (or fewer)” policy. Read the rest of this entry »

Minimal blow-up solution to an inhomogeneous NLS

Yesterday I went to a wonderful talk by Jeremie Szeftel on his recent joint work with Pierre Raphaël. The starting point is the following equation:

Eq 1. Homogeneous NLS
i \partial_t u + \triangle u + u|u|^2 = 0 on [0,t) \times \mathbb{R}^2

It is known as the mass-critical nonlinear Schrödinger equation. One of its very interesting properties is that it admits a soliton solution Q, which is the unique positive (real) radial solution to

Eq 2. Soliton equation
\triangle Q - Q + Q^3 = 0

Plugging Q into the homogeneous NLS, we see that it evolves purely by phase-rotation: it represents a non-dispersing standing wave. Physically, this represents the case when the self-interaction attractive non-linearity exactly balances out the tendency for a wave to disperse.

As one can easily see from its form, the homogeneous NLS has a large class of continuous symmetries:

  • Time translation
  • Spatial translation
  • Phase translation (translation in Fourier space)
  • Dilation
  • Galilean boosts

(It is also symmetric under rotations, but as the spatial rotation group is compact, it cannot cause problems for the analysis [a lesson from concentration compactness; I’ll write about this another time], so we’ll just forget about it for the time being.) The NLS also admits the so-called pseudo-conformal transformation, which is a discrete \mathbb{Z}_2 action that replacing

Eq 3. Pseudo-conformal inversion
\displaystyle u(t,x) \longrightarrow \frac{1}{|t|}\bar{u}(\frac{1}{t},\frac{x}{t}) e^{i x^2 / (4t)}

maps a solution to another solution. A particularly interesting phenomenon related to this additional symmetry is the existence of the minimal mass blow-up solution: by acting on Q (the soliton) with the pseudo-conformal transform, we obtain a solution that blows-up in finite time. But why do we call this a “minimal mass” solution? This is because previously it has been shown by Michael Weinstein (I think) that for any initial data to the NLS with initial mass (L^2 norm) smaller than that of Q, the solution must exists for all time, whereas for any values of mass strictly above that of Q, one can find a (in fact, multiple) solution that blows-up in finite time. With concentration compactness methods, Frank Merle was able to show that the pseudo-conformally inverted Q is the only initial data that leads to finite-time blow-up with that fixed mass.

In some sense, however, the homogeneous NLS is too-nice of a equation: because of its astounding number of symmetries, one can write down an explicit, self-similar blow-up solution just via the pseudo-conformal transform. A natural question to ask is that whether the property of the existence/uniqueness of a minimal mass blow-up solution can exist for more generic looking equations. The toy model one is led to consider is

Eq 4. Inhomogeneous NLS
i\partial_tu + \triangle u + k(x) u|u|^2 = 0

for which the k(x) = 1 case reduces to the homogeneous equation. The addition of the arbitrary term kills all of the symmetries except phase translation and time translation. The former is a trivial set of symmetries (whose orbit is compact, so not posing any difficulty), while the latter is important since it generates the conservation of energy for this equation.

In the case where k(x) is a differentiable, bounded function, some facts about this equation are known through the work of Merle. Without loss of generality, we will assume from now on that k(x) \leq 1 (we can always arrange for this by rescaling the equation). It was found that in this case, if the initial mass of the data is smaller than that of Q, again, we have global existence of a solution. Heuristically, the idea is that k(x) measures the self-interaction strength of the particle, which can vary with its spatial position: the larger the value of k, the more strongly the interaction. Now, in the homogeneous case the low-mass initial data does not have enough matter to lead to a strong enough self-interaction, so the dispersive behavior dominates and there cannot be concentration of energy and blow-up. Heuristically we expect that for interactions strictly weaker than the homogeneous case (k \leq 1), the dispersion should still dominate over the attractive self-force.

Furthermore, Merle also found that a minimal mass blow-up to the inhomogeneous NLS can only occur if k(x) = 1 (hits an interior maximum) at some finite point, and that k(x) is bounded strictly away from 1 outside some large compact set. In this case, the blow-up can only occur in such a way that leads to a concentration of energy at the maximum point. Heuristically, again, this is natural: the strong self-interaction gives a lower potential energy. So it is natural to expect the particle to slide down into this potential well when it concentrates. If the potential asymptotes to the minimum at infinity, however, one may expect the wave to slide out to infinity and disperse, so it is important to have a strict maximum of the interaction strength in the interior.

Szeftel and Raphaël’s work show that such a blow-up solution indeed exists, and is in fact unique.

Around the local maximum of k(x), we can (heuristically) expand by Taylor polynomials. That x_0 is a local maximum implies that we have schematically

k(x) = 1 + c_2\nabla^2k(x_0) x^2 + c_3\nabla^3k(x_0)x^3 + \ldots

In the case where the Hessian term vanishes, by “zooming in” along a pre-supposed self-similar blow-up with rates identical to the one induced by the pseudo-conformal transform in the homogeneous case, we can convince ourselves that the more we zoom in, the flatter k(x) looks. In this case, then it is not too unreasonable that the focusing behavior of the homogeneous case carries over: by zooming in sufficiently we rapidly approach a situation which is locally identical to the homogeneous case. If the energy is already concentrating, then the errors introduced at “large distances” will be small and controllable. This suggest that the problem admits a purely perturbative treatment. This, indeed, was the case, as Banica, Carles, and Duyckaerts have shown.

On the other hand, of the Hessian term does not vanish, one sees that it remains scale-invariant down to the smallest scales. In other words, no matter how far we zoom in, the pinnacle at k(x_0) will always look curved. In this situation, a perturbative method is less suitable, and this is the regime where Szeftel and Raphaël works.

The trick, it seems to me, is the following (I don’t think I completely understand all the intricacies of the proof; here I’ll just talk about the impression I got from the talk and from looking a bit at the paper): it turns out that by inverting the pseudo-conformal transform, we can reformulate the blow-up equation as a large-time equation in some rescaled variables, where now the potential k depends on the scaling parameter which also depends on time. The idea is to “solve backwards from infinity”. If we just plug-in naïvely the stationary solution Q at infinity, there will be error terms when we evolve back. What we want to do is capture the error terms. If we directly linearize the equation around Q, we will pick-up negative-eigenmodes, which will lead to exponential blow-up and destroy our ansatz. To overcome this difficulty, as is standard, the authors applied modulation theory. The idea behind modulation theory is that all the bad eigenmodes for the linearized equation of a good Hamiltonian system should be captured by the natural symmetries. In this case, we don’t have any natural symmetries to use. But we have “almost” symmetries coming from the homogeneous system. So we consider the manifold of functions spanned by symmetry transformations of Q, and decompose the solution as a projection part u which lives on the manifold, and an orthogonal part u'. In this way, all the wild, uncontrolled directions of the flow are captured in some sort of motion on the symmetry manifold. We don’t actually particularly care how the flow happens, as the flow on the manifold preserves norms. The only bit we care about the flow is how it converges as time approaches the blow-up time: this is what gives us the blow-up rate of the equation.

As it turns out, this decomposition is a very good one: the analysis showed that the flow on the manifold is a good approximation (to the fourth order) of the actual physical flow. This means that the orthogonal error u' is going to be rather small and controllable. Of course, to establish these estimates is a lot of hard work; fundamentally, however, the idea is a beautiful one.

Trip to Oxford; a little lattice problem

Yesterday I gave the relativity seminar at Oxford (the last one to be organized by Piotr Chrusciel, who will be moving to Vienna soon). And I committed one of the cardinal sins of talks: I “put away” information too quickly.

I fully intend to blame it on the technology.

Usually I don’t have this problem with board talks. In the case I have a set number of fixed blackboards, I will go from board 1 to board 2 to board 3 and then back to board 1. Sometimes using board 4 to keep a running tab of important details. In the case I have the sliding blackboards (the kind that has two/three layers of boards and you can slide a completed board up and write on the one underneath), I usually do top layer of board 1 to bottom layer to top layer of board 2 then bottom layer. After filling up all boards I will erase what I don’t need and recycle those boards. Oxford has a rather different system then what I am used to. Firstly, they use whiteboards. While it is more comfortable to hold a marker than a piece of chalk, my handwriting is actually slower and more legible with chalk on blackboard. But most importantly, they have an interesting design of whiteboards. The writing surface is essentially a belt. Imagine three whiteboards looped together so that the bottom of board 1 is connected to top of board 2, bottom of board 2 to top of board 3, and bottom of board 3 to top of board 1. Now mount this belt on a frame that is 1 and a half boards tall. So you can scroll up and down. Now put three of these monsters side by side. Read the rest of this entry »

Organizing references

Sort of on the opposite end of what I wrote about last week (on managing edits of my own papers), I recently came across a good way of managing citations and references.

Previously I’ve been compiling BibTex files by hand: I have a complicated scheme set-up with a central bib directory. Inside it is a master.bib file and a bunch of subdirectories. The subdirectories are categories: general relativity, PDE, differential geometry, etc. In each category are several smaller bibfiles. Each bibfile represent a subcategory, and the bibtex key I used are always of the form “[category]-[subcategory]-[authors’ initial]-[year]”. By keeping separate bibfiles for separate subcategories, editing is easier. And I have a script that recompiles all the bib entries from the subcategory files and dump them into the master.bib file. Furthermore, the script also compiles the key-author-title information and dump them into a dictionary file, which I can then import in Vim to get quick citations. (I’m actually quite proud of this abuse of the vim dictionary that I am using.)

I’ve heard many good things about Zotero and Mendeley, and I have seen Zotero in action. While I really like its direct integration to firefox which allows it to very simply import citations from the web, I am not so thrilled about using a cloud service for managing my citations.

By way of Slashdot, I learned about Jabref. It is Java based and automatically platform agnostic (so I can use it both on my primary Linux computers and also on my Windows computer if I have to). I like its ability to group/tag articles making them easier to find, and its BibTex based backend (making exports simpler). Furthermore, one can link entries with stored PDF files. And with the database stored locally, syncing the bib file and the “link” is as simple as copying the entire directory.

Gradually I plan to migrate not just papers that I cite, but also papers that I read to the database. It provides a good, searchable interface for papers in general. Right now finding a paper on my harddrive involves invoking the unix command “find” and requires remembering either the lastname of one of the authors, or some keyword in the title in proper abbreviation. This contributed in part to my having two or three copies of the same papers on my harddrive. Jabref will certainly help.

I also like the fact that it has a Review field in the bib entry, so you can jot down some thoughts you (or other people) have about the paper.

Snowflakes

It took me two tries to get out of my flat this morning. I really ought to get into the habit of looking out the window in the morning; too often do I open the front door, ready myself to step out, only to turn back to fetch my umbrella. The annoying thing about snow is that I can’t hear it, unlike the pitter-patter of rain.

Somehow or another I ended up looking at Wilson Bentley’s micro-photographs of snow crystals. And a question forms in my mind, “Why are they all so symmetrical?” If all snowflakes were to look alike, then perhaps the dynamics leading to the formation of snow crystal is stable, and the global convergence unto a completely symmetrical pattern would not be surprising. But not all snowflakes look alike. In fact, colloquially we speak of them as each completely different from every other. This implies that the dynamics of snow crystal growth should be at least somewhat sensitive to atmospheric conditions in a local scale (and perhaps to the nucleus that seeds the crystal) so that the seemingly random to-and-fro dance as the snowflake falls from the sky can effect different shapes and branches.

Now, much experimental evidence has gone to show that the formation of ice crystals tends to by catalyzed by impurities. Pure water can be supercooled, in normal pressure conditions, to temperatures below 273 Kelvin. But in these situations a single mite of impurity dropped into the water can cause the entire beaker to freeze over suddenly. Similarly, ice crystals in the upper atmosphere tend to form around impurities: bacterium floating in the air, dust or ash, or perhaps particles introduced artificially. So one may surmise that the fact that all 6 branches of a snowflake grows in the same way because, somehow, the eventual shape of the snowflake is already encoded in the original formation of the central nucleus. Let me try to explain why this hypothesis is not very convincing. I’ll make one a priori assumption, that the growth of a crystal structure is purely local, and not due to some long-range interaction.

To draw an analogy, consider a large group of acrobats. They are trying to bring themselves into a formation around a leader. Disallowing long-range interaction can be thought of requiring that the leader cannot shout out orders to individual troupe members. But we can allow passing of information by short-range interactions, i.e. whispering instructions to the people already in formation. So the leader stands alone at the start. Then he grabs on a few people nearby to form the nucleus. Then he tells each of the people he grabbed a set of instructions on how to grab more people, where to put them, and what instructions to pass on to those people (included in these instructions are instructions to be passed on to the even more remote layer of acrobats and so on). Then if the instructions were passed correctly, a completely ordered pattern will form. But as anyone who has played the game of telephone can testify, in these scenarios some errors will always work its way into the instructions. In the physical case of snowflakes, these are thermodynamical fluctuations. So some irregularities should happen. Now, if the instructions the leaders were trying to pass down were very short and easy to remember, the errors tend not to build up, and the formation will, for the most part, be correct. But keeping the message short has the drawback that the total number of formations one can form is fewer. In the snowflake case, one can imagine somehow each small group of molecules in the snow crystal can encode some fixed amount of information. If the encoding is very redundant (so the total number of shapes is small), then the thermodynamical fluctuations will not be likely to break the symmetries between the arms. But considering the large number of possible shapes of snowflakes, such encoding of information should be taxed to the limit, and small fluctuation (errors in the game of telephone) should be able to lead one arm to look drastically different from the others. One possible way to get around this difficulty would be to use some sort of self similarity principle. But this will suggest the snowflakes are true fractals, which they are not. Read the rest of this entry »

The “Hoop Conjecture” of Kip Thorne and Spherically Symmetric Space-times

Abstract. (This being a rather long post, I feel the need to write one.) In the post I first gather some miscellaneous thoughts on what the hoop conjecture is and why it is difficult to prove in general. After this motivation, I show also how the statement becomes much easier to state and prove in spherical symmetry: the entire argument collapses to an exercise in ordinary differential equations. In particular, I demonstrate a theorem that is analogous, yet slightly different, from a recent result of Markus Khuri, using much simpler machinery.

The Hoop conjecture is a proposed criterion for when a black-hole will form under gravitational collapse. Kip Thorne, in 1972 [see Thorne, Nonspherical Gravitational Collapse: a Short Review in Magic without Magic] made the conjecture that (I paraphrase here)

Horizons form when and only when a mass M gets compacted into a region whose circumference C in EVERY direction is bounded by C \lesssim M.

This conjecture, now widely under the name of “Hoop conjecture”, is deliberately vague. (This seemed to have been the trend in physics, especially in general relativity. Conjectures are often stated in such a way that half the effort spent in proving said conjectures are used to find the correct formulation of the statement itself.) Read the rest of this entry »