… Data aequatione quotcunque fluentes quantitates involvente fluxiones invenire et vice versa …

## Category: general mathematics

### An optimization problem: variation

Examining the theorem proven in the previous post, we are led naturally to ask whether there are higher order generalizations.

Question: Let $f \in C^{k}([-1,1])$ with $f^{(k)} > 0$. What can we say about the minimizer of $C = \int_{-1}^1 |f(x) - p(x)|~\mathrm{d}x$ where $p$ ranges over degree $k-1$ polynomials?

It is pretty easy to see that we expect $p$ to intersect $f$ at the maximum number of points, which is $k$. We label those points $x_1, \ldots, x_k$ and call $x_0 = -1$ and $x_{k+1}= 1$. Then the cost function can be written as
$\displaystyle C = \sum_{j = 0}^k (-1)^j \int_{x_j}^{x_{j+1}} f(x) - p(x; x_1, \ldots, x_k) ~\mathrm{d}x$
Since we know that values of $p$ at the points $x_1, \ldots, x_k$ we can write down the interpolation polynomial explicitly using Sylvester’s formula:
$\displaystyle p = \sum_{j = 1}^k \left( \prod_{1 \leq m \leq k, m\neq j} \frac{x - x_m}{x_j - x_m} \right) f(x_j) = \sum L_j(x; x_1, \ldots, x_k) f(x_j)$

The partial derivatives are now
$\displaystyle \partial_n C = \sum_{j = 0}^k (-1)^{j+1} \int_{x_j}^{x_{j+1}} \partial_n p(x; x_1, \ldots, x_k) ~\mathrm{d}x$
It remains to compute $\partial_n p$ for $1 \leq n \leq k$. We observe that when $n \neq j$
$\displaystyle \partial_n L_j = - \frac{1}{x - x_n} L_j + \frac{1}{x_j - x_n} L_j$
and also
$\displaystyle \partial_n L_n = - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n$
So
$\displaystyle \partial_n p = \sum_{j \neq n} \frac{x-x_j}{(x_j - x_n)(x - x_n)} L_j f(x_j) + L_n f'(x_n) - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n f(x_n)$
Now, we observe that
$\displaystyle \frac{x - x_j}{x - x_n} L_j = - \left( \prod_{m \neq n,j} \frac{x_n - x_m}{x_j - x_m} \right) L_n$
so after some computation we arrive at
$\displaystyle \partial_n p = L_n(x) \cdot \left[ f'(x_n) - \sum_{j \neq n} \frac{1}{x_j - x_n} \left(\left( \prod_{m \neq j,n}\frac{x_n - x_m}{x_j - x_m}\right)f(x_j) - f(x_n) \right)\right]$
which we can further simplify to
$\displaystyle \partial_n p = L_n(x) \cdot \left( f'(x_n) - p'(x_n)\right)$
Now, since $f$ and $p$ cross transversely at $x_n$, the difference of their derivatives is non-zero. (This harks back to our assumption that $f^{(k)} > 0$.) So we are down, as in the case where $k = 2$, to equations entirely independent of $f$.

More precisely, we see that the stationarity condition becomes the choice of $x_1, \ldots, x_k$ such that the integrals
$\displaystyle \sum_{j = 0}^k (-1)^{j} \int_{x_j}^{x_{j+1}} L_n(x) ~\mathrm{d}x = 0$
for each $n$. Since $L_n$ form a basis for the polynomials of degree at most $k-1$, we have that the function
$\chi(x) = (-1)^j \qquad x \in (x_j, x_{j+1})$
is $L^2$ orthogonal to every polynomial of degree at most $k-1$. So in particular the $x_j$ are solutions to the following system of equations
$x_0 = -1, \qquad x_{k+1} = 1$
$\sum_{j = 0}^k (-1)^j \left[ x_{j+1}^d - x_{j}^d \right] = 0 \qquad \forall d \in \{1, \ldots, k\}$

From symmetry considerations we have that $x_j = - x_{k+1 - j}$. This also kills about half of the equations. For the low $k$ we have

1. $\{ 0\}$
2. $\{ -1/2, 1/2\}$
3. $\{-1/2, 0, 1/2\}$
4. $\{ (\pm 1 \pm \sqrt{5})/4 \}$
5. $\{ 0, \pm\frac12, \pm \frac{\sqrt{3}}2 \}$

### An optimization problem: theme

Let’s start simple:

Question 1: What is the linear function $\ell(x)$ that minimizes the integral $\int_{-1}^1 |x^2 + x - \ell(x)| ~\mathrm{d}x$? In other words, what is the best linear approximation of $x^2 + x$ in the $L^1([-1,1])$ sense?

This is something that can in principle be solved by a high schooler with some calculus training. Here’s how one solution may go:

Solution: All linear functions take the form $\ell(x) = ax + b$. The integrand is equal to $x^2 + x - \ell(x)$ when $x^2 + x \geq \ell(x)$, and $\ell(x) - x - x^2$ otherwise. So we need to find the points of intersection. This requires solve $x^2 + x - ax - b = 0$, which we can solve by the quadratic formula. In the case where $x^2 + x - a x - b$ is signed, we see that changing $b$ we can make the integrand strictly smaller, and hence we cannot attain a minimizer. So we know that at the minimizer there must exist at least one root.

Consider the case where there is only one root in the interval (counted with multiplicity), call the root $x_0$. We have that the integral to minimize is equal to
$\displaystyle \left| \int_{-1}^{x_0} x^2 + x - ax - b~\mathrm{d}x \right| + \left| \int_{x_0}^1 x^2 + x - a x - b ~\mathrm{d}x \right|$
each part of which can be computed explicitly to obtain
$\displaystyle \left| \frac13 x_0^3 + \frac13 + \frac{1-a}{2} x_0^2 - \frac{1-a}{2} - b x_0 - b \right| + \left| \frac13 - \frac13 x_0^3 + \frac{1-a}{2} - \frac{1-a}{2} x_0^2 - b + b x_0\right|$
Since we know that the two terms comes from integrands with different signs, we can combine to get
$\displaystyle \left| \frac23 x_0^3 + (1-a) x_0^2 - (1-a) - 2b x_0 \right|$
as the integrand. Now, we cannot just take the partial derivatives of the above expression with respect to $a,b$ and set that to zero and see what we get: the root $x_0$ depends also on the parameters. So what we would do then is to plug in $x_0$ using the expression derived from the quadratic formula, $x_0 = \frac{1}{2} \left( a - 1 \pm \sqrt{ (1-a)^2 + 4b}\right)$, and then take the partial derivatives. Before that, though, we can simplify a little bit: since $x_0^3 + (1-a) x_0^2 - b x_0 = 0$ from the constraint, the quantity to minimize is now
$\displaystyle \left| - \frac13 x_0^3 - (1-a) - b x_0 \right|$
A long computation taking the $\partial_b$ now shows that necessarily $x_0 = 0$, which implies that $b = 0$. But for the range of $a$ where there is only one root in the interval, the quantity does not achieve a minimum. (The formal minimizer happens at $a = 1$ but we see for this case the integrand of the original cost function is signed.

So we are down to the case where there are two roots in the interval. Now we call the roots $x_+$ and $x_-$, and split the integral into
$\displaystyle \left| \int_{-1}^{x_-} x^2 + x - ax - b~\mathrm{d}x - \int_{x_-}^{x_+} x^2 + x - ax - b~\mathrm{d}x + \int_{x_+}^1 x^2 + x - ax - b~\mathrm{d}x \right|$
and proceed as before. (The remainder of the proof is omitted; the reader is encouraged to carry out this computation out by hand to see how tedious it is.) Read the rest of this entry »

### Mariş’s Theorem

During a literature search (to answer my question concerning symmetries of “ground states” in variational problem, I came across a very nice theorem due to Mihai Mariş. The theorem itself is, more than anything else, a statement about the geometry of Euclidean spaces. I will give a rather elementary write-up of (a special case of) the theorem here. (The proof presented here can equally well be applied to get the full strength of the theorem as presented in Maris’s paper; I give just the special case for clarity of the discussion.)

### Snell’s law and geometry

Today’s post is somewhat inspired by this question on math.stackexchange. To begin with, recall Snell’s Law from geometric optics. It gives a rule for describing the propagation of light between two media of different indices of refraction: namely that across an interface where on one side the index of refraction is $n_1$ and on the other side $n_2$, the angles of incidence and refraction $\theta_1$ and $\theta_2$, as measured from the normal to the interface, satisfies the relation

$\displaystyle \frac{\sin \theta_1}{\sin \theta_2} = \frac{n_2}{n_1}$.

Here I pose two problems:

1. Given an observer situated above the interface, and a fish below the interface, find the “apparent position” of the fish according to the observer.
2. Given an observer situated above the interface, and the “apparent position” of a fish below the interface, find the actual position of the fish.

Without loss of generality we can assume that the interface is the $x$-axis, and the observer is situated at the point $(0,1)$ in the $x-y$ plane. Read the rest of this entry »