An optimization problem: variation

by Willie Wong

Examining the theorem proven in the previous post, we are led naturally to ask whether there are higher order generalizations.

Question: Let f \in C^{k}([-1,1]) with f^{(k)} > 0. What can we say about the minimizer of C = \int_{-1}^1 |f(x) - p(x)|~\mathrm{d}x where p ranges over degree k-1 polynomials?

It is pretty easy to see that we expect p to intersect f at the maximum number of points, which is k. We label those points x_1, \ldots, x_k and call x_0 = -1 and x_{k+1}= 1. Then the cost function can be written as
\displaystyle C = \sum_{j = 0}^k (-1)^j \int_{x_j}^{x_{j+1}} f(x) - p(x; x_1, \ldots, x_k) ~\mathrm{d}x
Since we know that values of p at the points x_1, \ldots, x_k we can write down the interpolation polynomial explicitly using Sylvester’s formula:
\displaystyle p = \sum_{j = 1}^k \left( \prod_{1 \leq m \leq k, m\neq j} \frac{x - x_m}{x_j - x_m} \right) f(x_j) = \sum L_j(x; x_1, \ldots, x_k) f(x_j)

The partial derivatives are now
\displaystyle \partial_n C = \sum_{j = 0}^k (-1)^{j+1} \int_{x_j}^{x_{j+1}} \partial_n p(x; x_1, \ldots, x_k) ~\mathrm{d}x
It remains to compute \partial_n p for 1 \leq n \leq k. We observe that when n \neq j
\displaystyle \partial_n L_j = - \frac{1}{x - x_n} L_j + \frac{1}{x_j - x_n} L_j
and also
\displaystyle \partial_n L_n = - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n
So
\displaystyle \partial_n p =  \sum_{j \neq n} \frac{x-x_j}{(x_j - x_n)(x - x_n)} L_j f(x_j) + L_n f'(x_n) - \left( \sum_{1\leq m \leq k, m\neq n} \frac{1}{x_n - x_m} \right) L_n f(x_n)
Now, we observe that
\displaystyle \frac{x - x_j}{x - x_n} L_j =   - \left( \prod_{m \neq n,j} \frac{x_n - x_m}{x_j - x_m} \right)  L_n
so after some computation we arrive at
\displaystyle \partial_n p = L_n(x) \cdot \left[ f'(x_n) - \sum_{j \neq n} \frac{1}{x_j - x_n} \left(\left( \prod_{m \neq j,n}\frac{x_n - x_m}{x_j - x_m}\right)f(x_j) - f(x_n) \right)\right]
which we can further simplify to
\displaystyle \partial_n p = L_n(x) \cdot \left( f'(x_n) - p'(x_n)\right)
Now, since f and p cross transversely at x_n, the difference of their derivatives is non-zero. (This harks back to our assumption that f^{(k)} > 0.) So we are down, as in the case where k = 2, to equations entirely independent of f.

More precisely, we see that the stationarity condition becomes the choice of x_1, \ldots, x_k such that the integrals
\displaystyle \sum_{j = 0}^k (-1)^{j} \int_{x_j}^{x_{j+1}} L_n(x)  ~\mathrm{d}x = 0
for each n. Since L_n form a basis for the polynomials of degree at most k-1, we have that the function
\chi(x) = (-1)^j \qquad x \in (x_j, x_{j+1})
is L^2 orthogonal to every polynomial of degree at most k-1. So in particular the x_j are solutions to the following system of equations
x_0 = -1, \qquad x_{k+1} = 1
\sum_{j = 0}^k (-1)^j \left[ x_{j+1}^d - x_{j}^d \right] = 0 \qquad \forall d \in \{1, \ldots, k\}

From symmetry considerations we have that x_j = - x_{k+1 - j}. This also kills about half of the equations. For the low k we have

  1. \{ 0\}
  2. \{ -1/2, 1/2\}
  3. \{-1/2, 0, 1/2\}
  4. \{ (\pm 1 \pm \sqrt{5})/4 \}
  5. \{ 0, \pm\frac12, \pm \frac{\sqrt{3}}2 \}
Advertisements