The Remez Exchange Algorithm

Overview

The Remez exchange algorithm (also called the Remez algorithm) is an iterative numerical method for computing the best uniform (minimax) approximation of a continuous function \( f(x) \) on a closed interval \( [a, b] \) using functions from a finite-dimensional Chebyshev system (also known as a Haar subspace).

It is most famous for finding the polynomial \( p(x) \) of degree at most \( n \) that minimizes the maximum deviation:

\[ \min_{p \in \mathcal{P}_n} \max_{x \in [a,b]} \lvert f(x) - p(x) \rvert \quad\text{or equivalently}\quad \|f - p^*\|_\infty = \min_{p \in \mathcal{P}_n} \|f - p\|_\infty \]

However, the algorithm is far more general and applies to any linear approximating family satisfying the Haar condition, including:

  • Rational functions (fixed numerator/denominator degrees)
  • Exponential sums
  • Trigonometric polynomials
  • Splines with fixed knots
  • Other Chebyshev systems

This makes Remez the standard method for solving general minimax approximation problems.

Key Property: The Equioscillation Theorem

The solution is uniquely characterized by the de la Vallée-Poussin / Chebyshev equioscillation theorem:

The error \( e(x) = f(x) - p^*(x) \) of the best approximation \( p^* \) attains its maximum absolute value at at least \( n+2 \) points in \( [a, b] \), with alternating signs: $$ e(x_i) = (-1)^{i} \cdot E \quad \text{for} \quad i = 0, 1, \dots, n+1 $$ where \( a \leq x_0 < x_1 < \dots < x_{n+1} \leq b \) and \( E = \|f - p^*\|_\infty \).

This equioscillation of the error at \( n+2 \) points is both necessary and sufficient for optimality.

How the Algorithm Works

  1. Initialize a set of \( n+2 \) reference points \( \{x_0, x_1, \dots, x_{n+1}\} \) in \( [a, b] \) (often Chebyshev nodes or equally spaced).
  2. Solve the leveled-reference equation:
    Find coefficients of \( p(x) \) and deviation \( E \) such that: $$ f(x_i) - p(x_i) = (-1)^i E \quad \text{for} \quad i = 0, \dots, n+1 $$ This forms a nonlinear system (or linear after fixing the sign pattern).
  3. Update reference set:
    Find the extrema of the current error \( e(x) = f(x) - p(x) \).
    Replace the old reference points with new ones where the error achieves local maxima/minima (typically using the current extrema).
  4. Repeat steps 2–3 until the error equioscillates (or changes are negligible).

Convergence: The algorithm typically exhibits quadratic convergence near the solution.

Basic:
\[\begin{align*}\begin{pmatrix}\phi_{0}(x_{0})&\cdots&\phi_{n}(x_{0})\\\vdots&\ddots&\vdots\\\phi_{0}(x_{n+1})&\cdots&\phi_{n}(x_{n+1})\end{pmatrix}\begin{pmatrix}b_{0}\\\vdots\\b_{n}\end{pmatrix}=\begin{pmatrix}f(x_{0})-\sigma_{0}E \\\vdots\\f(x_{n})-\sigma_{n+1}E\end{pmatrix},\sigma_{k}=s(-1)^{k},\:s\in\left\{-1,1\right\}\end{align*}\] \[\begin{align*}b_{0}\phi_{0}(x_{k})+b_{1}\phi_{1}(x_{k})+\cdots+b_{n}\phi_{n}(x_{k}) +\sigma_{k}E =f(x_{k});\:k=0,1,\cdots,n+1\end{align*}\] \[\begin{align*}\begin{pmatrix}\phi_{0}(x_{0})&\cdots&\phi_{n}(x_{0})&1\\\vdots &\ddots&\vdots &-1\\\phi_{0}(x_{n})&\cdots&\phi_{n}(x_{n})&\vdots\\\phi_{0}(x_{n+1})&\cdots&\phi_{n}(x_{n+1})&(-1)^{n-[\frac{n}{2}]}\end{pmatrix}\begin{pmatrix}b_{0}\\\vdots\\b_{n}\\E\end{pmatrix}=\begin{pmatrix}f(x_{0})\\\vdots\\f(x_{n})\\f(x_{n+1})\end{pmatrix}\end{align*}\]
Weighted (from Grok)
The Remez algorithm, used for finding the best polynomial or rational approximation to a function over a given interval, can incorporate weight functions to prioritize certain regions of the interval or to account for varying importance of approximation accuracy. Here's how weight functions are utilized in the context of the Remez algorithm: ### Role of Weight Functions A weight function \( w(x) \) is a non-negative function defined over the approximation interval \([a, b]\). It modifies the error metric to focus the approximation accuracy on specific regions. Instead of minimizing the maximum absolute error \( \max |f(x) - p(x)| \), the algorithm minimizes the weighted maximum error: \[ \max |w(x) \cdot (f(x) - p(x))| \] where: - \( f(x) \) is the target function to be approximated, - \( p(x) \) is the approximating polynomial (or rational function), - \( w(x) \) is the weight function, which assigns higher or lower importance to errors at different points \( x \). ### Steps in the Remez Algorithm with Weight Functions The Remez algorithm iteratively constructs the best approximation by adjusting the polynomial based on the error at specific points (called alternations or extremal points). When a weight function is included, the process is modified as follows: 1. **Initialization**: - Choose an initial set of \( n+2 \) points (for a polynomial of degree \( n \)) in the interval \([a, b]\), typically Chebyshev nodes or equally spaced points. - Define the target function \( f(x) \), the weight function \( w(x) \), and the degree of the polynomial \( n \). 2. **Solve for the Polynomial**: - At the current set of points \( \{x_0, x_1, \dots, x_{n+1}\} \), set up the system to find a polynomial \( p(x) \) of degree \( n \) such that the weighted error alternates in sign and achieves equal magnitude: \[ w(x_i) \cdot (f(x_i) - p(x_i)) = (-1)^i \cdot E \] where \( E \) is the maximum weighted error, and the signs alternate to satisfy the equioscillation property. - This results in a system of \( n+2 \) equations (including the coefficients of the polynomial and the error \( E \)). 3. **Compute the Weighted Error**: - Calculate the weighted error \( w(x) \cdot (f(x) - p(x)) \) over the interval \([a, b]\) (often discretized for numerical computation). - Identify the points where the weighted error achieves its maximum absolute value (new extremal points). 4. **Update Extremal Points**: - Replace the current set of points with the new extremal points where the weighted error is maximized. - If the new points differ significantly, repeat steps 2–3 until the extremal points stabilize (i.e., the weighted error is equioscillatory). 5. **Convergence**: - The algorithm converges when the weighted error \( w(x) \cdot (f(x) - p(x)) \) is approximately equal in magnitude at the extremal points, with alternating signs, indicating the best approximation in the weighted max-norm. ### Practical Considerations - **Choice of Weight Function**: - A common choice is \( w(x) = 1 \), which reduces to the standard uniform norm (unweighted Remez). - For example, \( w(x) = 1/|x| \) emphasizes accuracy near \( x = 0 \), while \( w(x) = e^{-x} \) prioritizes accuracy for smaller \( x \). - Weight functions can also reflect application-specific needs, such as higher accuracy near critical points. - **Numerical Stability**: - The weight function must be positive and continuous to avoid numerical issues. If \( w(x) \) approaches zero, it can cause instability in the error calculations. - Discretizing the interval and evaluating \( w(x) \) at many points can help with numerical implementation. - **Applications**: - Weighted Remez is useful in signal processing, where certain frequency bands need better approximation, or in numerical analysis, where specific regions (e.g., near boundaries) are critical. - For example, in filter design, \( w(x) \) might be chosen to emphasize passband or stopband accuracy. ### Example Suppose you want to approximate \( f(x) = \sin(x) \) on \([-1, 1]\) with a polynomial of degree 2, prioritizing accuracy near \( x = 0 \) using \( w(x) = 1/(1 + x^2) \). The Remez algorithm would: 1. Start with initial points, say \( \{-1, -0.5, 0, 0.5, 1\} \). 2. Solve for a quadratic polynomial \( p(x) \) such that the weighted error \( w(x_i) \cdot (\sin(x_i) - p(x_i)) \) alternates and is equal in magnitude. 3. Compute the weighted error over \([-1, 1]\), find new extremal points, and iterate until the weighted error equioscillates. ### Limitations - The weight function complicates the system of equations, potentially increasing computational cost. - Choosing an inappropriate \( w(x) \) (e.g., one with discontinuities or extreme variations) can lead to poor convergence or numerical errors. In summary, weight functions in the Remez algorithm allow tailored approximations by emphasizing accuracy in specific regions, achieved by modifying the error metric to include \( w(x) \). The algorithm retains its iterative nature but focuses on minimizing the weighted max-norm, making it versatile for applications requiring non-uniform error control.

c.f. 4
The algorithm will detect a test case as odd/even, if the interval is symmetric and f(x), w(x) (if used) are odd/even. In that case the approximation can be done by using odd/even base functions only. Here is how the weight function is used to achieve this: In the case of odd polynomials p(x) is written as x*q(x^2). The minimax condition becomes \[\begin{align*}E=\max_{x\in\left[-a,a\right]}{\left|\left(f\left(x\right)-x*q\left(x^{2}\right)\right)\right|}=\max_{x\in\left[0,a\right]}{\left|\left(f\left(x\right)-x*q\left(x^{2}\right)\right)\right|}\end{align*}\] \[\begin{align*}With\:y:=x^{2}\:this\:becomes\:E=\max_{y\in\left[0,a^{2}\right]}{\left|\left(f\left(\sqrt y\right)-\sqrt y*q\left(y\right)\right)\right|}.\end{align*}\] Setting w(y)=\(\sqrt y\) and g(y)=\(f\left(\sqrt y\right)/\sqrt y\) the approximation problem becomes \begin{align*}q_{n}^{*}=\min_{q\in P_{n}}\left(\max_{y\in \left[0,a^{2}\right]}{\left|w(y)\left(g\left(y\right)-q\left(y\right)\right)\right|}\right).\end{align*}

c.f. 1
\[\begin{align*}\begin{pmatrix}1&\cdots&0&\sigma_{0}\\\vdots&\ddots&\vdots&\sigma_{1}\\0 &\cdots&1&\vdots\\l_{0}(x_{0})&\cdots&l_{n}(x_{n+1})&\sigma_{n+1}\end{pmatrix}\begin{pmatrix}b_{0}\\\vdots\\b_{n}\\E\end{pmatrix}=\begin{pmatrix}f(x_{0})\\\vdots\\f(x_{n})\\f(x_{n+1})\end{pmatrix}\end{align*}\] \[\begin{align*}l_{k}(x))=l_{\left\{x_0,\cdots x_{n}\right\},k}(x))=\frac{\prod_{l=0,l\neq k}^{n}x-{x}_{l}}{\prod_{l=0,l\neq k}^{n}{x}_{k}-{x}_{l}},\:k=0\cdots n.\:l_{k}(x_{i})=\delta_{k,i},\:i,k=0,\cdots,n\end{align*}\] \[\begin{align*}E=\frac{\sum_{k=0}^{n+1}w_{k}f(x_{k})}{\sum_{l=0}^{n+1}w_{k}\sigma_{k}}.\end{align*}\] \[\begin{align*}\Rightarrow\:p(x)=\sum_{k=0}^{n}p(x_{k})l_{k}(x)=\sum_{k=0}^{n}b_{k}l_{k}(x)=\sum_{k=0}^{n}(f(x_{k})-\sigma_{k}))l_{k}(x)\end{align*}\] \[\begin{align*}\tilde{w}_{j}:=\frac{1}{\prod_{k=0,k\neq j}^{n}{x}_{_j}-{x}_{k}},\:w_{j}:=\frac{1}{\prod_{k=0,k\neq j}^{n+1}x_{j}-x_{k}}\:(w_{j}=\frac{\tilde{w_{j}}}{x_{j}-x_{n+1}},\:j=0,\cdots,n).\end{align*}\] \[\begin{align*}Set\:\phi(x):=\prod_{k=0}^{n}(x-x_{k}),\:then\:l_{k}(x)=\:\frac{\phi(x)\tilde{w_{k}}}{x-x_{k}}.\end{align*}\] \[\begin{align*}From\:1=phi(x)\sum_{k=0}^{n}\frac{\phi(x)\tilde{w_{k}}}{x-x_{k}}\:(a\:constant\:is\:its\:own\:interpolant)\:follows\:the\:barymetric\:representation:\end{align*}\] \[\begin{align*}p(x)=\frac{\sum_{k=0}^{n}\frac{\tilde{w}_{k}p({x}_{k})}{x-{x}_{k}}}{\sum_{k=0}^{n}\frac{\tilde{w}_{k}}{x-{x}_{k}}}\end{align*}\]

Try any of the difficult examples from 8 Table 7.1-7.4:
f1: \(\displaystyle f(x)=\begin{cases} x^{2} & for\ x<\dfrac{1}{\sqrt{2}} \\ -x^{2}+2\sqrt{2}\,x-1 & for\ x>\dfrac{1}{\sqrt{2}} \end{cases}\)
f2: \(\displaystyle f(x)=|x|\sqrt{|x|}\)
f3: \(\displaystyle f(x)=x^{3} + \dfrac{x^{1/3}e^{-x^{2}}}{8}\)
f4: \(\displaystyle \frac{100\pi \left(x^{2}-0.36 \right)}{sinh\left(100\pi \left(x^{2}-0.36 \right) \right)}\)
f5: \(\displaystyle f(x)=-\dfrac{1}{log|x|}\)
f6: \(\displaystyle f(x)=\sqrt{|x|}\)
Further interesting examples from 9 Table 1:
f1: \(\displaystyle f(x)=tanh(x+0.5)-tanh(x-0.5)\)
f2: \(\displaystyle f(x)=sin(exp(x))\)
f3: \(\displaystyle f(x)=\sqrt{x+1}\)
f4: \(\displaystyle f(x)=\sqrt{|x-0.1}|\)
f5: \(\displaystyle f(x)=1-sin(5|x-0.5|)\)
f6: \(\displaystyle f(x)=min(sech(3sin(10x)), sin(9x))\)
f7: \(\displaystyle f(x)=max(sin(20x), e^{x-1})\)
f8: \(\displaystyle f(x)=sech(10(0.5x+0.3))^{2}+sech(100(0.5x+0.1))^{4}+sech(1000(0.5x-0.1))^{6}\)
f9: \(\displaystyle f(x)=log(1.0001+x)\)

  1. Best \(L_{\infty}\) Approximation of a polynomial of degree n by polynomials of degree < n
    \[\begin{align*}The\:best\:approximation\:to\:f(x)=\sum_{k=0}^{n}a_{k}x^{k}\:,a_{0}\lt \gt 0,\:by\:a\:polynomial\:p(x) \:of\:degree\:at\:most\:n−1\:in\:[−1, 1]\:is\end{align*}\] \[\begin{align*}p(x)=f(x)−a_{0}2^{−n+1}T_{n}(x)\:with\:maximum\:norm\:error\:a_{0}2^{−n+1}.\end{align*}\]
    c.f. 5
  2. Differentiable f(x)
    If f is differentiable, the solution of one of eight equation systems yields the best approximation: \[\begin{aligned} 1.\;& f(x_i) - p(x_i) = (-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad x_i \in (a,b),\; 0 \le i \le n+1 \\[4pt] 2.\;& f(x_i) - p(x_i) = -(-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad x_i \in (a,b),\; 0 \le i \le n+1 \\[4pt] 3.\;& f(a) - p(a) = E, \quad f(x_i) - p(x_i) = (-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 < i \le n+1 \\[4pt] 4.\;& f(a) - p(a) = -E, \quad f(x_i) - p(x_i) = -(-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 < i \le n+1 \\[4pt] 5.\;& f(b) - p(b) = (-1)^n E, \quad f(x_i) - p(x_i) = (-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 \le i < n+1 \\[4pt] 6.\;& f(b) - p(b) = -(-1)^n E, \quad f(x_i) - p(x_i) = -(-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 \le i < n+1 \\[4pt] 7.\;& f(a) - p(a) = E, \quad f(b) - p(b) = (-1)^n E, \quad f(x_i) - p(x_i) = (-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 < i < n \\[4pt] 8.\;& f(a) - p(a) = -E, \quad f(b) - p(b) = -(-1)^n E, \quad f(x_i) - p(x_i) = -(-1)^i E, \quad f'(x_i) - p'(x_i) = 0, \quad 0 < i < n \end{aligned}\] Any solution of one of these equations yields an alternant of sufficient length and therefore is the unique solution of the approximation problem. Of course, these equations become very complex with increasing n. That's why Remez invented his algorithm!
  3. Best \(L_{\infty}\) Approximation of abs(x) on [-1,1] by polynomials of degree 2n
    \[\begin{align*}Let\:p_{2n}(x)\:be\:the\:best\:2n-th\:grade\:polynomial\:approximation\:on\:[-1,1].\end{align*}\] \[\begin{align*}Then\:p_{2n}(0)=E,\:p_{2n}(x_{i})=x_{i}+(-1)^{i}E\:for\:0\lt i\lt n\:and\:p_{2n}(1)=1-(-1)^{n}E,\:where\:0\lt x_{i}\lt x_{i+1}\lt ...\lt x_{n}\lt 1.\end{align*}\] \[\begin{align*}Note:\:the\:alternant\:is\:1\:point\:longer\:than\:required.\end{align*}\] \[\begin{align*}n=1:\:p_{2}(x)=x^{2}-\frac{1}{8}; E=\frac{1}{8}.\end{align*}\] \[\begin{align*}n=2:\:p_{4}(x)=ax^{4}+bx^{2}+E\end{align*}\] \[\begin{align*}ax_{1}^{4}+bx_{1}^{2}+E=x_{1}-E\end{align*}\] \[\begin{align*}ax_{2}^{4}+bx_{2}^{2}+E=x_{2}+E\end{align*}\] \[\begin{align*}a+b+E=1-E\end{align*}\] \[\begin{align*}4ax_{1}^{3}+2bx_{1}-1=0\end{align*}\] \[\begin{align*}4ax_{2}^{3}+2bx_{2}-1=0\end{align*}\] \[\begin{align*}The\:system\:has\:the\:following\:solution:\end{align*}\] \[\begin{align*}x_{1} = \frac{5 - 3\sqrt{3} - 2\sqrt{3\sqrt{3} - 5}}{3(\sqrt{3} - 3)}\cong0.28443158314308692545\end{align*}\] \[\begin{align*}a = \frac{5 - 3\sqrt{3}}{8x_{1}^3}\cong-1.0655411683005148005\end{align*}\] \[\begin{align*}b = \frac{3\sqrt{3} - 3}{4x_{1}}\cong1.9302993697449462500\end{align*}\] \[\begin{align*}x_{2} = (1 + \sqrt{3})x_{1}\cong0.7770815364241649003\end{align*}\] \[\begin{align*}E= \frac{1}{2}(1-a-b)\cong0.06762089927778427525\end{align*}\] \[\begin{align*}Setting\:\phi := 3\sqrt{3} - 5\cong0.19615242270663188058\:this\:becomes:\end{align*}\] \[\begin{align*}x_{1} = \frac{2\sqrt{\phi} - \phi}{3(3 - \sqrt{3})}\end{align*}\] \[\begin{align*}a = \frac{-\phi}{8x_{1}^3}\end{align*}\] \[\begin{align*}b = \frac{\phi + 2}{4x_{1}}\end{align*}\] \[\begin{align*}x_{2} = \frac{8 + \phi}{3}x_{1}\end{align*}\] \[\begin{align*}E= \frac{1}{2}(1-a-b)\end{align*}\] \[\begin{align*}\end{align*}\]
  4. Best \(L_{\infty}\) Approximation of \(abs(x^{2}-\alpha),0\lt\alpha\lt1\) on [-1,1] by polynomials of degree 4
    Let \(-1=-x_{0}\lt -x_{1}\lt -x_{2}\lt x_{3}=0\lt x_{2}\lt x_{1}\lt x_{0}=1,\:p_{4}(x)=ax^{4}+bx^{2}+c\).
    We try to solve these 5 equations with the unknowns a, b, c, E, \(x_{1}\): \[\begin{align*}p_{4}(-1)=a+b+c=1-\alpha+E\end{align*}\] \[\begin{align*}p_{4}(-x_{1})=ax_{1}^{4}+bx_{1}^{2}+c=x_{1}^{2}-\alpha-E\end{align*}\] \[\begin{align*}p'_{4}(-x_{1})=-4ax_{1}^{3}-2bx_{1}=-2x_{1}\Rightarrow 4ax_{1}^{2}=2-2b\end{align*}\] \[\begin{align*}p_{4}(\sqrt{\alpha})=a\alpha^{2}+b\alpha+c=E\end{align*}\] \[\begin{align*}p'_{4}(\sqrt{\alpha})=4a\sqrt{\alpha}^{3}+2b\sqrt{\alpha}=0\Rightarrow b=-2a\alpha\end{align*}\]
  5. Approximation of the zero function by monic (normalized) polynomials

    The monic polynomial of degree n, which represents the best approximation of the zero function on [-1,1] in the maximum norm, is the normalized Chebyshev polynomial: T̃_n(x) = (1/2^(n-1)) · T_n(x) T_n(x) is the Chebyshev polynomial of the first kind of degree n. Maximum error: ‖T̃_n‖_∞ = 1/2^(n-1) Examples: n=1: T̃₁(x) = x, error = 1 n=2: T̃₂(x) = x² - 1/2, error = 1/2 n=3: T̃₃(x) = x³ - 3x/4, error = 1/4 n=4: T̃₄(x) = x⁴ - x² + 1/8, error = 1/8
  6. Functions with steps.
    It is easy to see that for any piecewise continuos function f with maximum step hight h \(\parallel f-p_{n}\parallel ^{\infty }_{[a,b]}\ge\frac{h}{2}\) holds.

    It follows that for any piecewise continuous function with exactly n+1 steps,which all are of height h the best polynomial approximation of degree n must pass through the center of the steps and can be achieved by interpolation in a unique way.

    This gives us a method, to construct for any given polynomial functions, which have that polynomial as best approximations.

    Here is an example, that uses the floor-function:
  7. Best approximation of \(\frac {1}{r-x}\), r>1, on [-1, 1]

    The function \( f(x) = \frac{1}{r - x} \) with \( r > 1 \) is optimally approximated in the interval \( [-1, 1] \) for each \( n \in \mathbb{N}_0 \)

    by the polynomial \( p_n(x) := \frac{1}{r - x} - E_n V_n \) of degree \( n \).

    Here, \( E_n := \frac{\left( r - \sqrt{r^2 - 1} \right)^n}{r^2 - 1} \) and \( V_n(x) := \cos(n \varphi + \psi) \), where \( \varphi \) is defined implicitly by \( \cos \varphi := x \) and \( \psi \) by \( \tan \frac{\psi}{2} := \sqrt{\frac{r + 1}{r - 1}} \tan \frac{\varphi}{2} \).

    Proof Sketch

    Polynomial Property: Through transformations, including those involving Chebyshev polynomials of the first and second kind, the function \( q(x) := 1 - (r - x) E_n V_n \) in the numerator of \( p_n(x) = \frac{1}{r - x} - E_n V_n \) is shown to be a polynomial of degree \( n + 1 \). Thus, \( p_n(x) \) is initially a rational function. Furthermore, \( q(x) \) has a zero at \( x = r \), so the factor \( (x - r) \) can be factored out of \( q(x) \) and canceled with the denominator \( (r - x) \) of \( p_n(x) \). Ultimately, \( p_n(x) \) is a polynomial of degree \( n \).

    Best Approximation: The given relations define a monotone and bijective mapping

    \[\begin{align*} ]-1,\ 1[\ \ \to\ \ ]0,\ (n + 1) \pi[\end{align*}\]

    \[\begin{align*}\quad x \mapsto n \varphi + \psi\end{align*}\]

    Because of the monotony this maps the extrema of the error function \( f(x) - p_n(x) = E_n \cos(n \varphi + \psi) \) onto the extrema of the mapped function, which are attained where \(n \varphi + \psi\) is a multiple of \( \pi \).

    By adding the interval boundaries \( x_0 = -1 \) with \( n \varphi + \psi = 0 \) and \( x_{n+1} = 1 \) with \( n \varphi + \psi = (n + 1) \pi \), (using the main branches of the arc-functions) one obtains the \( n + 2 \) alternants \( -1=x_0 < \dots < x_i < \dots < x_{n+1}=1 \), for which the error function takes on the extremal value \( (-1)^i E_n \) exactly \( n + 2 \) times with alternating sign.

    Example: First 4 Best Approximations for r = 37/12 

    n Best Polynomial pn Extrema Points Max Error En
    0 444/1225 1, -1 144/1225 ≈ 0.1176
    1 (144/1225)x + 12/35 1, 1/6, -1 24/1225 ≈ 0.0196
    2 (48/1225)x² + (4/35)x + 396/105 1, 7/12, -5/12, -1 4/1225 ≈ 0.0033
    3 (16/1225)x³ + (4/105)x² + (128/1225)x + 34/105 1, 3/4, 1/12, -2/3, -1 2/3675 ≈ 0.0005
    c.f. 6

  • Polynomials on any interval: \[\begin{align*}\left\{1,x,\cdots,x^{n}\right\}\end{align*}\]
  • Polynomials with even exponents on any interval [a, b] where a > 0: \[\begin{align*}\left\{1,x^{2},\cdots,x^{2n}\right\}\end{align*}\] 2
  • Exponential functions on any interval: \[\begin{align*}\left\{1,e^{\alpha_{0}x},\cdots,e^{\alpha_{n}x}\right\},\alpha_{i}\in\Re\end{align*}\] 2
  • Trigonomertric functions: \[\begin{align*}\left\{1,sin\left(x\right),\cdots,sin\left(nx\right)\right\};\:\left\{1,cos\left(x\right),\cdots,cos\left(nx\right)\right\};\:\left\{1,sin\left(x\right),cos\left(x\right),\cdots,sin\left(nx\right),cos\left(nx\right)\right\}\:on\:\left[0,2\pi\right]\end{align*}\] 3
  • On any interval:\[\begin{align*}\:\left\{1,x,\cdots,x^{n},f(x)\right\},\:if\:f^{\left(n\right)}\left(x\right)\lt\gt0\end{align*}\] (proof by induction using the mean value theorem) 2
  • The Faber–Schauder system on any interval:
    \[\begin{align*}\left\{ \begin{cases}2^n(t-\frac{k}{2^n})&\text{for }t\in I_{n,k}\text{ left half }\\2^n(\frac{k+1}{2^n}-t)&\text{for }t\in I_{n,k}\text{ right half }\\0&\text{otherwise}\end{cases}\\f(t)=a_0s_0(t)+a_1s_1(t)+\sum_{m=0}^{n-1}\sum_{k=0}^{2^m-1}a_{m,k}s_{m,k}(t) \right\}\end{align*}\] 4

\begin{align*}T_0(x)&=1,\\T_1(x)&=x,\\T_2(x)&=2x^2-1,\\T_3(x)&=4x^3-3x,\\T_4(x)&=8x^4-8x^2+1,\\T_5(x)&=16x^5-20x^3+5x,\\T_6(x)&=32x^6-48x^4+18x^2-1,\\T_7(x)&=64x^7-112x^5+56x^3-7x,\\T_8(x)&=128x^8-256x^6+160x^4-32x^2+1,\\T_9(x)&=256x^9-576x^7+432x^5-120x^3+9x,\\T_{10}(x)&=512x^{10}-1280x^8+1120x^6-400x^4+50x^2-1,\\T_{11}(x)&=1024x^{11}-2816x^9+2816x^7-1232x^5+220x^3-11x,\\T_{12}(x)&=2048x^{12}-6144x^{10}+6912x^8-3584x^6+840x^4-72x^2+1,\\T_{13}(x)&=4096x^{13}-13312x^{11}+16640x^9-9984x^7+2912x^5-364x^3+13x,\\T_{14}(x)&=8192x^{14}-28672x^{12}+39424x^{10}-26880x^8+9408x^6-1568x^4+98x^2-1,\\T_{15}(x)&=16384x^{15}-61440x^{13}+92160x^{11}-70400x^9+28800x^7-6048x^5+560x^3-15x,\\T_{16}(x)&=32768x^{16}-131072x^{14}+212992x^{12}-180224x^{10}+84480x^8-21504x^6+2688x^4-128x^2+1,\\T_{17}(x)&=65536x^{17}-278528x^{15}+487424x^{13}-454656x^{11}+240128x^9-70400x^7+10880x^5-816x^3+17x,\\T_{18}(x)&=131072x^{18}-589824x^{16}+1105920x^{14}-1136640x^{12}+662528x^{10}-221760x^8+41664x^6-4032x^4+162x^2-1,\\T_{19}(x)&=262144x^{19}-1245184x^{17}+2490368x^{15}-2799360x^{13}+1787904x^{11}-665856x^9+139968x^7-15504x^5+855x^3-19x\end{align*}