{\let\v=\varphi\let\a=\alpha\def\n#1{\left\|#1\right\|} \section{LEAST SQUARES APPROXIMATION} \noindent{\bf Continuous case}. Given an inner product $$\la f,g\ra= \int_a^b f(x)g(x)w(x)dx$$ on a vector space $V$ and some function $f\in V$, the problem is to find the polynomial $p^\ast(x)$ of degree $\le n$ which minimizes $$\tn{f-p}^2= \int_a^b \bigl(f(x)-p(x)\bigr)^2 w(x)dx$$ over all polynomials $p(x)$ of degree $\le n$. $p^\ast$ is called the {\it best least squares approximation\/} to $f$. $p^\ast$ can be computed directly by setting the partial derivatives of $E(\a)= \tn{f-p}^2$ equal to zero, where $p(x)=\a_0+\a_1x + \cdots + \a_nx^n$, $\a=(\a_0$, $\a_1$, $\ldots$, $\a_n)$. This gives $n+1$ linear equations in the $n+1$ unknown coefficients $\a_0$, $\ldots$, $\a_n$: $$A\a=b, \hskip40pt\hbox{(normal equations)}$$ where $A_{ij}=\la x^i,x^j\ra$, $b_i=\la f,x^i\ra$, $0\le i,j\le n$. This particular matrix $A$, known as the {\it Hilbert matrix}, is extremely ill conditioned, making $\a$ difficult to compute accurately. A better approach is to find orthogonal polynomials $\v_0$, $\ldots$, $\v_n$ with respect to $\la\cdot,\cdot\ra$, write $$p(x)=\sum_{i=0}^n \a_i\v_i(x),$$ and then minimize $E(\a)$ with respect to these coefficients $\a$. The normal equations $A\a=b$ then become {\it diagonal}, since $A_{ij}=\la\v_i,\v_j\ra=0$ for $i\ne j$, and $b_i= \la f,\v_i\ra$. The coefficients $\a_i=\la f,\v_i\ra / \la\v_i,\v_i\ra$ are then simply the Fourier coefficients, and $$p^\ast(x)=\sum_{i=0}^n {\la f,\v_i\ra\over\la\v_i,\v_i\ra} \v_i(x).$$ {\bf Continuous least squares approximation example}: the problem is to find the polynomial $p^*(x)$ of degree $\le2$ which minimizes $$\int_{-1}^1 \bigl(p(x)-\sin x\bigr)^2dx.$$ Solution: Define $\la r,s\ra=\int_{-1}^1 r(x)s(x)\,dx$, $f(x)=\sin x$. Then the problem is equivalent to $$\min_p\int_{-1}^1 \bigl(p(x)-\sin x\bigr)^2dx=\min_p \la p-f,p-f\ra=\min_p \n{p-f}^2 \eqno(1)$$ or $$\min_p \n{p-f} \eqno(2)$$ where the minimization is done over all polynomials $p$ of degree $\le2$. The solution to (2) is $$p^*=\sum_{i=0}^2 {\la f,\phi_i\ra\over\la \phi_i, \phi_i\ra}\;\phi_i \eqno(3)$$ where $\phi_0$, $\phi_1$, $\phi_2$ are orthogonal polynomials with respect to the inner product $\la \cdot,\cdot\ra$. Now, working out the details, $\phi_0=1$, $\phi_1=x$, $\phi_2=(3/2)(x^2-1/3)$, $\la\phi_0,\phi_0\ra =2$, $\la\phi_1,\phi_1\ra=2/3$, $\la\phi_2,\phi_2\ra =2/5$, $\la f,\phi_0\ra=0$, $\la f,\phi_1\ra=2(\sin1 -\cos1)$, $\la f,\phi_2\ra=0$. Thus $$p^*(x)=0\cdot 1+{2(\sin 1-\cos1)\over2/3}\,x+0\cdot{3\over2}(x^2-1/3)=3(\sin1-\cos1)x.$$ \hrule\medskip \noindent{\bf Discrete case}. Given data points $\bigl(x_i,f(x_i)\bigr)$ and fixed weights $w_i>0$, $i=1,\ldots,M$, the problem is to fit the data by a polynomial $p(x)=\a_0+\a_1x + \cdots + \a_nx^n$, $M>n+1$. The best weighted least squares fit minimizes $$E(\a) = \sum_{i=1}^M \bigl(p(x_i)-f(x_i)\bigr)^2 w_i$$ over all $\a$. Using the discrete inner product $\la p,q\ra=\sum\limits_{i=1}^M p(x_i)q(x_i)w_i$, $$\min_\a E(\a) =\min_\a \la p-f,p-f\ra= \min_{p\in{\cal P}_n} \la p-f,p-f\ra= \min_{p\in{\cal P}_n} \|p-f\|^2.$$ The closest point in ${\cal P}_n$ to $f$ is $$p^\ast(x)= P_{{\cal P}_n} (f)=\sum_{i=0}^n {\la f,\v_i\ra\over\la\v_i,\v_i\ra} \v_i(x),$$ where $\{\v_i\}_{i=0}^n$ is an orthogonal basis for ${\cal P}_n$ with respect to the inner product $\la\cdot,\cdot\ra$. \medskip\noindent{\bf Solution 1}. Using the three term recurrence relation, compute orthogonal polynomials $\{\v_i\}_{i=0}^n$ with respect to the inner product $\la p,q\ra=\sum_{i=1}^M p(x_i)q(x_i)w_i$, then compute the projection $p^\ast$ of $f$ onto ${\cal P}_n$ with respect to this inner product. \medskip\hrule\medskip Let $$\displaylines{ C=\pmatrix{1&x_1&x_1^2&\cdots&x_1^n\cr 1&x_2&x_2^2&\cdots&x_2^n\cr \vdots&\vdots&\vdots&\ddots&\vdots\cr 1&x_M&x_M^2&\cdots&x_M^n\cr}, \qquad W=\pmatrix{\sqrt{w_1}&0&\cdots&0\cr 0&\sqrt{w_2}&\cdots&0\cr \vdots&\vdots&\ddots&\vdots\cr 0&0&\cdots&\sqrt{w_M}\cr}, \cr d=\bigl(f(x_1), f(x_2), \ldots, f(x_M)\bigr)^t,\qquad \a=\bigl( \a_0, \a_1, \ldots, \a_n\bigr)^t,\cr}$$ $A=WC$, $b=Wd$. Then $E(\a)=\tn{W(C\a-d)}^2 = \tn{A\a-b}^2$. With $A=QR$, $Q^tQ=I$, $$R=\bordermatrix{&\overbrace{\hskip5pt}^{n+1}\cr &\hat R\cr &0\cr} \vcenter {\hbox{$\}n+1$} \hbox{$\}M-n-1$}}\quad,\qquad Q^tb=\pmatrix{\hat b\cr \tilde b\cr} \vcenter{\hbox{$\}n+1$} \hbox{$\}M-n-1$}}\quad,$$ $\hat R$ upper triangular, $E(\a)$ reduces to $$\eqalign{\tn{A\a-b}^2 &=\tn{QR\a-b}^2=\tn{Q(R\a-Q^tb)}^2 = \tn{R\a-Q^tb}^2 \cr &=\tn{\pmatrix{\hat R\a\cr 0\cr} - \pmatrix{\hat b\cr \tilde b\cr}}^2 = \tn{\hat R\a-\hat b}^2 + \tn{\tilde b}^2.\cr}$$ {\bf Solution 2}. Factor $A$ as $QR$, where $Q$ is $M\times M$ orthogonal, $R$ is $M\times(n+1)$ upper triangular, and then solve $\hat R\a=\hat b$ for $\a^\ast$, the coefficients of the best discrete least squares approximation $p^\ast$. ($\hat R$ is assumed to be invertible, with rank $\hat R=$ rank $C=n+1$.) \vfil\eject }