2. The Method of Least Squares (1/2)

The method of fitting a function to datasets containing data for two or more variables and using the criteria of minimising the sums of squares is called Least Squares. If we are going to fit a line of the form b₁·X + b₀ = Y + ε to the data, then we get one equation for each matched pair of observations. If we have N observed data pairs then we get N observation equations:

\begin{array}{l} b_{1} \cdot x_{1} + b_{0} = y_{1} + ε \\ b_{1} \cdot x_{2} + b_{0} = y_{2} + ε \\ . \\ . \\ . \\ b_{1} \cdot x_{N} + b_{0} = y_{N} + ε \end{array}

Which we want to solve for b₀ and b₁, and in which (X₁, Y₁), (X₂, Y₂), …, (X_N, Y_N) are the observed pairs of data values. It is usual that no pairs of data points will exactly fall on the best fitted line to the data, but where all of them have residuals. We will find the best fit by minimising the sums of squares of the residuals.

From the above equations, each residual is of the form

ε_{i} = y_{i} - b_{1} \cdot x_{i} - b_{0}

So that the square of a residual is of the form

\begin{matrix} ε_{i}^{2} = {(y_{i} - b_{i} \cdot x_{i} - b_{0})}^{2} \\ = y_{i}^{2} - 2 b_{1} b_{0} y_{i} - 2 b_{0} y_{i} + b_{1}^{2} x_{i}^{2} + 2 b_{0} b_{1} x_{1} + b_{0}^{2} \end{matrix}

If you partially differentiate this with respect to the two unknowns (see for partial derivatives also the Supplement 1 of the SEOS tutorial Time Series Analysis), then you get

\frac{δ ε^{2}}{δ b_{1}} = - 2 x_{i} y_{i} + 2 b_{1} x_{i}^{2} + 2 b_{0} x_{i}

and

\frac{δ ε^{2}}{δ b_{0}} = - 2 y_{i} + 2 b_{1} x_{i}^{} + 2 b_{0}

When these partial differentials are zero, then the gradient of the equations is zero and so the sums of squares are minimised. Each of these differentials gives us one equation. There are thus two equations to solve for the two unknowns. In creating these equations, remove the -2 from each differential and sum across the N equations to give

\begin{array}{l} \sum_{}^{} x y - b_{1} \sum_{}^{} x^{2} - b_{0} \sum_{}^{} x = 0 \\ \sum_{}^{} y - b_{1} \sum_{}^{} x^{} - N b_{0} = 0 \end{array}

\begin{array}{l} b_{1} \sum_{}^{} x^{2} + b_{0} \sum_{}^{} x = \sum_{}^{} x y \\ b_{1} \sum_{}^{} x^{} + N b_{0} = \sum_{}^{} y \end{array}

Modelling of Environmental Processes

2. The Method of Least Squares (1/2)