Least Square

Goal: $min_{x} ∣∣ A x - b ∣ ∣^{2}$ where $A$ and $b$ are parameters
$f (x) = ∣∣ A x - b ∣ ∣^{2} = ∣∣ A x ∣ ∣^{2} - 2 (A x)^{⊺} b + ∣∣ b ∣ ∣^{2} = x^{⊺} (A^{⊺} A) x - 2 x^{⊺} (A^{⊺} b) + ∣∣ b ∣ ∣^{2}$
$\nabla f (x) = 2 A^{⊺} A x - 2 A^{⊺} b = 0$
Therefore, the solution, $x^{*} = (A^{⊺} A)^{- 1} A^{⊺} b$
- $A$ must be full rank to have a unique solution
Solution with QR Decomposition:
- given an m × n matrix A with linearly independent columns and an m-vector b.
1. QR factorization. Compute the QR factorization $A = QR$
2. Compute $Q^{⊺} b$
3. Back substitution. Solve the triangular equation $R \overset{x}{^} = Q^{⊺} b$
Solution with SVD
When $A$ is not full rank, it is advisable to choose the minimum-norm solution to the set of solutions. This particular solution is $x = A^{†} y$

When equations are under-determined, we may want to single out a solution with minimum Euclidean norm: $min_{x} ∣∣ x ∣ ∣_{2} : A x = y$
Here, $A \in R^{m \times n}$ with $m < n$ , and $y \in R (A)$
Optimality conditions and full row rank case:
- From the fundamental theorem of linear algebra, any candidate $x$ can be written as $x = A^{⊺} v + r$ where $A r = 0$ . Since $A^{⊺} v, r$ are orthogonal we see that $∣∣ x ∣ ∣_{2}^{2} = ∣∣ A^{⊺} v ∣ ∣_{2}^{2} + ∣∣ r ∣ ∣_{2}^{2}$ which proves optimal $r = 0$ and $A A^{⊺} v = y$
- If $A$ is full row rank, $A A^{⊺}$ is invertible and the unique $v$ is $(A A^{⊺})^{- 1} y$ then $x^{*} = A^{⊺} (A A^{⊺})^{- 1} y$

A generalization of the basic LS problem allows for the addition of linear equality constraints on the $x$ variable, resulting in the Constrained Least Square $min_{x} ∣∣ A x - y ∣ ∣_{2}^{2}$ subject to $C x = d$
This problem can be converted to standard LS by eliminating the equality constraints via a standard procedure
First, suppose the problem is feasible, let $\overset{x}{ˉ}$ be such that $C \overset{x}{ˉ} = d$
All feasible points (regardless of least square) can be expressed as a form $x = \overset{x}{^} + N z$ where $N$ contains by the column basis for $N (C)$ and $z$ is a new variable
Then we need to find $z$ such that $min_{z} ∣∣ \tilde{A} z - \tilde{y} ∣ ∣_{2}^{2}$ where $\tilde{A} ≐ A N$ and $\tilde{y} ≐ y - A \tilde{x}$
then replace $x = \tilde{x} + N z$

Sometimes the square errors have different weights. The objective function can be written as $f_{0} (x) = ∣∣ W (A x - y) ∣ ∣_{2}^{2} = ∣∣ A_{w} x - y ∣ ∣_{2}^{2}$ where
- $W = diag (w_{1}, ..., w_{m}), A_{w} ≐ W A, y_{w} = W y$
Then the problem turns to a ordinary ls problem

$min_{x} ∣∣ A x - y ∣ ∣_{2}^{2} + λ ∣∣ x ∣ ∣_{2}^{2}, λ \geq 0$
$∣∣ A x - y ∣ ∣_{2}^{2} + λ ∣∣ x ∣ ∣_{2}^{2} = ∣∣ \tilde{A} x - \tilde{y} ∣ ∣_{2}^{2}$ where $\tilde{A} ≐ [A λ I_{n}]$ and $\tilde{y} ≐ [y 0_{n}]$
$λ \geq 0$ is a tradeoff parameter. Interpretation in terms of tradeoff between output tracking accuracy and input effort.
The optimal regulation parameter, $x^{*} (λ) := ar g min_{x} ∣∣ A x - y ∣ ∣_{2}^{2} + λ ∣∣ x ∣ ∣_{2}^{2}$
Train/test

StrixTheKiet Notes