Linearly Constrained least square:
- Minimize ∣∣Ax−b∣∣2 subject to Cx=d, equality constraints
- x^ is a solution of CLS if Cx^=d and ∣∣Ax^−b∣∣2≤∣∣Ax−b∣∣2 holds for any n-vector x that satisfies Cx=d
- how many row of C is dimension of λ
- Piecewise-polynomial fitting:
- In the case of splitting the graph into 2 in x-axis, the use p(x) to estimate points on the left graph and q(x) for the right
- Piecewise-polynomial f^ has the form:
- f^(x)={p(x)=θ1+θ2x+θ3x2+θ4x3q(x)=θ5+θ6x+θ7x2+θ8x3x≤ax>a
- To connect the graph, we need p(a)=q(a) and p′(a)=q′(a) as constraints
- θ1+θ2a+θ3a2+θ4a3−θ5−θ6a−θ7a2−θ8a3=0
- θ2+2θ3a+3θ4a2−θ6−2θ7a−3θ8a2=0
- Then fitting is minizing ∑i=1N(f^(xi)−yi)2 with constraints
- Prediction error on(xi,yi) is ai⊺θ−yi
- Sum square error is ∣∣Aθ−y∣∣2 where ai⊺ are the rows of A
- Solving CLS problem:
- The constrants are ci⊺x=di, i=1,...,p
- Use Lagrange Multipliers z1,...zp
- L(x,z)=f(x)+z1(c1⊺x−d1)+...+zp(cp⊺x−dp) where z is the p vector of Larange
- Optimal conditions are: ∂xi∂L(x^,z)=0, i=1,...,n and∂zi∂L(x^,z)=0, i=1,...,p
- ∂zi∂L(x^,z)=ci⊺x^−di=0 which we knew
- first n equations have form: ∂xi∂L(x^,z)=2j=1∑n(A⊺A)ijx^j−2(A⊺B)i+j=1∑pzjci=0
- Togethe with Cx^=d to get KKT conditions: [2A⊺ACC⊺0][x^z]=[2A⊺bd]
- a square set of n+p linear equations in variable x^,
- Assumming KKT matrix is invertible: [x^z]=[2A⊺ACC⊺0]−1[2A⊺bd]
- KKT matrix is invertible if and only if C has linearly independent rows and [AC] has linearly independent columns
- implies m+p≥n,p≤n
- Compute x^ in 2mn2+2(n+p)3 flops, order is n3 flops
- Vertification of solution:
- For every x satisfies Cx=d
- ∣∣Ax−b∣∣2=∣∣(Ax−Ax^)+(Ax^−b)∣∣2
- =∣∣A(x−x^)∣∣2+∣∣Ax^−b∣∣2+2(Ax−Ax^)⊺(Ax^−b)
- expand last term, using 2A⊺(Ax^−b)=−C⊺z, Cx=Cx^=d:
- 2(Ax−Ax^)⊺(Ax^−b)=2(x−x^)⊺A⊺(Ax^−b)=−(x−x^)⊺C⊺z=−(C(x−x^))⊺z=0
- so ∣∣Ax−b∣∣2=∣∣A(x−x^)∣∣2+∣∣Ax^−b∣∣2≥∣∣Ax^−b∣∣2 so x^ is the solution
Least-norm Problem:
- A simple case of CLS, to minimize ∣∣x∣∣2
- with A=I,b=0
- subject to Cx=d
- Solving Least norm problem:
- matrix [IC] always have independent columns
- Assume that C has indepent rows
- Optimal condition reduce to [2ICC⊺0][x^z]=[0d]
- then x^=−(1/2)C⊺z and −(1/2)CC⊺z=d
- Plug z=−2(CC⊺)−1d int the first equation to get x^=C⊺(CC⊺)−1d=C†d
- so when C has linearly independent rows:
- C† is a right inverse of C
- so for any D,x^=C†d satisfies Cx^=d
- ans we now know x^ is the smallest solution of Cx=d
Constrained least square applications:
Portfolio allocation:
- Allocate investment in a vector of n different assets, w
- wj is the fraction of portfolio allocated in asset j
- wj can be negative, meaning short position
- one of wj is the liquid, so 1⊺w=1
- w=en=[0 0 . . .1] means the portfolio is all cash
- Leverage L=∣w1∣+...+∣wn∣
- L=1 means all long position
- L>1 means atleast 1 short position
- Return over a period
- r^j is the return of asset j over the period, in fractional increase or decrease in value
- full portfolio return is VV+−V=r^⊺w
- for t-period, with return r1,...,rt; then Vt+1=V1(1+r1)(1+r2)...(1+rt)
- Return matrix:
- Hold portfolio with weights w over T periods
- define T×n (assets) return matrix, with Rtj is return of asset j in period t
- 1 row is return vector of 1 period, r~t⊺
- 1 column is the time series of 1 asset
- if last asset is risk-free, the last column of R is μrf1 where μrf is the risk-free per-period interest rate
- Portfolio return and risk:
- porfolio return vector (1 entry is return of 1 asset), r=Rw
- average return is avg(r) and risk is std(r)
- for small per-period returns, we have VT+1=V1(1+r1)...(1+rT)≈V1+V1(r1+...+rT)=V1+Tavg(r)V1
- so return approximates the avg per-period increase of portfolio value
- Annualized return and risk:
- Mean return and risk are often expressed in annualized form
- If there are P trading periods per year:
- annualized return =Pavg(r)
- annualized risk =Pstd(r)
- Portfolio optimization:
- minize the risk: std(Rw)2=(1/T)∣∣Rw−ρ1∣∣2
- with constrains 1⊺w=1 and avg(Rw)=ρ, meaning mean of past return is ρ
- solution w are Pareto Optimal
- Convert to contrained least squares:
- minize ∣∣Rw−ρ1∣∣2
- subject to [1⊺μ⊺]w=[1ρ]
- μ=R⊺1/T is n-vector of past asset returns
- solution: wz1z2=2R⊺R1⊺μ⊺100μ00−12ρTμ1ρ