Linear regression

Prediction by Regression Model
Actual value of $y = β_{0} + β_{1} x + ϵ$
- Regression equation $⟹ E (y) = β_{0} + β_{1} x$
Predicted valye of $\overset{y}{^} = b_{0} + b_{1} x$
- $\overset{y}{^}$ is the point estimator of $E (y)$
In practice, $β_{0}$ and $β_{1}$ are not known, but can be estimated with $b_{0}$ and $b_{1}$
- where $b_{0}$ and $b_{1}$ are computed by least squares method
Some notations:
- $x^{*}$ : given value of independent value of $x$
- $y^{*}$ : possible values (a range) of dependent value of $y$ when $x = x *$
- $\overset{y}{^}^{*} = b_{0} + b_{1} x^{*}$ : the point estimator $E (y^{*})$ and the predictor of an individual value of $y^{*}$ when $x = x^{*}$

$E (ϵ) = 0$ . This implies $β_{0}$ and $β_{1}$ are constants, and hence $E (y) = β_{0} + β_{1} x$
The variance of $ε$ , denoted by $σ^{2}$ , is the same for all $x$
The values of $ε$ are independent.
$ε$ is a normally distributed random variable for all values of $x$

Testing for significance
If $β_{1} = 0$ , then $E (y) = β_{0}$ . In this case, we would conclude that $x$ and $y$ are not linearly related
If $β 1 \neq = 0$ , we would conclude that the two variables are related.
The t test is commonly used.
- It requires an estimate of $σ 2$ , the variance of $ε$ in the regression model.
With $\overset{y}{^} = b_{0} + b_{1} x$ , we can use the Mean square error $s^{2}$ as an estimate $σ^{2}$
- $s^{2} = \frac{\sum ( y _{i} - y ^ _{i} ) ^{2}}{n - 2}$
- The standard error of the estimate $s = s^{2}$ is used to estimate $σ$
$H_{0} : β_{1} = 0, H_{a} : β_{1} \neq = 0$
For sample distributin of $b_{1}$ :
- Expected value: $E (b_{1}) = β_{1}$
- sd: $σ_{b_{1}} = \frac{σ}{\sum ( x _{i} - x ˉ ) ^{2}}$
  - then we use it to estimate: $s_{b_{1}} = \frac{s}{\sum ( x _{i} - x ˉ ) ^{2}}$
- $b_{1} \sim N (β_{1}, σ_{b_{1}}^{2})$
- $\to t s = \frac{b _{1} - β _{1}}{s _{b_{1}}} = \frac{b _{1}}{s _{b_{1}}}$
- Reject $H_{0}$ if $p$ -value $\leq α$ , where $p$ follows a t distribution with $n - 2$ degrees of freedom
Confidence Interval for $β_{1}$ is $b_{1} \pm t_{α /2} s_{b_{1}}$
- with two-tailed
- If the CI include 0, we can hypothesize value of $β_{1}$ might not be in the CI
- We can reject $H_{0}$ as there is a chance the model is not significant enought to be used

$Va r (\overset{y}{^}^{*}) = s_{\overset{y}{^}^{*}}^{2} = s^{2} . [\frac{1}{n} + \frac{( x ^{*} - x ˉ ) ^{2}}{\sum ( x _{i} - x ˉ ) ^{2}}]$
- where $s^{2}$ is the variance of the all collected $y$
- Think of it as var of the group to other group

Confidence interval of the mean of all predicting value when $x = x^{*}$
$C I ((1 - α) %) = \overset{y}{^}^{*} \pm t_{α /2} s_{\overset{y}{^}^{*}}$
- with $n - 2$ degree of freedom

Prediction
Confidence interval of the 1 value when $x = x^{*}$ , thus more uncertainty
Has variance of Mean square error and Variance of predicting value:
- $s_{p re d}^{2} = s^{2} + s_{\overset{y}{^}^{*}}^{2} = s^{2} . [1 + \frac{1}{n} + \frac{( x ^{*} - x ˉ ) ^{2}}{\sum ( x _{i} - x ˉ ) ^{2}}]$

StrixTheKiet Notes