Description:
- Prediction by Regression Model
- Actual value of y=β0+β1x+ϵ
- Regression equation⟹E(y)=β0+β1x
- Predicted valye of y^=b0+b1x
- y^ is the point estimator of E(y)
- In practice, β0 and β1 are not known, but can be estimated with b0 and b1
- where b0 and b1 are computed by least squares method
- Some notations:
- x∗: given value of independent value of x
- y∗: possible values (a range) of dependent value of y when x=x∗
- y^∗=b0+b1x∗: the point estimator E(y∗) and the predictor of an individual value of y∗ when x=x∗
Finding line of best fit:
- There are 2 main methods:
- Scattergraph method:
- Draw a line through data points with about an equal number of points above and below the line.
- Linear regression:
- Using correlation:
- We need to choose a and b to minimize the Mean square error
- E[(Y−(a+bX))2]=E[Y2]−2aE[Y]−2bE[XY]+a2+2abE[X]+b2E[X2]
- Taking partial derivative and set them equal to 0 we have a and b when it is at minimum point
- Cant have maximum due to the nature of the problem
- b=σX2Cov(X,Y)
- a=E[Y]−bE[X]
- The best linear predictor (lowest mean square error) is: μy+σxρσy(X−μx)
- Happens when μy=E[Y],μy=E[X] and ρ is the Correlation of X and Y
- The Mean square error of this predictor is given by E[(Y−μy−σxρσy(X−μx)2]=σy2(1−ρ2)
- Using y=a+bX:
- b=n∑xi2−(∑xi)2n∑xiyi−∑xi∑yi
- a=n∑y−nb∑x
Assumptions about ϵ in linear regression model:
- E(ϵ)=0. This implies β0 and β1 are constants, and hence E(y)=β0+β1x
- The variance of ε, denoted by σ2, is the same for all x
- The values of ε are independent.
- ε is a normally distributed random variable for all values of x
Testing for significance of linear:
- Testing for significance
- If β1=0, then E(y)=β0. In this case, we would conclude that x and y are not linearly related
- If β1=0, we would conclude that the two variables are related.
- The t test is commonly used.
- It requires an estimate of σ2, the variance of ε in the regression model.
- With y^=b0+b1x, we can use the Mean square error s2 as an estimate σ2
- s2=n−2∑(yi−y^i)2
- The standard error of the estimate s=s2 is used to estimate σ
- H0:β1=0,Ha:β1=0
- For sample distributin of b1:
- Expected value: E(b1)=β1
- sd: σb1=∑(xi−xˉ)2σ
- then we use it to estimate: sb1=∑(xi−xˉ)2s
- b1∼N(β1,σb12)
- →ts=sb1b1−β1=sb1b1
- Reject H0 if p-value ≤α, where p follows a t distribution with n−2 degrees of freedom
- Confidence Interval for β1 is b1±tα/2sb1
- with two-tailed
- If the CI include 0, we can hypothesize value of β1 might not be in the CI
- We can reject H0 as there is a chance the model is not significant enought to be used
Variance of predicting value:
- Var(y^∗)=sy^∗2=s2.[n1+∑(xi−xˉ)2(x∗−xˉ)2]
- where s2 is the variance of the all collected y
- Think of it as var of the group to other group
Confidence interval for mean of predicting value E(y∗):
- Confidence interval of the mean of all predicting value when x=x∗
- CI((1−α)%)=y^∗±tα/2sy^∗
- with n−2 degree of freedom
Prediction interval for y^∗: