helloworld: (Default)
[personal profile] helloworld
Practice problem for linear regression and predictions.

http://www.umiacs.umd.edu/~jbg/teaching/DATA_DIGGING/handout_04.pdf



#5 on the handout:

> newsdata <- read.csv(url("http://terpconnect.umd.edu/~ying/did/hw3/newspaper.csv"))
> plot(newsdata$daily, newsdata$Sunday)


Yes it does make sense to use a linear regression;

> attach(newsdata)
> newsdata.lm = lm(Sunday ~ daily)
> coeffs = coefficients(newsdata.lm); coeffs
(Intercept) daily
76.009807 1.276673


Now do the residuals:

> newsdata.res = resid(newsdata.lm)
> plot(newsdata$daily, newsdata.res, ylab="Residuals", xlab="Daily Readers", main = "Sunday Readership")
> abline(0,0)


(ouch, huge residuals)

Now the variance on the residuals:

> var(newsdata.res)
[1] 23308.4
> sd(newsdata.res)
[1] 152.6709


Then: "Predict the Sunday circulation of this hypothetical newspaper and give the range that would encompass one standard deviation of the normal distribution induced by this prediction (use the variance estimated from the previous question)."

Do this by hand ...

> newdata <- data.frame(daily=800)
> predict(newsdata.lm, newdata)
1
1097.348

The range is 1097 +/- 152 = 945 to 1249.


This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

helloworld: (Default)
apply(myLife, fuck)

October 2015

S M T W T F S
    123
4 5678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags