helloworld: (Default)
[personal profile] helloworld
Learning regularized linear regression modeling. Used the glmnet package. Just slapping a few of the most important things down here.

Remember: need reshape for melt().

Handout: http://www.umiacs.umd.edu/~jbg/teaching/DATA_DIGGING/handout_04.pdf



6 Regularized Regression

Create a dataset with two nuisance variables:

> data(mtcars)
> mtcars <- cbind(runif(nrow(mtcars)), runif(nrow(mtcars)), mtcars)
> colnames(mtcars)[1:2] <- c("dummy1", "dummy2")


Fit regularized L1 and L2 regression:

> library(glmnet)
> target <- as.matrix(mtcars$mpg)
> features <- as.matrix(subset(mtcars, select=-c(mpg)))
> reg.l2 <- glmnet(features, target, alpha=0)
> reg.l1 <- glmnet(features, target, alpha=1)


Plot L1 coefficients vs. lambda (do the same thing for L2):

> library(ggplot2)
> library(reshape)
> models <- data.frame(t(rbind(matrix(reg.l1$lambda, nrow=1), as.matrix(reg.l1$beta))))
> colnames(models)[1] <- "lambda"
> models <- melt(models, c("lambda"))
> ggplot(models) + aes(x=log(lambda), y=value, color=variable) + geom_line()


-- Importantly, what one notices is that as lambda gets big, all variables go to zero because the penalty is so high. The last thing to go to zero is, thus, the most important.

Use cross validation to determine the best lambda for L1 (do the same thing for L2):

> cv.l1 <- cv.glmnet(features, target, alpha=1)
> plot(cv.l1, main = "L1 n-fold cross validation error")


To get minimal lambda:

> l <- cv.l1$lambda.min
> l
[1] 0.9644503


And then one can use predict() to grab all the coefficients and the intercept for that lambda, ie:

> mypredictions<- predict(reg.lasso, newx=testing_set, s = cv.mymodel$lambda.min)
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

helloworld: (Default)
apply(myLife, fuck)

October 2015

S M T W T F S
    123
4 5678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags