Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Practical Machine Learning/ 016preProcessingPCA - Lecture example code change #157

Open
Arcticgrayling opened this issue Nov 3, 2016 · 1 comment

Comments

@Arcticgrayling
Copy link

This lecture needs to be updated, there is a change in how the "train" function has to be used.

With new version of caret (version 6.0-71), The lecture code:
modelFit <- train(training$type ~ .,method="glm",data=trainPC)`
gives an error.

I raised the issue with the caret package people.
https://github.com/topepo/caret/issues/480
They say this code is incorrect, we should use instead:
modelFit <- train(x = trainPC, y = training$type,method="glm")

You shouldn't use the data set name on the LHS of the formula. The formula interface should be used when the variables are in columns of the object that the data argument refers to.

If type is not in training and there are only numeric variables in trainPC, then you should use the non-formula method:
modelFit <- train(x = trainPC, y = training$type,method="glm")

@jtleek
Copy link
Contributor

jtleek commented Nov 3, 2016

Hi Peter thanks for the note, I'm going to be making some edits to this asap

On Thu, Nov 3, 2016 at 1:14 PM Peter Olsen [email protected] wrote:

This lecture needs to be updated, there is a change in how the "train"
function has to be used.

With new version of caret (version 6.0-71), The lecture code:
modelFit <- train(training$type ~ .,method="glm",data=trainPC)`
gives an error.

I raised the issue with the caret package people.
topepo/caret#480 http://url
They say this code is incorrect, we should use instead:
modelFit <- train(x = trainPC, y = training$type,method="glm")

You shouldn't use the data set name on the LHS of the formula. The formula
interface should be used when the variables are in columns of the object
that the data argument refers to.

If type is not in training and there are only numeric variables in
trainPC, then you should use the non-formula method:
modelFit <- train(x = trainPC, y = training$type,method="glm")


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#157, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABf7WpwjarxQ0qhH4Ns2io0qW1_KszhRks5q6haGgaJpZM4Koq6R
.

Matthew-May referenced this issue in bcaffo/courses Feb 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants