Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review primitive hyperparams that cause errors #29

Open
roquelopez opened this issue Apr 19, 2023 · 1 comment
Open

Review primitive hyperparams that cause errors #29

roquelopez opened this issue Apr 19, 2023 · 1 comment

Comments

@roquelopez
Copy link
Collaborator

There are some default hyperparameters that cause errors every time they are used. For instance, the 'average' hyperparameter of Sklearn Imputer, will always fail for categorical features (we can't calculate the average for this type of feature). These values should be changed, like here.

@madhuripujari95
Copy link
Collaborator

The issues seems to be with Countvectorizer and TfidfVectorizer when used with StandardScalar and RobustScalar. (below is the screenshot of the errors)
image

  • Fixed the "ValueError: Cannot center sparse matrices: use with_centering=False instead. See docstring for motivation and alternatives" with the below code in pipeline_builder.py
    elif isinstance(primitive_object, RobustScaler): primitive_object.set_params(with_centering=False)

image

  • Removed standardscalar from grammar and ran the code - all the errors are gone and fasttext on 15-16th rank
    (not sure if this impacts the other pipelines drastically, more testing to be done)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants