Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: create_feature_importance_chart has side effect on the regressor argument when the regressor has one-dimensional coef_ #29

Closed
pc-pallon opened this issue Jul 3, 2024 · 3 comments · Fixed by #30
Assignees
Labels
bug Something isn't working

Comments

@pc-pallon
Copy link

Describe the bug

create_feature_importance_chart has side effect on the regressor argument when the regressor has one-dimensional coef_.
This is the case when the regressor is fitted by a pd.Series in contrast to pd.DataFrame.
It happens with the simplest example of a LinearRegression given in the sklearn documentation.

Reproduction

This is the minimal code that exhibits the unintended behavior:

import numpy as np
from sklearn.linear_model import LinearRegression
import neptune.integrations.sklearn as npt_utils

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

reg = LinearRegression().fit(X, y)

print(reg.coef_)
# Out: [1. 2.]

npt_utils.create_feature_importance_chart(reg, X, y)

print(reg.coef_)
# Out: [ 50. 100.]

Attached is the screenshot of a run of an equivalent code on Jupyter:
Screenshot 2024-07-03 at 15 35 08

Expected behavior

The printed reg.coef_ should be identical.

Traceback

N/A

Environment

The output of pip list:
Related packages:

neptune                   1.10.4
neptune-sklearn           2.1.3
scikit-learn              1.5.0
scikit-plot               0.3.7
scipy                     1.11.4
All packages ``` agate 1.9.1 annotated-types 0.7.0 appdirs 1.4.4 arrow 1.3.0 attrs 23.2.0 Babel 2.15.0 boto3 1.34.121 botocore 1.34.121 bravado 11.0.3 bravado-core 6.1.1 cachetools 5.3.3 certifi 2024.2.2 cfgv 3.4.0 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 contourpy 1.2.1 cycler 0.12.1 daff 1.3.46 db-dtypes 1.2.0 dbt-adapters 1.3.2 dbt-bigquery 1.8.2 dbt-common 1.5.0 dbt-core 1.8.3 dbt-extractor 0.5.1 dbt-metabase 1.3.1 dbt-semantic-interfaces 0.5.1 deepdiff 7.0.1 diff_cover 9.1.0 distlib 0.3.8 exceptiongroup 1.2.1 filelock 3.15.4 fonttools 4.53.0 fqdn 1.5.1 future 1.0.0 gitdb 4.0.11 GitPython 3.1.43 google-api-core 2.19.0 google-auth 2.29.0 google-cloud-bigquery 3.23.1 google-cloud-core 2.4.1 google-cloud-dataproc 5.10.0 google-cloud-storage 2.17.0 google-crc32c 1.5.0 google-resumable-media 2.7.0 googleapis-common-protos 1.63.0 grpc-google-iam-v1 0.13.1 grpcio 1.64.0 grpcio-status 1.62.2 identify 2.5.36 idna 3.7 importlib-metadata 6.11.0 iniconfig 2.0.0 isodate 0.6.1 isoduration 20.11.0 Jinja2 3.1.4 jinja2-simple-tags 0.6.1 jmespath 1.0.1 joblib 1.4.2 jsonpointer 2.4 jsonref 1.1.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 leather 0.4.0 Logbook 1.5.3 markdown-it-py 3.0.0 MarkupSafe 2.1.5 mashumaro 3.13.1 matplotlib 3.9.0 mdurl 0.1.2 minimal-snowplow-tracker 0.0.2 monotonic 1.6 more-itertools 10.3.0 msgpack 1.0.8 neptune 1.10.4 neptune-sklearn 2.1.3 networkx 3.3 nodeenv 1.8.0 numpy 1.26.4 oauthlib 3.2.2 ordered-set 4.1.0 packaging 24.0 pandas 2.2.2 pandas-stubs 2.2.2.240514 parsedatetime 2.6 pathspec 0.12.1 pillow 10.3.0 pip 23.2.1 platformdirs 4.2.2 pluggy 1.5.0 pre-commit 3.2.2 proto-plus 1.23.0 protobuf 4.25.3 psutil 5.9.8 pyarrow 16.1.0 pyasn1 0.6.0 pyasn1_modules 0.4.0 pydantic 2.8.0 pydantic_core 2.20.0 Pygments 2.18.0 PyJWT 2.8.0 pyparsing 3.1.2 pyright 1.1.364 pytest 8.2.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-slugify 8.0.4 pytimeparse 1.1.8 pytz 2024.1 PyYAML 6.0.1 referencing 0.35.1 regex 2024.5.15 requests 2.32.2 requests-oauthlib 2.0.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.1 rpds-py 0.18.1 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 s3transfer 0.10.1 scikit-learn 1.5.0 scikit-plot 0.3.7 scipy 1.11.4 setuptools 70.0.0 simplejson 3.19.2 six 1.16.0 smmap 5.0.1 sqlfluff 2.3.5 sqlfluff-templater-dbt 2.3.5 sqlparse 0.5.0 swagger-spec-validator 3.0.3 tblib 3.0.0 text-unidecode 1.3 threadpoolctl 3.5.0 toml 0.10.2 tomli 2.0.1 tqdm 4.66.4 types-python-dateutil 2.9.0.20240316 types-pytz 2024.1.0.20240417 typing_extensions 4.12.1 tzdata 2024.1 uri-template 1.3.0 urllib3 2.2.1 virtualenv 20.26.3 webcolors 24.6.0 websocket-client 1.8.0 wheel 0.41.0 yellowbrick 1.5 zipp 3.19.2 ```
**The operating system you're using:** MacOS 14.5

The output of python --version:
Python 3.10.12

Additional context

Not applicable

@SiddhantSadangi SiddhantSadangi self-assigned this Jul 3, 2024
@SiddhantSadangi SiddhantSadangi added the bug Something isn't working label Jul 3, 2024
@SiddhantSadangi
Copy link
Member

SiddhantSadangi commented Jul 3, 2024

Hey @pc-pallon 👋
Thanks for bringing this to our attention.

Looks like an easy fix. I'll let you know once the fix is released ✅

@SiddhantSadangi
Copy link
Member

Hey @pc-pallon 👋

This should be fixed in neptune-sklearn 2.1.4 🚀

Can you please check and let me know?

@pc-pallon
Copy link
Author

Hi @SiddhantSadangi, looks like it works. Thanks for the quick fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants