DataSetError when trying to save dataframe to ParquetDataSet #1286
meaningfromdata
started this conversation in
Idea
Replies: 1 comment
-
Update: When I eliminate load_args and save_args and just specify the type and filepath this works to save and load from parquet. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to save a pandas dataframe to a parquet file in local storage as the output of a node. However, when I run the pipeline I am getting a DataSetError. The error message seems to indicate that there is a problem with the keyword arguments in the save_args. It doesn't like any of the keywords including "file_scheme", "has_nulls" and "engine" (all those I've tried listed below generate the DataSetError).
I am following the example for putting parquet files in the catalog.yml from https://kedro.readthedocs.io/en/stable/kedro.extras.datasets.pandas.ParquetDataSet.html#kedro.extras.datasets.pandas.ParquetDataSet
Here is what my catalog entry looks like:
Any help getting this to work would be appreciated. I am using kedro 0.17.6 with Python 3.7.11, pandas 1.3.5 and pyarrow 6.0.1.
Beta Was this translation helpful? Give feedback.
All reactions