write_parquet encoding no longer recognized by PBI Service parquet connector after Polars 1.5.0 onwards #18819
Labels
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
python
Related to Python Polars
Checks
Reproducible example
Log output
No response
Issue description
Just to explain the setup a bit:
Parquet gets written to a network drive. Report published to PBI Service connects to this parquet file using an on-premises gateway.
Refreshing works on local copy of PBI file, but through PBI Service specifically, it is now giving an error:
Data source error: {"error":{"code":"DM_GWPipeline_Gateway_MashupDataAccessError","pbi.error":{"code":"DM_GWPipeline_Gateway_MashupataAccessError","parameters":{},,"details":[{"code":"DM_errorDetailNameCode_UnderlyingErrorCode","detail":{"type":1,"value":"-2147467259"}},{"code":"DM_ErrorDetailNameCode_UnderlyingErrorMessage","detail":{"type":1,"value":"Parquet: class parquet::ParquetException (message: 'Unknown encoding type.'"}}, {"code": "DM_ErrorDetailNameCode_UnderlyingHResult", "detail":{"type":1,"value":"-2147467259"}},"code":"Microsoft.Data.Mashup.ValueError.Reason","detail":{"type":1,"value":"DataFormat.Error"}}]"eceptionCulprit":1}}}
This refreshes fine locally -- the problem is PBI Service specifically. I tested generating my parquet files version to version from Polars 1.2 up until current, and I start getting these messages as of Polars 1.5.0's write_parquet specifically.
I believe something changed specifically in the write_parquet output that is causing it to no longer be compatible with the PBI Service's parquet connector in newer versions. I have analyzed the schema and the meta data and they are exactly the same in the old output versus new output.
Expected behavior
As nothing has changed in my schema or meta data, the files should be refreshing, but it seems like write_parquet's encoding is not recognized by PBI Service as of 1.5.0 onwards.
Installed versions
The text was updated successfully, but these errors were encountered: