Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_csv ignores formatting when writing to io.StringIO() #18825

Open
2 tasks done
bskubi opened this issue Sep 19, 2024 · 2 comments
Open
2 tasks done

write_csv ignores formatting when writing to io.StringIO() #18825

bskubi opened this issue Sep 19, 2024 · 2 comments
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@bskubi
Copy link

bskubi commented Sep 19, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

from polars import DataFrame
import io

df = DataFrame({"c1":[1, 2], "c2":[3, 4]})
buffer = io.StringIO()
df.write_csv(buffer, separator="\t", include_header=False)
print(buffer.getvalue())

Expected output:

1      3
2      4

Wrong output:

c1,c2
1,3
2,4

Log output

No response

Issue description

I'm trying to obtain a Python string containing the output that would be written by the write_csv function.

Expected behavior

The code sample above should print a string that is tab-delimited and has no header, per the options I specify with write_csv.

Installed versions

--------Version info---------
Polars:              1.7.1
Index type:          UInt32
Platform:            Linux-6.8.0-40-generic-x86_64-with-glibc2.35
Python:              3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0]

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          3.0.0
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               2024.6.1
gevent               <not installed>
great_tables         <not installed>
matplotlib           3.9.1
nest_asyncio         1.6.0
numpy                1.26.4
openpyxl             <not installed>
pandas               2.2.2
pyarrow              16.1.0
pydantic             2.8.2
pyiceberg            <not installed>
sqlalchemy           2.0.34
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@bskubi bskubi added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 19, 2024
@cmdlineluser
Copy link
Contributor

Can reproduce.

This just seems to be an issue in the Python logic.

.write_csv(buf) on line 2862 doesn't take user params into consideration.

def write_csv_to_string() -> str:
with BytesIO() as buf:
self.write_csv(buf)
csv_bytes = buf.getvalue()
return csv_bytes.decode("utf8")
should_return_buffer = False
if file is None:
buffer = file = BytesIO()
should_return_buffer = True
elif isinstance(file, StringIO):
csv_str = write_csv_to_string()

@mcrumiller
Copy link
Contributor

mcrumiller commented Sep 19, 2024

Maybe we should rename the parameter to source_schema? This issue seems to pop up quite often.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants