-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: extend fastexcel
support to python >= 3.8
#152
feat: extend fastexcel
support to python >= 3.8
#152
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks a lot for your PR!
We wanted to propose fastexcel
as an engine to polars
as well, but you've been faster 😉
The changes look good to me overall, just a few comments here and there.
Also, have you encountered any difficulties/questions when you looked at the code ? We want to make the docs and README more welcoming to newcomers, so feedback from a first-time contributor would be precious 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Actually I was pleasantly surprised how easy it was to go from "clone repo" to "everything is compiled and tests are running/passing" 👌 On that front I only have one comment, which is that I didn't immediately expect Otherwise it was smooth sailing; was up & running almost immediately. As for the source itself, I have only given it a superficial scan so far - mostly just looking to see what features are exposed (eg: is there a "has_headers" flag in case the sheet has no column names, an option to set them if there aren't, etc). |
Good point, I'll update the make target names :) Regarding the features you'd like to add, feel free to open issues and @ me on them! We've only exposed what we need internally for now, but things such as sheet headers handling should be pretty easy to add |
@alexander-beedie |
Fantastic; thanks for the extremely quick turnaround on this! I've already integrated it as a new option for Polars in a local build and it's looking great. Will finish up some unit tests and look to push it out in the next day or two ✌️ |
Hoping that a patch to extend Python support is welcome 😉
I was looking to integrate a
calamine
based Excel parser as one of our supported engines in the Polarsread_excel
method (I am one of the core devs for that project), but we need to support Python versions >= 3.8. While we could potentially havepython_calamine
handle earlier versions, I'm reluctant to do so asfastexcel
is much better-suited for our use-case and benchmarks significantly faster (producing Arrow data without materialising anything in Python-space; nicely done!)Took a look at the code to see if it could be extended to cover the earlier versions, and it seems like a straightforward low-impact update; I've validated that everything compiles/runs on the earlier Python versions (testing on x86 Ubuntu Linux for py3.8 and an Apple Silicon Mac for py3.9). Mostly just some minor typing updates and avoidance of the
match
expression in one of the tests. Put in an extra lint rule to catch the need for afrom __future__ import annotations
statement at the top of some files (to keep the modern typing syntax).I can't validate the workflow changes, but with a little luck they'll run ok 🤞
Thanks for the great work, and let me know if you want any changes/modifications, etc?