-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): Support loading data from multiple Excel/ODS workbooks #20404
feat(python): Support loading data from multiple Excel/ODS workbooks #20404
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20404 +/- ##
==========================================
- Coverage 79.13% 78.96% -0.17%
==========================================
Files 1572 1562 -10
Lines 219839 220103 +264
Branches 2462 2486 +24
==========================================
- Hits 173961 173811 -150
- Misses 45310 45719 +409
- Partials 568 573 +5 ☔ View full report in Codecov by Sentry. |
Nice! |
Is it possible to add the filename when reading from multiple files similar to the scan_ methods? I have been using For Excel, I have been looping through files and concatenating: df = df.with_columns(pl.lit(path).alias("raw_file_path"))
dfs.append(df) I am not sure how this would work if I just passed a list or glob to the the |
@ldacey: Sure, it's do-able; can you make this a proper feature request so it's easier to track? |
Closes #20354.
Allows
read_excel
andread_ods
to take a list or glob pattern in the "source" parameter. This enables loading a given sheet from multiple workbooks (for example: directories containing workbooks that contain the same sheet data for different dates - can be useful to be able to easily load them all into a single frame).Also: tidied up some "source" docstrings (rogue linebreaks), and renamed the "ScanSource" type to "FileSource" (as it isn't just used for
scan
funcs).Example
Load the "data" sheet from all "trades" workbooks found in subdirs of the "2024" directory into a single DataFrame.