Modin is not always faster than pandas #6989
Replies: 4 comments 19 replies
-
Hi @klopstock-dviz, thank you for creating this discussion. Would you mind sharing a code example with us so we can look into it? It would help us to look at the issues Modin has and try to speed up? |
Beta Was this translation helpful? Give feedback.
-
Hi @YarShev For the dataset, you can get it here: For the .py i measured 49s for pandas, 71s for modin I am on a fresh install of linux mint |
Beta Was this translation helpful? Give feedback.
-
Hi @YarShev I tried with this exact line: Modin finished in 42s I got this warining: So it is a great improvement, 71s to 42s For casual jupyter tasks whit high data load, i would use modin to get faster feedbacks, but for backend tasks, i think pandas is more efficient in the equation cpu resources / time spent |
Beta Was this translation helpful? Give feedback.
-
This Project published benchmark comparisons in 2022. |
Beta Was this translation helpful? Give feedback.
-
Truth, vanilla pandas is faster
On a xeon 2850v2, 10 cores, 50gb of ram
Used modin to speed up load times, filters, groupings and aggregation
For data loading, modin did better (x2)
But for everything else, pandas is ... faster ! and consume far less cpu ressource
The only context where modin beats clearly (x2 faster) pandas, it is jupyter notebook (in vscode)
For casual .py, i stick to pandas
Beta Was this translation helpful? Give feedback.
All reactions