From 8b49fae35c1c097a0fc42479d5babf763730f216 Mon Sep 17 00:00:00 2001 From: Hossein Moein Date: Sat, 11 Jan 2025 11:40:33 -0500 Subject: [PATCH] Added to README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 26cdb7d1..4e4c2c5e 100644 --- a/README.md +++ b/README.md @@ -64,7 +64,7 @@ Each program has three identical parts. First it generates and populates 3 colum The maximum dataset I could load into Polars was 300m rows per column. Any bigger dataset blew up the memory and caused OS to kill it. I ran C++ DataFrame with 10b rows per column and I am sure it would have run with bigger datasets too. So, I was forced to run both with 300m rows to compare. I ran each test 4 times and took the best time. Polars numbers varied a lot from one run to another, especially calculation and selection times. C++ DataFrame numbers were significantly more consistent. -| | [C++ DataFrame](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/dataframe_performance.cc) | [   Polars    ](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/polars_performance.py) | [   Pandas    ](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/pandas_performance.py) | +| | [C++ DataFrame](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/dataframe_performance.cc) | [    Polars     ](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/polars_performance.py) | [    Pandas     ](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/pandas_performance.py) | | :-- | ---: | ---: | ---: | | Data generation/load time | 26.9459 secs | 28.4686 secs | 36.6799 secs | | Calculation time | 1.2602 secs | 4.8766 secs | 40.3264 secs |