Search by content of a column vs luau filter #2483
-
Currently I use I can’t find anything in the If it is not possible, would it be useful? If it were useful, would it be reasonably easy to implement? |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 3 replies
-
Have you tried the
It should be faster as Also, can you give me some metrics? How many rows? How long does it take? |
Beta Was this translation helpful? Give feedback.
-
You can also just do a direct comparison and skip using
|
Beta Was this translation helpful? Give feedback.
-
Some metrics using the different approaches above using the 1 million row NYC 311 sample we use in our benchmarks: $ /usr/bin/time qsv luau filter 'string.match(City,Borough)' /tmp/NYC_311_SR_2010-2020-sample-1M.csv -o /tmp/t1.csv
7.72 real 7.08 user 0.57 sys
$ /usr/bin/time qsv luau filter --no-globals 'string.match(col["City"],col["Borough"])' /tmp/NYC_311_SR_2010-2020-sample-1M.csv -o /tmp/t2.csv
5.30 real 4.68 user 0.56 sys
$ qsv luau filter 'City==Borough' /tmp/NYC_311_SR_2010-2020-sample-1M.csv -o /tmp/t3.csv
7.48 real 6.89 user 0.55 sys
$ /usr/bin/time qsv luau filter --no-globals 'col["City"]==col["Borough"]' /tmp/NYC_311_SR_2010-2020-sample-1M.csv -o /tmp/t4.csv
5.10 real 4.53 user 0.55 sys |
Beta Was this translation helpful? Give feedback.
-
Thank you. The figure for the first example look about right. I’ll try with index first for reference then with I’ll revert. |
Beta Was this translation helpful? Give feedback.
-
Auto-Indexing actually slows it down, as numerous indexes are generated, but in a million records |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hi @ondohotola - I moved the issue to a Discussion so other folks who have a similar |
Beta Was this translation helpful? Give feedback.
Some metrics using the different approaches above using the 1 million row NYC 311 sample we use in our benchmarks: