-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSSION] More SqlLogicTest test coverage for queries, including join queries #13470
Comments
Yes, 100% I think adding additional test coverage in all areas would be super helpful We / I also regularly hit bugs that are caught by InfluxData's test suite during upgrades of DataFusion but were not caught during the DataFusion PR Other Areas that would benefit from improved testing from my experience:
|
One way to extend join coverage is to have some more "answer checking" for different benchmarks, such as: (Earlier doing this for TPC-H found many issues in joins handling, decimal issues, etc.) We can do the same for the imdb join order benchmark, clickbench, etc. |
I have been looking over the sqllogictests this weekend and I think first of all that they need a good reorganization to clean them up, split them into smaller chunks testing just specific portions of functionality and to reduce the size of some of the files. I also have started looking at other projects's tests to see how they've done their testing. I am unsure about actually incorporating some of those tests directly into DF (duckdb for example, MIT license vs Apache) but they could be used as a basis for seeing where DF's test suite is falling short and expanding the test suite some more. |
I came across this paper which I found interesting for join testing. It won't work for DF yet since DF doesn't support query hints (at least as best as I know). |
I would also love to see a separate test-runner (maybe nightly) of the official SQLLite SLTs (where applicable). At this point DF seems to be able to pass the majority without modification (except delete). Since they are (a) pre-vetted by sqllite and (b) test foundational capability it would make sense to me that they are included in datafusion |
I actually am part way through porting those slt's to something that df's slt test runner can run |
Can't wait to see what happens here |
I'm wondering what are test strategies in Influx? is it something we can reuse in DF as well to avoid regression |
We have a bunch of our own basic tests for end to end functionality in our system (some is SQL based, some is "InfluxQL" based) We have often seen issues where features we use heavily (metadata, and UnionExec, for example) are not as heavily covered in the existing DataFusion test suites 🤔 |
is it real world use cases ? if its not I found something used for MariaDB and get investigate it https://mariadb.com/kb/en/random-query-generator-tests/ |
After the slt work I was thinking of taking a look at the work that @2010YOUY01 has done with sqlancer to see if there are areas that I could add into it. I noticed that the datafusion-contrib for that was merged into sqlancer main a few months back - https://github.com/sqlancer/sqlancer/tree/main/src/sqlancer/datafusion. sqlancer has quite a few options wrt test oracles that might be useful to look at. The one I was looking at (DQP) I think would need support for query hints (or corresponding set options) to function but if we could get it to work would allow a lot of join testing to happen. |
I agree -- I think using sqlancer is a great idea -- I suspect it simply needs someone to maintain and file tickets |
I've gotten many of them to run so far after removing or changing troublesome features (create index, valuesort, etc). There are differences with sqllite vs DataFusion wrt how sqllite handles integers vs real numbers that I'm still working through slowly. I do see a number of failures that I believe are real or at least point to limitations of DF. I'm hoping to have a full 'clean' run sometime in the next week that only fails on likely valid queries. Here is an example of something that is currently failing:
|
It would be great if there was a common infrastructure across
Boxes on the top are existing slt tests available in various
The transpiler would also be a repository of links to existing slts. Boxes at the bottom are the resulting slt tests. Ideally, the transpilation would not be a one off process, but it |
@Omega359 I see you went quite far and already had some good findings. |
Right now I'm just working through issues with the sqlite test files. My biggest issue actually is that the test runner seems to get very very very very slow sometimes (most of the time actually). I don't know why atm - I forked off the rust sqllogictest repo and tried a few things to try and automate converting value-oriented results to row-oriented results and at one point it ran super fast but otherwise it's 'fire off a run and come back in 1/2 hour'. I'll push up my changes to my repo on github in the next day or so. I suspect it's a good few days of solid work going through all the tests and fixing/changing things. |
I am absolutely not against this at all - but it's a pretty large undertaking. Currently the definition of what the format of a slt file is .. fluid. The original format from sqlite isn't directly compatible with the rust crate which has been a large part of my work. Fixing that would be step 1. Having a base set of slt tests that work around most every db would be nice however given what I've seen so far those would be very basic tests. Even things like what the field type of a Float field isn't stable across db's (or at least the sqllogictest runner impl's). Support for create index .. not in DF (yet). Mysql using DIV for integer division vs / (mysql/oracle treat / as always float result iirc). Just a few basic examples |
I think adding test oracles like DQP to sqllogictests is also a project worth to do. SQLancer is a fuzzer implementation with query generator + test oracles (for property tests), and these two component are loosely coupled: test oracles don't necessarily have to be implemented with SQLancer's query generator, the invariants can be tested as long as there is a valid query, maybe the best strategy is to implement them both in existing test cases(slt) and SQLancer's randomly generated queries. For DQP, I remember there is a configuration we can turn off 🤔, and there will be no join-reordering, so join order can be controlled by writing table order in SQL queries differently. |
@Omega359 I was looking for the official SQLLite SLTs -- are you looking at the ones here: https://www.sqlite.org/sqllogictest/dir?ci=tip&name=test ? |
https://arxiv.org/pdf/2410.21731 They did this slt bank for SQLite, PostgreSQL, DuckDB and found multiple bugs, the authors said integrating them into a new system is easy
The implementation is at https://github.com/suyZhong/SQuaLity |
YES! See my post in DF's discord channel https://discord.com/channels/885562378132000778/1314317688150949978/1316455175376081028 |
Yes, though that wasn't where I grabbed them from. I think I grabbed the files from https://github.com/hydromatic/sql-logic-test |
Copied from Discord: I've finally gotten to the point of having most of the sqlite slt test files running in df sqllogictest suite. After attempting to run and fix things manually (for days ... I'm an idiot sometimes) I smartened up and hacked up a version of the sqllogictest crate to do most of the work for me. The result is that everything runs however a LOT of queries are skipped for DF for various reasons. I'm going to push all of this to a df branch in my github account. I would really really like some assistance with evaluating the files and submitting tickets for any issues that were uncovered. There are apparently almost 5.7 million queries of which 78,444 were marked to be skipped by Datafusion because of errors or differences in results when compared to sqlite or postgres. If DF results were the same as postgres however (even attempting equality of floats) the results in the slt were updated. This is mostly because sqlite is rather simple and just doesn't seem to properly handle floats, etc correctly. Because of the format differences between the slt files in DF vs sqlite a custom version of sqllogictest dependency is required. I'm hoping to push a PR to that crate's repo at some point to cover the additional change required to run this. run command is changes are @ https://github.com/Omega359/arrow-datafusion/tree/feature/sqllogictest_add_sqlite Also I've been looking at duckdb's test - found a conway's game of life impl in sql. It doesn't even parse in df 😦 https://gist.github.com/janickr/58fab629ee3ea7e5638a The largest issue I had with porting the tests was that DF is just missing some sql support compared to sqlite. No delete. No create index. No duplicate column names. No scalar subqueries. Type conversion doesn't handle null all that well. And it seems that CTE expressions have a lot of limitations after playing with that for the last hour. Here is a gist of the most common errors. Note that the delete errors will not show up for anyone who looks at my branch linked above - I just removed that portion of tests rather than try and deal with each file given DF's lack of support |
thank you @Omega359 for your work on this!
that's totally understandable!
i know you mentioned this earlier (#13470 (comment)), but i unfortunately don't know what the difference is. How big it is? |
I was torn about this because it would mean that the expected query types and results would then be lost.
The primary issue is that those slt files are in what I call 'valuewise' ordering, meaning that results are in a single column. For example:
Every other use of slt's use what I call 'rowwise' where the results are in proper columns. Both have their uses - valuewise better handles string with leading/trailing spaces, rowwise is easier to parse. The main change I made to the sltlogictest-rs dep is to add support for I also stripped out everything related to mysql and mssql from those files. This is the series of regex I used to clean up the files:
|
After thinking about it a bit I think I could find a way to switch the results by using the column type count in |
Did anyone check the license of the sqlite test suite? Because if https://www.sqlite.org/sqllogictest/dir?ci=tip doesn't specify a license and https://www.sqlite.org/copyright.html doesn't explicitly mention the test corpus (which one could argue is a separate artifact). Hence you cannot just copy that data (i.e. it would be a copyright violation). Even reading it during tests over the internet might be a bit of a gray zone. |
Yes, I will be 100% willing to help both review the errors and triage/file issues (and organize the effort) |
I wll look into the copyright issue |
https://github.com/hydromatic/sql-logic-test
I didn't see the actual .test files, so I am not 100% sure what files we are referring to.
https://github.com/sqlite/sqlite https://github.com/sqlite/sqlite/blob/master/LICENSE.md explicitly says that this is in the public domain:
So TLDR I didn't find any compelling evidence either way related to licensing. When we have a proposal I will take a closer look |
At least according to my understanding of US copyright law, reading public files from the internet is not a problem. Redistributing them is where there can become a problem, depending on the license. |
https://github.com/hydromatic/sql-logic-test/tree/main/src/main/resources/test As far as I could tell the files there are just copies of the ones from sqlite however I didn't have to deal with some oddball version control system to download them. |
The sqllogictest files are not in that test directory. As best as I can tell they have no license attached to them at all if the license from sqlite doesn't cover them. The C source files do have licenses though. |
I asked the sqlite guys and they added a copyright.md file to the sqllogictest repo: https://sqlite.org/sqllogictest/file?name=COPYRIGHT.md&ci=tip It's about as open as it's possible to make it :) |
Oops, not so quick. I forgot about I'll see about pushing a PR to that crate later today and then hopefully if merged we can just use the next version. |
I've updated my test branch with the latest changes including adding skipif for Postgresql queries that fail to run. What is left is at least 19 legitimate query result mismatches where tickets were filed. I've submitted a PR to the sqllogictest-rs repo for my changes but so far they haven't responded. If they do not get around to merging it for whatever reason we have two choices:
For now I'm going to slow down work on this. |
Thank you so much @Omega359 I created an epic to track the remaining work to complete the sqllogictest migration. I am hoping we can rally the community to help us finish it I also field a ticket to track completing the integratio work: |
I am closing this issue for now so we can focus on completing #13811 -- please let me know if there are other items that should be tracked and I will file tickets Thanks again |
During SDF's upgrade to DataFusion 43 we found a bug #13425 . This was possible thanks to extensive test coverage for certain query shapes that we have internally, which is good. The bad part is that the bug could be caught at the PR stage, or before the release, should similar tests exist in DataFusion project.
Would it be useful if there were more SLT query tests, including the ones that allowed to catch #13425?
cc @alamb @schulte-lukas @gliga
The text was updated successfully, but these errors were encountered: