-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-8458: Use correct size of definition level bytes slice when reading Parquet v2 data page #2838
DRILL-8458: Use correct size of definition level bytes slice when reading Parquet v2 data page #2838
Conversation
…ding Parquet data page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, a basic buffer arithmetic bug by yours truly. I guess it's remained undetected so far because Parquet v2 is still uncommon the wild. And because of insufficient Parquet v2 test coverage.
Thank you very much for this fix which looks great. Would you mind seeing if you can relocate the test and its data? We've got TestParquetComplex already and also some Parquet v2 test files for which a naming pattern has been started, e.g.
exec/java-exec/src/test/resources/parquet/parquet_v2_logical_types_simple.parquet
(Feel free to oragnise things differently too, just aiming to bundle like tests and data files).
I guess we've all caused our fair share of those bugs ;-)
Sure, how about renaming the test file to Would you prefer if the test was moved to an existing test class, e.g. |
A test data file name like that would be great thank you. Yes, please may you move the test itself inside TestParquetComplex and see if you can integrate your test data file generation code into ParquetSimpleTestFileGenerator? It already takes a Parquet version number parameter IIRC (up until now, only ever set to v1). I think we may need to come back and organise our tests better for Parquet v2, but I don't want to saddle this PR with that reorganisation. |
P.S. if you find the test data generator only fits simply into ParquetSimpleTestFileGenerator as a standalone method that is unrelated to what's there already then I think that's fine. I'm pointing to things I know are there and related but not going so far as to ask "How would this look in code?" |
PR updated with refactored test code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Thanks.
…ding Parquet v2 data page (apache#2838)
…ding Parquet v2 data page (#2838)
…ding Parquet v2 data page (apache#2838)
…ding Parquet v2 data page (apache#2838)
DRILL-8458: Use correct size of definition level bytes slice when reading Parquet data page
Description
The byte buffer slice used for the definition level bytes when reading a Parquet v2 data page is sized correctly (in the call to
ByteBuffer::limit
)Documentation
Bugfix only, no documentation changes
Testing
Unit test added in new test class
org.apache.drill.exec.store.parquet.TestRepeatedColumn