-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RowContainer's columns' HasNulls info #11741
Comments
@zsmj2017 good point. We can consider to support this. cc @tanjialiang @zation99 |
@xiaoxmeng @tanjialiang one decision we need to make is, after we merge these two places to add the stats, do we want to enable it by default as what #10775 does, or make it disabled by default as what #11558 does? |
nvm @tanjialiang pointed me to the recent PR that enabled col stats by default: #11731 |
Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Reviewed By: xiaoxmeng Differential Revision: D67229925
…ator#11860) Summary: PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Reviewed By: xiaoxmeng Differential Revision: D67229925
Summary: Pull Request resolved: #11860 PR to address #11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Reviewed By: xiaoxmeng Differential Revision: D67229925 fbshipit-source-id: ef66a1a419942f0b1d699749849c19e05be378e7
…ator#11860) Summary: Pull Request resolved: facebookincubator#11860 PR to address facebookincubator#11741 - Removed the use of columnHasNulls in RowContainer and replaced them with row stats - Separate null count/sumBytes from minmax. In the case of rows erasure, only min/max is invalidated. Reviewed By: xiaoxmeng Differential Revision: D67229925 fbshipit-source-id: ef66a1a419942f0b1d699749849c19e05be378e7
Description
#11558
and
#10775
Both of these PRs add column-based statistics to RowContainer (the former includes column length information and nullcnt, while the latter only includes nullcnt).
Can we consider merging these two statistics?
From the current perspective, the changes in PR 10775 can be fully covered by PR 11558 (But PR 10775 is always enabled, while PR 11558 provides an option to decide whether to enable it).
I simply think that the statistical information should come from the same data source as much as possible. Dispersed data sources may lead to certain changes missing part of the logic, which could result in APIs that are supposed to have consistent semantics failing to return the same results.
The text was updated successfully, but these errors were encountered: