Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce aggregate_column_statistics #387

Merged
merged 1 commit into from
Oct 18, 2024
Merged

Introduce aggregate_column_statistics #387

merged 1 commit into from
Oct 18, 2024

Conversation

delucchi-cmu
Copy link
Contributor

Change Description

Closes #329.

Introduces a method to read the footer statistics on a parquet dataset _metadata file and provide a report on the global minimum and maximum values.

Copy link

codecov bot commented Oct 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.83%. Comparing base (632292a) to head (d5ce329).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #387      +/-   ##
==========================================
+ Coverage   92.75%   92.83%   +0.08%     
==========================================
  Files          48       48              
  Lines        1933     1955      +22     
==========================================
+ Hits         1793     1815      +22     
  Misses        140      140              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Before [632292a] <v0.4.1> After [7d0b20a] Ratio Benchmark (Parameter)
89.0±2ms 93.0±3ms 1.05 benchmarks.Suite.time_paths_creation
72.0±0.5ms 73.3±0.6ms 1.02 benchmarks.MetadataSuite.time_load_partition_info_order7
72.0±0.4ms 72.9±0.5ms 1.01 benchmarks.MetadataSuite.time_load_partition_join_info
43.6±0.5ms 44.0±0.5ms 1.01 benchmarks.Suite.time_pixel_tree_creation
18.1±0.4ms 18.0±0.3ms 0.99 benchmarks.MetadataSuite.time_load_partition_info_order6
13.3±0.4ms 13.1±0.4ms 0.99 benchmarks.Suite.time_inner_pixel_alignment
389±2ms 384±2ms 0.99 benchmarks.Suite.time_outer_pixel_alignment
125±1ms 124±0.8ms 0.99 benchmarks.time_test_alignment_even_sky
1.06±0.01ms 1.06±0ms 0.99 benchmarks.time_test_cone_filter_multiple_order

Click here to view all benchmarks.

@delucchi-cmu delucchi-cmu requested a review from wilsonbb October 18, 2024 19:28
@delucchi-cmu delucchi-cmu merged commit 94ef033 into main Oct 18, 2024
11 checks passed
@delucchi-cmu delucchi-cmu deleted the issue/329/stats branch October 18, 2024 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add method like Catalog.get_statistics()
2 participants