Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct abbreviations for data quantities. Fixes #13054 #14859

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

metadaddy
Copy link
Member

@metadaddy metadaddy commented Nov 1, 2022

Description

The Trino CLI currently reports data size incorrectly, using the legacy, 1,024-based, values for K, M, G etc rather than the modern 1,000-based values.

Is this change a fix, improvement, new feature, refactoring, or other?

Fix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

CLI

Non-technical explanation

This change corrects the display of data sizes in the CLI to use KiB, MiB for 1,024-based values and adds a command option, --decimal-data-size to display data sizes in 1,000-based units.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

The CLI now shows data sizes and rates with binary (1024-based) abbreviations: KiB, MiB, etc. A new command option, --decimal-data-size, shows decimal (1000-based) values and abbreviations: KB, MB, etc. (#13054)

Related issues, pull requests, and links

@electrum
Copy link
Member

electrum commented Nov 1, 2022

Since Trino uses base-2 units everywhere, I think it would be better to keep the base-2 units and but add the i to the unit designation. This preserves the existing behavior and compatibility with base-2 units used everywhere else. If we make the CLI use base-10, then it will show different values than other places in Trino while using the same suffix.

The downside of adding the suffix is that it will break all of the carefully formatted output, since the values will now be wider. But as long as we still fit in 80 characters, it should be ok.

We could also add a non-default command line option to use base-10 units, so as not to change the existing behavior. You can use a configuration file to effectively make this a default for your CLI installation.

@metadaddy
Copy link
Member Author

The impact on the formatting of the CLI is minimal:

trino:ds> select count(*) from drivestats;
   _col0   
-----------
 346006813 
(1 row)

Query 20221129_225226_00008_9q4rt, FINISHED, 1 node
Splits: 420 total, 420 done (100.00%)
12.27 [346M rows, 4.66MiB] [28.2M rows/s, 389KiB/s]

Same query with --decimal-data-size :

trino:ds> select count(*) from drivestats;
   _col0   
-----------
 346006813 
(1 row)

Query 20221129_225715_00012_9q4rt, FINISHED, 1 node
Splits: 421 total, 421 done (100.00%)
11.02 [346M rows, 4.89MB] [31.4M rows/s, 443KB/s]

@hashhar
Copy link
Member

hashhar commented Nov 30, 2022

@electrum Your comments seem to be addressed. Can you please take a look?

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % @electrum's review.


import static org.testng.Assert.assertEquals;

public class TestFormatUtils
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this.

@mosabua
Copy link
Member

mosabua commented Jan 11, 2024

👋 @metadaddy - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

Also cc @electrum and @hashhar

@hashhar
Copy link
Member

hashhar commented Jan 12, 2024

Thanks for reminder @mosabua. Sorry @metadaddy that this fell through the cracks.

I think we can just rebase this and merge. I'll wait a few days for @metadaddy if he does the rebase otherwise I'll do a rebase and merge.

@metadaddy
Copy link
Member Author

@hashhar Hey there - apologies for the latency in replying! I can rebase if you like - it should be relatively straightforward.

@metadaddy
Copy link
Member Author

Something messed up here. I'll figure it out and re-push!

@metadaddy metadaddy force-pushed the cli-wrong-units branch 2 times, most recently from 909c461 to 0b66544 Compare March 18, 2024 02:22
@metadaddy
Copy link
Member Author

@hashhar Looks like everything is good to go.

@mosabua
Copy link
Member

mosabua commented Mar 18, 2024

Could you add the new flag to the documentation @metadaddy .. if timing for merge doesnt work out I can review a separate follow up PR as well

@metadaddy
Copy link
Member Author

Hi @mosabua - I just added the flag to the doc as an amendment to the "Add option for decimal data size/rates, change KB etc to KiB etc for binary" commit: 18b9780 👍🏻

1 similar comment
@metadaddy
Copy link
Member Author

Hi @mosabua - I just added the flag to the doc as an amendment to the "Add option for decimal data size/rates, change KB etc to KiB etc for binary" commit: 18b9780 👍🏻

@mosabua
Copy link
Member

mosabua commented Mar 19, 2024

Great .. I pinged @hashhar and @electrum for review

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Default is unchanged and is opt-in.

Only affects CLI. format related functions in Trino still behave same as before.

Copy link

github-actions bot commented Sep 4, 2024

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Sep 4, 2024
@mosabua
Copy link
Member

mosabua commented Sep 19, 2024

I agree with the assessment from @hashhar

All concerns from @electrum are addressed.

I will merge. Worst case we can do a follow up.

@mosabua mosabua merged commit 08ff9ca into trinodb:master Sep 19, 2024
45 checks passed
@github-actions github-actions bot added this to the 459 milestone Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Trino CLI reports data with wrong units
4 participants