Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](agg function)Use float32 instead of float64 as a parameter in percentile_approx. #43593

Closed
wants to merge 3 commits into from

Conversation

Mryange
Copy link
Contributor

@Mryange Mryange commented Nov 11, 2024

What problem does this PR solve?

This issue was identified by the "compile_check".
In the functions percentile_approx and percentile_approx_weighted, the parameter type is Float64, but the internal TDigest used in these functions operates with Float32.
This results in type conversion every time the function is executed.
Modifying the function definition would cause compatibility issues for upgrades and downgrades, so both versions of the code are retained here.

test pr #43592

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Contributor Author

Mryange commented Nov 11, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#pragma once

#include <glog/logging.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'glog/logging.h' file not found [clang-diagnostic-error]

#include <glog/logging.h>
         ^

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.70% (9814/26035)
Line Coverage: 29.02% (82094/282873)
Region Coverage: 28.14% (42209/149979)
Branch Coverage: 24.74% (21398/86508)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5e66a06bec8e37e3515f8cde4b20cd7fe0348013_5e66a06bec8e37e3515f8cde4b20cd7fe0348013/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 51453 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5e66a06bec8e37e3515f8cde4b20cd7fe0348013, data reload: false

------ Round 1 ----------------------------------
q1	17571	7377	7328	7328
q2	2573	1303	1298	1298
q3	9948	1150	1167	1150
q4	10225	859	815	815
q5	7585	2964	2970	2964
q6	232	153	148	148
q7	1029	601	608	601
q8	9364	2366	2357	2357
q9	12248	12252	12280	12252
q10	7148	2407	2447	2407
q11	469	258	271	258
q12	401	212	215	212
q13	17759	3032	3047	3032
q14	251	206	209	206
q15	586	529	515	515
q16	665	588	574	574
q17	984	570	533	533
q18	7316	6617	6654	6617
q19	1315	985	941	941
q20	3049	2846	2809	2809
q21	3944	3257	3073	3073
q22	1412	1363	1371	1363
Total cold run time: 116074 ms
Total hot run time: 51453 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7532	7503	7468	7468
q2	343	260	249	249
q3	3151	2935	2999	2935
q4	2088	1824	1797	1797
q5	5640	5677	5677	5677
q6	220	137	140	137
q7	2223	1840	1821	1821
q8	3395	3584	3528	3528
q9	14425	14433	14442	14433
q10	3602	3572	3496	3496
q11	591	512	505	505
q12	860	596	605	596
q13	14113	3189	3236	3189
q14	294	268	272	268
q15	599	540	540	540
q16	671	623	633	623
q17	1816	1649	1599	1599
q18	8217	7771	7647	7647
q19	1660	1535	1504	1504
q20	2161	1943	1963	1943
q21	5439	5334	5184	5184
q22	621	548	553	548
Total cold run time: 79661 ms
Total hot run time: 65687 ms

@Mryange
Copy link
Contributor Author

Mryange commented Nov 14, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.90% (9888/26092)
Line Coverage: 29.12% (82629/283762)
Region Coverage: 28.26% (42472/150314)
Branch Coverage: 24.85% (21546/86718)
Coverage Report: http://coverage.selectdb-in.cc/coverage/69fa500339620e76a22797e15d9344ab0ad7f3ff_69fa500339620e76a22797e15d9344ab0ad7f3ff/report/index.html

@Mryange Mryange closed this Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants