Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: try improve perf #8517

Conversation

camc314
Copy link
Contributor

@camc314 camc314 commented Jan 15, 2025

#8509 had a massive impact on perf because we are using this trait to compare large amounts of data. Improving the perf of these comparisons should reduce the impact of the perf degregation

the theory:

match generates code something like:

        match (self, other) {
            (Self::A(_), Self::A(_)) => true,
            (Self::B(_), Self::B(_)) => true,
            (Self::C, Self::C) => true,
            (Self::D, Self::D) => true,
            (Self::E, Self::E) => true,
            _ => false
        }
<example::Foo as example::ContentEq>::content_eq::hd04b232e31742fda:
        mov     qword ptr [rsp - 16], rdi
        mov     qword ptr [rsp - 8], rsi
        mov     rax, qword ptr [rsp - 16]
        mov     rcx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        mov     rdx, rcx
        add     rdx, rax
        movabs  rax, -9223372036854775807
        add     rcx, rax
        xor     eax, eax
        sub     rdx, 4
        cmovb   rax, rcx
        mov     qword ptr [rsp - 32], rax
        mov     rax, qword ptr [rsp - 32]
        lea     rcx, [rip + .LJTI12_0]
        movsxd  rax, dword ptr [rcx + 4*rax]
        add     rax, rcx
        jmp     rax
        ud2
.LBB12_2:
        mov     rax, qword ptr [rsp - 8]
        mov     rdx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        sub     rdx, rax
        mov     rcx, rdx
        add     rcx, 1
        xor     eax, eax
        cmp     rdx, 3
        cmovbe  rax, rcx
        cmp     rax, 0
        sete    al
        and     al, 1
        mov     byte ptr [rsp - 17], al
        jmp     .LBB12_7
.LBB12_3:
        mov     rax, qword ptr [rsp - 8]
        mov     rdx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        sub     rdx, rax
        mov     rcx, rdx
        add     rcx, 1
        xor     eax, eax
        cmp     rdx, 3
        cmovbe  rax, rcx
        cmp     rax, 1
        sete    al
        and     al, 1
        mov     byte ptr [rsp - 17], al
        jmp     .LBB12_7
.LBB12_4:
        mov     rax, qword ptr [rsp - 8]
        mov     rdx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        sub     rdx, rax
        mov     rcx, rdx
        add     rcx, 1
        xor     eax, eax
        cmp     rdx, 3
        cmovbe  rax, rcx
        cmp     rax, 2
        sete    al
        and     al, 1
        mov     byte ptr [rsp - 17], al
        jmp     .LBB12_7
.LBB12_5:
        mov     rax, qword ptr [rsp - 8]
        mov     rdx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        sub     rdx, rax
        mov     rcx, rdx
        add     rcx, 1
        xor     eax, eax
        cmp     rdx, 3
        cmovbe  rax, rcx
        cmp     rax, 3
        sete    al
        and     al, 1
        mov     byte ptr [rsp - 17], al
        jmp     .LBB12_7
.LBB12_6:
        mov     rax, qword ptr [rsp - 8]
        mov     rdx, qword ptr [rax]
        movabs  rax, -9223372036854775808
        sub     rdx, rax
        mov     rcx, rdx
        add     rcx, 1
        xor     eax, eax
        cmp     rdx, 3
        cmovbe  rax, rcx
        cmp     rax, 4
        sete    al
        and     al, 1
        mov     byte ptr [rsp - 17], al
.LBB12_7:
        mov     al, byte ptr [rsp - 17]
        and     al, 1
        ret
.LJTI12_0:
        .long   .LBB12_2-.LJTI12_0
        .long   .LBB12_3-.LJTI12_0
        .long   .LBB12_4-.LJTI12_0
        .long   .LBB12_5-.LJTI12_0
        .long   .LBB12_6-.LJTI12_0

This means that it has to loop over all enum varients for a match.

i am thinking that if instead, we early return if the enum varient is not the same, this will be quicker

@github-actions github-actions bot added A-ast Area - AST A-ast-tools Area - AST tools labels Jan 15, 2025
Copy link
Contributor Author

camc314 commented Jan 15, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

codspeed-hq bot commented Jan 15, 2025

CodSpeed Performance Report

Merging #8517 will not alter performance

Comparing c/01-15-draft_try_improve_perf (6c978c8) with c/01-15-feat_minifier_merge_similar_if_stmts (62a3dac)

Summary

✅ 32 untouched benchmarks

@camc314 camc314 force-pushed the c/01-15-draft_try_improve_perf branch from a6a661f to 6c978c8 Compare January 15, 2025 16:44
@camc314 camc314 closed this Jan 15, 2025
@camc314 camc314 deleted the c/01-15-draft_try_improve_perf branch January 15, 2025 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ast Area - AST A-ast-tools Area - AST tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant