Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test-failure] Active defrag - AOF loading + edge case, 32-bit #1393

Open
zuiderkwast opened this issue Dec 5, 2024 · 2 comments
Open

[test-failure] Active defrag - AOF loading + edge case, 32-bit #1393

zuiderkwast opened this issue Dec 5, 2024 · 2 comments
Labels
test-failure An issue indicating a test failure

Comments

@zuiderkwast
Copy link
Contributor

Some of the active defrag test cases are failing, possibly after we merged #1242. @JimB123 can you take a look?

Today, https://github.com/valkey-io/valkey/actions/runs/12170639212/job/33946010031:

*** [err]: Active defrag - AOF loading in tests/unit/memefficiency.tcl
Expected 1.42 < 1.4 (context: type eval line 34 cmd {assert {$frag < 1.4}} proc ::test)
*** [err]: Active defrag edge case: standalone in tests/unit/memefficiency.tcl
Expected 1.19 < 1.1 (context: type eval line 92 cmd {assert {$frag < 1.1}} proc ::start_server)
Cleanup: may take some time... OK

Yesterday, https://github.com/valkey-io/valkey/actions/runs/12150400350/job/33883093665:

*** [err]: Active defrag - AOF loading in tests/unit/memefficiency.tcl
Expected 1.45 < 1.4 (context: type eval line 34 cmd {assert {$frag < 1.4}} proc ::test)
*** [err]: Active defrag edge case: standalone in tests/unit/memefficiency.tcl
Expected 1.18 < 1.1 (context: type eval line 92 cmd {assert {$frag < 1.1}} proc ::start_server)
@zuiderkwast zuiderkwast added the test-failure An issue indicating a test failure label Dec 5, 2024
@JimB123
Copy link
Contributor

JimB123 commented Dec 5, 2024

I'll take a look. Some of the test logic looked a little brittle.

JimB123 added a commit to JimB123/valkey that referenced this issue Dec 10, 2024
Addresses valkey-io#1393
During AOF loading or long running script, this allows defrag to be
initiated.
JimB123 added a commit to JimB123/valkey that referenced this issue Dec 10, 2024
Addresses valkey-io#1393
During AOF loading or long running script, this allows defrag to be
initiated.
JimB123 added a commit to JimB123/valkey that referenced this issue Dec 10, 2024
Addresses valkey-io#1393

During AOF loading or long running script, this allows defrag to be
initiated.
JimB123 added a commit to JimB123/valkey that referenced this issue Dec 10, 2024
Addresses valkey-io#1393

During AOF loading or long running script, this allows defrag to be
initiated.

Signed-off-by: Jim Brunner <[email protected]>
ranshid pushed a commit that referenced this issue Dec 11, 2024
Addresses #1393

Changes:
* During AOF loading or long running script, this allows defrag to be
initiated.
* The AOF defrag test was corrected to eliminate the wait period and
rely on non-timer invocations.
* Logic for "overage" time in defrag was changed. It previously
accumulated underage leading to large latencies in extreme tests having
very high CPU percentage. After several simple stages were completed
during infrequent blocked processing, a large cycle time would be
experienced.

Signed-off-by: Jim Brunner <[email protected]>
@JimB123
Copy link
Contributor

JimB123 commented Dec 11, 2024

I fixed the first one with AOF. The 2nd one is a little weird. I'm not even completely sure that the test is valid. It seems to be trying to create a scenario where all of the jmalloc slabs have exactly the same fragmentation in an attempt to confuse the defrag process. I'm not sure that it's possible to reliably create this scenario. And even if we can create it, I'm not sure that it tests anything meaningful. @zvi-code can you look at this test also? I'd like a second opinion on this one: https://github.com/valkey-io/valkey/blob/unstable/tests/unit/memefficiency.tcl#L623-L728

vudiep411 pushed a commit to Autxmaton/valkey that referenced this issue Dec 15, 2024
Addresses valkey-io#1393

Changes:
* During AOF loading or long running script, this allows defrag to be
initiated.
* The AOF defrag test was corrected to eliminate the wait period and
rely on non-timer invocations.
* Logic for "overage" time in defrag was changed. It previously
accumulated underage leading to large latencies in extreme tests having
very high CPU percentage. After several simple stages were completed
during infrequent blocked processing, a large cycle time would be
experienced.

Signed-off-by: Jim Brunner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test-failure An issue indicating a test failure
Projects
None yet
Development

No branches or pull requests

2 participants