Make Fail Overflow Test less flaky #164

abutch3r · 2024-02-22T11:13:51Z

Fail overflow test can fail to pass if exception is thrown after more then 256 messages have been processed.

increase window within which a failure can occur by applying a similar change as was done in #158

Fail overflow test can fail to pass if exception is thrown after more then 256 messages have been processed microprofile#162

eclipse-microprofile-bot · 2024-02-22T11:15:03Z

Can one of the admins verify this patch?

Azquelt · 2024-02-22T11:56:57Z

@eclipse-microprofile-bot please test this

Azquelt · 2024-02-22T11:58:47Z

To be honest, this test looks pretty sketchy. We're submitting to the executor in a tight loop, but the methods downstream of the processor seem to also run as quickly as possible.

I don't think there's any guarantee in this test that the emitter will overflow?

cescoffier · 2024-02-22T12:05:16Z

Yes, it may not overflow depending on the timing of the test.

We may want to disable it completely.

Azquelt · 2024-02-22T14:16:10Z

@eclipse-microprofile-bot test this please

abutch3r · 2024-02-22T14:54:08Z

@cescoffier
I have updated the test bean to be in line with the other beans which all contain a 1 millisecond sleep, which should for all practical purposes mean that the test will now throw the failure as expected.

Azquelt

I think this looks good.

Azquelt · 2024-02-26T11:01:37Z

1 second sleep

That'll be a 1 millisecond sleep - which should still be enough to ensure that we send messages to the emitter faster than they can be processed.

kabir · 2024-02-27T18:58:04Z

We're also seeing occasional failures in DropOverflowStrategyOverflowTest/

https://issues.redhat.com/browse/WFLY-17635

Would a similar fix be appropriate there? I've not looked beyond verifying that the classes are different :-)

Emily-Jiang · 2024-02-27T22:13:20Z

We're also seeing occasional failures in DropOverflowStrategyOverflowTest/

https://issues.redhat.com/browse/WFLY-17635

Would a similar fix be appropriate there? I've not looked beyond verifying that the classes are different :-)

Please open a different issue and propose a PR @kabir

Emily-Jiang · 2024-02-27T22:13:50Z

@cescoffier are you okay with this PR?

cescoffier · 2024-02-28T07:39:18Z

@ozangunalp can you check? It should be fine (as discussed).

ozangunalp · 2024-02-28T08:30:41Z

It seems to me like it'll almost always fail.

The second emit will be called before the very first message arrives to the out because of the sleep(1), and cause a lack of requests exception without any messages registered at the output.

I think we should just get rid of the assertion assertThat(bean.output()).isNotEmpty().... The emission and processing happens on different threads, We can't be sure if we process any messages.

abutch3r · 2024-02-28T09:54:33Z

@kabir your issue matches the fix in #158 which is not currently in a published version of the TCK. When a new version is published then the fix for that test would also be included

Azquelt · 2024-02-28T11:29:03Z

It seems to me like it'll almost always fail.

The second emit will be called before the very first message arrives to the out because of the sleep(1), and cause a lack of requests exception without any messages registered at the output.

I think we should just get rid of the assertion assertThat(bean.output()).isNotEmpty().... The emission and processing happens on different threads, We can't be sure if we process any messages.

Good point.

In practise, the await() line may often wait long enough for the first message to arrive since it only checks the condition every 100ms by default, but it is still a nasty race condition.

Azquelt · 2024-02-28T12:14:40Z

Ok, @abutch3r and I just had a chat about this test.

As soon as the emitter overflows, it emits a failure and terminates the subscription (1.6). At this point all further messages should result in the Emitter throwing IllegalStateException because the stream is terminated.

So when the emitter first throws an exception, neither the first message, nor the failure signal may have reached the end of the stream yet.

Currently the test awaits on an exception being thrown from the emitter. We think if we change it to wait for the onError signal to arrive, that would guarantee that the assertThat(bean.output()).isNotEmpty().hasSizeLessThan(999); check would pass, since the first item should be processed before the error signal.

At that point, it would be theoretically possible for the test thread not to yet see an exception thrown from the emitter since that's done asynchronously, so we should await() on that arriving too.

Wait for the failure on the stream to occur instead of the exception on the emitter - this ensures that at least the first message will be sucessfully processed and that a failure did occur before assertions are checked. In the case where the failure may occur sufficiently late in the test execution such that there is a failure, but not yet an exception. In this case emit one more message and wait for the exception before checking emitThree was unused and would close close the stream via `.complete()` - repurpose for being able to send one message and not close the stream if successful.

Azquelt · 2024-03-01T10:49:27Z

...e/microprofile/reactive/messaging/tck/channel/overflow/FailOverflowStrategyOverflowTest.java

+        //If an exception has not yet been thrown after the failure occurred, try one more message
+        if (bean.exception() == null) {
+            bean.emitOne();
+            await().until(() -> bean.exception() != null);


This line isn't needed because emitOne runs synchronously.

ozangunalp · 2024-03-01T11:06:32Z

I am trying to find a way to make this test work but even by waiting on the failure() and not the exception(), the error signal will arrive first and complete the stream before the first message arrives to the downstream out channel.

Azquelt · 2024-03-01T12:05:48Z

the error signal will arrive first and complete the stream before the first message arrives to the downstream out channel

Hmm, I was under the impression that the failure signal should follow the items down the stream, not reaching the final subscriber until the previous two items had been processed, but I guess that's not the case?

Emily-Jiang · 2024-03-06T18:17:02Z

@cescoffier @ozangunalp do you have any objections to merge this PR?

ozangunalp · 2024-03-06T23:03:42Z

I think the previous changes are making this test even more flaky.
I suggest we keep the old assertion about the size of the output and make the other changes :

    @Test
    public void testOverflow() {
        bean.emitALotOfItems();

        await().until(() -> bean.failure() != null);
        assertThat(bean.failure()).isInstanceOf(Exception.class);
        assertThat(bean.output()).doesNotContain("999");
        assertThat(bean.output()).hasSizeBetween(0, 256);
        // If an exception has not yet been thrown after the failure occurred, try one more message
        if (bean.exception() == null) {
            bean.emitOne();
        }
        assertThat(bean.exception()).isInstanceOf(IllegalStateException.class);
    }

I think we can also revert the added Thread.sleep(1).

Azquelt · 2024-03-08T11:05:05Z

I think you need the Thread.sleep(1) to ensure that each message isn't just processed as it comes in.

Each call to emitter.send() results in an async task starting, and the test relies on the next call to emitter.send() running before the async task has completed. Though this is likely to happen for at least one of the 1000 calls, the Thread.sleep(1) almost guarantees it.

Having a lock in there so that the test can block tasks from completing until the emitter fails would be an alternative.

abutch3r · 2024-03-21T17:00:17Z

@ozangunalp

I have removed the .isNotEmpty() check from the assertions and reintroduced the .doesNotContain("999") as that validates that no messages are later accepted.

the main issue we saw with the test is that the failure could occur after more then 500 messages were processed, so the 256 check does need to be increased to allow for the situation where we do process more then expected, but the sleep which is used in every other emitter test should provide a better guarantee that we see it fail.

Are these changes now satisfactory

ozangunalp · 2024-03-22T12:18:02Z

Yes, this looks good to me, thanks! @abutch3r @Azquelt

Make Fail Overflow Test less flaky

3e5afe3

Fail overflow test can fail to pass if exception is thrown after more then 256 messages have been processed microprofile#162

Add sleep to Drop Emitter test bean

29dfac9

Azquelt approved these changes Feb 26, 2024

View reviewed changes

Azquelt reviewed Mar 1, 2024

View reviewed changes

As emitOne is synchronous, no need to wait for exception.

cc1b8f1

Remove isNotEmpty assertion and unneeded force of one messsage

3a5eeb5

abutch3r force-pushed the flaky_emitter_overflow_fail_tck branch from 44b9861 to 3a5eeb5 Compare March 21, 2024 16:56

ozangunalp approved these changes Mar 22, 2024

View reviewed changes

Azquelt merged commit e025074 into microprofile:main Mar 22, 2024

Emily-Jiang added this to the 3.0.1 milestone Apr 16, 2024

Make Fail Overflow Test less flaky #164

Make Fail Overflow Test less flaky #164

Uh oh!

Conversation

abutch3r commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eclipse-microprofile-bot commented Feb 22, 2024

Uh oh!

Azquelt commented Feb 22, 2024

Uh oh!

Azquelt commented Feb 22, 2024

Uh oh!

cescoffier commented Feb 22, 2024

Uh oh!

Azquelt commented Feb 22, 2024

Uh oh!

abutch3r commented Feb 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Azquelt left a comment

Choose a reason for hiding this comment

Uh oh!

Azquelt commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kabir commented Feb 27, 2024

Uh oh!

Emily-Jiang commented Feb 27, 2024

Uh oh!

Emily-Jiang commented Feb 27, 2024

Uh oh!

cescoffier commented Feb 28, 2024

Uh oh!

ozangunalp commented Feb 28, 2024

Uh oh!

abutch3r commented Feb 28, 2024

Uh oh!

Azquelt commented Feb 28, 2024

Uh oh!

Azquelt commented Feb 28, 2024

Uh oh!

Azquelt Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

ozangunalp commented Mar 1, 2024

Uh oh!

Azquelt commented Mar 1, 2024

Uh oh!

Emily-Jiang commented Mar 6, 2024

Uh oh!

ozangunalp commented Mar 6, 2024

Uh oh!

Azquelt commented Mar 8, 2024

Uh oh!

abutch3r commented Mar 21, 2024

Uh oh!

ozangunalp commented Mar 22, 2024

Uh oh!

Uh oh!

abutch3r commented Feb 22, 2024 •

edited

Loading

abutch3r commented Feb 22, 2024 •

edited

Loading

Azquelt commented Feb 26, 2024 •

edited

Loading