Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oak 11113 - MissingLastRevSeekerTest fix flaky test #1745

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

ionutzpi
Copy link
Contributor

MissingLastRevSeekerTest fix flaky test getNonSplitDocs

@@ -238,6 +238,11 @@ private void getBrokenSurrogateAndInitializeDifferentStores(DocumentStoreFixture
if (nodeStore != null) {
nodeStore.dispose();
}
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, disposing the node store should take care of this. We should find out why it doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a precocious adding. No truly impact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then please remove that change, because a) not needed, b) not related, and c) potentially confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -249,7 +251,7 @@ public void dispose() {
}
dns.runBackgroundOperations();
//seeker should return only non split documents
int docs = Iterables.size(seeker.getCandidates(0));
int docs = Iterables.size(seeker.getCandidates(Objects.requireNonNull(dns.getRoot().getLastRevision().getRevision(dns.getClusterId())).getTimestamp()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the flakyness?

Copy link
Contributor Author

@ionutzpi ionutzpi Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I removed 0 and replaced it with proper revision version timestamp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

a) Could you put a short description of what was wrong into the ticket?

b) I would recommend that @egli, @rishabhdaim or @Joscorbe confirm that his is the correct change.

Copy link
Contributor

@rishabhdaim rishabhdaim Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 was used to fetch all the documents. Using the timestamp of lastRevision would result in skipping some documents.

If there is no proper reason, I will revert the change.

Copy link
Contributor

@rishabhdaim rishabhdaim Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to reproduce this test failure on my local. The root cause the is presence of document with id 1:/test in the RDB-H2.

Once I did mvn clean install this problem went away.
IMO, it is because of some earlier test(s) didn't cleanup properly rather than test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found the Root cause.

These tests are failing due to the CompressedDocumentPropertyStateTest test not cleaning up properly.
On my local, it is causing either Mongo or RDB-H2 to fail.

CompressedDocumentPropertyStateTest creates the problematic node 1:/test causing test failures.

CompressedDocumentPropertyStateTest has been added recently which explains the recent failures of MissingLastRevSeekerTest.

@ionutzpi could you please look into this?
cc @Joscorbe @reschke

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I added CompressedDocumentPropertyStateTest changes to solve this issue. Julian told that documentfixturestore should close when closing nodestore(seems that this is not true).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not removing for RDB, I dont know the reason yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rishabhdaim if I add removeMe.add("1:/test"); after assert it will fix the issue. Do you agree with this change?

@rishabhdaim
Copy link
Contributor

Created PR #1749 with proposed changes.

@reschke
Copy link
Contributor

reschke commented Sep 29, 2024

@rishabhdaim - thanks for the analysis. This also explains why the test hasn't been flaky for a long time.

Note that since a few weeks, AbstractDocumentStoreTest will check the NODES table before/after. So if a test fails like that, a check of the unit test log might be interesting.

@ionutzpi - trust me; the RDB based tests assume that the test nodes are cleaned up (unless the test has been designed to purge tables). This is different from MongoDB because back then, I didn't want tests to nuke tables they did not create. This may have been the wrong approach, and we can discuss to change this. I'm just not sure it's worth the effort.

@reschke
Copy link
Contributor

reschke commented Sep 29, 2024

Once it's clear which test is flaky, please adjust the issue title and make sure that the PR commit message matches.

@rishabhdaim
Copy link
Contributor

rishabhdaim commented Sep 29, 2024

@reschke I have created PR to fix the flacky tests here #1749.
Could you please review the same. Also the build passes after the proposed fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants