New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Implemented computation of segment replication stats at shard level #17055

Open

vinaykpud wants to merge 12 commits into opensearch-project:main from vinaykpud:node-stats

Contributor

vinaykpud commented Jan 19, 2025 •

edited

Loading

Description

The method implemented here computes the segment replication stats at the shard level, instead of relying on the primary shard to compute stats based on reports from its replicas.

Method implemented in this PR serves the segment replication stats for following core APIs:

Nodes Stats API (/_nodes/stats)
Cluster Stats API (/_cluster/stats)
Indices Stats API (/_stats or /{index}/_stats)

Related Issues

Resolves #16801
Related to #15306

Check List

Functionality includes testing.
API changes companion pull request created, if applicable.
Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.


          Implemented computation of segment replication stats at shard level

44a1134

The method implemented here computes the segment replication stats at the shard level,
instead of relying on the primary shard to compute stats based on reports from its replicas.

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud requested review from anasalkouz, andrross, ashking94, Bukhtawar, CEHENKLE, dblock, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, nknize, owaiskazi19, reta, Rishikesh1159, sachinpkale, saratvemulapalli, shwetathareja, sohami, VachaShah, jainankitk and linuxpi as code owners

January 19, 2025 03:09

github-actions bot added enhancement Search:Performance labels

vinaykpud added 4 commits

January 18, 2025 19:11


          Updated style checks in the test

5d138e3

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>


          Updated changelog

18664d2

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>


          Merge branch 'main' into node-stats

43d0798

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>


          fixed style issues

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud added 2 commits

January 21, 2025 12:08


          Fixed the comments for the initial revision

a94240f

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>


          Updated to use System.nanoTime() for lag calculation

dd0406d

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

Contributor

github-actions bot commented Jan 21, 2025

❌ Gradle check result for dd0406d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

vinaykpud closed this

vinaykpud reopened this

opensearch-ci-bot mentioned this pull request

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreIT #16145

Open

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for dd0406d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

vinaykpud added 2 commits

January 21, 2025 22:33


          Fixed the integration test for node stats

1104c1f

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>


          Merge branch 'main' into node-stats

59e2617

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for 59e2617: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

vinaykpud closed this

vinaykpud reopened this

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for 59e2617: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for 59e2617: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

vinaykpud closed this

vinaykpud reopened this

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for 59e2617: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

vinaykpud closed this

vinaykpud reopened this

Contributor

github-actions bot commented Jan 22, 2025

❌ Gradle check result for 59e2617: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?


          Modified the version in the ReplicationCheckpoint for backward compat…

90b96a8

…ibility

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud mentioned this pull request

[META] Reader/Writer Separation #15306

Open

16 tasks

This was referenced Jan 23, 2025

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreStatsIT #14310

Open

[AUTOCUT] Gradle Check Flaky Test Report for SpecificClusterManagerNodesIT #15944

Open

[AUTOCUT] Gradle Check Flaky Test Report for IndexServiceTests #14407

Open

[AUTOCUT] Gradle Check Flaky Test Report for DedicatedClusterSnapshotRestoreIT #15806

Open

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStoreMultipartIT #15819

Open

Contributor

github-actions bot commented Jan 23, 2025

✅ Gradle check result for 90b96a8: SUCCESS

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 72.41379% with 16 lines in your changes missing coverage. Please review.

Project coverage is 72.37%. Comparing base (2794655) to head (90b96a8).
Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
...nsearch/indices/replication/SegmentReplicator.java	72.72%	6 Missing and 3 partials ⚠️
.../replication/checkpoint/ReplicationCheckpoint.java	66.66%	1 Missing and 2 partials ⚠️
...in/java/org/opensearch/index/shard/IndexShard.java	60.00%	1 Missing and 1 partial ⚠️
...rc/main/java/org/opensearch/index/IndexModule.java	0.00%	1 Missing ⚠️
...c/main/java/org/opensearch/index/IndexService.java	50.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #17055      +/-   ##
============================================
+ Coverage     72.19%   72.37%   +0.17%     
- Complexity    65304    65385      +81     
============================================
  Files          5301     5301              
  Lines        303774   303817      +43     
  Branches      44034    44040       +6     
============================================
+ Hits         219323   219891     +568     
+ Misses        66458    65882     -576     
- Partials      17993    18044      +51

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mch2 reviewed

View reviewed changes

...c/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationStatsIT.java Show resolved Hide resolved

server/src/main/java/org/opensearch/index/IndexModule.java Show resolved Hide resolved

server/src/main/java/org/opensearch/index/shard/IndexShard.java Show resolved Hide resolved

server/src/main/java/org/opensearch/index/store/remote/metadata/RemoteSegmentMetadata.java

                           in.readLong(),
                           in.readLong(),
                           in.readString(),
-                          toStoreFileMetadata(uploadedSegmentMetadataMap)
+                          toStoreFileMetadata(uploadedSegmentMetadataMap),
+                          in.readLong()

Member

mch2 Jan 23, 2025

How does bwc work here with this added field? Do we need to bump CURRENT_VERSION on RemoteSegmentMetadata ? @sachinpkale can you pls assist?

Collaborator

Bukhtawar Jan 24, 2025

If data is being read without the writer serialising it, we will end with an end of stream exception

Member

mch2 Jan 24, 2025

right, so we'd need to check the version before attempting to read. I think we may want to add a mixed cluster test here in any case. I don't think theres an existing qa pkg that runs with remote enabled clusters.

server/src/main/java/org/opensearch/indices/replication/SegmentReplicationTargetService.java

                           return;
                       }
                       updateLatestReceivedCheckpoint(receivedCheckpoint, replicaShard);

Member

mch2 Jan 23, 2025

We should be able to remove this and the latestReceivedCheckpoint map from this class as the replicator is the source of truth now.

server/src/main/java/org/opensearch/indices/replication/SegmentReplicator.java

+                      ;
+                      final Map<String, StoreFileMetadata> indexStoreFileMetadata = indexReplicationCheckPoint.getMetadataMap();
+                      // If primaryLastRefreshedCheckpoint is null, we will default to indexReplicationCheckPoint
+                      // so that we can avoid any failures

Member

mch2 Jan 23, 2025

when does this happen? In this case can we not assume the shard is up to date and return empty ReplicationStats obj instead?

server/src/main/java/org/opensearch/indices/replication/SegmentReplicator.java

+                      );
+                      final Map<String, StoreFileMetadata> storeFileMetadata = primaryLastRefreshedCheckpoint.getMetadataMap();
+                      final Store.RecoveryDiff diff = Store.segmentReplicationDiff(storeFileMetadata, indexStoreFileMetadata);

Member

mch2 Jan 23, 2025

Rather than storing two ReplicationCheckpoint maps, lets store a single ShardId to ReplicationStats map and compute the stats as information about new primary checkpoints is received.

I'm also not following the lastOnGoingReplicationCheckpoint computation because cp used to compute bytes behind is not the same as the one used for lag. To simplify I think we only care about the primarylastRefreshed and should compute both stats against that, where lag is computed as the time between primarylastRefreshed and the currently searchable checkpoint on the shard.

This would still fall in line with our definition of 'Replication lag' in docs

segments.segment_replication.max_replication_lag	long	The maximum amount of time, in milliseconds, taken by a replica to catch up to its primary.

Contributor Author

vinaykpud Jan 24, 2025

As we discussed,

Lets compute the diff(bytesBehind) on the go everytime when we receive the primary refreshed checkpoint
Along with that, instead of maintaining two different maps, we can try to combine them in a single map
ex: Map<ShardId, Map<SegmentInfoVersion, ComputedDiff>

class ComputedDiff {
   long timeStamp,
   long bytesBehind
}

Once the checkpoint is computed we can remove all the entries less than replication completed checkpoint segmentInfoVersion
when nodeStats requested we can use the diff from the map and calculate the lag based on currentTime and timeStamp in the ComputedDiff

Member

mch2 Jan 24, 2025 •

edited

Loading

This works, we are doing similar tracking now with ReplicationTracker on the primary. You will need to wire this up with Index events IndexEventListener through the segrep service so that shard entries are cleared on removal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

mch2 mch2 left review comments

anasalkouz Awaiting requested review from anasalkouz anasalkouz is a code owner

andrross Awaiting requested review from andrross andrross is a code owner

ashking94 Awaiting requested review from ashking94 ashking94 is a code owner

Bukhtawar Awaiting requested review from Bukhtawar Bukhtawar is a code owner

CEHENKLE Awaiting requested review from CEHENKLE CEHENKLE is a code owner

dblock Awaiting requested review from dblock dblock is a code owner

dbwiddis Awaiting requested review from dbwiddis dbwiddis is a code owner

gbbafna Awaiting requested review from gbbafna gbbafna is a code owner

jed326 Awaiting requested review from jed326 jed326 is a code owner

kotwanikunal Awaiting requested review from kotwanikunal kotwanikunal is a code owner

msfroh Awaiting requested review from msfroh msfroh is a code owner

nknize Awaiting requested review from nknize nknize is a code owner

owaiskazi19 Awaiting requested review from owaiskazi19 owaiskazi19 is a code owner

reta Awaiting requested review from reta reta is a code owner

Rishikesh1159 Awaiting requested review from Rishikesh1159 Rishikesh1159 is a code owner

sachinpkale Awaiting requested review from sachinpkale sachinpkale is a code owner

saratvemulapalli Awaiting requested review from saratvemulapalli saratvemulapalli is a code owner

shwetathareja Awaiting requested review from shwetathareja shwetathareja is a code owner

sohami Awaiting requested review from sohami sohami is a code owner

VachaShah Awaiting requested review from VachaShah VachaShah is a code owner

jainankitk Awaiting requested review from jainankitk jainankitk is a code owner

linuxpi Awaiting requested review from linuxpi linuxpi is a code owner

cwperks Awaiting requested review from cwperks cwperks is a code owner

At least 1 approving review is required to merge this pull request.

Labels

enhancement Search:Performance