Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Added connection counter stats for backup groups #1552

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ethouris
Copy link
Collaborator

@ethouris ethouris commented Sep 16, 2020

This should be updated after adding:

  • response time stats
  • stats tables in apps

@ethouris ethouris linked an issue Sep 16, 2020 that may be closed by this pull request
1 task
@ethouris ethouris marked this pull request as draft September 16, 2020 10:57
@ethouris ethouris self-assigned this Sep 16, 2020
@ethouris ethouris added [core] Area: Changes in SRT library core Impact: Low Type: Enhancement Indicates new feature requests labels Sep 16, 2020
@ethouris ethouris added this to the v1.5.0 - Sprint 23 milestone Sep 16, 2020
Copy link
Collaborator

@maxsharabayko maxsharabayko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CBytePerfMon structure keeps increasing. 🤔
8 new stats entries for group usage only (4 interval-based and 4 accumulated).
We might need something like srt_groupstats(CBytePerfMon*, CGroupPerfMon*, bool clear, bool total)
And maybe a newer CBytePerfMon, that does not mix interval-based and total. Requires some thinking effort.


uint32_t countBreak; // increased with every broken link in the group
uint32_t countActivate; // increased with every activated link in the group
uint32_t countEager; // increased with every case when unstable link became stable again
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recovered or restored might be better. 🤔

Suggested change
uint32_t countEager; // increased with every case when unstable link became stable again
uint32_t countRecovered; // increased with every case when unstable link became stable again

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for this field is to count the cases when a link has been considered unstable, which has forced the group sender to activate the backup link, but then it turned out that the link actually wasn't broken and has returned to stability. This means that the activation was based on a false alert and resulted in only unnecessary overhead. A user who is reviewing these statistics, when seeing any nonzero value here, and moreover widely exceeding the value from countBroken, it should be an information for him that he definitely has too small value of groupstabtimeo option.

Hard to fit such an explanation in one comment line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some better names I could propose would be:

  • countFalseAlert
  • countMissed
  • countExcess

uint32_t countBreak; // increased with every broken link in the group
uint32_t countActivate; // increased with every activated link in the group
uint32_t countEager; // increased with every case when unstable link became stable again
uint32_t countSilence; // increased with every case of switching a link back to idle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better? 🤔

Suggested change
uint32_t countSilence; // increased with every case of switching a link back to idle
uint32_t countDeactivated; // increased with every case of switching a link back to idle

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks ok, if you prefer. Just note I can't simply commit this suggestion because the name must be changed everywhere, including documentation.

@mbakholdina mbakholdina modified the milestones: v1.5.0 - Sprint 23, v1.5.0 - Sprint 26 Sep 21, 2020
@mbakholdina mbakholdina modified the milestones: v1.5.0 - Sprint 26, v1.5.0 Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Enhancement Indicates new feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] Add reconnection counter for the group in main/backup mode
3 participants