Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
total_steal_time() can decrease in value from call to call, a violation of the rule that a counter metric must never decrease. This happens because steal time is "awake time (wall clock)" minus "cpu thread time (get_rusage style CPU time)". The awake time is itself composed of "time since reactor start" minus "accumulated sleep time", but it can be under-counted because sleep time is over-counted: as it's not possible to determine exactly the true sleep time, only get timestamps before and after a period you think might involve a sleep. Currently, sleep is even more significantly over-counted than the error described above as it is measured at a point which includes significant non-sleep work. The result is that when there is little to no true steal, CPU time will exceed measured awake wall clock time, resulting in negative steal. This change "fixes" this by enforcing that steal time is monotonically increasing. This occurs at measurement time by checking if "true steal" (i.e., the old definition of steal) has increased since the last measurement and adding that delta to our monotonic steal counter if so. Otherwise the delta is dropped. While not totally ideal this leads to a useful metric which mostly clamps away the error related to negative steal times, and more importantly avoids the catastrophic failure of PromQL functions when used on non-monotonic functions. Fixes scylladb#1521. Fixes scylladb#2374.
- Loading branch information