You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found a problem in yRCA output when the invoking service emits as log the template corresponding to the client_timeout proposed in /data/templates/chaos-echo.yml
The problem has been found testing Online Boutique application, where I intentionally put a sleep time in the responding service (adservice). yRCA's output was the following: "Found no failure cascade to the considered event", using the client_timeout template as error to explain.
As for timeouts expiring, the Prolog explainer of yRCA currently handles two situations:
explicitly logged failures of invoked services, due to which the timeouts in invokers expired (lines 14-26) or
requests sent but never received/handled by invoked services (lines 77-87)
The described situation seem to relate to the first case (lines 14-26), for which yRCA should consider the base case of a service taking "too long" to answer to its invokers (same case, but no failure in the invoked service while interaction is going on), e.g.,
causedBy(log(SI,I,T,E,M,Sev),[X],SI) :-
(E=errorFrom(SJ,Id);E=timeout(SJ,Id)),
failedInteraction(Id,(SI,I),(SJ,J),Ts,Te),
\+ (log(SJ,J,U,_,_,SevJ), lte(SevJ,warning), Ts =< U, U =< Te),
X=log(SI,I,T,E,M,Sev).
(which would cover also the error responses due to processing "bad requests" of invokers)
I found a problem in yRCA output when the invoking service emits as log the template corresponding to the client_timeout proposed in /data/templates/chaos-echo.yml
The problem has been found testing Online Boutique application, where I intentionally put a sleep time in the responding service (adservice). yRCA's output was the following: "Found no failure cascade to the considered event", using the client_timeout template as error to explain.
I think that yRCA should recognize that frontend had an internal error due to the timeout happened in adservice.
The text was updated successfully, but these errors were encountered: