debug Test{AD,IPA}.testQualifiedUsers #19578

martinpitt · 2023-11-07T12:28:37Z

Investigating these failures which have a 30% failure rate, but only with firefox.

martinpitt · 2023-11-07T12:44:12Z

I tried to reproduce this locally, by repeating the login, but that passes. So I really need a trace-enabled failed log from the bots.

--- test/verify/check-system-realms
+++ test/verify/check-system-realms
@@ -161,7 +161,11 @@ class CommonTests:
                   self.admin_password, self.admin_user), timeout=300)
 
         # log in as domain admin and check that we can do privileged operations
-        b.login_and_go('/system/services#/systemd-tmpfiles-clean.timer', user=f'{self.admin_user}@cockpit.lan', password=self.admin_password)
+        for _ in range(30):
+            b.login_and_go('/system/services#/systemd-tmpfiles-clean.timer', user=f'{self.admin_user}@cockpit.lan', password=self.admin_password)
+            time.sleep(5)
+            b.logout()
+            time.sleep(3)
         return
 
         b.wait_in_text("#statuses", "Running")

This log provides one. First the logout:

-> ph_mouse("#logout:not([disabled]):not([aria-disabled=true])","click",0,0,0,false,false,false,false)
<- {'type': 'undefined'}
-> wait: ph_is_present("#login")
CDP: executionContextDestroyed 3
CDP: frameNavigated {"frame":{"id":"8","loaderId":"15077482692674","url":"http://127.0.0.2:9391/system","securityOrigin":null,"mimeType":null}}
CDP: executionContextCreated {"context":{"id":9,"origin":"http://127.0.0.2:9391","name":"","auxData":{"isDefault":true,"frameId":"8","type":"default"}}}
<- raise {'response': {'message': 'Execution context was destroyed.'}}
CDP: executionContextDestroyed 4
CDP: executionContextDestroyed 8
CDP: executionContextDestroyed 6
CDP: executionContextDestroyed 7
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}

and then the login:

-> Page.navigate({"url": "http://127.0.0.2:9391/system/services#/systemd-tmpfiles-clean.timer"})
<- {'frameId': '8'}
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login")
<- {'type': 'undefined'}
-> wait: ph_is_present("#login-user-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login-user-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> ph_set_val("#login-user-input","[email protected]")
<- {'type': 'undefined'}
-> wait: ph_is_present("#login-password-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login-password-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> ph_set_val("#login-password-input","foobarFoo123")
<- {'type': 'undefined'}
-> window.localStorage.setItem("superuser:[email protected]", "any");
<- {'type': 'undefined'}
-> wait: ph_is_present("#login-button:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login-button:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> ph_mouse("#login-button:not([disabled]):not([aria-disabled=true])","click",0,0,0,false,false,false,false)
CDP: executionContextDestroyed 9
<- raise {'response': {'message': 'Execution context was destroyed.'}}

This is very strange -- context 9 is still from the "old" login page after logout. So it seems this is somehow "too fast" with filling out the login page, before the new one from Page.navigate() even loads.

Locally I see the execution context destruction earlier, where it doesn't hurt:

-> Page.navigate({"url": "http://127.0.0.2:9191/system/services#/systemd-tmpfiles-clean.timer"})
<- {'frameId': '8'}
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login")
<- {'type': 'undefined'}
-> wait: ph_is_present("#login-user-input:not([disabled]):not([aria-disabled=true])")
<- raise {'response': {'message': 'Execution context was destroyed.'}}
-> wait: ph_is_present("#login-user-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login-user-input:not([disabled]):not([aria-disabled=true])")
<- {'type': 'undefined'}
-> ph_set_val("#login-user-input","[email protected]")
<- {'type': 'undefined'}

That leaves the question what context this is -- it may be a delayed effect from logout, or the login page really does a page reload.

I created a much faster toy test to study this:

        self.login_and_go("/system", user="user")
        time.sleep(3)
        b.logout()
        time.sleep(3)

        print("XXXXXXX second login")
        b.login_and_go("/system", user="user")

Firefox

First login:

CDP: executionContextCreated {"context":{"id":1,"origin":"null","name":"","auxData":{"isDefault":true,"frameId":"8","type":"default"}}}
> warning: This error page has no error code in its security info
-> Page.navigate({"url": "http://127.0.0.2:9091/system"})
<- {'frameId': '8', 'loaderId': '299286206087240'}
-> wait: ph_is_present("#login")
CDP: executionContextDestroyed 1
CDP: frameNavigated {"frame":{"id":"8","loaderId":"299286206087240","url":"http://127.0.0.2:9091/system","securityOrigin":null,"mimeType":null}}
CDP: executionContextCreated {"context":{"id":2,"origin":"http://127.0.0.2:9091","name":"","auxData":{"isDefault":true,"frameId":"8","type":"default"}}}
<- raise {'request': {'method': 'Runtime.evaluate', 'params': {'expression': 'ph_wait_cond(() => ph_is_present("#login"), 30000, null)', 'silent': False, 'awaitPromise': True, 'trace': 'wait: ph_is_present("#login")', 'contextId': -1}}, 'response': {'message': 'Cannot find context with specified id', 'data': 'evaluate@chrome://remote/content/cdp/domains/content/Runtime.sys.mjs:279:15\nexecute@chrome://remote/content/cdp/domains/DomainCache.sys.mjs:95:25\nreceiveMessage@chrome://remote/content/cdp/sessions/ContentProcessSession.sys.mjs:79:45\n'}}
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}

... and then no further execution context changes until clicking the login button.

Then the logout and second login looks similar: It destroys the execution context from the previous login page, whose error message hits the first wait command after the Page.navigate():

-> ph_mouse("#logout:not([disabled]):not([aria-disabled=true])","click",0,0,0,false,false,false,false)
<- {'type': 'undefined'}
-> wait: ph_is_present("#login")
CDP: executionContextDestroyed 3
<- raise {'response': {'message': 'Execution context was destroyed.'}}
CDP: frameNavigated {"frame":{"id":"8","loaderId":"299685638045755","url":"http://127.0.0.2:9091/system","securityOrigin":null,"mimeType":null}}
CDP: executionContextCreated {"context":{"id":8,"origin":"http://127.0.0.2:9091","name":"","auxData":{"isDefault":true,"frameId":"8","type":"default"}}}
CDP: executionContextDestroyed 4
CDP: executionContextDestroyed 5
CDP: executionContextDestroyed 6
CDP: executionContextDestroyed 7
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login")
<- {'type': 'undefined'}

XXXXXXX second login
-> switch to frame None
-> Page.navigate({"url": "http://127.0.0.2:9091/system"})
<- {'frameId': '8', 'loaderId': '299286206087319'}
-> wait: ph_is_present("#login")
CDP: executionContextDestroyed 8
<- raise {'response': {'message': 'Execution context was destroyed.'}}
CDP: frameNavigated {"frame":{"id":"8","loaderId":"299286206087319","url":"http://127.0.0.2:9091/system","securityOrigin":null,"mimeType":null}}
CDP: executionContextCreated {"context":{"id":9,"origin":"http://127.0.0.2:9091","name":"","auxData":{"isDefault":true,"frameId":"8","type":"default"}}}
-> wait: ph_is_present("#login")

Chromium

It looks similar -- the Page.navigate() destroys the initial "empty tab" execution context, loads the login page's, and keeps it until clicking the Login button:

CDP: executionContextCreated {"context":{"id":1,"origin":"://","name":"","uniqueId":"-7111436715566051418.7017623621479427639","auxData":{"isDefault":true,"type":"default","frameId":"5A9AD69543D0E37153D760E0EA7DB82B"}}}
<- {}
-> switch to frame None
-> Page.navigate({"url": "http://127.0.0.2:9091/system"})
<- {'frameId': '5A9AD69543D0E37153D760E0EA7DB82B', 'loaderId': '90E40DBB4E6EC8D1A30F83A873ADBE4D'}
-> wait: ph_is_present("#login")
CDP: frameNavigated {"frame":{"id":"5A9AD69543D0E37153D760E0EA7DB82B","loaderId":"90E40DBB4E6EC8D1A30F83A873ADBE4D","url":"http://127.0.0.2:9091/system","domainAndRegistry":"","securityOrigin":"http://127.0.0.2:9091","mimeType":"text/html","adFrameStatus":{"adFrameType":"none"},"secureContextType":"SecureLocalhost","crossOriginIsolatedContextType":"NotIsolated","gatedAPIFeatures":[]},"type":"Navigation"}
CDP: executionContextCreated {"context":{"id":2,"origin":"http://127.0.0.2:9091","name":"","uniqueId":"-6202134415423023087.-2894943449835186796","auxData":{"isDefault":true,"type":"default","frameId":"5A9AD69543D0E37153D760E0EA7DB82B"}}}
<- raise {'request': {'method': 'Runtime.evaluate', 'params': {'expression': 'ph_wait_cond(() => ph_is_present("#login"), 30000, null)', 'silent': False, 'awaitPromise': True, 'trace': 'wait: ph_is_present("#login")', 'contextId': -1}}, 'response': {'code': -32000, 'message': 'Cannot find context with specified id'}}
-> wait: ph_is_present("#login")

Logout / second login is again similar:

-> ph_mouse("#logout:not([disabled]):not([aria-disabled=true])","click",0,0,0,false,false,false,false)
<- {'type': 'undefined'}
-> wait: ph_is_present("#login")
CDP: executionContextDestroyed 4
CDP: executionContextDestroyed 5
CDP: executionContextDestroyed 6
CDP: executionContextDestroyed 7
<- raise {'request': {'method': 'Runtime.evaluate', 'params': {'expression': 'ph_wait_cond(() => ph_is_present("#login"), 15000, null)', 'silent': False, 'awaitPromise': True, 'trace': 'wait: ph_is_present("#login")', 'contextId': 3}}, 'response': {'code': -32000, 'message': 'Execution context was destroyed.'}}
CDP: frameNavigated {"frame":{"id":"5A9AD69543D0E37153D760E0EA7DB82B","loaderId":"A3C874166658C804BC5E702EFB7610D9","url":"http://127.0.0.2:9091/system","domainAndRegistry":"","securityOrigin":"http://127.0.0.2:9091","mimeType":"text/html","adFrameStatus":{"adFrameType":"none"},"secureContextType":"SecureLocalhost","crossOriginIsolatedContextType":"NotIsolated","gatedAPIFeatures":[]},"type":"Navigation"}
CDP: executionContextCreated {"context":{"id":8,"origin":"http://127.0.0.2:9091","name":"","uniqueId":"-3177871754975844986.-5214187059523986893","auxData":{"isDefault":true,"type":"default","frameId":"5A9AD69543D0E37153D760E0EA7DB82B"}}}
CDP: {"source":"rendering","level":"info","text":"Autofocus processing was blocked because a document already has a focused element.","timestamp":1699421447875.437,"url":"http://127.0.0.2:9091/system"}
-> wait: ph_is_present("#login")
<- {'type': 'undefined'}
-> wait: ph_is_visible("#login")
<- {'type': 'undefined'}

XXXXXXX second login
-> switch to frame None
-> Page.navigate({"url": "http://127.0.0.2:9091/system"})
<- {'frameId': '5A9AD69543D0E37153D760E0EA7DB82B', 'loaderId': '6EF6FC3D8645EB0DA1DAB69C19A4BA3B'}
-> wait: ph_is_present("#login")
CDP: frameNavigated {"frame":{"id":"5A9AD69543D0E37153D760E0EA7DB82B","loaderId":"6EF6FC3D8645EB0DA1DAB69C19A4BA3B","url":"http://127.0.0.2:9091/system","domainAndRegistry":"","securityOrigin":"http://127.0.0.2:9091","mimeType":"text/html","adFrameStatus":{"adFrameType":"none"},"secureContextType":"SecureLocalhost","crossOriginIsolatedContextType":"NotIsolated","gatedAPIFeatures":[]},"type":"Navigation"}
CDP: executionContextCreated {"context":{"id":9,"origin":"http://127.0.0.2:9091","name":"","uniqueId":"-5962254357164802096.5968245114790688992","auxData":{"isDefault":true,"type":"default","frameId":"5A9AD69543D0E37153D760E0EA7DB82B"}}}
<- raise {'request': {'method': 'Runtime.evaluate', 'params': {'expression': 'ph_wait_cond(() => ph_is_present("#login"), 15000, null)', 'silent': False, 'awaitPromise': True, 'trace': 'wait: ph_is_present("#login")', 'contextId': 8}}, 'response': {'code': -32000, 'message': 'Cannot find context with specified id'}}
-> wait: ph_is_present("#login")

So my gut feeling is that the Page.navigate() doesn't wait for the new execution context to get loaded, and that error message just hits a random next wait command. I also triggered another run to see the "execution context created/destroyed" messages in the check-realms logs.

Commit 8284618 dropped all explicit "wait for page load" calls (fka. `expect_load()`) in favor of allowing page loads during wait commands. This is better in almost all cases, but it created a race condition in tests which logged out and back in. In particular: ... TODO

martinpitt · 2023-11-08T07:25:37Z

Bazinga! I sent #19583 for landing and running all tests.

martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Nov 7, 2023

debug Test{AD,IPA}.testQualifiedUsers

5b6736e

martinpitt force-pushed the testqualified branch from 2f7b3da to 5b6736e Compare November 8, 2023 06:07

martinpitt mentioned this pull request Nov 8, 2023

test: Wait for a page load in Browser.open() #19583

Merged

1 task

martinpitt closed this Nov 8, 2023

martinpitt deleted the testqualified branch November 8, 2023 07:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

debug Test{AD,IPA}.testQualifiedUsers #19578

debug Test{AD,IPA}.testQualifiedUsers #19578

martinpitt commented Nov 7, 2023 •

edited

Loading

martinpitt commented Nov 7, 2023 •

edited

Loading

martinpitt commented Nov 8, 2023

debug Test{AD,IPA}.testQualifiedUsers #19578

debug Test{AD,IPA}.testQualifiedUsers #19578

Conversation

martinpitt commented Nov 7, 2023 • edited Loading

martinpitt commented Nov 7, 2023 • edited Loading

Firefox

Chromium

martinpitt commented Nov 8, 2023

martinpitt commented Nov 7, 2023 •

edited

Loading

martinpitt commented Nov 7, 2023 •

edited

Loading