Add Wallaby driver #74

ftes · 2024-05-14T10:34:42Z

Fixes #73

Done

add Wallaby driver
prepare test setup for Wallaby (add app.js, convert Endpoint to proper/regular endpoint)
move Assertions into Driver protocol (wrap with retry for Wallaby)
run existing tests twice via test_also_with_js macro

To do

documentation
- update main moduledoc: third Driver Wallaby, use via @tag :js
- caveat: conn passed to visit/2 is mostly ignored (not supported: conn.assigns, supported: conn.req_cookies)
- possible discrepancies:
  - errors for fields without an ID (if no unique selector can be generated from other attributes)
fix NoRouteError tests (broken due to changes to Endpoint)
open_browser
- fill in form values (not automatically preserved in DOM tree via value/selected/checked attributes)
- check stylesheets are loaded correctly
copy initial conn.req_cookies via Wallaby.Browser.set_cookie/4

ftes · 2024-05-14T19:13:57Z

Changed to draft, but @germsvel would love your feedback already.

arnodirlam

Nice initiative! Just a few early comments:

test/priv/static/assets/app.js

TODO.md

ftes · 2024-05-21T20:19:37Z

lib/phoenix_test/wallaby.ex

+  defp map_errors(fun) do
+    fun.()
+  rescue
+    e ->


I would prefer:

e in Wallaby.QueryError -> raise ArgumentError, e.message

But that fails to match. Why? 🤔

I don't know. I would expect that to work. Is it that a different error was being raised when you tested?

Here's what happens if I add in Wallaby.QueryError:

1) test click_link/2 raises error when there are multiple links with same text (JS) (PhoenixTest.StaticTest) test/phoenix_test/static_test.exs:121 Expected exception ArgumentError but got Wallaby.QueryError (Expected to find 1 visible element that matched the css 'a' and contained the text 'Multiple links', but 2 visible elements were found. ) code: assert_raise ArgumentError, fn ->

test/phoenix_test/live_test.exs

ftes · 2024-05-21T20:38:12Z

Updated:

Support @tag :js (via small copy&paste snippet for test module)
Avoid duplicating tests by introducing test_also_with_js macro that runs test twice: 1. with Static or Live + 2. with Wallaby driver.

germsvel · 2024-05-22T12:14:46Z

Hi @ftes, thanks so much for this work! I really appreciate it. I know you've already put a lot of time and effort into this.

Sorry for not responding sooner. I was hoping to get a good chunk of time to start reviewing this, since it's such a large change. But it doesn't look like I'll have the time anytime soon.

Given that, I would suggest we put this on pause.

I know having something that can handle js in PhoenixTest would be awesome. (at least, I'd like that). But I haven't thought carefully about how to introduce that yet. One way might be Wallaby (which is what you have here), but before we go in that direction, I was hoping to explore some other potential alternatives (like Playwright).

Another potential idea I had was to expose the driver protocol so that other libraries could implement their own handlers -- and that would open the possibility for a Wallaby or Playwright or whatever else.

So, just to share where my mind is:

I'd like to do at least a shallow look at what other alternatives we could have to deal with JavaScript before we introduce Wallaby
I'd still like to review all the code you're trying to introduce here. I think there's plenty to learn about what it'll take to introduce something like Wallaby. Maybe we end up introducing this PR.

I just need to carve out time to review this PR and have an idea in my head of what I'd like the JS handling to be like. Does that make sense? Of course, I'm open to other suggestions -- maybe I'm missing something here, and we should just go forward with this path.

ftes · 2024-05-22T12:24:34Z

Thanks @germsvel :)

This work is mostly done. Pretty much only documentation + NoRouteError fix missing from my side.

I appreciate that it is a sizeable PR which demands significant time for review though.

The only potential blocker I see is if you found any changes to existing code problematic.
E.g.

moving Assertions functions to the Driver protocol
changes to Endpoint to allow e2e tests

Apart from that, I don't see why we shouldn't move forward with this for now:

The public API doesn't change.
The behaviour of existing drivers doesn't change.
The Wallaby driver can be drop-in-replaced in the future. Transparently for any consumers, keeping the generic@tag :js in place.
- Except of course for Hyrum's law.
- Problem: offering support for issues with the new driver until then. Solution proposal: Document as "beta" feature. See it as a chance to gather feedback to guide future initiatives.

germsvel · 2024-05-22T12:32:02Z

@ftes I like what you're saying,

The Wallaby driver can be drop-in-replaced in the future. Transparently for any consumers, keeping the generic@tag :js in place.

And

Document as "beta" feature. See it as a change to gather feedback to guide future initiatives.

As for your two potential blockers (assertions and endpoint changes), I'll have to give those a look.

Since this is so close to being done, I'll try to carve out some time to review this before going on my tour of alternatives 😄

ftes · 2024-06-03T19:25:28Z

@germsvel Would it help you if I broke this PR into a series/stack of smaller PRs?
Wouldn't take much time on my size, and would make review less daunting. Plus might trigger some valuable discussion along the way.

germsvel · 2024-06-03T19:47:19Z

Thanks for the offer @ftes, but don't spend your time on it just yet! I've been doing a little bit of research on playwright to see if that's a realistic alternative.

I know you said we could introduce this and then swap it out if we need to, but once something is in the codebase, it'll be hard to remove it. Don't want to ship breaking changes too often (even in pre 1.0 version).

Just hang in there for me if you can. I would really like to handle JS somehow -- just want to do due diligence before we fully commit to Wallaby.

ftes · 2024-06-12T13:22:59Z

Hey @germsvel how's it going?
Still happy to split the PR. Even if you decide to go with Playwright, some preliminary work would likely be helpful: Moving Assertions into the Driver protocol.

And getting the preliminary changes in would also allow us to implement an external Wallaby Driver in the current form until playwright is usable. Without having to fork phoenix_test.

germsvel

@ftes thanks again for the PR!

I haven't finished reviewing the full thing, but I figured I should at least do it in chunks when I can. There's a lot of work done here. Really appreciate it. 👍

I left a few comments and questions that I hope are helpful.

I'll continue reviewing the PR when I have more time. Let me know if you have questions in the meantime.

lib/phoenix_test/live.ex

lib/phoenix_test/wallaby.ex

germsvel · 2024-06-12T13:58:12Z

lib/phoenix_test/wallaby.ex

+      # Set via JS to avoid locale format issues
+      type when type in ~w(date datetime-local time week) ->
+        js = """
+          el = document.querySelector('#{field.selector}');
+          el.value = '#{value}';
+        """
+
+        session.session
+        |> Wallaby.Browser.execute_script(js)


I'm not sure I'm following here. Why do we do this via JS?

<input type="date" has different formats depending on the user locale.

E.g. DD.MM.YYYY (german), MM.DD.YYYY (US).
While the browser is usually smart enough to cope with different separates (-, .) it will get mixed up if the month and day are swapped.
Basically giving an incorrect input.

I've had to work around this issue before, this is the best we came up with back then.

Maybe there's a better solution.
E.g. forcing the Chrome locale to a known good value and applying a correctly formatted input string via the usual Wallaby fill_in. But we got stuck on that route (CI was still behaving differently than local machines).

lib/phoenix_test/wallaby.ex

germsvel · 2024-06-12T14:04:49Z

lib/phoenix_test/wallaby.ex

+
+  defp trigger_phx_change_validations(session, query) do
+    if has?(session.session, query) do
+      Wallaby.Browser.send_keys(session.session, query, [:tab])


This is a clever way to handle this! Do you know if it works reliably? I haven't tested this myself.

I've been using it for a while in tests.
"Works for me", but I can't speak to how stable this is.

germsvel · 2024-06-12T14:06:53Z

lib/phoenix_test/wallaby.ex

+  defp retry(fun, timeout_ms \\ 300, interval_ms \\ 10) do
+    now = DateTime.to_unix(DateTime.utc_now(), :millisecond)
+    timeout_at = DateTime.utc_now() |> DateTime.add(timeout_ms, :millisecond) |> DateTime.to_unix(:millisecond)
+    retry(fun, now, timeout_at, interval_ms)
+  end
+
+  defp retry(fun, now, timeout_at, _interval_ms) when now >= timeout_at do
+    fun.()
+  end
+
+  defp retry(fun, _now, timeout_at, interval_ms) do
+    fun.()
+  rescue
+    AssertionError ->
+      Process.sleep(interval_ms)
+      now = DateTime.to_unix(DateTime.utc_now(), :millisecond)
+      retry(fun, now, timeout_at, interval_ms)
+  end


I'm not necessarily against this, but I wonder if this is something we need to introduce now. I think we should rely as much as possible on Wallaby's built-in waiting mechanisms that come from find. Is there a reason why we're including this here and wrapping our assertions with retry?

There are two routes we can go down to implement phoenix_test assertions for Wallaby:

Wrap Wallaby's built-in assertions
a. (+) built-in retry/waiting

Use PhoenixTest.Assertions
a. (+) behaviour automatically the same as other Drivers

I opted for (2.) to ensure the Wallaby Driver behaves as similarly as possible as the other Drivers.
But if we warp PhoenixTest.Assertions we have to add our own retry around them.

Similar case:
For the mutations (fill_in) I also used PhoenixTest.Query instead of Wallaby.Query to find the correct field.
Wallaby doesn't support exact-match label semantics (e.g. field label Race vs Race 2).

Does that make sense to you?

@ftes makes total sense. I finally had a chance to give the implementation a good look. It's really interesting how far we can get if we just render HTML with Wallaby. That being said, I'd like to explore relying on Wallaby's built-in finders as much as possible.

I'm not asking you to do that here (though, of course, feel free to do so too), but I'm gonna create a branch based on this one and test out a few alternatives to see how that would look.

One hard limitation you'll run into is that Wallaby doesn't provide "Exact match" finders for the input labels.
That will definitely break some tests (Race vs Race 2). Maybe worth a PR for Wallaby?

But I agree. Consistency between Drivers is nice.
But users would likely rather care more about having a feature test that simulates real user behaviour as well as possible. And using Wallaby's built-in functions where possible is probably the best way to simulate that. Instead of doing to much "HTML interpretation" ourselves.

But users would likely rather care more about having a feature test that simulates real user behaviour as well as possible. And using Wallaby's built-in functions where possible is probably the best way to simulate that. Instead of doing to much "HTML interpretation" ourselves.

Yeah, I agree. I also imagine Wallaby's waiting mechanism logic has a few iterations on it (so it's probably ironed out any kinks), and it's also lower level (so might be able to do things we can't). So I'd rather rely on that instead.

I wanted to give this a shot and hit a stumbling block:
CSS selectors vs XPath

PhoenixTest exposes CSS selectors in the API and uses Elixir code to implement text match semantics (exact: true/false).

Wallaby usually uses XPath to encode text match semantics

A little background on Wallaby.Query

Wallaby mostly tries to push element searching to the browser using xpath and css queries.
Most of the Query functions generate xpath queries.
Example: Query.select/2 generates an xpath query with a non-exact match on the label (also supports ID filter etc.).
Nearly all Query functions generate non-exact xpath queries. Query.option is the only exception. This behaviour is not configurable. However, either wallaby could be extended to support customisable exact-match semantics or we could hand-craft the xpath queries in phoenix_test.

Wallaby supports additional query options (e.g. text) via Elixir post-processing.
This is less efficient, but flexible:

The browser executes the base query (either css or xpath).

The resulting element list is then filtered by inner text in Elixir.

The dilemma

CSS selectors and text matching don't play nicely.
CSS selectors and XPath can't be combined (although there are some attempts to transpile CSS selectors to XPath, I don't know if CSS selectors are a subset of XPath though).

It's difficult to get the best of both worlds: Expose CSS selectors to PhoenixTest users and use as much vanilla Wallaby as possible. Vanilla Wallaby relies mostly on XPath.

Where to go from here

Currently, the Wallaby driver always fetches the full HTML from the browser to do anything. That's slow.
Instead, we could at least push CSS queries to the browser where possible and filter the resulting element list via our own Elixir code (customisable exact-match semantics).
That could be a sweet spot of performance, consistency with the other drivers, and little need for changes to Wallaby itself.

I think this question might also have to be answered for Playwright.
Also, the discussion might inform what the PhoenixTest API should expose:

generic locators that can be "transpiled" to whatever works best for the driver

CSS selectors (status quo)

From a quick glance, Playwright mostly exposes generic locators but does offer full control via css or xpath if need be.

Probably not an issue for Playwright after all. After playing with it a bit (#136) I've seen that Playwright:

allows mixing different selector "engines" in locators (e.g. CSS + XPath)

has in-built selectors for label and text that support exact and non-exact matching, very similar to what PhoenixTest does

so we can get the best of both worlds: expose CSS selectors to users and combine them with efficient text selectors, executed with retry right in the browser

germsvel · 2024-06-12T14:08:00Z

lib/phoenix_test/wallaby.ex

+  defp map_errors(fun) do
+    fun.()
+  rescue
+    e ->


I don't know. I would expect that to work. Is it that a different error was being raised when you tested?

germsvel · 2024-06-12T14:15:32Z

@ftes just saw your comment after I reviewed the PR 😆

You mentioned:

Still happy to split the PR. Even if you decide to go with Playwright, some preliminary work would likely be helpful: Moving Assertions into the Driver protocol.

Having reviewed the parts that I reviewed, I do think there could be some nice splits (if you're interested in doing them). Moving the assertions into the Driver protocol would be a good one, for sure.

We could also add the current_path function as another one.

I don't want you to do more work if you don't feel like it, but if you want to do that, that'd be a great way to introduce changes in pieces. (always easier to review too 😄 )

ftes · 2024-06-17T08:44:46Z

Thanks for the review @germsvel

Having reviewed the parts that I reviewed, I do think there could be some nice splits (if you're interested in doing them).

Absolutely, I'm on it!

Split so far: #88, #89, #90, #91

germsvel · 2024-08-17T12:23:11Z

lib/phoenix_test/wallaby.ex

@@ -0,0 +1,259 @@
+defmodule PhoenixTest.Wallaby do


We probably want to wrap most of this module in Code.ensure_loaded? to check if Wallaby is included.

We shouldn't make Wallaby a required dependency in case some people don't want to use it. What do you think?

Good point! We should definitely not force Wallaby on users.

ftes · 2024-08-19T05:52:42Z

Did you gain any further insights by looking at Playwright, @germsvel ?

germsvel · 2024-08-19T12:35:44Z

@ftes yes! Great question.

Here are my thoughts so far. No particular order, and take them with a grain of salt. They're all subject to change as we discover more:

We should probably integrate both wallaby and playwright. People can choose which they can use.
That means we want to make sure the abstraction works for both
Because I'd like to integrate both Wallaby and Playwright, I think a @tag driver: :playwright or @tag driver: :wallaby would be what I'd suggest we use.
I also want to iron out a good API for Wallaby and Playwright. Should they start with a conn like the rest of the tests and we transform it into a session after visit/2? Or should we rely more on Wallaby's and Playwright's built-in stuff and get something like wallaby's session in the test context?
Wallaby has Wallaby.Feature that comes with the library. We don't have to use it, but I wanted to explore if we're losing anything by not using it. And does that mean we should introduce a use PhoenixTest.ConnCase or something like that? It might be helpful if we need to do some test setup.
The playwright-elixir library has a lot of work done, but some things are still missing. The biggest issue for us is that it hasn't implemented auto-waiting assertions yet. There _are_ways to make assertions, but it's kinda clunky. So, for now, when I introduce playwright it'll be under some "experimental" section (just so people know).
I tried avoiding playwright altogether (to see if we could handle things) and going directly to Chrome Driver Protocol. In the ruby world, Capybara (Wallaby equivalent) has a Cuprite driver which uses Ferrum. That's basically what they do -- but it seems like a lot of extra work to take on. So, I'm hoping we can rely on playwright-elixir since it already has a big head start (of course, playwright has a bigger surface area than what we need).

I know your PR seemed really close to getting Wallaby in here. And I'm sorry we haven't merged it in yet. Thanks for the patience! I just want to land at some good API before we fully introduce it.

ftes · 2024-09-06T12:55:29Z

Because I'd like to integrate both Wallaby and Playwright, I think a @tag driver: :playwright or @tag driver: :wallaby would be what I'd suggest we use.

That seems sensible.

I also want to iron out a good API for Wallaby and Playwright. Should they start with a conn like the rest of the tests and we transform it into a session after visit/2? Or should we rely more on Wallaby's and Playwright's built-in stuff and get something like wallaby's session in the test context?

I'd say: It depends on what the goal of offering alternative drivers is.

Is it plug-and-play interchangeability of drivers? Without the user needing to know/care what Wallaby is and how it works? Then use as little of the Wallaby/Playwright "standard" setup/assertions as possible.
Is it to to offer the PhoenixTest API to users that still want maximum control of the Wallaby/Playwright internals? Then use all the "standard" stuff.

Personally, I'd vote for plug&play. For me this would be the most common requirement.
For the few use cases where I need full control (e.g. test interaction of concurrent user session) I would e.g. use vanilla Wallaby then.

Wallaby has Wallaby.Feature that comes with the library. We don't have to use it, but I wanted to explore if we're losing anything by not using it. And does that mean we should introduce a use PhoenixTest.ConnCase or something like that? It might be helpful if we need to do some test setup.

I think this only adds support for @sessions. That offers the caller more flexibility (e.g. two parallel user sessions, controll session configuration).
This PR instead simply enforces a single session with default settings.

ftes · 2024-09-06T13:13:48Z

The playwright-elixir library has a lot of work done, but some things are still missing. The biggest issue for us is that it hasn't implemented auto-waiting assertions yet. There _are_ways to make assertions, but it's kinda clunky. So, for now, when I introduce playwright it'll be under some "experimental" section (just so people know).

That wouldn't be an issue if we use the approach in this PR: implement auto awaiting in Driver (wrap with retry).

ftes · 2024-10-05T15:44:10Z

config/test.exs

+  otp_app: :phoenix_test,
+  js_logger: nil,
+  chromedriver: [
+    headless: System.get_env("WALLABY_HEADLESS", "t") in ["t", "true"],


Should go in runtime.exs

ftes · 2024-10-05T15:45:03Z

lib/phoenix_test/link.ex

@@ -17,7 +18,7 @@ defmodule PhoenixTest.Link do
      raw: link_html,
      parsed: link,
      id: id,
-      selector: selector,
+      selector: Element.build_selector(link),


Not a required change. Just for consistency with the other modules.

ftes · 2024-10-31T15:49:04Z

Closing in favour of #145

ftes marked this pull request as draft May 14, 2024 19:13

arnodirlam reviewed May 15, 2024

View reviewed changes

test/priv/static/assets/app.js Outdated Show resolved Hide resolved

TODO.md Outdated Show resolved Hide resolved

TODO.md Outdated Show resolved Hide resolved

ftes force-pushed the feature/wallaby branch 3 times, most recently from f2163a9 to a5f6d7a Compare May 21, 2024 17:19

ftes commented May 21, 2024

View reviewed changes