Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mutation observer on broken pages #33

Open
wants to merge 1 commit into
base: web-extension
Choose a base branch
from

Conversation

phiresky
Copy link

@phiresky phiresky commented Dec 9, 2020

Some websites with broken HTML currently don't work correctly, since the MutationObserver watches for changes on the head > title element, but the title element still works from the browsers' perspective even when it is in the body.

One example is youtube right now for some videos: (i'm guessing a/b testing something):
image

image

image

setting document.querySelector("body>title").textContent = "foo" updates the title on that page.

The extension currently misses that website, causing no URL to appear in the window title. This PR fixes this on both firefox and chrome.

Minimal example:

<!doctype html>
<head>
</head>
<body>
<title>hello</title>
foo

<script>setTimeout(() => document.title="updated", 2000)</script></body>


In addition, this PR adds the subtree: true flag to the mutationobserver, since that is also required for some edge cases:

https://stackoverflow.com/a/22989490/2639190

or specifically: https://bugs.chromium.org/p/chromium/issues/detail?id=134322


There is one more edge case that this still doesn't handle: If no title element exists at all from the start, and then later document.title is set in JS, the browser auto-adds the title element to the head, but the mutation observer misses that change.

@phiresky
Copy link
Author

phiresky commented Dec 9, 2020

By the way, my use case for this extension is completely different from KeePass, so I could fork it I guess, but the above still applies to the original use case and it would be great if we could keep this change upstream :)

@erichgoldman
Copy link
Owner

Thanks for the PR. I understand your use case and desire to support sites which are not standards-compliant.

There is a performance impact the broader the scope of the Mutation Observer. There are usually minimal changes under <head />, but on modern sites is quite active.subtree: true` also would impact performance, but likely not so much since there shouldn't be descendants regardless of location.

It looks like broken a/b testing in your YouTube example since there is a <head /> element and most of the tags in your screenshot belong up there and not in <body />. Are you experiencing this elsewhere or commonly on YouTube?

I will do a bit more research on the subtree: true issue. I read the old bug report and StackOverflow post. Have you personally experienced this problem on any website in the past year? If it's not a major performance impact and could fix things, I'd consider it.

In terms of the <title> being in the <body /> the suggestion I have is to write a separate simple extension that checks for these the presence of a <title /> tag in the <body /> AND NOT in the <head /> and then copies it up to the <body /> on load. Unless you are experiencing the case where the title changes dynamically and is in the wrong location, this will be result in less impact. Further, if this is only an issue on YouTube, you could set the extension to only run on YouTube and not on every page. A bit of a Unix-y approach to keep the features in any one extension minimal. Given the performance impact I do not think I would expand the scope of the Mutation Observer in the extension.

@phiresky
Copy link
Author

As far as I understand, subtree: true only applies to children of the watched element - which is <title> and not <head>. So really there shouldn't be any performance impact.

Regarding the title selector: Selecting by tag name should only be a single hashmap or btree lookup on page load - given how often querySelectors are used browsers have to have indexes to look those up faster than going through every element anyways. I doubt the performance of that is noticeable at all - it might even make it better since the selector is simpler (not having to first search for the head element and then the title element). The title element only exists once in every single so it's not like it has to go through multiple possibilities either (and querySelector always only returns the first element).

In fact, whatever negative impact is caused by changing the selector could probably be inverted 10x by simply replacing querySelector with getElementByTagName: https://humanwhocodes.com/blog/2010/09/28/why-is-getelementsbytagname-faster-that-queryselectorall/ . Lmk if i should do that.

Have you personally experienced this problem on any website in the past year

The problem is: I don't know. I've used this extension and logged the firefox window title every 30s for the past year, and I have a lot of websites that are sometimes not tagged correctly. e.g. Youtube, Google Search, Discord, Rust Docs, etc. When looking at a random sample of those sites except Youtube now they seem to have the title set correctly. So I guess most are either a race condition in the title updates or just because the title is captured in the period before the url is added.

It also sometimes seems to set it multiple times:
image though that might be because I've played around with the extension fork.

In any case, my use case is this: https://github.com/phiresky/track-pc-usage-rs . I guess at some point I have to make my own extension anyway to get more detail.

@erichgoldman
Copy link
Owner

There could be a number of things, but with PWAs/SPAs, etc there are a lot of updates. You may also have other extensions interfering in some way.

Years ago, under old style and XUL extensions, you could easily pull this information from the browser and not have to query from inside the page. You could just add this information after the browser name or it its own variable. It's pretty tricky to track state and pull parts of the <title /> as it changes. I believe when I first wrote the code i considered using some other structure to track state and compare differences, it's possible I just wanted to minimize LoC for maintenance.

The double setting shouldn't happen, but it's possible.

You are correct about getElementByTagName() being faster. I think because I wanted to compound selector which is why I likely used a query. I still need to research more or figure out a good test setup to actual make the performance decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants