-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Lit]: Add build time mangler for lit-html templates #27703
base: master
Are you sure you want to change the base?
Conversation
|
||
# Note: We read from and write to the preprocess file - this way any | ||
# preprocessing that upstream does will be mangled. | ||
subprocess.run(['npx', 'tsx', '--tsconfig', '../../brave/tsconfig.json', '../../brave/build/commands/scripts/lit_mangler.ts', 'mangle', '-m', mangler_file, '-i', preprocess_file, '-o', preprocess_file]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @thypon this seems like it'd be a good candidate for a semgrep
rule - you shouldn't be able to execute npx
without the --no
flag to prevent installing the dependency
A Storybook has been deployed to preview UI for the latest push |
const world = new JSDOM() | ||
const document = world.window.document | ||
const element = document.createElement('template') | ||
element.innerHTML = template.text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thypon can you review this please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is server side, and build time, so looks fine.
It is also not "executed" as a site, since in JSDOM
.
Are we sure we want to use JSDOM
here and not a chromium-headless to interpret the DOM?
These two DOMs are not equivalent.
Okay, the approach is interesting, but we can improve things here:
|
Chromium major version is behind target branch (133.0.6943.98 vs 134.0.6998.15). Please rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's exciting to think about doing this at compile time instead of run time. For modifying the lit component itself, we'd still use a combination of src_overrides and lit_overriding.ts
- though perhaps the styles could be injected using this technique instead?
Using tsx seems ok because we don't force a bundled node version, so it just uses local node script same as our brave_node.py or npm run ...
does.
def run_mangler(mangler_file, preprocess_file): | ||
"""Runs the mangler on the given file""" | ||
import subprocess | ||
|
||
ts_config = os.path.join(get_src_dir(), 'brave', 'tsconfig.json') | ||
lit_mangler = os.path.join(get_src_dir(), 'brave', 'build', 'commands', | ||
'scripts', 'lit_mangler.ts') | ||
|
||
# Note: We read from and write to the preprocess file - this way any | ||
# preprocessing that upstream does will be mangled. | ||
subprocess.run([ | ||
'npx', '--no', '--', 'tsx', '--tsconfig', ts_config, lit_mangler, | ||
'mangle', '-m', mangler_file, '-i', preprocess_file, '-o', | ||
preprocess_file | ||
], | ||
check=True) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like injecting the mangle script run after preprocess is the simplest and most universal place. And if we want to type check we could run tsc
here. Even though it's going to run separately for each file, it should be fast?
But it seems like we could do this at a higher level as part of the build_webui
GN template after the preprocess stage, since all lit .html.ts files will run through that target. Not sure if doing it in GN will help with any issues that come up in review.
Also, if we're having any performance issues perhaps we could run against all the files in the directory or whatever list preprocess_if_expr
has. But you'd need to turn the module into a class or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not sure what the best solution here is - at least with this, it will only be rerun when the target the manglers exist in change, whereas if this was part of build_webui
it would have to run whenever any of the manglers changed, which I'm not sure would be better.
Additionally, it would make ordering quite difficult as this needs to run after the Chromium files are preprocessed (so it can run over the preprocessed files and doesn't get overwritten) but before the Typescript build happens.
|
||
mangler(element.content) | ||
|
||
template.text = element.innerHTML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not look right. Why do we need to de-escape escaped entities?
Shouldn't be possible to already get the unescaped elements anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is because there's an encode which happens when parsing to HTML and we want things to roundtrip nicely:
// the string
"<div>${"<foo&"}</div>"
// is parsed to
<div>${"<foo&"}</div>
// which should roundtrip back to the string
"<div>${"<foo&"}</div>"
This is because its valid to have html
which is really just a string inside the interpolated content (in the real world it gets replaced with the result of parsing to JS). However, we want the raw string from inside the interpolation so we can edit it.
5ab3877
to
37bb60d
Compare
[puLL-Merge] - brave/brave-core@27703 DescriptionThis PR introduces a new system for modifying Chromium's web components using "lit-html" templates. It replaces the old Polymer-based overrides with a build-time mangling approach that modifies the templates during compilation. The main motivation appears to be modernizing the override system to work with lit-html templates instead of Polymer. ChangesChanges
sequenceDiagram
participant Build as Build Process
participant Mangler as Lit Mangler
participant DOM as JSDOM
participant TS as TypeScript Compiler
Build->>Mangler: Load template file
Mangler->>Mangler: Parse lit-html templates
Mangler->>DOM: Create virtual DOM
Mangler->>DOM: Apply template modifications
DOM->>Mangler: Return modified HTML
Mangler->>TS: Type check modifications
TS->>Mangler: Validation result
Mangler->>Build: Write modified template
Security Hotspots
Possible Issues
|
.replaceAll('"', '"') | ||
.replaceAll('<', '<') | ||
.replaceAll('>', '>') | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] User controlled data in methods like innerHTML
, outerHTML
or document.write
is an anti-pattern that can lead to XSS vulnerabilities
Source: https://semgrep.dev/r/javascript.browser.security.insecure-document-method.insecure-document-method
Cc @thypon @kdenhartog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is at build time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might still introduce issues, since we might demangle a safety mangled element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be safe because &
is the first thing we encode so all HTML encoded entities should be converted and &
is the last thing we decode.
<script>alert('hahaha')<script>
==> &lt;script&gt;alert('haha')&lt;script&gt;
==> <script>alert('hahaha')<script>
// in the upstream file:
html`<div>Hi from Jay & Andrea</div>`
// mangled
<div>Hi from Jay &amp; Andrea</div>`
// we replaced all `&`s with &
// output
html`<div>Hi from Jay & Andrea</div>`
// we replaced all `&`s with `&`
// `foo="${bar ? "one" : "two"}"` | ||
.replaceAll(/\$\{.*?\}/gi, (...args) => { | ||
return args[0] | ||
.replaceAll('&', '&') // Note: & needs to be first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected a call to replaceAll()
in an attempt to HTML escape the string args[0]
. Manually sanitizing input through a manually built list can be circumvented in many situations, and it's better to use a well known sanitization library such as sanitize-html
or DOMPurify
.
Source: https://semgrep.dev/r/javascript.audit.detect-replaceall-sanitization.detect-replaceall-sanitization
Cc @thypon @kdenhartog
.replaceAll(/\$\{.*?\}/gi, (...args) => { | ||
return args[0] | ||
.replaceAll('&', '&') // Note: & needs to be first | ||
.replaceAll('"', '"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected a call to replaceAll()
in an attempt to HTML escape the string args[0]<br/> .replaceAll('&', '&')
. Manually sanitizing input through a manually built list can be circumvented in many situations, and it's better to use a well known sanitization library such as sanitize-html
or DOMPurify
.
Source: https://semgrep.dev/r/javascript.audit.detect-replaceall-sanitization.detect-replaceall-sanitization
Cc @thypon @kdenhartog
return args[0] | ||
.replaceAll('&', '&') // Note: & needs to be first | ||
.replaceAll('"', '"') | ||
.replaceAll('<', '<') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected a call to replaceAll()
in an attempt to HTML escape the string args[0]<br/> .replaceAll('&', '&') // Note: & needs to be first<br/> .replaceAll('"', '"')
. Manually sanitizing input through a manually built list can be circumvented in many situations, and it's better to use a well known sanitization library such as sanitize-html
or DOMPurify
.
Source: https://semgrep.dev/r/javascript.audit.detect-replaceall-sanitization.detect-replaceall-sanitization
Cc @thypon @kdenhartog
.replaceAll('&', '&') // Note: & needs to be first | ||
.replaceAll('"', '"') | ||
.replaceAll('<', '<') | ||
.replaceAll('>', '>') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected a call to replaceAll()
in an attempt to HTML escape the string args[0]<br/> .replaceAll('&', '&') // Note: & needs to be first<br/> .replaceAll('"', '"')<br/> .replaceAll('<', '<')
. Manually sanitizing input through a manually built list can be circumvented in many situations, and it's better to use a well known sanitization library such as sanitize-html
or DOMPurify
.
Source: https://semgrep.dev/r/javascript.audit.detect-replaceall-sanitization.detect-replaceall-sanitization
Cc @thypon @kdenhartog
@@ -44,17 +44,42 @@ def maybe_keep_upstream_version(override_in_folder, out_folder, override_file): | |||
return None | |||
|
|||
|
|||
def run_mangler(mangler_file, preprocess_file): | |||
"""Runs the mangler on the given file""" | |||
import subprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Consider possible security implications associated with subprocess module.
Source: https://semgrep.dev/r/gitlab.bandit.B404
Cc @thypon @kdenhartog
|
||
// Write the tsconfig to a temporary file. | ||
const tempDir = os.tmpdir() | ||
const tempTsConfigPath = path.join(tempDir, `${path.basename(file)}-tsconfig.json`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reported by reviewdog 🐶
[semgrep] Detected possible user input going into a path.join
or path.resolve
function. This could possibly lead to a path traversal vulnerability, where the attacker can access arbitrary files stored in the file system. Instead, be sure to sanitize or validate user input first.
Source: https://semgrep.dev/r/javascript.lang.security.audit.path-traversal.path-join-resolve-traversal.path-join-resolve-traversal
Cc @thypon @kdenhartog
That's a much better location. Moved it!
I actually did this originally but moved them into one 😆 Split them again.
I've gone with a two pronged approach:
Calling via |
@fallaciousreasoning the demangling looks still somewhat problematic |
Resolves brave/brave-browser#44051
Note: Needs Privacy/Security review
https://github.com/brave/reviews/issues/1872
Background
In Polymer, the HTML is compiled to a
HTMLTemplate
element which we're free to modify to our hearts desire. Lit makes use oflit-html
to generate HTML from tagged template literals, which works super well if you don't need to modify someone else's template.In
brave-core
we have a large number of existing HTML overrides for Polymer elements (especially in chrome://settings). Upstream is gradually migrating pages from Polymer to Lit, so we need a strategy to override these pages.Unfortunately it is non-trivial to override the tagged template literals. They exist roughly in the following form:
and modifying these will be both non-trivial and extremely prone to breakage
Rough Approach
In this PR, I've introduced a
lit-mangler
for mangling the templates at build time. It works roughly like this:.html.ts
filehtml
tagged template literals into a tree (as literal can exist inside literals for ternaries/maps/children ect).id="my-id"
)DocumentFragment
DocumentFragments
to the original script and write them to thepreprocess
directoryAlternatives Considered
See #27661 for an alternative runtime approach. I think the solution in this PR is much more robust and should also be more performant, as it won't thrash Lit's template cache every render.
Demonstration
As part of this PR, I've rewritten an existing override which broke when upstream migrated their element from Polymer to Lit. I can remove that from this PR, if it makes things easier.
Rough Example:
which replaces the old, broken override brave/chromium_src/chrome/browser/resources/extensions/detail_view.ts
Submitter Checklist:
QA/Yes
orQA/No
;release-notes/include
orrelease-notes/exclude
;OS/...
) to the associated issuenpm run test -- brave_browser_tests
,npm run test -- brave_unit_tests
wikinpm run presubmit
wiki,npm run gn_check
,npm run tslint
git rebase master
(if needed)Reviewer Checklist:
gn
After-merge Checklist:
changes has landed on
Test Plan: