-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: mb_convert_encoding PHP 8.2 deprecation #52
Conversation
Thanks you @shvlv, this solves PHP 8.2 deprecations. However, I stumbled upon another bug today. Here's the context. I'm using Vite to build my assets (with a custom Vite loader). In dev mode, I have to include vite client to handle hot reload, etc. The default URL for vite server is "http://%5B::1%5D:5173/path/to/build/@vite/client
I was wondering if instead of DomDocument (which has a lot of quirks and can be heavy on memory) and since the only manipulations achieved here targets attributes, you could leverage the new Tag processor API introduced in 6.2? Of course, that would mean a WP 6.2 requirement on the other hand. 6.3 also introduced native support for |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #52 +/- ##
============================================
- Coverage 91.02% 88.42% -2.60%
+ Complexity 282 281 -1
============================================
Files 21 21
Lines 836 821 -15
============================================
- Hits 761 726 -35
- Misses 75 95 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
After internal discussion, the new approach with WordPress HTML API was implemented. The following points should be discussed additionally:
|
Quick note: from 6.5, there is now a |
@shvlv Friendly bump, any progress on this? |
@nlemoine Sorry for the delay, we have a vacation time 😄 will proceed as soon as possible |
Hello @shvlv, would you mind having a look at this PR? 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I have two comments but they don't stop me from approving.
- I'm not sure how much we should care about WordPress < 6.2. It might be enough simply returning
$html
without triggering an error message (and have a test for it). - If the QA issues in the unchanged files are quick wins, it would be great to have them fixed according to the Boy Scout rules.
Signed-off-by: Philipp Bammes <[email protected]>
…t' into fix-php82-deprecation-improvement
Done partially. We decided to postpone integration testing for now.
It is better to handle it in the standalone PR. |
Please check if the PR fulfills these requirements
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
Fix.
What is the current behavior? (You can also link to an open issue here)
I added minor improvements on the top of @nlemoine's PR #49. It was inspired by very good SO answer - https://stackoverflow.com/a/8218649/13962131 and we tested same change for several projects already where we need HTML processing.
So the problem is (https://www.php.net/manual/en/function.mb-convert-encoding.php):
What is the new behavior (if this is a feature change)?
The right way to do things is
mb_encode_numericentity
.DOMDocument::loadHTML
loads ISO-8859-1 by default. So formap
argument, we choose:128
(0x80
). Anything above ISO-8859-1 that supports0-127
.U+10FFFF
is better.0
as we don't need to change code~0
that produces the maximal byte for your system. i.e.1111111111111111111111111111111111111111111111111111111111111111
for a 64bit system. It guarantees the mask never changes its actual value.Regarding
html_entity_decode
, I think it's an important addition because we can leave HTML entities for HTML content, and the browser handles them as expected. But we don't want to break Unicode characters inside JS. So if the user hasvar title = 'Lösungen'
, changing tovar = 'Lösungen'
is not good.Also, tests are added, including the character from the Unicode top plate - https://util.unicode.org/UnicodeJsps/character.jsp?a=2000B. I.e.
𠀋
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
No.
Other information: