-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
De-duplicate parsing logic across PHP and JS #1086
Comments
This is the correct way forward - we've already run into multiple issues because these two are different (#890 and #870 (comment) for example). Note that our PEG does include some JavaScript code, so we will need to simplify this and/or provide an approach to run equivalent logic on the server side (#, # for example).
This seems like the best approach. |
It seems php-pegjs requires pegjs 0.8 wheras Gutenberg has 0.10. Also php-pegjs hasn't been updated in two years. Aside from the PHP implementation, adaptation, and integration, I think the other thing is to modify the current grammar from:
To be something like:
Where |
cc @dmsnell , I think you've been working on JSON attributes support in the https://github.com/Automattic/wp-post-grammar/blob/master/src/post.pegjs |
Doesn't look like it's quite working yet? For one, I'm seeing the JSON being stored inside of a string instead of appearing literally in the block comment:
It also seems that the JSON is limited in the characters it can represent? I couldn\'t seem to add an apostrophe to the JSON strings without causing a parse error, and |
In #1130 ^^^ I added an alternative approach to parsing these kinds of attributes. I believe that the real problem wasn't that the parser wasn't designed to handle them so much as it simply failed to handle escaped quotes, which should now be possible. Personally I would like to do what we can with the JS parser and pushing that out in the design before we spend too much effort harmonizing with a PHP-based parser. I'm sure the parser will be changing a lot and unless we need the server at this point (which maybe we do and I didn't realize it) to be able to parse I feel like we will help ourselves by figuring out the design first then add the implementation. Thanks for the poke @aduth |
@dmsnell thanks for that. Yeah, the most important problem is the lack of being able to parse attribute values with escaped quotes. The secondary concern I have is in regards to the data type of the values. For example, does
This will become increasingly important now that we are driving into dynamic blocks (widgets) which have server-side rendering. |
@westonruter thanks for the review!
I disagree. this is a minor problem hardly a problem. it's just a small bug in the parser and we've got a simple fix for that in #1130.
wouldn't see any reason to interpret it as a boolean false. we still haven't made our decisions on what the block structure needs and what attributes mean, but unless we make a decision to say, "any attribute value will be JSON unless it fails to parse as JSON" I would be reluctant to "try" on occasion. that being said, I don't think it would be bad to consider that, or to make a special <!-- wp:my/block name="false" -->
this is my point. so far I haven't seen anyone suggest that attributes are JSON so I don't see why we need to disambiguate. that is, it's not JSON unless a block attempts to parse the JSON as such.
earlier today I found something which suggested we might need to double-quote and double-quote-escape all the attributes because I suspect that WordPress or React might be "fixing" the single-quoted strings in the block comment delimiters by replacing them with double-quotes. we discovered this in #1130 when
I'm not as familiar with this, but I thought the point of the dynamic blocks was that they render client-side? Fallback HTML in the |
I meant it is the main problem (primary defect), not necessarily that it is a major problem. Good that you got a fix.
Take the |
Reopening to reflect the main issue here, that our parsing logic is duplicated (and behaves differently) between JS and PHP. |
There are currently two parsers being used: one in PegJS (in JS) and regular expressions (in the PHP
do_blocks
andparse_block_attributes
functions). Has there been any investigation into re-using the PegJS grammar directly in PHP? For example, php-pegjs. This would be important as the attributes get more complex, such as in implementing dynamic blocks (widgets, #1011).In that regard, the serialization logic is currently limited to strings that lack any special characters. If a block attempts to serialize a string that includes
-->
or"
then the parser would break. Likewise, if an attribute contains an object then it currently gets encoded as[object Object]
and if it is an integer or a boolean, then it gets converted into strings.I'm also aware of #988, but it doesn't change how attributes get serialized. To me it seems like there needs to be more robust handling of serializing attribute values that need special encoding. While the goal of being human-readable is important, it's probably less important than having data that isn't easily corrupted. This problem was tackled in the Shortcake plugin by URL-encoding such values: wp-shortcake/shortcake#496
Nevertheless, to preserve the original attribute value's data type as well as to escape the value during serialization, perhaps the value should be serialized as JSON and then URL-encoded. This is the approach that the JS Widgets plugin took when porting making widgets available as shortcodes: xwp/wp-js-widgets#32
The attributes encoded with JSON could be prefixed with
encoded:
. For example:If the parser can be made more robust than to just rely on regular expressions, however, then we could just encode straight JSON as the values.
Note here that no double-quotes are used for the value because it's encoding an object. Similarly, integers could be unquoted and as well as booleans:
The text was updated successfully, but these errors were encountered: