Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Auto Sizes logic from Enhanced Responsive Images to Image Prioritizer #1476

Merged
merged 9 commits into from
Sep 10, 2024

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Aug 16, 2024

Now that Auto Sizes is proposed for Core (Core-61847, WordPress/wordpress-develop#7253), the logic in the Enhanced Responsive Images to apply sizes=auto in response to changes applied by Image Prioritizer doesn't make sense, as reasoned by @joemcgill in #1476 (comment). So this PR eliminates the Optimization Detective logic from that plugin and instead moves it to Image Prioritizer, incorporating it as part of the IMG tag visitor which applies changes to the loading attribute.

I also realized that the crossorigin attribute wasn't being handled properly when constructing preload links. That is fixed in 5dc1eb5.

Secondly, the IMG tag visitor also wasn't accounting for attribute values for the loading, fetchpriority, and crossorigin attributes which include whitespace or non-lowercase characters. These have been normalized in ae5d3ce.

Obsolete PR Description

There are currently two different code paths in the Enhanced Responsive Images plugin for adding the auto value to the sizes attribute, the first uses the wp_content_img_tag filter and the second uses a tag visitor for Optimization Detective:

function auto_sizes_update_content_img_tag( $html ): string {
if ( ! is_string( $html ) ) {
$html = '';
}
$processor = new WP_HTML_Tag_Processor( $html );
// Bail if there is no IMG tag.
if ( ! $processor->next_tag( array( 'tag_name' => 'IMG' ) ) ) {
return $html;
}
// Bail early if the image is not lazy-loaded.
$value = $processor->get_attribute( 'loading' );
if ( ! is_string( $value ) || 'lazy' !== strtolower( trim( $value, " \t\f\r\n" ) ) ) {
return $html;
}
$sizes = $processor->get_attribute( 'sizes' );
// Bail early if the image is not responsive.
if ( ! is_string( $sizes ) ) {
return $html;
}
// Don't add 'auto' to the sizes attribute if it already exists.
if ( auto_sizes_attribute_includes_valid_auto( $sizes ) ) {
return $html;
}
$processor->set_attribute( 'sizes', "auto, $sizes" );
return $processor->get_updated_html();
}
add_filter( 'wp_content_img_tag', 'auto_sizes_update_content_img_tag' );

function auto_sizes_visit_tag( OD_Tag_Visitor_Context $context ): bool {
if ( 'IMG' !== $context->processor->get_tag() ) {
return false;
}
$sizes = $context->processor->get_attribute( 'sizes' );
if ( ! is_string( $sizes ) ) {
return false;
}
$sizes = preg_split( '/\s*,\s*/', $sizes );
if ( ! is_array( $sizes ) ) {
return false;
}
$is_lazy_loaded = ( 'lazy' === $context->processor->get_attribute( 'loading' ) );
$has_auto_sizes = in_array( 'auto', $sizes, true );
$changed = false;
if ( $is_lazy_loaded && ! $has_auto_sizes ) {
array_unshift( $sizes, 'auto' );
$changed = true;
} elseif ( ! $is_lazy_loaded && $has_auto_sizes ) {
$sizes = array_diff( $sizes, array( 'auto' ) );
$changed = true;
}
if ( $changed ) {
$context->processor->set_attribute( 'sizes', join( ', ', $sizes ) );
}
return false; // Since this tag visitor does not require this tag to be included in the URL Metrics.
}

As of #1471, the non-Optimization Detective code path is leveraging HTML Tag Processor in the same as the Optimization Detective code path has been using since #1322. What this means is that we can reuse the same underlying tag processor logic for in both cases. This is also the case for Embed Optimizer in #1302.

Also, something that the Optimization Detective code path implemented was removing the auto value from the sizes attribute when it is not lazy-loaded. This was not implemented in current logic for auto_sizes_update_content_img_tag(), so this PR also handles that case. This remains exclusively in the Optimization Detective code path, per @felixarntz in WordPress/wordpress-develop#7253 (comment).

Lastly, in 7e8cebb I've gone back against myself for chosing strtok() which was discussed in #1445 (comment) with a follow-up by @dmsnell in #1445 (comment). The problem with strtok() is that if there are initial , characters they will be skipped over. So ,,,,,auto is currently detected as having an initial auto value, whereas Chrome will not recognize this (as it aligns with the HTML spec).

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature maybelater Issues to possibly be worked on long-term [Plugin] Enhanced Responsive Images Issues for the Enhanced Responsive Images plugin (formerly Auto Sizes) labels Aug 16, 2024
@westonruter westonruter added this to the auto-sizes 1.2.0 milestone Aug 16, 2024
Copy link

github-actions bot commented Aug 16, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <[email protected]>
Co-authored-by: mukeshpanchal27 <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: felixarntz <[email protected]>
Co-authored-by: joemcgill <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@westonruter westonruter removed the maybelater Issues to possibly be worked on long-term label Aug 16, 2024
@westonruter westonruter requested a review from felixarntz August 16, 2024 00:44
@westonruter westonruter changed the title Reuse logic between wp_content_img_tag filter and Optimization Detective tag visitor; account for removal of erroneous auto size Reuse logic between wp_content_img_tag filter and Optimization Detective tag visitor; remove erroneous auto size Aug 16, 2024
Base automatically changed from fix/1446-use-html-tag-processor-auto-sizes to trunk August 16, 2024 05:49
'intersectionRatio' => 1,
),
'buffer' => '<img src="https://example.com/foo.jpg" alt="Foo" width="1200" height="800" loading="lazy" srcset="https://example.com/foo-480w.jpg 480w, https://example.com/foo-800w.jpg 800w" sizes="auto">',
'expected' => '<img data-od-removed-loading="lazy" data-od-replaced-sizes="auto" src="https://example.com/foo.jpg" alt="Foo" width="1200" height="800" srcset="https://example.com/foo-480w.jpg 480w, https://example.com/foo-800w.jpg 800w" sizes="">',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why extra space is needed before srcset attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the loading attribute was previously located here and the tag processor doesn't currently strip out that extra space.

$value = $processor->get_attribute( 'loading' );
if ( ! is_string( $value ) || 'lazy' !== strtolower( trim( $value, " \t\f\r\n" ) ) ) {
return $html;
auto_sizes_apply_tag_parser_updates( $processor );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll trust that you know what you're doing, but the Tag and HTML Processors are most happy when local and not passed around. There's the danger here of making code appear to change without explanation, because something the process was sent to leads to its internal state changing, and those transitions aren't evident here where the processor was created.

An alternative style would be to pass in the HTML itself to the updater function, let it create its own processor, and receive the string output from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for passing around the processor is that there are two different code paths to this logic being needed:

  • When updating an IMG tag in isolation via the wp_content_img_tag filter.
  • When updating an IMG tag in the context of an entire document that is currently being iterated over in Optimization Detective.

In Optimization Detective the tag processor is passed to each of the registered tag visitors for each open tag while iterating over the document. See the auto_sizes_visit_tag() function in the diff below for how it is being used there.

There's the danger here of making code appear to change without explanation, because something the process was sent to leads to its internal state changing, and those transitions aren't evident here where the processor was created.

I don't understand. What's the danger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is, by the way, a problem that eventually we'll want to solve in Core. I have no idea what the answer might look like today, but it's in the realm of "limited access processors" that could be handed off in a WordPress filter, for example.

here's an example of where things can go fishy.

$processor = new WP_HTML_Tag_Processor( $html );
$total_images = 0;
while ( $processor->next_tag( 'IMG' ) ) {
	++$total_images;
	do_something( $processor );
}

At the end of this script we only see $total_images === 1 and wonder why it's so wrong, given that we sent 1,000 images to it. Well, do_something() called $processor->next_tag( 'SOURCE' ) but there weren't any, so by the time it returns control to the calling loop, the Tag Processor has reached the end of the document and skips all remaining images.

Copy link
Member Author

@westonruter westonruter Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. Optimization Detective is actually accounting for that by setting a bookmark on the tag prior to invoking all the tag visitors, and then after they have been invoked it checks if the cursor has moved at all, and if so, it seeks back to the bookmark before continuing to the next tag:

do {
$did_visit = false;
$processor->set_bookmark( $current_tag_bookmark ); // TODO: Should we break if this returns false?
foreach ( $visitors as $visitor ) {
$seek_count = $processor->get_seek_count();
$next_token_count = $processor->get_next_token_count();
$did_visit = $visitor( $tag_visitor_context ) || $did_visit;
// If the visitor traversed HTML tags, we need to go back to this tag so that in the next iteration any
// relevant tag visitors may apply, in addition to properly setting the data-od-xpath on this tag below.
if ( $seek_count !== $processor->get_seek_count() || $next_token_count !== $processor->get_next_token_count() ) {
$processor->seek( $current_tag_bookmark ); // TODO: Should this break out of the optimization loop if it returns false?
}
}
$processor->release_bookmark( $current_tag_bookmark );

The processor subclass introduces $processor->get_seek_count() and $processor->get_next_token_count() (the name of the latter I see is confusing) which return the number of times seek() and next_token() were called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is all baked into "I trust you" but in general, I like noting it, and don't think it's a terrible idea to caution others against following the example. it looks deceivingly simpler than it is.

$processor->set_attribute(
'sizes',
(string) preg_replace( '/^[ \t\f\r\n]*auto[ \t\f\r\n]*(,[ \t\f\r\n]*)?/i', '', $sizes )
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point it seems like there's some zig-zagging going on for the auto keyword. just going to toss this out there, but we're using one method to detect its presence, and another method to add it, but I wonder if the PCRE method doesn't already in some way capture both situations, and might not be clearer to rely on vs. splitting the detection and replacement.

function toggle_auto_on_sizes_attribute( string $existing, bool $wants_auto ): string {
	// Find any existing `auto` keyword at the front of the attribute.
	preg_match( '/^([ \t\f\r\n]*auto[ \t\r\f\n]*)(?:,|$)/i', $existing, $auto_match, PREG_OFFSET_CAPTURE );
	$has_auto = isset( $auto_match[1][0] );

	if ( $wants_auto && ! $has_auto ) {
		return "auto, {$sizes}";
	} elseif ( ! $wants_auto && $has_auto ) {
		// Chop off leading `auto` keyword.
		return substr( $existing, strlen( $auto_match[1][0] ) );
	}

	return $existing;
}

Then the calling code uses $is_lazy as a proxy for $wants_auto.

$processor->set_attribute( 'sizes', toggle_auto_on_sizes_attribute( $sizes, $is_lazy ) );

@joemcgill
Copy link
Member

I was talking to @felixarntz last week and think we should reconsider whether the OD compatibility code belongs in this plugin at all. When this was added, it was to ensure that when an Image Prioritizer removed loading="lazy" from an image, this plugin didn't still add sizes="auto".

In principle, it is a great goal to try to avoid adding auto-sizes to images that have their loading attribute modified late in the execution cycle. However, if auto-sizes gets implemented in Core, it will need to be the responsibility of any plugins that change the image markup to be responsible for removing auto-sizes unless we find a better way to run this post-processing very late in the template rendering process.

Even so, I'm think it would be good to think about how we write a callback that can be consumed by other plugins that need to correct the sizes value after modifying the loading value. It seems like passing in and HTML string and returning an HTML string would be the best way to handle this, but am unsure if Image Prioritizer would be able to make use of a function that doesn't except an WP_HTML_Tag_Processor that has already been initialized.

@westonruter
Copy link
Member Author

Yeah, that makes sense. The current Optimization Detective extension code should instead be put inside of Image_Prioritizer_Img_Tag_Visitor in the Image Prioritizer plugin. I'll update this PR to make that change.

@felixarntz
Copy link
Member

felixarntz commented Aug 19, 2024

think we should reconsider whether the OD compatibility code belongs in this plugin at all. When this was added, it was to ensure that when an Image Prioritizer removed loading="lazy" from an image, this plugin didn't still add sizes="auto".

+1 to removing the OD integration for this from auto-sizes and rather put parts of it into Image Prioritizer:

  • Having sizes="auto" on a non-lazy-loaded image is not correct, so it makes sense to me that, if Image Prioritizer removes loading="lazy", it should also remove the auto from the sizes attribute, if present. This is less about integrating with the auto-sizes plugin than it is about making sure that its own functionality doesn't lead to such a bug. FWIW, another plugin could have added the attribute similarly.
  • The reverse isn't necessary though. It's not the responsibility of Image Prioritizer to add sizes="auto", so it shouldn't need to do that. Sure, in principle it would be great to have sizes="auto" added too if Image Prioritizer adds loading="lazy" to an image, but I'm not sure this plugin is the right place. If we want that part, it would make sense to have it in the auto-sizes plugin (separately of course from what would be proposed for WP core).
  • In other words, the presence of Image Prioritizer shouldn't have to make sizes="auto" usage better. But it shouldn't make it worse. :)

@westonruter westonruter force-pushed the update/auto-sizes-code-reuse branch from fa8d99b to 65bc8f0 Compare September 5, 2024 22:45
@westonruter westonruter changed the title Reuse logic between wp_content_img_tag filter and Optimization Detective tag visitor; remove erroneous auto size Move Auto Sizes logic from Enhanced Responsive Images to Image Prioritizer Sep 6, 2024
@westonruter westonruter marked this pull request as ready for review September 6, 2024 00:39
Comment on lines -101 to +102
$token = strtok( strtolower( $sizes_attr ), ',' );
return false !== $token && 'auto' === trim( $token, " \t\f\r\n" );
list( $first_size ) = explode( ',', $sizes_attr, 2 );
return 'auto' === strtolower( trim( $first_size, " \t\f\r\n" ) );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches what is currently in WordPress/wordpress-develop#7253

@@ -25,7 +25,7 @@
* href?: non-empty-string,
* imagesrcset?: non-empty-string,
* imagesizes?: non-empty-string,
* crossorigin?: ''|'anonymous'|'use-credentials',
* crossorigin?: 'anonymous'|'use-credentials',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the HTML spec allows for links the crossorigin attribute to be (cf. MDN):

  • crossorigin (boolean attribute, same as anonymous)
  • crossorigin="" (empty value, same as anonymous`)
  • crossorigin=sooooo-secret (invalid value, same as anonymous)
  • crossorigin=anonymous
  • crossorigin=use-credentials

This PR normalizes the first three cases to only use the fourth case, which aligns with the reflected DOM crossOrigin property. This will ensure that OD_Link_Collection::OD_Link_Collection() will be able to properly merge links for all anonymous links.

Co-authored-by: Mukesh Panchal <[email protected]>
@westonruter westonruter merged commit 41706fd into trunk Sep 10, 2024
15 checks passed
@westonruter westonruter deleted the update/auto-sizes-code-reuse branch September 10, 2024 23:19
@westonruter westonruter added [Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Sep 10, 2024
@joemcgill
Copy link
Member

Thanks for handling this. Reviewed the changes post-merge and looks good to me.

westonruter added a commit to ShyamGadde/wp-performance that referenced this pull request Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Enhanced Responsive Images Issues for the Enhanced Responsive Images plugin (formerly Auto Sizes) [Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
Status: Done 😃
Development

Successfully merging this pull request may close these issues.

5 participants