Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.x] Warm paginated pages with static:warm command #9493

Draft
wants to merge 25 commits into
base: 5.x
Choose a base branch
from

Conversation

duncanmcclean
Copy link
Member

@duncanmcclean duncanmcclean commented Feb 8, 2024

This pull request implements warming of paginated pages into the static:warm command. This will be especially useful for larger sites which have lots of pagination pages since right now, only the first page is warmed.

Under the hood, Statamic's Cache middleware checks if there's a paginator instance being "blinked". If there is, a header is added to the response containing the URL of the next paginated page.

This header then gets picked up by the static:warm command and triggers the paginated page to be warmed, which happens in a loop until there's no more paginated pages to warm.

The implementation is inspired by statamic/ssg#140.

Working around the ignore_query_strings config option

Note

This part of the PR description is probably out-of-date after changes made in #10701, where we extracted the "allowed query params" bit into it's own PR.

We also need to make some changes to how this PR works alongside ignore_query_strings, see comment: #9493 (comment)

Right now, you can't use pagination at all with the ignore_query_strings config option enabled. This PR adds the ability to whitelist certain query parameters, like the page query param used for pagination.

// config/statamic/static_caching.php

'ignore_query_strings' => true,

'allowed_query_parameters' => [
    'page',
],

I've made this a config option rather than hard-coding page so it's flexible in case you have a reason for a query parameter changing the outputted HTML.

One downside to this approach (which is potentially a non-issue, depending on your reason for using the ignore setting in the first place) is that the allowed query parameters can be used on any page, even where it's not used, and it would result in a separate page in the cache.

For example: If you have page on the allow list and only have a "News" page with pagination on it, users could append ?page=whatever to any page and a separate page would be created in the cache.

@duncanmcclean duncanmcclean changed the title [4.x] Warm paginated pages [4.x] Warm paginated pages with static:warm command Feb 8, 2024
duncanmcclean and others added 7 commits February 9, 2024 16:35
By default, we'll have the `page` query parameter in here so pagination works.

However, if you change the pagination param name or have some other query parameter you want to whitelist, then that setting lets you do that.
@duncanmcclean duncanmcclean marked this pull request as ready for review February 12, 2024 15:19
jasonvarga

This comment was marked as resolved.

# Conflicts:
#	src/Console/Commands/StaticWarm.php
@duncanmcclean duncanmcclean changed the base branch from 4.x to 5.x May 13, 2024 12:10
@duncanmcclean duncanmcclean changed the title [4.x] Warm paginated pages with static:warm command [5.x] Warm paginated pages with static:warm command May 13, 2024
# Conflicts:
#	src/Console/Commands/StaticWarm.php
#	src/Console/Commands/StaticWarmJob.php
#	src/StaticCaching/Middleware/Cache.php
#	tests/Console/Commands/StaticWarmJobTest.php
#	tests/StaticCaching/CacherTest.php
@SylvesterDamgaard
Copy link
Contributor

@duncanmcclean Can we add query string sorting as well?

Take a look at the original code example i posted in the private channel.

This version takes a few more options into account:

  • ignore_query_strings
  • allowed_query_strings
  • disallowed_query_strings
public function getUrl(Request $request)
    {
        $url = $request->getUri();

        if ($this->config('ignore_query_strings')) {
            $url = explode('?', $url)[0];
        }

        $parts = parse_url($url);

        if (isset($parts['query'])) {
            parse_str($parts['query'], $query);

            if ($this->config('sort_query_strings')) {
                $query = Arr::sort($query);
            }

            if ($allowedQueryStrings = $this->config('allowed_query_strings')) {
                $query = array_intersect_key($query, array_flip($allowedQueryStrings));
            }

            if ($disallowedQueryStrings = $this->config('disallowed_query_strings')) {
                $disallowedQueryStrings = array_flip($disallowedQueryStrings);
                $query = array_diff_key($query, $disallowedQueryStrings);
            }

            $url = $parts['scheme'] . '://' . $parts['host'] . '?'. http_build_query($query);
        }

        return $url;
    }

@duncanmcclean
Copy link
Member Author

@SylvesterDamgaard I've split out the "allowed query parameters" part into a separate PR, where I've also implemented the sorting/disallowed options: #10701

Copy link
Member

@jasonvarga jasonvarga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna mark this as a draft. My previous review somehow got marked as resolved but I think it's still valid.

This unfortunately only works the first time.

Run php please static:warm, you'll see the paginated pages in the output.

Run it again, they won't be there.

This might not be an issue after #9396

@jasonvarga jasonvarga marked this pull request as draft September 11, 2024 17:29
# Conflicts:
#	config/static_caching.php
#	src/StaticCaching/Cachers/AbstractCacher.php
#	src/StaticCaching/StaticCacheManager.php
#	tests/StaticCaching/CacherTest.php
$query = $this->config('ignore_query_strings') ? '' : Arr::get($urlParts, 'query', '');
$query = Arr::get($urlParts, 'query', '');

if ($this->config('ignore_query_strings')) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we revisit this PR, we'll need to spend some time getting it working again when ignore_query_strings is true, as it looks like our changes in #10701 broke this. 😄

Before #10701, this method ensured that only query parameters configured in the allowed_query_strings array (like page) were included in the filename.

However, it now it seems like the $url being passed into this method already has the query parameters stripped off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants