Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBX-5502: Added additional tag cleaning to limit down number of orphaned tag en… #477

Open
wants to merge 10 commits into
base: 4.6
Choose a base branch
from

Conversation

mateuszbieniek
Copy link
Contributor

@mateuszbieniek mateuszbieniek commented Feb 12, 2025

Rebase of ezsystems/ezplatform-kernel#372

Question Answer
JIRA issue IBX-5502
Type bug
Target Ibexa version v4.6
BC breaks no

Due to how the cache is implemented in Symfony-based projects using the Redis adapter (and probably - not only Redis), entries (Members) inside a Tag (Set) are removed only when the tag is purged. Due to that, we can encounter on multiple occasions a situation, when a Tag is purged, resulting in cache items (Keys) being purged as well, but only the Tag that triggered a purge is cleaned up. All other tags that contain entries pointing to cache items that no longer exist will remain there until the affected tag is purged - which often is "never".

This results in orphaned entries in tags (or even in whole orphaned tags) filling up the memory. Memory allocated to tags in Redis is nonevictable, therefore it is freed only when purged on purpose (maxmemory-policy setting won't affect orphaned members/sets). On very big and/or constantly changing content structures, this unevictable memory allocated to orphaned tags and entries can significantly lower the amount of memory available for standard cache items to be stored.

Additional Tag purging added in this PR reduces this effect, but - unfortunately - it is not possible to resolve it fully with current Symfony's cache adapters implementation, due to the fact that there is no available interface to remove an entry from a Tag, without purging it whole - and that would result in purging cache items that were nor related to the original action performed on content.

Additional tag purging introduced by PR:

  • ContentHandler::updateContent:
    Multiple content-related entries (ibx-c-<contentId>-<versionNo>-0, ibx-c-<contentId>-<versionNo>-<languageCode>, ibx-cvi-<contentId>, ibx-cl-<contentId>-pfd are orphaned in tags: l-<locationId>, lp-<locationId>, c-<contentId>-v-<versionNo>, c-<contentId>-v-<prevVersionNo>. Added purging of those tags to avoid it.

  • TrashHandler::trashSubtree:
    Multiple content-related entries are orphaned (ibx-c-<contentId>-v<versionNumber>, ibx-cl-<contentId>, ibx-cvi-<contentId>, ibx-l-<locationId>-<languageCode>, ibx-c-<contentId>-<versionNo>-0, ibx-c-<contentId>-<versionNo>-<languageCode>, ibx-cvi-<contentId> ... (many more) are orphaned in tags: l-<locationId>, c-<contentId>-v-<versionNo>.

Handling those two occurrences helps to limit greatly the number of orphaned members.

My tests show, that one of the biggest culprits here is lp-<locationIdOfParent> tag, which can have multiple orphaned entries - therefore it might be a good idea to purge it as well, to avoid blocked memory in cost of potentially small performance loss - up for discussion.

This PR won't resolve the issue fully but limits the scope of it using available tools. To resolve this fully, we will need (in my opinion):

  • an interface to remove entries from sets
  • a garbage collector, as entries in tags might become orphaned as well when a key is evicted by Redis when maxmemory limit is reached.
    But those two are out of this PR scope.

Checklist:

  • Provided PR description.
  • Tested the solution manually.
  • Checked that target branch is set correctly (master for features, the oldest supported for bugs).
  • Ran PHP CS Fixer for new PHP code (use $ composer fix-cs).
  • Asked for a review (ping @ezsystems/engineering-team).

@mateuszbieniek mateuszbieniek requested a review from a team February 12, 2025 13:35
@mateuszbieniek mateuszbieniek added Bug Something isn't working Ready for review labels Feb 12, 2025
@mateuszbieniek mateuszbieniek changed the title Added additional tag cleaning to limit down number of orphaned tag en… IBX-5502: Added additional tag cleaning to limit down number of orphaned tag en… Feb 12, 2025
Copy link
Member

@alongosz alongosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution to the issue concerns me a bit. The idea behind tags was always to catch everything by using just one, so it feels like this needs to be resolved in a different place. However maybe I'm wrong or maybe there's no other way to solve it - POV ping @ibexa/php-dev.

@mateuszbieniek before moving with the remarks to the solution, see the one about getContentTags - in that case prior diff comments to that are not applicable.

Comment on lines +342 to +345
$locationPathTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_PATH_IDENTIFIER, [$location->id]);
}, $locations);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given LOCATION_PATH_IDENTIFIER tag exists only in conjunction with LOCATION_IDENTIFIER tag, are you sure both are necessary?
Moreover, I don't see LOCATION_PATH_IDENTIFIER tags (lp-%s) in the list off affected by the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the PR's description, the list presented only has an example, and the number of orphaned tags is not limited to it. But if you look really closely, you will also find mention of the lp- tag ;)

@@ -335,12 +335,33 @@ public function updateContent($contentId, $versionNo, UpdateStruct $struct)
{
$this->logger->logCall(__METHOD__, ['content' => $contentId, 'version' => $versionNo, 'struct' => $struct]);
$content = $this->persistenceHandler->contentHandler()->updateContent($contentId, $versionNo, $struct);
$this->cache->invalidateTags([
$this->cacheIdentifierGenerator->generateTag(
$locations = $this->persistenceHandler->locationHandler()->loadLocationsByContent($contentId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked the impact of this on performance? Every load inside caching system is a potential risk. I'd like to avoid the situation where we release some memory, but make longer update time as the cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked the impact of this on performance?

Only in the empirical way. If you have dozens or hundreds of locations, ofc this can have an impact, but then, probably this one is less of your worry then.

Every load inside caching system is a potential risk.

Any explanation behind this one? The change is made in a handler, not in the caching system, which to be frank main purpose is to do "loads".

I'd like to avoid the situation where we release some memory, but make longer update time as the cost.

That's, in my opinion is a wrong approach. This helps to clear unevictable memory from Redis - not doing this always results in eventually reaching a point when the whole app crashes. So this is not memory vs longer updates, but making the project less prone to failure due to OOM errors.

[$contentId, $versionNo]
),
]);
[$contentId, $versionNo - 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not invalidate tags for an unrelated versions. The cache performance side effects of that might be difficult to predict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, and I agree. Any suggestionson how we can handle c-<contentId>-v-<prevVersionNo> tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @alongosz

Comment on lines +339 to +342
$locationTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_IDENTIFIER, [$location->id]);
}, $locations);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very readable to me. How about:

Suggested change
$locationTags = array_map(function (Content\Location $location): string {
return $this->cacheIdentifierGenerator->generateTag(self::LOCATION_IDENTIFIER, [$location->id]);
}, $locations);
$locationTags = array_map(
fn (Content\Location $location): string => $this->cacheIdentifierGenerator->generateTag(
self::LOCATION_IDENTIFIER,
[$location->id]
),
$locations
);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, to be honest in my opinion thi is even less readable ;)

But no strong opinion here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants