Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable / Disable indexing of child products in indices #3473

Open
jakwinkler opened this issue Dec 17, 2024 · 3 comments
Open

Enable / Disable indexing of child products in indices #3473

jakwinkler opened this issue Dec 17, 2024 · 3 comments
Assignees
Labels

Comments

@jakwinkler
Copy link

In the current version of ElasticSuite, all child products are added to Indices data.
I believe it is done here:

vendor/smile/elasticsuite/src/module-elasticsuite-catalog/Model/Product/Indexer/Fulltext/Datasource/AttributeData.php

in this section

$relationsByChildId = $this->resourceModel->loadChildrens($productIds, $storeId);

I've created a randomized catalog using my module:
https://github.com/qoliber/m2-catalog-generator

I've created 50 configurable product with 5 attributes, 5 options each, generating 3125 child options for configurable product.
Basically creating ~160k simple products.

With this number of options for products, CatalogSearch Index time is increasing with each configurable product, drastically.

Catalog Search index has been rebuilt successfully in 00:53:01

I understand why child products are indexed, but if they are not visible, they should be skipped form indexing.

But my general questions is: should be be indexed if they prolong the indexing process?

@romainruaud
Copy link
Collaborator

Hi Jakub, thanks for filling this issue,

to be fully transparent, I'm pretty sure we never tested this kind of volumetry, and most probably, nobody in all the websites using Elasticsuite is having such configurable product matrix either, otherwise, this issue would have already popped :)

Having 3000+ child per each configurable (or even bundle) product is quite... uncommon in my opinion and does not reflect real-life use cases (the most common being t-shirt sizes and colors).

The child themselves are not indexed but it's only their variant values which is added to the parent data, to allow proper filtering in layered navigation and fulltext search based on child data, so basically if you dismanthle this mechanism, you'll not be able to search by color label or to filter by size (in the case of my t-shirt example).

53min for indexing 160k variants is a bit huge indeed, I'm curious to know what's the spec of the server you're running your tests, but more than that, what's the timing you're seeing if you generate 160k simple products (without any configurables) and run a full reindex, to compare this.

I let @rbayet decide about the next steps on our side on this one.

Best regards

@jakwinkler
Copy link
Author

Most of fashion stores or clothes related run on 50k+ products with configurables and simples.
My last project (10k configurable, each 10 options) was taking 3.0h to index data in ElasticSuite - I'm not saying it's a problem (indexing was done at night). I just see a great chance for an improvement here :-)

@romainruaud
Copy link
Collaborator

Most of fashion stores or clothes related run on 50k+ products with configurables and simples.

Yes but I doubt they're having 5 or more configuration axis, common pattern is generall size/color and that's all, no ? So around ~20 child products per configurable (assuming 5 sizes, 5 colors).

My last project (10k configurable, each 10 options) was taking 3.0h to index data in ElasticSuite - I'm not saying it's a problem (indexing was done at night). I just see a great chance for an improvement here :-)

Yes, definitely, indexing 10k of parents products should not take that much time, it should take only a couple of minutes. Maybe it's worth investigating a bit more, like going with a bit of blackfire profiling during indexing just to check if we missed something.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants