Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Archiving monthly data very slow at the end of the month #21906

Closed
4 tasks done
Cyriuz opened this issue Feb 9, 2024 · 4 comments
Closed
4 tasks done

[Bug] Archiving monthly data very slow at the end of the month #21906

Cyriuz opened this issue Feb 9, 2024 · 4 comments
Labels
answered For when a question was asked and we referred to forum or answered it.

Comments

@Cyriuz
Copy link

Cyriuz commented Feb 9, 2024

What happened?

We're running a Matomo instance that handles about 10m requests a month. Cron archiving jobs are running every 2h which works fine, but at the end of the month we sometimes experience timeouts since the sql queries take too long due to a DB governor.

The specific query that has a hard time is this one:

SELECT  /*+ MAX_EXECUTION_TIME(7200000) */  /* sites 1 */ /* 2023-10-31,2023-11-30 */ /* Core */ /* trigger = CronArchive */

count(distinct log_visit.idvisitor) AS `1`,

count(distinct log_visit.user_id) AS `39`

FROM

piwik_log_visit AS log_visit

WHERE

log_visit.visit_last_action_time >= '2023-10-31 23:00:00'

AND log_visit.visit_last_action_time <= '2023-11-30 22:59:59'

AND log_visit.idsite IN ('1')

What should happen?

The db infrastructure for this setup is not optimal but why does the month archival have to look at all individual visits? Shouldn't it use the already aggregated values from the day / week reports?

How can this be reproduced?

Run console core:archive at the end of the month on a 10m requests per month site.

Matomo version

5.0.2

PHP version

7.3

Server operating system

Linux

What browsers are you seeing the problem on?

No response

Computer operating system

No response

Relevant log output

No response

Validations

@Cyriuz Cyriuz added Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Feb 9, 2024
@sgiehl
Copy link
Member

sgiehl commented Feb 9, 2024

Hey @Cyriuz. This seems to be the query to calculate the unique visitors metric for a month. While all other metrics can be built by summing up metrics from day or week archives, this doesn't work for unique visitors.
If you do not need the unique visitors metric for months, you can disable the config enable_processing_unique_visitors_month and that query shouldn't be performed anymore.

@Cyriuz
Copy link
Author

Cyriuz commented Feb 9, 2024

Thank you for the reply, that makes sense now that you explained it! Is there a way to change that setting per run of archiving, so that month aggregation of unique visitors can be scheduled less frequent?

@Cyriuz Cyriuz closed this as completed Feb 9, 2024
@sgiehl
Copy link
Member

sgiehl commented Feb 9, 2024

@Cyriuz Unfortunately there isn't a setting for that at the moment. Maybe something like this gets implemented in the future, to speed that up until the month has finished: #6212

@sgiehl sgiehl added answered For when a question was asked and we referred to forum or answered it. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Feb 9, 2024
@Cyriuz
Copy link
Author

Cyriuz commented Feb 9, 2024

That does sound like it would be incredibly useful indeed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered For when a question was asked and we referred to forum or answered it.
Projects
None yet
Development

No branches or pull requests

2 participants