Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monthly Automatic Solr-ReIndex #10291

Open
3 tasks
mekarpeles opened this issue Jan 7, 2025 · 1 comment
Open
3 tasks

Monthly Automatic Solr-ReIndex #10291

mekarpeles opened this issue Jan 7, 2025 · 1 comment
Assignees
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Module: Solr Issues related to the configuration or use of the Solr subsystem. [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Needs: Staff / Internal Reviewed a PR but don't have merge powers? Use this. Priority: 2 Important, as time permits. [managed] Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]

Comments

@mekarpeles
Copy link
Member

Proposal

As a followup to:

Create a hands-free, scheduled task to routinely run solr optimize (and possibly automated monthly re-indexing)

Justification

Over time, solr performance can get very bad if we don't regularly run optimize. See #10287

The entire site grinds to a halt

Breakdown

Requirements Checklist

  • Document the process for running solr optimize (and link here)
  • Automate (hands-free) the process of running solr optimize (may require switching ol-solr[0-1])
  • Setup a cron or schedule solr optimize on a cadence (e.g. monthly)
    • We might also consider doing the same for solr full re-indexes (e.g. monthly or quarterly)

Related files

@mekarpeles mekarpeles added Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed] Module: Solr Issues related to the configuration or use of the Solr subsystem. [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Priority: 2 Important, as time permits. [managed] Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Needs: Staff / Internal Reviewed a PR but don't have merge powers? Use this. labels Jan 7, 2025
@mekarpeles mekarpeles added this to the Sprint 2025-01 milestone Jan 7, 2025
@mekarpeles mekarpeles changed the title Regularly Schedule & Automate Solr Optimize Monthly Automatic Solr-ReIndex Jan 13, 2025
@mekarpeles
Copy link
Member Author

mekarpeles commented Jan 13, 2025

Next step is to run optimize on ol-solr1 (i.e. backup) and see if it is offline during this time.

If optimize makes solr unavailable, we need a way for our services to switch to another ready-to-go version of solr

Can we reverse proxy to the right solr from nginx or haproxy?

We update open library to have a dedicated dns for solr.openlibrary.org and when we run optimize, ol-www0 config would handle requests to solr.openlibrary.org and route according to its config which would get changed and reloaded by the optimize script? T.B.D how we handle logging for this feature if we do this approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Module: Solr Issues related to the configuration or use of the Solr subsystem. [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Needs: Staff / Internal Reviewed a PR but don't have merge powers? Use this. Priority: 2 Important, as time permits. [managed] Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]
Projects
None yet
Development

No branches or pull requests

2 participants