Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to queue definition recomputation #1242

Merged
merged 16 commits into from
Dec 12, 2024

Conversation

qtomlinson
Copy link
Collaborator

@qtomlinson qtomlinson commented Nov 27, 2024

Performance Degradation During Schema Version Upgrade

During the process of upgrading the definition schema version, performance degradation has been observed. Currently, schema version updates are handled by recomputing the definition on the fly, which causes slower performance. These updates are critical for data correctness and to introduce the LicenseRef marker.

Proposed Solution:

  1. Current Implementation (versionCheck):
  • Checks the schema version and recomputes the definition on demand if it is stale.
  • This approach is currently in production.
  1. Preferred Approach (upgradeQueue):
  • When a request is received, return the existing definition.
  • If the schema has changed, trigger a recompute after returning the result.
  • Use a queue to handle the upgrade, similar to how the service computes definitions for harvested components.
  • This allows the system to function more smoothly without immediate performance degradation.
  • The tech steering committee has agreed on this approach (discussed on Nov 26).
  1. Switching Methods:
  • Once the majority of the migration is complete, the system can switch back to calculating on the fly via configuration.

PR Details:

This PR introduces two implementations for the upgrade handler:

  • versionCheck: Current implementation in production.
  • upgradeQueue: Preferred approach as described above.

Environment Variables:

  • DEFINITION_UPGRADE_PROVIDER: Controls the upgrade behavior (versionCheck, upgradeQueue) with versionCheck as the default.
  • DEFINITION_UPGRADE_QUEUE_PROVIDER: Controls the queuing of upgrades (memory, azure) with memory as the default.
  • DEFINITION_UPGRADE_QUEUE_CONNECTION_STRING: Connection string to the Azure Storage Queue.
  • If not present, HARVEST_AZBLOB_CONNECTION_STRING is used for connection information.
  • DEFINITION_UPGRADE_QUEUE_NAME: Name of the upgrade queue, defaulting to definitions-upgrade.

@qtomlinson qtomlinson changed the title Qt/definition upgrade Improve performance for definition upgrade Nov 27, 2024
@qtomlinson qtomlinson changed the title Improve performance for definition upgrade Improve performance for definition recomputation Nov 27, 2024
@qtomlinson qtomlinson changed the title Improve performance for definition recomputation Add the ability to queue definition recomputation Nov 27, 2024
@qtomlinson qtomlinson force-pushed the qt/definition_upgrade branch from b10f052 to 4c00f7b Compare November 29, 2024 01:32
Add encoding message in queueing.  AzureStorageQueue supports parsing encoded message.  Added memoryQueueConfig, to support parsing encoded messages.

Also add more logging.
@qtomlinson qtomlinson force-pushed the qt/definition_upgrade branch from fb8b49f to 96938b4 Compare December 3, 2024 05:33
@qtomlinson qtomlinson marked this pull request as ready for review December 4, 2024 03:08
Copy link
Collaborator

@elrayle elrayle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this really close and it looks good to me. Probably should get someone with more js experience to review it too.

if (!this._upgrade) throw new Error('Upgrade queue is not set')
const message = this._constructMessage(definition)
await this._upgrade.queue(JSON.stringify(message))
this.logger.debug('Queued for definition upgrade ', {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've heard this referred to as recompute which I see in the code in other places. I don't see upgrade in other parts of the code. Is there a reason to name the queue and method upgrade instead of recompute.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recompute happens only when schema version is update/upgraded. I am open to renaming it if it improves the readability.

Copy link
Member

@mpcen mpcen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall everything looks good. Left minor comments/questions.

providers/queueing/memoryQueue.js Show resolved Hide resolved
providers/upgrade/defUpgradeQueue.js Show resolved Hide resolved
providers/upgrade/process.js Outdated Show resolved Hide resolved
-add logging to handle queuing failure
-remove redundant JSON.stringify when queuing message (there is no impact to message decoding)
-move queueHandler and defUpgrader to local scope in setup
-minor refactoring to improve code reuse and readability
-add more tests
@qtomlinson qtomlinson force-pushed the qt/definition_upgrade branch from c3cdd04 to 0efd310 Compare December 11, 2024 23:29
@qtomlinson qtomlinson merged commit 0d72a12 into clearlydefined:master Dec 12, 2024
4 checks passed
@qtomlinson qtomlinson deleted the qt/definition_upgrade branch December 12, 2024 19:15
@qtomlinson
Copy link
Collaborator Author

Upon deployment, settings to turn on queuing:

DEFINITION_UPGRADE_PROVIDER=upgradeQueue
DEFINITION_UPGRADE_QUEUE_PROVIDER=azure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants