-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge validate_spatial and validate_series #76
Conversation
c766af2
to
14bcf0f
Compare
acd6fd6
to
eaa032b
Compare
It's going to take a while 😅 In the meantime, I have a question regarding a doubt I've had since we merged the two caches: how expensive is it to build the R*-tree? We are building it every time we call |
The algorithm is n * log(n), so I don't think it's a concern. I do agree there may be a need to optimise the case of simple pipelines though, as the fresh pipelines will likely form the vast bulk of our request load. I think if we're going to push to optimise those though, we should be more aggressive in our optimisation strategy. I'm thinking a much simpler DataCache that removed the spatial elements and the outer vec entirely, along with a cache of the recent data (a hashmap keyed by timeseries id, containing ringbuffers of data) bypassing the need to query the DB and do a network call entirely. That was the idea with #77. It would be good to see how the more general validate function performs with real pipelines, a representative data flow, and a real DB connector though, so we can see how much optimisation it needs, and have a target to aim for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I only have minor comments
c135883
to
2c7ba91
Compare
No description provided.