-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API kiara.list_all_values endpoint takes several minutes to process #73
Comments
Thanks for the report. I'll have to setup some test cases with a context with comparable size and number of values. I'd think that 356 values (like you seemed to have) should be prohibitive in terms of getting metadata for all of them, so I think there is a good chance this is something that can be optimized away, but I'll have to spend some time trying to replicate the problem... Anyway, please let me know if this happens again, queries like that should never take more than a few seconds, except if there is actually data loading (compared to metadata loading) involved.... |
Ah, also, if that happens again, it would be interesting to see whether this only happens with Jupyter, or, using the same context, also via 'pure Python' and/or the command-line.... Jupyter does have some quirks that could have caused this... |
As I hadn't intentionally stored anything in the data store, I didn't know that there were as many values. I don't know if users will be aware of these values being there when they don't intentionally save elements in the data store. |
Something I don't understand though: why does the size of the data matter to list the values, why does it take more time to do |
It was decided to store every value in every job run, which was mostly a consequence of the requirement of having a comment associated with every job run. Manually storing values is not necessary anymore, since everything gets stored anyway: #71 (comment) -- I did point out the potential issues that could arise in that meeting, esp. that I'm not sure about performance, since kiara wasn't designed with a pattern like that in mind. As I said, I think I can probably improve it in this instance, but I can't guarantee that we won't run into other similar issues.
In some case kiara needs to read the data (or parts of the data). It shouldn't have to for |
I will then look to use another operation for my needs, but it was interesting to see the impact of metadata auto-storing. Not sure, but I think this also had an impact on the performance of my laptop (ventilation was triggered a lot without me understanding why), but things are much better since I re-emptied the data store. Maybe it's just a coincidence, though but in any case, it may be important when getting to front-end considerations that users are aware of the amount of data that is in their data store (even if they didn't intentionally store anything). |
No, this needs to stay open, I need to investigate, as I said, we can't have that operation taking minutes.... |
ah ok :-) my bad sorry, I thought that the time that it takes was because of too much data in my store, thanks for re-opening it |
Describe the bug
I cleared the data store recently but ran a few operations since then without saving anything in the meantime (at least, not that I recall). I need to display the values present in the data store in a Jupyter notebook. When trying to run
kiara.list_all_values()
the cell takes more than 5 minutes to run, so I stopped the process. However, when I try to runkiara.list_all_value_ids()
it works in a few seconds.Additional info
partial output of
kiara.list_all_value_ids()
:(I just copied/pasted a few lines)
Output of kiara context explain:
Expected behavior
List all values in the current context, incl. internal ones.
Environment, versions (please complete the following information):
conda
The text was updated successfully, but these errors were encountered: