Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"use_api_hydrate" in app.settings v.s. pgstac's "nohydrate" conf #133

Open
louisstuart96 opened this issue Jul 12, 2024 · 1 comment
Open

Comments

@louisstuart96
Copy link

louisstuart96 commented Jul 12, 2024

Our team is testing eoapi (on top of stac-fastapi-pgstac) against STAC items with lots of asset links. We faced performance problems in search request with 'query' or 'filter' extensions. Our assumption is that application's hydration setting causes this problem.

search_request.conf["nohydrate"] = settings.use_api_hydrate

Here, the app's default setting is use_api_hydrate = False, which in turn becomes nohydrate=false in PgSTAC query. However, the correct setting should be nohydrate=true:

https://stac-utils.github.io/pgstac/pgstac/#runtime-configurations

@mmcfarland
Copy link
Contributor

The double negative nohydrate=false is a bit cumbersome to reason through here, but I think the default is logical. When "use api" is turned off then nohydrate==false, which means the DB will perform the hydration. If "use api" is turned on, nohydrate==true, and the database will skip the hydration step and it must be performed on the API side.

use_api_hydrate nohydrate Hydration performed in
false false database
true true API

Whether or not the default is right for your setup is a bit subjective, though. In our experience, the option should target where you have the most spare compute. In the Planetary Computer, which has a single, large pgstac database server instance, using DB Hydration resulted in high CPU usage there, slowing down queries across the board. We also had a fairly large API cluster though, so we were able to spread out that CPU load across the various nodes and the DB could remain responsive. If you have more compute in your DB server than in your API instances, it may be better to keep it the DB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants