Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query on last data #348

Open
pjhaest opened this issue Mar 21, 2022 · 6 comments
Open

Query on last data #348

pjhaest opened this issue Mar 21, 2022 · 6 comments
Labels
dov Related to the DOV webservices

Comments

@pjhaest
Copy link
Collaborator

pjhaest commented Mar 21, 2022

following up on #346 , it would be useful to query the latest measurement that was added to the data series, for all measured variables. I don't think this is possible with the current backend setup? Would that be something that DOV can elaborate @marleenvd ?

@pjhaest pjhaest added the dov Related to the DOV webservices label Mar 21, 2022
@Roel
Copy link
Member

Roel commented Mar 22, 2022

This is indeed currently not possible. I think this requires extra field(s) in the WFS services, e.g. eerste_invoer and/or laatste_wijziging. This question has been raised before, and there is some analysis planned (outcome and timing TBD), linked to ticket DOVSONDCL-6747 in DOV (not publicly accessible). I will follow up here once there is some progress.

@pjhaest
Copy link
Collaborator Author

pjhaest commented Dec 16, 2022

any news on this issue? Maybe something to keep in mind for the next year?

@Roel
Copy link
Member

Roel commented Jan 31, 2023

I propose to start with adding eerste_invoer, since laatste_wijziging needs to take a lot more data into account for calculating this correctly (due to the fact that the WFS services contain aggregated fields based on data linked to the object itself, and the fact that pydov uses the XML services too).

This would already allow answering questions like 'Boringen added since X', or 'Grondwatermonsters with PFAS measurements added since X', but not 'Peilmetingen added since X' (since this is not a separate service, but included in the XML download of the Grondwaterfilter service).

@pjhaest
Copy link
Collaborator Author

pjhaest commented Jan 31, 2023

Is that from the pydov point of view?
Is it possible, in the DOV database, to add an extra table with the last addition for each object? I think additions are rare, so this would not tremendously slow down the processing, would it? I know that's out of scope for pydov, but maybe core DOV users are interested in this feature as well?
Not an expert here, is the creation date of the xml available for query?
Anyway, happy to see that this subject is under consideration!

@Roel
Copy link
Member

Roel commented Feb 21, 2023

We have history available for all tables in DOV and some of it can be consulted in the web UI too when logged in. These tables is where the timestamp of eerste_invoer comes from too. It's not a performance thing, the WFS services are materialized anyway, it's more that the timestamp of laatste_wijziging needs to take a lot more (linked) objects (read: other tables) into account which makes the queries a lot more complex to setup and maintain.

Regarding the creation date of the XML: the XML is created on the fly when it is requested, so this would not be very useful.

@Roel
Copy link
Member

Roel commented Apr 4, 2023

The field eerste_invoer has been added to all services used in pydov. This allows to search for the data added since a certain timestamp, or search for the most recently added data.

For example to find the most recently added groundwater samples with PFAS observations, you could use something like this:

from pydov.search.grondwatermonster import GrondwaterMonsterSearch
from pydov.types.grondwatermonster import GrondwaterMonster

from owslib.fes2 import SortBy, SortProperty, PropertyIsEqualTo

s = GrondwaterMonsterSearch()

return_fields = GrondwaterMonster.get_field_names()
return_fields.append('eerste_invoer')

df = s.search(
  query=PropertyIsEqualTo('chemisch_PFAS', 'true'), max_features=10,
  return_fields=return_fields,
  sort_by=SortBy([SortProperty('eerste_invoer', 'DESC')]))
df = df[df.parametergroep == "Grondwater_chemisch_PFAS"]
                                pkey_grondwatermonster                  grondwatermonsternummer                             pkey_grondwaterlocatie        gw_id                                        pkey_filter  ... detectie  waarde  eenheid  veld_labo             eerste_invoer
0    https://www.dov.vlaanderen.be/data/watermonste...  56(4-KPS-0012a_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...  4-KPS-0012a  https://www.dov.vlaanderen.be/data/filter/2009...  ...        <     1.0     ng/l       LABO 2022-07-01 14:15:03+02:00
1    https://www.dov.vlaanderen.be/data/watermonste...  56(4-KPS-0012a_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...  4-KPS-0012a  https://www.dov.vlaanderen.be/data/filter/2009...  ...        <     1.0     ng/l       LABO 2022-07-01 14:15:03+02:00
2    https://www.dov.vlaanderen.be/data/watermonste...  56(4-KPS-0012a_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...  4-KPS-0012a  https://www.dov.vlaanderen.be/data/filter/2009...  ...        <     1.0     ng/l       LABO 2022-07-01 14:15:03+02:00
3    https://www.dov.vlaanderen.be/data/watermonste...  56(4-KPS-0012a_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...  4-KPS-0012a  https://www.dov.vlaanderen.be/data/filter/2009...  ...        <     1.0     ng/l       LABO 2022-07-01 14:15:03+02:00
4    https://www.dov.vlaanderen.be/data/watermonste...  56(4-KPS-0012a_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...  4-KPS-0012a  https://www.dov.vlaanderen.be/data/filter/2009...  ...        <     1.0     ng/l       LABO 2022-07-01 14:15:03+02:00
..                                                 ...                                      ...                                                ...          ...                                                ...  ...      ...     ...      ...        ...                       ...
505  https://www.dov.vlaanderen.be/data/watermonste...    47(701/75/29_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...    701/75/29  https://www.dov.vlaanderen.be/data/filter/2003...  ...        <     2.0     ng/l       LABO 2022-07-01 14:14:40+02:00
506  https://www.dov.vlaanderen.be/data/watermonste...    47(701/75/29_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...    701/75/29  https://www.dov.vlaanderen.be/data/filter/2003...  ...        <     1.0     ng/l       LABO 2022-07-01 14:14:40+02:00
507  https://www.dov.vlaanderen.be/data/watermonste...    47(701/75/29_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...    701/75/29  https://www.dov.vlaanderen.be/data/filter/2003...  ...        <     1.0     ng/l       LABO 2022-07-01 14:14:40+02:00
508  https://www.dov.vlaanderen.be/data/watermonste...    47(701/75/29_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...    701/75/29  https://www.dov.vlaanderen.be/data/filter/2003...  ...        <     1.0     ng/l       LABO 2022-07-01 14:14:40+02:00
509  https://www.dov.vlaanderen.be/data/watermonste...    47(701/75/29_1 | 01/02/22 - 30/06/22)  https://www.dov.vlaanderen.be/data/put/2017-00...    701/75/29  https://www.dov.vlaanderen.be/data/filter/2003...  ...        <     1.0     ng/l       LABO 2022-07-01 14:14:40+02:00

[480 rows x 18 columns]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dov Related to the DOV webservices
Projects
None yet
Development

No branches or pull requests

2 participants