Fix for avoiding serialization error when datetimes are > 2263 #6654

frode-aarstad · 2023-11-23T14:04:56Z

Issue
Resolves

Approach
We now use string not datetime to represent time values in responses and observations.
We convert responses before we write them to file and convert observations after reading them from file.

Pre review checklist

Read through the code changes carefully after finishing work
Make sure tests pass locally (after every commit!)
Prepare changes in small commits for more convenient review (optional)
PR title captures the intent of the changes, and is fitting for release notes.
Updated documentation
Ensured that unit tests are added for all new behavior (See
Ground Rules),
and changes to existing code have good test coverage.

Pre merge checklist

Added appropriate release note label
Commit history is consistent and clean, in line with the contribution guidelines.

codecov-commenter · 2023-11-27T07:31:35Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.37%. Comparing base (91ea363) to head (0ef0944).
Report is 924 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #6654   +/-   ##
=======================================
  Coverage   84.37%   84.37%           
=======================================
  Files         367      367           
  Lines       21856    21862    +6     
  Branches      900      900           
=======================================
+ Hits        18440    18446    +6     
  Misses       3122     3122           
  Partials      294      294

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

berland · 2023-11-29T07:56:20Z

src/ert/storage/local_ensemble.py

@@ -217,8 +217,15 @@ def load_responses(
            if not input_path.exists():
                raise KeyError(f"No response for key {key}, realization: {realization}")
            ds = xr.open_dataset(input_path, engine="scipy")
+            if "time" in ds.coords:
+                ds.coords["time"] = [
+                    datetime.fromisoformat(str(e.values).split(".", maxsplit=1)[0])


Are we losing the possibility for millisecond accuracy here?

Yes with this implementation we do. Do we really need milliseconds? I think we might be able to also keep ms but nanoseconds are not possible

Uncertain whether we "need" it, but Eclipse can output milliseconds. Whether resdata supports it I am not quite sure.

Could we perhaps do the millisecond truncation in a more canonical way and not by string manipulation? Parsing the ISO string as is, and then set the millisecond part to zero using the datetime api?

oyvindeide

Looks like a good change! How will this work with older versions of storage? I guess we will probably need a migration for it?

oyvindeide · 2023-12-07T11:12:46Z

src/ert/analysis/_es_update.py

@@ -329,6 +329,12 @@ def _get_obs_and_measure_data(
            }
            observation = observation.sel(sub_selection)
        ds = source_fs.load_responses(group, tuple(iens_active_index))
+
+        if "time" in observation.coords:


Could we potentially encode this information in the xarray dataset, so we can do the conversion when loading it? Then it can be more generic, so we can potentially have other axis named something other then time, and also benefit from this improvement.

Not quite sure what you mean here. I guess we could add an attr to xarray.dataset that indicates which columns that have been encoded. Ie: attr({"column_name" : "datetime_to_string", ..})

Yes, something like that, so we can do: if datetime_to_string in atts: ...

kvashchuka · 2023-12-07T14:48:01Z

tests/unit_tests/test_load_forward_model.py

+                continue
+            print(line, end="")
+
+    facade = LibresFacade.from_config_file("snake_oil.ert")


I guess we should avoid LibresFasade here in light of removing it in #6687

We still need it for facade.load_from_forward_model below

might be a stupid question, but why? I just tried and the test is passing without this line as well

I guess you want to check that it can be loaded without errors, right?

Also, I might be missing something, but the test does not fail when I add it to main and run 🤷‍♀️ I would expect it to fail without your fix, am I doing something wrong?

Your observations are correct. I have added another assert to check that we actually load whats written to the runpath

kvashchuka · 2023-12-07T14:59:49Z

tests/unit_tests/test_load_forward_model.py

+
+    facade = LibresFacade.from_config_file("snake_oil.ert")
+    realisation_number = 0
+    storage = open_storage(facade.enspath, mode="w")


Suggested change

storage = open_storage(facade.enspath, mode="w")

storage_path = ErtConfig.from_file("snake_oil.ert").ens_path

storage = open_storage(storage_path, mode="w")

dafeda · 2024-01-10T06:39:25Z

src/ert/storage/local_experiment.py

-            observation.name: xr.open_dataset(observation, engine="scipy")
-            for observation in observations
-        }
+        dict = {}


dict is a built-in keyword in Python so I think we should avoid using it.

dafeda · 2024-01-10T06:39:51Z

src/ert/storage/local_experiment.py

+            ds = xr.open_dataset(observation, engine="scipy")
+            if "time" in ds.coords:
+                ds.coords["time"] = [
+                    t[:-3] for t in ds.coords["time"].values.astype(str)


This needs a comment I think.

…n 2263

dafeda

Great 👍

oyvindeide · 2024-02-13T09:00:00Z

src/ert/analysis/_es_update.py

@@ -305,13 +305,15 @@ def _get_obs_and_measure_data(
    for obs_key, obs_active_list in selected_observations:
        observation = observations[obs_key]
        group = observation.attrs["response"]
+


There seems to be missing something here?

Its just an extra line

dafeda · 2024-03-13T09:00:41Z

What's the status of this?

frode-aarstad self-assigned this Nov 23, 2023

frode-aarstad marked this pull request as draft November 23, 2023 14:05

frode-aarstad force-pushed the datetime-problem branch 3 times, most recently from 7fa7d4c to bf4a3f7 Compare November 24, 2023 13:56

frode-aarstad marked this pull request as ready for review November 27, 2023 07:21

berland reviewed Nov 29, 2023

View reviewed changes

frode-aarstad marked this pull request as draft November 29, 2023 15:52

frode-aarstad force-pushed the datetime-problem branch from 9f99e93 to cfd5821 Compare December 1, 2023 09:15

frode-aarstad marked this pull request as ready for review December 1, 2023 09:26

oyvindeide reviewed Dec 7, 2023

View reviewed changes

kvashchuka reviewed Dec 7, 2023

View reviewed changes

frode-aarstad force-pushed the datetime-problem branch 4 times, most recently from 0c96124 to 9724536 Compare January 9, 2024 08:27

dafeda reviewed Jan 10, 2024

View reviewed changes

frode-aarstad force-pushed the datetime-problem branch 2 times, most recently from 953b0f1 to 5667e39 Compare January 18, 2024 11:26

Fix for avoiding serialization error when having datetimes larger tha…

0ef0944

…n 2263

frode-aarstad force-pushed the datetime-problem branch from 5667e39 to 0ef0944 Compare January 18, 2024 12:28

dafeda approved these changes Jan 23, 2024

View reviewed changes

oyvindeide reviewed Feb 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for avoiding serialization error when datetimes are > 2263 #6654

Fix for avoiding serialization error when datetimes are > 2263 #6654

frode-aarstad commented Nov 23, 2023 •

edited by sondreso

Loading

codecov-commenter commented Nov 27, 2023 •

edited

Loading

berland Nov 29, 2023

frode-aarstad Nov 29, 2023

berland Nov 29, 2023

berland Nov 29, 2023

oyvindeide left a comment

oyvindeide Dec 7, 2023

frode-aarstad Dec 11, 2023

oyvindeide Jan 10, 2024

kvashchuka Dec 7, 2023

frode-aarstad Jan 4, 2024

kvashchuka Jan 8, 2024

kvashchuka Jan 8, 2024

kvashchuka Jan 8, 2024

frode-aarstad Jan 9, 2024

kvashchuka Dec 7, 2023

dafeda Jan 10, 2024

dafeda Jan 10, 2024

dafeda left a comment

oyvindeide Feb 13, 2024

frode-aarstad Feb 14, 2024

dafeda commented Mar 13, 2024

	storage = open_storage(facade.enspath, mode="w")
	storage_path = ErtConfig.from_file("snake_oil.ert").ens_path
	storage = open_storage(storage_path, mode="w")

Fix for avoiding serialization error when datetimes are > 2263 #6654

Are you sure you want to change the base?

Fix for avoiding serialization error when datetimes are > 2263 #6654

Conversation

frode-aarstad commented Nov 23, 2023 • edited by sondreso Loading

Pre review checklist

Pre merge checklist

codecov-commenter commented Nov 27, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oyvindeide left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dafeda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dafeda commented Mar 13, 2024

frode-aarstad commented Nov 23, 2023 •

edited by sondreso

Loading

codecov-commenter commented Nov 27, 2023 •

edited

Loading