Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-46185: Refactor and simplify. #29

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

DM-46185: Refactor and simplify. #29

wants to merge 17 commits into from

Conversation

ktlim
Copy link
Contributor

@ktlim ktlim commented Nov 21, 2024

No description provided.

@hsinfang hsinfang force-pushed the tickets/DM-46185 branch 3 times, most recently from 7865ff7 to f7130b3 Compare December 4, 2024 23:39
ktlim and others added 3 commits December 4, 2024 16:18
The original test is not quite right, because when the embargo hours
falls right on an exposure, it should unembargo it.

Copy the original test to a new test test_copy_just_one.

shift the "now" time test_nothing_copies to be 0.000001 second earlier,
so all data still stay within the embargo period, and nothing should
copy.

5596964.255774 sec is the time difference between 2020-01-17 16:55:11.322700
and 2019-11-13T22:12:27.066926, the start time of exposure 2019111300059.
@hsinfang
Copy link

hsinfang commented Dec 5, 2024

@ktlim may you please take a look at my commits?

Now all tests pass as before this PR (two xfail existed before this PR). I changed the expected results or data values for some tests, with justifications in the commit messages. Please see if they make sense to you.

I tried to preserve what I think the tests were written to mean, and leave test refactoring to a later ticket.

test_main_copy_midnight_precision is idential to
test_main_midnight_precision.
test_raw_and_calexp_should_copy and test_raw_and_calexp_should_copy_yaml
have two, different now_time_embargo timestamps.

With the new API, it is more difficult to test with multiple
now_time_embargo timestamps in one call. Also we wouldn't use that
in pratical use case anyway.

So, use just the later timestamp, and shift the embargo
period for the raw datasets that used the earlier timestamp.

The amount of shifting is the difference between the two original
now_time_embargo timestamps:
2022-11-13T03:35:12.836981 - 2020-01-17T16:55:11.322700 =
89030401.514281 seconds, or 24730.66708730028 hr.

Instead of an embargo period of 0.1 hr, the new embargo period
is 0.1 + 24730.66708730028 hr.
@hsinfang
Copy link

hsinfang commented Dec 5, 2024

Sorry, pulling back one commit. test_main_midnight_precision results were wrong and I'm not sure what was wrong yet.

Exposure 2020011700004 has a start time at 2020-01-17T16:55:10.322700
and end time at 2020-01-17T16:55:11.322700.

3827088.677301 seconds after 2020-01-17T16:55:10.322700 is
2020-03-02T00:00:00.000000

So, in test_main_midnight_precision, 2020011700004 _just_ got out
of the embargo period.
@hsinfang
Copy link

hsinfang commented Dec 6, 2024

Fixed

transfer_dataset_type is used with batching. It is possible that
no datasets match the data IDs in a particular batch from all
possible exposure/visit records.
@hsinfang
Copy link

hsinfang commented Dec 11, 2024

Notes: this branch doesn't work with some datasets at the moment, because of DM-48094 since w_2024_37. But older stacks didn't have butler.collections.query_info yet.

The total number of dataset types in the repo can be much
larger than those in a specific collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants