Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up Dataspaces #1104

Merged
merged 18 commits into from
Dec 23, 2023
Merged

Clean up Dataspaces #1104

merged 18 commits into from
Dec 23, 2023

Conversation

simonbyrne
Copy link
Collaborator

@simonbyrne simonbyrne commented Sep 5, 2023

@mkitti
Copy link
Member

mkitti commented Sep 5, 2023

There is a lot of reorganization here.

Should this go into 0.17 or could it wait for 0.18?

My sense is that 0.18 is months off rather than a year or two.

@simonbyrne
Copy link
Collaborator Author

If 0.17 is otherwise ready to go, then this can wait.

One question is what should HDF5.UNLIMITED actually be? For compatibility, I made it simply -1, but we could make it its own type?

@mkitti
Copy link
Member

mkitti commented Sep 5, 2023

I'm not sure if this actually breaking or not, so perhaps it could be 0.17.1?

@simonbyrne
Copy link
Collaborator Author

The deprecations are breaking, so shouldn't go in a patch release.

This is part of my effort to clean up some interfaces, so happy to wait for 0.18.

@simonbyrne simonbyrne marked this pull request as draft September 5, 2023 22:03
@simonbyrne simonbyrne marked this pull request as ready for review September 5, 2023 22:51
@simonbyrne simonbyrne added this to the 0.18.0 milestone Sep 5, 2023
@simonbyrne simonbyrne force-pushed the sb/dataspace-unlimited branch 3 times, most recently from 6f4666b to a9815ca Compare September 6, 2023 17:10
@mkitti
Copy link
Member

mkitti commented Sep 6, 2023

Regarding the documentation, I would like to convert as many of the examples to jldoctest as possible so that these are actually executed and tested with changes that we make.

@mkitti
Copy link
Member

mkitti commented Sep 7, 2023

I would like to have parity with HDF5.Datatype. The issue with this is the similar with Core.DataType. Do you have anythoughts on how to resolve that?

@simonbyrne
Copy link
Collaborator Author

I would like to have parity with HDF5.Datatype. The issue with this is the similar with Core.DataType. Do you have anythoughts on how to resolve that?

The capitalization is different 😄?

I guess we don't have to export it: by making max_dims a keyword argument to create_dataset, I don't think users will really need to manually use Dataspace objects all that much?

@simonbyrne
Copy link
Collaborator Author

Can you expand on what you mean by "I would like to have parity with HDF5.Datatype"?

@mkitti
Copy link
Member

mkitti commented Sep 7, 2023

This package currently also exports datatype for obtaining instances of HDF5.Datatype which seems analogous to the relationship between dataspace and HDF5.Dataspace.

If we are going to increase use of HDF5.Dataspace constructors, should we also do the same HDF5.Datatype constructors?

@simonbyrne
Copy link
Collaborator Author

If we are going to increase use of HDF5.Dataspace constructors, should we also do the same HDF5.Datatype constructors?

Oh, I understand.

Yes, I think that would make sense: basically we would deprecate datatype(::Type) methods, and make them Datatype(::Type). We would keep datatype(obj) methods.

@simonbyrne simonbyrne force-pushed the sb/dataspace-unlimited branch 2 times, most recently from a486503 to b11eb00 Compare September 8, 2023 17:31
@mkitti
Copy link
Member

mkitti commented Sep 8, 2023

I think Datatype might be confusing though since it could easily be confused with Julia's DataType.

Perhaps the better strategy would be to focus on not requiring a HDF5.Datatype but rather just a Julia type to most user facing interfaces. Meanwhile we make the HDF5.Datatype constructor do most of what datatype currently does.

How about this:

  1. Mainly use Julia data types instead of HDF5 data types in the primary interface.
  2. Make the HDF5.Datatype constructor do what datatype does now.
  3. Do not export HDF5.Datatype due to confusion. Make it only necessary to use in unusual circumstances.

@simonbyrne
Copy link
Collaborator Author

I'm happy with 1 & 3.

For 2., I think it would be consistent to keep datatype(obj) for "the Datatype of elements of obj", in the same way that this PR makes dataspace(obj) be the "Dataspace of the dimensions of obj" vs Dataspace(dims) being the "Dataspace with dimensions dims".

@simonbyrne simonbyrne force-pushed the sb/dataspace-unlimited branch 3 times, most recently from 19bd547 to 8ebef2e Compare September 8, 2023 19:32
@simonbyrne
Copy link
Collaborator Author

something really odd is going on with the CI / julia 1 - windows-latest - x64.

We're getting an error in h5a_iterate:
https://github.com/JuliaIO/HDF5.jl/actions/runs/6124333998/job/16624132842#step:5:200
From what I can tell, it looks like h5a_iterate isn't throwing an error correctly. I can make it go away by removing the lock or the try/catch for the lock, or be removing the check of the error stack (which may be a good idea in any case?)

@mkitti
Copy link
Member

mkitti commented Sep 9, 2023

I saw the same error on Ubuntu. Do we have a race condition?

@mkitti
Copy link
Member

mkitti commented Sep 9, 2023

@simonbyrne
Copy link
Collaborator Author

Possibly, but I can't figure out exactly how.

@simonbyrne
Copy link
Collaborator Author

it could be something in HDF5 itself that isn't setting the error stack correctly?

@simonbyrne
Copy link
Collaborator Author

@mkitti any more thoughts on this? I'd like to make some more changes which depend on this.

@mkitti
Copy link
Member

mkitti commented Sep 15, 2023

I just got back home. Could you give me until Monday noon PT to look it over?

@simonbyrne
Copy link
Collaborator Author

No problem!

Copy link
Member

@mkitti mkitti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor thoughts:

  1. What is the relationship between the UNLIMITED constant in HDF5 and API modules?
  2. Should we support a symbol for unlimited?
  3. Should we preserve some of the legacy tests?

constructor or [`create_dataset`](@ref), or as a `count` argument in
[`BlockRange`](@ref) when selecting virtual dataset mappings.
"""
const UNLIMITED = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also interpret a symbol, :unlimited, for this purpose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. The question is what should size return? Defining HDF5.UNLIMITED = -1 seemed like the easiest option to get backwards compatibility and type stability.

test/dataspace.jl Show resolved Hide resolved
@simonbyrne
Copy link
Collaborator Author

1. What is the relationship between the UNLIMITED constant in HDF5 and API modules?

API.H5S_UNLIMITED = typemax(UInt64), whereas UNLIMITED = -1

2. Should we support a symbol for unlimited?

We could do, but the question is what should size return?

3. Should we preserve some of the legacy tests?

I believe most of the tests should still be there, I just rearranged them a bit.

@mkitti mkitti mentioned this pull request Sep 27, 2023
@mkitti
Copy link
Member

mkitti commented Dec 4, 2023

I'm thinking about just merging this and moving on to 0.18 development. It's been a while.

@mkitti mkitti merged commit 4568054 into master Dec 23, 2023
16 checks passed
@mkitti mkitti deleted the sb/dataspace-unlimited branch December 23, 2023 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants