-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested Dictionaries #1187
Comments
I have several questions:
|
One of my reservations about this is that it seems to encourage the user to read the entire file at once, and I would tend to encourage the user to use lazy interfaces for this purpose. The nested dictionary behavior that you describe seems closely resemble the current dictionary-like interface. julia> h5f = h5open("test.h5")
🗂️ HDF5.File: (read-only) test.h5
├─ 🔢 B
└─ 📂 groupA
├─ 🔢 A1
└─ 🔢 A2
julia> h5f["groupA"]["A1"][]
2×2 Matrix{Float64}:
0.43893 0.583493
0.546226 0.652598 The only difference here is the final How would also then deal with attributes? |
Thanks for the feedback! Here are my replies:
I wrote a custom read function which uses h5open. the reader code is as follows (writing code is similar). It's not too different from the code in FileIOExt.jl, just a different output: function load_nested(filename)
h5open(filename) do fid
read_group(fid)
end
end
function read_group(parent)
d = OrderedDict{String,Any}()
for key in keys(parent)
content = read_dataset(parent[key])
merge!(d,Dict(key => content))
end
d
end
read_dataset(val::HDF5.Group) = read_group(val)
read_dataset(val::HDF5.Dataset) = read(val)
I would prefer not to create a new function. My new functions are just my local solution to avoid type piracy. So
I actuallly wrote it for OrderedDict, but it could mirror the current load function's sink / typeflag:
Probably best practice, especially if data volumes are large. In my case, I want to mutate the structure without risking modifying the file contents. Based on the example you provided it is read only. In the past (in other languages and data types) i got into a habit of not leaving files "open".
That's interesting. so So the intent of this would be a different structure for the # current:
h5f = load(file) # creates a flat dict
A1 = h5f["groupA/A1"]
# new
h5f = load(file; nested = true) # creates a nested dict
A1 = h5f["groupA"]["A1"]
I'm not sure. Does the current load function handle attributes? |
My mental model of HDF5s is as a folder structure, where related data is grouped together, and buried in a nested / hierarchical format. Currently the read functions deliver a flat dictionary, and the hierarchy is held in strings as opposed to structure. The alternative which matches my mental model is to read an HDF5 in as a nested dictionary, where the value of a key is a datatype if the key refers to a datatype, and the value is a dictionary if the key refers to a group.
So for an HDF5 like:
The current read generates:
And I'd prefer an option to
read_nested
as:I've written this code locally (plus corresponding
write_nested
. Would it be reasonable to include it here?The text was updated successfully, but these errors were encountered: