Skip to content

Handling different types from the HDF hierarchical tree (dataset, group, attribute) #136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
caiohamamura opened this issue Jan 29, 2021 · 8 comments

Comments

@caiohamamura
Copy link

caiohamamura commented Jan 29, 2021

As it is we only have the option to get a Vec<String> of the members non-recursively, but we don't even know which type of member this actually is, members may be dataset or group and should be handled differently in any application.

You could actually try to code something like this to check if it is a group:

if let Ok(member_names) = hdf_file.get_members() {
    for member_name in member_names {
        if let Ok(group) = obj.get_group(member_name.as_str()) {
            // This is a group, do something with the group
        } else {
            // This is not a group
        } 
    }
}

But if this is not a group at all, although the program does run it will output some nasty HDF5 errors like this:

HDF5-DIAG: Error detected in HDF5 (1.10.5) thread 9184:
  #000: D:\src\osgeo4w64\src\hdf5\src\H5G.c line 470 in H5Gopen2(): unable to 
open group
    major: Symbol table
    minor: Can't open object

So, is there any better way to browse the HDF hierarchical tree?

@aldanor
Copy link
Owner

aldanor commented Jan 29, 2021

?

@caiohamamura
Copy link
Author

Sorry I mistyped <Enter + something> and it just posted an empty issue.

@caiohamamura caiohamamura changed the title Ve Handling for the HDF hierarchical tree different types (dataset, group, attribute) Jan 29, 2021
@caiohamamura caiohamamura changed the title Handling for the HDF hierarchical tree different types (dataset, group, attribute) Handling different types from the HDF hierarchical tree (dataset, group, attribute) Jan 29, 2021
@aldanor
Copy link
Owner

aldanor commented Jan 29, 2021

  1. you can silence_errors() so you won't see those errors (maybe we should really do that by default, and allow people to unsilence them instead... because HDF5 errors are a normal thing in cases like this)
  2. we can certainly improve on traversal API which is a bit simplistic at the moment and just gives you names; maybe returning a list of enums (Dataset/Group/...) or something like that. As I said, it's quite simplistic right now

@caiohamamura
Copy link
Author

caiohamamura commented Jan 29, 2021

Yeah, I think solution 2 would be a great improvement! I use hdf5 in many languages and this would be definetely the best overall solution in any API I've used so far.

@aldanor
Copy link
Owner

aldanor commented Jan 29, 2021

In (2) the problem is how to best design the API:

  • should it be callback-based traversal (like in h5py)
  • should it be an iterator-like traversal
  • should it return a Vec of enums (Dataset/Group/...)

There's also a common task of recursive traversal (like in hpy's visit).

The usual problem is that while you're traversing something, the items may have been already mutated or removed or renamed.

@aldanor
Copy link
Owner

aldanor commented Jan 29, 2021

It's quite easy to implement, the question is just how it should look. Maybe @mulimoen will have more ideas.

@mulimoen
Copy link
Collaborator

The basic atom would be

/// Take a visitor function F (return false to stop iteration)
fn iterate<F, G>(&self, mut op: F, mut val: G) -> Result<G>
where
    F: FnMut(hid_t, *const c_char, *const H5L_info_t, &mut G) -> bool,
{
    /// Struct used to pass a tuple
    struct Vtable<'a, F, D> {
        f: &'a mut F,
        d: &'a mut D,
    }
    // Monomorphs a closure to a C callback
    //
    // This function will be called multiple times, but never concurrently
    extern "C" fn callback<F, G>(
        id: hid_t, name: *const c_char, info: *const H5L_info_t, op_data: *mut c_void,
    ) -> herr_t
    where
        F: FnMut(hid_t, *const c_char, *const H5L_info_t, &mut G) -> bool,
    {
        panic::catch_unwind(|| {
            // This should not occur if `hdf5` upholds their invariants
            assert!(!op_data.is_null());
            let vtable = op_data as *mut Vtable<F, G>;
            // This references our heap which contains the Vtable
            let vtable: &mut Vtable<F, G> = unsafe { &mut *vtable };
            //let info = unsafe { &*info };
            if (vtable.f)(id, name, info, vtable.d) {
                0
            } else {
                1
            }
        })
        .unwrap_or(-1)
    }

    let callback_fn: H5L_iterate_t = Some(callback::<F, G>);
    let iter_pos: *mut hsize_t = &mut 0_u64;

    // Store our references on the heap
    let mut vtable = Vtable { f: &mut op, d: &mut val };
    let other_data = &mut vtable as *mut Vtable<_, _> as *mut c_void;

    h5call!(H5Literate(
        self.id(),
        H5_index_t::H5_INDEX_NAME,
        H5_iter_order_t::H5_ITER_INC,
        iter_pos,
        callback_fn,
        other_data
    ))?;
    Ok(val)
}

I have to think a bit about it, but there should be some way for F to take a reference, but we must ensure lifetimes are only valid for a single iteration.
member_names is very simple expressed like this:

/// Returns names of all the members in the group, non-recursively.
pub fn member_names(&self) -> Result<Vec<String>> {
    let f = |_, name: *const c_char, _, names: &mut Vec<String>| {
        names.push(string_from_cstr(name));
        true
    };

    self.iterate(f, Vec::new())
}

@mulimoen
Copy link
Collaborator

This was added in #157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants