Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASDF block API #1871

Open
braingram opened this issue Nov 26, 2024 Discussed in #1847 · 0 comments
Open

ASDF block API #1871

braingram opened this issue Nov 26, 2024 Discussed in #1847 · 0 comments
Milestone

Comments

@braingram
Copy link
Contributor

Converting to an issue to add a milestone.

Discussed in #1847

Originally posted by braingram October 10, 2024
Currently asdf doesn't provide any public API for interacting with ASDF blocks read from an ASDF file. There are some situations where this might be useful. One example is loading a block into a portion of a large array (without first loading the block into intermediate memory). Something like:

my_big_array = np.zeros((100, 2000, 2000))
for i, asdf_fn in enumerate(asdf_fns):  # assume len(asdf_fns) == 100
    with asdf.open(asdf_fn, lazy_load=True) as af:
        # let's assume af contains a single 2k x 2k block
        af.blocks[0].load(out=my_big_array[i])

I propose that we restrict this to "read-only" access at the moment (so AsdfFile.blocks will only be defined after a call to asdf.open(fn) and the block order and contents won't be modifiable). Minimally I propose that the API allow:

  • finding the number of blocks in an file: len(af.blocks)
  • access by index: blk = af.blocks[0]
  • an ASDFBlock class (name tbd) that allows:
    • getting the block header (perhaps as a frozen dataclass): blk.header
    • getting the block header offset: blk.header_offset
    • getting the block data offset: blk.data_offset
    • getting the block data: blk.load()

Some items to discuss are:

  • What read-only block information might be useful?
  • What are some use cases for this new API?
  • What are the pros and cons of having blk.load seed cached_data for the non-public block API?
@braingram braingram added this to the 4.x.x milestone Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant