Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include chunk shape as a parameter in stream resource for HDF dataset #544

Merged
merged 32 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2202413
Adding record for num frames in chunk along with chunk_size field in …
jwlodek Aug 28, 2024
02332c3
Attributes are saved all in a single chunk
jwlodek Aug 28, 2024
561da46
Update tests to account for chunk_size datakey parameter
jwlodek Aug 28, 2024
edc8f1c
Chunk size should be in sres not desc
jwlodek Aug 28, 2024
3cfb4e7
Move chunk size to sres parameters
jwlodek Aug 28, 2024
8117e4b
Refactor tests to reflect changes
jwlodek Aug 28, 2024
3313d1a
Merge branch 'main' of https://github.com/bluesky/ophyd-async into ch…
jwlodek Sep 4, 2024
ada0b15
chunk size can be int or none
jwlodek Sep 4, 2024
a835183
Update chunk size signal to non-zero in one of the AD test sets
jwlodek Sep 4, 2024
31dfcb1
Merge branch 'main' of https://github.com/bluesky/ophyd-async into ch…
jwlodek Sep 5, 2024
54e42f0
Use correct chunk size for PandA, make sure we use chunk size auto
jwlodek Sep 5, 2024
ad8b63f
Add comment on chunk size
jwlodek Sep 5, 2024
9856e2e
Make chunk_size a tuple that explicitly describes all chunk dims
jwlodek Sep 5, 2024
386e61b
Make sure chunk size is tuple even with one dim, update tests, simpli…
jwlodek Sep 9, 2024
415c398
Make chunk_size always tuple of int, default to empty tuple
jwlodek Sep 9, 2024
deec8d1
Merge branch 'main' into chunk-size-in-sres
jwlodek Sep 9, 2024
5ad4b7a
Use readback value to avoid disconnect between actual value and signa…
jwlodek Sep 9, 2024
09b51b3
Merge branch 'chunk-size-in-sres' of https://github.com/jwlodek/ophyd…
jwlodek Sep 9, 2024
578969c
Follow import convention for tests
jwlodek Sep 10, 2024
00275f3
Make use of slicing for detector name in ad_standard_det_factory clearer
jwlodek Sep 10, 2024
025ac06
Merge branch 'main' into chunk-size-in-sres
jwlodek Sep 10, 2024
1600a2c
Rename chunk size to chunk shape
jwlodek Sep 16, 2024
4a76859
Merge branch 'main' into chunk-size-in-sres
jwlodek Sep 16, 2024
f105e34
Add space for linting
jwlodek Sep 16, 2024
395b58b
Merge branch 'chunk-size-in-sres' of https://github.com/jwlodek/ophyd…
jwlodek Sep 16, 2024
3821fc6
Merge branch 'main' of https://github.com/bluesky/ophyd-async into ch…
jwlodek Sep 17, 2024
398ac01
Fix test
jwlodek Sep 17, 2024
9e28151
Merge with upstream
jwlodek Sep 18, 2024
0bd5ba2
Fix merge conflict
jwlodek Sep 18, 2024
4d01dae
Simplifying ad standard det factory fixture
jwlodek Sep 18, 2024
4f15ac1
Fix unawaited task issue
jwlodek Sep 18, 2024
4d4e12e
kinetix fixture doesn't need to be async
jwlodek Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/ophyd_async/core/_hdf_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ class HDFDataset:
dtype_numpy: str = ""
multiplier: int = 1
swmr: bool = False
# Represents explicit chunk size written to disk.
# The first dimension represents the number of frames/points
# per equally-sized chunk in the first chunking dimension.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the second half of this is actually always true, we happen to make this so for areaDetector, but in the general case the chunk size is just that, the size of the HDF chunk. This may or may not correspond to a multiple of images, it might be half an image in one direction and 15 in the other, but its magnitude is in the base datatype of the dataset.

I wonder if chunk_shape might be a better name to make people think of shape in the output of describe()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had the same thought yesterday about calling it chunk_shape. If it's not too late, I'd +1 to that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference from me about chunk_shape over chunk_size - perhaps lets discuss tomorrow at the sprint catchup, and we can settle on a naming scheme

chunk_size: tuple[int, ...] = ()


SLICE_NAME = "AD_HDF5_SWMR_SLICE"
Expand Down Expand Up @@ -65,6 +69,7 @@ def __init__(
"dataset": ds.dataset,
"swmr": ds.swmr,
"multiplier": ds.multiplier,
"chunk_size": ds.chunk_size,
},
uid=None,
validate=True,
Expand Down
2 changes: 2 additions & 0 deletions src/ophyd_async/epics/adcore/_core_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,6 @@ def __init__(self, prefix: str, name="") -> None:
self.array_size0 = epics_signal_r(int, prefix + "ArraySize0")
self.array_size1 = epics_signal_r(int, prefix + "ArraySize1")
self.create_directory = epics_signal_rw(int, prefix + "CreateDirectory")
self.num_frames_chunks = epics_signal_r(int, prefix + "NumFramesChunks_RBV")
self.chunk_size_auto = epics_signal_rw(bool, prefix + "ChunkSizeAuto")
super().__init__(prefix, name)
10 changes: 10 additions & 0 deletions src/ophyd_async/epics/adcore/_hdf_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ async def open(self, multiplier: int = 1) -> Dict[str, DataKey]:
# when directory path PV is processed.
await self.hdf.create_directory.set(info.create_dir_depth)

# Make sure we are using chunk auto-sizing
await self.hdf.chunk_size_auto.set(True)

await asyncio.gather(
self.hdf.num_extra_dims.set(0),
self.hdf.lazy_open.set(True),
Expand Down Expand Up @@ -83,6 +86,9 @@ async def open(self, multiplier: int = 1) -> Dict[str, DataKey]:
self._multiplier = multiplier
outer_shape = (multiplier,) if multiplier > 1 else ()

# Determine number of frames that will be saved per HDF chunk
frames_per_chunk = await self.hdf.num_frames_chunks.get_value()

# Add the main data
self._datasets = [
HDFDataset(
Expand All @@ -91,6 +97,7 @@ async def open(self, multiplier: int = 1) -> Dict[str, DataKey]:
shape=detector_shape,
dtype_numpy=np_dtype,
multiplier=multiplier,
chunk_size=(frames_per_chunk, *detector_shape),
)
]
# And all the scalar datasets
Expand All @@ -117,6 +124,9 @@ async def open(self, multiplier: int = 1) -> Dict[str, DataKey]:
(),
np_datatype,
multiplier,
# NDAttributes appear to always be configured with
# this chunk size
chunk_size=(16384,),
)
)

Expand Down
6 changes: 5 additions & 1 deletion src/ophyd_async/fastcs/panda/_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,11 @@ async def _update_datasets(self) -> None:

capture_table = await self.panda_data_block.datasets.get_value()
self._datasets = [
HDFDataset(dataset_name, "/" + dataset_name, [1], multiplier=1)
# TODO: Update chunk size to read signal once available in IOC
# Currently PandA IOC sets chunk size to 1024 points per chunk
Comment on lines +108 to +109
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it an issue and link here.

HDFDataset(
dataset_name, "/" + dataset_name, [1], multiplier=1, chunk_size=(1024,)
)
for dataset_name in capture_table["name"]
]

Expand Down
27 changes: 11 additions & 16 deletions tests/epics/adaravis/test_aravis.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
import re

import pytest
from bluesky.run_engine import RunEngine

from ophyd_async.core import (
DetectorTrigger,
DeviceCollector,
PathProvider,
TriggerInfo,
set_mock_value,
Expand All @@ -14,12 +12,8 @@


@pytest.fixture
async def test_adaravis(
RE: RunEngine,
static_path_provider: PathProvider,
) -> adaravis.AravisDetector:
async with DeviceCollector(mock=True):
test_adaravis = adaravis.AravisDetector("ADARAVIS:", static_path_provider)
async def test_adaravis(ad_standard_det_factory) -> adaravis.AravisDetector:
test_adaravis = await ad_standard_det_factory(adaravis.AravisDetector)

return test_adaravis

Expand Down Expand Up @@ -72,7 +66,7 @@ def test_gpio_pin_limited(test_adaravis: adaravis.AravisDetector):


async def test_hints_from_hdf_writer(test_adaravis: adaravis.AravisDetector):
assert test_adaravis.hints == {"fields": ["test_adaravis"]}
assert test_adaravis.hints == {"fields": ["test_adaravis1"]}


async def test_can_read(test_adaravis: adaravis.AravisDetector):
Expand All @@ -90,9 +84,9 @@ async def test_decribe_describes_writer_dataset(
await test_adaravis.stage()
await test_adaravis.prepare(one_shot_trigger_info)
assert await test_adaravis.describe() == {
"test_adaravis": {
"source": "mock+ca://ADARAVIS:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_adaravis1": {
"source": "mock+ca://ARAVIS1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand All @@ -117,12 +111,13 @@ async def test_can_collect(
assert docs[0][0] == "stream_resource"
stream_resource = docs[0][1]
sr_uid = stream_resource["uid"]
assert stream_resource["data_key"] == "test_adaravis"
assert stream_resource["data_key"] == "test_adaravis1"
assert stream_resource["uri"] == "file://localhost" + str(full_file_name)
assert stream_resource["parameters"] == {
"dataset": "/entry/data/data",
"swmr": False,
"multiplier": 1,
"chunk_size": (1, 10, 10),
}
assert docs[1][0] == "stream_datum"
stream_datum = docs[1][1]
Expand All @@ -140,9 +135,9 @@ async def test_can_decribe_collect(
await test_adaravis.stage()
await test_adaravis.prepare(one_shot_trigger_info)
assert (await test_adaravis.describe_collect()) == {
"test_adaravis": {
"source": "mock+ca://ADARAVIS:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_adaravis1": {
"source": "mock+ca://ARAVIS1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand Down
3 changes: 3 additions & 0 deletions tests/epics/adcore/test_writers.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ async def hdf_writer_with_stats(
hdf = adcore.NDFileHDFIO("HDF:")
stats = adcore.NDPluginStatsIO("FOO:")

# Set number of frames per chunk to something reasonable
set_mock_value(hdf.num_frames_chunks, 2)

return adcore.ADHDFWriter(
hdf,
static_path_provider,
Expand Down
30 changes: 12 additions & 18 deletions tests/epics/adkinetix/test_kinetix.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
import pytest
from bluesky.run_engine import RunEngine

from ophyd_async.core import (
DetectorTrigger,
DeviceCollector,
StaticPathProvider,
set_mock_value,
)
Expand All @@ -12,14 +10,9 @@


@pytest.fixture
async def test_adkinetix(
RE: RunEngine,
static_path_provider: StaticPathProvider,
) -> adkinetix.KinetixDetector:
async with DeviceCollector(mock=True):
test_adkinetix = adkinetix.KinetixDetector("KINETIX:", static_path_provider)

return test_adkinetix
async def test_adkinetix(ad_standard_det_factory):
kinetix = await ad_standard_det_factory(adkinetix.KinetixDetector)
return kinetix


async def test_get_deadtime(
Expand Down Expand Up @@ -54,7 +47,7 @@ async def setup_trigger_mode(trig_mode: DetectorTrigger):


async def test_hints_from_hdf_writer(test_adkinetix: adkinetix.KinetixDetector):
assert test_adkinetix.hints == {"fields": ["test_adkinetix"]}
assert test_adkinetix.hints == {"fields": ["test_adkinetix1"]}


async def test_can_read(test_adkinetix: adkinetix.KinetixDetector):
Expand All @@ -72,9 +65,9 @@ async def test_decribe_describes_writer_dataset(
await test_adkinetix.stage()
await test_adkinetix.prepare(one_shot_trigger_info)
assert await test_adkinetix.describe() == {
"test_adkinetix": {
"source": "mock+ca://KINETIX:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_adkinetix1": {
"source": "mock+ca://KINETIX1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand All @@ -99,12 +92,13 @@ async def test_can_collect(
assert docs[0][0] == "stream_resource"
stream_resource = docs[0][1]
sr_uid = stream_resource["uid"]
assert stream_resource["data_key"] == "test_adkinetix"
assert stream_resource["data_key"] == "test_adkinetix1"
assert stream_resource["uri"] == "file://localhost" + str(full_file_name)
assert stream_resource["parameters"] == {
"dataset": "/entry/data/data",
"swmr": False,
"multiplier": 1,
"chunk_size": (1, 10, 10),
}
assert docs[1][0] == "stream_datum"
stream_datum = docs[1][1]
Expand All @@ -122,9 +116,9 @@ async def test_can_decribe_collect(
await test_adkinetix.stage()
await test_adkinetix.prepare(one_shot_trigger_info)
assert (await test_adkinetix.describe_collect()) == {
"test_adkinetix": {
"source": "mock+ca://KINETIX:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_adkinetix1": {
"source": "mock+ca://KINETIX1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand Down
27 changes: 11 additions & 16 deletions tests/epics/advimba/test_vimba.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
import pytest
from bluesky.run_engine import RunEngine

from ophyd_async.core import (
DetectorTrigger,
DeviceCollector,
PathProvider,
set_mock_value,
)
Expand All @@ -12,12 +10,8 @@


@pytest.fixture
async def test_advimba(
RE: RunEngine,
static_path_provider: PathProvider,
) -> advimba.VimbaDetector:
async with DeviceCollector(mock=True):
test_advimba = advimba.VimbaDetector("VIMBA:", static_path_provider)
async def test_advimba(ad_standard_det_factory) -> advimba.VimbaDetector:
test_advimba = await ad_standard_det_factory(advimba.VimbaDetector)

return test_advimba

Expand Down Expand Up @@ -66,7 +60,7 @@ async def setup_trigger_mode(trig_mode: DetectorTrigger):


async def test_hints_from_hdf_writer(test_advimba: advimba.VimbaDetector):
assert test_advimba.hints == {"fields": ["test_advimba"]}
assert test_advimba.hints == {"fields": ["test_advimba1"]}


async def test_can_read(test_advimba: advimba.VimbaDetector):
Expand All @@ -84,9 +78,9 @@ async def test_decribe_describes_writer_dataset(
await test_advimba.stage()
await test_advimba.prepare(one_shot_trigger_info)
assert await test_advimba.describe() == {
"test_advimba": {
"source": "mock+ca://VIMBA:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_advimba1": {
"source": "mock+ca://VIMBA1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand All @@ -111,12 +105,13 @@ async def test_can_collect(
assert docs[0][0] == "stream_resource"
stream_resource = docs[0][1]
sr_uid = stream_resource["uid"]
assert stream_resource["data_key"] == "test_advimba"
assert stream_resource["data_key"] == "test_advimba1"
assert stream_resource["uri"] == "file://localhost" + str(full_file_name)
assert stream_resource["parameters"] == {
"dataset": "/entry/data/data",
"swmr": False,
"multiplier": 1,
"chunk_size": (1, 10, 10),
}
assert docs[1][0] == "stream_datum"
stream_datum = docs[1][1]
Expand All @@ -134,9 +129,9 @@ async def test_can_decribe_collect(
await test_advimba.stage()
await test_advimba.prepare(one_shot_trigger_info)
assert (await test_advimba.describe_collect()) == {
"test_advimba": {
"source": "mock+ca://VIMBA:HDF1:FullFileName_RBV",
"shape": (0, 0),
"test_advimba1": {
"source": "mock+ca://VIMBA1:HDF1:FullFileName_RBV",
"shape": (10, 10),
"dtype": "array",
"dtype_numpy": "|i1",
"external": "STREAM:",
Expand Down
34 changes: 34 additions & 0 deletions tests/epics/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import pytest
from bluesky.run_engine import RunEngine

from ophyd_async.core._detector import StandardDetector
from ophyd_async.core._device import DeviceCollector
from ophyd_async.core._mock_signal_utils import set_mock_value


@pytest.fixture
async def ad_standard_det_factory(
RE: RunEngine,
static_path_provider,
) -> StandardDetector:
async def generate_ad_standard_det(ad_standard_detector_class, number=1):
# Dynamically generate a name based on the class of detector
detector_name = ad_standard_detector_class.__name__
if detector_name.endswith("Detector"):
detector_name = detector_name[:-len("Detector")]

async with DeviceCollector(mock=True):
test_adstandard_det = ad_standard_detector_class(
f"{detector_name.upper()}{number}:",
static_path_provider,
name=f"test_ad{detector_name.lower()}{number}",
)

# Set number of frames per chunk and frame dimensions to something reasonable
set_mock_value(test_adstandard_det.hdf.num_frames_chunks, 1)
set_mock_value(test_adstandard_det.drv.array_size_x, 10)
set_mock_value(test_adstandard_det.drv.array_size_y, 10)

return test_adstandard_det

return generate_ad_standard_det
1 change: 1 addition & 0 deletions tests/fastcs/panda/test_hdf_panda.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,7 @@ def assert_resource_document():
"dataset": f"/{dataset_name}",
"swmr": False,
"multiplier": 1,
"chunk_size": (1024,),
},
}
assert "test-panda.h5" in stream_resource["uri"]
Expand Down
7 changes: 6 additions & 1 deletion tests/fastcs/panda/test_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,12 @@ def assert_resource_document(name, resource_doc):
"data_key": name,
"mimetype": "application/x-hdf5",
"uri": "file://localhost" + str(tmp_path / "mock_panda" / "data.h5"),
"parameters": {"dataset": f"/{name}", "swmr": False, "multiplier": 1},
"parameters": {
"dataset": f"/{name}",
"swmr": False,
"multiplier": 1,
"chunk_size": (1024,),
},
}
assert "mock_panda/data.h5" in resource_doc["uri"]

Expand Down
Loading