PandA dataset naming #330

coretl · 2024-05-23T10:04:12Z

At the moment the PandA HDF writer will always write datasets like "INENC1.VAL.Mean", "COUNTER1.OUT.Diff". These are exposed in the descriptor as "panda1-inenc-1-val-Mean" and "panda1-counter-1-out-Diff". These don't make any scientific sense, so we should allow scientific names to be given to the datasets and set the ophyd name accordingly.

The options:

Pass in HDFPanda(prefix, dp, dataset_names = {"inenc-1-val": "sample-x", "counter-1-out": "izero"})
Post init but pre-connect we get the blocks, but don't know how many there are, and things like "INENC" are difficult as they are target specific
Post init and post connect we get connected blocks and can do panda1.set_datsets({panda1.inenc[1].val: "sample-x"}) in the plan

The text was updated successfully, but these errors were encountered:

coretl · 2024-05-23T10:04:51Z

Decision is:

Keep type hinting, and have target specific flavours of HDF PandA
Drop type hinting, and keep one HDFPandA

coretl · 2024-05-23T11:01:10Z

@callumforrester has my favourite suggestion so far:

Make HDFPanda.dataset_names: SignalRW[dict[str, str]], e.g. {"inenc-1-val": "sample-x", "counter-1-out": "izero"}
Keep the ophyd names as they are
Load/save the dataset names with the other attributes
At DetectorWriter.open then for each signal_name in dataset_names.keys():
- Grab the value of signal_name + "_capture"
- If it is "No" then ignore it
- If it has a single value then give it the scientific name
- If it is "Min Max" then expose scientific_name + "-min" and scientific_name + "-max"
- If it is "Min Max Mean" then expose scientific_name + "-min", scientific_name + "-max" and scientific_name
We don't ever change the name of the signal, so bps.rd will still have panda1-inenc-1-val, this is because we would still have to bps.rd(panda1.enc[1].val) so would be surprising if it had a different name, plus it's difficult to get a soft signal to change the names of things outside itself without making spaghetti

evalott100 · 2024-05-29T15:07:05Z

As part of this issue, I'm going to move save and load to the ophyd level. It doesn't really make sense to me that they're currently plan stubs - they currently assume Locatable devices too.

I think Saveable and Loadable make sense as protocols.

callumforrester · 2024-05-30T06:50:55Z

The current save/load functions don't need any protocols, they just walk the device and set signals. It's only when we need some intelligence in a device (i.e. it knows the special way to save/load itself) that we need a protocol. Is there a use case for that here?

evalott100 · 2024-05-30T07:31:16Z

Hmmm true. I supposes I could do Device::load_device and Device::save_device

coretl · 2024-06-03T08:27:00Z

As part of this issue, I'm going to move save and load to the ophyd level. It doesn't really make sense to me that they're currently plan stubs - they currently assume Locatable devices too.

I think Saveable and Loadable make sense as protocols.

Please keep these as plan stubs, we considered new protocols, but we decided to add as few new protocols as possible. Either saving or loading will require application specific logic which is better placed in a plan

evalott100 · 2024-06-03T08:50:19Z

I'll be moving the current plan stubs from core

evalott100 · 2024-06-04T10:18:13Z

Yesterday @coretl and I agreed that the dataset names should be set through the IOC. This turns this into three Issues...

pandablocks-client: Enable passing in of alternative names for devices when writing.
pandablock-ioc: For each capture record, enable giving a scientifically relevant name which given to the client for writing. These names will be set through a new PV (e.g COUNTER1:VAL:DATASET).
ophyd-async: Rather than obtaining records marked for capture by looking for names with _capture on the end, just look for which capture records have a scientifically important name set.

Then down the line we can...

pandablocks-ioc: Create a new PVI table structure consisting the all positions capture table.
ophyd-async: Use the above instead of walking.

evalott100 · 2024-06-06T14:10:02Z

@callumforrester Instead of searching through device children for things ending in _capture,

ophyd-async/src/ophyd_async/panda/writers/_hdf_writer.py

Lines 38 to 54 in 0b05b9c

    
           def get_capture_signals( 
        
               block: Device, path_prefix: Optional[str] = "" 
        
           ) -> Dict[str, SignalR]: 
        
               """Get dict mapping a capture signal's name to the signal itself""" 
        
               if not path_prefix: 
        
                   path_prefix = "" 
        
               signals: Dict[str, SignalR[Any]] = {} 
        
               for attr_name, attr in block.children(): 
        
                   # Capture signals end in _capture, but num_capture is a red herring 
        
                   if attr_name == "num_capture": 
        
                       continue 
        
                   dot_path = f"{path_prefix}{attr_name}" 
        
                   if isinstance(attr, SignalR) and attr_name.endswith("_capture"): 
        
                       signals[dot_path] = attr 
        
                   attr_signals = get_capture_signals(attr, path_prefix=dot_path + ".") 
        
                   signals.update(attr_signals) 
        
               return signals

we can just look for devices with a DATASET field in their children:

            string rw PANDA1:COUNTER1:OUT:DATASET # will have a `dataset: SignalRW(str) attr

Additionally, putting to this will give it a different scientific name in the file output

callumforrester · 2024-06-06T14:32:01Z

@evalott100 so to clarify: There will be a new PV PANDA1:COUNTER1:OUT:DATASET, and ophyd-async will automatically make a new signal on the correct block called "dataset", so we are now just looking for signals called "dataset" and making a mapping of the dataset name (the value of the PV) to those signals' parents?

evalott100 · 2024-06-06T14:39:31Z

ophyd-async will automatically make a new signal on the correct block called "dataset", so we are now just looking for signals called "dataset"

Yep, once using PandABlocks/PandABlocks-client#91 and PandABlocks/PandABlocks-ioc#118

callumforrester · 2024-06-07T11:08:03Z

Based on discussion with @abbiemery @evalott100 and @coretl, we are going to slightly change the plan and push more logic down to the IOC. The previous plan has ophyd-async putting dataset names to the panda and then reconstructing a data structure for .describe() from the various signals. It is simpler to build that data structure once, in the IOC and export it via a PV. Especially because the relationships between datasets and their min/max datasets are not actually important to ophyd-async, it just needs to know what datasets the panda is writing to HDF5 (and their data types).

So the new plan:

pandablocks-client: Enable passing in of alternative names for devices when writing, no change from previous plan.
pandablock-ioc: For each capture record, still enable giving a scientifically relevant name which given to the client for writing. These names will be set through a new PV (e.g COUNTER1:VAL:DATASET). Also create a top-level PV called DATA:DATASETS that exports a table of dataset names and HDF5 datatypes.
ophyd-async: No longer needs much complex logic, just reads the table out of the DATASETS PV and converts to a dictionary in the describe document.

Most of the complex logic now goes in the IOC, it needs to produce the DATASETS table and keep it up-to-date. The table will probably look something like this:

dataset	hdf5_type
sample_x	uint
sample_y	float
sample_z_min	float
sample_z_max	float
sample_x	float

In this case, sample_x, sample_y and sample_z are labels set via the DATASET PV on particular signals. The sample_z signal's capture PV is Max Min Mean, so all three datasets are included. Other datasets are other capture signals.
If the capture signal for a dataset is No, it should not be included in the table at all. The table only shows datasets that will be captured if the panda is armed.

The table also has to be kept up-to-date live, i.e. if a new value is put to a CAPTURE or a DATASET PV, the DATASETS PV should update and produce a new table. This requires a bit of extra logic in the IOC. The panda can report datatypes that map to float and uint, exactly how and what the mapping is I'll leave to @coretl

coretl · 2024-06-07T11:26:49Z

The panda can report datatypes that map to float and uint

I suggest we use numpy datatypes, so float64 and uint32.

exactly how and what the mapping is I'll leave to @coretl
Looking at the code, timestamps are all now floating point, so the logic is:

ext_out bits is uint32
Everything else is float64

evalott100 self-assigned this May 29, 2024

evalott100 mentioned this issue May 31, 2024

Move panda save load into plan stubs #352

Open

This was referenced Jun 4, 2024

Allow scientific name for block in hdf writing PandABlocks/PandABlocks-client#90

Closed

add dataset records for capture PandABlocks/PandABlocks-ioc#117

Closed

coretl mentioned this issue Jun 5, 2024

allowed a dictionary of alternate dataset names to be given by the writer PandABlocks/PandABlocks-client#91

Merged

callumforrester mentioned this issue Jun 7, 2024

Derive PandA describe from datasets PV #369

Merged

callumforrester closed this as completed in #369 Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PandA dataset naming #330

PandA dataset naming #330

coretl commented May 23, 2024 •

edited

Loading

coretl commented May 23, 2024

coretl commented May 23, 2024

evalott100 commented May 29, 2024 •

edited

Loading

callumforrester commented May 30, 2024

evalott100 commented May 30, 2024

coretl commented Jun 3, 2024

evalott100 commented Jun 3, 2024

evalott100 commented Jun 4, 2024 •

edited

Loading

evalott100 commented Jun 6, 2024 •

edited

Loading

callumforrester commented Jun 6, 2024

evalott100 commented Jun 6, 2024 •

edited

Loading

callumforrester commented Jun 7, 2024

coretl commented Jun 7, 2024

PandA dataset naming #330

PandA dataset naming #330

Comments

coretl commented May 23, 2024 • edited Loading

coretl commented May 23, 2024

coretl commented May 23, 2024

evalott100 commented May 29, 2024 • edited Loading

callumforrester commented May 30, 2024

evalott100 commented May 30, 2024

coretl commented Jun 3, 2024

evalott100 commented Jun 3, 2024

evalott100 commented Jun 4, 2024 • edited Loading

evalott100 commented Jun 6, 2024 • edited Loading

callumforrester commented Jun 6, 2024

evalott100 commented Jun 6, 2024 • edited Loading

callumforrester commented Jun 7, 2024

coretl commented Jun 7, 2024

coretl commented May 23, 2024 •

edited

Loading

evalott100 commented May 29, 2024 •

edited

Loading

evalott100 commented Jun 4, 2024 •

edited

Loading

evalott100 commented Jun 6, 2024 •

edited

Loading

evalott100 commented Jun 6, 2024 •

edited

Loading