[WIP] Support distributed read for other acquisition files #30

tristpinsm · 2022-06-21T00:01:45Z

I made these changes in order to be able to read CalibrationGainData files into distributed containers.

tristpinsm · 2022-06-21T00:14:33Z

the other part of the unicode index map fix is in radiocosmology/caput#201

jrs65 · 2022-06-21T18:19:45Z

Ah interesting. I was facing similar issues with the HFB code. In the end I opted to go a slightly different route and write a draco style container that could read the archived HFB code (chime-experiment/ch_pipeline#99). I figured I could probably do the same for many of the other formats to (e.g. CorrData).

I'll have a look at this later. It is a massive pain to need to be maintaining multiple implementations for distributed IO, so we should see if there are reasons to keep it separate, and if we can combine any of it together.

tristpinsm · 2022-06-21T18:39:19Z

Ah interesting. I was facing similar issues with the HFB code. In the end I opted to go a slightly different route and write a draco style container that could read the archived HFB code (chime-experiment/ch_pipeline#99). I figured I could probably do the same for many of the other formats to (e.g. CorrData).

Yeah I thought about taking this approach, but decided I didn't want to tackle changing the whole framework, especially because the CorrData code is so complicated. In the end doing it this way took more effort than I anticipated anyways...

I'll have a look at this later. It is a massive pain to need to be maintaining multiple implementations for distributed IO, so we should see if there are reasons to keep it separate, and if we can combine any of it together.

I agree that it makes no sense the way the code is organised at the moment, and it would be much better if everything used a common approach. It might be painful to try and move things over seamlessly and support all the old formats, but just making containers for the current acquisition formats would be an improvement.

jrs65 · 2022-06-21T18:44:34Z

ch_util/andata.py

+            **kwargs,
+        )
+
+        # Datasets that we should convert into distribute ones


If this routine is moving to BaseData maybe this list should be a class attribute so it can easily be modified by derived classes.

jrs65 · 2022-07-27T00:23:04Z

ch_util/andata.py


        Examples
        --------
        Examples are analogous to those of :meth:`CorrData.from_acq_h5`.

        """

+        if distributed:


I kind of think classes should need to opt in to distributed support. See comments below for why,

jrs65 · 2022-07-27T00:24:53Z

ch_util/andata.py

+        datasets=None,
+        out_group=None,
+        **kwargs,
+    ):


I think for the sake of people maintaining the module there should be a bit more clarity about what these various base methods are for. One thing that would help would be calling this method something else, e.g. _from_acq_h5_single, or somesuch that makes it clearer how this differs. It might also be good to put some extensive comments explaining the relationship between everything.

jrs65 · 2022-07-27T00:35:49Z

ch_util/andata.py

+            comm = MPI.COMM_WORLD
+
+        # Determine the total number of frequencies
+        nfreq = None


This seems problematic as not all datasets have a frequency axis, e.g. doing WeatherData.from_acq_h5(..., distributed=True) and I think this would crash on you complaining about the absence of a frequency axis.

One option would be to allow subclasses to name an axis to distributed the read over. If it's present this routine uses it, if it's not present then you can't do a distributed read and it exits elegantly.

tristpinsm · 2022-08-31T00:45:33Z

ok, I think I've addressed your comments @jrs65 (thanks!). It feels like making this work is a hack, so I'm not sure what the right path forward is. I guess getting all of the CorrData code integrated with draco will not be trivial, but probably moving the other containers in here to that model would be straightforward. And this PR only addresses the latter...

tristpinsm · 2023-05-04T18:14:30Z

It's not clear this is the way forward to enable distributed reading of other acquisition files. There is a similar problem with the HFB files, so we can shelf this for now and see how the HFB approach pans out and consider generalising it to all non-visibility acquisition files.

tristpinsm force-pushed the tpm/dist_read branch from b4e539b to 67d2cd5 Compare June 21, 2022 00:05

tristpinsm added 2 commits June 20, 2022 17:11

fix(andata.BaseData): Also convert strings in index_map.

b0822d5

refactor(andata.BaseData): Move distributed code to BaseData.

da02011

tristpinsm force-pushed the tpm/dist_read branch from 67d2cd5 to da02011 Compare June 21, 2022 00:11

tristpinsm requested a review from jrs65 June 21, 2022 00:14

jrs65 requested changes Jul 27, 2022

View reviewed changes

feat(andata): Specify distributed axis as a container attribute.

55833be

tristpinsm force-pushed the tpm/dist_read branch 2 times, most recently from f0b982c to 4f94fec Compare August 31, 2022 00:07

tristpinsm added 6 commits August 30, 2022 17:40

feat(andata): Fail if no distributed axis.

4e901fc

fix(andata): Rename to _from_acq_h5_single.

74c48dd

feat(andata): Allow other distributed axes in addition to freq.

dda773f

doc(andata): Add some docstrings to the _from_acq_h5 methods.

35d4512

fix(andata.BaseData): Make the _DIST_DSETS list a property of the class.

66aa690

fix(andata): Fix axis selections to work with CorrData.

a97625c

tristpinsm force-pushed the tpm/dist_read branch from 4f94fec to a97625c Compare August 31, 2022 00:40

tristpinsm changed the title ~~Support distributed read for other acquisition files~~ [WIP] Support distributed read for other acquisition files May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Support distributed read for other acquisition files #30

[WIP] Support distributed read for other acquisition files #30

tristpinsm commented Jun 21, 2022 •

edited

Loading

tristpinsm commented Jun 21, 2022

jrs65 commented Jun 21, 2022

tristpinsm commented Jun 21, 2022

jrs65 Jun 21, 2022

jrs65 Jul 27, 2022

jrs65 Jul 27, 2022

jrs65 Jul 27, 2022

jrs65 Jul 27, 2022

tristpinsm commented Aug 31, 2022

tristpinsm commented May 4, 2023

[WIP] Support distributed read for other acquisition files #30

Are you sure you want to change the base?

[WIP] Support distributed read for other acquisition files #30

Conversation

tristpinsm commented Jun 21, 2022 • edited Loading

tristpinsm commented Jun 21, 2022

jrs65 commented Jun 21, 2022

tristpinsm commented Jun 21, 2022

jrs65 Jun 21, 2022

Choose a reason for hiding this comment

jrs65 Jul 27, 2022

Choose a reason for hiding this comment

jrs65 Jul 27, 2022

Choose a reason for hiding this comment

jrs65 Jul 27, 2022

Choose a reason for hiding this comment

jrs65 Jul 27, 2022

Choose a reason for hiding this comment

tristpinsm commented Aug 31, 2022

tristpinsm commented May 4, 2023

tristpinsm commented Jun 21, 2022 •

edited

Loading