Hextof lab loader #534

zain-sohail · 2024-12-19T12:45:59Z

This PR adds the lab loader requested in #503 . I tried to make minimal changes to the FlashLoader to make this work. The only major addition is the loader specific dataframe class and everything else stays approximately the same. So the lab data works with the flash loader but withbeamline config as cfel.

An example config is provided to make this work. Since I took out some hardcoded paramters (was in TODO) into the config, I updated the config model slightly.
Test data for this loading configuration still needs to be setup. I ask @kutnyakhov to provide a public file to perform this. Not sure if a tutorial is necessary or not.

zain-sohail · 2024-12-19T15:50:55Z

src/sed/loader/flash/buffer_handler.py

+        else:
+            raise ValueError(f"Unsupported core beamline: {core_beamline}")
+
+    def _validate_h5_files(self, config, h5_paths: list[Path]) -> list[Path]:


This validation was previously in BufferFilePaths and we had a discussion to move it from there. I find this location better (also was necessary due to restructure)

zain-sohail · 2024-12-19T15:52:06Z

src/sed/loader/flash/utils.py

-# TODO: move to config
-MULTI_INDEX = ["trainId", "pulseId", "electronId"]
-PULSE_ALIAS = MULTI_INDEX[1]
-FORMATS = ["per_electron", "per_pulse", "per_train"]


these have now been moved to config/config model

src/sed/loader/flash/utils.py

coveralls · 2025-01-16T15:17:17Z

Pull Request Test Coverage Report for Build 13419398366

Details

71 of 124 (57.26%) changed or added relevant lines in 7 files are covered.
3 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.6%) to 91.6%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/sed/loader/flash/loader.py	3	4	75.0%
src/sed/loader/flash/utils.py	9	10	90.0%
src/sed/loader/flash/buffer_handler.py	30	34	88.24%
src/sed/loader/flash/dataframe.py	22	69	31.88%

Files with Coverage Reduction	New Missed Lines	%
src/sed/binning/numba_bin.py	3	87.62%

Totals
Change from base Build 13167417292:	-0.6%
Covered Lines:	7731
Relevant Lines:	8440

💛 - Coveralls

Copilot

Copilot reviewed 10 out of 11 changed files in this pull request and generated no comments.

Files not reviewed (1)

.cspell/custom-dictionary.txt: Language not supported

Comments suppressed due to low confidence (2)

src/sed/loader/flash/dataframe.py:423

The docstring for the df_train property refers to channels of type [per pulse], but the implementation uses 'per_train'. Please update the comment to match the code and maintain clarity.

        Returns a pandas DataFrame for given channel names of type [per pulse]

tests/data/loader/flash/config.yaml:57

[nitpick] For consistency and to avoid potential YAML parsing issues, consider quoting the index values as strings (e.g. ['trainId', 'pulseId', 'electronId']).

  index: [trainId, pulseId, electronId]

… as it is not available right now anyways

…oader

zain-sohail

Just few comments. Is this local metadata scheme also important for flash loader? Because then the code also needs to be updated there.

zain-sohail · 2025-06-01T18:55:26Z

src/sed/core/config_model.py

@@ -26,6 +26,7 @@ class PathsModel(BaseModel):

    raw: DirectoryPath
    processed: Optional[Union[DirectoryPath, NewPath]] = None
+    meta: Optional[Union[DirectoryPath, NewPath]] = None


Instead of adding a new entry to the config model, I'd suggest we just allow directory paths in

sed/src/sed/core/config_model.py

Line 327 in 4a6ec53

archiver_url: Optional[HttpUrl] = None

what do you think?

Fine for me. I just thought as it anyway would be one of the main folders inside the beamtime folder.

zain-sohail · 2025-06-01T19:09:44Z

src/sed/loader/cfel/loader.py

            processed_dir = Path(
                self._config["core"]["paths"].get("processed", raw_dir.joinpath("processed")),
            )
+            meta_dir = Path(
+                self._config["core"]["paths"].get("meta", raw_dir.joinpath("meta")),


The path logic is confusing right now as there is too many possibilities. I'd put the default as archiver_url in lab default config, and one automatic option.
To me its not clear if the meta path is 'meta/' or 'meta/fabtrack/' right now

This part is also confusing for me, as don't really see how you can get from raw_dir to e.g. processed_dir with raw_dir.joinpath("processed") - because this will give you beamtime_dir/raw_dir/processed instead of beamtime_dir/processed, or?
Currently, meta path is 'meta/fabtrack/' as it comes from Fabiano's code, but probably can be changed just to 'meta/' as soon as it will be accepted/generalized by IT guys.

zain-sohail · 2025-06-01T19:14:13Z

src/sed/loader/cfel/loader.py

+            self.metadata.update(self.parse_local_metadata())
+        else:
+            print("Metadata taken from SciCat")
+            self.metadata.update(self.parse_scicat_metadata(token) if collect_metadata else {})


Not necessarily a big issue but the parse_scicat_metadata is called twice in case it exists, once during if and once during else.
One way could be:

scicat_metadata = self.parse_scicat_metadata(token) if collect_metadata else {}) self.metadata.update(scicat_metadata) if len(scicat_metadata) == 0: print("No SciCat metadata available, checking local folder") self.metadata.update(self.parse_local_metadata())

Fine for me. Just wanted to implement check if SciCat entries available then go for it, if not then check local folder to be compatible to older beamtimes.

zain-sohail · 2025-06-01T19:14:28Z

src/sed/loader/flash/metadata.py

            burl=self.url,
-            url="Datasets",
+            url="datasets",#"Datasets",


Did the api change?

Yes, all metadata was migrated to generalized scicat.desy.de with new api where 'Datasets' were changed to 'datasets' :)
Hopefully within next days it should be also available from outside DESY.

zain-sohail changed the base branch from main to v1_feature_branch December 19, 2024 12:46

zain-sohail marked this pull request as ready for review December 19, 2024 15:48

zain-sohail commented Dec 19, 2024

View reviewed changes

src/sed/loader/flash/utils.py Outdated Show resolved Hide resolved

zain-sohail added 7 commits January 16, 2025 03:05

working dataframe class for cfel

ceeb637

move file

1c23973

move to flash loader

e8965ee

updates for cfel loader, not breaking tests

d14bc95

fix spellcheck

289d037

add example config

dcfe456

fix cspell

788d189

zain-sohail force-pushed the hextof-lab-loader branch from c206130 to 788d189 Compare January 16, 2025 15:10

zain-sohail added 2 commits January 28, 2025 10:04

Merge branch 'v1_feature_branch' into hextof-lab-loader

69e4595

update some minor config changes

f4fd755

zain-sohail mentioned this pull request Jan 30, 2025

Upgrade to V1 #437

Merged

12 tasks

rettigl changed the base branch from v1_feature_branch to main February 5, 2025 21:58

zain-sohail added 4 commits February 7, 2025 18:38

make sure optional parameters are not necessary

053bc60

Merge branch 'main' into hextof-lab-loader

dbb7e94

fix the bugs

df78f69

add timed dataframe starting point

5cd23b4

zain-sohail requested a review from Copilot April 6, 2025 16:10

Copilot AI reviewed Apr 6, 2025

View reviewed changes

zain-sohail and others added 6 commits April 12, 2025 19:28

moving back to main branch for flash, and removing instrument support…

5b411d1

… as it is not available right now anyways

separated lab loading procedure but using common methods from flash l…

3739505

…oader

fix a few bugs

a571fa2

add data for testing and some spelling fixes

73d7b5f

changed timestamps to use unix format

aa42cd8

Revert "changed timestamps to use unix format"

4734fea

Aserhisham and others added 2 commits May 14, 2025 22:06

working on timestamps, unfinished testing

ec2160f

added metadata retrieve from beamtime folder

4a6ec53

kutnyakhov mentioned this pull request May 19, 2025

Using local metadata in beamtime folder #561

Open

kutnyakhov and others added 2 commits May 22, 2025 15:54

adjusted SciCat part to new version and URL

227dfb1

changes to validation

ef3dcda

zain-sohail commented Jun 1, 2025

View reviewed changes

added get_count_rate() to cfel

dda08a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hextof lab loader #534

Hextof lab loader #534

Uh oh!

zain-sohail commented Dec 19, 2024 •

edited

Loading

Uh oh!

zain-sohail Dec 19, 2024

Uh oh!

zain-sohail Dec 19, 2024

Uh oh!

Uh oh!

coveralls commented Jan 16, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

zain-sohail left a comment

Uh oh!

zain-sohail Jun 1, 2025

Uh oh!

kutnyakhov Jun 3, 2025

Uh oh!

zain-sohail Jun 1, 2025

Uh oh!

kutnyakhov Jun 3, 2025

Uh oh!

zain-sohail Jun 1, 2025

Uh oh!

kutnyakhov Jun 3, 2025

Uh oh!

zain-sohail Jun 1, 2025

Uh oh!

kutnyakhov Jun 3, 2025

Uh oh!

Uh oh!

Hextof lab loader #534

Are you sure you want to change the base?

Hextof lab loader #534

Uh oh!

Conversation

zain-sohail commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coveralls commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 13419398366

Details

💛 - Coveralls

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

zain-sohail left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zain-sohail commented Dec 19, 2024 •

edited

Loading

coveralls commented Jan 16, 2025 •

edited

Loading