Add pydantic curation model, improve merging rules, and add splitting model #3760

alejoe91 · 2025-03-11T12:24:28Z

This PR goes in the direction of adding more structure to the curation format.

By defining a Pydantic model, we can add proper description, types, and validation strategies for the curation.
This will make it easier to validate and adopt by third party software

…ules

src/spikeinterface/curation/curation_model.py

zm711 · 2025-03-11T13:12:50Z

src/spikeinterface/core/sortinganalyzer.py

+        sparsity_overlap: float = 0.75,
+        new_id_strategy: str = "append",
+        return_new_unit_ids: bool = False,
+        format: str = "memory",


One general question that will matter for typing going forward. We have been moving toward doing:

"append" | "new" for typing but type analysis programs don't like this so I assume pydantic won't either. str however is not accurate either because it doesn't expect any string, but specific strings. So in this case should we move the library over to
Literal['append' | 'new']
I forget the actual argument so 'new' was me just making something up for example.

Or does pydantic only accept str and doesn't accept Literal?

I think pydantic only accepts Literal.

Why ""append" | "new" for typing but type analysis programs don't like this"?

I don't know. On vscode I only get a warning saying that the type "append" | "new" are not defined. And others (I think Heberto) have commented about why not use Literal['append' | 'new'] so maybe he is seeing the typing warning too. I just want to make sure we fit in the pydantic model but also be useful to the end user. Saying str is not useful to the end-user that uses type hints because it is actually a Literal. I think adding Literal clutters stuff, but if we are now relying on a tool that expects Literal then we have to use it and we should move the whole code base in that direction for consistency.

I think the static type analysis programs think that "append" should be a type because we are not specifying it is a literal. So although python allows it, I think static type checkers don't know what to do with it. It is a little similar to the Optional, optional debate in type hinting in python.

Actually, I think using "append" | "new" is not supported...

That is exactly what I'm saying!

I prefer it, but it is not supported. So we need to switch! I don't want us to switch to str I want us to switch to Literal.

Got it! That makes sense to me :)

…e into curation-pydantic

alejoe91 · 2025-03-27T19:58:45Z

@samuelgarcia this is now only the new pydantic model, including splitting, and a general validation clean up in the curation format

alejoe91 · 2025-03-27T19:58:50Z

ready to review

…e into curation-pydantic

samuelgarcia · 2025-04-01T06:33:49Z

Hi.
I read it very quickly.
I have the feeling taht you also also changes the format itself.
Now the merges are list of dict which certainly better.
Can we have a call for this ?

Then we should have a clear v2 in the format.
We need too handle backward compatibility.

samuelgarcia · 2025-04-01T06:35:13Z

Could you also change the curation.rst format ?

samuelgarcia · 2025-06-02T09:22:14Z

src/spikeinterface/curation/curation_model.py

+
+class LabelDefinition(BaseModel):
+    name: str = Field(..., description="Name of the label")
+    label_options: List[str] = Field(..., description="List of possible label options", min_length=2)


We we really need this ... everywhere ?
I think this is clear that they are all mandatory expecet whe there is a default no ?
Is this common pydantic ?

The ... is needed for required fields

samuelgarcia · 2025-06-02T09:25:27Z

src/spikeinterface/curation/curation_model.py

+            "If labels, the split is defined by a list of labels for each spike (`split_labels`). "
+        ),
+    )
+    split_indices: Optional[Union[List[List[int]]]] = Field(default=None, description="List of indices for the split")


why Union ?

how does this play with numpy array ?

good catch, union is useless here

mmm, no It should be Union[List[int], List[List[int]]]: The List[int] is for labels mode, the List[List[int]] for indices mode

samuelgarcia · 2025-06-02T09:32:23Z

src/spikeinterface/curation/curation_model.py

+        return values
+
+    @classmethod
+    def check_splits(cls, values):


The structure of split_data should be desribe by mode here.
The split_data type is unclear.

samuelgarcia · 2025-06-02T09:38:43Z

src/spikeinterface/curation/curation_model.py

+    merge_unit_group: List[Union[int, str]] = Field(..., description="List of groups of units to be merged")
+    merge_new_unit_id: Optional[Union[int, str]] = Field(default=None, description="New unit IDs for the merge group")


If we now have a nested dict why not

Suggested change

merge_unit_group: List[Union[int, str]] = Field(..., description="List of groups of units to be merged")

merge_new_unit_id: Optional[Union[int, str]] = Field(default=None, description="New unit IDs for the merge group")

group: List[Union[int, str]] = Field(..., description="List of groups of units to be merged")

new_unit_id: Optional[Union[int, str]] = Field(default=None, description="New unit IDs for the merge group")

because it is obvisouly a merge. no ?

done, also for splits

Add pydantic curation model and improve curation format and merging r…

96863de

…ules

alejoe91 added the curation Related to curation module label Mar 11, 2025

alejoe91 commented Mar 11, 2025

View reviewed changes

src/spikeinterface/curation/curation_model.py Outdated Show resolved Hide resolved

Update src/spikeinterface/curation/curation_model.py

1ce611c

zm711 reviewed Mar 11, 2025

View reviewed changes

alejoe91 added 8 commits March 11, 2025 14:44

Merge branch 'curation-pydantic' of github.com:alejoe91/spikeinterfac…

567b2b7

…e into curation-pydantic

Move pydantic to core

3464987

Merge branch 'main' into curation-pydantic

69c4854

Merge branch 'main' into curation-pydantic

228722c

Refactor curation model to include merges and splits

dbfa315

Add merge list to tests

82526b0

Simplify and centralize conversion and checks

482f0be

Fix sortingview tests

f122db7

alejoe91 changed the title ~~Add pydantic curation model and improve curation format and merging rules~~ Add pydantic curation model, improve merging rules, and add splitting model Mar 27, 2025

Fix sortingview conversion

4f14e90

alejoe91 marked this pull request as ready for review March 27, 2025 19:58

Merge branch 'main' into curation-pydantic

d7633bf

alejoe91 added 2 commits March 28, 2025 16:45

merge_new_unit_ids -> merge_new_unit_id

317f87c

Merge branch 'curation-pydantic' of github.com:alejoe91/spikeinterfac…

a9ed838

…e into curation-pydantic

alejoe91 mentioned this pull request Mar 28, 2025

Add splitting functionality to curation and SortingAnalyzer #3817

Open

samuelgarcia reviewed Jun 2, 2025

View reviewed changes

alejoe91 added this to the 0.103.0 milestone Jun 11, 2025

Implement feedback

c64b840

alejoe91 added 2 commits June 12, 2025 15:38

Solve conflicts

6c940a3

Clearer explanation on split data

dbdac13

		merge_unit_group: List[Union[int, str]] = Field(..., description="List of groups of units to be merged")
		merge_new_unit_id: Optional[Union[int, str]] = Field(default=None, description="New unit IDs for the merge group")

Add pydantic curation model, improve merging rules, and add splitting model #3760

Are you sure you want to change the base?

Add pydantic curation model, improve merging rules, and add splitting model #3760

Uh oh!

Conversation

alejoe91 commented Mar 11, 2025

Uh oh!

Uh oh!

zm711 Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alejoe91 commented Mar 27, 2025

Uh oh!

alejoe91 commented Mar 27, 2025

Uh oh!

samuelgarcia commented Apr 1, 2025

Uh oh!

samuelgarcia commented Apr 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alejoe91 Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zm711 Mar 11, 2025 •

edited

Loading

alejoe91 Jun 2, 2025 •

edited

Loading