SpykingCircus2 clustering crash at template estimation

I'm trying to run SC2 on 3Brain HD-MEA data (4096 channels, 20 kHz) with mostly default parameters : 

```
p = {
 'apply_motion_correction': False,
 'apply_preprocessing': True,
 'cache_preprocessing': {'delete_cache': True,
                         'memory_limit': 0.5,
                         'mode': 'zarr'},
 'clustering': {'legacy': True},
 'debug': False,
 'detection': {'detect_threshold': 5, 'peak_sign': 'neg'},
 'filtering': {'filter_order': 2,
               'freq_max': 7000,
               'freq_min': 150,
               'ftype': 'bessel',
               'margin_ms': 10},
 'general': {'ms_after': 2, 'ms_before': 2, 'radius_um': 75},
 'job_kwargs': {'n_jobs': 40,},# 'total_memory': '50G'},
 'matched_filtering': True,
 'matching': {'method': 'circus-omp-svd'},
 'merging': {'auto_merge': {'corr_diff_thresh': 0.25, 'min_spikes': 10},
             'correlograms_kwargs': {},
             'similarity_kwargs': {'max_lag_ms': 0.2,
                                   'method': 'cosine',
                                   'support': 'union'}},
 'motion_correction': {'preset': 'dredge_fast'},
 'multi_units_only': False,
 'seed': 42,
 'selection': {'method': 'uniform',
               'min_n_peaks': 100000,
               'n_peaks_per_channel': 5000,
               'seed': 42,
               'select_per_channel': False},
 'sparsity': {'amplitude_mode': 'peak_to_peak',
              'method': 'snr',
              'threshold': 0.25},
 'whitening': {'mode': 'local', 'regularize': False}}
```

The only trace I get is : 

```
spykingcircus2 could benefit from using torch. Consider installing it
Preprocessing the recording (bandpass filtering + CMR + whitening)
noise_level (workers: 20 processes): 100%|███████████████████████████████████████████████████| 20/20 [00:24<00:00,  1.23s/it]
Use zarr_path=/tmp/spikeinterface_cache/tmpdx30z0db/CCLII4NL.zarr
write_zarr_recording 
engine=process - n_jobs=40 - samples_per_chunk=19,753 - chunk_memory=308.64 MiB - total_memory=12.06 GiB - chunk_duration=1.00s (999.96 ms)
write_zarr_recording (workers: 40 processes): 100%|██████████████████████████████████████████| 61/61 [01:34<00:00,  1.54s/it]
detect peaks using locally_exclusive + 1 node (workers: 40 processes): 100%|█████████████████| 61/61 [00:11<00:00,  5.10it/s]
detect peaks using matched_filtering (workers: 40 processes): 100%|██████████████████████████| 61/61 [02:18<00:00,  2.27s/it]
Kept 179242 peaks for clustering
extracting features (workers: 40 processes): 100%|███████████████████████████████████████████| 61/61 [00:06<00:00,  9.63it/s]
split_clusters with local_feature_clustering: 100%|███████████████████████████████████| 4210/4210 [00:00<00:00, 42564.83it/s]
Bus error (core dumped)
```

I've been able to trace the crash back to a call to `estimate_templates` ([here](https://github.com/SpikeInterface/spikeinterface/blob/2c6e800a820aa0618007018b94a047f71f82ace5/src/spikeinterface/sortingcomponents/clustering/circus.py#L260)) which then seems to call `estimate_templates_with_accumulator`. 

From what I could gather this looks like an out of memory error, but I've never seen something quite like this with other OOM Python issues.

The GUI monitor shows a modest 17x10⁶ TB being used : 

![Image](https://github.com/user-attachments/assets/f3890900-05fc-404d-ab0a-40e552e83faa)

And `dmesg` shows the following : 

```
[  954.327570] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/tmux-spawn-ad081ad7-fd73-4200-8a54-e76e0ce4b80d.scope,task=python,pid=8624,uid=1000
[  954.327924] Out of memory: Killed process 8624 (python) total-vm:14242184kB, anon-rss:10079372kB, file-rss:6260kB, shmem-rss:0kB, UID:1000 pgtables:20984kB oom_score_adj:0
[  956.739762] systemd-journald[641]: Under memory pressure, flushing caches.
[  957.536557] [drm:nv_drm_master_set [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00007300] Failed to grab modeset ownership
[  957.631410] rfkill: input handler disabled
```

I'm not sure if this is normal behavior and there are simply too many redundant units or if there's an actual issue with memory handling. 

I've tried passing `total_memory` to both the sorters's `job_kwargs` and SpikeInterface's global `job_kwargs`, but I'm not sure it's taken into account  when not dealing with the recording itself.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SpykingCircus2 clustering crash at template estimation #3722

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SpykingCircus2 clustering crash at template estimation #3722

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions