Description
I'm trying to run SC2 on 3Brain HD-MEA data (4096 channels, 20 kHz) with mostly default parameters :
p = {
'apply_motion_correction': False,
'apply_preprocessing': True,
'cache_preprocessing': {'delete_cache': True,
'memory_limit': 0.5,
'mode': 'zarr'},
'clustering': {'legacy': True},
'debug': False,
'detection': {'detect_threshold': 5, 'peak_sign': 'neg'},
'filtering': {'filter_order': 2,
'freq_max': 7000,
'freq_min': 150,
'ftype': 'bessel',
'margin_ms': 10},
'general': {'ms_after': 2, 'ms_before': 2, 'radius_um': 75},
'job_kwargs': {'n_jobs': 40,},# 'total_memory': '50G'},
'matched_filtering': True,
'matching': {'method': 'circus-omp-svd'},
'merging': {'auto_merge': {'corr_diff_thresh': 0.25, 'min_spikes': 10},
'correlograms_kwargs': {},
'similarity_kwargs': {'max_lag_ms': 0.2,
'method': 'cosine',
'support': 'union'}},
'motion_correction': {'preset': 'dredge_fast'},
'multi_units_only': False,
'seed': 42,
'selection': {'method': 'uniform',
'min_n_peaks': 100000,
'n_peaks_per_channel': 5000,
'seed': 42,
'select_per_channel': False},
'sparsity': {'amplitude_mode': 'peak_to_peak',
'method': 'snr',
'threshold': 0.25},
'whitening': {'mode': 'local', 'regularize': False}}
The only trace I get is :
spykingcircus2 could benefit from using torch. Consider installing it
Preprocessing the recording (bandpass filtering + CMR + whitening)
noise_level (workers: 20 processes): 100%|███████████████████████████████████████████████████| 20/20 [00:24<00:00, 1.23s/it]
Use zarr_path=/tmp/spikeinterface_cache/tmpdx30z0db/CCLII4NL.zarr
write_zarr_recording
engine=process - n_jobs=40 - samples_per_chunk=19,753 - chunk_memory=308.64 MiB - total_memory=12.06 GiB - chunk_duration=1.00s (999.96 ms)
write_zarr_recording (workers: 40 processes): 100%|██████████████████████████████████████████| 61/61 [01:34<00:00, 1.54s/it]
detect peaks using locally_exclusive + 1 node (workers: 40 processes): 100%|█████████████████| 61/61 [00:11<00:00, 5.10it/s]
detect peaks using matched_filtering (workers: 40 processes): 100%|██████████████████████████| 61/61 [02:18<00:00, 2.27s/it]
Kept 179242 peaks for clustering
extracting features (workers: 40 processes): 100%|███████████████████████████████████████████| 61/61 [00:06<00:00, 9.63it/s]
split_clusters with local_feature_clustering: 100%|███████████████████████████████████| 4210/4210 [00:00<00:00, 42564.83it/s]
Bus error (core dumped)
I've been able to trace the crash back to a call to estimate_templates
(here) which then seems to call estimate_templates_with_accumulator
.
From what I could gather this looks like an out of memory error, but I've never seen something quite like this with other OOM Python issues.
The GUI monitor shows a modest 17x10⁶ TB being used :
And dmesg
shows the following :
[ 954.327570] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/user.slice/user-1000.slice/[email protected]/tmux-spawn-ad081ad7-fd73-4200-8a54-e76e0ce4b80d.scope,task=python,pid=8624,uid=1000
[ 954.327924] Out of memory: Killed process 8624 (python) total-vm:14242184kB, anon-rss:10079372kB, file-rss:6260kB, shmem-rss:0kB, UID:1000 pgtables:20984kB oom_score_adj:0
[ 956.739762] systemd-journald[641]: Under memory pressure, flushing caches.
[ 957.536557] [drm:nv_drm_master_set [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00007300] Failed to grab modeset ownership
[ 957.631410] rfkill: input handler disabled
I'm not sure if this is normal behavior and there are simply too many redundant units or if there's an actual issue with memory handling.
I've tried passing total_memory
to both the sorters's job_kwargs
and SpikeInterface's global job_kwargs
, but I'm not sure it's taken into account when not dealing with the recording itself.