Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredump in segmm_ #4892

Open
sandrew11 opened this issue Sep 9, 2024 · 4 comments
Open

coredump in segmm_ #4892

sandrew11 opened this issue Sep 9, 2024 · 4 comments

Comments

@sandrew11
Copy link

#0 0x000055eca5413a75 in sgemm_incopy ()
#1 0x000055eca53f670f in sgemm_tn ()
#2 0x000055eca53f4d79 in sgemm_ ()
#3 0x000055eca52d9bda in faiss::(anonymous namespace)::exhaustive_inner_product_blas<faiss::HeapResultHandler<faiss::CMin<float, long> > > (res=..., ny=64, nx=64, d=128, y=0x7fca661be000,
x=) at third_party/faiss/faiss/faiss/impl/ResultHandler.h:88
#4 faiss::knn_inner_product (x=, y=, d=128, nx=64, ny=, k=1, val=0x7fca62d35200, ids=0x7fca661dfc00, sel=)
at third_party/faiss/faiss/faiss/utils/distances.cpp:636
#5 0x000055eca52d9d8e in faiss::knn_inner_product (x=, y=, d=, nx=, ny=, res=res@entry=0x7fca58dbe1c0, sel=0x0)
at third_party/faiss/faiss/faiss/utils/distances.cpp:667

There is coredump in some linux environment. Asking for everyone's help, please tell me the reason for this coredump

@martin-frbg
Copy link
Collaborator

Which version of OpenBLAS, which cpu, which compiler, what does the code and data look like that leads to the coredump ?

@sandrew11
Copy link
Author

Which version of OpenBLAS, which cpu, which compiler, what does the code and data look like that leads to the coredump ?
openblas version: 0.3.21
lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 42 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
CPU family: 6
Model: 79
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 16
Stepping: 1
BogoMIPS: 5187.98
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq
ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2
erms invpcid rtm rdseed adx smap xsaveopt arat flush_l1d
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 64 MiB (16 instances)
L3: 256 MiB (16 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Meltdown: Vulnerable
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected
Srbds: Not affected
Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown

compiler:gcc

in faiss::exhaustive_inner_product_blas
https://github.com/facebookresearch/faiss/blob/main/faiss/utils/distances.cpp
sgemm_("Transpose",
"Not transpose",
&nyi,
&nxi,
&di,
&one,
y + j0 * d,
&di,
x + i0 * d,
&di,
&zero,
ip_block.get(),
&nyi);

@martin-frbg
Copy link
Collaborator

martin-frbg commented Sep 10, 2024

I do not know FAISS, what do I need to do to trigger the error there - is it with a supplied example ? But 0.3.21 is two years old, there is a good chance that this was resolved in the meantime.

@martin-frbg
Copy link
Collaborator

I cannot reproduce this with simple cases based on the examples in the FAISS tutorial, please provide
a code sample that demonstrates the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants