Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed _mm256_set_m128 is only availble on gcc8+. issue#5072 #5075

Merged
merged 23 commits into from
Oct 10, 2023
Merged
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
7db5444
Add get_gpu_instance() function and Organized the instance class codes.
whyb Apr 17, 2023
e48623d
Move class __ncnn_vulkan_instance_holder declaration from gpu.h to g…
whyb Apr 17, 2023
1042565
Delete empty line changes
whyb Apr 17, 2023
a479422
Reimplement the sleep() and get_current_time() functions using modern…
whyb Apr 28, 2023
6eb3bb7
Fixed build error for __riscv not support c++ 11 thread
whyb Apr 28, 2023
ed932ba
Add NCNN_SIMPLESTL Macro
whyb May 5, 2023
00b9f28
Fix simple stl's compiler build error
whyb May 12, 2023
67ef283
Fix linux-gcc-cpp03-nostdio-nostring-simplestl build error
whyb May 12, 2023
d1e079c
Use u_int64_t type parameters in linux-gcc-cpp03-nostdio-nostring-sim…
whyb May 15, 2023
2ad85f0
change uint64_t&u_int64_t to unsigned long long int
whyb May 15, 2023
cec75bc
Remove include stdint.h and change function sleep() default paramete…
whyb May 16, 2023
85c15a0
apply code-format changes
whyb May 16, 2023
be921fd
Update benchmark.cpp
nihui May 16, 2023
b25ac10
Merge branch 'Tencent:master' into master
whyb May 30, 2023
98af54c
Merge branch 'Tencent:master' into master
whyb May 31, 2023
27ba97b
Merge branch 'Tencent:master' into master
whyb Jun 14, 2023
3110969
Merge branch 'Tencent:master' into master
whyb Jun 26, 2023
a1f84f8
Merge branch 'Tencent:master' into master
whyb Jun 26, 2023
76941c6
Merge branch 'Tencent:master' into master
whyb Aug 7, 2023
6cee97e
Merge branch 'Tencent:master' into master
whyb Oct 10, 2023
3f8557f
Fixed _mm256_set_m128 is only availble on gcc8+. issue#5072
whyb Oct 10, 2023
8d76be5
apply code-format changes
whyb Oct 10, 2023
643816e
Remove comments
whyb Oct 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions src/layer/x86/shufflechannel_x86.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -343,9 +343,9 @@ int ShuffleChannel_x86::forward(const Mat& bottom_blob, Mat& top_blob, const Opt
for (int i = 0; i < size; i++)
{
__m256 _p0 = _mm256_loadu_ps(ptr0);
// macro `_mm256_loadu2_m128` is declared in Intel® Intrinsics Guide but somehow missed in <immintrin.h>
// __m256 _p1 = _mm256_loadu2_m128(ptr2, ptr1);
__m256 _p1 = _mm256_set_m128(_mm_loadu_ps(ptr2), _mm_loadu_ps(ptr1));

__m256 _p1 = _mm256_castps128_ps256(_mm_loadu_ps(ptr1));
_p1 = _mm256_insertf128_ps(_p1, _mm_loadu_ps(ptr2), 1);

__m256 _lo = _mm256_unpacklo_ps(_p0, _p1);
__m256 _hi = _mm256_unpackhi_ps(_p0, _p1);
Expand Down
Loading