Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add clang to AppVeyor #185

Open
nemequ opened this issue Feb 5, 2016 · 19 comments
Open

Add clang to AppVeyor #185

nemequ opened this issue Feb 5, 2016 · 19 comments
Labels

Comments

@nemequ
Copy link
Member

nemequ commented Feb 5, 2016

See https://groups.google.com/d/msg/squash-compression/qjrshDe-3WY/GmGEQncpAgAJ

@nemequ nemequ added the windows label Feb 5, 2016
@jibsen
Copy link
Collaborator

jibsen commented Feb 6, 2016

Just dumping some information while working on this.

There are two (official) installers for clang on Windows, one that integrates with Visual Studio, and one that works with mingw.

There is a bug in the 3.7.x release that makes the mingw version fail when linking C++ executables (missing __chkstk_ms). For some of the plugins, adding -lgcc to the linker flags may help.

There is another bug that prevents munit from compiling. It is due to clang trying to use stdatomic.h from GCC, but failing.

By disabling the following plugins: brotli, bsc, gipfeli, lzham, lzo, ms-compress, snappy, yalz77, zling, zpaq, and doing the workaround for munit, I got the rest to build with clang on mingw-w64, and the unit test appears to work.

@jibsen
Copy link
Collaborator

jibsen commented Feb 6, 2016

Installed the version that integrates with VS to see how that works. I did not look at the error messages, but simply disabled any plugin that did not build (brotli, gipfeli, libdeflate, lzham, lzma, ms-compress, snappy, yalz77, zlib-ng, zling, zpaq, zstd).

With those disabled, it built and the unit test runs.

@jibsen
Copy link
Collaborator

jibsen commented Feb 6, 2016

Okay, upon further investigation, most of the errors with clang/VS were due to yet another bug, where clang does not recognize what version of VS it runs on top of. Adding -fms-compatibility-version=19 to the compiler flags when using Visual C++ 2015 fixed that.

The remaining issues are libdeflate, lzham, lzma, and zstd using intrinsics without including intrin.h,
and gipfeli and zlib-ng have other erros I have not looked into yet. Also it looks like lzham crashes the threads unit test.

@nemequ
Copy link
Member Author

nemequ commented Feb 6, 2016

Yikes, what a mess. Great job investigating this, though.

I think it's pretty clear that the mingw version isn't ready yet; we can revisit that after 3.8 is released.

For the VS version, though, it's easy enough to add that flag in the AppVeyor configuration, and it would be good go get a jump on fixing those plugins (especially if it's just a matter of including intrin.h).

@jibsen
Copy link
Collaborator

jibsen commented Feb 7, 2016

@elliotwoods posted a link in #145 to a blog post about an experimental clang toolset called "Clang with Microsoft CodeGen", provided by MS in VS 2015 Update 1. While it seems a bit unstable, it could potentially be used to compile plugins that would be difficult to add MSVC support to.

Just wanted to note the changes I made under properties to get it to compile the DENSITY plugin:

  • Change the toolset to Clang. Apply.
  • Change Object File Name to $(IntDir)%(filename).obj (MSVC switch /Fo can take a folder, clang -o cannot)
  • Change Debug Information Format to Full Debug Information
  • Change Microsoft Compatibility Mode to Yes (presumably needed because CMake did the probing using MSVC)

@elliotwoods
Copy link

@jibsen - I followed your instructions and then it would in Debug. Thank you!

but to build with -O2 (or -Oanything), i needed to add to globals.h:

#define density_likely(x) x
//__builtin_expect(!!(x), 1)
#define density_unlikely(x) x
//__builtin_expect(!!(x), 0)

i.e. nullify the branch prediction
which.....might...make...things...not so fast(?)

@nemequ
Copy link
Member Author

nemequ commented Feb 8, 2016

MSVC doesn't support anything like __builtin_expect, not much anyone can do about that. My understanding is that branch prediction in modern x86/x86_64 CPUs very good, and the cost of a mispredicted branch isn't actually that bad. __builtin_expect is mostly useful for other architectures these days, which Windows doesn't support anyways.

That said, I know density isn't as fast on MSVC. I doubt branch prediction is a major factor, but if you're concerned about speed you really need to be doing your own benchmarking with a realistic workload for your application. It may very well be that another codec works much better for you; that's a big part of what makes Squash interesting… you can test a lot of codecs without changing your code.

Also, you should really define the macros as (x) not x. It's much safer.

@elliotwoods
Copy link

(note : this is using CodeGen, i.e. clang with Visual Studio, not MSVC compiler)
Good to know that branch prediction is optimised well on modern architectures/compilers

yes! we'll definitely test with other compressors in real world scenario
and yes! that's one of the main reasons for using squash here.

ok i'll do as you say with the macros

@jibsen
Copy link
Collaborator

jibsen commented Feb 8, 2016

Thanks for the heads up.

Like @nemequ says, MSVC does not have an equivalent intrinsic. I get an ICE for it:

2>    %85 = call i32 @llvm.expect.i32(i32 %84, i32 0), !dbg !558
2>d:\dump\squash\plugins\density\squash-density.c(223): fatal error C1001: An internal error has occurred in the compiler.
2>  (compiler file 'llvm-bridge.cpp', line 5667)

Certainly would have been preferable if they had simply ignored that intrinsic and emitted a warning.

@elliotwoods
Copy link

yes you're totally right. MSVC's port of the Clang compiler doesn't support it.
my bad, sorry for misreading

@elliotwoods
Copy link

Here's the results with my build.
image

Notes:

  • All are built with MSCV 2015 compiler except:
  • density is built with clang/CodeGen (with the branch hints removed)
  • zlib/zlib-ng are built with MinGW (the density build from MinGW failed to load)

Here we can see snappy (selected) comes out around the same compression ratio as density, but density is considerably faster (grid lines in y are each 50MBps, so it's about 125MBps faster)

Perhaps this isn't the right thread for this (i'm just continuing from the above). so feel free to move this if it's useful elsewhere

@jibsen
Copy link
Collaborator

jibsen commented Mar 11, 2016

So, Clang 3.8.0 is out, and I have been trying to get the various combinations of Clang with mingw-w64 and MSVC headers to build squash. The closest to success was the MSYS2 version of Clang, which works except for two minor issues:

Working around these two issues, squash builds and the unit tests run.

The installer from llvm.org using the mingw-w64 (GCC 5.3.0) headers and --target=x86_64-w64-mingw32 further had an issue with lzo using __int64 instead of long long (probably not expecting Clang on Windows).

The installer from llvm.org using the MSVC (Visual Studio 2015) headers and -fms-compatibility-version=19 had problems with a number of plugins using MSVC intrinsics without including <intrin.h> (libdeflate, lzma, zlib-ng, zstd), and gipfeli using std::min/max without including <algorithm> (C++11).

@nemequ
Copy link
Member Author

nemequ commented Mar 11, 2016

The µnit thing is definitely a problem… I'm willing to work around that in µnit, but I'm not sure why you would be hitting it. stdatomic.h should only be included in µnit if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L) && !defined(__STDC_NO_ATOMICS__), but we compile in C99 mode…

For the other issues, it sounds like we can just disable the offending plugins until the issues are fixed.

@jibsen
Copy link
Collaborator

jibsen commented Mar 11, 2016

There does not appear to be any use of the -std= flag in the Clang build, so it is using C11 by default. I don't know if CMake implements the standard check as a "greater than or equal".

@nemequ
Copy link
Member Author

nemequ commented Mar 11, 2016

Damnit, cmake.

Well, we could do something like (untested):

diff --git a/munit.c b/munit.c
index 0607763..d76d06f 100644
--- a/munit.c
+++ b/munit.c
@@ -263,13 +263,13 @@ munit_malloc_ex(const char* filename, int line, size_t size) {
 #  endif
 #endif

-#if defined(HAVE_STDATOMIC)
+#if defined(HAVE_CLANG_ATOMICS)
+#  define ATOMIC_UINT32_T _Atomic uint32_t
+#  define ATOMIC_UINT32_INIT(x) (x)
+#elif defined(HAVE_STDATOMIC)
 #  include <stdatomic.h>
 #  define ATOMIC_UINT32_T _Atomic uint32_t
 #  define ATOMIC_UINT32_INIT(x) ATOMIC_VAR_INIT(x)
-#elif defined(HAVE_CLANG_ATOMICS)
-#  define ATOMIC_UINT32_T _Atomic uint32_t
-#  define ATOMIC_UINT32_INIT(x) (x)
 #elif defined(_WIN32)
 #  define ATOMIC_UINT32_T volatile LONG
 #  define ATOMIC_UINT32_INIT(x) (x)
@@ -280,14 +280,14 @@ munit_malloc_ex(const char* filename, int line, size_t size) {

 static ATOMIC_UINT32_T munit_rand_state = ATOMIC_UINT32_INIT(42);

-#if defined(HAVE_STDATOMIC)
-#  define munit_atomic_store(dest, value)         atomic_store(dest, value)
-#  define munit_atomic_load(src)                  atomic_load(src)
-#  define munit_atomic_cas(dest, expected, value) atomic_compare_exchange_weak(dest, expected, value)
-#elif defined(HAVE_CLANG_ATOMICS)
+#if defined(HAVE_CLANG_ATOMICS)
 #  define munit_atomic_store(dest, value)         __c11_atomic_store(dest, value, __ATOMIC_SEQ_CST)
 #  define munit_atomic_load(src)                  __c11_atomic_load(src, __ATOMIC_SEQ_CST)
 #  define munit_atomic_cas(dest, expected, value) __c11_atomic_compare_exchange_weak(dest, expected, value, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)
+#elif defined(HAVE_STDATOMIC)
+#  define munit_atomic_store(dest, value)         atomic_store(dest, value)
+#  define munit_atomic_load(src)                  atomic_load(src)
+#  define munit_atomic_cas(dest, expected, value) atomic_compare_exchange_weak(dest, expected, value)
 #elif defined(__GNUC__) && (__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)
 #  define munit_atomic_store(dest, value)         __atomic_store_n(dest, value, __ATOMIC_SEQ_CST)
 #  define munit_atomic_load(src)                  __atomic_load_n(src, __ATOMIC_SEQ_CST)

@jibsen
Copy link
Collaborator

jibsen commented Jun 12, 2016

I tried the May 2016 update to Clang/C2, which includes Clang 3.8.0, along with CMake 3.6.0-rc1 which supports the toolset.

I still get a couple of ICE, and a few plugins do not work, but it definitely seems to be improving. And it is really nice to be able to build from CMake instead of having to modify the build settings in VS.

@jibsen
Copy link
Collaborator

jibsen commented Jun 13, 2016

I had a little time to look into this, and I am going to try to describe the current issues building squash with Clang/C2 (warning: wall of text incoming).

Starting out with a freshly cloned squash, we build with CMake:

mkdir build
cd build
cmake -G "Visual Studio 14 2015" -T v140_clang_3_7 ..
cmake --build . --config Debug

First error is:

    %11 = atomicrmw sub i32* %10, i32 1 seq_cst, !dbg !34
d:\dump\squash2\squash\squash\object.c(253): fatal error C1001: An internal error has occurred in the compiler. [D:\dump\squash2\squash\build\squash\squash0.8.vcxproj]
  (compiler file 'llvm-bridge.cpp', line 6772)

an ICE compiling squash/object.c(253), which calls the atomic builtin __sync_fetch_and_sub. We can work around this by using:

diff --git a/squash/object.c b/squash/object.c
index 14c6175..4007c2c 100644
--- a/squash/object.c
+++ b/squash/object.c
@@ -29,7 +29,7 @@
 #include <stdbool.h>
 #include <stddef.h>

-#if defined(__GNUC__) || defined(__clang__) || defined(__INTEL_COMPILER)
+#if defined(__GNUC__) || (defined(__clang__) && !defined(__c2__)) || defined(__INTEL_COMPILER)
 #  define squash_atomic_inc(var) __sync_fetch_and_add(var, 1)
 #  define squash_atomic_dec(var) __sync_fetch_and_sub(var, 1)
 #  define squash_atomic_cas(var, orig, val) __sync_val_compare_and_swap(var, orig, val)

Next error is in the gipfeli plugin, which uses std::min and std::max without including <algorithm> (PR: google/gipfeli/pull/10).

Then we have another ICE in the lzma plugin, I haven't investigated this further:

  '_mm_movemask_epi8': Intrinsic not yet implemented:
    %4 = call i32 @llvm.x86.sse2.pmovmskb.128(<16 x i8> %3)
D:\dump\squash2\squash\build\plugins\lzma\lz_encoder_mf.c: fatal error C1001: An internal error has occurred in the compiler. [D:\dump\squash2\squash\build\plugins\lzma\squash0.8-plugin-lzma.vcxproj]
  (compiler file 'llvm-bridge.cpp', line 5686)

An error in the lzo plugin (Clang issue reported):

D:\dump\squash2\squash\plugins\lzo\lzo\src/lzo_func.h(212,1): error: expected ';' after struct [D:\dump\squash2\squash\build\plugins\lzo\squash0.8-plugin-lzo.vcxproj]
  __lzo_byte_struct(lzo_memops_TU8_struct,8)
  ^
  D:\dump\squash2\squash\plugins\lzo\lzo\include\lzo/lzodefs.h(1729,84) :
     note: expanded from macro '__lzo_byte_struct'
  #  define __lzo_byte_struct(s,n)        __lzo_struct_packed(s) unsigned char
  a[n]; __lzo_struct_packed_end()

        ^
  D:\dump\squash2\squash\plugins\lzo\lzo\include\lzo/lzodefs.h(1715,52) :
     note: expanded from macro '__lzo_struct_packed_end'
  #  define __lzo_struct_packed_end()     } __pragma(pack(pop));
                                                     ^

We can work around this with:

diff --git a/include/lzo/lzodefs.h b/include/lzo/lzodefs.h
index 1535c1e..ea60d1d 100644
--- a/include/lzo/lzodefs.h
+++ b/include/lzo/lzodefs.h
@@ -1712,7 +1712,7 @@ extern "C" {
 #  define __lzo_struct_packed_ma_end()  } __lzo_may_alias __attribute__((__packed__));
 #elif (LZO_CC_INTELC_MSC) || (LZO_CC_MSC && (_MSC_VER >= 1300))
 #  define __lzo_struct_packed(s)        __pragma(pack(push,1)) struct s {
-#  define __lzo_struct_packed_end()     } __pragma(pack(pop));
+#  define __lzo_struct_packed_end()     }; __pragma(pack(pop))
 #elif (LZO_CC_WATCOMC && (__WATCOMC__ >= 900))
 #  define __lzo_struct_packed(s)        _Packed struct s {
 #  define __lzo_struct_packed_end()     };

An error in the zlib-ng plugin, which defines its own version of __builtin_ctzl (PR: Dead2/zlib-ng/pull/70).

And finally an ICE in munit.c(462) calling C11 atomic_compare_exchange_weak:

  'munit_rand_double': Unexpected atomic instruction -- use Windows interlock intrinsics
    %25 = cmpxchg weak i32* @munit_rand_state, i32 %23, i32 %24 seq_cst seq_cst, !dbg !86
d:\dump\squash2\squash\tests\munit\munit.c(462): fatal error C1001: An internal error has occurred in the compiler. [D:\dump\squash2\squash\build\tests\test-squash.vcxproj]
  (compiler file 'llvm-bridge.cpp', line 6772)

There are preprocessor checks starting at munit.c(272) that check for the presence of C11 and Clang atomics. Clang/C2 reports both available, but produces the ICE above when trying to use them. Until this is fixed we can work around it with:

diff --git a/munit.c b/munit.c
index c41cfd3..713f5b8 100644
--- a/munit.c
+++ b/munit.c
@@ -277,6 +277,11 @@ munit_malloc_ex(const char* filename, int line, size_t size) {
 #  endif
 #endif

+#if defined(__clang__) && defined(__c2__)
+#  undef HAVE_STDATOMIC
+#  undef HAVE_CLANG_ATOMICS
+#endif
+
 #if defined(HAVE_STDATOMIC)
 #  include <stdatomic.h>
 #  define ATOMIC_UINT32_T _Atomic uint32_t

With the mentioned fixes/workarounds, and disabling the lzma plugin for now, squash builds with Clang/C2.

Running the unit tests with ctest -V -C Debug results in the gipfeli plugin failing 12 tests (crashing in two). This may be related to intrinsics, since the tests pass when rebuilding the plugin with HAVE_BUILTIN_CTZ disabled. However, the plugin works with Clang targeting mingw-w64, so the issue appears to be with Clang/C2.

jibsen added a commit to nemequ/munit that referenced this issue Jun 15, 2016
Clang 3.8.0 on mingw-w64 advertises support for both C11 and Clang
atomics, but only Clang atomics work. With Clang/C2 neither works.

http://llvm.org/bugs/show_bug.cgi?id=26911
quixdb/squash#185
@jibsen
Copy link
Collaborator

jibsen commented Oct 15, 2017

It looks like Clang/C2 most likely will not get further updates (see this tweet and comments at this blog post).

@jibsen
Copy link
Collaborator

jibsen commented Dec 11, 2018

I've updated the AppVeyor configuration, and Clang (7.0.0) targeting both MSVC (clang-cl and NMake) and mingw-w64 GCC is building and the unit tests run (with the plugins disabled which have issues with Clang on Windows or Windows in general).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants