Skip to content

Add ARM64 Support to ParquetSharp #519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

JoshuaOloton
Copy link
Contributor

Description

This PR adds support for building ParquetSharp on the ARM64 architecture. It includes modifications to the build system, dependency management, and resolves compatibility issues specific to ARM64 platforms.

Checklist

  • Added new PowerShell build script specifically for ARM64 Windows
  • Updated vcpkg.json to include required dependencies for ARM64 builds.
  • Modified CMakeLists.txt to configure necessary dependencies for the ARM64 target.
  • Resolved all errors during the ARM64 build process.
  • Added build configuration to properly handle both Debug and Release builds on ARM64

Related Issues

@JoshuaOloton JoshuaOloton marked this pull request as draft April 28, 2025 07:18
Copy link
Contributor Author

@JoshuaOloton JoshuaOloton Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new Windows PowerShell script was introduced to support ARM64 builds, primarily targeting arm64-windows triplets instead of the previous x64-windows-static triplet.
Some of the key changes introduced:

    • Since Arrow is now manually built from source, the script prompts users to set ARROW_HOME to the Arrow install directory, and ensures the debug/release sub-directories exist before continuing.
  • The build system was switched from Visual Studio generator (-G "Visual Studio 17 2022" -A "x64") to Ninja Multi-Config generator (-G "Ninja Multi-Config"), for faster and more flexible builds with better support on Windows ARM64.
  • Compiler change: Clang was chosen as the compiler (CMAKE_C_COMPILER=clang-cl, CMAKE_CXX_COMPILER=clang-cl).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update was made to address the following warning:

[48/56] Building CXX object cpp\CMakeFiles\ParquetSharpNative.dir\arrow\FileWriter.cpp.obj
C:\Users\JoshuaOloton\Desktop\ParquetSharp\cpp\arrow\FileWriter.cpp(118,43): warning: 'NewRowGroup' is deprecated: Deprecated in 19.0.0. Use NewRowGroup() without the `chunk_size` argument. [-Wdeprecated-declarations]
  118 | 	TRYCATCH(PARQUET_THROW_NOT_OK(writer->NewRowGroup(chunk_size));)
  	|                                       	^
C:\Program Files (x86)\arrow\include\parquet/arrow/writer.h(95,3): note: 'NewRowGroup' has been explicitly marked deprecated here
   95 |   ARROW_DEPRECATED(
  	|   ^
C:\Program Files (x86)\arrow\include\arrow/util/macros.h(140,35): note: expanded from macro 'ARROW_DEPRECATED'
  140 | #  define ARROW_DEPRECATED(...) [[deprecated(__VA_ARGS__)]]
  	|                               	^
1 warning generated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the native CMake configuration to reference the locally built Arrow and Parquet libraries directly. Instead of using find_package, it now specifies the include directories and library paths for Debug and Release builds via the ARROW_ROOT_DEBUG and ARROW_ROOT_RELEASE variables.
This was to ensure the project links against the correct Arrow build from source.

@JoshuaOloton
Copy link
Contributor Author

Hello @adamreeve
I'm facing a blocker running the tests after building successfully on Windows ARM. Multiple testcases fail with the following error:

System.DllNotFoundException : Unable to load DLL 'ParquetSharpNatived' or one of its dependencies: The specified module could not be found. (0x8007007E)
Stack Trace (example):
   at ParquetSharp.WriterPropertiesBuilder.WriterPropertiesBuilder_Create(IntPtr& builder)
   at ParquetSharp.WriterPropertiesBuilder..ctor() in C:\Users\mlhuser\Desktop\ParquetSharp\csharp\WriterPropertiesBuilder.cs:line 18
   ...

I’ll keep digging into it, but please let me know if there’s anything obvious I might be missing in my build setup.

@adamreeve
Copy link
Contributor

Are you seeing those errors with .NET Framework? There's an older way to configure the DLL location used by .NET Framework that you might need to set up, see the csharp/ParquetSharp.targets file.

@adamreeve
Copy link
Contributor

adamreeve commented May 6, 2025

Looking at this a bit more closely, I see that it relies on having a pre-built version of Arrow. Do you have a script you can share or add to this PR that shows how to build Arrow on arm64?

This PR is a very helpful step towards getting full Windows arm64 support in ParquetSharp, but there are a few more steps before we can properly integrate this. We'll need to be able to build Arrow as well as the ParquetSharp native library in our CI process using the same Arrow C++ version as other platforms, and ideally need minimal work to maintain this process as new Arrow C++ versions are released.

One way to do this might be to use a custom vcpkg triplet that implements any customisation required to build Arrow on arm64 without needing to build it separate from vcpkg. Eg. you can use the PORT MATCHES condition in a custom triplet file to set options for a specific port. See https://learn.microsoft.com/en-us/vcpkg/users/triplets#per-port-customization. Or if more complex changes are required we could use a port override instead to replace the whole Arrow port, but it would be good to avoid that if possible.

@JoshuaOloton
Copy link
Contributor Author

JoshuaOloton commented May 6, 2025

Thanks @adamreeve, that seems like a much cleaner and maintainable approach long-term.
Yes, it currently relies on an already built version of Arrow, I’ll share the script I used to build Arrow on Arm64 and its dependencies. I'll also look into implementing the custom vcpkg triplet as you suggested.

@JoshuaOloton
Copy link
Contributor Author

Are you seeing those errors with .NET Framework? There's an older way to configure the DLL location used by .NET Framework that you might need to set up, see the csharp/ParquetSharp.targets file.

Thanks! I think I may have spotted the issue — I’ll add a new Content block in ParquetSharp.targets specifically for Windows ARM under .NET Framework. I’ll report back once I’ve tested it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST]: Windows ARM Support
2 participants