Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): support building python on windows #1885

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

An-DJ
Copy link
Contributor

@An-DJ An-DJ commented Oct 14, 2024

What does this PR do?

Support building python on Windows.

  1. Use (git) bash to run python steps on Windows (MYSY, MINGW)
  2. Add missing header
  3. Rename FuryLogLevel::ERROR to FuryLogLevel::ERR. I don't know why this enum make build failed, but it is successed if renaming it to FuryLogLevel::ERR
  4. Rename pyx built dynamic lib name '*.so' to '*.pyd'

Related issues

Close #798

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

@chaokunyang
Copy link
Collaborator

Hi @An-DJ , it this PR ready for review?

@An-DJ
Copy link
Contributor Author

An-DJ commented Oct 31, 2024

Hi @An-DJ , it this PR ready for review?

No, CI result is fake, the build script didn't have been executed successfully. We need more works on windows platform...

I will continue this PR in the next few days (too busy in this week...)

Signed-off-by: Junduo Dong <[email protected]>
I don't know why this enum make build failed...
but successed if renaming it to FuryLogLevel::ERR

Signed-off-by: Junduo Dong <[email protected]>
* pyfury/_serialization.pyd
* pyfury/_util.pyd
* pyfury/format/_format.pyd
* pyfury/lib/mmh3/mmh3.pyd

Signed-off-by: Junduo Dong <[email protected]>
Signed-off-by: Junduo Dong <[email protected]>
Signed-off-by: Junduo Dong <[email protected]>
Signed-off-by: Junduo Dong <[email protected]>
Signed-off-by: Junduo Dong <[email protected]>
@An-DJ
Copy link
Contributor Author

An-DJ commented Nov 2, 2024

@chaokunyang This UT is strange.

@pytest.mark.parametrize("language", [Language.XLANG, Language.PYTHON])
def test_array_serializer(language):
    fury = Fury(language=language, ref_tracking=True, require_class_registration=False)
    for typecode in PyArraySerializer.typecode_dict.keys():
        arr = array.array(typecode, list(range(10)))

        # additional code to print array attribute
        print(arr.itemsize, arr.typecode)
        # end

        assert ser_de(fury, arr) == arr
    for dtype in Numpy1DArraySerializer.dtypes_dict.keys():
        arr = np.array(list(range(10)), dtype=dtype)
        new_arr = ser_de(fury, arr)
        assert np.array_equal(new_arr, arr)
        np.testing.assert_array_equal(new_arr, arr)

I test it on Ubuntu 20.04 and Windows 10, but there are different result:

# Ubuntu

pyfury/tests/test_serializer.py::test_array_serializer[Language.XLANG] 2 h
4 i
8 l
4 f
8 d
PASSED
pyfury/tests/test_serializer.py::test_array_serializer[Language.PYTHON] 2 h
4 i
8 l
4 f
8 d
PASSED
# Windows

=================== 1 failed, 1 passed, 3 warnings in 0.88s ===================
FAILED         [ 50%]
2 h
4 i
4 l

The point is that l type could be 4 or 8 bit on different platform.

In addition, in _util.pyx, there are also failed converted between const int64_t and long:

cdef inline uint8_t* get_address(v):
    view = memoryview(v)
    cdef str dtype = view.format
    cdef:
       # ...
        const int64_t[:] signed_long_data
       # ...
        uint8_t* ptr
    if dtype == "b":
        signed_char_data = v
        ptr = <uint8_t*>(&signed_char_data[0])
    elif dtype == "l":  # here
        signed_long_data = v
        ptr = <uint8_t*>(&signed_long_data[0])
    else:
        raise Exception(f"Unsupported buffer of type {type(v)} and format {dtype}")
    return ptr

Could you plz give me some advice?

@An-DJ An-DJ marked this pull request as ready for review November 2, 2024 16:26
@chaokunyang
Copy link
Collaborator

chaokunyang commented Nov 2, 2024

Is Windows a big endian machine?

@An-DJ
Copy link
Contributor Author

An-DJ commented Nov 3, 2024

Is Windows a big endian machine?

You means the github windows-2022 ci runner is hosted on a big endian machine?

No. According to the related document, for Linux and Windows runners, GitHub uses Dadsv5-series virtual machines. For more information, see Dasv5 and Dadsv5-series in the Microsoft Azure documentation.

image

AMD EPYC 7763v (Genoa) [x86-64] support small endian only.

@An-DJ
Copy link
Contributor Author

An-DJ commented Nov 3, 2024

@chaokunyang Actually we check the python array initialization method array.array with different dtype, but each type code has its Mininum size in bytes, not a fixed size.

IMHO it is the reason why it behaves differently on UNIX and Windows.

Ref: https://docs.python.org/3/library/array.html

image

We should find a strict way to fix each dtype size... right?

@chaokunyang
Copy link
Collaborator

Yeah, you are right. Fury take l as 4 bytes, but it may be 8 bytes on windows. In such cases, we may need to expand every item from 4 bytes value to 8 bytes in a loop manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Windows build support
2 participants