Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] CI: Enable testing on Windows #388

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from
Draft

Conversation

amotl
Copy link
Member

@amotl amotl commented Nov 3, 2020

This is just a WIP based on #386. Let's see how it goes.

@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

CrateDB nightlies are not available for macOS and Windows, see [1], bummer. Let's mitigate that by only adjusting the CrateDB version to "nightly" on Linux through c99d01d.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347098559#step:4:124

@amotl amotl force-pushed the amo/tests-multi-os-ci branch 2 times, most recently from 9459866 to 8f84461 Compare November 3, 2020 11:41
@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

We are seeing An existing connection was forcibly closed by the remote host again [1], like #386 (comment).

Error: Error downloading extends for URL https://cdn.crate.io/downloads/releases/cratedb/x64_windows/crate-4.3.0.zip: [WinError 10054] An existing connection was forcibly closed by the remote host
Error: Process completed with exit code 1.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347231174#step:4:115

@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

While making some progress on this, the last command bin/coverage run $test_program -vv1 errors out on Windows like

No file to run: 'bin/test-quick'

or

No file to run: 'bin\test-quick'

or

No file to run: 'D:\a\crate-python\crate-python\bin\test-quick'

See also [1], [2] and [3].

This is somehow related to

how the system path and python path are handled on various operating systems Windows [4].

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347349770#step:5:16
[2] https://github.com/crate/crate-python/runs/1347349770?check_suite_focus=true#step:5:16
[3] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347464565#step:5:16
[4] https://stackoverflow.com/questions/3312451/how-can-you-get-unittest2-and-coverage-py-working-together

@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

The reason that bin/coverage has not been able to find bin/test-quick is that it gets installed as bin/test-quick.exe on Windows. However, addressing that file obviously will not work as intended [1]:

Couldn't run 'bin/test-quick.exe' as Python code: SyntaxError: invalid or missing encoding declaration

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347537748#step:5:17

@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

After some more adjustments, the test suite also starts on Windows. Now, it croaks with AttributeError: module 'time' has no attribute 'clock' from sqlalchemy.util.compat [1], at least with sqlalchemy-1.1.18 on Python 3.8.

Traceback (most recent call last):
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 39, in <module>
    from crate.client.sqlalchemy.dialect import CrateDialect
  File "D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\__init__.py", line 22, in <module>
    from .dialect import CrateDialect
  File "D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\dialect.py", line 25, in <module>
    from sqlalchemy import types as sqltypes
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\__init__.py", line 9, in <module>
    from .sql import (
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\__init__.py", line 8, in <module>
    from .expression import (
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\expression.py", line 33, in <module>
    from .visitors import Visitable
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\sql\visitors.py", line 28, in <module>
    from .. import util
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\util\__init__.py", line 8, in <module>
    from .compat import callable, cmp, reduce,  \
  File "D:\a\crate-python\crate-python\eggs\sqlalchemy-1.1.18-py3.8-win-amd64.egg\sqlalchemy\util\compat.py", line 172, in <module>
    time_func = time.clock
AttributeError: module 'time' has no attribute 'clock'

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1347946253#step:5:42


According to sqlalchemy/sqlalchemy#5305, sqlalchemy/sqlalchemy#4731 and sqlalchemy/sqlalchemy#5444, it looks like this is deprecated.

At sqlalchemy/sqlalchemy#5305 (comment), @zzzeek recommends to

upgrade to a recent version of SQLAlchemy.

@amotl
Copy link
Member Author

amotl commented Nov 3, 2020

Introduction

There's a problem spinning up CrateDB on Windows.

Problem

Now, there's an error when trying to bootstrap CrateDB on Windows [1]. It goes like

b'[2020-11-03T15:11:44,184][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [crate] uncaught exception in thread [main]'
b'org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to create a child event loop'
[...]
b'Caused by: java.lang.IllegalStateException: failed to create a child event loop'
b'at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:88) ~[netty-common-4.1.53.Final.jar:4.1.53.Final]'
[...]
b'Caused by: io.netty.channel.ChannelException: failed to open a new selector'
[...]
b'Caused by: java.io.IOException: Unable to establish loopback connection'
[...]
b'Caused by: java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket'
[...]

java.lang.IllegalStateException: failed to create a child event loop

It might be related to the same thing happening to ES [2] or Netty in general [3]. Comments within both threads seem to suggest to turn off the firewall. See also netty/netty#7243.

Maybe [4] is related to this?

GitHub hosts Linux and Windows runners on Standard_DS2_v2 virtual machines in Microsoft Azure with the GitHub Actions runner application installed. And due to security policy Azure blocks ICMP by default. Hence, you cannot get ICMP answer in workflow.

[1] https://github.com/crate/crate-python/pull/388/checks?check_run_id=1348166824#step:5:212
[2] https://discuss.elastic.co/t/elasticsearch-fails-to-start-with-java-lang-illegalstateexception-failed-to-create-child-event-loop/218797
[3] https://stackoverflow.com/questions/27506788/failed-to-create-a-child-event-loop
[4] https://github.community/t/what-is-the-default-firewall-for-github-actions-runners/17732

java.io.IOException: Unable to establish loopback connection

That seems to be related to IPv6 somehow. People suggest to prefer the IPv4 stack using export _JAVA_OPTIONS="-Djava.net.preferIPv4Stack=true". However, there might be other root causes.

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

This issue seems to be well known to the community and has been around for quite some time, dating back to Java 1.4. Most probably, it is a socket port conflict.

While SO #37643301 says

WinSock error 10106 is WSAEPROVIDERFAILEDINIT: "Service provider failed to initialize. The requested service provider could not be loaded or initialized. This error is returned if either a service provider's DLL could not be loaded (LoadLibrary failed) or the provider's WSPStartup or NSPStartup function failed."

GSE #163543 says

java.net.SocketException: Unrecognized Windows Sockets error: 10106: create means you are failing to open the port. Usually this has one of two reasons: either you (or any program you installed) blocked the port to protect you, or any program you have got is using the Port right now.

@amotl
Copy link
Member Author

amotl commented Nov 5, 2020

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

All of these didn't bring any improvements:

# Disable firewall.
netsh advfirewall set allprofiles state off

# Reset IP stack.
netsh winsock reset

# Disable IPv6.
netsh interface teredo set state disabled
netsh interface ipv6 6to4 set state state=disabled undoonstop=disabled
netsh interface ipv6 isatap set state state=disabled

# Run Java in headless mode and prefer IPv4 stack.
set _JAVA_OPTIONS="-Djava.awt.headless=true -Djava.net.preferIPv4Stack=true"

@amotl
Copy link
Member Author

amotl commented Nov 5, 2020

java.net.SocketException: Unrecognized Windows Sockets error: 10106: socket

Some resources suggest to set the SystemRoot environment variable to C:\Windows:

Indeed: With 4c05db6, CrateDB seems to be able to spin up successfully.

@amotl
Copy link
Member Author

amotl commented Nov 5, 2020

Now, we are seeing this:

Error in test D:\a\crate-python\crate-python\src\crate\client\sqlalchemy\doctests\itests.txt
Traceback (most recent call last):
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\unittest\case.py", line 601, in run
    self.setUp()
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\doctest.py", line 2167, in setUp
    self._dt_setUp(test)
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 160, in setUpCrateLayerAndSqlAlchemy
    setUpWithCrateLayer(test)
  File "D:\a\crate-python\crate-python\src\crate\client\tests.py", line 148, in setUpWithCrateLayer
    cursor.execute("copy locations from ?", (data_path,))
  File "D:\a\crate-python\crate-python\src\crate\client\cursor.py", line 54, in execute
    bulk_parameters)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 396, in sql
    content = self._json_request('POST', self.path, data=data)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 523, in _json_request
    _raise_for_status(response)
  File "D:\a\crate-python\crate-python\src\crate\client\http.py", line 207, in _raise_for_status
    error_trace=error_trace)
crate.client.exceptions.ProgrammingError: SQLParseException[Illegal character in opaque part at index 2: D:\a\crate-python\crate-python\src\crate\testing\testdata\data\test_a.json]

Regarding this, we are lucky @seut already suggested at crate/crate#7781 to

Use a valid file URI on windows, see e.g.: https://blogs.msdn.microsoft.com/ie/2006/12/06/file-uris-in-windows/.

The place to adjust that is:

data_path = docs_path('testing/testdata/data/test_a.json')
# load testing data into crate
cursor.execute("copy locations from ?", (data_path,))

@amotl amotl changed the title Enable CI-testing on macOS and Windows CI: Enable testing on Windows May 25, 2022
amotl added 3 commits May 25, 2022 23:00
In this manner, the outcome can be better inspected regarding deviances
and flaky behavior; specifically when so many things on the CI
environment setup is being changed.

It can be reverted again later.
@amotl amotl changed the title CI: Enable testing on Windows [DRAFT] CI: Enable testing on Windows Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant