Run CI tests on 3.12 and multiple operating systems #22

EliahKagan · 2023-09-17T07:13:50Z

This expands the CI matrix in test.yml by:

Testing Python 3.12 in addition to 3.8, 3.9, 3.10, and 3.11. Currently Python 3.12 is in release candidate 2. Testing with it before the stable 3.12.0 release comes out is encouraged.
Testing macOS and Windows as well as Ubuntu. I would advocate this as a general practice, but there is also a specific reason that applies for wasm_exec, in that wasmtime includes platform-specific code. Since wasmtime's documentation recommends it be updated monthly, I think it's useful to be able to automate this as much as possible. Doing more testing, as introduced here, would give greater confidence when adding such automation later. (For example, wasmtime updates could be taken using Dependabot, by issuing @dependabot merge on a Dependabot PR, to cause the update to be merged only if, and after, all checks have passed, including running all tests on all supported version of Python on three operating systems.)

A few important things this PR does not do:

This is completely independent of what Python version the Python WASM runtime uses. Currently this remains, as you have noted in the readme, 3.11-only. (From this section of the Wasm Labs page, it seems VMWare may release a 3.12 runtime at some point after a stable 3.12 release comes out. This PR does not cover that at all.)
I did not expand the matrix in the linting workflow. I'm not sure it's necessary to lint on multiple versions at all, much less operating systems. Either way, though, I think linting is far less likely to break in a new version, and also that the effect of breaking linting is much less severe in terms of what bugs in reveals. If continuing to lint on multiple versions, then 3.12 should be added once the stable 3.12.0 is released.
I did not do anything related to updating Python or GitHub Actions dependencies. Although I think part of the benefit of expanding the test matrix is to provide greater confidence in doing regular updates with Dependabot, or Renovatebot, etc., it seems to me that this PR would be too broadly scoped if it tried to do anything with that. The considerations when reviewing those kinds of changes would, I think, be largely different from the ones here. Furthermore, I think the benefit of more automated testing can stand by itself.

Although I think it remains to be seen if the complexity of caching for poetry across CI runs is worthwhile for this repository, I suspect it very well may be worthwhile, since it's possible that, in the future, extended testing may include testing the effect of changes on specific applications or libraries like langchain, some of which are themselves sizable once their dependencies are accounted for. Therefore, I have kept, and tried to make sure not to break, the caching logic. Caching doesn't seem to be working on my feature branch for the newly introdued operating systems, but based on Actions/cache: Cache not being hit despite of being present I think that is normal and hope it will simply work once those new caches are created on main (if you choose to merge this PR). However, separately from that, I have had to make a couple of changes to actions/poetry_setup/action.yml:

Added a python-allow-prereleases option to allow a value to be passed through to the setup-python action's allow-prereleases key. Either this or hard-coding true is needed to install a prerelease. I think adding an option is better. Furthermore, while the logic that special-cases 3.12 should, it seems to me, be removed from test.yml soon after 3.12.0 comes out, the python-allow-prereleases option added to poetry_setup could be reused in the future (if poetry_setup is still in use at that time).
Installed poetry with pip instead of pipx, because pipx (and, as a result, poetry) do not find the correct interpreter on macOS and Windows GitHub Actions runners without significant effort. This is undesirable, but I believe the alternatives are worse. Commands that pass the Python version to pipx, as that workflow was doing, tend not to work very well on the macOS and Windows runners. The exact path to the Python executable is not made directly available by setup-python, and while the installation directory is available, the relative path to the interpreter varies by operating system. If other Python implementations such as PyPy are tested in the future, the situation becomes even worse in that regard. Furthermore, because the Windows runner has Python and pipx, what seems like it works will often instead just everything with the system Python instead of the one setup-python installed. Using pip eliminates all these problems with no added complexity for selecting the correct version, because setup-python provides pip on all platforms.

I have also made some other changes that seem to me to be useful in light of all the above changes:

Tweak the style of test names, so they are less likely to be cut off in the GitHub Actions web-based interface. This counterbalances the lengthening effect of operating system names being included in the job names. The job names are now fully automatically generated, rather than through interpolating ${{ }} expressions. I changed the names of the job keys, e.g. build to test in test.yml, so that the generated names would be good. I changed the values associated with the workflow name keys to be capitalized, so they would be styled differently from job names.
Specify a newer version of poetry, 1.6.1, which is currently the latest. I was unsure if I should do that in this PR or separately, but I decided to do it here because, as of 1.6.0, there are performance improvements that I think are worth having when expanding CI. (The runners run in parallel, for the most part, but the macOS and Windows runners tend to run Python code slower, including for pip and poetry.)
Small cleanup. For example, the test workflow, unlike the others, specifies the Poetry version with the poetry-version key passed to actions/poetry_setup/action.yml, but it also had a POETRY_VERSION environment variable defined at the top that had no effect, but appeared to control the version.
Run test and lint jobs on all branches, rather than just main and branches that have an open pull request targeting main. Although this has the disadvantage that it will double up CI checks on PRs that originate in this upstream repository (i.e., that you open yourself), I think that is outweighed by the advantage that, for anyone who has enabled GitHub Actions in their fork, CI checks will run even without, or before, a pull request is opened. This makes trying out changes one is not sure will work out and justify a PR easier. This was the first thing I did (it is in the first commit here) and I found it very valuable because I was able to test various approaches before even opening this pull request, and to do so without requiring any added steps.

However, some of that is subjective! I would be pleased to make any requested changes to this PR (and/or please feel free to push commits).

This makes pushes trigger CI checks even when they are not to the main branch or in a PR opened against the main branch. This way, CI will work in forks once enabled in the Actions tab (otherwise the changes have to be on the "main" branch of the fork, or a PR is needed).

Unlike in the two other workflows, the test workflow specifies the poetry version with the poetry-version key in the poetry_setup action configuration.

This includes performance improvements for installing, as detailed in https://python-poetry.org/history/#160---2023-08-20.

This is not really a bugfix, because it's possible to infer the correctness of the unquoted expansions in the cases where they appear in the CI workflows. However, quoting them makes the intent clear: we want the literal text produced by the expansion, and it is not intenteded that it be changed by word splitting and globbing (filename expansion) done by the shell.

This allows a prerelease for 3.12, which is currently at RC2 so it will not install if a prerelease is not allowed, while continuing to prohibit installing a prerelease for earlier versions. Because 3.12 is at RC2, the likelihood of test failures on it due to bugs in 3.12 is fairly low. But in any situation where tests do fail on 3.12, it would be necessary to compare the reults to those of at least one other release, to rule this out. So I have also set continue-on-error to true for 3.12 so that failure of a 3.12 job will not cancel the other jobs.

Going along with this, to make it so good names are generated: - Change job key names from "build" to be descriptive/intuitive. - Capitalize workflow names to avoid confusion with the new job names, so workflow and job names are styled differently in CI output. This change may produce a minor improvement in clarity right now, but the real benefit is for the forthcoming expansion of the test matrix to include multiple operating systems, where the terser and more common style used to distinguish different generated jobs from the same matrix will be more readable. (Longer names produced from generated jobs are hard to distinguish because the full name would appear cut off in some places in the GitHub Actions web interface.)

This extends the CI test matrix with an "os" dimension, so it tests on Ubuntu (as before), macOS, and Windows. It tests all five Python version on each, so the matrix now has 15 jobs. Building the value for "os" by appending "-latest" to names like "ubuntu", rather than just using the full names ("ubuntu-latest"), is less idiomatic. However, it helps keep the generated matrix job names narrow enough not to be cut off in the GitHub Actions interface, for matrices like this one that have a third dimension. (Currently, "core" is the only value for "test-type", but it is, and probably should, be kept in the names, thus lengthening them.)

This is imperfect, but it allows the correct version of Python to be installed and used with minimal complexity and special casing. Furthermore, because poetry is being used and the project is being installed in a virtual environment, polluting the global Python environment with poetry itself and its dependencies is unlikely to hide unstated dependency bugs or cause conflicts. On Ubuntu runners, pipx works with whatever version of Python is installed by setup-python, and commands like "python3.10" are available in $PATH for pipx to find and use. That is not the case on other GitHub Actions runners, however. This is especially a problem on Windows, where even obtaining the full path to the interpreter cannot be done in any straightforward portable way. (setup-python provides the Python directory in the pythonLocation Python_ROOT_DIR etc. environment variables, but where the Python interpeter is in it can differ by OS. This is practical, though cumbersome, to solve when only CPython is used. But if pypy is also used, which may be desired at some point in the future, then even more special-casing is required, on Windows.)

Since kebab-case is more common than snake_case for this usage, and this convention was otherwise being used throughout the workflows.

Jflick58 · 2024-01-24T16:44:04Z

Apologies for the delayed merge!

EliahKagan added 9 commits September 16, 2023 22:14

Remove unused POETRY_VERSION env var in test.yml

f36d2fe

Unlike in the two other workflows, the test workflow specifies the poetry version with the poetry-version key in the poetry_setup action configuration.

Bump poetry used on CI from 1.4.2 to 1.6.1

06bd8cf

This includes performance improvements for installing, as detailed in https://python-poetry.org/history/#160---2023-08-20.

Rename test_type to test-type

b3b3d0a

Since kebab-case is more common than snake_case for this usage, and this convention was otherwise being used throughout the workflows.

EliahKagan marked this pull request as ready for review September 17, 2023 07:15

Jflick58 merged commit fb84f93 into Jflick58:main Jan 24, 2024
19 checks passed

EliahKagan deleted the ci-matrix branch January 26, 2024 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run CI tests on 3.12 and multiple operating systems #22

Run CI tests on 3.12 and multiple operating systems #22

EliahKagan commented Sep 17, 2023 •

edited

Loading

Jflick58 commented Jan 24, 2024

Run CI tests on 3.12 and multiple operating systems #22

Run CI tests on 3.12 and multiple operating systems #22

Conversation

EliahKagan commented Sep 17, 2023 • edited Loading

Jflick58 commented Jan 24, 2024

EliahKagan commented Sep 17, 2023 •

edited

Loading