Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CI tests on 3.12 and multiple operating systems #22

Merged
merged 9 commits into from
Jan 24, 2024

Conversation

EliahKagan
Copy link
Contributor

@EliahKagan EliahKagan commented Sep 17, 2023

This expands the CI matrix in test.yml by:

  • Testing Python 3.12 in addition to 3.8, 3.9, 3.10, and 3.11. Currently Python 3.12 is in release candidate 2. Testing with it before the stable 3.12.0 release comes out is encouraged.
  • Testing macOS and Windows as well as Ubuntu. I would advocate this as a general practice, but there is also a specific reason that applies for wasm_exec, in that wasmtime includes platform-specific code. Since wasmtime's documentation recommends it be updated monthly, I think it's useful to be able to automate this as much as possible. Doing more testing, as introduced here, would give greater confidence when adding such automation later. (For example, wasmtime updates could be taken using Dependabot, by issuing @dependabot merge on a Dependabot PR, to cause the update to be merged only if, and after, all checks have passed, including running all tests on all supported version of Python on three operating systems.)

A few important things this PR does not do:

  • This is completely independent of what Python version the Python WASM runtime uses. Currently this remains, as you have noted in the readme, 3.11-only. (From this section of the Wasm Labs page, it seems VMWare may release a 3.12 runtime at some point after a stable 3.12 release comes out. This PR does not cover that at all.)
  • I did not expand the matrix in the linting workflow. I'm not sure it's necessary to lint on multiple versions at all, much less operating systems. Either way, though, I think linting is far less likely to break in a new version, and also that the effect of breaking linting is much less severe in terms of what bugs in reveals. If continuing to lint on multiple versions, then 3.12 should be added once the stable 3.12.0 is released.
  • I did not do anything related to updating Python or GitHub Actions dependencies. Although I think part of the benefit of expanding the test matrix is to provide greater confidence in doing regular updates with Dependabot, or Renovatebot, etc., it seems to me that this PR would be too broadly scoped if it tried to do anything with that. The considerations when reviewing those kinds of changes would, I think, be largely different from the ones here. Furthermore, I think the benefit of more automated testing can stand by itself.

Although I think it remains to be seen if the complexity of caching for poetry across CI runs is worthwhile for this repository, I suspect it very well may be worthwhile, since it's possible that, in the future, extended testing may include testing the effect of changes on specific applications or libraries like langchain, some of which are themselves sizable once their dependencies are accounted for. Therefore, I have kept, and tried to make sure not to break, the caching logic. Caching doesn't seem to be working on my feature branch for the newly introdued operating systems, but based on Actions/cache: Cache not being hit despite of being present I think that is normal and hope it will simply work once those new caches are created on main (if you choose to merge this PR). However, separately from that, I have had to make a couple of changes to actions/poetry_setup/action.yml:

  • Added a python-allow-prereleases option to allow a value to be passed through to the setup-python action's allow-prereleases key. Either this or hard-coding true is needed to install a prerelease. I think adding an option is better. Furthermore, while the logic that special-cases 3.12 should, it seems to me, be removed from test.yml soon after 3.12.0 comes out, the python-allow-prereleases option added to poetry_setup could be reused in the future (if poetry_setup is still in use at that time).
  • Installed poetry with pip instead of pipx, because pipx (and, as a result, poetry) do not find the correct interpreter on macOS and Windows GitHub Actions runners without significant effort. This is undesirable, but I believe the alternatives are worse. Commands that pass the Python version to pipx, as that workflow was doing, tend not to work very well on the macOS and Windows runners. The exact path to the Python executable is not made directly available by setup-python, and while the installation directory is available, the relative path to the interpreter varies by operating system. If other Python implementations such as PyPy are tested in the future, the situation becomes even worse in that regard. Furthermore, because the Windows runner has Python and pipx, what seems like it works will often instead just everything with the system Python instead of the one setup-python installed. Using pip eliminates all these problems with no added complexity for selecting the correct version, because setup-python provides pip on all platforms.

I have also made some other changes that seem to me to be useful in light of all the above changes:

  • Tweak the style of test names, so they are less likely to be cut off in the GitHub Actions web-based interface. This counterbalances the lengthening effect of operating system names being included in the job names. The job names are now fully automatically generated, rather than through interpolating ${{ }} expressions. I changed the names of the job keys, e.g. build to test in test.yml, so that the generated names would be good. I changed the values associated with the workflow name keys to be capitalized, so they would be styled differently from job names.
  • Specify a newer version of poetry, 1.6.1, which is currently the latest. I was unsure if I should do that in this PR or separately, but I decided to do it here because, as of 1.6.0, there are performance improvements that I think are worth having when expanding CI. (The runners run in parallel, for the most part, but the macOS and Windows runners tend to run Python code slower, including for pip and poetry.)
  • Small cleanup. For example, the test workflow, unlike the others, specifies the Poetry version with the poetry-version key passed to actions/poetry_setup/action.yml, but it also had a POETRY_VERSION environment variable defined at the top that had no effect, but appeared to control the version.
  • Run test and lint jobs on all branches, rather than just main and branches that have an open pull request targeting main. Although this has the disadvantage that it will double up CI checks on PRs that originate in this upstream repository (i.e., that you open yourself), I think that is outweighed by the advantage that, for anyone who has enabled GitHub Actions in their fork, CI checks will run even without, or before, a pull request is opened. This makes trying out changes one is not sure will work out and justify a PR easier. This was the first thing I did (it is in the first commit here) and I found it very valuable because I was able to test various approaches before even opening this pull request, and to do so without requiring any added steps.

However, some of that is subjective! I would be pleased to make any requested changes to this PR (and/or please feel free to push commits).

This makes pushes trigger CI checks even when they are not to the
main branch or in a PR opened against the main branch. This way,
CI will work in forks once enabled in the Actions tab (otherwise
the changes have to be on the "main" branch of the fork, or a PR
is needed).
Unlike in the two other workflows, the test workflow specifies the
poetry version with the poetry-version key in the poetry_setup
action configuration.
This includes performance improvements for installing, as detailed
in https://python-poetry.org/history/#160---2023-08-20.
This is not really a bugfix, because it's possible to infer the
correctness of the unquoted expansions in the cases where they
appear in the CI workflows.

However, quoting them makes the intent clear: we want the literal
text produced by the expansion, and it is not intenteded that it be
changed by word splitting and globbing (filename expansion) done by
the shell.
This allows a prerelease for 3.12, which is currently at RC2 so it
will not install if a prerelease is not allowed, while continuing
to prohibit installing a prerelease for earlier versions.

Because 3.12 is at RC2, the likelihood of test failures on it due
to bugs in 3.12 is fairly low. But in any situation where tests do
fail on 3.12, it would be necessary to compare the reults to those
of at least one other release, to rule this out. So I have also
set continue-on-error to true for 3.12 so that failure of a 3.12
job will not cancel the other jobs.
Going along with this, to make it so good names are generated:

- Change job key names from "build" to be descriptive/intuitive.

- Capitalize workflow names to avoid confusion with the new job
  names, so workflow and job names are styled differently in CI
  output.

This change may produce a minor improvement in clarity right now,
but the real benefit is for the forthcoming expansion of the test
matrix to include multiple operating systems, where the terser and
more common style used to distinguish different generated jobs from
the same matrix will be more readable. (Longer names produced from
generated jobs are hard to distinguish because the full name would
appear cut off in some places in the GitHub Actions web interface.)
This extends the CI test matrix with an "os" dimension, so it tests
on Ubuntu (as before), macOS, and Windows. It tests all five Python
version on each, so the matrix now has 15 jobs.

Building the value for "os" by appending "-latest" to names like
"ubuntu", rather than just using the full names ("ubuntu-latest"),
is less idiomatic. However, it helps keep the generated matrix job
names narrow enough not to be cut off in the GitHub Actions
interface, for matrices like this one that have a third dimension.
(Currently, "core" is the only value for "test-type", but it is,
and probably should, be kept in the names, thus lengthening them.)
This is imperfect, but it allows the correct version of Python
to be installed and used with minimal complexity and special
casing. Furthermore, because poetry is being used and the project
is being installed in a virtual environment, polluting the global
Python environment with poetry itself and its dependencies is
unlikely to hide unstated dependency bugs or cause conflicts.

On Ubuntu runners, pipx works with whatever version of Python is
installed by setup-python, and commands like "python3.10" are
available in $PATH for pipx to find and use. That is not the case
on other GitHub Actions runners, however. This is especially a
problem on Windows, where even obtaining the full path to the
interpreter cannot be done in any straightforward portable way.
(setup-python provides the Python directory in the pythonLocation
Python_ROOT_DIR etc. environment variables, but where the Python
interpeter is in it can differ by OS. This is practical, though
cumbersome, to solve when only CPython is used. But if pypy is also
used, which may be desired at some point in the future, then even
more special-casing is required, on Windows.)
Since kebab-case is more common than snake_case for this usage, and
this convention was otherwise being used throughout the workflows.
@EliahKagan EliahKagan marked this pull request as ready for review September 17, 2023 07:15
@Jflick58 Jflick58 merged commit fb84f93 into Jflick58:main Jan 24, 2024
19 checks passed
@Jflick58
Copy link
Owner

Apologies for the delayed merge!

@EliahKagan EliahKagan deleted the ci-matrix branch January 26, 2024 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants