Proposal: Force recompilation of all dependencies #195

kmod · 2022-05-05T00:32:54Z

There is, in general, not a requirement that source packages compile to the same thing as available binary packages. It's rare, but this is one thing that can end up causing differences in benchmark results between two Python distributions, if one has binary packages and one does not. Forcing all dependencies to compile from source eliminates this potential difference.

The only case I know of is mypy, which ships a more-optimized binary package than what their source package compiles to.

corona10 · 2022-05-05T01:45:22Z

@kmod
IIUC, the most user doesn't use --no-binary option while they installing packages,
Doesn't it show a benchmark that deviates from the general use case?

kmod · 2022-05-05T19:12:38Z

True, that's not how users install packages, so the package installation process would be a bit different between the benchmarks and users. But in the common case the actual running of the benchmarks shouldn't be affected, and in the rare case that the benchmarks are affected by whether or not binary packages are installed I'd argue that that's when we would want this behavior.

ericsnowcurrently · 2022-05-09T22:53:43Z

Great observation!

tl;dr +1, but add an option to disable --no-binary.

The kinds of benchmarks where dependencies matter are typically workload-oriented (macro benchmarks). We want to be able to communicate a performance delta to users relative to how they should expect their Python workload to perform. So with that in mind I'd lean toward avoiding --no-binary if possible. Performance difference between binary and source distributions should be communicated by the relevant projects (e.g. mypy).

That said, I get what you are saying about apples-to-apples comparisons for Python implementations (or distros). We also run into this with running the benchmarks against CPython main: few dependencies are available on PyPI. So I agree we don't want benchmark dependencies to dictate any difference in performance between result sets.

Weighing the two sides, I'm in favor of the latter, which you are proposing. So +1. There is still a use for using pre-built extensions, though, so it would be good for pyperformance to allow us to opt out of --no-binary.

ericsnowcurrently · 2022-05-09T22:54:05Z

Also, presumably all the dependencies build on all our tested versions, based on the CI results.

There is, in general, not a requirement that source packages compile to the same thing as available binary packages. It's rare, but this is one thing that can end up causing differences in benchmark results between two Python distributions, if one has binary packages and one does not. Forcing all dependencies to compile from source eliminates this potential difference. The only case I know of is mypy, which ships a more-optimized binary package than what their source package compiles to.

kmod · 2022-05-10T18:21:38Z

No problem, I added a USE_BINARY_PACKAGES env var that will keep the old behavior of not passing --no-binary. Personally I have a hard time envisioning a scenario when a user would want to include performance differences between binary packages... but anyway I think an argument for the behavior of this PR being opt-out (rather than opt-in) is that people who want binary packages probably know that they do, and people who want more consistency are probably not aware of this as something to worry about.

Anyway, let me know if there's anything else I can do to make this more acceptable. I'm not sure what to do about the test failures -- in my view these are legitimate issues (that these interpreters are unable to build these dependencies) that are being uncovered by this PR. I think we could add wheel as an automatic dependency to get pypy to work, but I'm not sure what the issue is with 3.11 and greenlet.

gvanrossum

LGTM

corona10 requested review from ericsnowcurrently and gvanrossum May 5, 2022 01:46

corona10 closed this May 7, 2022

corona10 reopened this May 7, 2022

kmod force-pushed the nobinary branch from 7a60703 to e969a2f Compare May 10, 2022 18:06

gvanrossum reviewed May 10, 2022

View reviewed changes

kmod mentioned this pull request Jul 7, 2022

mypyc benchmark doesn't seem to run mypyc pyston/python-macrobenchmarks#5

Closed

kumaraditya303 closed this Aug 2, 2022

kumaraditya303 reopened this Aug 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Force recompilation of all dependencies #195

Proposal: Force recompilation of all dependencies #195

kmod commented May 5, 2022

corona10 commented May 5, 2022 •

edited

Loading

kmod commented May 5, 2022

ericsnowcurrently commented May 9, 2022

ericsnowcurrently commented May 9, 2022

kmod commented May 10, 2022

gvanrossum left a comment

Proposal: Force recompilation of all dependencies #195

Are you sure you want to change the base?

Proposal: Force recompilation of all dependencies #195

Conversation

kmod commented May 5, 2022

corona10 commented May 5, 2022 • edited Loading

kmod commented May 5, 2022

ericsnowcurrently commented May 9, 2022

ericsnowcurrently commented May 9, 2022

kmod commented May 10, 2022

gvanrossum left a comment

Choose a reason for hiding this comment

corona10 commented May 5, 2022 •

edited

Loading