-
Notifications
You must be signed in to change notification settings - Fork 3.1k
dependency resolution behaves differently with requirements.txt vs listing on the command-line #10544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The resolver since 21.2 attempts to prefer the order in which you give it requirements. In your requirements file example your order is effectively:
In your command line example your order is effectively:
Because in the former example numpy is attempted to be pinned last and in the latter example numpy is attempted to pinned first they are going to have very different dependency resolution behavior, especially if your requirements need to be backtracked. You should be able to confirm this by swapping the order you give the requirements in either of your examples. I am hoping that the performance in both cases should be acceptable once this lands: #10479 but I don't have time to test right now. I'll see if I can test tomorrow evening. |
So that's an interesting detail that I wasn't aware of. However, I can be quite certain that the order is not the culprit here. I've redone the test using only bash invocation differences and I can see the behavior is quite different. With a 3 minutes might not seem like an insane amount of time to wait (IMHO it is, but that's a different discussion), but this is actually a reduced example from what I was doing which was installing many more packages. They should behave the same when provided with a requirements file vs command-line arguments, especially if order is important. It appears that when presented on the command-line, the resolver starts doing backtracking on the numpy package, whereas the requirements file fails much more quickly. Here's another invocation that does a better job of showing the difference where I simply invoke
|
I can't reproduce your issue but I don't have an OS X device to test on and I don't know what or how you've configured this second index "http://sfw.sf.smle.co:8080/sfw/pip" which definitely appears to be contributing to the backtracking. If I could reproduce the issue I would attempt to look at what is getting ordered different when |
It's an apache http server with vanilla directory listings as a simple repository server. What's your hypothesis about that index? I don't see it contributing to the longer run from the log I posted. If I run the command again without the extra index server in my pip.conf, I still see the behavior
|
I took a quick look (without debugging). It looks to me like the difference in behavior here https://github.com/pypa/pip/blob/21.2.4/src/pip/_internal/cli/req_command.py#L363 vs https://github.com/pypa/pip/blob/21.2.4/src/pip/_internal/cli/req_command.py#L387 which calls out to https://github.com/pypa/pip/blob/21.2.4/src/pip/_internal/req/constructors.py#L430 is notable. In one case pip does for req in args:
req_to_add = install_req_from_line(
req,
None,
isolated=options.isolated_mode,
use_pep517=options.use_pep517,
user_supplied=True,
)
requirements.append(req_to_add)
for req in options.editables:
req_to_add = install_req_from_editable(
req,
user_supplied=True,
isolated=options.isolated_mode,
use_pep517=options.use_pep517,
)
requirements.append(req_to_add) and in the other, it calls install_req_from_parsed_requirements (I presume line-by-line without reordering them), which itself does: if parsed_req.is_editable:
req = install_req_from_editable(
parsed_req.requirement,
comes_from=parsed_req.comes_from,
use_pep517=use_pep517,
constraint=parsed_req.constraint,
isolated=isolated,
user_supplied=user_supplied,
)
else:
req = install_req_from_line(
parsed_req.requirement,
comes_from=parsed_req.comes_from,
use_pep517=use_pep517,
isolated=isolated,
options=parsed_req.options,
constraint=parsed_req.constraint,
line_source=parsed_req.line_source,
user_supplied=user_supplied,
) IIUC, it looks like passing arguments on the command line orders non-editables before editables, whereas the requirements file is parsed line by line and added in the order in the file. That of course assumes that the downstream resolver depends on order and doesn't do any further reordering. If I'm right, I see a few solutions
Index: src/pip/_internal/req/req_file.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/pip/_internal/req/req_file.py b/src/pip/_internal/req/req_file.py
--- a/src/pip/_internal/req/req_file.py (revision 0981e071bee8ac3ca2497813fa0863005033c6c2)
+++ b/src/pip/_internal/req/req_file.py (date 1633472570176)
@@ -328,6 +328,8 @@
def _parse_and_recurse(
self, filename: str, constraint: bool
) -> Iterator[ParsedLine]:
+ editable_requirements = []
+
for line in self._parse_file(filename, constraint):
if not line.is_requirement and (
line.opts.requirements or line.opts.constraints
@@ -353,8 +355,12 @@
)
yield from self._parse_and_recurse(req_path, nested_constraint)
+ elif line.is_editable:
+ editable_requirements.append(line)
else:
yield line
+
+ yield from editable_requirements
def _parse_file(self, filename: str, constraint: bool) -> Iterator[ParsedLine]:
_, content = get_file_content(filename, self._session)
What's the correct ordering of requirements? Is it by command-line order, or should editables always be last? Is it considered ok to have requirements.txt install differently than command-line invocations re: editable installs? Is it considered a guarantee that requirements.txt ordering implies a specific resolution ordering, and does that same guarantee apply to command-line invocations? For my mental model, it seems like they should be the same. But I completely understand the engineering decision to not offer it as a guarantee. It also seems like this may have been a difference for years, but now that resolvers routinely take minutes and hours to find/fail to find installation candidates, suddenly the route to failure is relevant. Not sure if this is going to end up causing other differences in resolution. |
Assuming your analysis is correct (I'm quite confident it is, but didn't actually check, and nobody should trust my memory on this kind of stuff 🙂), ordering requirements by command line order is the most sensible solution. There is no "correct" ordering because everything could be editable or not depending on the situation (conceptually editable-ness is completely orthogontal to dependency resolution), so we can only use an ordering that surprises the least people. |
Also, I made a repository to make testing a little easier https://github.com/stefansjs/test_pip_install_differences. |
I suspect that at least part of the problem here is that on the command line, Unfortunately, the decision to use an option to signal an editable install is long-established, and would be very hard to change at this point. (Please check before acting on the above, this is purely from memory and I'm not that familiar with our option parsing code). |
FWIW that's my understanding from memory as well. The only reasonable-ish way would be to manually parse |
So I was curious about the same thing. Obviously doing a secondary custom parsing of sys.argv, when you have a module whose whole purpose is to parse sys.argv, is error prone. BUT, I think there may be some ways to use optparse (or future parsers) to do something desirable. The most direct of which is to have the -e flag append to the same Index: src/pip/_internal/cli/cmdoptions.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/pip/_internal/cli/cmdoptions.py b/src/pip/_internal/cli/cmdoptions.py
--- a/src/pip/_internal/cli/cmdoptions.py (revision 0981e071bee8ac3ca2497813fa0863005033c6c2)
+++ b/src/pip/_internal/cli/cmdoptions.py (date 1633544838206)
@@ -430,9 +430,9 @@
return Option(
"-e",
"--editable",
- dest="editables",
+ dest="requirements",
action="append",
- default=[],
+ callback=optparse_editable_callback,
metavar="path/url",
help=(
"Install a project in editable mode (i.e. setuptools "
@@ -440,6 +440,12 @@
),
)
+def optparse_editable_callback(option, opt, value, parser):
+ if not hasattr(parser.values, 'editables'):
+ setattr(parser.values, 'editables', set())
+
+ parser.values.editable.add(value)
+
def _handle_src(option: Option, opt_str: str, value: str, parser: OptionParser) -> None:
value = os.path.abspath(value) Then my thought was to modify the loop that loops over requirements then editables to react on editables differently Index: src/pip/_internal/cli/req_command.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/pip/_internal/cli/req_command.py b/src/pip/_internal/cli/req_command.py
--- a/src/pip/_internal/cli/req_command.py (revision 0981e071bee8ac3ca2497813fa0863005033c6c2)
+++ b/src/pip/_internal/cli/req_command.py (date 1633485496843)
@@ -382,15 +382,7 @@
isolated=options.isolated_mode,
use_pep517=options.use_pep517,
user_supplied=True,
- )
- requirements.append(req_to_add)
-
- for req in options.editables:
- req_to_add = install_req_from_editable(
- req,
- user_supplied=True,
- isolated=options.isolated_mode,
- use_pep517=options.use_pep517,
+ editable=req in options.editables,
)
requirements.append(req_to_add) If it's looking like a reasonable approach, I can start to put together a PR. The code changes themselves will be quick to write and share; we could discuss the approaches and any alternatives. Making the thing unit tested and ready for merge would take a little longer. |
Honestly, I'm -1 on reconstructing the command line on general principle. It constrains how we develop our command line interface in future. Any of the other options would be better IMO, if we decided we wanted to do anything here. |
Right, I'm not sure if we're in agreement or not. I think I agree that reconstructing the command-line from parsed options/arguments is a Bad Idea ™️. Anything that involves looking at Are you saying you disagree with the |
Personally, I feel that any attempt to preserve the order of normal requirements and But I'm happy to hear what the other pip developers say here - I don't have a strong opinion, so if someone else does, I'll defer to them. |
Relying on callbacks has an additional issue of relying on the argument parsing library calling them sequentially. I don't think either |
Ok, it really sounds like order of command-line arguments shouldn't be relied upon, and requirements.txt should be considered more reliable. I'm going to close this since it seems the behavior does differ in a way that's expected-ish. Does anybody disagree? |
Description
When doing
pip install -r requirements.txt
the resolver notices a version conflict and fails quickly, whereas listing the same requirements on the command line causes an extended un-resolved backtracking.pip install -r
failed in 20s on my machine, while thepip install
command is still running after an hour so far.Expected behavior
pip install -r
andpip install
behave the same with respect to the resolverpip version
21.2.4
Python version
3.9
OS
Mac OS 10.15.6
How to Reproduce
requirements.txt contains:
setup.py contains
Output
Code of Conduct
The text was updated successfully, but these errors were encountered: