Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix address resolution on Solaris by specifying the protocol #2088

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zyv
Copy link

@zyv zyv commented Sep 7, 2024

Hi there,

Solaris supports address resolution only if no service name (port) is specified (None) or if a desired protocol is explicitly specified.

$ python3
Python 3.9.19 (main, Mar 26 2024, 20:30:24)
[GCC 13.2.0] on sunos5
Type "help", "copyright", "credits" or "license" for more information.

>>> import socket

>>> socket.getaddrinfo("127.0.0.1", 5000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 9] service name not available for the specified socket type

>>> socket.getaddrinfo("127.0.0.1", None)
[(<AddressFamily.AF_INET: 2>, 0, 0, '', ('127.0.0.1', 0))]

>>> socket.getaddrinfo("127.0.0.1", 5000, proto=socket.IPPROTO_TCP)
[(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 2>, 6, '', ('127.0.0.1', 5000))]

Since you seem to support only TCP anyway, I suggest specifying the protocol explicitly, like in the Python documentation:

https://docs.python.org/3/library/socket.html#socket.getaddrinfo

It doesn't hurt on any platform and fixes the resolution on Solaris. Alternatively, you can just omit the service name from the resolution. This is how the Gevent people solved the same problem: gevent/gevent#1256 .

@miguelgrinberg
Copy link
Owner

This fix should go to eventlet in the same way gevent was fixed. This package does not need to be concerned with the networking idiosyncrasies of all the different platforms.

@zyv
Copy link
Author

zyv commented Sep 7, 2024

Well, I considered opening a PR for eventlet instead, and fixing their implementation of getaddrinfo:

https://github.com/eventlet/eventlet/blob/8bac9b2bb5ba02d42305446327a117ff51af177b/eventlet/support/greendns.py#L545

but it was not obvious to me how to fix it, unlike in the case of gevent. Note that gevent fixed the call to getaddrinfo and not a custom implementation of getaddrinfo.

Eventlet simply implements an alternative to Python socket's getaddrinfo, which would fail (and does fail, see my example above) in exactly the same way as the "stock" one if you used it instead of eventlet. This way you could argue that it's a bug in Python and should be fixed in CPython...

So my thinking is that this is an invalid use of getaddrinfo and should be fixed at the caller's end. Maybe it's not obvious, but it makes sense - either you have to specify the protocol you want, like in the Python documentation, or you have to give None the service name, because it just doesn't make sense without specifying the protocol. It's unfortunate that "friendly" systems like Linux just helpfully ignore it and Solaris & AIX hard fail, but Linux is generally quite tolerant of POSIX abuse in a constructive sense.

If you still disagree, do you have any idea how to fix this in eventlet? Or go straight to CPython?

Note that if I set port to None in eventlet if protocol is not specified, the resolution works, but it obviously doesn't return a port because it doesn't make sense. You expect it to return a port as if the protocol was set to TCP. I could of course patch the port back into the tuples returned by getaddrinfo, but that gets really messy and really questionable very quickly.

I would be grateful if you could reconsider or give me some further direction.

@miguelgrinberg
Copy link
Owner

miguelgrinberg commented Sep 7, 2024

This way you could argue that it's a bug in Python and should be fixed in CPython...

If the Solaris port of CPython does not implement the correct behavior for this function then the bug is with the Solaris port. And if eventlet copied this code and duplicated the bug, the it is also a bug in eventlet.

So my thinking is that this is an invalid use of getaddrinfo

This is not an invalid use of getaddrinfo. This is from the Python official docs:

The family, type and proto arguments can be optionally specified in order to narrow the list of addresses returned. Passing zero as a value for each of these arguments selects the full range of results.

As I said above, I don't want to maintain obscure hacks that are unrelated to this repository. If you need to start an eventlet server under Solaris, you can always use your own server starting code. Just copy the code from the socketio.run() method and adapt it to your needs in your own application. Or better yet, subclass the SocketIO class and override the run() method with the server starting logic that you need to use under Solaris.

@zyv
Copy link
Author

zyv commented Sep 8, 2024

If the Solaris port of CPython does not implement the correct behavior for this function then the bug is with the Solaris port.

I opened python/cpython#123832 with an explanation and a poc fix. Eventlet didn't copy the code directly, they just wrap Python code to make it async.

If you need to start an eventlet server under Solaris, you can always use your own server starting code.

I don't write any server code myself, the bug was exposed by another tool that depends on Flask-SocketIO - specifically https://github.com/cs01/gdbgui.

@zyv
Copy link
Author

zyv commented Nov 15, 2024

This is not an invalid use of getaddrinfo. This is from the Python official docs

Hi @miguelgrinberg, the discussion with the CPython developers is now concluded, and the bottom line is that using getaddrinfo(host, port) without specifying the desired protocol family or socket type is improper (short of wrong) if you make this call and later use its result to open a socket.

The documentation has been updated to clarify this:

https://docs.python.org/3.14/library/socket.html#socket.getaddrinfo

The recommended usage example already included proto=socket.IPPROTO_TCP before the changes (partly for this reason), so it hasn't been modified.

As I said above, I don't want to maintain obscure hacks that are unrelated to this repository.

... so it's actually a correctness fix rather than an obscure hack, but it's up to you to reject the PR anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants