Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update helpers.py #396

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Update helpers.py #396

wants to merge 3 commits into from

Conversation

FFAMax
Copy link
Contributor

@FFAMax FFAMax commented Oct 28, 2024

Tmp folder permission may mislead root cause showing issue with binding socket

Detected system: Linux
Inference engine name after selection: tinygrad
get_inference_engine called with: tinygrad
Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader
Trying to find available port host='0.0.0.0' port=61382
Unable write to file
Traceback (most recent call last):
  File "/home/user/exo/exo/helpers.py", line 62, in find_available_port
    write_used_port(port, used_ports)
  File "/home/user/exo/exo/helpers.py", line 47, in write_used_port
    with open(used_ports_file, "w") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/tmp/exo_used_ports'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/exo/.venv/bin/exo", line 5, in <module>
    from exo.main import run
  File "/home/user/exo/exo/main.py", line 72, in <module>
    args.node_port = find_available_port(args.node_host)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/exo/exo/helpers.py", line 66, in find_available_port
    raise RuntimeError (e)
RuntimeError: [Errno 13] Permission denied: '/tmp/exo_used_ports'

Tmp folder permission may mislead root cause showing issue with binding socket

```
Detected system: Linux
Inference engine name after selection: tinygrad
get_inference_engine called with: tinygrad
Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: HFShardDownloader
Trying to find available port host='0.0.0.0' port=61382
Unable write to file
Traceback (most recent call last):
  File "/home/user/exo/exo/helpers.py", line 62, in find_available_port
    write_used_port(port, used_ports)
  File "/home/user/exo/exo/helpers.py", line 47, in write_used_port
    with open(used_ports_file, "w") as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: '/tmp/exo_used_ports'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/exo/.venv/bin/exo", line 5, in <module>
    from exo.main import run
  File "/home/user/exo/exo/main.py", line 72, in <module>
    args.node_port = find_available_port(args.node_host)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/exo/exo/helpers.py", line 66, in find_available_port
    raise RuntimeError (e)
RuntimeError: [Errno 13] Permission denied: '/tmp/exo_used_ports'
```
exo/helpers.py Outdated
try:
write_used_port(port, used_ports)
except Exception as e:
if DEBUG >= 2: print(f"Unable write to file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make this a bit more helpful of an error message, for example "Unable to write to file using the write_used_port function" to make this more debuggable.

@dtnewman
Copy link
Contributor

dtnewman commented Nov 3, 2024

I'm curious what the impetus here is. What system are you running this on where you are having trouble writing to /tmp? And do you have any idea why?

@dtnewman
Copy link
Contributor

dtnewman commented Nov 3, 2024

I'm curious what the impetus here is. What system are you running this on where you are having trouble writing to /tmp? And do you have any idea why?

in any case, I have a different PR which has a bunch of changes, but one of them (which can possibly be separated out) is a "config.py" file where I setup a mechanism for persistent storage. Instead of using /tmp, I write settings to a "user data directory" (e.g. ~/.local/share/exo on linux). Anything currently written to tmp probably belongs there instead and can easily be added. But I'm concerned that whatever issue is coming up for you here with tmp directory permissions might come up there as well.

@FFAMax
Copy link
Contributor Author

FFAMax commented Nov 3, 2024

I'm curious what the impetus here is. What system are you running this on where you are having trouble writing to /tmp? And do you have any idea why?

It is linux. Issue is:
User A run exo and file created.
User B run exo and cannot create the same file due already exist and no perms to use it.

(Debian)

@FFAMax
Copy link
Contributor Author

FFAMax commented Nov 3, 2024

I'm curious what the impetus here is. What system are you running this on where you are having trouble writing to /tmp? And do you have any idea why?

in any case, I have a different PR which has a bunch of changes, but one of them (which can possibly be separated out) is a "config.py" file where I setup a mechanism for persistent storage. Instead of using /tmp, I write settings to a "user data directory" (e.g. ~/.local/share/exo on linux). Anything currently written to tmp probably belongs there instead and can easily be added. But I'm concerned that whatever issue is coming up for you here with tmp directory permissions might come up there as well.

That a good idea to organize storage. That issue with /tmp is weird because I do not expect any setting stick global session (like only one process can be running on the system). For example that file with ports written into it related to particular PID and should be good to keep it in temp as it is temporary for each run, but add PID into the name of the file and delete file once program closed (even crashed - ideally).

@AlexCheema
Copy link
Contributor

I'm still not sure exactly how this helps. It reraises the exception so doesn't it just end up with the same behaviour with a slightly different print?

@FFAMax
Copy link
Contributor Author

FFAMax commented Nov 25, 2024

@AlexCheema this to help in troubleshooting.
Instead of saying 'Unable open a socket' will point to more specific issue related to the file permissions. This will not resolve the root cause but help save time on troubleshooting: instead of checking why socket can not be opened, it will point to the permissions with a file.
But as we discussed with Daniel Newman, better refactor this portion to move away from using file /tmp/exo_used_ports. At least I didn't found why we need this file in first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants