Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No module named 'punctuator.punc'; 'punctuator' is not a package #3

Open
wcooper90 opened this issue Oct 21, 2020 · 16 comments
Open

No module named 'punctuator.punc'; 'punctuator' is not a package #3

wcooper90 opened this issue Oct 21, 2020 · 16 comments

Comments

@wcooper90
Copy link

I'm currently trying to create a webapp, Punctuator being an important package for it. I'm using AWS, which is "a distribution that evolved from Red Hat Enterprise Linux (RHEL) and CentOS," but I'm not sure about specifics. I'm on Python 3.7.9, and these are the errors that come out -

~ File "/var/app/venv/staging-LQM1lest/bin/punctuator.py", line 5, in
~ from punctuator.punc import command_line_runner
~ ModuleNotFoundError: No module named 'punctuator.punc'; 'punctuator' is not a package

I installed puncuator 0.9.6 into the virtual environment venv via a requirements.txt file off of github, with the following command:

sudo pip3 install -r https://raw.githubusercontent.com/wcooper90/summarization/master/backend/requirements.txt

I also have Punctuator installed on Amazon Linux 2 with just pip3 install puncuator.

I'm wondering if there are some dependency issues, or if it may have to do with the OS?

Thanks for any help.

@chrisspen
Copy link
Owner

How are you calling it?

@wcooper90
Copy link
Author

Hi Chris,

We've tried:
text = pytesseract.image_to_string(img).encode('latin-1', 'ignore')

As well as executing from the command line and then reading it from a file:

os.system("tesseract -l eng /var/app/current/inputs/" + str(i) + ".png text")

Thanks for getting back so quickly.

@chrisspen
Copy link
Owner

I meant how are you calling punctuator. That code only appears to call tesseract.

@wcooper90
Copy link
Author

wcooper90 commented Oct 23, 2020

Sorry!

Here is the function we are using punctuator in:

def punctuate_transcript(text):
# try different sample models in punctuator -- period accuracy is most important (especially for summary)!
p = Punctuator('Demo-Europarl-EN.pcl')
return p.punctuate(text)

We import Punctuator at the top of the file with:

from punctuator import Punctuator

and I've made sure to download the model, Demo-Europarl-EN.pcl, to the right place, both locally and on AWS.

@chrisspen
Copy link
Owner

I meant a complete script to reproduce the issue. Try this:

cd /tmp
mkdir test
cd test
virtualenv -p python3.7 env
pip install punctuator
python
>>> from punctuator import Punctuator

Does that throw an import error?

@wcooper90
Copy link
Author

With or without the virtualenv, it does not throw an import error. Do you think we can use the os package to run Punctuator in Python from the command line within our application?

@chrisspen
Copy link
Owner

chrisspen commented Oct 29, 2020

I'm not sure I understand your question. If you mean calling punctuator via os.system(), I suppose that could work, but that's a complicated workaround to what should be a simple problem to fix.

If your application is running inside the virtualenv where punctuator is installed, it'll just work and you should need to call punctuator it via os.system. It looks like it's throwing an import error because you simply haven't installed punctuator.

If you're somehow calling punctuator from Python running os.system("tesseract..."), then you need to make sure that Python instance is inside the virtualenv where punctuator is installed. Then the process called from os.system should inherit the path.

@Bacus96
Copy link

Bacus96 commented Nov 25, 2020

I've been having the same issue when importing it via python with from punctuator import Punctuator . I attempted to install it as you suggested above, and then run that command in Python but it results in the error that's mentioned earlier in this thread. Help would be appreciated since it would be great to check this out

@0xhamachi
Copy link

I'm having the same problem

@chrisspen
Copy link
Owner

If someone could provide a script that reproduces the problem, then I could probably fix it. However, I can find no problems on my end. I even have a Travis build that installs the package and runs some unittests.

Closing this as not-reproducible, but feel free to re-open if you can document steps to reproduce.

@evios
Copy link

evios commented Jan 4, 2021

Hi! Happy NY and Merry Christmas :)
To replicate you may run anything except python in CMD (uvicorn, celery, etc). If you run python from CMD - everything fine.c

Dockerfile:
FROM python:slim
RUN pip3 install punctuator fastapi uvicorn
COPY main.py ./app/main.py
CMD uvicorn --host 0.0.0.0 app.main:app

app/main.py:
from punctuator import Punctuator

from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def versions():
return "something"

RUN
docker build . -t punctuator
docker run -ti punctuator # error will occur on load

Error occured
File "./app/main.py", line 1, in
from punctuator import Punctuator
File "/usr/local/bin/punctuator.py", line 5, in
from punctuator.punc import command_line_runner
ModuleNotFoundError: No module named 'punctuator.punc'; 'punctuator' is not a package

@chrisspen chrisspen reopened this Jan 4, 2021
@chrisspen
Copy link
Owner

@evios Thanks. I can reproduce this. I can also reproduce this if I use a normal venv in Ubuntu. However, it seems to be a bug in uvicorn, not this package. That's why I couldn't reproduce this earlier, as I was only testing with a normal Python shell.

If I add import sys; print(sys.path) to my __init__.py and then run your uvicorn code, I see:

['.', '/home/chris/git/punctuator2/test/.env37/bin', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/home/chris/git/punctuator2/test/.env37/lib/python3.7/site-packages']

However, if I run a normal Python shell and then do the same import, I see:

['', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/home/chris/git/punctuator2/test/.env37/lib/python3.7/site-packages']

So for some odd reason, it looks like uvicorn is adding the standard bin directory as a place to look for packages, and this is breaking because I have a bin script with the same name as the package. So it tries to import the bin script, which obviously isn't a package, causing the ModuleNotFoundError.

I don't think this behavior in uvicorn is correct. It should not be looking for packages in the virtualenv's bin directory. Therefore, I don't think there's anything I can do on my end, short of changing my names to conform to uvicorn's non-standard behavior, which isn't good practice.

Correct me if I'm wrong.

@chrisspen
Copy link
Owner

Also, as a workaround, if you remove the bin directory from sys.path before you import punctuator, that should fix it.

@zxl777
Copy link

zxl777 commented Jan 5, 2021

Thanks @chrisspen ,It worked.

import sys
sys.path.remove('/root/miniconda3/envs/xxx/bin')
import punctuator

@evios
Copy link

evios commented Jan 5, 2021

Hi:)
Also can confirm that removing bin dir fixed.
I noticed such strange behaviour not only for uvicorn, but for celery as well.
uvicorn --host 0.0.0.0 app.main:app
celery worker -A app.worker
As you can see, while this is python you cant run then directly as binary packages.
Hence,
one more workaround (if you run it in Docker) is to start (uvicorn, celery) with:
CMD python -m uvicorn --host 0.0.0.0 app.main:app
instead of
CMD uvicorn --host 0.0.0.0 app.main:app

In such run scenario everything good.
Thank you @chrisspen for packaging it in pip!
Have a great day!

@ghost
Copy link

ghost commented Mar 8, 2021

Hi,
I'm very interested in using Punctuator but my configuration skills are not up to fixing the import work-around mentioned in the previous posts.

I have these system paths:

['/mnt/c/PythonProgrammes/venv', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/dist-packages', '/usr/lib/python3/dist-packages']

(running Python 3.7 on Ubuntu 18.04 LTS)

I have tried to removing '/mnt/c/PythonProgrammes/venv' with:

sys.path.remove('/mnt/c/PythonProgrammes/venv')

But my installed_packages_list does not include punctuator.

Any help appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants