Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Error 404 Not Found but I was able to manually download spreadsheet #89

Open
dld2517 opened this issue May 7, 2020 · 4 comments
Open

Comments

@dld2517
Copy link

dld2517 commented May 7, 2020

~/repos/springer_free_books$ python3 main.py
Traceback (most recent call last):
File "main.py", line 37, in
books = pd.read_excel(table_url)
File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 188, in wrapper
return func(*args, **kwargs)
File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 188, in wrapper
return func(*args, **kwargs)
File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 350, in read_excel
io = ExcelFile(io, engine=engine)
File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 653, in init
self._reader = self._enginesengine
File "/home/ddarden/.local/lib/python3.8/site-packages/pandas/io/excel.py", line 402, in init
filepath_or_buffer = _urlopen(filepath_or_buffer)
File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.8/urllib/request.py", line 531, in open
response = meth(req, response)
File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response
response = self.parent.error(
File "/usr/lib/python3.8/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

@dld2517
Copy link
Author

dld2517 commented May 7, 2020

Requirement already satisfied: pandas in /home/ddarden/.local/lib/python3.8/site-packages (0.24.2)
Requirement already satisfied: python-dateutil>=2.5.0 in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (2.8.1)
Requirement already satisfied: numpy>=1.12.0 in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (1.16.6)
Requirement already satisfied: pytz>=2011k in /home/ddarden/.local/lib/python3.8/site-packages (from pandas) (2019.3)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.5.0->pandas) (1.14.0)

@chaosAD
Copy link
Contributor

chaosAD commented May 7, 2020

Springer had updated the Excel file with a different name, but the Python script tried to download from the old link; therefore the error you encountered. Alex has fixed the link issue (see #85). Try downloading/cloning the repo again.

@dld2517
Copy link
Author

dld2517 commented May 7, 2020

Still didn't work. I used git fetch to redownload it and got the same issue. I think I'm done with the python mess. I just used the spreadsheet and created the url's via a concat function and used wget -O to download them.

@chaosAD
Copy link
Contributor

chaosAD commented May 8, 2020

After git fetch, did you git merge? If you didn't, it wouldn't be in your working directory and therefore you were still running the older script. I suggest git pull command rather than git fetch. But beware that this would work smoothly if you hadn't modified the code in the working directory. In my opinion, the best way is to start off with a clean slate by issuing git clone or download the zip in GitHub. This would have saved you all the trouble.

In fact, I did suggest to you to try downloading/cloning the repo again in my previous post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants