Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link to the downloaded books #119

Open
kangli-bionic opened this issue May 26, 2020 · 3 comments
Open

Link to the downloaded books #119

kangli-bionic opened this issue May 26, 2020 · 3 comments

Comments

@kangli-bionic
Copy link

The script is not easy to run to completion if downloading all books. Finally was able to finish downloading on my fifth attempt. So to save people some time I'm sharing a link to the downloaded books.

Currently missing the last 13 titles on the excel file.

https://drive.google.com/drive/folders/1JC15m__PbPaowQ7k2zS1-Us72yvROCQs?usp=sharing

@valahna
Copy link

valahna commented Jun 24, 2020

For those who want the recently released 1000 books for summer and to learn how I approached downloading any set that is freely available:

I grabbed the CSV report from the search page that contains the links and some meta data. I parsed the URLs and DOIs into direct download links, and then created two text files: one for the pdfs and one for the epubs version. I then imported these files into downthemall (download manager extension), which then proceeded to download all one thousand of these books. Not all have an epub version, so some will fail in that regard. I kept the simultaneous download limit to 8-10 at a time, and the it worked fine, is this due to the extension acting as a complete browser and handling the cookies, headers, and all that for you, or because it was limited the amount downloaded at a time to prevent being flagged as a bot/script, I don't know. Further testing and data would need to be done to determine this.

Then I wrote a script to parse the csv and update the PDFs with the meta data using exiftool, and renamed the files to something besides the DOI. I compressed them into three files, one with the PDFs and two with the epubs. You can find the csv I used and the archives with all the books here: Mega Hosted

To chaosAD's point, my approach is certainly a more "cat and mouse" approach, and not as elegant and refined as a script that handles all of this for you; however, I think it is a little impractical for someone to visit each books' springer page and click on two donwload buttons for all one thousand of these titles.

@CyclopeanBee
Copy link

The script is not easy to run to completion if downloading all books. Finally was able to finish downloading on my fifth attempt. So to save people some time I'm sharing a link to the downloaded books.

Currently missing the last 13 titles on the excel file.

https://drive.google.com/drive/folders/1JC15m__PbPaowQ7k2zS1-Us72yvROCQs?usp=sharing

I have 11 of the missing books! The final two didn't have download links any more when I checked.

@AntoineSoetewey
Copy link

Hello @kangli-bionic,

Can you confirm that your google drive link and the books will remain accessible as long as possible?

I would like to include your link at the top of this article, so I would like to make sure books are not removed soon.

Thanks again for having downloaded the books!

Regards,
Antoine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants