Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patched tumblr_backup.py to archive Likes #165

Closed
wants to merge 1 commit into from
Closed

Patched tumblr_backup.py to archive Likes #165

wants to merge 1 commit into from

Conversation

cherryband
Copy link
Contributor

@cherryband cherryband commented Dec 10, 2018

I happen to write this on my own some months ago but i never thought of pushing it back up. Few key differences from the #114 's implementation:

  • There is actually a way to grab timestamp when it's liked. I used that timestamp instead of original posts' timestamp to order them. This means it also should be able to do incremental backup. This however also means the original post's timestamp will not be available in the backup.
  • I didn't limit the API call to 20. It somehow still works when you call 50 at a time. It does not always return exactly 50 though, and I assume the numbers missing are likes that have the originals deleted.
  • Every post is marked with their respective original posters. The Hacky fixes to archive likes #114 does not contain the necessary fix for this.

All the commands from tumblr_backup.py should work the same in tumblr_likes_backup.py (it's a fork.) You will get the likes under the directory <blogname>'s likes. I was able to backup 9.7K out of 11K likes from this script. Hope you make good use of this!

Edit: I have conducted manual comparison of the archive to the actual Tumblr likes and the script is indeed slipping a few posts. The cause is unknown and I don't believe it has to do with my script, as I know the API doesn't always return the correct amount of posts (and #151). It is more relevant on the older part of the archive; Tumblr don't miss out on newer ones.

@cherryband
Copy link
Contributor Author

Here is the diff file from the original tumblr_backup.py, if anyone's interested.

@cebtenzzre
Copy link
Collaborator

cebtenzzre commented Dec 10, 2018

Is BeautifulSoup necessary?
Also, the addition of the blog URL to the header probably belongs as a separate issue or PR.

@cherryband
Copy link
Contributor Author

@cebtenzzre BeautifulSoup: The class is dangling, so no.
Also I wasn't aware that my tumblr_backup.py was actually the old version so didn't know the addition of the --likes. Silly me! Thank you for the guidance. I'll close this PR and look for other way to improve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants