-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize tags before saving #159
Comments
Fuzzy matching might help in this case. You could have any tags with similar enough strings get grouped, and still expose the underlying tags themselves.
Though I can't think immediately of a decent way to test all combinations within the tag set. Tags can be quite diverse on tumblr. |
I was having a similar problem with "/" in tags as well as upper and lower case tags. So I modified the tumblr_backup.py to quote and lower case tags. Let me run a few more tests and I will try to add the code. |
…iewing This is for issue bbolli#159 . I was having a similar issue with special characters as well as with tag upper/lower case. I have added three new options and the code to implement the options. --normalize-tags - sets the text to lower case and creates a unique set to remove duplicates --escape-tags - uses urllib.quote_plus to escape special characters in the tags --fix-for-disk - adds an extra urllib.quote_plus when the urls are being built to account for browsing from disk weirdness in windows
I updated the documentation with the three new options I added for issue bbolli#159
I have many post tagged with real names in which sometimes i used upper case letters and sometimes I did not. Now the issue is that for tags index pages, these tags are differents: for example "Todd Hido" is different from "todd hido". Would it be possible to normalize every tag beforehand, by making them all lower case?
The text was updated successfully, but these errors were encountered: