This repository has been archived by the owner on Mar 9, 2021. It is now read-only.
The Tumblr api is now rate limited.
Backup your Index folder in the download location before running this version. It will permanently modify your blog index files (*.tumblr) upon the first run. They contain the already downloaded file information and might end up broken after the upgrade.
- Saves blog databases as .json files (plain text) instead of a binary format. Allows modification in your text editor of choice.
- The url list is now a separated file (_files.tumblr, also saved as json) and loaded on demand and is not permanently held in memory to reduce memory usage.
- Stores only the filename of tumblr photo, video and audio posts, instead of the whole url. This lowers memory consumption as a large part of the url is not file but host specific. The whole url address was saved to prevent reloading of the same file, but since the host server changes, the filename should be sufficient for this task.
- The picture/video preview lags a bit in the beginning and might display nothing for several seconds but does not freeze the whole application anymore.
- Downloads inline images of all post types (#24).
- The picture preview now displays animated .gifs (#38).
Rate limited Tumblr api:
The initial download process where all the image, video and audio urls are being searched for has to be slowed down since mid-February of 2017. The servers now only accept a defined number of connections per time interval. If too many connections are opened the servers don't respond anymore and just close the connection with a 429 respond -- Limit exceeded (see #26 for more).
Therefore, this pre-release addresses this new issue by:
- Adding a rate limiter in the settings. The Number of connections is per time in seconds and might be increased. I've not tested these two values thoroughly, but they work without hitting the limit. Different solutions as mention in #26 are faster (e.g. crawl in small batches and start the download immediately) but require more work to properly implement them. Only the initial evaluation period for grabbing the urls and meta information is slowed down. The picture, video and audio download is not impacted.
- It now shows an error if the api limit was reached. You should lower the limit for the api connections in the settings and re-crawl the specific blog, otherwise not all posts will be downloaded.
- Brings back some speed by simultaneously accessing the api and immediately downloading the first grabbed image, video and audio urls. So it does not wait for the "evaluating xxx of xxx post" to finish before starting to download.
- If a blog was successfully downloaded, the newest post id is saved. Upon the next download, only newer posts will be evaluated using the tumblr api, thus finishing the blog more quickly. A full rescan can be forced in the details view.