Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeviantArt can now be ripped #1548

Closed
rautamiekka opened this issue Jan 9, 2020 · 7 comments
Closed

DeviantArt can now be ripped #1548

rautamiekka opened this issue Jan 9, 2020 · 7 comments

Comments

@rautamiekka
Copy link
Contributor

  • Ripme version: 1.7.90

I've made a "simple" Python 3.7+ script which successfully rips non-mature content from the whole DeviantArt gallery. The way DA is made now it's impossible without logging in to get the mature content, but note I haven't tested cookies for their difficulty.

Also due to the way DA is made now, using the download button on the deviation pages is impossible without logging in.

Ripping is a pretty simple thing to do, surprisingly:

  1. Download the 1st page using an address like "https://www.deviantart.com/search/deviations?page=1&q=by%3A" + username.
  2. Run RegEx <span class="_4pI41">([0-9]+) results</span> on the 1st search page. So far the random-looking class name is identical across separate attempts I've done.
  3. Calculate ceil(RESULTS / 24) where ceil(...) rounds up to the even number; 61 / 24 = 2.541666667, which must become 3. 24 is the number of items on the search pages.
  4. Download the rest of the pages.
  5. Run RegEx "data-hook=\"deviation_link\"[\\t ]+href=\"(https?://(?:www\.)?deviantart\.[^/]+/" + username + "/art/[^"]+)\"" on all of the search pages you downloaded.
  6. Download the deviation pages using the links you got from the RegEx above.
  7. Run RegEx "<link[\\t ]+data-rh=\"true\"[\\t ]+rel=\"preload\"[\\t ]+href=\"([^\"]+)\"[\\t ]+as=\"image\"[\\t ]*/>" on the deviation pages you downloaded.
  8. Get the pure filename from the link by link.split('/')[-1].split("?")[0].
  9. Download the files using the links you got from the RegEx above, naming the file as the filename you got from above.
@ReppiksProductions
Copy link

Greetings rautamiekka,

Pretty awesome you found a way to rip DA again! I'm not to savvy with programming and especially anything to do with editing ripme. Could you please explain how to implement these changes to ripme?

Cheers

@rautamiekka
Copy link
Contributor Author

Greetings rautamiekka,

Pretty awesome you found a way to rip DA again! I'm not to savvy with programming and especially anything to do with editing ripme. Could you please explain how to implement these changes to ripme?

Cheers

The best I can do is publish my Python code cuz I dunno Java too much; it'll be fairly simple to follow the code and google for the imports and functions, they've simple official documentation.

@sgtrusty
Copy link

sgtrusty commented Feb 4, 2020

Can you post the code somewhere? Gist maybe?

@rautamiekka
Copy link
Contributor Author

Can you post the code somewhere? Gist maybe?

I suppose I could this one that works, but it only works with usernames and manually asking the program to acquire either full gallery, all favs, or all scraps. It doesn't support folders and URLs yet.

@rautamiekka
Copy link
Contributor Author

rautamiekka commented May 25, 2020

The code has a problem now: for some reason it can't fetch the webpage where it reads the gallery size, and even if it can there's another problem with the upload count: if one has uploaded more than 999 pieces, the counter will change to Xk where X is the floating point number of the uploads and k is short for thousand. I assume it'd change to XM where M is short for million if someone actually uploaded that many.

That, is worse than it sounds: the app won't be able to accurately tell the ripping progress, so I figured the ripping progress reporting has to be dropped altogether, and the JSONs need to be fetched incrementally, since each JSON will explicitly tell if there's another batch of uploads.

That, however, has its own problems I've already forgotten.

@rautamiekka
Copy link
Contributor Author

rautamiekka commented Oct 18, 2021

"A bit" late, but I gave up on my own downloader idea even before I found out about gallery-dl. Even if grossly simplified, my code wouldn't be nearly as simple as the then-working one I had, and writing the framework for using the Eclipse API would be too much for me.

@metaprime
Copy link
Contributor

[Mega-Thread] DeviantArt ripper is broken; yes we know -- de-duping all other DeviantArt issues to this one. #2063

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants