Likelihood of rate limits/IP ban from Twitter? #385

Meorge · 2022-02-02T00:03:04Z

I'm currently trying to use snscrape to download Tweets from Twitter. According to my calculations, I should be getting around 2,200,000 Tweets in total by the time it finishes. I'm concerned about the possibility of getting IP banned from Twitter as a result of this. Is this something worth being concerned about, or should I not worry?

More generally:

How much scraping should be safe to do before worrying about getting rate-limited or IP banned?
If the concern of getting IP banned is valid, do these bans typically go away after a period of time, or are they permanent, etc?

This tool seems like a godsend, compared to the limits of the official Twitter API. Having a more solid understanding of the "safe zone" would make me feel more comfortable with using it. I know that maintainers can't guarantee anything about rate limits or IP bans, but if anyone has experience with where they begin to set in, knowing that would help a lot!

TheTechRobo · 2022-02-02T00:05:38Z

Well, recently JustAnotherArchivist has added an amazing feature. snscrape now reuses guest tokens across sessions. This prevents rate-limiting from burning through too many guest tokens.

Personally I have not been banned, and I've been downloading thousands of tweets recursively...

I recommend creating an "Ignore File" and writing the tweet data for every 5 tweets or so, so that you don't have to redo all your progress if/when (2.2mil tweets is a lot) you get banned. I did something similar in my recursive tweet downloading script.

JustAnotherArchivist · 2022-02-02T00:11:55Z

Based on my past experience, that should be fine. I've scraped many millions of tweets before in parallel without problems. I usually split such big runs up into monthly scrapes using a search query like keyword since:2022-01-01 until:2022-02-01 to fetch tweets from this January. Then I iterate over the months and finally check whether each monthly output file contains the expected results (e.g. whether the last result is close to midnight on the 1st).

See also #307

Meorge · 2022-02-02T01:58:07Z

Thank you for the info! Perhaps I'm in the minority on this, but it might be helpful to others to include this sort of anecdotal information somewhere, so that people have a better idea on how much they can expect to use snscrape before being in danger of getting banned/rate limited? (Apologies if it's already available somewhere and I just didn't see it!)

JustAnotherArchivist · 2022-02-02T04:08:52Z

It's mentioned in some issues but not prominently. Documentation is WIP, and I agree it may be worth including some vague notes about it there.

cosmicoptima · 2022-02-17T23:00:23Z

I have been rate-limited by IP with a cooldown of half an hour to an hour; AFAICT it is not possible to get banned.

TheTechRobo · 2022-02-17T23:08:05Z

I've never been rate-limited before, at least that I've noticed.

cosmicoptima · 2022-02-17T23:22:54Z

my memory is hazy but it took... maybe 10-100 concurrent threads

TheTechRobo · 2022-02-17T23:59:27Z

Ok then that explains it, lol. I only have a few at a time

hyzhak · 2023-07-05T08:03:26Z

At a high level, we are working to prevent these accounts from 1) scraping people’s public Twitter data to build AI models and 2) manipulating people and conversation on the platform in various ways.

https://business.twitter.com/en/blog/update-on-twitters-limited-usage.html

so even if it wasn't problem before it could happen problem now

Meorge changed the title ~~Likelihood of IP ban from Twitter?~~ Likelihood of rate limits/IP ban from Twitter? Feb 2, 2022

JustAnotherArchivist added module:twitter question Further information is requested labels Feb 2, 2022

JustAnotherArchivist mentioned this issue Sep 13, 2022

Twitter Rate Limits? #551

Closed

This comment was marked as off-topic.

Sign in to view

bilgd mentioned this issue May 11, 2023

Scraping twitter in parallel #890

Closed

JustAnotherArchivist mentioned this issue Jun 26, 2023

Auth and rate limits with GraphQL API #984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Likelihood of rate limits/IP ban from Twitter? #385

Likelihood of rate limits/IP ban from Twitter? #385

Meorge commented Feb 2, 2022

TheTechRobo commented Feb 2, 2022 •

edited

Loading

JustAnotherArchivist commented Feb 2, 2022

Meorge commented Feb 2, 2022

JustAnotherArchivist commented Feb 2, 2022

cosmicoptima commented Feb 17, 2022

TheTechRobo commented Feb 17, 2022

cosmicoptima commented Feb 17, 2022

TheTechRobo commented Feb 17, 2022

This comment was marked as off-topic.

hyzhak commented Jul 5, 2023

Likelihood of rate limits/IP ban from Twitter? #385

Likelihood of rate limits/IP ban from Twitter? #385

Comments

Meorge commented Feb 2, 2022

TheTechRobo commented Feb 2, 2022 • edited Loading

JustAnotherArchivist commented Feb 2, 2022

Meorge commented Feb 2, 2022

JustAnotherArchivist commented Feb 2, 2022

cosmicoptima commented Feb 17, 2022

TheTechRobo commented Feb 17, 2022

cosmicoptima commented Feb 17, 2022

TheTechRobo commented Feb 17, 2022

This comment was marked as off-topic.

hyzhak commented Jul 5, 2023

TheTechRobo commented Feb 2, 2022 •

edited

Loading