Where is the text of the tweets? #10

VictorSuarezL · 2019-04-12T09:41:51Z

I recently access to the CREDBANK-data, merging all the different databases. So far I have found the main topics, score, and so on. I would love to use this corpus of tweets in a paper, but unfortunately, I can't find the original text of the tweets, where is it? Is it available in another resource? Did I miss anything?

A way of getting the original text of the tweets could be using the id of the tweet and the REST API of Twitter. But given the number of tweets and the time since they were posted, I am afraid it will not be possible or will take a lot of time. So I was wondering if it could be possible to get the text?

BTW thanks for sharing and congrats for the great job done!

marcodegra · 2019-04-13T12:14:59Z

Hello,

I share with you a reply on the same topic that the creator of dataset gave me some months ago.

"... about the *.json.gz files, these data files need to be retrieved from Twitter via their IDs. As such, I can’t share the tweet files directly (against Twitter’s terms of service).

At UMD, we maintain an archive of tweets extracted from Twitter’s 1% public sample, so I was able to pull a sample of tweets from that archive (you can find a similar one at the Internet Archive’s Twitter stream grab). You could also use Twitter’s API to rehydrate tweets slowly, or you could partner with Twitter/Gnip to rehydrate all the tweets in one go.

From there, the *.json.gz files are comprised of all relevant tweets, with each tweet occupying one line in the file."

Hope you find useful this information :)

VictorSuarezL · 2019-04-24T08:19:50Z

So many thanks @marcodegra, this is exactly what I want to know. I saw in Internet Archive's Twitter stream grab several datasets so seems pretty easy to merge the whole together. Surely this takes a while but seems faster than using Twitter's API.

Thanks again for the info! 🎉

hansd410 · 2019-11-14T02:43:31Z

Where is the archive? Would you share the link please?

marcodegra · 2019-11-14T08:05:39Z

Hello hansd410, did you already try to follow the instructions to download it?

hansd410 · 2019-11-14T08:16:46Z

Hello marcodegra.
Yes, I downloaded streaming data, but I can find twitter ID only.
I wonder I could get twitter 'text' data through archive.

marcodegra · 2019-11-14T08:21:27Z

Well, then please refer to the above comments.
Unfortunate I can not help you more than that :(

hansd410 · 2019-11-14T08:40:32Z

Oh I see. I just guessed @VictorSuarezL got the archive that fit with the streaming data.
Thank you for reply, @marcodegra!

myrainbowandsky · 2019-11-22T06:36:41Z

So many thanks @marcodegra, this is exactly what I want to know. I saw in Internet Archive's Twitter stream grab several datasets so seems pretty easy to merge the whole together. Surely this takes a while but seems faster than using Twitter's API.

Thanks again for the info! 🎉
Could you please show where the archive is?

AfrouzHojati · 2020-06-15T03:09:55Z

Thank you @VictorSuarezL for your help. if you were able to get the twitter text, would you please let us know it and how did you get them.

Thank you again

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where is the text of the tweets? #10

Where is the text of the tweets? #10

VictorSuarezL commented Apr 12, 2019 •

edited

Loading

marcodegra commented Apr 13, 2019 •

edited

Loading

VictorSuarezL commented Apr 24, 2019

hansd410 commented Nov 14, 2019

marcodegra commented Nov 14, 2019

hansd410 commented Nov 14, 2019

marcodegra commented Nov 14, 2019

hansd410 commented Nov 14, 2019

myrainbowandsky commented Nov 22, 2019

AfrouzHojati commented Jun 15, 2020

Where is the text of the tweets? #10

Where is the text of the tweets? #10

Comments

VictorSuarezL commented Apr 12, 2019 • edited Loading

marcodegra commented Apr 13, 2019 • edited Loading

VictorSuarezL commented Apr 24, 2019

hansd410 commented Nov 14, 2019

marcodegra commented Nov 14, 2019

hansd410 commented Nov 14, 2019

marcodegra commented Nov 14, 2019

hansd410 commented Nov 14, 2019

myrainbowandsky commented Nov 22, 2019

AfrouzHojati commented Jun 15, 2020

VictorSuarezL commented Apr 12, 2019 •

edited

Loading

marcodegra commented Apr 13, 2019 •

edited

Loading