Skip to content

Releases: DocNow/twarc

v2.1.5

10 Jun 18:32
21833e4
Compare
Choose a tag to compare

A bugfix release for logging CONSUMER_KEY correctly.

v2.1.4

10 Jun 18:27
2f589e1
Compare
Choose a tag to compare

The log will now include a message about where the config file has been loaded from. In addition --verbose can be used to cause more information to be logged, such as the keys that are being used to access the API. Most of the time this is probably not a good idea since it makes your keys available in a log, but it can be useful in situations where debugging error responses from the API.

v2.1.3

08 Jun 12:46
fb7b21b
Compare
Choose a tag to compare

This release includes a small fix for twarc timelines to ignore empty lines. #476

v2.1.2

01 Jun 19:37
e78352a
Compare
Choose a tag to compare

This release adds two new twarc2 subcommands:

conversations

conversations will read a file of tweets and look for any conversations they are a part of and will download the full conversation thread for them.

twarc2 conversations tweets.jsonl > conversations.jsonl

You can also give it a file of conversation_ids to download instead:

twarc2 conversations ids.txt > conversations.jsonl

timelines

Similarly timelines will read in a file of tweet ids and will download the user timeline for any user who authored the tweets.

twarc2 timelines tweets.jsonl > timelines.jsonl

You can also give it a file of user ids or usernames:

twarc2 timelines users.txt > timelines.jsonl

This functionality was first developed in the twarc-timelines plugin which has been renamed to twarc-timeline-archive because it does some extra things like writing timelines to separate directories and being able to be run on a schedule without redownloading previously downloaded data.

v2.1.1

31 May 19:38
68c35bb
Compare
Choose a tag to compare

Hot on the heels of v2.1.0 this bugfix release corrects the twarc2 flatten sub-command to ensure that data is both flattened and output as line oriented JSON.

v2.1.0

31 May 19:25
c715bc0
Compare
Choose a tag to compare

v2.1.0 removes the --flatten option from many commands in the hopes of encouraging users to mostly use the original data as retrieved from the Twitter API. The subcommand twarc2 flatten remains mostly for use in data processing pipelines that expect line oriented json where each object is a tweet:

twarc2 search blacklivesmatter | twarc2 flatten | jq .text 

The twarc.expansions.flatten() function has been updated to always return a list of tweets, and twarc.expansions.ensure_flattened() can be used to make sure data has been flattened already when processing tweet data. Since it is designed for use in twarc plugins and other pieces of software that need to work with tweets it is also available for import from twarc:

from twarc import ensure_flattened

In addition this release also includes twarc conversation for retrieving tweets from a a particular conversation thread.

v2.0.13

21 May 01:18
e6a2fef
Compare
Choose a tag to compare

This bugfix release irons out some wrinkles that have been discovered during usage:

  • Fix for handling search responses with missing data stanza but that contain a next token for another page of results. #464
  • Stream diagnostics now to go stderr to not interfere with JSON being written to stdout. #456
  • twarc can be run from the command line as a module now: e.g. python -m twarc2 search barackobama #455
  • twarc2 command help text indicates times are UTC

v2.0.12

01 May 01:24
0277486
Compare
Choose a tag to compare

Bugfix for using --since-id.

v2.0.11

30 Apr 12:50
aee7964
Compare
Choose a tag to compare

Added the HTTP status code when emitting a message about unparseable JSON.

v2.0.10

29 Apr 14:52
5b43e24
Compare
Choose a tag to compare

This release adds the --max-results option to the search command to override the default of 100 for search/recent and 500 for search/all. See #449 for the details on why this is sometimes needed.