You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, queries for "users", "search", and "likes" mode conflict by reusing some caching files, as noted in #17#18.
For example, under default settings, here are the cache files associated with the target eleurent for different modes:
users mode: cache data stored in out/eleurent/cache/followers.json and out/eleurent/cache/friends.json
search mode: cache data stored in out/eleurent/cache/followers.json and out/eleurent/cache/tweets.json
likes mode: cache data stored in out/eleurent/cache/followers.json and out/eleurent/cache/tweets.json
Note also that for all three modes, the final results are written to out/eleurent/edges.csv and out/eleurent/nodes.csv, thus multiple runs of different modes with the same target will overwrite the result graph, which could lead to data loss.
It may make more sense to create a new directory layer under the target name associated with the mode being used, i.e. use these files instead:
users mode: cache data stored in out/eleurent/users/cache/followers.json and out/eleurent/users/cache/friends.json
search mode: cache data stored in out/eleurent/search/cache/authors.json and out/eleurent/search/cache/tweets.json
likes mode: cache data stored in out/eleurent/likes/cache/authors.json and out/eleurent/likes/cache/tweets.json
Perhaps also the outfile can sit under this new directory layer as well, so that the final outputs for the different modes do not get overwritten. E.g.
users mode: final results stored in out/eleurent/users/edges.csv and out/eleurent/users/nodes.csv
search mode: final results stored in out/eleurent/search/edges.csv and out/eleurent/search/nodes.csv
likes mode: final results stored in out/eleurent/likes/edges.csv and out/eleurent/likes/nodes.csv
The text was updated successfully, but these errors were encountered:
nadesai
changed the title
Caching logic for "users", "search", and "likes" mode conflicts
Cache and output files for "users", "search", and "likes" mode conflicts
Feb 2, 2022
The only "desirable" conflict that I can think of is the one of frienships.json which can be filled a first time when creating the graph of someone's followers, and reused later when creating the graph of their friends instead, since there is probably a significant overlap. But that is not an essential feature, and it would be preserved by your suggestion anyway, so yes by all means :)
Currently, queries for "users", "search", and "likes" mode conflict by reusing some caching files, as noted in #17 #18.
For example, under default settings, here are the cache files associated with the target
eleurent
for different modes:users
mode: cache data stored inout/eleurent/cache/followers.json
andout/eleurent/cache/friends.json
search
mode: cache data stored inout/eleurent/cache/followers.json
andout/eleurent/cache/tweets.json
likes
mode: cache data stored inout/eleurent/cache/followers.json
andout/eleurent/cache/tweets.json
Note also that for all three modes, the final results are written to
out/eleurent/edges.csv
andout/eleurent/nodes.csv
, thus multiple runs of different modes with the same target will overwrite the result graph, which could lead to data loss.It may make more sense to create a new directory layer under the target name associated with the mode being used, i.e. use these files instead:
users
mode: cache data stored inout/eleurent/users/cache/followers.json
andout/eleurent/users/cache/friends.json
search
mode: cache data stored inout/eleurent/search/cache/authors.json
andout/eleurent/search/cache/tweets.json
likes
mode: cache data stored inout/eleurent/likes/cache/authors.json
andout/eleurent/likes/cache/tweets.json
Perhaps also the outfile can sit under this new directory layer as well, so that the final outputs for the different modes do not get overwritten. E.g.
users
mode: final results stored inout/eleurent/users/edges.csv
andout/eleurent/users/nodes.csv
search
mode: final results stored inout/eleurent/search/edges.csv
andout/eleurent/search/nodes.csv
likes
mode: final results stored inout/eleurent/likes/edges.csv
andout/eleurent/likes/nodes.csv
The text was updated successfully, but these errors were encountered: