Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The extraction of the subnetwork in the Github dataset #5

Open
xiaowenmasfather opened this issue Jan 10, 2021 · 1 comment
Open

The extraction of the subnetwork in the Github dataset #5

xiaowenmasfather opened this issue Jan 10, 2021 · 1 comment

Comments

@xiaowenmasfather
Copy link

Hi,
Thanks for providing this implementation of dynamic network representation.
I am currently researching dynamic network embedding, and I am curious about the extraction of the subnetwork with 284 users in the Github dataset in your code.
I have downloaded the whole dataset in 2013 and preprocessed it in the form of [src_id, dst_id, snapshot_id] by choosing the event type as "FollowEvent", where the time interval between adjacent snapshots is 7 days.
After removing nodes with only one edge connected to the network), there are still more than 900k nodes.
When I try to generate adjacency matrices in different snapshots, I find that most nodes have degrees lower than 20.
However, if I only choose the nodes with degrees higher than some value (like 20), the network constructed by the selected nodes would be totally different (because the high degree of some node could be caused by those low-degree nodes, which have been removed; the selected high-degree nodes would be low-degree after the selection).

A possible way to do this I have thought of is to perform community clustering along with snapshots.
But I currently have no idea how to make an efficient implementation, considering the large node numbers.

So, I wonder how do you select the 284 nodes to guarantee the nodes have dense events with each other.
Could you provide your selection code?
Hope your reply

@bknyaz
Copy link
Collaborator

bknyaz commented Jan 13, 2021

Hi, thanks for your question. I found the preprocessing script, but it's quite messy. Let me try to clean it up and upload asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants