Skip to content

Distributed Graph Builder application to convert text datasets from Wikipedia into graph datasets.

Notifications You must be signed in to change notification settings

commutativity/GraphBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphBuilder


The repository provides the source code for the Graph Builder Application which has been developed for the Masterthesis "Generating Graph Datasets: Conceptualization of a Graph Builder for the Wikipedia Encyclopaedia".

The application provides an approach to convert datasets from the Wikipedia encyclopaedia into graph datasets which can then be imported into graph exploration software e.g. Gephi. A video in German on the Graph Builder Application has been published as well: https://youtu.be/Ca_VwM6rmWI

Generated graph dataset demos

The directory demo provides GEXF graph datasets that have been constructed with the application. The graph datasets can be visualized and explored in Gephi which is available as open-source: https://gephi.org/users/download/. The first figure below is created with the demo-one dataset and the ForceAtlas algorithm of Gephi. The second figure is created with the Sydney dataset and the nodes are coloured regarding their category.


Requirements

The requirements for the GraphBuilder application are:

  • Hadoop 3.3.1
  • Most requirements are set by the build-sbt
  • The application has been run on 12 cores with 16 GB RAM
  • At least 150 GB of disk space is required for the datasets from the encyclopaedia

About

Distributed Graph Builder application to convert text datasets from Wikipedia into graph datasets.

Topics

Resources

Stars

Watchers

Forks