-
Notifications
You must be signed in to change notification settings - Fork 11
Importing Data
The Vistorian has two ways of importing data.
- Basic: you need to create a node table, edge table, node schema, and link schema. This is explained at the section Home.
- Through importers that can read a specific data format. Eventually, all importers use the basic import above, which acts as a bottleneck format.
If data is loaded other than through one of the importers, a DataSet
object must be created manually. A dataset object looks as follows:
var dataset = new networkcube.DataSet({
name: 'myAwesomeNetwork',
nodeTable: [[..some data..]], // node table here
linkTable: [[..some data..]], // edge table here
nodeSchema: {id:0, label:1}, // some node schema
linkSchema: {id:0, source:1, target:2} // some link schema
});
All these information are required and mandatory. They guarantee that the network is in minimal format. For basic data import, it takes two tables:
- a node table that contains one row per node with its respective attributes, and
- a link table that contains one row per link with its respective attributes.
Once the DataSet
object is created, it can be imported via:
var session= 'someSessionStringHere;
window.vc.main.importData(session, dataSet);
Again, this method is not recommended for use, unless there is no suitable importer function.
The respective node and link schemas are simple JSON objects that map columns to data attributes
For example,
var nodeSchema =
{
id:0
}
var linkSchema =
{
id:0,
source:1,
target:2
}
The above code creates two schemas assigning the required attributes for each type, node and link.
You can add optional properties to any schema, e.g. an age to a node, or some other value to a link; just specify the attribute in the respective schema and make sure the table column is valid. The property will be added to the schema.
A node schema can have the following attributes:
-
id
(required) - Node ID. Must be unique, start with 0, and running. -
label
(optional) - Label shown for this node. If not provided, node ID should be shown. -
location
(optional) - a node's geographic location. Can change over time. -
time
(optional) - associates a time stamp to this node. Attributes in this row are valid only at this time stamp. This can be used to, e.g. change a node's type or label over time. -
nodeType
(optional) - a node's type
A link schema can have the following attributes:
-
id
(required) - Node ID. Must be unique, start with 0, and running. -
source
(required) - ID of source node -
target
(required) - ID of target node -
time
(optional) - associates a time stamp to this node. Attributes in this row are valid only at this time stamp. This can be used to, e.g. change a node's type or label over time. -
weight
(optional) - an edge's weight. Can change with time -
linkType
(optional) - a link's type. Can change over time -
directed
(optional) - if a link is directed or not. If not explicitly indicated a link is seen as directed.
Tables used in the basic import have to be normalized and must link node, link, and location tables through IDs. That means:
- Each table's first column is an id (e.g., a node id, link id, location id).
- Ids start from
0
. -
source
andtarget
(in linktable) contain ids in the node table. -
location
(in nodetable) contains ids that link to a location table.
Importing data in this way helps keeping the memory load small and load bigger networks.
Using the basic import is recommended only if no proper importer is available. The Vistorian provides functions for some common data formats.
networkcube.loadLinkTable(url, callBack, linkSchema, delimiter, timeFormat?)`
This function reads a table in link format (each row is a link) with table columns indicating link attributes such as source, target, weight, time, etc. A link's weight at different time points must be indicated by another row. Below an example of a CSV formatted file, but the file ending can be anything.
source, target, weight, time, type
Ana, Bob, 4, 2010, letters
Ana, Bob, 10, 2010, visits
Cyril, Ana, 10, 2009, visits
The function must deliver a linkSchema
, which is a simple json object specifying which column of the input table maps to which attribute. For the example above, the linkschema
would look as follows:
{
source: 0,
target: 1,
weight: 2,
linkType: 4,
time: 3
}
Note that not every attribute or column must be assigned. Non-assigned columns will be ignored and not imported into networkcube. If an attribute is not assigned a column, this attribute is ignored by the visualizations. E.g. if no linkType
is set, no link types are shown. If no time
is specified, the network will show as a static non-temporal network.
The delimiter
defines the delimiter used to separate fields in a row in the csv file, e.g. ,
, \t
, ;
...
If any field in the table indicates a time stamp, timeFormat
indicates the time formatting. See http://momentjs.com/docs/#/parsing/ for information on how to specify your time format. If no timeFormat
is given, but the linkSchema
contains a field time
, then networkcube will assume the time given in Unix milliseconds (since 01/01/1970)
The TypeScript file implementing the loaders is core/importers.ts
. Add loaders there.