Week 35 - Discover AuraDB Free - NODES Sessions

We’re really excited and looking forward to our annual online developer conference NODES 2022 on Nov 16/17.

The event will run around the Globe with talks in all timezones.

You can save your spot now, by registering to be early for some of the cool goodies that are coming your way.

We ran our Call for Papers in August and got 150 really great submissions by 130 speakers.

We selected 90 for the event, which was really not easy. Some of the sessions that didn’t make it for the conference will be run in later live streams.

Today, let’s look at the NODES data model and import the data for sessions and speakers into Neo4j from CSV.

If you rather want to watch the video of the live stream, here you go:

https://www.youtube.com/watch?v=NYYRmLmlO2M

Next time we’ll import the data from a REST API, look at the schedule and session recommendations.

Let’s start by creating and connecting to our AuraDB Free instance.

If you want to try out the Workspace Beta too and provide feedback, please go to https://neo4j.com/product/workspace and sign up.

connect-aura.adoc

Source Data from Sessionize

Sessionize offers both an JSON REST API to fetch the data as well as export as CSV/XLS.

We saved the CSV for sessions and speakers locally and can use it with Data Importer in Workspace.

The fields in our Session CSV are

Session Id
Title
Description
Speakers
Session format
Level
Pre-requisites for attendees?
Topic of your presentation
Neo4j Use-Case
Your Timezone
Status
Date Submitted
Speaker Ids

The fields for our Speakers:

Speaker Id
FirstName
LastName
TagLine
Bio
timezone
City
Country
Ninja
LinkedIn
Blog
Twitter
Company Website
Profile Picture

Data Modeling and Import

TL;DR if you want to shortcut the modeling you can grab the nodes-sessions-data-importer-2022-09-26.zip File from the GitHub repository and load it via the "Open model (with data)" menu entry in the three dots ….

If we open Workspace on our blank database (or Data Importer directly) we can add our CSV files on the left side.

Then we start mapping our data by

adding the session node
setting it’s label to Session
selecting the sessions csv in the mapping view
and adding all relevant fields to the mapping from the file
the Session Id is automatically selected as id field

Then we do the same for the Speaker node just with the speaker’s CSV.

To connect both, we drag out a relationship from speaker to session from the speaker node’s halo.

Give it the name PRESENTS and use the fields Session Id and Speaker Ids for the mapping.

Unfortunately the comma separated Speaker Ids are not handled by data importer yet, so this will only connect sessions which have a single speaker.

But fear not, we will connect the rest with a bit of post-processing.

In Preview we see how our data will look like in the graph both the properties as well as relationships.

If we’re satisfied, we can click "Run Import" and it will take roughly a second to import the data.

If you click on the Show query links in the import report you’ll see the constraints and import query that data importer is running.

There you also see that for the relationships it tries to match the existing nodes for session and speaker based on Sessiond Id and Speaker Ids to connect them, which only works if there’s a single speaker for a session.

After running the importer we can open the "Query" tab by htting "Run Queries".

Or just click on it on top.

Querying

A single pre-populated query shows us our graph, which should look very similar to the preview.

MATCH p=()-[:PRESENTS]->() RETURN p LIMIT 25;

We can also look for Speakers that have more than one session, like Anton, but not yet for sessions with more than one speaker.

Let’s first find sessions that have no speakers yet.

match (s:Session) where not exists { (s)<-[:PRESENTS]-() }
return s.`Session Id` as session, s.`Speaker Ids` as speakers

These are the ones we want to fix in our post-processing.

Post Processing

The approach we take for all these operations is the same.

Find the sessions to update, split a field by comma+space `, ` into a list of values.

Then turn (UNWIND) this list of values into rows of values, MATCH or MERGE (get-or-create) nodes for the values and connect the session to them via a relationship.

List of speaker ids

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
RETURN s.`Session Id` as session, split(s.`Speaker Ids`,', ') as speakers

List to rows

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
WITH s, split(s.`Speaker Ids`,', ') as speakers
UNWIND speakers as speakerId
RETURN s.`Session Id` as session, speakerId

MATCH speakers and connect them

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
WITH  s, split(s.`Speaker Ids`,', ') as speakers
UNWIND speakers as speakerId
MATCH (sp:Speaker {`Speaker Id`:speakerId})
MERGE (sp)-[r:PRESENTS]->(s)
RETURN *

As sessionize exported all speakers not just the ones with the accepted sessions, we now how to remove our orphans.

Delete orphan speakers

MATCH (sp:Speaker)
WHERE NOT EXISTS { (sp)-[:PRESENTS]->() }
DELETE sp

Categorize other fields

We can now do the same for the other comma separated fields.

Level
Topic
Neo4j Usecase

Here we generally call the list names, an value name and the node n so keep the editing needed to a minimum.

Level

MATCH (s:Session)
WITH  s, split(s.Level,', ') as names
UNWIND names as name
MERGE (n:Level {name:name})
MERGE (s)-[r:OF_LEVEL]->(n)
RETURN *

Use Cases

MATCH (s:Session)
WITH  s, split(s.`Neo4j Use-Case`,', ') as names
UNWIND names as name
MERGE (n:UseCase {name:name})
MERGE (s)-[r:USECASE]->(n)
RETURN *

Topics

match (s:Session)
WITH  s, split(s.`Topic of your presentation`,', ') as names
UNWIND names as name
MERGE (n:Topic {name:name})
MERGE (s)-[r:HAS_TOPIC]->(n)
RETURN *

Now we have a beautiful graph with different nodes for

Session
Speaker
Level
Topic
UseCase

and their relationships.

Exploring the Results

We can open "Explore" to visualize our data a bit.

First we set the correct captions for Session (Title) and Speaker (FirstName and LastName) as well as some icons.

Then we can pic Show me A Graph from the drop down to quickly show a graph and explore a bit.

Then we can use our extracted topics and see which Sessions share topics by entering: Session<tab>Topic<tab>Session<tab> into the search bar and hitting return.

Dashboard

We’re using NeoDash, a Neo4j Labs project for creating quick dashboards.

You can open it via https://tools.neo4jlabs.com and add your connection URL (from the aura console or your credentials download) to the form and click on the Open button for NeoDash.

You still need to provide the password, for security reasons it shouldn’t be passed through the URL.

In the video we go through the charts and queries for the dashboar, we’ll list them quickly here only.

Item	Chart Type	Query
Sessions	Number	`MATCH (:Session) RETURN count(*)`
Description	Markdown	`This is a dashboard for Neo4j NODES 2022`
Titles	Table	`MATCH (s:Session)←[:PRESENTS]-(sp:Speaker) RETURN s.title as session, collect(sp.FirstName+" "+sp.LastName) as speakers`
Topics	Bar Chart	`MATCH (t:Topic)←[:HAS_TOPIC]-(s:Session) RETURN t.name as name, count(*) as c ORDER BY c DESC`
Levels	Pie Chart	`MATCH (s:Session)-[:OF_LEVEL]→(l:Level) RETURN l.name, count(*)`
Graph	Graph	`MATCH p=()-→() RETURN p LIMIT 100`
Speaker Locations	Map	`MATCH (sp:Speaker) WHERE sp.location is not null RETURN sp`

For the speaker locations we first have to compute the geolocation from their city and country.

Add geolocation for Speaker Cities

Let’s check it for a single speaker, we need to see if the city is actually empty or has some characters.

Then we can call apoc.spatial.geocodeOnce to geocode the city and country and look at the results.

match (sp:Speaker)
where size(sp.City) > 1
with sp limit 1
call apoc.spatial.geocodeOnce(sp.City+" "+sp.Country) yield location, data, latitude, longitude
return *

We can use the latitude and longitude from the result directly to create spatial point location properties in our speakers.

match (sp:Speaker)
where size(sp.City) > 1
call apoc.spatial.geocodeOnce(sp.City+" "+sp.Country) yield location, data, latitude, longitude
set sp.location = point({latitude:latitude, longitude:longitude})

Which we then can put on a Map-Chart in the Dashboar by just selecting the speakers that have a location property.

Conclusion

We hope this was a fun and useful sessions and you got excited for NODES 2022.

Stay tuned for the next time when we look at importing REST APIs, Schedule Modeling and computing recommendations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week-35-nodes-sessions.adoc

week-35-nodes-sessions.adoc

Week 35 - Discover AuraDB Free - NODES Sessions

Source Data from Sessionize

Data Modeling and Import

Querying

Post Processing

Categorize other fields

Exploring the Results

Dashboard

Add geolocation for Speaker Cities

Conclusion

Files

week-35-nodes-sessions.adoc

Latest commit

History

week-35-nodes-sessions.adoc

File metadata and controls

Week 35 - Discover AuraDB Free - NODES Sessions

Source Data from Sessionize

Data Modeling and Import

Querying

Post Processing

Categorize other fields

Exploring the Results

Dashboard

Add geolocation for Speaker Cities

Conclusion