Releases: neo4j/graph-data-science
Graph Data Science 2.2.7
GDS 2.2.7 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
New features
- Added compatibility for Neo4j database 5.4.0.
Bug fixes
- Missing id fields in the Arrow records for the
CREATE_DATABASE
action would throw aNullPointerException
. It now throws a more descriptive exception instead. - Graphs with long node or relationship property names would fail during the restore process.
- Yens algorithm would ignore edges in multigraphs and yield incorrect results.
- Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels.
- Node label filtering could lead to streamed node properties being null when filters are applied.
- Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship.
- Node label filtering that would lead to the wrong results. This also affected:
gds.beta.graphSage
andgds.beta.graph.relationships.stream
.
Graph Data Science 2.3.0-Alpha04
GDS 2.3.0-alpha04 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Breaking changes
- Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the
gds.alpha.leiden
command. - K-means was promoted to the beta tier. It is now called via the
gds.beta.kmeans
command instead of thegds.alpha.kmeans
command. - Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the
gds.beta.spanningTree
command instead ofgds.alpha.spanningTree
- The procedures
gds.alpha.spanningTree.minimum
andgds.alpha.spanningTree.maximum
have been removed. You can get the same behaviour by specifying the new parameterobjective
ingds.beta.spanningTree
. - The
weightWriteProperty
has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:mutateRelationshipType
mutateProperty
gds.alpha.spanningTree.kmin
andgds.alpha.spanningTree.kmax
have been removed as the K-Spanning Tree algorithm has been moved in its own spacegds.alpha.kSpanningTree
- The parameter
startNodeId
in all Spanning Tree algorithms has been replaced withsourceNode
.
- The procedures
- Arrow: when projecting graphs,
null
will be translated toNaN
for floating point values. This enables users of either the GDS Python Client or PyArrow to loadNaN
properties stored in Pandas DataFrames - Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
- The algorithm
gds.alpha.influenceMaximization.greedy
has been removed. It's replacement is the already existinggds.beta.influenceMaximization.celf
algorithm which has the same configuration parameters and offers better performance.
New features
Minimum Directed Steiner Tree
- Added heuristic for minimum directed Steiner Tree under the
gds.beta.steinerTree
domain.- Added
stats
mode withgds.beta.steinerTree.stats
- Added
stream
mode withgds.beta.steinerTree.stream
- Added
mutate
mode withgds.beta.steinerTree.mutate
- Added
write
mode withgds.beta.steinerTree.write
- Now available in progress tracking -
gds.list.progress()
- Added
Leiden
- New parameter
consecutiveIds
that assigns consecutive ids for the discovered communities. - New parameter
seedProperty
to seed initial communities for nodes. - New parameter
tolerance
to enable convergence criteria based on difference in modularity from one iteration to another. - Now available in progress tracking -
gds.list.progress()
- Added memory estimation mode:
gds.beta.leiden.mutate.estimate
gds.beta.leiden.stats.estimate
gds.beta.leiden.stream.estimate
gds.beta.leiden.write.estimate
Logistic Regression & MLP
- New configuration parameters
classWeights
andfocusWeight
for training methods, supported by procedures:gds.beta.pipeline.nodeClassification.addLogisticRegression
gds.beta.pipeline.nodeClassification.addMLP
gds.beta.pipeline.linkPrediction.addLogisticRegression
gds.beta.pipeline.linkPrediction.addMLP
HashGNN
- New algorithm
gds.alpha.hashgnn.{mutate,stream}
to create HashGNN node embeddings - New procedures
gds.alpha.hashgnn.{mutate,stream}.estimate
to estimate the memory required to run HashGNN
Link Prediction
- Added new optional configuration parameter
negativeRelationshipType
togds.beta.pipeline.linkPrediction.configureSplit
Spanning Tree
- New modes supported:
gds.beta.spanningTree.(stats, stream, mutate)
- New yield output for
gds.beta.spanningTree
that outputs the sum of weights in the discovered spanning tree. - New yield output for
gds.beta.spanningTree
that outputs the number of relationships written or added for write and mutate mode respectively. - Added memory estimation mode :
gds.beta.spanningTree.stream.estimate
gds.beta.spanningTree.mutate.estimate
gds.beta.spanningTree.stats.estimate
gds.beta.spanningTree.write.estimate
Write Labels
- Added
gds.alpha.graph.nodeLabel.write
to allow for Node Labels to be written back from projections to a Neo4j Database
Graph Projections
- Arrow now supports specifying undirected relationship types using the
undirected_relationship_types
configuration argument - Cypher Aggregations (
gds.alpha.graph.project
) now support specifying undirected relationship types using theundirectedRelationshipTypes
configuration option - New procedure to turn directed relationships into undirected relationships:
gds.beta.graph.relationships.toUndirect
Administration
- Added the
jobId
andusername
to theongoingGdsProcedures
return field ofgds.alpha.systemMonitor
. - Added username as a new return field to
gds.beta.listProgress
. - Added a new return field to
gds.graph.list
calledschemaWithOrientation
which also includes the orientation. - Administrators can now see all running tasks from all users with
gds.beta.listProgress
Bug fixes
- Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
- Cypher Aggregations: When using
gds.alpha.graph.project
:- The projected graph would list relationship types with zero relationships
- AIOOB exceptions could surface due to sizing errors
- Arrow:
CREATE_DATABASE
action would throw a NPE if missing id fields in Arrow record.. A more descriptive exception is provided
Improvements
Arrow
- graph import now fully supports external node ids in the 64 Bit space.
- graph import now supports 16, 32 or 64 Bit node identifiers.
Leiden
- Better parallelization and improved overall performance improvements
Other Improvements
- Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
- Improved error message for invalid node labels and relationship types
Other changes
- Histograms returned such as
degreeDistribution
ingds.graph.list
can have slightly different values for specific percentiles due to changes in floating point operations. - Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
- Mark the yielded field
schema
as deprecated ingds.graph.list
andgds.graph.drop
. In the next major release, theschema
field will use the semantics ofschemaWithOrientation
- In
gds.alpha.model.store
, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier. - Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.
Graph Data Science 2.2.6
Neo4j Graph Data Science 2.2.6 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.
Improvements
- Added support for Neo4j Database 5.3
Graph Data Science 2.3.0-alpha03
GDS 2.3.0-alpha03 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Breaking changes
- Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the
gds.alpha.leiden
command. - K-means is promoted to the beta tier. It is now called via the
gds.beta.kmeans
command instead of thegds.alpha.kmeans
command. - The parameter
startNodeId
in Spanning Tree algorithms have been replaced withsourceNode
. - The minimum weighted spanning tree algorithm is moved to beta. It is now called via the
gds.beta.spanningTree
command instead ofgds.alpha.spanningTree
- The procedures
gds.alpha.spanningTree.minimum
andgds.alpha.spanningTree.maximum
have been removed. You can get the same behavior by specifying the new parameterobjective
ingds.beta.spanningTree
.
- The procedures
New features
Minimum Directed Steiner Tree
- Added heuristic for minimum directed Steiner Tree under the
gds.alpha.steinerTree
domain.- Added
stats
mode withgds.alpha.steinerTree.stats
- Added
stream
mode withgds.alpha.steinerTree.stream
- Added
mutate
mode withgds.alpha.steinerTree.mutate
- Added
write
mode withgds.alpha.steinerTree.write
- Added
Leiden
- New parameter
consecutiveIds
that assigns consecutive ids for the discovered communities. - New parameter
seedProperty
to seed initial communities for nodes. - New parameter
tolerance
to enable convergence criteria based on difference in modularity from one iteration to another. - Now available in progress tracking -
gds.list.progress()
- Added memory estimation mode:
gds.beta.leiden.mutate.estimate
gds.beta.leiden.stats.estimate
gds.beta.leiden.stream.estimate
gds.beta.leiden.write.estimate
Logistic Regression & MLP
- New configuration parameters
classWeights
andfocusWeight
for training methods, supported by procedures:gds.beta.pipeline.nodeClassification.addLogisticRegression
gds.beta.pipeline.nodeClassification.addMLP
gds.beta.pipeline.linkPrediction.addLogisticRegression
gds.beta.pipeline.linkPrediction.addMLP
HashGNN
- New algorithm
gds.alpha.hashgnn.{mutate,stream}
to create HashGNN node embeddings - New procedures
gds.alpha.hashgnn.{mutate,stream}.estimate
to estimate the memory required to run HashGNN
Link Prediction
- Added new optional configuration parameter
negativeRelationshipType
togds.beta.pipeline.linkPrediction.configureSplit
Spanning Tree
- New modes supported:
gds.alpha.spanningTree.(stats, stream, mutate)
- New yield output for
gds.alpha.spanningTree
that outputs the sum of weights in the discovered spanning tree. - New yield output for
gds.alpha.spanningTree
that outputs the number of relationships written or added for write and mutate mode respectively. - Added memory estimation mode :
gds.alpha.spanningTree.stream.estimate
gds.alpha.spanningTree.mutate.estimate
gds.alpha.spanningTree.stats.estimate
gds.alpha.spanningTree.write.estimate
Write Labels
- Added
gds.alpha.graph.nodeLabel.write
to allow for Node Labels to be written back from projections to a Neo4j Database
Administration
- Added the
jobId
andusername
to theongoingGdsProcedures
return field ofgds.alpha.systemMonitor
. - Added username as a new return field to
gds.beta.listProgress
. - Added a new return field to
gds.graph.list
calledschemaWithOrientation
which also includes the orientation.
Bug fixes
- Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.
Improvements
Arrow
- graph import now fully supports external node ids in the 64 Bit space.
- graph import now supports 16, 32 or 64 Bit node identifiers.
Leiden
- Better parallelization and improved overall performance improvements
Other Algorithms
- Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
Other changes
- Histograms returned such as
degreeDistribution
ingds.graph.list
can have slightly different values for specific percentiles due to changes in floating point operations. - Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
- Mark the yielded field
schema
as deprecated ingds.graph.list
andgds.graph.drop
. In the next major release, theschema
field will use the semantics ofschemaWithOrientation
Graph Data Science 2.2.5
Neo4j Graph Data Science 2.2.5 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.
Bug Fixes
- Some functions would not work as expected with Neo4j 5.x versions
gds.alpha.linkprediction.adamicAdar
gds.alpha.linkprediction.commonNeighbors
gds.alpha.linkprediction.resourceAllocation
gds.alpha.linkprediction.totalNeighbors
Graph Data Science 2.3.0-Alpha02
GDS 2.3.0-alpha02 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Breaking changes
- Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the
gds.alpha.leiden
command. - K-means is promoted to the beta tier. It is now called via the
gds.beta.kmeans
command instead of thegds.alpha.kmeans
command. - The parameter
startNodeId
in Spanning Tree algorithms have been replaced withsourceNode
. - The procedures
gds.alpha.spanningTree.minimum
andgds.alpha.spanningTree.maximum
have been removed. You can get the same behavior by specifying the new parameterobjective
ingds.alpha.spanningTree
.
New features
Leiden
- New parameter
consecutiveIds
that assigns consecutive ids for the discovered communities. - New parameter
seedProperty
to seed initial communities for nodes. - New parameter
tolerance
to enable convergence criteria based on difference in modularity from one iteration to another. - Now available in progress tracking -
gds.list.progress()
- Added memory estimation mode:
gds.beta.leiden.mutate.estimate
gds.beta.leiden.stats.estimate
gds.beta.leiden.stream.estimate
gds.beta.leiden.write.estimate
Logistic Regression & MLP
- New configuration parameters
classWeights
andfocusWeight
for training methods, supported by procedures:gds.beta.pipeline.nodeClassification.addLogisticRegression
gds.beta.pipeline.nodeClassification.addMLP
gds.beta.pipeline.linkPrediction.addLogisticRegression
gds.beta.pipeline.linkPrediction.addMLP
HashGNN
- New algorithm
gds.alpha.hashgnn.{mutate,stream}
to create HashGNN node embeddings - New procedures
gds.alpha.hashgnn.{mutate,stream}.estimate
to estimate the memory required to run HashGNN
Spanning Tree
- New modes supported:
gds.alpha.spanningTree.(stats, stream, mutate)
- New yield output for
gds.alpha.spanningTree
that outputs the sum of weights in the discovered spanning tree. - New yield output for
gds.alpha.spanningTree
that outputs the number of relationships written or added for write and mutate mode respectively. - Added memory estimation mode:
gds.alpha.spanningTree.stream.estimate
gds.alpha.spanningTree.mutate.estimate
gds.alpha.spanningTree.stats.estimate
gds.alpha.spanningTree.write.estimate
Bug fixes
- Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.
Improvements
Arrow
- graph import now fully supports external node ids in the 64 Bit space.
- graph import now supports 16, 32 or 64 Bit node identifiers.
Leiden
- Better parallelization and improved overall performance improvements
Other changes
- Histograms returned such as
degreeDistribution
ingds.graph.list
can have slightly different values for specific percentiles due to changes in floating point operations. - Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
Graph Data Science 2.2.4
GDS 2.2.4 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Bug fixes
gds.alpha.nodeSimilarity.filtered
- would give incorrect node IDs.- Pregel framework - the computation would not stop after terminating the underlying transaction. This affects
gds.pageRank
,gds.articleRank
,gds.eigenvector
. alpha.hits
andgds.alpha.sllpa
could not be used as a nodeProperty step inside ml pipeline includinggds.beta.pipeline.linkPrediction
,gds.beta.pipeline.nodeClassification
, andgds.alpha.pipeline.nodeRegression
.- nodeProperty steps could not be added to ml pipelines when running against Neo4j 5.x. This affected
gds.beta.pipeline.linkPrediction
,gds.beta.pipeline.nodeClassification
, andgds.alpha.pipeline.nodeRegression
.
Improvements
gds.graph.list
will only calculate the graph size when the procedure is called without anyYIELD
or if the fieldsmemoryUsage
orsizeInBytes
are explicitlyYIELD
-ed.
UsingYIELD
to return other fields but not one ofmemoryUsage
orsizeInBytes
can speed up the execution time ofgds.graph.list
.
Graph Data Science 2.2.3
GDS 2.2.3 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Bug fixes
gds.graph.export
failed to run on Neo4j 5.Xgds.graph.export
failed with InvalidRecordException whenwriteConcurrency
is set >1.- Enterprise users were unable to load models trained with concurrency > 4.
Improvements
- Arrow graph import now fully supports external node ids in the 64 Bit space.
Graph Data Science 2.3.0-alpha01
GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
New features
- Added a parameter
consecutiveIds
to Leiden to assign consecutive ids for the discovered communities. - Added a parameter
seedProperty
to Leiden to seed initial communities for nodes. - Added new configuration parameter
focusWeight
for Logistic Regression training method, supported by procedures:gds.beta.pipeline.nodeClassification.addLogisticRegression
gds.beta.pipeline.linkPrediction.addLogisticRegression
Bug fixes
- Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.
Improvements
- Arrow graph import now fully supports external node ids in the 64 Bit space.
- Arrow graph import now supports 16, 32 or 64 Bit node identifiers.
Other changes
- Histograms returned such as
degreeDistribution
ingds.graph.list
can have slightly different values for specific percentiles due to changes in floating point operations.
Graph Data Science 2.2.2
GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.
For GDS compatibility with previous releases, please use GDS Compatibility Table.
Improvements
- Graph Data Science ≥2.2.2 now supports Neo4j 5