Skip to content

Releases: neo4j/graph-data-science

Graph Data Science 2.2.7

27 Jan 09:49
Compare
Choose a tag to compare

GDS 2.2.7 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

  • Added compatibility for Neo4j database 5.4.0.

Bug fixes

  • Missing id fields in the Arrow records for the CREATE_DATABASE action would throw a NullPointerException. It now throws a more descriptive exception instead.
  • Graphs with long node or relationship property names would fail during the restore process.
  • Yens algorithm would ignore edges in multigraphs and yield incorrect results.
  • Multi-threading bug when creating projections via Cypher Aggregation or Arrow could lead to lost labels.
  • Node label filtering could lead to streamed node properties being null when filters are applied.
  • Cypher projections and Cypher aggregation would throw the wrong error message when loading an invalid relationship.
  • Node label filtering that would lead to the wrong results. This also affected: gds.beta.graphSage and gds.beta.graph.relationships.stream.

Graph Data Science 2.3.0-Alpha04

05 Jan 15:58
Compare
Choose a tag to compare
Pre-release

GDS 2.3.0-alpha04 is compatible with Neo4j 5 & 4.4 versions (≥ 4.4.9) & 4.3 versions (≥ 4.3.15) Database.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden was promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means was promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • Minimum weighted spanning tree algorithm was promoted to the beta tier. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
    • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behaviour by specifying the new parameter objective in gds.beta.spanningTree.
    • The weightWriteProperty has been removed as a configuration parameter. To supply the Relationship Type and Property for the produced relationship, use:
      • mutateRelationshipType
      • mutateProperty
    • gds.alpha.spanningTree.kmin and gds.alpha.spanningTree.kmax have been removed as the K-Spanning Tree algorithm has been moved in its own space gds.alpha.kSpanningTree
    • The parameter startNodeId in all Spanning Tree algorithms has been replaced with sourceNode.
  • Arrow: when projecting graphs, null will be translated to NaN for floating point values. This enables users of either the GDS Python Client or PyArrow to load NaN properties stored in Pandas DataFrames
  • Cypher Aggregations will become the primary surface for creating projections with Cypher. Offering a more intuitive and expressive interface than Cypher Projections that can also be used in Fabric or Composite Database setups.
  • The algorithm gds.alpha.influenceMaximization.greedy has been removed. It's replacement is the already existing gds.beta.influenceMaximization.celf algorithm which has the same configuration parameters and offers better performance.

New features

Minimum Directed Steiner Tree

  • Added heuristic for minimum directed Steiner Tree under the gds.beta.steinerTree domain.
    • Added stats mode with gds.beta.steinerTree.stats
    • Added stream mode with gds.beta.steinerTree.stream
    • Added mutate mode with gds.beta.steinerTree.mutate
    • Added write mode with gds.beta.steinerTree.write
    • Now available in progress tracking - gds.list.progress()

Leiden

  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.stream.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP

HashGNN

  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

  • Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

  • New modes supported: gds.beta.spanningTree.(stats, stream, mutate)
  • New yield output for gds.beta.spanningTree that outputs the sum of weights in the discovered spanning tree.
  • New yield output for gds.beta.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode :
  • gds.beta.spanningTree.stream.estimate
  • gds.beta.spanningTree.mutate.estimate
  • gds.beta.spanningTree.stats.estimate
  • gds.beta.spanningTree.write.estimate

Write Labels

  • Added gds.alpha.graph.nodeLabel.write to allow for Node Labels to be written back from projections to a Neo4j Database

Graph Projections

  • Arrow now supports specifying undirected relationship types using the undirected_relationship_types configuration argument
  • Cypher Aggregations (gds.alpha.graph.project) now support specifying undirected relationship types using the undirectedRelationshipTypes configuration option
  • New procedure to turn directed relationships into undirected relationships: gds.beta.graph.relationships.toUndirect

Administration

  • Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
  • Added username as a new return field to gds.beta.listProgress.
  • Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.
  • Administrators can now see all running tasks from all users with gds.beta.listProgress

Bug fixes

  • Minimum Weighted Spanning Tree: Graphs with parallel edges could make the discovered tree have wrong weights on relationships
  • Cypher Aggregations: When using gds.alpha.graph.project:
    • The projected graph would list relationship types with zero relationships
    • AIOOB exceptions could surface due to sizing errors
  • Arrow: CREATE_DATABASE action would throw a NPE if missing id fields in Arrow record.. A more descriptive exception is provided

Improvements

Arrow

  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

  • Better parallelization and improved overall performance improvements

Other Improvements

  • Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.
  • Improved error message for invalid node labels and relationship types

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
  • Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation
  • In gds.alpha.model.store, the positional argument failIfUnsupportedType is renamed to failIfUnsupported. Both will be supported until it is promoted to the beta tier.
  • Progress tracking for Betweenness Centrality has been reworked. Progress reporting may differ from earlier versions.

Graph Data Science 2.2.6

16 Dec 09:15
Compare
Choose a tag to compare

Neo4j Graph Data Science 2.2.6 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Improvements

  • Added support for Neo4j Database 5.3

Graph Data Science 2.3.0-alpha03

01 Dec 15:45
Compare
Choose a tag to compare
Pre-release

GDS 2.3.0-alpha03 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means is promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • The parameter startNodeId in Spanning Tree algorithms have been replaced with sourceNode.
  • The minimum weighted spanning tree algorithm is moved to beta. It is now called via the gds.beta.spanningTree command instead of gds.alpha.spanningTree
    • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behavior by specifying the new parameter objective in gds.beta.spanningTree.

New features

Minimum Directed Steiner Tree

  • Added heuristic for minimum directed Steiner Tree under the gds.alpha.steinerTree domain.
    • Added stats mode with gds.alpha.steinerTree.stats
    • Added stream mode with gds.alpha.steinerTree.stream
    • Added mutate mode with gds.alpha.steinerTree.mutate
    • Added write mode with gds.alpha.steinerTree.write

Leiden

  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.stream.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP

HashGNN

  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Link Prediction

  • Added new optional configuration parameter negativeRelationshipType to gds.beta.pipeline.linkPrediction.configureSplit

Spanning Tree

  • New modes supported: gds.alpha.spanningTree.(stats, stream, mutate)
  • New yield output for gds.alpha.spanningTree that outputs the sum of weights in the discovered spanning tree.
  • New yield output for gds.alpha.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode :
  • gds.alpha.spanningTree.stream.estimate
  • gds.alpha.spanningTree.mutate.estimate
  • gds.alpha.spanningTree.stats.estimate
  • gds.alpha.spanningTree.write.estimate

Write Labels

  • Added gds.alpha.graph.nodeLabel.write to allow for Node Labels to be written back from projections to a Neo4j Database

Administration

  • Added the jobId and username to the ongoingGdsProcedures return field of gds.alpha.systemMonitor.
  • Added username as a new return field to gds.beta.listProgress.
  • Added a new return field to gds.graph.list called schemaWithOrientation which also includes the orientation.

Bug fixes

  • Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow

  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

  • Better parallelization and improved overall performance improvements

Other Algorithms

  • Speed improvements for Dijkstra, Astar, Yens, CELF, weighted Betweenness Centrality, and the Spanning Tree algorithms. The improvements will see a slight increase in the memory consumption of these algorithms.

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.
  • Mark the yielded field schema as deprecated in gds.graph.list and gds.graph.drop. In the next major release, the schema field will use the semantics of schemaWithOrientation

Graph Data Science 2.2.5

29 Nov 10:39
Compare
Choose a tag to compare

Neo4j Graph Data Science 2.2.5 is compatible with Neo4j Database 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For Neo4j Graph Data Science compatibility, please use the Neo4j Compatibility Matrix.

Bug Fixes

  • Some functions would not work as expected with Neo4j 5.x versions
    • gds.alpha.linkprediction.adamicAdar
    • gds.alpha.linkprediction.commonNeighbors
    • gds.alpha.linkprediction.resourceAllocation
    • gds.alpha.linkprediction.totalNeighbors

Graph Data Science 2.3.0-Alpha02

21 Nov 10:41
Compare
Choose a tag to compare
Pre-release

GDS 2.3.0-alpha02 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Breaking changes

  • Leiden promoted to the beta tier. It is now called via the 'gds.beta.leiden' command instead of the gds.alpha.leiden command.
  • K-means is promoted to the beta tier. It is now called via the gds.beta.kmeans command instead of the gds.alpha.kmeans command.
  • The parameter startNodeId in Spanning Tree algorithms have been replaced with sourceNode.
  • The procedures gds.alpha.spanningTree.minimum and gds.alpha.spanningTree.maximum have been removed. You can get the same behavior by specifying the new parameter objective in gds.alpha.spanningTree.

New features

Leiden

  • New parameter consecutiveIds that assigns consecutive ids for the discovered communities.
  • New parameter seedProperty to seed initial communities for nodes.
  • New parameter tolerance to enable convergence criteria based on difference in modularity from one iteration to another.
  • Now available in progress tracking - gds.list.progress()
  • Added memory estimation mode:
    • gds.beta.leiden.mutate.estimate
    • gds.beta.leiden.stats.estimate
    • gds.beta.leiden.stream.estimate
    • gds.beta.leiden.write.estimate

Logistic Regression & MLP

  • New configuration parameters classWeights and focusWeight for training methods, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.nodeClassification.addMLP
    • gds.beta.pipeline.linkPrediction.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addMLP

HashGNN

  • New algorithm gds.alpha.hashgnn.{mutate,stream} to create HashGNN node embeddings
  • New procedures gds.alpha.hashgnn.{mutate,stream}.estimate to estimate the memory required to run HashGNN

Spanning Tree

  • New modes supported: gds.alpha.spanningTree.(stats, stream, mutate)
  • New yield output for gds.alpha.spanningTree that outputs the sum of weights in the discovered spanning tree.
  • New yield output for gds.alpha.spanningTree that outputs the number of relationships written or added for write and mutate mode respectively.
  • Added memory estimation mode:
  • gds.alpha.spanningTree.stream.estimate
  • gds.alpha.spanningTree.mutate.estimate
  • gds.alpha.spanningTree.stats.estimate
  • gds.alpha.spanningTree.write.estimate

Bug fixes

  • Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

Arrow

  • graph import now fully supports external node ids in the 64 Bit space.
  • graph import now supports 16, 32 or 64 Bit node identifiers.

Leiden

  • Better parallelization and improved overall performance improvements

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.
  • Progress tracking in the Spanning Tree algorithm has been reworked. Progress reporting may differ from earlier versions.

Graph Data Science 2.2.4

21 Nov 10:46
Compare
Choose a tag to compare

GDS 2.2.4 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

  • gds.alpha.nodeSimilarity.filtered - would give incorrect node IDs.
  • Pregel framework - the computation would not stop after terminating the underlying transaction. This affects gds.pageRank, gds.articleRank, gds.eigenvector.
  • alpha.hits and gds.alpha.sllpa could not be used as a nodeProperty step inside ml pipeline including gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.
  • nodeProperty steps could not be added to ml pipelines when running against Neo4j 5.x. This affected gds.beta.pipeline.linkPrediction, gds.beta.pipeline.nodeClassification, and gds.alpha.pipeline.nodeRegression.

Improvements

  • gds.graph.list will only calculate the graph size when the procedure is called without any YIELD or if the fields memoryUsage or sizeInBytes are explicitly YIELD-ed.
    Using YIELD to return other fields but not one of memoryUsage or sizeInBytes can speed up the execution time of gds.graph.list.

Graph Data Science 2.2.3

07 Nov 09:55
Compare
Choose a tag to compare

GDS 2.2.3 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Bug fixes

  • gds.graph.export failed to run on Neo4j 5.X
  • gds.graph.export failed with InvalidRecordException when writeConcurrency is set >1.
  • Enterprise users were unable to load models trained with concurrency > 4.

Improvements

  • Arrow graph import now fully supports external node ids in the 64 Bit space.

Graph Data Science 2.3.0-alpha01

23 Oct 13:42
Compare
Choose a tag to compare
Pre-release

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

New features

  • Added a parameter consecutiveIds to Leiden to assign consecutive ids for the discovered communities.
  • Added a parameter seedProperty to Leiden to seed initial communities for nodes.
  • Added new configuration parameter focusWeight for Logistic Regression training method, supported by procedures:
    • gds.beta.pipeline.nodeClassification.addLogisticRegression
    • gds.beta.pipeline.linkPrediction.addLogisticRegression

Bug fixes

  • Fixed a bug in Minimum Weighted Spanning Tree on graphs with parallel edges where the discovered tree could have wrong weights.

Improvements

  • Arrow graph import now fully supports external node ids in the 64 Bit space.
  • Arrow graph import now supports 16, 32 or 64 Bit node identifiers.

Other changes

  • Histograms returned such as degreeDistribution in gds.graph.list can have slightly different values for specific percentiles due to changes in floating point operations.

Graph Data Science 2.2.2

21 Oct 10:17
Compare
Choose a tag to compare

GDS 2.2.2 is compatible with Neo4j 5 versions & 4.4 versions ≥ 4.4.9 & 4.3 versions ≥ 4.3.15.

For GDS compatibility with previous releases, please use GDS Compatibility Table.

Improvements

  • Graph Data Science ≥2.2.2 now supports Neo4j 5