Skip to content

Releases: neo4j/graph-data-science

GDS 1.4.1

07 Dec 23:22
Compare
Choose a tag to compare

Release date: 7 December, 2020

GDS 1.4.1 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Bug fixes

  • Fixed a bug in progress logging for gds.graph.writeNodeProperties() and gds.graph.writeRelationships() where some percentages were missed, or others reported multiple times.
  • Fixed a bug where gds.graph.writeNodeProperties() and gds.alpha.shortestPathDeltaStepping.write() were single threaded by default
  • Fixed a bug where gds.alpha.node2vec ignored relationships for graphs with multiple projected relationship types.
  • Fixed a bug where gds.pagerank.*.estimate would fail for very large node counts.
  • Fixed a bug where using float array node properties (e.g. after running gds.fastRP.mutate) would fail in some situations.
  • Fixed a bug where a graph with multiple labels and all nodes sharing at least one label could lead to either an exception or a wrongly mapped Neo4j id.

Improvements

  • gds.pageRank will now select batches more dynamically to properly respect the requested concurrency.

1.3.5

02 Dec 17:00
Compare
Choose a tag to compare

Release date: 23 November, 2020

GDS 1.3.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x or 4.2. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.2 compatible release, please see GDS 1.4.0

See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, 1.3.3 release notes, and 1.3.4 release notes,

Bug fixes

  • Fixed a bug in gds.graph.export where at most one relationship property per relationship type would be exported.
  • Fixed a bug in Louvain where changes to maxIterations were ignored.
  • Fixed a bug where gds.alpha.node2vec would ignore relationships for graphs with multiple projected relationship types.

GDS 1.4.0

05 Nov 20:16
Compare
Choose a tag to compare

Release date: 5 November, 2020

GDS 1.4.0 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Breaking changes

  • License key configuration was renamed from licenseFile to license_file for consistency with Bloom
  • Removed sparsity parameter from gds.alpha.randomProjection.*
  • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
  • Renamed embeddingSize parameter to embeddingDimension for fastRP, GraphSAGE and Node2Vec.
  • Renamed projectedFeatureSize to projectedFeatureDimension for GraphSAGE
  • Renamed nodePropertyNames has been renamed to featureProperties in gds.beta.fastRPExtended and gds.beta.graphSage.train
  • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
  • Default parameters for gds.fastRP have changed on the following configuration parameters:
    • iterationWeights now has default [0.0, 1.0, 1.0]
    • normalizeL2 has been removed and its effect is always applied
  • Removed alpha procedures for GraphSage (replaced with beta tier, see New Features section)
    • gds.alpha.graphSage.stream
    • gds.alpha.graphSage.write
  • GraphSage no longer directly calculates embeddings, instead it has been split into train (to generate a named model) and write, mutate, and stream to apply the model predictions to your data.
  • Due to the creation of a train mode for graph sage, the following configuration parameters were moved:
    • embeddingSize - moved as configuration parameter of gds.beta.graphSage.train
    • aggregator - moved as configuration parameter of gds.beta.graphSage.train
    • activationFunction - moved as configuration parameter of gds.beta.graphSage.train
    • sampleSizes - moved as configuration parameter of gds.beta.graphSage.train
    • nodePropertyNames - moved as configuration parameter of gds.beta.graphSage.train
    • tolerance - moved as configuration parameter of gds.beta.graphSage.train
    • learningRate - moved as configuration parameter of gds.beta.graphSage.train
    • epochs - moved as configuration parameter of gds.beta.graphSage.train
    • maxIterations - moved as configuration parameter of gds.beta.graphSage.train
    • searchDepth - moved as configuration parameter of gds.beta.graphSage.train
    • negativeSampleWeight - moved as configuration parameter of gds.beta.graphSage.train
    • degreeAsProperty - moved as configuration parameter of gds.beta.graphSage.train
  • gds.beta.graphSage.stream procedure now requires modelName configuration parameter.
  • gds.beta.graphSage.write procedure requires modelName configuration parameter.
  • Removed startLoss and epochLosses from the result columns of gds.beta.graphSage.write.
  • Added the graph create config as a return field to the train procedure, affecting gds.beta.graphSage.train
  • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
  • Removed configuration parameter maxCost from gds.alpha.bfs/dfs.
  • Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
  • Removed degreeDistribution from gds.graph.drop return columns.
  • gds.pageRank now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.
  • Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.

New features

  • Promote GraphSage to beta tier and added support for inductive models with the train mode
    • This adds procedures
      • gds.beta.graphSage.mutate
      • gds.beta.graphSage.mutate.estimate
      • gds.beta.graphSage.stream
      • gds.beta.graphSage.stream.estimate
      • gds.beta.graphSage.train
      • gds.beta.graphSage.train.estimate
      • gds.beta.graphSage.write
      • gds.beta.graphSage.write.estimate
    • And removes alpha procedures
      • gds.alpha.graphSage.stream
      • gds.alpha.graphSage.write
  • GraphSage supports relationship weights, driven by relationshipWeightProperty
  • GraphSage supports node labels via projectedFeatureSize
  • Introduced the model catalog to manage trained models, including:
    • gds.beta.model.exists - a procedure to check if a model exists in the catalog
    • Gds.beta.model.list- list all available models
    • gds.beta.model.drop - removes a model from the catalog
  • The Random Projection algorithm has been promoted to the product tier and we have added:
    • gds.fastRP.stats
    • gds.fastRP.mutate
    • gds.fastRP.estimate
    • Added procedures for stats and mutate mode, as well as, estimates for all modes.
  • FastRP has been extended to support relationship weights and directions
  • FastRP supports integer configuration for iteration weights.
  • We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
    • gds.beta.fastRPExtended.mutate
    • gds.beta.fastRPExtended.stream
    • gds.beta.fastRPExtended.stats
    • gds.beta.fastRPExtended.write
    • gds.beta.fastRPExtended.mutate.estimate
    • gds.beta.fastRPExtended.stream.estimate
    • gds.beta.fastRPExtended.stats.estimate
    • gds.beta.fastRPExtended.write.estimate
  • We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
  • gds.beta.knn.mutate and gds.beta.knn.mutate.estimate
  • gds.beta.knn.stats and gds.beta.knn.stats.estimate
  • gds.beta.knn.stream and gds.beta.knn.stream.estimate
  • gds.beta.knn.write and gds.beta.knn.write.estimate
  • The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
  • Pregel framework
    • Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
    • Pregel now supports long and double array node values.
    • Add support for composite node state to allow complex data types on nodes.
    • Reduced memory consumption.
    • Improved memory estimation.
    • Simplified message iteration in compute methods.
    • Split context into Init- and ComputeContext and simplified API.
    • Removed K1ColoringExample standalone project.
    • Added pregel-bootstrap standalone project.
    • Added pregel-examples module.
  • Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
  • Added density property to the output of graph in graph.list.
  • Added a failIfMissing flag to gds.graph.drop

Bug fixes

  • Pregel:
    • Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
    • Fix cast exception when returning array node properties in generated Pregel procedures.
  • Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
    • gds.alpha.closeness
    • gds.alpha.closeness.harmonic
    • gds.alpha.allShortestPaths
  • Fixed a bug in gds.alpha.shortestPath.deltaStepping where large relationship weights led to incorrect results
  • Weakly connected components:
    • Fixed a bug in WCC where componentCount would be negative when the graph is empty.
    • Fixed a regression where WCC could run more slowly with increased concurrency.
  • Fixed bugs in Louvain:
    • communityCount is no longer negative when the graph is empty.
    • changes to maxIterations are no longer ignored.
  • Fixed a bug in LabelPropagation where communityCount would be negative when the graph is empty.
  • Fixed a bug in KNN where it failed when run on graphs with filtere...
Read more

GDS 1.4 Preview

16 Oct 20:03
Compare
Choose a tag to compare
GDS 1.4 Preview Pre-release
Pre-release

Breaking changes

  • Removed sparsity parameter from gds.alpha.randomProjection.*
  • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
  • Renamed embeddingSize parameter to embeddingDimension for fastRP, GraphSAGE and Node2Vec.
  • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
  • Default parameters for gds.fastRP have changed on the following configuration parameters:
    • iterationWeights now has default [0.0, 1.0, 1.0]
    • normalizeL2 has been removed and its effect is always applied
  • Removed alpha procedures for GraphSage (replaced with beta tier, see New Features section)
    • gds.alpha.graphSage.stream
    • gds.alpha.graphSage.write
  • GraphSage no longer directly calculates embeddings, instead it has been split into train (to generate a named model) and write, mutate, and stream to apply the model predictions to your data.
  • Due to the creation of a train mode for graph sage, the following configuration parameters were moved:
    • embeddingSize - moved as configuration parameter of gds.beta.graphSage.train
    • aggregator - moved as configuration parameter of gds.beta.graphSage.train
    • activationFunction - moved as configuration parameter of gds.beta.graphSage.train
    • sampleSizes - moved as configuration parameter of gds.beta.graphSage.train
    • nodePropertyNames - moved as configuration parameter of gds.beta.graphSage.train
    • tolerance - moved as configuration parameter of gds.beta.graphSage.train
    • learningRate - moved as configuration parameter of gds.beta.graphSage.train
    • epochs - moved as configuration parameter of gds.beta.graphSage.train
    • maxIterations - moved as configuration parameter of gds.beta.graphSage.train
    • searchDepth - moved as configuration parameter of gds.beta.graphSage.train
    • negativeSampleWeight - moved as configuration parameter of gds.beta.graphSage.train
    • degreeAsProperty - moved as configuration parameter of gds.beta.graphSage.train
  • gds.beta.graphSage.stream procedure now requires modelName configuration parameter.
  • gds.beta.graphSage.write procedure requires modelName configuration parameter.
  • Removed startLoss and epochLosses from the result columns of gds.beta.graphSage.write.
  • Added the graph create config as a return field to the train procedure, affecting gds.beta.graphSage.train
  • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
  • Removed configuration parameter maxCost from gds.alpha.bfs/dfs.
  • Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
  • Removed degreeDistribution from gds.graph.drop return columns.
  • gds.pageRank now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.
  • Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.

New features

  • Promote GraphSage to beta tier and added support for inductive models with the train mode
    • This adds procedures
      • gds.beta.graphSage.mutate
      • gds.beta.graphSage.mutate.estimate
      • gds.beta.graphSage.stream
      • gds.beta.graphSage.stream.estimate
      • gds.beta.graphSage.train
      • gds.beta.graphSage.train.estimate
      • gds.beta.graphSage.write
      • gds.beta.graphSage.write.estimate
    • And removes alpha procedures
      • gds.alpha.graphSage.stream
      • gds.alpha.graphSage.write
  • GraphSage supports relationship weights, driven by relationshipWeightProperty
  • GraphSage supports node labels via projectedFeatureSize
  • Introduced the model catalog to manage trained models, including:
    • gds.beta.model.exists - a procedure to check if a model exists in the catalog
    • Gds.beta.model.list- list all available models
    • gds.beta.model.drop - removes a model from the catalog
  • The Random Projection algorithm has been promoted to the product tier and we have added:
    • gds.fastRP.stats
    • gds.fastRP.mutate
    • gds.fastRP.estimate
    • Added procedures for stats and mutate mode, as well as, estimates for all modes.
  • FastRP has been extended to support relationship weights and directions
  • FastRP supports integer configuration for iteration weights.
  • We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
    • gds.beta.fastRPExtended.mutate
    • gds.beta.fastRPExtended.stream
    • gds.beta.fastRPExtended.stats
    • gds.beta.fastRPExtended.write
    • gds.beta.fastRPExtended.mutate.estimate
    • gds.beta.fastRPExtended.stream.estimate
    • gds.beta.fastRPExtended.stats.estimate
    • gds.beta.fastRPExtended.write.estimate
  • We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
  • gds.beta.knn.mutate and gds.beta.knn.mutate.estimate
  • gds.beta.knn.stats and gds.beta.knn.stats.estimate
  • gds.beta.knn.stream and gds.beta.knn.stream.estimate
  • gds.beta.knn.write and gds.beta.knn.write.estimate
  • The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
  • Pregel framework
    • Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
    • Pregel now supports long and double array node values.
    • Add support for composite node state to allow complex data types on nodes.
    • Reduced memory consumption.
    • Improved memory estimation.
    • Simplified message iteration in compute methods.
    • Split context into Init- and ComputeContext and simplified API.
    • Removed K1ColoringExample standalone project.
    • Added pregel-bootstrap standalone project.
    • Added pregel-examples module.
  • Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
  • Added density property to the output of graph in graph.list.
  • Added a failIfMissing flag to gds.graph.drop

Bug fixes

  • Pregel:
    • Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
    • Fix cast exception when returning array node properties in generated Pregel procedures.
  • Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
    • gds.alpha.closeness
    • gds.alpha.closeness.harmonic
    • gds.alpha.allShortestPaths
  • Weakly connected components:
    • Fixed a bug in WCC where componentCount would be negative when the graph is empty.
    • Fixed a regression where WCC could run more slowly with increased concurrency.
  • Fixed bugs in Louvain:
    • communityCount is no longer negative when the graph is empty.
    • changes to maxIterations are no longer ignored.
  • Fixed a bug in LabelPropagation where communityCount would be negative when the graph is empty.
  • Fixed a bug in gds.graph.export where at most one relationship property per relationship type would be exported.
  • Graph loading:
    • Fixed a bug where using node label projections including properties on large graphs and high concurrency could lead to loss of some properties.
    • Fixed bug in graph creation which could cause an AIOOB exception during node loading.
    • The readConcurrency config parameter can no longer be overwritten by the concurrency param when it is explicitly set in an implicit graph creation config
  • Fixed a bug in memory estimation of large anonymous fictitious graphs.
  • Fixed bug in gds.alpha.dfs/bfs, where the algorithm did not terminate for graphs containing loops.
  • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
  • Fixed a bug in Node2Vec where many disconnected nodes would cause a StackOverflowError
  • Fixed a bug in RandomProjection each iteration weight was multiplied all previous iteration weights.
  • Similarity algorithms:
    • Fixed a bug where Alpha Similarity algorithms would load a graph even though it was not needed
    • Fixed a bug where similarity algorithms would not remove the placeholder graph if config validation fails on invalid user input.
  • Fixed a bug where community statistic computation could overflow for large community ids.
  • Fixed a bug where DegreeCentrality returned incorrect values when concurrency > 1.
  • Fixed a bug where ClosenessCentrality was using a slightly incorrect formula for Wasserman-Faust algorithm.
  • Fixed a bug that affected gds.triangleCount() and gds.alpha.triangles() where not all triangles would be counted under certain conditions.
  • Parallel edges in a graph no longer lead to incorrect Local Clustering Coefficient and Triangle Count results.

Improvements

  • gds.fastRP now accepts integer iterationWeights
  • If graphSage.train is run on a graph without relationships, GDS now fails gracefully with an appropriate error message
  • Added validation that properties used by GraphSage exist on graph
  • Added validation for <code>embeddingSize</code>>=1
  • Added a failIfExists flag to graph creation to enable a user to specify that if a graph already exists, it should be overwritten without failing.
  • Progress logging:
    • We now log progress in equally spaced percentages. This is 0-100% either in steps of 1, or in ...
Read more

1.4.0-alpha06

08 Oct 22:48
Compare
Choose a tag to compare
1.4.0-alpha06 Pre-release
Pre-release
Tagging for 1.4.0-alpha06

1.3.4

01 Oct 13:53
Compare
Choose a tag to compare

Release date: 1 October, 2020

GDS 1.3.4 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, and 1.3.3 release notes

Bug fixes

  • Fixed a bug where node label projections that included properties, on large graphs with high concurrency, failed to load all properties.

1.1.6

01 Oct 13:53
Compare
Choose a tag to compare

Release date: 1 October, 2020

GDS 1.1.6 is compatible with Neo4j 3.5.9 and above, but not Neo4j 4.x. For a 4.x compatible release, please see GDS 1.3.4.

Bug fixes:

  • Fixed a bug in memory estimation for large, anonymous, fictitious graphs
  • The readConcurrency config parameter can no longer be overwritten by the concurrency parameter when it is set explicitly in an implicit graph creation config
  • Fixed a bug where node label projections including properties, in situations with large graphs and high concurrency, could lead to the loss of some properties in the in memory graph.

See also:

1.3.3

22 Sep 13:49
Compare
Choose a tag to compare

Release date: 22 September, 2020

GDS 1.3.3 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.5.

See also 1.3.0 release notes, 1.3.1 release notes, and 1.3.2 release notes

Bug fixes

  • Fixed a bug in memory estimation on large, anonymous fictitious graphs
  • The readConcurrency configuration parameter cannot be overwritten by the concurrency parameter when it is explicitly set during graph creation
  • Fixed a bug in gds.triangleCount and gds.alpha.triangles where not all triangles were being counted under certain conditions

Improvements

  • Improved memory estimation for * node projections (loading all nodes regardless of label)

1.3.2

20 Aug 16:55
Compare
Choose a tag to compare

Release date: August 20, 2020

Bug fixes

  • Fixed bug in RandomProjection where effectively each iteration weight was multiplied all previous iteration weights.
  • Fixed bug in graph creation which could cause an AIOOB exception during graph creation.

GDS 1.1.5

20 Aug 16:16
Compare
Choose a tag to compare

Release date: August 20, 2020

Bug fixes

  • Fixed bug in graph creation which could cause an AIOOB exception during graph creation.