Releases · neo4j/graph-data-science

20 Dec 17:15

1.8.1

bb33cfe

Graph Data Science 1.8.1

GDS 1.8.1 is compatible with Neo4j 4.1, 4.2, 4.3 and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5.

Bug fixes

Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms, e.g. Page Rank.
Fixed a bug where gds.beta.graphSage could produce incorrect results for small graphs
Fixed a bug where gds.beta.graphSage could product incorrect results for the pool aggregator
Fixed a bug where gds.graph.create.cypher would not accept list properties for nodes
Fixed a bug in gds.beta.graph.create.subgraph where long values greater than 2⁵³ were not properly handled during expression evaluation

Assets 4

03 Dec 13:41

AliciaFrame

1.7.3

ba86fc1

Graph Data Science 1.7.3

GDS 1.7.3 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x, 4.0, or 4.4. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.4 compatible release, please see GDS 1.8.0.

Bug fixes

Fixed a bug where Node2Vec would produce an AIOOBE on sufficiently large graphs.
Fixed a bug where ForkJoin pools were not properly closed which could lead to OOMs using Pregel-based algorithms,e.g. Page Rank.

Assets 4

01 Dec 19:23

AliciaFrame

1.8.0

49f69d9

GDS 1.8.0

GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects graphName, modelName, and several property parameters such as nodeWeightProperty or seedProperty.
We have removed the separate concurrency parameter from the model parameter space in gds.alpha.ml.nodeClassification.train, gds.alpha.ml.linkPrediction.train and gds.alpha.ml.pipeline.linkPrediction.configureParams. The concurrency value in the configuration of the train procedure will be used.
The procedure gds.alpha.randomWalk.stream has graduated to the beta tier, as gds.beta.randomWalk.stream.
- Random Walk has been improved and aligned with the Node2Vec implementation. Please consult the documentation to find out about the new configuration options.
- gds.alpha.randomWalk.stream has been removed.
- A memory estimation procedure, gds.beta.randomWalk.estimate has been added
The procedure gds.beta.fastRPExtended has been merged with gds.fastRP.

New features

Link Prediction
- Add new link prediction stream procedure gds.alpha.ml.pipeline.linkPrediction.predict.stream.
- Added probabilityDistribution and samplingStats to the result of gds.alpha.ml.pipeline.linkPrediction.predict.mutate.
- To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate.
- Node property steps in Link Prediction pipelines can use a relationship property.
Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
- gds.alpha.ml.pipeline.nodeClassification.create
- gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
- gds.alpha.ml.pipeline.nodeClassification.selectFeatures
- gds.alpha.ml.pipeline.nodeClassification.configureParams
- gds.alpha.ml.pipeline.nodeClassification.configureSplit
- gds.alpha.ml.pipeline.nodeClassification.train
- gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
New algorithm: Conductance, gds.alpha.conductance.stream, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms.
Added support for preserving a relationship property in gds.alpha.ml.splitRelationships.mutate.
The procedure gds.fastRP has received additional configuration parameters:
- featureProperties: to configure using node properties as part of the embedding.
- propertyRatio: to control how much of the embedding is computed from properties.
- nodeSelfInfluence: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.

Bug fixes

Added check that concurrency is meeting determinism constraints for K-Nearest Neighbors whenever randomSeed is overridden.
Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
Fixed an issue where KNN did not add candidates to the topK result.
Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException.
Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.

Improvements

Added context information to log entries in debug and warning.
Log Training loss as part of general progress logging
Running transactions while projecting a graph now has less chance of breaking the projected graph
Improve runtime performance for FastRP
Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
The in-memory cypher db is now capable of querying relationship ids, types and properties
The procedure gds.alpha.randomWalk.stream has been improved and should now run faster and more stable.

Assets 4

26 Nov 15:20

AliciaFrame

1.8.0-alpha04

acf8462

Graph Data Science 1.8.0-Preview

GDS 1.8 is compatible with Neo4j 4.1, 4.2, 4.3, and 4.4 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

GDS now throws error messages on identifiers with trailing whitespaces to avoid input errors. This affects graphName, modelName, and several property parameters such as nodeWeightProperty or seedProperty.
We have removed the separate concurrency parameter from the model parameter space in gds.alpha.ml.nodeClassification.train, gds.alpha.ml.linkPrediction.train and gds.alpha.ml.pipeline.linkPrediction.configureParams. The concurrency value in the configuration of the train procedure will be used.
The procedure gds.alpha.randomWalk.stream has been improved and aligned with the Node2Vec implementation. Please consult the documentation to find out about the new configuration options.
The procedure gds.beta.fastRPExtended has been merged with gds.fastRP.

New features

Link Prediction
- Add new link prediction stream procedure gds.alpha.ml.pipeline.linkPrediction.predict.stream.
- Added probabilityDistribution and samplingStats to the result of gds.alpha.ml.pipeline.linkPrediction.predict.mutate.
- To improve prediction performance, we’ve added kNN-based approximate search strategy option to link prediction procedures gds.alpha.ml.pipeline.linkPrediction.predict.stream|mutate.
- Node property steps in Link Prediction pipelines can use a relationship property.
Node Classification pipelines: similar to link prediction pipelines, we’ve added a pipeline procedure for node classification, where users can define the features, splitting strategy, and model training options. We’ve added:
- gds.alpha.ml.pipeline.nodeClassification.create
- gds.alpha.ml.pipeline.nodeClassification.addNodeProperty
- gds.alpha.ml.pipeline.nodeClassification.addFeatures
- gds.alpha.ml.pipeline.nodeClassification.configureParams
- gds.alpha.ml.pipeline.nodeClassification.configureSplit
- gds.alpha.ml.pipeline.nodeClassification.train
- gds.alpha.ml.pipeline.nodeClassification.predict.mutate|stream|write
New algorithm: Conductance, gds.alpha.conductance.stream, can be used to compute a metric to evaluate the quality of communities identified by community detection algorithms.
Added support for preserving a relationship property in gds.alpha.ml.splitRelationships.mutate.
The procedure gds.fastRP has received additional configuration parameters:
- featureProperties: to configure using node properties as part of the embedding.
- propertyRatio: to control how much of the embedding is computed from properties.
- nodeSelfInfluence: allows using each node's initial random vector as a contribution to the node's embedding. Especially useful for graphs with disconnected nodes.

Bug fixes

Added check that concurrency is meeting determinism constraints for K-Nearest Neighbors whenever randomSeed is overridden.
Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
Fixed an issue where KNN did not add candidates to the topK result.
Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
Fixed an ArrayIndexOutOfBounds error that could happen in triangle count on some graphs with multiple relationship types.
Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException.
Fixed a bug where the in-memory storage engine would not find the correct graph store if the db name was not lowercase
Fixed a bug where the graph store would be released when storing the CypherGraphStore in the catalog
Fixed a bug where Node2Vec would produce an ArrayIndexOutOfBounds error on sufficiently large graphs.

Improvements

Added context information to log entries in debug and warning.
Log Training loss as part of general progress logging
Running transactions while projecting a graph now has less chance of breaking the projected graph
Improve runtime performance for FastRP
Use Neo4j node id instead of internal GDS node id when seeding generation of initial random vectors in FastRP.
The in-memory cypher db is now capable of querying relationship ids, types and properties
The procedure gds.alpha.randomWalk.stream has been improved and should now run faster and more stable.

Assets 4

01 Nov 20:35

AliciaFrame

1.7.2

94ba4d1

Graph Data Science 1.7.2

GDS 1.7.2 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5

Bug fixes

Fixed an issue where seeded algorithms (such as WCC) on graphs with multiple node labels could assign seeded communities to new nodes.
Fixed an issue where KNN did not add candidates to the topK result.
Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.
Fixed an issue where running gds.alpha.ml.pipeline.linkPrediction.train could result in an error on graphs filtered with the configuration parameter nodeLabels.
Fixed an issue with unmapped Neo4j node ids throwing ArrayIndexOutOfBoundsException

Assets 4

01 Nov 20:31

AliciaFrame

1.1.7

87ac369

GDS 1.1.7

GDS 1.1.7 is compatible with Neo4j Neo4j 3.5.x. For a 4.x compatible release, please see GDS 1.7.2.

Bug fixes

Fixed a bug in Louvain where changes to maxIterations were ignored.
Fixed a bug which caused gds.graph.list and gds.graph.drop to throw an error when specifying a graph with duplicate property keys by failing early
Fixed a bug where gds.alpha.scc would sometimes fail with an ArrayIndexOutOfBoundsException.
Fixed an issue where running an algorithm could return incorrect results on graphs filtered with the configuration parameter nodeLabels.

Assets 4

13 Oct 13:17

AliciaFrame

1.7.1

5d8a546

GDS 1.7.1

Release Date October 12, 2021

GDS 1.7.1 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.6.5

Fixed a bug where Cypher graph loading and subgraph creation which could lead to ArrayIndexOutOfBounds errors.
Fixed an ArrayIndexOutOfBounds caused by running triangleCount on graphs with multiple relationship types.

Assets 4

23 Sep 18:32

AliciaFrame

1.7.0

06cc10a

Graph Data Science 1.7.0

GDS 1.7.0 is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.6.5

Breaking changes

This release does not support Neo4j 4.0.x
Align returned modelInfo entry names of gds.alpha.ml.linkPrediction.train and gds.alpha.ml.nodeClassification.train with the model catalog. Now containing modelName and modelInfo instead of name and info.
Remove the sharedUpdater parameter from gds.alpha.ml.linkPrediction and gds.alpha.ml.nodeClassification.
gds.beta.graph.export.csv now exports into a subdirectory called export. Previously, the exported graphs were written directly into the configured directory.
Renamed all graphalgo packages to gds

New features

New Algorithm: Approximate Maximum K-Cut
- Includes procedures: gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate].
Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures: gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate.
Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added additionalNodeProperties parameter to gds.graph.export
- Added additionalNodeProperties parameter to gds.graph.export.csv
Introduced experimental support for querying the in-memory graph with Cypher
- Added gds.alpha.create.cypherdb to allow neo4j to recognize the in-memory graph as a database for Cypher queries
To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure, gds.alpha.systemMonitor, to provide an overview of the system's workload and available resources.
Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with gds.beta.listProgress
GraphSAGE now supports deterministic results with the randomSeed configuration parameter to gds.beta.graphSage.train.
Improve performance (up to 20x speedup) of weakly connected components, gds.wcc, for undirected graphs by applying a subgraph sampling optimization.

Bug fixes

Fixed a bug regarding weighted graphs with multiple relationship types, which affected gds.beta.graphSage and gds.alpha.spanningTree.
Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a NaN issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity.
- Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in gds.alpha.ml.linkPrediction.train. This affects the penality parameter used in logistic regression.
Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected gds.nodeSimilarity.write and gds.alpha.knn.write when being executed in combination with a nodeLabels filter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids.
- Fixed a bug where gds.nodeSimilarity.[write|mutate] and gds.beta.knn.[write|mutate] wrote duplicate relationships if the input graph is undirected.
KNN:
- Fixed a bug in gds.beta.knn where negative values in node properties of type float arrays failed when returning the similarityDistribution.
Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in gds.beta.graphSage, where the concurrency parameter was not considered.
Graph Operations:
- Fixed a bug in gds.graph.removeNodeProperties where removedPropertiesWritten was too large for properties shared across multiple labels.
- Fixed a bug in gds.beta.graph.generate, where random graphs with relationship properties could not be generated.
- Fixed a bug in gds.create.subgraph which could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition.
- Fixed a bug in gds.graph.create, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for sourceNode and targetNode to all shortest path procedures in the product tier.
- Improved runtime of gds.fastRP via better workload balancing between threads.
- Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of gds.beta.listProgress.
- Scale down scores computed by gds.articleRank.

Assets 4

13 Sep 17:48

AliciaFrame

1.6.5

5d1ccca

Graph Data Science 1.6.5

GDS 1.6.5 is compatible with Neo4j 4.0, 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Bug fixes

Fixed a bug in gds.beta.graph.generate, where random graphs with relationship properties could not be generated.
Fixed a bug in gds.graph.create, where default values for array properties would throw for convertable types.
Fixed a bug in gds.beta.graphSage, where the concurrency parameter was not considered.
Fixed a bug where the BitIdMap node mapping builder (on by default in GDS Enterprise Edition) would not correctly count all nodes in certain situations.
Corrected the training size used in gds.alpha.ml.linkPrediction.train. This affects the penality parameter used in logistic regression.

Assets 4

09 Sep 22:10

AliciaFrame

1.7.0-alpha06

3de7fac

GDS 1.7.0-Preview Pre-release

Pre-release

GDS 1.7.0-preview is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.1.6

Breaking changes

This release does not support Neo4j 4.0.x
Align returned modelInfo entry names of gds.alpha.ml.linkPrediction.train and gds.alpha.ml.nodeClassification.train with the model catalog. Now containing modelName and modelInfo instead of name and info.
Remove the sharedUpdater parameter from gds.alpha.ml.linkPrediction and gds.alpha.ml.nodeClassification.
gds.beta.graph.export.csv now exports into a subdirectory called export. Previously, the exported graphs were written directly into the configured directory.
Renamed all graphalgo packages to gds

New features

New Algorithm: Approximate Maximum K-Cut
- Includes procedures: gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate].
Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures: gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate.
Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added additionalNodeProperties parameter to gds.graph.export
- Added additionalNodeProperties parameter to gds.graph.export.csv
Introduced experimental support for querying the in-memory graph with Cypher
- Added gds.alpha.create.cypherdb to allow neo4j to recognize the in-memory graph as a database for Cypher queries
To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure, gds.alpha.systemMonitor, to provide an overview of the system's workload and available resources.
Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with gds.beta.listProgress
GraphSAGE now supports deterministic results with the randomSeed configuration parameter to gds.beta.graphSage.train.
Improve performance (up to 20x speedup) of weakly connected components, gds.wcc, for undirected graphs by applying a subgraph sampling optimization.

Bug fixes

Fixed a bug regarding weighted graphs with multiple relationship types, which affected gds.beta.graphSage and gds.alpha.spanningTree.
Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a NaN issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity.
- Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in gds.alpha.ml.linkPrediction.train. This affects the penality parameter used in logistic regression.
Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected gds.nodeSimilarity.write and gds.alpha.knn.write when being executed in combination with a nodeLabels filter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids.
- Fixed a bug where gds.nodeSimilarity.[write|mutate] and gds.beta.knn.[write|mutate] wrote duplicate relationships if the input graph is undirected.
KNN:
- Fixed a bug in gds.beta.knn where negative values in node properties of type float arrays failed when returning the similarityDistribution.
Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in gds.beta.graphSage, where the concurrency parameter was not considered.
Graph Operations:
- Fixed a bug in gds.graph.removeNodeProperties where removedPropertiesWritten was too large for properties shared across multiple labels.
- Fixed a bug in gds.beta.graph.generate, where random graphs with relationship properties could not be generated.
- Fixed a bug in gds.create.subgraph which could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition.
- Fixed a bug in gds.graph.create, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for sourceNode and targetNode to all shortest path procedures in the product tier.
- Improved runtime of gds.fastRP via better workload balancing between threads.
- Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of gds.beta.listProgress.
- Scale down scores computed by gds.articleRank.

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug fixes

Uh oh!

Bug fixes

Uh oh!

Uh oh!

Uh oh!

Bug fixes

Uh oh!

Bug fixes

Uh oh!

Uh oh!

Breaking changes

New features

Bug fixes

Improvements

Uh oh!

Bug fixes

Uh oh!

Breaking changes

New features

Bug fixes

Improvements

Uh oh!

Releases: neo4j/graph-data-science

Graph Data Science 1.8.1

Bug fixes

Uh oh!

Graph Data Science 1.7.3

Bug fixes

Uh oh!

GDS 1.8.0

Uh oh!

Graph Data Science 1.8.0-Preview

Uh oh!

Graph Data Science 1.7.2

Bug fixes

Uh oh!

GDS 1.1.7

Bug fixes

Uh oh!

GDS 1.7.1

Uh oh!

Graph Data Science 1.7.0

Breaking changes

New features

Bug fixes

Improvements

Uh oh!

Graph Data Science 1.6.5

Bug fixes

Uh oh!

GDS 1.7.0-Preview

Breaking changes

New features

Bug fixes

Improvements

Uh oh!