Releases: neo4j/graph-data-science
GDS 1.4.1
Release date: 7 December, 2020
GDS 1.4.1 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in progress logging for
gds.graph.writeNodeProperties()
andgds.graph.writeRelationships()
where some percentages were missed, or others reported multiple times. - Fixed a bug where
gds.graph.writeNodeProperties()
andgds.alpha.shortestPathDeltaStepping.write()
were single threaded by default - Fixed a bug where
gds.alpha.node2vec
ignored relationships for graphs with multiple projected relationship types. - Fixed a bug where
gds.pagerank.*.estimate
would fail for very large node counts. - Fixed a bug where using float array node properties (e.g. after running
gds.fastRP.mutate
) would fail in some situations. - Fixed a bug where a graph with multiple labels and all nodes sharing at least one label could lead to either an exception or a wrongly mapped Neo4j id.
Improvements
gds.pageRank
will now select batches more dynamically to properly respect the requested concurrency.
1.3.5
Release date: 23 November, 2020
GDS 1.3.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x or 4.2. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.2 compatible release, please see GDS 1.4.0
See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, 1.3.3 release notes, and 1.3.4 release notes,
Bug fixes
- Fixed a bug in
gds.graph.export
where at most one relationship property per relationship type would be exported. - Fixed a bug in Louvain where changes to
maxIterations
were ignored. - Fixed a bug where
gds.alpha.node2vec
would ignore relationships for graphs with multiple projected relationship types.
GDS 1.4.0
Release date: 5 November, 2020
GDS 1.4.0 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- License key configuration was renamed from
licenseFile
tolicense_file
for consistency with Bloom - Removed sparsity parameter from
gds.alpha.randomProjection.*
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Renamed
embeddingSize
parameter toembeddingDimension
for fastRP, GraphSAGE and Node2Vec. - Renamed
projectedFeatureSize
toprojectedFeatureDimension
for GraphSAGE - Renamed
nodePropertyNames
has been renamed tofeatureProperties
ingds.beta.fastRPExtended
andgds.beta.graphSage.train
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Default parameters for
gds.fastRP
have changed on the following configuration parameters:iterationWeights
now has default[0.0, 1.0, 1.0]
normalizeL2
has been removed and its effect is always applied
- Removed alpha procedures for GraphSage (replaced with
beta
tier, see New Features section)gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- GraphSage no longer directly calculates embeddings, instead it has been split into
train
(to generate a named model) andwrite, mutate
, andstream
to apply the model predictions to your data. - Due to the creation of a
train
mode for graph sage, the following configuration parameters were moved:embeddingSize
- moved as configuration parameter ofgds.beta.graphSage.train
aggregator
- moved as configuration parameter ofgds.beta.graphSage.train
activationFunction
- moved as configuration parameter ofgds.beta.graphSage.train
sampleSizes
- moved as configuration parameter ofgds.beta.graphSage.train
nodePropertyNames
- moved as configuration parameter ofgds.beta.graphSage.train
tolerance
- moved as configuration parameter ofgds.beta.graphSage.train
learningRate
- moved as configuration parameter ofgds.beta.graphSage.train
epochs
- moved as configuration parameter ofgds.beta.graphSage.train
maxIterations
- moved as configuration parameter ofgds.beta.graphSage.train
searchDepth
- moved as configuration parameter ofgds.beta.graphSage.train
negativeSampleWeight
- moved as configuration parameter ofgds.beta.graphSage.train
degreeAsProperty
- moved as configuration parameter ofgds.beta.graphSage.train
gds.beta.graphSage.stream
procedure now requiresmodelName
configuration parameter.gds.beta.graphSage.write
procedure requiresmodelName
configuration parameter.- Removed
startLoss
andepochLosses
from the result columns ofgds.beta.graphSage.write
. - Added the graph create config as a return field to the train procedure, affecting
gds.beta.graphSage.train
- Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Removed configuration parameter
maxCost
fromgds.alpha.bfs/dfs
. - Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
- Removed
degreeDistribution
fromgds.graph.drop
return columns. gds.pageRank
now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.- Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.
New features
- Promote GraphSage to
beta
tier and added support for inductive models with thetrain
mode- This adds procedures
gds.beta.graphSage.mutate
gds.beta.graphSage.mutate.estimate
gds.beta.graphSage.stream
gds.beta.graphSage.stream.estimate
gds.beta.graphSage.train
gds.beta.graphSage.train.estimate
gds.beta.graphSage.write
gds.beta.graphSage.write.estimate
- And removes alpha procedures
gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- This adds procedures
- GraphSage supports relationship weights, driven by
relationshipWeightProperty
- GraphSage supports node labels via
projectedFeatureSize
- Introduced the model catalog to manage trained models, including:
gds.beta.model.exists
- a procedure to check if a model exists in the catalogGds.beta.model.list
- list all available modelsgds.beta.model.drop
- removes a model from the catalog
- The Random Projection algorithm has been promoted to the product tier and we have added:
gds.fastRP.stats
gds.fastRP.mutate
gds.fastRP.estimate
- Added procedures for
stats
andmutate
mode, as well as,estimates
for all modes.
- FastRP has been extended to support relationship weights and directions
- FastRP supports integer configuration for iteration weights.
- We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
gds.beta.fastRPExtended.mutate
gds.beta.fastRPExtended.stream
gds.beta.fastRPExtended.stats
gds.beta.fastRPExtended.write
gds.beta.fastRPExtended.mutate.estimate
gds.beta.fastRPExtended.stream.estimate
gds.beta.fastRPExtended.stats.estimate
gds.beta.fastRPExtended.write.estimate
- We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
gds.beta.knn.mutate
andgds.beta.knn.mutate.estimate
gds.beta.knn.stats
andgds.beta.knn.stats.estimate
gds.beta.knn.stream
andgds.beta.knn.stream.estimate
gds.beta.knn.write
andgds.beta.knn.write.estimate
- The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
- Pregel framework
- Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
- Pregel now supports long and double array node values.
- Add support for composite node state to allow complex data types on nodes.
- Reduced memory consumption.
- Improved memory estimation.
- Simplified message iteration in
compute
methods. - Split context into Init- and ComputeContext and simplified API.
- Removed
K1ColoringExample
standalone project. - Added
pregel-bootstrap
standalone project. - Added
pregel-examples
module.
- Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
- Added
density
property to the output of graph ingraph.list
. - Added a
failIfMissing
flag togds.graph.drop
Bug fixes
- Pregel:
- Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
- Fix cast exception when returning array node properties in generated Pregel procedures.
- Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
gds.alpha.closeness
gds.alpha.closeness.harmonic
gds.alpha.allShortestPaths
- Fixed a bug in
gds.alpha.shortestPath.deltaStepping
where large relationship weights led to incorrect results - Weakly connected components:
- Fixed a bug in WCC where
componentCount
would be negative when the graph is empty. - Fixed a regression where WCC could run more slowly with increased concurrency.
- Fixed a bug in WCC where
- Fixed bugs in Louvain:
-
communityCount
is no longer negative when the graph is empty. - changes to
maxIterations
are no longer ignored.
-
- Fixed a bug in LabelPropagation where
communityCount
would be negative when the graph is empty. - Fixed a bug in KNN where it failed when run on graphs with filtere...
GDS 1.4 Preview
Breaking changes
- Removed sparsity parameter from
gds.alpha.randomProjection.*
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Renamed
embeddingSize
parameter toembeddingDimension
for fastRP, GraphSAGE and Node2Vec. - Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Default parameters for
gds.fastRP
have changed on the following configuration parameters:iterationWeights
now has default[0.0, 1.0, 1.0]
normalizeL2
has been removed and its effect is always applied
- Removed alpha procedures for GraphSage (replaced with
beta
tier, see New Features section)gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- GraphSage no longer directly calculates embeddings, instead it has been split into
train
(to generate a named model) andwrite, mutate
, andstream
to apply the model predictions to your data. - Due to the creation of a
train
mode for graph sage, the following configuration parameters were moved:embeddingSize
- moved as configuration parameter ofgds.beta.graphSage.train
aggregator
- moved as configuration parameter ofgds.beta.graphSage.train
activationFunction
- moved as configuration parameter ofgds.beta.graphSage.train
sampleSizes
- moved as configuration parameter ofgds.beta.graphSage.train
nodePropertyNames
- moved as configuration parameter ofgds.beta.graphSage.train
tolerance
- moved as configuration parameter ofgds.beta.graphSage.train
learningRate
- moved as configuration parameter ofgds.beta.graphSage.train
epochs
- moved as configuration parameter ofgds.beta.graphSage.train
maxIterations
- moved as configuration parameter ofgds.beta.graphSage.train
searchDepth
- moved as configuration parameter ofgds.beta.graphSage.train
negativeSampleWeight
- moved as configuration parameter ofgds.beta.graphSage.train
degreeAsProperty
- moved as configuration parameter ofgds.beta.graphSage.train
gds.beta.graphSage.stream
procedure now requiresmodelName
configuration parameter.gds.beta.graphSage.write
procedure requiresmodelName
configuration parameter.- Removed
startLoss
andepochLosses
from the result columns ofgds.beta.graphSage.write
. - Added the graph create config as a return field to the train procedure, affecting
gds.beta.graphSage.train
- Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Removed configuration parameter
maxCost
fromgds.alpha.bfs/dfs
. - Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
- Removed
degreeDistribution
fromgds.graph.drop
return columns. gds.pageRank
now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.- Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.
New features
- Promote GraphSage to
beta
tier and added support for inductive models with thetrain
mode- This adds procedures
gds.beta.graphSage.mutate
gds.beta.graphSage.mutate.estimate
gds.beta.graphSage.stream
gds.beta.graphSage.stream.estimate
gds.beta.graphSage.train
gds.beta.graphSage.train.estimate
gds.beta.graphSage.write
gds.beta.graphSage.write.estimate
- And removes alpha procedures
gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- This adds procedures
- GraphSage supports relationship weights, driven by
relationshipWeightProperty
- GraphSage supports node labels via
projectedFeatureSize
- Introduced the model catalog to manage trained models, including:
gds.beta.model.exists
- a procedure to check if a model exists in the catalogGds.beta.model.list
- list all available modelsgds.beta.model.drop
- removes a model from the catalog
- The Random Projection algorithm has been promoted to the product tier and we have added:
gds.fastRP.stats
gds.fastRP.mutate
gds.fastRP.estimate
- Added procedures for
stats
andmutate
mode, as well as,estimates
for all modes.
- FastRP has been extended to support relationship weights and directions
- FastRP supports integer configuration for iteration weights.
- We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
gds.beta.fastRPExtended.mutate
gds.beta.fastRPExtended.stream
gds.beta.fastRPExtended.stats
gds.beta.fastRPExtended.write
gds.beta.fastRPExtended.mutate.estimate
gds.beta.fastRPExtended.stream.estimate
gds.beta.fastRPExtended.stats.estimate
gds.beta.fastRPExtended.write.estimate
- We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
gds.beta.knn.mutate
andgds.beta.knn.mutate.estimate
gds.beta.knn.stats
andgds.beta.knn.stats.estimate
gds.beta.knn.stream
andgds.beta.knn.stream.estimate
gds.beta.knn.write
andgds.beta.knn.write.estimate
- The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
- Pregel framework
- Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
- Pregel now supports long and double array node values.
- Add support for composite node state to allow complex data types on nodes.
- Reduced memory consumption.
- Improved memory estimation.
- Simplified message iteration in
compute
methods. - Split context into Init- and ComputeContext and simplified API.
- Removed
K1ColoringExample
standalone project. - Added
pregel-bootstrap
standalone project. - Added
pregel-examples
module.
- Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
- Added
density
property to the output of graph ingraph.list
. - Added a
failIfMissing
flag togds.graph.drop
Bug fixes
- Pregel:
- Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
- Fix cast exception when returning array node properties in generated Pregel procedures.
- Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
gds.alpha.closeness
gds.alpha.closeness.harmonic
gds.alpha.allShortestPaths
- Weakly connected components:
- Fixed a bug in WCC where
componentCount
would be negative when the graph is empty. - Fixed a regression where WCC could run more slowly with increased concurrency.
- Fixed a bug in WCC where
- Fixed bugs in Louvain:
communityCount
is no longer negative when the graph is empty.- changes to
maxIterations
are no longer ignored.
- Fixed a bug in LabelPropagation where
communityCount
would be negative when the graph is empty. - Fixed a bug in
gds.graph.export
where at most one relationship property per relationship type would be exported. - Graph loading:
- Fixed a bug where using node label projections including properties on large graphs and high concurrency could lead to loss of some properties.
- Fixed bug in graph creation which could cause an AIOOB exception during node loading.
- The
readConcurrency
config parameter can no longer be overwritten by theconcurrency
param when it is explicitly set in an implicit graph creation config
- Fixed a bug in memory estimation of large anonymous fictitious graphs.
- Fixed bug in
gds.alpha.dfs/bfs
, where the algorithm did not terminate for graphs containing loops. - Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Fixed a bug in Node2Vec where many disconnected nodes would cause a StackOverflowError
- Fixed a bug in RandomProjection each iteration weight was multiplied all previous iteration weights.
- Similarity algorithms:
- Fixed a bug where Alpha Similarity algorithms would load a graph even though it was not needed
- Fixed a bug where similarity algorithms would not remove the placeholder graph if config validation fails on invalid user input.
- Fixed a bug where community statistic computation could overflow for large community ids.
- Fixed a bug where DegreeCentrality returned incorrect values when concurrency > 1.
- Fixed a bug where ClosenessCentrality was using a slightly incorrect formula for Wasserman-Faust algorithm.
- Fixed a bug that affected
gds.triangleCount()
andgds.alpha.triangles()
where not all triangles would be counted under certain conditions. - Parallel edges in a graph no longer lead to incorrect Local Clustering Coefficient and Triangle Count results.
Improvements
gds.fastRP
now accepts integer iterationWeights- If
graphSage.train
is run on a graph without relationships, GDS now fails gracefully with an appropriate error message - Added validation that properties used by GraphSage exist on graph
- Added validation for <code>embeddingSize</code>>=1
- Added a failIfExists flag to graph creation to enable a user to specify that if a graph already exists, it should be overwritten without failing.
- Progress logging:
- We now log progress in equally spaced percentages. This is 0-100% either in steps of 1, or in ...
1.4.0-alpha06
Tagging for 1.4.0-alpha06
1.3.4
Release date: 1 October, 2020
GDS 1.3.4 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, and 1.3.3 release notes
Bug fixes
- Fixed a bug where node label projections that included properties, on large graphs with high concurrency, failed to load all properties.
1.1.6
Release date: 1 October, 2020
GDS 1.1.6 is compatible with Neo4j 3.5.9 and above, but not Neo4j 4.x. For a 4.x compatible release, please see GDS 1.3.4.
Bug fixes:
- Fixed a bug in memory estimation for large, anonymous, fictitious graphs
- The
readConcurrency
config parameter can no longer be overwritten by theconcurrency
parameter when it is set explicitly in an implicit graph creation config - Fixed a bug where node label projections including properties, in situations with large graphs and high concurrency, could lead to the loss of some properties in the in memory graph.
See also:
1.3.3
Release date: 22 September, 2020
GDS 1.3.3 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.5.
See also 1.3.0 release notes, 1.3.1 release notes, and 1.3.2 release notes
Bug fixes
- Fixed a bug in memory estimation on large, anonymous fictitious graphs
- The
readConcurrency
configuration parameter cannot be overwritten by theconcurrency
parameter when it is explicitly set during graph creation - Fixed a bug in
gds.triangleCount
andgds.alpha.triangles
where not all triangles were being counted under certain conditions
Improvements
- Improved memory estimation for
*
node projections (loading all nodes regardless of label)
1.3.2
Release date: August 20, 2020
Bug fixes
- Fixed bug in RandomProjection where effectively each iteration weight was multiplied all previous iteration weights.
- Fixed bug in graph creation which could cause an AIOOB exception during graph creation.
GDS 1.1.5
Release date: August 20, 2020
Bug fixes
- Fixed bug in graph creation which could cause an AIOOB exception during graph creation.