Releases: neo4j/graph-data-science
Graph Data Science 2.2.1
GDS 2.2.1 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Breaking changes
- Change the content of some fields from the output of
gds.debug.arrow
:-
listenAddress
now always returns the same content asadvertisedListenAddress
-
serverLocation
always returnsNULL
-
Graph Data Science 2.2.0
GDS 2.2.0 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Breaking changes
- Link Prediction filtering:
- Change graph filtering in
gds.beta.pipeline.linkPrediction.train
- Replace parameter
nodeLabels
withsourceNodeLabel
andtargetNodeLabel
. - Replace parameter
relationshipTypes
withtargetRelationshipType
.
- Replace parameter
- Change graph filtering in
gds.beta.pipeline.linkPrediction.predict
- Replace parameter
nodeLabels
with optionalsourceNodeLabels
andtargetNodeLabels
. By default, they will be derived from the model's train configuration. - Change the default value for
relationshipTypes
with thetargetRelationshipType
from the model's train configuration.
- Replace parameter
- Change graph filtering in
- Node Classification & Regression filtering:
- Change graph filtering in
gds.beta.pipeline.nodeClassification.train
andgds.beta.pipeline.nodeRegression.train
- Replace parameter
nodeLabels
withtargetNodeLabels
- Replace parameter
- Change graph filtering in
gds.beta.pipeline.nodeClassification.predict
andgds.beta.pipeline.nodeRegression.predict
- Replace parameter
nodeLabels
withtargetNodeLabels
By default, they will be derived from the model's train configuration.
- Replace parameter
- Change graph filtering in
- Promoting Collapse Path to beta tier
- Changed the procedure name to
gds.beta.collapsePath.mutate
- Use parameter
pathTemplates
to now specify multiple_path templates_.
- Changed the procedure name to
- Promoting CELF to
beta
tier- Moved
gds.alpha.influenceMaximization.celf.stream
togds.beta.influenceMaximization.celf.stream
- Moved
- For graphs created, with
gds.graph.project.cypher
, reduce output ofgds.graph.list
to only print the names ofparameters
. This will avoid printing the parameter values, which potentially leads to long procedure execution times. - RandomWalk algorithm promoted to product tier
gds.beta.randomWalk.stats
=>gds.randomWalk.stats
gds.beta.randomWalk.stats.estimate
=>gds.randomWalk.stats.estimate
gds.beta.randomWalk.stream
=>gds.randomWalk.stream
gds.beta.randomWalk.stream.estimate
=>gds.randomWalk.stream.estimate
- Removed
debug_log
config field from Arrow Create Database action. - Node2Vec uses new embedding initializer
NORMALIZED
as default. - Dropped support for older patches:
- for 4.3, only 4.3.15 and later is supported
- for 4.4, only 4.4.9 and later is supported
New features
- Link Prediction filtering:
- Supports heterogeneous LinkPrediction pipelines by allowing configuring which node labels and relationship type to train and predict for.
- See Breaking changes above for more details.
- K-means:
- Added centroids and average node-centroid distance to result for Mutate, Stats, and Write modes.
- Added distance to centroid per node result in Stream mode.
- Introduced a parameter
numberOfRestarts
that runs K-Means multiple times and picks the one with the minimum node-centroid distance. - Introduced a parameter
computeSilhouette
that if enabled will compute silhouette related metrics. - Introduced a parameter
initialSampler
which can select different sampling strategies for picking the first centroids.- Added the
K-means++
initialization algorithm which can be enabled by settinginitialSampler=kmeans++
.
- Added the
- Introduced the parameter
seedCentroids
which seeds input centroids to k-means (in negation of the above).
- Introduced a new scaler
Center
forScaleProperties
that subtracts the mean from each value. - Expose
penaltyL2
to configure the l2 regularization term to the loss function ingds.beta.graphSage.train
. - Add Multilayer Perceptron as a training method for node classification (
gds.alpha.pipeline.nodeClassification.addMLP
) and link prediction (gds.alpha.pipeline.linkPrediction.addMLP
). - Add
SAME_CATEGORY
feature type togds.beta.pipeline.linkPrediction.addFeature
. - Added new procedure
gds.beta.graph.relationships.stream
that streams relationship topology. - Added arrow export endpoint
gds.beta.graph.relationships.stream
that streams relationship topology. - Added new procedure
gds.alpha.graph.sample.rwr
that creates a new graph projection by sampling using random walk with restarts. - Added the ability to collapse multiple paths using
gds.beta.collapsePath.mutate
. - Promoting CELF algorithm to
beta
tier.- Added
gds.beta.influenceMaximization.celf.stats
- Added
gds.beta.influenceMaximization.celf.mutate
- Added
gds.beta.influenceMaximization.celf.write
- Added progress tracking capabilities.
- Added memory estimation.
- Added
- Progress tracking for KMeans algorithm.
- Memory estimation for KMeans.
- added
gds.alpha.kmeans.mutate.estimate
- added
gds.alpha.kmeans.stats.estimate
- added
gds.alpha.kmeans.stream.estimate
- added
gds.alpha.kmeans.write.estimate
- added
- Added procedure to compute modularity for pre-computed communities.
gds.alpha.modularity.stats
gds.alpha.modularity.stream
- Added new config options to the GDS Flight server.
gds.arrow.encryption.never
deactivates the server encryption even if it would otherwise be enabled.gds.arrow.advertised_listen_address
sets the server location that clients should connect to.
- Added support for importing
String
node identifiers for the ArrowCREATE_DATABASE
action. - Added capability to run BetweennessCentrality using relationship weights.
- Added
relationshipWeightProperty
optional configuration parameter.
- Added
- Added
stats
mode procedures for RandomWalk.gds.beta.randomWalk.stats
gds.beta.randomWalk.stats.estimate
- Introduced the ability to configure defaults and limits for configuration parameters.
gds.alpha.config.defaults.list
gds.alpha.config.defaults.set
gds.alpha.config.limits.list
gds.alpha.config.limits.set
- Introduce new configuration parameters
contextNodeLabels
andcontextRelationshipTypes
in nodePropertySteps.gds.beta.pipeline.linkPrediction.addNodeProperty
gds.beta.pipeline.nodeClassification.addNodeProperty
gds.alpha.pipeline.nodeRegression.addNodeProperty
- The context is used to enlarge the input graph to the node property steps when running
gds.beta.pipeline.linkPrediction.addNodeProperty.[train|predict]
,gds.beta.pipeline.nodeClassification.[train|predict]
andgds.alpha.pipeline.nodeRegression.[train|predict]
.
Leiden
- Add capability to mutate
intermediateCommunities
whenincludeIntermediateCommunities
is set totrue
. - Add capability to write
intermediateCommunities
whenincludeIntermediateCommunities
is set totrue
.
- Add capability to mutate
- Node2Vec adds new embedding initializer
NORMALIZED
configured with the parameterembeddingInitializer
.
Bug fixes
- Fixed a bug where eager checking for business rules around GDS on a Neo4j cluster would cause the cluster to fail to start.
- Fixed a bug where Neo4j users with
admin
role could not see all graphs in the catalog on GDS enterprise. - Fixed a bug in random graph generation where the resulting graph can end up with an incorrect relationship schema.
- Fixed a bug where a schema filter would not create a deep copy of the property schema map.
- Fixed a bug where modularity could have been incorrectly updated in ModularityOptimization. This may affect the number of performed iterations for ModularityOptimization or number of levels for Louvain.
- Fixed a bug where restoring from csv could not read values wrapped in quotes.
- Fixed a bug where KNN did not use the expected search space. This will improve the result but also increase the runtime.
- Fixed a bug in ML autotuning where
maxTrials
included model evaluations with concrete configs. - Fixed a bug where
gds.triangleCount
andgds.localClusteringCoefficient
were allowed to run on directed graphs. - Fixed a bug in
gds.graph.export
and Arrow DB import where thewriteConcurrency
was not respected. - Fixed a bug with Node Operations where
gds.graph.nodeProperties.write
,gds.graph.nodeProperties.drop
andgds.graph.nodeProperties/y.stream
would not acceptString
input for parametersnodeLabels
and/ornodeProperties
. - Fixed a bug, where Node2Vec would report negative losses.
- Fixed a bug with
gds.graph.nodeProperties/y.stream
, where the wrong nodes where returned when specifying anodeLabels
filter and using Arrow. - Fixed a bug in the Louvain algorithm, where aggregating dense communities could potentially lead to an exception.
- Fixed a bug where model loading is attempted even for unlicensed user, which might fail database startup.
Improvements
- Better error handling in K-means
- Improve memory estimation for
gds.beta.pipeline.linkPrediction.train
when the nodePropertySteps used a weighted graph. - Improve runtime of feature generation in
gds.beta.linkPrediction.[train|predict]
. - Improve performance of
gds.graph.project.cypher
by using the subscriber API. - Improve convergence criteria for
LogisticRegression
andLinearRegression
trainers, by making it independent of the number of batches. This affectsgds.alpha.pipeline.nodeRegression.train
,gds.beta.pipeline.[linkPrediction|nodeClassification].train
. - Improve error handling on invalid user input.
- Cypher on GDS projections is now capable of setting labels on nodes.
- Promoting CELF algorithm to `bet...
Graph Data Science 2.1.13
GDS 2.1.13 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
-
gds.graph.nodeProperties.write
,gds.graph.nodeProperties.drop
,gds.graph.nodeProperty.stream
andgds.graph.nodeProperties.stream
now acceptString
input for parametersnodeLabels
and/ornodeProperties
. -
gds.graph.nodeProperty.stream
andgds.graph.nodeProperties.stream
, would return the wrong nodes when specifying anodeLabels
filter when using Arrow. - Louvain algorithm would throw an exception when aggregating dense communities.
Improvements
-
Export to CSV now enabled when GDS is running on a Causal Cluster Read Replica
-
gds.beta.graph.export.csv
-
gds.beta.graph.export.csv.estimate
-
Graph Data Science 2.1.12
GDS 2.1.12 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Improvements
-
New procedures for enabling and disabling Arrow database import (default: enabled)
gds.features.enableArrowDatabaseImport
gds.features.enableArrowDatabaseImport.reset
Graph Data Science 2.1.11
GDS 2.1.11 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
-
gds.graph.export
and Arrow DB import where thewriteConcurrency
was not respected
Graph Data Science 2.1.10
GDS 2.1.10 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
- Modularity Optimization and Louvain may run into an ArrayIndexOutOfBoundsException. This bug was introduced in 2.1.8.
-
gds.alpha.create.cypherdb
would not clean up internal state when encountering an unexpected error. -
gds.alpha.backup
,gds.alpha.restore
andgds.beta.project.subgraph
would lose information on relationship projections. This made some algorithms unable to run on graphs produced through the above procedures.
Graph Data Science 2.1.9
GDS 2.1.9 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
New Features
- New setting
gds.arrow.advertised_listen_address
added to the GDS Flight server.
Bug Fixes
-
gds.beta.listProgress()
would throw due to negative values. This could occur while running procedures such asgds.louvain
orgds.graph.project.cypher
. -
gds.beta.pipeline.[nodeClassification|linkPrediction].train()
orgds.alpha.pipeline.nodeRegression.train()
would not work when using a pipeline with a graphSage nodeProperty. -
ML autotuning included model evaluations with concrete configs when using
maxTrials
.
Improvements
- The
gds.debug.sysInfo
procedure now shows the license expiration date when run with a valid GDS license.
Graph Data Science 2.1.8
GDS 2.1.8 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
New Features
- New configuration setting to deactivate encryption for the GDS Flight server.
Bug Fixes
- Fixed a bug where KNN did not use the expected search space. This will improve the accuracy of results but may also increase the runtime
-
Fixed a bug where the GDS plugin would only work with the latest Neo4j patch release. It now works on all
4.3.x
&4.4.x
releases.
Improvements
- A new column
serverLocation
has been added togds.debug.arrow()
that presents the actual location of where the server is running, as this can differ to the configured location.
Graph Data Science 2.1.7
GDS 2.1.7 is compatible with Neo 4.3 versions ≥ 4.3.15 and 4.4 ≥ 4.4.9.
For GDS compatibility with previous releases of 4.3 and 4.4, please use please see GDS 2.1.6. The 2.1 series is also incompatible with Neo4j 3.5.x, 4.0, 4.1, and 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Breaking Changes
- Link prediction pipeline training no longer accepts directed graphs. This is because the algorithm & ML techniques used by link prediction pipelines are only defined for undirected graphs.
Bug Fixes
- Fixed a bug in
modularityOptimization
could incorrectly update modularity values - Fixed a bug where
gds.restore
did not correctly read values wrapped in quotes
2.1.6
GDS 2.1.6 is compatible with Neo4j 4.3 and 4.4 but not Neo4j 3.5.x, 4.0, 4.1, or 4.2. For a 3.5 compatible release, please see GDS 1.1.7. For a 4.0 compatible release, please see GDS 1.6.5. For a 4.1 or 4.2 compatible release, please see GDS 1.8.8
Bug Fixes
- Fixed a bug where relationship types or node labels were not handled correctly when importing previously exported data via Apache Arrow.
- Fixed a bug where
gds.graphSage.[stream|write|mutate]
did not use the correct relationship weights when run with concurrency > 1.