14 Dec 02:04

3276960

2.5.2 Latest

Latest

Contributions

Special thanks to all the contributors for this release (in git shortlog order):
@IsaacYangSLA, @yhwen @yanchengnv, @nvidianz @YuanTingHsieh

What's New

In this release, we have introduced several exciting new features and enhancements, building on the foundation of version 2.5.1. Key updates include:

Extended Python Version Support (2.5.1)

We now support a broader range of Python versions, from 3.9 to 3.12, ensuring greater compatibility and flexibility for your development needs.

Secure Federated XGBoost Enhancements

Please see this note

The Secure Federated XGBoost framework has been significantly improved with optimizations to the CUDA Paillier Plugin:

New Parallel CUDA-Based Reduction Algorithm:

Version 2 of the CUDA Paillier Plugin introduces a cutting-edge parallel reduction algorithm. This improvement:

Doubles the performance speed compared to version 1 on specific datasets (e.g., small feature sets with a large number of rows).
Dramatically enhances efficiency in datasets with a wide number of features (over 2000 features).
Parameter Conversion Optimization: We have reduced unnecessary parameter conversions, streamlining the overall performance.

Performance Benchmarks:

Benchmarks conducted on the V100 GPU highlight the remarkable improvements achieved with these enhancements:

For small feature datasets, our solution is 30x to 36.5x faster compared to third-party CPU-based implementations.
For wide-feature datasets, we maintain a competitive edge, being 4.6x faster.
The CPU-based plugin is also optimized to reduce the memory usage during ciphertext operations by utilizing shared memory.

End-to-end fraud detection example enhancements

In addition to the existing manual feature engineering, we add example to use Graph Embedding to feed to XGBoost, the embedding as new features works better than manual feature enrichment
We also showed how to use Federated explainability for Fed XGBoost

Support Normal TLS & signed messages

By default NVIDIA FLARE supports mutual TLS (mTLS) connection. There is a need for some customers to use normal TLS. In this release, we added the normal TLS support

Description
Currently Flare's message security comes from mutual TLS: server and client authenticate each other when making connections. This means that only clients that have the right startup kits can make a connection to the server.
The requirement of one-way SSL between the server and clients breaks this assumption: the server could be exposed to the internet and any one could write a client to connect to the server. To ensure message security, explicit message authentication is required.
This PR implements message authentication: messages received by the server must have an auth token, and the token must be validated successfully to prove that it was issued by the server!

Here is how it works:
The client first tries to login to the server. The server/client authenticate each other explicitly with the credentials in their startup kits. This step is independent of how client/server is connected.
If the client credential is validated correctly, the server issues a token and a signature that binds the client name and token together. The signature is generated with the server's private key to prove that the signature can only be issued by the server.
When sending a message to the server, the client adds its client name, token and the signature as headers to the message.
When the message is received, the server validates the token and the signature. Messages that are missing these headers or fail to validate will be rejected.
Note that this mechanism is based on the security of the startup kits. All sites must protect their startup kits securely.

Bug fixes:

We fixed various bugs discovered by our users and customers

What's Changed

[2.5] Fix TF examples by @YuanTingHsieh in #3038
[2.5] Update dashboard cloud base image version to meet Python 3.9 by @IsaacYangSLA in #3063
[2.5] Support one-way SSL by @yanchengnv in #3062
[2.5] --- Fix custom pythonpath missing by @yhwen in #3070
[2.5] --- Fix the simulator without custom folder job run error by @yhwen in #3090
[2.5] Support explicit message authentication by @yanchengnv in #3096
[2.5] XGB Optimization using Shared Memory by @nvidianz in #3094
[2.5] Cherry pick xgb updates by @YuanTingHsieh in #3099
[2.5] Add authentication to Client API messages by @yanchengnv in #3103
[2.5] Support connection security for cell pipe by @yanchengnv in #3105
[2.5] Cherry pick 3106 by @IsaacYangSLA in #3107
[2.5] Updated XGBoost User Guide by @nvidianz in #3109
[2.5] --- Remove the extra client app custom folder by @yhwen in #3102

Full Changelog: 2.5.1...2.5.2

Contributors

yhwen, IsaacYangSLA, and 3 other contributors

Assets 2

15 Oct 00:35

IsaacYangSLA

2.5.1

a5476ae

2.5.1: bug fixes and additional features

What's Changed

[2.5] doc fix typo by @chesterxgchen in #2940
[2.5] Update higgs data link by @SYangster in #2945
[2.5] Update video links by @SYangster in #2944
[2.5] Add research examples to tutorial page by @SYangster in #2943
[2.5] Cherry-pick fix doc and docstring issues (#2931) by @YuanTingHsieh in #2946
[2.5] Add check before api send 25 by @YuanTingHsieh in #2949
[2.5] Fix dashboard server resource by @IsaacYangSLA in #2959
[2.5] F3 Streaming Code Rewrite by @nvidianz in #2956
[2.5] Add the hello-pt-resnet example by @yhwen in #2955
[2.5] Fix simulator result path by @YuanTingHsieh in #2967
[2.5] Add python 3.12 support by @YuanTingHsieh in #2966
[2.5] update PSI to support python 3.11 by @chesterxgchen in #2973
[2.5] Web updates by @SYangster in #2977
[2.5] -- Replace the distutils with shutil by @yhwen in #2979
[2.5] Update openmined-psi to 2.0.5 to support python 3.12 by @chesterxgchen in #2982
[2.5] Allow JobAPI wrapper customization by @YuanTingHsieh in #2988
[2.5] Job api update by @YuanTingHsieh in #2991
[2.5] Bionemo demos (#2968) by @holgerroth in #2998
[2.5] Update pt params converter (#2989) by @YuanTingHsieh in #2997
[2.5] BioNeMo: use multi threading but reduce num workers by @holgerroth in #3003
[2.5] Update test script by @YuanTingHsieh in #3001
[2.5] Fix doc typo and VDR reported issues by @YuanTingHsieh in #3000
[2.5] Expose init in client lightning api by @YuanTingHsieh in #3005
[2.5] Update flwr job object, client, server by @YuanTingHsieh in #3009
[2.5] Cherry pick Update documentation for Dockerfile, add location of tbevents, fix link by @nvkevlu in #3006
[2.5] Update finance end to end README and images by @YuanTingHsieh in #3013
[2.5] Fix fobs issue by @YuanTingHsieh in #3015
[2.5] Fix fobs doc (#3012) by @YuanTingHsieh in #3016
[2.5] Update version and python requires by @IsaacYangSLA in #3024
[2.5] Fix PTModel optional arguments by @SYangster in #3026
[2.5] Set numpy less than 2.0.0 by @YuanTingHsieh in #3027
[2.5] Fedbpt fix by @holgerroth in #3030
[2.5] split learning: upgrade openmined-psi to 2.0.5 by @chesterxgchen in #3019
[2.5] Cherry-pick Fixed broken doc ref to 'helm_chart' (#3022) [skip ci] by @YuanTingHsieh in #3032
[2.5] Enhance POC notebook and docs by @SYangster in #3033
[2.5] Fix hello-pt req [skip ci] by @YuanTingHsieh in #3036
[2.5] Remove extra "." [skip ci] by @YuanTingHsieh in #3037
[2.5] Add TF based TBAnalyticsReceiver by @YuanTingHsieh in #3035
[2.5] Bump up python version of Dockerfile by @IsaacYangSLA in #3042

Full Changelog: 2.5.0...2.5.1

Contributors

chesterxgchen, yhwen, and 6 other contributors

Assets 2

11 Oct 00:32

IsaacYangSLA

2.5.1rc2

139bc6a

2.5.1rc2: bug fixes and document updates Pre-release

Pre-release

What's Changed

[2.5] Update finance end to end README and images by @YuanTingHsieh in #3013
[2.5] Fix fobs issue by @YuanTingHsieh in #3015
[2.5] Fix fobs doc (#3012) by @YuanTingHsieh in #3016
[2.5] Update version and python requires by @IsaacYangSLA in #3024
[2.5] Fix PTModel optional arguments by @SYangster in #3026
[2.5] Set numpy less than 2.0.0 by @YuanTingHsieh in #3027
[2.5] Fedbpt fix by @holgerroth in #3030
[2.5] split learning: upgrade openmined-psi to 2.0.5 by @chesterxgchen in #3019
[2.5] Cherry-pick Fixed broken doc ref to 'helm_chart' (#3022) [skip ci] by @YuanTingHsieh in #3032

Full Changelog: 2.5.1rc1...2.5.1rc2

Contributors

chesterxgchen, IsaacYangSLA, and 3 other contributors

Assets 2

08 Oct 02:47

IsaacYangSLA

2.5.1rc1

348ddb1

2.5.1rc1: Bug fixes and more features Pre-release

Pre-release

What's Changed

[2.5] doc fix typo by @chesterxgchen in #2940
[2.5] Update higgs data link by @SYangster in #2945
[2.5] Update video links by @SYangster in #2944
[2.5] Add research examples to tutorial page by @SYangster in #2943
[2.5] Cherry-pick fix doc and docstring issues (#2931) by @YuanTingHsieh in #2946
[2.5] Add check before api send 25 by @YuanTingHsieh in #2949
[2.5] Fix dashboard server resource by @IsaacYangSLA in #2959
[2.5] F3 Streaming Code Rewrite by @nvidianz in #2956
[2.5] Add the hello-pt-resnet example by @yhwen in #2955
[2.5] Fix simulator result path by @YuanTingHsieh in #2967
[2.5] Add python 3.12 support by @YuanTingHsieh in #2966
[2.5] update PSI to support python 3.11 by @chesterxgchen in #2973
[2.5] Web updates by @SYangster in #2977
[2.5] -- Replace the distutils with shutil by @yhwen in #2979
[2.5] Update openmined-psi to 2.0.5 to support python 3.12 by @chesterxgchen in #2982
[2.5] Allow JobAPI wrapper customization by @YuanTingHsieh in #2988
[2.5] Job api update by @YuanTingHsieh in #2991
[2.5] Bionemo demos (#2968) by @holgerroth in #2998
[2.5] Update pt params converter (#2989) by @YuanTingHsieh in #2997
[2.5] BioNeMo: use multi threading but reduce num workers by @holgerroth in #3003
[2.5] Update test script by @YuanTingHsieh in #3001
[2.5] Fix doc typo and VDR reported issues by @YuanTingHsieh in #3000
[2.5] Expose init in client lightning api by @YuanTingHsieh in #3005
[2.5] Update flwr job object, client, server by @YuanTingHsieh in #3009
[2.5] Cherry pick Update documentation for Dockerfile, add location of tbevents, fix link by @nvkevlu in #3006

Full Changelog: 2.5.0...2.5.1rc1

Contributors

chesterxgchen, yhwen, and 6 other contributors

Assets 2

09 Sep 21:13

IsaacYangSLA

2.5.0

bff1d69

2.5.0: Latest release with features and bug fixes

What's new

https://nvflare.readthedocs.io/en/main/whats_new.html
https://nvidia.github.io/NVFlare/

User Experience Improvements¶
NVFlare 2.5.0 offers several new sets of APIs that allows for end-to-end ease of use that can greatly improve researcher and data scientists’ experience working with FLARE. The new API covers client, server and job construction with end-to-end pythonic user experience.
Model Controller API¶
The new ModelController API greatly simplifies the experience of developing new federated learning workflows. Users can simply subclass the ModelController to develop new workflows. The new API doesn’t require users to know the details of NVFlare constructs except for FLModel class, where it is simply a data structure that contains model weights, optimization parameters and metadata.

You can easily construct a new workflow with basic python code, and when ready, the send_and_wait() communication function is all you need for communication between clients and server.

Client API¶
We introduced another Client API implementation, InProcessClientAPIExecutor. This has the same interface and syntax of the previous Client API using SubprocessLauncher, except all communication is in memory.

Using this in-process client API, we build a ScriptExecutor, which is directly used in the new Job API.

Compared with SubProcessLauncherClientAPI, the in-process client API offers better efficiency and is easier to configure. All the operations will be carried out within the memory space of the executor.

SubProcessLauncherClientAPI can be used for cases where a separate training process is required.

Job API¶
The new Job API, or FedJob API, combined with Client API and Model Controller API, will give users an end-to-end pythonic user experience. The Job configuration, required prior to the current release, can now be directly generated automatically, so the user doesn’t need to edit the configuration files manually.

We provide many examples to demonstrate the power of the new Job APIs making it very easy to experiment with new federated learning algorithms or create new applications.

Flower Integration¶
Integration between NVFlare and the Flower framework aims to provide researchers the ability to leverage the strengths of both frameworks by enabling Flower projects to seamlessly run on top of NVFlare. Through the seamless integration of Flower and FLARE, applications crafted within the Flower framework can effortlessly operate within the FLARE runtime environment without necessitating any modifications. This initial integration streamlines the process, eliminating complexities and ensuring smooth interoperability between the two platforms, thus enhancing the overall efficiency and accessibility of FL applications. Please find details here. A hello-world example is available here.
Secure XGBoost¶
The latest features from XGBoost introduced the support for secure federated learning via homomorphic encryption. For vertical federated XGBoost learning, the gradients of each sample are protected by encryption such that the label information will not be leaked to unintended parties; while for horizontal federated XGBoost learning, the local gradient histograms will not be learnt by the central aggregation server.

With our encryption plugins working with XGBoost, NVFlare now supports all secure federated schemes for XGBoost model training, with both CPU and GPU.

Please check federated xgboost with nvflare user guide https://nvflare.readthedocs.io/en/main/user_guide/federated_xgboost.html and the example

Tensorflow support¶

With community contributions, we add FedOpt, FedProx and Scaffold algorithms using Tensorflow. You can check the code here and the example

FOBS Auto Registration¶
FOBS, the secure mechanism NVFlare uses for message serialization and deserialization, is enhanced with new auto registration features. These changes will reduce the number of decomposers that users have to register. The changes are:

Auto registering of decomposers on deserialization. The decomposer class is stored in the serialized data and the decomposers are registered automatically when deserializing. If a component only receives serialized data but it doesn’t perform serialization, decomposer registering is not needed anymore.

Data Class decomposer auto registering on serialization. If a decomposer is not found for a class, FOBS will try to treat the class as a Data Class and register DataClassDecomposer for it. This works in most cases but not all.

New Examples¶
Secure Federated Kaplan-Meier Analysis¶
The Secure Federated Kaplan-Meier Analysis via Time-Binning and Homomorphic Encryption example illustrates two features:

How to perform Kaplan-Meier survival analysis in a federated setting without and with secure features via time-binning and Homomorphic Encryption (HE).

How to use the Flare ModelController API to contract a workflow to facilitate HE under simulator mode.

BioNemo example for Drug Discovery¶
BioNeMo is NVIDIA’s generative AI platform for drug discovery. We included several examples of running BioNeMo in a federated learning environment using NVFlare:

The task fitting example includes a notebook that shows how to obtain protein-learned representations in the form of embeddings using the ESM-1nv pre-trained model.

The downstream example shows three different downstream tasks for fine-tuning a BioNeMo ESM-style model.

Federated Logistic Regression with NR optimization¶
The Federated Logistic Regression with Second-Order Newton-Raphson optimization example shows how to implement a federated binary classification via logistic regression with second-order Newton-Raphson optimization.

Hierarchical Federated Statistics¶
Hierarchical Federated Statistics is helpful when there are multiple organizations involved. For example, in the medical device applications, the medical devices usage statistics can be viewed from both device, device-hosting site, and hospital or manufacturers’ point of views. Manufacturers would like to see the usage stats of their product (device) in different sites and hospitals. Hospitals may like to see overall stats of devices including different products from different manufacturers. In such a case, the hierarchical federated stats will be very helpful.

FedAvg Early Stopping Example¶
The FedAvg Early Stopping example tries to demonstrate that with the new server-side model controller API, it is very easy to change the control conditions and adjust workflows with a few lines of python code.

Tensorflow Algorithms & Examples¶
FedOpt, FedProx, Scaffold implementation for Tensorflow.

FedBN: Federated Learning on Non-IID Features via Local Batch Normalization¶
The FedBN example showcases a federated learning algorithm designed to addr...

Contributors

chesterxgchen, yhwen, and 18 other contributors

Assets 2

07 Sep 00:51

IsaacYangSLA

2.5.0rc12

c278aff

2.5.0rc12: Bug fixes Pre-release

Pre-release

What's Changed

Fixed XGBoost Example README by @nvidianz in #2913
Fix cifar10 examples num_clients by @SYangster in #2914
Fix data save path by @YuanTingHsieh in #2917
trim the whitespace of the clients and gpu from the job simulator_run by @yhwen in #2912
Add CSE with job api with client api by @YuanTingHsieh in #2918
Update to use BaseFedJob by @SYangster in #2919
Warning for Mixed Plugin Use by @nvidianz in #2920
BugFix: Hierarchical Fed Stats, prepare data: replace os.rename() function by @chesterxgchen in #2921
Note about Simulator in XGBoost Doc by @nvidianz in #2911
Add params_transfer_type to ScriptRunner by @SYangster in #2922
Fix nemo examples by @holgerroth in #2923
Added the current-round info the fl_ctx for BaseModelController by @yhwen in #2916
Fix ci path by @YuanTingHsieh in #2927
Fix xgb standalone fed by @YuanTingHsieh in #2924
Fixing the memoryview issues by @nvidianz in #2926

Full Changelog: 2.5.0rc11...2.5.0rc12

Contributors

chesterxgchen, yhwen, and 4 other contributors

Assets 2

05 Sep 04:17

IsaacYangSLA

2.5.0rc11

910179c

2.5.0rc11: Bug fixes Pre-release

Pre-release

What's Changed

Fix hello-pt-cse job by @YuanTingHsieh in #2905
Undo remove bionemo from new by @nvkevlu in #2902
Add vertical xgboost gpu instructions by @YuanTingHsieh in #2903
Fix bionemo examples by @holgerroth in #2904
Fixed Plugin README by @nvidianz in #2906
Update xgboost docs by @nvkevlu in #2907
Added debug info for memoryview error by @nvidianz in #2908
Change job simulator run to use Popen by @yhwen in #2909
Fix hello_world tf result printing by @SYangster in #2910

Full Changelog: 2.5.0rc10...2.5.0rc11

Contributors

yhwen, holgerroth, and 4 other contributors

Assets 2

04 Sep 01:11

IsaacYangSLA

2.5.0rc9

d07b3f6

2.5.0rc9: Bug fixes Pre-release

Pre-release

What's Changed

Updated plugin build doc by @nvidianz in #2892
fix PSI and Vertical learning paths by @chesterxgchen in #2893
Fix ci test configs format issue by @YuanTingHsieh in #2896
Remove bionemo from new by @nvkevlu in #2897
Update random forest and vertical xgb examples by @ZiyueXu77 in #2895
Site, docs, and example updates by @SYangster in #2894
Update xgboost requirements by @YuanTingHsieh in #2898
Update flare simulator tutorial by @YuanTingHsieh in #2899
Fix tf weights filename by @SYangster in #2901

Full Changelog: 2.5.0rc8...2.5.0rc9

Contributors

chesterxgchen, YuanTingHsieh, and 4 other contributors

Assets 2

04 Sep 01:20

IsaacYangSLA

2.5.0rc10

60e722a

2.5.0rc10: Feature improvements Pre-release

Pre-release

What's Changed

Add log info for flower executor by @YuanTingHsieh in #2900

Full Changelog: 2.5.0rc9...2.5.0rc10

Contributors

YuanTingHsieh

Assets 2

31 Aug 00:37

IsaacYangSLA

2.5.0rc8

e3f195d

2.5.0rc8: Bug fixes Pre-release

Pre-release

What's Changed

Fix hierarchical stats documentation by @apatole in #2882
Update fedbn example by @ZiyueXu77 in #2883
fix path due to simulator output structure changes by @chesterxgchen in #2885
Add note on installing nvflare in requirements by @nvkevlu in #2884
Fix sbs notebooks by @SYangster in #2887
Re-factor hello-numpy-cse example by @YuanTingHsieh in #2880
Update CrossSiteEval by @YuanTingHsieh in #2886
Add printing of tb logdir by @YuanTingHsieh in #2888
Update getting_started cifar notebook by @ZiyueXu77 in #2889
Deprecate decorator pattern by @YuanTingHsieh in #2891
Added instructions to run horizontal secure XGBoost in simulator by @nvidianz in #2890

Full Changelog: 2.5.0rc7...2.5.0rc8

Contributors

chesterxgchen, apatole, and 5 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributions

What's New

Extended Python Version Support (2.5.1)

Secure Federated XGBoost Enhancements

New Parallel CUDA-Based Reduction Algorithm:

Performance Benchmarks:

End-to-end fraud detection example enhancements

Support Normal TLS & signed messages

Bug fixes:

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's new

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: NVIDIA/NVFlare

2.5.2

Contributions

What's New

Extended Python Version Support (2.5.1)

Secure Federated XGBoost Enhancements

New Parallel CUDA-Based Reduction Algorithm:

Performance Benchmarks:

End-to-end fraud detection example enhancements

Support Normal TLS & signed messages

Bug fixes:

What's Changed

Contributors

2.5.1: bug fixes and additional features

What's Changed

Contributors

2.5.1rc2: bug fixes and document updates

What's Changed

Contributors

2.5.1rc1: Bug fixes and more features

What's Changed

Contributors

2.5.0: Latest release with features and bug fixes

What's new

Contributors

2.5.0rc12: Bug fixes

What's Changed

Contributors

2.5.0rc11: Bug fixes

What's Changed

Contributors

2.5.0rc9: Bug fixes

What's Changed

Contributors

2.5.0rc10: Feature improvements

What's Changed

Contributors

2.5.0rc8: Bug fixes

What's Changed

Contributors