Skip to content

Commit

Permalink
Merge branch 'NVIDIA:main' into embed
Browse files Browse the repository at this point in the history
  • Loading branch information
ZiyueXu77 authored Sep 20, 2024
2 parents e37834e + aa4a0de commit 1096beb
Show file tree
Hide file tree
Showing 53 changed files with 842 additions and 791 deletions.
4 changes: 0 additions & 4 deletions docs/examples/fl_experiment_tracking_mlflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,6 @@ Adding MLflow Logging to Configurations
Inside the config folder there are two files, ``config_fed_client.conf`` and ``config_fed_server.conf``.

.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_client.conf
:language:
:linenos:
:caption: config_fed_client.conf

Take a look at the components section of the client config at line 24.
Expand All @@ -70,8 +68,6 @@ Finally, :class:`ConvertToFedEvent<nvflare.app_common.widgets.convert_to_fed_eve
This changes the event ``analytix_log_stats`` into a fed event ``fed.analytix_log_stats``, which will then be streamed from the clients to the server.

.. literalinclude:: ../../examples/advanced/experiment-tracking/mlflow/jobs/hello-pt-mlflow/app/config/config_fed_server.conf
:language:
:linenos:
:caption: config_fed_server.conf

Under the component section in the server config, we have the
Expand Down
166 changes: 0 additions & 166 deletions docs/examples/hello_cross_site_eval.rst

This file was deleted.

51 changes: 13 additions & 38 deletions docs/examples/hello_cross_val.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,36 +110,15 @@ and adding a ``random_epsilon`` before returning the results packaged with a DXO
NVIDIA FLARE can be used with any data packaged inside a :ref:`Shareable <shareable>` object (subclasses ``dict``), and
:ref:`DXO <data_exchange_object>` is recommended as a way to manage that data in a standard way.

Application Configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^

Inside the config folder there are two files, ``config_fed_client.json`` and ``config_fed_server.json``.

.. literalinclude:: ../../examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_server.json
:language: json
:linenos:
:caption: config_fed_server.json

The server now has a second workflow configured after Scatter and Gather, :class:`CrossSiteModelEval<nvflare.app_common.workflows.cross_site_model_eval.CrossSiteModelEval>`.

The components "model_locator" and "formatter" have been added to work with the cross site model evaluation workflow,
and the rest is the same as in :doc:`Hello Scatter and Gather <hello_scatter_and_gather>`.

Cross site validation!
----------------------

.. literalinclude:: ../../examples/hello-world/hello-numpy-cross-val/jobs/hello-numpy-cross-val/app/config/config_fed_client.json
:language: json
:linenos:
:caption: config_fed_client.json
We can run it using NVFlare simulator

The client configuration now has more tasks and an additional Executor ``NPValidator`` configured to handle the "validate" task.
The "submit_model" task has been added to the ``NPTrainer`` Executor to work with the :class:`CrossSiteModelEval<nvflare.app_common.workflows.cross_site_model_eval.CrossSiteModelEval>`
workflow to get the client models.
.. code-block:: bash
Cross site validation!
----------------------
python3 job_train_and_cse.py
.. |ExampleApp| replace:: hello-numpy-cross-val
.. include:: run_fl_system.rst
During the first phase, the model will be trained.

Expand All @@ -154,23 +133,18 @@ This can produce a lot of results. All the results will be kept in the job's wor
Understanding the Output
^^^^^^^^^^^^^^^^^^^^^^^^

After starting the server and clients, you should begin to see
some outputs in each terminal tracking the progress of the FL run.
As each client finishes training, it will start the cross site validation process.
During this you'll see several important outputs the track the progress of cross site validation.
You can find the running logs and results inside the simulator's workspace:

The server shows the log of each client requesting models, the models it sends and the results received.
Since the server could be responding to many clients at the same time, it may
require careful examination to make proper sense of events from the jumbled logs.
.. code-block:: bash
ls /tmp/nvflare/jobs/workdir/
server/ site-1/ site-2/ startup/
.. include:: access_result.rst
.. note::
You could see the cross-site validation results
at ``[DOWNLOAD_DIR]/[JOB_ID]/workspace/cross_site_val/cross_val_results.json``
The cross site validation results:

.. include:: shutdown_fl_system.rst
.. code-block:: bash
cat /tmp/nvflare/jobs/workdir/server/simulate_job/cross_site_val/cross_val_results.json
Congratulations!

Expand All @@ -186,3 +160,4 @@ Previous Versions of Hello Cross-Site Validation
- `hello-numpy-cross-val for 2.1 <https://github.com/NVIDIA/NVFlare/tree/2.1/examples/hello-numpy-cross-val>`_
- `hello-numpy-cross-val for 2.2 <https://github.com/NVIDIA/NVFlare/tree/2.2/examples/hello-numpy-cross-val>`_
- `hello-numpy-cross-val for 2.3 <https://github.com/NVIDIA/NVFlare/tree/2.3/examples/hello-world/hello-numpy-cross-val/>`_
- `hello-numpy-cross-val for 2.4 <https://github.com/NVIDIA/NVFlare/tree/2.4/examples/hello-world/hello-numpy-cross-val/>`_
4 changes: 2 additions & 2 deletions docs/examples/hello_fedavg_numpy.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _hello_fedavg_w_numpy:
.. _hello_fedavg_numpy:

Hello FedAvg with NumPy
=======================
Expand Down Expand Up @@ -35,7 +35,7 @@ The following steps compose one cycle of weight updates, called a **round**:
#. These updates are then sent to the server which will aggregate them to produce a model with new weights.
#. Finally, the server sends this updated version of the model back to each client, so the clients can continue to calculate the next model weights in future rounds.

For this exercise, we will be working with the ``hello-fedavg-numpy`` application in the examples folder.
For this exercise, we will be working with the ``hello-fedavg-numpy`` in the examples folder.

Let's get started. First clone the repo, if you haven't already:

Expand Down
4 changes: 2 additions & 2 deletions docs/examples/hello_pt_job_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Before You Start
Feel free to refer to the :doc:`detailed documentation <../programming_guide>` at any point
to learn more about the specifics of `NVIDIA FLARE <https://pypi.org/project/nvflare/>`_.

We recommend you first finish the :doc:`Hello FedAvg with NumPy <hello_fedavg_w_numpy>` exercise since it introduces the
We recommend you first finish the :doc:`Hello FedAvg with NumPy <hello_fedavg_numpy>` exercise since it introduces the
federated learning concepts of `NVIDIA FLARE <https://pypi.org/project/nvflare/>`_.

Make sure you have an environment with NVIDIA FLARE installed.
Expand Down Expand Up @@ -55,7 +55,7 @@ NVIDIA FLARE Job API
--------------------

The ``fedavg_script_executor_hello-pt.py`` script for this hello-pt example is very similar to the ``fedavg_script_executor_hello-numpy.py`` script
for the :doc:`Hello FedAvg with NumPy <hello_fedavg_w_numpy>` exercise. Other than changes to the names of the job and client script, the only difference
for the :doc:`Hello FedAvg with NumPy <hello_fedavg_numpy>` exercise. Other than changes to the names of the job and client script, the only difference
is a line to define the initial global model for the server:

.. code-block:: python
Expand Down
Loading

0 comments on commit 1096beb

Please sign in to comment.