From 49fe4775ef1176f2fd98994141af7b72e0023a59 Mon Sep 17 00:00:00 2001
From: Adam Narozniak <51029327+adam-narozniak@users.noreply.github.com>
Date: Sat, 24 Aug 2024 10:13:03 +0200
Subject: [PATCH 01/42] refactor(examples) Update federated kaplan meier fitter
(#3922)
Co-authored-by: jafermarq
---
.../federated-kaplan-meier-fitter/README.md | 84 +++++++------------
.../examplefkm/__init__.py | 1 +
.../{client.py => examplefkm/client_app.py} | 41 ++++-----
.../{server.py => examplefkm/server_app.py} | 36 +++++---
.../examplefkm/task.py | 17 ++++
.../pyproject.toml | 43 +++++++---
.../requirements.txt | 5 --
7 files changed, 118 insertions(+), 109 deletions(-)
create mode 100644 examples/federated-kaplan-meier-fitter/examplefkm/__init__.py
rename examples/federated-kaplan-meier-fitter/{client.py => examplefkm/client_app.py} (54%)
rename examples/federated-kaplan-meier-fitter/{server.py => examplefkm/server_app.py} (83%)
create mode 100644 examples/federated-kaplan-meier-fitter/examplefkm/task.py
delete mode 100644 examples/federated-kaplan-meier-fitter/requirements.txt
diff --git a/examples/federated-kaplan-meier-fitter/README.md b/examples/federated-kaplan-meier-fitter/README.md
index 20d4ca4c47af..1964ec4e5653 100644
--- a/examples/federated-kaplan-meier-fitter/README.md
+++ b/examples/federated-kaplan-meier-fitter/README.md
@@ -4,9 +4,9 @@ dataset: [Waltons]
framework: [lifelines]
---
-# Flower Example using KaplanMeierFitter
+# Federated Survival Analysis with Flower and KaplanMeierFitter
-This is an introductory example on **federated survival analysis** using [Flower](https://flower.ai/)
+This is an introductory example of **federated survival analysis** using [Flower](https://flower.ai/)
and [lifelines](https://lifelines.readthedocs.io/en/stable/index.html) library.
The aim of this example is to estimate the survival function using the
@@ -25,86 +25,60 @@ the group it comes from therefore to simulate the division that might occur.
-## Project Setup
+## Set up the project
-Start by cloning the example project. We prepared a single-line command that you can copy into your shell which will checkout the example for you:
+### Clone the project
-```shell
-$ git clone --depth=1 https://github.com/adap/flower.git _tmp && mv _tmp/examples/federated-kaplan-meier-fitter . && rm -rf _tmp && cd federated-kaplan-meier-fitter
-```
-
-This will create a new directory called `federated-kaplan-meier-fitter` containing the following files:
-
-```shell
--- pyproject.toml
--- requirements.txt
--- client.py
--- server.py
--- centralized.py
--- README.md
-```
-
-### Installing Dependencies
-
-Project dependencies (such as `lifelines` and `flwr`) are defined in `pyproject.toml` and `requirements.txt`. We recommend [Poetry](https://python-poetry.org/docs/) to install those dependencies and manage your virtual environment ([Poetry installation](https://python-poetry.org/docs/#installation)) or [pip](https://pip.pypa.io/en/latest/development/), but feel free to use a different way of installing dependencies and managing virtual environments if you have other preferences.
-
-#### Poetry
-
-```shell
-poetry install
-poetry shell
-```
-
-Poetry will install all your dependencies in a newly created virtual environment. To verify that everything works correctly you can run the following command:
+Start by cloning the example project:
```shell
-poetry run python3 -c "import flwr"
+$ git clone --depth=1 https://github.com/adap/flower.git _tmp && mv _tmp/examples/federated-kaplan-meier-fitter . && rm -rf _tmp && cd federated-kaplan-meier-fitter
```
-If you don't see any errors you're good to go!
-
-#### pip
-
-Write the command below in your terminal to install the dependencies according to the configuration file requirements.txt.
+This will create a new directory called `federated-kaplan-meier-fitter` with the following structure:
```shell
-pip install -r requirements.txt
+federated-kaplan-meier-fitter
+├── examplefmk
+│ ├── __init__.py
+│ ├── client_app.py # Defines your ClientApp
+│ ├── server_app.py # Defines your ServerApp
+│ └── task.py # Defines your model, training and data loading
+├── pyproject.toml # Project metadata like dependencies and configs
+└── README.md
```
-## Run Federated Survival Analysis with Flower and lifelines's KaplanMeierFitter
+### Install dependencies and project
-### Start the long-running Flower server (SuperLink)
+Install the dependencies defined in `pyproject.toml` as well as the `examplefmk` package.
```bash
-flower-superlink --insecure
+pip install -e .
```
-### Start the long-running Flower client (SuperNode)
-
-In a new terminal window, start the first long-running Flower client:
+## Run the project
-```bash
-flower-client-app client:node_1_app --insecure
-```
+You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
-In yet another new terminal window, start the second long-running Flower client:
+### Run with the Simulation Engine
```bash
-flower-client-app client:node_2_app --insecure
+flwr run .
```
-### Run the Flower App
-
-With both the long-running server (SuperLink) and two clients (SuperNode) up and running, we can now run the actual Flower App:
+You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flower-server-app server:app --insecure
+flwr run . --run-config num-server-rounds=5,learning-rate=0.05
```
-You will see that the server is printing survival function, median survival time and saves the plot with the survival function.
-
You can also check that the results match the centralized version.
```shell
$ python3 centralized.py
```
+
+### Run with the Deployment Engine
+
+> \[!NOTE\]
+> An update to this example will show how to run this Flower application with the Deployment Engine and TLS certificates, or with Docker.
diff --git a/examples/federated-kaplan-meier-fitter/examplefkm/__init__.py b/examples/federated-kaplan-meier-fitter/examplefkm/__init__.py
new file mode 100644
index 000000000000..794b6a6600e9
--- /dev/null
+++ b/examples/federated-kaplan-meier-fitter/examplefkm/__init__.py
@@ -0,0 +1 @@
+"""federated-kaplan-feier-fitter."""
diff --git a/examples/federated-kaplan-meier-fitter/client.py b/examples/federated-kaplan-meier-fitter/examplefkm/client_app.py
similarity index 54%
rename from examples/federated-kaplan-meier-fitter/client.py
rename to examples/federated-kaplan-meier-fitter/examplefkm/client_app.py
index 948492efc575..ea744af85be8 100644
--- a/examples/federated-kaplan-meier-fitter/client.py
+++ b/examples/federated-kaplan-meier-fitter/examplefkm/client_app.py
@@ -1,11 +1,13 @@
+"""examplefkm: A Flower / Lifelines app."""
+
from typing import Dict, List, Tuple
import flwr as fl
import numpy as np
-from datasets import Dataset
-from flwr.common import NDArray, NDArrays
-from flwr_datasets.partitioner import NaturalIdPartitioner
-from lifelines.datasets import load_waltons
+from flwr.client import Client, ClientApp
+from flwr.common import NDArray, NDArrays, Context
+
+from examplefkm.task import load_partition
class FlowerClient(fl.client.NumPyClient):
@@ -40,26 +42,17 @@ def fit(
)
-# Prepare data
-X = load_waltons()
-partitioner = NaturalIdPartitioner(partition_by="group")
-partitioner.dataset = Dataset.from_pandas(X)
-
+def client_fn(context: Context) -> Client:
+ """Construct a Client that will be run in a ClientApp.
-def get_client_fn(partition_id: int):
- def client_fn(cid: str):
- partition = partitioner.load_partition(partition_id).to_pandas()
- events = partition["E"].values
- times = partition["T"].values
- return FlowerClient(times=times, events=events).to_client()
-
- return client_fn
+ You can use settings in `context.run_config` to parameterize the
+ construction of your Client. You could use the `context.node_config` to, for
+ example, indicate which dataset to load (e.g accesing the partition-id).
+ """
+ partition_id = context.node_config["partition-id"]
+ times, events = load_partition(partition_id)
+ return FlowerClient(times=times, events=events).to_client()
-# Run via `flower-client-app client:app`
-node_1_app = fl.client.ClientApp(
- client_fn=get_client_fn(0),
-)
-node_2_app = fl.client.ClientApp(
- client_fn=get_client_fn(1),
-)
+# Flower ClientApp
+app = ClientApp(client_fn=client_fn)
diff --git a/examples/federated-kaplan-meier-fitter/server.py b/examples/federated-kaplan-meier-fitter/examplefkm/server_app.py
similarity index 83%
rename from examples/federated-kaplan-meier-fitter/server.py
rename to examples/federated-kaplan-meier-fitter/examplefkm/server_app.py
index e1f84a961bf1..2515e8ea852d 100644
--- a/examples/federated-kaplan-meier-fitter/server.py
+++ b/examples/federated-kaplan-meier-fitter/examplefkm/server_app.py
@@ -16,8 +16,6 @@
from typing import Any, Dict, List, Optional, Tuple, Union
-import flwr as fl
-import matplotlib.pyplot as plt
import numpy as np
from flwr.common import (
EvaluateIns,
@@ -27,7 +25,9 @@
Parameters,
Scalar,
parameters_to_ndarrays,
+ Context,
)
+from flwr.server import ServerApp, ServerConfig, ServerAppComponents
from flwr.server.client_manager import ClientManager
from flwr.server.client_proxy import ClientProxy
from flwr.server.strategy import Strategy
@@ -66,7 +66,7 @@ def configure_fit(
config = {}
fit_ins = FitIns(parameters, config)
clients = client_manager.sample(
- num_clients=client_manager.num_available(),
+ num_clients=self._min_num_clients,
min_num_clients=self._min_num_clients,
)
return [(client, fit_ins) for client in clients]
@@ -99,9 +99,6 @@ def aggregate_fit(
self.fitter.fit(sorted_times, sorted_events)
print("Survival function:")
print(self.fitter.survival_function_)
- self.fitter.plot_survival_function()
- plt.title("Survival function of fruit flies (Walton's data)", fontsize=16)
- plt.savefig("./_static/survival_function_federated.png", dpi=200)
print("Mean survival time:")
print(self.fitter.median_survival_time_)
return None, {}
@@ -136,10 +133,25 @@ def configure_evaluate(
return []
-fitter = KaplanMeierFitter() # You can choose other method that work on E, T data
-strategy = EventTimeFitterStrategy(min_num_clients=2, fitter=fitter)
+def server_fn(context: Context) -> ServerAppComponents:
+ """Construct components that set the ServerApp behaviour.
-app = fl.server.ServerApp(
- config=fl.server.ServerConfig(num_rounds=1),
- strategy=strategy,
-)
+ You can use settings in `context.run_config` to parameterize the
+ construction of all elements (e.g the strategy or the number of rounds)
+ wrapped in the returned ServerAppComponents object.
+ """
+
+ # Define the strategy
+ fitter = KaplanMeierFitter() # You can choose other method that work on E, T data
+ min_num_clients = context.run_config["min-num-clients"]
+ strategy = EventTimeFitterStrategy(min_num_clients=min_num_clients, fitter=fitter)
+
+ # Construct ServerConfig
+ num_rounds = context.run_config["num-server-rounds"]
+ config = ServerConfig(num_rounds=num_rounds)
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+# Create ServerApp
+app = ServerApp(server_fn=server_fn)
diff --git a/examples/federated-kaplan-meier-fitter/examplefkm/task.py b/examples/federated-kaplan-meier-fitter/examplefkm/task.py
new file mode 100644
index 000000000000..d76dc79c6724
--- /dev/null
+++ b/examples/federated-kaplan-meier-fitter/examplefkm/task.py
@@ -0,0 +1,17 @@
+"""examplefkm: A Flower / Lifelines app."""
+
+from lifelines.datasets import load_waltons
+
+from flwr_datasets.partitioner import NaturalIdPartitioner
+from datasets import Dataset
+
+X = load_waltons()
+
+
+def load_partition(partition_id: int):
+ partitioner = NaturalIdPartitioner(partition_by="group")
+ partitioner.dataset = Dataset.from_pandas(X)
+ partition = partitioner.load_partition(partition_id).to_pandas()
+ times = partition["T"].values
+ events = partition["E"].values
+ return times, events
diff --git a/examples/federated-kaplan-meier-fitter/pyproject.toml b/examples/federated-kaplan-meier-fitter/pyproject.toml
index 8fe354ffb750..47cb0a4ba286 100644
--- a/examples/federated-kaplan-meier-fitter/pyproject.toml
+++ b/examples/federated-kaplan-meier-fitter/pyproject.toml
@@ -1,18 +1,35 @@
[build-system]
-requires = ["poetry-core>=1.4.0"]
-build-backend = "poetry.core.masonry.api"
+requires = ["hatchling"]
+build-backend = "hatchling.build"
-[tool.poetry]
+[project]
name = "federated-kaplan-meier-fitter"
-version = "0.1.0"
+version = "1.0.0"
description = "Federated Kaplan Meier Fitter with Flower"
-authors = ["The Flower Authors "]
-maintainers = ["The Flower Authors "]
+license = "Apache-2.0"
+dependencies = [
+ "flwr[simulation]>=1.10.0",
+ "flwr-datasets>=0.3.0",
+ "numpy>=1.23.2",
+ "pandas>=2.0.0",
+ "lifelines>=0.28.0",
+]
+[tool.hatch.build.targets.wheel]
+packages = ["."]
-[tool.poetry.dependencies]
-python = ">=3.9,<3.11"
-flwr-nightly = "*"
-flwr-datasets = ">=0.0.2,<1.0.0"
-numpy = ">=1.23.2"
-pandas = ">=2.0.0"
-lifelines = ">=0.28.0"
+[tool.flwr.app]
+publisher = "flwrlabs"
+
+[tool.flwr.app.components]
+serverapp = "examplefkm.server_app:app"
+clientapp = "examplefkm.client_app:app"
+
+[tool.flwr.app.config]
+min-num-clients = 2
+num-server-rounds = 1
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 2
diff --git a/examples/federated-kaplan-meier-fitter/requirements.txt b/examples/federated-kaplan-meier-fitter/requirements.txt
deleted file mode 100644
index cc8146545c7b..000000000000
--- a/examples/federated-kaplan-meier-fitter/requirements.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-flwr-nightly
-flwr-datasets>=0.0.2, <1.0.0
-numpy>=1.23.2
-pandas>=2.0.0
-lifelines>=0.28.0
From 71c4eed28d31a17c19c863917913a666d44fd6e9 Mon Sep 17 00:00:00 2001
From: Javier
Date: Sat, 24 Aug 2024 14:31:52 +0100
Subject: [PATCH 02/42] docs(framework) Add note to `flower-server-app` CLI ref
(#4076)
---
doc/source/ref-api-cli.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/doc/source/ref-api-cli.rst b/doc/source/ref-api-cli.rst
index ff1a9606f58d..95664b2f490a 100644
--- a/doc/source/ref-api-cli.rst
+++ b/doc/source/ref-api-cli.rst
@@ -45,6 +45,12 @@ flower-supernode
flower-server-app
~~~~~~~~~~~~~~~~~
+.. note::
+ Note that since version :code:`1.11.0`, :code:`flower-server-app` no longer supports passing a reference to a `ServerApp` attribute.
+ Instead, you need to pass the path to Flower app via the argument :code:`--app`.
+ This is the path to a directory containing a `pyproject.toml`.
+ You can create a valid Flower app by executing :code:`flwr new` and following the prompt.
+
.. argparse::
:module: flwr.server.run_serverapp
:func: _parse_args_run_server_app
From 5be5b1d09393db2937d6ff94e9ca874a15c1bc04 Mon Sep 17 00:00:00 2001
From: Heng Pan
Date: Sat, 24 Aug 2024 14:37:26 +0100
Subject: [PATCH 03/42] feat(framework:skip) Print prompts when users pass a
reference instead of a path to `flower-server-app` (#4077)
---
src/py/flwr/server/run_serverapp.py | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/src/py/flwr/server/run_serverapp.py b/src/py/flwr/server/run_serverapp.py
index 8f67c917c8ed..d9c363245a2e 100644
--- a/src/py/flwr/server/run_serverapp.py
+++ b/src/py/flwr/server/run_serverapp.py
@@ -97,6 +97,21 @@ def run_server_app() -> None:
args = _parse_args_run_server_app().parse_args()
+ # Check if the server app reference is passed.
+ # Since Flower 1.11, passing a reference is not allowed.
+ app_path: Optional[str] = args.app
+ # If the provided app_path doesn't exist, and contains a ":",
+ # it is likely to be a server app reference instead of a path.
+ if app_path is not None and not Path(app_path).exists() and ":" in app_path:
+ sys.exit(
+ "It appears you've passed a reference like `server:app`.\n\n"
+ "Note that since version `1.11.0`, `flower-server-app` no longer supports "
+ "passing a reference to a `ServerApp` attribute. Instead, you need to pass "
+ "the path to Flower app via the argument `--app`. This is the path to a "
+ "directory containing a `pyproject.toml`. You can create a valid Flower "
+ "app by executing `flwr new` and following the prompt."
+ )
+
if args.server != ADDRESS_DRIVER_API:
warn = "Passing flag --server is deprecated. Use --superlink instead."
warn_deprecated_feature(warn)
@@ -151,7 +166,6 @@ def run_server_app() -> None:
cert_path,
)
- app_path: Optional[str] = args.app
if not (app_path is None) ^ (args.run_id is None):
raise sys.exit(
"Please provide either a Flower App path or a Run ID, but not both. "
From dbe957018153074ae412e99307550e9752a602b0 Mon Sep 17 00:00:00 2001
From: Chong Shen Ng
Date: Sat, 24 Aug 2024 17:17:40 +0100
Subject: [PATCH 04/42] refactor(examples) Update Flower example using Hugging
Face (#3754)
Co-authored-by: jafermarq
---
examples/quickstart-huggingface/README.md | 83 ++++++-----
examples/quickstart-huggingface/client.py | 129 ------------------
.../huggingface_example/__init__.py | 1 +
.../huggingface_example/client_app.py | 58 ++++++++
.../huggingface_example/server_app.py | 33 +++++
.../huggingface_example/task.py | 105 ++++++++++++++
.../quickstart-huggingface/pyproject.toml | 61 ++++++---
.../quickstart-huggingface/requirements.txt | 7 -
examples/quickstart-huggingface/run.sh | 15 --
examples/quickstart-huggingface/server.py | 15 --
10 files changed, 282 insertions(+), 225 deletions(-)
delete mode 100644 examples/quickstart-huggingface/client.py
create mode 100644 examples/quickstart-huggingface/huggingface_example/__init__.py
create mode 100644 examples/quickstart-huggingface/huggingface_example/client_app.py
create mode 100644 examples/quickstart-huggingface/huggingface_example/server_app.py
create mode 100644 examples/quickstart-huggingface/huggingface_example/task.py
delete mode 100644 examples/quickstart-huggingface/requirements.txt
delete mode 100755 examples/quickstart-huggingface/run.sh
delete mode 100644 examples/quickstart-huggingface/server.py
diff --git a/examples/quickstart-huggingface/README.md b/examples/quickstart-huggingface/README.md
index fa4330040ea7..ac0acebb9b99 100644
--- a/examples/quickstart-huggingface/README.md
+++ b/examples/quickstart-huggingface/README.md
@@ -4,77 +4,76 @@ dataset: [IMDB]
framework: [transformers]
---
-# Federated HuggingFace Transformers using Flower and PyTorch
+# Federated Learning with HuggingFace Transformers and Flower (Quickstart Example)
-This introductory example to using [HuggingFace](https://huggingface.co) Transformers with Flower with PyTorch. This example has been extended from the [quickstart-pytorch](https://flower.ai/docs/examples/quickstart-pytorch.html) example. The training script closely follows the [HuggingFace course](https://huggingface.co/course/chapter3?fw=pt), so you are encouraged to check that out for a detailed explanation of the transformer pipeline.
+This introductory example to using [🤗Transformers](https://huggingface.co/docs/transformers/en/index) with Flower. The training script closely follows the [HuggingFace course](https://huggingface.co/course/chapter3?fw=pt), so you are encouraged to check that out for a detailed explanation of the transformer pipeline.
-Like `quickstart-pytorch`, running this example in itself is also meant to be quite easy.
+In this example, we will federated the training of a [DistilBERT](https://huggingface.co/distilbert/distilbert-base-uncased) modle on the [IMDB](https://huggingface.co/datasets/stanfordnlp/imdb) dataset. The data will be downloaded and partitioned using [Flower Datasets](https://flower.ai/docs/datasets/). This example runs best when a GPU is available.
-## Project Setup
+## Set up the project
+
+### Clone the project
Start by cloning the example project. We prepared a single-line command that you can copy into your shell which will checkout the example for you:
```shell
-git clone --depth=1 https://github.com/adap/flower.git && mv flower/examples/quickstart-huggingface . && rm -rf flower && cd quickstart-huggingface
+git clone --depth=1 https://github.com/adap/flower.git _tmp \
+ && mv _tmp/examples/quickstart-huggingface . \
+ && rm -rf _tmp && cd quickstart-huggingface
```
This will create a new directory called `quickstart-huggingface` containing the following files:
```shell
--- pyproject.toml
--- requirements.txt
--- client.py
--- server.py
--- README.md
+quickstart-huggingface
+├── huggingface_example
+│ ├── __init__.py
+│ ├── client_app.py # Defines your ClientApp
+│ ├── server_app.py # Defines your ServerApp
+│ └── task.py # Defines your model, training and data loading
+├── pyproject.toml # Project metadata like dependencies and configs
+└── README.md
```
-### Installing Dependencies
-
-Project dependencies (such as `torch` and `flwr`) are defined in `pyproject.toml` and `requirements.txt`. We recommend [Poetry](https://python-poetry.org/docs/) to install those dependencies and manage your virtual environment ([Poetry installation](https://python-poetry.org/docs/#installation)) or [pip](https://pip.pypa.io/en/latest/development/), but feel free to use a different way of installing dependencies and managing virtual environments if you have other preferences.
+### Install dependencies and project
-#### Poetry
+Install the dependencies defined in `pyproject.toml` as well as the `huggingface_example` package.
-```shell
-poetry install
-poetry shell
+```bash
+pip install -e .
```
-Poetry will install all your dependencies in a newly created virtual environment. To verify that everything works correctly you can run the following command:
+## Run the Example
-```shell
-poetry run python3 -c "import flwr"
-```
-
-If you don't see any errors you're good to go!
+You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
-#### pip
+### Run with the Simulation Engine
-Write the command below in your terminal to install the dependencies according to the configuration file requirements.txt.
+> \[!TIP\]
+> This example runs faster when the `ClientApp`s have access to a GPU. If your system has one, you can make use of it by configuring the `backend.client-resources` component in `pyproject.toml`. If you want to try running the example with GPU right away, use the `local-simulation-gpu` federation as shown below.
-```shell
-pip install -r requirements.txt
+```bash
+# Run with the default federation (CPU only)
+flwr run .
```
-## Run Federated Learning with Flower
+Run the project in the `local-simulation-gpu` federation that gives CPU and GPU resources to each `ClientApp`. By default, at most 1x`ClientApp` (using ~12 GB of VRAM) will run in parallel in each available GPU. Note you can adjust the degree of paralellism but modifying the `client-resources` specification.
-Afterwards you are ready to start the Flower server as well as the clients. You can simply start the server in a terminal as follows:
-
-```shell
-python3 server.py
+```bash
+# Run with the `local-simulation-gpu` federation
+flwr run . local-simulation-gpu
```
-Now you are ready to start the Flower clients which will participate in the learning. To do so simply open two more terminal windows and run the following commands.
-
-Start client 1 in the first terminal:
+You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example
-```shell
-python3 client.py --partition-id 0
+```bash
+flwr run --run-config num-server-rounds=5
```
-Start client 2 in the second terminal:
+> \[!TIP\]
+> For a more detailed walk-through check our [quickstart 🤗Transformers tutorial](https://flower.ai/docs/framework/tutorial-quickstart-huggingface.html)
-```shell
-python3 client.py --partition-id 1
-```
+### Run with the Deployment Engine
-You will see that PyTorch is starting a federated training.
+> \[!NOTE\]
+> An update to this example will show how to run this Flower project with the Deployment Engine and TLS certificates, or with Docker.
diff --git a/examples/quickstart-huggingface/client.py b/examples/quickstart-huggingface/client.py
deleted file mode 100644
index b880119d1c7c..000000000000
--- a/examples/quickstart-huggingface/client.py
+++ /dev/null
@@ -1,129 +0,0 @@
-import argparse
-import warnings
-from collections import OrderedDict
-
-import flwr as fl
-import torch
-from evaluate import load as load_metric
-from flwr_datasets import FederatedDataset
-from torch.optim import AdamW
-from torch.utils.data import DataLoader
-from transformers import (
- AutoModelForSequenceClassification,
- AutoTokenizer,
- DataCollatorWithPadding,
-)
-
-warnings.filterwarnings("ignore", category=UserWarning)
-DEVICE = torch.device("cpu")
-CHECKPOINT = "distilbert-base-uncased" # transformer model checkpoint
-
-
-def load_data(partition_id):
- """Load IMDB data (training and eval)"""
- fds = FederatedDataset(dataset="imdb", partitioners={"train": 1_000})
- partition = fds.load_partition(partition_id)
- # Divide data: 80% train, 20% test
- partition_train_test = partition.train_test_split(test_size=0.2, seed=42)
-
- tokenizer = AutoTokenizer.from_pretrained(CHECKPOINT, model_max_length=512)
-
- def tokenize_function(examples):
- return tokenizer(examples["text"], truncation=True)
-
- partition_train_test = partition_train_test.map(tokenize_function, batched=True)
- partition_train_test = partition_train_test.remove_columns("text")
- partition_train_test = partition_train_test.rename_column("label", "labels")
-
- data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
- trainloader = DataLoader(
- partition_train_test["train"],
- shuffle=True,
- batch_size=32,
- collate_fn=data_collator,
- )
-
- testloader = DataLoader(
- partition_train_test["test"], batch_size=32, collate_fn=data_collator
- )
-
- return trainloader, testloader
-
-
-def train(net, trainloader, epochs):
- optimizer = AdamW(net.parameters(), lr=5e-5)
- net.train()
- for _ in range(epochs):
- for batch in trainloader:
- batch = {k: v.to(DEVICE) for k, v in batch.items()}
- outputs = net(**batch)
- loss = outputs.loss
- loss.backward()
- optimizer.step()
- optimizer.zero_grad()
-
-
-def test(net, testloader):
- metric = load_metric("accuracy")
- loss = 0
- net.eval()
- for batch in testloader:
- batch = {k: v.to(DEVICE) for k, v in batch.items()}
- with torch.no_grad():
- outputs = net(**batch)
- logits = outputs.logits
- loss += outputs.loss.item()
- predictions = torch.argmax(logits, dim=-1)
- metric.add_batch(predictions=predictions, references=batch["labels"])
- loss /= len(testloader.dataset)
- accuracy = metric.compute()["accuracy"]
- return loss, accuracy
-
-
-def main(partition_id):
- net = AutoModelForSequenceClassification.from_pretrained(
- CHECKPOINT, num_labels=2
- ).to(DEVICE)
-
- trainloader, testloader = load_data(partition_id)
-
- # Flower client
- class IMDBClient(fl.client.NumPyClient):
- def get_parameters(self, config):
- return [val.cpu().numpy() for _, val in net.state_dict().items()]
-
- def set_parameters(self, parameters):
- params_dict = zip(net.state_dict().keys(), parameters)
- state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
- net.load_state_dict(state_dict, strict=True)
-
- def fit(self, parameters, config):
- self.set_parameters(parameters)
- print("Training Started...")
- train(net, trainloader, epochs=1)
- print("Training Finished.")
- return self.get_parameters(config={}), len(trainloader), {}
-
- def evaluate(self, parameters, config):
- self.set_parameters(parameters)
- loss, accuracy = test(net, testloader)
- return float(loss), len(testloader), {"accuracy": float(accuracy)}
-
- # Start client
- fl.client.start_client(
- server_address="127.0.0.1:8080", client=IMDBClient().to_client()
- )
-
-
-if __name__ == "__main__":
- parser = argparse.ArgumentParser(description="Flower")
- parser.add_argument(
- "--partition-id",
- choices=list(range(1_000)),
- required=True,
- type=int,
- help="Partition of the dataset divided into 1,000 iid partitions created "
- "artificially.",
- )
- partition_id = parser.parse_args().partition_id
- main(partition_id)
diff --git a/examples/quickstart-huggingface/huggingface_example/__init__.py b/examples/quickstart-huggingface/huggingface_example/__init__.py
new file mode 100644
index 000000000000..6d897650c6bf
--- /dev/null
+++ b/examples/quickstart-huggingface/huggingface_example/__init__.py
@@ -0,0 +1 @@
+"""huggingface_example: A Flower / Hugging Face app."""
diff --git a/examples/quickstart-huggingface/huggingface_example/client_app.py b/examples/quickstart-huggingface/huggingface_example/client_app.py
new file mode 100644
index 000000000000..8989e52281ad
--- /dev/null
+++ b/examples/quickstart-huggingface/huggingface_example/client_app.py
@@ -0,0 +1,58 @@
+"""huggingface_example: A Flower / Hugging Face app."""
+
+import warnings
+
+import torch
+from flwr.client import Client, ClientApp, NumPyClient
+from flwr.common import Context
+from transformers import logging
+from huggingface_example.task import (
+ train,
+ test,
+ load_data,
+ set_params,
+ get_params,
+ get_model,
+)
+
+warnings.filterwarnings("ignore", category=FutureWarning)
+
+# To mute warnings reminding that we need to train the model to a downstream task
+# This is something this example does.
+logging.set_verbosity_error()
+
+
+# Flower client
+class IMDBClient(NumPyClient):
+ def __init__(self, model_name, trainloader, testloader) -> None:
+ self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+ self.trainloader = trainloader
+ self.testloader = testloader
+ self.net = get_model(model_name)
+ self.net.to(self.device)
+
+ def fit(self, parameters, config) -> tuple[list, int, dict]:
+ set_params(self.net, parameters)
+ train(self.net, self.trainloader, epochs=1, device=self.device)
+ return get_params(self.net), len(self.trainloader), {}
+
+ def evaluate(self, parameters, config) -> tuple[float, int, dict[str, float]]:
+ set_params(self.net, parameters)
+ loss, accuracy = test(self.net, self.testloader, device=self.device)
+ return float(loss), len(self.testloader), {"accuracy": float(accuracy)}
+
+
+def client_fn(context: Context) -> Client:
+ """Construct a Client that will be run in a ClientApp."""
+ # Read the node_config to fetch data partition associated to this node
+ partition_id = context.node_config["partition-id"]
+ num_partitions = context.node_config["num-partitions"]
+
+ # Read the run config to get settings to configure the Client
+ model_name = context.run_config["model-name"]
+ trainloader, testloader = load_data(partition_id, num_partitions, model_name)
+
+ return IMDBClient(model_name, trainloader, testloader).to_client()
+
+
+app = ClientApp(client_fn=client_fn)
diff --git a/examples/quickstart-huggingface/huggingface_example/server_app.py b/examples/quickstart-huggingface/huggingface_example/server_app.py
new file mode 100644
index 000000000000..d0db1b43fa36
--- /dev/null
+++ b/examples/quickstart-huggingface/huggingface_example/server_app.py
@@ -0,0 +1,33 @@
+"""huggingface_example: A Flower / Hugging Face app."""
+
+from flwr.common import Context, ndarrays_to_parameters
+from flwr.server import ServerApp, ServerAppComponents, ServerConfig
+from flwr.server.strategy import FedAvg
+
+from huggingface_example.task import get_params, get_model
+
+
+def server_fn(context: Context) -> ServerAppComponents:
+ """Construct components for ServerApp."""
+ # Construct ServerConfig
+ num_rounds = context.run_config["num-server-rounds"]
+ config = ServerConfig(num_rounds=num_rounds)
+
+ # Set global model initialization
+ model_name = context.run_config["model-name"]
+ ndarrays = get_params(get_model(model_name))
+ global_model_init = ndarrays_to_parameters(ndarrays)
+
+ # Define strategy
+ fraction_fit = context.run_config["fraction-fit"]
+ fraction_evaluate = context.run_config["fraction-evaluate"]
+ strategy = FedAvg(
+ fraction_fit=fraction_fit,
+ fraction_evaluate=fraction_evaluate,
+ initial_parameters=global_model_init,
+ )
+
+ return ServerAppComponents(config=config, strategy=strategy)
+
+
+app = ServerApp(server_fn=server_fn)
diff --git a/examples/quickstart-huggingface/huggingface_example/task.py b/examples/quickstart-huggingface/huggingface_example/task.py
new file mode 100644
index 000000000000..25304d134a67
--- /dev/null
+++ b/examples/quickstart-huggingface/huggingface_example/task.py
@@ -0,0 +1,105 @@
+"""huggingface_example: A Flower / Hugging Face app."""
+
+from typing import Any
+from collections import OrderedDict
+
+import torch
+from evaluate import load as load_metric
+from torch.optim import AdamW
+from torch.utils.data import DataLoader
+from transformers import (
+ AutoTokenizer,
+ DataCollatorWithPadding,
+ AutoModelForSequenceClassification,
+)
+from datasets.utils.logging import disable_progress_bar
+from flwr_datasets import FederatedDataset
+from flwr_datasets.partitioner import IidPartitioner
+
+
+disable_progress_bar()
+fds = None # Cache FederatedDataset
+
+
+def load_data(
+ partition_id: int, num_partitions: int, model_name: str
+) -> tuple[DataLoader[Any], DataLoader[Any]]:
+ """Load IMDB data (training and eval)"""
+ # Only initialize `FederatedDataset` once
+ global fds
+ if fds is None:
+ # Partition the IMDB dataset into N partitions
+ partitioner = IidPartitioner(num_partitions=num_partitions)
+ fds = FederatedDataset(
+ dataset="stanfordnlp/imdb", partitioners={"train": partitioner}
+ )
+ partition = fds.load_partition(partition_id)
+ # Divide data: 80% train, 20% test
+ partition_train_test = partition.train_test_split(test_size=0.2, seed=42)
+
+ tokenizer = AutoTokenizer.from_pretrained(model_name, model_max_length=512)
+
+ def tokenize_function(examples):
+ return tokenizer(examples["text"], truncation=True)
+
+ partition_train_test = partition_train_test.map(tokenize_function, batched=True)
+ partition_train_test = partition_train_test.remove_columns("text")
+ partition_train_test = partition_train_test.rename_column("label", "labels")
+
+ data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
+ trainloader = DataLoader(
+ partition_train_test["train"],
+ shuffle=True,
+ batch_size=32,
+ collate_fn=data_collator,
+ )
+
+ testloader = DataLoader(
+ partition_train_test["test"], batch_size=32, collate_fn=data_collator
+ )
+
+ return trainloader, testloader
+
+
+def get_model(model_name):
+ return AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
+
+
+def get_params(model):
+ return [val.cpu().numpy() for _, val in model.state_dict().items()]
+
+
+def set_params(model, parameters) -> None:
+ params_dict = zip(model.state_dict().keys(), parameters)
+ state_dict = OrderedDict({k: torch.Tensor(v) for k, v in params_dict})
+ model.load_state_dict(state_dict, strict=True)
+
+
+def train(net, trainloader, epochs, device) -> None:
+ optimizer = AdamW(net.parameters(), lr=5e-5)
+ net.train()
+ for _ in range(epochs):
+ for batch in trainloader:
+ batch = {k: v.to(device) for k, v in batch.items()}
+ outputs = net(**batch)
+ loss = outputs.loss
+ loss.backward()
+ optimizer.step()
+ optimizer.zero_grad()
+
+
+def test(net, testloader, device) -> tuple[Any | float, Any]:
+ metric = load_metric("accuracy")
+ loss = 0
+ net.eval()
+ for batch in testloader:
+ batch = {k: v.to(device) for k, v in batch.items()}
+ with torch.no_grad():
+ outputs = net(**batch)
+ logits = outputs.logits
+ loss += outputs.loss.item()
+ predictions = torch.argmax(logits, dim=-1)
+ metric.add_batch(predictions=predictions, references=batch["labels"])
+ loss /= len(testloader.dataset)
+ accuracy = metric.compute()["accuracy"]
+ return loss, accuracy
diff --git a/examples/quickstart-huggingface/pyproject.toml b/examples/quickstart-huggingface/pyproject.toml
index 2b46804d7b45..af48b2429635 100644
--- a/examples/quickstart-huggingface/pyproject.toml
+++ b/examples/quickstart-huggingface/pyproject.toml
@@ -1,22 +1,49 @@
[build-system]
-requires = ["poetry-core>=1.4.0"]
-build-backend = "poetry.core.masonry.api"
+requires = ["hatchling"]
+build-backend = "hatchling.build"
-[tool.poetry]
-name = "quickstart-huggingface"
-version = "0.1.0"
-description = "Hugging Face Transformers Federated Learning Quickstart with Flower"
+[project]
+name = "huggingface_example"
+version = "1.0.0"
+description = "Federated Learning with Hugginface Transformers and Flower (Quickstart Example)"
+license = "Apache-2.0"
authors = [
- "The Flower Authors ",
- "Kaushik Amar Das ",
+ { name = "The Flower Authors", email = "hello@flower.ai" },
+ { name = "Kaushik Amar Das", email = "kaushik.das@iiitg.ac.in" },
]
+dependencies = [
+ "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr-datasets>=0.3.0",
+ "torch==2.4.0",
+ "transformers>=4.30.0,<5.0",
+ "evaluate>=0.4.0,<1.0",
+ "datasets>=2.0.0, <3.0",
+ "scikit-learn>=1.3.1, <2.0",
+]
+
+[tool.hatch.build.targets.wheel]
+packages = ["."]
+
+[tool.flwr.app]
+publisher = "flwrlabs"
+
+[tool.flwr.app.components]
+serverapp = "huggingface_example.server_app:app"
+clientapp = "huggingface_example.client_app:app"
+
+[tool.flwr.app.config]
+num-server-rounds = 3
+model-name = "distilbert-base-uncased"
+fraction-fit = 0.05
+fraction-evaluate = 0.1
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 100
-[tool.poetry.dependencies]
-python = ">=3.8,<3.11"
-flwr = ">=1.0,<2.0"
-flwr-datasets = ">=0.0.2,<1.0.0"
-torch = ">=1.13.1,<2.0"
-transformers = ">=4.30.0,<5.0"
-evaluate = ">=0.4.0,<1.0"
-datasets = ">=2.0.0, <3.0"
-scikit-learn = ">=1.3.1, <2.0"
+[tool.flwr.federations.local-simulation-gpu]
+options.num-supernodes = 100
+options.backend.client-resources.num-cpus = 4 # each ClientApp assumes to use 4CPUs
+options.backend.client-resources.num-gpus = 1.0 # at most 1 ClientApp will run in a given GPU (lower it to increase parallelism)
\ No newline at end of file
diff --git a/examples/quickstart-huggingface/requirements.txt b/examples/quickstart-huggingface/requirements.txt
deleted file mode 100644
index 3cd5735625ba..000000000000
--- a/examples/quickstart-huggingface/requirements.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-flwr>=1.0, <2.0
-flwr-datasets>=0.0.2, <1.0.0
-torch>=1.13.1, <2.0
-transformers>=4.30.0, <5.0
-evaluate>=0.4.0, <1.0
-datasets>=2.0.0, <3.0
-scikit-learn>=1.3.1, <2.0
diff --git a/examples/quickstart-huggingface/run.sh b/examples/quickstart-huggingface/run.sh
deleted file mode 100755
index fa989eab1471..000000000000
--- a/examples/quickstart-huggingface/run.sh
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/bash
-
-echo "Starting server"
-python server.py &
-sleep 3 # Sleep for 3s to give the server enough time to start
-
-for i in `seq 0 1`; do
- echo "Starting client $i"
- python client.py --partition-id ${i}&
-done
-
-# This will allow you to use CTRL+C to stop all background processes
-trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM
-# Wait for all background processes to complete
-wait
diff --git a/examples/quickstart-huggingface/server.py b/examples/quickstart-huggingface/server.py
deleted file mode 100644
index 4eeb9da7da75..000000000000
--- a/examples/quickstart-huggingface/server.py
+++ /dev/null
@@ -1,15 +0,0 @@
-import flwr as fl
-
-if __name__ == "__main__":
- # Define strategy
- strategy = fl.server.strategy.FedAvg(
- fraction_fit=1.0,
- fraction_evaluate=1.0,
- )
-
- # Start server
- fl.server.start_server(
- server_address="0.0.0.0:8080",
- config=fl.server.ServerConfig(num_rounds=3),
- strategy=strategy,
- )
From ecac7f54e659a3cad23f482bdc62f002db35ba4d Mon Sep 17 00:00:00 2001
From: Javier
Date: Sat, 24 Aug 2024 17:24:48 +0100
Subject: [PATCH 05/42] refactor(examples) Update `quickstart-monai` example
(#3934)
---
examples/quickstart-monai/.gitignore | 1 +
examples/quickstart-monai/README.md | 96 ++++-----
examples/quickstart-monai/client.py | 61 ------
examples/quickstart-monai/data.py | 158 --------------
examples/quickstart-monai/model.py | 33 ---
.../quickstart-monai/monaiexample/__init__.py | 0
.../monaiexample/client_app.py | 41 ++++
.../monaiexample/server_app.py | 46 ++++
.../quickstart-monai/monaiexample/task.py | 199 ++++++++++++++++++
examples/quickstart-monai/pyproject.toml | 58 +++--
examples/quickstart-monai/requirements.txt | 7 -
examples/quickstart-monai/run.sh | 19 --
examples/quickstart-monai/server.py | 25 ---
13 files changed, 369 insertions(+), 375 deletions(-)
delete mode 100644 examples/quickstart-monai/client.py
delete mode 100644 examples/quickstart-monai/data.py
delete mode 100644 examples/quickstart-monai/model.py
create mode 100644 examples/quickstart-monai/monaiexample/__init__.py
create mode 100644 examples/quickstart-monai/monaiexample/client_app.py
create mode 100644 examples/quickstart-monai/monaiexample/server_app.py
create mode 100644 examples/quickstart-monai/monaiexample/task.py
delete mode 100644 examples/quickstart-monai/requirements.txt
delete mode 100755 examples/quickstart-monai/run.sh
delete mode 100644 examples/quickstart-monai/server.py
diff --git a/examples/quickstart-monai/.gitignore b/examples/quickstart-monai/.gitignore
index a218cab9669e..2626387e2a4f 100644
--- a/examples/quickstart-monai/.gitignore
+++ b/examples/quickstart-monai/.gitignore
@@ -1 +1,2 @@
MedNIST*
+.data_download.lock
diff --git a/examples/quickstart-monai/README.md b/examples/quickstart-monai/README.md
index dc31f03e4b1b..c470a6a6c86f 100644
--- a/examples/quickstart-monai/README.md
+++ b/examples/quickstart-monai/README.md
@@ -4,88 +4,76 @@ dataset: [MedNIST]
framework: [MONAI]
---
-# Flower Example using MONAI
+# Federated Learning with MONAI and Flower (Quickstart Example)
This introductory example to Flower uses MONAI, but deep knowledge of MONAI is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case.
-Running this example in itself is quite easy.
+Running this example in itself is quite easy. [MONAI](https://docs.monai.io/en/latest/index.html)(Medical Open Network for AI) is a PyTorch-based, open-source framework for deep learning in healthcare imaging, part of the PyTorch Ecosystem. This example uses a subset of the [MedMNIST](https://medmnist.com/) dataset including 6 classes, as done in [MONAI's classification demo](https://colab.research.google.com/drive/1wy8XUSnNWlhDNazFdvGBHLfdkGvOHBKe). Each client trains am [DenseNet121](https://docs.monai.io/en/stable/networks.html#densenet121) from MONAI.
-[MONAI](https://docs.monai.io/en/latest/index.html)(Medical Open Network for AI) is a PyTorch-based, open-source framework for deep learning in healthcare imaging, part of the PyTorch Ecosystem.
+> \[!NOTE\]
+> This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to partition the MedMNIST dataset. Its a good example to show how to bring any dataset into Flower and partition it using any of the built-in [partitioners](https://flower.ai/docs/datasets/ref-api/flwr_datasets.partitioner.html) (e.g. `DirichletPartitioner`, `PathologicalPartitioner`). Learn [how to use partitioners](https://flower.ai/docs/datasets/tutorial-use-partitioners.html) in a step-by-step tutorial.
-Its ambitions are:
+## Set up the project
-- developing a community of academic, industrial and clinical researchers collaborating on a common foundation;
+### Clone the project
-- creating state-of-the-art, end-to-end training workflows for healthcare imaging;
-
-- providing researchers with an optimized and standardized way to create and evaluate deep learning models.
-
-## Project Setup
-
-Start by cloning the example project. We prepared a single-line command that you can copy into your shell which will checkout the example for you:
+Start by cloning the example project:
```shell
-git clone --depth=1 https://github.com/adap/flower.git _tmp && mv _tmp/examples/quickstart-monai . && rm -rf _tmp && cd quickstart-monai
+git clone --depth=1 https://github.com/adap/flower.git _tmp \
+ && mv _tmp/examples/quickstart-monai . \
+ && rm -rf _tmp \
+ && cd quickstart-monai
```
-This will create a new directory called `quickstart-monai` containing the following files:
+This will create a new directory called `quickstart-monai` with the following structure:
```shell
--- pyproject.toml
--- requirements.txt
--- client.py
--- data.py
--- model.py
--- server.py
--- README.md
+quickstart-monai
+├── monaiexample
+│ ├── __init__.py
+│ ├── client_app.py # Defines your ClientApp
+│ ├── server_app.py # Defines your ServerApp
+│ └── task.py # Defines your model, training and data loading
+├── pyproject.toml # Project metadata like dependencies and configs
+└── README.md
```
-### Installing Dependencies
-
-Project dependencies (such as `monai` and `flwr`) are defined in `pyproject.toml` and `requirements.txt`. We recommend [Poetry](https://python-poetry.org/docs/) to install those dependencies and manage your virtual environment ([Poetry installation](https://python-poetry.org/docs/#installation)) or [pip](https://pip.pypa.io/en/latest/development/), but feel free to use a different way of installing dependencies and managing virtual environments if you have other preferences.
+### Install dependencies and project
-#### Poetry
+Install the dependencies defined in `pyproject.toml` as well as the `monaiexample` package.
-```shell
-poetry install
-poetry shell
+```bash
+pip install -e .
```
-Poetry will install all your dependencies in a newly created virtual environment. To verify that everything works correctly you can run the following command:
-
-```shell
-poetry run python3 -c "import flwr"
-```
+## Run the project
-If you don't see any errors you're good to go!
+You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
-#### pip
+### Run with the Simulation Engine
-Write the command below in your terminal to install the dependencies according to the configuration file requirements.txt.
+> \[!TIP\]
+> This example runs faster when the `ClientApp`s have access to a GPU. If your system has one, you can make use of it by configuring the `backend.client-resources` component in `pyproject.toml`. If you want to try running the example with GPU right away, use the `local-simulation-gpu` federation as shown below.
-```shell
-pip install -r requirements.txt
+```bash
+# Run with the default federation (CPU only)
+flwr run .
```
-## Run Federated Learning with MONAI and Flower
-
-Afterwards you are ready to start the Flower server as well as the clients. You can simply start the server in a terminal as follows:
+Run the project in the `local-simulation-gpu` federation that gives CPU and GPU resources to each `ClientApp`. By default, at most 4x`ClientApp` will run in parallel in the available GPU.
-```shell
-python3 server.py
+```bash
+# Run with the `local-simulation-gpu` federation
+flwr run . local-simulation-gpu
```
-Now you are ready to start the Flower clients which will participate in the learning. To do so simply open two more terminal windows and run the following commands. Clients will train a [DenseNet121](https://docs.monai.io/en/stable/networks.html#densenet121) from MONAI. If a GPU is present in your system, clients will use it.
-
-Start client 1 in the first terminal:
+You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
-```shell
-python3 client.py --partition-id 0
+```bash
+flwr run . --run-config num-server-rounds=5,batch-size=32
```
-Start client 2 in the second terminal:
-
-```shell
-python3 client.py --partition-id 1
-```
+### Run with the Deployment Engine
-You will see that the federated training is starting. Look at the [code](https://github.com/adap/flower/tree/main/examples/quickstart-monai) for a detailed explanation.
+> \[!NOTE\]
+> An update to this example will show how to run this Flower project with the Deployment Engine and TLS certificates, or with Docker.
diff --git a/examples/quickstart-monai/client.py b/examples/quickstart-monai/client.py
deleted file mode 100644
index 1401928af1ff..000000000000
--- a/examples/quickstart-monai/client.py
+++ /dev/null
@@ -1,61 +0,0 @@
-import argparse
-import warnings
-from collections import OrderedDict
-
-import flwr as fl
-import torch
-from monai.networks.nets.densenet import DenseNet121
-
-from data import load_data
-from model import test, train
-
-warnings.filterwarnings("ignore", category=UserWarning)
-DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
-
-
-# Define Flower client
-class FlowerClient(fl.client.NumPyClient):
- def __init__(self, net, trainloader, testloader, device):
- self.net = net
- self.trainloader = trainloader
- self.testloader = testloader
- self.device = device
-
- def get_parameters(self, config):
- return [val.cpu().numpy() for _, val in self.net.state_dict().items()]
-
- def set_parameters(self, parameters):
- params_dict = zip(self.net.state_dict().keys(), parameters)
- state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
- self.net.load_state_dict(state_dict, strict=True)
-
- def fit(self, parameters, config):
- self.set_parameters(parameters)
- train(self.net, self.trainloader, epoch_num=1, device=self.device)
- return self.get_parameters(config={}), len(self.trainloader), {}
-
- def evaluate(self, parameters, config):
- self.set_parameters(parameters)
- loss, accuracy = test(self.net, self.testloader, self.device)
- return loss, len(self.testloader), {"accuracy": accuracy}
-
-
-if __name__ == "__main__":
- total_partitions = 10
- parser = argparse.ArgumentParser()
- parser.add_argument(
- "--partition-id", type=int, choices=range(total_partitions), required=True
- )
- args = parser.parse_args()
-
- # Load model and data (simple CNN, CIFAR-10)
- trainloader, _, testloader, num_class = load_data(
- total_partitions, args.partition_id
- )
- net = DenseNet121(spatial_dims=2, in_channels=1, out_channels=num_class).to(DEVICE)
-
- # Start Flower client
- fl.client.start_numpy_client(
- server_address="127.0.0.1:8080",
- client=FlowerClient(net, trainloader, testloader, DEVICE),
- )
diff --git a/examples/quickstart-monai/data.py b/examples/quickstart-monai/data.py
deleted file mode 100644
index d184476522e8..000000000000
--- a/examples/quickstart-monai/data.py
+++ /dev/null
@@ -1,158 +0,0 @@
-import os
-import tarfile
-from urllib import request
-
-import numpy as np
-from monai.data import DataLoader, Dataset
-from monai.transforms import (
- Compose,
- EnsureChannelFirst,
- LoadImage,
- RandFlip,
- RandRotate,
- RandZoom,
- ScaleIntensity,
- ToTensor,
-)
-
-
-def _partition(files_list, labels_list, num_shards, index):
- total_size = len(files_list)
- assert total_size == len(
- labels_list
- ), f"List of datapoints and labels must be of the same length"
- shard_size = total_size // num_shards
-
- # Calculate start and end indices for the shard
- start_idx = index * shard_size
- if index == num_shards - 1:
- # Last shard takes the remainder
- end_idx = total_size
- else:
- end_idx = start_idx + shard_size
-
- # Create a subset for the shard
- files = files_list[start_idx:end_idx]
- labels = labels_list[start_idx:end_idx]
- return files, labels
-
-
-def load_data(num_shards, index):
- image_file_list, image_label_list, _, num_class = _download_data()
-
- # Get partition given index
- files_list, labels_list = _partition(
- image_file_list, image_label_list, num_shards, index
- )
-
- trainX, trainY, valX, valY, testX, testY = _split_data(
- files_list, labels_list, len(files_list)
- )
- train_transforms, val_transforms = _get_transforms()
-
- train_ds = MedNISTDataset(trainX, trainY, train_transforms)
- train_loader = DataLoader(train_ds, batch_size=300, shuffle=True)
-
- val_ds = MedNISTDataset(valX, valY, val_transforms)
- val_loader = DataLoader(val_ds, batch_size=300)
-
- test_ds = MedNISTDataset(testX, testY, val_transforms)
- test_loader = DataLoader(test_ds, batch_size=300)
-
- return train_loader, val_loader, test_loader, num_class
-
-
-class MedNISTDataset(Dataset):
- def __init__(self, image_files, labels, transforms):
- self.image_files = image_files
- self.labels = labels
- self.transforms = transforms
-
- def __len__(self):
- return len(self.image_files)
-
- def __getitem__(self, index):
- return self.transforms(self.image_files[index]), self.labels[index]
-
-
-def _download_data():
- data_dir = "./MedNIST/"
- _download_and_extract(
- "https://dl.dropboxusercontent.com/s/5wwskxctvcxiuea/MedNIST.tar.gz",
- os.path.join(data_dir),
- )
-
- class_names = sorted(
- [x for x in os.listdir(data_dir) if os.path.isdir(os.path.join(data_dir, x))]
- )
- num_class = len(class_names)
- image_files = [
- [
- os.path.join(data_dir, class_name, x)
- for x in os.listdir(os.path.join(data_dir, class_name))
- ]
- for class_name in class_names
- ]
- image_file_list = []
- image_label_list = []
- for i, class_name in enumerate(class_names):
- image_file_list.extend(image_files[i])
- image_label_list.extend([i] * len(image_files[i]))
- num_total = len(image_label_list)
- return image_file_list, image_label_list, num_total, num_class
-
-
-def _split_data(image_file_list, image_label_list, num_total):
- valid_frac, test_frac = 0.1, 0.1
- trainX, trainY = [], []
- valX, valY = [], []
- testX, testY = [], []
-
- for i in range(num_total):
- rann = np.random.random()
- if rann < valid_frac:
- valX.append(image_file_list[i])
- valY.append(image_label_list[i])
- elif rann < test_frac + valid_frac:
- testX.append(image_file_list[i])
- testY.append(image_label_list[i])
- else:
- trainX.append(image_file_list[i])
- trainY.append(image_label_list[i])
-
- return trainX, trainY, valX, valY, testX, testY
-
-
-def _get_transforms():
- train_transforms = Compose(
- [
- LoadImage(image_only=True),
- EnsureChannelFirst(),
- ScaleIntensity(),
- RandRotate(range_x=15, prob=0.5, keep_size=True),
- RandFlip(spatial_axis=0, prob=0.5),
- RandZoom(min_zoom=0.9, max_zoom=1.1, prob=0.5, keep_size=True),
- ToTensor(),
- ]
- )
-
- val_transforms = Compose(
- [LoadImage(image_only=True), EnsureChannelFirst(), ScaleIntensity(), ToTensor()]
- )
-
- return train_transforms, val_transforms
-
-
-def _download_and_extract(url, dest_folder):
- if not os.path.isdir(dest_folder):
- # Download the tar.gz file
- tar_gz_filename = url.split("/")[-1]
- if not os.path.isfile(tar_gz_filename):
- with request.urlopen(url) as response, open(
- tar_gz_filename, "wb"
- ) as out_file:
- out_file.write(response.read())
-
- # Extract the tar.gz file
- with tarfile.open(tar_gz_filename, "r:gz") as tar_ref:
- tar_ref.extractall()
diff --git a/examples/quickstart-monai/model.py b/examples/quickstart-monai/model.py
deleted file mode 100644
index 4c74d50553e4..000000000000
--- a/examples/quickstart-monai/model.py
+++ /dev/null
@@ -1,33 +0,0 @@
-import torch
-
-
-def train(model, train_loader, epoch_num, device):
- loss_function = torch.nn.CrossEntropyLoss()
- optimizer = torch.optim.Adam(model.parameters(), 1e-5)
- for _ in range(epoch_num):
- model.train()
- for inputs, labels in train_loader:
- optimizer.zero_grad()
- loss_function(model(inputs.to(device)), labels.to(device)).backward()
- optimizer.step()
-
-
-def test(model, test_loader, device):
- model.eval()
- loss = 0.0
- y_true = list()
- y_pred = list()
- loss_function = torch.nn.CrossEntropyLoss()
- with torch.no_grad():
- for test_images, test_labels in test_loader:
- out = model(test_images.to(device))
- test_labels = test_labels.to(device)
- loss += loss_function(out, test_labels).item()
- pred = out.argmax(dim=1)
- for i in range(len(pred)):
- y_true.append(test_labels[i].item())
- y_pred.append(pred[i].item())
- accuracy = sum([1 if t == p else 0 for t, p in zip(y_true, y_pred)]) / len(
- test_loader.dataset
- )
- return loss, accuracy
diff --git a/examples/quickstart-monai/monaiexample/__init__.py b/examples/quickstart-monai/monaiexample/__init__.py
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/examples/quickstart-monai/monaiexample/client_app.py b/examples/quickstart-monai/monaiexample/client_app.py
new file mode 100644
index 000000000000..c0dcac0cdae2
--- /dev/null
+++ b/examples/quickstart-monai/monaiexample/client_app.py
@@ -0,0 +1,41 @@
+"""monaiexample: A Flower / MONAI app."""
+
+import torch
+from flwr.common import Context
+from flwr.client import NumPyClient, ClientApp
+
+from monaiexample.task import load_data, load_model, test, train, get_params, set_params
+
+
+# Define Flower client
+class FlowerClient(NumPyClient):
+ def __init__(self, net, trainloader, valloader):
+ self.net = net
+ self.trainloader = trainloader
+ self.valloader = valloader
+ self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+ def fit(self, parameters, config):
+ set_params(self.net, parameters)
+ train(self.net, self.trainloader, epoch_num=1, device=self.device)
+ return get_params(self.net), len(self.trainloader), {}
+
+ def evaluate(self, parameters, config):
+ set_params(self.net, parameters)
+ loss, accuracy = test(self.net, self.valloader, self.device)
+ return loss, len(self.valloader), {"accuracy": accuracy}
+
+
+def client_fn(context: Context):
+
+ partition_id = context.node_config["partition-id"]
+ num_partitions = context.node_config["num-partitions"]
+
+ batch_size = context.run_config["batch-size"]
+ trainloader, valloader = load_data(num_partitions, partition_id, batch_size)
+ net = load_model()
+
+ return FlowerClient(net, trainloader, valloader).to_client()
+
+
+app = ClientApp(client_fn=client_fn)
diff --git a/examples/quickstart-monai/monaiexample/server_app.py b/examples/quickstart-monai/monaiexample/server_app.py
new file mode 100644
index 000000000000..f68d3887a488
--- /dev/null
+++ b/examples/quickstart-monai/monaiexample/server_app.py
@@ -0,0 +1,46 @@
+"""monaiexample: A Flower / MONAI app."""
+
+from typing import List, Tuple
+
+from flwr.common import Metrics, Context, ndarrays_to_parameters
+from flwr.server import ServerApp, ServerAppComponents, ServerConfig
+from flwr.server.strategy import FedAvg
+
+from monaiexample.task import load_model, get_params
+
+
+# Define metric aggregation function
+def weighted_average(metrics: List[Tuple[int, Metrics]]) -> Metrics:
+ # Multiply accuracy of each client by number of examples used
+ accuracies = [num_examples * m["accuracy"] for num_examples, m in metrics]
+ examples = [num_examples for num_examples, _ in metrics]
+
+ # Aggregate and return custom metric (weighted average)
+ return {"accuracy": sum(accuracies) / sum(examples)}
+
+
+def server_fn(context: Context):
+
+ # Init model
+ model = load_model()
+
+ # Convert model parameters to flwr.common.Parameters
+ ndarrays = get_params(model)
+ global_model_init = ndarrays_to_parameters(ndarrays)
+
+ # Define strategy
+ fraction_fit = context.run_config["fraction-fit"]
+ strategy = FedAvg(
+ fraction_fit=fraction_fit,
+ evaluate_metrics_aggregation_fn=weighted_average,
+ initial_parameters=global_model_init,
+ )
+
+ # Construct ServerConfig
+ num_rounds = context.run_config["num-server-rounds"]
+ config = ServerConfig(num_rounds=num_rounds)
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+app = ServerApp(server_fn=server_fn)
diff --git a/examples/quickstart-monai/monaiexample/task.py b/examples/quickstart-monai/monaiexample/task.py
new file mode 100644
index 000000000000..09597562a1f2
--- /dev/null
+++ b/examples/quickstart-monai/monaiexample/task.py
@@ -0,0 +1,199 @@
+"""monaiexample: A Flower / MONAI app."""
+
+import os
+import tarfile
+from urllib import request
+from collections import OrderedDict
+
+import torch
+import monai
+from monai.networks.nets import densenet
+from monai.transforms import (
+ Compose,
+ EnsureChannelFirst,
+ LoadImage,
+ RandFlip,
+ RandRotate,
+ RandZoom,
+ ScaleIntensity,
+ ToTensor,
+)
+from filelock import FileLock
+from datasets import Dataset
+from flwr_datasets.partitioner import IidPartitioner
+
+
+def load_model():
+ """Load a DenseNet12."""
+ return densenet.DenseNet121(spatial_dims=2, in_channels=1, out_channels=6)
+
+
+def get_params(model):
+ """Return tensors in the model's state_dict."""
+ return [val.cpu().numpy() for _, val in model.state_dict().items()]
+
+
+def set_params(model, ndarrays):
+ """Apply parameters to a model."""
+ params_dict = zip(model.state_dict().keys(), ndarrays)
+ state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
+ model.load_state_dict(state_dict, strict=True)
+
+
+def train(model, train_loader, epoch_num, device):
+ """Train a model using the supplied dataloader."""
+ model.to(device)
+ loss_function = torch.nn.CrossEntropyLoss()
+ optimizer = torch.optim.Adam(model.parameters(), 1e-5)
+ for _ in range(epoch_num):
+ model.train()
+ for batch in train_loader:
+ images, labels = batch["img"], batch["label"]
+ optimizer.zero_grad()
+ loss_function(model(images.to(device)), labels.to(device)).backward()
+ optimizer.step()
+
+
+def test(model, test_loader, device):
+ """Evaluate a model on a held-out dataset."""
+ model.to(device)
+ model.eval()
+ loss = 0.0
+ y_true = list()
+ y_pred = list()
+ loss_function = torch.nn.CrossEntropyLoss()
+ with torch.no_grad():
+ for batch in test_loader:
+ images, labels = batch["img"], batch["label"]
+ out = model(images.to(device))
+ labels = labels.to(device)
+ loss += loss_function(out, labels).item()
+ pred = out.argmax(dim=1)
+ for i in range(len(pred)):
+ y_true.append(labels[i].item())
+ y_pred.append(pred[i].item())
+ accuracy = sum([1 if t == p else 0 for t, p in zip(y_true, y_pred)]) / len(
+ test_loader.dataset
+ )
+ return loss, accuracy
+
+
+def _get_transforms():
+ """Return transforms to be used for training and evaluation."""
+ train_transforms = Compose(
+ [
+ LoadImage(image_only=True),
+ EnsureChannelFirst(),
+ ScaleIntensity(),
+ RandRotate(range_x=15, prob=0.5, keep_size=True),
+ RandFlip(spatial_axis=0, prob=0.5),
+ RandZoom(min_zoom=0.9, max_zoom=1.1, prob=0.5, keep_size=True),
+ ToTensor(),
+ ]
+ )
+
+ val_transforms = Compose(
+ [LoadImage(image_only=True), EnsureChannelFirst(), ScaleIntensity(), ToTensor()]
+ )
+
+ return train_transforms, val_transforms
+
+
+def get_apply_transforms_fn(transforms_to_apply):
+ """Return a function that applies the transforms passed as input argument."""
+
+ def apply_transforms(batch):
+ """Apply transforms to the partition from FederatedDataset."""
+ batch["img"] = [transforms_to_apply(img) for img in batch["img_file"]]
+ return batch
+
+ return apply_transforms
+
+
+ds = None
+partitioner = None
+
+
+def load_data(num_partitions, partition_id, batch_size):
+ """Download dataset, partition it and return data loader of specific partition."""
+ # Set dataset and partitioner only once
+ global ds, partitioner
+ if ds is None:
+ image_file_list, image_label_list = _download_data()
+
+ # Construct HuggingFace dataset
+ ds = Dataset.from_dict({"img_file": image_file_list, "label": image_label_list})
+ # Set partitioner
+ partitioner = IidPartitioner(num_partitions)
+ partitioner.dataset = ds
+
+ partition = partitioner.load_partition(partition_id)
+
+ # Split train/validation
+ partition_train_test = partition.train_test_split(test_size=0.2, seed=42)
+
+ # Get transforms
+ train_t, test_t = _get_transforms()
+
+ # Apply transforms individually to each split
+ train_partition = partition_train_test["train"]
+ test_partition = partition_train_test["test"]
+
+ partition_train = train_partition.with_transform(get_apply_transforms_fn(train_t))
+ partition_val = test_partition.with_transform(get_apply_transforms_fn(test_t))
+
+ # Create dataloaders
+ train_loader = monai.data.DataLoader(
+ partition_train, batch_size=batch_size, shuffle=True
+ )
+ val_loader = monai.data.DataLoader(partition_val, batch_size=batch_size)
+
+ return train_loader, val_loader
+
+
+def _download_data():
+ """Download and extract dataset."""
+ data_dir = "./MedNIST/"
+ _download_and_extract_if_needed(
+ "https://dl.dropboxusercontent.com/s/5wwskxctvcxiuea/MedNIST.tar.gz",
+ os.path.join(data_dir),
+ )
+
+ # Compute list of files and thier associated labels
+ class_names = sorted(
+ [x for x in os.listdir(data_dir) if os.path.isdir(os.path.join(data_dir, x))]
+ )
+ image_files = [
+ [
+ os.path.join(data_dir, class_name, x)
+ for x in os.listdir(os.path.join(data_dir, class_name))
+ ]
+ for class_name in class_names
+ ]
+ image_file_list = []
+ image_label_list = []
+ for i, _ in enumerate(class_names):
+ image_file_list.extend(image_files[i])
+ image_label_list.extend([i] * len(image_files[i]))
+
+ return image_file_list, image_label_list
+
+
+def _download_and_extract_if_needed(url, dest_folder):
+ """Download dataset if not present."""
+
+ # Logic behind a filelock to prevent multiple processes (e.g. ClientApps)
+ # from downloading the dataset at the same time.
+ with FileLock(".data_download.lock"):
+ if not os.path.isdir(dest_folder):
+ # Download the tar.gz file
+ tar_gz_filename = url.split("/")[-1]
+ if not os.path.isfile(tar_gz_filename):
+ with request.urlopen(url) as response, open(
+ tar_gz_filename, "wb"
+ ) as out_file:
+ out_file.write(response.read())
+
+ # Extract the tar.gz file
+ with tarfile.open(tar_gz_filename, "r:gz") as tar_ref:
+ tar_ref.extractall()
diff --git a/examples/quickstart-monai/pyproject.toml b/examples/quickstart-monai/pyproject.toml
index 2b77a2fc061f..6ecf5011d24f 100644
--- a/examples/quickstart-monai/pyproject.toml
+++ b/examples/quickstart-monai/pyproject.toml
@@ -1,19 +1,41 @@
[build-system]
-requires = ["poetry-core>=1.4.0"]
-build-backend = "poetry.core.masonry.api"
-
-[tool.poetry]
-name = "quickstart-monai"
-version = "0.1.0"
-description = "MONAI Federated Learning Quickstart with Flower"
-authors = ["The Flower Authors "]
-
-[tool.poetry.dependencies]
-python = ">=3.8,<3.11"
-flwr = ">=1.0,<2.0"
-torch = "1.13.1"
-tqdm = "4.66.3"
-scikit-learn = "1.3.1"
-monai = { version = "1.3.0", extras=["gdown", "nibabel", "tqdm", "itk"] }
-numpy = "1.24.4"
-pillow = "10.2.0"
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[project]
+name = "monaiexample"
+version = "1.0.0"
+description = "Federated Learning with MONAI and Flower (Quickstart Example)"
+license = "Apache-2.0"
+dependencies = [
+ "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr-datasets[vision]>=0.3.0",
+ "monai==1.3.2",
+ "filelock==3.15.4",
+]
+
+[tool.hatch.build.targets.wheel]
+packages = ["."]
+
+[tool.flwr.app]
+publisher = "flwrlabs"
+
+[tool.flwr.app.components]
+serverapp = "monaiexample.server_app:app"
+clientapp = "monaiexample.client_app:app"
+
+[tool.flwr.app.config]
+num-server-rounds = 5
+fraction-fit = 0.5
+batch-size = 128
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 10
+
+[tool.flwr.federations.local-simulation-gpu]
+options.num-supernodes = 10
+options.backend.client-resources.num-cpus = 4
+options.backend.client-resources.num-gpus = 0.25 # at most 4 ClientApps will run in a given GPU
diff --git a/examples/quickstart-monai/requirements.txt b/examples/quickstart-monai/requirements.txt
deleted file mode 100644
index e3f1e463c629..000000000000
--- a/examples/quickstart-monai/requirements.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-flwr>=1.0, <2.0
-torch==1.13.1
-tqdm==4.65.0
-scikit-learn==1.3.1
-monai[gdown,nibabel,tqdm,itk]==1.3.0
-numpy==1.24.4
-pillow==10.2.0
diff --git a/examples/quickstart-monai/run.sh b/examples/quickstart-monai/run.sh
deleted file mode 100755
index 1da60bccb86d..000000000000
--- a/examples/quickstart-monai/run.sh
+++ /dev/null
@@ -1,19 +0,0 @@
-#!/bin/bash
-set -e
-cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"/
-
-python -c "from data import _download_data; _download_data()"
-
-echo "Starting server"
-python server.py &
-sleep 3 # Sleep for 3s to give the server enough time to start
-
-for i in `seq 0 1`; do
- echo "Starting client $i"
- python client.py --partition-id $i &
-done
-
-# Enable CTRL+C to stop all background processes
-trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM
-# Wait for all background processes to complete
-wait
diff --git a/examples/quickstart-monai/server.py b/examples/quickstart-monai/server.py
deleted file mode 100644
index fe691a88aba0..000000000000
--- a/examples/quickstart-monai/server.py
+++ /dev/null
@@ -1,25 +0,0 @@
-from typing import List, Tuple
-
-import flwr as fl
-from flwr.common import Metrics
-
-
-# Define metric aggregation function
-def weighted_average(metrics: List[Tuple[int, Metrics]]) -> Metrics:
- # Multiply accuracy of each client by number of examples used
- accuracies = [num_examples * m["accuracy"] for num_examples, m in metrics]
- examples = [num_examples for num_examples, _ in metrics]
-
- # Aggregate and return custom metric (weighted average)
- return {"accuracy": sum(accuracies) / sum(examples)}
-
-
-# Define strategy
-strategy = fl.server.strategy.FedAvg(evaluate_metrics_aggregation_fn=weighted_average)
-
-# Start Flower server
-fl.server.start_server(
- server_address="0.0.0.0:8080",
- config=fl.server.ServerConfig(num_rounds=3),
- strategy=strategy,
-)
From 75ea504aedeca48634eb957cd163a6e27c888d64 Mon Sep 17 00:00:00 2001
From: Javier
Date: Sat, 24 Aug 2024 17:31:33 +0100
Subject: [PATCH 06/42] refactor(examples) Update `vit-finetune` example
(#3935)
---
README.md | 2 +-
doc/source/conf.py | 1 +
.../README.md | 87 ++++++------
.../_static/central_evaluation.png | Bin
examples/flowertune-vit/pyproject.toml | 43 ++++++
.../flowertune-vit/vitexample/__init__.py | 1 +
.../flowertune-vit/vitexample/client_app.py | 62 +++++++++
.../flowertune-vit/vitexample/server_app.py | 77 ++++++++++
examples/flowertune-vit/vitexample/task.py | 131 ++++++++++++++++++
examples/vit-finetune/client.py | 80 -----------
examples/vit-finetune/dataset.py | 51 -------
examples/vit-finetune/main.py | 57 --------
examples/vit-finetune/model.py | 71 ----------
examples/vit-finetune/pyproject.toml | 17 ---
examples/vit-finetune/requirements.txt | 5 -
examples/vit-finetune/server.py | 61 --------
16 files changed, 362 insertions(+), 384 deletions(-)
rename examples/{vit-finetune => flowertune-vit}/README.md (56%)
rename examples/{vit-finetune => flowertune-vit}/_static/central_evaluation.png (100%)
create mode 100644 examples/flowertune-vit/pyproject.toml
create mode 100644 examples/flowertune-vit/vitexample/__init__.py
create mode 100644 examples/flowertune-vit/vitexample/client_app.py
create mode 100644 examples/flowertune-vit/vitexample/server_app.py
create mode 100644 examples/flowertune-vit/vitexample/task.py
delete mode 100644 examples/vit-finetune/client.py
delete mode 100644 examples/vit-finetune/dataset.py
delete mode 100644 examples/vit-finetune/main.py
delete mode 100644 examples/vit-finetune/model.py
delete mode 100644 examples/vit-finetune/pyproject.toml
delete mode 100644 examples/vit-finetune/requirements.txt
delete mode 100644 examples/vit-finetune/server.py
diff --git a/README.md b/README.md
index 1dd686e5f1b6..3f1d96ca53c0 100644
--- a/README.md
+++ b/README.md
@@ -144,7 +144,7 @@ Other [examples](https://github.com/adap/flower/tree/main/examples):
- [Vertical FL](https://github.com/adap/flower/tree/main/examples/vertical-fl)
- [Federated Finetuning of OpenAI's Whisper](https://github.com/adap/flower/tree/main/examples/whisper-federated-finetuning)
- [Federated Finetuning of Large Language Model](https://github.com/adap/flower/tree/main/examples/llm-flowertune)
-- [Federated Finetuning of a Vision Transformer](https://github.com/adap/flower/tree/main/examples/vit-finetune)
+- [Federated Finetuning of a Vision Transformer](https://github.com/adap/flower/tree/main/examples/flowertune-vit)
- [Advanced Flower with TensorFlow/Keras](https://github.com/adap/flower/tree/main/examples/advanced-tensorflow)
- [Advanced Flower with PyTorch](https://github.com/adap/flower/tree/main/examples/advanced-pytorch)
- Single-Machine Simulation of Federated Learning Systems ([PyTorch](https://github.com/adap/flower/tree/main/examples/simulation-pytorch)) ([Tensorflow](https://github.com/adap/flower/tree/main/examples/simulation-tensorflow))
diff --git a/doc/source/conf.py b/doc/source/conf.py
index d3881325a5ce..5d434dd729bb 100644
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -195,6 +195,7 @@ def find_test_modules(package_path):
"apiref-binaries": "ref-api-cli.html",
"fedbn-example-pytorch-from-centralized-to-federated": "example-fedbn-pytorch-from-centralized-to-federated.html",
"how-to-use-built-in-middleware-layers": "how-to-use-built-in-mods.html",
+ "vit-finetune": "flowertune-vit.html",
# Restructuring: tutorials
"tutorial/Flower-0-What-is-FL": "tutorial-series-what-is-federated-learning.html",
"tutorial/Flower-1-Intro-to-FL-PyTorch": "tutorial-series-get-started-with-flower-pytorch.html",
diff --git a/examples/vit-finetune/README.md b/examples/flowertune-vit/README.md
similarity index 56%
rename from examples/vit-finetune/README.md
rename to examples/flowertune-vit/README.md
index 957c0eda0b68..9e2b0fd6b079 100644
--- a/examples/vit-finetune/README.md
+++ b/examples/flowertune-vit/README.md
@@ -1,68 +1,78 @@
---
-title: Federated finetuning of a ViT
-tags: [finetuneing, vision, fds]
+tags: [finetuning, vision, fds]
dataset: [Oxford Flower-102]
framework: [torch, torchvision]
---
-# Federated finetuning of a ViT
+# Federated Finetuning of a Vision Transformer with Flower
-This example shows how to use Flower's Simulation Engine to federate the finetuning of a Vision Transformer ([ViT-Base-16](https://pytorch.org/vision/main/models/generated/torchvision.models.vit_b_16.html#torchvision.models.vit_b_16)) that has been pretrained on ImageNet. To keep things simple we'll be finetuning it to [Oxford Flower-102](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) datasset, creating 20 partitions using [Flower Datasets](https://flower.ai/docs/datasets/). We'll be finetuning just the exit `head` of the ViT, this means that the training is not that costly and each client requires just ~1GB of VRAM (for a batch size of 32 images).
+This example shows how to use Flower's Simulation Engine to federate the finetuning of a Vision Transformer ([ViT-Base-16](https://pytorch.org/vision/main/models/generated/torchvision.models.vit_b_16.html#torchvision.models.vit_b_16)) that has been pretrained on ImageNet. To keep things simple we'll be finetuning it to [Oxford Flower-102](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) datasset, creating 20 partitions using [Flower Datasets](https://flower.ai/docs/datasets/). We'll be finetuning just the exit `head` of the ViT, this means that the training is not that costly and each client requires just ~1GB of VRAM (for a batch size of 32 images) if you choose to use a GPU.
-## Running the example
+## Set up the project
-If you haven't cloned the Flower repository already you might want to clone code example and discard the rest. We prepared a single-line command that you can copy into your shell which will checkout the example for you:
+### Clone the project
+
+Start by cloning the example project:
```shell
-git clone --depth=1 https://github.com/adap/flower.git && mv flower/examples/vit-finetune . && rm -rf flower && cd vit-finetune
+git clone --depth=1 https://github.com/adap/flower.git _tmp \
+ && mv _tmp/examples/flowertune-vit . \
+ && rm -rf _tmp \
+ && cd flowertune-vit
```
-This will create a new directory called `vit-finetune` containing the following files:
+This will create a new directory called `flowertune-vit` with the following structure:
+```shell
+flowertune-vit
+├── vitexample
+│ ├── __init__.py
+│ ├── client_app.py # Defines your ClientApp
+│ ├── server_app.py # Defines your ServerApp
+│ └── task.py # Defines your model, training and data loading
+├── pyproject.toml # Project metadata like dependencies and configs
+└── README.md
```
--- README.md <- Your're reading this right now
--- main.py <- Main file that launches the simulation
--- client.py <- Contains Flower client code and ClientApp
--- server.py <- Contains Flower server code and ServerApp
--- model.py <- Defines model and train/eval functions
--- dataset.py <- Downloads, partitions and processes dataset
--- pyproject.toml <- Example dependencies, installable using Poetry
--- requirements.txt <- Example dependencies, installable using pip
-```
-
-### Installing Dependencies
-Project dependencies (such as `torch` and `flwr`) are defined in `pyproject.toml` and `requirements.txt`. We recommend [Poetry](https://python-poetry.org/docs/) to install those dependencies and manage your virtual environment ([Poetry installation](https://python-poetry.org/docs/#installation)) or [pip](https://pip.pypa.io/en/latest/development/), but feel free to use a different way of installing dependencies and managing virtual environments if you have other preferences.
+### Install dependencies and project
-#### Poetry
+Install the dependencies defined in `pyproject.toml` as well as the `vitexample` package.
-```shell
-poetry install
-poetry shell
+```bash
+pip install -e .
```
-#### pip
+## Run the project
-With an activated environemnt, install the dependencies for this example:
+You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
-```shell
-pip install -r requirements.txt
+### Run with the Simulation Engine
+
+> \[!TIP\]
+> This example runs faster when the `ClientApp`s have access to a GPU. If your system has one, you can make use of it by configuring the `backend.client-resources` component in `pyproject.toml`. If you want to try running the example with GPU right away, use the `local-simulation-gpu` federation as shown below.
+
+```bash
+# Run with the default federation (CPU only)
+flwr run .
```
-### Run with `start_simulation()`
+You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
+
+```bash
+flwr run . --run-config num-server-rounds=5,batch-size=64
+```
-Running the example is quite straightforward. You can control the number of rounds `--num-rounds` (which defaults to 20).
+Run the project in the `local-simulation-gpu` federation that gives CPU and GPU resources to each `ClientApp`. By default, at most 5x`ClientApp` will run in parallel in the available GPU. You can tweak the degree of parallelism by adjusting the settings of this federation in the `pyproject.toml`.
```bash
-python main.py
+# Run with the `local-simulation-gpu` federation
+flwr run . local-simulation-gpu
```

Running the example as-is on an RTX 3090Ti should take ~15s/round running 5 clients in parallel (plus the _global model_ during centralized evaluation stages) in a single GPU. Note that more clients could fit in VRAM, but since the GPU utilization is high (99%-100%) we are probably better off not doing that (at least in this case).
-You can adjust the `client_resources` passed to `start_simulation()` so more/less clients run at the same time in the GPU. Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for more details on how you can customise your simulation.
-
```bash
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 |
@@ -90,12 +100,7 @@ You can adjust the `client_resources` passed to `start_simulation()` so more/les
+---------------------------------------------------------------------------------------+
```
-### Run with Flower Next (preview)
+### Run with the Deployment Engine
-```bash
-flower-simulation \
- --client-app=client:app \
- --server-app=server:app \
- --num-supernodes=20 \
- --backend-config='{"client_resources": {"num_cpus":4, "num_gpus":0.25}}'
-```
+> \[!NOTE\]
+> An update to this example will show how to run this Flower project with the Deployment Engine and TLS certificates, or with Docker.
diff --git a/examples/vit-finetune/_static/central_evaluation.png b/examples/flowertune-vit/_static/central_evaluation.png
similarity index 100%
rename from examples/vit-finetune/_static/central_evaluation.png
rename to examples/flowertune-vit/_static/central_evaluation.png
diff --git a/examples/flowertune-vit/pyproject.toml b/examples/flowertune-vit/pyproject.toml
new file mode 100644
index 000000000000..0f11dc54c81a
--- /dev/null
+++ b/examples/flowertune-vit/pyproject.toml
@@ -0,0 +1,43 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[project]
+name = "vitexample"
+version = "1.0.0"
+description = "Federated Finetuning of a Vision Transformer with Flower"
+license = "Apache-2.0"
+dependencies = [
+ "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr-datasets[vision]>=0.3.0",
+ "torch==2.2.1",
+ "torchvision==0.17.1",
+]
+
+[tool.hatch.build.targets.wheel]
+packages = ["."]
+
+[tool.flwr.app]
+publisher = "flwrlabs"
+
+[tool.flwr.app.components]
+serverapp = "vitexample.server_app:app"
+clientapp = "vitexample.client_app:app"
+
+[tool.flwr.app.config]
+num-server-rounds = 3
+batch-size = 32
+learning-rate = 0.01
+dataset-name = "nelorth/oxford-flowers"
+num-classes = 102
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 10
+
+[tool.flwr.federations.local-simulation-gpu]
+options.num-supernodes = 10
+options.backend.client-resources.num-cpus = 2 # each ClientApp assumes to use 2CPUs
+options.backend.client-resources.num-gpus = 0.2 # at most 5 ClientApp will run in a given GPU
diff --git a/examples/flowertune-vit/vitexample/__init__.py b/examples/flowertune-vit/vitexample/__init__.py
new file mode 100644
index 000000000000..f0ce539fac90
--- /dev/null
+++ b/examples/flowertune-vit/vitexample/__init__.py
@@ -0,0 +1 @@
+"""vitexample: A Flower / PyTorch app with Vision Transformers."""
diff --git a/examples/flowertune-vit/vitexample/client_app.py b/examples/flowertune-vit/vitexample/client_app.py
new file mode 100644
index 000000000000..59143f1d25f8
--- /dev/null
+++ b/examples/flowertune-vit/vitexample/client_app.py
@@ -0,0 +1,62 @@
+"""vitexample: A Flower / PyTorch app with Vision Transformers."""
+
+import torch
+from torch.utils.data import DataLoader
+
+from flwr.common import Context
+from flwr.client import NumPyClient, ClientApp
+
+
+from vitexample.task import apply_train_transforms, get_dataset_partition
+from vitexample.task import get_model, set_params, get_params, train
+
+
+class FedViTClient(NumPyClient):
+ def __init__(self, trainloader, learning_rate, num_classes):
+ self.trainloader = trainloader
+ self.learning_rate = learning_rate
+ self.model = get_model(num_classes)
+
+ # Determine device
+ self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+ self.model.to(self.device) # send model to device
+
+ def fit(self, parameters, config):
+ set_params(self.model, parameters)
+
+ # Set optimizer
+ optimizer = torch.optim.Adam(self.model.parameters(), lr=self.learning_rate)
+ # Train locally
+ avg_train_loss = train(
+ self.model, self.trainloader, optimizer, epochs=1, device=self.device
+ )
+ # Return locally-finetuned part of the model
+ return (
+ get_params(self.model),
+ len(self.trainloader.dataset),
+ {"train_loss": avg_train_loss},
+ )
+
+
+def client_fn(context: Context):
+ """Return a FedViTClient."""
+
+ # Read the node_config to fetch data partition associated to this node
+ partition_id = context.node_config["partition-id"]
+ num_partitions = context.node_config["num-partitions"]
+ dataset_name = context.run_config["dataset-name"]
+ trainpartition = get_dataset_partition(num_partitions, partition_id, dataset_name)
+
+ batch_size = context.run_config["batch-size"]
+ lr = context.run_config["learning-rate"]
+ num_classes = context.run_config["num-classes"]
+ trainset = trainpartition.with_transform(apply_train_transforms)
+
+ trainloader = DataLoader(
+ trainset, batch_size=batch_size, num_workers=2, shuffle=True
+ )
+
+ return FedViTClient(trainloader, lr, num_classes).to_client()
+
+
+app = ClientApp(client_fn=client_fn)
diff --git a/examples/flowertune-vit/vitexample/server_app.py b/examples/flowertune-vit/vitexample/server_app.py
new file mode 100644
index 000000000000..f37215df5eb9
--- /dev/null
+++ b/examples/flowertune-vit/vitexample/server_app.py
@@ -0,0 +1,77 @@
+"""vitexample: A Flower / PyTorch app with Vision Transformers."""
+
+from logging import INFO
+
+import torch
+from datasets import Dataset, load_dataset
+from torch.utils.data import DataLoader
+
+from vitexample.task import apply_eval_transforms
+from vitexample.task import get_model, set_params, test, get_params
+
+from flwr.common import Context, ndarrays_to_parameters
+from flwr.common.logger import log
+from flwr.server import ServerApp, ServerConfig, ServerAppComponents
+from flwr.server.strategy import FedAvg
+
+
+def get_evaluate_fn(
+ centralized_testset: Dataset,
+ num_classes: int,
+):
+ """Return an evaluation function for centralized evaluation."""
+
+ def evaluate(server_round, parameters, config):
+ """Use the entire Oxford Flowers-102 test set for evaluation."""
+
+ # Determine device
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+ # Instantiate model and apply current global parameters
+ model = get_model(num_classes)
+ set_params(model, parameters)
+ model.to(device)
+
+ # Apply transform to dataset
+ testset = centralized_testset.with_transform(apply_eval_transforms)
+
+ testloader = DataLoader(testset, batch_size=128)
+ # Run evaluation
+ loss, accuracy = test(model, testloader, device=device)
+ log(INFO, f"round: {server_round} -> acc: {accuracy:.4f}, loss: {loss: .4f}")
+
+ return loss, {"accuracy": accuracy}
+
+ return evaluate
+
+
+def server_fn(context: Context):
+
+ # Define tested for central evaluation
+ dataset_name = context.run_config["dataset-name"]
+ dataset = load_dataset(dataset_name)
+ test_set = dataset["test"]
+
+ # Set initial global model
+ num_classes = context.run_config["num-classes"]
+ ndarrays = get_params(get_model(num_classes))
+ init_parameters = ndarrays_to_parameters(ndarrays)
+
+ # Configure the strategy
+ strategy = FedAvg(
+ fraction_fit=0.5, # Sample 50% of available clients
+ fraction_evaluate=0.0, # No federated evaluation
+ evaluate_fn=get_evaluate_fn(
+ test_set, num_classes
+ ), # Global evaluation function
+ initial_parameters=init_parameters,
+ )
+
+ # Construct ServerConfig
+ num_rounds = context.run_config["num-server-rounds"]
+ config = ServerConfig(num_rounds=num_rounds)
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+app = ServerApp(server_fn=server_fn)
diff --git a/examples/flowertune-vit/vitexample/task.py b/examples/flowertune-vit/vitexample/task.py
new file mode 100644
index 000000000000..3512d1891db2
--- /dev/null
+++ b/examples/flowertune-vit/vitexample/task.py
@@ -0,0 +1,131 @@
+"""vitexample: A Flower / PyTorch app with Vision Transformers."""
+
+from collections import OrderedDict
+
+import torch
+from torchvision.models import vit_b_16, ViT_B_16_Weights
+from torchvision.transforms import (
+ Compose,
+ Normalize,
+ ToTensor,
+ RandomResizedCrop,
+ Resize,
+ CenterCrop,
+)
+
+from flwr_datasets import FederatedDataset
+from flwr_datasets.partitioner import IidPartitioner
+
+
+def get_model(num_classes: int):
+ """Return a pretrained ViT with all layers frozen except output head."""
+
+ # Instantiate a pre-trained ViT-B on ImageNet
+ model = vit_b_16(weights=ViT_B_16_Weights.IMAGENET1K_V1)
+
+ # We're going to federated the finetuning of this model
+ # using (by default) the Oxford Flowers-102 dataset. One easy way
+ # to achieve this is by re-initializing the output block of the
+ # ViT so it outputs 102 clases instead of the default 1k
+ in_features = model.heads[-1].in_features
+ model.heads[-1] = torch.nn.Linear(in_features, num_classes)
+
+ # Disable gradients for everything
+ model.requires_grad_(False)
+ # Now enable just for output head
+ model.heads.requires_grad_(True)
+
+ return model
+
+
+def set_params(model, parameters):
+ """Apply the parameters to model head."""
+ finetune_layers = model.heads
+ params_dict = zip(finetune_layers.state_dict().keys(), parameters)
+ state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
+ finetune_layers.load_state_dict(state_dict, strict=True)
+
+
+def get_params(model):
+ """Get parameters from model head as ndarrays."""
+ finetune_layers = model.heads
+ return [val.cpu().numpy() for _, val in finetune_layers.state_dict().items()]
+
+
+def train(net, trainloader, optimizer, epochs, device):
+ """Train the model on the training set."""
+ criterion = torch.nn.CrossEntropyLoss()
+ net.train()
+ net.to(device)
+ avg_loss = 0
+ # A very standard training loop for image classification
+ for _ in range(epochs):
+ for batch in trainloader:
+ images, labels = batch["image"].to(device), batch["label"].to(device)
+ optimizer.zero_grad()
+ loss = criterion(net(images), labels)
+ avg_loss += loss.item() / labels.shape[0]
+ loss.backward()
+ optimizer.step()
+
+ return avg_loss / len(trainloader)
+
+
+def test(net, testloader, device: str):
+ """Validate the network on the entire test set."""
+ criterion = torch.nn.CrossEntropyLoss()
+ correct, loss = 0, 0.0
+ net.to(device)
+ net.eval()
+ with torch.no_grad():
+ for data in testloader:
+ images, labels = data["image"].to(device), data["label"].to(device)
+ outputs = net(images)
+ loss += criterion(outputs, labels).item()
+ _, predicted = torch.max(outputs.data, 1)
+ correct += (predicted == labels).sum().item()
+ accuracy = correct / len(testloader.dataset)
+ return loss, accuracy
+
+
+fds = None
+
+
+def get_dataset_partition(num_partitions: int, partition_id: int, dataset_name: str):
+ """Get Oxford Flowers datasets and partition it."""
+ global fds
+ if fds is None:
+ # Get dataset (by default Oxford Flowers-102) and create IID partitions
+ partitioner = IidPartitioner(num_partitions)
+ fds = FederatedDataset(
+ dataset=dataset_name, partitioners={"train": partitioner}
+ )
+
+ return fds.load_partition(partition_id)
+
+
+def apply_eval_transforms(batch):
+ """Apply a very standard set of image transforms."""
+ transforms = Compose(
+ [
+ Resize((256, 256)),
+ CenterCrop((224, 224)),
+ ToTensor(),
+ Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
+ ]
+ )
+ batch["image"] = [transforms(img) for img in batch["image"]]
+ return batch
+
+
+def apply_train_transforms(batch):
+ """Apply a very standard set of image transforms."""
+ transforms = Compose(
+ [
+ RandomResizedCrop((224, 224)),
+ ToTensor(),
+ Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
+ ]
+ )
+ batch["image"] = [transforms(img) for img in batch["image"]]
+ return batch
diff --git a/examples/vit-finetune/client.py b/examples/vit-finetune/client.py
deleted file mode 100644
index 6226b9363ca4..000000000000
--- a/examples/vit-finetune/client.py
+++ /dev/null
@@ -1,80 +0,0 @@
-import flwr
-import torch
-from flwr.client import NumPyClient
-from torch.utils.data import DataLoader
-
-from dataset import apply_transforms, get_dataset_with_partitions
-from model import get_model, set_parameters, train
-
-
-class FedViTClient(NumPyClient):
- def __init__(self, trainset):
- self.trainset = trainset
- self.model = get_model()
-
- # Determine device
- self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
- self.model.to(self.device) # send model to device
-
- def set_for_finetuning(self):
- """Freeze all parameter except those in the final head.
-
- Only output MLP will be updated by the client and therefore, the only part of
- the model that will be federated (hence, communicated back to the server for
- aggregation.)
- """
-
- # Disable gradients for everything
- self.model.requires_grad_(False)
- # Now enable just for output head
- self.model.heads.requires_grad_(True)
-
- def get_parameters(self, config):
- """Get locally updated parameters."""
- finetune_layers = self.model.heads
- return [val.cpu().numpy() for _, val in finetune_layers.state_dict().items()]
-
- def fit(self, parameters, config):
- set_parameters(self.model, parameters)
-
- # Get some info from the config
- # Get batchsize and LR set from server
- batch_size = config["batch_size"]
- lr = config["lr"]
-
- trainloader = DataLoader(
- self.trainset, batch_size=batch_size, num_workers=2, shuffle=True
- )
-
- # Set optimizer
- optimizer = torch.optim.Adam(self.model.parameters(), lr=lr)
- # Train locally
- avg_train_loss = train(
- self.model, trainloader, optimizer, epochs=1, device=self.device
- )
- # Return locally-finetuned part of the model
- return (
- self.get_parameters(config={}),
- len(trainloader.dataset),
- {"train_loss": avg_train_loss},
- )
-
-
-# Downloads and partition dataset
-federated_ox_flowers, _ = get_dataset_with_partitions(num_partitions=20)
-
-
-def client_fn(cid: str):
- """Return a FedViTClient that trains with the cid-th data partition."""
-
- trainset_for_this_client = federated_ox_flowers.load_partition(int(cid), "train")
-
- trainset = trainset_for_this_client.with_transform(apply_transforms)
-
- return FedViTClient(trainset).to_client()
-
-
-# To be used with Flower Next
-app = flwr.client.ClientApp(
- client_fn=client_fn,
-)
diff --git a/examples/vit-finetune/dataset.py b/examples/vit-finetune/dataset.py
deleted file mode 100644
index e1e01da61dd4..000000000000
--- a/examples/vit-finetune/dataset.py
+++ /dev/null
@@ -1,51 +0,0 @@
-from flwr_datasets import FederatedDataset
-from torchvision.transforms import (
- CenterCrop,
- Compose,
- Normalize,
- RandomResizedCrop,
- Resize,
- ToTensor,
-)
-
-
-def get_dataset_with_partitions(num_partitions: int):
- """Get Oxford Flowers datasets and partition it.
-
- Return partitioned dataset as well as the whole test set.
- """
-
- # Get Oxford Flowers-102 and divide it into 20 IID partitions
- ox_flowers_fds = FederatedDataset(
- dataset="nelorth/oxford-flowers", partitioners={"train": num_partitions}
- )
-
- centralized_testset = ox_flowers_fds.load_split("test")
- return ox_flowers_fds, centralized_testset
-
-
-def apply_eval_transforms(batch):
- """Apply a very standard set of image transforms."""
- transforms = Compose(
- [
- Resize((256, 256)),
- CenterCrop((224, 224)),
- ToTensor(),
- Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
- ]
- )
- batch["image"] = [transforms(img) for img in batch["image"]]
- return batch
-
-
-def apply_transforms(batch):
- """Apply a very standard set of image transforms."""
- transforms = Compose(
- [
- RandomResizedCrop((224, 224)),
- ToTensor(),
- Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
- ]
- )
- batch["image"] = [transforms(img) for img in batch["image"]]
- return batch
diff --git a/examples/vit-finetune/main.py b/examples/vit-finetune/main.py
deleted file mode 100644
index 33ad78a04d47..000000000000
--- a/examples/vit-finetune/main.py
+++ /dev/null
@@ -1,57 +0,0 @@
-import argparse
-
-import flwr as fl
-import matplotlib.pyplot as plt
-
-from client import client_fn
-from server import strategy
-
-parser = argparse.ArgumentParser(
- description="Finetuning of a ViT with Flower Simulation."
-)
-
-parser.add_argument(
- "--num-rounds",
- type=int,
- default=20,
- help="Number of rounds.",
-)
-
-
-def main():
- args = parser.parse_args()
-
- # To control the degree of parallelism
- # With default settings in this example,
- # each client should take just ~1GB of VRAM.
- client_resources = {
- "num_cpus": 4,
- "num_gpus": 0.2,
- }
-
- # Launch simulation
- history = fl.simulation.start_simulation(
- client_fn=client_fn,
- num_clients=20,
- client_resources=client_resources,
- config=fl.server.ServerConfig(num_rounds=args.num_rounds),
- strategy=strategy,
- )
-
- print(history)
-
- # Basic plotting
- global_accuracy_centralised = history.metrics_centralized["accuracy"]
- round = [int(data[0]) for data in global_accuracy_centralised]
- acc = [100.0 * data[1] for data in global_accuracy_centralised]
- plt.plot(round, acc)
- plt.xticks(round)
- plt.grid()
- plt.ylabel("Accuracy (%)")
- plt.xlabel("Round")
- plt.title("Federated finetuning of ViT for Flowers-102")
- plt.savefig("central_evaluation.png")
-
-
-if __name__ == "__main__":
- main()
diff --git a/examples/vit-finetune/model.py b/examples/vit-finetune/model.py
deleted file mode 100644
index a0b8294aa485..000000000000
--- a/examples/vit-finetune/model.py
+++ /dev/null
@@ -1,71 +0,0 @@
-from collections import OrderedDict
-
-import torch
-from torchvision.models import ViT_B_16_Weights, vit_b_16
-
-
-def get_model():
- """Return a pretrained ViT with all layers frozen except output head."""
-
- # Instantiate a pre-trained ViT-B on ImageNet
- model = vit_b_16(weights=ViT_B_16_Weights.IMAGENET1K_V1)
-
- # We're going to federated the finetuning of this model
- # using the Oxford Flowers-102 dataset. One easy way to achieve
- # this is by re-initializing the output block of the ViT so it
- # outputs 102 clases instead of the default 1k
- in_features = model.heads[-1].in_features
- model.heads[-1] = torch.nn.Linear(in_features, 102)
-
- # Disable gradients for everything
- model.requires_grad_(False)
- # Now enable just for output head
- model.heads.requires_grad_(True)
-
- return model
-
-
-def set_parameters(model, parameters):
- """Apply the parameters to the model.
-
- Recall this example only federates the head of the ViT so that's the only part of
- the model we need to load.
- """
- finetune_layers = model.heads
- params_dict = zip(finetune_layers.state_dict().keys(), parameters)
- state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
- finetune_layers.load_state_dict(state_dict, strict=True)
-
-
-def train(net, trainloader, optimizer, epochs, device):
- """Train the model on the training set."""
- criterion = torch.nn.CrossEntropyLoss()
- net.train()
- avg_loss = 0
- # A very standard training loop for image classification
- for _ in range(epochs):
- for batch in trainloader:
- images, labels = batch["image"].to(device), batch["label"].to(device)
- optimizer.zero_grad()
- loss = criterion(net(images), labels)
- avg_loss += loss.item() / labels.shape[0]
- loss.backward()
- optimizer.step()
-
- return avg_loss / len(trainloader)
-
-
-def test(net, testloader, device: str):
- """Validate the network on the entire test set."""
- criterion = torch.nn.CrossEntropyLoss()
- correct, loss = 0, 0.0
- net.eval()
- with torch.no_grad():
- for data in testloader:
- images, labels = data["image"].to(device), data["label"].to(device)
- outputs = net(images)
- loss += criterion(outputs, labels).item()
- _, predicted = torch.max(outputs.data, 1)
- correct += (predicted == labels).sum().item()
- accuracy = correct / len(testloader.dataset)
- return loss, accuracy
diff --git a/examples/vit-finetune/pyproject.toml b/examples/vit-finetune/pyproject.toml
deleted file mode 100644
index d014d6b6fb2a..000000000000
--- a/examples/vit-finetune/pyproject.toml
+++ /dev/null
@@ -1,17 +0,0 @@
-[build-system]
-requires = ["poetry-core>=1.4.0"]
-build-backend = "poetry.core.masonry.api"
-
-[tool.poetry]
-name = "vit-finetune"
-version = "0.1.0"
-description = "FL finetuning of a Vision Transformer with Flower."
-authors = ["The Flower Authors "]
-
-[tool.poetry.dependencies]
-python = ">=3.8,<3.11"
-flwr = { extras = ["simulation"], version = ">=1.0,<2.0" }
-flwr-datasets = { extras = ["vision"], version = ">=0.0.2,<1.0.0" }
-torch = "2.2.1"
-torchvision = "0.17.1"
-matplotlib = "3.8.3"
diff --git a/examples/vit-finetune/requirements.txt b/examples/vit-finetune/requirements.txt
deleted file mode 100644
index 3692be0d6c2c..000000000000
--- a/examples/vit-finetune/requirements.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-flwr[simulation]>=1.0, <2.0
-flwr-datasets[vision]>=0.0.2, <1.0.0
-matplotlib==3.8.3
-torch==2.2.1
-torchvision==0.17.1
\ No newline at end of file
diff --git a/examples/vit-finetune/server.py b/examples/vit-finetune/server.py
deleted file mode 100644
index 5352d34c4fe2..000000000000
--- a/examples/vit-finetune/server.py
+++ /dev/null
@@ -1,61 +0,0 @@
-import flwr as fl
-import torch
-from datasets import Dataset
-from torch.utils.data import DataLoader
-
-from dataset import apply_eval_transforms, get_dataset_with_partitions
-from model import get_model, set_parameters, test
-
-
-def fit_config(server_round: int):
- """Return a configuration with static batch size and (local) epochs."""
- config = {
- "lr": 0.01, # Learning rate used by clients
- "batch_size": 32, # Batch size to use by clients during fit()
- }
- return config
-
-
-def get_evaluate_fn(
- centralized_testset: Dataset,
-):
- """Return an evaluation function for centralized evaluation."""
-
- def evaluate(server_round, parameters, config):
- """Use the entire Oxford Flowers-102 test set for evaluation."""
-
- # Determine device
- device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
-
- model = get_model()
- set_parameters(model, parameters)
- model.to(device)
-
- # Apply transform to dataset
- testset = centralized_testset.with_transform(apply_eval_transforms)
-
- testloader = DataLoader(testset, batch_size=128)
- # Run evaluation
- loss, accuracy = test(model, testloader, device=device)
-
- return loss, {"accuracy": accuracy}
-
- return evaluate
-
-
-# Downloads and partition dataset
-_, centralized_testset = get_dataset_with_partitions(num_partitions=20)
-
-# Configure the strategy
-strategy = fl.server.strategy.FedAvg(
- fraction_fit=0.5, # Sample 50% of available clients for training each round
- fraction_evaluate=0.0, # No federated evaluation
- on_fit_config_fn=fit_config,
- evaluate_fn=get_evaluate_fn(centralized_testset), # Global evaluation function
-)
-
-# To be used with Flower Next
-app = fl.server.ServerApp(
- config=fl.server.ServerConfig(num_rounds=3),
- strategy=strategy,
-)
From 489247a64adeed412e8f6ea966e656b049a2fe71 Mon Sep 17 00:00:00 2001
From: Daniel Nata Nugraha
Date: Sat, 24 Aug 2024 18:45:38 +0200
Subject: [PATCH 07/42] refactor(framework) Rename client auth to node auth
(#4074)
---
src/py/flwr/server/app.py | 26 ++---
src/py/flwr/server/server_test.py | 18 ++--
.../fleet/grpc_rere/server_interceptor.py | 22 ++---
.../grpc_rere/server_interceptor_test.py | 94 +++++++++----------
.../server/superlink/state/in_memory_state.py | 30 +++---
.../server/superlink/state/sqlite_state.py | 20 ++--
src/py/flwr/server/superlink/state/state.py | 16 ++--
.../flwr/server/superlink/state/state_test.py | 20 ++--
8 files changed, 122 insertions(+), 124 deletions(-)
diff --git a/src/py/flwr/server/app.py b/src/py/flwr/server/app.py
index ef632a0c014d..32b490903554 100644
--- a/src/py/flwr/server/app.py
+++ b/src/py/flwr/server/app.py
@@ -278,24 +278,24 @@ def run_superlink() -> None:
fleet_thread.start()
bckg_threads.append(fleet_thread)
elif args.fleet_api_type == TRANSPORT_TYPE_GRPC_RERE:
- maybe_keys = _try_setup_client_authentication(args, certificates)
+ maybe_keys = _try_setup_node_authentication(args, certificates)
interceptors: Optional[Sequence[grpc.ServerInterceptor]] = None
if maybe_keys is not None:
(
- client_public_keys,
+ node_public_keys,
server_private_key,
server_public_key,
) = maybe_keys
state = state_factory.state()
- state.store_client_public_keys(client_public_keys)
+ state.store_node_public_keys(node_public_keys)
state.store_server_private_public_key(
private_key_to_bytes(server_private_key),
public_key_to_bytes(server_public_key),
)
log(
INFO,
- "Client authentication enabled with %d known public keys",
- len(client_public_keys),
+ "Node authentication enabled with %d known public keys",
+ len(node_public_keys),
)
interceptors = [AuthenticateServerInterceptor(state)]
@@ -344,7 +344,7 @@ def _format_address(address: str) -> Tuple[str, str, int]:
return (f"[{host}]:{port}" if is_v6 else f"{host}:{port}", host, port)
-def _try_setup_client_authentication(
+def _try_setup_node_authentication(
args: argparse.Namespace,
certificates: Optional[Tuple[bytes, bytes, bytes]],
) -> Optional[Tuple[Set[bytes], ec.EllipticCurvePrivateKey, ec.EllipticCurvePublicKey]]:
@@ -373,16 +373,16 @@ def _try_setup_client_authentication(
"`--ssl-keyfile`, and `—-ssl-ca-certfile` and try again."
)
- client_keys_file_path = Path(args.auth_list_public_keys)
- if not client_keys_file_path.exists():
+ node_keys_file_path = Path(args.auth_list_public_keys)
+ if not node_keys_file_path.exists():
sys.exit(
"The provided path to the known public keys CSV file does not exist: "
- f"{client_keys_file_path}. "
+ f"{node_keys_file_path}. "
"Please provide the CSV file path containing known public keys "
"to '--auth-list-public-keys'."
)
- client_public_keys: Set[bytes] = set()
+ node_public_keys: Set[bytes] = set()
try:
ssh_private_key = load_ssh_private_key(
@@ -413,13 +413,13 @@ def _try_setup_client_authentication(
"path points to a valid public key file and try again."
)
- with open(client_keys_file_path, newline="", encoding="utf-8") as csvfile:
+ with open(node_keys_file_path, newline="", encoding="utf-8") as csvfile:
reader = csv.reader(csvfile)
for row in reader:
for element in row:
public_key = load_ssh_public_key(element.encode())
if isinstance(public_key, ec.EllipticCurvePublicKey):
- client_public_keys.add(public_key_to_bytes(public_key))
+ node_public_keys.add(public_key_to_bytes(public_key))
else:
sys.exit(
"Error: Unable to parse the public keys in the CSV "
@@ -427,7 +427,7 @@ def _try_setup_client_authentication(
"known SSH public keys files and try again."
)
return (
- client_public_keys,
+ node_public_keys,
ssh_private_key,
ssh_public_key,
)
diff --git a/src/py/flwr/server/server_test.py b/src/py/flwr/server/server_test.py
index f47b5c3d8469..b80811a6f730 100644
--- a/src/py/flwr/server/server_test.py
+++ b/src/py/flwr/server/server_test.py
@@ -55,7 +55,7 @@
)
from flwr.server.client_manager import SimpleClientManager
-from .app import _try_setup_client_authentication
+from .app import _try_setup_node_authentication
from .client_proxy import ClientProxy
from .server import Server, evaluate_clients, fit_clients
@@ -203,8 +203,8 @@ def test_set_max_workers() -> None:
assert server.max_workers == 42
-def test_setup_client_auth() -> None: # pylint: disable=R0914
- """Test setup client authentication."""
+def test_setup_node_auth() -> None: # pylint: disable=R0914
+ """Test setup node authentication."""
# Prepare
_, first_public_key = generate_key_pairs()
private_key, public_key = generate_key_pairs()
@@ -220,12 +220,12 @@ def test_setup_client_auth() -> None: # pylint: disable=R0914
# Execute
with tempfile.TemporaryDirectory() as temp_dir:
# Initialize temporary files
- client_keys_file_path = Path(temp_dir) / "client_keys.csv"
+ node_keys_file_path = Path(temp_dir) / "node_keys.csv"
server_private_key_path = Path(temp_dir) / "server_private_key"
server_public_key_path = Path(temp_dir) / "server_public_key"
# Fill the files with relevant keys
- with open(client_keys_file_path, "w", newline="", encoding="utf-8") as csvfile:
+ with open(node_keys_file_path, "w", newline="", encoding="utf-8") as csvfile:
writer = csv.writer(csvfile)
writer.writerow(
[
@@ -240,15 +240,15 @@ def test_setup_client_auth() -> None: # pylint: disable=R0914
server_public_key_path.write_bytes(server_public_key)
server_private_key_path.write_bytes(server_private_key)
- # Mock argparse with `require-client-authentication`` flag
+ # Mock argparse with `require-node-authentication`` flag
mock_args = argparse.Namespace(
- auth_list_public_keys=str(client_keys_file_path),
+ auth_list_public_keys=str(node_keys_file_path),
auth_superlink_private_key=str(server_private_key_path),
auth_superlink_public_key=str(server_public_key_path),
)
- # Run _try_setup_client_authentication
- result = _try_setup_client_authentication(mock_args, (b"", b"", b""))
+ # Run _try_setup_node_authentication
+ result = _try_setup_node_authentication(mock_args, (b"", b"", b""))
expected_private_key = load_ssh_private_key(server_private_key, None)
expected_public_key = load_ssh_public_key(server_public_key)
diff --git a/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor.py b/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor.py
index 87ac45a4f9c8..70b38f8b625e 100644
--- a/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor.py
+++ b/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor.py
@@ -78,13 +78,13 @@ def _get_value_from_tuples(
class AuthenticateServerInterceptor(grpc.ServerInterceptor): # type: ignore
- """Server interceptor for client authentication."""
+ """Server interceptor for node authentication."""
def __init__(self, state: State):
self.state = state
- self.client_public_keys = state.get_client_public_keys()
- if len(self.client_public_keys) == 0:
+ self.node_public_keys = state.get_node_public_keys()
+ if len(self.node_public_keys) == 0:
log(WARNING, "Authentication enabled, but no known public keys configured")
private_key = self.state.get_server_private_key()
@@ -103,9 +103,9 @@ def intercept_service(
) -> grpc.RpcMethodHandler:
"""Flower server interceptor authentication logic.
- Intercept all unary calls from clients and authenticate clients by validating
- auth metadata sent by the client. Continue RPC call if client is authenticated,
- else, terminate RPC call by setting context to abort.
+ Intercept all unary calls from nodes and authenticate nodes by validating auth
+ metadata sent by the node. Continue RPC call if node is authenticated, else,
+ terminate RPC call by setting context to abort.
"""
# One of the method handlers in
# `flwr.server.superlink.fleet.grpc_rere.fleet_server.FleetServicer`
@@ -119,17 +119,17 @@ def _generic_method_handler(
request: Request,
context: grpc.ServicerContext,
) -> Response:
- client_public_key_bytes = base64.urlsafe_b64decode(
+ node_public_key_bytes = base64.urlsafe_b64decode(
_get_value_from_tuples(
_PUBLIC_KEY_HEADER, context.invocation_metadata()
)
)
- if client_public_key_bytes not in self.client_public_keys:
+ if node_public_key_bytes not in self.node_public_keys:
context.abort(grpc.StatusCode.UNAUTHENTICATED, "Access denied")
if isinstance(request, CreateNodeRequest):
response = self._create_authenticated_node(
- client_public_key_bytes, request, context
+ node_public_key_bytes, request, context
)
log(
INFO,
@@ -144,13 +144,13 @@ def _generic_method_handler(
_AUTH_TOKEN_HEADER, context.invocation_metadata()
)
)
- public_key = bytes_to_public_key(client_public_key_bytes)
+ public_key = bytes_to_public_key(node_public_key_bytes)
if not self._verify_hmac(public_key, request, hmac_value):
context.abort(grpc.StatusCode.UNAUTHENTICATED, "Access denied")
# Verify node_id
- node_id = self.state.get_node_id(client_public_key_bytes)
+ node_id = self.state.get_node_id(node_public_key_bytes)
if not self._verify_node_id(node_id, request):
context.abort(grpc.StatusCode.UNAUTHENTICATED, "Access denied")
diff --git a/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor_test.py b/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor_test.py
index ece443a816cb..74914be68a8f 100644
--- a/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor_test.py
+++ b/src/py/flwr/server/superlink/fleet/grpc_rere/server_interceptor_test.py
@@ -58,7 +58,7 @@ class TestServerInterceptor(unittest.TestCase): # pylint: disable=R0902
def setUp(self) -> None:
"""Initialize mock stub and server interceptor."""
- self._client_private_key, self._client_public_key = generate_key_pairs()
+ self._node_private_key, self._node_public_key = generate_key_pairs()
self._server_private_key, self._server_public_key = generate_key_pairs()
state_factory = StateFactory(":flwr-in-memory-state:")
@@ -69,9 +69,7 @@ def setUp(self) -> None:
private_key_to_bytes(self._server_private_key),
public_key_to_bytes(self._server_public_key),
)
- self.state.store_client_public_keys(
- {public_key_to_bytes(self._client_public_key)}
- )
+ self.state.store_node_public_keys({public_key_to_bytes(self._node_public_key)})
self._server_interceptor = AuthenticateServerInterceptor(self.state)
self._server: grpc.Server = _run_fleet_api_grpc_rere(
@@ -122,7 +120,7 @@ def test_successful_create_node_with_metadata(self) -> None:
"""Test server interceptor for creating node."""
# Prepare
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -145,9 +143,9 @@ def test_successful_create_node_with_metadata(self) -> None:
def test_unsuccessful_create_node_with_metadata(self) -> None:
"""Test server interceptor for creating node unsuccessfully."""
# Prepare
- _, client_public_key = generate_key_pairs()
+ _, node_public_key = generate_key_pairs()
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(client_public_key)
+ public_key_to_bytes(node_public_key)
)
# Execute & Assert
@@ -161,17 +159,17 @@ def test_successful_delete_node_with_metadata(self) -> None:
"""Test server interceptor for deleting node."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = DeleteNodeRequest(node=Node(node_id=node_id))
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -191,16 +189,16 @@ def test_unsuccessful_delete_node_with_metadata(self) -> None:
"""Test server interceptor for deleting node unsuccessfully."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = DeleteNodeRequest(node=Node(node_id=node_id))
- client_private_key, _ = generate_key_pairs()
- shared_secret = generate_shared_key(client_private_key, self._server_public_key)
+ node_private_key, _ = generate_key_pairs()
+ shared_secret = generate_shared_key(node_private_key, self._server_public_key)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute & Assert
@@ -217,17 +215,17 @@ def test_successful_pull_task_ins_with_metadata(self) -> None:
"""Test server interceptor for pull task ins."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PullTaskInsRequest(node=Node(node_id=node_id))
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -247,16 +245,16 @@ def test_unsuccessful_pull_task_ins_with_metadata(self) -> None:
"""Test server interceptor for pull task ins unsuccessfully."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PullTaskInsRequest(node=Node(node_id=node_id))
- client_private_key, _ = generate_key_pairs()
- shared_secret = generate_shared_key(client_private_key, self._server_public_key)
+ node_private_key, _ = generate_key_pairs()
+ shared_secret = generate_shared_key(node_private_key, self._server_public_key)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute & Assert
@@ -273,19 +271,19 @@ def test_successful_push_task_res_with_metadata(self) -> None:
"""Test server interceptor for push task res."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PushTaskResRequest(
task_res_list=[TaskRes(task=Task(producer=Node(node_id=node_id)))]
)
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -305,18 +303,18 @@ def test_unsuccessful_push_task_res_with_metadata(self) -> None:
"""Test server interceptor for push task res unsuccessfully."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PushTaskResRequest(
task_res_list=[TaskRes(task=Task(producer=Node(node_id=node_id)))]
)
- client_private_key, _ = generate_key_pairs()
- shared_secret = generate_shared_key(client_private_key, self._server_public_key)
+ node_private_key, _ = generate_key_pairs()
+ shared_secret = generate_shared_key(node_private_key, self._server_public_key)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute & Assert
@@ -333,18 +331,18 @@ def test_successful_get_run_with_metadata(self) -> None:
"""Test server interceptor for pull task ins."""
# Prepare
self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
run_id = self.state.create_run("", "", "", {})
request = GetRunRequest(run_id=run_id)
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -364,17 +362,17 @@ def test_unsuccessful_get_run_with_metadata(self) -> None:
"""Test server interceptor for pull task ins unsuccessfully."""
# Prepare
self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
run_id = self.state.create_run("", "", "", {})
request = GetRunRequest(run_id=run_id)
- client_private_key, _ = generate_key_pairs()
- shared_secret = generate_shared_key(client_private_key, self._server_public_key)
+ node_private_key, _ = generate_key_pairs()
+ shared_secret = generate_shared_key(node_private_key, self._server_public_key)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute & Assert
@@ -391,17 +389,17 @@ def test_successful_ping_with_metadata(self) -> None:
"""Test server interceptor for pull task ins."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PingRequest(node=Node(node_id=node_id))
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute
@@ -421,16 +419,16 @@ def test_unsuccessful_ping_with_metadata(self) -> None:
"""Test server interceptor for pull task ins unsuccessfully."""
# Prepare
node_id = self.state.create_node(
- ping_interval=30, public_key=public_key_to_bytes(self._client_public_key)
+ ping_interval=30, public_key=public_key_to_bytes(self._node_public_key)
)
request = PingRequest(node=Node(node_id=node_id))
- client_private_key, _ = generate_key_pairs()
- shared_secret = generate_shared_key(client_private_key, self._server_public_key)
+ node_private_key, _ = generate_key_pairs()
+ shared_secret = generate_shared_key(node_private_key, self._server_public_key)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
# Execute & Assert
@@ -446,7 +444,7 @@ def test_unsuccessful_ping_with_metadata(self) -> None:
def test_successful_restore_node(self) -> None:
"""Test server interceptor for restoring node."""
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
response, call = self._create_node.with_call(
request=CreateNodeRequest(),
@@ -461,20 +459,20 @@ def test_successful_restore_node(self) -> None:
)
node = response.node
- client_node_id = node.node_id
+ node_node_id = node.node_id
assert call.initial_metadata()[0] == expected_metadata
assert isinstance(response, CreateNodeResponse)
request = DeleteNodeRequest(node=node)
shared_secret = generate_shared_key(
- self._client_private_key, self._server_public_key
+ self._node_private_key, self._server_public_key
)
hmac_value = base64.urlsafe_b64encode(
compute_hmac(shared_secret, request.SerializeToString(True))
)
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
response, call = self._delete_node.with_call(
request=request,
@@ -488,7 +486,7 @@ def test_successful_restore_node(self) -> None:
assert grpc.StatusCode.OK == call.code()
public_key_bytes = base64.urlsafe_b64encode(
- public_key_to_bytes(self._client_public_key)
+ public_key_to_bytes(self._node_public_key)
)
response, call = self._create_node.with_call(
request=CreateNodeRequest(),
@@ -504,4 +502,4 @@ def test_successful_restore_node(self) -> None:
assert call.initial_metadata()[0] == expected_metadata
assert isinstance(response, CreateNodeResponse)
- assert response.node.node_id == client_node_id
+ assert response.node.node_id == node_node_id
diff --git a/src/py/flwr/server/superlink/state/in_memory_state.py b/src/py/flwr/server/superlink/state/in_memory_state.py
index fde8fe41912f..c87ba86e47e7 100644
--- a/src/py/flwr/server/superlink/state/in_memory_state.py
+++ b/src/py/flwr/server/superlink/state/in_memory_state.py
@@ -45,7 +45,7 @@ def __init__(self) -> None:
self.task_ins_store: Dict[UUID, TaskIns] = {}
self.task_res_store: Dict[UUID, TaskRes] = {}
- self.client_public_keys: Set[bytes] = set()
+ self.node_public_keys: Set[bytes] = set()
self.server_public_key: Optional[bytes] = None
self.server_private_key: Optional[bytes] = None
@@ -237,7 +237,7 @@ def create_node(
return node_id
def delete_node(self, node_id: int, public_key: Optional[bytes] = None) -> None:
- """Delete a client node."""
+ """Delete a node."""
with self.lock:
if node_id not in self.node_ids:
raise ValueError(f"Node {node_id} not found")
@@ -254,7 +254,7 @@ def delete_node(self, node_id: int, public_key: Optional[bytes] = None) -> None:
del self.node_ids[node_id]
def get_nodes(self, run_id: int) -> Set[int]:
- """Return all available client nodes.
+ """Return all available nodes.
Constraints
-----------
@@ -271,9 +271,9 @@ def get_nodes(self, run_id: int) -> Set[int]:
if online_until > current_time
}
- def get_node_id(self, client_public_key: bytes) -> Optional[int]:
- """Retrieve stored `node_id` filtered by `client_public_keys`."""
- return self.public_key_to_node_id.get(client_public_key)
+ def get_node_id(self, node_public_key: bytes) -> Optional[int]:
+ """Retrieve stored `node_id` filtered by `node_public_keys`."""
+ return self.public_key_to_node_id.get(node_public_key)
def create_run(
self,
@@ -318,19 +318,19 @@ def get_server_public_key(self) -> Optional[bytes]:
"""Retrieve `server_public_key` in urlsafe bytes."""
return self.server_public_key
- def store_client_public_keys(self, public_keys: Set[bytes]) -> None:
- """Store a set of `client_public_keys` in state."""
+ def store_node_public_keys(self, public_keys: Set[bytes]) -> None:
+ """Store a set of `node_public_keys` in state."""
with self.lock:
- self.client_public_keys = public_keys
+ self.node_public_keys = public_keys
- def store_client_public_key(self, public_key: bytes) -> None:
- """Store a `client_public_key` in state."""
+ def store_node_public_key(self, public_key: bytes) -> None:
+ """Store a `node_public_key` in state."""
with self.lock:
- self.client_public_keys.add(public_key)
+ self.node_public_keys.add(public_key)
- def get_client_public_keys(self) -> Set[bytes]:
- """Retrieve all currently stored `client_public_keys` as a set."""
- return self.client_public_keys
+ def get_node_public_keys(self) -> Set[bytes]:
+ """Retrieve all currently stored `node_public_keys` as a set."""
+ return self.node_public_keys
def get_run(self, run_id: int) -> Optional[Run]:
"""Retrieve information about the run with the specified `run_id`."""
diff --git a/src/py/flwr/server/superlink/state/sqlite_state.py b/src/py/flwr/server/superlink/state/sqlite_state.py
index 93b3cd63ca7f..daa211560912 100644
--- a/src/py/flwr/server/superlink/state/sqlite_state.py
+++ b/src/py/flwr/server/superlink/state/sqlite_state.py
@@ -569,7 +569,7 @@ def create_node(
return node_id
def delete_node(self, node_id: int, public_key: Optional[bytes] = None) -> None:
- """Delete a client node."""
+ """Delete a node."""
query = "DELETE FROM node WHERE node_id = ?"
params = (node_id,)
@@ -607,10 +607,10 @@ def get_nodes(self, run_id: int) -> Set[int]:
result: Set[int] = {row["node_id"] for row in rows}
return result
- def get_node_id(self, client_public_key: bytes) -> Optional[int]:
- """Retrieve stored `node_id` filtered by `client_public_keys`."""
+ def get_node_id(self, node_public_key: bytes) -> Optional[int]:
+ """Retrieve stored `node_id` filtered by `node_public_keys`."""
query = "SELECT node_id FROM node WHERE public_key = :public_key;"
- row = self.query(query, {"public_key": client_public_key})
+ row = self.query(query, {"public_key": node_public_key})
if len(row) > 0:
node_id: int = row[0]["node_id"]
return node_id
@@ -684,19 +684,19 @@ def get_server_public_key(self) -> Optional[bytes]:
public_key = None
return public_key
- def store_client_public_keys(self, public_keys: Set[bytes]) -> None:
- """Store a set of `client_public_keys` in state."""
+ def store_node_public_keys(self, public_keys: Set[bytes]) -> None:
+ """Store a set of `node_public_keys` in state."""
query = "INSERT INTO public_key (public_key) VALUES (?)"
data = [(key,) for key in public_keys]
self.query(query, data)
- def store_client_public_key(self, public_key: bytes) -> None:
- """Store a `client_public_key` in state."""
+ def store_node_public_key(self, public_key: bytes) -> None:
+ """Store a `node_public_key` in state."""
query = "INSERT INTO public_key (public_key) VALUES (:public_key)"
self.query(query, {"public_key": public_key})
- def get_client_public_keys(self) -> Set[bytes]:
- """Retrieve all currently stored `client_public_keys` as a set."""
+ def get_node_public_keys(self) -> Set[bytes]:
+ """Retrieve all currently stored `node_public_keys` as a set."""
query = "SELECT public_key FROM public_key"
rows = self.query(query)
result: Set[bytes] = {row["public_key"] for row in rows}
diff --git a/src/py/flwr/server/superlink/state/state.py b/src/py/flwr/server/superlink/state/state.py
index 80d3b799bce3..fea53105b23f 100644
--- a/src/py/flwr/server/superlink/state/state.py
+++ b/src/py/flwr/server/superlink/state/state.py
@@ -153,8 +153,8 @@ def get_nodes(self, run_id: int) -> Set[int]:
"""
@abc.abstractmethod
- def get_node_id(self, client_public_key: bytes) -> Optional[int]:
- """Retrieve stored `node_id` filtered by `client_public_keys`."""
+ def get_node_id(self, node_public_key: bytes) -> Optional[int]:
+ """Retrieve stored `node_id` filtered by `node_public_keys`."""
@abc.abstractmethod
def create_run(
@@ -199,16 +199,16 @@ def get_server_public_key(self) -> Optional[bytes]:
"""Retrieve `server_public_key` in urlsafe bytes."""
@abc.abstractmethod
- def store_client_public_keys(self, public_keys: Set[bytes]) -> None:
- """Store a set of `client_public_keys` in state."""
+ def store_node_public_keys(self, public_keys: Set[bytes]) -> None:
+ """Store a set of `node_public_keys` in state."""
@abc.abstractmethod
- def store_client_public_key(self, public_key: bytes) -> None:
- """Store a `client_public_key` in state."""
+ def store_node_public_key(self, public_key: bytes) -> None:
+ """Store a `node_public_key` in state."""
@abc.abstractmethod
- def get_client_public_keys(self) -> Set[bytes]:
- """Retrieve all currently stored `client_public_keys` as a set."""
+ def get_node_public_keys(self) -> Set[bytes]:
+ """Retrieve all currently stored `node_public_keys` as a set."""
@abc.abstractmethod
def acknowledge_ping(self, node_id: int, ping_interval: float) -> bool:
diff --git a/src/py/flwr/server/superlink/state/state_test.py b/src/py/flwr/server/superlink/state/state_test.py
index 3efce9ca0c88..0cf30a42ca2c 100644
--- a/src/py/flwr/server/superlink/state/state_test.py
+++ b/src/py/flwr/server/superlink/state/state_test.py
@@ -575,22 +575,22 @@ def test_store_server_private_public_key_twice(self) -> None:
new_private_key_bytes, new_public_key_bytes
)
- def test_client_public_keys(self) -> None:
- """Test store_client_public_keys and get_client_public_keys from state."""
+ def test_node_public_keys(self) -> None:
+ """Test store_node_public_keys and get_node_public_keys from state."""
# Prepare
state: State = self.state_factory()
key_pairs = [generate_key_pairs() for _ in range(3)]
public_keys = {public_key_to_bytes(pair[1]) for pair in key_pairs}
# Execute
- state.store_client_public_keys(public_keys)
- client_public_keys = state.get_client_public_keys()
+ state.store_node_public_keys(public_keys)
+ node_public_keys = state.get_node_public_keys()
# Assert
- assert client_public_keys == public_keys
+ assert node_public_keys == public_keys
- def test_client_public_key(self) -> None:
- """Test store_client_public_key and get_client_public_keys from state."""
+ def test_node_public_key(self) -> None:
+ """Test store_node_public_key and get_node_public_keys from state."""
# Prepare
state: State = self.state_factory()
key_pairs = [generate_key_pairs() for _ in range(3)]
@@ -598,11 +598,11 @@ def test_client_public_key(self) -> None:
# Execute
for public_key in public_keys:
- state.store_client_public_key(public_key)
- client_public_keys = state.get_client_public_keys()
+ state.store_node_public_key(public_key)
+ node_public_keys = state.get_node_public_keys()
# Assert
- assert client_public_keys == public_keys
+ assert node_public_keys == public_keys
def test_acknowledge_ping(self) -> None:
"""Test if acknowledge_ping works and if get_nodes return online nodes."""
From 2612332fc26696d972344d8e08d3f4fdc5638944 Mon Sep 17 00:00:00 2001
From: Danny
Date: Sat, 24 Aug 2024 19:32:08 +0200
Subject: [PATCH 08/42] feat(framework) Log `node_id`, `fab_hash` and `run_id`
on SuperLink (#4079)
Signed-off-by: Danny Heinrich
Co-authored-by: Daniel J. Beutel
---
.../fleet/grpc_rere/fleet_servicer.py | 27 +++++++++++++------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/src/py/flwr/server/superlink/fleet/grpc_rere/fleet_servicer.py b/src/py/flwr/server/superlink/fleet/grpc_rere/fleet_servicer.py
index e0501e54fafc..02e34e0bba02 100644
--- a/src/py/flwr/server/superlink/fleet/grpc_rere/fleet_servicer.py
+++ b/src/py/flwr/server/superlink/fleet/grpc_rere/fleet_servicer.py
@@ -51,19 +51,22 @@ def CreateNode(
self, request: CreateNodeRequest, context: grpc.ServicerContext
) -> CreateNodeResponse:
"""."""
- log(INFO, "FleetServicer.CreateNode")
+ log(INFO, "[Fleet.CreateNode] Request ping_interval=%s", request.ping_interval)
+ log(DEBUG, "[Fleet.CreateNode] Request: %s", request)
response = message_handler.create_node(
request=request,
state=self.state_factory.state(),
)
- log(INFO, "FleetServicer: Created node_id=%s", response.node.node_id)
+ log(INFO, "[Fleet.CreateNode] Created node_id=%s", response.node.node_id)
+ log(DEBUG, "[Fleet.CreateNode] Response: %s", response)
return response
def DeleteNode(
self, request: DeleteNodeRequest, context: grpc.ServicerContext
) -> DeleteNodeResponse:
"""."""
- log(INFO, "FleetServicer.DeleteNode")
+ log(INFO, "[Fleet.DeleteNode] Delete node_id=%s", request.node.node_id)
+ log(DEBUG, "[Fleet.DeleteNode] Request: %s", request)
return message_handler.delete_node(
request=request,
state=self.state_factory.state(),
@@ -71,7 +74,7 @@ def DeleteNode(
def Ping(self, request: PingRequest, context: grpc.ServicerContext) -> PingResponse:
"""."""
- log(DEBUG, "FleetServicer.Ping")
+ log(DEBUG, "[Fleet.Ping] Request: %s", request)
return message_handler.ping(
request=request,
state=self.state_factory.state(),
@@ -81,7 +84,8 @@ def PullTaskIns(
self, request: PullTaskInsRequest, context: grpc.ServicerContext
) -> PullTaskInsResponse:
"""Pull TaskIns."""
- log(INFO, "FleetServicer.PullTaskIns")
+ log(INFO, "[Fleet.PullTaskIns] node_id=%s", request.node.node_id)
+ log(DEBUG, "[Fleet.PullTaskIns] Request: %s", request)
return message_handler.pull_task_ins(
request=request,
state=self.state_factory.state(),
@@ -91,7 +95,14 @@ def PushTaskRes(
self, request: PushTaskResRequest, context: grpc.ServicerContext
) -> PushTaskResResponse:
"""Push TaskRes."""
- log(INFO, "FleetServicer.PushTaskRes")
+ if request.task_res_list:
+ log(
+ INFO,
+ "[Fleet.PushTaskRes] Push results from node_id=%s",
+ request.task_res_list[0].task.producer.node_id,
+ )
+ else:
+ log(INFO, "[Fleet.PushTaskRes] No task results to push")
return message_handler.push_task_res(
request=request,
state=self.state_factory.state(),
@@ -101,7 +112,7 @@ def GetRun(
self, request: GetRunRequest, context: grpc.ServicerContext
) -> GetRunResponse:
"""Get run information."""
- log(INFO, "FleetServicer.GetRun")
+ log(INFO, "[Fleet.GetRun] Requesting `Run` for run_id=%s", request.run_id)
return message_handler.get_run(
request=request,
state=self.state_factory.state(),
@@ -111,7 +122,7 @@ def GetFab(
self, request: GetFabRequest, context: grpc.ServicerContext
) -> GetFabResponse:
"""Get FAB."""
- log(DEBUG, "DriverServicer.GetFab")
+ log(INFO, "[Fleet.GetFab] Requesting FAB for fab_hash=%s", request.hash_str)
return message_handler.get_fab(
request=request,
ffs=self.ffs_factory.ffs(),
From 559408bd29baf3119b6a57a5c4afa197d9367bdd Mon Sep 17 00:00:00 2001
From: Taner Topal
Date: Mon, 26 Aug 2024 13:09:54 +0200
Subject: [PATCH 09/42] break(framework) Remove `flwr example` command (#4084)
---
src/py/flwr/cli/app.py | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/py/flwr/cli/app.py b/src/py/flwr/cli/app.py
index d1b270026cd7..93effea6df98 100644
--- a/src/py/flwr/cli/app.py
+++ b/src/py/flwr/cli/app.py
@@ -18,7 +18,6 @@
from typer.main import get_command
from .build import build
-from .example import example
from .install import install
from .new import new
from .run import run
@@ -33,7 +32,6 @@
)
app.command()(new)
-app.command()(example)
app.command()(run)
app.command()(build)
app.command()(install)
From 6651729128c189408270c3dbe5256dd22deaa03a Mon Sep 17 00:00:00 2001
From: Adam Narozniak <51029327+adam-narozniak@users.noreply.github.com>
Date: Mon, 26 Aug 2024 14:05:32 +0200
Subject: [PATCH 10/42] feat(datasets) Add grouped natural id partitioner
(#4051)
Co-authored-by: Carlos Mari
Co-authored-by: jafermarq
---
.../flwr_datasets/partitioner/__init__.py | 2 +
.../grouped_natural_id_partitioner.py | 224 +++++++++++++
.../grouped_natural_id_partitioner_test.py | 310 ++++++++++++++++++
3 files changed, 536 insertions(+)
create mode 100644 datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner.py
create mode 100644 datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner_test.py
diff --git a/datasets/flwr_datasets/partitioner/__init__.py b/datasets/flwr_datasets/partitioner/__init__.py
index bb35d5a85cdc..acb2e6e832f5 100644
--- a/datasets/flwr_datasets/partitioner/__init__.py
+++ b/datasets/flwr_datasets/partitioner/__init__.py
@@ -18,6 +18,7 @@
from .dirichlet_partitioner import DirichletPartitioner
from .distribution_partitioner import DistributionPartitioner
from .exponential_partitioner import ExponentialPartitioner
+from .grouped_natural_id_partitioner import GroupedNaturalIdPartitioner
from .iid_partitioner import IidPartitioner
from .inner_dirichlet_partitioner import InnerDirichletPartitioner
from .linear_partitioner import LinearPartitioner
@@ -32,6 +33,7 @@
"DirichletPartitioner",
"DistributionPartitioner",
"ExponentialPartitioner",
+ "GroupedNaturalIdPartitioner",
"IidPartitioner",
"InnerDirichletPartitioner",
"LinearPartitioner",
diff --git a/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner.py b/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner.py
new file mode 100644
index 000000000000..f10d80b3aaac
--- /dev/null
+++ b/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner.py
@@ -0,0 +1,224 @@
+# Copyright 2024 Flower Labs GmbH. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Grouped natural id partitioner class that works with Hugging Face Datasets."""
+
+
+from typing import Any, Dict, List, Literal
+
+import numpy as np
+
+import datasets
+from flwr_datasets.common.typing import NDArrayInt
+from flwr_datasets.partitioner.partitioner import Partitioner
+
+
+class GroupedNaturalIdPartitioner(Partitioner):
+ """Partition dataset by creating groups of natural ids.
+
+ Conceptually, you can think of this partitioner as a way of creating an organization
+ of x users instead of each user represetning a separate partition. You can change
+ the nature of the problem from cross-device to cross-silo (cross organization).
+
+ Parameters
+ ----------
+ partition_by: str
+ The name of the column that contains the unique values of partitions.
+ group_size: int
+ The number of unique ids that will be placed in a single group.
+ mode: Literal["allow-smaller", "allow-bigger", "drop-reminder", ""strict"]
+ The mode that will be used to handle the remainder of the unique ids.
+ - "allow-smaller": The last group can be smaller than the group_size.
+ - "allow-bigger": The first group can be bigger than the group_size.
+ - "drop-reminder": The last group will be dropped if it is smaller than the
+ group_size.
+ - "strict": Raises a ValueError if the remainder is not zero. In this mode, you
+ expect each group to have the same size.
+ sort_unique_ids: bool
+ If True, the unique natural ids will be sorted before creating the groups.
+
+ Examples
+ --------
+ Partition users in the "sentiment140" (aka Twitter) dataset into groups of two
+ users following the default mode:
+
+ >>> from flwr_datasets import FederatedDataset
+ >>> from flwr_datasets.partitioner import GroupedNaturalIdPartitioner
+ >>>
+ >>> partitioner = GroupedNaturalIdPartitioner(partition_by="user", group_size=2)
+ >>> fds = FederatedDataset(dataset="sentiment140",
+ >>> partitioners={"train": partitioner})
+ >>> partition = fds.load_partition(0)
+ """
+
+ def __init__(
+ self,
+ partition_by: str,
+ group_size: int,
+ mode: Literal[
+ "allow-smaller", "allow-bigger", "drop-reminder", "strict"
+ ] = "allow-smaller",
+ sort_unique_ids: bool = False,
+ ) -> None:
+ super().__init__()
+ self._partition_id_to_natural_ids: Dict[int, List[Any]] = {}
+ self._natural_id_to_partition_id: Dict[Any, int] = {}
+ self._partition_id_to_indices: Dict[int, NDArrayInt] = {}
+ self._partition_by = partition_by
+ self._mode = mode
+ self._sort_unique_ids = sort_unique_ids
+
+ if group_size < 0:
+ raise ValueError("group_size must be a positive integer")
+ self._group_size = group_size
+
+ def _create_int_partition_id_to_natural_id(self) -> None:
+ """Create a mapping from int indices to unique client ids from dataset.
+
+ Natural ids come from the column specified in `partition_by`.
+ """
+ unique_natural_ids = self.dataset.unique(self._partition_by)
+ if self._mode != "allow-smaller" and self._group_size > len(unique_natural_ids):
+ raise ValueError(
+ "The group size needs to be smaller than the number of the unique "
+ "natural ids unless you are using allow-smaller mode which will "
+ "result in a single partition."
+ )
+ if self._sort_unique_ids:
+ unique_natural_ids = sorted(unique_natural_ids)
+ num_unique_natural_ids = len(unique_natural_ids)
+ remainder = num_unique_natural_ids % self._group_size
+ num_groups = num_unique_natural_ids // self._group_size
+ if num_groups == 0 and self._mode == "allow-smaller":
+ num_groups = 1
+ remainder = 0
+ # Note that the number of groups might be different that this number
+ # due to certain modes, it's a base value.
+
+ if self._mode == "allow-bigger":
+ groups_of_natural_ids = np.array_split(unique_natural_ids, num_groups)
+ elif self._mode == "drop-reminder":
+ # Narrow down the unique_natural_ids to not have a bigger group
+ # which is the behavior of the np.array_split
+ unique_natural_ids = unique_natural_ids[
+ : int(num_groups * self._group_size)
+ ]
+ groups_of_natural_ids = np.array_split(unique_natural_ids, num_groups)
+ elif self._mode == "allow-smaller":
+ if remainder > 0:
+ last_group_ids = unique_natural_ids[-remainder:]
+ unique_natural_ids = unique_natural_ids[
+ : int(num_groups * self._group_size)
+ ]
+ groups_of_natural_ids = np.array_split(unique_natural_ids, num_groups)
+ if remainder > 0:
+ groups_of_natural_ids.append(np.array(last_group_ids))
+ elif self._mode == "strict":
+ if remainder != 0:
+ raise ValueError(
+ "Strict mode requires that the number of unique natural ids is "
+ "perfectly divisible by the group size. "
+ f"Found remainder: {remainder}. Please pass the group_size that "
+ f"enables strict mode or relax the mode parameter. Refer to the "
+ f"documentation of the mode parameter for the available modes."
+ )
+ groups_of_natural_ids = np.array_split(unique_natural_ids, num_groups)
+ else:
+ raise ValueError(
+ f"Given {self._mode} is not a valid mode. Refer to the documentation of"
+ " the mode parameter for the available modes."
+ )
+
+ self._partition_id_to_natural_ids = {}
+ for group_of_natural_ids_id, group_of_natural_ids in enumerate(
+ groups_of_natural_ids
+ ):
+ self._partition_id_to_natural_ids[group_of_natural_ids_id] = (
+ group_of_natural_ids.tolist()
+ )
+
+ def _create_natural_id_to_int_partition_id(self) -> None:
+ """Create a mapping from unique client ids from dataset to int indices.
+
+ Natural ids come from the column specified in `partition_by`. This object is
+ inverse of the `self._partition_id_to_natural_id`. This method assumes that
+ `self._partition_id_to_natural_id` already exists.
+ """
+ self._natural_id_to_partition_id = {}
+ for partition_id, natural_ids in self._partition_id_to_natural_ids.items():
+ for natural_id in natural_ids:
+ self._natural_id_to_partition_id[natural_id] = partition_id
+
+ def _create_partition_id_to_indices(self) -> None:
+ natural_id_to_indices = {} # type: ignore
+ natural_ids = np.array(self.dataset[self._partition_by])
+
+ for index, natural_id in enumerate(natural_ids):
+ if natural_id not in natural_id_to_indices:
+ natural_id_to_indices[natural_id] = []
+ natural_id_to_indices[natural_id].append(index)
+
+ self._partition_id_to_indices = {}
+ for partition_id, natural_id_group in self._partition_id_to_natural_ids.items():
+ indices = []
+ for natural_id in natural_id_group:
+ indices.extend(natural_id_to_indices[natural_id])
+ self._partition_id_to_indices[partition_id] = np.array(indices)
+
+ def load_partition(self, partition_id: int) -> datasets.Dataset:
+ """Load a single partition corresponding to a single `partition_id`.
+
+ The choice of the partition is based on unique integers assigned to each
+ natural id present in the dataset in the `partition_by` column.
+
+
+ Parameters
+ ----------
+ partition_id : int
+ the index that corresponds to the requested partition
+
+ Returns
+ -------
+ dataset_partition : Dataset
+ single dataset partition
+ """
+ if len(self._partition_id_to_natural_ids) == 0:
+ self._create_int_partition_id_to_natural_id()
+ self._create_natural_id_to_int_partition_id()
+
+ if len(self._partition_id_to_indices) == 0:
+ self._create_partition_id_to_indices()
+
+ return self.dataset.select(self._partition_id_to_indices[partition_id])
+
+ @property
+ def num_partitions(self) -> int:
+ """Total number of partitions."""
+ if len(self._partition_id_to_natural_ids) == 0:
+ self._create_int_partition_id_to_natural_id()
+ self._create_natural_id_to_int_partition_id()
+ return len(self._partition_id_to_natural_ids)
+
+ @property
+ def partition_id_to_natural_ids(self) -> Dict[int, List[Any]]:
+ """Partition id to the corresponding group of natural ids present.
+
+ Natural ids are the unique values in `partition_by` column in dataset.
+ """
+ return self._partition_id_to_natural_ids
+
+ @property
+ def natural_id_to_partition_id(self) -> Dict[Any, int]:
+ """Natural id to the corresponding partition id."""
+ return self._natural_id_to_partition_id
diff --git a/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner_test.py b/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner_test.py
new file mode 100644
index 000000000000..635d3850624d
--- /dev/null
+++ b/datasets/flwr_datasets/partitioner/grouped_natural_id_partitioner_test.py
@@ -0,0 +1,310 @@
+# Copyright 2024 Flower Labs GmbH. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Test GroupedNaturalIdPartitioner."""
+
+
+import unittest
+from typing import List, Literal, Set
+
+from parameterized import parameterized, parameterized_class
+
+from datasets import Dataset
+from flwr_datasets.partitioner.grouped_natural_id_partitioner import (
+ GroupedNaturalIdPartitioner,
+)
+
+
+def _create_dataset(num_rows: int, n_unique_natural_ids: int) -> Dataset:
+ """Create dataset based on the number of rows and unique natural ids."""
+ data = {
+ "features": list(range(num_rows)),
+ "natural_id": [f"{i % n_unique_natural_ids}" for i in range(num_rows)],
+ "labels": [i % 2 for i in range(num_rows)],
+ }
+ dataset = Dataset.from_dict(data)
+ return dataset
+
+
+# mypy: disable-error-code="attr-defined"
+@parameterized_class(
+ ("sort_unique_ids",),
+ [
+ (False,),
+ (True,),
+ ],
+)
+# pylint: disable=no-member
+class TestGroupedNaturalIdPartitioner(unittest.TestCase):
+ """Test GroupedNaturalIdPartitioner."""
+
+ @parameterized.expand( # type: ignore
+ # num_rows, num_unique_natural_ids, group_size, expected_num_partitions
+ [
+ [10, 10, 2, 5],
+ [11, 10, 2, 5],
+ [100, 10, 2, 5],
+ [12, 6, 3, 2],
+ ]
+ )
+ def test_strict_mode_num_partitions_and_partition_sizes(
+ self,
+ num_rows: int,
+ num_unique_natural_id: int,
+ group_size: int,
+ expected_num_partitions: int,
+ ) -> None:
+ """Test strict mode with valid group size."""
+ dataset = _create_dataset(num_rows, num_unique_natural_id)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=group_size,
+ mode="strict",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ # Trigger partitioning
+ _ = partitioner.load_partition(0)
+ self.assertEqual(partitioner.num_partitions, expected_num_partitions)
+
+ @parameterized.expand( # type: ignore
+ # num_rows, num_unique_natural_ids, group_size, expected_num_partitions,
+ # expected_num_unique_natural_ids
+ [
+ [10, 10, 2, [2, 2, 2, 2, 2]],
+ [100, 10, 2, [2, 2, 2, 2, 2]],
+ [12, 6, 3, [3, 3]],
+ # The cases in which the partitions should be smaller
+ [10, 7, 2, [2, 2, 2, 1]],
+ [10, 3, 2, [2, 1]],
+ ]
+ )
+ def test_allow_smaller_mode_num_partitions_and_partition_sizes(
+ self,
+ num_rows: int,
+ num_unique_natural_id: int,
+ group_size: int,
+ expected_num_unique_natural_ids: List[int],
+ ) -> None:
+ """Test allow-smaller mode handles the remainder correctly."""
+ dataset = _create_dataset(num_rows, num_unique_natural_id)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=group_size,
+ mode="allow-smaller",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ # Trigger partitioning
+ partitions = [
+ partitioner.load_partition(i) for i in range(partitioner.num_partitions)
+ ]
+ unique_natural_ids = [
+ len(partition.unique("natural_id")) for partition in partitions
+ ]
+ self.assertEqual(unique_natural_ids, expected_num_unique_natural_ids)
+
+ @parameterized.expand( # type: ignore
+ # num_rows, num_unique_natural_ids, group_size, expected_num_partitions,
+ # expected_num_unique_natural_ids
+ [
+ [10, 10, 2, [2, 2, 2, 2, 2]],
+ [100, 10, 2, [2, 2, 2, 2, 2]],
+ [12, 6, 3, [3, 3]],
+ # The cases in which the partitions should be smaller
+ [10, 7, 2, [3, 2, 2]],
+ [10, 3, 2, [3]],
+ ]
+ )
+ def test_allow_bigger_mode_num_partitions_and_partition_sizes(
+ self,
+ num_rows: int,
+ num_unique_natural_id: int,
+ group_size: int,
+ expected_num_unique_natural_ids: List[int],
+ ) -> None:
+ """Test allow-bigger mode handles the remainder correctly."""
+ dataset = _create_dataset(num_rows, num_unique_natural_id)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=group_size,
+ mode="allow-bigger",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ # Trigger partitioning
+ partitions = [
+ partitioner.load_partition(i) for i in range(partitioner.num_partitions)
+ ]
+ unique_natural_ids = [
+ len(partition.unique("natural_id")) for partition in partitions
+ ]
+ self.assertEqual(unique_natural_ids, expected_num_unique_natural_ids)
+
+ @parameterized.expand( # type: ignore
+ # num_rows, num_unique_natural_ids, group_size, expected_num_partitions,
+ # expected_num_unique_natural_ids
+ [
+ [10, 10, 2, [2, 2, 2, 2, 2]],
+ [100, 10, 2, [2, 2, 2, 2, 2]],
+ [12, 6, 3, [3, 3]],
+ # The cases in which the partitions should be smaller
+ [10, 7, 2, [2, 2, 2]],
+ [10, 3, 2, [2]],
+ ]
+ )
+ def test_drop_reminder_mode_num_partitions_and_partition_sizes(
+ self,
+ num_rows: int,
+ num_unique_natural_id: int,
+ group_size: int,
+ expected_num_unique_natural_ids: List[int],
+ ) -> None:
+ """Test drop reminder mode."""
+ dataset = _create_dataset(num_rows, num_unique_natural_id)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=group_size,
+ mode="drop-reminder",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ # Trigger partitioning
+ partitions = [
+ partitioner.load_partition(i) for i in range(partitioner.num_partitions)
+ ]
+ unique_natural_ids = [
+ len(partition.unique("natural_id")) for partition in partitions
+ ]
+ self.assertEqual(unique_natural_ids, expected_num_unique_natural_ids)
+
+ @parameterized.expand( # type: ignore
+ # mode, num_rows, num_unique_natural_ids, group_size
+ [
+ ["strict", 10, 10, 2],
+ ["allow-smaller", 10, 7, 2],
+ ["allow-bigger", 10, 7, 2],
+ ["drop-reminder", 10, 7, 2],
+ ["strict", 12, 6, 3],
+ ["allow-smaller", 12, 6, 3],
+ ["allow-bigger", 12, 6, 3],
+ ["drop-reminder", 12, 6, 3],
+ ["allow-smaller", 10, 2, 3],
+ ]
+ )
+ def test_no_overlapping_natural_ids(
+ self,
+ mode: Literal["allow-smaller", "allow-bigger", "drop-reminder", "strict"],
+ num_rows: int,
+ num_unique_natural_id: int,
+ group_size: int,
+ ) -> None:
+ """Test that no natural_ids overlap across partitions."""
+ dataset = _create_dataset(num_rows, num_unique_natural_id)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=group_size,
+ mode=mode,
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+
+ # Trigger partitioning
+ partitions = [
+ partitioner.load_partition(i) for i in range(partitioner.num_partitions)
+ ]
+
+ # Check for overlaps between partitions
+ seen_natural_ids: Set[str] = set()
+ for partition in partitions:
+ natural_ids_in_partition = set(partition.unique("natural_id"))
+
+ # Check if there is any overlap with previously seen natural IDs
+ overlap = seen_natural_ids.intersection(natural_ids_in_partition)
+ self.assertTrue(
+ len(overlap) == 0,
+ f"Overlapping natural IDs found between partitions in mode: {mode}. "
+ f"Overlapping IDs: {overlap}",
+ )
+
+ # Add the natural IDs from this partition to the seen set
+ seen_natural_ids.update(natural_ids_in_partition)
+
+ def test_group_size_bigger_than_num_unique_natural_ids_allow_smaller(self) -> None:
+ """Test the allow-smaller mode with group size > number of unique natural ids.
+
+ That's the only mode that should work in this scenario.
+ """
+ dataset = _create_dataset(num_rows=10, n_unique_natural_ids=2)
+ expected_num_unique_natural_ids = [2]
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=3,
+ mode="allow-smaller",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ # Trigger partitioning
+ partitions = [
+ partitioner.load_partition(i) for i in range(partitioner.num_partitions)
+ ]
+ unique_natural_ids = [
+ len(partition.unique("natural_id")) for partition in partitions
+ ]
+
+ self.assertEqual(unique_natural_ids, expected_num_unique_natural_ids)
+
+ def test_strict_mode_with_invalid_group_size(self) -> None:
+ """Test strict mode raises if group_size does not divide unique IDs evenly."""
+ dataset = _create_dataset(num_rows=10, n_unique_natural_ids=3)
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=2,
+ mode="strict",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ with self.assertRaises(ValueError) as context:
+ _ = partitioner.load_partition(0)
+ self.assertIn(
+ "Strict mode requires that the number of unique natural ids is perfectly "
+ "divisible by the group size.",
+ str(context.exception),
+ )
+
+ def test_too_big_group_size(self) -> None:
+ """Test raises if the group size > than the number of unique natural ids."""
+ n_unique_natural_ids = 3
+ dataset = _create_dataset(
+ num_rows=10, n_unique_natural_ids=n_unique_natural_ids
+ )
+ partitioner = GroupedNaturalIdPartitioner(
+ partition_by="natural_id",
+ group_size=n_unique_natural_ids + 1,
+ mode="allow-bigger",
+ sort_unique_ids=self.sort_unique_ids,
+ )
+ partitioner.dataset = dataset
+ with self.assertRaises(ValueError) as context:
+ _ = partitioner.load_partition(0)
+ self.assertIn(
+ "The group size needs to be smaller than the number of the unique "
+ "natural ids unless you are using allow-smaller mode which will "
+ "result in a single partition.",
+ str(context.exception),
+ )
+
+
+if __name__ == "__main__":
+ unittest.main()
From 95b9f116269be4f2e42a9daa10dadf89df0216c5 Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Tue, 27 Aug 2024 14:27:28 +0100
Subject: [PATCH 11/42] fix(framework) Update FlowerTune template code (#3997)
Co-authored-by: Javier
---
src/py/flwr/cli/new/new.py | 16 ++--
.../templates/app/README.flowertune.md.tpl | 22 +++--
.../templates/app/code/flwr_tune/app.py.tpl | 89 -----------------
.../{client.py.tpl => client_app.py.tpl} | 90 ++++++++++--------
.../app/code/flwr_tune/config.yaml.tpl | 34 -------
.../app/code/flwr_tune/dataset.py.tpl | 34 ++++++-
.../app/code/flwr_tune/models.py.tpl | 3 -
.../app/code/flwr_tune/server.py.tpl | 48 ----------
.../app/code/flwr_tune/server_app.py.tpl | 95 +++++++++++++++++++
.../app/code/flwr_tune/static_config.yaml.tpl | 11 ---
.../app/code/flwr_tune/strategy.py.tpl | 64 +++++++++++++
.../app/pyproject.flowertune.toml.tpl | 41 ++++++--
12 files changed, 297 insertions(+), 250 deletions(-)
delete mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/app.py.tpl
rename src/py/flwr/cli/new/templates/app/code/flwr_tune/{client.py.tpl => client_app.py.tpl} (66%)
delete mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/config.yaml.tpl
delete mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/server.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/server_app.py.tpl
delete mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/static_config.yaml.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
diff --git a/src/py/flwr/cli/new/new.py b/src/py/flwr/cli/new/new.py
index 862244da9158..31da7b4ab9fb 100644
--- a/src/py/flwr/cli/new/new.py
+++ b/src/py/flwr/cli/new/new.py
@@ -187,24 +187,20 @@ def new(
"pyproject.toml": {"template": f"app/pyproject.{framework_str}.toml.tpl"},
"README.md": {"template": f"app/README.{framework_str}.md.tpl"},
f"{import_name}/__init__.py": {"template": "app/code/__init__.py.tpl"},
- f"{import_name}/server.py": {
- "template": "app/code/flwr_tune/server.py.tpl"
+ f"{import_name}/server_app.py": {
+ "template": "app/code/flwr_tune/server_app.py.tpl"
},
- f"{import_name}/client.py": {
- "template": "app/code/flwr_tune/client.py.tpl"
+ f"{import_name}/client_app.py": {
+ "template": "app/code/flwr_tune/client_app.py.tpl"
},
- f"{import_name}/app.py": {"template": "app/code/flwr_tune/app.py.tpl"},
f"{import_name}/models.py": {
"template": "app/code/flwr_tune/models.py.tpl"
},
f"{import_name}/dataset.py": {
"template": "app/code/flwr_tune/dataset.py.tpl"
},
- f"{import_name}/conf/config.yaml": {
- "template": "app/code/flwr_tune/config.yaml.tpl"
- },
- f"{import_name}/conf/static_config.yaml": {
- "template": "app/code/flwr_tune/static_config.yaml.tpl"
+ f"{import_name}/strategy.py": {
+ "template": "app/code/flwr_tune/strategy.py.tpl"
},
}
diff --git a/src/py/flwr/cli/new/templates/app/README.flowertune.md.tpl b/src/py/flwr/cli/new/templates/app/README.flowertune.md.tpl
index 2b59937e4130..4bdc9c779a29 100644
--- a/src/py/flwr/cli/new/templates/app/README.flowertune.md.tpl
+++ b/src/py/flwr/cli/new/templates/app/README.flowertune.md.tpl
@@ -23,10 +23,12 @@ pip install -e .
## Experimental setup
-The dataset is partitioned into $num_clients shards with IID fashion serving as clients.
-We randomly sample $fraction_fit clients to be available for each round,
-and the federated fine-tuning lasts for `200` rounds.
-All settings are defined in `$project_name/conf/static_config.yaml`, which is not allowed to be modified for fair competition if you plan to participated in the [LLM leaderboard](https://flower.ai/benchmarks/llm-leaderboard).
+The dataset is divided into $num_clients partitions in an IID fashion, a partition is assigned to each ClientApp.
+We randomly sample a fraction ($fraction_fit) of the total nodes to participate in each round, for a total of `200` rounds.
+All settings are defined in `pyproject.toml`.
+
+> [!IMPORTANT]
+> Please note that `[tool.flwr.app.config.static]` and `options.num-supernodes` under `[tool.flwr.federations.local-simulation]` are not allowed to be modified for fair competition if you plan to participated in the [LLM leaderboard](https://flower.ai/benchmarks/llm-leaderboard).
## Running the challenge
@@ -39,7 +41,7 @@ huggingface-cli login
```
Run the challenge with default config values.
-The configs are in `$project_name/conf/config.yaml` and `$project_name/conf/static_config.yaml`, and are loaded automatically.
+The configs are defined in `[tool.flwr.app.config]` entry of `pyproject.toml`, and are loaded automatically.
```bash
flwr run
@@ -53,4 +55,12 @@ We use Mistral-7B model with 4-bit quantization as default. The estimated VRAM c
| :--------: | :--------: | :--------: | :--------: | :--------: |
| VRAM | ~25.50 GB | ~17.30 GB | ~22.80 GB | ~17.40 GB |
-You can adjust the CPU/GPU resources you assign to each of the clients based on your device, which is specified with `flower.engine.simulation` in `pyproject.toml`.
+You can adjust the CPU/GPU resources you assign to each of the clients based on your device, which are specified with `options.backend.clientapp-cpus` and `options.backend.clientapp-gpus` under `[tool.flwr.federations.local-simulation]` entry in `pyproject.toml`.
+
+
+## Model saving
+
+The global PEFT model checkpoints are saved every 5 rounds after aggregation on the sever side as default, which can be specified with `train.save-every-round` under [tool.flwr.app.config] entry in `pyproject.toml`.
+
+> [!NOTE]
+> Please provide the last PEFT checkpoint if you plan to participated in the [LLM leaderboard](https://flower.ai/benchmarks/llm-leaderboard).
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/app.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/app.py.tpl
deleted file mode 100644
index 637658c5b23c..000000000000
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/app.py.tpl
+++ /dev/null
@@ -1,89 +0,0 @@
-"""$project_name: A Flower / FlowerTune app."""
-
-import os
-import warnings
-from datetime import datetime
-
-from flwr_datasets import FederatedDataset
-from hydra import compose, initialize
-from hydra.utils import instantiate
-
-from flwr.client import ClientApp
-from flwr.common import Context, ndarrays_to_parameters
-from flwr.server import ServerApp, ServerAppComponents, ServerConfig
-
-from $import_name.client_app import gen_client_fn, get_parameters
-from $import_name.dataset import get_tokenizer_and_data_collator_and_propt_formatting
-from $import_name.models import get_model
-from $import_name.server_app import fit_weighted_average, get_evaluate_fn, get_on_fit_config
-
-# Avoid warnings
-warnings.filterwarnings("ignore", category=UserWarning)
-os.environ["TOKENIZERS_PARALLELISM"] = "true"
-os.environ["RAY_DISABLE_DOCKER_CPU_WARNING"] = "1"
-
-# Initialise regular config
-with initialize(config_path="conf", version_base="1.1"):
- cfg = compose(config_name="config")
-
-# Initialise static config
-with initialize(config_path="conf", version_base="1.1"):
- cfg_static = compose(config_name="static_config")
-
-cfg.train.num_rounds = cfg_static.num_rounds
-
-# Create output directory given current timestamp
-current_time = datetime.now()
-folder_name = current_time.strftime("%Y-%m-%d_%H-%M-%S")
-save_path = os.path.join(os.getcwd(), f"results/{folder_name}")
-os.makedirs(save_path, exist_ok=True)
-
-# Partition dataset and get dataloaders
-partitioner = instantiate(cfg_static.partitioner)
-fds = FederatedDataset(
- dataset=cfg_static.dataset.name, partitioners={"train": partitioner}
-)
-(
- tokenizer,
- data_collator,
- formatting_prompts_func,
-) = get_tokenizer_and_data_collator_and_propt_formatting(cfg.model.name)
-
-# ClientApp for Flower Next
-client = ClientApp(
- client_fn=gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- cfg.model,
- cfg.train,
- save_path,
- ),
-)
-
-# Get initial model weights
-init_model = get_model(cfg.model)
-init_model_parameters = get_parameters(init_model)
-init_model_parameters = ndarrays_to_parameters(init_model_parameters)
-
-def server_fn(context: Context):
- # Instantiate strategy according to config. Here we pass other arguments
- # that are only defined at runtime.
- strategy = instantiate(
- cfg.strategy,
- on_fit_config_fn=get_on_fit_config(),
- fit_metrics_aggregation_fn=fit_weighted_average,
- initial_parameters=init_model_parameters,
- evaluate_fn=get_evaluate_fn(
- cfg.model, cfg.train.save_every_round, cfg_static.num_rounds, save_path
- ),
- )
-
- config = ServerConfig(num_rounds=cfg_static.num_rounds)
-
- return ServerAppComponents(strategy=strategy, config=config)
-
-
-# ServerApp for Flower Next
-server = ServerApp(server_fn=server_fn)
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/client.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/client_app.py.tpl
similarity index 66%
rename from src/py/flwr/cli/new/templates/app/code/flwr_tune/client.py.tpl
rename to src/py/flwr/cli/new/templates/app/code/flwr_tune/client_app.py.tpl
index 2472e23ece44..19d1e20baccd 100644
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/client.py.tpl
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/client_app.py.tpl
@@ -1,20 +1,32 @@
"""$project_name: A Flower / FlowerTune app."""
+import os
+import warnings
from collections import OrderedDict
-from typing import Callable, Dict, Tuple
+from typing import Dict, Tuple
import torch
+from flwr.client import ClientApp, NumPyClient
+from flwr.common import Context
+from flwr.common.config import unflatten_dict
+from flwr.common.typing import NDArrays, Scalar
from omegaconf import DictConfig
from peft import get_peft_model_state_dict, set_peft_model_state_dict
from transformers import TrainingArguments
from trl import SFTTrainer
-from flwr.client import NumPyClient
-from flwr.common import Context
-from flwr.common.typing import NDArrays, Scalar
-from $import_name.dataset import reformat
+from $import_name.dataset import (
+ get_tokenizer_and_data_collator_and_propt_formatting,
+ load_data,
+ replace_keys,
+)
from $import_name.models import cosine_annealing, get_model
+# Avoid warnings
+os.environ["TOKENIZERS_PARALLELISM"] = "true"
+os.environ["RAY_DISABLE_DOCKER_CPU_WARNING"] = "1"
+warnings.filterwarnings("ignore", category=UserWarning)
+
# pylint: disable=too-many-arguments
# pylint: disable=too-many-instance-attributes
@@ -29,7 +41,7 @@ class FlowerClient(NumPyClient):
tokenizer,
formatting_prompts_func,
data_collator,
- save_path,
+ num_rounds,
): # pylint: disable=too-many-arguments
self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
self.train_cfg = train_cfg
@@ -37,13 +49,12 @@ class FlowerClient(NumPyClient):
self.tokenizer = tokenizer
self.formatting_prompts_func = formatting_prompts_func
self.data_collator = data_collator
- self.save_path = save_path
+ self.num_rounds = num_rounds
+ self.trainset = trainset
# instantiate model
self.model = get_model(model_cfg)
- self.trainset = trainset
-
def fit(
self, parameters: NDArrays, config: Dict[str, Scalar]
) -> Tuple[NDArrays, int, Dict]:
@@ -52,13 +63,13 @@ class FlowerClient(NumPyClient):
new_lr = cosine_annealing(
int(config["current_round"]),
- self.train_cfg.num_rounds,
+ self.num_rounds,
self.train_cfg.learning_rate_max,
self.train_cfg.learning_rate_min,
)
self.training_argumnets.learning_rate = new_lr
- self.training_argumnets.output_dir = self.save_path
+ self.training_argumnets.output_dir = config["save_path"]
# Construct trainer
trainer = SFTTrainer(
@@ -95,32 +106,31 @@ def get_parameters(model) -> NDArrays:
return [val.cpu().numpy() for _, val in state_dict.items()]
-def gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- model_cfg: DictConfig,
- train_cfg: DictConfig,
- save_path: str,
-) -> Callable[[Context], FlowerClient]: # pylint: disable=too-many-arguments
- """Generate the client function that creates the Flower Clients."""
-
- def client_fn(context: Context) -> FlowerClient:
- """Create a Flower client representing a single organization."""
- # Let's get the partition corresponding to the i-th client
- partition_id = context.node_config["partition-id"]
- client_trainset = fds.load_partition(partition_id, "train")
- client_trainset = reformat(client_trainset, llm_task="$llm_challenge_str")
-
- return FlowerClient(
- model_cfg,
- train_cfg,
- client_trainset,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- save_path,
- ).to_client()
-
- return client_fn
+def client_fn(context: Context) -> FlowerClient:
+ """Create a Flower client representing a single organization."""
+ partition_id = context.node_config["partition-id"]
+ num_partitions = context.node_config["num-partitions"]
+ num_rounds = context.run_config["num-server-rounds"]
+ cfg = DictConfig(replace_keys(unflatten_dict(context.run_config)))
+
+ # Let's get the client partition
+ client_trainset = load_data(partition_id, num_partitions, cfg.static.dataset.name)
+ (
+ tokenizer,
+ data_collator,
+ formatting_prompts_func,
+ ) = get_tokenizer_and_data_collator_and_propt_formatting(cfg.model.name)
+
+ return FlowerClient(
+ cfg.model,
+ cfg.train,
+ client_trainset,
+ tokenizer,
+ formatting_prompts_func,
+ data_collator,
+ num_rounds,
+ ).to_client()
+
+
+# Flower ClientApp
+app = ClientApp(client_fn)
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/config.yaml.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/config.yaml.tpl
deleted file mode 100644
index 9f700dd5b8da..000000000000
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/config.yaml.tpl
+++ /dev/null
@@ -1,34 +0,0 @@
-# Federated Instruction Tuning
----
-model:
- name: "mistralai/Mistral-7B-v0.3"
- quantization: 4 # 8 or 4 if you want to do quantization with BitsAndBytes
- gradient_checkpointing: True
- lora:
- peft_lora_r: 32
- peft_lora_alpha: 64
-
-train:
- num_rounds: null
- save_every_round: 5
- learning_rate_max: 5e-5
- learning_rate_min: 1e-6
- seq_length: 512
- training_arguments:
- output_dir: null # to be set by hydra
- learning_rate: null # to be set by the client
- per_device_train_batch_size: 16
- gradient_accumulation_steps: 1
- logging_steps: 10
- num_train_epochs: 3
- max_steps: 10
- report_to: null
- save_steps: 1000
- save_total_limit: 10
- gradient_checkpointing: True
- lr_scheduler_type: "constant"
-
-strategy:
- _target_: flwr.server.strategy.FedAvg
- fraction_fit: $fraction_fit
- fraction_evaluate: 0.0 # no client evaluation
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/dataset.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/dataset.py.tpl
index 1b3691d7cf3c..41381ef7c7a3 100644
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/dataset.py.tpl
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/dataset.py.tpl
@@ -1,8 +1,12 @@
"""$project_name: A Flower / FlowerTune app."""
+from flwr_datasets import FederatedDataset
+from flwr_datasets.partitioner import IidPartitioner
from transformers import AutoTokenizer
from trl import DataCollatorForCompletionOnlyLM
+FDS = None # Cache FederatedDataset
+
def formatting_prompts_func(example):
"""Construct prompts."""
@@ -24,7 +28,6 @@ def formatting_prompts_func(example):
def get_tokenizer_and_data_collator_and_propt_formatting(model_name: str):
"""Get tokenizer, data_collator and prompt formatting."""
- # From: https://huggingface.co/docs/trl/en/sft_trainer
tokenizer = AutoTokenizer.from_pretrained(
model_name, use_fast=True, padding_side="right"
)
@@ -49,9 +52,36 @@ def formatting(dataset):
def reformat(dataset, llm_task):
"""Reformat datasets."""
dataset = dataset.rename_column("output", "response")
- if llm_task == "finance" or llm_task == "code":
+ if llm_task in ["finance", "code"]:
dataset = dataset.map(formatting, remove_columns=["input"])
if llm_task == "medical":
dataset = dataset.remove_columns(["instruction"])
dataset = dataset.rename_column("input", "instruction")
return dataset
+
+
+def load_data(partition_id: int, num_partitions: int, dataset_name: str):
+ """Load partition data."""
+ # Only initialize `FederatedDataset` once
+ global FDS
+ if FDS is None:
+ partitioner = IidPartitioner(num_partitions=num_partitions)
+ FDS = FederatedDataset(
+ dataset=dataset_name,
+ partitioners={"train": partitioner},
+ )
+ client_trainset = FDS.load_partition(partition_id, "train")
+ client_trainset = reformat(client_trainset, llm_task="generalnlp")
+ return client_trainset
+
+
+def replace_keys(input_dict, match="-", target="_"):
+ """Recursively replace match string with target string in dictionary keys."""
+ new_dict = {}
+ for key, value in input_dict.items():
+ new_key = key.replace(match, target)
+ if isinstance(value, dict):
+ new_dict[new_key] = replace_keys(value, match, target)
+ else:
+ new_dict[new_key] = value
+ return new_dict
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/models.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/models.py.tpl
index a2794f35518c..a548ba9abeef 100644
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/models.py.tpl
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/models.py.tpl
@@ -22,9 +22,6 @@ def cosine_annealing(
def get_model(model_cfg: DictConfig):
"""Load model with appropriate quantization config and other optimizations.
-
- Please refer to this example for `peft + BitsAndBytes`:
- https://github.com/huggingface/peft/blob/main/examples/fp4_finetuning/finetune_fp4_opt_bnb_peft.py
"""
if model_cfg.quantization == 4:
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/server.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/server.py.tpl
deleted file mode 100644
index 5dd4d881f2f1..000000000000
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/server.py.tpl
+++ /dev/null
@@ -1,48 +0,0 @@
-"""$project_name: A Flower / FlowerTune app."""
-
-from $import_name.client_app import set_parameters
-from $import_name.models import get_model
-
-
-# Get function that will be executed by the strategy's evaluate() method
-# Here we use it to save global model checkpoints
-def get_evaluate_fn(model_cfg, save_every_round, total_round, save_path):
- """Return an evaluation function for saving global model."""
-
- def evaluate(server_round: int, parameters, config):
- # Save model
- if server_round != 0 and (
- server_round == total_round or server_round % save_every_round == 0
- ):
- # Init model
- model = get_model(model_cfg)
- set_parameters(model, parameters)
-
- model.save_pretrained(f"{save_path}/peft_{server_round}")
-
- return 0.0, {}
-
- return evaluate
-
-
-def get_on_fit_config():
- """
- Return a function that will be used to construct the config
- that the client's fit() method will receive.
- """
-
- def fit_config_fn(server_round: int):
- fit_config = {"current_round": server_round}
- return fit_config
-
- return fit_config_fn
-
-
-def fit_weighted_average(metrics):
- """Aggregate (federated) evaluation metrics."""
- # Multiply accuracy of each client by number of examples used
- losses = [num_examples * m["train_loss"] for num_examples, m in metrics]
- examples = [num_examples for num_examples, _ in metrics]
-
- # Aggregate and return custom metric (weighted average)
- return {"train_loss": sum(losses) / sum(examples)}
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/server_app.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/server_app.py.tpl
new file mode 100644
index 000000000000..586b929be06c
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/server_app.py.tpl
@@ -0,0 +1,95 @@
+"""$project_name: A Flower / FlowerTune app."""
+
+import os
+from datetime import datetime
+
+from flwr.common import Context, ndarrays_to_parameters
+from flwr.common.config import unflatten_dict
+from flwr.server import ServerApp, ServerAppComponents, ServerConfig
+from omegaconf import DictConfig
+
+from $import_name.client_app import get_parameters, set_parameters
+from $import_name.models import get_model
+from $import_name.dataset import replace_keys
+from $import_name.strategy import FlowerTuneLlm
+
+
+# Get function that will be executed by the strategy's evaluate() method
+# Here we use it to save global model checkpoints
+def get_evaluate_fn(model_cfg, save_every_round, total_round, save_path):
+ """Return an evaluation function for saving global model."""
+
+ def evaluate(server_round: int, parameters, config):
+ # Save model
+ if server_round != 0 and (
+ server_round == total_round or server_round % save_every_round == 0
+ ):
+ # Init model
+ model = get_model(model_cfg)
+ set_parameters(model, parameters)
+
+ model.save_pretrained(f"{save_path}/peft_{server_round}")
+
+ return 0.0, {}
+
+ return evaluate
+
+
+def get_on_fit_config(save_path):
+ """Return a function that will be used to construct the config that the
+ client's fit() method will receive."""
+
+ def fit_config_fn(server_round: int):
+ fit_config = {}
+ fit_config["current_round"] = server_round
+ fit_config["save_path"] = save_path
+ return fit_config
+
+ return fit_config_fn
+
+
+def fit_weighted_average(metrics):
+ """Aggregate (federated) evaluation metrics."""
+ # Multiply accuracy of each client by number of examples used
+ losses = [num_examples * m["train_loss"] for num_examples, m in metrics]
+ examples = [num_examples for num_examples, _ in metrics]
+
+ # Aggregate and return custom metric (weighted average)
+ return {"train_loss": sum(losses) / sum(examples)}
+
+
+def server_fn(context: Context):
+ """Construct components that set the ServerApp behaviour."""
+ # Create output directory given current timestamp
+ current_time = datetime.now()
+ folder_name = current_time.strftime("%Y-%m-%d_%H-%M-%S")
+ save_path = os.path.join(os.getcwd(), f"results/{folder_name}")
+ os.makedirs(save_path, exist_ok=True)
+
+ # Read from config
+ num_rounds = context.run_config["num-server-rounds"]
+ cfg = DictConfig(replace_keys(unflatten_dict(context.run_config)))
+
+ # Get initial model weights
+ init_model = get_model(cfg.model)
+ init_model_parameters = get_parameters(init_model)
+ init_model_parameters = ndarrays_to_parameters(init_model_parameters)
+
+ # Define strategy
+ strategy = FlowerTuneLlm(
+ fraction_fit=cfg.strategy.fraction_fit,
+ fraction_evaluate=cfg.strategy.fraction_evaluate,
+ on_fit_config_fn=get_on_fit_config(save_path),
+ fit_metrics_aggregation_fn=fit_weighted_average,
+ initial_parameters=init_model_parameters,
+ evaluate_fn=get_evaluate_fn(
+ cfg.model, cfg.train.save_every_round, num_rounds, save_path
+ ),
+ )
+ config = ServerConfig(num_rounds=num_rounds)
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+# Flower ServerApp
+app = ServerApp(server_fn=server_fn)
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/static_config.yaml.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/static_config.yaml.tpl
deleted file mode 100644
index a8a4039fc831..000000000000
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/static_config.yaml.tpl
+++ /dev/null
@@ -1,11 +0,0 @@
-# Federated Instruction Tuning (static)
----
-dataset:
- name: $dataset_name
-
-# FL experimental settings
-num_clients: $num_clients # total number of clients
-num_rounds: 200
-partitioner:
- _target_: flwr_datasets.partitioner.IidPartitioner
- num_partitions: $num_clients
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
new file mode 100644
index 000000000000..55b2ee3c9aa7
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
@@ -0,0 +1,64 @@
+"""$project_name: A Flower / FlowerTune app."""
+
+from logging import INFO, WARN
+from typing import List, Tuple, Union
+
+from flwr.common import FitIns, FitRes, Parameters, log, parameters_to_ndarrays
+from flwr.server.client_manager import ClientManager
+from flwr.server.client_proxy import ClientProxy
+from flwr.server.strategy import FedAvg
+
+
+class FlowerTuneLlm(FedAvg):
+ """Customised FedAvg strategy implementation."""
+
+ def configure_fit(
+ self, server_round: int, parameters: Parameters, client_manager: ClientManager
+ ):
+ """Configure the next round of training."""
+ return_clients = super().configure_fit(server_round, parameters, client_manager)
+
+ # Test communication costs
+ fit_ins_list = [fit_ins for _, fit_ins in return_clients]
+ test_communication_costs(fit_ins_list)
+
+ return return_clients
+
+ def aggregate_fit(
+ self,
+ server_round: int,
+ results: List[Tuple[ClientProxy, FitRes]],
+ failures: List[Union[Tuple[ClientProxy, FitRes], BaseException]],
+ ):
+ """Aggregate fit results using weighted average."""
+ # Test communication costs
+ fit_res_list = [fit_res for _, fit_res in results]
+ test_communication_costs(fit_res_list)
+
+ parameters_aggregated, metrics_aggregated = super().aggregate_fit(
+ server_round, results, failures
+ )
+
+ return parameters_aggregated, metrics_aggregated
+
+
+def test_communication_costs(fit_list: List[Union[FitIns, FitRes]]):
+ """Test communication costs per FL round."""
+
+ def compute_bytes(weights):
+ return sum(ele.nbytes for ele in weights)
+
+ size_bytes_list = [
+ compute_bytes(parameters_to_ndarrays(fit_ele.parameters))
+ for fit_ele in fit_list
+ ]
+ comm_cost = 2 * sum(size_bytes_list) / 1024**2
+ log(INFO, "Communication costs per round: %f MB", comm_cost)
+
+ if comm_cost > 500:
+ log(
+ WARN,
+ "The total communication costs per round exceed 500 MB. "
+ "Please consider reducing it if you plan to participate "
+ "FlowerTune LLM Leaderboard.",
+ )
diff --git a/src/py/flwr/cli/new/templates/app/pyproject.flowertune.toml.tpl b/src/py/flwr/cli/new/templates/app/pyproject.flowertune.toml.tpl
index b564a66090d2..5046a6f89f27 100644
--- a/src/py/flwr/cli/new/templates/app/pyproject.flowertune.toml.tpl
+++ b/src/py/flwr/cli/new/templates/app/pyproject.flowertune.toml.tpl
@@ -8,15 +8,15 @@ version = "1.0.0"
description = ""
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.9.0,<2.0",
- "flwr-datasets>=0.1.0,<1.0.0",
- "hydra-core==1.3.2",
+ "flwr[simulation]>=1.10.0",
+ "flwr-datasets>=0.3.0",
"trl==0.8.1",
"bitsandbytes==0.43.0",
"scipy==1.13.0",
"peft==0.6.2",
"transformers==4.39.3",
"sentencepiece==0.2.0",
+ "omegaconf==2.3.0",
]
[tool.hatch.build.targets.wheel]
@@ -26,14 +26,41 @@ packages = ["."]
publisher = "$username"
[tool.flwr.app.components]
-serverapp = "$import_name.app:server"
-clientapp = "$import_name.app:client"
+serverapp = "$import_name.server_app:app"
+clientapp = "$import_name.client_app:app"
[tool.flwr.app.config]
-num-server-rounds = 3
+model.name = "mistralai/Mistral-7B-v0.3"
+model.quantization = 4
+model.gradient-checkpointing = true
+model.lora.peft-lora-r = 32
+model.lora.peft-lora-alpha = 64
+train.save-every-round = 5
+train.learning-rate-max = 5e-5
+train.learning-rate-min = 1e-6
+train.seq-length = 512
+train.training-arguments.output-dir = ""
+train.training-arguments.learning-rate = ""
+train.training-arguments.per-device-train-batch-size = 16
+train.training-arguments.gradient-accumulation-steps = 1
+train.training-arguments.logging-steps = 10
+train.training-arguments.num-train-epochs = 3
+train.training-arguments.max-steps = 10
+train.training-arguments.save-steps = 1000
+train.training-arguments.save-total-limit = 10
+train.training-arguments.gradient-checkpointing = true
+train.training-arguments.lr-scheduler-type = "constant"
+strategy.fraction-fit = $fraction_fit
+strategy.fraction-evaluate = 0.0
+num-server-rounds = 200
+
+[tool.flwr.app.config.static]
+dataset.name = "$dataset_name"
[tool.flwr.federations]
default = "local-simulation"
[tool.flwr.federations.local-simulation]
-options.num-supernodes = 10
+options.num-supernodes = $num_clients
+options.backend.client-resources.num-cpus = 6
+options.backend.client-resources.num-gpus = 1.0
From 01db3c20e6256af44a2b2bd781719f811fec33e7 Mon Sep 17 00:00:00 2001
From: Charles Beauville
Date: Tue, 27 Aug 2024 15:42:10 +0200
Subject: [PATCH 12/42] fix(framework:skip) Use `app` directory as root for
certs (#3850)
---
src/py/flwr/cli/run/run.py | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/py/flwr/cli/run/run.py b/src/py/flwr/cli/run/run.py
index b2c4dc4151cd..6375e71522de 100644
--- a/src/py/flwr/cli/run/run.py
+++ b/src/py/flwr/cli/run/run.py
@@ -124,14 +124,14 @@ def run(
def _run_with_superexec(
- app: Optional[Path],
+ app: Path,
federation_config: Dict[str, Any],
config_overrides: Optional[List[str]],
) -> None:
insecure_str = federation_config.get("insecure")
if root_certificates := federation_config.get("root-certificates"):
- root_certificates_bytes = Path(root_certificates).read_bytes()
+ root_certificates_bytes = (app / root_certificates).read_bytes()
if insecure := bool(insecure_str):
typer.secho(
"❌ `root_certificates` were provided but the `insecure` parameter"
From ebf0d3fcaae1cdade98740aae7f2bfa02b5f6634 Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Tue, 27 Aug 2024 17:39:44 +0100
Subject: [PATCH 13/42] feat(framework) Update communication cost calculator
for LLM leaderboard (#4013)
Co-authored-by: jafermarq
Co-authored-by: Daniel J. Beutel
---
.../app/code/flwr_tune/strategy.py.tpl | 55 +++++++++++++------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl b/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
index 55b2ee3c9aa7..8accd70c4e76 100644
--- a/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
+++ b/src/py/flwr/cli/new/templates/app/code/flwr_tune/strategy.py.tpl
@@ -1,5 +1,6 @@
"""$project_name: A Flower / FlowerTune app."""
+from io import BytesIO
from logging import INFO, WARN
from typing import List, Tuple, Union
@@ -10,7 +11,14 @@ from flwr.server.strategy import FedAvg
class FlowerTuneLlm(FedAvg):
- """Customised FedAvg strategy implementation."""
+ """Customised FedAvg strategy implementation.
+
+ This class behaves just like FedAvg but also tracks the communication
+ costs associated with `fit` over FL rounds.
+ """
+ def __init__(self, **kwargs):
+ super().__init__(**kwargs)
+ self.comm_tracker = CommunicationTracker()
def configure_fit(
self, server_round: int, parameters: Parameters, client_manager: ClientManager
@@ -20,7 +28,7 @@ class FlowerTuneLlm(FedAvg):
# Test communication costs
fit_ins_list = [fit_ins for _, fit_ins in return_clients]
- test_communication_costs(fit_ins_list)
+ self.comm_tracker.track(fit_ins_list)
return return_clients
@@ -33,7 +41,7 @@ class FlowerTuneLlm(FedAvg):
"""Aggregate fit results using weighted average."""
# Test communication costs
fit_res_list = [fit_res for _, fit_res in results]
- test_communication_costs(fit_res_list)
+ self.comm_tracker.track(fit_res_list)
parameters_aggregated, metrics_aggregated = super().aggregate_fit(
server_round, results, failures
@@ -42,23 +50,34 @@ class FlowerTuneLlm(FedAvg):
return parameters_aggregated, metrics_aggregated
-def test_communication_costs(fit_list: List[Union[FitIns, FitRes]]):
- """Test communication costs per FL round."""
+class CommunicationTracker:
+ """Communication costs tracker over FL rounds."""
+ def __init__(self):
+ self.curr_comm_cost = 0.0
- def compute_bytes(weights):
- return sum(ele.nbytes for ele in weights)
+ @staticmethod
+ def _compute_bytes(parameters):
+ return sum([BytesIO(t).getbuffer().nbytes for t in parameters.tensors])
- size_bytes_list = [
- compute_bytes(parameters_to_ndarrays(fit_ele.parameters))
- for fit_ele in fit_list
- ]
- comm_cost = 2 * sum(size_bytes_list) / 1024**2
- log(INFO, "Communication costs per round: %f MB", comm_cost)
+ def track(self, fit_list: List[Union[FitIns, FitRes]]):
+ size_bytes_list = [
+ self._compute_bytes(fit_ele.parameters)
+ for fit_ele in fit_list
+ ]
+ comm_cost = sum(size_bytes_list) / 1024**2
- if comm_cost > 500:
+ self.curr_comm_cost += comm_cost
log(
- WARN,
- "The total communication costs per round exceed 500 MB. "
- "Please consider reducing it if you plan to participate "
- "FlowerTune LLM Leaderboard.",
+ INFO,
+ "Communication budget: used %.2f MB (+%.2f MB this round) / 200,000 MB",
+ self.curr_comm_cost,
+ comm_cost,
)
+
+ if self.curr_comm_cost > 2e5:
+ log(
+ WARN,
+ "The accumulated communication cost has exceeded 200,000 MB. "
+ "Please consider reducing it if you plan to participate "
+ "FlowerTune LLM Leaderboard.",
+ )
From 5b3784a099f2f285e7ccb657c8457a2f83d0d8a8 Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Tue, 27 Aug 2024 18:05:58 +0100
Subject: [PATCH 14/42] refactor(framework) Update FlowerTune LLM example
(#4046)
Co-authored-by: jafermarq
---
README.md | 2 +-
examples/doc/source/conf.py | 1 +
examples/flowertune-llm/README.md | 118 ++++++++++++++
.../_static/train_loss_smooth.png | Bin
.../flowertune-llm/flowertune_llm/__init__.py | 1 +
.../flowertune_llm/client_app.py} | 119 ++++++++-------
.../flowertune_llm}/dataset.py | 33 ++++
.../flowertune_llm}/models.py | 5 +-
.../flowertune_llm/server_app.py | 95 ++++++++++++
examples/flowertune-llm/pyproject.toml | 66 ++++++++
.../test.py | 0
examples/llm-flowertune/README.md | 144 ------------------
examples/llm-flowertune/app.py | 85 -----------
examples/llm-flowertune/conf/config.yaml | 45 ------
examples/llm-flowertune/main.py | 92 -----------
examples/llm-flowertune/requirements.txt | 10 --
examples/llm-flowertune/utils.py | 43 ------
17 files changed, 384 insertions(+), 475 deletions(-)
create mode 100644 examples/flowertune-llm/README.md
rename examples/{llm-flowertune => flowertune-llm}/_static/train_loss_smooth.png (100%)
create mode 100644 examples/flowertune-llm/flowertune_llm/__init__.py
rename examples/{llm-flowertune/client.py => flowertune-llm/flowertune_llm/client_app.py} (55%)
rename examples/{llm-flowertune => flowertune-llm/flowertune_llm}/dataset.py (53%)
rename examples/{llm-flowertune => flowertune-llm/flowertune_llm}/models.py (96%)
create mode 100644 examples/flowertune-llm/flowertune_llm/server_app.py
create mode 100644 examples/flowertune-llm/pyproject.toml
rename examples/{llm-flowertune => flowertune-llm}/test.py (100%)
delete mode 100644 examples/llm-flowertune/README.md
delete mode 100644 examples/llm-flowertune/app.py
delete mode 100644 examples/llm-flowertune/conf/config.yaml
delete mode 100644 examples/llm-flowertune/main.py
delete mode 100644 examples/llm-flowertune/requirements.txt
delete mode 100644 examples/llm-flowertune/utils.py
diff --git a/README.md b/README.md
index 3f1d96ca53c0..c36e012d5644 100644
--- a/README.md
+++ b/README.md
@@ -143,7 +143,7 @@ Other [examples](https://github.com/adap/flower/tree/main/examples):
- [PyTorch: From Centralized to Federated](https://github.com/adap/flower/tree/main/examples/pytorch-from-centralized-to-federated)
- [Vertical FL](https://github.com/adap/flower/tree/main/examples/vertical-fl)
- [Federated Finetuning of OpenAI's Whisper](https://github.com/adap/flower/tree/main/examples/whisper-federated-finetuning)
-- [Federated Finetuning of Large Language Model](https://github.com/adap/flower/tree/main/examples/llm-flowertune)
+- [Federated Finetuning of Large Language Model](https://github.com/adap/flower/tree/main/examples/flowertune-llm)
- [Federated Finetuning of a Vision Transformer](https://github.com/adap/flower/tree/main/examples/flowertune-vit)
- [Advanced Flower with TensorFlow/Keras](https://github.com/adap/flower/tree/main/examples/advanced-tensorflow)
- [Advanced Flower with PyTorch](https://github.com/adap/flower/tree/main/examples/advanced-pytorch)
diff --git a/examples/doc/source/conf.py b/examples/doc/source/conf.py
index 2c2dd2742633..7712aa5f4f59 100644
--- a/examples/doc/source/conf.py
+++ b/examples/doc/source/conf.py
@@ -66,6 +66,7 @@
"quickstart-mxnet": "index.html",
"mxnet-from-centralized-to-federated": "index.html",
"app-secure-aggregation": "flower-secure-aggregation.html",
+ "llm-flowertune": "flowertune-llm.html",
}
diff --git a/examples/flowertune-llm/README.md b/examples/flowertune-llm/README.md
new file mode 100644
index 000000000000..51cae73ae88a
--- /dev/null
+++ b/examples/flowertune-llm/README.md
@@ -0,0 +1,118 @@
+---
+tags: [llm, nlp, LLama]
+dataset: [Alpaca-GPT4]
+framework: [PEFT, torch]
+---
+
+# FlowerTune LLM: Federated LLM Fine-tuning with Flower
+
+Large language models (LLMs), which have been trained on vast amounts of publicly accessible data, have shown remarkable effectiveness in a wide range of areas.
+However, despite the fact that more data typically leads to improved performance, there is a concerning prospect that the supply of high-quality public data will deplete within a few years.
+Federated LLM training could unlock access to an endless pool of distributed private data by allowing multiple data owners to collaboratively train a shared model without the need to exchange raw data.
+
+This introductory example conducts federated instruction tuning with pretrained [OpenLLaMA](https://huggingface.co/openlm-research) models on [Alpaca-GPT4](https://huggingface.co/datasets/vicgalle/alpaca-gpt4) dataset.
+We implement FlowerTune LLM by integrating a bundle of techniques: 1) We use [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the dataset. 2) The fine-tuning is done using the [🤗PEFT](https://huggingface.co/docs/peft/en/index) library. 3) We use Flower's Simulation Engine to simulate the LLM fine-tuning process in federated way,
+which allows users to perform the training on a single GPU.
+
+## Set up the project
+
+Start by cloning the example project:
+
+```shell
+git clone --depth=1 https://github.com/adap/flower.git _tmp \
+ && mv _tmp/examples/flowertune-llm . \
+ && rm -rf _tmp \
+ && cd flowertune-llm
+```
+
+This will create a new directory called `flowertune-llm` with the following structure:
+
+```shell
+flowertune-llm
+├── flowertune_llm
+│ ├── __init__.py
+│ ├── client_app.py # Defines your ClientApp
+│ ├── server_app.py # Defines your ServerApp
+│ ├── dataset.py # Defines your dataset and tokenizer
+│ └── models.py # Defines your models
+│
+├── pyproject.toml # Project metadata like dependencies and configs
+├── test.py # Test pre-trained model
+└── README.md
+```
+
+### Install dependencies and project
+
+Install the dependencies defined in `pyproject.toml` as well as the `flowertune_llm` package.
+
+```bash
+pip install -e .
+```
+
+## Run the project
+
+You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
+
+### Run with the Simulation Engine
+
+```bash
+flwr run .
+```
+
+This command will run FL simulations with a 4-bit [OpenLLaMA 3Bv2](https://huggingface.co/openlm-research/open_llama_3b_v2) model involving 2 clients per rounds for 100 FL rounds. You can override configuration parameters directly from the command line. Below are a few settings you might want to test:
+
+```bash
+# Use OpenLLaMA-7B instead of 3B and 8-bits quantization
+flwr run . --run-config "model.name='openlm-research/open_llama_7b_v2' model.quantization=8"
+
+# Run for 50 rounds but increasing the fraction of clients that participate per round to 25%
+flwr run . --run-config "num-server-rounds=50 strategy.fraction-fit=0.25"
+```
+
+### Run with the Deployment Engine
+
+> \[!NOTE\]
+> An update to this example will show how to run this Flower application with the Deployment Engine and TLS certificates, or with Docker.
+
+## Expected results
+
+
+
+As expected, OpenLLaMA-7B model works better than its 3B version with lower training loss. With the hyperparameters tested, the 8-bit model seems to deliver lower training loss for the smaller 3B model compared to its 4-bit version.
+
+## VRAM consumption
+
+| Models | 7-billion (8-bit) | 7-billion (4-bit) | 3-billion (8-bit) | 3-billion (4-bit) |
+| :----: | :---------------: | :---------------: | :---------------: | :---------------: |
+| VRAM | ~22.00 GB | ~16.50 GB | ~13.50 GB | ~10.60 GB |
+
+We make use of the [bitsandbytes](https://huggingface.co/docs/bitsandbytes/main/en/index) library in conjunction with [PEFT](https://huggingface.co/docs/peft/en/index) to derive LLMs that can be fine-tuned efficiently.
+The above table shows the VRAM consumption per client for the different models considered in this example.
+You can adjust the CPU/GPU resources you assign to each of the clients based on your device.
+For example, it is easy to train 2 concurrent clients on each GPU (24 GB VRAM) if you choose 3-billion (4-bit) model.
+Assigning 50% of the GPU's VRAM to each client by setting `options.backend.clientapp-gpus = 0.5` under `[tool.flwr.federations.local-simulation]` in `pyproject.toml`.
+
+## Test with your Questions
+
+We provide a script to test your trained model by passing your specified questions. For example:
+
+```bash
+python test.py --peft-path=/path/to/trained-model-dir/ \
+ --question="What is the ideal 1-day plan in London?"
+```
+
+An answer generated from federated trained 7-billion (8-bit) OpenLLaMA model:
+
+```
+Great choice.
+London has so much to offer, and you can really soak up all the sights and sounds in just a single day.
+Here's a suggested itinerary for you.
+Start your day off with a hearty breakfast at an authentic British diner.
+Then head to the iconic Big Ben and the Houses of Parliament to learn about the history of the city.
+Next, make your way to Westminster Abbey to see the many historical monuments and memorials.
+From there, cross the river Thames to the Tower of London, which is home to the Crown Jewels of England and Scotland.
+Finally, end your day with a relaxing visit to the London Eye, the tallest Ferris wheel in Europe, for a beautiful view of the city.
+```
+
+The [`Vicuna`](https://huggingface.co/lmsys/vicuna-13b-v1.1) template we used in this example is for a chat assistant.
+The generated answer is expected to be a multi-turn conversations. Feel free to try more interesting questions!
diff --git a/examples/llm-flowertune/_static/train_loss_smooth.png b/examples/flowertune-llm/_static/train_loss_smooth.png
similarity index 100%
rename from examples/llm-flowertune/_static/train_loss_smooth.png
rename to examples/flowertune-llm/_static/train_loss_smooth.png
diff --git a/examples/flowertune-llm/flowertune_llm/__init__.py b/examples/flowertune-llm/flowertune_llm/__init__.py
new file mode 100644
index 000000000000..e786a4d4b73d
--- /dev/null
+++ b/examples/flowertune-llm/flowertune_llm/__init__.py
@@ -0,0 +1 @@
+"""flowertune_llm."""
diff --git a/examples/llm-flowertune/client.py b/examples/flowertune-llm/flowertune_llm/client_app.py
similarity index 55%
rename from examples/llm-flowertune/client.py
rename to examples/flowertune-llm/flowertune_llm/client_app.py
index c81333f664b3..992b0f1a3e09 100644
--- a/examples/llm-flowertune/client.py
+++ b/examples/flowertune-llm/flowertune_llm/client_app.py
@@ -1,21 +1,40 @@
+"""flowertune-llm: A Flower / FlowerTune app."""
+
+import os
+import warnings
+from typing import Dict, Tuple
from collections import OrderedDict
-from typing import Callable, Dict, Tuple
-import flwr as fl
import torch
+from flwr.client import ClientApp, NumPyClient
+from flwr.common import Context
+from flwr.common.config import unflatten_dict
from flwr.common.typing import NDArrays, Scalar
from omegaconf import DictConfig
+
from peft import get_peft_model_state_dict, set_peft_model_state_dict
from transformers import TrainingArguments
from trl import SFTTrainer
-from models import cosine_annealing, get_model
+from flowertune_llm.dataset import (
+ get_tokenizer_and_data_collator_and_propt_formatting,
+ load_data,
+ replace_keys,
+)
+from flowertune_llm.models import (
+ cosine_annealing,
+ get_model,
+)
+
+# Avoid warnings
+os.environ["TOKENIZERS_PARALLELISM"] = "true"
+os.environ["RAY_DISABLE_DOCKER_CPU_WARNING"] = "1"
+warnings.filterwarnings("ignore", category=UserWarning)
# pylint: disable=too-many-arguments
-class FlowerClient(
- fl.client.NumPyClient
-): # pylint: disable=too-many-instance-attributes
+# pylint: disable=too-many-instance-attributes
+class FlowerClient(NumPyClient):
"""Standard Flower client for CNN training."""
def __init__(
@@ -26,7 +45,7 @@ def __init__(
tokenizer,
formatting_prompts_func,
data_collator,
- save_path,
+ num_rounds,
): # pylint: disable=too-many-arguments
self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
self.train_cfg = train_cfg
@@ -34,19 +53,12 @@ def __init__(
self.tokenizer = tokenizer
self.formatting_prompts_func = formatting_prompts_func
self.data_collator = data_collator
- self.save_path = save_path
+ self.num_rounds = num_rounds
+ self.trainset = trainset
# instantiate model
self.model = get_model(model_cfg)
- self.trainset = trainset
-
- def get_parameters(self, config: Dict[str, Scalar]) -> NDArrays:
- """Return the parameters of the current net."""
-
- state_dict = get_peft_model_state_dict(self.model)
- return [val.cpu().numpy() for _, val in state_dict.items()]
-
def fit(
self, parameters: NDArrays, config: Dict[str, Scalar]
) -> Tuple[NDArrays, int, Dict]:
@@ -55,13 +67,13 @@ def fit(
new_lr = cosine_annealing(
int(config["current_round"]),
- self.train_cfg.num_rounds,
+ self.num_rounds,
self.train_cfg.learning_rate_max,
self.train_cfg.learning_rate_min,
)
self.training_argumnets.learning_rate = new_lr
- self.training_argumnets.output_dir = self.save_path
+ self.training_argumnets.output_dir = config["save_path"]
# Construct trainer
trainer = SFTTrainer(
@@ -78,7 +90,7 @@ def fit(
results = trainer.train()
return (
- self.get_parameters({}),
+ get_parameters(self.model),
len(self.trainset),
{"train_loss": results.training_loss},
)
@@ -92,38 +104,37 @@ def set_parameters(model, parameters: NDArrays) -> None:
set_peft_model_state_dict(model, state_dict)
-def gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- model_cfg: DictConfig,
- train_cfg: DictConfig,
- save_path: str,
- partition_id: int = 0,
- api: bool = False,
-) -> Callable[[str], FlowerClient]: # pylint: disable=too-many-arguments
- """Generate the client function that creates the Flower Clients."""
-
- def client_fn(cid: str) -> FlowerClient:
- """Create a Flower client representing a single organization."""
-
- # Let's get the partition corresponding to the i-th client
- client_trainset = (
- fds.load_partition(partition_id, "train")
- if api
- else fds.load_partition(int(cid), "train")
- )
- client_trainset = client_trainset.rename_column("output", "response")
-
- return FlowerClient(
- model_cfg,
- train_cfg,
- client_trainset,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- save_path,
- ).to_client()
-
- return client_fn
+def get_parameters(model) -> NDArrays:
+ """Return the parameters of the current net."""
+ state_dict = get_peft_model_state_dict(model)
+ return [val.cpu().numpy() for _, val in state_dict.items()]
+
+
+def client_fn(context: Context) -> FlowerClient:
+ """Create a Flower client representing a single organization."""
+ partition_id = context.node_config["partition-id"]
+ num_partitions = context.node_config["num-partitions"]
+ num_rounds = context.run_config["num-server-rounds"]
+ cfg = DictConfig(replace_keys(unflatten_dict(context.run_config)))
+
+ # Let's get the client partition
+ client_trainset = load_data(partition_id, num_partitions, cfg.dataset.name)
+ (
+ tokenizer,
+ data_collator,
+ formatting_prompts_func,
+ ) = get_tokenizer_and_data_collator_and_propt_formatting(cfg.model.name)
+
+ return FlowerClient(
+ cfg.model,
+ cfg.train,
+ client_trainset,
+ tokenizer,
+ formatting_prompts_func,
+ data_collator,
+ num_rounds,
+ ).to_client()
+
+
+# Flower ClientApp
+app = ClientApp(client_fn)
diff --git a/examples/llm-flowertune/dataset.py b/examples/flowertune-llm/flowertune_llm/dataset.py
similarity index 53%
rename from examples/llm-flowertune/dataset.py
rename to examples/flowertune-llm/flowertune_llm/dataset.py
index 571be31f7fba..87595b3f9ccd 100644
--- a/examples/llm-flowertune/dataset.py
+++ b/examples/flowertune-llm/flowertune_llm/dataset.py
@@ -1,6 +1,11 @@
from transformers import AutoTokenizer
from trl import DataCollatorForCompletionOnlyLM
+from flwr_datasets.partitioner import IidPartitioner
+from flwr_datasets import FederatedDataset
+
+FDS = None # Cache FederatedDataset
+
def formatting_prompts_func(example):
output_texts = []
@@ -27,3 +32,31 @@ def get_tokenizer_and_data_collator_and_propt_formatting(model_name: str):
)
return tokenizer, data_collator, formatting_prompts_func
+
+
+def load_data(partition_id: int, num_partitions: int, dataset_name: str):
+ """Load partition data."""
+ # Only initialize `FederatedDataset` once
+ global FDS
+ if FDS is None:
+ partitioner = IidPartitioner(num_partitions=num_partitions)
+ FDS = FederatedDataset(
+ dataset=dataset_name,
+ partitioners={"train": partitioner},
+ )
+ client_trainset = FDS.load_partition(partition_id, "train")
+ client_trainset = client_trainset.rename_column("output", "response")
+
+ return client_trainset
+
+
+def replace_keys(input_dict, match="-", target="_"):
+ """Recursively replace match string with target string in dictionary keys."""
+ new_dict = {}
+ for key, value in input_dict.items():
+ new_key = key.replace(match, target)
+ if isinstance(value, dict):
+ new_dict[new_key] = replace_keys(value, match, target)
+ else:
+ new_dict[new_key] = value
+ return new_dict
diff --git a/examples/llm-flowertune/models.py b/examples/flowertune-llm/flowertune_llm/models.py
similarity index 96%
rename from examples/llm-flowertune/models.py
rename to examples/flowertune-llm/flowertune_llm/models.py
index f32c800cf2c1..7d0c8f391687 100644
--- a/examples/llm-flowertune/models.py
+++ b/examples/flowertune-llm/flowertune_llm/models.py
@@ -2,7 +2,10 @@
import torch
from omegaconf import DictConfig
-from peft import LoraConfig, get_peft_model
+from peft import (
+ LoraConfig,
+ get_peft_model,
+)
from peft.utils import prepare_model_for_kbit_training
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
diff --git a/examples/flowertune-llm/flowertune_llm/server_app.py b/examples/flowertune-llm/flowertune_llm/server_app.py
new file mode 100644
index 000000000000..309166cc30a3
--- /dev/null
+++ b/examples/flowertune-llm/flowertune_llm/server_app.py
@@ -0,0 +1,95 @@
+"""flowertune-llm: A Flower / FlowerTune app."""
+
+import os
+from datetime import datetime
+
+from flwr.common import Context, ndarrays_to_parameters
+from flwr.common.config import unflatten_dict
+from flwr.server import ServerApp, ServerAppComponents, ServerConfig
+from flwr.server.strategy import FedAvg
+from omegaconf import DictConfig
+
+from flowertune_llm.models import get_model
+from flowertune_llm.dataset import replace_keys
+from flowertune_llm.client_app import get_parameters, set_parameters
+
+
+# Get function that will be executed by the strategy's evaluate() method
+# Here we use it to save global model checkpoints
+def get_evaluate_fn(model_cfg, save_every_round, total_round, save_path):
+ """Return an evaluation function for saving global model."""
+
+ def evaluate(server_round: int, parameters, config):
+ # Save model
+ if server_round != 0 and (
+ server_round == total_round or server_round % save_every_round == 0
+ ):
+ # Init model
+ model = get_model(model_cfg)
+ set_parameters(model, parameters)
+
+ model.save_pretrained(f"{save_path}/peft_{server_round}")
+
+ return 0.0, {}
+
+ return evaluate
+
+
+def get_on_fit_config(save_path):
+ """Return a function that will be used to construct the config that the client's
+ fit() method will receive."""
+
+ def fit_config_fn(server_round: int):
+ fit_config = {}
+ fit_config["current_round"] = server_round
+ fit_config["save_path"] = save_path
+ return fit_config
+
+ return fit_config_fn
+
+
+def fit_weighted_average(metrics):
+ """Aggregate (federated) evaluation metrics."""
+ # Multiply accuracy of each client by number of examples used
+ losses = [num_examples * m["train_loss"] for num_examples, m in metrics]
+ examples = [num_examples for num_examples, _ in metrics]
+
+ # Aggregate and return custom metric (weighted average)
+ return {"train_loss": sum(losses) / sum(examples)}
+
+
+def server_fn(context: Context):
+ """Construct components that set the ServerApp behaviour."""
+ # Create output directory given current timestamp
+ current_time = datetime.now()
+ folder_name = current_time.strftime("%Y-%m-%d_%H-%M-%S")
+ save_path = os.path.join(os.getcwd(), f"results/{folder_name}")
+ os.makedirs(save_path, exist_ok=True)
+
+ # Read from config
+ num_rounds = context.run_config["num-server-rounds"]
+ cfg = DictConfig(replace_keys(unflatten_dict(context.run_config)))
+
+ # Get initial model weights
+ init_model = get_model(cfg.model)
+ init_model_parameters = get_parameters(init_model)
+ init_model_parameters = ndarrays_to_parameters(init_model_parameters)
+
+ # Define strategy
+ strategy = FedAvg(
+ fraction_fit=cfg.strategy.fraction_fit,
+ fraction_evaluate=cfg.strategy.fraction_evaluate,
+ on_fit_config_fn=get_on_fit_config(save_path),
+ fit_metrics_aggregation_fn=fit_weighted_average,
+ initial_parameters=init_model_parameters,
+ evaluate_fn=get_evaluate_fn(
+ cfg.model, cfg.train.save_every_round, num_rounds, save_path
+ ),
+ )
+ config = ServerConfig(num_rounds=num_rounds)
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+# Flower ServerApp
+app = ServerApp(server_fn=server_fn)
diff --git a/examples/flowertune-llm/pyproject.toml b/examples/flowertune-llm/pyproject.toml
new file mode 100644
index 000000000000..8171d7680620
--- /dev/null
+++ b/examples/flowertune-llm/pyproject.toml
@@ -0,0 +1,66 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[project]
+name = "flowertune-llm"
+version = "1.0.0"
+description = "FlowerTune LLM: Federated LLM Fine-tuning with Flower"
+license = "Apache-2.0"
+dependencies = [
+ "flwr-nightly[simulation]==1.11.0.dev20240826",
+ "flwr-datasets>=0.3.0",
+ "trl==0.8.1",
+ "bitsandbytes==0.43.0",
+ "scipy==1.13.0",
+ "peft==0.6.2",
+ "fschat[model_worker,webui]==0.2.35",
+ "transformers==4.39.3",
+ "sentencepiece==0.2.0",
+ "omegaconf==2.3.0",
+ "hf_transfer==0.1.8",
+]
+
+[tool.hatch.build.targets.wheel]
+packages = ["."]
+
+[tool.flwr.app]
+publisher = "flwrlabs"
+
+[tool.flwr.app.components]
+serverapp = "flowertune_llm.server_app:app"
+clientapp = "flowertune_llm.client_app:app"
+
+[tool.flwr.app.config]
+dataset.name = "vicgalle/alpaca-gpt4"
+model.name = "openlm-research/open_llama_3b_v2"
+model.quantization = 4
+model.gradient-checkpointing = true
+model.lora.peft-lora-r = 32
+model.lora.peft-lora-alpha = 64
+train.save-every-round = 5
+train.learning-rate-max = 5e-5
+train.learning-rate-min = 1e-6
+train.seq-length = 512
+train.training-arguments.output-dir = ""
+train.training-arguments.learning-rate = ""
+train.training-arguments.per-device-train-batch-size = 16
+train.training-arguments.gradient-accumulation-steps = 1
+train.training-arguments.logging-steps = 10
+train.training-arguments.num-train-epochs = 3
+train.training-arguments.max-steps = 10
+train.training-arguments.save-steps = 1000
+train.training-arguments.save-total-limit = 10
+train.training-arguments.gradient-checkpointing = true
+train.training-arguments.lr-scheduler-type = "constant"
+strategy.fraction-fit = 0.1
+strategy.fraction-evaluate = 0.0
+num-server-rounds = 100
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 20
+options.backend.client-resources.num-cpus = 8
+options.backend.client-resources.num-gpus = 1.0
diff --git a/examples/llm-flowertune/test.py b/examples/flowertune-llm/test.py
similarity index 100%
rename from examples/llm-flowertune/test.py
rename to examples/flowertune-llm/test.py
diff --git a/examples/llm-flowertune/README.md b/examples/llm-flowertune/README.md
deleted file mode 100644
index 46076e0b2078..000000000000
--- a/examples/llm-flowertune/README.md
+++ /dev/null
@@ -1,144 +0,0 @@
----
-title: Federated LLM Fine-tuning with Flower
-tags: [llm, nlp, LLama2]
-dataset: [Alpaca-GPT4]
-framework: [PEFT, torch]
----
-
-# LLM FlowerTune: Federated LLM Fine-tuning with Flower
-
-Large language models (LLMs), which have been trained on vast amounts of publicly accessible data, have shown remarkable effectiveness in a wide range of areas.
-However, despite the fact that more data typically leads to improved performance, there is a concerning prospect that the supply of high-quality public data will deplete within a few years.
-Federated LLM training could unlock access to an endless pool of distributed private data by allowing multiple data owners to collaboratively train a shared model without the need to exchange raw data.
-
-This introductory example conducts federated instruction tuning with pretrained [LLama2](https://huggingface.co/openlm-research) models on [Alpaca-GPT4](https://huggingface.co/datasets/vicgalle/alpaca-gpt4) dataset.
-We implement LLM FlowerTune by integrating a bundle of techniques: 1) We use [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the dataset. 2) The fine-tuning is done using the [🤗PEFT](https://huggingface.co/docs/peft/en/index) library. 3) We use Flower's Simulation Engine to simulate the LLM fine-tuning process in federated way,
-which allows users to perform the training on a single GPU.
-
-## Environment Setup
-
-Start by cloning the code example. We prepared a single-line command that you can copy into your shell which will checkout the example for you:
-
-```shell
-git clone --depth=1 https://github.com/adap/flower.git && mv flower/examples/llm-flowertune . && rm -rf flower && cd llm-flowertune
-```
-
-This will create a new directory called `llm-flowertune` containing the following files:
-
-```
--- README.md <- Your're reading this right now
--- main.py <- Start fed-LLM simulation
--- client.py <- Flower client constructor
--- model.py <- Model build
--- dataset.py <- Dataset and tokenizer build
--- utils.py <- Utility functions
--- test.py <- Test pre-trained model
--- app.py <- ServerApp/ClientApp for Flower-Next
--- conf/config.yaml <- Configuration file
--- requirements.txt <- Example dependencies
-```
-
-### Installing dependencies
-
-Project dependencies are defined in `requirements.txt`. Install them with:
-
-```shell
-pip install -r requirements.txt
-```
-
-## Run LLM Fine-tuning
-
-With an activated Python environment, run the example with default config values. The config is in `conf/config.yaml` and is loaded automatically.
-
-```bash
-# Run with default config
-python main.py
-```
-
-This command will run FL simulations with a 4-bit [OpenLLaMA 7Bv2](https://huggingface.co/openlm-research/open_llama_7b_v2) model involving 2 clients per rounds for 100 FL rounds. You can override configuration parameters directly from the command line. Below are a few settings you might want to test:
-
-```bash
-# Use OpenLLaMA-3B instead of 7B and 8-bits quantization
-python main.py model.name="openlm-research/open_llama_3b_v2" model.quantization=8
-
-# Run for 50 rounds but increasing the fraction of clients that participate per round to 25%
-python main.py num_rounds=50 fraction_fit.fraction_fit=0.25
-```
-
-## Expected Results
-
-
-
-As expected, LLama2-7B model works better than its 3B version with lower training loss. With the hyperparameters tested, the 8-bit model seems to deliver lower training loss for the smaller 3B model compared to its 4-bit version.
-
-You can run all 8 experiments with a single command as:
-
-```bash
-python main.py --multirun model.name="openlm-research/open_llama_7b_v2","openlm-research/open_llama_3b_v2" model.quantization=8,4 strategy.fraction_fit=0.1,0.2
-```
-
-## VRAM Consumption
-
-| Models | 7-billion (8-bit) | 7-billion (4-bit) | 3-billion (8-bit) | 3-billion (4-bit) |
-| :----: | :---------------: | :---------------: | :---------------: | :---------------: |
-| VRAM | ~22.00 GB | ~16.50 GB | ~13.50 GB | ~10.60 GB |
-
-We make use of the [bitsandbytes](https://huggingface.co/docs/bitsandbytes/main/en/index) library in conjunction with [PEFT](https://huggingface.co/docs/peft/en/index) to derive LLMs that can be fine-tuned efficiently.
-The above table shows the VRAM consumption per client for the different models considered in this example.
-You can adjust the CPU/GPU resources you assign to each of the clients based on your device.
-For example, it is easy to train 2 concurrent clients on each GPU (24 GB VRAM) if you choose 3-billion (4-bit) model.
-
-```bash
-# This will assign 50% of the GPU's VRAM to each client.
-python main.py model.name="openlm-research/open_llama_3b_v2" model.quantization=4 client_resources.num_gpus=0.5
-```
-
-## Test with your Questions
-
-We provide a script to test your trained model by passing your specified questions. For example:
-
-```bash
-python test.py --peft-path=/path/to/trained-model-dir/ \
- --question="What is the ideal 1-day plan in London?"
-```
-
-An answer generated from federated trained 7-billion (8-bit) LLama2 model:
-
-```
-Great choice.
-London has so much to offer, and you can really soak up all the sights and sounds in just a single day.
-Here's a suggested itinerary for you.
-Start your day off with a hearty breakfast at an authentic British diner.
-Then head to the iconic Big Ben and the Houses of Parliament to learn about the history of the city.
-Next, make your way to Westminster Abbey to see the many historical monuments and memorials.
-From there, cross the river Thames to the Tower of London, which is home to the Crown Jewels of England and Scotland.
-Finally, end your day with a relaxing visit to the London Eye, the tallest Ferris wheel in Europe, for a beautiful view of the city.
-```
-
-The [`Vicuna`](https://huggingface.co/lmsys/vicuna-13b-v1.1) template we used in this example is for a chat assistant.
-The generated answer is expected to be a multi-turn conversations. Feel free to try more interesting questions!
-
-## Run with Flower Next (preview)
-
-We conduct a 2-client setting to demonstrate how to run federated LLM fine-tuning with Flower Next.
-Please follow the steps below:
-
-1. Start the long-running Flower server (SuperLink)
- ```bash
- flower-superlink --insecure
- ```
-2. Start the long-running Flower client (SuperNode)
- ```bash
- # In a new terminal window, start the first long-running Flower client:
- flower-client-app app:client1 --insecure
- ```
- ```bash
- # In another new terminal window, start the second long-running Flower client:
- flower-client-app app:client2 --insecure
- ```
-3. Run the Flower App
- ```bash
- # With both the long-running server (SuperLink) and two clients (SuperNode) up and running,
- # we can now run the actual Flower App:
- flower-server-app app:server --insecure
- ```
diff --git a/examples/llm-flowertune/app.py b/examples/llm-flowertune/app.py
deleted file mode 100644
index db6595c94d31..000000000000
--- a/examples/llm-flowertune/app.py
+++ /dev/null
@@ -1,85 +0,0 @@
-import os
-import warnings
-
-import flwr as fl
-from flwr_datasets import FederatedDataset
-from hydra import compose, initialize
-
-from client import gen_client_fn
-from dataset import get_tokenizer_and_data_collator_and_propt_formatting
-from utils import fit_weighted_average, get_on_fit_config
-
-warnings.filterwarnings("ignore", category=UserWarning)
-
-NUM_ROUNDS = 100
-save_path = "./results/"
-
-with initialize(config_path="conf"):
- cfg = compose(config_name="config")
-
-# Reset the number of number
-cfg.num_rounds = NUM_ROUNDS
-cfg.train.num_rounds = NUM_ROUNDS
-
-# Create output directory
-if not os.path.exists(save_path):
- os.mkdir(save_path)
-
-# Partition dataset and get dataloaders
-# We set the number of partitions to 20 for fast processing.
-fds = FederatedDataset(
- dataset=cfg.dataset.name, partitioners={"train": cfg.num_clients}
-)
-(
- tokenizer,
- data_collator,
- formatting_prompts_func,
-) = get_tokenizer_and_data_collator_and_propt_formatting(cfg.model.name)
-
-
-# ClientApp for client #1 (Flower Next)
-client1 = fl.client.ClientApp(
- client_fn=gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- cfg.model,
- cfg.train,
- save_path,
- partition_id=0,
- api=True,
- ),
-)
-
-
-# ClientApp for client #2 (Flower Next)
-client2 = fl.client.ClientApp(
- client_fn=gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- cfg.model,
- cfg.train,
- save_path,
- partition_id=1,
- api=True,
- ),
-)
-
-
-# Instantiate strategy.
-strategy = fl.server.strategy.FedAvg(
- min_available_clients=2, # Simulate a 2-client setting
- fraction_fit=1.0,
- fraction_evaluate=0.0, # no client evaluation
- on_fit_config_fn=get_on_fit_config(),
- fit_metrics_aggregation_fn=fit_weighted_average,
-)
-
-# ServerApp for Flower-Next
-server = fl.server.ServerApp(
- config=fl.server.ServerConfig(num_rounds=NUM_ROUNDS),
- strategy=strategy,
-)
diff --git a/examples/llm-flowertune/conf/config.yaml b/examples/llm-flowertune/conf/config.yaml
deleted file mode 100644
index 0b769d351479..000000000000
--- a/examples/llm-flowertune/conf/config.yaml
+++ /dev/null
@@ -1,45 +0,0 @@
-# Federated Instruction Tuning on General Dataset
----
-
-num_clients: 20 # total number of clients
-num_rounds: 100
-
-dataset:
- name: "vicgalle/alpaca-gpt4"
-
-model:
- name: "openlm-research/open_llama_7b_v2"
- quantization: 4 # 8 or 4 if you want to do quantization with BitsAndBytes
- gradient_checkpointing: True
- lora:
- peft_lora_r: 32
- peft_lora_alpha: 64
-
-train:
- num_rounds: ${num_rounds}
- save_every_round: 5
- learning_rate_max: 5e-5
- learning_rate_min: 1e-6
- seq_length: 512
- training_arguments:
- output_dir: null # to be set by hydra
- learning_rate: null # to be set by the client
- per_device_train_batch_size: 16
- gradient_accumulation_steps: 1
- logging_steps: 10
- num_train_epochs: 3
- max_steps: 10
- report_to: null
- save_steps: 1000
- save_total_limit: 10
- gradient_checkpointing: ${model.gradient_checkpointing}
- lr_scheduler_type: "constant"
-
-strategy:
- _target_: flwr.server.strategy.FedAvg
- fraction_fit: 0.1 # sample 10% of clients (i.e. 2 per round)
- fraction_evaluate: 0.0 # no client evaluation
-
-client_resources:
- num_cpus: 8
- num_gpus: 1.0
diff --git a/examples/llm-flowertune/main.py b/examples/llm-flowertune/main.py
deleted file mode 100644
index ec8308601efb..000000000000
--- a/examples/llm-flowertune/main.py
+++ /dev/null
@@ -1,92 +0,0 @@
-import pickle
-import warnings
-
-import flwr as fl
-import hydra
-from flwr_datasets import FederatedDataset
-from hydra.core.hydra_config import HydraConfig
-from hydra.utils import instantiate
-from omegaconf import DictConfig, OmegaConf
-
-from client import gen_client_fn
-from dataset import get_tokenizer_and_data_collator_and_propt_formatting
-from utils import fit_weighted_average, get_evaluate_fn, get_on_fit_config
-
-warnings.filterwarnings("ignore", category=UserWarning)
-
-
-@hydra.main(config_path="conf", config_name="config", version_base=None)
-def main(cfg: DictConfig) -> None:
- """Run federated LLM fine-tuning.
-
- Parameters
- ----------
- cfg : DictConfig
- An omegaconf object that stores the hydra config.
- """
- # Print config structured as YAML
- print(OmegaConf.to_yaml(cfg))
-
- # Partition dataset and get dataloaders
- fds = FederatedDataset(
- dataset=cfg.dataset.name, partitioners={"train": cfg.num_clients}
- )
- (
- tokenizer,
- data_collator,
- formatting_prompts_func,
- ) = get_tokenizer_and_data_collator_and_propt_formatting(
- cfg.model.name,
- )
-
- # Hydra automatically creates an output directory
- # Let's retrieve it and save some results there
- save_path = HydraConfig.get().runtime.output_dir
-
- # Prepare function that will be used to spawn each client
- client_fn = gen_client_fn(
- fds,
- tokenizer,
- formatting_prompts_func,
- data_collator,
- cfg.model,
- cfg.train,
- save_path,
- )
-
- # Instantiate strategy according to config. Here we pass other arguments
- # that are only defined at run time.
- strategy = instantiate(
- cfg.strategy,
- on_fit_config_fn=get_on_fit_config(),
- fit_metrics_aggregation_fn=fit_weighted_average,
- evaluate_fn=get_evaluate_fn(
- cfg.model, cfg.train.save_every_round, cfg.num_rounds, save_path
- ),
- )
-
- # Start simulation
- history = fl.simulation.start_simulation(
- client_fn=client_fn,
- num_clients=cfg.num_clients,
- config=fl.server.ServerConfig(num_rounds=cfg.num_rounds),
- client_resources={
- "num_cpus": cfg.client_resources.num_cpus,
- "num_gpus": cfg.client_resources.num_gpus,
- },
- strategy=strategy,
- )
-
- # Experiment completed. Now we save the results and
- # generate plots using the `history`
- print("................")
- print(history)
-
- # Save results as a Python pickle using a file_path
- # the directory created by Hydra for each run
- with open(f"{save_path}/results.pkl", "wb") as f:
- pickle.dump(history, f)
-
-
-if __name__ == "__main__":
- main()
diff --git a/examples/llm-flowertune/requirements.txt b/examples/llm-flowertune/requirements.txt
deleted file mode 100644
index 2d0e65da3615..000000000000
--- a/examples/llm-flowertune/requirements.txt
+++ /dev/null
@@ -1,10 +0,0 @@
-flwr[rest,simulation]>=1.8.0, <2.0
-flwr-datasets>=0.0.2
-hydra-core==1.3.2
-trl==0.7.2
-bitsandbytes==0.41.3
-scipy==1.11.2
-peft==0.4.0
-fschat[model_worker,webui]==0.2.35
-transformers==4.38.1
-hf_transfer==0.1.8
diff --git a/examples/llm-flowertune/utils.py b/examples/llm-flowertune/utils.py
deleted file mode 100644
index bbb607810537..000000000000
--- a/examples/llm-flowertune/utils.py
+++ /dev/null
@@ -1,43 +0,0 @@
-from client import set_parameters
-from models import get_model
-
-
-# Get function that will be executed by the strategy's evaluate() method
-# Here we use it to save global model checkpoints
-def get_evaluate_fn(model_cfg, save_every_round, total_round, save_path):
- """Return an evaluation function for saving global model."""
-
- def evaluate(server_round: int, parameters, config):
- # Save model
- if server_round != 0 and (
- server_round == total_round or server_round % save_every_round == 0
- ):
- # Init model
- model = get_model(model_cfg)
- set_parameters(model, parameters)
-
- model.save_pretrained(f"{save_path}/peft_{server_round}")
-
- return 0.0, {}
-
- return evaluate
-
-
-# Get a function that will be used to construct the config that the client's
-# fit() method will receive
-def get_on_fit_config():
- def fit_config_fn(server_round: int):
- fit_config = {"current_round": server_round}
- return fit_config
-
- return fit_config_fn
-
-
-def fit_weighted_average(metrics):
- """Aggregation function for (federated) evaluation metrics."""
- # Multiply accuracy of each client by number of examples used
- losses = [num_examples * m["train_loss"] for num_examples, m in metrics]
- examples = [num_examples for num_examples, _ in metrics]
-
- # Aggregate and return custom metric (weighted average)
- return {"train_loss": sum(losses) / sum(examples)}
From 46374c8054c68eae6ba807820ffa9f055c4edb16 Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Tue, 27 Aug 2024 18:49:17 +0100
Subject: [PATCH 15/42] fix(examples:skip) Fix redirect issue for ViT example
(#4089)
---
doc/source/conf.py | 1 -
examples/doc/source/conf.py | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/doc/source/conf.py b/doc/source/conf.py
index 5d434dd729bb..d3881325a5ce 100644
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -195,7 +195,6 @@ def find_test_modules(package_path):
"apiref-binaries": "ref-api-cli.html",
"fedbn-example-pytorch-from-centralized-to-federated": "example-fedbn-pytorch-from-centralized-to-federated.html",
"how-to-use-built-in-middleware-layers": "how-to-use-built-in-mods.html",
- "vit-finetune": "flowertune-vit.html",
# Restructuring: tutorials
"tutorial/Flower-0-What-is-FL": "tutorial-series-what-is-federated-learning.html",
"tutorial/Flower-1-Intro-to-FL-PyTorch": "tutorial-series-get-started-with-flower-pytorch.html",
diff --git a/examples/doc/source/conf.py b/examples/doc/source/conf.py
index 7712aa5f4f59..4e4b7b210051 100644
--- a/examples/doc/source/conf.py
+++ b/examples/doc/source/conf.py
@@ -67,6 +67,7 @@
"mxnet-from-centralized-to-federated": "index.html",
"app-secure-aggregation": "flower-secure-aggregation.html",
"llm-flowertune": "flowertune-llm.html",
+ "vit-finetune": "flowertune-vit.html",
}
From d02b81a14c5f211fbc3d7cae563f527a36b42974 Mon Sep 17 00:00:00 2001
From: Javier
Date: Wed, 28 Aug 2024 13:00:32 +0100
Subject: [PATCH 16/42] feat(framework) Add template to create a Flower
Baseline (#3979)
Co-authored-by: Taner Topal
Co-authored-by: Daniel J. Beutel
---
src/py/flwr/cli/new/new.py | 19 ++
src/py/flwr/cli/new/templates/app/LICENSE.tpl | 202 ++++++++++++++++++
.../new/templates/app/README.baseline.md.tpl | 127 +++++++++++
.../app/code/__init__.baseline.py.tpl | 1 +
.../templates/app/code/client.baseline.py.tpl | 58 +++++
.../app/code/dataset.baseline.py.tpl | 36 ++++
.../templates/app/code/model.baseline.py.tpl | 80 +++++++
.../templates/app/code/server.baseline.py.tpl | 46 ++++
.../app/code/strategy.baseline.py.tpl | 1 +
.../templates/app/code/utils.baseline.py.tpl | 1 +
.../templates/app/pyproject.baseline.toml.tpl | 138 ++++++++++++
11 files changed, 709 insertions(+)
create mode 100644 src/py/flwr/cli/new/templates/app/LICENSE.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/README.baseline.md.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/__init__.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/client.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/dataset.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/model.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/server.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/strategy.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/code/utils.baseline.py.tpl
create mode 100644 src/py/flwr/cli/new/templates/app/pyproject.baseline.toml.tpl
diff --git a/src/py/flwr/cli/new/new.py b/src/py/flwr/cli/new/new.py
index 31da7b4ab9fb..9f2d32ddf99c 100644
--- a/src/py/flwr/cli/new/new.py
+++ b/src/py/flwr/cli/new/new.py
@@ -42,6 +42,7 @@ class MlFramework(str, Enum):
MLX = "MLX"
NUMPY = "NumPy"
FLOWERTUNE = "FlowerTune"
+ BASELINE = "Flower Baseline"
class LlmChallengeName(str, Enum):
@@ -164,6 +165,8 @@ def new(
llm_challenge_str = selected_value[0]
llm_challenge_str = llm_challenge_str.lower()
+ is_baseline_project = framework_str == "baseline"
+
print(
typer.style(
f"\n🔨 Creating Flower App {app_name}...",
@@ -193,6 +196,7 @@ def new(
f"{import_name}/client_app.py": {
"template": "app/code/flwr_tune/client_app.py.tpl"
},
+ f"{import_name}/app.py": {"template": "app/code/flwr_tune/app.py.tpl"},
f"{import_name}/models.py": {
"template": "app/code/flwr_tune/models.py.tpl"
},
@@ -255,6 +259,21 @@ def new(
"template": f"app/code/task.{framework_str}.py.tpl"
}
+ if is_baseline_project:
+ # Include additional files for baseline template
+ for file_name in ["model", "dataset", "strategy", "utils", "__init__"]:
+ files[f"{import_name}/{file_name}.py"] = {
+ "template": f"app/code/{file_name}.{framework_str}.py.tpl"
+ }
+
+ # Replace README.md
+ files["README.md"]["template"] = f"app/README.{framework_str}.md.tpl"
+
+ # Add LICENSE
+ files["LICENSE"] = {"template": "app/LICENSE.tpl"}
+
+ context["framework_str"] = "baseline"
+
for file_path, value in files.items():
render_and_create(
file_path=project_dir / file_path,
diff --git a/src/py/flwr/cli/new/templates/app/LICENSE.tpl b/src/py/flwr/cli/new/templates/app/LICENSE.tpl
new file mode 100644
index 000000000000..7a4a3ea2424c
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/LICENSE.tpl
@@ -0,0 +1,202 @@
+
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright [yyyy] [name of copyright owner]
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
\ No newline at end of file
diff --git a/src/py/flwr/cli/new/templates/app/README.baseline.md.tpl b/src/py/flwr/cli/new/templates/app/README.baseline.md.tpl
new file mode 100644
index 000000000000..9bbbe8f22794
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/README.baseline.md.tpl
@@ -0,0 +1,127 @@
+---
+title: title of the paper # TODO
+url: https://arxiv.org/abs/2007.14390 # TODO: update with the link to your paper
+labels: [label1, label2] # TODO: please add between 4 and 10 single-word (maybe two-words) labels (e.g. system heterogeneity, image classification, asynchronous, weight sharing, cross-silo). Do not use "". Remove this comment once you are done.
+dataset: [dataset1, dataset2] # TODO: list of datasets you include in your baseline. Do not use "". Remove this comment once you are done.
+---
+
+> [!IMPORTANT]
+> This is the template for your `README.md`. Please fill-in the information in all areas with a :warning: symbol.
+> Please refer to the [Flower Baselines contribution](https://flower.ai/docs/baselines/how-to-contribute-baselines.html) and [Flower Baselines usage](https://flower.ai/docs/baselines/how-to-use-baselines.html) guides for more details.
+> Please complete the metadata section at the very top of this README. This generates a table at the top of the file that will facilitate indexing baselines.
+> Please remove this [!IMPORTANT] block once you are done with your `README.md` as well as all the `:warning:` symbols and the comments next to them.
+
+> [!IMPORTANT]
+> To help having all baselines similarly formatted and structured, we have included two scripts in `baselines/dev` that when run will format your code and run some tests checking if it's formatted.
+> These checks use standard packages such as `isort`, `black`, `pylint` and others. You as a baseline creator will need to install additional pacakges. These are already specified in the `pyproject.toml` of
+> your baseline. Follow these steps:
+
+```bash
+# Create a python env
+pyenv virtualenv 3.10.14 $project_name
+
+# Activate it
+pyenv activate $project_name
+
+# Install project including developer packages
+# Note the `-e` this means you install it in editable mode
+# so even if you change the code you don't need to do `pip install`
+# again. However, if you add a new dependency to `pyproject.toml` you
+# will need to re-run the command below
+pip install -e ".[dev]"
+
+# Even without modifying or adding new code, you can run your baseline
+# with the placeholder code generated when you did `flwr new`. If you
+# want to test this to familiarise yourself with how flower apps are
+# executed, execute this from the directory where you `pyproject.toml` is:
+flwr run .
+
+# At anypoint during the process of creating your baseline you can
+# run the formatting script. For this do:
+cd .. # so you are in the `flower/baselines` directory
+
+# Run the formatting script (it will auto-correct issues if possible)
+./dev/format-baseline.sh $project_name
+
+# Then, if the above is all good, run the tests.
+./dev/test-baseline.sh $project_name
+```
+
+> [!IMPORTANT]
+> When you open a PR to get the baseline merged into the main Flower repository, the `./dev/test-baseline.sh` script will run. Only if test pass, the baseline can be merged.
+> Some issues highlighted by the tests script are easier than others to fix. Do not hesitate in reaching out for help to us (e.g. as a comment in your PR) if you are stuck with these.
+> Before opening your PR, please remove the code snippet above as well all the [!IMPORTANT] message blocks. Yes, including this one.
+
+# :warning: *_Title of your baseline_* # Also copy this title to the `description` in the `[project]` section of your `pyproject.toml`.
+
+> [!NOTE]
+> If you use this baseline in your work, please remember to cite the original authors of the paper as well as the Flower paper.
+
+**Paper:** :warning: *_add the URL of the paper page (not to the .pdf). For instance if you link a paper on ArXiv, add here the URL to the abstract page (e.g. [paper](https://arxiv.org/abs/1512.03385)). If your paper is in from a journal or conference proceedings, please follow the same logic._*
+
+**Authors:** :warning: *_list authors of the paper_*
+
+**Abstract:** :warning: *_add here the abstract of the paper you are implementing_*
+
+
+## About this baseline
+
+**What’s implemented:** :warning: *_Concisely describe what experiment(s) (e.g. Figure 1, Table 2, etc) in the publication can be replicated by running the code. Please only use a few sentences. ”_*
+
+**Datasets:** :warning: *_List the datasets you used (if you used a medium to large dataset, >10GB please also include the sizes of the dataset). We highly recommend using [FlowerDatasets](https://flower.ai/docs/datasets/index.html) to download and partition your dataset. If you have other ways to download the data, you can also use `FlowerDatasets` to partiion it._*
+
+**Hardware Setup:** :warning: *_Give some details about the hardware (e.g. a server with 8x V100 32GB and 256GB of RAM) you used to run the experiments for this baseline. Indicate how long it took to run the experiments. Someone out there might not have access to the same resources you have so, could you list the absolute minimum hardware needed to run the experiment in a reasonable amount of time ? (e.g. minimum is 1x 16GB GPU otherwise a client model can’t be trained with a sufficiently large batch size). Could you test this works too?_*
+
+**Contributors:** :warning: *_let the world know who contributed to this baseline. This could be either your name, your name and affiliation at the time, or your GitHub profile name if you prefer. If multiple contributors signed up for this baseline, please list yourself and your colleagues_*
+
+
+## Experimental Setup
+
+**Task:** :warning: *_what’s the primary task that is being federated? (e.g. image classification, next-word prediction). If you have experiments for several, please list them_*
+
+**Model:** :warning: *_provide details about the model you used in your experiments (if more than use a list). If your model is small, describing it as a table would be :100:. Some FL methods do not use an off-the-shelve model (e.g. ResNet18) instead they create your own. If this is your case, please provide a summary here and give pointers to where in the paper (e.g. Appendix B.4) is detailed._*
+
+**Dataset:** :warning: *_Earlier you listed already the datasets that your baseline uses. Now you should include a breakdown of the details about each of them. Please include information about: how the dataset is partitioned (e.g. LDA with alpha 0.1 as default and all clients have the same number of training examples; or each client gets assigned a different number of samples following a power-law distribution with each client only instances of 2 classes)? if your dataset is naturally partitioned just state “naturally partitioned”; how many partitions there are (i.e. how many clients)? Please include this an all information relevant about the dataset and its partitioning into a table._*
+
+**Training Hyperparameters:** :warning: *_Include a table with all the main hyperparameters in your baseline. Please show them with their default value._*
+
+
+## Environment Setup
+
+:warning: _Specify the steps to create and activate your environment and install the baseline project. Most baselines are expected to require minimal steps as shown below. These instructions should be comprehensive enough so anyone can run them (if non standard, describe them step-by-step)._
+
+:warning: _The dependencies for your baseline are listed in the `pyproject.toml`, extend it with additional packages needed for your baseline._
+
+:warning: _Baselines should use Python 3.10, [pyenv](https://github.com/pyenv/pyenv), and the [virtualenv](https://github.com/pyenv/pyenv-virtualenv) plugging.
+
+```bash
+# Create the virtual environment
+pyenv virtualenv 3.10.14
+
+# Activate it
+pyenv activate
+
+# Install the baseline
+pip install -e .
+```
+
+:warning: _If your baseline requires running some script before starting an experiment, please indicate so here_.
+
+## Running the Experiments
+
+:warning: _Make sure you have adjusted the `client-resources` in the federation in `pyproject.toml` so your simulation makes the best use of the system resources available._
+
+:warning: _Your baseline implementation should replicate several of the experiments in the original paper. Please include here the exact command(s) needed to run each of those experiments followed by a figure (e.g. a line plot) or table showing the results you obtained when you ran the code. Below is an example of how you can present this. Please add command followed by results for all your experiments._
+
+:warning: _You might want to add more hyperparameters and settings for your baseline. You can do so by extending `[tool.flwr.app.config]` in `pyproject.toml`. In addition, you can create a new `.toml` file that can be passed with the `--run-config` command (see below an example) to override several config values **already present** in `pyproject.toml`._
+```bash
+# it is likely that for one experiment you need to override some arguments.
+flwr run . --run-config learning-rate=0.1,coefficient=0.123
+
+# or you might want to load different `.toml` configs all together:
+flwr run . --run-config .toml
+```
+
+:warning: _It is preferable to show a single commmand (or multilple commands if they belong to the same experiment) and then a table/plot with the expected results, instead of showing all the commands first and then all the results/plots._
+:warning: _If you present plots or other figures, please include either a Jupyter notebook showing how to create them or include a utility function that can be called after the experiments finish running._
+:warning: If you include plots or figures, save them in `.png` format and place them in a new directory named `_static` at the same level as your `README.md`.
diff --git a/src/py/flwr/cli/new/templates/app/code/__init__.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/__init__.baseline.py.tpl
new file mode 100644
index 000000000000..5ad8041381d6
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/__init__.baseline.py.tpl
@@ -0,0 +1 @@
+"""$project_name: A Flower Baseline."""
diff --git a/src/py/flwr/cli/new/templates/app/code/client.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/client.baseline.py.tpl
new file mode 100644
index 000000000000..83a475f20d27
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/client.baseline.py.tpl
@@ -0,0 +1,58 @@
+"""$project_name: A Flower Baseline."""
+
+import torch
+
+from flwr.client import ClientApp, NumPyClient
+from flwr.common import Context
+from $import_name.dataset import load_data
+from $import_name.model import Net, get_weights, set_weights, test, train
+
+
+class FlowerClient(NumPyClient):
+ """A class defining the client."""
+
+ def __init__(self, net, trainloader, valloader, local_epochs):
+ self.net = net
+ self.trainloader = trainloader
+ self.valloader = valloader
+ self.local_epochs = local_epochs
+ self.device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+ self.net.to(self.device)
+
+ def fit(self, parameters, config):
+ """Traim model using this client's data."""
+ set_weights(self.net, parameters)
+ train_loss = train(
+ self.net,
+ self.trainloader,
+ self.local_epochs,
+ self.device,
+ )
+ return (
+ get_weights(self.net),
+ len(self.trainloader.dataset),
+ {"train_loss": train_loss},
+ )
+
+ def evaluate(self, parameters, config):
+ """Evaluate model using this client's data."""
+ set_weights(self.net, parameters)
+ loss, accuracy = test(self.net, self.valloader, self.device)
+ return loss, len(self.valloader.dataset), {"accuracy": accuracy}
+
+
+def client_fn(context: Context):
+ """Construct a Client that will be run in a ClientApp."""
+ # Load model and data
+ net = Net()
+ partition_id = int(context.node_config["partition-id"])
+ num_partitions = int(context.node_config["num-partitions"])
+ trainloader, valloader = load_data(partition_id, num_partitions)
+ local_epochs = context.run_config["local-epochs"]
+
+ # Return Client instance
+ return FlowerClient(net, trainloader, valloader, local_epochs).to_client()
+
+
+# Flower ClientApp
+app = ClientApp(client_fn)
diff --git a/src/py/flwr/cli/new/templates/app/code/dataset.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/dataset.baseline.py.tpl
new file mode 100644
index 000000000000..46f1f64418c0
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/dataset.baseline.py.tpl
@@ -0,0 +1,36 @@
+"""$project_name: A Flower Baseline."""
+
+from flwr_datasets import FederatedDataset
+from flwr_datasets.partitioner import IidPartitioner
+from torch.utils.data import DataLoader
+from torchvision.transforms import Compose, Normalize, ToTensor
+
+FDS = None # Cache FederatedDataset
+
+
+def load_data(partition_id: int, num_partitions: int):
+ """Load partition CIFAR10 data."""
+ # Only initialize `FederatedDataset` once
+ global FDS # pylint: disable=global-statement
+ if FDS is None:
+ partitioner = IidPartitioner(num_partitions=num_partitions)
+ FDS = FederatedDataset(
+ dataset="uoft-cs/cifar10",
+ partitioners={"train": partitioner},
+ )
+ partition = FDS.load_partition(partition_id)
+ # Divide data on each node: 80% train, 20% test
+ partition_train_test = partition.train_test_split(test_size=0.2, seed=42)
+ pytorch_transforms = Compose(
+ [ToTensor(), Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
+ )
+
+ def apply_transforms(batch):
+ """Apply transforms to the partition from FederatedDataset."""
+ batch["img"] = [pytorch_transforms(img) for img in batch["img"]]
+ return batch
+
+ partition_train_test = partition_train_test.with_transform(apply_transforms)
+ trainloader = DataLoader(partition_train_test["train"], batch_size=32, shuffle=True)
+ testloader = DataLoader(partition_train_test["test"], batch_size=32)
+ return trainloader, testloader
diff --git a/src/py/flwr/cli/new/templates/app/code/model.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/model.baseline.py.tpl
new file mode 100644
index 000000000000..8a914fcf60d1
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/model.baseline.py.tpl
@@ -0,0 +1,80 @@
+"""$project_name: A Flower Baseline."""
+
+from collections import OrderedDict
+
+import torch
+import torch.nn.functional as F
+from torch import nn
+
+
+class Net(nn.Module):
+ """Model (simple CNN adapted from 'PyTorch: A 60 Minute Blitz')."""
+
+ def __init__(self):
+ super().__init__()
+ self.conv1 = nn.Conv2d(3, 6, 5)
+ self.pool = nn.MaxPool2d(2, 2)
+ self.conv2 = nn.Conv2d(6, 16, 5)
+ self.fc1 = nn.Linear(16 * 5 * 5, 120)
+ self.fc2 = nn.Linear(120, 84)
+ self.fc3 = nn.Linear(84, 10)
+
+ def forward(self, x):
+ """Do forward."""
+ x = self.pool(F.relu(self.conv1(x)))
+ x = self.pool(F.relu(self.conv2(x)))
+ x = x.view(-1, 16 * 5 * 5)
+ x = F.relu(self.fc1(x))
+ x = F.relu(self.fc2(x))
+ return self.fc3(x)
+
+
+def train(net, trainloader, epochs, device):
+ """Train the model on the training set."""
+ net.to(device) # move model to GPU if available
+ criterion = torch.nn.CrossEntropyLoss()
+ criterion.to(device)
+ optimizer = torch.optim.SGD(net.parameters(), lr=0.1, momentum=0.9)
+ net.train()
+ running_loss = 0.0
+ for _ in range(epochs):
+ for batch in trainloader:
+ images = batch["img"]
+ labels = batch["label"]
+ optimizer.zero_grad()
+ loss = criterion(net(images.to(device)), labels.to(device))
+ loss.backward()
+ optimizer.step()
+ running_loss += loss.item()
+
+ avg_trainloss = running_loss / len(trainloader)
+ return avg_trainloss
+
+
+def test(net, testloader, device):
+ """Validate the model on the test set."""
+ net.to(device)
+ criterion = torch.nn.CrossEntropyLoss()
+ correct, loss = 0, 0.0
+ with torch.no_grad():
+ for batch in testloader:
+ images = batch["img"].to(device)
+ labels = batch["label"].to(device)
+ outputs = net(images)
+ loss += criterion(outputs, labels).item()
+ correct += (torch.max(outputs.data, 1)[1] == labels).sum().item()
+ accuracy = correct / len(testloader.dataset)
+ loss = loss / len(testloader)
+ return loss, accuracy
+
+
+def get_weights(net):
+ """Extract model parameters as numpy arrays from state_dict."""
+ return [val.cpu().numpy() for _, val in net.state_dict().items()]
+
+
+def set_weights(net, parameters):
+ """Apply parameters to an existing model."""
+ params_dict = zip(net.state_dict().keys(), parameters)
+ state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
+ net.load_state_dict(state_dict, strict=True)
diff --git a/src/py/flwr/cli/new/templates/app/code/server.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/server.baseline.py.tpl
new file mode 100644
index 000000000000..ea536e3efffb
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/server.baseline.py.tpl
@@ -0,0 +1,46 @@
+"""$project_name: A Flower Baseline."""
+
+from typing import List, Tuple
+
+from flwr.common import Context, Metrics, ndarrays_to_parameters
+from flwr.server import ServerApp, ServerAppComponents, ServerConfig
+from flwr.server.strategy import FedAvg
+from $import_name.model import Net, get_weights
+
+
+# Define metric aggregation function
+def weighted_average(metrics: List[Tuple[int, Metrics]]) -> Metrics:
+ """Do weighted average of accuracy metric."""
+ # Multiply accuracy of each client by number of examples used
+ accuracies = [num_examples * float(m["accuracy"]) for num_examples, m in metrics]
+ examples = [num_examples for num_examples, _ in metrics]
+
+ # Aggregate and return custom metric (weighted average)
+ return {"accuracy": sum(accuracies) / sum(examples)}
+
+
+def server_fn(context: Context):
+ """Construct components that set the ServerApp behaviour."""
+ # Read from config
+ num_rounds = context.run_config["num-server-rounds"]
+ fraction_fit = context.run_config["fraction-fit"]
+
+ # Initialize model parameters
+ ndarrays = get_weights(Net())
+ parameters = ndarrays_to_parameters(ndarrays)
+
+ # Define strategy
+ strategy = FedAvg(
+ fraction_fit=float(fraction_fit),
+ fraction_evaluate=1.0,
+ min_available_clients=2,
+ initial_parameters=parameters,
+ evaluate_metrics_aggregation_fn=weighted_average,
+ )
+ config = ServerConfig(num_rounds=int(num_rounds))
+
+ return ServerAppComponents(strategy=strategy, config=config)
+
+
+# Create ServerApp
+app = ServerApp(server_fn=server_fn)
diff --git a/src/py/flwr/cli/new/templates/app/code/strategy.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/strategy.baseline.py.tpl
new file mode 100644
index 000000000000..5ad8041381d6
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/strategy.baseline.py.tpl
@@ -0,0 +1 @@
+"""$project_name: A Flower Baseline."""
diff --git a/src/py/flwr/cli/new/templates/app/code/utils.baseline.py.tpl b/src/py/flwr/cli/new/templates/app/code/utils.baseline.py.tpl
new file mode 100644
index 000000000000..5ad8041381d6
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/code/utils.baseline.py.tpl
@@ -0,0 +1 @@
+"""$project_name: A Flower Baseline."""
diff --git a/src/py/flwr/cli/new/templates/app/pyproject.baseline.toml.tpl b/src/py/flwr/cli/new/templates/app/pyproject.baseline.toml.tpl
new file mode 100644
index 000000000000..71afc184ffa9
--- /dev/null
+++ b/src/py/flwr/cli/new/templates/app/pyproject.baseline.toml.tpl
@@ -0,0 +1,138 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[project]
+name = "$package_name"
+version = "1.0.0"
+description = ""
+license = "Apache-2.0"
+dependencies = [
+ "flwr[simulation]>=1.11.0",
+ "flwr-datasets[vision]>=0.3.0",
+ "torch==2.2.1",
+ "torchvision==0.17.1",
+]
+
+[tool.hatch.metadata]
+allow-direct-references = true
+
+[project.optional-dependencies]
+dev = [
+ "isort==5.13.2",
+ "black==24.2.0",
+ "docformatter==1.7.5",
+ "mypy==1.8.0",
+ "pylint==3.2.6",
+ "flake8==5.0.4",
+ "pytest==6.2.4",
+ "pytest-watch==4.2.0",
+ "ruff==0.1.9",
+ "types-requests==2.31.0.20240125",
+]
+
+[tool.isort]
+profile = "black"
+known_first_party = ["flwr"]
+
+[tool.black]
+line-length = 88
+target-version = ["py38", "py39", "py310", "py311"]
+
+[tool.pytest.ini_options]
+minversion = "6.2"
+addopts = "-qq"
+testpaths = [
+ "flwr_baselines",
+]
+
+[tool.mypy]
+ignore_missing_imports = true
+strict = false
+plugins = "numpy.typing.mypy_plugin"
+
+[tool.pylint."MESSAGES CONTROL"]
+disable = "duplicate-code,too-few-public-methods,useless-import-alias"
+good-names = "i,j,k,_,x,y,X,Y,K,N"
+max-args = 10
+max-attributes = 15
+max-locals = 36
+max-branches = 20
+max-statements = 55
+
+[tool.pylint.typecheck]
+generated-members = "numpy.*, torch.*, tensorflow.*"
+
+[[tool.mypy.overrides]]
+module = [
+ "importlib.metadata.*",
+ "importlib_metadata.*",
+]
+follow_imports = "skip"
+follow_imports_for_stubs = true
+disallow_untyped_calls = false
+
+[[tool.mypy.overrides]]
+module = "torch.*"
+follow_imports = "skip"
+follow_imports_for_stubs = true
+
+[tool.docformatter]
+wrap-summaries = 88
+wrap-descriptions = 88
+
+[tool.ruff]
+target-version = "py38"
+line-length = 88
+select = ["D", "E", "F", "W", "B", "ISC", "C4"]
+fixable = ["D", "E", "F", "W", "B", "ISC", "C4"]
+ignore = ["B024", "B027"]
+exclude = [
+ ".bzr",
+ ".direnv",
+ ".eggs",
+ ".git",
+ ".hg",
+ ".mypy_cache",
+ ".nox",
+ ".pants.d",
+ ".pytype",
+ ".ruff_cache",
+ ".svn",
+ ".tox",
+ ".venv",
+ "__pypackages__",
+ "_build",
+ "buck-out",
+ "build",
+ "dist",
+ "node_modules",
+ "venv",
+ "proto",
+]
+
+[tool.ruff.pydocstyle]
+convention = "numpy"
+
+[tool.hatch.build.targets.wheel]
+packages = ["."]
+
+[tool.flwr.app]
+publisher = "$username"
+
+[tool.flwr.app.components]
+serverapp = "$import_name.server_app:app"
+clientapp = "$import_name.client_app:app"
+
+[tool.flwr.app.config]
+num-server-rounds = 3
+fraction-fit = 0.5
+local-epochs = 1
+
+[tool.flwr.federations]
+default = "local-simulation"
+
+[tool.flwr.federations.local-simulation]
+options.num-supernodes = 10
+options.backend.client-resources.num-cpus = 2
+options.backend.client-resources.num-gpus = 0.0
From 2dd161ac4bf3e53d020509c1e3c9c5bd3476b0d1 Mon Sep 17 00:00:00 2001
From: Javier
Date: Wed, 28 Aug 2024 18:39:58 +0200
Subject: [PATCH 17/42] fix(framework:skip) Parse node config if set (#4091)
---
src/py/flwr/client/supernode/app.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/py/flwr/client/supernode/app.py b/src/py/flwr/client/supernode/app.py
index 8d28e69dea6e..ac845417415e 100644
--- a/src/py/flwr/client/supernode/app.py
+++ b/src/py/flwr/client/supernode/app.py
@@ -77,7 +77,9 @@ def run_supernode() -> None:
authentication_keys=authentication_keys,
max_retries=args.max_retries,
max_wait_time=args.max_wait_time,
- node_config=parse_config_args([args.node_config]),
+ node_config=parse_config_args(
+ [args.node_config] if args.node_config else args.node_config
+ ),
isolation=args.isolation,
supernode_address=args.supernode_address,
)
From 7ae277075009458c1e4a9d11268b77bdad675945 Mon Sep 17 00:00:00 2001
From: Javier
Date: Wed, 28 Aug 2024 18:52:16 +0200
Subject: [PATCH 18/42] fix(framework:skip) Support overriding run config from
a `TOML` (#4080)
---
src/py/flwr/common/config.py | 25 +++++++++-------
src/py/flwr/common/config_test.py | 48 +++++++++++++++++++++++++++++++
2 files changed, 62 insertions(+), 11 deletions(-)
diff --git a/src/py/flwr/common/config.py b/src/py/flwr/common/config.py
index eec7cfb726b7..42039fa959ac 100644
--- a/src/py/flwr/common/config.py
+++ b/src/py/flwr/common/config.py
@@ -185,23 +185,26 @@ def parse_config_args(
if config is None:
return overrides
+ # Handle if .toml file is passed
+ if len(config) == 1 and config[0].endswith(".toml"):
+ with Path(config[0]).open("rb") as config_file:
+ overrides = flatten_dict(tomli.load(config_file))
+ return overrides
+
# Regular expression to capture key-value pairs with possible quoted values
pattern = re.compile(r"(\S+?)=(\'[^\']*\'|\"[^\"]*\"|\S+)")
for config_line in config:
if config_line:
- matches = pattern.findall(config_line)
+ # .toml files aren't allowed alongside other configs
+ if config_line.endswith(".toml"):
+ raise ValueError(
+ "TOML files cannot be passed alongside key-value pairs."
+ )
- if (
- len(matches) == 1
- and "=" not in matches[0][0]
- and matches[0][0].endswith(".toml")
- ):
- with Path(matches[0][0]).open("rb") as config_file:
- overrides = flatten_dict(tomli.load(config_file))
- else:
- toml_str = "\n".join(f"{k} = {v}" for k, v in matches)
- overrides.update(tomli.loads(toml_str))
+ matches = pattern.findall(config_line)
+ toml_str = "\n".join(f"{k} = {v}" for k, v in matches)
+ overrides.update(tomli.loads(toml_str))
return overrides
diff --git a/src/py/flwr/common/config_test.py b/src/py/flwr/common/config_test.py
index 712e07264d3f..34bc691cc957 100644
--- a/src/py/flwr/common/config_test.py
+++ b/src/py/flwr/common/config_test.py
@@ -15,6 +15,7 @@
"""Test util functions handling Flower config."""
import os
+import tempfile
import textwrap
from pathlib import Path
from unittest.mock import patch
@@ -254,3 +255,50 @@ def test_parse_config_args_overrides() -> None:
"key5": True,
"key6": "value6",
}
+
+
+def test_parse_config_args_from_toml_file() -> None:
+ """Test if a toml passed to --run-config it is loaded and fused correctly."""
+ # Will be saved as a temp .toml file
+ toml_config = """
+ num-server-rounds = 10
+ momentum = 0.1
+ verbose = true
+ """
+ # This is the UserConfig that would be extracted from pyproject.toml
+ initial_run_config: UserConfig = {
+ "num-server-rounds": 5,
+ "momentum": 0.2,
+ "dataset": "my-fancy-dataset",
+ "verbose": False,
+ }
+ expected_config = {
+ "num-server-rounds": 10,
+ "momentum": 0.1,
+ "dataset": "my-fancy-dataset",
+ "verbose": True,
+ }
+
+ # Create a temporary directory using a context manager
+ with tempfile.TemporaryDirectory() as temp_dir:
+ # Create a temporary TOML file within that directory
+ toml_config_file = os.path.join(temp_dir, "extra_config.toml")
+
+ # Write the data to the TOML file
+ with open(toml_config_file, "w", encoding="utf-8") as toml_file:
+ toml_file.write(textwrap.dedent(toml_config))
+
+ # Parse config (this mimics what `--run-config path/to/config.toml` does)
+ config_from_toml = parse_config_args([toml_config_file])
+ # Fuse
+ config = fuse_dicts(initial_run_config, config_from_toml)
+
+ # Assert
+ assert config == expected_config
+
+
+def test_parse_config_args_passing_toml_and_key_value() -> None:
+ """Test that passing a toml and key-value configs aren't allowed."""
+ config = ["my-other-config.toml", "lr=0.1", "epochs=99"]
+ with pytest.raises(ValueError):
+ parse_config_args(config)
From 84eeda695f7eadd10bb8a1ee5d69c52a098a9393 Mon Sep 17 00:00:00 2001
From: Javier
Date: Thu, 29 Aug 2024 10:46:55 +0200
Subject: [PATCH 19/42] fix(framework:skip) Fix parsing run config in
simulation (#4095)
---
src/py/flwr/simulation/run_simulation.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/py/flwr/simulation/run_simulation.py b/src/py/flwr/simulation/run_simulation.py
index af12da4a5814..1eddd91108d8 100644
--- a/src/py/flwr/simulation/run_simulation.py
+++ b/src/py/flwr/simulation/run_simulation.py
@@ -177,7 +177,9 @@ def run_simulation_from_cli() -> None:
client_app_attr = app_components["clientapp"]
server_app_attr = app_components["serverapp"]
- override_config = parse_config_args([args.run_config])
+ override_config = parse_config_args(
+ [args.run_config] if args.run_config else args.run_config
+ )
fused_config = get_fused_config_from_dir(app_path, override_config)
app_dir = args.app
is_app = True
From e3d74f7e7e8bd505ece0e12a17e4321b1a80863d Mon Sep 17 00:00:00 2001
From: Javier
Date: Thu, 29 Aug 2024 13:20:41 +0200
Subject: [PATCH 20/42] refactor(baselines:skip) Remove baseline template and
creation script (#4082)
---
.../baseline_template/EXTENDED_README.md | 123 -----------
baselines/baseline_template/LICENSE | 202 ------------------
baselines/baseline_template/README.md | 87 --------
.../baseline_template/__init__.py | 1 -
.../baseline_template/client.py | 5 -
.../baseline_template/conf/base.yaml | 17 --
.../baseline_template/dataset.py | 10 -
.../baseline_template/dataset_preparation.py | 34 ---
.../baseline_template/main.py | 57 -----
.../baseline_template/models.py | 7 -
.../baseline_template/server.py | 5 -
.../baseline_template/strategy.py | 5 -
.../baseline_template/utils.py | 6 -
baselines/baseline_template/pyproject.toml | 137 ------------
baselines/dev/create-baseline.sh | 30 ---
15 files changed, 726 deletions(-)
delete mode 100644 baselines/baseline_template/EXTENDED_README.md
delete mode 100644 baselines/baseline_template/LICENSE
delete mode 100644 baselines/baseline_template/README.md
delete mode 100644 baselines/baseline_template/baseline_template/__init__.py
delete mode 100644 baselines/baseline_template/baseline_template/client.py
delete mode 100644 baselines/baseline_template/baseline_template/conf/base.yaml
delete mode 100644 baselines/baseline_template/baseline_template/dataset.py
delete mode 100644 baselines/baseline_template/baseline_template/dataset_preparation.py
delete mode 100644 baselines/baseline_template/baseline_template/main.py
delete mode 100644 baselines/baseline_template/baseline_template/models.py
delete mode 100644 baselines/baseline_template/baseline_template/server.py
delete mode 100644 baselines/baseline_template/baseline_template/strategy.py
delete mode 100644 baselines/baseline_template/baseline_template/utils.py
delete mode 100644 baselines/baseline_template/pyproject.toml
delete mode 100755 baselines/dev/create-baseline.sh
diff --git a/baselines/baseline_template/EXTENDED_README.md b/baselines/baseline_template/EXTENDED_README.md
deleted file mode 100644
index 9c8f5bc72fa9..000000000000
--- a/baselines/baseline_template/EXTENDED_README.md
+++ /dev/null
@@ -1,123 +0,0 @@
-
-# Extended Readme
-
-> The baselines are expected to run in a machine running Ubuntu 22.04
-
-While `README.md` should include information about the baseline you implement and how to run it, this _extended_ readme provides info on what's the expected directory structure for a new baseline and more generally the instructions to follow before your baseline can be merged into the Flower repository. Please follow closely these instructions. It is likely that you have already completed steps 1-2.
-
-1. Fork the Flower repository and clone it.
-2. Navigate to the `baselines/` directory and from there run:
- ```bash
- # This will create a new directory with the same structure as this `baseline_template` directory.
- ./dev/create-baseline.sh
- ```
-3. All your code and configs should go into a sub-directory with the same name as the name of your baseline.
- * The sub-directory contains a series of Python scripts that you can edit. Please stick to these files and consult with us if you need additional ones.
- * There is also a basic config structure in `/conf` ready be parsed by [Hydra](https://hydra.cc/) when executing your `main.py`.
-4. Therefore, the directory structure in your baseline should look like:
- ```bash
- baselines/
- ├── README.md # describes your baseline and everything needed to use it
- ├── EXTENDED_README.md # to remove before creating your PR
- ├── pyproject.toml # details your Python environment
- └──
- ├── *.py # several .py files including main.py and __init__.py
- └── conf
- └── *.yaml # one or more Hydra config files
-
- ```
-> :warning: Make sure the variable `name` in `pyproject.toml` is set to the name of the sub-directory containing all your code.
-
-5. Add your dependencies to the `pyproject.toml` (see below a few examples on how to do it). Read more about Poetry below in this `EXTENDED_README.md`.
-6. Regularly check that your coding style and the documentation you add follow good coding practices. To test whether your code meets the requirements, please run the following:
- ```bash
- # After activating your environment and from your baseline's directory
- cd .. # to go to the top-level directory of all baselines
- ./dev/test-baseline.sh
- ./dev/test-baseline-structure.sh
- ```
- Both `test-baseline.sh` and `test-baseline-structure.sh` will also be automatically run when you create a PR, and both tests need to pass for the baseline to be merged.
- To automatically solve some formatting issues and apply easy fixes, please run the formatting script:
- ```bash
- # After activating your environment and from your baseline's directory
- cd .. # to go to the top-level directory of all baselines
- ./dev/format-baseline.sh
- ```
-7. Ensure that the Python environment for your baseline can be created without errors by simply running `poetry install` and that this is properly described later when you complete the `Environment Setup` section in `README.md`. This is specially important if your environment requires additional steps after doing `poetry install`.
-8. Ensure that your baseline runs with default arguments by running `poetry run python -m .main`. Then, describe this and other forms of running your code in the `Running the Experiments` section in `README.md`.
-9. Once your code is ready and you have checked:
- * that following the instructions in your `README.md` the Python environment can be created correctly
-
- * that running the code following your instructions can reproduce the experiments in the paper
-
- , then you just need to create a Pull Request (PR) to kickstart the process of merging your baseline into the Flower repository.
-
-> Once you are happy to merge your baseline contribution, please delete this `EXTENDED_README.md` file.
-
-
-## About Poetry
-
-We use Poetry to manage the Python environment for each individual baseline. You can follow the instructions [here](https://python-poetry.org/docs/) to install Poetry in your machine.
-
-
-### Specifying a Python Version (optional)
-By default, Poetry will use the Python version in your system. In some settings, you might want to specify a particular version of Python to use inside your Poetry environment. You can do so with [`pyenv`](https://github.com/pyenv/pyenv). Check the documentation for the different ways of installing `pyenv`, but one easy way is using the [automatic installer](https://github.com/pyenv/pyenv-installer):
-```bash
-curl https://pyenv.run | bash # then, don't forget links to your .bashrc/.zshrc
-```
-
-You can then install any Python version with `pyenv install ` (e.g. `pyenv install 3.9.17`). Then, in order to use that version for your baseline, you'd do the following:
-
-```bash
-# cd to your baseline directory (i.e. where the `pyproject.toml` is)
-pyenv local
-
-# set that version for poetry
-poetry env use
-
-# then you can install your Poetry environment (see the next setp)
-```
-
-### Installing Your Environment
-With the Poetry tool already installed, you can create an environment for this baseline with commands:
-```bash
-# run this from the same directory as the `pyproject.toml` file is
-poetry install
-```
-
-This will create a basic Python environment with just Flower and additional packages, including those needed for simulation. Next, you should add the dependencies for your code. It is **critical** that you fix the version of the packages you use using a `=` not a `=^`. You can do so via [`poetry add`](https://python-poetry.org/docs/cli/#add). Below are some examples:
-
-```bash
-# For instance, if you want to install tqdm
-poetry add tqdm==4.65.0
-
-# If you already have a requirements.txt, you can add all those packages (but ensure you have fixed the version) in one go as follows:
-poetry add $( cat requirements.txt )
-```
-With each `poetry add` command, the `pyproject.toml` gets automatically updated so you don't need to keep that `requirements.txt` as part of this baseline.
-
-
-More critically however, is adding your ML framework of choice to the list of dependencies. For some frameworks you might be able to do so with the `poetry add` command. Check [the Poetry documentation](https://python-poetry.org/docs/cli/#add) for how to add packages in various ways. For instance, let's say you want to use PyTorch:
-
-```bash
-# with plain `pip` you'd run a command such as:
-pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
-
-# to add the same 3 dependencies to your Poetry environment you'd need to add the URL to the wheel that the above pip command auto-resolves for you.
-# You can find those wheels in `https://download.pytorch.org/whl/cu117`. Copy the link and paste it after the `poetry add` command.
-# For instance to add `torch==1.13.1+cu117` and a x86 Linux system with Python3.8 you'd:
-poetry add https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp38-cp38-linux_x86_64.whl
-# you'll need to repeat this for both `torchvision` and `torchaudio`
-```
-The above is just an example of how you can add these dependencies. Please refer to the Poetry documentation to extra reference.
-
-If all attempts fail, you can still install packages via standard `pip`. You'd first need to source/activate your Poetry environment.
-```bash
-# first ensure you have created your environment
-# and installed the base packages provided in the template
-poetry install
-
-# then activate it
-poetry shell
-```
-Now you are inside your environment (pretty much as when you use `virtualenv` or `conda`) so you can install further packages with `pip`. Please note that, unlike with `poetry add`, these extra requirements won't be captured by `pyproject.toml`. Therefore, please ensure that you provide all instructions needed to: (1) create the base environment with Poetry and (2) install any additional dependencies via `pip` when you complete your `README.md`.
\ No newline at end of file
diff --git a/baselines/baseline_template/LICENSE b/baselines/baseline_template/LICENSE
deleted file mode 100644
index d64569567334..000000000000
--- a/baselines/baseline_template/LICENSE
+++ /dev/null
@@ -1,202 +0,0 @@
-
- Apache License
- Version 2.0, January 2004
- http://www.apache.org/licenses/
-
- TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
- 1. Definitions.
-
- "License" shall mean the terms and conditions for use, reproduction,
- and distribution as defined by Sections 1 through 9 of this document.
-
- "Licensor" shall mean the copyright owner or entity authorized by
- the copyright owner that is granting the License.
-
- "Legal Entity" shall mean the union of the acting entity and all
- other entities that control, are controlled by, or are under common
- control with that entity. For the purposes of this definition,
- "control" means (i) the power, direct or indirect, to cause the
- direction or management of such entity, whether by contract or
- otherwise, or (ii) ownership of fifty percent (50%) or more of the
- outstanding shares, or (iii) beneficial ownership of such entity.
-
- "You" (or "Your") shall mean an individual or Legal Entity
- exercising permissions granted by this License.
-
- "Source" form shall mean the preferred form for making modifications,
- including but not limited to software source code, documentation
- source, and configuration files.
-
- "Object" form shall mean any form resulting from mechanical
- transformation or translation of a Source form, including but
- not limited to compiled object code, generated documentation,
- and conversions to other media types.
-
- "Work" shall mean the work of authorship, whether in Source or
- Object form, made available under the License, as indicated by a
- copyright notice that is included in or attached to the work
- (an example is provided in the Appendix below).
-
- "Derivative Works" shall mean any work, whether in Source or Object
- form, that is based on (or derived from) the Work and for which the
- editorial revisions, annotations, elaborations, or other modifications
- represent, as a whole, an original work of authorship. For the purposes
- of this License, Derivative Works shall not include works that remain
- separable from, or merely link (or bind by name) to the interfaces of,
- the Work and Derivative Works thereof.
-
- "Contribution" shall mean any work of authorship, including
- the original version of the Work and any modifications or additions
- to that Work or Derivative Works thereof, that is intentionally
- submitted to Licensor for inclusion in the Work by the copyright owner
- or by an individual or Legal Entity authorized to submit on behalf of
- the copyright owner. For the purposes of this definition, "submitted"
- means any form of electronic, verbal, or written communication sent
- to the Licensor or its representatives, including but not limited to
- communication on electronic mailing lists, source code control systems,
- and issue tracking systems that are managed by, or on behalf of, the
- Licensor for the purpose of discussing and improving the Work, but
- excluding communication that is conspicuously marked or otherwise
- designated in writing by the copyright owner as "Not a Contribution."
-
- "Contributor" shall mean Licensor and any individual or Legal Entity
- on behalf of whom a Contribution has been received by Licensor and
- subsequently incorporated within the Work.
-
- 2. Grant of Copyright License. Subject to the terms and conditions of
- this License, each Contributor hereby grants to You a perpetual,
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
- copyright license to reproduce, prepare Derivative Works of,
- publicly display, publicly perform, sublicense, and distribute the
- Work and such Derivative Works in Source or Object form.
-
- 3. Grant of Patent License. Subject to the terms and conditions of
- this License, each Contributor hereby grants to You a perpetual,
- worldwide, non-exclusive, no-charge, royalty-free, irrevocable
- (except as stated in this section) patent license to make, have made,
- use, offer to sell, sell, import, and otherwise transfer the Work,
- where such license applies only to those patent claims licensable
- by such Contributor that are necessarily infringed by their
- Contribution(s) alone or by combination of their Contribution(s)
- with the Work to which such Contribution(s) was submitted. If You
- institute patent litigation against any entity (including a
- cross-claim or counterclaim in a lawsuit) alleging that the Work
- or a Contribution incorporated within the Work constitutes direct
- or contributory patent infringement, then any patent licenses
- granted to You under this License for that Work shall terminate
- as of the date such litigation is filed.
-
- 4. Redistribution. You may reproduce and distribute copies of the
- Work or Derivative Works thereof in any medium, with or without
- modifications, and in Source or Object form, provided that You
- meet the following conditions:
-
- (a) You must give any other recipients of the Work or
- Derivative Works a copy of this License; and
-
- (b) You must cause any modified files to carry prominent notices
- stating that You changed the files; and
-
- (c) You must retain, in the Source form of any Derivative Works
- that You distribute, all copyright, patent, trademark, and
- attribution notices from the Source form of the Work,
- excluding those notices that do not pertain to any part of
- the Derivative Works; and
-
- (d) If the Work includes a "NOTICE" text file as part of its
- distribution, then any Derivative Works that You distribute must
- include a readable copy of the attribution notices contained
- within such NOTICE file, excluding those notices that do not
- pertain to any part of the Derivative Works, in at least one
- of the following places: within a NOTICE text file distributed
- as part of the Derivative Works; within the Source form or
- documentation, if provided along with the Derivative Works; or,
- within a display generated by the Derivative Works, if and
- wherever such third-party notices normally appear. The contents
- of the NOTICE file are for informational purposes only and
- do not modify the License. You may add Your own attribution
- notices within Derivative Works that You distribute, alongside
- or as an addendum to the NOTICE text from the Work, provided
- that such additional attribution notices cannot be construed
- as modifying the License.
-
- You may add Your own copyright statement to Your modifications and
- may provide additional or different license terms and conditions
- for use, reproduction, or distribution of Your modifications, or
- for any such Derivative Works as a whole, provided Your use,
- reproduction, and distribution of the Work otherwise complies with
- the conditions stated in this License.
-
- 5. Submission of Contributions. Unless You explicitly state otherwise,
- any Contribution intentionally submitted for inclusion in the Work
- by You to the Licensor shall be under the terms and conditions of
- this License, without any additional terms or conditions.
- Notwithstanding the above, nothing herein shall supersede or modify
- the terms of any separate license agreement you may have executed
- with Licensor regarding such Contributions.
-
- 6. Trademarks. This License does not grant permission to use the trade
- names, trademarks, service marks, or product names of the Licensor,
- except as required for reasonable and customary use in describing the
- origin of the Work and reproducing the content of the NOTICE file.
-
- 7. Disclaimer of Warranty. Unless required by applicable law or
- agreed to in writing, Licensor provides the Work (and each
- Contributor provides its Contributions) on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
- implied, including, without limitation, any warranties or conditions
- of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
- PARTICULAR PURPOSE. You are solely responsible for determining the
- appropriateness of using or redistributing the Work and assume any
- risks associated with Your exercise of permissions under this License.
-
- 8. Limitation of Liability. In no event and under no legal theory,
- whether in tort (including negligence), contract, or otherwise,
- unless required by applicable law (such as deliberate and grossly
- negligent acts) or agreed to in writing, shall any Contributor be
- liable to You for damages, including any direct, indirect, special,
- incidental, or consequential damages of any character arising as a
- result of this License or out of the use or inability to use the
- Work (including but not limited to damages for loss of goodwill,
- work stoppage, computer failure or malfunction, or any and all
- other commercial damages or losses), even if such Contributor
- has been advised of the possibility of such damages.
-
- 9. Accepting Warranty or Additional Liability. While redistributing
- the Work or Derivative Works thereof, You may choose to offer,
- and charge a fee for, acceptance of support, warranty, indemnity,
- or other liability obligations and/or rights consistent with this
- License. However, in accepting such obligations, You may act only
- on Your own behalf and on Your sole responsibility, not on behalf
- of any other Contributor, and only if You agree to indemnify,
- defend, and hold each Contributor harmless for any liability
- incurred by, or claims asserted against, such Contributor by reason
- of your accepting any such warranty or additional liability.
-
- END OF TERMS AND CONDITIONS
-
- APPENDIX: How to apply the Apache License to your work.
-
- To apply the Apache License to your work, attach the following
- boilerplate notice, with the fields enclosed by brackets "[]"
- replaced with your own identifying information. (Don't include
- the brackets!) The text should be enclosed in the appropriate
- comment syntax for the file format. We also recommend that a
- file or class name and description of purpose be included on the
- same "printed page" as the copyright notice for easier
- identification within third-party archives.
-
- Copyright [yyyy] [name of copyright owner]
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
diff --git a/baselines/baseline_template/README.md b/baselines/baseline_template/README.md
deleted file mode 100644
index ee6e1e96976f..000000000000
--- a/baselines/baseline_template/README.md
+++ /dev/null
@@ -1,87 +0,0 @@
----
-title: title of the paper
-url: URL to the paper page (not the pdf)
-labels: [label1, label2] # please add between 4 and 10 single-word (maybe two-words) labels (e.g. system heterogeneity, image classification, asynchronous, weight sharing, cross-silo). Do not use ""
-dataset: [dataset1, dataset2] # list of datasets you include in your baseline. Do not use ""
----
-
-# :warning: *_Title of your baseline_*
-
-> Note: If you use this baseline in your work, please remember to cite the original authors of the paper as well as the Flower paper.
-
-> :warning: This is the template to follow when creating a new Flower Baseline. Please follow the instructions in `EXTENDED_README.md`
-
-> :warning: Please follow the instructions carefully. You can see the [FedProx-MNIST baseline](https://github.com/adap/flower/tree/main/baselines/fedprox) as an example of a baseline that followed this guide.
-
-> :warning: Please complete the metadata section at the very top of this README. This generates a table at the top of the file that will facilitate indexing baselines.
-
-**Paper:** :warning: *_add the URL of the paper page (not to the .pdf). For instance if you link a paper on ArXiv, add here the URL to the abstract page (e.g. https://arxiv.org/abs/1512.03385). If your paper is in from a journal or conference proceedings, please follow the same logic._*
-
-**Authors:** :warning: *_list authors of the paper_*
-
-**Abstract:** :warning: *_add here the abstract of the paper you are implementing_*
-
-
-## About this baseline
-
-**What’s implemented:** :warning: *_Concisely describe what experiment(s) in the publication can be replicated by running the code. Please only use a few sentences. Start with: “The code in this directory …”_*
-
-**Datasets:** :warning: *_List the datasets you used (if you used a medium to large dataset, >10GB please also include the sizes of the dataset)._*
-
-**Hardware Setup:** :warning: *_Give some details about the hardware (e.g. a server with 8x V100 32GB and 256GB of RAM) you used to run the experiments for this baseline. Someone out there might not have access to the same resources you have so, could list the absolute minimum hardware needed to run the experiment in a reasonable amount of time ? (e.g. minimum is 1x 16GB GPU otherwise a client model can’t be trained with a sufficiently large batch size). Could you test this works too?_*
-
-**Contributors:** :warning: *_let the world know who contributed to this baseline. This could be either your name, your name and affiliation at the time, or your GitHub profile name if you prefer. If multiple contributors signed up for this baseline, please list yourself and your colleagues_*
-
-
-## Experimental Setup
-
-**Task:** :warning: *_what’s the primary task that is being federated? (e.g. image classification, next-word prediction). If you have experiments for several, please list them_*
-
-**Model:** :warning: *_provide details about the model you used in your experiments (if more than use a list). If your model is small, describing it as a table would be :100:. Some FL methods do not use an off-the-shelve model (e.g. ResNet18) instead they create your own. If this is your case, please provide a summary here and give pointers to where in the paper (e.g. Appendix B.4) is detailed._*
-
-**Dataset:** :warning: *_Earlier you listed already the datasets that your baseline uses. Now you should include a breakdown of the details about each of them. Please include information about: how the dataset is partitioned (e.g. LDA with alpha 0.1 as default and all clients have the same number of training examples; or each client gets assigned a different number of samples following a power-law distribution with each client only instances of 2 classes)? if your dataset is naturally partitioned just state “naturally partitioned”; how many partitions there are (i.e. how many clients)? Please include this an all information relevant about the dataset and its partitioning into a table._*
-
-**Training Hyperparameters:** :warning: *_Include a table with all the main hyperparameters in your baseline. Please show them with their default value._*
-
-
-## Environment Setup
-
-:warning: _The Python environment for all baselines should follow these guidelines in the `EXTENDED_README`. Specify the steps to create and activate your environment. If there are any external system-wide requirements, please include instructions for them too. These instructions should be comprehensive enough so anyone can run them (if non standard, describe them step-by-step)._
-
-
-## Running the Experiments
-
-:warning: _Provide instructions on the steps to follow to run all the experiments._
-```bash
-# The main experiment implemented in your baseline using default hyperparameters (that should be setup in the Hydra configs) should run (including dataset download and necessary partitioning) by executing the command:
-
-poetry run python -m .main # where is the name of this directory and that of the only sub-directory in this directory (i.e. where all your source code is)
-
-# If you are using a dataset that requires a complicated download (i.e. not using one natively supported by TF/PyTorch) + preprocessing logic, you might want to tell people to run one script first that will do all that. Please ensure the download + preprocessing can be configured to suit (at least!) a different download directory (and use as default the current directory). The expected command to run to do this is:
-
-poetry run python -m .dataset_preparation
-
-# It is expected that you baseline supports more than one dataset and different FL settings (e.g. different number of clients, dataset partitioning methods, etc). Please provide a list of commands showing how these experiments are run. Include also a short explanation of what each one does. Here it is expected you'll be using the Hydra syntax to override the default config.
-
-poetry run python -m .main
-.
-.
-.
-poetry run python -m .main
-```
-
-
-## Expected Results
-
-:warning: _Your baseline implementation should replicate several of the experiments in the original paper. Please include here the exact command(s) needed to run each of those experiments followed by a figure (e.g. a line plot) or table showing the results you obtained when you ran the code. Below is an example of how you can present this. Please add command followed by results for all your experiments._
-
-```bash
-# it is likely that for one experiment you need to sweep over different hyperparameters. You are encouraged to use Hydra's multirun functionality for this. This is an example of how you could achieve this for some typical FL hyperparameteres
-
-poetry run python -m .main --multirun num_client_per_round=5,10,50 dataset=femnist,cifar10
-# the above command will run a total of 6 individual experiments (because 3client_configs x 2datasets = 6 -- you can think of it as a grid).
-
-[Now show a figure/table displaying the results of the above command]
-
-# add more commands + plots for additional experiments.
-```
diff --git a/baselines/baseline_template/baseline_template/__init__.py b/baselines/baseline_template/baseline_template/__init__.py
deleted file mode 100644
index a5e567b59135..000000000000
--- a/baselines/baseline_template/baseline_template/__init__.py
+++ /dev/null
@@ -1 +0,0 @@
-"""Template baseline package."""
diff --git a/baselines/baseline_template/baseline_template/client.py b/baselines/baseline_template/baseline_template/client.py
deleted file mode 100644
index d2e2206111f3..000000000000
--- a/baselines/baseline_template/baseline_template/client.py
+++ /dev/null
@@ -1,5 +0,0 @@
-"""Define your client class and a function to construct such clients.
-
-Please overwrite `flwr.client.NumPyClient` or `flwr.client.Client` and create a function
-to instantiate your client.
-"""
diff --git a/baselines/baseline_template/baseline_template/conf/base.yaml b/baselines/baseline_template/baseline_template/conf/base.yaml
deleted file mode 100644
index 2d65b3b989b2..000000000000
--- a/baselines/baseline_template/baseline_template/conf/base.yaml
+++ /dev/null
@@ -1,17 +0,0 @@
----
-# this is the config that will be loaded as default by main.py
-# Please follow the provided structure (this will ensuring all baseline follow
-# a similar configuration structure and hence be easy to customise)
-
-dataset:
- # dataset config
-
-model:
- # model config
-
-strategy:
- _target_: # points to your strategy (either custom or exiting in Flower)
- # rest of strategy config
-
-client:
- # client config
diff --git a/baselines/baseline_template/baseline_template/dataset.py b/baselines/baseline_template/baseline_template/dataset.py
deleted file mode 100644
index 5e436abe12fb..000000000000
--- a/baselines/baseline_template/baseline_template/dataset.py
+++ /dev/null
@@ -1,10 +0,0 @@
-"""Handle basic dataset creation.
-
-In case of PyTorch it should return dataloaders for your dataset (for both the clients
-and the server). If you are using a custom dataset class, this module is the place to
-define it. If your dataset requires to be downloaded (and this is not done
-automatically -- e.g. as it is the case for many dataset in TorchVision) and
-partitioned, please include all those functions and logic in the
-`dataset_preparation.py` module. You can use all those functions from functions/methods
-defined here of course.
-"""
diff --git a/baselines/baseline_template/baseline_template/dataset_preparation.py b/baselines/baseline_template/baseline_template/dataset_preparation.py
deleted file mode 100644
index bd3440b9276b..000000000000
--- a/baselines/baseline_template/baseline_template/dataset_preparation.py
+++ /dev/null
@@ -1,34 +0,0 @@
-"""Handle the dataset partitioning and (optionally) complex downloads.
-
-Please add here all the necessary logic to either download, uncompress, pre/post-process
-your dataset (or all of the above). If the desired way of running your baseline is to
-first download the dataset and partition it and then run the experiments, please
-uncomment the lines below and tell us in the README.md (see the "Running the Experiment"
-block) that this file should be executed first.
-"""
-# import hydra
-# from hydra.core.hydra_config import HydraConfig
-# from hydra.utils import call, instantiate
-# from omegaconf import DictConfig, OmegaConf
-
-
-# @hydra.main(config_path="conf", config_name="base", version_base=None)
-# def download_and_preprocess(cfg: DictConfig) -> None:
-# """Does everything needed to get the dataset.
-
-# Parameters
-# ----------
-# cfg : DictConfig
-# An omegaconf object that stores the hydra config.
-# """
-
-# ## 1. print parsed config
-# print(OmegaConf.to_yaml(cfg))
-
-# # Please include here all the logic
-# # Please use the Hydra config style as much as possible specially
-# # for parts that can be customised (e.g. how data is partitioned)
-
-# if __name__ == "__main__":
-
-# download_and_preprocess()
diff --git a/baselines/baseline_template/baseline_template/main.py b/baselines/baseline_template/baseline_template/main.py
deleted file mode 100644
index 25ae1bec6a10..000000000000
--- a/baselines/baseline_template/baseline_template/main.py
+++ /dev/null
@@ -1,57 +0,0 @@
-"""Create and connect the building blocks for your experiments; start the simulation.
-
-It includes processioning the dataset, instantiate strategy, specify how the global
-model is going to be evaluated, etc. At the end, this script saves the results.
-"""
-# these are the basic packages you'll need here
-# feel free to remove some if aren't needed
-import hydra
-from omegaconf import DictConfig, OmegaConf
-
-
-@hydra.main(config_path="conf", config_name="base", version_base=None)
-def main(cfg: DictConfig) -> None:
- """Run the baseline.
-
- Parameters
- ----------
- cfg : DictConfig
- An omegaconf object that stores the hydra config.
- """
- # 1. Print parsed config
- print(OmegaConf.to_yaml(cfg))
-
- # 2. Prepare your dataset
- # here you should call a function in datasets.py that returns whatever is needed to:
- # (1) ensure the server can access the dataset used to evaluate your model after
- # aggregation
- # (2) tell each client what dataset partitions they should use (e.g. a this could
- # be a location in the file system, a list of dataloader, a list of ids to extract
- # from a dataset, it's up to you)
-
- # 3. Define your clients
- # Define a function that returns another function that will be used during
- # simulation to instantiate each individual client
- # client_fn = client.()
-
- # 4. Define your strategy
- # pass all relevant argument (including the global dataset used after aggregation,
- # if needed by your method.)
- # strategy = instantiate(cfg.strategy, )
-
- # 5. Start Simulation
- # history = fl.simulation.start_simulation()
-
- # 6. Save your results
- # Here you can save the `history` returned by the simulation and include
- # also other buffers, statistics, info needed to be saved in order to later
- # on generate the plots you provide in the README.md. You can for instance
- # access elements that belong to the strategy for example:
- # data = strategy.get_my_custom_data() -- assuming you have such method defined.
- # Hydra will generate for you a directory each time you run the code. You
- # can retrieve the path to that directory with this:
- # save_path = HydraConfig.get().runtime.output_dir
-
-
-if __name__ == "__main__":
- main()
diff --git a/baselines/baseline_template/baseline_template/models.py b/baselines/baseline_template/baseline_template/models.py
deleted file mode 100644
index 71fa553d1f59..000000000000
--- a/baselines/baseline_template/baseline_template/models.py
+++ /dev/null
@@ -1,7 +0,0 @@
-"""Define our models, and training and eval functions.
-
-If your model is 100% off-the-shelf (e.g. directly from torchvision without requiring
-modifications) you might be better off instantiating your model directly from the Hydra
-config. In this way, swapping your model for another one can be done without changing
-the python code at all
-"""
diff --git a/baselines/baseline_template/baseline_template/server.py b/baselines/baseline_template/baseline_template/server.py
deleted file mode 100644
index 2fd7d42cde5a..000000000000
--- a/baselines/baseline_template/baseline_template/server.py
+++ /dev/null
@@ -1,5 +0,0 @@
-"""Create global evaluation function.
-
-Optionally, also define a new Server class (please note this is not needed in most
-settings).
-"""
diff --git a/baselines/baseline_template/baseline_template/strategy.py b/baselines/baseline_template/baseline_template/strategy.py
deleted file mode 100644
index 17436c401c30..000000000000
--- a/baselines/baseline_template/baseline_template/strategy.py
+++ /dev/null
@@ -1,5 +0,0 @@
-"""Optionally define a custom strategy.
-
-Needed only when the strategy is not yet implemented in Flower or because you want to
-extend or modify the functionality of an existing strategy.
-"""
diff --git a/baselines/baseline_template/baseline_template/utils.py b/baselines/baseline_template/baseline_template/utils.py
deleted file mode 100644
index 9a831719d623..000000000000
--- a/baselines/baseline_template/baseline_template/utils.py
+++ /dev/null
@@ -1,6 +0,0 @@
-"""Define any utility function.
-
-They are not directly relevant to the other (more FL specific) python modules. For
-example, you may define here things like: loading a model from a checkpoint, saving
-results, plotting.
-"""
diff --git a/baselines/baseline_template/pyproject.toml b/baselines/baseline_template/pyproject.toml
deleted file mode 100644
index 31f1ee7bfe6d..000000000000
--- a/baselines/baseline_template/pyproject.toml
+++ /dev/null
@@ -1,137 +0,0 @@
-[build-system]
-requires = ["poetry-core>=1.4.0"]
-build-backend = "poetry.masonry.api"
-
-[tool.poetry]
-name = "" # <----- Ensure it matches the name of your baseline directory containing all the source code
-version = "1.0.0"
-description = "Flower Baselines"
-license = "Apache-2.0"
-authors = ["The Flower Authors "]
-readme = "README.md"
-homepage = "https://flower.ai"
-repository = "https://github.com/adap/flower"
-documentation = "https://flower.ai"
-classifiers = [
- "Development Status :: 3 - Alpha",
- "Intended Audience :: Developers",
- "Intended Audience :: Science/Research",
- "License :: OSI Approved :: Apache Software License",
- "Operating System :: MacOS :: MacOS X",
- "Operating System :: POSIX :: Linux",
- "Programming Language :: Python",
- "Programming Language :: Python :: 3",
- "Programming Language :: Python :: 3 :: Only",
- "Programming Language :: Python :: 3.8",
- "Programming Language :: Python :: 3.9",
- "Programming Language :: Python :: 3.10",
- "Programming Language :: Python :: 3.11",
- "Programming Language :: Python :: Implementation :: CPython",
- "Topic :: Scientific/Engineering",
- "Topic :: Scientific/Engineering :: Artificial Intelligence",
- "Topic :: Scientific/Engineering :: Mathematics",
- "Topic :: Software Development",
- "Topic :: Software Development :: Libraries",
- "Topic :: Software Development :: Libraries :: Python Modules",
- "Typing :: Typed",
-]
-
-[tool.poetry.dependencies]
-python = ">=3.8.15, <3.12.0" # don't change this
-flwr = { extras = ["simulation"], version = "1.5.0" }
-hydra-core = "1.3.2" # don't change this
-
-[tool.poetry.dev-dependencies]
-isort = "==5.13.2"
-black = "==24.2.0"
-docformatter = "==1.7.5"
-mypy = "==1.4.1"
-pylint = "==2.8.2"
-flake8 = "==3.9.2"
-pytest = "==6.2.4"
-pytest-watch = "==4.2.0"
-ruff = "==0.0.272"
-types-requests = "==2.27.7"
-
-[tool.isort]
-line_length = 88
-indent = " "
-multi_line_output = 3
-include_trailing_comma = true
-force_grid_wrap = 0
-use_parentheses = true
-
-[tool.black]
-line-length = 88
-target-version = ["py38", "py39", "py310", "py311"]
-
-[tool.pytest.ini_options]
-minversion = "6.2"
-addopts = "-qq"
-testpaths = [
- "flwr_baselines",
-]
-
-[tool.mypy]
-ignore_missing_imports = true
-strict = false
-plugins = "numpy.typing.mypy_plugin"
-
-[tool.pylint."MESSAGES CONTROL"]
-disable = "bad-continuation,duplicate-code,too-few-public-methods,useless-import-alias"
-good-names = "i,j,k,_,x,y,X,Y"
-signature-mutators = "hydra.main.main"
-
-[tool.pylint.typecheck]
-generated-members = "numpy.*, torch.*, tensorflow.*"
-
-[[tool.mypy.overrides]]
-module = [
- "importlib.metadata.*",
- "importlib_metadata.*",
-]
-follow_imports = "skip"
-follow_imports_for_stubs = true
-disallow_untyped_calls = false
-
-[[tool.mypy.overrides]]
-module = "torch.*"
-follow_imports = "skip"
-follow_imports_for_stubs = true
-
-[tool.docformatter]
-wrap-summaries = 88
-wrap-descriptions = 88
-
-[tool.ruff]
-target-version = "py38"
-line-length = 88
-select = ["D", "E", "F", "W", "B", "ISC", "C4"]
-fixable = ["D", "E", "F", "W", "B", "ISC", "C4"]
-ignore = ["B024", "B027"]
-exclude = [
- ".bzr",
- ".direnv",
- ".eggs",
- ".git",
- ".hg",
- ".mypy_cache",
- ".nox",
- ".pants.d",
- ".pytype",
- ".ruff_cache",
- ".svn",
- ".tox",
- ".venv",
- "__pypackages__",
- "_build",
- "buck-out",
- "build",
- "dist",
- "node_modules",
- "venv",
- "proto",
-]
-
-[tool.ruff.pydocstyle]
-convention = "numpy"
diff --git a/baselines/dev/create-baseline.sh b/baselines/dev/create-baseline.sh
deleted file mode 100755
index 53cd79c569aa..000000000000
--- a/baselines/dev/create-baseline.sh
+++ /dev/null
@@ -1,30 +0,0 @@
-#!/bin/bash
-
-# This script duplicates the `baseline_template` directory and changes its name
-# to the one you specify when running this script. That name is also used to
-# rename the subdirectory inside your new baseline directory as well as to set
-# the Python package name that Poetry will build
-
-set -e
-cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"/../
-
-template="baseline_template"
-name=$1
-
-# copying directory
-echo "Copying '$template' and renaming it to '$name'"
-cp -r $template $name
-
-# renaming sub-directory
-echo "Renaming sub-directory as '$name'"
-mv $name/$template $name/$name
-
-# adjusting package name in pyproject.toml
-cd $name
-if [[ "$OSTYPE" == "darwin"* ]]; then
- sed -i '' -e "s//$name/" pyproject.toml
-else
- sed -i -e "s//$name/" pyproject.toml
-fi
-
-echo "!!! Your directory for your baseline '$name' is ready."
From 1c01e7fe0147d4e446b8e6b003cab424ce3394d2 Mon Sep 17 00:00:00 2001
From: Javier
Date: Thu, 29 Aug 2024 13:26:53 +0200
Subject: [PATCH 21/42] docs(baselines) Update baselines contribution docs
(#3995)
---
baselines/README.md | 34 +++---
.../source/how-to-contribute-baselines.rst | 42 ++++---
baselines/doc/source/how-to-use-baselines.rst | 114 +++++++++++-------
3 files changed, 112 insertions(+), 78 deletions(-)
diff --git a/baselines/README.md b/baselines/README.md
index 3a84df02d8de..75bcccb68b2a 100644
--- a/baselines/README.md
+++ b/baselines/README.md
@@ -1,10 +1,9 @@
# Flower Baselines
+> [!NOTE]
> We are changing the way we structure the Flower baselines. While we complete the transition to the new format, you can still find the existing baselines in the `flwr_baselines` directory. Currently, you can make use of baselines for [FedAvg](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/fedavg_mnist), [FedOpt](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/adaptive_federated_optimization), and [LEAF-FEMNIST](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/leaf/femnist).
-> The documentation below has been updated to reflect the new way of using Flower baselines.
-
## Structure
@@ -15,17 +14,15 @@ baselines//
├── README.md
├── pyproject.toml
└──
- ├── *.py # several .py files including main.py and __init__.py
- └── conf
- └── *.yaml # one or more Hydra config files
+ └── *.py # several .py files
```
-Please note that some baselines might include additional files (e.g. a `requirements.txt`) or a hierarchy of `.yaml` files for [Hydra](https://hydra.cc/).
## Running the baselines
-Each baseline is self-contained in its own directory. Furthermore, each baseline defines its own Python environment using [Poetry](https://python-poetry.org/docs/) via a `pyproject.toml` file and [`pyenv`](https://github.com/pyenv/pyenv). If you haven't setup `Poetry` and `pyenv` already on your machine, please take a look at the [Documentation](https://flower.ai/docs/baselines/how-to-use-baselines.html#setting-up-your-machine) for a guide on how to do so.
+> [!NOTE]
+> We are in the process of migrating all baselines to use `flwr run`. Those baselines that remain using the previous system (i.e. using [Poetry](https://python-poetry.org/), [Hydra](https://hydra.cc/) and [start_simulation](https://flower.ai/docs/framework/ref-api/flwr.simulation.start_simulation.html)) might require you to first setup `Poetry` and `pyenv` already on your machine, please take a look at the [Documentation](https://flower.ai/docs/baselines/how-to-use-baselines.html#setting-up-your-machine) for a guide on how to do so.
-Assuming `pyenv` and `Poetry` are already installed on your system. Running a baseline can be done by:
+Each baseline is self-contained in its own directory. To run a baseline:
1. Cloning the flower repository
@@ -34,11 +31,7 @@ Assuming `pyenv` and `Poetry` are already installed on your system. Running a ba
```
2. Navigate inside the directory of the baseline you'd like to run.
-3. Follow the `[Environment Setup]` instructions in the `README.md`. In most cases this will require you to just do:
-
- ```bash
- poetry install
- ```
+3. Follow the `[Environment Setup]` instructions in the `README.md`.
4. Run the baseline as indicated in the `[Running the Experiments]` section in the `README.md` or in the `[Expected Results]` section to reproduce the experiments in the paper.
@@ -46,17 +39,22 @@ Assuming `pyenv` and `Poetry` are already installed on your system. Running a ba
Do you have a new federated learning paper and want to add a new baseline to Flower? Or do you want to add an experiment to an existing baseline paper? Great, we really appreciate your contribution !!
+> [!TIP]
+> A more verbose version of these steps can be found in the [Flower Baselines documentation](https://flower.ai/docs/baselines/how-to-contribute-baselines.html).
+
The steps to follow are:
+1. Create a new Python 3.10 environment and install Flower (`pip install flwr`)
1. Fork the Flower repo and clone it into your machine.
-2. Navigate to the `baselines/` directory, choose a single-word (and **lowercase**) name for your baseline, and from there run:
+2. Navigate to the `baselines/` directory, from there and with your environment activated, run:
```bash
- # This will create a new directory with the same structure as `baseline_template`.
- ./dev/create-baseline.sh
+ # Choose option "Flower Baseline" when prompted
+ flwr new
```
-3. Then, go inside your baseline directory and continue with the steps detailed in `EXTENDED_README.md` and `README.md`.
-4. Once your code is ready and you have checked that following the instructions in your `README.md` the Python environment can be created correctly and that running the code following your instructions can reproduce the experiments in the paper, you just need to create a Pull Request (PR). Then, the process to merge your baseline into the Flower repo will begin!
+3. Then, go inside your baseline directory and continue with the steps detailed in the `README.md`.
+4. Once your code is ready, check that you have completed all the sections in the `README.md` and that, if a new environment is created, your baseline still runs (i.e. play the role of a person running the baseline you want to contribute).
+5. Create a Pull Request (PR). Then, the process to merge your baseline into the Flower repo will begin!
Further resources:
diff --git a/baselines/doc/source/how-to-contribute-baselines.rst b/baselines/doc/source/how-to-contribute-baselines.rst
index b568e73f1c11..429ac714c1aa 100644
--- a/baselines/doc/source/how-to-contribute-baselines.rst
+++ b/baselines/doc/source/how-to-contribute-baselines.rst
@@ -6,16 +6,14 @@ Do you have a new federated learning paper and want to add a new baseline to Flo
The goal of Flower Baselines is to reproduce experiments from popular papers to accelerate researchers by enabling faster comparisons to new strategies, datasets, models, and federated pipelines in general.
Before you start to work on a new baseline or experiment, please check the `Flower Issues `_ or `Flower Pull Requests `_ to see if someone else is already working on it. Please open a new issue if you are planning to work on a new baseline or experiment with a short description of the corresponding paper and the experiment you want to contribute.
+If you are proposing a brand new baseline, please indicate what experiments from the paper are planning to include.
Requirements
------------
-Contributing a new baseline is really easy. You only have to make sure that your federated learning experiments are running with Flower and replicate the results of a paper. Flower baselines need to make use of:
+Contributing a new baseline is really easy. You only have to make sure that your federated learning experiments run with Flower, use `Flower Datasets `_, and replicate the results of a paper.
+Preferably, the baselines make use of PyTorch, but other ML frameworks are also welcome. The baselines are expected to run in a machine with Ubuntu 22.04, but if yours runs also on macOS even better!
-* `Poetry `_ to manage the Python environment.
-* `Hydra `_ to manage the configuration files for your experiments.
-
-You can find more information about how to setup Poetry in your machine in the ``EXTENDED_README.md`` that is generated when you prepare your baseline.
Add a new Flower Baseline
-------------------------
@@ -27,11 +25,18 @@ Let's say you want to contribute the code of your most recent Federated Learning
#. **Get the Flower source code on your machine**
#. Fork the Flower codebase: go to the `Flower GitHub repo `_ and fork the code (click the *Fork* button in the top-right corner and follow the instructions)
#. Clone the (forked) Flower source code: :code:`git clone git@github.com:[your_github_username]/flower.git`
- #. Open the code in your favorite editor.
-#. **Use the provided script to create your baseline directory**
- #. Navigate to the baselines directory and run :code:`./dev/create-baseline.sh fedawesome`
- #. A new directory in :code:`baselines/fedawesome` is created.
- #. Follow the instructions in :code:`EXTENDED_README.md` and :code:`README.md` in your baseline directory.
+#. **Create a new baseline using the template**
+ #. Create a new Python environment with Python 3.10 (we recommend doing this with `pyenv `_)
+ #. Install flower with: :code:`pip install flwr`.
+ #. Navigate to the baselines directory and run: :code:`flwr new fedawesome`. When prompted, choose the option :code:`Flower Baseline`.
+ #. A new directory in :code:`baselines/fedawesome` is created with the structure needed for a Flower Baseline.
+ #. Follow the instructions in the :code:`README.md` in your baseline directory.
+
+ .. tip::
+ At this point, your baseline contains source code showing how a simple :code:`PyTorch+CIFAR10` project can be built with Flower.
+ You can run it directly by executing :code:`flwr run .` from inside the directory of your baseline. Update the code with that
+ needed to implement your baseline.
+
#. **Open a pull request**
#. Stage your changes: :code:`git add .`
#. Commit & push: :code:`git commit -m "Create new FedAwesome baseline" ; git push`
@@ -49,15 +54,18 @@ Further reading:
Usability
---------
-Flower is known and loved for its usability. Therefore, make sure that your baseline or experiment can be executed with a single command such as:
+Flower is known and loved for its usability. Therefore, make sure that your baseline or experiment can be executed with a single command after installing the baseline project:
.. code-block:: bash
- poetry run python -m .main
-
- # or, once sourced into your environment
- python -m .main
+ # Install the baseline project
+ pip install -e .
+
+ # Run the baseline using default config
+ flwr run .
+
+ # Run the baseline overriding the config
+ flwr run . --run-config lr=0.01,num-server-rounds=200
-We provide you with a `template-baseline `_ to use as guidance when contributing your baseline. Having all baselines follow a homogenous structure helps users to tryout many baselines without the overheads of having to understand each individual codebase. Similarly, by using Hydra throughout, users will immediately know how to parameterise your experiments directly from the command line.
-We look forward to your contribution!
+We look forward to your contribution!
\ No newline at end of file
diff --git a/baselines/doc/source/how-to-use-baselines.rst b/baselines/doc/source/how-to-use-baselines.rst
index 4704a9b6074e..ec65f8f7d5ee 100644
--- a/baselines/doc/source/how-to-use-baselines.rst
+++ b/baselines/doc/source/how-to-use-baselines.rst
@@ -5,7 +5,6 @@ Use Baselines
We are changing the way we structure the Flower baselines. While we complete the transition to the new format, you can still find the existing baselines and use them: `baselines (old) `_.
Currently, you can make use of baselines for `FedAvg `_, `FedOpt `_, and `LEAF-FEMNIST `_.
- The documentation below has been updated to reflect the new way of using Flower baselines.
Structure
---------
@@ -15,87 +14,116 @@ All baselines are available in the directory `baselines /
+ ├── LICENSE
├── README.md
- ├── pyproject.toml
+ ├── pyproject.toml # defines dependencies
+ ├── _static # optionally a directory to save plots
└──
- ├── *.py # several .py files including main.py and __init__.py
- └── conf
- └── *.yaml # one or more Hydra config files
-
-Please note that some baselines might include additional files (e.g. a :code:`requirements.txt`) or a hierarchy of :code:`.yaml` files for `Hydra `_.
+ └── *.py # several .py files
Setting up your machine
-----------------------
-.. note::
- Flower baselines are designed to run on Ubuntu 22.04. While a GPU is not required to run the baselines, some of the more computationally demanding ones do benefit from GPU acceleration.
+.. tip::
+ Flower baselines are designed to run on Ubuntu 22.04 and Python 3.10. While a GPU is not required to run the baselines, some of the more computationally demanding ones do benefit from GPU acceleration.
+ All baselines are expected to make use of `pyenv `_.
-Common to all baselines is `Poetry `_, a tool to manage Python dependencies. Baselines also make use of `Pyenv `_. You'll need to install both on your system before running a baseline. What follows is a step-by-step guide on getting :code:`pyenv` and :code:`Poetry` installed on your system.
+.. note::
+ We are in the process of migrating all baselines to use `flwr run`. Those that haven't yet been migrated still make use of `Poetry `_, a tool to manage Python dependencies.
+ Identifying whether the baseline you want to run requires Poetry or not is easy: check if the `Environment Setup` section in the baseline readme mentions Poetry.
+ Follow the instructions later in this section if you need to setup Poetry in your system.
-Let's begin by installing :code:`pyenv`. We'll be following the standard procedure. Please refer to the `pyenv docs `_ for alternative ways of installing it.
+Let's begin by installing :code:`pyenv`. We'll be following the standard procedure. Please refer to the `pyenv docs `_ for alternative ways of installing it, including for platforms other than Ubuntu.
.. code-block:: bash
- # first install a few packages needed later for pyenv
- sudo apt install build-essential zlib1g-dev libssl-dev libsqlite3-dev \
- libreadline-dev libbz2-dev libffi-dev liblzma-dev
+ # first install a few packages needed later for pyenv
+ sudo apt install build-essential zlib1g-dev libssl-dev libsqlite3-dev \
+ libreadline-dev libbz2-dev libffi-dev liblzma-dev
- # now clone pyenv into your home directory (this is the default way of installing pyenv)
- git clone https://github.com/pyenv/pyenv.git ~/.pyenv
+ # now clone pyenv into your home directory (this is the default way of installing pyenv)
+ git clone https://github.com/pyenv/pyenv.git ~/.pyenv
- # Then add pyenv to your path by adding the below to your .bashrc/.zshrc
- export PYENV_ROOT="$HOME/.pyenv"
- command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
- eval "$(pyenv init -)"
+ # Then add pyenv to your path by adding the below to your .bashrc/.zshrc
+ export PYENV_ROOT="$HOME/.pyenv"
+ command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
+ eval "$(pyenv init -)"
Verify your installation by opening a new terminal and
.. code-block:: bash
- # check python versions available
- pyenv versions
- # * system (...) # <-- it should just show one
+ # check python versions available
+ pyenv versions
+ # * system (...) # <-- it should just show one
+
+Then you can proceed and install any version of Python. Baselines use Python 3.10, so we'll be installing a recent version of it.
+
+.. code-block:: bash
+
+ pyenv install 3.10.14
+ # this will take a little while
+ # once done, you should see that that version is available
+ pyenv versions
+ # system
+ # * 3.10.14 # <-- you just installed this
-Then you can proceed and install any version of Python. Most baselines currently use Python 3.10.6, so we'll be installing that one.
+Next, let's install the :code:`virtualenv` plugin. Check `the documentation `_ for alternative installation methods.
.. code-block:: bash
- pyenv install 3.10.6
- # this will take a little while
- # once done, you should see that that version is available
- pyenv versions
- # system
- # * 3.10.6 # <-- you just installed this
+ # Clone `pyenv-virtualenv`
+ git clone https://github.com/pyenv/pyenv-virtualenv.git $(pyenv root)/plugins/pyenv-virtualenv
+
+ # Restart your shell
+ exec "$SHELL"
+
-Now that we have :code:`pyenv` installed, we are ready to install :code:`poetry`. Installing Poetry can be done from a single command:
+Using :code:`pyenv`
+~~~~~~~~~~~~~~~~~~~
+
+Creating a virtual environment can be done as follows:
.. code-block:: bash
- curl -sSL https://install.python-poetry.org | python3 -
+ # Create an environment for Python 3.10.14 named test-env
+ pyenv virtualenv 3.10.14 test-env
+
+ # Then activate it
+ pyenv activate test-env
+
+ # Deactivate it as follows
+ pyenv deactivate
- # add to path by putting this line at the end of your .zshrc/.bashrc
- export PATH="$HOME/.local/bin:$PATH"
+
+(optional) Setup Poetry
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Now that we have :code:`pyenv` installed, we are ready to install :code:`poetry`. It can be done from a single command:
+
+.. code-block:: bash
+
+ curl -sSL https://install.python-poetry.org | python3 -
+
+ # add to path by putting this line at the end of your .zshrc/.bashrc
+ export PATH="$HOME/.local/bin:$PATH"
To install Poetry from source, to customise your installation, or to further integrate Poetry with your shell after installation, please check `the Poetry documentation `_.
+
Using a Flower Baseline
-----------------------
-To use Flower Baselines you need first to install :code:`pyenv` and :code:`Poetry`, then:
+To use Flower Baselines you need first to install :code:`pyenv` and, depending on the baselines, also :code:`Poetry`, then:
1. Clone the flower repository
.. code-block:: bash
- git clone https://github.com/adap/flower.git && cd flower
+ git clone https://github.com/adap/flower.git && cd flower
2. Navigate inside the directory of the baseline you'd like to run
-3. Follow the :code:`[Environment Setup]` instructions in the :code:`README.md`. In most cases this will require you to just do:
-
-.. code-block:: bash
-
- poetry install
-
-4. Run the baseline as indicated in the :code:`[Running the Experiments]` section in the :code:`README.md` or in the `[Expected Results]` section to reproduce the experiments in the paper.
+3. Follow the :code:`[Environment Setup]` instructions in the :code:`README.md`.
+4. Run the baseline as indicated in the :code:`[Running the Experiments]` section in the :code:`README.md` or in the :code:`[Expected Results]` section to reproduce the experiments in the paper.
From 0ea6c511526a7f6f27b028ab69dc7329e8071d7f Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Thu, 29 Aug 2024 14:23:00 +0200
Subject: [PATCH 22/42] refactor(framework) Reorganize `pyproject.toml` and
telemetry (#4098)
---
pyproject.toml | 10 +++++---
src/py/flwr/common/telemetry.py | 42 +++++++++++++++++++++++----------
2 files changed, 36 insertions(+), 16 deletions(-)
diff --git a/pyproject.toml b/pyproject.toml
index 0d0138a5689b..6c974304e785 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -52,14 +52,18 @@ exclude = [
]
[tool.poetry.scripts]
+# `flwr` CLI
flwr = "flwr.cli.app:app"
-flower-superlink = "flwr.server:run_superlink"
+# SuperExec (can run with either Deployment Engine or Simulation Engine)
flower-superexec = "flwr.superexec:run_superexec"
+# Simulation Engine
+flower-simulation = "flwr.simulation.run_simulation:run_simulation_from_cli"
+# Deployment Engine
+flower-superlink = "flwr.server:run_superlink"
flower-supernode = "flwr.client:run_supernode"
-flower-client-app = "flwr.client:run_client_app"
flower-server-app = "flwr.server:run_server_app"
-flower-simulation = "flwr.simulation.run_simulation:run_simulation_from_cli"
flwr-clientapp = "flwr.client.clientapp:flwr_clientapp"
+flower-client-app = "flwr.client:run_client_app" # Deprecated, use `flower-supernode`
[tool.poetry.dependencies]
python = "^3.8"
diff --git a/src/py/flwr/common/telemetry.py b/src/py/flwr/common/telemetry.py
index 399f400b7edc..4c9f53ee1e17 100644
--- a/src/py/flwr/common/telemetry.py
+++ b/src/py/flwr/common/telemetry.py
@@ -132,14 +132,36 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
# Ping
PING = auto()
- # Client: start_client
+ # --- LEGACY FUNCTIONS -------------------------------------------------------------
+
+ # Legacy: `start_client` function
START_CLIENT_ENTER = auto()
START_CLIENT_LEAVE = auto()
- # Server: start_server
+ # Legacy: `start_server` function
START_SERVER_ENTER = auto()
START_SERVER_LEAVE = auto()
+ # Legacy: `start_simulation` function
+ START_SIMULATION_ENTER = auto()
+ START_SIMULATION_LEAVE = auto()
+
+ # --- `flwr` CLI -------------------------------------------------------------------
+
+ # Not yet implemented
+
+ # --- SuperExec --------------------------------------------------------------------
+
+ # SuperExec
+ RUN_SUPEREXEC_ENTER = auto()
+ RUN_SUPEREXEC_LEAVE = auto()
+
+ # --- Simulation Engine ------------------------------------------------------------
+
+ # Not yet implemented
+
+ # --- Deployment Engine ------------------------------------------------------------
+
# Driver API
RUN_DRIVER_API_ENTER = auto()
RUN_DRIVER_API_LEAVE = auto()
@@ -152,10 +174,6 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
RUN_SUPERLINK_ENTER = auto()
RUN_SUPERLINK_LEAVE = auto()
- # Simulation
- START_SIMULATION_ENTER = auto()
- START_SIMULATION_LEAVE = auto()
-
# Driver: Driver
DRIVER_CONNECT = auto()
DRIVER_DISCONNECT = auto()
@@ -164,10 +182,6 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
START_DRIVER_ENTER = auto()
START_DRIVER_LEAVE = auto()
- # flower-client-app
- RUN_CLIENT_APP_ENTER = auto()
- RUN_CLIENT_APP_LEAVE = auto()
-
# flower-server-app
RUN_SERVER_APP_ENTER = auto()
RUN_SERVER_APP_LEAVE = auto()
@@ -176,9 +190,11 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
RUN_SUPERNODE_ENTER = auto()
RUN_SUPERNODE_LEAVE = auto()
- # SuperExec
- RUN_SUPEREXEC_ENTER = auto()
- RUN_SUPEREXEC_LEAVE = auto()
+ # --- DEPRECATED -------------------------------------------------------------------
+
+ # [DEPRECATED] CLI: `flower-client-app`
+ RUN_CLIENT_APP_ENTER = auto()
+ RUN_CLIENT_APP_LEAVE = auto()
# Use the ThreadPoolExecutor with max_workers=1 to have a queue
From b817b3b954fe7843f97fb2f85a6f38293f2151dd Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Thu, 29 Aug 2024 14:33:28 +0200
Subject: [PATCH 23/42] fix(*:skip) Fix Legacy Key Value Format in Ubuntu
Dockerfile (#4090)
Signed-off-by: Robert Steiner
---
src/docker/base/ubuntu/Dockerfile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/docker/base/ubuntu/Dockerfile b/src/docker/base/ubuntu/Dockerfile
index 31cc8381b7c5..643308f324df 100644
--- a/src/docker/base/ubuntu/Dockerfile
+++ b/src/docker/base/ubuntu/Dockerfile
@@ -32,7 +32,7 @@ RUN apt-get update \
# Install PyEnv and Python
ARG PYTHON_VERSION=3.11
ENV PYENV_ROOT=/root/.pyenv
-ENV PATH $PYENV_ROOT/bin:$PATH
+ENV PATH=$PYENV_ROOT/bin:$PATH
# https://github.com/hadolint/hadolint/wiki/DL4006
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash
From ea2f4cea47599c2275023feeda9536eeb3ead533 Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Thu, 29 Aug 2024 16:02:26 +0200
Subject: [PATCH 24/42] refactor(framework:skip) Reorder and remove unused
events (#4105)
---
src/py/flwr/common/telemetry.py | 26 +++++-------------------
src/py/flwr/server/compat/app.py | 5 -----
src/py/flwr/server/driver/grpc_driver.py | 4 +---
3 files changed, 6 insertions(+), 29 deletions(-)
diff --git a/src/py/flwr/common/telemetry.py b/src/py/flwr/common/telemetry.py
index 4c9f53ee1e17..bb5747eca2a6 100644
--- a/src/py/flwr/common/telemetry.py
+++ b/src/py/flwr/common/telemetry.py
@@ -162,34 +162,18 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
# --- Deployment Engine ------------------------------------------------------------
- # Driver API
- RUN_DRIVER_API_ENTER = auto()
- RUN_DRIVER_API_LEAVE = auto()
-
- # Fleet API
- RUN_FLEET_API_ENTER = auto()
- RUN_FLEET_API_LEAVE = auto()
-
- # Driver API and Fleet API
+ # CLI: `flower-superlink`
RUN_SUPERLINK_ENTER = auto()
RUN_SUPERLINK_LEAVE = auto()
- # Driver: Driver
- DRIVER_CONNECT = auto()
- DRIVER_DISCONNECT = auto()
-
- # Driver: start_driver
- START_DRIVER_ENTER = auto()
- START_DRIVER_LEAVE = auto()
+ # CLI: `flower-supernode`
+ RUN_SUPERNODE_ENTER = auto()
+ RUN_SUPERNODE_LEAVE = auto()
- # flower-server-app
+ # CLI: `flower-server-app`
RUN_SERVER_APP_ENTER = auto()
RUN_SERVER_APP_LEAVE = auto()
- # SuperNode
- RUN_SUPERNODE_ENTER = auto()
- RUN_SUPERNODE_LEAVE = auto()
-
# --- DEPRECATED -------------------------------------------------------------------
# [DEPRECATED] CLI: `flower-client-app`
diff --git a/src/py/flwr/server/compat/app.py b/src/py/flwr/server/compat/app.py
index e978359fa828..1d3e5024ba90 100644
--- a/src/py/flwr/server/compat/app.py
+++ b/src/py/flwr/server/compat/app.py
@@ -18,7 +18,6 @@
from logging import INFO
from typing import Optional
-from flwr.common import EventType, event
from flwr.common.logger import log
from flwr.server.client_manager import ClientManager
from flwr.server.history import History
@@ -65,8 +64,6 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals
hist : flwr.server.history.History
Object containing training and evaluation metrics.
"""
- event(EventType.START_DRIVER_ENTER)
-
# Initialize the Driver API server and config
initialized_server, initialized_config = init_defaults(
server=server,
@@ -96,6 +93,4 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals
f_stop.set()
thread.join()
- event(EventType.START_SERVER_LEAVE)
-
return hist
diff --git a/src/py/flwr/server/driver/grpc_driver.py b/src/py/flwr/server/driver/grpc_driver.py
index 80ce9623ab3f..ea6d1c9ea3e5 100644
--- a/src/py/flwr/server/driver/grpc_driver.py
+++ b/src/py/flwr/server/driver/grpc_driver.py
@@ -21,7 +21,7 @@
import grpc
-from flwr.common import DEFAULT_TTL, EventType, Message, Metadata, RecordSet, event
+from flwr.common import DEFAULT_TTL, Message, Metadata, RecordSet
from flwr.common.grpc import create_channel
from flwr.common.logger import log
from flwr.common.serde import (
@@ -94,7 +94,6 @@ def _connect(self) -> None:
This will not call GetRun.
"""
- event(EventType.DRIVER_CONNECT)
if self._is_connected:
log(WARNING, "Already connected")
return
@@ -108,7 +107,6 @@ def _connect(self) -> None:
def _disconnect(self) -> None:
"""Disconnect from the Driver API."""
- event(EventType.DRIVER_DISCONNECT)
if not self._is_connected:
log(DEBUG, "Already disconnected")
return
From 5c248330fbada244edbc01bfd9320396fd1a0d8b Mon Sep 17 00:00:00 2001
From: Taner Topal
Date: Thu, 29 Aug 2024 16:18:48 +0200
Subject: [PATCH 25/42] ci(*:skip) Add Robert and Danny as codeowners (#4100)
---
.github/CODEOWNERS | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 5270bf89ae33..ce280c6bd2d4 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -27,3 +27,8 @@ README.md @jafermarq @tanertopal @danieljanes
# GitHub Actions and Workflows
/.github/workflows @Robert-Steiner @tanertopal @danieljanes
/.github/actions @Robert-Steiner @tanertopal @danieljanes
+
+# Docker-related files
+/.devcontainer @Robert-Steiner @Moep90
+**/Dockerfile @Robert-Steiner @Moep90
+**/*.Dockerfile @Robert-Steiner @Moep90
From 807d0586317a7baea1964de98249561d6034eee6 Mon Sep 17 00:00:00 2001
From: Daniel Nata Nugraha
Date: Thu, 29 Aug 2024 15:25:20 +0100
Subject: [PATCH 26/42] fix(framework:skip) Allow retry when SuperLink is not
reachable (#4106)
---
.../grpc_rere_client/client_interceptor.py | 17 ++++++++++----
.../client_interceptor_test.py | 23 ++++++++++++++++++-
2 files changed, 35 insertions(+), 5 deletions(-)
diff --git a/src/py/flwr/client/grpc_rere_client/client_interceptor.py b/src/py/flwr/client/grpc_rere_client/client_interceptor.py
index d2dded8a73d9..c16f911eb4c2 100644
--- a/src/py/flwr/client/grpc_rere_client/client_interceptor.py
+++ b/src/py/flwr/client/grpc_rere_client/client_interceptor.py
@@ -17,11 +17,13 @@
import base64
import collections
+from logging import WARNING
from typing import Any, Callable, Optional, Sequence, Tuple, Union
import grpc
from cryptography.hazmat.primitives.asymmetric import ec
+from flwr.common.logger import log
from flwr.common.secure_aggregation.crypto.symmetric_encryption import (
bytes_to_public_key,
compute_hmac,
@@ -151,8 +153,15 @@ def intercept_unary_unary(
server_public_key_bytes = base64.urlsafe_b64decode(
_get_value_from_tuples(_PUBLIC_KEY_HEADER, response.initial_metadata())
)
- self.server_public_key = bytes_to_public_key(server_public_key_bytes)
- self.shared_secret = generate_shared_key(
- self.private_key, self.server_public_key
- )
+
+ if server_public_key_bytes != b"":
+ self.server_public_key = bytes_to_public_key(server_public_key_bytes)
+ else:
+ log(WARNING, "Can't get server public key, SuperLink may be offline")
+
+ if self.server_public_key is not None:
+ self.shared_secret = generate_shared_key(
+ self.private_key, self.server_public_key
+ )
+
return response
diff --git a/src/py/flwr/client/grpc_rere_client/client_interceptor_test.py b/src/py/flwr/client/grpc_rere_client/client_interceptor_test.py
index 79416a8eb31b..155bae202720 100644
--- a/src/py/flwr/client/grpc_rere_client/client_interceptor_test.py
+++ b/src/py/flwr/client/grpc_rere_client/client_interceptor_test.py
@@ -164,7 +164,7 @@ def _init_retry_invoker() -> RetryInvoker:
return RetryInvoker(
wait_gen_factory=exponential,
recoverable_exceptions=grpc.RpcError,
- max_tries=None,
+ max_tries=1,
max_time=None,
on_giveup=lambda retry_state: (
log(
@@ -415,6 +415,27 @@ def test_client_auth_get_run(self) -> None:
assert actual_public_key == expected_public_key
assert actual_hmac == expected_hmac
+ def test_without_servicer(self) -> None:
+ """Test client authentication without servicer."""
+ # Prepare
+ self._server.stop(grace=None)
+ retry_invoker = _init_retry_invoker()
+
+ # Execute and Assert
+ with self._connection(
+ self._address,
+ True,
+ retry_invoker,
+ GRPC_MAX_MESSAGE_LENGTH,
+ None,
+ (self._client_private_key, self._client_public_key),
+ ) as conn:
+ _, _, create_node, _, _, _ = conn
+ assert create_node is not None
+ create_node()
+
+ assert self._servicer.received_client_metadata() is None
+
if __name__ == "__main__":
unittest.main(verbosity=2)
From 9456fbf9636aca5028f77bd63885d2b4c36d71e8 Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Thu, 29 Aug 2024 16:54:53 +0200
Subject: [PATCH 27/42] feat(framework) Add events for `flower-simulation` and
`run_simulation` (#4107)
Co-authored-by: jafermarq
---
src/py/flwr/common/telemetry.py | 8 +++++++-
src/py/flwr/simulation/run_simulation.py | 25 +++++++++++++++++++-----
2 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/src/py/flwr/common/telemetry.py b/src/py/flwr/common/telemetry.py
index bb5747eca2a6..981cfe79966a 100644
--- a/src/py/flwr/common/telemetry.py
+++ b/src/py/flwr/common/telemetry.py
@@ -158,7 +158,13 @@ def _generate_next_value_(name: str, start: int, count: int, last_values: List[A
# --- Simulation Engine ------------------------------------------------------------
- # Not yet implemented
+ # CLI: flower-simulation
+ CLI_FLOWER_SIMULATION_ENTER = auto()
+ CLI_FLOWER_SIMULATION_LEAVE = auto()
+
+ # Python API: `run_simulation`
+ PYTHON_API_RUN_SIMULATION_ENTER = auto()
+ PYTHON_API_RUN_SIMULATION_LEAVE = auto()
# --- Deployment Engine ------------------------------------------------------------
diff --git a/src/py/flwr/simulation/run_simulation.py b/src/py/flwr/simulation/run_simulation.py
index 1eddd91108d8..38a6ee7d6c14 100644
--- a/src/py/flwr/simulation/run_simulation.py
+++ b/src/py/flwr/simulation/run_simulation.py
@@ -109,6 +109,11 @@ def run_simulation_from_cli() -> None:
"""Run Simulation Engine from the CLI."""
args = _parse_args_run_simulation().parse_args()
+ event(
+ EventType.CLI_FLOWER_SIMULATION_ENTER,
+ event_details={"backend": args.backend, "num-supernodes": args.num_supernodes},
+ )
+
# Add warnings for deprecated server_app and client_app arguments
if args.server_app:
warn_deprecated_feature(
@@ -214,6 +219,7 @@ def run_simulation_from_cli() -> None:
verbose_logging=args.verbose,
server_app_run_config=fused_config,
is_app=is_app,
+ exit_event=EventType.CLI_FLOWER_SIMULATION_LEAVE,
)
@@ -267,6 +273,11 @@ def run_simulation(
When disabled, only INFO, WARNING and ERROR log messages will be shown. If
enabled, DEBUG-level logs will be displayed.
"""
+ event(
+ EventType.PYTHON_API_RUN_SIMULATION_ENTER,
+ event_details={"backend": backend_name, "num-supernodes": num_supernodes},
+ )
+
if enable_tf_gpu_growth:
warn_deprecated_feature_with_example(
"Passing `enable_tf_gpu_growth=True` is deprecated.",
@@ -284,6 +295,7 @@ def run_simulation(
backend_config=backend_config,
enable_tf_gpu_growth=enable_tf_gpu_growth,
verbose_logging=verbose_logging,
+ exit_event=EventType.PYTHON_API_RUN_SIMULATION_LEAVE,
)
@@ -367,6 +379,7 @@ def _main_loop(
is_app: bool,
enable_tf_gpu_growth: bool,
run: Run,
+ exit_event: EventType,
flwr_dir: Optional[str] = None,
client_app: Optional[ClientApp] = None,
client_app_attr: Optional[str] = None,
@@ -374,7 +387,7 @@ def _main_loop(
server_app_attr: Optional[str] = None,
server_app_run_config: Optional[UserConfig] = None,
) -> None:
- """Launch SuperLink with Simulation Engine, then ServerApp on a separate thread."""
+ """Start ServerApp on a separate thread, then launch Simulation Engine."""
# Initialize StateFactory
state_factory = StateFactory(":flwr-in-memory-state:")
@@ -382,6 +395,7 @@ def _main_loop(
# A Threading event to indicate if an exception was raised in the ServerApp thread
server_app_thread_has_exception = threading.Event()
serverapp_th = None
+ success = True
try:
# Register run
log(DEBUG, "Pre-registering run with id %s", run.run_id)
@@ -405,8 +419,7 @@ def _main_loop(
enable_tf_gpu_growth=enable_tf_gpu_growth,
)
- # SuperLink with Simulation Engine
- event(EventType.RUN_SUPERLINK_ENTER)
+ # Start Simulation Engine
vce.start_vce(
num_supernodes=num_supernodes,
client_app_attr=client_app_attr,
@@ -424,13 +437,13 @@ def _main_loop(
except Exception as ex:
log(ERROR, "An exception occurred !! %s", ex)
log(ERROR, traceback.format_exc())
+ success = False
raise RuntimeError("An error was encountered. Ending simulation.") from ex
finally:
# Trigger stop event
f_stop.set()
-
- event(EventType.RUN_SUPERLINK_LEAVE)
+ event(exit_event, event_details={"success": success})
if serverapp_th:
serverapp_th.join()
if server_app_thread_has_exception.is_set():
@@ -442,6 +455,7 @@ def _main_loop(
# pylint: disable=too-many-arguments,too-many-locals
def _run_simulation(
num_supernodes: int,
+ exit_event: EventType,
client_app: Optional[ClientApp] = None,
server_app: Optional[ServerApp] = None,
backend_name: str = "ray",
@@ -508,6 +522,7 @@ def _run_simulation(
is_app,
enable_tf_gpu_growth,
run,
+ exit_event,
flwr_dir,
client_app,
client_app_attr,
From 4b59fe5d2e416f909103d88f87d27db180eb2314 Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Thu, 29 Aug 2024 17:01:34 +0200
Subject: [PATCH 28/42] refactor(*:skip) Keep version of setuptools and pip in
sync (#4093)
Signed-off-by: Robert Steiner
---
src/docker/base/alpine/Dockerfile | 8 +++++---
src/docker/base/ubuntu/Dockerfile | 10 +++++-----
2 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/src/docker/base/alpine/Dockerfile b/src/docker/base/alpine/Dockerfile
index 441e0fdd9b85..3e6a246e53c1 100644
--- a/src/docker/base/alpine/Dockerfile
+++ b/src/docker/base/alpine/Dockerfile
@@ -51,9 +51,11 @@ RUN pip install -U --no-cache-dir \
FROM python:${PYTHON_VERSION}-${DISTRO}${DISTRO_VERSION} AS base
-# Upgrade system Python pip and setuptools
-# hadolint ignore=DL3013
-RUN pip install -U --no-cache-dir pip setuptools
+# Keep the version of system Python pip and setuptools in sync with those installed in the
+# virtualenv.
+ARG PIP_VERSION
+ARG SETUPTOOLS_VERSION
+RUN pip install -U --no-cache-dir pip==${PIP_VERSION} setuptools==${SETUPTOOLS_VERSION}
# required by the grpc package
RUN apk add --no-cache \
diff --git a/src/docker/base/ubuntu/Dockerfile b/src/docker/base/ubuntu/Dockerfile
index 643308f324df..ddc662a0ae98 100644
--- a/src/docker/base/ubuntu/Dockerfile
+++ b/src/docker/base/ubuntu/Dockerfile
@@ -50,16 +50,16 @@ RUN LATEST=$(pyenv latest -k ${PYTHON_VERSION}) \
ENV PATH=/usr/local/bin/python/bin:$PATH
-# Upgrade system Python pip and setuptools
-# hadolint ignore=DL3013
-RUN pip install -U --no-cache-dir pip setuptools \
+ARG PIP_VERSION
+ARG SETUPTOOLS_VERSION
+# Keep the version of system Python pip and setuptools in sync with those installed in the
+# virtualenv.
+RUN pip install -U --no-cache-dir pip==${PIP_VERSION} setuptools==${SETUPTOOLS_VERSION} \
# Use a virtual environment to ensure that Python packages are installed in the same location
# regardless of whether the subsequent image build is run with the app or the root user
&& python -m venv /python/venv
ENV PATH=/python/venv/bin:$PATH
-ARG PIP_VERSION
-ARG SETUPTOOLS_VERSION
ARG FLWR_VERSION
ARG FLWR_PACKAGE=flwr
RUN pip install -U --no-cache-dir \
From be870143eee648e98b3c691e949c5b4187b29dc5 Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Thu, 29 Aug 2024 17:21:28 +0200
Subject: [PATCH 29/42] break(framework) Remove CLI entry points exports
(#4101)
---
pyproject.toml | 10 +++++-----
src/py/flwr/client/__init__.py | 4 ----
src/py/flwr/server/__init__.py | 4 ----
src/py/flwr/superexec/__init__.py | 6 ------
4 files changed, 5 insertions(+), 19 deletions(-)
diff --git a/pyproject.toml b/pyproject.toml
index 6c974304e785..91b518af5e03 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -55,15 +55,15 @@ exclude = [
# `flwr` CLI
flwr = "flwr.cli.app:app"
# SuperExec (can run with either Deployment Engine or Simulation Engine)
-flower-superexec = "flwr.superexec:run_superexec"
+flower-superexec = "flwr.superexec.app:run_superexec"
# Simulation Engine
flower-simulation = "flwr.simulation.run_simulation:run_simulation_from_cli"
# Deployment Engine
-flower-superlink = "flwr.server:run_superlink"
-flower-supernode = "flwr.client:run_supernode"
-flower-server-app = "flwr.server:run_server_app"
+flower-superlink = "flwr.server.app:run_superlink"
+flower-supernode = "flwr.client.supernode.app:run_supernode"
+flower-server-app = "flwr.server.run_serverapp:run_server_app"
flwr-clientapp = "flwr.client.clientapp:flwr_clientapp"
-flower-client-app = "flwr.client:run_client_app" # Deprecated, use `flower-supernode`
+flower-client-app = "flwr.client.supernode:run_client_app" # Deprecated
[tool.poetry.dependencies]
python = "^3.8"
diff --git a/src/py/flwr/client/__init__.py b/src/py/flwr/client/__init__.py
index 218f2fe20d62..dce3be9036bb 100644
--- a/src/py/flwr/client/__init__.py
+++ b/src/py/flwr/client/__init__.py
@@ -20,8 +20,6 @@
from .client import Client as Client
from .client_app import ClientApp as ClientApp
from .numpy_client import NumPyClient as NumPyClient
-from .supernode import run_client_app as run_client_app
-from .supernode import run_supernode as run_supernode
from .typing import ClientFn as ClientFn
from .typing import ClientFnExt as ClientFnExt
@@ -32,8 +30,6 @@
"ClientFnExt",
"NumPyClient",
"mod",
- "run_client_app",
- "run_supernode",
"start_client",
"start_numpy_client",
]
diff --git a/src/py/flwr/server/__init__.py b/src/py/flwr/server/__init__.py
index 896b46298327..1dde95b6b047 100644
--- a/src/py/flwr/server/__init__.py
+++ b/src/py/flwr/server/__init__.py
@@ -17,14 +17,12 @@
from . import strategy
from . import workflow as workflow
-from .app import run_superlink as run_superlink
from .app import start_server as start_server
from .client_manager import ClientManager as ClientManager
from .client_manager import SimpleClientManager as SimpleClientManager
from .compat import LegacyContext as LegacyContext
from .driver import Driver as Driver
from .history import History as History
-from .run_serverapp import run_server_app as run_server_app
from .server import Server as Server
from .server_app import ServerApp as ServerApp
from .server_config import ServerConfig as ServerConfig
@@ -40,8 +38,6 @@
"ServerAppComponents",
"ServerConfig",
"SimpleClientManager",
- "run_server_app",
- "run_superlink",
"start_server",
"strategy",
"workflow",
diff --git a/src/py/flwr/superexec/__init__.py b/src/py/flwr/superexec/__init__.py
index a510c41f4182..0584ca663a02 100644
--- a/src/py/flwr/superexec/__init__.py
+++ b/src/py/flwr/superexec/__init__.py
@@ -13,9 +13,3 @@
# limitations under the License.
# ==============================================================================
"""Flower SuperExec service."""
-
-from .app import run_superexec as run_superexec
-
-__all__ = [
- "run_superexec",
-]
From 01d865e76cdc398c26c54a2fa5caf405fb045179 Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Thu, 29 Aug 2024 17:37:37 +0200
Subject: [PATCH 30/42] feat(framework:skip) Move Docker Hub READMEs to flower
repository (#4094)
Signed-off-by: Robert Steiner
Co-authored-by: Taner Topal
---
.github/workflows/docker-readme.yml | 51 +++++++++++++++++++++++++++++
src/docker/base/README.md | 38 +++++++++++++++++++++
src/docker/clientapp/README.md | 22 +++++++++++++
src/docker/serverapp/README.md | 34 +++++++++++++++++++
src/docker/superexec/README.md | 26 +++++++++++++++
src/docker/superlink/README.md | 28 ++++++++++++++++
src/docker/supernode/README.md | 30 +++++++++++++++++
7 files changed, 229 insertions(+)
create mode 100644 .github/workflows/docker-readme.yml
create mode 100644 src/docker/base/README.md
create mode 100644 src/docker/clientapp/README.md
create mode 100644 src/docker/serverapp/README.md
create mode 100644 src/docker/superexec/README.md
create mode 100644 src/docker/superlink/README.md
create mode 100644 src/docker/supernode/README.md
diff --git a/.github/workflows/docker-readme.yml b/.github/workflows/docker-readme.yml
new file mode 100644
index 000000000000..29dd787d638e
--- /dev/null
+++ b/.github/workflows/docker-readme.yml
@@ -0,0 +1,51 @@
+name: Update Docker READMEs
+
+on:
+ push:
+ branches:
+ - 'main'
+ paths:
+ - 'src/docker/**/README.md'
+
+jobs:
+ collect:
+ if: ${{ github.repository == 'adap/flower' }}
+ name: Collect Docker READMEs
+ runs-on: ubuntu-22.04
+ timeout-minutes: 10
+ outputs:
+ readme_files: ${{ steps.filter.outputs.readme_files }}
+ steps:
+ - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+
+ - uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2
+ id: filter
+ with:
+ list-files: "json"
+ filters: |
+ readme:
+ - 'src/docker/**/README.md'
+
+ update:
+ if: ${{ needs.collect.outputs.readme_files != '' && toJson(fromJson(needs.collect.outputs.readme_files)) != '[]' }}
+ name: Update Docker READMEs
+ runs-on: ubuntu-22.04
+ timeout-minutes: 10
+ needs: collect
+ strategy:
+ matrix:
+ readme_path: ${{ fromJSON(needs.collect.outputs.readme_files) }}
+
+ steps:
+ - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+
+ - id: repository
+ run: echo "name=$(basename $(dirname ${{ matrix.readme_path }}))" >> "$GITHUB_OUTPUT"
+
+ - name: Docker Hub Description
+ uses: peter-evans/dockerhub-description@e98e4d1628a5f3be2be7c231e50981aee98723ae # v4.0.0
+ with:
+ repository: flwr/${{ steps.repository.outputs.name }}
+ readme-filepath: ${{ matrix.readme_path }}
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
diff --git a/src/docker/base/README.md b/src/docker/base/README.md
new file mode 100644
index 000000000000..e61899525f19
--- /dev/null
+++ b/src/docker/base/README.md
@@ -0,0 +1,38 @@
+# Flower Base
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+ - nightly image uses Python 3.11 and Ubuntu 22.04
+- `1.10.0-py3.11-alpine3.19`
+- `1.10.0-py3.11-ubuntu22.04`
+- `1.10.0-py3.10-ubuntu22.04`
+- `1.10.0-py3.9-ubuntu22.04`
+- `1.10.0-py3.8-ubuntu22.04`
+- `1.9.0-py3.11-alpine3.19`
+- `1.9.0-py3.11-ubuntu22.04`
+- `1.9.0-py3.10-ubuntu22.04`
+- `1.9.0-py3.9-ubuntu22.04`
+- `1.9.0-py3.8-ubuntu22.04`
+- `1.8.0-py3.11-alpine3.19`
+- `1.8.0-py3.11-ubuntu22.04`
+- `1.8.0-py3.10-ubuntu22.04`
+- `1.8.0-py3.9-ubuntu22.04`
+- `1.8.0-py3.8-ubuntu22.04`
diff --git a/src/docker/clientapp/README.md b/src/docker/clientapp/README.md
new file mode 100644
index 000000000000..ac50d4dc9b8f
--- /dev/null
+++ b/src/docker/clientapp/README.md
@@ -0,0 +1,22 @@
+# Flower ClientApp
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
diff --git a/src/docker/serverapp/README.md b/src/docker/serverapp/README.md
new file mode 100644
index 000000000000..c4283fa51f00
--- /dev/null
+++ b/src/docker/serverapp/README.md
@@ -0,0 +1,34 @@
+# Flower ServerApp
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
+- `1.10.0-py3.10-ubuntu22.04`
+- `1.10.0-py3.9-ubuntu22.04`
+- `1.10.0-py3.8-ubuntu22.04`
+- `1.9.0`, `1.9.0-py3.11-ubuntu22.04`
+- `1.9.0-py3.10-ubuntu22.04`
+- `1.9.0-py3.9-ubuntu22.04`
+- `1.9.0-py3.8-ubuntu22.04`
+- `1.8.0`, `1.8.0-py3.11-ubuntu22.04`
+- `1.8.0-py3.10-ubuntu22.04`
+- `1.8.0-py3.9-ubuntu22.04`
+- `1.8.0-py3.8-ubuntu22.04`
diff --git a/src/docker/superexec/README.md b/src/docker/superexec/README.md
new file mode 100644
index 000000000000..03dcc2cba5c9
--- /dev/null
+++ b/src/docker/superexec/README.md
@@ -0,0 +1,26 @@
+# Flower SuperExec
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
+- `1.10.0-py3.10-ubuntu22.04`
+- `1.10.0-py3.9-ubuntu22.04`
+- `1.10.0-py3.8-ubuntu22.04`
diff --git a/src/docker/superlink/README.md b/src/docker/superlink/README.md
new file mode 100644
index 000000000000..3da3c16909b8
--- /dev/null
+++ b/src/docker/superlink/README.md
@@ -0,0 +1,28 @@
+# Flower SuperLink
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `1.10.0`, `1.10.0-py3.11-alpine3.19`
+- `1.10.0-py3.11-ubuntu22.04`
+- `1.9.0`, `1.9.0-py3.11-alpine3.19`
+- `1.9.0-py3.11-ubuntu22.04`
+- `1.8.0`, `1.8.0-py3.11-alpine3.19`
+- `1.8.0-py3.11-ubuntu22.04`
diff --git a/src/docker/supernode/README.md b/src/docker/supernode/README.md
new file mode 100644
index 000000000000..defee36b35ae
--- /dev/null
+++ b/src/docker/supernode/README.md
@@ -0,0 +1,30 @@
+# Flower SuperNode
+
+
+
+
+
+
+
+## Quick reference
+
+- **Learn more:**
+ [Flower Docs](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html)
+
+- **Where to get help:**
+ [Flower Discuss](https://discuss.flower.ai), [Slack](https://flower.ai/join-slack) or [GitHub](https://github.com/adap/flower)
+
+- **Supported architectures:**
+ `amd64`, `arm64v8`
+
+## Supported tags
+
+- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
+- `1.10.0-py3.10-ubuntu22.04`
+- `1.10.0-py3.9-ubuntu22.04`
+- `1.10.0-py3.8-ubuntu22.04`
+- `1.9.0`, `1.9.0-py3.11-ubuntu22.04`
+- `1.9.0-py3.10-ubuntu22.04`
+- `1.9.0-py3.9-ubuntu22.04`
+- `1.9.0-py3.8-ubuntu22.04`
From 6d617154b99c9fab2c429259ad88ce3ac7008753 Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Thu, 29 Aug 2024 22:15:45 +0200
Subject: [PATCH 31/42] docs(framework) Remove redundant JAX page (#4108)
---
doc/source/conf.py | 1 +
...mple-jax-from-centralized-to-federated.rst | 282 ------------------
doc/source/index.rst | 1 -
3 files changed, 1 insertion(+), 283 deletions(-)
delete mode 100644 doc/source/example-jax-from-centralized-to-federated.rst
diff --git a/doc/source/conf.py b/doc/source/conf.py
index d3881325a5ce..de475748abb1 100644
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -264,6 +264,7 @@ def find_test_modules(package_path):
"example-mxnet-walk-through": "index.html",
"ref-api/flwr.simulation.run_simulation_from_cli": "index.html",
"contributor-how-to-create-new-messages": "index.html",
+ "example-jax-from-centralized-to-federated": "tutorial-quickstart-jax.html",
}
# -- Options for HTML output -------------------------------------------------
diff --git a/doc/source/example-jax-from-centralized-to-federated.rst b/doc/source/example-jax-from-centralized-to-federated.rst
deleted file mode 100644
index 6b06a288a67a..000000000000
--- a/doc/source/example-jax-from-centralized-to-federated.rst
+++ /dev/null
@@ -1,282 +0,0 @@
-Example: JAX - Run JAX Federated
-================================
-
-This tutorial will show you how to use Flower to build a federated version of an existing JAX workload.
-We are using JAX to train a linear regression model on a scikit-learn dataset.
-We will structure the example similar to our `PyTorch - From Centralized To Federated `_ walkthrough.
-First, we build a centralized training approach based on the `Linear Regression with JAX `_ tutorial`.
-Then, we build upon the centralized training code to run the training in a federated fashion.
-
-Before we start building our JAX example, we need install the packages :code:`jax`, :code:`jaxlib`, :code:`scikit-learn`, and :code:`flwr`:
-
-.. code-block:: shell
-
- $ pip install jax jaxlib scikit-learn flwr
-
-
-Linear Regression with JAX
---------------------------
-
-We begin with a brief description of the centralized training code based on a :code:`Linear Regression` model.
-If you want a more in-depth explanation of what's going on then have a look at the official `JAX documentation `_.
-
-Let's create a new file called :code:`jax_training.py` with all the components required for a traditional (centralized) linear regression training.
-First, the JAX packages :code:`jax` and :code:`jaxlib` need to be imported. In addition, we need to import :code:`sklearn` since we use :code:`make_regression` for the dataset and :code:`train_test_split` to split the dataset into a training and test set.
-You can see that we do not yet import the :code:`flwr` package for federated learning. This will be done later.
-
-.. code-block:: python
-
- from typing import Dict, List, Tuple, Callable
- import jax
- import jax.numpy as jnp
- from sklearn.datasets import make_regression
- from sklearn.model_selection import train_test_split
-
- key = jax.random.PRNGKey(0)
-
-The :code:`load_data()` function loads the mentioned training and test sets.
-
-.. code-block:: python
-
- def load_data() -> Tuple[List[np.ndarray], List[np.ndarray], List[np.ndarray], List[np.ndarray]]:
- # create our dataset and start with similar datasets for different clients
- X, y = make_regression(n_features=3, random_state=0)
- X, X_test, y, y_test = train_test_split(X, y)
- return X, y, X_test, y_test
-
-The model architecture (a very simple :code:`Linear Regression` model) is defined in :code:`load_model()`.
-
-.. code-block:: python
-
- def load_model(model_shape) -> Dict:
- # model weights
- params = {
- 'b' : jax.random.uniform(key),
- 'w' : jax.random.uniform(key, model_shape)
- }
- return params
-
-We now need to define the training (function :code:`train()`), which loops over the training set and measures the loss (function :code:`loss_fn()`) for each batch of training examples. The loss function is separate since JAX takes derivatives with a :code:`grad()` function (defined in the :code:`main()` function and called in :code:`train()`).
-
-.. code-block:: python
-
- def loss_fn(params, X, y) -> Callable:
- err = jnp.dot(X, params['w']) + params['b'] - y
- return jnp.mean(jnp.square(err)) # mse
-
- def train(params, grad_fn, X, y) -> Tuple[np.array, float, int]:
- num_examples = X.shape[0]
- for epochs in range(10):
- grads = grad_fn(params, X, y)
- params = jax.tree_multimap(lambda p, g: p - 0.05 * g, params, grads)
- loss = loss_fn(params,X, y)
- # if epochs % 10 == 9:
- # print(f'For Epoch {epochs} loss {loss}')
- return params, loss, num_examples
-
-The evaluation of the model is defined in the function :code:`evaluation()`. The function takes all test examples and measures the loss of the linear regression model.
-
-.. code-block:: python
-
- def evaluation(params, grad_fn, X_test, y_test) -> Tuple[float, int]:
- num_examples = X_test.shape[0]
- err_test = loss_fn(params, X_test, y_test)
- loss_test = jnp.mean(jnp.square(err_test))
- # print(f'Test loss {loss_test}')
- return loss_test, num_examples
-
-Having defined the data loading, model architecture, training, and evaluation we can put everything together and train our model using JAX. As already mentioned, the :code:`jax.grad()` function is defined in :code:`main()` and passed to :code:`train()`.
-
-.. code-block:: python
-
- def main():
- X, y, X_test, y_test = load_data()
- model_shape = X.shape[1:]
- grad_fn = jax.grad(loss_fn)
- print("Model Shape", model_shape)
- params = load_model(model_shape)
- params, loss, num_examples = train(params, grad_fn, X, y)
- evaluation(params, grad_fn, X_test, y_test)
-
-
- if __name__ == "__main__":
- main()
-
-You can now run your (centralized) JAX linear regression workload:
-
-.. code-block:: python
-
- python3 jax_training.py
-
-So far this should all look fairly familiar if you've used JAX before.
-Let's take the next step and use what we've built to create a simple federated learning system consisting of one server and two clients.
-
-JAX meets Flower
-----------------
-
-The concept of federating an existing workload is always the same and easy to understand.
-We have to start a *server* and then use the code in :code:`jax_training.py` for the *clients* that are connected to the *server*.
-The *server* sends model parameters to the clients. The *clients* run the training and update the parameters.
-The updated parameters are sent back to the *server*, which averages all received parameter updates.
-This describes one round of the federated learning process, and we repeat this for multiple rounds.
-
-Our example consists of one *server* and two *clients*. Let's set up :code:`server.py` first. The *server* needs to import the Flower package :code:`flwr`.
-Next, we use the :code:`start_server` function to start a server and tell it to perform three rounds of federated learning.
-
-.. code-block:: python
-
- import flwr as fl
-
- if __name__ == "__main__":
- fl.server.start_server(server_address="0.0.0.0:8080", config=fl.server.ServerConfig(num_rounds=3))
-
-We can already start the *server*:
-
-.. code-block:: python
-
- python3 server.py
-
-Finally, we will define our *client* logic in :code:`client.py` and build upon the previously defined JAX training in :code:`jax_training.py`.
-Our *client* needs to import :code:`flwr`, but also :code:`jax` and :code:`jaxlib` to update the parameters on our JAX model:
-
-.. code-block:: python
-
- from typing import Dict, List, Callable, Tuple
-
- import flwr as fl
- import numpy as np
- import jax
- import jax.numpy as jnp
-
- import jax_training
-
-
-Implementing a Flower *client* basically means implementing a subclass of either :code:`flwr.client.Client` or :code:`flwr.client.NumPyClient`.
-Our implementation will be based on :code:`flwr.client.NumPyClient` and we'll call it :code:`FlowerClient`.
-:code:`NumPyClient` is slightly easier to implement than :code:`Client` if you use a framework with good NumPy interoperability (like JAX) because it avoids some of the boilerplate that would otherwise be necessary.
-:code:`FlowerClient` needs to implement four methods, two methods for getting/setting model parameters, one method for training the model, and one method for testing the model:
-
-#. :code:`set_parameters (optional)`
- * set the model parameters on the local model that are received from the server
- * transform parameters to NumPy :code:`ndarray`'s
- * loop over the list of model parameters received as NumPy :code:`ndarray`'s (think list of neural network layers)
-#. :code:`get_parameters`
- * get the model parameters and return them as a list of NumPy :code:`ndarray`'s (which is what :code:`flwr.client.NumPyClient` expects)
-#. :code:`fit`
- * update the parameters of the local model with the parameters received from the server
- * train the model on the local training set
- * get the updated local model parameters and return them to the server
-#. :code:`evaluate`
- * update the parameters of the local model with the parameters received from the server
- * evaluate the updated model on the local test set
- * return the local loss to the server
-
-The challenging part is to transform the JAX model parameters from :code:`DeviceArray` to :code:`NumPy ndarray` to make them compatible with `NumPyClient`.
-
-The two :code:`NumPyClient` methods :code:`fit` and :code:`evaluate` make use of the functions :code:`train()` and :code:`evaluate()` previously defined in :code:`jax_training.py`.
-So what we really do here is we tell Flower through our :code:`NumPyClient` subclass which of our already defined functions to call for training and evaluation.
-We included type annotations to give you a better understanding of the data types that get passed around.
-
-.. code-block:: python
-
-
- class FlowerClient(fl.client.NumPyClient):
- """Flower client implementing using linear regression and JAX."""
-
- def __init__(
- self,
- params: Dict,
- grad_fn: Callable,
- train_x: List[np.ndarray],
- train_y: List[np.ndarray],
- test_x: List[np.ndarray],
- test_y: List[np.ndarray],
- ) -> None:
- self.params= params
- self.grad_fn = grad_fn
- self.train_x = train_x
- self.train_y = train_y
- self.test_x = test_x
- self.test_y = test_y
-
- def get_parameters(self, config) -> Dict:
- # Return model parameters as a list of NumPy ndarrays
- parameter_value = []
- for _, val in self.params.items():
- parameter_value.append(np.array(val))
- return parameter_value
-
- def set_parameters(self, parameters: List[np.ndarray]) -> Dict:
- # Collect model parameters and update the parameters of the local model
- value=jnp.ndarray
- params_item = list(zip(self.params.keys(),parameters))
- for item in params_item:
- key = item[0]
- value = item[1]
- self.params[key] = value
- return self.params
-
-
- def fit(
- self, parameters: List[np.ndarray], config: Dict
- ) -> Tuple[List[np.ndarray], int, Dict]:
- # Set model parameters, train model, return updated model parameters
- print("Start local training")
- self.params = self.set_parameters(parameters)
- self.params, loss, num_examples = jax_training.train(self.params, self.grad_fn, self.train_x, self.train_y)
- results = {"loss": float(loss)}
- print("Training results", results)
- return self.get_parameters(config={}), num_examples, results
-
- def evaluate(
- self, parameters: List[np.ndarray], config: Dict
- ) -> Tuple[float, int, Dict]:
- # Set model parameters, evaluate the model on a local test dataset, return result
- print("Start evaluation")
- self.params = self.set_parameters(parameters)
- loss, num_examples = jax_training.evaluation(self.params,self.grad_fn, self.test_x, self.test_y)
- print("Evaluation accuracy & loss", loss)
- return (
- float(loss),
- num_examples,
- {"loss": float(loss)},
- )
-
-Having defined the federation process, we can run it.
-
-.. code-block:: python
-
- def main() -> None:
- """Load data, start MNISTClient."""
-
- # Load data
- train_x, train_y, test_x, test_y = jax_training.load_data()
- grad_fn = jax.grad(jax_training.loss_fn)
-
- # Load model (from centralized training) and initialize parameters
- model_shape = train_x.shape[1:]
- params = jax_training.load_model(model_shape)
-
- # Start Flower client
- client = FlowerClient(params, grad_fn, train_x, train_y, test_x, test_y)
- fl.client.start_client(server_address="0.0.0.0:8080", client.to_client())
-
- if __name__ == "__main__":
- main()
-
-
-And that's it. You can now open two additional terminal windows and run
-
-.. code-block:: python
-
- python3 client.py
-
-in each window (make sure that the server is still running before you do so) and see your JAX project run federated learning across two clients. Congratulations!
-
-Next Steps
-----------
-
-The source code of this example was improved over time and can be found here: `Quickstart JAX `_.
-Our example is somewhat over-simplified because both clients load the same dataset.
-
-You're now prepared to explore this topic further. How about using a more sophisticated model or using a different dataset? How about adding more clients?
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 2a34693f7b26..4f6ad705e9bc 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -102,7 +102,6 @@ Problem-oriented how-to guides show step-by-step how to achieve a specific goal.
:caption: Legacy example guides
example-pytorch-from-centralized-to-federated
- example-jax-from-centralized-to-federated
example-fedbn-pytorch-from-centralized-to-federated
Explanations
From 0f7bbe119e94146a07ede813a2a50f7e903b7878 Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Fri, 30 Aug 2024 14:13:11 +0200
Subject: [PATCH 32/42] docs(framework) Update changelog for Flower 1.11.0
(#4110)
---
doc/source/ref-changelog.md | 104 +++++++++++++++++++++++++++++++++++-
1 file changed, 103 insertions(+), 1 deletion(-)
diff --git a/doc/source/ref-changelog.md b/doc/source/ref-changelog.md
index 531afb9ada52..7fcea7edc729 100644
--- a/doc/source/ref-changelog.md
+++ b/doc/source/ref-changelog.md
@@ -1,6 +1,108 @@
# Changelog
-## Unreleased
+## v1.11.0 (2024-08-30)
+
+### Thanks to our contributors
+
+We would like to give our special thanks to all the contributors who made the new version of Flower possible (in `git shortlog` order):
+
+`Adam Narozniak`, `Charles Beauville`, `Chong Shen Ng`, `Daniel J. Beutel`, `Daniel Nata Nugraha`, `Danny`, `Edoardo Gabrielli`, `Heng Pan`, `Javier`, `Meng Yan`, `Michal Danilowski`, `Mohammad Naseri`, `Robert Steiner`, `Steve Laskaridis`, `Taner Topal`, `Yan Gao`
+
+### What's new?
+
+- **Deliver Flower App Bundle (FAB) to SuperLink and SuperNodes** ([#4006](https://github.com/adap/flower/pull/4006), [#3945](https://github.com/adap/flower/pull/3945), [#3999](https://github.com/adap/flower/pull/3999), [#4027](https://github.com/adap/flower/pull/4027), [#3851](https://github.com/adap/flower/pull/3851), [#3946](https://github.com/adap/flower/pull/3946), [#4003](https://github.com/adap/flower/pull/4003), [#4029](https://github.com/adap/flower/pull/4029), [#3942](https://github.com/adap/flower/pull/3942), [#3957](https://github.com/adap/flower/pull/3957), [#4020](https://github.com/adap/flower/pull/4020), [#4044](https://github.com/adap/flower/pull/4044), [#3852](https://github.com/adap/flower/pull/3852), [#4019](https://github.com/adap/flower/pull/4019), [#4031](https://github.com/adap/flower/pull/4031), [#4036](https://github.com/adap/flower/pull/4036), [#4049](https://github.com/adap/flower/pull/4049), [#4017](https://github.com/adap/flower/pull/4017), [#3943](https://github.com/adap/flower/pull/3943), [#3944](https://github.com/adap/flower/pull/3944), [#4011](https://github.com/adap/flower/pull/4011), [#3619](https://github.com/adap/flower/pull/3619))
+
+ Dynamic code updates are here! `flwr run` can now ship and install the latest version of your `ServerApp` and `ClientApp` to an already-running federation (SuperLink and SuperNodes).
+
+ How does it work? `flwr run` bundles your Flower app into a single FAB (Flower App Bundle) file. It then ships this FAB file, via the SuperExec, to both the SuperLink and those SuperNodes that need it. This allows you to keep SuperExec, SuperLink and SuperNodes running as permanent infrastructure, and then ship code updates (including completely new projects!) dynamically.
+
+ `flwr run` is all you need.
+
+- **Introduce isolated** `ClientApp` **execution** ([#3970](https://github.com/adap/flower/pull/3970), [#3976](https://github.com/adap/flower/pull/3976), [#4002](https://github.com/adap/flower/pull/4002), [#4001](https://github.com/adap/flower/pull/4001), [#4034](https://github.com/adap/flower/pull/4034), [#4037](https://github.com/adap/flower/pull/4037), [#3977](https://github.com/adap/flower/pull/3977), [#4042](https://github.com/adap/flower/pull/4042), [#3978](https://github.com/adap/flower/pull/3978), [#4039](https://github.com/adap/flower/pull/4039), [#4033](https://github.com/adap/flower/pull/4033), [#3971](https://github.com/adap/flower/pull/3971), [#4035](https://github.com/adap/flower/pull/4035), [#3973](https://github.com/adap/flower/pull/3973), [#4032](https://github.com/adap/flower/pull/4032))
+
+ The SuperNode can now run your `ClientApp` in a fully isolated way. In an enterprise deployment, this allows you to set strict limits on what the `ClientApp` can and cannot do.
+
+ `flower-supernode` supports three `--isolation` modes:
+
+ - Unset: The SuperNode runs the `ClientApp` in the same process (as in previous versions of Flower). This is the default mode.
+ - `--isolation=subprocess`: The SuperNode starts a subprocess to run the `ClientApp`.
+ - `--isolation=process`: The SuperNode expects an externally-managed process to run the `ClientApp`. This external process is not managed by the SuperNode, so it has to be started beforehand and terminated manually. The common way to use this isolation mode is via the new `flwr/clientapp` Docker image.
+
+- **Improve Docker support for enterprise deployments** ([#4050](https://github.com/adap/flower/pull/4050), [#4090](https://github.com/adap/flower/pull/4090), [#3784](https://github.com/adap/flower/pull/3784), [#3998](https://github.com/adap/flower/pull/3998), [#4094](https://github.com/adap/flower/pull/4094), [#3722](https://github.com/adap/flower/pull/3722))
+
+ Flower 1.11 ships many Docker improvements that are especially useful for enterprise deployments:
+
+ - `flwr/supernode` comes with a new Alpine Docker image.
+ - `flwr/clientapp` is a new image to be used with the `--isolation=process` option. In this mode, SuperNode and `ClientApp` run in two different Docker containers. `flwr/supernode` (preferably the Alpine version) runs the long-running SuperNode with `--isolation=process`. `flwr/clientapp` runs the `ClientApp`. This is the recommended way to deploy Flower in enterprise settings.
+ - New all-in-one Docker Compose enables you to easily start a full Flower Deployment Engine on a single machine.
+ - Completely new Docker documentation: https://flower.ai/docs/framework/docker/index.html
+
+- **Improve SuperNode authentication** ([#4043](https://github.com/adap/flower/pull/4043), [#4047](https://github.com/adap/flower/pull/4047), [#4074](https://github.com/adap/flower/pull/4074))
+
+ SuperNode auth has been improved in several ways, including improved logging, improved testing, and improved error handling.
+
+- **Update** `flwr new` **templates** ([#3933](https://github.com/adap/flower/pull/3933), [#3894](https://github.com/adap/flower/pull/3894), [#3930](https://github.com/adap/flower/pull/3930), [#3931](https://github.com/adap/flower/pull/3931), [#3997](https://github.com/adap/flower/pull/3997), [#3979](https://github.com/adap/flower/pull/3979), [#3965](https://github.com/adap/flower/pull/3965), [#4013](https://github.com/adap/flower/pull/4013), [#4064](https://github.com/adap/flower/pull/4064))
+
+ All `flwr new` templates have been updated to show the latest recommended use of Flower APIs.
+
+- **Improve Simulation Engine** ([#4095](https://github.com/adap/flower/pull/4095), [#3913](https://github.com/adap/flower/pull/3913), [#4059](https://github.com/adap/flower/pull/4059), [#3954](https://github.com/adap/flower/pull/3954), [#4071](https://github.com/adap/flower/pull/4071), [#3985](https://github.com/adap/flower/pull/3985), [#3988](https://github.com/adap/flower/pull/3988))
+
+ The Flower Simulation Engine comes with several updates, including improved run config support, verbose logging, simulation backend configuration via `flwr run`, and more.
+
+- **Improve** `RecordSet` ([#4052](https://github.com/adap/flower/pull/4052), [#3218](https://github.com/adap/flower/pull/3218), [#4016](https://github.com/adap/flower/pull/4016))
+
+ `RecordSet` is the core object to exchange model parameters, configuration values and metrics between `ClientApp` and `ServerApp`. This release ships several smaller improvements to `RecordSet` and related `*Record` types.
+
+- **Update documentation** ([#3972](https://github.com/adap/flower/pull/3972), [#3925](https://github.com/adap/flower/pull/3925), [#4061](https://github.com/adap/flower/pull/4061), [#3984](https://github.com/adap/flower/pull/3984), [#3917](https://github.com/adap/flower/pull/3917), [#3900](https://github.com/adap/flower/pull/3900), [#4066](https://github.com/adap/flower/pull/4066), [#3765](https://github.com/adap/flower/pull/3765), [#4021](https://github.com/adap/flower/pull/4021), [#3906](https://github.com/adap/flower/pull/3906), [#4063](https://github.com/adap/flower/pull/4063), [#4076](https://github.com/adap/flower/pull/4076), [#3920](https://github.com/adap/flower/pull/3920), [#3916](https://github.com/adap/flower/pull/3916))
+
+ Many parts of the documentation, including the main tutorial, have been migrated to show new Flower APIs and other new Flower features like the improved Docker support.
+
+- **Migrate code example to use new Flower APIs** ([#3758](https://github.com/adap/flower/pull/3758), [#3701](https://github.com/adap/flower/pull/3701), [#3919](https://github.com/adap/flower/pull/3919), [#3918](https://github.com/adap/flower/pull/3918), [#3934](https://github.com/adap/flower/pull/3934), [#3893](https://github.com/adap/flower/pull/3893), [#3833](https://github.com/adap/flower/pull/3833), [#3922](https://github.com/adap/flower/pull/3922), [#3846](https://github.com/adap/flower/pull/3846), [#3777](https://github.com/adap/flower/pull/3777), [#3874](https://github.com/adap/flower/pull/3874), [#3873](https://github.com/adap/flower/pull/3873), [#3935](https://github.com/adap/flower/pull/3935), [#3754](https://github.com/adap/flower/pull/3754), [#3980](https://github.com/adap/flower/pull/3980), [#4089](https://github.com/adap/flower/pull/4089), [#4046](https://github.com/adap/flower/pull/4046), [#3314](https://github.com/adap/flower/pull/3314), [#3316](https://github.com/adap/flower/pull/3316), [#3295](https://github.com/adap/flower/pull/3295), [#3313](https://github.com/adap/flower/pull/3313))
+
+ Many code examples have been migrated to use new Flower APIs.
+
+- **Update Flower framework, framework internals and quality infrastructure** ([#4018](https://github.com/adap/flower/pull/4018), [#4053](https://github.com/adap/flower/pull/4053), [#4098](https://github.com/adap/flower/pull/4098), [#4067](https://github.com/adap/flower/pull/4067), [#4105](https://github.com/adap/flower/pull/4105), [#4048](https://github.com/adap/flower/pull/4048), [#4107](https://github.com/adap/flower/pull/4107), [#4069](https://github.com/adap/flower/pull/4069), [#3915](https://github.com/adap/flower/pull/3915), [#4101](https://github.com/adap/flower/pull/4101), [#4108](https://github.com/adap/flower/pull/4108), [#3914](https://github.com/adap/flower/pull/3914), [#4068](https://github.com/adap/flower/pull/4068), [#4041](https://github.com/adap/flower/pull/4041), [#4040](https://github.com/adap/flower/pull/4040), [#3986](https://github.com/adap/flower/pull/3986), [#4026](https://github.com/adap/flower/pull/4026), [#3961](https://github.com/adap/flower/pull/3961), [#3975](https://github.com/adap/flower/pull/3975), [#3983](https://github.com/adap/flower/pull/3983), [#4091](https://github.com/adap/flower/pull/4091), [#3982](https://github.com/adap/flower/pull/3982), [#4079](https://github.com/adap/flower/pull/4079), [#4073](https://github.com/adap/flower/pull/4073), [#4060](https://github.com/adap/flower/pull/4060), [#4106](https://github.com/adap/flower/pull/4106), [#4080](https://github.com/adap/flower/pull/4080), [#3974](https://github.com/adap/flower/pull/3974), [#3996](https://github.com/adap/flower/pull/3996), [#3991](https://github.com/adap/flower/pull/3991), [#3981](https://github.com/adap/flower/pull/3981), [#4093](https://github.com/adap/flower/pull/4093), [#4100](https://github.com/adap/flower/pull/4100), [#3939](https://github.com/adap/flower/pull/3939), [#3955](https://github.com/adap/flower/pull/3955), [#3940](https://github.com/adap/flower/pull/3940), [#4038](https://github.com/adap/flower/pull/4038))
+
+ As always, many parts of the Flower framework and quality infrastructure were improved and updated.
+
+### Deprecations
+
+- **Deprecate accessing `Context` via `Client.context`** ([#3797](https://github.com/adap/flower/pull/3797))
+
+ Now that both `client_fn` and `server_fn` receive a `Context` object, accessing `Context` via `Client.context` is deprecated. `Client.context` will be removed in a future release. If you need to access `Context` in your `Client` implementation, pass it manually when creating the `Client` instance in `client_fn`:
+
+ ```python
+ def client_fn(context: Context) -> Client:
+ return FlowerClient(context).to_client()
+ ```
+
+### Incompatible changes
+
+- **Update CLIs to accept an app directory instead of** `ClientApp` **and** `ServerApp` ([#3952](https://github.com/adap/flower/pull/3952), [#4077](https://github.com/adap/flower/pull/4077), [#3850](https://github.com/adap/flower/pull/3850))
+
+ The CLI commands `flower-supernode` and `flower-server-app` now accept an app directory as argument (instead of references to a `ClientApp` or `ServerApp`). An app directory is any directory containing a `pyproject.toml` file (with the appropriate Flower config fields set). The easiest way to generate a compatible project structure is to use `flwr new`.
+
+- **Disable** `flower-client-app` **CLI command** ([#4022](https://github.com/adap/flower/pull/4022))
+
+ `flower-client-app` has been disabled. Use `flower-supernode` instead.
+
+- **Use spaces instead of commas for separating config args** ([#4000](https://github.com/adap/flower/pull/4000))
+
+ When passing configs (run config, node config) to Flower, you now need to separate key-value pairs using spaces instead of commas. For example:
+
+ ```bash
+ flwr run . --run-config "learning-rate=0.01 num_rounds=10" # Works
+ ```
+
+ Previously, you could pass configs using commas, like this:
+
+ ```bash
+ flwr run . --run-config "learning-rate=0.01,num_rounds=10" # Doesn't work
+ ```
+
+- **Remove** `flwr example` **CLI command** ([#4084](https://github.com/adap/flower/pull/4084))
+
+ The experimental `flwr example` CLI command has been removed. Use `flwr new` to generate a project and then run it using `flwr run`.
## v1.10.0 (2024-07-24)
From e301ee24d9ef4672dbacb66f8d7ec7f56afcfa2d Mon Sep 17 00:00:00 2001
From: "Daniel J. Beutel"
Date: Fri, 30 Aug 2024 14:51:14 +0200
Subject: [PATCH 33/42] feat(framework) Increase dev version to Flower 1.12
(#4113)
---
baselines/doc/source/conf.py | 2 +-
doc/source/conf.py | 4 ++--
examples/doc/source/conf.py | 2 +-
pyproject.toml | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/baselines/doc/source/conf.py b/baselines/doc/source/conf.py
index ecc3482c6fce..974c264a6220 100644
--- a/baselines/doc/source/conf.py
+++ b/baselines/doc/source/conf.py
@@ -37,7 +37,7 @@
author = "The Flower Authors"
# The full version, including alpha/beta/rc tags
-release = "1.10.0"
+release = "1.11.0"
# -- General configuration ---------------------------------------------------
diff --git a/doc/source/conf.py b/doc/source/conf.py
index de475748abb1..c645c556c603 100644
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -90,10 +90,10 @@
author = "The Flower Authors"
# The full version of the next release, including alpha/beta/rc tags
-release = "1.11.0"
+release = "1.12.0"
# The current released version
rst_prolog = """
-.. |stable_flwr_version| replace:: 1.10.0
+.. |stable_flwr_version| replace:: 1.11.0
.. |stable_flwr_superlink_docker_digest| replace:: 4b317d5b6030710b476f4dbfab2c3a33021ad40a0fcfa54d7edd45e0c51d889c
.. |ubuntu_version| replace:: 22.04
.. |setuptools_version| replace:: 70.3.0
diff --git a/examples/doc/source/conf.py b/examples/doc/source/conf.py
index 4e4b7b210051..3500d7f0b59c 100644
--- a/examples/doc/source/conf.py
+++ b/examples/doc/source/conf.py
@@ -29,7 +29,7 @@
author = "The Flower Authors"
# The full version, including alpha/beta/rc tags
-release = "1.11.0"
+release = "1.12.0"
# -- General configuration ---------------------------------------------------
diff --git a/pyproject.toml b/pyproject.toml
index 91b518af5e03..6df9180ac3f8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "flwr"
-version = "1.11.0"
+version = "1.12.0"
description = "Flower: A Friendly Federated Learning Framework"
license = "Apache-2.0"
authors = ["The Flower Authors "]
From e7487fc31099e0b0b6d1367bddef7f2eee611c99 Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Fri, 30 Aug 2024 17:37:05 +0200
Subject: [PATCH 34/42] docs(framework:skip) Add flwr 1.11.0 to Docker READMEs
(#4116)
Signed-off-by: Robert Steiner
---
src/docker/base/README.md | 7 ++++++-
src/docker/clientapp/README.md | 6 +++++-
src/docker/serverapp/README.md | 6 +++++-
src/docker/superexec/README.md | 6 +++++-
src/docker/superlink/README.md | 4 +++-
src/docker/supernode/README.md | 7 ++++++-
6 files changed, 30 insertions(+), 6 deletions(-)
diff --git a/src/docker/base/README.md b/src/docker/base/README.md
index e61899525f19..16822c18782e 100644
--- a/src/docker/base/README.md
+++ b/src/docker/base/README.md
@@ -19,8 +19,13 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
- nightly image uses Python 3.11 and Ubuntu 22.04
+- `1.11.0-py3.11-alpine3.19`
+- `1.11.0-py3.11-ubuntu22.04`
+- `1.11.0-py3.10-ubuntu22.04`
+- `1.11.0-py3.9-ubuntu22.04`
+- `1.11.0-py3.8-ubuntu22.04`
- `1.10.0-py3.11-alpine3.19`
- `1.10.0-py3.11-ubuntu22.04`
- `1.10.0-py3.10-ubuntu22.04`
diff --git a/src/docker/clientapp/README.md b/src/docker/clientapp/README.md
index ac50d4dc9b8f..5827cb8974df 100644
--- a/src/docker/clientapp/README.md
+++ b/src/docker/clientapp/README.md
@@ -19,4 +19,8 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
+- `1.11.0`, `1.11.0-py3.11-ubuntu22.04`
+- `1.11.0-py3.10-ubuntu22.04`
+- `1.11.0-py3.9-ubuntu22.04`
+- `1.11.0-py3.8-ubuntu22.04`
diff --git a/src/docker/serverapp/README.md b/src/docker/serverapp/README.md
index c4283fa51f00..f75704ad7bbb 100644
--- a/src/docker/serverapp/README.md
+++ b/src/docker/serverapp/README.md
@@ -19,7 +19,11 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
+- `1.11.0`, `1.11.0-py3.11-ubuntu22.04`
+- `1.11.0-py3.10-ubuntu22.04`
+- `1.11.0-py3.9-ubuntu22.04`
+- `1.11.0-py3.8-ubuntu22.04`
- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
- `1.10.0-py3.10-ubuntu22.04`
- `1.10.0-py3.9-ubuntu22.04`
diff --git a/src/docker/superexec/README.md b/src/docker/superexec/README.md
index 03dcc2cba5c9..c5c102313ccb 100644
--- a/src/docker/superexec/README.md
+++ b/src/docker/superexec/README.md
@@ -19,7 +19,11 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
+- `1.11.0`, `1.11.0-py3.11-ubuntu22.04`
+- `1.11.0-py3.10-ubuntu22.04`
+- `1.11.0-py3.9-ubuntu22.04`
+- `1.11.0-py3.8-ubuntu22.04`
- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
- `1.10.0-py3.10-ubuntu22.04`
- `1.10.0-py3.9-ubuntu22.04`
diff --git a/src/docker/superlink/README.md b/src/docker/superlink/README.md
index 3da3c16909b8..729a1f7ba7fb 100644
--- a/src/docker/superlink/README.md
+++ b/src/docker/superlink/README.md
@@ -19,7 +19,9 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
+- `1.11.0`, `1.11.0-py3.11-alpine3.19`
+- `1.11.0-py3.11-ubuntu22.04`
- `1.10.0`, `1.10.0-py3.11-alpine3.19`
- `1.10.0-py3.11-ubuntu22.04`
- `1.9.0`, `1.9.0-py3.11-alpine3.19`
diff --git a/src/docker/supernode/README.md b/src/docker/supernode/README.md
index defee36b35ae..becc2323ca2d 100644
--- a/src/docker/supernode/README.md
+++ b/src/docker/supernode/README.md
@@ -19,7 +19,12 @@
## Supported tags
-- `nightly`, `.dev` e.g. `1.11.0.dev20240724`
+- `nightly`, `.dev` e.g. `1.12.0.dev20240830`
+- `1.11.0`, `1.11.0-py3.11-alpine3.19`
+- `1.11.0-py3.11-ubuntu22.04`
+- `1.11.0-py3.10-ubuntu22.04`
+- `1.11.0-py3.9-ubuntu22.04`
+- `1.11.0-py3.8-ubuntu22.04`
- `1.10.0`, `1.10.0-py3.11-ubuntu22.04`
- `1.10.0-py3.10-ubuntu22.04`
- `1.10.0-py3.9-ubuntu22.04`
From 6c12082e66c9152fc6ca8c232931ae82d3bf6336 Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Sat, 31 Aug 2024 12:26:25 +0200
Subject: [PATCH 35/42] docs(framework:skip) Update Docker docs for 1.11.0
(#4075)
Signed-off-by: Robert Steiner
Co-authored-by: Taner Topal
---
doc/source/docker/index.rst | 5 +-
doc/source/docker/run-as-subprocess.rst | 53 +++++
.../docker/tutorial-quickstart-docker.rst | 183 +++++++++++-------
3 files changed, 164 insertions(+), 77 deletions(-)
create mode 100644 doc/source/docker/run-as-subprocess.rst
diff --git a/doc/source/docker/index.rst b/doc/source/docker/index.rst
index a070a47cb853..ac6124b4c138 100644
--- a/doc/source/docker/index.rst
+++ b/doc/source/docker/index.rst
@@ -33,11 +33,12 @@ Advanced Options
set-environment-variables
run-as-root-user
+ run-as-subprocess
pin-version
use-a-different-version
-Run Flower Docker Compose
--------------------------
+Run Flower using Docker Compose
+-------------------------------
.. toctree::
:maxdepth: 1
diff --git a/doc/source/docker/run-as-subprocess.rst b/doc/source/docker/run-as-subprocess.rst
new file mode 100644
index 000000000000..f8c482f632a0
--- /dev/null
+++ b/doc/source/docker/run-as-subprocess.rst
@@ -0,0 +1,53 @@
+Run ClientApp as a Subprocess
+=============================
+
+In this mode, the ClientApp is executed as a subprocess within the SuperNode Docker container,
+rather than running in a separate container. This approach reduces the number of running containers,
+which can be beneficial for environments with limited resources. However, it also means that the
+ClientApp is no longer isolated from the SuperNode, which may introduce additional security
+concerns.
+
+Prerequisites
+-------------
+
+#. Before running the ClientApp as a subprocess, ensure that the FAB dependencies have been installed
+ in the SuperNode images. This can be done by extending the SuperNode image:
+
+ .. code-block:: dockerfile
+ :caption: Dockerfile.supernode
+ :linenos:
+ :substitutions:
+
+ FROM flwr/supernode:|stable_flwr_version|
+
+ WORKDIR /app
+ COPY pyproject.toml .
+ RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
+ && python -m pip install -U --no-cache-dir .
+
+ ENTRYPOINT ["flower-supernode"]
+
+#. Next, build the SuperNode Docker image by running the following command in the directory where
+ Dockerfile is located:
+
+ .. code-block:: shell
+
+ $ docker build -f Dockerfile.supernode -t flwr_supernode:0.0.1 .
+
+
+Run the ClientApp as a Subprocess
+---------------------------------
+
+Start the SuperNode with the flag ``--isolation subprocess``, which tells the SuperNode to execute
+the ClientApp as a subprocess:
+
+.. code-block:: shell
+
+ $ docker run --rm \
+ --detach \
+ flwr_supernode:0.0.1 \
+ --insecure \
+ --superlink superlink:9092 \
+ --node-config "partition-id=1 num-partitions=2" \
+ --supernode-address localhost:9094 \
+ --isolation subprocess
diff --git a/doc/source/docker/tutorial-quickstart-docker.rst b/doc/source/docker/tutorial-quickstart-docker.rst
index 29ae6d5f6a43..189d019cb097 100644
--- a/doc/source/docker/tutorial-quickstart-docker.rst
+++ b/doc/source/docker/tutorial-quickstart-docker.rst
@@ -66,8 +66,8 @@ Open your terminal and run:
* ``docker run``: This tells Docker to run a container from an image.
* ``--rm``: Remove the container once it is stopped or the command exits.
* | ``-p 9091:9091 -p 9092:9092``: Map port ``9091`` and ``9092`` of the container to the same port of
- | the host machine, allowing you to access the Driver API on ``http://localhost:9091`` and
- | the Fleet API on ``http://localhost:9092``.
+ | the host machine, allowing other services to access the Driver API on
+ | ``http://localhost:9091`` and the Fleet API on ``http://localhost:9092``.
* ``--network flwr-network``: Make the container join the network named ``flwr-network``.
* ``--name superlink``: Assign the name ``superlink`` to the container.
* ``--detach``: Run the container in the background, freeing up the terminal.
@@ -79,32 +79,92 @@ Open your terminal and run:
Step 3: Start the SuperNode
---------------------------
-The SuperNode Docker image comes with a pre-installed version of Flower and serves as a base for
-building your own SuperNode image.
+Start two SuperNode containers.
-#. Create a SuperNode Dockerfile called ``Dockerfile.supernode`` and paste the following code into it:
+#. Start the first container:
+
+ .. code-block:: bash
+ :substitutions:
+
+ $ docker run --rm \
+ -p 9094:9094 \
+ --network flwr-network \
+ --name supernode-1 \
+ --detach \
+ flwr/supernode:|stable_flwr_version| \
+ --insecure \
+ --superlink superlink:9092 \
+ --node-config "partition-id=0 num-partitions=2" \
+ --supernode-address 0.0.0.0:9094 \
+ --isolation process
+
+ .. dropdown:: Understand the command
+
+ * ``docker run``: This tells Docker to run a container from an image.
+ * ``--rm``: Remove the container once it is stopped or the command exits.
+ * | ``-p 9094:9094``: Map port ``9094`` of the container to the same port of
+ | the host machine, allowing other services to access the SuperNode API on
+ | ``http://localhost:9094``.
+ * ``--network flwr-network``: Make the container join the network named ``flwr-network``.
+ * ``--name supernode-1``: Assign the name ``supernode-1`` to the container.
+ * ``--detach``: Run the container in the background, freeing up the terminal.
+ * | ``flwr/supernode:|stable_flwr_version|``: This is the name of the image to be run and the specific tag
+ | of the image.
+ * | ``--insecure``: This flag tells the container to operate in an insecure mode, allowing
+ | unencrypted communication.
+ * | ``--superlink superlink:9092``: Connect to the SuperLink's Fleet API at the address
+ | ``superlink:9092``.
+ * | ``--node-config "partition-id=0 num-partitions=2"``: Set the partition ID to ``0`` and the
+ | number of partitions to ``2`` for the SuperNode configuration.
+ * | ``--supernode-address 0.0.0.0:9094``: Set the address and port number that the SuperNode
+ | is listening on.
+ * | ``--isolation process``: Tells the SuperNode that the ClientApp is created by separate
+ | independent process. The SuperNode does not attempt to create it.
+
+#. Start the second container:
+
+ .. code-block:: shell
+ :substitutions:
+
+ $ docker run --rm \
+ -p 9095:9095 \
+ --network flwr-network \
+ --name supernode-2 \
+ --detach \
+ flwr/supernode:|stable_flwr_version| \
+ --insecure \
+ --superlink superlink:9092 \
+ --node-config "partition-id=1 num-partitions=2" \
+ --supernode-address 0.0.0.0:9095 \
+ --isolation process
+
+Step 4: Start the ClientApp
+---------------------------
+
+The ClientApp Docker image comes with a pre-installed version of Flower and serves as a base for
+building your own ClientApp image. In order to install the FAB dependencies, you will need to create
+a Dockerfile that extends the ClientApp image and installs the required dependencies.
+
+#. Create a ClientApp Dockerfile called ``Dockerfile.clientapp`` and paste the following code into it:
.. code-block:: dockerfile
- :caption: Dockerfile.supernode
+ :caption: Dockerfile.clientapp
:linenos:
:substitutions:
- FROM flwr/supernode:|stable_flwr_version|
+ FROM flwr/clientapp:|stable_flwr_version|
WORKDIR /app
COPY pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
- COPY flower.quickstart-docker.1-0-0.fab .
- RUN flwr install flower.quickstart-docker.1-0-0.fab
-
- ENTRYPOINT ["flower-supernode"]
+ ENTRYPOINT ["flwr-clientapp"]
.. dropdown:: Understand the Dockerfile
- * | :substitution-code:`FROM flwr/supernode:|stable_flwr_version|`: This line specifies that the Docker image
- | to be built from is the ``flwr/supernode image``, version :substitution-code:`|stable_flwr_version|`.
+ * | :substitution-code:`FROM flwr/clientapp:|stable_flwr_version|`: This line specifies that the Docker image
+ | to be built from is the ``flwr/clientapp image``, version :substitution-code:`|stable_flwr_version|`.
* | ``WORKDIR /app``: Set the working directory for the container to ``/app``.
| Any subsequent commands that reference a directory will be relative to this directory.
* | ``COPY pyproject.toml .``: Copy the ``pyproject.toml`` file
@@ -116,51 +176,37 @@ building your own SuperNode image.
|
| The ``-U`` flag indicates that any existing packages should be upgraded, and
| ``--no-cache-dir`` prevents pip from using the cache to speed up the installation.
- * | ``COPY flower.quickstart-docker.1-0-0.fab .``: Copy the
- | ``flower.quickstart-docker.1-0-0.fab`` file from the current working directory into
- | the container's ``/app`` directory.
- * | ``RUN flwr install flower.quickstart-docker.1-0-0.fab``: Run the ``flwr`` install command
- | to install the Flower App Bundle locally.
- * | ``ENTRYPOINT ["flower-supernode"]``: Set the command ``flower-supernode`` to be
+ * | ``ENTRYPOINT ["flwr-clientapp"]``: Set the command ``flwr-clientapp`` to be
| the default command run when the container is started.
.. important::
- Note that `flwr `__ is already installed in the ``flwr/supernode``
+ Note that `flwr `__ is already installed in the ``flwr/clientapp``
base image, so only other package dependencies such as ``flwr-datasets``, ``torch``, etc.,
need to be installed. As a result, the ``flwr`` dependency is removed from the
``pyproject.toml`` after it has been copied into the Docker image (see line 5).
-#. Build the Flower App Bundle (FAB):
-
- .. code-block:: bash
-
- $ flwr build
-
-#. Next, build the SuperNode Docker image by running the following command in the directory where
- Dockerfile is located:
+#. Next, build the ClientApp Docker image by running the following command in the directory where
+ the Dockerfile is located:
.. code-block:: bash
- $ docker build -f Dockerfile.supernode -t flwr_supernode:0.0.1 .
+ $ docker build -f Dockerfile.clientapp -t flwr_clientapp:0.0.1 .
.. note::
- The image name was set as ``flwr_supernode`` with the tag ``0.0.1``. Remember that
+ The image name was set as ``flwr_clientapp`` with the tag ``0.0.1``. Remember that
these values are merely examples, and you can customize them according to your requirements.
-#. Start the first SuperNode container:
+#. Start the first ClientApp container:
.. code-block:: bash
$ docker run --rm \
--network flwr-network \
--detach \
- flwr_supernode:0.0.1 \
- --insecure \
- --superlink superlink:9092 \
- --node-config \
- partition-id=0,num-partitions=2
+ flwr_clientapp:0.0.1 \
+ --supernode supernode-1:9094
.. dropdown:: Understand the command
@@ -168,35 +214,28 @@ building your own SuperNode image.
* ``--rm``: Remove the container once it is stopped or the command exits.
* ``--network flwr-network``: Make the container join the network named ``flwr-network``.
* ``--detach``: Run the container in the background, freeing up the terminal.
- * | ``flwr_supernode:0.0.1``: This is the name of the image to be run and the specific tag
+ * | ``flwr_clientapp:0.0.1``: This is the name of the image to be run and the specific tag
| of the image.
- * | ``--insecure``: This flag tells the container to operate in an insecure mode, allowing
- | unencrypted communication.
- * | ``--superlink superlink:9092``: Connect to the SuperLinks Fleet API on the address
- | ``superlink:9092``.
- * | ``--node-config partition-id=0,num-partitions=2``: Set the partition ID to ``0`` and the
- | number of partitions to ``2`` for the SuperNode configuration.
+ * | ``--supernode supernode-1:9094``: Connect to the SuperNode's Fleet API at the address
+ | ``supernode-1:9094``.
-#. Start the second SuperNode container:
+#. Start the second ClientApp container:
.. code-block:: shell
$ docker run --rm \
--network flwr-network \
--detach \
- flwr_supernode:0.0.1 \
- --insecure \
- --superlink superlink:9092 \
- --node-config \
- partition-id=1,num-partitions=2
+ flwr_clientapp:0.0.1 \
+ --supernode supernode-2:9095
-Step 4: Start the SuperExec
+Step 5: Start the SuperExec
---------------------------
-The procedure for building and running a SuperExec image is almost identical to the SuperNode image.
+The procedure for building and running a SuperExec image is almost identical to the ClientApp image.
-Similar to the SuperNode image, the SuperExec Docker image comes with a pre-installed version of
-Flower and serves as a base for building your own SuperExec image.
+Similar to the ClientApp image, you will need to create a Dockerfile that extends the SuperExec
+image and installs the required FAB dependencies.
#. Create a SuperExec Dockerfile called ``Dockerfile.superexec`` and paste the following code in:
@@ -254,8 +293,7 @@ Flower and serves as a base for building your own SuperExec image.
--detach \
flwr_superexec:0.0.1 \
--insecure \
- --executor-config \
- superlink=\"superlink:9091\"
+ --executor-config superlink=\"superlink:9091\"
.. dropdown:: Understand the command
@@ -273,7 +311,7 @@ Flower and serves as a base for building your own SuperExec image.
* | ``--executor-config superlink=\"superlink:9091\"``: Configure the SuperExec executor to
| connect to the SuperLink running on port ``9091``.
-Step 5: Run the Quickstart Project
+Step 6: Run the Quickstart Project
----------------------------------
#. Add the following lines to the ``pyproject.toml``:
@@ -297,7 +335,7 @@ Step 5: Run the Quickstart Project
$ docker logs -f superexec
-Step 6: Update the Application
+Step 7: Update the Application
------------------------------
#. Change the application code. For example, change the ``seed`` in ``quickstart_docker/task.py``
@@ -310,39 +348,32 @@ Step 6: Update the Application
partition_train_test = partition.train_test_split(test_size=0.2, seed=43)
# ...
-#. Stop the current SuperNode containers:
+#. Stop the current ClientApp containers:
.. code-block:: bash
- $ docker stop $(docker ps -a -q --filter ancestor=flwr_supernode:0.0.1)
+ $ docker stop $(docker ps -a -q --filter ancestor=flwr_clientapp:0.0.1)
-#. Rebuild the FAB and SuperNode image:
+#. Rebuild the FAB and ClientApp image:
.. code-block:: bash
- $ flwr build
- $ docker build -f Dockerfile.supernode -t flwr_supernode:0.0.1 .
+ $ docker build -f Dockerfile.clientapp -t flwr_clientapp:0.0.1 .
-#. Launch two new SuperNode containers based on the newly built image:
+#. Launch two new ClientApp containers based on the newly built image:
.. code-block:: bash
$ docker run --rm \
--network flwr-network \
--detach \
- flwr_supernode:0.0.1 \
- --insecure \
- --superlink superlink:9092 \
- --node-config \
- partition-id=0,num-partitions=2
+ flwr_clientapp:0.0.1 \
+ --supernode supernode-1:9094
$ docker run --rm \
--network flwr-network \
--detach \
- flwr_supernode:0.0.1 \
- --insecure \
- --superlink superlink:9092 \
- --node-config \
- partition-id=1,num-partitions=2
+ flwr_clientapp:0.0.1 \
+ --supernode supernode-2:9095
#. Run the updated project:
@@ -350,14 +381,16 @@ Step 6: Update the Application
$ flwr run . docker
-Step 7: Clean Up
+Step 8: Clean Up
----------------
Remove the containers and the bridge network:
.. code-block:: bash
- $ docker stop $(docker ps -a -q --filter ancestor=flwr_supernode:0.0.1) \
+ $ docker stop $(docker ps -a -q --filter ancestor=flwr_clientapp:0.0.1) \
+ supernode-1 \
+ supernode-2 \
superexec \
superlink
$ docker network rm flwr-network
From 7215dfb7a7a65833471cdcec0689ec5f6503d6f7 Mon Sep 17 00:00:00 2001
From: Robert Steiner
Date: Sat, 31 Aug 2024 12:37:23 +0200
Subject: [PATCH 36/42] docs(framework:skip) Update Docker Compose docs for
1.11.0 (#4085)
Signed-off-by: Robert Steiner
---
.../tutorial-quickstart-docker-compose.rst | 56 +++++---
src/docker/complete/compose.yml | 124 +++++++++++-------
src/docker/complete/with-tls.yml | 22 +++-
3 files changed, 135 insertions(+), 67 deletions(-)
diff --git a/doc/source/docker/tutorial-quickstart-docker-compose.rst b/doc/source/docker/tutorial-quickstart-docker-compose.rst
index 93a000295951..49cef55ec5a2 100644
--- a/doc/source/docker/tutorial-quickstart-docker-compose.rst
+++ b/doc/source/docker/tutorial-quickstart-docker-compose.rst
@@ -44,7 +44,7 @@ Step 1: Set Up
Setting the ``PROJECT_DIR`` helps Docker Compose locate the ``pyproject.toml`` file, allowing
it to install dependencies in the SuperExec and SuperNode images correctly.
-Step 2: Run Flower in insecure mode
+Step 2: Run Flower in Insecure Mode
-----------------------------------
To begin, start Flower with the most basic configuration. In this setup, Flower
@@ -230,7 +230,7 @@ Step 6: Run Flower with TLS
[tool.flwr.federations.docker-compose-tls]
address = "127.0.0.1:9093"
- root-certificates = "superexec-certificates/ca.crt"
+ root-certificates = "../superexec-certificates/ca.crt"
#. Restart the services with TLS enabled:
@@ -248,43 +248,57 @@ Step 6: Run Flower with TLS
Step 7: Add another SuperNode
-----------------------------
-You can add more SuperNodes by duplicating the SuperNode definition in the ``compose.yml`` file.
+You can add more SuperNodes and ClientApps by duplicating their definitions in the ``compose.yml``
+file.
-Just make sure to give each new SuperNode service a unique service name like ``supernode-3``, ``supernode-4``, etc.
+Just give each new SuperNode and ClientApp service a unique service name like ``supernode-3``,
+``clientapp-3``, etc.
In ``compose.yml``, add the following:
.. code-block:: yaml
:caption: compose.yml
+ :substitutions:
- services:
# other service definitions
supernode-3:
- user: root
- deploy:
- resources:
- limits:
- cpus: "2"
+ image: flwr/supernode:${FLWR_VERSION:-|stable_flwr_version|}
command:
+ - --insecure
- --superlink
- superlink:9092
- - --insecure
+ - --supernode-address
+ - 0.0.0.0:9096
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=1 num-partitions=2"
depends_on:
- superlink
- volumes:
- - apps-volume:/app/.flwr/apps/:ro
+
+ clientapp-3:
build:
context: ${PROJECT_DIR:-.}
dockerfile_inline: |
- FROM flwr/supernode:${FLWR_VERSION:-1.10.0}
+ FROM flwr/clientapp:${FLWR_VERSION:-|stable_flwr_version|}
WORKDIR /app
COPY --chown=app:app pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
- ENTRYPOINT ["flower-supernode", "--node-config", "partition-id=0,num-partitions=2"]
+ ENTRYPOINT ["flwr-clientapp"]
+ command:
+ - --supernode
+ - supernode-3:9096
+ deploy:
+ resources:
+ limits:
+ cpus: "2"
+ stop_signal: SIGINT
+ depends_on:
+ - supernode-3
If you also want to enable TLS for the new SuperNodes, duplicate the SuperNode definition for
each new SuperNode service in the ``with-tls.yml`` file.
@@ -296,13 +310,18 @@ In ``with-tls.yml``, add the following:
.. code-block:: yaml
:caption: with-tls.yml
- services:
# other service definitions
supernode-3:
command:
- --superlink
- superlink:9092
+ - --supernode-address
+ - 0.0.0.0:9096
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=1 num-partitions=2"
- --root-certificates
- certificates/ca.crt
secrets:
@@ -315,14 +334,13 @@ Step 8: Persisting the SuperLink State and Enabling TLS
To run Flower with persisted SuperLink state and enabled TLS, a slight change in the ``with-state.yml``
file is required:
-#. Comment out the lines 3-5 and uncomment the lines 6-10:
+#. Comment out the lines 2-4 and uncomment the lines 5-9:
.. code-block:: yaml
:caption: with-state.yml
:linenos:
- :emphasize-lines: 3-10
+ :emphasize-lines: 2-9
- services:
superlink:
# command:
# - --insecure
diff --git a/src/docker/complete/compose.yml b/src/docker/complete/compose.yml
index 90261249f322..60279adceb37 100644
--- a/src/docker/complete/compose.yml
+++ b/src/docker/complete/compose.yml
@@ -1,17 +1,16 @@
services:
# create a SuperLink service
superlink:
- image: flwr/superlink:${FLWR_VERSION:-1.10.0}
+ image: flwr/superlink:${FLWR_VERSION:-1.11.0}
command:
- --insecure
# create a SuperExec service
superexec:
- user: root
build:
context: ${PROJECT_DIR:-.}
dockerfile_inline: |
- FROM flwr/superexec:${FLWR_VERSION:-1.10.0}
+ FROM flwr/superexec:${FLWR_VERSION:-1.11.0}
WORKDIR /app
COPY --chown=app:app pyproject.toml .
@@ -29,89 +28,122 @@ services:
- superlink="superlink:9091"
depends_on:
- superlink
- volumes:
- - apps-volume:/app/.flwr/apps/:rw
# create a two SuperNode service with different node configs
supernode-1:
- user: root
- deploy:
- resources:
- limits:
- cpus: "2"
+ image: flwr/supernode:${FLWR_VERSION:-1.11.0}
command:
+ - --insecure
- --superlink
- superlink:9092
+ - --supernode-address
+ - 0.0.0.0:9094
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=0 num-partitions=2"
+ depends_on:
+ - superlink
+
+ supernode-2:
+ image: flwr/supernode:${FLWR_VERSION:-1.11.0}
+ command:
- --insecure
+ - --superlink
+ - superlink:9092
+ - --supernode-address
+ - 0.0.0.0:9095
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=1 num-partitions=2"
depends_on:
- superlink
- volumes:
- - apps-volume:/app/.flwr/apps/:ro
+
+ # uncomment to add another SuperNode
+ #
+ # supernode-3:
+ # image: flwr/supernode:${FLWR_VERSION:-1.11.0}
+ # command:
+ # - --insecure
+ # - --superlink
+ # - superlink:9092
+ # - --supernode-address
+ # - 0.0.0.0:9096
+ # - --isolation
+ # - process
+ # - --node-config
+ # - "partition-id=1 num-partitions=2"
+ # depends_on:
+ # - superlink
+
+ clientapp-1:
build:
context: ${PROJECT_DIR:-.}
dockerfile_inline: |
- FROM flwr/supernode:${FLWR_VERSION:-1.10.0}
+ FROM flwr/clientapp:${FLWR_VERSION:-1.11.0}
WORKDIR /app
COPY --chown=app:app pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
- ENTRYPOINT ["flower-supernode", "--node-config", "partition-id=0,num-partitions=2"]
-
- supernode-2:
- user: root
+ ENTRYPOINT ["flwr-clientapp"]
+ command:
+ - --supernode
+ - supernode-1:9094
deploy:
resources:
limits:
cpus: "2"
- command:
- - --superlink
- - superlink:9092
- - --insecure
+ stop_signal: SIGINT
depends_on:
- - superlink
- volumes:
- - apps-volume:/app/.flwr/apps/:ro
+ - supernode-1
+
+ clientapp-2:
build:
context: ${PROJECT_DIR:-.}
dockerfile_inline: |
- FROM flwr/supernode:${FLWR_VERSION:-1.10.0}
+ FROM flwr/clientapp:${FLWR_VERSION:-1.11.0}
WORKDIR /app
COPY --chown=app:app pyproject.toml .
RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
&& python -m pip install -U --no-cache-dir .
- ENTRYPOINT ["flower-supernode", "--node-config", "partition-id=1,num-partitions=2"]
+ ENTRYPOINT ["flwr-clientapp"]
+ command:
+ - --supernode
+ - supernode-2:9095
+ deploy:
+ resources:
+ limits:
+ cpus: "2"
+ stop_signal: SIGINT
+ depends_on:
+ - supernode-2
- # uncomment to add another supernode
+ # uncomment to add another ClientApp
#
- # supernode-3:
- # user: root
- # deploy:
- # resources:
- # limits:
- # cpus: "2"
- # command:
- # - --superlink
- # - superlink:9092
- # - --insecure
- # depends_on:
- # - superlink
- # volumes:
- # - apps-volume:/app/.flwr/apps/:ro
+ # clientapp-3:
# build:
# context: ${PROJECT_DIR:-.}
# dockerfile_inline: |
- # FROM flwr/supernode:${FLWR_VERSION:-1.10.0}
+ # FROM flwr/clientapp:${FLWR_VERSION:-1.11.0}
# WORKDIR /app
# COPY --chown=app:app pyproject.toml .
# RUN sed -i 's/.*flwr\[simulation\].*//' pyproject.toml \
# && python -m pip install -U --no-cache-dir .
- # ENTRYPOINT ["flower-supernode", "--node-config", "partition-id=0,num-partitions=2"]
-
-volumes:
- apps-volume:
+ # ENTRYPOINT ["flwr-clientapp"]
+ # command:
+ # - --supernode
+ # - supernode-3:9096
+ # deploy:
+ # resources:
+ # limits:
+ # cpus: "2"
+ # stop_signal: SIGINT
+ # depends_on:
+ # - supernode-3
diff --git a/src/docker/complete/with-tls.yml b/src/docker/complete/with-tls.yml
index 1b8540e09b64..6cbeb2ba7397 100644
--- a/src/docker/complete/with-tls.yml
+++ b/src/docker/complete/with-tls.yml
@@ -17,7 +17,7 @@ services:
- --executor
- flwr.superexec.deployment:executor
- --executor-config
- - superlink="superlink:9091",root-certificates="certificates/superlink-ca.crt"
+ - superlink="superlink:9091" root-certificates="certificates/superlink-ca.crt"
- --ssl-ca-certfile=certificates/ca.crt
- --ssl-certfile=certificates/server.pem
- --ssl-keyfile=certificates/server.key
@@ -35,6 +35,12 @@ services:
command:
- --superlink
- superlink:9092
+ - --supernode-address
+ - 0.0.0.0:9094
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=0 num-partitions=2"
- --root-certificates
- certificates/ca.crt
secrets:
@@ -45,18 +51,30 @@ services:
command:
- --superlink
- superlink:9092
+ - --supernode-address
+ - 0.0.0.0:9095
+ - --isolation
+ - process
+ - --node-config
+ - "partition-id=1 num-partitions=2"
- --root-certificates
- certificates/ca.crt
secrets:
- source: superlink-ca-certfile
target: /app/certificates/ca.crt
- # uncomment to enable TLS on another supernode
+ # uncomment to enable TLS on another SuperNode
#
# supernode-3:
# command:
# - --superlink
# - superlink:9092
+ # - --supernode-address
+ # - 0.0.0.0:9096
+ # - --isolation
+ # - process
+ # - --node-config
+ # - "partition-id=1 num-partitions=2"
# - --root-certificates
# - certificates/ca.crt
# secrets:
From 9bbfe9c6fee79c802f9aec8210c530bf598e1d9e Mon Sep 17 00:00:00 2001
From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon, 2 Sep 2024 09:14:03 +0200
Subject: [PATCH 37/42] build(deps): bump actions/upload-artifact from 4.3.6 to
4.4.0 (#4122)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.6 to 4.4.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/834a144ee995460fba8ed112a2fc961b36a5ec5a...50769540e7f4bd5e21e526ee35c689e35e0d6874)
---
updated-dependencies:
- dependency-name: actions/upload-artifact
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot]
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
---
.github/workflows/_docker-build.yml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/.github/workflows/_docker-build.yml b/.github/workflows/_docker-build.yml
index a3373c6e93fa..ac88502b748a 100644
--- a/.github/workflows/_docker-build.yml
+++ b/.github/workflows/_docker-build.yml
@@ -122,7 +122,7 @@ jobs:
touch "/tmp/digests/${digest#sha256:}"
- name: Upload digest
- uses: actions/upload-artifact@834a144ee995460fba8ed112a2fc961b36a5ec5a # v4.3.6
+ uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4.4.0
with:
name: digests-${{ steps.build-id.outputs.id }}-${{ matrix.platform.name }}
path: /tmp/digests/*
From 24abe659769ae16ee03d9e545359bb1469f3d919 Mon Sep 17 00:00:00 2001
From: Chong Shen Ng
Date: Mon, 2 Sep 2024 22:24:31 +0800
Subject: [PATCH 38/42] fix(framework) Fix parsing `executor_config` if present
(#4125)
---
src/py/flwr/superexec/app.py | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/py/flwr/superexec/app.py b/src/py/flwr/superexec/app.py
index 9510479ec8e1..67568b8378e0 100644
--- a/src/py/flwr/superexec/app.py
+++ b/src/py/flwr/superexec/app.py
@@ -56,7 +56,9 @@ def run_superexec() -> None:
address=address,
executor=_load_executor(args),
certificates=certificates,
- config=parse_config_args([args.executor_config]),
+ config=parse_config_args(
+ [args.executor_config] if args.executor_config else args.executor_config
+ ),
)
grpc_servers = [superexec_server]
From 5c921888382c69838e41bc362579804907f201f7 Mon Sep 17 00:00:00 2001
From: Javier
Date: Mon, 2 Sep 2024 16:31:55 +0200
Subject: [PATCH 39/42] fix(examples) Update examples READMEs (#4117)
---
examples/federated-kaplan-meier-fitter/README.md | 2 +-
examples/federated-kaplan-meier-fitter/pyproject.toml | 2 +-
examples/flower-secure-aggregation/README.md | 2 +-
examples/flower-secure-aggregation/pyproject.toml | 2 +-
examples/flowertune-llm/pyproject.toml | 2 +-
examples/flowertune-vit/README.md | 2 +-
examples/flowertune-vit/pyproject.toml | 2 +-
examples/quickstart-fastai/pyproject.toml | 2 +-
examples/quickstart-huggingface/pyproject.toml | 2 +-
examples/quickstart-mlx/README.md | 2 +-
examples/quickstart-mlx/pyproject.toml | 2 +-
examples/quickstart-monai/README.md | 2 +-
examples/quickstart-monai/pyproject.toml | 2 +-
examples/quickstart-pytorch-lightning/README.md | 2 +-
examples/quickstart-pytorch-lightning/pyproject.toml | 2 +-
examples/quickstart-pytorch/README.md | 2 +-
examples/quickstart-pytorch/pyproject.toml | 2 +-
examples/quickstart-tensorflow/README.md | 2 +-
examples/sklearn-logreg-mnist/README.md | 2 +-
examples/sklearn-logreg-mnist/pyproject.toml | 2 +-
20 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/examples/federated-kaplan-meier-fitter/README.md b/examples/federated-kaplan-meier-fitter/README.md
index 1964ec4e5653..cc68a331bbba 100644
--- a/examples/federated-kaplan-meier-fitter/README.md
+++ b/examples/federated-kaplan-meier-fitter/README.md
@@ -69,7 +69,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,learning-rate=0.05
+flwr run . --run-config "num-server-rounds=5 learning-rate=0.05"
```
You can also check that the results match the centralized version.
diff --git a/examples/federated-kaplan-meier-fitter/pyproject.toml b/examples/federated-kaplan-meier-fitter/pyproject.toml
index 47cb0a4ba286..159ccc15efe4 100644
--- a/examples/federated-kaplan-meier-fitter/pyproject.toml
+++ b/examples/federated-kaplan-meier-fitter/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Kaplan Meier Fitter with Flower"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets>=0.3.0",
"numpy>=1.23.2",
"pandas>=2.0.0",
diff --git a/examples/flower-secure-aggregation/README.md b/examples/flower-secure-aggregation/README.md
index 9e92aed01d9e..0a9056263db3 100644
--- a/examples/flower-secure-aggregation/README.md
+++ b/examples/flower-secure-aggregation/README.md
@@ -57,7 +57,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example
```bash
-flwr run . --run-config num-server-rounds=5,learning-rate=0.25
+flwr run . --run-config "num-server-rounds=5 learning-rate=0.25"
```
To adapt the example for a practial usage, set `is-demo=false` like shown below. You might want to adjust the `num-shares` and `reconstruction-threshold` settings to suit your requirements. You can override those via `--run-config` as well.
diff --git a/examples/flower-secure-aggregation/pyproject.toml b/examples/flower-secure-aggregation/pyproject.toml
index d9be719653b0..6ac94253e839 100644
--- a/examples/flower-secure-aggregation/pyproject.toml
+++ b/examples/flower-secure-aggregation/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Secure Aggregation in Flower"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"torch==2.2.1",
"torchvision==0.17.1",
diff --git a/examples/flowertune-llm/pyproject.toml b/examples/flowertune-llm/pyproject.toml
index 8171d7680620..20aa7267d9d5 100644
--- a/examples/flowertune-llm/pyproject.toml
+++ b/examples/flowertune-llm/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "FlowerTune LLM: Federated LLM Fine-tuning with Flower"
license = "Apache-2.0"
dependencies = [
- "flwr-nightly[simulation]==1.11.0.dev20240826",
+ "flwr[simulation]==1.11.0",
"flwr-datasets>=0.3.0",
"trl==0.8.1",
"bitsandbytes==0.43.0",
diff --git a/examples/flowertune-vit/README.md b/examples/flowertune-vit/README.md
index 9e2b0fd6b079..48327880f412 100644
--- a/examples/flowertune-vit/README.md
+++ b/examples/flowertune-vit/README.md
@@ -59,7 +59,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,batch-size=64
+flwr run . --run-config "num-server-rounds=5 batch-size=64"
```
Run the project in the `local-simulation-gpu` federation that gives CPU and GPU resources to each `ClientApp`. By default, at most 5x`ClientApp` will run in parallel in the available GPU. You can tweak the degree of parallelism by adjusting the settings of this federation in the `pyproject.toml`.
diff --git a/examples/flowertune-vit/pyproject.toml b/examples/flowertune-vit/pyproject.toml
index 0f11dc54c81a..d0feabc14212 100644
--- a/examples/flowertune-vit/pyproject.toml
+++ b/examples/flowertune-vit/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Finetuning of a Vision Transformer with Flower"
license = "Apache-2.0"
dependencies = [
- "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr[simulation]==1.11.0",
"flwr-datasets[vision]>=0.3.0",
"torch==2.2.1",
"torchvision==0.17.1",
diff --git a/examples/quickstart-fastai/pyproject.toml b/examples/quickstart-fastai/pyproject.toml
index 4d160bae0eec..25219ffcac4c 100644
--- a/examples/quickstart-fastai/pyproject.toml
+++ b/examples/quickstart-fastai/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Learning with Fastai and Flower (Quickstart Example)"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"fastai==2.7.14",
"torch==2.2.0",
diff --git a/examples/quickstart-huggingface/pyproject.toml b/examples/quickstart-huggingface/pyproject.toml
index af48b2429635..696f05b33ebf 100644
--- a/examples/quickstart-huggingface/pyproject.toml
+++ b/examples/quickstart-huggingface/pyproject.toml
@@ -12,7 +12,7 @@ authors = [
{ name = "Kaushik Amar Das", email = "kaushik.das@iiitg.ac.in" },
]
dependencies = [
- "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr[simulation]==1.11.0",
"flwr-datasets>=0.3.0",
"torch==2.4.0",
"transformers>=4.30.0,<5.0",
diff --git a/examples/quickstart-mlx/README.md b/examples/quickstart-mlx/README.md
index 95b9ccf605b5..ef28c3728279 100644
--- a/examples/quickstart-mlx/README.md
+++ b/examples/quickstart-mlx/README.md
@@ -58,7 +58,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,learning-rate=0.05
+flwr run . --run-config "num-server-rounds=5 learning-rate=0.05"
```
> \[!TIP\]
diff --git a/examples/quickstart-mlx/pyproject.toml b/examples/quickstart-mlx/pyproject.toml
index 36e39bcd6d78..459cac86f5d6 100644
--- a/examples/quickstart-mlx/pyproject.toml
+++ b/examples/quickstart-mlx/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Learning with MLX and Flower (Quickstart Example)"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"mlx==0.16.0",
"numpy==1.26.4",
diff --git a/examples/quickstart-monai/README.md b/examples/quickstart-monai/README.md
index c470a6a6c86f..8189a8e98406 100644
--- a/examples/quickstart-monai/README.md
+++ b/examples/quickstart-monai/README.md
@@ -70,7 +70,7 @@ flwr run . local-simulation-gpu
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,batch-size=32
+flwr run . --run-config "num-server-rounds=5 batch-size=32"
```
### Run with the Deployment Engine
diff --git a/examples/quickstart-monai/pyproject.toml b/examples/quickstart-monai/pyproject.toml
index 6ecf5011d24f..daa92fc0387d 100644
--- a/examples/quickstart-monai/pyproject.toml
+++ b/examples/quickstart-monai/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Learning with MONAI and Flower (Quickstart Example)"
license = "Apache-2.0"
dependencies = [
- "flwr-nightly[simulation]==1.11.0.dev20240823",
+ "flwr[simulation]==1.11.0",
"flwr-datasets[vision]>=0.3.0",
"monai==1.3.2",
"filelock==3.15.4",
diff --git a/examples/quickstart-pytorch-lightning/README.md b/examples/quickstart-pytorch-lightning/README.md
index e520be856962..0aa34db9af75 100644
--- a/examples/quickstart-pytorch-lightning/README.md
+++ b/examples/quickstart-pytorch-lightning/README.md
@@ -52,7 +52,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,max-epochs=2
+flwr run . --run-config "num-server-rounds=5 max-epochs=2"
```
### Run with the Deployment Engine
diff --git a/examples/quickstart-pytorch-lightning/pyproject.toml b/examples/quickstart-pytorch-lightning/pyproject.toml
index 482fc1356527..c5537ac6fcbe 100644
--- a/examples/quickstart-pytorch-lightning/pyproject.toml
+++ b/examples/quickstart-pytorch-lightning/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Learning with PyTorch Lightning and Flower (Quickstart Example)"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"pytorch-lightning<2.0.0; sys_platform == 'darwin'",
"pytorch-lightning==1.6.0; sys_platform != 'darwin'",
diff --git a/examples/quickstart-pytorch/README.md b/examples/quickstart-pytorch/README.md
index e37d49194b01..d07f83a7ea85 100644
--- a/examples/quickstart-pytorch/README.md
+++ b/examples/quickstart-pytorch/README.md
@@ -55,7 +55,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,learning-rate=0.05
+flwr run . --run-config "num-server-rounds=5 learning-rate=0.05"
```
> \[!TIP\]
diff --git a/examples/quickstart-pytorch/pyproject.toml b/examples/quickstart-pytorch/pyproject.toml
index 29414962ba6b..98f02626a429 100644
--- a/examples/quickstart-pytorch/pyproject.toml
+++ b/examples/quickstart-pytorch/pyproject.toml
@@ -8,7 +8,7 @@ version = "1.0.0"
description = "Federated Learning with PyTorch and Flower (Quickstart Example)"
license = "Apache-2.0"
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"torch==2.2.1",
"torchvision==0.17.1",
diff --git a/examples/quickstart-tensorflow/README.md b/examples/quickstart-tensorflow/README.md
index f1fa12a3393c..a162e756d799 100644
--- a/examples/quickstart-tensorflow/README.md
+++ b/examples/quickstart-tensorflow/README.md
@@ -56,7 +56,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,learning-rate=0.05
+flwr run . --run-config "num-server-rounds=5 learning-rate=0.05"
```
> \[!TIP\]
diff --git a/examples/sklearn-logreg-mnist/README.md b/examples/sklearn-logreg-mnist/README.md
index b56dbfc5dd3a..7c75e2ecfb85 100644
--- a/examples/sklearn-logreg-mnist/README.md
+++ b/examples/sklearn-logreg-mnist/README.md
@@ -55,7 +55,7 @@ flwr run .
You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:
```bash
-flwr run . --run-config num-server-rounds=5,fraction-fit=0.25
+flwr run . --run-config "num-server-rounds=5 fraction-fit=0.25"
```
> \[!TIP\]
diff --git a/examples/sklearn-logreg-mnist/pyproject.toml b/examples/sklearn-logreg-mnist/pyproject.toml
index be1e4810b312..937f05e35eda 100644
--- a/examples/sklearn-logreg-mnist/pyproject.toml
+++ b/examples/sklearn-logreg-mnist/pyproject.toml
@@ -12,7 +12,7 @@ authors = [
{ name = "Kaushik Amar Das", email = "kaushik.das@iiitg.ac.in" },
]
dependencies = [
- "flwr[simulation]>=1.10.0",
+ "flwr[simulation]>=1.11.0",
"flwr-datasets[vision]>=0.3.0",
"numpy<2.0.0",
"scikit-learn~=1.2.2",
From daaa54e78982839d927fad9d59cb4b57d61d2137 Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Mon, 2 Sep 2024 15:37:49 +0100
Subject: [PATCH 40/42] fix(framework) Fix `FlowerTune` template (#4123)
---
src/py/flwr/cli/new/new.py | 1 -
1 file changed, 1 deletion(-)
diff --git a/src/py/flwr/cli/new/new.py b/src/py/flwr/cli/new/new.py
index 9f2d32ddf99c..520f683a47d8 100644
--- a/src/py/flwr/cli/new/new.py
+++ b/src/py/flwr/cli/new/new.py
@@ -196,7 +196,6 @@ def new(
f"{import_name}/client_app.py": {
"template": "app/code/flwr_tune/client_app.py.tpl"
},
- f"{import_name}/app.py": {"template": "app/code/flwr_tune/app.py.tpl"},
f"{import_name}/models.py": {
"template": "app/code/flwr_tune/models.py.tpl"
},
From 0f7c64ed2136f95de5afb472b8ed044f52d292d3 Mon Sep 17 00:00:00 2001
From: Taner Topal
Date: Mon, 2 Sep 2024 19:07:04 +0200
Subject: [PATCH 41/42] ci(*:skip) Add Javier as codeowner to /benchmarks
(#4126)
---
.github/CODEOWNERS | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index ce280c6bd2d4..ccf031344f67 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -7,7 +7,10 @@
README.md @jafermarq @tanertopal @danieljanes
# Flower Baselines
-/baselines @jafermarq @tanertopal @danieljanes
+/baselines @jafermarq @danieljanes
+
+# Flower Benchmarks
+/benchmarks @jafermarq @danieljanes
# Flower Datasets
/datasets @jafermarq @tanertopal @danieljanes
From 24e9af9c61bd55f21f0beb4ceba0e4fc2b93395d Mon Sep 17 00:00:00 2001
From: Yan Gao
Date: Mon, 2 Sep 2024 18:36:34 +0100
Subject: [PATCH 42/42] feat(benchmarks) Add LLM evaluation pipeline for
general NLP challenge (#3767)
Co-authored-by: jafermarq
Co-authored-by: Daniel J. Beutel
---
benchmarks/flowertune-llm/README.md | 54 +++----
.../flowertune-llm/_static/flower_llm.jpg | Bin 1627444 -> 0 bytes
.../flowertune-llm/_static/flower_llm.png | Bin 0 -> 119787 bytes
.../flowertune-llm/evaluation/README.md | 46 ++++++
.../evaluation/general-nlp/README.md | 63 ++++++++
.../evaluation/general-nlp/gen_judgement.py | 130 +++++++++++++++++
.../general-nlp/gen_model_answer.py | 135 ++++++++++++++++++
.../evaluation/general-nlp/requirements.txt | 6 +
.../evaluation/general-nlp/show_result.py | 36 +++++
9 files changed, 446 insertions(+), 24 deletions(-)
delete mode 100644 benchmarks/flowertune-llm/_static/flower_llm.jpg
create mode 100644 benchmarks/flowertune-llm/_static/flower_llm.png
create mode 100644 benchmarks/flowertune-llm/evaluation/README.md
create mode 100644 benchmarks/flowertune-llm/evaluation/general-nlp/README.md
create mode 100644 benchmarks/flowertune-llm/evaluation/general-nlp/gen_judgement.py
create mode 100644 benchmarks/flowertune-llm/evaluation/general-nlp/gen_model_answer.py
create mode 100644 benchmarks/flowertune-llm/evaluation/general-nlp/requirements.txt
create mode 100644 benchmarks/flowertune-llm/evaluation/general-nlp/show_result.py
diff --git a/benchmarks/flowertune-llm/README.md b/benchmarks/flowertune-llm/README.md
index 0cb69e7ff9c7..ed2f8821cd88 100644
--- a/benchmarks/flowertune-llm/README.md
+++ b/benchmarks/flowertune-llm/README.md
@@ -1,4 +1,4 @@
-
+
# FlowerTune LLM Leaderboard
@@ -9,39 +9,40 @@ Please follow the instructions to run and evaluate the federated LLMs.
## Create a new project
-As the first step, please register a Flower account on [Flower website](https://flower.ai/login).
-Assuming `flwr` package is already installed on your system (check [here](https://flower.ai/docs/framework/how-to-install-flower.html) for `flwr` installation).
-We provide a single-line command to create a new project directory based on your selected challenge:
+As the first step, please register for a Flower account on [flower.ai/login](https://flower.ai/login).
+Then, create a new Python environment and install Flower.
+
+> [!TIP]
+> We recommend using `pyenv` and the `virtualenv` plugin to create your environment. Other manager such as Conda would likely work too. Check the [documentation](https://flower.ai/docs/framework/how-to-install-flower.html) for alternative ways of installing Flower.
```shell
-flwr new --framework=flwrtune --username=your_flower_account
+pip install flwr
```
-Then you will see a prompt to ask your project name and the choice of LLM challenges from the set of general NLP, finance, medical and code.
-Type your project name and select your preferred challenge,
-and then a new project directory will be generated automatically.
-
-### Structure
+On the new environment, create a new Flower project using the `FlowerTune` template. You will be prompted for a name to give to your project, your username, and for your choice of LLM challenge:
+```shell
+flwr new --framework=FlowerTune
+```
-After running `flwr new`, you will see a new directory generated with the following structure:
+The `flwr new` command will generate a directory with the following structure:
```bash
├── README.md # <- Instructions
- ├── pyproject.toml # <- Environment dependencies
+ ├── pyproject.toml # <- Environment dependencies and configs
└──
- ├── app.py # <- Flower ClientApp/ServerApp build
- ├── client.py # <- Flower client constructor
- ├── server.py # <- Sever-related functions
- ├── models.py # <- Model build
+ ├── client_app.py # <- Flower ClientApp build
├── dataset.py # <- Dataset and tokenizer build
- ├── conf/config.yaml # <- User configuration
- └── conf/static_config.yaml # <- Static configuration
+ ├── models.py # <- Model build
+ ├── server_app.py # <- Flower ServerApp build
+ └── strategy.py # <- Flower strategy build
```
This can serve as the starting point for you to build up your own federated LLM fine-tuning methods.
-Please note that any modification to the content of `conf/static_config.yaml` is strictly prohibited for those who wish to participate in the [LLM Leaderboard](https://flower.ai/benchmarks/llm-leaderboard).
-Otherwise, the submission will not be considered.
+
+> [!IMPORTANT]
+> Please note that if you intend to submit your project as an entry to the [LLM Leaderboard](https://flower.ai/benchmarks/llm-leaderboard) modifications to `[tool.flwr.app.config.static]` and `[tool.flwr.federations.local-simulation]` sections in the `pyproject.toml` are not allowed and will invalidate the submission.
+
## Run FlowerTune LLM challenges
@@ -50,12 +51,17 @@ With a new project directory created, running a baseline challenge can be done b
1. Navigate inside the directory that you just created.
-2. Follow the `Environments setup` section of `README.md` in the project directory to install project dependencies.
+2. Follow the `Environments setup` section of `README.md` in the project directory to install the project dependencies.
3. Run the challenge as indicated in the `Running the challenge` section in the `README.md`.
-## Evaluate pre-trained LLMs
+## Evaluate fine-tuned LLMs
+
+Once the LLM fine-tuning finished, evaluate the performance of your fine-tuned LLM
+following the `README.md` in [`evaluation`](https://github.com/adap/flower/tree/main/benchmarks/flowertune-llm/evaluation) directory.
+
-After the LLM fine-tuning finished, evaluate the performance of your pre-trained LLMs
-following the `README.md` in `evaluation` directory.
+> [!NOTE]
+> If you have any questions about running FlowerTune LLM challenges or evaluation, please feel free to make posts at [Flower Discuss](https://discuss.flower.ai) forum,
+or join our [Slack channel](https://flower.ai/join-slack/) to ask questions in the `#flowertune-llm-leaderboard` channel.
diff --git a/benchmarks/flowertune-llm/_static/flower_llm.jpg b/benchmarks/flowertune-llm/_static/flower_llm.jpg
deleted file mode 100644
index 96081d9c2ad1990ae72819f3f5eb69d969ebe971..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 1627444
zcmb@sWmFtb+c(&_yKB&47$gj?!3Ng>26q|Uog^d#cXvr}clY2<@DMym2pS+pi0$Mr
z_x-%j?m4?3wwUhv)m4{QHC5H8f3N@E14z}C)sz7=G&DdJ^#}akCs|NaRJ797(@|E_
zR6=zC0Le=mFE2NAQUGxG@bl4AQD8APF=fGd51;{<00AHZpxfB_ddchRY67V8Z~X`V
zw|rScaihvI-ydE7rT>3IWcCieb^w5;i{iGm_qB6J;WhxkEU@$P@dE(tKXmp$Kd(O+
zbpezop0K^L@I!l0^BT5JH6$%S@
z*gJcm;%7nO2k!2k_9$$P!eA7S7YhH;FK+uk^f>+xY-3~lADuQfj{m`b$AXeX8HPFg
zdb!yI|JC^a^XBg6kIL6y1%$d2I{B#Tp=wH0-qu|`HU3~i6h8N`)z(2_aumLGMn(5G
zM#p#ZS4E&OiXWZP&Q}qYOMn`M1?+8L8Ys+&!pa_Q+JDCT58W2&tgME@q$nKZ?5Adc
z!n7zH@wD>t0pq_(2`N0^qv-!(^R4f<^4j%IVaGuj02hDgbYaY5
z^#8&C*jN5*bRGX=P46#%Tv6lV{2!iw+9dD!vuvIq(Y3Io8O_2mx_0DOf1wQitM&;FMdz6$^{gTH_O?)@)K
zeGmYe#ZZrb_kU?%aR4AA0D#GEJAa>mzx84MsjyK0B}hXeOARb5oGJ!ncIZz5z0rfx&&d+|pb4Uhqdh=VLeoUkLo-3MMsq~-K=Vg?
zgcgn#kCuj(hgOVMjn<6TjW&cfg|>+H2JJ1{3EB;titTT
z9K~G5e2aO7`3s8(ixGB%NWZMD-bIRD+{Xvs~u|$^(-A>eZj`YrpD&N7DGLY
zCfF|653v)l3$YuqUt%v}A7FpR!NH-!;lYu?(ZR97@xzJ2$;GL|8N^w_Il{TcCBVIh
zD~PL%Yl`cE`wTY=w+43rcM10x_d6ab9w(k8o(`TpUNBxVUKw5w-YdK#yzlsA_+0oh
z_y+hc_+j`t_zn1D_*?iN2?z+-2qXyL1kMCc333RU2qp>M5_}^hCxj3x5}FhG6DAQ>
z622sSL->)1hzLX^M`TLmN0dZVO*BHZLv%|_LCjCAMr=>~ggBqLgLsMfk_3;0lSH1x
zk|cyAi=>t070Ed%9w~@ak<^AXlr*2ToAfp5XEF*hD47nKJ6Qr*4cQdgdvY9d5V;b$
zJ$VFqDftNb0R;vHJB1>J9Yr`r8O0dI5hXSy7o{4d3uQcIJ>@**6%{#^FqI)y5LG@^
zKh+*J1~rIUjoOVmnYx8~jrxv;iAJ8rfhLxwfo6&3D=h=99IXRw9Bm`*D(xK|3!M_3
zD_tsGC*3wZCOw26P9I2LL_bD<&Opf^#bCz}$I!yC!HCWXVbo)M$XLoa%lMgznMsAo
ziz%0Bgz21_npuw7g*k(HfcfMe#k~jjobP4a8@zYQLd7D_;>ME0GRktr%E+q9>d#ut
zI?wu@jf)Mz_KdBOZJQmRU7Q`sp20rMe#LQ*LyO}PM=i$&CmyFbrxRy3=LF|h5EsY@
z6blIJYZzA@>3V9U=m8g5*P9@u2gF@;LJp
z@+|UV@k;P|@Rsnt<|E{j=L_Jg<$KFd&9BM-jK7or@;=9X)BCCSC+`0e5D{<_C==L(
zl0(&@&!9cf8$pPmonXG;iV%^Ivd|NuE}?5-9$^RJBH?ur3K1=lXpv!&pQ2)-KB5hx
zCt@68)?x)>uf-|Fb;T3JCnc~X6eONV^htb^6qWRoY?1sR#V6$|RU`FYnoHV2x=i{&
zhE2vsrdVe00m}pH2QMD%$+F7Y$d<_N%dyMZ%T>r7%X7=S$k)kVDhMd}D6}hlRg_eG
ztoRa!0aJm+!RD37l@LmKN;}GI%8tsl%2z5PDi2j&s$!{XsHUpER%24LQ>#|HQWsNy
ztUjtisG+Zsr?IaI(e%;m(L&c!*Gkvg(&p6m(C*X$bkuaxb+&c6biH+Z;n;9pcs~4{
z9#rp<-h@7dzLkEB{ucuUgJgqE1Q_9u7&as^G&ig^{AvUk|W
zm#?lmu9dF8-HhBC-ErNm-Mc-=JzPCTJ()d&JQuxqyrR7Jyd}Ley{~*Ud@6j=d@X#t
z{3uZy`B{Ikf29B0fCmBj0k?sMfvrJgK^{S~!I0qC;NuYGkn)F^5A7ZfKVp3p_UP?n
z`NzefXrZ>D!%x_sL_9fqs`9id3_r{@Z0?!Bv$SWQ!i~avBA6ncMjS+{MAk+TNBKm(
zj+Tmk9)lU<95Wv)6q_6SE6zS{I$j_?EB;4vbK3O!mCcdC~cfS7NWa79xS)>hm$>2}r*<4(yg_ip2!=wAO@*xT8C{r$}YyMwbs-@_kA5$}lJ
zWgXu;u6{50zVAf&WZ~58^ytj%?E87t1=&TxCHS)KgWQMNE90x9YoF`iHwho0iJ9ca4fhNC>rg`d{_$P1L>=wH^6?E$HZIXs90=`oHDB2>p+e|DoXjVEdQ-
z|F!=85+K7x_rsOOKqCXt$anHPJBf@Ud`ku?bKNR-~wq
zQO;ptV4`9Cu?$5+#~>req+r9sW(9EAaVepKctY|D_#9MlJ!(Z8TW=a+pDdaagIS1j61gFe9_IePnTI`7O!-Guj^yP!9Z`9$;bP;Nsz<
z`VGlY41ernpktw-HcV(t=xF2^WNZ|yf&eCzU0wl851Ud5Zfk?1=;3K~(0SpK~M5MrS8l3|bmvcQL4{WOwG$tKL)
z4!FR2O=QfObz{fi2$b2FDB&k#DG2E`ul|az{xx^v{Ii|*pL<`@zvdMSK`jI8IFx)hwnp_iN71P(ZpCQ06EY8$t1l!I
zjS>v~`#n2KMCcb|S4^Z9(&W9}#tpvy>>o4YzlMZ98N|YNxJ$Cdr9=*cV*?=NN|b%!
z)T@ivAAh#9df!ZUpL0E--mCS#*$}wD_DZlzsusxbjFi`X83kcDs))WtgT>}@Ny>uv9<1QDb+%J845EQb
ziwwr3xCZ3|Hla<=y{a1_z3#auhr159clG_nCb|Y6t#%fAgJ7E<9}hlQDA1{?3K32C
zr7u!#gwSK4pEA=A=kJ^eW(N-L$W>NpSq~-??1VnuCuU7F3c~>B
zg)$U=weo=lCe0+(T7mUXNmwpe&`+ryG-^`j0cUDAPUetP#YeY2ovC`tFh<7#X#JPO
zAF$a_YWNXPXNI1Q2jy!m*tsdiP2uLxK$A{9eQtwuXN6YDPxzVPVeDV8nuB^
z`YgOTv>gGwR=)vAa+=!E@)T(aiNj!9jm{BH-xFua16inEI->(so``_v&B|yiv`nd1
zV0QL-B(qrJvvRc;rMjsNYg8kUXo#UZd(Mw6TLrL#)$QR=y@H2h&}1JwU#Be|8cD<7
zK--d7bt=<7%bo9JJ9o@YbZ4c?a$_efIWcHz#bRqGY;
zyWouuL56T%O)TuZszN|Y_y?rUU8k4nFO;fX`(A!32!b`0a#Fv_m>BD9(5(>@kt|Yp
z;7W!lshi5g7hdJfFlqqXUs`~lLOP_VT5q)R1*LuiNRatZP-k5n4bv;WQI(*By#dFc
zp8d+th0=u&Ywu!BNuG?$Elzy^daM7j;gETFD%OZ$%2+f%0A7{5F5BYxauj^MH64Xh
zvfrU1=|S9eL+qspc_n!&ZxLliv_FWy%JP5{S61j=_EwSEx7;?!mw45qZBi37@BuvG
zAhyjt_{&nX?2+VJ30I&P8%+)8F7b3y@(nJxQkEXAOV>At3%)`#OsQkpbwc(wq~26L
zli{+Zl~Z+2ZlY(i!HSWuNTQOckGujKm_(WbrQbz?B)OjFGLKFWhHUqb8m9-(ex(?`5Og3kVVRc@tMgR3VC7^`
z-Xz2K-&xzK@Tl4d73Nc}dd#N;w#+m-bI_Z(JC~Q0ofzLx{c^h!%a!R`D|Ztk#4Ncs
zI|h01?iuBw^z+oAP`$&U@Z8z#VW)tZ}Sq
z2>V5udBc*v(IHPKb)hr6G6s@8UzyK^i>@vAHiOEd~Xc>QEB#l
zO}I5BLd(H!Wj0{f_g)LqU+7KG>zNC(jjS`_v!Yo9njgYWO(|YbAitKn3g^h&{nznZ
zM4Y|026#14`s;4*b#BE}p~pQtvIlxeKZVqWXCz(>L(3|7>=?hVEgyXR4bYyK!vvU4
zqOFtYI?)zH9J&l9FW5BQy7a@V^lJ7pXDmzUpw&DYZU1S4cinEt`r+~zw0vNQJ*3R}T$ojP#1-06^NInRU-L~%f^M2($+Ov`fV@X1O?Mek
z8g+Uyh~~t$_lIds*zh)sLX3wih2*CZVKCB^Ukt_tP)FtjvBtceRO1NmYSeU+n5pf}S->j#VC5kv}!!kKXKYX}$y13V?
zPccq%k+dlF*|&wqrCCJsuymN6a_C5ApR4VMT9GWLiOIZj-U~xrW5Me5X3%WhZC=I4
zBJ~1{j+6bw@DV!d&yyc;)=Zj5H;7de9qkW+nKqU%m;gH#cyL_MeNmjG8wtU)dZ
z+vRrNX^XOoN!3+wlShitEqx?bPPKQ6)%!P_u!D5$Wy!$<(U@N|gHspSqb^Ru$0jEB
ztx2v-CR>jOsmLn^%o4kxp}MJ}p*6sIbU8LW{uvX^tXd>kxeodAN6qOF)wVtd#o<^5
zq0{VmokPIwYlm+EveH$d>eiYoeruZYItTj+^PSfR!Pm*1qn?(jT1)dQ1^WI0Ufrie
zS$feJVvgyF84W(Z8Ly`H*_6fcZFR!*jjkRITnqG$9GhwS{)m~iC$(j8aB{XgD!mW?
zC>}g=7w;T_Yn%~GD>^8P(ds)A`^L9K+Ny~o<~6ba7HEcODqda+Yg&9eqD{lh1*
zIc%dhQnAv3PWR9VKvbEq*AsEur6Ebz>fn6wTXw6{ASN^Cg}Ay^Bu{;6n=;Iha)ren
zzq%#SYiO`>2)CISbX8BEW2xIcR+)924Q?lUodC{}a%ZnirjVQv-F{T-xg9)Fon>RS
zPYAn|y0e62Jy+HIpp@#O9bdK&PgOrF?KTx6Ns-j~l>qjDDMW=h8o9+LIX{}~oG4(P
zsh+mmw|Qp4J;JJ4S2sI9hkV_)Gnl|fR5ZQ?HFZ
zgEB&y6JZ|o8}O{&q`F!dH;uF&ih&D3lix|}_Ji=DzNGfBA39H(-l-@@RdREw9}<@#=hI$FGo&8MLDv(CXD20#hvy5WDht&uafD-wL4_!VS;eWJR_BT)z5H=M#i>^_u%_(b
zO-&PcebgJf;&FPy(1$3N>50NdN55$UO
zMZX0kvzZU-35`F-FoC(W{l!Rn<7~p6b$yacu1)LGWpo8@^pIRmdV_@kC!E!p&%Vk
zF$#Y>rjl{VHztKC^q?9rYAI@}TE(_HX*N2jP1&<`e`aF9<|s-x30SQZdLchLFU9V>
zn)Ej$$f^)K^759Oh$riK;LB14_Ff^9IEE5wZ?Vf}FpRb4Q5&dFN3JQGVHHW>cZ5Se
z&_(Sh(r~I?;qP7|Vn$;+HK=Ug2#&E`S$?)P$@ht{Gx)&um3{|YmP04$8qiSbc=C$2
zr$dQ8@x#0;!_(0a8~EBiNw~^IG^KEOJzu$;A!{Tpn>C9rahDbqEsGA~VB4mvz{O{y
zi2kq~oaS+UQtp3vG~1oTEWM}}nattwk)lEDgP_r2`%1dyGQR$b>A@P27P6*{u0}jG
zB|vI*SHYmgFTJN4Erne1McM{jlz5S%dsVNvKdDPkIP%0PoI*gOvubEc%l@!djDk!^
zW?(7i(Yp#fZy|!mGqLPzaQnX8+6#z!)cX}Ap@QCPg8KZ!Lg}zy3A@~R6X-m$Gt&gR
z(TA!NJWmth!8)|Lt{GDwPH|6b2iiZVzB#c%dy%z8MxsBql6S`KlCzkG6VDQ2qhH#e
znXLNiU~A`*<74sjK`si$ecPgp23OaFeH~82GiK+=X#(;?pVx?yW+T>aE5gC;eA;=e
zgmEf6Y5VJb>r2)cN@wEH)3;P#PC}@C3XFZee5LS!->TD_jd&89B~iU$Ce}>#!(3p1
z?_{Jc(m7By45%hG(4JX;$-~xKX_x(InlbJD8Epb2ne*1vXyjwIEScRL#L6S}tt0Nf
zML{eMv)EPVZfcv#j^?Ky2%0Bxr_74N`?>;zn?jy0i6^Qq1H>NcpP_`fT*Bk0cY-9<
znP2ahE#Ag7s5dDNN~4a~)y+
z=B`}E;f5{hAEvpE`-%r{H2GH|xdt<)=4)KlWf6>Xwiwkl<=8_R*W;g@d!|bLkubeJ
zT39SpN;u=J=$FVXThk16g-a!wRVXqYXYNmFYKHllxSaM^fo2g;@D*X_rQSQmmdR?7
zCDpeoe3_mXf(fnZv=Xvav+U(&)eroim9TVKmtr@)t!bsT&3t%?{ey*xy`WP&M#J_Vz#xkTBGLhzSVcwgB+t>
z-$+t=1YFQEh-U`Bh~LT2aFKM%OzLej+HXD|B&FdZGdj}S;^*9zHaQX!;gsK4`5^F750QLU#fwk?B}x-=9^
zVeWX3TO`P1ZLulQ{QTwpFfl@zR?vI)n-&|JSR1DI539%8D{+m3)zC%F9Bsm!I+e@o
zW?$U*v(qBdX%n^p8|ZHd*`eHkhaSPylE@7;1)9A}U@BA{IV?~`}hGCnr2WAyhiznOKnoq%h_L>Mva7QnUZ%+)UH%dl8CAOQ$d
zT_5Bat+D$!Rhq_)m>CIU>6{r?XNXYMVX`-t2+b~@NF9sciG9+uhnxiTLpLi*mE!NZ
zGn?(7k5@S8IMe7LYiFVq!qB-D9`jZEoMRx&Qb{XQwhQvKBdXu9yTo~~EQD1dmg~qX
z-7|A(IyD%kHoTR_CHitrp9X|m%+(B1{t|R^T}HBLa+7)|=>^Kkt{GMxrrT{gb0DPN
zoUg?4qkj$+#*6);nI6te1Ni^{rm|duz$#{ul!f`OXT}xK$Z_9JdsK3YS
zr((N`_7v8c%xH5JyK-|e-{&|u=8UMgsJ?=D!?jN2dDLx;^^z6NlzO)JKHR*oimEY*
zg|Dr-iw2^7wo_rIpu4IRRcIG3mt`f(x^mUCv2?Kwr^IA(K|h?W8&)oP!h|J~j*no~
zG!^9nIZejZgI8=t0ic4^`(*>N$e=gj&4e>km&(+uR546p&
zY;oHj3CB;RmS&FJ1o05w9zFUR=JIBIMNigZVn5vp7IQW6DiIP8h!(Wd2{40
z{Y7@H8=4L+q^t15ZoPiZXbOa~J=?Y#4zpXFgJ{L#jlP-Abz5Y^(P2V`RnVK6{Ni-N
zw=*-(iJ{QZT25?GMG?(<(bmuVX}NPv>J4X;Wfc)lks}{WpS3s84Vbz{NK)S*tpyyL
z*XZYJ4Lm3LKKRSKjP~**2G*!;TZ|?|f7iVn*DJclEQV4^_f{o}Y{OtYIDNm4SOz<*
z@0%eKatou$ljRNPWS75<$N3RIvOvJ>b4OHnq`{wV7o@!!N)|1$DU2FEw>%FcShS4^
zOY1rgW{YCk_AnUicyIbRvZ&bTP;)w_*cIPY3ofNz?sh2YeM&N0bKD-usm#U0Vc{hj
zpi`n}ZF8TnAfS6OtK^D3k-SC-^-5wX_JccjLx=cRXf|#KueZN3x_{OKvCVOA(I}%=
zcN|QqjM~4%?j`Y>)n1pHFcaTHNy)@`4FE|6UNum$NoO`^@d!zI|drMCxylcUpLNd6RK9@{f_N-xQ(bg;|pJV&9
zof^s5&_`LHsgy+hj)*lqB1`&>&qL3no8!8lBBg>Aoc=sLlQN$=AC-x$QCwo>);ek?#9ec~k|sqR
z>hv@ZWeR-Sl%jW!JnstX0(%C`1gfU8H)G%rO@q0&PDYGkMRh4TjskM6wkYGY_$8-}
zHL*c{-0zU2OZv65IkVs-bAjo|T_k?p+wAg~1y`oByy{v}>zgZN(2CT9pS|v+uW6SB
znoEzkm89%5$wXwfhTDFAHCeX)Oj2S{P}S0xN`(%aY6zoxJ)jnUz_j`@FSk)fSW6-UFvNE)~9beAOnt_)10Z)Xieb)lNCy
z$R!a>q&6*$dIq;DxTqzaC4(Q(=Wadcuq|ocVKa!NUVkw`+`(3)B5=71O&Qt9|qGV!3417czD=*LQKo`Yi|N-C)J;`r<7Qi>_VWW_|;58J$|=1^OR0=3RFpITG6Y_Obo}O(
zWgb1|;=LGF55MHL?R(pm29Ktyo_4Ejan65f6d)YL`+mz>mRsCIHADn^u2>04@HD>Q
zV;Yp|K8Kyji-WO=2@i`wQp$V?Z9e3>6voY&T2${t)3EQ(lMxWNky`aYQRDk*=bSb2
zr|eaz6v;*(H)sLOt?<%DXNG?raN0aIeeq_#SldHy_(vMUZv{W0YS~Z;?J!
z#7#LJ>KX5M)H?l`P-3P$Lmfn!O%=guzg+HqZ-Z}?l0^K00P5Ql{CtHUu5`7*1S}=b
z<~CFx)x32wG1|+Mj_)>9xu~R8E99QnJ-8_z`1Rzn@U}joGA*nNqGityO~`N1AT1Hu
zouf-;wt`9{9fJG`VM;A@d99HiFDw{q`6A6ts0@Vx_{=kdlxYiE{v27gWcSst_6{R5
z20zI*x_L3kwlSihC-BcI=`O&`b11m|6PZEem1bd0a^*pJsmSI+Sj_
z%|96~MW3JZoQ_n!;Q@H+nO-Sh#$sB+y0T^fi7FavoPW=a!-Gy|Fq{Sy@nhlM(a3?F
zp!f{PPB?X)&5iGnU59aSM|H4CrR3}$CE#bElF*~)A=!bi_c&4
z&o^`wG$BYXegl*)8*a-?+vfE~w;t_pL`!%c^S*n30FghH&$cWU)!=j65?wKj1cj){
z_lg%SWBbO^iN7_Z$_U72$at7`tHs+JsG9LAsDP^8d(uWMUeiP?F4KV+4iCrN()C$R
zF`i{>bw5b^W}cn>eq1r7npUpa=K6_P`b~(cy69p|J{i612u_8vs)aoq6v_g#N-Uvd
zHJ#~6aMk@5&Ox@R?5v@`=BT<-1RJL2mOvO}<}8LgO2-1JFz(`$P3W`?JtSb#Zh%>b84N92DE*F%WC1+Z`;Yx=dre{29@?o
z2U|q>=diM6n%9Q+=-J6`%k`bA4GOOKeryX0cw9bqyXm7j(wpa0sm!%HJv&a}qEnz?
zh(l-i(LY50!4v4d4PC8}kS?W2x_wrz6ZtadL0|cS(VP=Ad59Az<
zvrVU0aGm6V-xeWU?$=R{i@10x(PMpLCvD3oXZS9q8tSo2sei*ZL_t^AD6|4Oir3R_
zd(IxJkQTShiEA^7Z}+s-O?!IMgsvIM&N5ZR=*e+OsdFEDy^$Ycqj0FyS=E=8AIn&B
zA5T7Md;YuURg&
znyee7Ip!_ZmXhsolsZpZF2|Sa<1xl{PL2>0VIBrfv*e(h$Z_Z3t;-tm+_<8gg+=Qr
zaklBZ?A5Mop+as0GCozMEO`)B^h~#jYoWPRBel~^qTSbEc90Ud_GebQNL#IfuH85*
z^~MKG@`SFJ;>XhOCi>v`o6%n_3WlFIf1_$@C?qM9d@0(R+wdBn?Gz)|AK@%cSgQ+|
zsPRs>f~hQkJJ`Glb~o81a_OsN+f$qVnet&z>uFxG)^d)w_8k%h?TP%pZ$V%CenH9H
zX>TkRD-8{549XG>laf>egPF@?EwaqixY#U*7e)y8ZZ{=+$S7?Jz6sh2dUKM#3b5^3
z;$gcwQR}I%)83cBB#s}=8D~?UI2icG#nkc1C&>HJ*6vMW5nb=vXt(#uf^lBwG$xK_tV7=1xL5ptFZ`Oyy!9l
z+RR}Kp8nu8Adx>W+ua4HOjSmLQjUWIQ6R;R<&c_2#!P?n02xDx5lz&<-7!cqIr{yP
z2l>oBf`kTxLQXytDf(*&S`T=1<$;erp?#h19Q(j3Wp_3o{wY%P8a|PG6BDOQz_uzX
zd>W`c+ux?fbKVo|+d&u@mVhea1m$XhaD9a4;t-j}l-0IRi)vz8z33bypvvb3TChosoM}K%sUtvh?gm^Y
zel|9sTM}%2!6;`|otZQ;9gIUcpb^R`NojH^8Ku?KV&{vy2HFxK{}lC3x)Sy>QdNlT
z0E4G(gH_LrW6kD*dn(!Wof}=tJ4K_3co~Lkc+0L5XQ=Vra@*I>ntO`&MM_`R&0&K)KpkhG6@ISc-5Ep
zFKXXZn%Tuz6iVBvEEAhdVA5S?e4U`6K9X%oXUK8X&mNY~B$XkkH)?%G9_4Y;wm+qP
z_m*_FEbc^X%8CP7Z*;VX!-&hnqMy#J@MdeLFIhrMyZq*Xz7l;)LtbZO;=#o11hZ(l
z9AuGcWiGb(vI?79vQ4YwkOEw7yaER*`M7HtwX^u-zp3V>D9Bg{C>uOx(|nlo*t{{=
z22a2p1Qj2YynVAOzfLgy5FtveIw3eadW0K{DP9%x&9u_a5FX=mo4t*t!nwt&CgGsZ
z&5qD=6-EQtvj#g%s7#+WG*tfvm?A~`jIaHgUvs3{1ll?~^!t#_ZCJh$!j_uea|&A;
z)yFlo{(yADbQWU}+fig0%6W~^nBa8)6$Ie+?^m7)SJ2D2fZb{kWCw~C#x&nZCVUzu
z?v8U(({%{G;xo3=EDzH37;)=bXX~}~Jbm&0YukRLxhJE2+~&rCf^4%ATu2hitDBUzW}JjYY7Gak67C)e2W89To@o-kVcl%*`~d!A+$14G=sn
zho*UX4er|KyIwZgZ~H+TaBkKXGV0~~r@Ht-&SWfj+{Sd!Jrj6>*NdM8Tmv>H0&mT(
zb)Ua`cr%%;sA70%5YCO$>MCay=M0IyZ$b^lPyE@0bliB7g)d`np@gAY4slhY!F!~Y
zn&?x+n88|gVEsuTW{$RhP=i=4oMQdt^_RF)+ogLB)-i7wa?^bb)SVX`Ctg$@)Hboy
z6@4lo_xxTk6=PnAn6aB$rJ{39X8H}BkT9;Rrcd)sCJu=Ip!wovqEEg>Y%Y{hnSXlF
zt@sX6A;z?^t)8LI@}bF~EX2}N=q*uCR%1YGIVfnW(&e2JM@d*j4iA7$IcJ6Cm+MV9
zxi^)P3s*g;B%zH7h@4SpG5(NIvyr?~5QOA-H%TxF(`=CWeuFRti*^Y*OlKikdMz@m
zCF%Fq<0&)ePFzPGhaf6T@#FHiKI=dDz?(%!Ti^wq-DI@SJaEbO`u5(T5TW_4EOH7i
z=A~czn!JuKCzQQlV!GnUmS!eRi3#xM$yR?YF
zUJo}`s#OV-#cn)Or-Nipu_`O_H0eMWUgLp=1c;ou8>sRq{IL~GrBh&5U$|>OF6pSm
z(`l?4^;rNNHjOKL4HR!89i-zyDDtUY_-5gNYka$ex5U8f_V%e3tv5fp>xyxPKYnb+
z>Nli^ZdgqOVHv$v=%d&>>*-vR#7wKmz*VEvI=Hag#E~hUl+$S3o#xw<%FAXh<5Z4W
z&8o(*q)PFg;Wr^rk=Z=F6dD+31XqSgfkDb5jR$6^Uy&KIMyj*K1RB?U1Er95g$%
zvQVFy`b|$|SS>tDTPYkAhfFAAM?1)~)YO9?W@%(o9y35C-*8`3Yq(Wp-+yW@GSH}P
zs3g{QrP6d!u38q+_|Bf1bN{j@-S`RBb2mC<3PLTC;c5b#q8sN-aK>P4}@4CMoC35kG+C^`jEz
zX%Aio98PDMTPGxmn<1Vzdvoq3L2yo7DrXYdmok$AX~gV|qVSpQSK*B1dM6-v&6gQV
zK`{1-HElIajz=_h1hlaP36oZ*zVWH@*PJ$s?qpQ|dMQMa45_ytdF%Z8RqBnJ&ix`G
zZ@lYIYXRiiNrIQ#&kQh{<-5%3r+k%}*pqwD4H)j7_M|fQE?0|R+mVHQc=$m}Kaqe6
zlv+iK|K<{+5m72QF-;x|L6^P|98SUSg@??CA^C*P*s<|ty(%pYm@IR*ld4W!<8Eqe
zS(oDs8ZZdBu8=if6O?@3(yt6DIR+L^s>(Ctb8-!JKMiSo|Lh=vZbrDYNY1`!(H<<{cqMD$I~G
z{mf;r_;2HTSw`q@jn6CPUqpW9qEH)pxp!|z^M#VDtiXQY`m9S(Kln2i+vjX!a)mBm
z*c>nT>>x?y;3cE-&7{I?o$#xo$T&9P_Ilr?v}EP+1$P@P`{=3-_12Du`CpA?#&0|)
z3t>62Ykg$~G*SgIY;D#&8u|0-kp-#Bx=Uvqe62p7^|W&gSuMQT8);r;LtyFeg
zT8enfYHTcg>5jK>r;5NqS7q!?Clr!#g}3?g3^C8!?$mfB3vFUg3UedW%j|OTx$DU*
z%Yz#yLjBV!67ebX!B|MJ+xu8VP0u4EIs2!l)hu*S;nzGrHE4Q9(m}^nsj>c~UF=3J
zp$=bXWK46OldL0pMn>2P49hz1Q4pN#ZorchmXBuO9_6*QQFE!|F|jUa?xG$4wp>
zgzt4DN|)j@AY%_7NDCyUGpirYp7P+$KVD5^)Gm1xyy8@D50zVX&nZ1i$nr-J-v(I!$gxZ8C{n2|
zv{MJMA^YATsWQ{dJ*DKiYv!03tCQW6%y_s$J3oUsc`)#&F-4Bjn-E#l~T&)cr^q
zq1>Rm)$xjuD5qntjf!RX_XBk$1aFttCkkaYV`ZStl&wNfr0gZ^VA?H8$@a36bv{cg
zouI6>eq1^?37X_{f1L2kXtuJ)-00i+E1GIY?i(z#
z)Sk6<5#aNOHShqecj$UIeyKvatGVtI>39Xb*8Fj!|I_3A!?mD}xCz&&Zd0p3bT
zRDvEZ>=e`yBdz4u6O=X)&WP9dy3l0sIZLvTn&31evbEfx>X2U6SbKbHxSbeZUgr{`
zMOu?y;!-pCx+AIKeVxzjMEl%|>2@vC=a~qj8AGSsK(5223L)}-W?pZW98l^#Y;`H~
zX4>$3_}h!w$&Ui461LuIe}A^usyXovl$YF0=M<;3l3bVZT6vc_Aa!EbeJ>ZrhESvLwqT&tYjb*P?b*tg?Z9
z#(UqjEN(;XPK?H1e&eb$y$(r8c`5HR!7o?H$D=f9-BRFO!DxRTq*6Jpw|+@b{aEp-
zplzgSLDhJre8enq-9Ux?Q&GEAX0t+*UaBw`phx0=V;!$T$YXmpB|4emD^|AfekXc
zui-(v(~5(_#qU?q)h2(I;#e4zy}k|&5?quZwoT>y0ndB}c9erl_~UHZj%!S4
z-Avx8j>h&y6Xw|Bx%OL+GL1pd8hsc~_e!yZ|zd8xmvq)Jz+E}@8}BxPTskj^d~j5N8$0T
zICo0wyZ&&OQ+_(~hceL`nbVpnF+x~}El{4jhfVpWK3cpx1wmhZ$z&-Itu=f_GAER^
zJ_NN+=?*Usa=Q=*;WzDhKQeh|erArmCl<70dg7TFbJAcom@xMlgOf}|v_1O47wC!T
zRF