From 89618b04ca6ac84303099014f8ee67697bdb9544 Mon Sep 17 00:00:00 2001 From: Chong Shen Ng Date: Wed, 18 Sep 2024 18:25:20 +0100 Subject: [PATCH] docs(framework) Update Quickstart Tutorial documentation for TensorFlow with `flwr run` (#3338) Co-authored-by: Javier Co-authored-by: Charles Beauville --- doc/source/tutorial-quickstart-tensorflow.rst | 452 ++++++++++++------ 1 file changed, 294 insertions(+), 158 deletions(-) diff --git a/doc/source/tutorial-quickstart-tensorflow.rst b/doc/source/tutorial-quickstart-tensorflow.rst index bd63eb461d21..ffcd9efeb9bc 100644 --- a/doc/source/tutorial-quickstart-tensorflow.rst +++ b/doc/source/tutorial-quickstart-tensorflow.rst @@ -1,171 +1,307 @@ .. _quickstart-tensorflow: +####################### + Quickstart TensorFlow +####################### + +In this tutorial we will learn how to train a Convolutional Neural +Network on CIFAR-10 using the Flower framework and TensorFlow. First of +all, it is recommended to create a virtual environment and run +everything within a :doc:`virtualenv +`. + +Let's use `flwr new` to create a complete Flower+TensorFlow project. It +will generate all the files needed to run, by default with the Flower +Simulation Engine, a federation of 10 nodes using `FedAvg +`_. +The dataset will be partitioned using Flower Dataset's `IidPartitioner +`_. + +Now that we have a rough idea of what this example is about, let's get +started. First, install Flower in your new environment: + +.. code:: shell + + # In a new Python environment + $ pip install flwr + +Then, run the command below. You will be prompted to select one of the +available templates (choose ``TensorFlow``), give a name to your +project, and type in your developer name: + +.. code:: shell + + $ flwr new + +After running it you'll notice a new directory with your project name +has been created. It should have the following structure: + +.. code:: shell + + + ├── + │ ├── __init__.py + │ ├── client_app.py # Defines your ClientApp + │ ├── server_app.py # Defines your ServerApp + │ └── task.py # Defines your model, training and data loading + ├── pyproject.toml # Project metadata like dependencies and configs + └── README.md + +If you haven't yet installed the project and its dependencies, you can +do so by: + +.. code:: shell + + # From the directory where your pyproject.toml is + $ pip install -e . + +To run the project, do: + +.. code:: shell + + # Run with default arguments + $ flwr run . + +With default arguments you will see an output like this one: + +.. code:: shell + + Loading project configuration... + Success + INFO : Starting Flower ServerApp, config: num_rounds=3, no round_timeout + INFO : + INFO : [INIT] + INFO : Using initial global parameters provided by strategy + INFO : Starting evaluation of initial global parameters + INFO : Evaluation returned no results (`None`) + INFO : + INFO : [ROUND 1] + INFO : configure_fit: strategy sampled 10 clients (out of 10) + INFO : aggregate_fit: received 10 results and 0 failures + WARNING : No fit_metrics_aggregation_fn provided + INFO : configure_evaluate: strategy sampled 10 clients (out of 10) + INFO : aggregate_evaluate: received 10 results and 0 failures + WARNING : No evaluate_metrics_aggregation_fn provided + INFO : + INFO : [ROUND 2] + INFO : configure_fit: strategy sampled 10 clients (out of 10) + INFO : aggregate_fit: received 10 results and 0 failures + INFO : configure_evaluate: strategy sampled 10 clients (out of 10) + INFO : aggregate_evaluate: received 10 results and 0 failures + INFO : + INFO : [ROUND 3] + INFO : configure_fit: strategy sampled 10 clients (out of 10) + INFO : aggregate_fit: received 10 results and 0 failures + INFO : configure_evaluate: strategy sampled 10 clients (out of 10) + INFO : aggregate_evaluate: received 10 results and 0 failures + INFO : + INFO : [SUMMARY] + INFO : Run finished 3 round(s) in 31.31s + INFO : History (loss, distributed): + INFO : round 1: 1.9066195368766785 + INFO : round 2: 1.657227087020874 + INFO : round 3: 1.559039831161499 + INFO : + +You can also override the parameters defined in the +``[tool.flwr.app.config]`` section in ``pyproject.toml`` like this: + +.. code:: shell + + # Override some arguments + $ flwr run . --run-config "num-server-rounds=5 batch-size=16" + +********** + The Data +********** + +This tutorial uses `Flower Datasets `_ +to easily download and partition the `CIFAR-10` dataset. In this example +you'll make use of the `IidPartitioner +`_ +to generate `num_partitions` partitions. You can choose `other +partitioners +`_ +available in Flower Datasets. Each ``ClientApp`` will call this function +to create the ``NumPy`` arrays that correspond to their data partition. + +.. code:: python + + partitioner = IidPartitioner(num_partitions=num_partitions) + fds = FederatedDataset( + dataset="uoft-cs/cifar10", + partitioners={"train": partitioner}, + ) + partition = fds.load_partition(partition_id, "train") + partition.set_format("numpy") + + # Divide data on each node: 80% train, 20% test + partition = partition.train_test_split(test_size=0.2) + x_train, y_train = partition["train"]["img"] / 255.0, partition["train"]["label"] + x_test, y_test = partition["test"]["img"] / 255.0, partition["test"]["label"] + +*********** + The Model +*********** + +Next, we need a model. We defined a simple Convolutional Neural Network +(CNN), but feel free to replace it with a more sophisticated model if +you'd like: + +.. code:: python + + def load_model(learning_rate: float = 0.001): + # Define a simple CNN for CIFAR-10 and set Adam optimizer + model = keras.Sequential( + [ + keras.Input(shape=(32, 32, 3)), + layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), + layers.MaxPooling2D(pool_size=(2, 2)), + layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), + layers.MaxPooling2D(pool_size=(2, 2)), + layers.Flatten(), + layers.Dropout(0.5), + layers.Dense(10, activation="softmax"), + ] + ) + model.compile( + "adam", + loss="sparse_categorical_crossentropy", + metrics=["accuracy"], + ) + return model + +*************** + The ClientApp +*************** + +With `TensorFlow`, we can use the built-in ``get_weights()`` and +``set_weights()`` functions, which simplifies the implementation with +`Flower`. The rest of the functionality in the ClientApp is directly +inspired by the centralized case. The ``fit()`` method in the client +trains the model using the local dataset. Similarly, the ``evaluate()`` +method is used to evaluate the model received on a held-out validation +set that the client might have: + +.. code:: python + + class FlowerClient(NumPyClient): + def __init__(self, model, data, epochs, batch_size, verbose): + self.model = model + self.x_train, self.y_train, self.x_test, self.y_test = data + self.epochs = epochs + self.batch_size = batch_size + self.verbose = verbose + + def fit(self, parameters, config): + self.model.set_weights(parameters) + self.model.fit( + self.x_train, + self.y_train, + epochs=self.epochs, + batch_size=self.batch_size, + verbose=self.verbose, + ) + return self.model.get_weights(), len(self.x_train), {} + + def evaluate(self, parameters, config): + self.model.set_weights(parameters) + loss, accuracy = self.model.evaluate(self.x_test, self.y_test, verbose=0) + return loss, len(self.x_test), {"accuracy": accuracy} + +Finally, we can construct a ``ClientApp`` using the ``FlowerClient`` +defined above by means of a ``client_fn()`` callback. Note that the +`context` enables you to get access to hyperparameters defined in your +``pyproject.toml`` to configure the run. For example, in this tutorial +we access the `local-epochs` setting to control the number of epochs a +``ClientApp`` will perform when running the ``fit()`` method, in +addition to `batch-size`. You could define additional hyperparameters in +``pyproject.toml`` and access them here. + +.. code:: python + + def client_fn(context: Context): + # Load model and data + net = load_model() + + partition_id = context.node_config["partition-id"] + num_partitions = context.node_config["num-partitions"] + data = load_data(partition_id, num_partitions) + epochs = context.run_config["local-epochs"] + batch_size = context.run_config["batch-size"] + verbose = context.run_config.get("verbose") + + # Return Client instance + return FlowerClient( + net, data, epochs, batch_size, verbose + ).to_client() + + + # Flower ClientApp + app = ClientApp(client_fn=client_fn) + +*************** + The ServerApp +*************** + +To construct a ``ServerApp`` we define a ``server_fn()`` callback with +an identical signature to that of ``client_fn()`` but the return type is +`ServerAppComponents +`_ +as opposed to a `Client +`_. +In this example we use the `FedAvg`. To it we pass a randomly +initialized model that will serve as the global model to federate. + +.. code:: python + + def server_fn(context: Context): + # Read from config + num_rounds = context.run_config["num-server-rounds"] + + # Get parameters to initialize global model + parameters = ndarrays_to_parameters(load_model().get_weights()) + + # Define strategy + strategy = strategy = FedAvg( + fraction_fit=1.0, + fraction_evaluate=1.0, + min_available_clients=2, + initial_parameters=parameters, + ) + config = ServerConfig(num_rounds=num_rounds) + + return ServerAppComponents(strategy=strategy, config=config) + + # Create ServerApp + app = ServerApp(server_fn=server_fn) -Quickstart TensorFlow -===================== - -.. meta:: - :description: Check out this Federated Learning quickstart tutorial for using Flower with TensorFlow to train a MobilNetV2 model on CIFAR-10. - -.. youtube:: FGTc2TQq7VM - :width: 100% - -Let's build a federated learning system in less than 20 lines of code! - -Before Flower can be imported we have to install it: - -.. code-block:: shell - - $ pip install flwr - -Since we want to use the Keras API of TensorFlow (TF), we have to install TF as well: - -.. code-block:: shell - - $ pip install tensorflow - - -Flower Client -------------- - -Next, in a file called :code:`client.py`, import Flower and TensorFlow: - -.. code-block:: python - - import flwr as fl - import tensorflow as tf - -We use the Keras utilities of TF to load CIFAR10, a popular colored image classification -dataset for machine learning. The call to -:code:`tf.keras.datasets.cifar10.load_data()` downloads CIFAR10, caches it locally, -and then returns the entire training and test set as NumPy ndarrays. - -.. code-block:: python - - (x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data() - -Next, we need a model. For the purpose of this tutorial, we use MobilNetV2 with 10 output classes: - -.. code-block:: python - - model = tf.keras.applications.MobileNetV2((32, 32, 3), classes=10, weights=None) - model.compile("adam", "sparse_categorical_crossentropy", metrics=["accuracy"]) - -The Flower server interacts with clients through an interface called -:code:`Client`. When the server selects a particular client for training, it -sends training instructions over the network. The client receives those -instructions and calls one of the :code:`Client` methods to run your code -(i.e., to train the neural network we defined earlier). - -Flower provides a convenience class called :code:`NumPyClient` which makes it -easier to implement the :code:`Client` interface when your workload uses Keras. -The :code:`NumPyClient` interface defines three methods which can be -implemented in the following way: - -.. code-block:: python - - class CifarClient(fl.client.NumPyClient): - def get_parameters(self, config): - return model.get_weights() - - def fit(self, parameters, config): - model.set_weights(parameters) - model.fit(x_train, y_train, epochs=1, batch_size=32, steps_per_epoch=3) - return model.get_weights(), len(x_train), {} - - def evaluate(self, parameters, config): - model.set_weights(parameters) - loss, accuracy = model.evaluate(x_test, y_test) - return loss, len(x_test), {"accuracy": float(accuracy)} - - -We can now create an instance of our class :code:`CifarClient` and add one line -to actually run this client: - -.. code-block:: python - - fl.client.start_client(server_address="[::]:8080", client=CifarClient().to_client()) - - -That's it for the client. We only have to implement :code:`Client` or -:code:`NumPyClient` and call :code:`fl.client.start_client()`. If you implement a client of type :code:`NumPyClient` you'll need to first call its :code:`to_client()` method. The string :code:`"[::]:8080"` tells the client which server to connect to. In our case we can run the server and the client on the same machine, therefore we use -:code:`"[::]:8080"`. If we run a truly federated workload with the server and -clients running on different machines, all that needs to change is the -:code:`server_address` we point the client at. - - -Flower Server -------------- - -For simple workloads we can start a Flower server and leave all the -configuration possibilities at their default values. In a file named -:code:`server.py`, import Flower and start the server: - -.. code-block:: python - - import flwr as fl - - fl.server.start_server(config=fl.server.ServerConfig(num_rounds=3)) - - -Train the model, federated! ---------------------------- - -With both client and server ready, we can now run everything and see federated -learning in action. FL systems usually have a server and multiple clients. We -therefore have to start the server first: - -.. code-block:: shell - - $ python server.py - -Once the server is running we can start the clients in different terminals. -Open a new terminal and start the first client: +Congratulations! You've successfully built and run your first federated +learning system. -.. code-block:: shell +.. note:: - $ python client.py + Check the source code of the extended version of this tutorial in + |quickstart_tf_link|_ in the Flower GitHub repository. -Open another terminal and start the second client: +.. |quickstart_tf_link| replace:: -.. code-block:: shell + :code:`examples/quickstart-tensorflow` - $ python client.py +.. _quickstart_tf_link: https://github.com/adap/flower/blob/main/examples/quickstart-tensorflow -Each client will have its own dataset. +**************** + Video tutorial +**************** -You should now see how the training does in the very first terminal (the one -that started the server): +.. note:: -.. code-block:: shell + The video shown below shows how to setup a TensorFlow + Flower + project using our previously recommended APIs. A new video tutorial + will be released that shows the new APIs (as the content above does) - INFO flower 2021-02-25 14:15:46,741 | app.py:76 | Flower server running (insecure, 3 rounds) - INFO flower 2021-02-25 14:15:46,742 | server.py:72 | Getting initial parameters - INFO flower 2021-02-25 14:16:01,770 | server.py:74 | Evaluating initial parameters - INFO flower 2021-02-25 14:16:01,770 | server.py:87 | [TIME] FL starting - DEBUG flower 2021-02-25 14:16:12,341 | server.py:165 | fit_round: strategy sampled 2 clients (out of 2) - DEBUG flower 2021-02-25 14:21:17,235 | server.py:177 | fit_round received 2 results and 0 failures - DEBUG flower 2021-02-25 14:21:17,512 | server.py:139 | evaluate: strategy sampled 2 clients - DEBUG flower 2021-02-25 14:21:29,628 | server.py:149 | evaluate received 2 results and 0 failures - DEBUG flower 2021-02-25 14:21:29,696 | server.py:165 | fit_round: strategy sampled 2 clients (out of 2) - DEBUG flower 2021-02-25 14:25:59,917 | server.py:177 | fit_round received 2 results and 0 failures - DEBUG flower 2021-02-25 14:26:00,227 | server.py:139 | evaluate: strategy sampled 2 clients - DEBUG flower 2021-02-25 14:26:11,457 | server.py:149 | evaluate received 2 results and 0 failures - DEBUG flower 2021-02-25 14:26:11,530 | server.py:165 | fit_round: strategy sampled 2 clients (out of 2) - DEBUG flower 2021-02-25 14:30:43,389 | server.py:177 | fit_round received 2 results and 0 failures - DEBUG flower 2021-02-25 14:30:43,630 | server.py:139 | evaluate: strategy sampled 2 clients - DEBUG flower 2021-02-25 14:30:53,384 | server.py:149 | evaluate received 2 results and 0 failures - INFO flower 2021-02-25 14:30:53,384 | server.py:122 | [TIME] FL finished in 891.6143046000007 - INFO flower 2021-02-25 14:30:53,385 | app.py:109 | app_fit: losses_distributed [(1, 2.3196680545806885), (2, 2.3202896118164062), (3, 2.1818180084228516)] - INFO flower 2021-02-25 14:30:53,385 | app.py:110 | app_fit: accuracies_distributed [] - INFO flower 2021-02-25 14:30:53,385 | app.py:111 | app_fit: losses_centralized [] - INFO flower 2021-02-25 14:30:53,385 | app.py:112 | app_fit: accuracies_centralized [] - DEBUG flower 2021-02-25 14:30:53,442 | server.py:139 | evaluate: strategy sampled 2 clients - DEBUG flower 2021-02-25 14:31:02,848 | server.py:149 | evaluate received 2 results and 0 failures - INFO flower 2021-02-25 14:31:02,848 | app.py:121 | app_evaluate: federated loss: 2.1818180084228516 - INFO flower 2021-02-25 14:31:02,848 | app.py:125 | app_evaluate: results [('ipv4:127.0.0.1:57158', EvaluateRes(loss=2.1818180084228516, num_examples=10000, accuracy=0.0, metrics={'accuracy': 0.21610000729560852})), ('ipv4:127.0.0.1:57160', EvaluateRes(loss=2.1818180084228516, num_examples=10000, accuracy=0.0, metrics={'accuracy': 0.21610000729560852}))] - INFO flower 2021-02-25 14:31:02,848 | app.py:127 | app_evaluate: failures [] flower 2020-07-15 10:07:56,396 | app.py:77 | app_evaluate: failures [] +.. meta:: + :description: Check out this Federated Learning quickstart tutorial for using Flower with TensorFlow to train a CNN model on CIFAR-10. -Congratulations! You've successfully built and run your first federated -learning system. The full `source code `_ for this can be found in -:code:`examples/quickstart-tensorflow/client.py`. +.. youtube:: FGTc2TQq7VM + :width: 100%