From d524b80d9decefb370d8d5353b42c86bf23569fd Mon Sep 17 00:00:00 2001 From: Charles Beauville Date: Tue, 13 Feb 2024 12:53:41 +0100 Subject: [PATCH 001/102] Fix minor doc errors (#2940) Co-authored-by: Taner Topal --- doc/source/.gitignore | 1 + doc/source/example-walkthrough-pytorch-mnist.rst | 2 +- doc/source/how-to-use-built-in-mods.rst | 2 +- doc/source/ref-changelog.md | 2 +- 4 files changed, 4 insertions(+), 3 deletions(-) create mode 100644 doc/source/.gitignore diff --git a/doc/source/.gitignore b/doc/source/.gitignore new file mode 100644 index 000000000000..e9341a1383b7 --- /dev/null +++ b/doc/source/.gitignore @@ -0,0 +1 @@ +ref-api/ diff --git a/doc/source/example-walkthrough-pytorch-mnist.rst b/doc/source/example-walkthrough-pytorch-mnist.rst index ab311813f5de..0be0af6e1ca6 100644 --- a/doc/source/example-walkthrough-pytorch-mnist.rst +++ b/doc/source/example-walkthrough-pytorch-mnist.rst @@ -76,7 +76,7 @@ Inside the server helper script *run-server.sh* you will find the following code We can go a bit deeper and see that :code:`server.py` simply launches a server that will coordinate three rounds of training. -Flower Servers are very customizable, but for simple workloads, we can start a server using the :ref:`start_server ` function and leave all the configuration possibilities at their default values, as seen below. +Flower Servers are very customizable, but for simple workloads, we can start a server using the `start_server `_ function and leave all the configuration possibilities at their default values, as seen below. .. code-block:: python diff --git a/doc/source/how-to-use-built-in-mods.rst b/doc/source/how-to-use-built-in-mods.rst index af7102de9d0b..3c13892356ea 100644 --- a/doc/source/how-to-use-built-in-mods.rst +++ b/doc/source/how-to-use-built-in-mods.rst @@ -86,4 +86,4 @@ Conclusion By following this guide, you have learned how to effectively use mods to enhance your ``ClientApp``'s functionality. Remember that the order of mods is crucial and affects how the input and output are processed. -Enjoy building more robust and flexible ``ClientApp``s with mods! +Enjoy building more robust and flexible ``ClientApp`` s with mods! diff --git a/doc/source/ref-changelog.md b/doc/source/ref-changelog.md index 78d1e0e491a4..e9282632410d 100644 --- a/doc/source/ref-changelog.md +++ b/doc/source/ref-changelog.md @@ -217,7 +217,7 @@ We would like to give our special thanks to all the contributors who made the ne - **Restructure Flower Docs** ([#1824](https://github.com/adap/flower/pull/1824), [#1865](https://github.com/adap/flower/pull/1865), [#1884](https://github.com/adap/flower/pull/1884), [#1887](https://github.com/adap/flower/pull/1887), [#1919](https://github.com/adap/flower/pull/1919), [#1922](https://github.com/adap/flower/pull/1922), [#1920](https://github.com/adap/flower/pull/1920), [#1923](https://github.com/adap/flower/pull/1923), [#1924](https://github.com/adap/flower/pull/1924), [#1962](https://github.com/adap/flower/pull/1962), [#2006](https://github.com/adap/flower/pull/2006), [#2133](https://github.com/adap/flower/pull/2133), [#2203](https://github.com/adap/flower/pull/2203), [#2215](https://github.com/adap/flower/pull/2215), [#2122](https://github.com/adap/flower/pull/2122), [#2223](https://github.com/adap/flower/pull/2223), [#2219](https://github.com/adap/flower/pull/2219), [#2232](https://github.com/adap/flower/pull/2232), [#2233](https://github.com/adap/flower/pull/2233), [#2234](https://github.com/adap/flower/pull/2234), [#2235](https://github.com/adap/flower/pull/2235), [#2237](https://github.com/adap/flower/pull/2237), [#2238](https://github.com/adap/flower/pull/2238), [#2242](https://github.com/adap/flower/pull/2242), [#2231](https://github.com/adap/flower/pull/2231), [#2243](https://github.com/adap/flower/pull/2243), [#2227](https://github.com/adap/flower/pull/2227)) - Much effort went into a completely restructured Flower docs experience. The documentation on [flower.dev/docs](flower.dev/docs) is now divided into Flower Framework, Flower Baselines, Flower Android SDK, Flower iOS SDK, and code example projects. + Much effort went into a completely restructured Flower docs experience. The documentation on [flower.dev/docs](https://flower.dev/docs) is now divided into Flower Framework, Flower Baselines, Flower Android SDK, Flower iOS SDK, and code example projects. - **Introduce Flower Swift SDK** ([#1858](https://github.com/adap/flower/pull/1858), [#1897](https://github.com/adap/flower/pull/1897)) From d213534fccfa57c6b963bf8f73ff7141eff73e7e Mon Sep 17 00:00:00 2001 From: Javier Date: Tue, 13 Feb 2024 18:16:33 +0100 Subject: [PATCH 002/102] Add `node_id` to `Metadata` (#2912) Co-authored-by: Heng Pan <134433891+panh99@users.noreply.github.com> --- src/py/flwr/client/grpc_client/connection.py | 1 + .../flwr/client/grpc_client/connection_test.py | 2 ++ .../client/message_handler/message_handler.py | 8 +++++--- .../message_handler/message_handler_test.py | 2 ++ .../mod/secure_aggregation/secaggplus_mod.py | 9 ++++++++- .../secure_aggregation/secaggplus_mod_test.py | 18 ++++++++++++++++-- src/py/flwr/client/mod/utils_test.py | 4 +++- src/py/flwr/common/message.py | 3 +++ src/py/flwr/common/serde.py | 2 ++ src/py/flwr/common/serde_test.py | 5 +++++ 10 files changed, 47 insertions(+), 7 deletions(-) diff --git a/src/py/flwr/client/grpc_client/connection.py b/src/py/flwr/client/grpc_client/connection.py index c7d8494b8fde..aaaf1fcc863c 100644 --- a/src/py/flwr/client/grpc_client/connection.py +++ b/src/py/flwr/client/grpc_client/connection.py @@ -173,6 +173,7 @@ def receive() -> Message: task_id=str(uuid.uuid4()), group_id="", ttl="", + node_id=0, task_type=task_type, ), content=recordset, diff --git a/src/py/flwr/client/grpc_client/connection_test.py b/src/py/flwr/client/grpc_client/connection_test.py index 7ba39568f0a4..e193c9484fff 100644 --- a/src/py/flwr/client/grpc_client/connection_test.py +++ b/src/py/flwr/client/grpc_client/connection_test.py @@ -48,6 +48,7 @@ run_id=0, task_id="", group_id="", + node_id=0, ttl="", task_type=TASK_TYPE_GET_PROPERTIES, ), @@ -60,6 +61,7 @@ run_id=0, task_id="", group_id="", + node_id=0, ttl="", task_type="reconnect", ), diff --git a/src/py/flwr/client/message_handler/message_handler.py b/src/py/flwr/client/message_handler/message_handler.py index c45d13d43f94..52dbdec7691b 100644 --- a/src/py/flwr/client/message_handler/message_handler.py +++ b/src/py/flwr/client/message_handler/message_handler.py @@ -92,6 +92,7 @@ def handle_control_message(message: Message) -> Tuple[Optional[Message], int]: run_id=0, task_id="", group_id="", + node_id=0, ttl="", task_type="reconnect", ), @@ -150,9 +151,10 @@ def handle_legacy_message_from_tasktype( # Return Message out_message = Message( metadata=Metadata( - run_id=0, # Non-user defined - task_id="", # Non-user defined - group_id="", # Non-user defined + run_id=0, + task_id="", + group_id="", + node_id=0, ttl="", task_type=task_type, ), diff --git a/src/py/flwr/client/message_handler/message_handler_test.py b/src/py/flwr/client/message_handler/message_handler_test.py index 288597b764f4..820de9bbe27c 100644 --- a/src/py/flwr/client/message_handler/message_handler_test.py +++ b/src/py/flwr/client/message_handler/message_handler_test.py @@ -124,6 +124,7 @@ def test_client_without_get_properties() -> None: run_id=0, task_id=str(uuid.uuid4()), group_id="", + node_id=0, ttl="", task_type=TASK_TYPE_GET_PROPERTIES, ), @@ -162,6 +163,7 @@ def test_client_with_get_properties() -> None: run_id=0, task_id=str(uuid.uuid4()), group_id="", + node_id=0, ttl="", task_type=TASK_TYPE_GET_PROPERTIES, ), diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py index a6f59b736911..eba5dec658aa 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py @@ -207,7 +207,14 @@ def secaggplus_mod( # Return message return Message( - metadata=Metadata(0, "", "", "", TASK_TYPE_FIT), + metadata=Metadata( + run_id=0, + task_id="", + group_id="", + node_id=0, + ttl="", + task_type=TASK_TYPE_FIT, + ), content=RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(res, False)}), ) diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py index 2ca9748f8219..d0f461e7a8de 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py @@ -57,7 +57,14 @@ def get_test_handler( def empty_ffn(_: Message, _2: Context) -> Message: return Message( - metadata=Metadata(0, "", "", "", TASK_TYPE_FIT), + metadata=Metadata( + run_id=0, + task_id="", + group_id="", + node_id=0, + ttl="", + task_type=TASK_TYPE_FIT, + ), content=RecordSet(), ) @@ -65,7 +72,14 @@ def empty_ffn(_: Message, _2: Context) -> Message: def func(configs: Dict[str, ConfigsRecordValues]) -> Dict[str, ConfigsRecordValues]: in_msg = Message( - metadata=Metadata(0, "", "", "", TASK_TYPE_FIT), + metadata=Metadata( + run_id=0, + task_id="", + group_id="", + node_id=0, + ttl="", + task_type=TASK_TYPE_FIT, + ), content=RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(configs)}), ) out_msg = app(in_msg, ctxt) diff --git a/src/py/flwr/client/mod/utils_test.py b/src/py/flwr/client/mod/utils_test.py index 00c0ef850dc0..07b38a250d6c 100644 --- a/src/py/flwr/client/mod/utils_test.py +++ b/src/py/flwr/client/mod/utils_test.py @@ -74,7 +74,9 @@ def app(message: Message, context: Context) -> Message: def _get_dummy_flower_message() -> Message: return Message( content=RecordSet(), - metadata=Metadata(run_id=0, task_id="", group_id="", ttl="", task_type="mock"), + metadata=Metadata( + run_id=0, task_id="", group_id="", node_id=0, ttl="", task_type="mock" + ), ) diff --git a/src/py/flwr/common/message.py b/src/py/flwr/common/message.py index 1a971749966e..ada4a617b60d 100644 --- a/src/py/flwr/common/message.py +++ b/src/py/flwr/common/message.py @@ -33,6 +33,8 @@ class Metadata: group_id : str An identifier for grouping tasks. In some settings this is used as the FL round. + node_id : int + An identifier for the node running a task. ttl : str Time-to-live for this task. task_type : str @@ -43,6 +45,7 @@ class Metadata: run_id: int task_id: str group_id: str + node_id: int ttl: str task_type: str diff --git a/src/py/flwr/common/serde.py b/src/py/flwr/common/serde.py index 3b8e4c3a1c2c..5a8c5c753136 100644 --- a/src/py/flwr/common/serde.py +++ b/src/py/flwr/common/serde.py @@ -563,6 +563,7 @@ def message_from_taskins(taskins: TaskIns) -> Message: run_id=taskins.run_id, task_id=taskins.task_id, group_id=taskins.group_id, + node_id=taskins.task.consumer.node_id, ttl=taskins.task.ttl, task_type=taskins.task.task_type, ) @@ -592,6 +593,7 @@ def message_from_taskres(taskres: TaskRes) -> Message: run_id=taskres.run_id, task_id=taskres.task_id, group_id=taskres.group_id, + node_id=taskres.task.consumer.node_id, ttl=taskres.task.ttl, task_type=taskres.task.task_type, ) diff --git a/src/py/flwr/common/serde_test.py b/src/py/flwr/common/serde_test.py index c30f24d3700c..1c36f2171149 100644 --- a/src/py/flwr/common/serde_test.py +++ b/src/py/flwr/common/serde_test.py @@ -219,6 +219,7 @@ def metadata(self) -> Metadata: run_id=self.rng.randint(0, 1 << 30), task_id=self.get_str(64), group_id=self.get_str(30), + node_id=self.rng.randint(0, 1 << 63), ttl=self.get_str(10), task_type=self.get_str(10), ) @@ -309,6 +310,7 @@ def test_message_to_and_from_taskins() -> None: run_id=0, task_id="", group_id="", + node_id=metadata.node_id, ttl=metadata.ttl, task_type=metadata.task_type, ), @@ -320,6 +322,7 @@ def test_message_to_and_from_taskins() -> None: taskins.run_id = metadata.run_id taskins.task_id = metadata.task_id taskins.group_id = metadata.group_id + taskins.task.consumer.node_id = metadata.node_id deserialized = message_from_taskins(taskins) # Assert @@ -337,6 +340,7 @@ def test_message_to_and_from_taskres() -> None: run_id=0, task_id="", group_id="", + node_id=metadata.node_id, ttl=metadata.ttl, task_type=metadata.task_type, ), @@ -348,6 +352,7 @@ def test_message_to_and_from_taskres() -> None: taskres.run_id = metadata.run_id taskres.task_id = metadata.task_id taskres.group_id = metadata.group_id + taskres.task.consumer.node_id = metadata.node_id deserialized = message_from_taskres(taskres) # Assert From d1c0481e254a3d8d7948bd8cbe2ea3bf1a5748cc Mon Sep 17 00:00:00 2001 From: Javier Date: Tue, 13 Feb 2024 19:39:42 +0100 Subject: [PATCH 003/102] Update mt-pytorch example (#2933) --- examples/mt-pytorch/README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/examples/mt-pytorch/README.md b/examples/mt-pytorch/README.md index 721a26ed814d..0f676044ee90 100644 --- a/examples/mt-pytorch/README.md +++ b/examples/mt-pytorch/README.md @@ -36,17 +36,17 @@ flower-superlink --insecure In a new terminal window, start the first long-running Flower client: ```bash -flower-client client:app --insecure +flower-client-app client:app --insecure ``` In yet another new terminal window, start the second long-running Flower client: ```bash -flower-client client:app --insecure +flower-client-app client:app --insecure ``` ## Start the driver ```bash -python driver.py +python start_driver.py ``` From db9a527573f2f8beb034e421d76ae702bf1cadc5 Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Wed, 14 Feb 2024 12:46:51 +0100 Subject: [PATCH 004/102] Create ServerApp (#2931) --- doc/source/how-to-use-built-in-mods.rst | 2 +- src/py/flwr/client/clientapp.py | 4 +- src/py/flwr/server/__init__.py | 2 + src/py/flwr/server/serverapp.py | 95 +++++++++++++++++++++++++ src/py/flwr/server/typing.py | 23 ++++++ 5 files changed, 123 insertions(+), 3 deletions(-) create mode 100644 src/py/flwr/server/serverapp.py create mode 100644 src/py/flwr/server/typing.py diff --git a/doc/source/how-to-use-built-in-mods.rst b/doc/source/how-to-use-built-in-mods.rst index 3c13892356ea..341139175074 100644 --- a/doc/source/how-to-use-built-in-mods.rst +++ b/doc/source/how-to-use-built-in-mods.rst @@ -86,4 +86,4 @@ Conclusion By following this guide, you have learned how to effectively use mods to enhance your ``ClientApp``'s functionality. Remember that the order of mods is crucial and affects how the input and output are processed. -Enjoy building more robust and flexible ``ClientApp`` s with mods! +Enjoy building a more robust and flexible ``ClientApp`` with mods! diff --git a/src/py/flwr/client/clientapp.py b/src/py/flwr/client/clientapp.py index 51c912890c7e..2d8e9e652dba 100644 --- a/src/py/flwr/client/clientapp.py +++ b/src/py/flwr/client/clientapp.py @@ -72,12 +72,12 @@ def ffn( self._call = make_ffn(ffn, mods if mods is not None else []) def __call__(self, message: Message, context: Context) -> Message: - """.""" + """Execute `ClientApp`.""" return self._call(message, context) class LoadClientAppError(Exception): - """.""" + """Error when trying to load `ClientApp`.""" def load_client_app(module_attribute_str: str) -> ClientApp: diff --git a/src/py/flwr/server/__init__.py b/src/py/flwr/server/__init__.py index 84c24b4bc2c1..1472bcaf0893 100644 --- a/src/py/flwr/server/__init__.py +++ b/src/py/flwr/server/__init__.py @@ -26,6 +26,7 @@ from .client_manager import SimpleClientManager as SimpleClientManager from .history import History as History from .server import Server as Server +from .serverapp import ServerApp as ServerApp __all__ = [ "ClientManager", @@ -36,6 +37,7 @@ "run_server_app", "run_superlink", "Server", + "ServerApp", "ServerConfig", "SimpleClientManager", "start_server", diff --git a/src/py/flwr/server/serverapp.py b/src/py/flwr/server/serverapp.py new file mode 100644 index 000000000000..1a89093621e1 --- /dev/null +++ b/src/py/flwr/server/serverapp.py @@ -0,0 +1,95 @@ +# Copyright 2024 Flower Labs GmbH. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +"""Flower ServerApp.""" + + +import importlib +from typing import Optional, cast + +from flwr.common.context import Context +from flwr.server.driver.driver import Driver +from flwr.server.strategy import Strategy + +from .app import ServerConfig +from .client_manager import ClientManager +from .server import Server + + +class ServerApp: + """Flower ServerApp.""" + + def __init__( + self, + server: Optional[Server] = None, + config: Optional[ServerConfig] = None, + strategy: Optional[Strategy] = None, + client_manager: Optional[ClientManager] = None, + ) -> None: + self.server = server + self.config = config + self.strategy = strategy + self.client_manager = client_manager + + def __call__(self, driver: Driver, context: Context) -> None: + """Execute `ServerApp`.""" + + +class LoadServerAppError(Exception): + """Error when trying to load `ServerApp`.""" + + +def load_server_app(module_attribute_str: str) -> ServerApp: + """Load the `ServerApp` object specified in a module attribute string. + + The module/attribute string should have the form :. Valid + examples include `server:app` and `project.package.module:wrapper.app`. It + must refer to a module on the PYTHONPATH, the module needs to have the specified + attribute, and the attribute must be of type `ServerApp`. + """ + module_str, _, attributes_str = module_attribute_str.partition(":") + if not module_str: + raise LoadServerAppError( + f"Missing module in {module_attribute_str}", + ) from None + if not attributes_str: + raise LoadServerAppError( + f"Missing attribute in {module_attribute_str}", + ) from None + + # Load module + try: + module = importlib.import_module(module_str) + except ModuleNotFoundError: + raise LoadServerAppError( + f"Unable to load module {module_str}", + ) from None + + # Recursively load attribute + attribute = module + try: + for attribute_str in attributes_str.split("."): + attribute = getattr(attribute, attribute_str) + except AttributeError: + raise LoadServerAppError( + f"Unable to load attribute {attributes_str} from module {module_str}", + ) from None + + # Check type + if not isinstance(attribute, ServerApp): + raise LoadServerAppError( + f"Attribute {attributes_str} is not of type {ServerApp}", + ) from None + + return cast(ServerApp, attribute) diff --git a/src/py/flwr/server/typing.py b/src/py/flwr/server/typing.py new file mode 100644 index 000000000000..728121c2eddf --- /dev/null +++ b/src/py/flwr/server/typing.py @@ -0,0 +1,23 @@ +# Copyright 2024 Flower Labs GmbH. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +"""Custom types for Flower servers.""" + + +from typing import Callable + +from flwr.common.context import Context +from flwr.server.driver import Driver + +ServerAppCallable = Callable[[Driver, Context], None] From 87ad890da6d9a5f947c6b21e46678f5b10c87100 Mon Sep 17 00:00:00 2001 From: Robert Steiner Date: Wed, 14 Feb 2024 13:29:34 +0100 Subject: [PATCH 005/102] Update artifact bucket ID (#2945) --- .github/workflows/e2e.yml | 7 ++++--- .github/workflows/framework-draft-release.yml | 21 +++++++++++-------- .github/workflows/framework-release.yml | 13 +++++++----- 3 files changed, 24 insertions(+), 17 deletions(-) diff --git a/.github/workflows/e2e.yml b/.github/workflows/e2e.yml index 065e79fff9ab..c89896dd9d6b 100644 --- a/.github/workflows/e2e.yml +++ b/.github/workflows/e2e.yml @@ -14,6 +14,7 @@ concurrency: env: FLWR_TELEMETRY_ENABLED: 0 + ARTIFACT_BUCKET: artifact.flower.ai jobs: wheel: @@ -43,7 +44,7 @@ jobs: echo "SHORT_SHA=$sha_short" >> "$GITHUB_OUTPUT" [ -z "${{ github.head_ref }}" ] && dir="${{ github.ref_name }}" || dir="pr/${{ github.head_ref }}" echo "DIR=$dir" >> "$GITHUB_OUTPUT" - aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./ s3://artifact.flower.dev/py/$dir/$sha_short --recursive + aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./ s3://${{ env.ARTIFACT_BUCKET }}/py/$dir/$sha_short --recursive outputs: whl_path: ${{ steps.upload.outputs.WHL_PATH }} short_sha: ${{ steps.upload.outputs.SHORT_SHA }} @@ -123,7 +124,7 @@ jobs: - name: Install Flower wheel from artifact store if: ${{ github.repository == 'adap/flower' && !github.event.pull_request.head.repo.fork && github.actor != 'dependabot[bot]' }} run: | - python -m pip install https://artifact.flower.dev/py/${{ needs.wheel.outputs.dir }}/${{ needs.wheel.outputs.short_sha }}/${{ needs.wheel.outputs.whl_path }} + python -m pip install https://${{ env.ARTIFACT_BUCKET }}/py/${{ needs.wheel.outputs.dir }}/${{ needs.wheel.outputs.short_sha }}/${{ needs.wheel.outputs.whl_path }} - name: Download dataset if: ${{ matrix.dataset }} run: python -c "${{ matrix.dataset }}" @@ -164,7 +165,7 @@ jobs: - name: Install Flower wheel from artifact store if: ${{ github.repository == 'adap/flower' && !github.event.pull_request.head.repo.fork && github.actor != 'dependabot[bot]' }} run: | - python -m pip install https://artifact.flower.dev/py/${{ needs.wheel.outputs.dir }}/${{ needs.wheel.outputs.short_sha }}/${{ needs.wheel.outputs.whl_path }} + python -m pip install https://${{ env.ARTIFACT_BUCKET }}/py/${{ needs.wheel.outputs.dir }}/${{ needs.wheel.outputs.short_sha }}/${{ needs.wheel.outputs.whl_path }} - name: Cache Datasets uses: actions/cache@v3 with: diff --git a/.github/workflows/framework-draft-release.yml b/.github/workflows/framework-draft-release.yml index 959d17249765..91a89953cf96 100644 --- a/.github/workflows/framework-draft-release.yml +++ b/.github/workflows/framework-draft-release.yml @@ -5,6 +5,9 @@ on: tags: - "v*.*.*" +env: + ARTIFACT_BUCKET: artifact.flower.ai + jobs: publish: if: ${{ github.repository == 'adap/flower' }} @@ -26,16 +29,16 @@ jobs: run: | tag_name=$(echo "${GITHUB_REF_NAME}" | cut -c2-) echo "TAG_NAME=$tag_name" >> "$GITHUB_ENV" - + wheel_name="flwr-${tag_name}-py3-none-any.whl" echo "WHEEL_NAME=$wheel_name" >> "$GITHUB_ENV" - + tar_name="flwr-${tag_name}.tar.gz" echo "TAR_NAME=$tar_name" >> "$GITHUB_ENV" - wheel_url="https://artifact.flower.dev/py/main/${GITHUB_SHA::7}/${wheel_name}" - tar_url="https://artifact.flower.dev/py/main/${GITHUB_SHA::7}/${tar_name}" - + wheel_url="https://${{ env.ARTIFACT_BUCKET }}/py/main/${GITHUB_SHA::7}/${wheel_name}" + tar_url="https://${{ env.ARTIFACT_BUCKET }}/py/main/${GITHUB_SHA::7}/${tar_name}" + curl $wheel_url --output $wheel_name curl $tar_url --output $tar_name - name: Upload wheel @@ -44,14 +47,14 @@ jobs: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets. AWS_SECRET_ACCESS_KEY }} run: | - aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./${{ env.WHEEL_NAME }} s3://artifact.flower.dev/py/release/v${{ env.TAG_NAME }}/${{ env.WHEEL_NAME }} - aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./${{ env.TAR_NAME }} s3://artifact.flower.dev/py/release/v${{ env.TAG_NAME }}/${{ env.TAR_NAME }} - + aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./${{ env.WHEEL_NAME }} s3://${{ env.ARTIFACT_BUCKET }}/py/release/v${{ env.TAG_NAME }}/${{ env.WHEEL_NAME }} + aws s3 cp --content-disposition "attachment" --cache-control "no-cache" ./${{ env.TAR_NAME }} s3://${{ env.ARTIFACT_BUCKET }}/py/release/v${{ env.TAG_NAME }}/${{ env.TAR_NAME }} + - name: Generate body run: | ./dev/get-latest-changelog.sh > body.md cat body.md - + - name: Release uses: softprops/action-gh-release@v1 with: diff --git a/.github/workflows/framework-release.yml b/.github/workflows/framework-release.yml index f052d3a4a928..04b68fd38af9 100644 --- a/.github/workflows/framework-release.yml +++ b/.github/workflows/framework-release.yml @@ -3,11 +3,14 @@ name: Publish `flwr` release on PyPI on: release: types: [released] - + concurrency: group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_id || github.event.pull_request.number || github.ref }} cancel-in-progress: true - + +env: + ARTIFACT_BUCKET: artifact.flower.ai + jobs: publish: if: ${{ github.repository == 'adap/flower' }} @@ -28,11 +31,11 @@ jobs: run: | TAG_NAME=$(echo "${GITHUB_REF_NAME}" | cut -c2-) - wheel_name="flwr-${TAG_NAME}-py3-none-any.whl" + wheel_name="flwr-${TAG_NAME}-py3-none-any.whl" tar_name="flwr-${TAG_NAME}.tar.gz" - wheel_url="https://artifact.flower.dev/py/release/v${TAG_NAME}/${wheel_name}" - tar_url="https://artifact.flower.dev/py/release/v${TAG_NAME}/${tar_name}" + wheel_url="https://${{ env.ARTIFACT_BUCKET }}/py/release/v${TAG_NAME}/${wheel_name}" + tar_url="https://${{ env.ARTIFACT_BUCKET }}/py/release/v${TAG_NAME}/${tar_name}" mkdir -p dist From cbbb813a7e46c165788e5c405a90dbfb6565730b Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Wed, 14 Feb 2024 14:53:21 +0100 Subject: [PATCH 006/102] Load ServerApp (#2934) --- src/py/flwr/server/__init__.py | 2 +- src/py/flwr/server/app.py | 44 +++++++++++++++-------------- src/py/flwr/server/driver/app.py | 3 +- src/py/flwr/server/server_config.py | 31 ++++++++++++++++++++ src/py/flwr/server/serverapp.py | 2 +- src/py/flwr/simulation/app.py | 3 +- 6 files changed, 60 insertions(+), 25 deletions(-) create mode 100644 src/py/flwr/server/server_config.py diff --git a/src/py/flwr/server/__init__.py b/src/py/flwr/server/__init__.py index 1472bcaf0893..b0f95f903811 100644 --- a/src/py/flwr/server/__init__.py +++ b/src/py/flwr/server/__init__.py @@ -16,7 +16,6 @@ from . import driver, strategy -from .app import ServerConfig as ServerConfig from .app import run_driver_api as run_driver_api from .app import run_fleet_api as run_fleet_api from .app import run_server_app as run_server_app @@ -26,6 +25,7 @@ from .client_manager import SimpleClientManager as SimpleClientManager from .history import History as History from .server import Server as Server +from .server_config import ServerConfig as ServerConfig from .serverapp import ServerApp as ServerApp __all__ = [ diff --git a/src/py/flwr/server/app.py b/src/py/flwr/server/app.py index 6e2f1f7dc88d..66adcbdb6b85 100644 --- a/src/py/flwr/server/app.py +++ b/src/py/flwr/server/app.py @@ -19,7 +19,6 @@ import importlib.util import sys import threading -from dataclasses import dataclass from logging import DEBUG, ERROR, INFO, WARN from os.path import isfile from pathlib import Path @@ -43,17 +42,20 @@ from flwr.proto.fleet_pb2_grpc import ( # pylint: disable=E0611 add_FleetServicer_to_server, ) -from flwr.server.client_manager import ClientManager, SimpleClientManager -from flwr.server.history import History -from flwr.server.server import Server -from flwr.server.strategy import FedAvg, Strategy -from flwr.server.superlink.driver.driver_servicer import DriverServicer -from flwr.server.superlink.fleet.grpc_bidi.grpc_server import ( + +from .client_manager import ClientManager, SimpleClientManager +from .history import History +from .server import Server +from .server_config import ServerConfig +from .serverapp import ServerApp, load_server_app +from .strategy import FedAvg, Strategy +from .superlink.driver.driver_servicer import DriverServicer +from .superlink.fleet.grpc_bidi.grpc_server import ( generic_create_grpc_server, start_grpc_server, ) -from flwr.server.superlink.fleet.grpc_rere.fleet_servicer import FleetServicer -from flwr.server.superlink.state import StateFactory +from .superlink.fleet.grpc_rere.fleet_servicer import FleetServicer +from .superlink.state import StateFactory ADDRESS_DRIVER_API = "0.0.0.0:9091" ADDRESS_FLEET_API_GRPC_RERE = "0.0.0.0:9092" @@ -63,18 +65,6 @@ DATABASE = ":flwr-in-memory-state:" -@dataclass -class ServerConfig: - """Flower server config. - - All attributes have default values which allows users to configure just the ones - they care about. - """ - - num_rounds: int = 1 - round_timeout: Optional[float] = None - - def run_server_app() -> None: """Run Flower server app.""" event(EventType.RUN_SERVER_APP_ENTER) @@ -126,6 +116,18 @@ def run_server_app() -> None: log(WARN, "Not implemented: run_server_app") + server_app_dir = args.dir + if server_app_dir is not None: + sys.path.insert(0, server_app_dir) + + def _load() -> ServerApp: + server_app: ServerApp = load_server_app(getattr(args, "server-app")) + return server_app + + server_app = _load() + + log(DEBUG, "server_app: `%s`", server_app) + event(EventType.RUN_SERVER_APP_LEAVE) diff --git a/src/py/flwr/server/driver/app.py b/src/py/flwr/server/driver/app.py index ae47c58f4e9c..b47454b7b4b6 100644 --- a/src/py/flwr/server/driver/app.py +++ b/src/py/flwr/server/driver/app.py @@ -26,10 +26,11 @@ from flwr.common.address import parse_address from flwr.common.logger import log from flwr.proto import driver_pb2 # pylint: disable=E0611 -from flwr.server.app import ServerConfig, init_defaults, run_fl +from flwr.server.app import init_defaults, run_fl from flwr.server.client_manager import ClientManager from flwr.server.history import History from flwr.server.server import Server +from flwr.server.server_config import ServerConfig from flwr.server.strategy import Strategy from .driver_client_proxy import DriverClientProxy diff --git a/src/py/flwr/server/server_config.py b/src/py/flwr/server/server_config.py new file mode 100644 index 000000000000..823f832da6f8 --- /dev/null +++ b/src/py/flwr/server/server_config.py @@ -0,0 +1,31 @@ +# Copyright 2024 Flower Labs GmbH. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +"""Flower ServerConfig.""" + + +from dataclasses import dataclass +from typing import Optional + + +@dataclass +class ServerConfig: + """Flower server config. + + All attributes have default values which allows users to configure just the ones + they care about. + """ + + num_rounds: int = 1 + round_timeout: Optional[float] = None diff --git a/src/py/flwr/server/serverapp.py b/src/py/flwr/server/serverapp.py index 1a89093621e1..1ffa087719dc 100644 --- a/src/py/flwr/server/serverapp.py +++ b/src/py/flwr/server/serverapp.py @@ -22,9 +22,9 @@ from flwr.server.driver.driver import Driver from flwr.server.strategy import Strategy -from .app import ServerConfig from .client_manager import ClientManager from .server import Server +from .server_config import ServerConfig class ServerApp: diff --git a/src/py/flwr/simulation/app.py b/src/py/flwr/simulation/app.py index 6a18a258ac60..b159042588c9 100644 --- a/src/py/flwr/simulation/app.py +++ b/src/py/flwr/simulation/app.py @@ -29,9 +29,10 @@ from flwr.common import EventType, event from flwr.common.logger import log from flwr.server import Server -from flwr.server.app import ServerConfig, init_defaults, run_fl +from flwr.server.app import init_defaults, run_fl from flwr.server.client_manager import ClientManager from flwr.server.history import History +from flwr.server.server_config import ServerConfig from flwr.server.strategy import Strategy from flwr.simulation.ray_transport.ray_actor import ( DefaultActor, From 314492cc64eff9689d98c2d22eba348d801c4d28 Mon Sep 17 00:00:00 2001 From: Heng Pan <134433891+panh99@users.noreply.github.com> Date: Wed, 14 Feb 2024 14:23:59 +0000 Subject: [PATCH 007/102] Rename `task_*` to `message_*` (#2944) --- src/py/flwr/client/clientapp.py | 4 +- src/py/flwr/client/grpc_client/connection.py | 40 +++++++++---------- .../client/grpc_client/connection_test.py | 12 +++--- .../client/message_handler/message_handler.py | 32 +++++++-------- .../message_handler/message_handler_test.py | 20 +++++----- .../mod/secure_aggregation/secaggplus_mod.py | 8 ++-- .../secure_aggregation/secaggplus_mod_test.py | 10 ++--- src/py/flwr/client/mod/utils_test.py | 2 +- src/py/flwr/common/constant.py | 8 ++-- src/py/flwr/common/message.py | 20 +++++----- src/py/flwr/common/serde.py | 12 +++--- src/py/flwr/common/serde_test.py | 16 ++++---- .../flwr/server/driver/driver_client_proxy.py | 16 ++++---- .../server/driver/driver_client_proxy_test.py | 18 ++++----- 14 files changed, 109 insertions(+), 109 deletions(-) diff --git a/src/py/flwr/client/clientapp.py b/src/py/flwr/client/clientapp.py index 2d8e9e652dba..cfc59c9298ed 100644 --- a/src/py/flwr/client/clientapp.py +++ b/src/py/flwr/client/clientapp.py @@ -19,7 +19,7 @@ from typing import List, Optional, cast from flwr.client.message_handler.message_handler import ( - handle_legacy_message_from_tasktype, + handle_legacy_message_from_msgtype, ) from flwr.client.mod.utils import make_ffn from flwr.client.typing import ClientFn, Mod @@ -63,7 +63,7 @@ def ffn( message: Message, context: Context, ) -> Message: # pylint: disable=invalid-name - out_message = handle_legacy_message_from_tasktype( + out_message = handle_legacy_message_from_msgtype( client_fn=client_fn, message=message, context=context ) return out_message diff --git a/src/py/flwr/client/grpc_client/connection.py b/src/py/flwr/client/grpc_client/connection.py index aaaf1fcc863c..e6d21963fcbf 100644 --- a/src/py/flwr/client/grpc_client/connection.py +++ b/src/py/flwr/client/grpc_client/connection.py @@ -27,10 +27,10 @@ from flwr.common import serde from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import ( - TASK_TYPE_EVALUATE, - TASK_TYPE_FIT, - TASK_TYPE_GET_PARAMETERS, - TASK_TYPE_GET_PROPERTIES, + MESSAGE_TYPE_EVALUATE, + MESSAGE_TYPE_FIT, + MESSAGE_TYPE_GET_PARAMETERS, + MESSAGE_TYPE_GET_PROPERTIES, ) from flwr.common.grpc import create_channel from flwr.common.logger import log @@ -133,33 +133,33 @@ def receive() -> Message: # ServerMessage proto --> *Ins --> RecordSet field = proto.WhichOneof("msg") - task_type = "" + message_type = "" if field == "get_properties_ins": recordset = compat.getpropertiesins_to_recordset( serde.get_properties_ins_from_proto(proto.get_properties_ins) ) - task_type = TASK_TYPE_GET_PROPERTIES + message_type = MESSAGE_TYPE_GET_PROPERTIES elif field == "get_parameters_ins": recordset = compat.getparametersins_to_recordset( serde.get_parameters_ins_from_proto(proto.get_parameters_ins) ) - task_type = TASK_TYPE_GET_PARAMETERS + message_type = MESSAGE_TYPE_GET_PARAMETERS elif field == "fit_ins": recordset = compat.fitins_to_recordset( serde.fit_ins_from_proto(proto.fit_ins), False ) - task_type = TASK_TYPE_FIT + message_type = MESSAGE_TYPE_FIT elif field == "evaluate_ins": recordset = compat.evaluateins_to_recordset( serde.evaluate_ins_from_proto(proto.evaluate_ins), False ) - task_type = TASK_TYPE_EVALUATE + message_type = MESSAGE_TYPE_EVALUATE elif field == "reconnect_ins": recordset = RecordSet() recordset.set_configs( "config", ConfigsRecord({"seconds": proto.reconnect_ins.seconds}) ) - task_type = "reconnect" + message_type = "reconnect" else: raise ValueError( "Unsupported instruction in ServerMessage, " @@ -170,44 +170,44 @@ def receive() -> Message: return Message( metadata=Metadata( run_id=0, - task_id=str(uuid.uuid4()), + message_id=str(uuid.uuid4()), group_id="", ttl="", node_id=0, - task_type=task_type, + message_type=message_type, ), content=recordset, ) def send(message: Message) -> None: - # Retrieve RecordSet and task_type + # Retrieve RecordSet and message_type recordset = message.content - task_type = message.metadata.task_type + message_type = message.metadata.message_type # RecordSet --> *Res --> *Res proto -> ClientMessage proto - if task_type == TASK_TYPE_GET_PROPERTIES: + if message_type == MESSAGE_TYPE_GET_PROPERTIES: getpropres = compat.recordset_to_getpropertiesres(recordset) msg_proto = ClientMessage( get_properties_res=serde.get_properties_res_to_proto(getpropres) ) - elif task_type == TASK_TYPE_GET_PARAMETERS: + elif message_type == MESSAGE_TYPE_GET_PARAMETERS: getparamres = compat.recordset_to_getparametersres(recordset, False) msg_proto = ClientMessage( get_parameters_res=serde.get_parameters_res_to_proto(getparamres) ) - elif task_type == TASK_TYPE_FIT: + elif message_type == MESSAGE_TYPE_FIT: fitres = compat.recordset_to_fitres(recordset, False) msg_proto = ClientMessage(fit_res=serde.fit_res_to_proto(fitres)) - elif task_type == TASK_TYPE_EVALUATE: + elif message_type == MESSAGE_TYPE_EVALUATE: evalres = compat.recordset_to_evaluateres(recordset) msg_proto = ClientMessage(evaluate_res=serde.evaluate_res_to_proto(evalres)) - elif task_type == "reconnect": + elif message_type == "reconnect": reason = cast(Reason.ValueType, recordset.get_configs("config")["reason"]) msg_proto = ClientMessage( disconnect_res=ClientMessage.DisconnectRes(reason=reason) ) else: - raise ValueError(f"Invalid task type: {task_type}") + raise ValueError(f"Invalid task type: {message_type}") # Send ClientMessage proto return queue.put(msg_proto, block=False) diff --git a/src/py/flwr/client/grpc_client/connection_test.py b/src/py/flwr/client/grpc_client/connection_test.py index e193c9484fff..127e27356f64 100644 --- a/src/py/flwr/client/grpc_client/connection_test.py +++ b/src/py/flwr/client/grpc_client/connection_test.py @@ -25,7 +25,7 @@ from flwr.common import recordset_compat as compat from flwr.common.configsrecord import ConfigsRecord -from flwr.common.constant import TASK_TYPE_GET_PROPERTIES +from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES from flwr.common.message import Message, Metadata from flwr.common.recordset import RecordSet from flwr.common.typing import Code, GetPropertiesRes, Status @@ -46,11 +46,11 @@ MESSAGE_GET_PROPERTIES = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_GET_PROPERTIES, + message_type=MESSAGE_TYPE_GET_PROPERTIES, ), content=compat.getpropertiesres_to_recordset( GetPropertiesRes(Status(Code.OK, ""), {}) @@ -59,11 +59,11 @@ MESSAGE_DISCONNECT = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type="reconnect", + message_type="reconnect", ), content=RecordSet(configs={"config": ConfigsRecord({"reason": 0})}), ) @@ -134,7 +134,7 @@ def run_client() -> int: message = receive() messages_received += 1 - if message.metadata.task_type == "reconnect": # type: ignore + if message.metadata.message_type == "reconnect": # type: ignore send(MESSAGE_DISCONNECT) break diff --git a/src/py/flwr/client/message_handler/message_handler.py b/src/py/flwr/client/message_handler/message_handler.py index 52dbdec7691b..93de7d7d8821 100644 --- a/src/py/flwr/client/message_handler/message_handler.py +++ b/src/py/flwr/client/message_handler/message_handler.py @@ -26,10 +26,10 @@ from flwr.client.typing import ClientFn from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import ( - TASK_TYPE_EVALUATE, - TASK_TYPE_FIT, - TASK_TYPE_GET_PARAMETERS, - TASK_TYPE_GET_PROPERTIES, + MESSAGE_TYPE_EVALUATE, + MESSAGE_TYPE_FIT, + MESSAGE_TYPE_GET_PARAMETERS, + MESSAGE_TYPE_GET_PROPERTIES, ) from flwr.common.context import Context from flwr.common.message import Message, Metadata @@ -75,7 +75,7 @@ def handle_control_message(message: Message) -> Tuple[Optional[Message], int]: sleep_duration : int Number of seconds that the client should disconnect from the server. """ - if message.metadata.task_type == "reconnect": + if message.metadata.message_type == "reconnect": # Retrieve ReconnectIns from recordset recordset = message.content seconds = cast(int, recordset.get_configs("config")["seconds"]) @@ -90,11 +90,11 @@ def handle_control_message(message: Message) -> Tuple[Optional[Message], int]: out_message = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type="reconnect", + message_type="reconnect", ), content=recordset, ) @@ -105,7 +105,7 @@ def handle_control_message(message: Message) -> Tuple[Optional[Message], int]: return None, 0 -def handle_legacy_message_from_tasktype( +def handle_legacy_message_from_msgtype( client_fn: ClientFn, message: Message, context: Context ) -> Message: """Handle legacy message in the inner most mod.""" @@ -113,17 +113,17 @@ def handle_legacy_message_from_tasktype( client.set_context(context) - task_type = message.metadata.task_type + message_type = message.metadata.message_type # Handle GetPropertiesIns - if task_type == TASK_TYPE_GET_PROPERTIES: + if message_type == MESSAGE_TYPE_GET_PROPERTIES: get_properties_res = maybe_call_get_properties( client=client, get_properties_ins=recordset_to_getpropertiesins(message.content), ) out_recordset = getpropertiesres_to_recordset(get_properties_res) # Handle GetParametersIns - elif task_type == TASK_TYPE_GET_PARAMETERS: + elif message_type == MESSAGE_TYPE_GET_PARAMETERS: get_parameters_res = maybe_call_get_parameters( client=client, get_parameters_ins=recordset_to_getparametersins(message.content), @@ -132,31 +132,31 @@ def handle_legacy_message_from_tasktype( get_parameters_res, keep_input=False ) # Handle FitIns - elif task_type == TASK_TYPE_FIT: + elif message_type == MESSAGE_TYPE_FIT: fit_res = maybe_call_fit( client=client, fit_ins=recordset_to_fitins(message.content, keep_input=True), ) out_recordset = fitres_to_recordset(fit_res, keep_input=False) # Handle EvaluateIns - elif task_type == TASK_TYPE_EVALUATE: + elif message_type == MESSAGE_TYPE_EVALUATE: evaluate_res = maybe_call_evaluate( client=client, evaluate_ins=recordset_to_evaluateins(message.content, keep_input=True), ) out_recordset = evaluateres_to_recordset(evaluate_res) else: - raise ValueError(f"Invalid task type: {task_type}") + raise ValueError(f"Invalid task type: {message_type}") # Return Message out_message = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type=task_type, + message_type=message_type, ), content=out_recordset, ) diff --git a/src/py/flwr/client/message_handler/message_handler_test.py b/src/py/flwr/client/message_handler/message_handler_test.py index 820de9bbe27c..c4c65d98b833 100644 --- a/src/py/flwr/client/message_handler/message_handler_test.py +++ b/src/py/flwr/client/message_handler/message_handler_test.py @@ -34,12 +34,12 @@ ) from flwr.common import recordset_compat as compat from flwr.common import typing -from flwr.common.constant import TASK_TYPE_GET_PROPERTIES +from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES from flwr.common.context import Context from flwr.common.message import Message, Metadata from flwr.common.recordset import RecordSet -from .message_handler import handle_legacy_message_from_tasktype +from .message_handler import handle_legacy_message_from_msgtype class ClientWithoutProps(Client): @@ -122,17 +122,17 @@ def test_client_without_get_properties() -> None: message = Message( metadata=Metadata( run_id=0, - task_id=str(uuid.uuid4()), + message_id=str(uuid.uuid4()), group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_GET_PROPERTIES, + message_type=MESSAGE_TYPE_GET_PROPERTIES, ), content=recordset, ) # Execute - actual_msg = handle_legacy_message_from_tasktype( + actual_msg = handle_legacy_message_from_msgtype( client_fn=_get_client_fn(client), message=message, context=Context(state=RecordSet()), @@ -150,7 +150,7 @@ def test_client_without_get_properties() -> None: expected_msg = Message(message.metadata, expected_rs) assert actual_msg.content == expected_msg.content - assert actual_msg.metadata.task_type == expected_msg.metadata.task_type + assert actual_msg.metadata.message_type == expected_msg.metadata.message_type def test_client_with_get_properties() -> None: @@ -161,17 +161,17 @@ def test_client_with_get_properties() -> None: message = Message( metadata=Metadata( run_id=0, - task_id=str(uuid.uuid4()), + message_id=str(uuid.uuid4()), group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_GET_PROPERTIES, + message_type=MESSAGE_TYPE_GET_PROPERTIES, ), content=recordset, ) # Execute - actual_msg = handle_legacy_message_from_tasktype( + actual_msg = handle_legacy_message_from_msgtype( client_fn=_get_client_fn(client), message=message, context=Context(state=RecordSet()), @@ -189,4 +189,4 @@ def test_client_with_get_properties() -> None: expected_msg = Message(message.metadata, expected_rs) assert actual_msg.content == expected_msg.content - assert actual_msg.metadata.task_type == expected_msg.metadata.task_type + assert actual_msg.metadata.message_type == expected_msg.metadata.message_type diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py index eba5dec658aa..fa5a9fd24109 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py @@ -24,7 +24,7 @@ from flwr.common import ndarray_to_bytes, parameters_to_ndarrays from flwr.common import recordset_compat as compat from flwr.common.configsrecord import ConfigsRecord -from flwr.common.constant import TASK_TYPE_FIT +from flwr.common.constant import MESSAGE_TYPE_FIT from flwr.common.context import Context from flwr.common.logger import log from flwr.common.message import Message, Metadata @@ -168,7 +168,7 @@ def secaggplus_mod( ) -> Message: """Handle incoming message and return results, following the SecAgg+ protocol.""" # Ignore non-fit messages - if msg.metadata.task_type != TASK_TYPE_FIT: + if msg.metadata.message_type != MESSAGE_TYPE_FIT: return call_next(msg, ctxt) # Retrieve local state @@ -209,11 +209,11 @@ def secaggplus_mod( return Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_FIT, + message_type=MESSAGE_TYPE_FIT, ), content=RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(res, False)}), ) diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py index d0f461e7a8de..4033306d0845 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py @@ -20,7 +20,7 @@ from flwr.client.mod import make_ffn from flwr.common.configsrecord import ConfigsRecord -from flwr.common.constant import TASK_TYPE_FIT +from flwr.common.constant import MESSAGE_TYPE_FIT from flwr.common.context import Context from flwr.common.message import Message, Metadata from flwr.common.recordset import RecordSet @@ -59,11 +59,11 @@ def empty_ffn(_: Message, _2: Context) -> Message: return Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_FIT, + message_type=MESSAGE_TYPE_FIT, ), content=RecordSet(), ) @@ -74,11 +74,11 @@ def func(configs: Dict[str, ConfigsRecordValues]) -> Dict[str, ConfigsRecordValu in_msg = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=0, ttl="", - task_type=TASK_TYPE_FIT, + message_type=MESSAGE_TYPE_FIT, ), content=RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(configs)}), ) diff --git a/src/py/flwr/client/mod/utils_test.py b/src/py/flwr/client/mod/utils_test.py index 07b38a250d6c..4a086d9ae3f7 100644 --- a/src/py/flwr/client/mod/utils_test.py +++ b/src/py/flwr/client/mod/utils_test.py @@ -75,7 +75,7 @@ def _get_dummy_flower_message() -> Message: return Message( content=RecordSet(), metadata=Metadata( - run_id=0, task_id="", group_id="", node_id=0, ttl="", task_type="mock" + run_id=0, message_id="", group_id="", node_id=0, ttl="", message_type="mock" ), ) diff --git a/src/py/flwr/common/constant.py b/src/py/flwr/common/constant.py index 8d1d865f084b..fcafd853a349 100644 --- a/src/py/flwr/common/constant.py +++ b/src/py/flwr/common/constant.py @@ -32,7 +32,7 @@ TRANSPORT_TYPE_REST, ] -TASK_TYPE_GET_PROPERTIES = "get_properties" -TASK_TYPE_GET_PARAMETERS = "get_parameters" -TASK_TYPE_FIT = "fit" -TASK_TYPE_EVALUATE = "evaluate" +MESSAGE_TYPE_GET_PROPERTIES = "get_properties" +MESSAGE_TYPE_GET_PARAMETERS = "get_parameters" +MESSAGE_TYPE_FIT = "fit" +MESSAGE_TYPE_EVALUATE = "evaluate" diff --git a/src/py/flwr/common/message.py b/src/py/flwr/common/message.py index ada4a617b60d..9258edccbcd5 100644 --- a/src/py/flwr/common/message.py +++ b/src/py/flwr/common/message.py @@ -22,32 +22,32 @@ @dataclass class Metadata: - """A dataclass holding metadata associated with the current task. + """A dataclass holding metadata associated with the current message. Parameters ---------- run_id : int An identifier for the current run. - task_id : str - An identifier for the current task. + message_id : str + An identifier for the current message. group_id : str - An identifier for grouping tasks. In some settings + An identifier for grouping messages. In some settings this is used as the FL round. node_id : int - An identifier for the node running a task. + An identifier for the node running a message. ttl : str - Time-to-live for this task. - task_type : str + Time-to-live for this message. + message_type : str A string that encodes the action to be executed on the receiving end. """ run_id: int - task_id: str + message_id: str group_id: str node_id: int ttl: str - task_type: str + message_type: str @dataclass @@ -57,7 +57,7 @@ class Message: Parameters ---------- metadata : Metadata - A dataclass including information about the task to be executed. + A dataclass including information about the message to be executed. content : RecordSet Holds records either sent by another entity (e.g. sent by the server-side logic to a client, or vice-versa) or that will be sent to it. diff --git a/src/py/flwr/common/serde.py b/src/py/flwr/common/serde.py index 5a8c5c753136..2808cb88fb5c 100644 --- a/src/py/flwr/common/serde.py +++ b/src/py/flwr/common/serde.py @@ -550,7 +550,7 @@ def message_to_taskins(message: Message) -> TaskIns: return TaskIns( task=Task( ttl=message.metadata.ttl, - task_type=message.metadata.task_type, + task_type=message.metadata.message_type, recordset=recordset_to_proto(message.content), ), ) @@ -561,11 +561,11 @@ def message_from_taskins(taskins: TaskIns) -> Message: # Retrieve the Metadata metadata = Metadata( run_id=taskins.run_id, - task_id=taskins.task_id, + message_id=taskins.task_id, group_id=taskins.group_id, node_id=taskins.task.consumer.node_id, ttl=taskins.task.ttl, - task_type=taskins.task.task_type, + message_type=taskins.task.task_type, ) # Return the Message @@ -580,7 +580,7 @@ def message_to_taskres(message: Message) -> TaskRes: return TaskRes( task=Task( ttl=message.metadata.ttl, - task_type=message.metadata.task_type, + task_type=message.metadata.message_type, recordset=recordset_to_proto(message.content), ), ) @@ -591,11 +591,11 @@ def message_from_taskres(taskres: TaskRes) -> Message: # Retrieve the MetaData metadata = Metadata( run_id=taskres.run_id, - task_id=taskres.task_id, + message_id=taskres.task_id, group_id=taskres.group_id, node_id=taskres.task.consumer.node_id, ttl=taskres.task.ttl, - task_type=taskres.task.task_type, + message_type=taskres.task.task_type, ) # Return the Message diff --git a/src/py/flwr/common/serde_test.py b/src/py/flwr/common/serde_test.py index 1c36f2171149..44085e8d9ab8 100644 --- a/src/py/flwr/common/serde_test.py +++ b/src/py/flwr/common/serde_test.py @@ -217,11 +217,11 @@ def metadata(self) -> Metadata: """Create a Metadata.""" return Metadata( run_id=self.rng.randint(0, 1 << 30), - task_id=self.get_str(64), + message_id=self.get_str(64), group_id=self.get_str(30), node_id=self.rng.randint(0, 1 << 63), ttl=self.get_str(10), - task_type=self.get_str(10), + message_type=self.get_str(10), ) @@ -308,11 +308,11 @@ def test_message_to_and_from_taskins() -> None: original = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=metadata.node_id, ttl=metadata.ttl, - task_type=metadata.task_type, + message_type=metadata.message_type, ), content=maker.recordset(1, 1, 1), ) @@ -320,7 +320,7 @@ def test_message_to_and_from_taskins() -> None: # Execute taskins = message_to_taskins(original) taskins.run_id = metadata.run_id - taskins.task_id = metadata.task_id + taskins.task_id = metadata.message_id taskins.group_id = metadata.group_id taskins.task.consumer.node_id = metadata.node_id deserialized = message_from_taskins(taskins) @@ -338,11 +338,11 @@ def test_message_to_and_from_taskres() -> None: original = Message( metadata=Metadata( run_id=0, - task_id="", + message_id="", group_id="", node_id=metadata.node_id, ttl=metadata.ttl, - task_type=metadata.task_type, + message_type=metadata.message_type, ), content=maker.recordset(1, 1, 1), ) @@ -350,7 +350,7 @@ def test_message_to_and_from_taskres() -> None: # Execute taskres = message_to_taskres(original) taskres.run_id = metadata.run_id - taskres.task_id = metadata.task_id + taskres.task_id = metadata.message_id taskres.group_id = metadata.group_id taskres.task.consumer.node_id = metadata.node_id deserialized = message_from_taskres(taskres) diff --git a/src/py/flwr/server/driver/driver_client_proxy.py b/src/py/flwr/server/driver/driver_client_proxy.py index e0ff26c035f7..8ea288dbb50c 100644 --- a/src/py/flwr/server/driver/driver_client_proxy.py +++ b/src/py/flwr/server/driver/driver_client_proxy.py @@ -22,10 +22,10 @@ from flwr.common import recordset_compat as compat from flwr.common import serde from flwr.common.constant import ( - TASK_TYPE_EVALUATE, - TASK_TYPE_FIT, - TASK_TYPE_GET_PARAMETERS, - TASK_TYPE_GET_PROPERTIES, + MESSAGE_TYPE_EVALUATE, + MESSAGE_TYPE_FIT, + MESSAGE_TYPE_GET_PARAMETERS, + MESSAGE_TYPE_GET_PROPERTIES, ) from flwr.common.recordset import RecordSet from flwr.proto import driver_pb2, node_pb2, task_pb2 # pylint: disable=E0611 @@ -54,7 +54,7 @@ def get_properties( out_recordset = compat.getpropertiesins_to_recordset(ins) # Fetch response in_recordset = self._send_receive_recordset( - out_recordset, TASK_TYPE_GET_PROPERTIES, timeout + out_recordset, MESSAGE_TYPE_GET_PROPERTIES, timeout ) # RecordSet to Res return compat.recordset_to_getpropertiesres(in_recordset) @@ -67,7 +67,7 @@ def get_parameters( out_recordset = compat.getparametersins_to_recordset(ins) # Fetch response in_recordset = self._send_receive_recordset( - out_recordset, TASK_TYPE_GET_PARAMETERS, timeout + out_recordset, MESSAGE_TYPE_GET_PARAMETERS, timeout ) # RecordSet to Res return compat.recordset_to_getparametersres(in_recordset, False) @@ -78,7 +78,7 @@ def fit(self, ins: common.FitIns, timeout: Optional[float]) -> common.FitRes: out_recordset = compat.fitins_to_recordset(ins, keep_input=True) # Fetch response in_recordset = self._send_receive_recordset( - out_recordset, TASK_TYPE_FIT, timeout + out_recordset, MESSAGE_TYPE_FIT, timeout ) # RecordSet to Res return compat.recordset_to_fitres(in_recordset, keep_input=False) @@ -91,7 +91,7 @@ def evaluate( out_recordset = compat.evaluateins_to_recordset(ins, keep_input=True) # Fetch response in_recordset = self._send_receive_recordset( - out_recordset, TASK_TYPE_EVALUATE, timeout + out_recordset, MESSAGE_TYPE_EVALUATE, timeout ) # RecordSet to Res return compat.recordset_to_evaluateres(in_recordset) diff --git a/src/py/flwr/server/driver/driver_client_proxy_test.py b/src/py/flwr/server/driver/driver_client_proxy_test.py index 18277a7ce80c..aa60448bd72e 100644 --- a/src/py/flwr/server/driver/driver_client_proxy_test.py +++ b/src/py/flwr/server/driver/driver_client_proxy_test.py @@ -25,10 +25,10 @@ from flwr.common import recordset_compat as compat from flwr.common import serde from flwr.common.constant import ( - TASK_TYPE_EVALUATE, - TASK_TYPE_FIT, - TASK_TYPE_GET_PARAMETERS, - TASK_TYPE_GET_PROPERTIES, + MESSAGE_TYPE_EVALUATE, + MESSAGE_TYPE_FIT, + MESSAGE_TYPE_GET_PARAMETERS, + MESSAGE_TYPE_GET_PROPERTIES, ) from flwr.common.typing import ( Code, @@ -56,21 +56,21 @@ def _make_task( res: Union[GetParametersRes, GetPropertiesRes, FitRes, EvaluateRes] ) -> task_pb2.Task: # pylint: disable=E1101 if isinstance(res, GetParametersRes): - task_type = TASK_TYPE_GET_PARAMETERS + message_type = MESSAGE_TYPE_GET_PARAMETERS recordset = compat.getparametersres_to_recordset(res, True) elif isinstance(res, GetPropertiesRes): - task_type = TASK_TYPE_GET_PROPERTIES + message_type = MESSAGE_TYPE_GET_PROPERTIES recordset = compat.getpropertiesres_to_recordset(res) elif isinstance(res, FitRes): - task_type = TASK_TYPE_FIT + message_type = MESSAGE_TYPE_FIT recordset = compat.fitres_to_recordset(res, True) elif isinstance(res, EvaluateRes): - task_type = TASK_TYPE_EVALUATE + message_type = MESSAGE_TYPE_EVALUATE recordset = compat.evaluateres_to_recordset(res) else: raise ValueError(f"Unsupported type: {type(res)}") return task_pb2.Task( # pylint: disable=E1101 - task_type=task_type, + task_type=message_type, recordset=serde.recordset_to_proto(recordset), ) From 06295891c635197ca6629af9518eb4b9500147a2 Mon Sep 17 00:00:00 2001 From: Robert Steiner Date: Thu, 15 Feb 2024 11:33:19 +0100 Subject: [PATCH 008/102] Update URLs in READMEs (#2949) --- README.md | 50 +++++------ baselines/README.md | 6 +- baselines/fedpara/README.md | 84 +++++++++---------- datasets/README.md | 16 ++-- examples/advanced-pytorch/README.md | 2 +- examples/advanced-tensorflow/README.md | 2 +- examples/custom-metrics/README.md | 12 +-- examples/flower-via-docker-compose/README.md | 4 +- .../README.md | 2 +- examples/quickstart-huggingface/README.md | 2 +- examples/quickstart-pandas/README.md | 4 +- .../quickstart-pytorch-lightning/README.md | 2 +- examples/quickstart-pytorch/README.md | 2 +- examples/quickstart-sklearn-tabular/README.md | 2 +- examples/quickstart-tensorflow/README.md | 2 +- examples/simulation-pytorch/README.md | 4 +- examples/simulation-tensorflow/README.md | 4 +- examples/sklearn-logreg-mnist/README.md | 2 +- .../whisper-federated-finetuning/README.md | 4 +- examples/xgboost-comprehensive/README.md | 6 +- examples/xgboost-quickstart/README.md | 2 +- 21 files changed, 107 insertions(+), 107 deletions(-) diff --git a/README.md b/README.md index e4433e517b88..38a11d951fe7 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,16 @@ # Flower: A Friendly Federated Learning Framework

- - Flower Website + + Flower Website

- Website | - Blog | - Docs | - Conference | - Slack + Website | + Blog | + Docs | + Conference | + Slack

@@ -18,7 +18,7 @@ [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/adap/flower/blob/main/CONTRIBUTING.md) ![Build](https://github.com/adap/flower/actions/workflows/framework.yml/badge.svg) [![Downloads](https://static.pepy.tech/badge/flwr)](https://pepy.tech/project/flwr) -[![Slack](https://img.shields.io/badge/Chat-Slack-red)](https://flower.dev/join-slack) +[![Slack](https://img.shields.io/badge/Chat-Slack-red)](https://flower.ai/join-slack) Flower (`flwr`) is a framework for building federated learning systems. The design of Flower is based on a few guiding principles: @@ -39,7 +39,7 @@ design of Flower is based on a few guiding principles: - **Understandable**: Flower is written with maintainability in mind. The community is encouraged to both read and contribute to the codebase. -Meet the Flower community on [flower.dev](https://flower.dev)! +Meet the Flower community on [flower.ai](https://flower.ai)! ## Federated Learning Tutorial @@ -73,19 +73,19 @@ Stay tuned, more tutorials are coming soon. Topics include **Privacy and Securit ## Documentation -[Flower Docs](https://flower.dev/docs): +[Flower Docs](https://flower.ai/docs): -- [Installation](https://flower.dev/docs/framework/how-to-install-flower.html) -- [Quickstart (TensorFlow)](https://flower.dev/docs/framework/tutorial-quickstart-tensorflow.html) -- [Quickstart (PyTorch)](https://flower.dev/docs/framework/tutorial-quickstart-pytorch.html) -- [Quickstart (Hugging Face)](https://flower.dev/docs/framework/tutorial-quickstart-huggingface.html) -- [Quickstart (PyTorch Lightning)](https://flower.dev/docs/framework/tutorial-quickstart-pytorch-lightning.html) -- [Quickstart (Pandas)](https://flower.dev/docs/framework/tutorial-quickstart-pandas.html) -- [Quickstart (fastai)](https://flower.dev/docs/framework/tutorial-quickstart-fastai.html) -- [Quickstart (JAX)](https://flower.dev/docs/framework/tutorial-quickstart-jax.html) -- [Quickstart (scikit-learn)](https://flower.dev/docs/framework/tutorial-quickstart-scikitlearn.html) -- [Quickstart (Android [TFLite])](https://flower.dev/docs/framework/tutorial-quickstart-android.html) -- [Quickstart (iOS [CoreML])](https://flower.dev/docs/framework/tutorial-quickstart-ios.html) +- [Installation](https://flower.ai/docs/framework/how-to-install-flower.html) +- [Quickstart (TensorFlow)](https://flower.ai/docs/framework/tutorial-quickstart-tensorflow.html) +- [Quickstart (PyTorch)](https://flower.ai/docs/framework/tutorial-quickstart-pytorch.html) +- [Quickstart (Hugging Face)](https://flower.ai/docs/framework/tutorial-quickstart-huggingface.html) +- [Quickstart (PyTorch Lightning)](https://flower.ai/docs/framework/tutorial-quickstart-pytorch-lightning.html) +- [Quickstart (Pandas)](https://flower.ai/docs/framework/tutorial-quickstart-pandas.html) +- [Quickstart (fastai)](https://flower.ai/docs/framework/tutorial-quickstart-fastai.html) +- [Quickstart (JAX)](https://flower.ai/docs/framework/tutorial-quickstart-jax.html) +- [Quickstart (scikit-learn)](https://flower.ai/docs/framework/tutorial-quickstart-scikitlearn.html) +- [Quickstart (Android [TFLite])](https://flower.ai/docs/framework/tutorial-quickstart-android.html) +- [Quickstart (iOS [CoreML])](https://flower.ai/docs/framework/tutorial-quickstart-ios.html) ## Flower Baselines @@ -112,9 +112,9 @@ Flower Baselines is a collection of community-contributed projects that reproduc - [FedAvg](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/fedavg_mnist) - [FedOpt](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/adaptive_federated_optimization) -Please refer to the [Flower Baselines Documentation](https://flower.dev/docs/baselines/) for a detailed categorization of baselines and for additional info including: -* [How to use Flower Baselines](https://flower.dev/docs/baselines/how-to-use-baselines.html) -* [How to contribute a new Flower Baseline](https://flower.dev/docs/baselines/how-to-contribute-baselines.html) +Please refer to the [Flower Baselines Documentation](https://flower.ai/docs/baselines/) for a detailed categorization of baselines and for additional info including: +* [How to use Flower Baselines](https://flower.ai/docs/baselines/how-to-use-baselines.html) +* [How to contribute a new Flower Baseline](https://flower.ai/docs/baselines/how-to-contribute-baselines.html) ## Flower Usage Examples @@ -151,7 +151,7 @@ Other [examples](https://github.com/adap/flower/tree/main/examples): ## Community -Flower is built by a wonderful community of researchers and engineers. [Join Slack](https://flower.dev/join-slack) to meet them, [contributions](#contributing-to-flower) are welcome. +Flower is built by a wonderful community of researchers and engineers. [Join Slack](https://flower.ai/join-slack) to meet them, [contributions](#contributing-to-flower) are welcome. diff --git a/baselines/README.md b/baselines/README.md index a18c0553b2b4..3a84df02d8de 100644 --- a/baselines/README.md +++ b/baselines/README.md @@ -1,7 +1,7 @@ # Flower Baselines -> We are changing the way we structure the Flower baselines. While we complete the transition to the new format, you can still find the existing baselines in the `flwr_baselines` directory. Currently, you can make use of baselines for [FedAvg](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/fedavg_mnist), [FedOpt](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/adaptive_federated_optimization), and [LEAF-FEMNIST](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/leaf/femnist). +> We are changing the way we structure the Flower baselines. While we complete the transition to the new format, you can still find the existing baselines in the `flwr_baselines` directory. Currently, you can make use of baselines for [FedAvg](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/fedavg_mnist), [FedOpt](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/adaptive_federated_optimization), and [LEAF-FEMNIST](https://github.com/adap/flower/tree/main/baselines/flwr_baselines/flwr_baselines/publications/leaf/femnist). > The documentation below has been updated to reflect the new way of using Flower baselines. @@ -23,7 +23,7 @@ Please note that some baselines might include additional files (e.g. a `requirem ## Running the baselines -Each baseline is self-contained in its own directory. Furthermore, each baseline defines its own Python environment using [Poetry](https://python-poetry.org/docs/) via a `pyproject.toml` file and [`pyenv`](https://github.com/pyenv/pyenv). If you haven't setup `Poetry` and `pyenv` already on your machine, please take a look at the [Documentation](https://flower.dev/docs/baselines/how-to-use-baselines.html#setting-up-your-machine) for a guide on how to do so. +Each baseline is self-contained in its own directory. Furthermore, each baseline defines its own Python environment using [Poetry](https://python-poetry.org/docs/) via a `pyproject.toml` file and [`pyenv`](https://github.com/pyenv/pyenv). If you haven't setup `Poetry` and `pyenv` already on your machine, please take a look at the [Documentation](https://flower.ai/docs/baselines/how-to-use-baselines.html#setting-up-your-machine) for a guide on how to do so. Assuming `pyenv` and `Poetry` are already installed on your system. Running a baseline can be done by: @@ -54,7 +54,7 @@ The steps to follow are: ```bash # This will create a new directory with the same structure as `baseline_template`. ./dev/create-baseline.sh - ``` + ``` 3. Then, go inside your baseline directory and continue with the steps detailed in `EXTENDED_README.md` and `README.md`. 4. Once your code is ready and you have checked that following the instructions in your `README.md` the Python environment can be created correctly and that running the code following your instructions can reproduce the experiments in the paper, you just need to create a Pull Request (PR). Then, the process to merge your baseline into the Flower repo will begin! diff --git a/baselines/fedpara/README.md b/baselines/fedpara/README.md index 068366aa261c..82efe5fac537 100644 --- a/baselines/fedpara/README.md +++ b/baselines/fedpara/README.md @@ -5,7 +5,7 @@ labels: [image classification, personalization, low-rank training, tensor decomp dataset: [CIFAR-10, CIFAR-100, MNIST] --- -# FedPara: Low-rank Hadamard Product for Communication-Efficient Federated Learning +# FedPara: Low-rank Hadamard Product for Communication-Efficient Federated Learning > Note: If you use this baseline in your work, please remember to cite the original authors of the paper as well as the Flower paper. @@ -43,7 +43,7 @@ Specifically, it replicates the results for CIFAR-10 and CIFAR-100 in Figure 3 On a machine with RTX 3090Ti (24GB VRAM) it takes approximately 1h to run each CIFAR-10/100 experiment while using < 12GB of VRAM. You can lower the VRAM footprint my reducing the number of clients allowed to run in parallel in your GPU (do this by raising `client_resources.num_gpus`). -**Contributors:** Yahia Salaheldin Shaaban, Omar Mokhtar and Roeia Amr +**Contributors:** Yahia Salaheldin Shaaban, Omar Mokhtar and Roeia Amr ## Experimental Setup @@ -52,48 +52,48 @@ On a machine with RTX 3090Ti (24GB VRAM) it takes approximately 1h to run each C **Model:** This baseline implements VGG16 with group normalization. -**Dataset:** +**Dataset:** -| Dataset | #classes | #partitions | partitioning method IID | partitioning method non-IID | -|:---------|:--------:|:-----------:|:----------------------:| :----------------------:| -| CIFAR-10 | 10 | 100 | random split | Dirichlet distribution ($\alpha=0.5$)| -| CIFAR-100 | 100 | 50 | random split| Dirichlet distribution ($\alpha=0.5$)| +| Dataset | #classes | #partitions | partitioning method IID | partitioning method non-IID | +| :-------- | :------: | :---------: | :---------------------: | :-----------------------------------: | +| CIFAR-10 | 10 | 100 | random split | Dirichlet distribution ($\alpha=0.5$) | +| CIFAR-100 | 100 | 50 | random split | Dirichlet distribution ($\alpha=0.5$) | **Training Hyperparameters:** -| | Cifar10 IID | Cifar10 Non-IID | Cifar100 IID | Cifar100 Non-IID | MNIST | -|---|-------|-------|------|-------|----------| -| Fraction of client (K) | 16 | 16 | 8 | 8 | 10 | -| Total rounds (T) | 200 | 200 | 400 | 400 | 100 | -| Number of SGD epochs (E) | 10 | 5 | 10 | 5 | 5 | -| Batch size (B) | 64 | 64 | 64 | 64 | 10 | -| Initial learning rate (η) | 0.1 | 0.1 | 0.1 | 0.1 | 0.1-0.01 | -| Learning rate decay (τ) | 0.992 | 0.992 | 0.992| 0.992 | 0.999 | -| Regularization coefficient (λ) | 1 | 1 | 1 | 1 | 0 | +| | Cifar10 IID | Cifar10 Non-IID | Cifar100 IID | Cifar100 Non-IID | MNIST | +| ------------------------------ | ----------- | --------------- | ------------ | ---------------- | -------- | +| Fraction of client (K) | 16 | 16 | 8 | 8 | 10 | +| Total rounds (T) | 200 | 200 | 400 | 400 | 100 | +| Number of SGD epochs (E) | 10 | 5 | 10 | 5 | 5 | +| Batch size (B) | 64 | 64 | 64 | 64 | 10 | +| Initial learning rate (η) | 0.1 | 0.1 | 0.1 | 0.1 | 0.1-0.01 | +| Learning rate decay (τ) | 0.992 | 0.992 | 0.992 | 0.992 | 0.999 | +| Regularization coefficient (λ) | 1 | 1 | 1 | 1 | 0 | As for the parameters ratio ($\gamma$) we use the following model sizes. As in the paper, $\gamma=0.1$ is used for CIFAR-10 and $\gamma=0.4$ for CIFAR-100: | Parameters ratio ($\gamma$) | CIFAR-10 | CIFAR-100 | -|----------|--------|--------| -| 1.0 (original) | 15.25M | 15.30M | -| 0.1 | 1.55M | - | -| 0.4 | - | 4.53M | +| --------------------------- | -------- | --------- | +| 1.0 (original) | 15.25M | 15.30M | +| 0.1 | 1.55M | - | +| 0.4 | - | 4.53M | -### Notes: +### Notes: - Notably, Fedpara's low-rank training technique heavily relies on initialization, with our experiments revealing that employing a 'Fan-in' He initialization (or Kaiming) renders the model incapable of convergence, resulting in a performance akin to that of a random classifier. We found that only Fan-out initialization yielded the anticipated results, and we postulated that this is attributed to the variance conservation during backward propagation. - The paper lacks explicit guidance on calculating the rank, aside from the "Rank_min - Rank_max" equation. To address this, we devised an equation aligning with the literature's explanation and constraint, solving a quadratic equation to determine max_rank and utilizing proposition 2 from the paper to establish min_rank. - The Jacobian correction was not incorporated into our implementation, primarily due to the lack of explicit instructions in the paper regarding the specific implementation of the dual update principle mentioned in the Jacobian correction section. -- It was observed that data generation is crutial for model convergence +- It was observed that data generation is crutial for model convergence ## Environment Setup To construct the Python environment follow these steps: -It is assumed that `pyenv` is installed, `poetry` is installed and python 3.10.6 is installed using `pyenv`. Refer to this [documentation](https://flower.dev/docs/baselines/how-to-usef-baselines.html#setting-up-your-machine) to ensure that your machine is ready. +It is assumed that `pyenv` is installed, `poetry` is installed and python 3.10.6 is installed using `pyenv`. Refer to this [documentation](https://flower.ai/docs/baselines/how-to-usef-baselines.html#setting-up-your-machine) to ensure that your machine is ready. ```bash # Set Python 3.10 @@ -112,7 +112,7 @@ poetry shell Running `FedPara` is easy. You can run it with default parameters directly or by tweaking them directly on the command line. Some command examples are shown below. -```bash +```bash # To run fedpara with default parameters python -m fedpara.main @@ -138,45 +138,45 @@ To reproduce the curves shown below (which correspond to those in Figure 3 in th ```bash # To run fedpara for non-iid CIFAR-10 on vgg16 for lowrank and original schemes -python -m fedpara.main --multirun model.param_type=standard,lowrank +python -m fedpara.main --multirun model.param_type=standard,lowrank # To run fedpara for non-iid CIFAR-100 on vgg16 for lowrank and original schemes -python -m fedpara.main --config-name cifar100 --multirun model.param_type=standard,lowrank +python -m fedpara.main --config-name cifar100 --multirun model.param_type=standard,lowrank # To run fedpara for iid CIFAR-10 on vgg16 for lowrank and original schemes -python -m fedpara.main --multirun model.param_type=standard,lowrank num_epochs=10 dataset_config.partition=iid +python -m fedpara.main --multirun model.param_type=standard,lowrank num_epochs=10 dataset_config.partition=iid # To run fedpara for iid CIFAR-100 on vgg16 for lowrank and original schemes python -m fedpara.main --config-name cifar100 --multirun model.param_type=standard,lowrank num_epochs=10 dataset_config.partition=iid -# To run fedavg for non-iid MINST on FC -python -m fedpara.main --config-name mnist_fedavg -# To run fedper for non-iid MINST on FC -python -m fedpara.main --config-name mnist_fedper -# To run pfedpara for non-iid MINST on FC -python -m fedpara.main --config-name mnist_pfedpara +# To run fedavg for non-iid MINST on FC +python -m fedpara.main --config-name mnist_fedavg +# To run fedper for non-iid MINST on FC +python -m fedpara.main --config-name mnist_fedper +# To run pfedpara for non-iid MINST on FC +python -m fedpara.main --config-name mnist_pfedpara ``` -#### Communication Cost: -Communication costs as measured as described in the paper: +#### Communication Cost: +Communication costs as measured as described in the paper: *"FL evaluation typically measures the required rounds to achieve the target accuracy as communication costs, but we instead assess total transferred bit sizes, 2 × (#participants)×(model size)×(#rounds)"* ### CIFAR-100 (Accuracy vs Communication Cost) -| IID | Non-IID | -|:----:|:----:| -|![Cifar100 iid](_static/Cifar100_iid.jpeg) | ![Cifar100 non-iid](_static/Cifar100_noniid.jpeg) | +| IID | Non-IID | +| :----------------------------------------: | :-----------------------------------------------: | +| ![Cifar100 iid](_static/Cifar100_iid.jpeg) | ![Cifar100 non-iid](_static/Cifar100_noniid.jpeg) | ### CIFAR-10 (Accuracy vs Communication Cost) -| IID | Non-IID | -|:----:|:----:| -|![CIFAR10 iid](_static/Cifar10_iid.jpeg) | ![CIFAR10 non-iid](_static/Cifar10_noniid.jpeg) | +| IID | Non-IID | +| :--------------------------------------: | :---------------------------------------------: | +| ![CIFAR10 iid](_static/Cifar10_iid.jpeg) | ![CIFAR10 non-iid](_static/Cifar10_noniid.jpeg) | ### NON-IID MINST (FedAvg vs FedPer vs pFedPara) The only federated averaging (FedAvg) implementation replicates the results outlined in the paper. However, challenges with convergence were encountered when applying `pFedPara` and `FedPer` methods. -![Personalization algorithms](_static/non-iid_mnist_personalization.png) +![Personalization algorithms](_static/non-iid_mnist_personalization.png) ## Code Acknowledgments Our code is inspired from these repos: diff --git a/datasets/README.md b/datasets/README.md index 876b6f453fa5..61292fe988bf 100644 --- a/datasets/README.md +++ b/datasets/README.md @@ -4,9 +4,9 @@ [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/adap/flower/blob/main/CONTRIBUTING.md) ![Build](https://github.com/adap/flower/actions/workflows/framework.yml/badge.svg) ![Downloads](https://pepy.tech/badge/flwr-datasets) -[![Slack](https://img.shields.io/badge/Chat-Slack-red)](https://flower.dev/join-slack) +[![Slack](https://img.shields.io/badge/Chat-Slack-red)](https://flower.ai/join-slack) -Flower Datasets (`flwr-datasets`) is a library to quickly and easily create datasets for federated learning, federated evaluation, and federated analytics. It was created by the `Flower Labs` team that also created Flower: A Friendly Federated Learning Framework. +Flower Datasets (`flwr-datasets`) is a library to quickly and easily create datasets for federated learning, federated evaluation, and federated analytics. It was created by the `Flower Labs` team that also created Flower: A Friendly Federated Learning Framework. Flower Datasets library supports: * **downloading datasets** - choose the dataset from Hugging Face's `datasets`, * **partitioning datasets** - customize the partitioning scheme, @@ -14,10 +14,10 @@ Flower Datasets library supports: Thanks to using Hugging Face's `datasets` used under the hood, Flower Datasets integrates with the following popular formats/frameworks: * Hugging Face, -* PyTorch, -* TensorFlow, -* Numpy, -* Pandas, +* PyTorch, +* TensorFlow, +* Numpy, +* Pandas, * Jax, * Arrow. @@ -25,7 +25,7 @@ Create **custom partitioning schemes** or choose from the **implemented partitio * Partitioner (the abstract base class) `Partitioner` * IID partitioning `IidPartitioner(num_partitions)` * Natural ID partitioner `NaturalIdPartitioner` -* Size partitioner (the abstract base class for the partitioners dictating the division based the number of samples) `SizePartitioner` +* Size partitioner (the abstract base class for the partitioners dictating the division based the number of samples) `SizePartitioner` * Linear partitioner `LinearPartitioner` * Square partitioner `SquarePartitioner` * Exponential partitioner `ExponentialPartitioner` @@ -83,7 +83,7 @@ Here are a few of the things that we will work on in future releases: * ✅ Support for more datasets (especially the ones that have user id present). * ✅ Creation of custom `Partitioner`s. * ✅ More out-of-the-box `Partitioner`s. -* ✅ Passing `Partitioner`s via `FederatedDataset`'s `partitioners` argument. +* ✅ Passing `Partitioner`s via `FederatedDataset`'s `partitioners` argument. * ✅ Customization of the dataset splitting before the partitioning. * Simplification of the dataset transformation to the popular frameworks/types. * Creation of the synthetic data, diff --git a/examples/advanced-pytorch/README.md b/examples/advanced-pytorch/README.md index 9101105b2618..c1ba85b95879 100644 --- a/examples/advanced-pytorch/README.md +++ b/examples/advanced-pytorch/README.md @@ -1,6 +1,6 @@ # Advanced Flower Example (PyTorch) -This example demonstrates an advanced federated learning setup using Flower with PyTorch. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) and it differs from the quickstart example in the following ways: +This example demonstrates an advanced federated learning setup using Flower with PyTorch. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) and it differs from the quickstart example in the following ways: - 10 clients (instead of just 2) - Each client holds a local dataset of 5000 training examples and 1000 test examples (note that using the `run.sh` script will only select 10 data samples by default, as the `--toy` argument is set). diff --git a/examples/advanced-tensorflow/README.md b/examples/advanced-tensorflow/README.md index b21c0d2545ca..59866fd99a06 100644 --- a/examples/advanced-tensorflow/README.md +++ b/examples/advanced-tensorflow/README.md @@ -1,6 +1,6 @@ # Advanced Flower Example (TensorFlow/Keras) -This example demonstrates an advanced federated learning setup using Flower with TensorFlow/Keras. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) and it differs from the quickstart example in the following ways: +This example demonstrates an advanced federated learning setup using Flower with TensorFlow/Keras. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) and it differs from the quickstart example in the following ways: - 10 clients (instead of just 2) - Each client holds a local dataset of 1/10 of the train datasets and 80% is training examples and 20% as test examples (note that by default only a small subset of this data is used when running the `run.sh` script) diff --git a/examples/custom-metrics/README.md b/examples/custom-metrics/README.md index debcd7919839..317fb6336106 100644 --- a/examples/custom-metrics/README.md +++ b/examples/custom-metrics/README.md @@ -9,7 +9,7 @@ The main takeaways of this implementation are: - the use of the `output_dict` on the client side - inside `evaluate` method on `client.py` - the use of the `evaluate_metrics_aggregation_fn` - to aggregate the metrics on the server side, part of the `strategy` on `server.py` -This example is based on the `quickstart-tensorflow` with CIFAR-10, source [here](https://flower.dev/docs/quickstart-tensorflow.html), with the addition of [Flower Datasets](https://flower.dev/docs/datasets/index.html) to retrieve the CIFAR-10. +This example is based on the `quickstart-tensorflow` with CIFAR-10, source [here](https://flower.ai/docs/quickstart-tensorflow.html), with the addition of [Flower Datasets](https://flower.ai/docs/datasets/index.html) to retrieve the CIFAR-10. Using the CIFAR-10 dataset for classification, this is a multi-class classification problem, thus some changes on how to calculate the metrics using `average='micro'` and `np.argmax` is required. For binary classification, this is not required. Also, for unsupervised learning tasks, such as using a deep autoencoder, a custom metric based on reconstruction error could be implemented on client side. @@ -91,16 +91,16 @@ chmod +x run.sh ./run.sh ``` -You will see that Keras is starting a federated training. Have a look to the [Flower Quickstarter documentation](https://flower.dev/docs/quickstart-tensorflow.html) for a detailed explanation. You can add `steps_per_epoch=3` to `model.fit()` if you just want to evaluate that everything works without having to wait for the client-side training to finish (this will save you a lot of time during development). +You will see that Keras is starting a federated training. Have a look to the [Flower Quickstarter documentation](https://flower.ai/docs/quickstart-tensorflow.html) for a detailed explanation. You can add `steps_per_epoch=3` to `model.fit()` if you just want to evaluate that everything works without having to wait for the client-side training to finish (this will save you a lot of time during development). Running `run.sh` will result in the following output (after 3 rounds): ```shell INFO flwr 2024-01-17 17:45:23,794 | app.py:228 | app_fit: metrics_distributed { - 'accuracy': [(1, 0.10000000149011612), (2, 0.10000000149011612), (3, 0.3393000066280365)], - 'acc': [(1, 0.1), (2, 0.1), (3, 0.3393)], - 'rec': [(1, 0.1), (2, 0.1), (3, 0.3393)], - 'prec': [(1, 0.1), (2, 0.1), (3, 0.3393)], + 'accuracy': [(1, 0.10000000149011612), (2, 0.10000000149011612), (3, 0.3393000066280365)], + 'acc': [(1, 0.1), (2, 0.1), (3, 0.3393)], + 'rec': [(1, 0.1), (2, 0.1), (3, 0.3393)], + 'prec': [(1, 0.1), (2, 0.1), (3, 0.3393)], 'f1': [(1, 0.10000000000000002), (2, 0.10000000000000002), (3, 0.3393)] } ``` diff --git a/examples/flower-via-docker-compose/README.md b/examples/flower-via-docker-compose/README.md index 1d830e46cbdb..3ef1ac37bcda 100644 --- a/examples/flower-via-docker-compose/README.md +++ b/examples/flower-via-docker-compose/README.md @@ -1,7 +1,7 @@ # Leveraging Flower and Docker for Device Heterogeneity Management in Federated Learning

- Flower Website + Flower Website Docker Logo

@@ -141,7 +141,7 @@ By following these steps, you will have a fully functional federated learning en ### Data Pipeline with FLWR-Datasets -We have integrated [`flwr-datasets`](https://flower.dev/docs/datasets/) into our data pipeline, which is managed within the `load_data.py` file in the `helpers/` directory. This script facilitates standardized access to datasets across the federated network and incorporates a `data_sampling_percentage` argument. This argument allows users to specify the percentage of the dataset to be used for training and evaluation, accommodating devices with lower memory capabilities to prevent Out-of-Memory (OOM) errors. +We have integrated [`flwr-datasets`](https://flower.ai/docs/datasets/) into our data pipeline, which is managed within the `load_data.py` file in the `helpers/` directory. This script facilitates standardized access to datasets across the federated network and incorporates a `data_sampling_percentage` argument. This argument allows users to specify the percentage of the dataset to be used for training and evaluation, accommodating devices with lower memory capabilities to prevent Out-of-Memory (OOM) errors. ### Model Selection and Dataset diff --git a/examples/pytorch-from-centralized-to-federated/README.md b/examples/pytorch-from-centralized-to-federated/README.md index fccb14158ecd..06ee89dddcac 100644 --- a/examples/pytorch-from-centralized-to-federated/README.md +++ b/examples/pytorch-from-centralized-to-federated/README.md @@ -2,7 +2,7 @@ This example demonstrates how an already existing centralized PyTorch-based machine learning project can be federated with Flower. -This introductory example for Flower uses PyTorch, but you're not required to be a PyTorch expert to run the example. The example will help you to understand how Flower can be used to build federated learning use cases based on existing machine learning projects. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. +This introductory example for Flower uses PyTorch, but you're not required to be a PyTorch expert to run the example. The example will help you to understand how Flower can be used to build federated learning use cases based on existing machine learning projects. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. ## Project Setup diff --git a/examples/quickstart-huggingface/README.md b/examples/quickstart-huggingface/README.md index fd868aa1fcce..5fdba887f181 100644 --- a/examples/quickstart-huggingface/README.md +++ b/examples/quickstart-huggingface/README.md @@ -1,6 +1,6 @@ # Federated HuggingFace Transformers using Flower and PyTorch -This introductory example to using [HuggingFace](https://huggingface.co) Transformers with Flower with PyTorch. This example has been extended from the [quickstart-pytorch](https://flower.dev/docs/examples/quickstart-pytorch.html) example. The training script closely follows the [HuggingFace course](https://huggingface.co/course/chapter3?fw=pt), so you are encouraged to check that out for a detailed explanation of the transformer pipeline. +This introductory example to using [HuggingFace](https://huggingface.co) Transformers with Flower with PyTorch. This example has been extended from the [quickstart-pytorch](https://flower.ai/docs/examples/quickstart-pytorch.html) example. The training script closely follows the [HuggingFace course](https://huggingface.co/course/chapter3?fw=pt), so you are encouraged to check that out for a detailed explanation of the transformer pipeline. Like `quickstart-pytorch`, running this example in itself is also meant to be quite easy. diff --git a/examples/quickstart-pandas/README.md b/examples/quickstart-pandas/README.md index a25e6ea6ee36..efcda43cf34d 100644 --- a/examples/quickstart-pandas/README.md +++ b/examples/quickstart-pandas/README.md @@ -1,6 +1,6 @@ # Flower Example using Pandas -This introductory example to Flower uses Pandas, but deep knowledge of Pandas is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to +This introductory example to Flower uses Pandas, but deep knowledge of Pandas is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the dataset. Running this example in itself is quite easy. @@ -79,4 +79,4 @@ Start client 2 in the second terminal: $ python3 client.py --node-id 1 ``` -You will see that the server is printing aggregated statistics about the dataset distributed amongst clients. Have a look to the [Flower Quickstarter documentation](https://flower.dev/docs/quickstart-pandas.html) for a detailed explanation. +You will see that the server is printing aggregated statistics about the dataset distributed amongst clients. Have a look to the [Flower Quickstarter documentation](https://flower.ai/docs/quickstart-pandas.html) for a detailed explanation. diff --git a/examples/quickstart-pytorch-lightning/README.md b/examples/quickstart-pytorch-lightning/README.md index 1287b50bca65..1d404a5d714f 100644 --- a/examples/quickstart-pytorch-lightning/README.md +++ b/examples/quickstart-pytorch-lightning/README.md @@ -1,6 +1,6 @@ # Flower Example using PyTorch Lightning -This introductory example to Flower uses PyTorch, but deep knowledge of PyTorch Lightning is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the MNIST dataset. +This introductory example to Flower uses PyTorch, but deep knowledge of PyTorch Lightning is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the MNIST dataset. ## Project Setup diff --git a/examples/quickstart-pytorch/README.md b/examples/quickstart-pytorch/README.md index 6de0dcf7ab32..3b9b9b310608 100644 --- a/examples/quickstart-pytorch/README.md +++ b/examples/quickstart-pytorch/README.md @@ -1,6 +1,6 @@ # Flower Example using PyTorch -This introductory example to Flower uses PyTorch, but deep knowledge of PyTorch is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. +This introductory example to Flower uses PyTorch, but deep knowledge of PyTorch is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. ## Project Setup diff --git a/examples/quickstart-sklearn-tabular/README.md b/examples/quickstart-sklearn-tabular/README.md index d62525c96c18..373aaea5999c 100644 --- a/examples/quickstart-sklearn-tabular/README.md +++ b/examples/quickstart-sklearn-tabular/README.md @@ -3,7 +3,7 @@ This example of Flower uses `scikit-learn`'s `LogisticRegression` model to train a federated learning system on "iris" (tabular) dataset. It will help you understand how to adapt Flower for use with `scikit-learn`. -Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to +Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the dataset. ## Project Setup diff --git a/examples/quickstart-tensorflow/README.md b/examples/quickstart-tensorflow/README.md index 92d38c9340d7..8d5e9434b086 100644 --- a/examples/quickstart-tensorflow/README.md +++ b/examples/quickstart-tensorflow/README.md @@ -1,7 +1,7 @@ # Flower Example using TensorFlow/Keras This introductory example to Flower uses Keras but deep knowledge of Keras is not necessarily required to run the example. However, it will help you understand how to adapt Flower to your use case. -Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. +Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the CIFAR-10 dataset. ## Project Setup diff --git a/examples/simulation-pytorch/README.md b/examples/simulation-pytorch/README.md index 11b7a3364376..5ba5ec70dc3e 100644 --- a/examples/simulation-pytorch/README.md +++ b/examples/simulation-pytorch/README.md @@ -1,6 +1,6 @@ # Flower Simulation example using PyTorch -This introductory example uses the simulation capabilities of Flower to simulate a large number of clients on a single machine. Take a look at the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html) for a deep dive into how Flower simulation works. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the MNIST dataset. This examples uses 100 clients by default. +This introductory example uses the simulation capabilities of Flower to simulate a large number of clients on a single machine. Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for a deep dive into how Flower simulation works. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the MNIST dataset. This examples uses 100 clients by default. ## Running the example (via Jupyter Notebook) @@ -79,4 +79,4 @@ python sim.py --num_cpus=2 python sim.py --num_cpus=2 --num_gpus=0.2 ``` -Take a look at the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html) for more details on how you can customise your simulation. +Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for more details on how you can customise your simulation. diff --git a/examples/simulation-tensorflow/README.md b/examples/simulation-tensorflow/README.md index f0d94f343d37..75be823db2eb 100644 --- a/examples/simulation-tensorflow/README.md +++ b/examples/simulation-tensorflow/README.md @@ -1,6 +1,6 @@ # Flower Simulation example using TensorFlow/Keras -This introductory example uses the simulation capabilities of Flower to simulate a large number of clients on a single machine. Take a look at the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html) for a deep dive into how Flower simulation works. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the MNIST dataset. This examples uses 100 clients by default. +This introductory example uses the simulation capabilities of Flower to simulate a large number of clients on a single machine. Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for a deep dive into how Flower simulation works. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the MNIST dataset. This examples uses 100 clients by default. ## Running the example (via Jupyter Notebook) @@ -78,4 +78,4 @@ python sim.py --num_cpus=2 python sim.py --num_cpus=2 --num_gpus=0.2 ``` -Take a look at the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html) for more details on how you can customise your simulation. +Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for more details on how you can customise your simulation. diff --git a/examples/sklearn-logreg-mnist/README.md b/examples/sklearn-logreg-mnist/README.md index ee3cdfc9768e..50576d98ba3d 100644 --- a/examples/sklearn-logreg-mnist/README.md +++ b/examples/sklearn-logreg-mnist/README.md @@ -1,7 +1,7 @@ # Flower Example using scikit-learn This example of Flower uses `scikit-learn`'s `LogisticRegression` model to train a federated learning system. It will help you understand how to adapt Flower for use with `scikit-learn`. -Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.dev/docs/datasets/) to download, partition and preprocess the MNIST dataset. +Running this example in itself is quite easy. This example uses [Flower Datasets](https://flower.ai/docs/datasets/) to download, partition and preprocess the MNIST dataset. ## Project Setup diff --git a/examples/whisper-federated-finetuning/README.md b/examples/whisper-federated-finetuning/README.md index e89a09519fed..ddebe51247b2 100644 --- a/examples/whisper-federated-finetuning/README.md +++ b/examples/whisper-federated-finetuning/README.md @@ -110,7 +110,7 @@ An overview of the FL pipeline built with Flower for this example is illustrated 3. Once on-site training is completed, each client sends back the (now updated) classification head to the Flower server. 4. The Flower server aggregates (via FedAvg) the classification heads in order to obtain a new _global_ classification head. This head will be shared with clients in the next round. -Flower supports two ways of doing Federated Learning: simulated and non-simulated FL. The former, managed by the [`VirtualClientEngine`](https://flower.dev/docs/framework/how-to-run-simulations.html), allows you to run large-scale workloads in a system-aware manner, that scales with the resources available on your system (whether it is a laptop, a desktop with a single GPU, or a cluster of GPU servers). The latter is better suited for settings where clients are unique devices (e.g. a server, a smart device, etc). This example shows you how to use both. +Flower supports two ways of doing Federated Learning: simulated and non-simulated FL. The former, managed by the [`VirtualClientEngine`](https://flower.ai/docs/framework/how-to-run-simulations.html), allows you to run large-scale workloads in a system-aware manner, that scales with the resources available on your system (whether it is a laptop, a desktop with a single GPU, or a cluster of GPU servers). The latter is better suited for settings where clients are unique devices (e.g. a server, a smart device, etc). This example shows you how to use both. ### Preparing the dataset @@ -147,7 +147,7 @@ INFO flwr 2023-11-08 14:03:57,557 | app.py:229 | app_fit: metrics_centralized {' With just 5 FL rounds, the global model should be reaching ~95% validation accuracy. A test accuracy of 97% can be reached with 10 rounds of FL training using the default hyperparameters. On an RTX 3090Ti, each round takes ~20-30s depending on the amount of data the clients selected in a round have. -Take a look at the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html) for more details on how you can customize your simulation. +Take a look at the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) for more details on how you can customize your simulation. ### Federated Finetuning (non-simulated) diff --git a/examples/xgboost-comprehensive/README.md b/examples/xgboost-comprehensive/README.md index 97ecc39b47f2..01fed646d056 100644 --- a/examples/xgboost-comprehensive/README.md +++ b/examples/xgboost-comprehensive/README.md @@ -1,7 +1,7 @@ # Flower Example using XGBoost (Comprehensive) This example demonstrates a comprehensive federated learning setup using Flower with XGBoost. -We use [HIGGS](https://archive.ics.uci.edu/dataset/280/higgs) dataset to perform a binary classification task. This examples uses [Flower Datasets](https://flower.dev/docs/datasets/) to retrieve, partition and preprocess the data for each Flower client. +We use [HIGGS](https://archive.ics.uci.edu/dataset/280/higgs) dataset to perform a binary classification task. This examples uses [Flower Datasets](https://flower.ai/docs/datasets/) to retrieve, partition and preprocess the data for each Flower client. It differs from the [xgboost-quickstart](https://github.com/adap/flower/tree/main/examples/xgboost-quickstart) example in the following ways: - Arguments parsers of server and clients for hyperparameters selection. @@ -91,7 +91,7 @@ pip install -r requirements.txt ## Run Federated Learning with XGBoost and Flower -You can run this example in two ways: either by manually launching the server, and then several clients that connect to it; or by launching a Flower simulation. Both run the same workload, yielding identical results. The former is ideal for deployments on different machines, while the latter makes it easy to simulate large client cohorts in a resource-aware manner. You can read more about how Flower Simulation works in the [Documentation](https://flower.dev/docs/framework/how-to-run-simulations.html). The commands shown below assume you have activated your environment (if you decide to use Poetry, you can activate it via `poetry shell`). +You can run this example in two ways: either by manually launching the server, and then several clients that connect to it; or by launching a Flower simulation. Both run the same workload, yielding identical results. The former is ideal for deployments on different machines, while the latter makes it easy to simulate large client cohorts in a resource-aware manner. You can read more about how Flower Simulation works in the [Documentation](https://flower.ai/docs/framework/how-to-run-simulations.html). The commands shown below assume you have activated your environment (if you decide to use Poetry, you can activate it via `poetry shell`). ### Independent Client/Server Setup @@ -143,7 +143,7 @@ python sim.py --train-method=cyclic --pool-size=5 --num-rounds=30 --centralised- ``` In addition, we provide more options to customise the experimental settings, including data partitioning and centralised/distributed evaluation (see `utils.py`). -Check the [tutorial](https://flower.dev/docs/framework/tutorial-quickstart-xgboost.html) for a detailed explanation. +Check the [tutorial](https://flower.ai/docs/framework/tutorial-quickstart-xgboost.html) for a detailed explanation. ### Expected Experimental Results diff --git a/examples/xgboost-quickstart/README.md b/examples/xgboost-quickstart/README.md index 5174c236c668..cd99cd4c2895 100644 --- a/examples/xgboost-quickstart/README.md +++ b/examples/xgboost-quickstart/README.md @@ -85,4 +85,4 @@ poetry run ./run.sh ``` Look at the [code](https://github.com/adap/flower/tree/main/examples/xgboost-quickstart) -and [tutorial](https://flower.dev/docs/framework/tutorial-quickstart-xgboost.html) for a detailed explanation. +and [tutorial](https://flower.ai/docs/framework/tutorial-quickstart-xgboost.html) for a detailed explanation. From 5af1679d1dc712f3a027ebbfb028cf3f633b0afc Mon Sep 17 00:00:00 2001 From: Robert Steiner Date: Thu, 15 Feb 2024 11:45:44 +0100 Subject: [PATCH 009/102] Update URL in fedbn README (#2950) --- baselines/fedbn/README.md | 52 +++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/baselines/fedbn/README.md b/baselines/fedbn/README.md index 4b271bd49851..d50c6f5bb605 100644 --- a/baselines/fedbn/README.md +++ b/baselines/fedbn/README.md @@ -34,37 +34,37 @@ dataset: [MNIST, MNIST-M, SVHN, USPS, SynthDigits] **Model:** A six-layer CNN with 14,219,210 parameters following the structure described in appendix D.2. -**Dataset:** This baseline makes use of the pre-processed partitions created and open source by the authors of the FedBN paper. You can read more about how those were created [here](https://github.com/med-air/FedBN). Follow the steps below in the `Environment Setup` section to download them. +**Dataset:** This baseline makes use of the pre-processed partitions created and open source by the authors of the FedBN paper. You can read more about how those were created [here](https://github.com/med-air/FedBN). Follow the steps below in the `Environment Setup` section to download them. A more detailed explanation of the datasets is given in the following table. -| | MNIST | MNIST-M | SVHN | USPS | SynthDigits | -|--- |--- |--- |--- |--- |--- | -| data type| handwritten digits| MNIST modification randomly colored with colored patches| Street view house numbers | handwritten digits from envelopes by the U.S. Postal Service | Syntehtic digits Windows TM font varying the orientation, blur and stroke colors | -| color | greyscale | RGB | RGB | greyscale | RGB | -| pixelsize | 28x28 | 28 x 28 | 32 x32 | 16 x16 | 32 x32 | -| labels | 0-9 | 0-9 | 1-10 | 0-9 | 1-10 | -| number of trainset | 60.000 | 60.000 | 73.257 | 9,298 | 50.000 | -| number of testset| 10.000 | 10.000 | 26.032 | - | - | -| image shape | (28,28) | (28,28,3) | (32,32,3) | (16,16) | (32,32,3) | +| | MNIST | MNIST-M | SVHN | USPS | SynthDigits | +| ------------------ | ------------------ | -------------------------------------------------------- | ------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------- | +| data type | handwritten digits | MNIST modification randomly colored with colored patches | Street view house numbers | handwritten digits from envelopes by the U.S. Postal Service | Syntehtic digits Windows TM font varying the orientation, blur and stroke colors | +| color | greyscale | RGB | RGB | greyscale | RGB | +| pixelsize | 28x28 | 28 x 28 | 32 x32 | 16 x16 | 32 x32 | +| labels | 0-9 | 0-9 | 1-10 | 0-9 | 1-10 | +| number of trainset | 60.000 | 60.000 | 73.257 | 9,298 | 50.000 | +| number of testset | 10.000 | 10.000 | 26.032 | - | - | +| image shape | (28,28) | (28,28,3) | (32,32,3) | (16,16) | (32,32,3) | **Training Hyperparameters:** By default (i.e. if you don't override anything in the config) these main hyperparameters used are shown in the table below. For a complete list of hyperparameters, please refer to the config files in `fedbn/conf`. -| Description | Value | -| ----------- | ----- | -| rounds | 10 | -| num_clients | 5 | -| strategy_fraction_fit | 1.0 | -| strategy.fraction_evaluate | 0.0 | -| training samples per client| 743 | -| client.l_r | 10E-2 | -| local epochs | 1 | -| loss | cross entropy loss | -| optimizer | SGD | -| client_resources.num_cpu | 2 | -| client_resources.num_gpus | 0.0 | +| Description | Value | +| --------------------------- | ------------------ | +| rounds | 10 | +| num_clients | 5 | +| strategy_fraction_fit | 1.0 | +| strategy.fraction_evaluate | 0.0 | +| training samples per client | 743 | +| client.l_r | 10E-2 | +| local epochs | 1 | +| loss | cross entropy loss | +| optimizer | SGD | +| client_resources.num_cpu | 2 | +| client_resources.num_gpus | 0.0 | ## Environment Setup @@ -93,7 +93,7 @@ cd data .. ## Running the Experiments -First, activate your environment via `poetry shell`. The commands below show how to run the experiments and modify some of its key hyperparameters via the cli. Each time you run an experiment, the log and results will be stored inside `outputs//
{% endif %} diff --git a/doc/source/_templates/sidebar/versioning.html b/doc/source/_templates/sidebar/versioning.html index dde7528d15e4..74f1cd8febb7 100644 --- a/doc/source/_templates/sidebar/versioning.html +++ b/doc/source/_templates/sidebar/versioning.html @@ -59,8 +59,8 @@ -
- +
+
diff --git a/doc/source/conf.py b/doc/source/conf.py index 259d8a988841..88cb5c05b1d8 100644 --- a/doc/source/conf.py +++ b/doc/source/conf.py @@ -123,25 +123,27 @@ # The full name is still at the top of the page add_module_names = False + def find_test_modules(package_path): """Go through the python files and exclude every *_test.py file.""" full_path_modules = [] for root, dirs, files in os.walk(package_path): for file in files: - if file.endswith('_test.py'): + if file.endswith("_test.py"): # Construct the module path relative to the package directory full_path = os.path.join(root, file) relative_path = os.path.relpath(full_path, package_path) # Convert file path to dotted module path - module_path = os.path.splitext(relative_path)[0].replace(os.sep, '.') + module_path = os.path.splitext(relative_path)[0].replace(os.sep, ".") full_path_modules.append(module_path) modules = [] for full_path_module in full_path_modules: - parts = full_path_module.split('.') + parts = full_path_module.split(".") for i in range(len(parts)): - modules.append('.'.join(parts[i:])) + modules.append(".".join(parts[i:])) return modules + # Stop from documenting the *_test.py files. # That's the only way to do that in autosummary (make the modules as mock_imports). autodoc_mock_imports = find_test_modules(os.path.abspath("../../src/py/flwr")) @@ -249,7 +251,7 @@ def find_test_modules(package_path): html_title = f"Flower Framework" html_logo = "_static/flower-logo.png" html_favicon = "_static/favicon.ico" -html_baseurl = "https://flower.dev/docs/framework/" +html_baseurl = "https://flower.ai/docs/framework/" html_theme_options = { # diff --git a/doc/source/contributor-explanation-architecture.rst b/doc/source/contributor-explanation-architecture.rst index 0e2ea1f6e66b..a20a84313118 100644 --- a/doc/source/contributor-explanation-architecture.rst +++ b/doc/source/contributor-explanation-architecture.rst @@ -4,7 +4,7 @@ Flower Architecture Edge Client Engine ------------------ -`Flower `_ core framework architecture with Edge Client Engine +`Flower `_ core framework architecture with Edge Client Engine .. figure:: _static/flower-architecture-ECE.png :width: 80 % @@ -12,7 +12,7 @@ Edge Client Engine Virtual Client Engine --------------------- -`Flower `_ core framework architecture with Virtual Client Engine +`Flower `_ core framework architecture with Virtual Client Engine .. figure:: _static/flower-architecture-VCE.png :width: 80 % @@ -20,7 +20,7 @@ Virtual Client Engine Virtual Client Engine and Edge Client Engine in the same workload ----------------------------------------------------------------- -`Flower `_ core framework architecture with both Virtual Client Engine and Edge Client Engine +`Flower `_ core framework architecture with both Virtual Client Engine and Edge Client Engine .. figure:: _static/flower-architecture.drawio.png :width: 80 % diff --git a/doc/source/contributor-how-to-build-docker-images.rst b/doc/source/contributor-how-to-build-docker-images.rst index 2c6c7a7ab986..3beae7422bef 100644 --- a/doc/source/contributor-how-to-build-docker-images.rst +++ b/doc/source/contributor-how-to-build-docker-images.rst @@ -17,7 +17,7 @@ Before we can start, we need to meet a few prerequisites in our local developmen #. Verify the Docker daemon is running. Please follow the first section on - `Run Flower using Docker `_ + `Run Flower using Docker `_ which covers this step in more detail. Currently, Flower provides two images, a base image and a server image. There will also be a client diff --git a/doc/source/contributor-how-to-contribute-translations.rst b/doc/source/contributor-how-to-contribute-translations.rst index d97a2cb8c64f..1614b8e5a040 100644 --- a/doc/source/contributor-how-to-contribute-translations.rst +++ b/doc/source/contributor-how-to-contribute-translations.rst @@ -2,7 +2,7 @@ Contribute translations ======================= Since `Flower 1.5 -`_ we +`_ we have introduced translations to our doc pages, but, as you might have noticed, the translations are often imperfect. If you speak languages other than English, you might be able to help us in our effort to make Federated Learning @@ -67,5 +67,5 @@ Add new languages ----------------- If you want to add a new language, you will first have to contact us, either on -`Slack `_, or by opening an issue on our `GitHub +`Slack `_, or by opening an issue on our `GitHub repo `_. diff --git a/doc/source/contributor-ref-good-first-contributions.rst b/doc/source/contributor-ref-good-first-contributions.rst index 523a4679c6ef..cbf21e2845bc 100644 --- a/doc/source/contributor-ref-good-first-contributions.rst +++ b/doc/source/contributor-ref-good-first-contributions.rst @@ -14,7 +14,7 @@ Until the Flower core library matures it will be easier to get PR's accepted if they only touch non-core areas of the codebase. Good candidates to get started are: -- Documentation: What's missing? What could be expressed more clearly? +- Documentation: What's missing? What could be expressed more clearly? - Baselines: See below. - Examples: See below. @@ -22,9 +22,9 @@ are: Request for Flower Baselines ---------------------------- -If you are not familiar with Flower Baselines, you should probably check-out our `contributing guide for baselines `_. +If you are not familiar with Flower Baselines, you should probably check-out our `contributing guide for baselines `_. -You should then check out the open +You should then check out the open `issues `_ for baseline requests. If you find a baseline that you'd like to work on and that has no assignes, feel free to assign it to yourself and start working on it! diff --git a/doc/source/contributor-tutorial-contribute-on-github.rst b/doc/source/contributor-tutorial-contribute-on-github.rst index d409802897e4..273b47a636cc 100644 --- a/doc/source/contributor-tutorial-contribute-on-github.rst +++ b/doc/source/contributor-tutorial-contribute-on-github.rst @@ -3,8 +3,8 @@ Contribute on GitHub This guide is for people who want to get involved with Flower, but who are not used to contributing to GitHub projects. -If you're familiar with how contributing on GitHub works, you can directly checkout our -`getting started guide for contributors `_. +If you're familiar with how contributing on GitHub works, you can directly checkout our +`getting started guide for contributors `_. Setting up the repository @@ -16,9 +16,9 @@ Setting up the repository GitHub, itself, is a code hosting platform for version control and collaboration. It allows for everyone to collaborate and work from anywhere on remote repositories. - If you haven't already, you will need to create an account on `GitHub `_. + If you haven't already, you will need to create an account on `GitHub `_. - The idea behind the generic Git and GitHub workflow boils down to this: + The idea behind the generic Git and GitHub workflow boils down to this: you download code from a remote repository on GitHub, make changes locally and keep track of them using Git and then you upload your new history back to GitHub. 2. **Forking the Flower repository** @@ -26,7 +26,7 @@ Setting up the repository and click the ``Fork`` button situated on the top right of the page. .. image:: _static/fork_button.png - + You can change the name if you want, but this is not necessary as this version of Flower will be yours and will sit inside your own account (i.e., in your own list of repositories). Once created, you should see on the top left corner that you are looking at your own version of Flower. @@ -34,14 +34,14 @@ Setting up the repository 3. **Cloning your forked repository** The next step is to download the forked repository on your machine to be able to make changes to it. - On your forked repository page, you should first click on the ``Code`` button on the right, + On your forked repository page, you should first click on the ``Code`` button on the right, this will give you the ability to copy the HTTPS link of the repository. .. image:: _static/cloning_fork.png Once you copied the \, you can open a terminal on your machine, navigate to the place you want to download the repository to and type: - .. code-block:: shell + .. code-block:: shell $ git clone @@ -58,14 +58,14 @@ Setting up the repository To obtain it, we can do as previously mentioned by going to our fork repository on our GitHub account and copying the link. .. image:: _static/cloning_fork.png - + Once the \ is copied, we can type the following command in our terminal: .. code-block:: shell $ git remote add origin - + 5. **Add upstream** Now we will add an upstream address to our repository. Still in the same directroy, we must run the following command: @@ -76,10 +76,10 @@ Setting up the repository The following diagram visually explains what we did in the previous steps: - .. image:: _static/github_schema.png + .. image:: _static/github_schema.png - The upstream is the GitHub remote address of the parent repository (in this case Flower), - i.e. the one we eventually want to contribute to and therefore need an up-to-date history of. + The upstream is the GitHub remote address of the parent repository (in this case Flower), + i.e. the one we eventually want to contribute to and therefore need an up-to-date history of. The origin is just the GitHub remote address of the forked repository we created, i.e. the copy (fork) in our own account. To make sure our local version of the fork is up-to-date with the latest changes from the Flower repository, @@ -113,9 +113,9 @@ And with Flower's repository: $ git pull upstream main 1. **Create a new branch** - To make the history cleaner and easier to work with, it is good practice to + To make the history cleaner and easier to work with, it is good practice to create a new branch for each feature/project that needs to be implemented. - + To do so, just run the following command inside the repository's directory: .. code-block:: shell @@ -137,7 +137,7 @@ And with Flower's repository: $ ./dev/test.sh # to test that your code can be accepted $ ./baselines/dev/format.sh # same as above but for code added to baselines $ ./baselines/dev/test.sh # same as above but for code added to baselines - + 4. **Stage changes** Before creating a commit that will update your history, you must specify to Git which files it needs to take into account. @@ -184,21 +184,21 @@ Creating and merging a pull request (PR) Once you click the ``Compare & pull request`` button, you should see something similar to this: .. image:: _static/creating_pr.png - + At the top you have an explanation of which branch will be merged where: .. image:: _static/merging_branch.png - + In this example you can see that the request is to merge the branch ``doc-fixes`` from my forked repository to branch ``main`` from the Flower repository. - The input box in the middle is there for you to describe what your PR does and to link it to existing issues. + The input box in the middle is there for you to describe what your PR does and to link it to existing issues. We have placed comments (that won't be rendered once the PR is opened) to guide you through the process. It is important to follow the instructions described in comments. For instance, in order to not break how our changelog system works, you should read the information above the ``Changelog entry`` section carefully. You can also checkout some examples and details in the :ref:`changelogentry` appendix. - At the bottom you will find the button to open the PR. This will notify reviewers that a new PR has been opened and + At the bottom you will find the button to open the PR. This will notify reviewers that a new PR has been opened and that they should look over it to merge or to request changes. If your PR is not yet ready for review, and you don't want to notify anyone, you have the option to create a draft pull request: @@ -218,7 +218,7 @@ Creating and merging a pull request (PR) Merging will be blocked if there are ongoing requested changes. .. image:: _static/changes_requested.png - + To resolve them, just push the necessary changes to the branch associated with the PR: .. image:: _static/make_changes.png @@ -277,12 +277,12 @@ This is a tiny change, but it’ll allow us to test your end-to-end setup. After - Find the source file in ``doc/source`` - Make the change in the ``.rst`` file (beware, the dashes under the title should be the same length as the title itself) -- Build the docs and check the result: ``_ +- Build the docs and check the result: ``_ Rename file ::::::::::: -You might have noticed that the file name still reflects the old wording. +You might have noticed that the file name still reflects the old wording. If we just change the file, then we break all existing links to it - it is **very important** to avoid that, breaking links can harm our search engine ranking. Here’s how to change the file name: @@ -295,7 +295,7 @@ This will cause a redirect from ``saving-progress.html`` to ``save-progress.html Apply changes in the index file ::::::::::::::::::::::::::::::: -For the lateral navigation bar to work properly, it is very important to update the ``index.rst`` file as well. +For the lateral navigation bar to work properly, it is very important to update the ``index.rst`` file as well. This is where we define the whole arborescence of the navbar. - Find and modify the file name in ``index.rst`` @@ -343,7 +343,7 @@ Next steps Once you have made your first PR, and want to contribute more, be sure to check out the following : -- `Good first contributions `_, where you should particularly look into the :code:`baselines` contributions. +- `Good first contributions `_, where you should particularly look into the :code:`baselines` contributions. Appendix @@ -358,10 +358,10 @@ When opening a new PR, inside its description, there should be a ``Changelog ent Above this header you should see the following comment that explains how to write your changelog entry: - Inside the following 'Changelog entry' section, + Inside the following 'Changelog entry' section, you should put the description of your changes that will be added to the changelog alongside your PR title. - If the section is completely empty (without any token) or non-existant, + If the section is completely empty (without any token) or non-existant, the changelog will just contain the title of the PR for the changelog entry, without any description. If the section contains some text other than tokens, it will use it to add a description to the change. diff --git a/doc/source/example-fedbn-pytorch-from-centralized-to-federated.rst b/doc/source/example-fedbn-pytorch-from-centralized-to-federated.rst index 5ebaa337dde8..5d4dac0c0cda 100644 --- a/doc/source/example-fedbn-pytorch-from-centralized-to-federated.rst +++ b/doc/source/example-fedbn-pytorch-from-centralized-to-federated.rst @@ -3,11 +3,11 @@ Example: FedBN in PyTorch - From Centralized To Federated This tutorial will show you how to use Flower to build a federated version of an existing machine learning workload with `FedBN `_, a federated training strategy designed for non-iid data. We are using PyTorch to train a Convolutional Neural Network(with Batch Normalization layers) on the CIFAR-10 dataset. -When applying FedBN, only few changes needed compared to `Example: PyTorch - From Centralized To Federated `_. +When applying FedBN, only few changes needed compared to `Example: PyTorch - From Centralized To Federated `_. Centralized Training -------------------- -All files are revised based on `Example: PyTorch - From Centralized To Federated `_. +All files are revised based on `Example: PyTorch - From Centralized To Federated `_. The only thing to do is modifying the file called :code:`cifar.py`, revised part is shown below: The model architecture defined in class Net() is added with Batch Normalization layers accordingly. @@ -50,8 +50,8 @@ Let's take the next step and use what we've built to create a federated learning Federated Training ------------------ -If you have read `Example: PyTorch - From Centralized To Federated `_, the following parts are easy to follow, onyl :code:`get_parameters` and :code:`set_parameters` function in :code:`client.py` needed to revise. -If not, please read the `Example: PyTorch - From Centralized To Federated `_. first. +If you have read `Example: PyTorch - From Centralized To Federated `_, the following parts are easy to follow, onyl :code:`get_parameters` and :code:`set_parameters` function in :code:`client.py` needed to revise. +If not, please read the `Example: PyTorch - From Centralized To Federated `_. first. Our example consists of one *server* and two *clients*. In FedBN, :code:`server.py` keeps unchanged, we can start the server directly. @@ -66,7 +66,7 @@ Finally, we will revise our *client* logic by changing :code:`get_parameters` an class CifarClient(fl.client.NumPyClient): """Flower client implementing CIFAR-10 image classification using PyTorch.""" - + ... def get_parameters(self, config) -> List[np.ndarray]: @@ -79,7 +79,7 @@ Finally, we will revise our *client* logic by changing :code:`get_parameters` an params_dict = zip(keys, parameters) state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict}) self.model.load_state_dict(state_dict, strict=False) - + ... Now, you can now open two additional terminal windows and run diff --git a/doc/source/how-to-configure-clients.rst b/doc/source/how-to-configure-clients.rst index 26c132125ccf..bfb5a8f63761 100644 --- a/doc/source/how-to-configure-clients.rst +++ b/doc/source/how-to-configure-clients.rst @@ -13,7 +13,7 @@ Configuration values are represented as a dictionary with ``str`` keys and value config_dict = { "dropout": True, # str key, bool value "learning_rate": 0.01, # str key, float value - "batch_size": 32, # str key, int value + "batch_size": 32, # str key, int value "optimizer": "sgd", # str key, str value } @@ -56,7 +56,7 @@ To make the built-in strategies use this function, we can pass it to ``FedAvg`` One the client side, we receive the configuration dictionary in ``fit``: .. code-block:: python - + class FlowerClient(flwr.client.NumPyClient): def fit(parameters, config): print(config["batch_size"]) # Prints `32` @@ -86,7 +86,7 @@ Configuring individual clients In some cases, it is necessary to send different configuration values to different clients. -This can be achieved by customizing an existing strategy or by `implementing a custom strategy from scratch `_. Here's a nonsensical example that customizes :code:`FedAvg` by adding a custom ``"hello": "world"`` configuration key/value pair to the config dict of a *single client* (only the first client in the list, the other clients in this round to not receive this "special" config value): +This can be achieved by customizing an existing strategy or by `implementing a custom strategy from scratch `_. Here's a nonsensical example that customizes :code:`FedAvg` by adding a custom ``"hello": "world"`` configuration key/value pair to the config dict of a *single client* (only the first client in the list, the other clients in this round to not receive this "special" config value): .. code-block:: python diff --git a/doc/source/how-to-install-flower.rst b/doc/source/how-to-install-flower.rst index ff3dbb605846..dc88076424f8 100644 --- a/doc/source/how-to-install-flower.rst +++ b/doc/source/how-to-install-flower.rst @@ -57,7 +57,7 @@ Advanced installation options Install via Docker ~~~~~~~~~~~~~~~~~~ -`How to run Flower using Docker `_ +`How to run Flower using Docker `_ Install pre-release ~~~~~~~~~~~~~~~~~~~ diff --git a/doc/source/how-to-run-flower-using-docker.rst b/doc/source/how-to-run-flower-using-docker.rst index 40df1ffcb63c..ed034c820142 100644 --- a/doc/source/how-to-run-flower-using-docker.rst +++ b/doc/source/how-to-run-flower-using-docker.rst @@ -54,7 +54,7 @@ to the Flower server. Here, we are passing the flag ``--insecure``. The ``--insecure`` flag enables insecure communication (using HTTP, not HTTPS) and should only be used for testing purposes. We strongly recommend enabling - `SSL `_ + `SSL `_ when deploying to a production environment. You can use ``--help`` to view all available flags that the server supports: @@ -90,7 +90,7 @@ To enable SSL, you will need a CA certificate, a server certificate and a server .. note:: For testing purposes, you can generate your own self-signed certificates. The - `Enable SSL connections `_ + `Enable SSL connections `_ page contains a section that will guide you through the process. Assuming all files we need are in the local ``certificates`` directory, we can use the flag diff --git a/doc/source/how-to-upgrade-to-flower-1.0.rst b/doc/source/how-to-upgrade-to-flower-1.0.rst index fd380e95d69c..c4429d61d0a9 100644 --- a/doc/source/how-to-upgrade-to-flower-1.0.rst +++ b/doc/source/how-to-upgrade-to-flower-1.0.rst @@ -50,7 +50,7 @@ Strategies / ``start_server`` / ``start_simulation`` - Replace ``num_rounds=1`` in ``start_simulation`` with the new ``config=ServerConfig(...)`` (see previous item) - Remove ``force_final_distributed_eval`` parameter from calls to ``start_server``. Distributed evaluation on all clients can be enabled by configuring the strategy to sample all clients for evaluation after the last round of training. - Rename parameter/ndarray conversion functions: - + - ``parameters_to_weights`` --> ``parameters_to_ndarrays`` - ``weights_to_parameters`` --> ``ndarrays_to_parameters`` @@ -88,4 +88,4 @@ Along with the necessary changes above, there are a number of potential improvem Further help ------------ -Most official `Flower code examples `_ are already updated to Flower 1.0, they can serve as a reference for using the Flower 1.0 API. If there are further questionsm, `join the Flower Slack `_ and use the channgel ``#questions``. +Most official `Flower code examples `_ are already updated to Flower 1.0, they can serve as a reference for using the Flower 1.0 API. If there are further questionsm, `join the Flower Slack `_ and use the channgel ``#questions``. diff --git a/doc/source/index.rst b/doc/source/index.rst index 7e2b4052bee6..ea52a9421b61 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -4,7 +4,7 @@ Flower Framework Documentation .. meta:: :description: Check out the documentation of the main Flower Framework enabling easy Python development for Federated Learning. -Welcome to Flower's documentation. `Flower `_ is a friendly federated learning framework. +Welcome to Flower's documentation. `Flower `_ is a friendly federated learning framework. Join the Flower Community @@ -12,7 +12,7 @@ Join the Flower Community The Flower Community is growing quickly - we're a friendly group of researchers, engineers, students, professionals, academics, and other enthusiasts. -.. button-link:: https://flower.dev/join-slack +.. button-link:: https://flower.ai/join-slack :color: primary :shadow: diff --git a/doc/source/ref-changelog.md b/doc/source/ref-changelog.md index e9282632410d..41dc91873c6c 100644 --- a/doc/source/ref-changelog.md +++ b/doc/source/ref-changelog.md @@ -42,7 +42,7 @@ We would like to give our special thanks to all the contributors who made the ne - **Introduce Docker image for Flower server** ([#2700](https://github.com/adap/flower/pull/2700), [#2688](https://github.com/adap/flower/pull/2688), [#2705](https://github.com/adap/flower/pull/2705), [#2695](https://github.com/adap/flower/pull/2695), [#2747](https://github.com/adap/flower/pull/2747), [#2746](https://github.com/adap/flower/pull/2746), [#2680](https://github.com/adap/flower/pull/2680), [#2682](https://github.com/adap/flower/pull/2682), [#2701](https://github.com/adap/flower/pull/2701)) - The Flower server can now be run using an official Docker image. A new how-to guide explains [how to run Flower using Docker](https://flower.dev/docs/framework/how-to-run-flower-using-docker.html). An official Flower client Docker image will follow. + The Flower server can now be run using an official Docker image. A new how-to guide explains [how to run Flower using Docker](https://flower.ai/docs/framework/how-to-run-flower-using-docker.html). An official Flower client Docker image will follow. - **Introduce** `flower-via-docker-compose` **example** ([#2626](https://github.com/adap/flower/pull/2626)) @@ -52,7 +52,7 @@ We would like to give our special thanks to all the contributors who made the ne - **Update code examples to use Flower Datasets** ([#2450](https://github.com/adap/flower/pull/2450), [#2456](https://github.com/adap/flower/pull/2456), [#2318](https://github.com/adap/flower/pull/2318), [#2712](https://github.com/adap/flower/pull/2712)) - Several code examples were updated to use [Flower Datasets](https://flower.dev/docs/datasets/). + Several code examples were updated to use [Flower Datasets](https://flower.ai/docs/datasets/). - **General updates to Flower Examples** ([#2381](https://github.com/adap/flower/pull/2381), [#2805](https://github.com/adap/flower/pull/2805), [#2782](https://github.com/adap/flower/pull/2782), [#2806](https://github.com/adap/flower/pull/2806), [#2829](https://github.com/adap/flower/pull/2829), [#2825](https://github.com/adap/flower/pull/2825), [#2816](https://github.com/adap/flower/pull/2816), [#2726](https://github.com/adap/flower/pull/2726), [#2659](https://github.com/adap/flower/pull/2659), [#2655](https://github.com/adap/flower/pull/2655)) @@ -213,11 +213,11 @@ We would like to give our special thanks to all the contributors who made the ne The new simulation engine has been rewritten from the ground up, yet it remains fully backwards compatible. It offers much improved stability and memory handling, especially when working with GPUs. Simulations transparently adapt to different settings to scale simulation in CPU-only, CPU+GPU, multi-GPU, or multi-node multi-GPU environments. - Comprehensive documentation includes a new [how-to run simulations](https://flower.dev/docs/framework/how-to-run-simulations.html) guide, new [simulation-pytorch](https://flower.dev/docs/examples/simulation-pytorch.html) and [simulation-tensorflow](https://flower.dev/docs/examples/simulation-tensorflow.html) notebooks, and a new [YouTube tutorial series](https://www.youtube.com/watch?v=cRebUIGB5RU&list=PLNG4feLHqCWlnj8a_E1A_n5zr2-8pafTB). + Comprehensive documentation includes a new [how-to run simulations](https://flower.ai/docs/framework/how-to-run-simulations.html) guide, new [simulation-pytorch](https://flower.ai/docs/examples/simulation-pytorch.html) and [simulation-tensorflow](https://flower.ai/docs/examples/simulation-tensorflow.html) notebooks, and a new [YouTube tutorial series](https://www.youtube.com/watch?v=cRebUIGB5RU&list=PLNG4feLHqCWlnj8a_E1A_n5zr2-8pafTB). - **Restructure Flower Docs** ([#1824](https://github.com/adap/flower/pull/1824), [#1865](https://github.com/adap/flower/pull/1865), [#1884](https://github.com/adap/flower/pull/1884), [#1887](https://github.com/adap/flower/pull/1887), [#1919](https://github.com/adap/flower/pull/1919), [#1922](https://github.com/adap/flower/pull/1922), [#1920](https://github.com/adap/flower/pull/1920), [#1923](https://github.com/adap/flower/pull/1923), [#1924](https://github.com/adap/flower/pull/1924), [#1962](https://github.com/adap/flower/pull/1962), [#2006](https://github.com/adap/flower/pull/2006), [#2133](https://github.com/adap/flower/pull/2133), [#2203](https://github.com/adap/flower/pull/2203), [#2215](https://github.com/adap/flower/pull/2215), [#2122](https://github.com/adap/flower/pull/2122), [#2223](https://github.com/adap/flower/pull/2223), [#2219](https://github.com/adap/flower/pull/2219), [#2232](https://github.com/adap/flower/pull/2232), [#2233](https://github.com/adap/flower/pull/2233), [#2234](https://github.com/adap/flower/pull/2234), [#2235](https://github.com/adap/flower/pull/2235), [#2237](https://github.com/adap/flower/pull/2237), [#2238](https://github.com/adap/flower/pull/2238), [#2242](https://github.com/adap/flower/pull/2242), [#2231](https://github.com/adap/flower/pull/2231), [#2243](https://github.com/adap/flower/pull/2243), [#2227](https://github.com/adap/flower/pull/2227)) - Much effort went into a completely restructured Flower docs experience. The documentation on [flower.dev/docs](https://flower.dev/docs) is now divided into Flower Framework, Flower Baselines, Flower Android SDK, Flower iOS SDK, and code example projects. + Much effort went into a completely restructured Flower docs experience. The documentation on [flower.ai/docs](https://flower.ai/docs) is now divided into Flower Framework, Flower Baselines, Flower Android SDK, Flower iOS SDK, and code example projects. - **Introduce Flower Swift SDK** ([#1858](https://github.com/adap/flower/pull/1858), [#1897](https://github.com/adap/flower/pull/1897)) @@ -303,7 +303,7 @@ We would like to give our special thanks to all the contributors who made the ne - **Introduce new "What is Federated Learning?" tutorial** ([#1657](https://github.com/adap/flower/pull/1657), [#1721](https://github.com/adap/flower/pull/1721)) - A new [entry-level tutorial](https://flower.dev/docs/framework/tutorial-what-is-federated-learning.html) in our documentation explains the basics of Fedetated Learning. It enables anyone who's unfamiliar with Federated Learning to start their journey with Flower. Forward it to anyone who's interested in Federated Learning! + A new [entry-level tutorial](https://flower.ai/docs/framework/tutorial-what-is-federated-learning.html) in our documentation explains the basics of Fedetated Learning. It enables anyone who's unfamiliar with Federated Learning to start their journey with Flower. Forward it to anyone who's interested in Federated Learning! - **Introduce new Flower Baseline: FedProx MNIST** ([#1513](https://github.com/adap/flower/pull/1513), [#1680](https://github.com/adap/flower/pull/1680), [#1681](https://github.com/adap/flower/pull/1681), [#1679](https://github.com/adap/flower/pull/1679)) @@ -417,7 +417,7 @@ We would like to give our special thanks to all the contributors who made the ne - **Introduce new Flower Baseline: FedAvg MNIST** ([#1497](https://github.com/adap/flower/pull/1497), [#1552](https://github.com/adap/flower/pull/1552)) - Over the coming weeks, we will be releasing a number of new reference implementations useful especially to FL newcomers. They will typically revisit well known papers from the literature, and be suitable for integration in your own application or for experimentation, in order to deepen your knowledge of FL in general. Today's release is the first in this series. [Read more.](https://flower.dev/blog/2023-01-12-fl-starter-pack-fedavg-mnist-cnn/) + Over the coming weeks, we will be releasing a number of new reference implementations useful especially to FL newcomers. They will typically revisit well known papers from the literature, and be suitable for integration in your own application or for experimentation, in order to deepen your knowledge of FL in general. Today's release is the first in this series. [Read more.](https://flower.ai/blog/2023-01-12-fl-starter-pack-fedavg-mnist-cnn/) - **Improve GPU support in simulations** ([#1555](https://github.com/adap/flower/pull/1555)) @@ -427,16 +427,16 @@ We would like to give our special thanks to all the contributors who made the ne Some users reported that Jupyter Notebooks have not always been easy to use on GPU instances. We listened and made improvements to all of our Jupyter notebooks! Check out the updated notebooks here: - - [An Introduction to Federated Learning](https://flower.dev/docs/framework/tutorial-get-started-with-flower-pytorch.html) - - [Strategies in Federated Learning](https://flower.dev/docs/framework/tutorial-use-a-federated-learning-strategy-pytorch.html) - - [Building a Strategy](https://flower.dev/docs/framework/tutorial-build-a-strategy-from-scratch-pytorch.html) - - [Client and NumPyClient](https://flower.dev/docs/framework/tutorial-customize-the-client-pytorch.html) + - [An Introduction to Federated Learning](https://flower.ai/docs/framework/tutorial-get-started-with-flower-pytorch.html) + - [Strategies in Federated Learning](https://flower.ai/docs/framework/tutorial-use-a-federated-learning-strategy-pytorch.html) + - [Building a Strategy](https://flower.ai/docs/framework/tutorial-build-a-strategy-from-scratch-pytorch.html) + - [Client and NumPyClient](https://flower.ai/docs/framework/tutorial-customize-the-client-pytorch.html) - **Introduce optional telemetry** ([#1533](https://github.com/adap/flower/pull/1533), [#1544](https://github.com/adap/flower/pull/1544), [#1584](https://github.com/adap/flower/pull/1584)) After a [request for feedback](https://github.com/adap/flower/issues/1534) from the community, the Flower open-source project introduces optional collection of *anonymous* usage metrics to make well-informed decisions to improve Flower. Doing this enables the Flower team to understand how Flower is used and what challenges users might face. - **Flower is a friendly framework for collaborative AI and data science.** Staying true to this statement, Flower makes it easy to disable telemetry for users who do not want to share anonymous usage metrics. [Read more.](https://flower.dev/docs/telemetry.html). + **Flower is a friendly framework for collaborative AI and data science.** Staying true to this statement, Flower makes it easy to disable telemetry for users who do not want to share anonymous usage metrics. [Read more.](https://flower.ai/docs/telemetry.html). - **Introduce (experimental) Driver API** ([#1520](https://github.com/adap/flower/pull/1520), [#1525](https://github.com/adap/flower/pull/1525), [#1545](https://github.com/adap/flower/pull/1545), [#1546](https://github.com/adap/flower/pull/1546), [#1550](https://github.com/adap/flower/pull/1550), [#1551](https://github.com/adap/flower/pull/1551), [#1567](https://github.com/adap/flower/pull/1567)) @@ -468,7 +468,7 @@ We would like to give our special thanks to all the contributors who made the ne As usual, the documentation has improved quite a bit. It is another step in our effort to make the Flower documentation the best documentation of any project. Stay tuned and as always, feel free to provide feedback! - One highlight is the new [first time contributor guide](https://flower.dev/docs/first-time-contributors.html): if you've never contributed on GitHub before, this is the perfect place to start! + One highlight is the new [first time contributor guide](https://flower.ai/docs/first-time-contributors.html): if you've never contributed on GitHub before, this is the perfect place to start! ### Incompatible changes @@ -657,7 +657,7 @@ We would like to give our **special thanks** to all the contributors who made Fl - **Flower Baselines (preview): FedOpt, FedBN, FedAvgM** ([#919](https://github.com/adap/flower/pull/919), [#1127](https://github.com/adap/flower/pull/1127), [#914](https://github.com/adap/flower/pull/914)) - The first preview release of Flower Baselines has arrived! We're kickstarting Flower Baselines with implementations of FedOpt (FedYogi, FedAdam, FedAdagrad), FedBN, and FedAvgM. Check the documentation on how to use [Flower Baselines](https://flower.dev/docs/using-baselines.html). With this first preview release we're also inviting the community to [contribute their own baselines](https://flower.dev/docs/contributing-baselines.html). + The first preview release of Flower Baselines has arrived! We're kickstarting Flower Baselines with implementations of FedOpt (FedYogi, FedAdam, FedAdagrad), FedBN, and FedAvgM. Check the documentation on how to use [Flower Baselines](https://flower.ai/docs/using-baselines.html). With this first preview release we're also inviting the community to [contribute their own baselines](https://flower.ai/docs/contributing-baselines.html). - **C++ client SDK (preview) and code example** ([#1111](https://github.com/adap/flower/pull/1111)) @@ -703,7 +703,7 @@ We would like to give our **special thanks** to all the contributors who made Fl - New option to keep Ray running if Ray was already initialized in `start_simulation` ([#1177](https://github.com/adap/flower/pull/1177)) - Add support for custom `ClientManager` as a `start_simulation` parameter ([#1171](https://github.com/adap/flower/pull/1171)) - - New documentation for [implementing strategies](https://flower.dev/docs/framework/how-to-implement-strategies.html) ([#1097](https://github.com/adap/flower/pull/1097), [#1175](https://github.com/adap/flower/pull/1175)) + - New documentation for [implementing strategies](https://flower.ai/docs/framework/how-to-implement-strategies.html) ([#1097](https://github.com/adap/flower/pull/1097), [#1175](https://github.com/adap/flower/pull/1175)) - New mobile-friendly documentation theme ([#1174](https://github.com/adap/flower/pull/1174)) - Limit version range for (optional) `ray` dependency to include only compatible releases (`>=1.9.2,<1.12.0`) ([#1205](https://github.com/adap/flower/pull/1205)) diff --git a/doc/source/ref-example-projects.rst b/doc/source/ref-example-projects.rst index b47bd8e48997..8eb723000cac 100644 --- a/doc/source/ref-example-projects.rst +++ b/doc/source/ref-example-projects.rst @@ -23,8 +23,8 @@ The TensorFlow/Keras quickstart example shows CIFAR-10 image classification with MobileNetV2: - `Quickstart TensorFlow (Code) `_ -- `Quickstart TensorFlow (Tutorial) `_ -- `Quickstart TensorFlow (Blog Post) `_ +- `Quickstart TensorFlow (Tutorial) `_ +- `Quickstart TensorFlow (Blog Post) `_ Quickstart PyTorch @@ -34,7 +34,7 @@ The PyTorch quickstart example shows CIFAR-10 image classification with a simple Convolutional Neural Network: - `Quickstart PyTorch (Code) `_ -- `Quickstart PyTorch (Tutorial) `_ +- `Quickstart PyTorch (Tutorial) `_ PyTorch: From Centralized To Federated @@ -43,7 +43,7 @@ PyTorch: From Centralized To Federated This example shows how a regular PyTorch project can be federated using Flower: - `PyTorch: From Centralized To Federated (Code) `_ -- `PyTorch: From Centralized To Federated (Tutorial) `_ +- `PyTorch: From Centralized To Federated (Tutorial) `_ Federated Learning on Raspberry Pi and Nvidia Jetson @@ -52,7 +52,7 @@ Federated Learning on Raspberry Pi and Nvidia Jetson This example shows how Flower can be used to build a federated learning system that run across Raspberry Pi and Nvidia Jetson: - `Federated Learning on Raspberry Pi and Nvidia Jetson (Code) `_ -- `Federated Learning on Raspberry Pi and Nvidia Jetson (Blog Post) `_ +- `Federated Learning on Raspberry Pi and Nvidia Jetson (Blog Post) `_ diff --git a/doc/source/ref-faq.rst b/doc/source/ref-faq.rst index 13c44bc64b0e..932396e3c583 100644 --- a/doc/source/ref-faq.rst +++ b/doc/source/ref-faq.rst @@ -6,20 +6,20 @@ This page collects answers to commonly asked questions about Federated Learning .. dropdown:: :fa:`eye,mr-1` Can Flower run on Juptyter Notebooks / Google Colab? Yes, it can! Flower even comes with a few under-the-hood optimizations to make it work even better on Colab. Here's a quickstart example: - + * `Flower simulation PyTorch `_ * `Flower simulation TensorFlow/Keras `_ .. dropdown:: :fa:`eye,mr-1` How can I run Federated Learning on a Raspberry Pi? - Find the `blog post about federated learning on embedded device here `_ and the corresponding `GitHub code example `_. + Find the `blog post about federated learning on embedded device here `_ and the corresponding `GitHub code example `_. .. dropdown:: :fa:`eye,mr-1` Does Flower support federated learning on Android devices? - Yes, it does. Please take a look at our `blog post `_ or check out the code examples: + Yes, it does. Please take a look at our `blog post `_ or check out the code examples: - * `Android Kotlin example `_ - * `Android Java example `_ + * `Android Kotlin example `_ + * `Android Java example `_ .. dropdown:: :fa:`eye,mr-1` Can I combine federated learning with blockchain? diff --git a/doc/source/ref-telemetry.md b/doc/source/ref-telemetry.md index 206e641d8b41..49efef5c8559 100644 --- a/doc/source/ref-telemetry.md +++ b/doc/source/ref-telemetry.md @@ -41,7 +41,7 @@ Flower telemetry collects the following metrics: **Source.** Flower telemetry tries to store a random source ID in `~/.flwr/source` the first time a telemetry event is generated. The source ID is important to identify whether an issue is recurring or whether an issue is triggered by multiple clusters running concurrently (which often happens in simulation). For example, if a device runs multiple workloads at the same time, and this results in an issue, then, in order to reproduce the issue, multiple workloads must be started at the same time. -You may delete the source ID at any time. If you wish for all events logged under a specific source ID to be deleted, you can send a deletion request mentioning the source ID to `telemetry@flower.dev`. All events related to that source ID will then be permanently deleted. +You may delete the source ID at any time. If you wish for all events logged under a specific source ID to be deleted, you can send a deletion request mentioning the source ID to `telemetry@flower.ai`. All events related to that source ID will then be permanently deleted. We will not collect any personally identifiable information. If you think any of the metrics collected could be misused in any way, please [get in touch with us](#how-to-contact-us). We will update this page to reflect any changes to the metrics collected and publish changes in the changelog. @@ -63,4 +63,4 @@ FLWR_TELEMETRY_ENABLED=0 FLWR_TELEMETRY_LOGGING=1 python server.py # or client.p ## How to contact us -We want to hear from you. If you have any feedback or ideas on how to improve the way we handle anonymous usage metrics, reach out to us via [Slack](https://flower.dev/join-slack/) (channel `#telemetry`) or email (`telemetry@flower.dev`). +We want to hear from you. If you have any feedback or ideas on how to improve the way we handle anonymous usage metrics, reach out to us via [Slack](https://flower.ai/join-slack/) (channel `#telemetry`) or email (`telemetry@flower.ai`). diff --git a/doc/source/tutorial-quickstart-ios.rst b/doc/source/tutorial-quickstart-ios.rst index 7c8007baaa75..aa94a72580c1 100644 --- a/doc/source/tutorial-quickstart-ios.rst +++ b/doc/source/tutorial-quickstart-ios.rst @@ -7,14 +7,14 @@ Quickstart iOS .. meta:: :description: Read this Federated Learning quickstart tutorial for creating an iOS app using Flower to train a neural network on MNIST. -In this tutorial we will learn how to train a Neural Network on MNIST using Flower and CoreML on iOS devices. +In this tutorial we will learn how to train a Neural Network on MNIST using Flower and CoreML on iOS devices. -First of all, for running the Flower Python server, it is recommended to create a virtual environment and run everything within a `virtualenv `_. +First of all, for running the Flower Python server, it is recommended to create a virtual environment and run everything within a `virtualenv `_. For the Flower client implementation in iOS, it is recommended to use Xcode as our IDE. -Our example consists of one Python *server* and two iPhone *clients* that all have the same model. +Our example consists of one Python *server* and two iPhone *clients* that all have the same model. -*Clients* are responsible for generating individual weight updates for the model based on their local datasets. +*Clients* are responsible for generating individual weight updates for the model based on their local datasets. These updates are then sent to the *server* which will aggregate them to produce a better model. Finally, the *server* sends this improved version of the model back to each *client*. A complete cycle of weight updates is called a *round*. @@ -44,10 +44,10 @@ For simplicity reasons we will use the complete Flower client with CoreML, that public func getParameters() -> GetParametersRes { let parameters = parameters.weightsToParameters() let status = Status(code: .ok, message: String()) - + return GetParametersRes(parameters: parameters, status: status) } - + /// Calls the routine to fit the local model /// /// - Returns: The result from the local training, e.g., updated parameters @@ -55,17 +55,17 @@ For simplicity reasons we will use the complete Flower client with CoreML, that let status = Status(code: .ok, message: String()) let result = runMLTask(configuration: parameters.parametersToWeights(parameters: ins.parameters), task: .train) let parameters = parameters.weightsToParameters() - + return FitRes(parameters: parameters, numExamples: result.numSamples, status: status) } - + /// Calls the routine to evaluate the local model /// /// - Returns: The result from the evaluation, e.g., loss public func evaluate(ins: EvaluateIns) -> EvaluateRes { let status = Status(code: .ok, message: String()) let result = runMLTask(configuration: parameters.parametersToWeights(parameters: ins.parameters), task: .test) - + return EvaluateRes(loss: Float(result.loss), numExamples: result.numSamples, status: status) } @@ -88,12 +88,12 @@ For the MNIST dataset, we need to preprocess it into :code:`MLBatchProvider` obj // prepare train dataset let trainBatchProvider = DataLoader.trainBatchProvider() { _ in } - + // prepare test dataset let testBatchProvider = DataLoader.testBatchProvider() { _ in } - + // load them together - let dataLoader = MLDataLoader(trainBatchProvider: trainBatchProvider, + let dataLoader = MLDataLoader(trainBatchProvider: trainBatchProvider, testBatchProvider: testBatchProvider) Since CoreML does not allow the model parameters to be seen before training, and accessing the model parameters during or after the training can only be done by specifying the layer name, @@ -122,7 +122,7 @@ Then start the Flower gRPC client and start communicating to the server by passi self.flwrGRPC.startFlwrGRPC(client: self.mlFlwrClient) That's it for the client. We only have to implement :code:`Client` or call the provided -:code:`MLFlwrClient` and call :code:`startFlwrGRPC()`. The attribute :code:`hostname` and :code:`port` tells the client which server to connect to. +:code:`MLFlwrClient` and call :code:`startFlwrGRPC()`. The attribute :code:`hostname` and :code:`port` tells the client which server to connect to. This can be done by entering the hostname and port in the application before clicking the start button to start the federated learning process. Flower Server diff --git a/doc/source/tutorial-quickstart-mxnet.rst b/doc/source/tutorial-quickstart-mxnet.rst index ff8d4b2087dd..08304483af86 100644 --- a/doc/source/tutorial-quickstart-mxnet.rst +++ b/doc/source/tutorial-quickstart-mxnet.rst @@ -9,13 +9,13 @@ Quickstart MXNet .. meta:: :description: Check out this Federated Learning quickstart tutorial for using Flower with MXNet to train a Sequential model on MNIST. -In this tutorial, we will learn how to train a :code:`Sequential` model on MNIST using Flower and MXNet. +In this tutorial, we will learn how to train a :code:`Sequential` model on MNIST using Flower and MXNet. -It is recommended to create a virtual environment and run everything within this `virtualenv `_. +It is recommended to create a virtual environment and run everything within this `virtualenv `_. -Our example consists of one *server* and two *clients* all having the same model. +Our example consists of one *server* and two *clients* all having the same model. -*Clients* are responsible for generating individual model parameter updates for the model based on their local datasets. +*Clients* are responsible for generating individual model parameter updates for the model based on their local datasets. These updates are then sent to the *server* which will aggregate them to produce an updated global model. Finally, the *server* sends this improved version of the model back to each *client*. A complete cycle of parameters updates is called a *round*. @@ -35,12 +35,12 @@ Since we want to use MXNet, let's go ahead and install it: Flower Client ------------- -Now that we have all our dependencies installed, let's run a simple distributed training with two clients and one server. Our training procedure and network architecture are based on MXNet´s `Hand-written Digit Recognition tutorial `_. +Now that we have all our dependencies installed, let's run a simple distributed training with two clients and one server. Our training procedure and network architecture are based on MXNet´s `Hand-written Digit Recognition tutorial `_. In a file called :code:`client.py`, import Flower and MXNet related packages: .. code-block:: python - + import flwr as fl import numpy as np @@ -58,7 +58,7 @@ In addition, define the device allocation in MXNet with: DEVICE = [mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()] -We use MXNet to load MNIST, a popular image classification dataset of handwritten digits for machine learning. The MXNet utility :code:`mx.test_utils.get_mnist()` downloads the training and test data. +We use MXNet to load MNIST, a popular image classification dataset of handwritten digits for machine learning. The MXNet utility :code:`mx.test_utils.get_mnist()` downloads the training and test data. .. code-block:: python @@ -72,7 +72,7 @@ We use MXNet to load MNIST, a popular image classification dataset of handwritte val_data = mx.io.NDArrayIter(mnist["test_data"], mnist["test_label"], batch_size) return train_data, val_data -Define the training and loss with MXNet. We train the model by looping over the dataset, measure the corresponding loss, and optimize it. +Define the training and loss with MXNet. We train the model by looping over the dataset, measure the corresponding loss, and optimize it. .. code-block:: python @@ -110,7 +110,7 @@ Define the training and loss with MXNet. We train the model by looping over the return trainings_metric, num_examples -Next, we define the validation of our machine learning model. We loop over the test set and measure both loss and accuracy on the test set. +Next, we define the validation of our machine learning model. We loop over the test set and measure both loss and accuracy on the test set. .. code-block:: python @@ -155,7 +155,7 @@ Our Flower clients will use a simple :code:`Sequential` model: init = nd.random.uniform(shape=(2, 784)) model(init) -After loading the dataset with :code:`load_data()` we perform one forward propagation to initialize the model and model parameters with :code:`model(init)`. Next, we implement a Flower client. +After loading the dataset with :code:`load_data()` we perform one forward propagation to initialize the model and model parameters with :code:`model(init)`. Next, we implement a Flower client. The Flower server interacts with clients through an interface called :code:`Client`. When the server selects a particular client for training, it @@ -207,7 +207,7 @@ They can be implemented in the following way: [accuracy, loss], num_examples = test(model, val_data) print("Evaluation accuracy & loss", accuracy, loss) return float(loss[1]), val_data.batch_size, {"accuracy": float(accuracy[1])} - + We can now create an instance of our class :code:`MNISTClient` and add one line to actually run this client: diff --git a/doc/source/tutorial-quickstart-pytorch.rst b/doc/source/tutorial-quickstart-pytorch.rst index f15a4a93114e..32f9c5ebb3a1 100644 --- a/doc/source/tutorial-quickstart-pytorch.rst +++ b/doc/source/tutorial-quickstart-pytorch.rst @@ -10,13 +10,13 @@ Quickstart PyTorch .. youtube:: jOmmuzMIQ4c :width: 100% -In this tutorial we will learn how to train a Convolutional Neural Network on CIFAR10 using Flower and PyTorch. +In this tutorial we will learn how to train a Convolutional Neural Network on CIFAR10 using Flower and PyTorch. -First of all, it is recommended to create a virtual environment and run everything within a `virtualenv `_. +First of all, it is recommended to create a virtual environment and run everything within a `virtualenv `_. -Our example consists of one *server* and two *clients* all having the same model. +Our example consists of one *server* and two *clients* all having the same model. -*Clients* are responsible for generating individual weight-updates for the model based on their local datasets. +*Clients* are responsible for generating individual weight-updates for the model based on their local datasets. These updates are then sent to the *server* which will aggregate them to produce a better model. Finally, the *server* sends this improved version of the model back to each *client*. A complete cycle of weight updates is called a *round*. @@ -26,7 +26,7 @@ Now that we have a rough idea of what is going on, let's get started. We first n $ pip install flwr -Since we want to use PyTorch to solve a computer vision task, let's go ahead and install PyTorch and the **torchvision** library: +Since we want to use PyTorch to solve a computer vision task, let's go ahead and install PyTorch and the **torchvision** library: .. code-block:: shell @@ -36,12 +36,12 @@ Since we want to use PyTorch to solve a computer vision task, let's go ahead and Flower Client ------------- -Now that we have all our dependencies installed, let's run a simple distributed training with two clients and one server. Our training procedure and network architecture are based on PyTorch's `Deep Learning with PyTorch `_. +Now that we have all our dependencies installed, let's run a simple distributed training with two clients and one server. Our training procedure and network architecture are based on PyTorch's `Deep Learning with PyTorch `_. In a file called :code:`client.py`, import Flower and PyTorch related packages: .. code-block:: python - + from collections import OrderedDict import torch @@ -59,7 +59,7 @@ In addition, we define the device allocation in PyTorch with: DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") -We use PyTorch to load CIFAR10, a popular colored image classification dataset for machine learning. The PyTorch :code:`DataLoader()` downloads the training and test data that are then normalized. +We use PyTorch to load CIFAR10, a popular colored image classification dataset for machine learning. The PyTorch :code:`DataLoader()` downloads the training and test data that are then normalized. .. code-block:: python @@ -75,7 +75,7 @@ We use PyTorch to load CIFAR10, a popular colored image classification dataset f num_examples = {"trainset" : len(trainset), "testset" : len(testset)} return trainloader, testloader, num_examples -Define the loss and optimizer with PyTorch. The training of the dataset is done by looping over the dataset, measure the corresponding loss and optimize it. +Define the loss and optimizer with PyTorch. The training of the dataset is done by looping over the dataset, measure the corresponding loss and optimize it. .. code-block:: python @@ -91,7 +91,7 @@ Define the loss and optimizer with PyTorch. The training of the dataset is done loss.backward() optimizer.step() -Define then the validation of the machine learning network. We loop over the test set and measure the loss and accuracy of the test set. +Define then the validation of the machine learning network. We loop over the test set and measure the loss and accuracy of the test set. .. code-block:: python @@ -139,7 +139,7 @@ The Flower clients will use a simple CNN adapted from 'PyTorch: A 60 Minute Blit net = Net().to(DEVICE) trainloader, testloader, num_examples = load_data() -After loading the data set with :code:`load_data()` we define the Flower interface. +After loading the data set with :code:`load_data()` we define the Flower interface. The Flower server interacts with clients through an interface called :code:`Client`. When the server selects a particular client for training, it diff --git a/doc/source/tutorial-quickstart-scikitlearn.rst b/doc/source/tutorial-quickstart-scikitlearn.rst index 4921f63bab2c..b95118aa091f 100644 --- a/doc/source/tutorial-quickstart-scikitlearn.rst +++ b/doc/source/tutorial-quickstart-scikitlearn.rst @@ -7,13 +7,13 @@ Quickstart scikit-learn .. meta:: :description: Check out this Federated Learning quickstart tutorial for using Flower with scikit-learn to train a linear regression model. -In this tutorial, we will learn how to train a :code:`Logistic Regression` model on MNIST using Flower and scikit-learn. +In this tutorial, we will learn how to train a :code:`Logistic Regression` model on MNIST using Flower and scikit-learn. -It is recommended to create a virtual environment and run everything within this `virtualenv `_. +It is recommended to create a virtual environment and run everything within this `virtualenv `_. -Our example consists of one *server* and two *clients* all having the same model. +Our example consists of one *server* and two *clients* all having the same model. -*Clients* are responsible for generating individual model parameter updates for the model based on their local datasets. +*Clients* are responsible for generating individual model parameter updates for the model based on their local datasets. These updates are then sent to the *server* which will aggregate them to produce an updated global model. Finally, the *server* sends this improved version of the model back to each *client*. A complete cycle of parameters updates is called a *round*. @@ -59,7 +59,7 @@ Please check out :code:`utils.py` `here `_, a popular image classification dataset of handwritten digits for machine learning. The utility :code:`utils.load_mnist()` downloads the training and test data. The training set is split afterwards into 10 partitions with :code:`utils.partition()`. +We load the MNIST dataset from `OpenML `_, a popular image classification dataset of handwritten digits for machine learning. The utility :code:`utils.load_mnist()` downloads the training and test data. The training set is split afterwards into 10 partitions with :code:`utils.partition()`. .. code-block:: python diff --git a/doc/source/tutorial-quickstart-xgboost.rst b/doc/source/tutorial-quickstart-xgboost.rst index 3a7b356c4d2a..ec9101f4b3fd 100644 --- a/doc/source/tutorial-quickstart-xgboost.rst +++ b/doc/source/tutorial-quickstart-xgboost.rst @@ -36,7 +36,7 @@ and then we dive into a more complex example (`full code xgboost-comprehensive < Environment Setup -------------------- -First of all, it is recommended to create a virtual environment and run everything within a `virtualenv `_. +First of all, it is recommended to create a virtual environment and run everything within a `virtualenv `_. We first need to install Flower and Flower Datasets. You can do this by running : @@ -596,7 +596,7 @@ Comprehensive Federated XGBoost Now that you have known how federated XGBoost work with Flower, it's time to run some more comprehensive experiments by customising the experimental settings. In the xgboost-comprehensive example (`full code `_), we provide more options to define various experimental setups, including aggregation strategies, data partitioning and centralised/distributed evaluation. -We also support `Flower simulation `_ making it easy to simulate large client cohorts in a resource-aware manner. +We also support `Flower simulation `_ making it easy to simulate large client cohorts in a resource-aware manner. Let's take a look! Cyclic training diff --git a/doc/source/tutorial-series-customize-the-client-pytorch.ipynb b/doc/source/tutorial-series-customize-the-client-pytorch.ipynb index 0ff67de6f51d..bcfdeb30d3c7 100644 --- a/doc/source/tutorial-series-customize-the-client-pytorch.ipynb +++ b/doc/source/tutorial-series-customize-the-client-pytorch.ipynb @@ -7,11 +7,11 @@ "source": [ "# Customize the client\n", "\n", - "Welcome to the fourth part of the Flower federated learning tutorial. In the previous parts of this tutorial, we introduced federated learning with PyTorch and Flower ([part 1](https://flower.dev/docs/framework/tutorial-get-started-with-flower-pytorch.html)), we learned how strategies can be used to customize the execution on both the server and the clients ([part 2](https://flower.dev/docs/framework/tutorial-use-a-federated-learning-strategy-pytorch.html)), and we built our own custom strategy from scratch ([part 3](https://flower.dev/docs/framework/tutorial-build-a-strategy-from-scratch-pytorch.html)).\n", + "Welcome to the fourth part of the Flower federated learning tutorial. In the previous parts of this tutorial, we introduced federated learning with PyTorch and Flower ([part 1](https://flower.ai/docs/framework/tutorial-get-started-with-flower-pytorch.html)), we learned how strategies can be used to customize the execution on both the server and the clients ([part 2](https://flower.ai/docs/framework/tutorial-use-a-federated-learning-strategy-pytorch.html)), and we built our own custom strategy from scratch ([part 3](https://flower.ai/docs/framework/tutorial-build-a-strategy-from-scratch-pytorch.html)).\n", "\n", "In this notebook, we revisit `NumPyClient` and introduce a new baseclass for building clients, simply named `Client`. In previous parts of this tutorial, we've based our client on `NumPyClient`, a convenience class which makes it easy to work with machine learning libraries that have good NumPy interoperability. With `Client`, we gain a lot of flexibility that we didn't have before, but we'll also have to do a few things the we didn't have to do before.\n", "\n", - "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.dev/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", + "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.ai/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", "\n", "Let's go deeper and see what it takes to move from `NumPyClient` to `Client`!" ] diff --git a/doc/source/tutorial-series-get-started-with-flower-pytorch.ipynb b/doc/source/tutorial-series-get-started-with-flower-pytorch.ipynb index 704ed520bf3e..f4b8acaa5bb8 100644 --- a/doc/source/tutorial-series-get-started-with-flower-pytorch.ipynb +++ b/doc/source/tutorial-series-get-started-with-flower-pytorch.ipynb @@ -9,9 +9,9 @@ "\n", "Welcome to the Flower federated learning tutorial!\n", "\n", - "In this notebook, we'll build a federated learning system using Flower, [Flower Datasets](https://flower.dev/docs/datasets/) and PyTorch. In part 1, we use PyTorch for the model training pipeline and data loading. In part 2, we continue to federate the PyTorch-based pipeline using Flower.\n", + "In this notebook, we'll build a federated learning system using Flower, [Flower Datasets](https://flower.ai/docs/datasets/) and PyTorch. In part 1, we use PyTorch for the model training pipeline and data loading. In part 2, we continue to federate the PyTorch-based pipeline using Flower.\n", "\n", - "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.dev/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", + "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.ai/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", "\n", "Let's get stated!" ] diff --git a/doc/source/tutorial-series-use-a-federated-learning-strategy-pytorch.ipynb b/doc/source/tutorial-series-use-a-federated-learning-strategy-pytorch.ipynb index 06f53cd8e1b1..c758b8f637b0 100644 --- a/doc/source/tutorial-series-use-a-federated-learning-strategy-pytorch.ipynb +++ b/doc/source/tutorial-series-use-a-federated-learning-strategy-pytorch.ipynb @@ -7,11 +7,11 @@ "source": [ "# Use a federated learning strategy\n", "\n", - "Welcome to the next part of the federated learning tutorial. In previous parts of this tutorial, we introduced federated learning with PyTorch and Flower ([part 1](https://flower.dev/docs/framework/tutorial-get-started-with-flower-pytorch.html)).\n", + "Welcome to the next part of the federated learning tutorial. In previous parts of this tutorial, we introduced federated learning with PyTorch and Flower ([part 1](https://flower.ai/docs/framework/tutorial-get-started-with-flower-pytorch.html)).\n", "\n", - "In this notebook, we'll begin to customize the federated learning system we built in the introductory notebook (again, using [Flower](https://flower.dev/) and [PyTorch](https://pytorch.org/)).\n", + "In this notebook, we'll begin to customize the federated learning system we built in the introductory notebook (again, using [Flower](https://flower.ai/) and [PyTorch](https://pytorch.org/)).\n", "\n", - "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.dev/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", + "> [Star Flower on GitHub](https://github.com/adap/flower) ⭐️ and join the Flower community on Slack to connect, ask questions, and get help: [Join Slack](https://flower.ai/join-slack) 🌼 We'd love to hear from you in the `#introductions` channel! And if anything is unclear, head over to the `#questions` channel.\n", "\n", "Let's move beyond FedAvg with Flower strategies!" ] diff --git a/examples/doc/source/_templates/base.html b/examples/doc/source/_templates/base.html index e4fe80720b74..08030fb08c15 100644 --- a/examples/doc/source/_templates/base.html +++ b/examples/doc/source/_templates/base.html @@ -5,7 +5,7 @@ - + {%- if metatags %}{{ metatags }}{% endif -%} @@ -99,6 +99,6 @@ {%- endblock -%} {%- endblock scripts -%} - + diff --git a/examples/doc/source/conf.py b/examples/doc/source/conf.py index 608aaeaeed6b..bf177aa5ae24 100644 --- a/examples/doc/source/conf.py +++ b/examples/doc/source/conf.py @@ -76,7 +76,7 @@ html_title = f"Flower Examples {release}" html_logo = "_static/flower-logo.png" html_favicon = "_static/favicon.ico" -html_baseurl = "https://flower.dev/docs/examples/" +html_baseurl = "https://flower.ai/docs/examples/" html_theme_options = { # From 6da77b1ee5061e681db68c0b9ae575448d2c7ac2 Mon Sep 17 00:00:00 2001 From: Gustavo Bertoli Date: Fri, 16 Feb 2024 17:35:23 +0100 Subject: [PATCH 015/102] Fix a typo in the datasets documentation (#2808) --- datasets/doc/source/tutorial-quickstart.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/datasets/doc/source/tutorial-quickstart.rst b/datasets/doc/source/tutorial-quickstart.rst index b93a08f234f2..bd4f336d618d 100644 --- a/datasets/doc/source/tutorial-quickstart.rst +++ b/datasets/doc/source/tutorial-quickstart.rst @@ -40,7 +40,7 @@ To iid partition your dataset, choose the split you want to partition and the nu partition = fds.load_partition(0, "train") centralized_dataset = fds.load_full("test") -Now you're ready to go. You have ten partitions created from the train split of the MNIST dataset and the test split +Now you're ready to go. You have ten partitions created from the train split of the CIFAR10 dataset and the test split for the centralized evaluation. We will convert the type of the dataset from Hugging Face's `Dataset` type to the one supported by your framework. From 17fb28f27ed1be4e1d9ebba8f4438b72d36c78e2 Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Fri, 16 Feb 2024 18:04:17 +0100 Subject: [PATCH 016/102] Introduce ServerApp compatibility mode (#2958) --- doc/source/ref-api-cli.rst | 28 +++++------ .../{mt-pytorch => app-pytorch}/README.md | 0 .../{mt-pytorch => app-pytorch}/client.py | 11 +---- .../{mt-pytorch => app-pytorch}/driver.py | 0 .../pyproject.toml | 2 +- .../requirements.txt | 0 .../start_driver.py => app-pytorch/server.py} | 13 +++-- examples/{mt-pytorch => app-pytorch}/task.py | 0 src/py/flwr/server/compat/app.py | 48 ++++++++++++------- src/py/flwr/server/run_serverapp.py | 20 +++++++- src/py/flwr/server/serverapp.py | 9 ++++ 11 files changed, 82 insertions(+), 49 deletions(-) rename examples/{mt-pytorch => app-pytorch}/README.md (100%) rename examples/{mt-pytorch => app-pytorch}/client.py (78%) rename examples/{mt-pytorch => app-pytorch}/driver.py (100%) rename examples/{mt-pytorch => app-pytorch}/pyproject.toml (83%) rename examples/{mt-pytorch => app-pytorch}/requirements.txt (100%) rename examples/{mt-pytorch/start_driver.py => app-pytorch/server.py} (85%) rename examples/{mt-pytorch => app-pytorch}/task.py (100%) diff --git a/doc/source/ref-api-cli.rst b/doc/source/ref-api-cli.rst index c0e8940061fc..63579143755d 100644 --- a/doc/source/ref-api-cli.rst +++ b/doc/source/ref-api-cli.rst @@ -31,22 +31,22 @@ flower-fleet-api :func: _parse_args_run_fleet_api :prog: flower-fleet-api -.. .. _flower-client-app-apiref: +.. _flower-client-app-apiref: -.. flower-client-app -.. ~~~~~~~~~~~~~~~~~ +flower-client-app +~~~~~~~~~~~~~~~~~ -.. .. argparse:: -.. :filename: flwr.client -.. :func: _parse_args_run_client_app -.. :prog: flower-client-app +.. argparse:: + :module: flwr.client.app + :func: _parse_args_run_client_app + :prog: flower-client-app -.. .. _flower-server-app-apiref: +.. _flower-server-app-apiref: -.. flower-server-app -.. ~~~~~~~~~~~~~~~~~ +flower-server-app +~~~~~~~~~~~~~~~~~ -.. .. argparse:: -.. :filename: flwr.server -.. :func: _parse_args_run_server_app -.. :prog: flower-server-app +.. argparse:: + :module: flwr.server.run_serverapp + :func: _parse_args_run_server_app + :prog: flower-server-app diff --git a/examples/mt-pytorch/README.md b/examples/app-pytorch/README.md similarity index 100% rename from examples/mt-pytorch/README.md rename to examples/app-pytorch/README.md diff --git a/examples/mt-pytorch/client.py b/examples/app-pytorch/client.py similarity index 78% rename from examples/mt-pytorch/client.py rename to examples/app-pytorch/client.py index 1f2db323ac34..8095a2d7aa93 100644 --- a/examples/mt-pytorch/client.py +++ b/examples/app-pytorch/client.py @@ -38,16 +38,7 @@ def client_fn(cid: str): return FlowerClient().to_client() -# To run this: `flower-client client:app` +# Run via `flower-client-app client:app` app = fl.client.ClientApp( client_fn=client_fn, ) - - -if __name__ == "__main__": - # Start Flower client - fl.client.start_client( - server_address="0.0.0.0:9092", # "0.0.0.0:9093" for REST - client_fn=client_fn, - transport="grpc-rere", # "rest" for REST - ) diff --git a/examples/mt-pytorch/driver.py b/examples/app-pytorch/driver.py similarity index 100% rename from examples/mt-pytorch/driver.py rename to examples/app-pytorch/driver.py diff --git a/examples/mt-pytorch/pyproject.toml b/examples/app-pytorch/pyproject.toml similarity index 83% rename from examples/mt-pytorch/pyproject.toml rename to examples/app-pytorch/pyproject.toml index 4978035495ea..5603b0f03480 100644 --- a/examples/mt-pytorch/pyproject.toml +++ b/examples/app-pytorch/pyproject.toml @@ -10,7 +10,7 @@ authors = ["The Flower Authors "] [tool.poetry.dependencies] python = ">=3.8,<3.11" -flwr-nightly = {version = ">=1.0,<2.0", extras = ["rest", "simulation"]} +flwr = { path = "../../", develop = true, extras = ["simulation"] } torch = "1.13.1" torchvision = "0.14.1" tqdm = "4.65.0" diff --git a/examples/mt-pytorch/requirements.txt b/examples/app-pytorch/requirements.txt similarity index 100% rename from examples/mt-pytorch/requirements.txt rename to examples/app-pytorch/requirements.txt diff --git a/examples/mt-pytorch/start_driver.py b/examples/app-pytorch/server.py similarity index 85% rename from examples/mt-pytorch/start_driver.py rename to examples/app-pytorch/server.py index 307f4ebd1a3b..fbf3f24a133d 100644 --- a/examples/mt-pytorch/start_driver.py +++ b/examples/app-pytorch/server.py @@ -33,10 +33,9 @@ def weighted_average(metrics: List[Tuple[int, Metrics]]) -> Metrics: fit_metrics_aggregation_fn=weighted_average, ) -if __name__ == "__main__": - # Start Flower server - fl.server.driver.start_driver( - server_address="0.0.0.0:9091", - config=fl.server.ServerConfig(num_rounds=3), - strategy=strategy, - ) + +# Run via `flower-server-app server:app` +app = fl.server.ServerApp( + config=fl.server.ServerConfig(num_rounds=3), + strategy=strategy, +) diff --git a/examples/mt-pytorch/task.py b/examples/app-pytorch/task.py similarity index 100% rename from examples/mt-pytorch/task.py rename to examples/app-pytorch/task.py diff --git a/src/py/flwr/server/compat/app.py b/src/py/flwr/server/compat/app.py index 06debb858c38..c0255391b885 100644 --- a/src/py/flwr/server/compat/app.py +++ b/src/py/flwr/server/compat/app.py @@ -24,7 +24,7 @@ from flwr.common import EventType, event from flwr.common.address import parse_address -from flwr.common.logger import log +from flwr.common.logger import log, warn_deprecated_feature from flwr.proto import driver_pb2 # pylint: disable=E0611 from flwr.server.app import init_defaults, run_fl from flwr.server.client_manager import ClientManager @@ -33,6 +33,7 @@ from flwr.server.server_config import ServerConfig from flwr.server.strategy import Strategy +from ..driver import Driver from ..driver.grpc_driver import GrpcDriver from .driver_client_proxy import DriverClientProxy @@ -54,6 +55,7 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals strategy: Optional[Strategy] = None, client_manager: Optional[ClientManager] = None, root_certificates: Optional[Union[bytes, str]] = None, + driver: Optional[Driver] = None, ) -> History: """Start a Flower Driver API server. @@ -81,6 +83,8 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals The PEM-encoded root certificates as a byte string or a path string. If provided, a secure connection using the certificates will be established to an SSL-enabled Flower server. + driver : Optional[Driver] (default: None) + The Driver object to use. Returns ------- @@ -101,21 +105,29 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals """ event(EventType.START_DRIVER_ENTER) - # Parse IP address - parsed_address = parse_address(server_address) - if not parsed_address: - sys.exit(f"Server IP address ({server_address}) cannot be parsed.") - host, port, is_v6 = parsed_address - address = f"[{host}]:{port}" if is_v6 else f"{host}:{port}" - - # Create the Driver - if isinstance(root_certificates, str): - root_certificates = Path(root_certificates).read_bytes() - grpc_driver = GrpcDriver( - driver_service_address=address, root_certificates=root_certificates - ) + if driver: + # pylint: disable=protected-access + grpc_driver, _ = driver._get_grpc_driver_and_run_id() + # pylint: enable=protected-access + else: + # Not passing a `Driver` object is deprecated + warn_deprecated_feature("start_driver") + + # Parse IP address + parsed_address = parse_address(server_address) + if not parsed_address: + sys.exit(f"Server IP address ({server_address}) cannot be parsed.") + host, port, is_v6 = parsed_address + address = f"[{host}]:{port}" if is_v6 else f"{host}:{port}" + + # Create the Driver + if isinstance(root_certificates, str): + root_certificates = Path(root_certificates).read_bytes() + grpc_driver = GrpcDriver( + driver_service_address=address, root_certificates=root_certificates + ) + grpc_driver.connect() - grpc_driver.connect() lock = threading.Lock() # Initialize the Driver API server and config @@ -150,7 +162,11 @@ def start_driver( # pylint: disable=too-many-arguments, too-many-locals # Stop the Driver API server and the thread with lock: - grpc_driver.disconnect() + if driver: + del driver + else: + grpc_driver.disconnect() + thread.join() event(EventType.START_SERVER_LEAVE) diff --git a/src/py/flwr/server/run_serverapp.py b/src/py/flwr/server/run_serverapp.py index 35fffcf2d7ba..685675b72e75 100644 --- a/src/py/flwr/server/run_serverapp.py +++ b/src/py/flwr/server/run_serverapp.py @@ -21,8 +21,11 @@ from pathlib import Path from flwr.common import EventType, event +from flwr.common.context import Context from flwr.common.logger import log +from flwr.common.recordset import RecordSet +from .driver.driver import Driver from .serverapp import ServerApp, load_server_app @@ -89,6 +92,21 @@ def _load() -> ServerApp: log(DEBUG, "server_app: `%s`", server_app) + # Initialize Context + context = Context(state=RecordSet()) + + # Initialize Driver + driver = Driver( + driver_service_address=args.server, + root_certificates=root_certificates, + ) + + # Call ServerApp + server_app(driver=driver, context=context) + + # Clean up + del driver + event(EventType.RUN_SERVER_APP_LEAVE) @@ -117,7 +135,7 @@ def _parse_args_run_server_app() -> argparse.ArgumentParser: ) parser.add_argument( "--server", - default="0.0.0.0:9092", + default="0.0.0.0:9091", help="Server address", ) parser.add_argument( diff --git a/src/py/flwr/server/serverapp.py b/src/py/flwr/server/serverapp.py index 1ffa087719dc..39c4f3f1e87f 100644 --- a/src/py/flwr/server/serverapp.py +++ b/src/py/flwr/server/serverapp.py @@ -23,6 +23,7 @@ from flwr.server.strategy import Strategy from .client_manager import ClientManager +from .compat import start_driver from .server import Server from .server_config import ServerConfig @@ -44,6 +45,14 @@ def __init__( def __call__(self, driver: Driver, context: Context) -> None: """Execute `ServerApp`.""" + # Compatibility mode + start_driver( + server=self.server, + config=self.config, + strategy=self.strategy, + client_manager=self.client_manager, + driver=driver, + ) class LoadServerAppError(Exception): From d76a6f27426650673cc9ac2c0f836ee977a86b11 Mon Sep 17 00:00:00 2001 From: Daniel Nata Nugraha Date: Fri, 16 Feb 2024 23:10:51 +0000 Subject: [PATCH 017/102] Update quickstart-fastai requirements (#2467) Co-authored-by: Taner Topal Co-authored-by: jafermarq --- examples/quickstart-fastai/pyproject.toml | 8 +++++--- examples/quickstart-fastai/requirements.txt | 7 ++++--- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/examples/quickstart-fastai/pyproject.toml b/examples/quickstart-fastai/pyproject.toml index ffaa97267493..7f458ef3431c 100644 --- a/examples/quickstart-fastai/pyproject.toml +++ b/examples/quickstart-fastai/pyproject.toml @@ -9,6 +9,8 @@ description = "Fastai Federated Learning Quickstart with Flower" authors = ["The Flower Authors "] [tool.poetry.dependencies] -python = ">=3.8,<3.10" -flwr = "^1.0.0" -fastai = "^2.7.10" +python = ">=3.8,<3.11" +flwr = ">=1.0,<2.0" +fastai = "2.7.14" +torch = "2.2.0" +torchvision = "0.17.0" diff --git a/examples/quickstart-fastai/requirements.txt b/examples/quickstart-fastai/requirements.txt index 0a7f315018a2..9c6e8d77293a 100644 --- a/examples/quickstart-fastai/requirements.txt +++ b/examples/quickstart-fastai/requirements.txt @@ -1,3 +1,4 @@ -fastai~=2.7.12 -flwr~=1.4.0 -torch~=2.0.1 +flwr>=1.0, <2.0 +fastai==2.7.14 +torch==2.2.0 +torchvision==0.17.0 From c4bc415ea50341dfd0f1bab26d3b7294cb2754a0 Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Sat, 17 Feb 2024 12:27:18 +0100 Subject: [PATCH 018/102] Update README in app-pytorch (#2970) --- examples/app-pytorch/README.md | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/examples/app-pytorch/README.md b/examples/app-pytorch/README.md index 0f676044ee90..73d19ff11258 100644 --- a/examples/app-pytorch/README.md +++ b/examples/app-pytorch/README.md @@ -1,10 +1,9 @@ # Flower App (PyTorch) 🧪 -🧪 = This example covers experimental features that might change in future versions of Flower +> 🧪 = This example covers experimental features that might change in future versions of Flower +> Please consult the regular PyTorch code examples ([quickstart](https://github.com/adap/flower/tree/main/examples/quickstart-pytorch), [advanced](https://github.com/adap/flower/tree/main/examples/advanced-pytorch)) to learn how to use Flower with PyTorch. -Please consult the regular PyTorch code examples ([quickstart](https://github.com/adap/flower/tree/main/examples/quickstart-pytorch), [advanced](https://github.com/adap/flower/tree/main/examples/advanced-pytorch)) to learn how to use Flower with PyTorch. - -This how-to guide describes the deployment of a long-running Flower server. +The following steps describe how to start a long-running Flower server (SuperLink) and then run a Flower App (consisting of a `ClientApp` and a `ServerApp`). ## Preconditions @@ -13,10 +12,10 @@ Let's assume the following project structure: ```bash $ tree . . -├── client.py -├── server.py -├── task.py -└── requirements.txt +├── client.py # <-- contains `ClientApp` +├── server.py # <-- contains `ServerApp` +├── task.py # <-- task-specific code (model, data) +└── requirements.txt # <-- dependencies ``` ## Install dependencies @@ -25,13 +24,13 @@ $ tree . pip install -r requirements.txt ``` -## Start the SuperLink +## Start the long-running Flower server (SuperLink) ```bash flower-superlink --insecure ``` -## Start the long-running Flower client +## Start the long-running Flower client (SuperNode) In a new terminal window, start the first long-running Flower client: @@ -45,8 +44,10 @@ In yet another new terminal window, start the second long-running Flower client: flower-client-app client:app --insecure ``` -## Start the driver +## Run the Flower App + +With both the long-running server (SuperLink) and two clients (SuperNode) up and running, we can now run the actual Flower App: ```bash -python start_driver.py +flower-server-app server:app --insecure ``` From b731913a4864e90b47a45e8cbfaf12289d56a459 Mon Sep 17 00:00:00 2001 From: Javier Date: Sat, 17 Feb 2024 12:39:44 +0100 Subject: [PATCH 019/102] Update `app-pytorch` example (#2967) --- examples/app-pytorch/pyproject.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/app-pytorch/pyproject.toml b/examples/app-pytorch/pyproject.toml index 5603b0f03480..3fd8d3170f5e 100644 --- a/examples/app-pytorch/pyproject.toml +++ b/examples/app-pytorch/pyproject.toml @@ -3,7 +3,7 @@ requires = ["poetry-core>=1.4.0"] build-backend = "poetry.core.masonry.api" [tool.poetry] -name = "mt-pytorch" +name = "app-pytorch" version = "0.1.0" description = "Multi-Tenant Federated Learning with Flower and PyTorch" authors = ["The Flower Authors "] From ce2d97019b10e6db9e226919158a7cda574875c0 Mon Sep 17 00:00:00 2001 From: Javier Date: Sun, 18 Feb 2024 16:02:51 +0100 Subject: [PATCH 020/102] Make VCE use `Message` and `Flower` callable (#2783) --- doc/source/how-to-run-simulations.rst | 2 +- e2e/pandas/simulation.py | 6 +- .../client/message_handler/message_handler.py | 2 +- src/py/flwr/simulation/app.py | 8 +- .../simulation/ray_transport/ray_actor.py | 64 ++-- .../ray_transport/ray_client_proxy.py | 291 ++++++------------ .../ray_transport/ray_client_proxy_test.py | 103 ++++--- src/py/flwr/simulation/ray_transport/utils.py | 23 -- 8 files changed, 189 insertions(+), 310 deletions(-) diff --git a/doc/source/how-to-run-simulations.rst b/doc/source/how-to-run-simulations.rst index 6e0520a79bf5..d1dcb511ed51 100644 --- a/doc/source/how-to-run-simulations.rst +++ b/doc/source/how-to-run-simulations.rst @@ -29,7 +29,7 @@ Running Flower simulations still require you to define your client class, a stra def client_fn(cid: str): # Return a standard Flower client - return MyFlowerClient() + return MyFlowerClient().to_client() # Launch the simulation hist = fl.simulation.start_simulation( diff --git a/e2e/pandas/simulation.py b/e2e/pandas/simulation.py index 91af84062712..b548b5ebb760 100644 --- a/e2e/pandas/simulation.py +++ b/e2e/pandas/simulation.py @@ -1,12 +1,8 @@ import flwr as fl -from client import FlowerClient +from client import client_fn from strategy import FedAnalytics -def client_fn(cid): - _ = cid - return FlowerClient() - hist = fl.simulation.start_simulation( client_fn=client_fn, num_clients=2, diff --git a/src/py/flwr/client/message_handler/message_handler.py b/src/py/flwr/client/message_handler/message_handler.py index 93de7d7d8821..f8c8a725aec7 100644 --- a/src/py/flwr/client/message_handler/message_handler.py +++ b/src/py/flwr/client/message_handler/message_handler.py @@ -109,7 +109,7 @@ def handle_legacy_message_from_msgtype( client_fn: ClientFn, message: Message, context: Context ) -> Message: """Handle legacy message in the inner most mod.""" - client = client_fn("-1") + client = client_fn(str(message.metadata.node_id)) client.set_context(context) diff --git a/src/py/flwr/simulation/app.py b/src/py/flwr/simulation/app.py index b159042588c9..9ee230942890 100644 --- a/src/py/flwr/simulation/app.py +++ b/src/py/flwr/simulation/app.py @@ -35,7 +35,7 @@ from flwr.server.server_config import ServerConfig from flwr.server.strategy import Strategy from flwr.simulation.ray_transport.ray_actor import ( - DefaultActor, + ClientAppActor, VirtualClientEngineActor, VirtualClientEngineActorPool, pool_size_from_resources, @@ -83,7 +83,7 @@ def start_simulation( client_manager: Optional[ClientManager] = None, ray_init_args: Optional[Dict[str, Any]] = None, keep_initialised: Optional[bool] = False, - actor_type: Type[VirtualClientEngineActor] = DefaultActor, + actor_type: Type[VirtualClientEngineActor] = ClientAppActor, actor_kwargs: Optional[Dict[str, Any]] = None, actor_scheduling: Union[str, NodeAffinitySchedulingStrategy] = "DEFAULT", ) -> History: @@ -139,10 +139,10 @@ def start_simulation( keep_initialised: Optional[bool] (default: False) Set to True to prevent `ray.shutdown()` in case `ray.is_initialized()=True`. - actor_type: VirtualClientEngineActor (default: DefaultActor) + actor_type: VirtualClientEngineActor (default: ClientAppActor) Optionally specify the type of actor to use. The actor object, which persists throughout the simulation, will be the process in charge of - running the clients' jobs (i.e. their `fit()` method). + executing a ClientApp wrapping input argument `client_fn`. actor_kwargs: Optional[Dict[str, Any]] (default: None) If you want to create your own Actor classes, you might need to pass diff --git a/src/py/flwr/simulation/ray_transport/ray_actor.py b/src/py/flwr/simulation/ray_transport/ray_actor.py index 853566a4cbeb..974773a3f577 100644 --- a/src/py/flwr/simulation/ray_transport/ray_actor.py +++ b/src/py/flwr/simulation/ray_transport/ray_actor.py @@ -25,18 +25,12 @@ from ray import ObjectRef from ray.util.actor_pool import ActorPool -from flwr import common -from flwr.client import Client, ClientFn +from flwr.client.clientapp import ClientApp from flwr.common.context import Context from flwr.common.logger import log -from flwr.simulation.ray_transport.utils import check_clientfn_returns_client +from flwr.common.message import Message -# All possible returns by a client -ClientRes = Union[ - common.GetPropertiesRes, common.GetParametersRes, common.FitRes, common.EvaluateRes -] -# A function to be executed by a client to obtain some results -JobFn = Callable[[Client], ClientRes] +ClientAppFn = Callable[[], ClientApp] class ClientException(Exception): @@ -58,27 +52,25 @@ def terminate(self) -> None: def run( self, - client_fn: ClientFn, - job_fn: JobFn, + client_app_fn: ClientAppFn, + message: Message, cid: str, context: Context, - ) -> Tuple[str, ClientRes, Context]: + ) -> Tuple[str, Message, Context]: """Run a client run.""" - # Execute tasks and return result + # Pass message through ClientApp and return a message # return also cid which is needed to ensure results # from the pool are correctly assigned to each ClientProxy try: - # Instantiate client (check 'Client' type is returned) - client = check_clientfn_returns_client(client_fn(cid)) - # Inject context - client.set_context(context) - # Run client job - job_results = job_fn(client) - # Retrieve context (potentially updated) - updated_context = client.get_context() + # Load app + app: ClientApp = client_app_fn() + + # Handle task message + out_message = app(message=message, context=context) + except Exception as ex: client_trace = traceback.format_exc() - message = ( + mssg = ( "\n\tSomething went wrong when running your client run." "\n\tClient " + cid @@ -87,13 +79,13 @@ def run( + " was running its run." "\n\tException triggered on the client side: " + client_trace, ) - raise ClientException(str(message)) from ex + raise ClientException(str(mssg)) from ex - return cid, job_results, updated_context + return cid, out_message, context @ray.remote -class DefaultActor(VirtualClientEngineActor): +class ClientAppActor(VirtualClientEngineActor): """A Ray Actor class that runs client runs. Parameters @@ -237,16 +229,16 @@ def add_actors_to_pool(self, num_actors: int) -> None: self._idle_actors.extend(new_actors) self.num_actors += num_actors - def submit(self, fn: Any, value: Tuple[ClientFn, JobFn, str, Context]) -> None: - """Take idle actor and assign it a client run. + def submit(self, fn: Any, value: Tuple[ClientAppFn, Message, str, Context]) -> None: + """Take an idle actor and assign it to run a client app and Message. Submit a job to an actor by first removing it from the list of idle actors, then - check if this actor was flagged to be removed from the pool + check if this actor was flagged to be removed from the pool. """ - client_fn, job_fn, cid, context = value + app_fn, mssg, cid, context = value actor = self._idle_actors.pop() if self._check_and_remove_actor_from_pool(actor): - future = fn(actor, client_fn, job_fn, cid, context) + future = fn(actor, app_fn, mssg, cid, context) future_key = tuple(future) if isinstance(future, List) else future self._future_to_actor[future_key] = (self._next_task_index, actor, cid) self._next_task_index += 1 @@ -255,7 +247,7 @@ def submit(self, fn: Any, value: Tuple[ClientFn, JobFn, str, Context]) -> None: self._cid_to_future[cid]["future"] = future_key def submit_client_job( - self, actor_fn: Any, job: Tuple[ClientFn, JobFn, str, Context] + self, actor_fn: Any, job: Tuple[ClientAppFn, Message, str, Context] ) -> None: """Submit a job while tracking client ids.""" _, _, cid, _ = job @@ -295,7 +287,7 @@ def _is_future_ready(self, cid: str) -> bool: return self._cid_to_future[cid]["ready"] # type: ignore - def _fetch_future_result(self, cid: str) -> Tuple[ClientRes, Context]: + def _fetch_future_result(self, cid: str) -> Tuple[Message, Context]: """Fetch result and updated context for a VirtualClient from Object Store. The job submitted by the ClientProxy interfacing with client with cid=cid is @@ -303,9 +295,9 @@ def _fetch_future_result(self, cid: str) -> Tuple[ClientRes, Context]: """ try: future: ObjectRef[Any] = self._cid_to_future[cid]["future"] # type: ignore - res_cid, res, updated_context = ray.get( + res_cid, out_mssg, updated_context = ray.get( future - ) # type: (str, ClientRes, Context) + ) # type: (str, Message, Context) except ray.exceptions.RayActorError as ex: log(ERROR, ex) if hasattr(ex, "actor_id"): @@ -322,7 +314,7 @@ def _fetch_future_result(self, cid: str) -> Tuple[ClientRes, Context]: # Reset mapping self._reset_cid_to_future_dict(cid) - return res, updated_context + return out_mssg, updated_context def _flag_actor_for_removal(self, actor_id_hex: str) -> None: """Flag actor that should be removed from pool.""" @@ -409,7 +401,7 @@ def process_unordered_future(self, timeout: Optional[float] = None) -> None: def get_client_result( self, cid: str, timeout: Optional[float] - ) -> Tuple[ClientRes, Context]: + ) -> Tuple[Message, Context]: """Get result from VirtualClient with specific cid.""" # Loop until all jobs submitted to the pool are completed. Break early # if the result for the ClientProxy calling this method is ready diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py index 894012dc6d70..ddac030b2ef0 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py @@ -17,107 +17,33 @@ import traceback from logging import ERROR -from typing import Dict, Optional, cast - -import ray +from typing import Optional from flwr import common -from flwr.client import Client, ClientFn -from flwr.client.client import ( - maybe_call_evaluate, - maybe_call_fit, - maybe_call_get_parameters, - maybe_call_get_properties, -) +from flwr.client import ClientFn +from flwr.client.clientapp import ClientApp from flwr.client.node_state import NodeState +from flwr.common.constant import ( + MESSAGE_TYPE_EVALUATE, + MESSAGE_TYPE_FIT, + MESSAGE_TYPE_GET_PARAMETERS, + MESSAGE_TYPE_GET_PROPERTIES, +) from flwr.common.logger import log -from flwr.server.client_proxy import ClientProxy -from flwr.simulation.ray_transport.ray_actor import ( - ClientRes, - JobFn, - VirtualClientEngineActorPool, +from flwr.common.message import Message, Metadata +from flwr.common.recordset import RecordSet +from flwr.common.recordset_compat import ( + evaluateins_to_recordset, + fitins_to_recordset, + getparametersins_to_recordset, + getpropertiesins_to_recordset, + recordset_to_evaluateres, + recordset_to_fitres, + recordset_to_getparametersres, + recordset_to_getpropertiesres, ) - - -class RayClientProxy(ClientProxy): - """Flower client proxy which delegates work using Ray.""" - - def __init__(self, client_fn: ClientFn, cid: str, resources: Dict[str, float]): - super().__init__(cid) - self.client_fn = client_fn - self.resources = resources - - def get_properties( - self, ins: common.GetPropertiesIns, timeout: Optional[float] - ) -> common.GetPropertiesRes: - """Return client's properties.""" - future_get_properties_res = launch_and_get_properties.options( # type: ignore - **self.resources, - ).remote(self.client_fn, self.cid, ins) - try: - res = ray.get(future_get_properties_res, timeout=timeout) - except Exception as ex: - log(ERROR, ex) - raise ex - return cast( - common.GetPropertiesRes, - res, - ) - - def get_parameters( - self, ins: common.GetParametersIns, timeout: Optional[float] - ) -> common.GetParametersRes: - """Return the current local model parameters.""" - future_paramseters_res = launch_and_get_parameters.options( # type: ignore - **self.resources, - ).remote(self.client_fn, self.cid, ins) - try: - res = ray.get(future_paramseters_res, timeout=timeout) - except Exception as ex: - log(ERROR, ex) - raise ex - return cast( - common.GetParametersRes, - res, - ) - - def fit(self, ins: common.FitIns, timeout: Optional[float]) -> common.FitRes: - """Train model parameters on the locally held dataset.""" - future_fit_res = launch_and_fit.options( # type: ignore - **self.resources, - ).remote(self.client_fn, self.cid, ins) - try: - res = ray.get(future_fit_res, timeout=timeout) - except Exception as ex: - log(ERROR, ex) - raise ex - return cast( - common.FitRes, - res, - ) - - def evaluate( - self, ins: common.EvaluateIns, timeout: Optional[float] - ) -> common.EvaluateRes: - """Evaluate model parameters on the locally held dataset.""" - future_evaluate_res = launch_and_evaluate.options( # type: ignore - **self.resources, - ).remote(self.client_fn, self.cid, ins) - try: - res = ray.get(future_evaluate_res, timeout=timeout) - except Exception as ex: - log(ERROR, ex) - raise ex - return cast( - common.EvaluateRes, - res, - ) - - def reconnect( - self, ins: common.ReconnectIns, timeout: Optional[float] - ) -> common.DisconnectRes: - """Disconnect and (optionally) reconnect later.""" - return common.DisconnectRes(reason="") # Nothing to do here (yet) +from flwr.server.client_proxy import ClientProxy +from flwr.simulation.ray_transport.ray_actor import VirtualClientEngineActorPool class RayActorClientProxy(ClientProxy): @@ -127,15 +53,17 @@ def __init__( self, client_fn: ClientFn, cid: str, actor_pool: VirtualClientEngineActorPool ): super().__init__(cid) - self.client_fn = client_fn + + def _load_app() -> ClientApp: + return ClientApp(client_fn=client_fn) + + self.app_fn = _load_app self.actor_pool = actor_pool self.proxy_state = NodeState() - def _submit_job(self, job_fn: JobFn, timeout: Optional[float]) -> ClientRes: - # The VCE is not exposed to TaskIns, it won't handle multilple runs - # For the time being, fixing run_id is a small compromise - # This will be one of the first points to address integrating VCE + DriverAPI - run_id = 0 + def _submit_job(self, message: Message, timeout: Optional[float]) -> Message: + """Sumbit a message to the ActorPool.""" + run_id = message.metadata.run_id # Register state self.proxy_state.register_context(run_id=run_id) @@ -145,10 +73,12 @@ def _submit_job(self, job_fn: JobFn, timeout: Optional[float]) -> ClientRes: try: self.actor_pool.submit_client_job( - lambda a, c_fn, j_fn, cid, state: a.run.remote(c_fn, j_fn, cid, state), - (self.client_fn, job_fn, self.cid, state), + lambda a, a_fn, mssg, cid, state: a.run.remote(a_fn, mssg, cid, state), + (self.app_fn, message, self.cid, state), + ) + out_mssg, updated_context = self.actor_pool.get_client_result( + self.cid, timeout ) - res, updated_context = self.actor_pool.get_client_result(self.cid, timeout) # Update state self.proxy_state.update_context(run_id=run_id, context=updated_context) @@ -162,134 +92,87 @@ def _submit_job(self, job_fn: JobFn, timeout: Optional[float]) -> ClientRes: log(ERROR, ex) raise ex - return res + return out_mssg + + def _wrap_recordset_in_message( + self, + recordset: RecordSet, + message_type: str, + timeout: Optional[float], + ) -> Message: + """Wrap a RecordSet inside a Message.""" + return Message( + content=recordset, + metadata=Metadata( + run_id=0, + message_id="", + group_id="", + node_id=int(self.cid), + ttl=str(timeout) if timeout else "", + message_type=message_type, + ), + ) def get_properties( self, ins: common.GetPropertiesIns, timeout: Optional[float] ) -> common.GetPropertiesRes: """Return client's properties.""" + recordset = getpropertiesins_to_recordset(ins) + message = self._wrap_recordset_in_message( + recordset, + message_type=MESSAGE_TYPE_GET_PROPERTIES, + timeout=timeout, + ) - def get_properties(client: Client) -> common.GetPropertiesRes: - return maybe_call_get_properties( - client=client, - get_properties_ins=ins, - ) + message_out = self._submit_job(message, timeout) - res = self._submit_job(get_properties, timeout) - - return cast( - common.GetPropertiesRes, - res, - ) + return recordset_to_getpropertiesres(message_out.content) def get_parameters( self, ins: common.GetParametersIns, timeout: Optional[float] ) -> common.GetParametersRes: """Return the current local model parameters.""" + recordset = getparametersins_to_recordset(ins) + message = self._wrap_recordset_in_message( + recordset, + message_type=MESSAGE_TYPE_GET_PARAMETERS, + timeout=timeout, + ) - def get_parameters(client: Client) -> common.GetParametersRes: - return maybe_call_get_parameters( - client=client, - get_parameters_ins=ins, - ) - - res = self._submit_job(get_parameters, timeout) + message_out = self._submit_job(message, timeout) - return cast( - common.GetParametersRes, - res, - ) + return recordset_to_getparametersres(message_out.content, keep_input=False) def fit(self, ins: common.FitIns, timeout: Optional[float]) -> common.FitRes: """Train model parameters on the locally held dataset.""" + recordset = fitins_to_recordset( + ins, keep_input=True + ) # This must stay TRUE since ins are in-memory + message = self._wrap_recordset_in_message( + recordset, message_type=MESSAGE_TYPE_FIT, timeout=timeout + ) - def fit(client: Client) -> common.FitRes: - return maybe_call_fit( - client=client, - fit_ins=ins, - ) + message_out = self._submit_job(message, timeout) - res = self._submit_job(fit, timeout) - - return cast( - common.FitRes, - res, - ) + return recordset_to_fitres(message_out.content, keep_input=False) def evaluate( self, ins: common.EvaluateIns, timeout: Optional[float] ) -> common.EvaluateRes: """Evaluate model parameters on the locally held dataset.""" + recordset = evaluateins_to_recordset( + ins, keep_input=True + ) # This must stay TRUE since ins are in-memory + message = self._wrap_recordset_in_message( + recordset, message_type=MESSAGE_TYPE_EVALUATE, timeout=timeout + ) - def evaluate(client: Client) -> common.EvaluateRes: - return maybe_call_evaluate( - client=client, - evaluate_ins=ins, - ) - - res = self._submit_job(evaluate, timeout) + message_out = self._submit_job(message, timeout) - return cast( - common.EvaluateRes, - res, - ) + return recordset_to_evaluateres(message_out.content) def reconnect( self, ins: common.ReconnectIns, timeout: Optional[float] ) -> common.DisconnectRes: """Disconnect and (optionally) reconnect later.""" return common.DisconnectRes(reason="") # Nothing to do here (yet) - - -@ray.remote -def launch_and_get_properties( - client_fn: ClientFn, cid: str, get_properties_ins: common.GetPropertiesIns -) -> common.GetPropertiesRes: - """Exectue get_properties remotely.""" - client: Client = _create_client(client_fn, cid) - return maybe_call_get_properties( - client=client, - get_properties_ins=get_properties_ins, - ) - - -@ray.remote -def launch_and_get_parameters( - client_fn: ClientFn, cid: str, get_parameters_ins: common.GetParametersIns -) -> common.GetParametersRes: - """Exectue get_parameters remotely.""" - client: Client = _create_client(client_fn, cid) - return maybe_call_get_parameters( - client=client, - get_parameters_ins=get_parameters_ins, - ) - - -@ray.remote -def launch_and_fit( - client_fn: ClientFn, cid: str, fit_ins: common.FitIns -) -> common.FitRes: - """Exectue fit remotely.""" - client: Client = _create_client(client_fn, cid) - return maybe_call_fit( - client=client, - fit_ins=fit_ins, - ) - - -@ray.remote -def launch_and_evaluate( - client_fn: ClientFn, cid: str, evaluate_ins: common.EvaluateIns -) -> common.EvaluateRes: - """Exectue evaluate remotely.""" - client: Client = _create_client(client_fn, cid) - return maybe_call_evaluate( - client=client, - evaluate_ins=evaluate_ins, - ) - - -def _create_client(client_fn: ClientFn, cid: str) -> Client: - """Create a client instance.""" - # Materialize client - return client_fn(cid) diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py index b380d37d01c8..9ade31c323d8 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py @@ -17,19 +17,25 @@ from math import pi from random import shuffle -from typing import List, Tuple, Type, cast +from typing import Dict, List, Tuple, Type import ray from flwr.client import Client, NumPyClient -from flwr.common import Code, GetPropertiesRes, Status +from flwr.client.clientapp import ClientApp +from flwr.common import Config, Scalar from flwr.common.configsrecord import ConfigsRecord +from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES from flwr.common.context import Context +from flwr.common.message import Message, Metadata from flwr.common.recordset import RecordSet +from flwr.common.recordset_compat import ( + getpropertiesins_to_recordset, + recordset_to_getpropertiesres, +) +from flwr.common.recordset_compat_test import _get_valid_getpropertiesins from flwr.simulation.ray_transport.ray_actor import ( - ClientRes, - DefaultActor, - JobFn, + ClientAppActor, VirtualClientEngineActor, VirtualClientEngineActorPool, ) @@ -42,34 +48,24 @@ class DummyClient(NumPyClient): def __init__(self, cid: str) -> None: self.cid = int(cid) - -def get_dummy_client(cid: str) -> Client: - """Return a DummyClient converted to Client type.""" - return DummyClient(cid).to_client() - - -# A dummy run -def job_fn(cid: str) -> JobFn: # pragma: no cover - """Construct a simple job with cid dependency.""" - - def cid_times_pi(client: Client) -> ClientRes: # pylint: disable=unused-argument - result = int(cid) * pi + def get_properties(self, config: Config) -> Dict[str, Scalar]: + """Return properties by doing a simple calculation.""" + result = int(self.cid) * pi # store something in context - client.numpy_client.context.state.set_configs( # type: ignore + self.context.state.set_configs( "result", record=ConfigsRecord({"result": str(result)}) ) + return {"result": result} - # now let's convert it to a GetPropertiesRes response - return GetPropertiesRes( - status=Status(Code(0), message="test"), properties={"result": result} - ) - return cid_times_pi +def get_dummy_client(cid: str) -> Client: + """Return a DummyClient converted to Client type.""" + return DummyClient(cid).to_client() def prep( - actor_type: Type[VirtualClientEngineActor] = DefaultActor, + actor_type: Type[VirtualClientEngineActor] = ClientAppActor, ) -> Tuple[List[RayActorClientProxy], VirtualClientEngineActorPool]: # pragma: no cover """Prepare ClientProxies and pool for tests.""" client_resources = {"num_cpus": 1, "num_gpus": 0.0} @@ -104,13 +100,23 @@ def test_cid_consistency_one_at_a_time() -> None: Submit one job and waits for completion. Then submits the next and so on """ proxies, _ = prep() + + getproperties_ins = _get_valid_getpropertiesins() + recordset = getpropertiesins_to_recordset(getproperties_ins) + # submit jobs one at a time for prox in proxies: - res = prox._submit_job( # pylint: disable=protected-access - job_fn=job_fn(prox.cid), timeout=None + message = prox._wrap_recordset_in_message( # pylint: disable=protected-access + recordset, + MESSAGE_TYPE_GET_PROPERTIES, + timeout=None, ) + message_out = prox._submit_job( # pylint: disable=protected-access + message=message, timeout=None + ) + + res = recordset_to_getpropertiesres(message_out.content) - res = cast(GetPropertiesRes, res) assert int(prox.cid) * pi == res.properties["result"] ray.shutdown() @@ -125,6 +131,9 @@ def test_cid_consistency_all_submit_first_run_consistency() -> None: proxies, _ = prep() run_id = 0 + getproperties_ins = _get_valid_getpropertiesins() + recordset = getpropertiesins_to_recordset(getproperties_ins) + # submit all jobs (collect later) shuffle(proxies) for prox in proxies: @@ -133,18 +142,24 @@ def test_cid_consistency_all_submit_first_run_consistency() -> None: # Retrieve state state = prox.proxy_state.retrieve_context(run_id=run_id) - job = job_fn(prox.cid) + message = prox._wrap_recordset_in_message( # pylint: disable=protected-access + recordset, + message_type=MESSAGE_TYPE_GET_PROPERTIES, + timeout=None, + ) prox.actor_pool.submit_client_job( - lambda a, c_fn, j_fn, cid, state: a.run.remote(c_fn, j_fn, cid, state), - (prox.client_fn, job, prox.cid, state), + lambda a, a_fn, mssg, cid, state: a.run.remote(a_fn, mssg, cid, state), + (prox.app_fn, message, prox.cid, state), ) # fetch results one at a time shuffle(proxies) for prox in proxies: - res, updated_context = prox.actor_pool.get_client_result(prox.cid, timeout=None) + message_out, updated_context = prox.actor_pool.get_client_result( + prox.cid, timeout=None + ) prox.proxy_state.update_context(run_id, context=updated_context) - res = cast(GetPropertiesRes, res) + res = recordset_to_getpropertiesres(message_out.content) assert int(prox.cid) * pi == res.properties["result"] assert ( @@ -163,20 +178,36 @@ def test_cid_consistency_without_proxies() -> None: num_clients = len(proxies) cids = [str(cid) for cid in range(num_clients)] + getproperties_ins = _get_valid_getpropertiesins() + recordset = getpropertiesins_to_recordset(getproperties_ins) + + def _load_app() -> ClientApp: + return ClientApp(client_fn=get_dummy_client) + # submit all jobs (collect later) shuffle(cids) for cid in cids: - job = job_fn(cid) + message = Message( + content=recordset, + metadata=Metadata( + run_id=0, + message_id="", + group_id="", + ttl="", + node_id=int(cid), + message_type=MESSAGE_TYPE_GET_PROPERTIES, + ), + ) pool.submit_client_job( lambda a, c_fn, j_fn, cid_, state: a.run.remote(c_fn, j_fn, cid_, state), - (get_dummy_client, job, cid, Context(state=RecordSet())), + (_load_app, message, cid, Context(state=RecordSet())), ) # fetch results one at a time shuffle(cids) for cid in cids: - res, _ = pool.get_client_result(cid, timeout=None) - res = cast(GetPropertiesRes, res) + message_out, _ = pool.get_client_result(cid, timeout=None) + res = recordset_to_getpropertiesres(message_out.content) assert int(cid) * pi == res.properties["result"] ray.shutdown() diff --git a/src/py/flwr/simulation/ray_transport/utils.py b/src/py/flwr/simulation/ray_transport/utils.py index dd9fb6b2aa85..3861164998a4 100644 --- a/src/py/flwr/simulation/ray_transport/utils.py +++ b/src/py/flwr/simulation/ray_transport/utils.py @@ -18,7 +18,6 @@ import warnings from logging import ERROR -from flwr.client import Client from flwr.common.logger import log try: @@ -60,25 +59,3 @@ def enable_tf_gpu_growth() -> None: log(ERROR, traceback.format_exc()) log(ERROR, ex) raise ex - - -def check_clientfn_returns_client(client: Client) -> Client: - """Warn once that clients returned in `clinet_fn` should be of type Client. - - This is here for backwards compatibility. If a ClientFn is provided returning - a different type of client (e.g. NumPyClient) we'll warn the user but convert - the client internally to `Client` by calling `.to_client()`. - """ - if not isinstance(client, Client): - mssg = ( - " Ensure your client is of type `flwr.client.Client`. Please convert it" - " using the `.to_client()` method before returning it" - " in the `client_fn` you pass to `start_simulation`." - " We have applied this conversion on your behalf." - " Not returning a `Client` might trigger an error in future" - " versions of Flower." - ) - - warnings.warn(mssg, DeprecationWarning, stacklevel=2) - client = client.to_client() - return client From e6fe6f48ae3ca391182e928ac6673ef18906be8f Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Sun, 18 Feb 2024 16:42:17 +0100 Subject: [PATCH 021/102] Introduce custom ServerApp functions (#2972) --- src/py/flwr/server/__init__.py | 2 +- src/py/flwr/server/run_serverapp.py | 6 +- .../server/{serverapp.py => server_app.py} | 102 +++++++++++++++--- src/py/flwr/server/server_app_test.py | 63 +++++++++++ 4 files changed, 154 insertions(+), 19 deletions(-) rename src/py/flwr/server/{serverapp.py => server_app.py} (51%) create mode 100644 src/py/flwr/server/server_app_test.py diff --git a/src/py/flwr/server/__init__.py b/src/py/flwr/server/__init__.py index 09372e258861..cad054fb86ab 100644 --- a/src/py/flwr/server/__init__.py +++ b/src/py/flwr/server/__init__.py @@ -26,8 +26,8 @@ from .history import History as History from .run_serverapp import run_server_app as run_server_app from .server import Server as Server +from .server_app import ServerApp as ServerApp from .server_config import ServerConfig as ServerConfig -from .serverapp import ServerApp as ServerApp __all__ = [ "ClientManager", diff --git a/src/py/flwr/server/run_serverapp.py b/src/py/flwr/server/run_serverapp.py index 685675b72e75..283348f05457 100644 --- a/src/py/flwr/server/run_serverapp.py +++ b/src/py/flwr/server/run_serverapp.py @@ -26,7 +26,7 @@ from flwr.common.recordset import RecordSet from .driver.driver import Driver -from .serverapp import ServerApp, load_server_app +from .server_app import ServerApp, load_server_app def run_server_app() -> None: @@ -78,8 +78,6 @@ def run_server_app() -> None: root_certificates, ) - log(WARN, "Not implemented: run_server_app") - server_app_dir = args.dir if server_app_dir is not None: sys.path.insert(0, server_app_dir) @@ -90,8 +88,6 @@ def _load() -> ServerApp: server_app = _load() - log(DEBUG, "server_app: `%s`", server_app) - # Initialize Context context = Context(state=RecordSet()) diff --git a/src/py/flwr/server/serverapp.py b/src/py/flwr/server/server_app.py similarity index 51% rename from src/py/flwr/server/serverapp.py rename to src/py/flwr/server/server_app.py index 39c4f3f1e87f..027818019786 100644 --- a/src/py/flwr/server/serverapp.py +++ b/src/py/flwr/server/server_app.py @@ -16,9 +16,10 @@ import importlib -from typing import Optional, cast +from typing import Callable, Optional, cast from flwr.common.context import Context +from flwr.common.recordset import RecordSet from flwr.server.driver.driver import Driver from flwr.server.strategy import Strategy @@ -26,10 +27,32 @@ from .compat import start_driver from .server import Server from .server_config import ServerConfig +from .typing import ServerAppCallable class ServerApp: - """Flower ServerApp.""" + """Flower ServerApp. + + Examples + -------- + Use the `ServerApp` with an existing `Strategy`: + + >>> server_config = ServerConfig(num_rounds=3) + >>> strategy = FedAvg() + >>> + >>> app = ServerApp() + >>> server_config=server_config, + >>> strategy=strategy, + >>> ) + + Use the `ServerApp` with a custom main function: + + >>> app = ServerApp() + >>> + >>> @app.main() + >>> def main(driver: Driver, context: Context) -> None: + >>> print("ServerApp running") + """ def __init__( self, @@ -38,21 +61,74 @@ def __init__( strategy: Optional[Strategy] = None, client_manager: Optional[ClientManager] = None, ) -> None: - self.server = server - self.config = config - self.strategy = strategy - self.client_manager = client_manager + self._server = server + self._config = config + self._strategy = strategy + self._client_manager = client_manager + self._main: Optional[ServerAppCallable] = None def __call__(self, driver: Driver, context: Context) -> None: """Execute `ServerApp`.""" # Compatibility mode - start_driver( - server=self.server, - config=self.config, - strategy=self.strategy, - client_manager=self.client_manager, - driver=driver, - ) + if not self._main: + start_driver( + server=self._server, + config=self._config, + strategy=self._strategy, + client_manager=self._client_manager, + driver=driver, + ) + return + + # New execution mode + context = Context(state=RecordSet()) + self._main(driver, context) + + def main(self) -> Callable[[ServerAppCallable], ServerAppCallable]: + """Return a decorator that registers the main fn with the server app. + + Examples + -------- + >>> app = ServerApp() + >>> + >>> @app.main() + >>> def main(driver: Driver, context: Context) -> None: + >>> print("ServerApp running") + """ + + def main_decorator(main_fn: ServerAppCallable) -> ServerAppCallable: + """Register the main fn with the ServerApp object.""" + if self._server or self._config or self._strategy or self._client_manager: + raise ValueError( + """Use either a custom main function or a `Strategy`, but not both. + + Use the `ServerApp` with an existing `Strategy`: + + >>> server_config = ServerConfig(num_rounds=3) + >>> strategy = FedAvg() + >>> + >>> app = ServerApp() + >>> server_config=server_config, + >>> strategy=strategy, + >>> ) + + Use the `ServerApp` with a custom main function: + + >>> app = ServerApp() + >>> + >>> @app.main() + >>> def main(driver: Driver, context: Context) -> None: + >>> print("ServerApp running") + """, + ) + + # Register provided function with the ServerApp object + self._main = main_fn + + # Return provided function unmodified + return main_fn + + return main_decorator class LoadServerAppError(Exception): diff --git a/src/py/flwr/server/server_app_test.py b/src/py/flwr/server/server_app_test.py new file mode 100644 index 000000000000..011b22b0d60d --- /dev/null +++ b/src/py/flwr/server/server_app_test.py @@ -0,0 +1,63 @@ +# Copyright 2020 Flower Labs GmbH. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +"""Tests for ServerApp.""" + + +from unittest.mock import MagicMock + +import pytest + +from flwr.common.context import Context +from flwr.common.recordset import RecordSet +from flwr.server import ServerApp, ServerConfig +from flwr.server.driver import Driver + + +def test_server_app_custom_mode() -> None: + """Test sampling w/o criterion.""" + # Prepare + app = ServerApp() + driver = MagicMock() + context = Context(state=RecordSet()) + + called = {"called": False} + + # pylint: disable=unused-argument + @app.main() + def custom_main(driver: Driver, context: Context) -> None: + called["called"] = True + + # pylint: enable=unused-argument + + # Execute + app(driver, context) + + # Assert + assert called["called"] + + +def test_server_app_exception_when_both_modes() -> None: + """Test ServerApp error when both compat mode and custom fns are used.""" + # Prepare + app = ServerApp(config=ServerConfig(num_rounds=3)) + + # Execute and assert + with pytest.raises(ValueError): + # pylint: disable=unused-argument + @app.main() + def custom_main(driver: Driver, context: Context) -> None: + pass + + # pylint: enable=unused-argument From 155a58f15e324ffd2c399ca9a82f0874071e81b1 Mon Sep 17 00:00:00 2001 From: Javier Date: Sun, 18 Feb 2024 17:13:41 +0100 Subject: [PATCH 022/102] Split `run_server_app` (#2974) --- src/py/flwr/server/run_serverapp.py | 34 ++++++++++++++++++----------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/src/py/flwr/server/run_serverapp.py b/src/py/flwr/server/run_serverapp.py index 283348f05457..26b8243b1f88 100644 --- a/src/py/flwr/server/run_serverapp.py +++ b/src/py/flwr/server/run_serverapp.py @@ -29,6 +29,24 @@ from .server_app import ServerApp, load_server_app +def run(server_app_attr: str, driver: Driver, server_app_dir: str) -> None: + """Run ServerApp with a given Driver.""" + if server_app_dir is not None: + sys.path.insert(0, server_app_dir) + + def _load() -> ServerApp: + server_app: ServerApp = load_server_app(server_app_attr) + return server_app + + server_app = _load() + + # Initialize Context + context = Context(state=RecordSet()) + + # Call ServerApp + server_app(driver=driver, context=context) + + def run_server_app() -> None: """Run Flower server app.""" event(EventType.RUN_SERVER_APP_ENTER) @@ -79,17 +97,7 @@ def run_server_app() -> None: ) server_app_dir = args.dir - if server_app_dir is not None: - sys.path.insert(0, server_app_dir) - - def _load() -> ServerApp: - server_app: ServerApp = load_server_app(getattr(args, "server-app")) - return server_app - - server_app = _load() - - # Initialize Context - context = Context(state=RecordSet()) + server_app_attr = getattr(args, "server-app") # Initialize Driver driver = Driver( @@ -97,8 +105,8 @@ def _load() -> ServerApp: root_certificates=root_certificates, ) - # Call ServerApp - server_app(driver=driver, context=context) + # Run the Server App with the Driver + run(server_app_attr, driver, server_app_dir) # Clean up del driver From 461abc8b9e86cf551cb885bc53c33398b9fc52f8 Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Sun, 18 Feb 2024 17:27:47 +0100 Subject: [PATCH 023/102] Export Context, Message and RecordSet from flwr.common (#2975) --- examples/secaggplus-mt/workflows.py | 2 +- src/py/flwr/client/app.py | 3 +-- src/py/flwr/client/client.py | 2 +- src/py/flwr/client/clientapp.py | 3 +-- src/py/flwr/client/grpc_client/connection.py | 4 +--- src/py/flwr/client/grpc_client/connection_test.py | 3 +-- src/py/flwr/client/grpc_rere_client/connection.py | 3 +-- .../flwr/client/message_handler/message_handler.py | 4 +--- .../client/message_handler/message_handler_test.py | 7 ++++--- .../client/message_handler/task_handler_test.py | 3 +-- .../mod/secure_aggregation/secaggplus_mod.py | 12 ++++++++---- .../mod/secure_aggregation/secaggplus_mod_test.py | 4 +--- src/py/flwr/client/mod/utils.py | 3 +-- src/py/flwr/client/mod/utils_test.py | 13 ++++++++----- src/py/flwr/client/node_state.py | 3 +-- src/py/flwr/client/node_state_tests.py | 2 +- src/py/flwr/client/numpy_client.py | 2 +- src/py/flwr/client/rest_client/connection.py | 3 +-- src/py/flwr/client/typing.py | 3 +-- src/py/flwr/common/__init__.py | 14 ++++++++++++++ src/py/flwr/server/compat/driver_client_proxy.py | 2 +- src/py/flwr/server/run_serverapp.py | 4 +--- src/py/flwr/server/server_app.py | 3 +-- src/py/flwr/server/typing.py | 2 +- src/py/flwr/simulation/ray_transport/ray_actor.py | 3 +-- .../simulation/ray_transport/ray_client_proxy.py | 3 +-- .../ray_transport/ray_client_proxy_test.py | 5 +---- 27 files changed, 57 insertions(+), 58 deletions(-) diff --git a/examples/secaggplus-mt/workflows.py b/examples/secaggplus-mt/workflows.py index b98de883b8f7..c9d0190555eb 100644 --- a/examples/secaggplus-mt/workflows.py +++ b/examples/secaggplus-mt/workflows.py @@ -58,7 +58,7 @@ from flwr.proto.task_pb2 import Task from flwr.common import serde from flwr.common.constant import TASK_TYPE_FIT -from flwr.common.recordset import RecordSet +from flwr.common import RecordSet from flwr.common import recordset_compat as compat from flwr.common.configsrecord import ConfigsRecord diff --git a/src/py/flwr/client/app.py b/src/py/flwr/client/app.py index 15f7c5057a20..86e792d784e7 100644 --- a/src/py/flwr/client/app.py +++ b/src/py/flwr/client/app.py @@ -25,7 +25,7 @@ from flwr.client.client import Client from flwr.client.clientapp import ClientApp from flwr.client.typing import ClientFn -from flwr.common import GRPC_MAX_MESSAGE_LENGTH, EventType, event +from flwr.common import GRPC_MAX_MESSAGE_LENGTH, EventType, Message, event from flwr.common.address import parse_address from flwr.common.constant import ( MISSING_EXTRA_REST, @@ -35,7 +35,6 @@ TRANSPORT_TYPES, ) from flwr.common.logger import log, warn_deprecated_feature, warn_experimental_feature -from flwr.common.message import Message from .clientapp import load_client_app from .grpc_client.connection import grpc_connection diff --git a/src/py/flwr/client/client.py b/src/py/flwr/client/client.py index 6d982ecc9a9e..23a3755f3efe 100644 --- a/src/py/flwr/client/client.py +++ b/src/py/flwr/client/client.py @@ -21,6 +21,7 @@ from flwr.common import ( Code, + Context, EvaluateIns, EvaluateRes, FitIns, @@ -32,7 +33,6 @@ Parameters, Status, ) -from flwr.common.context import Context class Client(ABC): diff --git a/src/py/flwr/client/clientapp.py b/src/py/flwr/client/clientapp.py index cfc59c9298ed..9de6516c7a39 100644 --- a/src/py/flwr/client/clientapp.py +++ b/src/py/flwr/client/clientapp.py @@ -23,8 +23,7 @@ ) from flwr.client.mod.utils import make_ffn from flwr.client.typing import ClientFn, Mod -from flwr.common.context import Context -from flwr.common.message import Message +from flwr.common import Context, Message class ClientApp: diff --git a/src/py/flwr/client/grpc_client/connection.py b/src/py/flwr/client/grpc_client/connection.py index e6d21963fcbf..e04846985845 100644 --- a/src/py/flwr/client/grpc_client/connection.py +++ b/src/py/flwr/client/grpc_client/connection.py @@ -22,7 +22,7 @@ from queue import Queue from typing import Callable, Iterator, Optional, Tuple, Union, cast -from flwr.common import GRPC_MAX_MESSAGE_LENGTH +from flwr.common import GRPC_MAX_MESSAGE_LENGTH, Message, Metadata, RecordSet from flwr.common import recordset_compat as compat from flwr.common import serde from flwr.common.configsrecord import ConfigsRecord @@ -34,8 +34,6 @@ ) from flwr.common.grpc import create_channel from flwr.common.logger import log -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.proto.transport_pb2 import ( # pylint: disable=E0611 ClientMessage, Reason, diff --git a/src/py/flwr/client/grpc_client/connection_test.py b/src/py/flwr/client/grpc_client/connection_test.py index 127e27356f64..4fa289d8bd46 100644 --- a/src/py/flwr/client/grpc_client/connection_test.py +++ b/src/py/flwr/client/grpc_client/connection_test.py @@ -23,11 +23,10 @@ import grpc +from flwr.common import Message, Metadata, RecordSet from flwr.common import recordset_compat as compat from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.typing import Code, GetPropertiesRes, Status from flwr.proto.transport_pb2 import ( # pylint: disable=E0611 ClientMessage, diff --git a/src/py/flwr/client/grpc_rere_client/connection.py b/src/py/flwr/client/grpc_rere_client/connection.py index 07635d002721..04f03299f320 100644 --- a/src/py/flwr/client/grpc_rere_client/connection.py +++ b/src/py/flwr/client/grpc_rere_client/connection.py @@ -26,10 +26,9 @@ validate_task_ins, validate_task_res, ) -from flwr.common import GRPC_MAX_MESSAGE_LENGTH +from flwr.common import GRPC_MAX_MESSAGE_LENGTH, Message from flwr.common.grpc import create_channel from flwr.common.logger import log, warn_experimental_feature -from flwr.common.message import Message from flwr.common.serde import message_from_taskins, message_to_taskres from flwr.proto.fleet_pb2 import ( # pylint: disable=E0611 CreateNodeRequest, diff --git a/src/py/flwr/client/message_handler/message_handler.py b/src/py/flwr/client/message_handler/message_handler.py index f8c8a725aec7..c5bc91969291 100644 --- a/src/py/flwr/client/message_handler/message_handler.py +++ b/src/py/flwr/client/message_handler/message_handler.py @@ -24,6 +24,7 @@ maybe_call_get_properties, ) from flwr.client.typing import ClientFn +from flwr.common import Context, Message, Metadata, RecordSet from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import ( MESSAGE_TYPE_EVALUATE, @@ -31,9 +32,6 @@ MESSAGE_TYPE_GET_PARAMETERS, MESSAGE_TYPE_GET_PROPERTIES, ) -from flwr.common.context import Context -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.recordset_compat import ( evaluateres_to_recordset, fitres_to_recordset, diff --git a/src/py/flwr/client/message_handler/message_handler_test.py b/src/py/flwr/client/message_handler/message_handler_test.py index c4c65d98b833..361d301bc8fc 100644 --- a/src/py/flwr/client/message_handler/message_handler_test.py +++ b/src/py/flwr/client/message_handler/message_handler_test.py @@ -21,6 +21,7 @@ from flwr.client.typing import ClientFn from flwr.common import ( Code, + Context, EvaluateIns, EvaluateRes, FitIns, @@ -29,15 +30,15 @@ GetParametersRes, GetPropertiesIns, GetPropertiesRes, + Message, + Metadata, Parameters, + RecordSet, Status, ) from flwr.common import recordset_compat as compat from flwr.common import typing from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES -from flwr.common.context import Context -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from .message_handler import handle_legacy_message_from_msgtype diff --git a/src/py/flwr/client/message_handler/task_handler_test.py b/src/py/flwr/client/message_handler/task_handler_test.py index 9a668231d509..65ad23630ec2 100644 --- a/src/py/flwr/client/message_handler/task_handler_test.py +++ b/src/py/flwr/client/message_handler/task_handler_test.py @@ -20,8 +20,7 @@ validate_task_ins, validate_task_res, ) -from flwr.common import serde -from flwr.common.recordset import RecordSet +from flwr.common import RecordSet, serde from flwr.proto.fleet_pb2 import PullTaskInsResponse # pylint: disable=E0611 from flwr.proto.task_pb2 import Task, TaskIns, TaskRes # pylint: disable=E0611 diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py index fa5a9fd24109..ce8aab523787 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py @@ -21,14 +21,18 @@ from typing import Any, Callable, Dict, List, Tuple, cast from flwr.client.typing import ClientAppCallable -from flwr.common import ndarray_to_bytes, parameters_to_ndarrays +from flwr.common import ( + Context, + Message, + Metadata, + RecordSet, + ndarray_to_bytes, + parameters_to_ndarrays, +) from flwr.common import recordset_compat as compat from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import MESSAGE_TYPE_FIT -from flwr.common.context import Context from flwr.common.logger import log -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.secure_aggregation.crypto.shamir import create_shares from flwr.common.secure_aggregation.crypto.symmetric_encryption import ( bytes_to_private_key, diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py index 4033306d0845..760d1a26984c 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py @@ -19,11 +19,9 @@ from typing import Callable, Dict, List from flwr.client.mod import make_ffn +from flwr.common import Context, Message, Metadata, RecordSet from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import MESSAGE_TYPE_FIT -from flwr.common.context import Context -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.secure_aggregation.secaggplus_constants import ( KEY_ACTIVE_SECURE_ID_LIST, KEY_CIPHERTEXT_LIST, diff --git a/src/py/flwr/client/mod/utils.py b/src/py/flwr/client/mod/utils.py index 3db5da563c23..4c3c32944f01 100644 --- a/src/py/flwr/client/mod/utils.py +++ b/src/py/flwr/client/mod/utils.py @@ -18,8 +18,7 @@ from typing import List from flwr.client.typing import ClientAppCallable, Mod -from flwr.common.context import Context -from flwr.common.message import Message +from flwr.common import Context, Message def make_ffn(ffn: ClientAppCallable, mods: List[Mod]) -> ClientAppCallable: diff --git a/src/py/flwr/client/mod/utils_test.py b/src/py/flwr/client/mod/utils_test.py index 4a086d9ae3f7..bb3db6e1d6ce 100644 --- a/src/py/flwr/client/mod/utils_test.py +++ b/src/py/flwr/client/mod/utils_test.py @@ -19,11 +19,14 @@ from typing import List from flwr.client.typing import ClientAppCallable, Mod -from flwr.common.configsrecord import ConfigsRecord -from flwr.common.context import Context -from flwr.common.message import Message, Metadata -from flwr.common.metricsrecord import MetricsRecord -from flwr.common.recordset import RecordSet +from flwr.common import ( + ConfigsRecord, + Context, + Message, + Metadata, + MetricsRecord, + RecordSet, +) from .utils import make_ffn diff --git a/src/py/flwr/client/node_state.py b/src/py/flwr/client/node_state.py index 465bbd356c1c..71681b783419 100644 --- a/src/py/flwr/client/node_state.py +++ b/src/py/flwr/client/node_state.py @@ -17,8 +17,7 @@ from typing import Any, Dict -from flwr.common.context import Context -from flwr.common.recordset import RecordSet +from flwr.common import Context, RecordSet class NodeState: diff --git a/src/py/flwr/client/node_state_tests.py b/src/py/flwr/client/node_state_tests.py index 11e5e74a31ec..20885880983f 100644 --- a/src/py/flwr/client/node_state_tests.py +++ b/src/py/flwr/client/node_state_tests.py @@ -16,8 +16,8 @@ from flwr.client.node_state import NodeState +from flwr.common import Context from flwr.common.configsrecord import ConfigsRecord -from flwr.common.context import Context from flwr.proto.task_pb2 import TaskIns # pylint: disable=E0611 diff --git a/src/py/flwr/client/numpy_client.py b/src/py/flwr/client/numpy_client.py index a77889912a09..0247958d88a9 100644 --- a/src/py/flwr/client/numpy_client.py +++ b/src/py/flwr/client/numpy_client.py @@ -21,12 +21,12 @@ from flwr.client.client import Client from flwr.common import ( Config, + Context, NDArrays, Scalar, ndarrays_to_parameters, parameters_to_ndarrays, ) -from flwr.common.context import Context from flwr.common.typing import ( Code, EvaluateIns, diff --git a/src/py/flwr/client/rest_client/connection.py b/src/py/flwr/client/rest_client/connection.py index a5c8ea0957d2..cee1529ac285 100644 --- a/src/py/flwr/client/rest_client/connection.py +++ b/src/py/flwr/client/rest_client/connection.py @@ -26,10 +26,9 @@ validate_task_ins, validate_task_res, ) -from flwr.common import GRPC_MAX_MESSAGE_LENGTH +from flwr.common import GRPC_MAX_MESSAGE_LENGTH, Message from flwr.common.constant import MISSING_EXTRA_REST from flwr.common.logger import log -from flwr.common.message import Message from flwr.common.serde import message_from_taskins, message_to_taskres from flwr.proto.fleet_pb2 import ( # pylint: disable=E0611 CreateNodeRequest, diff --git a/src/py/flwr/client/typing.py b/src/py/flwr/client/typing.py index 7aef2b30e0fc..956ac7a15c05 100644 --- a/src/py/flwr/client/typing.py +++ b/src/py/flwr/client/typing.py @@ -17,8 +17,7 @@ from typing import Callable -from flwr.common.context import Context -from flwr.common.message import Message +from flwr.common import Context, Message from .client import Client as Client diff --git a/src/py/flwr/common/__init__.py b/src/py/flwr/common/__init__.py index 2f45de45dfc3..3ee5c1b14500 100644 --- a/src/py/flwr/common/__init__.py +++ b/src/py/flwr/common/__init__.py @@ -15,14 +15,21 @@ """Common components shared between server and client.""" +from .configsrecord import ConfigsRecord as ConfigsRecord +from .context import Context as Context from .date import now as now from .grpc import GRPC_MAX_MESSAGE_LENGTH from .logger import configure as configure from .logger import log as log +from .message import Message as Message +from .message import Metadata as Metadata +from .metricsrecord import MetricsRecord as MetricsRecord from .parameter import bytes_to_ndarray as bytes_to_ndarray from .parameter import ndarray_to_bytes as ndarray_to_bytes from .parameter import ndarrays_to_parameters as ndarrays_to_parameters from .parameter import parameters_to_ndarrays as parameters_to_ndarrays +from .parametersrecord import ParametersRecord as ParametersRecord +from .recordset import RecordSet as RecordSet from .telemetry import EventType as EventType from .telemetry import event as event from .typing import ClientMessage as ClientMessage @@ -53,7 +60,9 @@ "ClientMessage", "Code", "Config", + "ConfigsRecord", "configure", + "Context", "DisconnectRes", "EvaluateIns", "EvaluateRes", @@ -67,8 +76,11 @@ "GetPropertiesRes", "GRPC_MAX_MESSAGE_LENGTH", "log", + "Message", + "Metadata", "Metrics", "MetricsAggregationFn", + "MetricsRecord", "ndarray_to_bytes", "now", "NDArray", @@ -76,8 +88,10 @@ "ndarrays_to_parameters", "Parameters", "parameters_to_ndarrays", + "ParametersRecord", "Properties", "ReconnectIns", + "RecordSet", "Scalar", "ServerMessage", "Status", diff --git a/src/py/flwr/server/compat/driver_client_proxy.py b/src/py/flwr/server/compat/driver_client_proxy.py index 1dc992106f60..a52bd229892d 100644 --- a/src/py/flwr/server/compat/driver_client_proxy.py +++ b/src/py/flwr/server/compat/driver_client_proxy.py @@ -19,6 +19,7 @@ from typing import List, Optional from flwr import common +from flwr.common import RecordSet from flwr.common import recordset_compat as compat from flwr.common import serde from flwr.common.constant import ( @@ -27,7 +28,6 @@ MESSAGE_TYPE_GET_PARAMETERS, MESSAGE_TYPE_GET_PROPERTIES, ) -from flwr.common.recordset import RecordSet from flwr.proto import driver_pb2, node_pb2, task_pb2 # pylint: disable=E0611 from flwr.server.client_proxy import ClientProxy diff --git a/src/py/flwr/server/run_serverapp.py b/src/py/flwr/server/run_serverapp.py index 26b8243b1f88..e7205ebd1444 100644 --- a/src/py/flwr/server/run_serverapp.py +++ b/src/py/flwr/server/run_serverapp.py @@ -20,10 +20,8 @@ from logging import DEBUG, WARN from pathlib import Path -from flwr.common import EventType, event -from flwr.common.context import Context +from flwr.common import Context, EventType, RecordSet, event from flwr.common.logger import log -from flwr.common.recordset import RecordSet from .driver.driver import Driver from .server_app import ServerApp, load_server_app diff --git a/src/py/flwr/server/server_app.py b/src/py/flwr/server/server_app.py index 027818019786..1d775878bbd9 100644 --- a/src/py/flwr/server/server_app.py +++ b/src/py/flwr/server/server_app.py @@ -18,8 +18,7 @@ import importlib from typing import Callable, Optional, cast -from flwr.common.context import Context -from flwr.common.recordset import RecordSet +from flwr.common import Context, RecordSet from flwr.server.driver.driver import Driver from flwr.server.strategy import Strategy diff --git a/src/py/flwr/server/typing.py b/src/py/flwr/server/typing.py index 728121c2eddf..dd2463c8e939 100644 --- a/src/py/flwr/server/typing.py +++ b/src/py/flwr/server/typing.py @@ -17,7 +17,7 @@ from typing import Callable -from flwr.common.context import Context +from flwr.common import Context from flwr.server.driver import Driver ServerAppCallable = Callable[[Driver, Context], None] diff --git a/src/py/flwr/simulation/ray_transport/ray_actor.py b/src/py/flwr/simulation/ray_transport/ray_actor.py index 974773a3f577..70a220dc2a19 100644 --- a/src/py/flwr/simulation/ray_transport/ray_actor.py +++ b/src/py/flwr/simulation/ray_transport/ray_actor.py @@ -26,9 +26,8 @@ from ray.util.actor_pool import ActorPool from flwr.client.clientapp import ClientApp -from flwr.common.context import Context +from flwr.common import Context, Message from flwr.common.logger import log -from flwr.common.message import Message ClientAppFn = Callable[[], ClientApp] diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py index ddac030b2ef0..10fe0f41dd28 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py @@ -23,6 +23,7 @@ from flwr.client import ClientFn from flwr.client.clientapp import ClientApp from flwr.client.node_state import NodeState +from flwr.common import Message, Metadata, RecordSet from flwr.common.constant import ( MESSAGE_TYPE_EVALUATE, MESSAGE_TYPE_FIT, @@ -30,8 +31,6 @@ MESSAGE_TYPE_GET_PROPERTIES, ) from flwr.common.logger import log -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.recordset_compat import ( evaluateins_to_recordset, fitins_to_recordset, diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py index 9ade31c323d8..5910607e2012 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py @@ -23,12 +23,9 @@ from flwr.client import Client, NumPyClient from flwr.client.clientapp import ClientApp -from flwr.common import Config, Scalar +from flwr.common import Config, Context, Message, Metadata, RecordSet, Scalar from flwr.common.configsrecord import ConfigsRecord from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES -from flwr.common.context import Context -from flwr.common.message import Message, Metadata -from flwr.common.recordset import RecordSet from flwr.common.recordset_compat import ( getpropertiesins_to_recordset, recordset_to_getpropertiesres, From 1884222dd7f97f421193a36aa9d32254cc27a1be Mon Sep 17 00:00:00 2001 From: "Daniel J. Beutel" Date: Sun, 18 Feb 2024 18:01:22 +0100 Subject: [PATCH 024/102] Export Driver from flwr.server (#2973) --- src/py/flwr/server/__init__.py | 5 +++-- src/py/flwr/server/driver/driver.py | 3 ++- src/py/flwr/server/driver/driver_test.py | 3 ++- src/py/flwr/server/server_app.py | 2 +- src/py/flwr/server/typing.py | 3 ++- 5 files changed, 10 insertions(+), 6 deletions(-) diff --git a/src/py/flwr/server/__init__.py b/src/py/flwr/server/__init__.py index cad054fb86ab..969bea96d1fe 100644 --- a/src/py/flwr/server/__init__.py +++ b/src/py/flwr/server/__init__.py @@ -15,7 +15,7 @@ """Flower server.""" -from . import driver, strategy +from . import strategy from .app import run_driver_api as run_driver_api from .app import run_fleet_api as run_fleet_api from .app import run_superlink as run_superlink @@ -23,6 +23,7 @@ from .client_manager import ClientManager as ClientManager from .client_manager import SimpleClientManager as SimpleClientManager from .compat import start_driver as start_driver +from .driver import Driver as Driver from .history import History as History from .run_serverapp import run_server_app as run_server_app from .server import Server as Server @@ -31,7 +32,7 @@ __all__ = [ "ClientManager", - "driver", + "Driver", "History", "run_driver_api", "run_fleet_api", diff --git a/src/py/flwr/server/driver/driver.py b/src/py/flwr/server/driver/driver.py index 0a7cb36f8847..8b8b637e47cb 100644 --- a/src/py/flwr/server/driver/driver.py +++ b/src/py/flwr/server/driver/driver.py @@ -25,7 +25,8 @@ ) from flwr.proto.node_pb2 import Node # pylint: disable=E0611 from flwr.proto.task_pb2 import TaskIns, TaskRes # pylint: disable=E0611 -from flwr.server.driver.grpc_driver import DEFAULT_SERVER_ADDRESS_DRIVER, GrpcDriver + +from .grpc_driver import DEFAULT_SERVER_ADDRESS_DRIVER, GrpcDriver class Driver: diff --git a/src/py/flwr/server/driver/driver_test.py b/src/py/flwr/server/driver/driver_test.py index 0ee7fbfec37e..b0d548af53a5 100644 --- a/src/py/flwr/server/driver/driver_test.py +++ b/src/py/flwr/server/driver/driver_test.py @@ -24,7 +24,8 @@ PushTaskInsRequest, ) from flwr.proto.task_pb2 import Task, TaskIns, TaskRes # pylint: disable=E0611 -from flwr.server.driver.driver import Driver + +from .driver import Driver class TestDriver(unittest.TestCase): diff --git a/src/py/flwr/server/server_app.py b/src/py/flwr/server/server_app.py index 1d775878bbd9..7b5630b1bad2 100644 --- a/src/py/flwr/server/server_app.py +++ b/src/py/flwr/server/server_app.py @@ -19,11 +19,11 @@ from typing import Callable, Optional, cast from flwr.common import Context, RecordSet -from flwr.server.driver.driver import Driver from flwr.server.strategy import Strategy from .client_manager import ClientManager from .compat import start_driver +from .driver import Driver from .server import Server from .server_config import ServerConfig from .typing import ServerAppCallable diff --git a/src/py/flwr/server/typing.py b/src/py/flwr/server/typing.py index dd2463c8e939..fa84322bc785 100644 --- a/src/py/flwr/server/typing.py +++ b/src/py/flwr/server/typing.py @@ -18,6 +18,7 @@ from typing import Callable from flwr.common import Context -from flwr.server.driver import Driver + +from .driver import Driver ServerAppCallable = Callable[[Driver, Context], None] From da682a1709185f359bbba68e34df5d86f5077853 Mon Sep 17 00:00:00 2001 From: Heng Pan <134433891+panh99@users.noreply.github.com> Date: Sun, 18 Feb 2024 20:45:55 +0000 Subject: [PATCH 025/102] Add new fields to `Metadata` class (#2961) --- src/py/flwr/client/grpc_client/connection.py | 6 +- .../client/grpc_client/connection_test.py | 8 +- .../client/grpc_rere_client/connection.py | 56 +++--- .../client/message_handler/message_handler.py | 46 +++-- .../message_handler/message_handler_test.py | 117 +++++++++++-- .../client/message_handler/task_handler.py | 69 +------- .../message_handler/task_handler_test.py | 43 +---- .../mod/secure_aggregation/secaggplus_mod.py | 14 +- .../secure_aggregation/secaggplus_mod_test.py | 18 +- src/py/flwr/client/mod/utils_test.py | 9 +- src/py/flwr/client/rest_client/connection.py | 50 +++--- src/py/flwr/common/message.py | 164 ++++++++++++++++-- src/py/flwr/common/serde.py | 30 +++- src/py/flwr/common/serde_test.py | 31 +--- .../ray_transport/ray_client_proxy.py | 4 +- .../ray_transport/ray_client_proxy_test.py | 4 +- 16 files changed, 398 insertions(+), 271 deletions(-) diff --git a/src/py/flwr/client/grpc_client/connection.py b/src/py/flwr/client/grpc_client/connection.py index e04846985845..c59c33d7ae8e 100644 --- a/src/py/flwr/client/grpc_client/connection.py +++ b/src/py/flwr/client/grpc_client/connection.py @@ -169,9 +169,11 @@ def receive() -> Message: metadata=Metadata( run_id=0, message_id=str(uuid.uuid4()), + src_node_id=0, + dst_node_id=0, + reply_to_message="", group_id="", ttl="", - node_id=0, message_type=message_type, ), content=recordset, @@ -205,7 +207,7 @@ def send(message: Message) -> None: disconnect_res=ClientMessage.DisconnectRes(reason=reason) ) else: - raise ValueError(f"Invalid task type: {message_type}") + raise ValueError(f"Invalid message type: {message_type}") # Send ClientMessage proto return queue.put(msg_proto, block=False) diff --git a/src/py/flwr/client/grpc_client/connection_test.py b/src/py/flwr/client/grpc_client/connection_test.py index 4fa289d8bd46..9f0aa3b8980c 100644 --- a/src/py/flwr/client/grpc_client/connection_test.py +++ b/src/py/flwr/client/grpc_client/connection_test.py @@ -46,8 +46,10 @@ metadata=Metadata( run_id=0, message_id="", + src_node_id=0, + dst_node_id=0, + reply_to_message="", group_id="", - node_id=0, ttl="", message_type=MESSAGE_TYPE_GET_PROPERTIES, ), @@ -59,8 +61,10 @@ metadata=Metadata( run_id=0, message_id="", + src_node_id=0, + dst_node_id=0, + reply_to_message="", group_id="", - node_id=0, ttl="", message_type="reconnect", ), diff --git a/src/py/flwr/client/grpc_rere_client/connection.py b/src/py/flwr/client/grpc_rere_client/connection.py index 04f03299f320..00b7a864c5d6 100644 --- a/src/py/flwr/client/grpc_rere_client/connection.py +++ b/src/py/flwr/client/grpc_rere_client/connection.py @@ -16,19 +16,17 @@ from contextlib import contextmanager +from copy import copy from logging import DEBUG, ERROR from pathlib import Path from typing import Callable, Dict, Iterator, Optional, Tuple, Union, cast -from flwr.client.message_handler.task_handler import ( - configure_task_res, - get_task_ins, - validate_task_ins, - validate_task_res, -) -from flwr.common import GRPC_MAX_MESSAGE_LENGTH, Message +from flwr.client.message_handler.message_handler import validate_out_message +from flwr.client.message_handler.task_handler import get_task_ins, validate_task_ins +from flwr.common import GRPC_MAX_MESSAGE_LENGTH from flwr.common.grpc import create_channel from flwr.common.logger import log, warn_experimental_feature +from flwr.common.message import Message, Metadata from flwr.common.serde import message_from_taskins, message_to_taskres from flwr.proto.fleet_pb2 import ( # pylint: disable=E0611 CreateNodeRequest, @@ -41,7 +39,7 @@ from flwr.proto.task_pb2 import TaskIns # pylint: disable=E0611 KEY_NODE = "node" -KEY_TASK_INS = "current_task_ins" +KEY_METADATA = "in_message_metadata" def on_channel_state_change(channel_connectivity: str) -> None: @@ -102,8 +100,8 @@ def grpc_request_response( channel.subscribe(on_channel_state_change) stub = FleetStub(channel) - # Necessary state to link TaskRes to TaskIns - state: Dict[str, Optional[TaskIns]] = {KEY_TASK_INS: None} + # Necessary state to validate messages to be sent + state: Dict[str, Optional[Metadata]] = {KEY_METADATA: None} # Enable create_node and delete_node to store node node_store: Dict[str, Optional[Node]] = {KEY_NODE: None} @@ -149,14 +147,20 @@ def receive() -> Optional[Message]: task_ins: Optional[TaskIns] = get_task_ins(response) # Discard the current TaskIns if not valid - if task_ins is not None and not validate_task_ins(task_ins): + if task_ins is not None and not ( + task_ins.task.consumer.node_id == node.node_id + and validate_task_ins(task_ins) + ): task_ins = None - # Remember `task_ins` until `task_res` is available - state[KEY_TASK_INS] = task_ins + # Construct the Message + in_message = message_from_taskins(task_ins) if task_ins else None + + # Remember `metadata` of the in message + state[KEY_METADATA] = copy(in_message.metadata) if in_message else None # Return the message if available - return message_from_taskins(task_ins) if task_ins is not None else None + return in_message def send(message: Message) -> None: """Send task result back to server.""" @@ -164,30 +168,26 @@ def send(message: Message) -> None: if node_store[KEY_NODE] is None: log(ERROR, "Node instance missing") return - node: Node = cast(Node, node_store[KEY_NODE]) - # Get incoming TaskIns - if state[KEY_TASK_INS] is None: - log(ERROR, "No current TaskIns") + # Get incoming message + in_metadata = state[KEY_METADATA] + if in_metadata is None: + log(ERROR, "No current message") + return + + # Validate out message + if not validate_out_message(message, in_metadata): + log(ERROR, "Invalid out message") return - task_ins: TaskIns = cast(TaskIns, state[KEY_TASK_INS]) # Construct TaskRes task_res = message_to_taskres(message) - # Check if fields to be set are not initialized - if not validate_task_res(task_res): - state[KEY_TASK_INS] = None - log(ERROR, "TaskRes has been initialized accidentally") - - # Configure TaskRes - task_res = configure_task_res(task_res, task_ins, node) - # Serialize ProtoBuf to bytes request = PushTaskResRequest(task_res_list=[task_res]) _ = stub.PushTaskRes(request) - state[KEY_TASK_INS] = None + state[KEY_METADATA] = None try: # Yield methods diff --git a/src/py/flwr/client/message_handler/message_handler.py b/src/py/flwr/client/message_handler/message_handler.py index c5bc91969291..643128f31061 100644 --- a/src/py/flwr/client/message_handler/message_handler.py +++ b/src/py/flwr/client/message_handler/message_handler.py @@ -85,17 +85,7 @@ def handle_control_message(message: Message) -> Tuple[Optional[Message], int]: reason = cast(int, disconnect_msg.disconnect_res.reason) recordset = RecordSet() recordset.set_configs("config", ConfigsRecord({"reason": reason})) - out_message = Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=0, - ttl="", - message_type="reconnect", - ), - content=recordset, - ) + out_message = message.create_reply(recordset, ttl="") # Return TaskRes and sleep duration return out_message, sleep_duration @@ -107,7 +97,7 @@ def handle_legacy_message_from_msgtype( client_fn: ClientFn, message: Message, context: Context ) -> Message: """Handle legacy message in the inner most mod.""" - client = client_fn(str(message.metadata.node_id)) + client = client_fn(str(message.metadata.dst_node_id)) client.set_context(context) @@ -144,21 +134,10 @@ def handle_legacy_message_from_msgtype( ) out_recordset = evaluateres_to_recordset(evaluate_res) else: - raise ValueError(f"Invalid task type: {message_type}") + raise ValueError(f"Invalid message type: {message_type}") # Return Message - out_message = Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=0, - ttl="", - message_type=message_type, - ), - content=out_recordset, - ) - return out_message + return message.create_reply(out_recordset, ttl="") def _reconnect( @@ -173,3 +152,20 @@ def _reconnect( # Build DisconnectRes message disconnect_res = ClientMessage.DisconnectRes(reason=reason) return ClientMessage(disconnect_res=disconnect_res), sleep_duration + + +def validate_out_message(out_message: Message, in_message_metadata: Metadata) -> bool: + """Validate the out message.""" + out_meta = out_message.metadata + in_meta = in_message_metadata + if ( # pylint: disable-next=too-many-boolean-expressions + out_meta.run_id == in_meta.run_id + and out_meta.message_id == "" # This will be generated by the server + and out_meta.src_node_id == in_meta.dst_node_id + and out_meta.dst_node_id == in_meta.src_node_id + and out_meta.reply_to_message == in_meta.message_id + and out_meta.group_id == in_meta.group_id + and out_meta.message_type == in_meta.message_type + ): + return True + return False diff --git a/src/py/flwr/client/message_handler/message_handler_test.py b/src/py/flwr/client/message_handler/message_handler_test.py index 361d301bc8fc..9fc126f27923 100644 --- a/src/py/flwr/client/message_handler/message_handler_test.py +++ b/src/py/flwr/client/message_handler/message_handler_test.py @@ -15,7 +15,10 @@ """Client-side message handler tests.""" +import unittest import uuid +from copy import copy +from typing import List from flwr.client import Client from flwr.client.typing import ClientFn @@ -40,7 +43,7 @@ from flwr.common import typing from flwr.common.constant import MESSAGE_TYPE_GET_PROPERTIES -from .message_handler import handle_legacy_message_from_msgtype +from .message_handler import handle_legacy_message_from_msgtype, validate_out_message class ClientWithoutProps(Client): @@ -122,10 +125,12 @@ def test_client_without_get_properties() -> None: recordset = compat.getpropertiesins_to_recordset(GetPropertiesIns({})) message = Message( metadata=Metadata( - run_id=0, + run_id=123, message_id=str(uuid.uuid4()), - group_id="", - node_id=0, + group_id="some group ID", + src_node_id=0, + dst_node_id=1123, + reply_to_message="", ttl="", message_type=MESSAGE_TYPE_GET_PROPERTIES, ), @@ -148,10 +153,22 @@ def test_client_without_get_properties() -> None: properties={}, ) expected_rs = compat.getpropertiesres_to_recordset(expected_get_properties_res) - expected_msg = Message(message.metadata, expected_rs) + expected_msg = Message( + metadata=Metadata( + run_id=123, + message_id="", + group_id="some group ID", + src_node_id=1123, + dst_node_id=0, + reply_to_message=message.metadata.message_id, + ttl="", + message_type=MESSAGE_TYPE_GET_PROPERTIES, + ), + content=expected_rs, + ) assert actual_msg.content == expected_msg.content - assert actual_msg.metadata.message_type == expected_msg.metadata.message_type + assert actual_msg.metadata == expected_msg.metadata def test_client_with_get_properties() -> None: @@ -161,10 +178,12 @@ def test_client_with_get_properties() -> None: recordset = compat.getpropertiesins_to_recordset(GetPropertiesIns({})) message = Message( metadata=Metadata( - run_id=0, + run_id=123, message_id=str(uuid.uuid4()), - group_id="", - node_id=0, + group_id="some group ID", + src_node_id=0, + dst_node_id=1123, + reply_to_message="", ttl="", message_type=MESSAGE_TYPE_GET_PROPERTIES, ), @@ -187,7 +206,83 @@ def test_client_with_get_properties() -> None: properties={"str_prop": "val", "int_prop": 1}, ) expected_rs = compat.getpropertiesres_to_recordset(expected_get_properties_res) - expected_msg = Message(message.metadata, expected_rs) + expected_msg = Message( + metadata=Metadata( + run_id=123, + message_id="", + group_id="some group ID", + src_node_id=1123, + dst_node_id=0, + reply_to_message=message.metadata.message_id, + ttl="", + message_type=MESSAGE_TYPE_GET_PROPERTIES, + ), + content=expected_rs, + ) assert actual_msg.content == expected_msg.content - assert actual_msg.metadata.message_type == expected_msg.metadata.message_type + assert actual_msg.metadata == expected_msg.metadata + + +class TestMessageValidation(unittest.TestCase): + """Test message validation.""" + + def setUp(self) -> None: + """Set up the message validation.""" + # Common setup for tests + self.in_metadata = Metadata( + run_id=123, + message_id="qwerty", + src_node_id=10, + dst_node_id=20, + reply_to_message="", + group_id="group1", + ttl="60", + message_type="mock", + ) + self.valid_out_metadata = Metadata( + run_id=123, + message_id="", + src_node_id=20, + dst_node_id=10, + reply_to_message="qwerty", + group_id="group1", + ttl="60", + message_type="mock", + ) + self.common_content = RecordSet() + + def test_valid_message(self) -> None: + """Test a valid message.""" + # Prepare + valid_message = Message(metadata=self.valid_out_metadata, content=RecordSet()) + + # Assert + self.assertTrue(validate_out_message(valid_message, self.in_metadata)) + + def test_invalid_message_run_id(self) -> None: + """Test invalid messages.""" + # Prepare + msg = Message(metadata=self.valid_out_metadata, content=RecordSet()) + + # Execute + invalid_metadata_list: List[Metadata] = [] + attrs = list(vars(self.valid_out_metadata).keys()) + for attr in attrs: + if attr == "_ttl": # Skip configurable ttl + continue + # Make an invalid metadata + invalid_metadata = copy(self.valid_out_metadata) + value = getattr(invalid_metadata, attr) + if isinstance(value, int): + value = 999 + elif isinstance(value, str): + value = "999" + setattr(invalid_metadata, attr, value) + # Add to list + invalid_metadata_list.append(invalid_metadata) + + # Assert + for invalid_metadata in invalid_metadata_list: + msg._metadata = invalid_metadata # pylint: disable=protected-access + self.assertFalse(validate_out_message(msg, self.in_metadata)) diff --git a/src/py/flwr/client/message_handler/task_handler.py b/src/py/flwr/client/message_handler/task_handler.py index daac1be77138..7f515a30fe5a 100644 --- a/src/py/flwr/client/message_handler/task_handler.py +++ b/src/py/flwr/client/message_handler/task_handler.py @@ -18,8 +18,7 @@ from typing import Optional from flwr.proto.fleet_pb2 import PullTaskInsResponse # pylint: disable=E0611 -from flwr.proto.node_pb2 import Node # pylint: disable=E0611 -from flwr.proto.task_pb2 import Task, TaskIns, TaskRes # pylint: disable=E0611 +from flwr.proto.task_pb2 import TaskIns # pylint: disable=E0611 def validate_task_ins(task_ins: TaskIns) -> bool: @@ -41,40 +40,6 @@ def validate_task_ins(task_ins: TaskIns) -> bool: return True -def validate_task_res(task_res: TaskRes) -> bool: - """Validate a TaskRes before filling its fields in the `send()` function. - - Parameters - ---------- - task_res: TaskRes - The task response to be sent to the server. - - Returns - ------- - is_valid: bool - True if the `task_id`, `group_id`, and `run_id` fields in TaskRes - and the `producer`, `consumer`, and `ancestry` fields in its sub-message Task - are not initialized accidentally elsewhere, - False otherwise. - """ - # Retrieve initialized fields in TaskRes and Task - initialized_fields_in_task_res = {field.name for field, _ in task_res.ListFields()} - initialized_fields_in_task = {field.name for field, _ in task_res.task.ListFields()} - - # Check if certain fields are already initialized - if ( # pylint: disable-next=too-many-boolean-expressions - "task_id" in initialized_fields_in_task_res - or "group_id" in initialized_fields_in_task_res - or "run_id" in initialized_fields_in_task_res - or "producer" in initialized_fields_in_task - or "consumer" in initialized_fields_in_task - or "ancestry" in initialized_fields_in_task - ): - return False - - return True - - def get_task_ins( pull_task_ins_response: PullTaskInsResponse, ) -> Optional[TaskIns]: @@ -87,35 +52,3 @@ def get_task_ins( task_ins: TaskIns = pull_task_ins_response.task_ins_list[0] return task_ins - - -def configure_task_res( - task_res: TaskRes, ref_task_ins: TaskIns, producer: Node -) -> TaskRes: - """Set the metadata of a TaskRes. - - Fill `group_id` and `run_id` in TaskRes - and `producer`, `consumer`, and `ancestry` in Task in TaskRes. - - `producer` in Task in TaskRes will remain unchanged/unset. - - Note that protobuf API `protobuf.message.MergeFrom(other_msg)` - does NOT always overwrite fields that are set in `other_msg`. - Please refer to: - https://googleapis.dev/python/protobuf/latest/google/protobuf/message.html - """ - task_res = TaskRes( - task_id="", # This will be generated by the server - group_id=ref_task_ins.group_id, - run_id=ref_task_ins.run_id, - task=task_res.task, - ) - # pylint: disable-next=no-member - task_res.task.MergeFrom( - Task( - producer=producer, - consumer=ref_task_ins.task.producer, - ancestry=[ref_task_ins.task_id], - ) - ) - return task_res diff --git a/src/py/flwr/client/message_handler/task_handler_test.py b/src/py/flwr/client/message_handler/task_handler_test.py index 65ad23630ec2..fa204198e6b6 100644 --- a/src/py/flwr/client/message_handler/task_handler_test.py +++ b/src/py/flwr/client/message_handler/task_handler_test.py @@ -15,14 +15,11 @@ """Tests for module task_handler.""" -from flwr.client.message_handler.task_handler import ( - get_task_ins, - validate_task_ins, - validate_task_res, -) -from flwr.common import RecordSet, serde +from flwr.client.message_handler.task_handler import get_task_ins, validate_task_ins +from flwr.common import serde +from flwr.common.recordset import RecordSet from flwr.proto.fleet_pb2 import PullTaskInsResponse # pylint: disable=E0611 -from flwr.proto.task_pb2 import Task, TaskIns, TaskRes # pylint: disable=E0611 +from flwr.proto.task_pb2 import Task, TaskIns # pylint: disable=E0611 def test_validate_task_ins_no_task() -> None: @@ -46,38 +43,6 @@ def test_validate_task_ins_valid() -> None: assert validate_task_ins(task_ins) -def test_validate_task_res() -> None: - """Test validate_task_res.""" - task_res = TaskRes(task=Task()) - assert validate_task_res(task_res) - - task_res.task_id = "123" - assert not validate_task_res(task_res) - - task_res.Clear() - task_res.group_id = "123" - assert not validate_task_res(task_res) - - task_res.Clear() - task_res.run_id = 61016 - assert not validate_task_res(task_res) - - task_res.Clear() - # pylint: disable-next=no-member - task_res.task.producer.node_id = 0 - assert not validate_task_res(task_res) - - task_res.Clear() - # pylint: disable-next=no-member - task_res.task.consumer.node_id = 0 - assert not validate_task_res(task_res) - - task_res.Clear() - # pylint: disable-next=no-member - task_res.task.ancestry.append("123") - assert not validate_task_res(task_res) - - def test_get_task_ins_empty_response() -> None: """Test get_task_ins.""" res = PullTaskInsResponse(reconnect=None, task_ins_list=[]) diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py index ce8aab523787..68cafb3e9825 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod.py @@ -24,7 +24,6 @@ from flwr.common import ( Context, Message, - Metadata, RecordSet, ndarray_to_bytes, parameters_to_ndarrays, @@ -210,17 +209,8 @@ def secaggplus_mod( ctxt.state.set_configs(RECORD_KEY_STATE, ConfigsRecord(state.to_dict())) # Return message - return Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=0, - ttl="", - message_type=MESSAGE_TYPE_FIT, - ), - content=RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(res, False)}), - ) + content = RecordSet(configs={RECORD_KEY_CONFIGS: ConfigsRecord(res, False)}) + return msg.create_reply(content, ttl="") def check_stage(current_stage: str, configs: Dict[str, ConfigsRecordValues]) -> None: diff --git a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py index 760d1a26984c..bbb7df1eb7f0 100644 --- a/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py +++ b/src/py/flwr/client/mod/secure_aggregation/secaggplus_mod_test.py @@ -53,18 +53,8 @@ def get_test_handler( ) -> Callable[[Dict[str, ConfigsRecordValues]], Dict[str, ConfigsRecordValues]]: """.""" - def empty_ffn(_: Message, _2: Context) -> Message: - return Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=0, - ttl="", - message_type=MESSAGE_TYPE_FIT, - ), - content=RecordSet(), - ) + def empty_ffn(_msg: Message, _2: Context) -> Message: + return _msg.create_reply(RecordSet(), ttl="") app = make_ffn(empty_ffn, [secaggplus_mod]) @@ -73,8 +63,10 @@ def func(configs: Dict[str, ConfigsRecordValues]) -> Dict[str, ConfigsRecordValu metadata=Metadata( run_id=0, message_id="", + src_node_id=0, + dst_node_id=123, + reply_to_message="", group_id="", - node_id=0, ttl="", message_type=MESSAGE_TYPE_FIT, ), diff --git a/src/py/flwr/client/mod/utils_test.py b/src/py/flwr/client/mod/utils_test.py index bb3db6e1d6ce..4915f0710a0f 100644 --- a/src/py/flwr/client/mod/utils_test.py +++ b/src/py/flwr/client/mod/utils_test.py @@ -78,7 +78,14 @@ def _get_dummy_flower_message() -> Message: return Message( content=RecordSet(), metadata=Metadata( - run_id=0, message_id="", group_id="", node_id=0, ttl="", message_type="mock" + run_id=0, + message_id="", + group_id="", + src_node_id=0, + dst_node_id=0, + reply_to_message="", + ttl="", + message_type="mock", ), ) diff --git a/src/py/flwr/client/rest_client/connection.py b/src/py/flwr/client/rest_client/connection.py index cee1529ac285..c637475551ed 100644 --- a/src/py/flwr/client/rest_client/connection.py +++ b/src/py/flwr/client/rest_client/connection.py @@ -17,18 +17,16 @@ import sys from contextlib import contextmanager +from copy import copy from logging import ERROR, INFO, WARN from typing import Callable, Dict, Iterator, Optional, Tuple, Union, cast -from flwr.client.message_handler.task_handler import ( - configure_task_res, - get_task_ins, - validate_task_ins, - validate_task_res, -) -from flwr.common import GRPC_MAX_MESSAGE_LENGTH, Message +from flwr.client.message_handler.message_handler import validate_out_message +from flwr.client.message_handler.task_handler import get_task_ins, validate_task_ins +from flwr.common import GRPC_MAX_MESSAGE_LENGTH from flwr.common.constant import MISSING_EXTRA_REST from flwr.common.logger import log +from flwr.common.message import Message, Metadata from flwr.common.serde import message_from_taskins, message_to_taskres from flwr.proto.fleet_pb2 import ( # pylint: disable=E0611 CreateNodeRequest, @@ -49,7 +47,7 @@ KEY_NODE = "node" -KEY_TASK_INS = "current_task_ins" +KEY_METADATA = "in_message_metadata" PATH_CREATE_NODE: str = "api/v0/fleet/create-node" @@ -121,8 +119,8 @@ def http_request_response( "must be provided as a string path to the client.", ) - # Necessary state to link TaskRes to TaskIns - state: Dict[str, Optional[TaskIns]] = {KEY_TASK_INS: None} + # Necessary state to validate messages to be sent + state: Dict[str, Optional[Metadata]] = {KEY_METADATA: None} # Enable create_node and delete_node to store node node_store: Dict[str, Optional[Node]] = {KEY_NODE: None} @@ -257,16 +255,18 @@ def receive() -> Optional[Message]: task_ins: Optional[TaskIns] = get_task_ins(pull_task_ins_response_proto) # Discard the current TaskIns if not valid - if task_ins is not None and not validate_task_ins(task_ins): + if task_ins is not None and not ( + task_ins.task.consumer.node_id == node.node_id + and validate_task_ins(task_ins) + ): task_ins = None - # Remember `task_ins` until `task_res` is available - state[KEY_TASK_INS] = task_ins - # Return the Message if available message = None + state[KEY_METADATA] = None if task_ins is not None: message = message_from_taskins(task_ins) + state[KEY_METADATA] = copy(message.metadata) log(INFO, "[Node] POST /%s: success", PATH_PULL_TASK_INS) return message @@ -276,25 +276,21 @@ def send(message: Message) -> None: if node_store[KEY_NODE] is None: log(ERROR, "Node instance missing") return - node: Node = cast(Node, node_store[KEY_NODE]) - if state[KEY_TASK_INS] is None: - log(ERROR, "No current TaskIns") + # Get incoming message + in_metadata = state[KEY_METADATA] + if in_metadata is None: + log(ERROR, "No current message") return - task_ins: TaskIns = cast(TaskIns, state[KEY_TASK_INS]) + # Validate out message + if not validate_out_message(message, in_metadata): + log(ERROR, "Invalid out message") + return # Construct TaskRes task_res = message_to_taskres(message) - # Check if fields to be set are not initialized - if not validate_task_res(task_res): - state[KEY_TASK_INS] = None - log(ERROR, "TaskRes has been initialized accidentally") - - # Configure TaskRes - task_res = configure_task_res(task_res, task_ins, node) - # Serialize ProtoBuf to bytes push_task_res_request_proto = PushTaskResRequest(task_res_list=[task_res]) push_task_res_request_bytes: bytes = ( @@ -313,7 +309,7 @@ def send(message: Message) -> None: timeout=None, ) - state[KEY_TASK_INS] = None + state[KEY_METADATA] = None # Check status code and headers if res.status_code != 200: diff --git a/src/py/flwr/common/message.py b/src/py/flwr/common/message.py index 9258edccbcd5..7735ad85323b 100644 --- a/src/py/flwr/common/message.py +++ b/src/py/flwr/common/message.py @@ -15,13 +15,15 @@ """Message.""" +from __future__ import annotations + from dataclasses import dataclass from .recordset import RecordSet @dataclass -class Metadata: +class Metadata: # pylint: disable=too-many-instance-attributes """A dataclass holding metadata associated with the current message. Parameters @@ -30,11 +32,15 @@ class Metadata: An identifier for the current run. message_id : str An identifier for the current message. + src_node_id : int + An identifier for the node sending this message. + dst_node_id : int + An identifier for the node receiving this message. + reply_to_message : str + An identifier for the message this message replies to. group_id : str - An identifier for grouping messages. In some settings + An identifier for grouping messages. In some settings, this is used as the FL round. - node_id : int - An identifier for the node running a message. ttl : str Time-to-live for this message. message_type : str @@ -42,12 +48,94 @@ class Metadata: the receiving end. """ - run_id: int - message_id: str - group_id: str - node_id: int - ttl: str - message_type: str + _run_id: int + _message_id: str + _src_node_id: int + _dst_node_id: int + _reply_to_message: str + _group_id: str + _ttl: str + _message_type: str + + def __init__( # pylint: disable=too-many-arguments + self, + run_id: int, + message_id: str, + src_node_id: int, + dst_node_id: int, + reply_to_message: str, + group_id: str, + ttl: str, + message_type: str, + ) -> None: + self._run_id = run_id + self._message_id = message_id + self._src_node_id = src_node_id + self._dst_node_id = dst_node_id + self._reply_to_message = reply_to_message + self._group_id = group_id + self._ttl = ttl + self._message_type = message_type + + @property + def run_id(self) -> int: + """An identifier for the current run.""" + return self._run_id + + @property + def message_id(self) -> str: + """An identifier for the current message.""" + return self._message_id + + @property + def src_node_id(self) -> int: + """An identifier for the node sending this message.""" + return self._src_node_id + + @property + def reply_to_message(self) -> str: + """An identifier for the message this message replies to.""" + return self._reply_to_message + + @property + def dst_node_id(self) -> int: + """An identifier for the node receiving this message.""" + return self._dst_node_id + + @dst_node_id.setter + def dst_node_id(self, value: int) -> None: + """Set dst_node_id.""" + self._dst_node_id = value + + @property + def group_id(self) -> str: + """An identifier for grouping messages.""" + return self._group_id + + @group_id.setter + def group_id(self, value: str) -> None: + """Set group_id.""" + self._group_id = value + + @property + def ttl(self) -> str: + """Time-to-live for this message.""" + return self._ttl + + @ttl.setter + def ttl(self, value: str) -> None: + """Set ttl.""" + self._ttl = value + + @property + def message_type(self) -> str: + """A string that encodes the action to be executed on the receiving end.""" + return self._message_type + + @message_type.setter + def message_type(self, value: str) -> None: + """Set message_type.""" + self._message_type = value @dataclass @@ -63,5 +151,57 @@ class Message: logic to a client, or vice-versa) or that will be sent to it. """ - metadata: Metadata - content: RecordSet + _metadata: Metadata + _content: RecordSet + + def __init__(self, metadata: Metadata, content: RecordSet) -> None: + self._metadata = metadata + self._content = content + + @property + def metadata(self) -> Metadata: + """A dataclass including information about the message to be executed.""" + return self._metadata + + @property + def content(self) -> RecordSet: + """The content of this message.""" + return self._content + + @content.setter + def content(self, value: RecordSet) -> None: + """Set content.""" + self._content = value + + def create_reply(self, content: RecordSet, ttl: str) -> Message: + """Create a reply to this message with specified content and TTL. + + The method generates a new `Message` as a reply to this message. + It inherits 'run_id', 'src_node_id', 'dst_node_id', and 'message_type' from + this message and sets 'reply_to_message' to the ID of this message. + + Parameters + ---------- + content : RecordSet + The content for the reply message. + ttl : str + Time-to-live for this message. + + Returns + ------- + Message + A new `Message` instance representing the reply. + """ + return Message( + metadata=Metadata( + run_id=self.metadata.run_id, + message_id="", + src_node_id=self.metadata.dst_node_id, + dst_node_id=self.metadata.src_node_id, + reply_to_message=self.metadata.message_id, + group_id=self.metadata.group_id, + ttl=ttl, + message_type=self.metadata.message_type, + ), + content=content, + ) diff --git a/src/py/flwr/common/serde.py b/src/py/flwr/common/serde.py index 2808cb88fb5c..530597d89807 100644 --- a/src/py/flwr/common/serde.py +++ b/src/py/flwr/common/serde.py @@ -20,6 +20,7 @@ from google.protobuf.message import Message as GrpcMessage # pylint: disable=E0611 +from flwr.proto.node_pb2 import Node from flwr.proto.recordset_pb2 import Array as ProtoArray from flwr.proto.recordset_pb2 import BoolList, BytesList from flwr.proto.recordset_pb2 import ConfigsRecord as ProtoConfigsRecord @@ -547,10 +548,16 @@ def recordset_from_proto(recordset_proto: ProtoRecordSet) -> RecordSet: def message_to_taskins(message: Message) -> TaskIns: """Create a TaskIns from the Message.""" + md = message.metadata return TaskIns( + group_id=md.group_id, + run_id=md.run_id, task=Task( - ttl=message.metadata.ttl, - task_type=message.metadata.message_type, + producer=Node(node_id=0, anonymous=True), # Assume driver node + consumer=Node(node_id=md.dst_node_id, anonymous=False), + ttl=md.ttl, + ancestry=[md.reply_to_message] if md.reply_to_message != "" else [], + task_type=md.message_type, recordset=recordset_to_proto(message.content), ), ) @@ -562,8 +569,10 @@ def message_from_taskins(taskins: TaskIns) -> Message: metadata = Metadata( run_id=taskins.run_id, message_id=taskins.task_id, + src_node_id=taskins.task.producer.node_id, + dst_node_id=taskins.task.consumer.node_id, + reply_to_message=taskins.task.ancestry[0] if taskins.task.ancestry else "", group_id=taskins.group_id, - node_id=taskins.task.consumer.node_id, ttl=taskins.task.ttl, message_type=taskins.task.task_type, ) @@ -577,10 +586,17 @@ def message_from_taskins(taskins: TaskIns) -> Message: def message_to_taskres(message: Message) -> TaskRes: """Create a TaskRes from the Message.""" + md = message.metadata return TaskRes( + task_id="", # This will be generated by the server + group_id=md.group_id, + run_id=md.run_id, task=Task( - ttl=message.metadata.ttl, - task_type=message.metadata.message_type, + producer=Node(node_id=md.src_node_id, anonymous=False), + consumer=Node(node_id=0, anonymous=True), # Assume driver node + ttl=md.ttl, + ancestry=[md.reply_to_message] if md.reply_to_message != "" else [], + task_type=md.message_type, recordset=recordset_to_proto(message.content), ), ) @@ -592,8 +608,10 @@ def message_from_taskres(taskres: TaskRes) -> Message: metadata = Metadata( run_id=taskres.run_id, message_id=taskres.task_id, + src_node_id=taskres.task.producer.node_id, + dst_node_id=taskres.task.consumer.node_id, + reply_to_message=taskres.task.ancestry[0] if taskres.task.ancestry else "", group_id=taskres.group_id, - node_id=taskres.task.consumer.node_id, ttl=taskres.task.ttl, message_type=taskres.task.task_type, ) diff --git a/src/py/flwr/common/serde_test.py b/src/py/flwr/common/serde_test.py index 44085e8d9ab8..9a38d7e5ecee 100644 --- a/src/py/flwr/common/serde_test.py +++ b/src/py/flwr/common/serde_test.py @@ -219,7 +219,9 @@ def metadata(self) -> Metadata: run_id=self.rng.randint(0, 1 << 30), message_id=self.get_str(64), group_id=self.get_str(30), - node_id=self.rng.randint(0, 1 << 63), + src_node_id=self.rng.randint(0, 1 << 63), + dst_node_id=self.rng.randint(0, 1 << 63), + reply_to_message=self.get_str(64), ttl=self.get_str(10), message_type=self.get_str(10), ) @@ -305,24 +307,16 @@ def test_message_to_and_from_taskins() -> None: # Prepare maker = RecordMaker(state=1) metadata = maker.metadata() + # pylint: disable-next=protected-access + metadata._src_node_id = 0 # Assume driver node original = Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=metadata.node_id, - ttl=metadata.ttl, - message_type=metadata.message_type, - ), + metadata=metadata, content=maker.recordset(1, 1, 1), ) # Execute taskins = message_to_taskins(original) - taskins.run_id = metadata.run_id taskins.task_id = metadata.message_id - taskins.group_id = metadata.group_id - taskins.task.consumer.node_id = metadata.node_id deserialized = message_from_taskins(taskins) # Assert @@ -335,24 +329,15 @@ def test_message_to_and_from_taskres() -> None: # Prepare maker = RecordMaker(state=2) metadata = maker.metadata() + metadata.dst_node_id = 0 # Assume driver node original = Message( - metadata=Metadata( - run_id=0, - message_id="", - group_id="", - node_id=metadata.node_id, - ttl=metadata.ttl, - message_type=metadata.message_type, - ), + metadata=metadata, content=maker.recordset(1, 1, 1), ) # Execute taskres = message_to_taskres(original) - taskres.run_id = metadata.run_id taskres.task_id = metadata.message_id - taskres.group_id = metadata.group_id - taskres.task.consumer.node_id = metadata.node_id deserialized = message_from_taskres(taskres) # Assert diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py index 10fe0f41dd28..405e0920c5a4 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy.py @@ -106,7 +106,9 @@ def _wrap_recordset_in_message( run_id=0, message_id="", group_id="", - node_id=int(self.cid), + src_node_id=0, + dst_node_id=int(self.cid), + reply_to_message="", ttl=str(timeout) if timeout else "", message_type=message_type, ), diff --git a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py index 5910607e2012..7cc6de0c7315 100644 --- a/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py +++ b/src/py/flwr/simulation/ray_transport/ray_client_proxy_test.py @@ -190,8 +190,10 @@ def _load_app() -> ClientApp: run_id=0, message_id="", group_id="", + src_node_id=0, + dst_node_id=int(cid), + reply_to_message="", ttl="", - node_id=int(cid), message_type=MESSAGE_TYPE_GET_PROPERTIES, ), ) From 03632ffd5040763f4399024c0621d84d98a9e1be Mon Sep 17 00:00:00 2001 From: "Weblate (bot)" Date: Mon, 19 Feb 2024 12:49:59 +0100 Subject: [PATCH 026/102] Translated using Weblate (Chinese (Simplified)) (#2978) --- .../zh_Hans/LC_MESSAGES/framework-docs.po | 59 ++++++++++--------- 1 file changed, 31 insertions(+), 28 deletions(-) diff --git a/doc/locales/zh_Hans/LC_MESSAGES/framework-docs.po b/doc/locales/zh_Hans/LC_MESSAGES/framework-docs.po index 720de8578261..ab1c8dc39e64 100644 --- a/doc/locales/zh_Hans/LC_MESSAGES/framework-docs.po +++ b/doc/locales/zh_Hans/LC_MESSAGES/framework-docs.po @@ -8,15 +8,16 @@ msgstr "" "Project-Id-Version: Flower main\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2024-02-13 11:23+0100\n" -"PO-Revision-Date: 2024-02-10 11:56+0000\n" +"PO-Revision-Date: 2024-02-19 11:37+0000\n" "Last-Translator: Yan Gao \n" +"Language-Team: Chinese (Simplified) \n" "Language: zh_Hans\n" -"Language-Team: Chinese (Simplified) \n" -"Plural-Forms: nplurals=1; plural=0;\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=utf-8\n" "Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=1; plural=0;\n" +"X-Generator: Weblate 5.4\n" "Generated-By: Babel 2.13.1\n" #: ../../source/contributor-explanation-architecture.rst:2 @@ -55,7 +56,7 @@ msgstr "具有虚拟客户端引擎和边缘客户端引擎的`Flower `_." msgstr "" -"如果您对 Flower Baselines 还不熟悉,也许应该看看我们的 \"基线贡献指南 `_\"。" +"如果您对 Flower Baselines 还不熟悉,也许可以看看我们的 `Baselines贡献指南 " +"`_。" #: ../../source/contributor-ref-good-first-contributions.rst:27 msgid "" @@ -1239,30 +1240,33 @@ msgid "" " and that has no assignes, feel free to assign it to yourself and start " "working on it!" msgstr "" -"然后,您应该查看开放的 `issues " -"`_" -" 基线请求。如果您发现了自己想做的基线,而它还没有被分配,请随时把它分配给自己,然后开始工作!" +"然后查看开放的 `issues `_ baseline请求。如" +"果您发现了自己想做的baseline,而它还没有被分配,请随时把它分配给自己,然后开" +"始工作!" #: ../../source/contributor-ref-good-first-contributions.rst:31 msgid "" "Otherwise, if you don't find a baseline you'd like to work on, be sure to" " open a new issue with the baseline request template!" -msgstr "否则,如果您没有找到想要处理的基线,请务必使用基线请求模板打开一个新问题!" +msgstr "如果您没有找到想要做的baseline,请务必使用baseline请求模板打开一个新问题(" +"GitHub issue)!" #: ../../source/contributor-ref-good-first-contributions.rst:34 msgid "Request for examples" -msgstr "要求提供范例" +msgstr "示例请求" #: ../../source/contributor-ref-good-first-contributions.rst:36 msgid "" "We wish we had more time to write usage examples because we believe they " "help users to get started with building what they want to build. Here are" " a few ideas where we'd be happy to accept a PR:" -msgstr "我们希望有更多的时间来撰写使用示例,因为我们相信这些示例可以帮助用户开始构建他们想要构建的东西。以下是我们乐意接受 PR 的几个想法:" +msgstr "我们希望有更多的时间来撰写使用示例,因为我们相信这些示例可以帮助用户开始构建" +"他们想要的东西。以下是我们乐意接受 PR 的几个想法:" #: ../../source/contributor-ref-good-first-contributions.rst:40 msgid "Llama 2 fine-tuning, with Hugging Face Transformers and PyTorch" -msgstr "微调 \"拉玛 2\",使用 \"抱脸变形金刚 \"和 PyTorch" +msgstr "微调 Llama 2,使用 Hugging Face Transformers 和 PyTorch" #: ../../source/contributor-ref-good-first-contributions.rst:41 msgid "XGBoost" @@ -1270,7 +1274,7 @@ msgstr "XGBoost" #: ../../source/contributor-ref-good-first-contributions.rst:42 msgid "Android ONNX on-device training" -msgstr "安卓 ONNX 设备上培训" +msgstr "安卓 ONNX 设备上训练" #: ../../source/contributor-ref-secure-aggregation-protocols.rst:2 msgid "Secure Aggregation Protocols" @@ -1326,16 +1330,16 @@ msgid "" msgstr "本指南适用于想参与 Flower,但不习惯为 GitHub 项目贡献的人。" #: ../../source/contributor-tutorial-contribute-on-github.rst:6 -#, fuzzy msgid "" "If you're familiar with how contributing on GitHub works, you can " "directly checkout our `getting started guide for contributors " "`_." msgstr "" -"如果您熟悉如何在 GitHub 上贡献,可以直接查看我们的 \"贡献者入门指南\" `_ 和 \"优秀的首次贡献示例\" " -"`_。" +"如果您熟悉如何在 GitHub 上贡献,可以直接查看我们的 \"贡献者入门指南\" " +"`_ 和 " +"\"优秀的首次贡献示例\" `_。" #: ../../source/contributor-tutorial-contribute-on-github.rst:11 msgid "Setting up the repository" @@ -1997,7 +2001,7 @@ msgid "" "ref-good-first-contributions.html>`_, where you should particularly look " "into the :code:`baselines` contributions." msgstr "" -"好的第一批贡献 `_,在这里你应该特别看看 :code:`baselines` 的贡献。" #: ../../source/contributor-tutorial-contribute-on-github.rst:350 @@ -20489,4 +20493,3 @@ msgstr "" #~ "`_\" " #~ "的类来配置,因此行为方式也完全相同。除此之外,由 :code:`VirtualClientEngine` " #~ "管理的客户端还包括:" - From f4b6e2242a9cd0424c07a0c4b4505c067c955585 Mon Sep 17 00:00:00 2001 From: Robert Steiner Date: Mon, 19 Feb 2024 18:24:40 +0100 Subject: [PATCH 027/102] Allow to run multiple baseline tests for collobrators (#2966) --- .github/workflows/baselines.yml | 166 ++++++++++++-------------------- 1 file changed, 60 insertions(+), 106 deletions(-) diff --git a/.github/workflows/baselines.yml b/.github/workflows/baselines.yml index bfb26053836d..c4485fe72d10 100644 --- a/.github/workflows/baselines.yml +++ b/.github/workflows/baselines.yml @@ -1,14 +1,5 @@ name: Baselines -# The aim of this workflow is to test only the changed (or added) baseline. -# Here is the rough idea of how it works (more details are presented later in the comments): -# 1. Checks for the changes between the current branch and the main - in case of PR - -# or between the HEAD and HEAD~1 (main last commit and the previous one) - in case of -# a push to main. -# 2. Fails the test if there are changes to more than one baseline. Passes the test -# (skips the rests) if there are no changes to any baselines. Follows the test if only -# one baseline is added or modified. -# 3. Sets up the env specified for the baseline. -# 4. Runs the tests. + on: push: branches: @@ -24,112 +15,75 @@ concurrency: env: FLWR_TELEMETRY_ENABLED: 0 -defaults: - run: - working-directory: baselines - jobs: - test_baselines: - name: Test + changes: runs-on: ubuntu-22.04 + permissions: + pull-requests: read + outputs: + baselines: ${{ steps.filter.outputs.changes }} steps: - uses: actions/checkout@v4 - # The depth two of the checkout is needed in case of merging to the main - # because we compare the HEAD (current version) with HEAD~1 (version before - # the PR was merged) - with: - fetch-depth: 2 - - name: Fetch main branch - run: | - # The main branch is needed in case of the PR to make a comparison (by - # default the workflow takes as little information as possible - it does not - # have the history - if [ ${{ github.event_name }} == "pull_request" ] - then - git fetch origin main:main - fi - - name: Find changed/new baselines - id: find_changed_baselines_dirs + + - shell: bash run: | - if [ ${{ github.event_name }} == "push" ] - then - # Push event triggered when merging to main - change_references="HEAD..HEAD~1" - else - # Pull request event triggered for any commit to a pull request - change_references="main..HEAD" - fi - dirs=$(git diff --dirstat=files,0 ${change_references} . | awk '{print $2}' | grep -E '^baselines/[^/]*/$' | \ - grep -v \ - -e '^baselines/dev' \ - -e '^baselines/baseline_template' \ - -e '^baselines/flwr_baselines' \ - -e '^baselines/doc' \ - | sed 's/^baselines\///') - # git diff --dirstat=files,0 ${change_references} . - checks the differences - # and a file is counted as changed if more than 0 lines were changed - # it returns the results in the format x.y% path/to/dir/ - # awk '{print $2}' - takes only the directories (skips the percentages) - # grep -E '^baselines/[^/]*/$' - takes only the paths that start with - # baseline (and have at least one subdirectory) - # grep -v -e ... - excludes the `baseline_template`, `dev`, `flwr_baselines` - # sed 's/^baselines\///' - narrows down the path to baseline/ - echo "Detected changed directories: ${dirs}" - # Save changed dirs to output of this step - EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64) - echo "dirs<> "$GITHUB_OUTPUT" - for dir in $dirs - do - echo "$dir" >> "$GITHUB_OUTPUT" - done - echo "EOF" >> "$GITHUB_OUTPUT" - - name: Validate changed/new baselines - id: validate_changed_baselines_dirs + # create a list of all directories in baselines + { + echo 'FILTER_PATHS<> "$GITHUB_ENV" + + - uses: dorny/paths-filter@v3 + id: filter + with: + filters: ${{ env.FILTER_PATHS }} + + - if: ${{ github.event.pull_request.head.repo.fork }} run: | - dirs="${{ steps.find_changed_baselines_dirs.outputs.dirs }}" - dirs_array=() - if [[ -n $dirs ]]; then - while IFS= read -r line; do - dirs_array+=("$line") - done <<< "$dirs" - fi - length=${#dirs_array[@]} - echo "The number of changed baselines is $length" - - if [ $length -gt 1 ]; then - echo "The changes should only apply to a single baseline" - exit 1 + CHANGES=$(echo "${{ toJson(steps.filter.outputs.changes) }}" | jq '. | length') + if [ "$CHANGES" -gt 1 ]; then + echo "::error ::The changes should only apply to a single baseline." + exit 1 fi - - if [ $length -eq 0 ]; then - echo "The baselines were not changed - skipping the remaining steps." - echo "baseline_changed=false" >> "$GITHUB_OUTPUT" - exit 0 - fi - - echo "changed_dir=${dirs[0]}" >> "$GITHUB_OUTPUT" - echo "baseline_changed=true" >> "$GITHUB_OUTPUT" + + test: + runs-on: ubuntu-22.04 + needs: changes + if: ${{ needs.changes.outputs.baselines != '' && toJson(fromJson(needs.changes.outputs.baselines)) != '[]' }} + strategy: + matrix: + baseline: ${{ fromJSON(needs.changes.outputs.baselines) }} + steps: + - uses: actions/checkout@v4 + - name: Bootstrap - if: steps.validate_changed_baselines_dirs.outputs.baseline_changed == 'true' uses: ./.github/actions/bootstrap with: python-version: '3.10' + - name: Install dependencies - if: steps.validate_changed_baselines_dirs.outputs.baseline_changed == 'true' - run: | - changed_dir="${{ steps.validate_changed_baselines_dirs.outputs.changed_dir }}" - cd "${changed_dir}" - python -m poetry install - - name: Test - if: steps.validate_changed_baselines_dirs.outputs.baseline_changed == 'true' - run: | - dir="${{ steps.validate_changed_baselines_dirs.outputs.changed_dir }}" - echo "Testing ${dir}" - ./dev/test-baseline.sh $dir - - name: Test Structure - if: steps.validate_changed_baselines_dirs.outputs.baseline_changed == 'true' - run: | - dir="${{ steps.validate_changed_baselines_dirs.outputs.changed_dir }}" - echo "Testing ${dir}" - ./dev/test-baseline-structure.sh $dir + working-directory: baselines/${{ matrix.baseline }} + run: python -m poetry install + + - name: Testing ${{ matrix.baseline }} + working-directory: baselines + run: ./dev/test-baseline.sh ${{ matrix.baseline }} + - name: Test Structure of ${{ matrix.baseline }} + working-directory: baselines + run: ./dev/test-baseline-structure.sh ${{ matrix.baseline }} From 78cf9721f9170f7743cf92c9182561ba58e04e37 Mon Sep 17 00:00:00 2001 From: Javier Date: Mon, 19 Feb 2024 18:44:54 +0100 Subject: [PATCH 028/102] Merges dependencies (#2968) --- baselines/fedavgm/pyproject.toml | 5 ++--- baselines/fjord/requirements.txt | 3 +-- examples/whisper-federated-finetuning/requirements.txt | 3 +-- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/baselines/fedavgm/pyproject.toml b/baselines/fedavgm/pyproject.toml index 298deafd8932..cfd55a5b1fba 100644 --- a/baselines/fedavgm/pyproject.toml +++ b/baselines/fedavgm/pyproject.toml @@ -38,11 +38,10 @@ classifiers = [ [tool.poetry.dependencies] python = ">=3.9, <3.12.0" # changed! original baseline template uses >= 3.8.15 -flwr = "1.5.0" -ray = "2.6.3" +flwr = { extras = ["simulation"], version = "1.5.0" } hydra-core = "1.3.2" # don't change this cython = "^3.0.0" -tensorflow = "2.10" +tensorflow = "2.11.1" numpy = "1.25.2" matplotlib = "^3.7.2" diff --git a/baselines/fjord/requirements.txt b/baselines/fjord/requirements.txt index 35583b1a45c4..28f700c27537 100644 --- a/baselines/fjord/requirements.txt +++ b/baselines/fjord/requirements.txt @@ -1,8 +1,7 @@ coloredlogs==15.0.1 hydra-core==1.3.2 -flwr==1.5.0 +flwr[simulation]==1.5.0 omegaconf==2.3.0 -ray==2.6.3 torch==2.0.1 torchvision==0.15.2 tqdm==4.65.0 diff --git a/examples/whisper-federated-finetuning/requirements.txt b/examples/whisper-federated-finetuning/requirements.txt index eb4a5d7eb47b..f16b3d6993ce 100644 --- a/examples/whisper-federated-finetuning/requirements.txt +++ b/examples/whisper-federated-finetuning/requirements.txt @@ -3,5 +3,4 @@ tokenizers==0.13.3 datasets==2.14.6 soundfile==0.12.1 librosa==0.10.1 -flwr==1.5.0 -ray==2.6.3 \ No newline at end of file +flwr[simulation]>=1.0, <2.0 \ No newline at end of file From 2b14a81f97bb7222b8c39cda1e31ba0e6e93c4ed Mon Sep 17 00:00:00 2001 From: Heng Pan <134433891+panh99@users.noreply.github.com> Date: Mon, 19 Feb 2024 18:49:14 +0000 Subject: [PATCH 029/102] Migrate `Driver` to use `Message` (#2941) --- src/py/flwr/server/driver/driver.py | 182 ++++++++++++++++++++--- src/py/flwr/server/driver/driver_test.py | 88 ++++++++--- 2 files changed, 234 insertions(+), 36 deletions(-) diff --git a/src/py/flwr/server/driver/driver.py b/src/py/flwr/server/driver/driver.py index 8b8b637e47cb..932ec617d7b3 100644 --- a/src/py/flwr/server/driver/driver.py +++ b/src/py/flwr/server/driver/driver.py @@ -15,8 +15,12 @@ """Flower driver service client.""" +import time from typing import Iterable, List, Optional, Tuple +from flwr.common.message import Message, Metadata +from flwr.common.recordset import RecordSet +from flwr.common.serde import message_from_taskres, message_to_taskins from flwr.proto.driver_pb2 import ( # pylint: disable=E0611 CreateRunRequest, GetNodesRequest, @@ -24,7 +28,7 @@ PushTaskInsRequest, ) from flwr.proto.node_pb2 import Node # pylint: disable=E0611 -from flwr.proto.task_pb2 import TaskIns, TaskRes # pylint: disable=E0611 +from flwr.proto.task_pb2 import TaskIns # pylint: disable=E0611 from .grpc_driver import DEFAULT_SERVER_ADDRESS_DRIVER, GrpcDriver @@ -41,7 +45,6 @@ class Driver: Tuple containing root certificate, server certificate, and private key to start a secure SSL-enabled server. The tuple is expected to have three bytes elements in the following order: - * CA certificate. * server certificate. * server private key. @@ -69,44 +72,185 @@ def _get_grpc_driver_and_run_id(self) -> Tuple[GrpcDriver, int]: self.grpc_driver.connect() res = self.grpc_driver.create_run(CreateRunRequest()) self.run_id = res.run_id - return self.grpc_driver, self.run_id - def get_nodes(self) -> List[Node]: + def _check_message(self, message: Message) -> None: + # Check if the message is valid + if not ( + message.metadata.run_id == self.run_id + and message.metadata.src_node_id == self.node.node_id + and message.metadata.message_id == "" + and message.metadata.reply_to_message == "" + ): + raise ValueError(f"Invalid message: {message}") + + def create_message( # pylint: disable=too-many-arguments + self, + content: RecordSet, + message_type: str, + dst_node_id: int, + group_id: str, + ttl: str, + ) -> Message: + """Create a new message with specified parameters. + + This method constructs a new `Message` with given content and metadata. + The `run_id` and `src_node_id` will be set automatically. + + Parameters + ---------- + content : RecordSet + The content for the new message. This holds records that are to be sent + to the destination node. + message_type : str + The type of the message, defining the action to be executed on + the receiving end. + dst_node_id : int + The ID of the destination node to which the message is being sent. + group_id : str + The ID of the group to which this message is associated. In some settings, + this is used as the FL round. + ttl : str + Time-to-live for the round trip of this message, i.e., the time from sending + this message to receiving a reply. It specifies the duration for which the + message and its potential reply are considered valid. + + Returns + ------- + message : Message + A new `Message` instance with the specified content and metadata. + """ + _, run_id = self._get_grpc_driver_and_run_id() + metadata = Metadata( + run_id=run_id, + message_id="", # Will be set by the server + src_node_id=self.node.node_id, + dst_node_id=dst_node_id, + reply_to_message="", + group_id=group_id, + ttl=ttl, + message_type=message_type, + ) + return Message(metadata=metadata, content=content) + + def get_node_ids(self) -> List[int]: """Get node IDs.""" grpc_driver, run_id = self._get_grpc_driver_and_run_id() - # Call GrpcDriver method res = grpc_driver.get_nodes(GetNodesRequest(run_id=run_id)) - return list(res.nodes) + return [node.node_id for node in res.nodes] - def push_task_ins(self, task_ins_list: List[TaskIns]) -> List[str]: - """Schedule tasks.""" - grpc_driver, run_id = self._get_grpc_driver_and_run_id() + def push_messages(self, messages: Iterable[Message]) -> Iterable[str]: + """Push messages to specified node IDs. - # Set run_id - for task_ins in task_ins_list: - task_ins.run_id = run_id + This method takes an iterable of messages and sends each message + to the node specified in `dst_node_id`. + Parameters + ---------- + messages : Iterable[Message] + An iterable of messages to be sent. + + Returns + ------- + message_ids : Iterable[str] + An iterable of IDs for the messages that were sent, which can be used + to pull replies. + """ + grpc_driver, _ = self._get_grpc_driver_and_run_id() + # Construct TaskIns + task_ins_list: List[TaskIns] = [] + for msg in messages: + # Check message + self._check_message(msg) + # Convert Message to TaskIns + taskins = message_to_taskins(msg) + # Add to list + task_ins_list.append(taskins) # Call GrpcDriver method res = grpc_driver.push_task_ins(PushTaskInsRequest(task_ins_list=task_ins_list)) return list(res.task_ids) - def pull_task_res(self, task_ids: Iterable[str]) -> List[TaskRes]: - """Get task results.""" - grpc_driver, _ = self._get_grpc_driver_and_run_id() + def pull_messages(self, message_ids: Iterable[str]) -> Iterable[Message]: + """Pull messages based on message IDs. - # Call GrpcDriver method + This method is used to collect messages from the SuperLink + that correspond to a set of given message IDs. + + Parameters + ---------- + message_ids : Iterable[str] + An iterable of message IDs for which reply messages are to be retrieved. + + Returns + ------- + messages : Iterable[Message] + An iterable of messages received. + """ + grpc_driver, _ = self._get_grpc_driver_and_run_id() + # Pull TaskRes res = grpc_driver.pull_task_res( - PullTaskResRequest(node=self.node, task_ids=task_ids) + PullTaskResRequest(node=self.node, task_ids=message_ids) ) - return list(res.task_res_list) + # Convert TaskRes to Message + msgs = [message_from_taskres(taskres) for taskres in res.task_res_list] + return msgs + + def send_and_receive( + self, + messages: Iterable[Message], + *, + timeout: Optional[float] = None, + ) -> Iterable[Message]: + """Push messages to specified node IDs and pull the reply messages. + + This method sends a list of messages to their destination node IDs and then + waits for the replies. It continues to pull replies until either all + replies are received or the specified timeout duration is exceeded. + + Parameters + ---------- + messages : Iterable[Message] + An iterable of messages to be sent. + timeout : Optional[float] (default: None) + The timeout duration in seconds. If specified, the method will wait for + replies for this duration. If `None`, there is no time limit and the method + will wait until replies for all messages are received. + + Returns + ------- + replies : Iterable[Message] + An iterable of reply messages received from the SuperLink. + + Notes + ----- + This method uses `push_messages` to send the messages and `pull_messages` + to collect the replies. If `timeout` is set, the method may not return + replies for all sent messages. A message remains valid until its TTL, + which is not affected by `timeout`. + """ + # Push messages + msg_ids = set(self.push_messages(messages)) + + # Pull messages + end_time = time.time() + (timeout if timeout is not None else 0.0) + ret: List[Message] = [] + while timeout is None or time.time() < end_time: + res_msgs = self.pull_messages(msg_ids) + ret.extend(res_msgs) + msg_ids.difference_update( + {msg.metadata.reply_to_message for msg in res_msgs} + ) + if len(msg_ids) == 0: + break + # Sleep + time.sleep(3) + return ret def __del__(self) -> None: """Disconnect GrpcDriver if connected.""" # Check if GrpcDriver is initialized if self.grpc_driver is None: return - # Disconnect self.grpc_driver.disconnect() diff --git a/src/py/flwr/server/driver/driver_test.py b/src/py/flwr/server/driver/driver_test.py index b0d548af53a5..bd5c23a407fd 100644 --- a/src/py/flwr/server/driver/driver_test.py +++ b/src/py/flwr/server/driver/driver_test.py @@ -15,15 +15,17 @@ """Tests for driver SDK.""" +import time import unittest from unittest.mock import Mock, patch +from flwr.common import RecordSet from flwr.proto.driver_pb2 import ( # pylint: disable=E0611 GetNodesRequest, PullTaskResRequest, PushTaskInsRequest, ) -from flwr.proto.task_pb2 import Task, TaskIns, TaskRes # pylint: disable=E0611 +from flwr.proto.task_pb2 import Task, TaskRes # pylint: disable=E0611 from .driver import Driver @@ -74,11 +76,11 @@ def test_get_nodes(self) -> None: """Test retrieval of nodes.""" # Prepare mock_response = Mock() - mock_response.nodes = [Mock(), Mock()] + mock_response.nodes = [Mock(node_id=404), Mock(node_id=200)] self.mock_grpc_driver.get_nodes.return_value = mock_response # Execute - nodes = self.driver.get_nodes() + node_ids = self.driver.get_node_ids() args, kwargs = self.mock_grpc_driver.get_nodes.call_args # Assert @@ -87,18 +89,19 @@ def test_get_nodes(self) -> None: self.assertEqual(len(kwargs), 0) self.assertIsInstance(args[0], GetNodesRequest) self.assertEqual(args[0].run_id, 61016) - self.assertEqual(nodes, mock_response.nodes) + self.assertEqual(node_ids, [404, 200]) - def test_push_task_ins(self) -> None: - """Test pushing task instructions.""" + def test_push_messages_valid(self) -> None: + """Test pushing valid messages.""" # Prepare - mock_response = Mock() - mock_response.task_ids = ["id1", "id2"] + mock_response = Mock(task_ids=["id1", "id2"]) self.mock_grpc_driver.push_task_ins.return_value = mock_response - task_ins_list = [TaskIns(), TaskIns()] + msgs = [ + self.driver.create_message(RecordSet(), "", 0, "", "") for _ in range(2) + ] # Execute - task_ids = self.driver.push_task_ins(task_ins_list) + msg_ids = self.driver.push_messages(msgs) args, kwargs = self.mock_grpc_driver.push_task_ins.call_args # Assert @@ -106,12 +109,27 @@ def test_push_task_ins(self) -> None: self.assertEqual(len(args), 1) self.assertEqual(len(kwargs), 0) self.assertIsInstance(args[0], PushTaskInsRequest) - self.assertEqual(task_ids, mock_response.task_ids) + self.assertEqual(msg_ids, mock_response.task_ids) for task_ins in args[0].task_ins_list: self.assertEqual(task_ins.run_id, 61016) - def test_pull_task_res_with_given_task_ids(self) -> None: - """Test pulling task results with specific task IDs.""" + def test_push_messages_invalid(self) -> None: + """Test pushing invalid messages.""" + # Prepare + mock_response = Mock(task_ids=["id1", "id2"]) + self.mock_grpc_driver.push_task_ins.return_value = mock_response + msgs = [ + self.driver.create_message(RecordSet(), "", 0, "", "") for _ in range(2) + ] + # Use invalid run_id + msgs[1].metadata._run_id += 1 # pylint: disable=protected-access + + # Execute and assert + with self.assertRaises(ValueError): + self.driver.push_messages(msgs) + + def test_pull_messages_with_given_message_ids(self) -> None: + """Test pulling messages with specific message IDs.""" # Prepare mock_response = Mock() mock_response.task_res_list = [ @@ -119,10 +137,11 @@ def test_pull_task_res_with_given_task_ids(self) -> None: TaskRes(task=Task(ancestry=["id3"])), ] self.mock_grpc_driver.pull_task_res.return_value = mock_response - task_ids = ["id1", "id2", "id3"] + msg_ids = ["id1", "id2", "id3"] # Execute - task_res_list = self.driver.pull_task_res(task_ids) + msgs = self.driver.pull_messages(msg_ids) + reply_tos = {msg.metadata.reply_to_message for msg in msgs} args, kwargs = self.mock_grpc_driver.pull_task_res.call_args # Assert @@ -130,8 +149,43 @@ def test_pull_task_res_with_given_task_ids(self) -> None: self.assertEqual(len(args), 1) self.assertEqual(len(kwargs), 0) self.assertIsInstance(args[0], PullTaskResRequest) - self.assertEqual(args[0].task_ids, task_ids) - self.assertEqual(task_res_list, mock_response.task_res_list) + self.assertEqual(args[0].task_ids, msg_ids) + self.assertEqual(reply_tos, {"id2", "id3"}) + + def test_send_and_receive_messages_complete(self) -> None: + """Test send and receive all messages successfully.""" + # Prepare + mock_response = Mock(task_ids=["id1"]) + self.mock_grpc_driver.push_task_ins.return_value = mock_response + mock_response = Mock(task_res_list=[TaskRes(task=Task(ancestry=["id1"]))]) + self.mock_grpc_driver.pull_task_res.return_value = mock_response + msgs = [self.driver.create_message(RecordSet(), "", 0, "", "")] + + # Execute + ret_msgs = list(self.driver.send_and_receive(msgs)) + + # Assert + self.assertEqual(len(ret_msgs), 1) + self.assertEqual(ret_msgs[0].metadata.reply_to_message, "id1") + + def test_send_and_receive_messages_timeout(self) -> None: + """Test send and receive messages but time out.""" + # Prepare + sleep_fn = time.sleep + mock_response = Mock(task_ids=["id1"]) + self.mock_grpc_driver.push_task_ins.return_value = mock_response + mock_response = Mock(task_res_list=[]) + self.mock_grpc_driver.pull_task_res.return_value = mock_response + msgs = [self.driver.create_message(RecordSet(), "", 0, "", "")] + + # Execute + with patch("time.sleep", side_effect=lambda t: sleep_fn(t * 0.01)): + start_time = time.time() + ret_msgs = list(self.driver.send_and_receive(msgs, timeout=0.15)) + + # Assert + self.assertLess(time.time() - start_time, 0.2) + self.assertEqual(len(ret_msgs), 0) def test_del_with_initialized_driver(self) -> None: """Test cleanup behavior when Driver is initialized.""" From 4d96a398f33067afc22f0b6b1b583ddb67eedde3 Mon Sep 17 00:00:00 2001 From: Raj Parekh Date: Mon, 19 Feb 2024 13:32:54 -0800 Subject: [PATCH 030/102] FedStar (#2482) Co-authored-by: jafermarq --- README.md | 1 + baselines/fedstar/.gitignore | 4 + baselines/fedstar/LICENSE | 202 ++++++ baselines/fedstar/README.md | 163 +++++ baselines/fedstar/data_splits.tar | Bin 0 -> 13105152 bytes baselines/fedstar/fedstar/.gitattributes | 1 + baselines/fedstar/fedstar/.gitignore | 3 + baselines/fedstar/fedstar/__init__.py | 1 + baselines/fedstar/fedstar/client.py | 154 ++++ baselines/fedstar/fedstar/clients.py | 238 +++++++ baselines/fedstar/fedstar/conf/L/L20.yaml | 3 + baselines/fedstar/fedstar/conf/L/L3.yaml | 3 + baselines/fedstar/fedstar/conf/L/L5.yaml | 3 + baselines/fedstar/fedstar/conf/L/L50.yaml | 3 + baselines/fedstar/fedstar/conf/table3.yaml | 39 ++ baselines/fedstar/fedstar/conf/table4.yaml | 40 ++ baselines/fedstar/fedstar/dataset.py | 506 ++++++++++++++ .../fedstar/fedstar/dataset_preparation.py | 43 ++ baselines/fedstar/fedstar/main.py | 1 + baselines/fedstar/fedstar/models.py | 427 +++++++++++ baselines/fedstar/fedstar/server.py | 291 ++++++++ baselines/fedstar/fedstar/strategy.py | 5 + baselines/fedstar/fedstar/utils.py | 660 ++++++++++++++++++ baselines/fedstar/pyproject.toml | 141 ++++ baselines/fedstar/setup_datasets.sh | 42 ++ 25 files changed, 2974 insertions(+) create mode 100644 baselines/fedstar/.gitignore create mode 100644 baselines/fedstar/LICENSE create mode 100644 baselines/fedstar/README.md create mode 100644 baselines/fedstar/data_splits.tar create mode 100644 baselines/fedstar/fedstar/.gitattributes create mode 100644 baselines/fedstar/fedstar/.gitignore create mode 100644 baselines/fedstar/fedstar/__init__.py create mode 100644 baselines/fedstar/fedstar/client.py create mode 100644 baselines/fedstar/fedstar/clients.py create mode 100644 baselines/fedstar/fedstar/conf/L/L20.yaml create mode 100644 baselines/fedstar/fedstar/conf/L/L3.yaml create mode 100644 baselines/fedstar/fedstar/conf/L/L5.yaml create mode 100644 baselines/fedstar/fedstar/conf/L/L50.yaml create mode 100644 baselines/fedstar/fedstar/conf/table3.yaml create mode 100644 baselines/fedstar/fedstar/conf/table4.yaml create mode 100644 baselines/fedstar/fedstar/dataset.py create mode 100644 baselines/fedstar/fedstar/dataset_preparation.py create mode 100644 baselines/fedstar/fedstar/main.py create mode 100644 baselines/fedstar/fedstar/models.py create mode 100644 baselines/fedstar/fedstar/server.py create mode 100644 baselines/fedstar/fedstar/strategy.py create mode 100644 baselines/fedstar/fedstar/utils.py create mode 100644 baselines/fedstar/pyproject.toml create mode 100644 baselines/fedstar/setup_datasets.sh diff --git a/README.md b/README.md index 38a11d951fe7..00e0a277104a 100644 --- a/README.md +++ b/README.md @@ -101,6 +101,7 @@ Flower Baselines is a collection of community-contributed projects that reproduc - [FedNova](https://github.com/adap/flower/tree/main/baselines/fednova) - [HeteroFL](https://github.com/adap/flower/tree/main/baselines/heterofl) - [FedAvgM](https://github.com/adap/flower/tree/main/baselines/fedavgm) +- [FedStar](https://github.com/adap/flower/tree/main/baselines/fedstar) - [FedWav2vec2](https://github.com/adap/flower/tree/main/baselines/fedwav2vec2) - [FjORD](https://github.com/adap/flower/tree/main/baselines/fjord) - [MOON](https://github.com/adap/flower/tree/main/baselines/moon) diff --git a/baselines/fedstar/.gitignore b/baselines/fedstar/.gitignore new file mode 100644 index 000000000000..e39a6de1f73d --- /dev/null +++ b/baselines/fedstar/.gitignore @@ -0,0 +1,4 @@ +datasets +outputs +data_splits +multirun diff --git a/baselines/fedstar/LICENSE b/baselines/fedstar/LICENSE new file mode 100644 index 000000000000..d64569567334 --- /dev/null +++ b/baselines/fedstar/LICENSE @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/baselines/fedstar/README.md b/baselines/fedstar/README.md new file mode 100644 index 000000000000..9005ae01679d --- /dev/null +++ b/baselines/fedstar/README.md @@ -0,0 +1,163 @@ +--- +title: Federated Self-training for Semi-supervised Audio Recognition +url: https://dl.acm.org/doi/10.1145/3520128 +labels: [Audio Classification, Semi Supervised learning] +dataset: [Ambient Context, Speech Commands] +--- + +# FedStar: Federated Self-training for Semi-supervised Audio Recognition + +> Note: If you use this baseline in your work, please remember to cite the original authors of the paper as well as the Flower paper. + +**Paper:** [dl.acm.org/doi/10.1145/3520128](https://dl.acm.org/doi/10.1145/3520128) + +**Authors:** Vasileios Tsouvalas, Aaqib Saeed, Tanir Özcelebi + +**Abstract:** Federated Learning is a distributed machine learning paradigm dealing with decentralized and personal datasets. Since data reside on devices such as smartphones and virtual assistants, labeling is entrusted to the clients or labels are extracted in an automated way. Specifically, in the case of audio data, acquiring semantic annotations can be prohibitively expensive and time-consuming. As a result, an abundance of audio data remains unlabeled and unexploited on users’ devices. Most existing federated learning approaches focus on supervised learning without harnessing the unlabeled data. In this work, we study the problem of semi-supervised learning of audio models via self-training in conjunction with federated learning. We propose FedSTAR to exploit large-scale on-device unlabeled data to improve the generalization of audio recognition models. We further demonstrate that self-supervised pre-trained models can accelerate the training of on-device models, significantly improving convergence within fewer training rounds. We conduct experiments on diverse public audio classification datasets and investigate the performance of our models under varying percentages of labeled and unlabeled data. Notably, we show that with as little as 3% labeled data available, FedSTAR on average can improve the recognition rate by 13.28% compared to the fully supervised federated model. + + +## About this baseline + +**What’s implemented:** The code is structured in such a way that all experiments for ambient context and speech commands can be derived. + +**Datasets:** Ambient Context, Speech Commands + +**Hardware Setup:** These experiments were run on a linux server with 56 CPU threads with 325 GB Ram with A10 GPU in it. Any machine with 16 CPU cores and 32 GB memory would be able to run experiments with small number of clients in a reasonable amount of time. For context, a machine with 24 cores and a RTX3090Ti ran the Speech Commands experiment in Table 3 with 10 clients in 1h. For this experiment 30GB of RAM was used and clients required ~1.4GB of VRAM each. The same experiment but with the Ambient Context dataset too 13minutes. + +**Contributors:** Raj Parekh [GitHub](https://github.com/Raj-Parekh24), [Mail](rajparekhwc@gmail.com) + +## Environment Setup +```bash +# Set python version +pyenv local 3.10.6 +# Tell poetry to use python 3.10 +poetry env use 3.10.6 +# Now install the environment +poetry install +# Start the shell to activate your environment. +poetry shell +``` + +Next, you'll need to download the datasets. In the case of SpeechCommands, some preprocessing is also required: + +```bash +# Make the shell script executable +chmod +x setup_datasets.sh + +# The below script will download the datasets and create a directory structure requir to run this experiment. +./setup_datasets.sh + +# If you want to run the SpeechCommands experiment, pre-process the dataset +# This will genereate a few training example from the _silence_ category +python -m fedstar.dataset_preparation +# Please note the above will make following changes: +# * Add new files to datasets/speech_commands/Data/Train/_silence_ +# * Add new entries to data_splits/speech_commands/train_split.txt +# Therefore the above command should only be run once. If you want to run it again +# after making modifications to the script, please either revert the changes outlined +# above or erase the dataset and repeat the download + preprocessing as defined in setup_datasets.sh script. +``` + +## Setting up GPU Memory + +**Note:** The experiment is designed to run on both GPU and CPU, but runs better on a system with GPU (specially when using the SpeechCommands dataset). If you wish to use GPU, make sure you have installed the [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads). This baseline has been tested with CUDA 12.3. By default, it will run only on the CPU. Please update the value of the list `gpu_total_mem` with the corresponding memory for each GPU in your machine that you want to expose to the experiment. The variable is in the `distribute_gpus` function inside the `clients.py`. Reference is shown below. + +```python +# For Eg:- We have a system with two GPUs with 8GB and 4GB VRAM. +# The modified varaible will looks like below. +gpu_free_mem = [8000,4000] +``` + + + +## Running the Experiments + +By default, the `Ambient Context` experiment in Table 3 with 10 clients will be run. + +```bash +python -m fedstar.server +python -m fedstar.clients +``` + +You can change the dataset, number of clients and number of rounds like this: + +```bash +python -m fedstar.server num_clients=5 dataset_name=speech_commands server.rounds=20 +python -m fedstar.clients num_clients=5 dataset_name=speech_commands +``` + +To run experiments for Table 4, you should pass a different config file (i.e. that in `fedstar/conf/table4.yaml`). You can do this as follows: + +```bash +# by default will run FedStar with Ambient Context and L=3% +python -m fedstar.server --config-name table4 +python -m fedstar.clients --config-name table4 +``` + +To modify the ratio of labelled data do so as follows: +```bash +# To use a different L setting +python -m fedstar.server --config-name table4 L=L5 # {L3, L5, L20, L50} +# same for fedstar.clients +``` + +To run in supervised mode, pass `fedstar=false` to any of the commands above (when launching both the server and clients). Naturally, you can also override any other setting, like `dataset_name` and `num_clients` if desired. + + +## Expected Results + + +This section indicates the commands to execute to obtain the results shown below in Table 3 and Table 4. While both configs fix the number of rounds to 100, in many settings fewer rounds are enough for the model to reach the accuracy shown in the tables. The commands below make use of Hydra's `--multirun` to run multiple experiments. This is better suited when using Flower simulations. Here they work fine but, if you encounter any issues, you can always "unroll" the multirun and run one configuration at a time. If you do this, results won't go into the `multirun/` directory, instead to the default `outputs/` directory. + + +### Table 3 + +Results will be stored in `multirun/Table3//N_//