Skip to content

Commit

Permalink
Merge branch 'main' into update-mlx-example-flwr-run
Browse files Browse the repository at this point in the history
  • Loading branch information
jafermarq authored Jul 16, 2024
2 parents bfe6023 + 1f3fe0f commit cbd6122
Show file tree
Hide file tree
Showing 29 changed files with 738 additions and 336 deletions.
1 change: 1 addition & 0 deletions datasets/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ pillow = { version = ">=6.2.1", optional = true }
soundfile = { version = ">=0.12.1", optional = true }
librosa = { version = ">=0.10.0.post2", optional = true }
tqdm ="^4.66.1"
pyarrow = "==16.1.0"
matplotlib = "^3.7.5"
seaborn = "^0.13.0"

Expand Down
118 changes: 118 additions & 0 deletions doc/source/contributor-explanation-public-and-private-apis.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
Public and private APIs
=======================

In Python, everything is public.
To enable developers to understand which components can be relied upon, Flower declares a public API.
Components that are part of the public API can be relied upon.
Changes to the public API are announced in the release notes and are subject to deprecation policies.

Everything that is not part of the public API is part of the private API.
Even though Python allows accessing them, user code should never use those components.
Private APIs can change at any time, even in patch releases.

How can you determine whether a component is part of the public API or not? Easy:

- `Use the Flower API reference documentation <ref-api/flwr.html>`_
- `Use the Flower CLI reference documentation <ref-api-cli.html>`_

Everything listed in the reference documentation is part of the public API.
This document explains how Flower maintainers define the public API and how you can determine whether a component is part of the public API or not by reading the Flower source code.

Flower public API
-----------------

Flower has a well-defined public API. Let's look at this in more detail.

.. important::

Every component that is reachable by recursively following ``__init__.__all__`` starting from the root package (``flwr``) is part of the public API.

If you want to determine whether a component (class/function/generator/...) is part of the public API or not, you need to start at the root of the ``flwr`` package.
Let's use ``tree -L 1 -d src/py/flwr`` to look at the Python sub-packages contained ``flwr``:

.. code-block:: bash
flwr
├── cli
├── client
├── common
├── proto
├── server
└── simulation
Contrast this with the definition of ``__all__`` in the root ``src/py/flwr/__init__.py``:

.. code-block:: python
# From `flwr/__init__.py`
__all__ = [
"client",
"common",
"server",
"simulation",
]
You can see that ``flwr`` has six subpackages (``cli``, ``client``, ``common``, ``proto``, ``server``, ``simulation``), but only four of them are "exported" via ``__all__`` (``client``, ``common``, ``server``, ``simulation``).

What does this mean? It means that ``client``, ``common``, ``server`` and ``simulation`` are part of the public API, but ``cli`` and ``proto`` are not.
The ``flwr`` subpackages ``cli`` and ``proto`` are private APIs.
A private API can change completely from one release to the next (even in patch releases).
It can change in a breaking way, it can be renamed (for example, ``flwr.cli`` could be renamed to ``flwr.command``) and it can even be removed completely.

Therefore, as a Flower user:

- ``from flwr import client`` ✅ Ok, you're importing a public API.
- ``from flwr import proto`` ❌ Not recommended, you're importing a private API.

What about components that are nested deeper in the hierarchy? Let's look at Flower strategies to see another typical pattern.
Flower strategies like ``FedAvg`` are often imported using ``from flwr.server.strategy import FedAvg``.
Let's look at ``src/py/flwr/server/strategy/__init__.py``:

.. code-block:: python
from .fedavg import FedAvg as FedAvg
# ... more imports
__all__ = [
"FedAvg",
# ... more exports
]
What's notable here is that all strategies are implemented in dedicated modules (e.g., ``fedavg.py``).
In ``__init__.py``, we *import* the components we want to make part of the public API and then *export* them via ``__all__``.
Note that we export the component itself (for example, the ``FedAvg`` class), but not the module it is defined in (for example, ``fedavg.py``).
This allows us to move the definition of ``FedAvg`` into a different module (or even a module in a subpackage) without breaking the public API (as long as we update the import path in ``__init__.py``).

Therefore:

- ``from flwr.server.strategy import FedAvg`` ✅ Ok, you're importing a class that is part of the public API.
- ``from flwr.server.strategy import fedavg`` ❌ Not recommended, you're importing a private module.

This approach is also implemented in the tooling that automatically builds API reference docs.

Flower public API of private packages
-------------------------------------

We also use this to define the public API of private subpackages.
Public, in this context, means the API that other ``flwr`` subpackages should use.
For example, ``flwr.server.driver`` is a private subpackage (it's not exported via ``src/py/flwr/server/__init__.py``'s ``__all__``).

Still, the private sub-package ``flwr.server.driver`` defines a "public" API using ``__all__`` in ``src/py/flwr/server/driver/__init__.py``:

.. code-block:: python
from .driver import Driver
from .grpc_driver import GrpcDriver
from .inmemory_driver import InMemoryDriver
__all__ = [
"Driver",
"GrpcDriver",
"InMemoryDriver",
]
The interesting part is that both ``GrpcDriver`` and ``InMemoryDriver`` are never used by Flower framework users, only by other parts of the Flower framework codebase.
Those other parts of the codebase import, for example, ``InMemoryDriver`` using ``from flwr.server.driver import InMemoryDriver`` (i.e., the ``InMemoryDriver`` exported via ``__all__``), not ``from flwr.server.driver.in_memory_driver import InMemoryDriver`` (``in_memory_driver.py`` is the module containing the actual ``InMemoryDriver`` class definition).

This is because ``flwr.server.driver`` defines a public interface for other ``flwr`` subpackages.
This allows codeowners of ``flwr.server.driver`` to refactor the package without breaking other ``flwr``-internal users.
1 change: 1 addition & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ The Flower community welcomes contributions. The following docs are intended to
:caption: Contributor explanations

contributor-explanation-architecture
contributor-explanation-public-and-private-apis

.. toctree::
:maxdepth: 1
Expand Down
37 changes: 22 additions & 15 deletions examples/simulation-pytorch/sim.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,13 @@ def get_client_fn(dataset: FederatedDataset):
the strategy to participate.
"""

def client_fn(cid: str) -> fl.client.Client:
def client_fn(context) -> fl.client.Client:
"""Construct a FlowerClient with its own dataset partition."""

# Let's get the partition corresponding to the i-th client
client_dataset = dataset.load_partition(int(cid), "train")
client_dataset = dataset.load_partition(
int(context.node_config["partition-id"]), "train"
)

# Now let's split it into train (90%) and validation (10%)
client_dataset_splits = client_dataset.train_test_split(test_size=0.1, seed=42)
Expand Down Expand Up @@ -171,26 +173,31 @@ def evaluate(
mnist_fds = FederatedDataset(dataset="mnist", partitioners={"train": NUM_CLIENTS})
centralized_testset = mnist_fds.load_split("test")

# Configure the strategy
strategy = fl.server.strategy.FedAvg(
fraction_fit=0.1, # Sample 10% of available clients for training
fraction_evaluate=0.05, # Sample 5% of available clients for evaluation
min_available_clients=10,
on_fit_config_fn=fit_config,
evaluate_metrics_aggregation_fn=weighted_average, # Aggregate federated metrics
evaluate_fn=get_evaluate_fn(centralized_testset), # Global evaluation function
)
from flwr.server import ServerAppComponents


def server_fn(context):
# Configure the strategy
strategy = fl.server.strategy.FedAvg(
fraction_fit=0.1, # Sample 10% of available clients for training
fraction_evaluate=0.05, # Sample 5% of available clients for evaluation
min_available_clients=10,
on_fit_config_fn=fit_config,
evaluate_metrics_aggregation_fn=weighted_average, # Aggregate federated metrics
evaluate_fn=get_evaluate_fn(centralized_testset), # Global evaluation function
)
return ServerAppComponents(
strategy=strategy, config=fl.server.ServerConfig(num_rounds=NUM_ROUNDS)
)


# ClientApp for Flower-Next
client = fl.client.ClientApp(
client_fn=get_client_fn(mnist_fds),
)

# ServerApp for Flower-Next
server = fl.server.ServerApp(
config=fl.server.ServerConfig(num_rounds=NUM_ROUNDS),
strategy=strategy,
)
server = fl.server.ServerApp(server_fn=server_fn)


def main():
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ pycryptodome = "^3.18.0"
iterators = "^0.0.2"
typer = { version = "^0.9.0", extras=["all"] }
tomli = "^2.0.1"
tomli-w = "^1.0.0"
pathspec = "^0.12.1"
# Optional dependencies (Simulation Engine)
ray = { version = "==2.10.0", optional = true, python = ">=3.8,<3.12" }
Expand Down
18 changes: 16 additions & 2 deletions src/py/flwr/cli/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from typing import Optional

import pathspec
import tomli_w
import typer
from typing_extensions import Annotated

Expand Down Expand Up @@ -85,23 +86,36 @@ def build(

# Set the name of the zip file
fab_filename = (
f"{conf['flower']['publisher']}"
f"{conf['tool']['flwr']['app']['publisher']}"
f".{directory.name}"
f".{conf['project']['version'].replace('.', '-')}.fab"
)
list_file_content = ""

allowed_extensions = {".py", ".toml", ".md"}

# Remove the 'federations' field from 'tool.flwr' if it exists
if (
"tool" in conf
and "flwr" in conf["tool"]
and "federations" in conf["tool"]["flwr"]
):
del conf["tool"]["flwr"]["federations"]

toml_contents = tomli_w.dumps(conf)

with zipfile.ZipFile(fab_filename, "w", zipfile.ZIP_DEFLATED) as fab_file:
fab_file.writestr("pyproject.toml", toml_contents)

# Continue with adding other files
for root, _, files in os.walk(directory, topdown=True):
# Filter directories and files based on .gitignore
files = [
f
for f in files
if not ignore_spec.match_file(Path(root) / f)
and f != fab_filename
and Path(f).suffix in allowed_extensions
and f != "pyproject.toml" # Exclude the original pyproject.toml
]

for file in files:
Expand Down
38 changes: 23 additions & 15 deletions src/py/flwr/cli/config_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def get_fab_metadata(fab_file: Union[Path, bytes]) -> Tuple[str, str]:

return (
conf["project"]["version"],
f"{conf['flower']['publisher']}/{conf['project']['name']}",
f"{conf['tool']['flwr']['app']['publisher']}/{conf['project']['name']}",
)


Expand Down Expand Up @@ -136,20 +136,28 @@ def validate_fields(config: Dict[str, Any]) -> Tuple[bool, List[str], List[str]]
if "authors" not in config["project"]:
warnings.append('Recommended property "authors" missing in [project]')

if "flower" not in config:
errors.append("Missing [flower] section")
if (
"tool" not in config
or "flwr" not in config["tool"]
or "app" not in config["tool"]["flwr"]
):
errors.append("Missing [tool.flwr.app] section")
else:
if "publisher" not in config["flower"]:
errors.append('Property "publisher" missing in [flower]')
if "config" in config["flower"]:
_validate_run_config(config["flower"]["config"], errors)
if "components" not in config["flower"]:
errors.append("Missing [flower.components] section")
if "publisher" not in config["tool"]["flwr"]["app"]:
errors.append('Property "publisher" missing in [tool.flwr.app]')
if "config" in config["tool"]["flwr"]["app"]:
_validate_run_config(config["tool"]["flwr"]["app"]["config"], errors)
if "components" not in config["tool"]["flwr"]["app"]:
errors.append("Missing [tool.flwr.app.components] section")
else:
if "serverapp" not in config["flower"]["components"]:
errors.append('Property "serverapp" missing in [flower.components]')
if "clientapp" not in config["flower"]["components"]:
errors.append('Property "clientapp" missing in [flower.components]')
if "serverapp" not in config["tool"]["flwr"]["app"]["components"]:
errors.append(
'Property "serverapp" missing in [tool.flwr.app.components]'
)
if "clientapp" not in config["tool"]["flwr"]["app"]["components"]:
errors.append(
'Property "clientapp" missing in [tool.flwr.app.components]'
)

return len(errors) == 0, errors, warnings

Expand All @@ -165,14 +173,14 @@ def validate(

# Validate serverapp
is_valid, reason = object_ref.validate(
config["flower"]["components"]["serverapp"], check_module
config["tool"]["flwr"]["app"]["components"]["serverapp"], check_module
)
if not is_valid and isinstance(reason, str):
return False, [reason], []

# Validate clientapp
is_valid, reason = object_ref.validate(
config["flower"]["components"]["clientapp"], check_module
config["tool"]["flwr"]["app"]["components"]["clientapp"], check_module
)

if not is_valid and isinstance(reason, str):
Expand Down
Loading

0 comments on commit cbd6122

Please sign in to comment.