Skip to content

Commit

Permalink
Creating the glossary folder and moving the .mdx files
Browse files Browse the repository at this point in the history
  • Loading branch information
Julianrussmeyer committed Sep 10, 2024
1 parent d24ebe5 commit 126d52e
Show file tree
Hide file tree
Showing 12 changed files with 248 additions and 0 deletions.
18 changes: 18 additions & 0 deletions glossary/aggregation.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: "Aggregation"
description: "Combine model weights from sampled clients to update the global model. This process enables the global model to learn from each client's data."
date: "2024-05-23"
author:
name: "Charles Beauville"
position: "Machine Learning Engineer"
website: "https://www.linkedin.com/in/charles-beauville/"
github: "github.com/charlesbvll"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Tutorial: What is Federated Learning?"
link: "/docs/framework/tutorial-series-what-is-federated-learning.html"
---

During each Federated Learning round, the server will receive model weights from sampled clients and needs a function to improve its global model using those weights. This is what is called `aggregation`. It can be a simple weighted average function (like `FedAvg`), or can be more complex (e.g. incorporating optimization techniques). The aggregation is where FL's magic happens, it allows the global model to learn and improve from each client's particular data distribution with only their trained weights.

17 changes: 17 additions & 0 deletions glossary/client.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: "Client"
description: "A client is any machine with local data that connects to a server, trains on received global model weights, and sends back updated weights. Clients may also evaluate global model weights."
date: "2024-05-23"
author:
name: "Charles Beauville"
position: "Machine Learning Engineer"
website: "https://www.linkedin.com/in/charles-beauville/"
github: "github.com/charlesbvll"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Tutorial: What is Federated Learning?"
link: "/docs/framework/tutorial-series-what-is-federated-learning.html"
---

Any machine with access to some data that connects to a server to perform Federated Learning. During each round of FL (if it is sampled), it will receive global model weights from the server, train on the data they have access to, and send the resulting trained weights back to the server. Clients can also be sampled to evaluate the global server weights on the data they have access to, this is called federated evaluation.
22 changes: 22 additions & 0 deletions glossary/docker.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: "Docker"
description: "Docker is a containerization tool that allows for consistent and reliable deployment of applications across different environments."
date: "2024-07-08"
author:
name: "Robert Steiner"
position: "DevOps Engineer at Flower Labs"
website: "https://github.com/Robert-Steiner"
---

Docker is an open-source containerization tool for deploying and running applications. Docker
containers encapsulate an application's code, dependencies, and configuration files, allowing
for consistent and reliable deployment across different environments.

In the context of federated learning, Docker containers can be used to package the entire client
and server application, including all the necessary dependencies, and then deployed on various
devices such as edge devices, cloud servers, or even on-premises servers.

In Flower, Docker containers are used to containerize various applications like `SuperLink`,
`SuperNode`, and `SuperExec`. Flower's Docker images allow users to quickly get Flower up and
running, reducing the time and effort required to set up and configure the necessary software
and dependencies.
19 changes: 19 additions & 0 deletions glossary/evaluation.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Evaluation"
description: "Evaluation measures how well the trained model performs by testing it on each client's local data, providing insights into its generalizability across varied data sources."
date: "2024-07-08"
author:
name: "Heng Pan"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/pan-h/summary"
github: "github.com/panh99"
related:
- text: "Server"
link: "/glossary/server"
- text: "Client"
link: "/glossary/client"
---

Evaluation in machine learning is the process of assessing a model's performance on unseen data to determine its ability to generalize beyond the training set. This typically involves using a separate test set and various metrics like accuracy or F1-score to measure how well the model performs on new data, ensuring it isn't overfitting or underfitting.

In federated learning, evaluation (or distributed evaluation) refers to the process of assessing a model's performance across multiple clients, such as devices or data centers. Each client evaluates the model locally using its own data and then sends the results to the server, which aggregates all the evaluation outcomes. This process allows for understanding how well the model generalizes to different data distributions without centralizing sensitive data.
14 changes: 14 additions & 0 deletions glossary/federated-learning.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: "Federated Learning"
description: "Federated Learning is a machine learning approach where model training occurs on decentralized devices, preserving data privacy and leveraging local computations."
date: "2024-05-23"
author:
name: "Julian Rußmeyer"
position: "UX/UI Designer"
website: "https://www.linkedin.com/in/julian-russmeyer/"
related:
- text: "Tutorial: What is Federated Learning?"
link: "/docs/framework/tutorial-series-what-is-federated-learning.html"
---

Federated learning is an approach to machine learning in which the model is trained on multiple decentralized devices or servers with local data samples without exchanging them. Instead of sending raw data to a central server, updates to the model are calculated locally and only the model parameters are aggregated centrally. In this way, user privacy is maintained and communication costs are reduced, while collaborative model training is enabled.
21 changes: 21 additions & 0 deletions glossary/inference.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: "Inference"
description: "Inference is the phase in which a trained machine learning model applies its learned patterns to new, unseen data to make predictions or decisions."
date: "2024-07-12"
author:
name: "Yan Gao"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/yan-gao/"
github: "github.com/yan-gao-GY"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Server"
link: "/glossary/server"
- text: "Client"
link: "/glossary/client"
---

Inference, also known as model prediction, is the stage in the machine learning workflow where a trained model is used to make predictions based on new, unseen data. In a typical machine learning setting, model inference involves the following steps: model loading, where the trained model is loaded into the application or service where it will be used; data preparation, which preprocess the new data in the same way as the training data; and model prediction, where the prepared data is fed into the model to compute outputs based on the learned patterns during training.

In the context of federated learning (FL), inference can be performed locally on the user's device. A global model updated from FL process is deployed and loaded on individual nodes (e.g., smartphones, hospital servers) for local inference. This allows for keeping all data on-device, enhancing privacy and reducing latency.
24 changes: 24 additions & 0 deletions glossary/medical-ai.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: "Medical AI"
description: "Medical AI involves the application of artificial intelligence technologies to healthcare, enhancing diagnosis, treatment planning, and patient monitoring by analyzing complex medical data."
date: "2024-07-12"
author:
name: "Yan Gao"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/yan-gao/"
github: "github.com/yan-gao-GY"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Server"
link: "/glossary/server"
- text: "Client"
link: "/glossary/client"
---

Medical AI refers to the application of artificial intelligence technologies, particularly machine learning algorithms, to medical and healthcare-related fields. This includes, but is not limited to, tasks such as disease diagnosis, personalized treatment plans, drug development, medical imaging analysis, and healthcare management. The goal of Medical AI is to enhance healthcare services, improve treatment outcomes, reduce costs, and increase efficiency within healthcare systems.

Federated learning (FL) introduces a novel approach to the training of machine learning models across multiple decentralized devices or servers holding local data samples, without exchanging them. This is particularly appropriate in the medical field due to the sensitive nature of medical data and strict privacy requirements. It leverages the strength of diverse datasets without compromising patient confidentiality, making it an increasingly popular choice in Medical AI applications.

#### Medical AI in Flower
Flower, a friendly FL framework, is developing a more versatile and privacy-enhancing solution for Medical AI through the use of FL. Please check out [Flower industry healthcare](flower.ai/industry/healthcare) website for more detailed information.
24 changes: 24 additions & 0 deletions glossary/model-training.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: "Model Training"
description: "Model training is the process of teaching an algorithm to learn from data to make predictions or decisions."
date: "2024-07-12"
author:
name: "Yan Gao"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/yan-gao/"
github: "github.com/yan-gao-GY"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Server"
link: "/glossary/server"
- text: "Client"
link: "/glossary/client"
---

Model training is a core component of developing machine learning (ML) systems, where an algorithm learns from data to make predictions or decisions. A typical model training process involves several key steps: dataset preparation, feature selection and engineering, choice of model based on the task (e.g., classification, regression), choice of training algorithm (e.g. optimizer), and model iteration for updating its weights and biases to minimize the loss function, which measures the difference between the predicted and actual outcomes on the training data. The traditional ML model training process typically involves considerable manual effort, whereas deep learning (DL) offers an end-to-end automated process.

This approach assumes easy access to data and often requires substantial computational resources, depending on the size of the dataset and complexity of the model. However, large amounts of the data in the real world is distributed and protected due to privacy concerns, making it inaccessible for typical (centralized) model training. Federated learning (FL) migrates the model training from data center to local user ends. After local training, each participant sends only their model's updates (not the data) to a central server for aggregation. The updated global model is sent back to the participants for further rounds of local training and updates. This way, the model training benefits from diverse, real-world data without compromising individual data privacy.

#### Model training in Flower
Flower, a friendly FL framework, offers a wealth of model training examples and baselines tailored for federated environments. Please refer to the [examples](https://flower.ai/docs/examples/) and [baselines](https://flower.ai/docs/baselines/) documentation for more detailed information.
19 changes: 19 additions & 0 deletions glossary/platform-independence.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: "Platform Independence"
description: "The capability to run program across different hardware and operating systems."
date: "2024-07-08"
author:
name: "Heng Pan"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/pan-h/summary"
github: "github.com/panh99"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
---

Platform independence in federated learning refers to the capability of machine learning systems to operate seamlessly across various hardware and operating system environments. This ensures that the federated learning process can function effectively on various devices with different operating systems such as Windows, Linux, Mac OS, iOS, and Android without requiring platform-specific modifications. By achieving platform independence, federated learning frameworks enable efficient data analysis and model training across heterogeneous edge devices, enhancing scalability and flexibility in distributed machine learning scenarios.

### Platform Independence in Flower

Flower is interoperable with different operating systems and hardware platforms to work well in heterogeneous edge device environments.
31 changes: 31 additions & 0 deletions glossary/protocol-buffers.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
title: "Protocol Buffers"
description: "Protocol Buffers, often abbreviated as Protobuf, are a language-neutral, platform-neutral, extensible mechanism for serializing structured data, similar to XML but smaller, faster, and simpler."
date: "2024-05-24"
author:
name: "Taner Topal"
position: "Co-Creator and CTO @ Flower Labs"
website: "https://www.linkedin.com/in/tanertopal/"
github: "github.com/tanertopal"
related:
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Tutorial: What is Federated Learning?"
link: "/docs/framework/tutorial-series-what-is-federated-learning.html"
---

### Introduction to Protocol Buffers

Protocol Buffers, often abbreviated as Protobuf, are a language-neutral, platform-neutral, extensible mechanism for serializing structured data, similar to XML but smaller, faster, and simpler. The method involves defining how you want your data to be structured once, then using language specific generated source code to write and read structured data to and from a variety of data streams.

### How Protocol Buffers Work

Protocol Buffers require a `.proto` file where the data structure (the messages) is defined. This is essentially a schema describing the data to be serialized. Once the `.proto` file is prepared, it is compiled using the Protobuf compiler (`protoc`), which generates data access classes in supported languages like Java, C++, Python, Swift, Kotlin, and more. These classes provide simple accessors for each field (like standard getters and setters) and methods to serialize the entire structure to a binary format that can be easily transmitted over network protocols or written to a file.

### Advantages and Use Cases

The primary advantages of Protocol Buffers include their simplicity, efficiency, and backward compatibility. They are more efficient than XML or JSON as they serialize to a binary format, which makes them both smaller and faster. They support backward compatibility, allowing to modify data structures without breaking deployed programs that are communicating using the protocol. This makes Protobuf an excellent choice for data storage or RPC (Remote Procedure Call) applications where small size, low latency, and schema evolution are critical.

### Protocol Buffers in Flower

In the context of Flower, Protocol Buffers play a crucial role in ensuring efficient and reliable communication between the server and clients. Federated learning involves heterogeneous clients (e.g., servers, mobile devices, edge devices) running different environments and programming languages. This setup requires frequent exchanges of model updates and other metadata between the server and clients. Protocol Buffers, with their efficient binary serialization, enable Flower to handle these exchanges with minimal overhead, ensuring low latency and reducing the bandwidth required for communication. Moreover, the backward compatibility feature of Protobuf allows Flower to evolve and update its communication protocols without disrupting existing deployments. Best of all, Flower users typically do not have to deal directly with Protobuf, as Flower provides language-specific abstractions that simplify interaction with the underlying communication protocols.
22 changes: 22 additions & 0 deletions glossary/scalability.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: "Scalability"
description: "Scalability ensures systems grow with demand. In Federated Learning, it involves efficiently managing dynamic clients and diverse devices. Flower supports large-scale FL on various devices/ resources."
date: "2024-05-23"
author:
name: "Daniel Nata Nugraha"
position: "Software Engineer"
image: "daniel_nata_nugraha.png"
website: "https://www.linkedin.com/in/daniel-nugraha/"
github: "github.com/danielnugraha"
related:
- text: "Flower Paper"
link: "https://arxiv.org/pdf/2007.14390"
- text: "Federated Learning"
link: "/glossary/federated-learning"
- text: "Tutorial: What is Federated Learning?"
link: "/docs/framework/tutorial-series-what-is-federated-learning.html"
---

Scalability is the ability of a system, network, or process to accommodate an increasing amount of work. This involves adding resources (like servers) or optimizing existing ones to maintain or enhance performance. There are two main types of scalability: horizontal scalability (adding more nodes, such as servers) and vertical scalability (adding more power to existing nodes, like increasing CPU or RAM). Ideally, a scalable system can do both, seamlessly adapting to increased demands without significant downtime. Scalability is essential for businesses to grow while ensuring services remain reliable and responsive.
Scalability in Federated Learning involves managing dynamic client participation, as clients may join or leave unpredictably. This requires algorithms that adapt to varying availability and efficiently aggregate updates from numerous models. Additionally, scalable federated learning systems must handle heterogeneous client devices with different processing powers, network conditions, and data distributions, ensuring balanced contributions to the global model.
Scalability in Flower means efficiently conducting large-scale federated learning (FL) training and evaluation. Flower enables researchers to launch FL experiments with many clients using reasonable computing resources, such as a single machine or a multi-GPU rack. Flower supports scaling workloads to millions of clients, including diverse devices like Raspberry Pis, Android and iOS mobile devices, laptops, etc. It offers complete control over connection management and includes a virtual client engine for large-scale simulations.
17 changes: 17 additions & 0 deletions glossary/server.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: "Server"
description: "The central entity coordinating the aggregation of local model updates from multiple clients to build a comprehensive, privacy-preserving global model."
date: "2024-07-08"
author:
name: "Heng Pan"
position: "Research Scientist"
website: "https://discuss.flower.ai/u/pan-h/summary"
github: "github.com/panh99"
related:
- text: "Client"
link: "/glossary/client"
- text: "Federated Learning"
link: "/glossary/federated-learning"
---

A server in federated learning plays a pivotal role by managing the distributed training process across various clients. Each client independently trains its local model using the local data and then sends the model updates to the server. The server aggregates the received updates to create a new global model, which is subsequently sent back to the clients. This iterative process allows the global model to improve over time without the need for the clients to share their raw data, ensuring data privacy and minimizing data transfer.

0 comments on commit 126d52e

Please sign in to comment.