Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Malicious clients simulation #78

Merged
merged 7 commits into from
Sep 18, 2024
Merged

Malicious clients simulation #78

merged 7 commits into from
Sep 18, 2024

Conversation

gautamjajoo
Copy link
Collaborator

No description provided.

Copy link
Contributor

@tremblerz tremblerz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should be creating a deeper layer of abstraction in the sys_config where each node gets assigned a corresponding algorithm configuration (the algorithm that it will run) and then we can create an algorithm and corresponding algorithm config file that malicious nodes run.

Obviously, to prevent cumbersome assignment of same algorithm to hundreds of nodes, this method of configuration will have to be layered on top with another parameter that automatically assigns the algorithms to each node. I have done something similar for GPU assignment already in get_device_ids https://github.com/aidecentralized/sonar/blob/main/src/configs/sys_config.py#L203

Advantages --

  1. Each node can run a different algorithm so we can have heterogeneous mixes of different algorithms such as one malicious node and one free rider node etc.
  2. The current implementation is a risky one because it relies on the fact that random function will work exactly the same way for everyone.
  3. This abstraction layer/API will work much better when we actually run these things in a more decentralized manner where someone starts n nodes of some type and someone else starts m nodes of some other type.

Disadvantages --

  1. Does not make sense to do this if we do not anticipate many use-cases with a mix of different kinds of nodes.
  2. Our data partitioning system also relies on the random function working exactly the same way for all the nodes.

@@ -30,6 +30,7 @@
"model_lr": 3e-4,
"batch_size": 256,
"exp_keys": [],
"num_malicious_clients": 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please only use the word nodes and not clients anywhere

src/scheduler.py Outdated
@@ -87,6 +94,21 @@ def merge_configs(self):
self.config.update(self.sys_config)
self.config.update(self.algo_config)

def malicious_simulation(self):
num_clients = self.config.get("num_users", 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_users

@@ -1,7 +1,8 @@
from typing import Any, Dict, List
import jmespath
import importlib

from utils.node_map import NodeMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these two getting imported?

@@ -30,6 +30,7 @@
"model_lr": 3e-4,
"batch_size": 256,
"exp_keys": [],
"num_malicious_clients": 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go in sys_config.py

@gautamjajoo
Copy link
Collaborator Author

I understand your point, and it seems a better solution. But as mentioned, if we do not plan to run with a mix of different kinds of nodes, we will lose time here.

I will start digging and check whether running with a mix is feasible or not

@gautamjajoo
Copy link
Collaborator Author

Now here is the structure:

  1. There is sys_config which will assign a node whether it will be doing traditional_fl or malicious_traditional_fl. This is set as random. This is a map containing a node and its algo type.
  2. Then inside the algo_config, each node if it's malicious_traditional_fl will be randomly assigned any of the malicious types, and if it is traditional_fl it will be working in its usual way.
  3. The different malicious types are defined in malicious_config.py.
  4. Also introduced hex uuid for storing the expt folder inside the expt_dump directory

@tremblerz tremblerz merged commit 40e6a04 into main Sep 18, 2024
0 of 2 checks passed
@gautamjajoo gautamjajoo deleted the malicious_simulation branch September 20, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants