-
Notifications
You must be signed in to change notification settings - Fork 941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(baselines) Add FedRep Baseline #3790
Conversation
Hi, @jafermarq .
Seems FedRep is normal again. My laptop has not that much resources to do all the experiments. I think if This PR is ready again. |
Hi @KarhouTam, thanks for the update. I re-run the experiments but they perform similarly to when i did my last review (so they are worse than earlier in the PR thread). I looked into it and I'm happy to say that I found the issue 😆 . Before using
flower-dataset the test set of CIFAR-10/100 is not being used at all. It seems the only reason why the accuracy of some experiments (mostly CIFAR-10 w/ FedRep ) seem to be lower is just not exposing the trained model to a wider evaluation set... See below the results I got with the code as in the last commit and after applying the changes I propose below.
![]() Note Top row is What to change ?
I've done it like this: def get_client_fn_simulation(...):
....
# new partitioner for the training split
partitioner_test = PathologicalPartitioner(<same-args-as-the-existing-one>)
....
FEDERATED_DATASET = FederatedDataset(
dataset=config.dataset.name.lower(), partitioners={"train": partitioner_train, "test": partitioner_test}
)
...
def client_fn(cid: str) -> Client:
....
trainset = partition_train_test["train"].with_transform(apply_train_transforms)
valset = partition_train_test["test"].with_transform(apply_test_transforms) # just like before
#! Now extract test partition and fuse validation and test into a single one
testset = FEDERATED_DATASET.load_partition(cid_use, split="test")
testset = testset.with_transform(apply_test_transforms)
testset = concatenate_datasets([valset, testset])
# rest is the same Finally, I also had to make a simple change to your def evaluate_metrics_aggregation_fn(...):
weights, accuracies = [], []
for num_examples, metric in eval_metrics:
weights.append(num_examples)
accuracies.append(metric["accuracy"] * num_examples)
accuracy = sum(accuracies) / sum(weights)
return {"accuracy": accuracy} I though of presenting my changes here in a comment instead of as comments to the code. Let me know what you think about this finding of now including the |
Hi, @jafermarq . Thanks for doing other experiments. Your thought is really helpful 🚀. I've fixed the code. I followed most of your suggestions. At Would you mind run experiment again for testing? If so, that would be a great help. |
@KarhouTam , actually reusing the same partitioner isn't possible since it can only be linked to one A super simplified test of this is the following: from flwr_datasets import FederatedDataset
from flwr_datasets.partitioner import IidPartitioner
partitioner = IidPartitioner(num_partitions=10)
fds = FederatedDataset(dataset="mnist", partitioners={"test": partitioner, "train": partitioner})
# Load the first partition
iid_partition = fds.load_partition(partition_id=0, split='test')
print(iid_partition)
iid_partition = fds.load_partition(partition_id=0, split='train')
print(iid_partition) this gets printed: Dataset({
features: ['image', 'label'],
num_rows: 1000
})
Dataset({
features: ['image', 'label'],
num_rows: 1000 # <------------- clearly this isn't coming from the training set (which has 50K images)
}) @KarhouTam , in the mean time. Is it ok if we have that "duplicated" partitioner? maybe do a hardcopy if you prefer in order to keep the code cleaner and leave a comment? |
@jafermarq , sorry I didn't check the inside of partitioner. As you mentioned before, I set different partitioners for dataset splits. |
@jafermarq I'll open up a PR that performs the check if there are two identical partitioners specified in the partitioners dict. It's a bug (I haven't thought about this problem). The latter asked partitioning does not happen but uses the already assigned dataset. |
@jafermarq @KarhouTam
|
Thanks @adam-narozniak ! I'm seeing good results with the change proposed by Adam. |
@adam-narozniak Thanks for the clean
@jafermarq Well I think is fine to blend the data from different splits (original Anyway, I agree to use the |
Hi, @jafermarq . The change has beed done. I've tested the agorithm with |
Amazing job @KarhouTam! 💯 |
Issue
Description
Add baseline FedRep to flower-baseline.
Reproduce experiments on CIFAR-10 and CIFAR-100.
Related issues/PRs
Related issue #3788
Checklist
#contributions
)