diff --git a/README.md b/README.md index a5fe478c0e..bf091f58fd 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,6 @@ -[DOCS](./doc) | [中文](./README_zh.md) FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy. It implements secure computation protocols based on homomorphic encryption and multi-party computation (MPC). @@ -36,8 +35,23 @@ Deploying FATE to multiple nodes to achieve scalability, reliability and managea - [Cluster deployment by CLI](./deploy/cluster-deploy): Using CLI to deploy a FATE cluster. ### Quick Start -- [Training Demo With Installing FATE AND FATE-Flow From Pypi](doc/2.0/fate/quick_start.md) - [Training Demo With Installing FATE Only From Pypi](doc/2.0/fate/ml) +- [Training Demo With Installing FATE AND FATE-Flow From Pypi](doc/2.0/fate/quick_start.md) + +### More examples +- [ML examples](examples/launchers) +- [PipeLine examples](examples/pipeline) + +## Documentation + +### FATE Design +- [Architecture](./doc/architecture/README.md): Building Unified and Standardized API for Heterogeneous Computing Engines Interconnection +- [FATE Algorithm Components](./doc/2.0/fate/components/README.md): Building Standardized Algorithm Components for different Scheduling Engines +- [OSX (Open Site Exchange)](./doc/2.0/osx/osx.md): Building Open Platform for Cross-Site Communication Interconnection +- [FATE-Flow](https://github.com/FederatedAI/FATE-Flow/blob/main/doc/fate_flow.md): Building Open and Standardized Scheduling Platform for Scheduling Interconnection +- [PipeLine Design](https://github.com/FederatedAI/FATE-Client/blob/main/doc/pipeline.md): Building Scalable Federated DSL for Application Layer Interconnection And Providing Tools For Fast Federated Modeling +- [RoadMap](./doc/images/roadmap.png) +- [Paper & Conference](./doc/resources/README.md) ## Related Repositories (Projects) - [KubeFATE](https://github.com/FederatedAI/KubeFATE): An operational tool for the FATE platform using cloud native technologies such as containers and Kubernetes. diff --git a/README_zh.md b/README_zh.md deleted file mode 100644 index f632fae216..0000000000 --- a/README_zh.md +++ /dev/null @@ -1,69 +0,0 @@ -[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![CodeStyle](https://img.shields.io/badge/Check%20Style-Google-brightgreen)](https://checkstyle.sourceforge.io/google_style.html) [![Style](https://img.shields.io/badge/Check%20Style-Black-black)](https://checkstyle.sourceforge.io/google_style.html) [![Build Status](https://travis-ci.org/FederatedAI/FATE.svg?branch=master)](https://travis-ci.org/FederatedAI/FATE) -[![codecov](https://codecov.io/gh/FederatedAI/FATE/branch/master/graph/badge.svg)](https://codecov.io/gh/FederatedAI/FATE) -[![Documentation Status](https://readthedocs.org/projects/fate/badge/?version=latest)](https://fate.readthedocs.io/en/latest/?badge=latest) -[![Gitpod Ready-to-Code](https://img.shields.io/badge/Gitpod-Ready--to--Code-blue?logo=gitpod)](https://gitpod.io/from-referrer/) -[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/6308/badge)](https://bestpractices.coreinfrastructure.org/projects/6308) - -
- -
- -[DOC](./doc) | [Quick Start](doc/tutorial/pipeline/pipeline_tutorial_hetero_sbt.ipynb) | [English](./README.md) - -FATE (Federated AI Technology Enabler) 是全球首个联邦学习工业级开源框架,可以让企业和机构在保护数据安全和数据隐私的前提下进行数据协作。 -FATE项目使用多方安全计算 (MPC) 以及同态加密 (HE) 技术构建底层安全计算协议,以此支持不同种类的机器学习的安全计算,包括逻辑回归、基于树的算法、深度学习和迁移学习等。 -FATE于2019年2月首次对外开源,并成立 -[FATE TSC](https://github.com/FederatedAI/FATE-Community/blob/master/FATE_Project_Technical_Charter.pdf) -对FATE社区进行开源治理,成员包含国内主要云计算和金融服务企业。 - - - -## 教程 - -### 2.0以前的版本 -FATE 2.0以前的版本在[发布页](https://github.com/FederatedAI/FATE/releases), 下载资源汇总页在[wiki](https://github.com/FederatedAI/FATE/wiki/Download) - -### 2.0.0 版本 -#### 单机版部署 -在单节点上部署FATE单机版,支持从 PyPI 直接安装,docker,主机安装包三种方式。 -- [单机版部署教程](./deploy/standalone-deploy) -#### 集群 -- [原生集群安装](./deploy/cluster-deploy): Using CLI to deploy a FATE cluster. - -### 快速开始 -- [从 PyPI 下载安装 FATE 和 FATE-Flow 并启动训练任务示例](doc/2.0/fate/quick_start.md) -- [从 PyPI 下载安装 FATE,并启动训练任务示例](doc/2.0/fate/ml) - -## 关联仓库 -- [KubeFATE](https://github.com/FederatedAI/KubeFATE) -- [FATE-Flow](https://github.com/FederatedAI/FATE-Flow) -- [FATE-Board](https://github.com/FederatedAI/FATE-Board) -- [FATE-Serving](https://github.com/FederatedAI/FATE-Serving) -- [FATE-Cloud](https://github.com/FederatedAI/FATE-Cloud) -- [FedVision](https://github.com/FederatedAI/FedVision) -- [EggRoll](https://github.com/WeBankFinTech/eggroll) -- [AnsibleFATE](https://github.com/FederatedAI/AnsibleFATE) -- [FATE-Builder](https://github.com/FederatedAI/FATE-Builder) -- [FATE-Client](https://github.com/FederatedAI/FATE-Client) -- [FATE-Test](https://github.com/FederatedAI/FATE-Test) - -## 社区治理 - -[FATE-Community](https://github.com/FederatedAI/FATE-Community) 仓库包含历史社区合作,沟通,会议,章程等文档。 - -- [GOVERNANCE.md](https://github.com/FederatedAI/FATE-Community/blob/master/GOVERNANCE.md) -- [Minutes](https://github.com/FederatedAI/FATE-Community/blob/master/meeting-minutes) -- [Development Process Guidelines](https://github.com/FederatedAI/FATE-Community/blob/master/FederatedAI_PROJECT_PROCESS_GUIDELINE.md) -- [Security Release Process](https://github.com/FederatedAI/FATE-Community/blob/master/SECURITY.md) - -## 了解更多 - -- [Fate-FedAI Group IO](https://groups.io/g/Fate-FedAI) -- [FAQ](https://github.com/FederatedAI/FATE/wiki) -- [issues](https://github.com/FederatedAI/FATE/issues) -- [pull requests](https://github.com/FederatedAI/FATE/pulls) -- [Twitter: @FATEFedAI](https://twitter.com/FateFedAI) - - -## License -[Apache License 2.0](LICENSE) diff --git a/RELEASE.md b/RELEASE.md index 85d9a85d0b..b09def23f3 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -34,7 +34,7 @@ > Algorithm Performance Improvements: (Comparison with FATE-v1.11.*) * PSI (Privacy Set Intersection): tested on a dataset of 100 million with an intersection result of 100 million, 1.8+ times of FATE-v1.11.4 * Hetero-SSHE-LR: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 4.3+ times of FATE-v1.11.4 -* Hetero-NN(Based on FedPass Protocol): tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, basically consistent with the plaintext performance, 56+ times of FATE-v1.11.4 +* Hetero-NN(Based on FedPass Protocol): tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, basically consistent with the plaintext performance, 143+ times of FATE-v1.11.4 * Hetero-Coordinated-LR: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 1.2+ times of FATE-v1.11.4 * Hetero-Feature-Binning: tested on data of guest 10w * 30 dimensions and host 10w * 300 dimensions, 1.5+ times of FATE-v1.11.4 diff --git a/deploy/cluster-deploy/allinone/fate-exchange_deployment_guide.zh.md b/deploy/cluster-deploy/allinone/fate-exchange_deployment_guide.zh.md index 6225fdf4b3..2391dfc443 100644 --- a/deploy/cluster-deploy/allinone/fate-exchange_deployment_guide.zh.md +++ b/deploy/cluster-deploy/allinone/fate-exchange_deployment_guide.zh.md @@ -316,7 +316,7 @@ EOF **需要连接exchange的各party的osx模块,app用户修改** -修改/data/projects/fate/osx/conf/broker/route_table.json部分,默认路由信息指向部署好的exchange,不需要配置对端fateflow信息,修改后需重启osx: +修改/data/projects/fate/osx/conf/broker/route_table.json部分,默认路由信息指向部署好的exchange,并配置本方fateflow等信息,不需要配置对端rollsite或者osx信息,修改后需重启osx: ``` "default": { @@ -328,6 +328,35 @@ EOF ] } ``` +比如party9999,即目标服务器(192.168.0.2)配置 /data/projects/fate/osx/conf/broker/route_table.json为: + +``` +{ + "route_table": { + "default": { + "default": [ + { + "ip": "192.168.0.1", + "port": 9370 + } + ] + }, + "9999": { + "default": [{ + "ip": "rollsite", + "port": 9370 + }], + "fateflow": [{ + "ip": "fateflow", + "port": 9360 + }] + } + }, + "permission": { + "default_allow": true + } +} +``` diff --git a/deploy/standalone-deploy/README.zh.md b/deploy/standalone-deploy/README.zh.md index b86b7eda7f..504c4ffbb2 100644 --- a/deploy/standalone-deploy/README.zh.md +++ b/deploy/standalone-deploy/README.zh.md @@ -41,7 +41,7 @@ pip install fate_client[fate,fate_flow]==2.0.0 #### 2.2.1.2 服务初始化 ```shell fate_flow init --ip 127.0.0.1 --port 9380 --home $HOME_DIR -pipeline --ip 127.0.0.1 --port 9380 +pipeline init --ip 127.0.0.1 --port 9380 ``` - `ip`:服务运行的ip - `port`:服务运行的 http 端口 diff --git a/doc/2.0/fate/ml/hetero_nn_tutorial.md b/doc/2.0/fate/ml/hetero_nn_tutorial.md index f5409e2fd6..e072ac74b6 100644 --- a/doc/2.0/fate/ml/hetero_nn_tutorial.md +++ b/doc/2.0/fate/ml/hetero_nn_tutorial.md @@ -234,10 +234,13 @@ def run(ctx): ds, model, optimizer, loss, args = get_setting(ctx) trainer = train(ctx, ds, model, optimizer, loss, args) pred = predict(trainer, ds) - # compute auc here - from sklearn.metrics import roc_auc_score - print('auc is') - print(roc_auc_score(pred.label_ids, pred.predictions)) + if ctx.is_on_guest: + # print("pred:", pred) + # compute auc here + from sklearn.metrics import roc_auc_score + print('auc is') + print(roc_auc_score(pred.label_ids, pred.predictions)) + if __name__ == '__main__': from fate.arch.launchers.multiprocess_launcher import launch diff --git a/doc/2.0/fate/performance.md b/doc/2.0/fate/performance.md index ac805688c8..618c95abe3 100644 --- a/doc/2.0/fate/performance.md +++ b/doc/2.0/fate/performance.md @@ -18,6 +18,6 @@ Testing configuration: | ------------------------| ---------------------------------------- | --------------------------- | ------------------ | | PSI | 50m54s | 1h32m20s | 1.8x+ | | Hetero-SSHE-LR | 4m54s/epoch | 21m03s/epoch | 4.3x+ | -| Hetero-NN | 52.5s/epoch(based on FedPass protocol) | 2940s/epoch | 56x+ | +| Hetero-NN | 2.425s/epoch(based on FedPass protocol) | 347s/epoch | 143x+ | | Hetero-Coordinated-LR | 2m16s/epoch | 2m41s/epoch | 1.2x+ | | Hetero-Feature-Binning | 1m08s | 1m45s | 1.5x+ | diff --git a/doc/architecture/README.md b/doc/architecture/README.md new file mode 100644 index 0000000000..569fdc7157 --- /dev/null +++ b/doc/architecture/README.md @@ -0,0 +1,18 @@ +### FATE InterOp Goal +![](../images/interop_goal.png) + +### FATE InterOp Principles +![](../images/interop-principles.png) + +### FATE Overall Architecture + +![](../images/fate_arch.png) + +### FATE Core Architecture +![](../images/fate-core-arch.png) + + + + + + diff --git a/doc/images/fate-core-arch.png b/doc/images/fate-core-arch.png new file mode 100644 index 0000000000..33ddde8a7d Binary files /dev/null and b/doc/images/fate-core-arch.png differ diff --git a/doc/images/fate_arch.png b/doc/images/fate_arch.png new file mode 100644 index 0000000000..6c64976e8f Binary files /dev/null and b/doc/images/fate_arch.png differ diff --git a/doc/images/interop-principles.png b/doc/images/interop-principles.png new file mode 100644 index 0000000000..ee6d82408a Binary files /dev/null and b/doc/images/interop-principles.png differ diff --git a/doc/images/interop_goal.png b/doc/images/interop_goal.png new file mode 100644 index 0000000000..114fbb2850 Binary files /dev/null and b/doc/images/interop_goal.png differ diff --git a/doc/images/roadmap.png b/doc/images/roadmap.png new file mode 100644 index 0000000000..fcaca281a1 Binary files /dev/null and b/doc/images/roadmap.png differ diff --git a/doc/resources/GDPR_Data_Shortage_and_AI-AAAI_2019_PPT.pdf b/doc/resources/GDPR_Data_Shortage_and_AI-AAAI_2019_PPT.pdf new file mode 100644 index 0000000000..ff52de0b44 Binary files /dev/null and b/doc/resources/GDPR_Data_Shortage_and_AI-AAAI_2019_PPT.pdf differ diff --git a/doc/resources/README.md b/doc/resources/README.md new file mode 100644 index 0000000000..9ea7481568 --- /dev/null +++ b/doc/resources/README.md @@ -0,0 +1,25 @@ +# Materials + +[中文](./README.zh.md) + +## Speech and Conference + +- [2021. Professor Yang's Seminar Presentation: An Introducton to Federated Learning (2021)](杨强教授:2021联邦学习专题研讨会.pdf) +- [2019. SecureBoost-ijcai2019-workshop](SecureBoost-ijcai2019-workshop.pdf) +- [2019. GDPR_Data_Shortage_and_AI-AAAI_2019_PPT](GDPR_Data_Shortage_and_AI-AAAI_2019_PPT.pdf) + + +## Paper +1. Yang Q, Liu Y, Chen T, et al. Federated machine learning: Concept and applications[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2019, 10(2): 1-19. +2. Liu Y, Fan T, Chen T, et al. FATE: An industrial grade platform for collaborative learning with data protection[J]. Journal of Machine Learning Research, 2021, 22(226): 1-6 +3. Cheng K, Fan T, Jin Y, et al. Secureboost: A lossless federated learning framework[J]. IEEE Intelligent Systems, 2021. +4. Chen W, Ma G, Fan T, et al. SecureBoost+: A High Performance Gradient Boosting Tree Framework for Large Scale Vertical Federated Learning[J]. arXiv preprint arXiv:2110.10927, 2021. +5. Zhang Q, Wang C, Wu H, et al. GELU-Net: A Globally Encrypted, Locally Unencrypted Deep Neural Network for Privacy-Preserved Learning[C]//IJCAI. 2018: 3933-3939. +6. Zhang Y, Zhu H. Additively Homomorphical Encryption based Deep Neural Network for Asymmetrically Collaborative Machine Learning[J]. arXiv preprint arXiv:2007.06849, 2020. +7. Yang K, Fan T, Chen T, et al. A quasi-newton method based vertical federated learning framework for logistic regression[J]. arXiv preprint arXiv:1912.00513, 2019. +8. Hardy S, Henecka W, Ivey-Law H, et al. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption[J]. arXiv preprint arXiv:1711.10677, 2017. +9. Chen C, Zhou J, Wang L, et al. When homomorphic encryption marries secret sharing: Secure large-scale sparse logistic regression and applications in risk control[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021: 2652-2662. +10. Gu H, Luo J, Kang Y, et al. FedPass: Privacy-Preserving Vertical Federated Deep Learning with Adaptive Obfuscation[J]. arXiv preprint arXiv:2301.12623, 2023. + + + diff --git a/doc/resources/README.zh.md b/doc/resources/README.zh.md new file mode 100644 index 0000000000..6b075c2c1d --- /dev/null +++ b/doc/resources/README.zh.md @@ -0,0 +1,25 @@ +# 资料 + +[English](./README.md) + +## 演讲 & 会议 +- [2021. 杨强教授:2021联邦学习专题研讨会](杨强教授:2021联邦学习专题研讨会.pdf) +- [2019. SecureBoost-ijcai2019-workshop](SecureBoost-ijcai2019-workshop.pdf) +- [2019. GDPR_Data_Shortage_and_AI-AAAI_2019_PPT](GDPR_Data_Shortage_and_AI-AAAI_2019_PPT.pdf) + +## 论文 +1. Yang Q, Liu Y, Chen T, et al. Federated machine learning: Concept and applications[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2019, 10(2): 1-19. +2. Liu Y, Fan T, Chen T, et al. FATE: An industrial grade platform for collaborative learning with data protection[J]. Journal of Machine Learning Research, 2021, 22(226): 1-6 +3. Cheng K, Fan T, Jin Y, et al. Secureboost: A lossless federated learning framework[J]. IEEE Intelligent Systems, 2021. +4. Chen W, Ma G, Fan T, et al. SecureBoost+: A High Performance Gradient Boosting Tree Framework for Large Scale Vertical Federated Learning[J]. arXiv preprint arXiv:2110.10927, 2021. +5. Zhang Q, Wang C, Wu H, et al. GELU-Net: A Globally Encrypted, Locally Unencrypted Deep Neural Network for Privacy-Preserved Learning[C]//IJCAI. 2018: 3933-3939. +6. Zhang Y, Zhu H. Additively Homomorphical Encryption based Deep Neural Network for Asymmetrically Collaborative Machine Learning[J]. arXiv preprint arXiv:2007.06849, 2020. +7. Yang K, Fan T, Chen T, et al. A quasi-newton method based vertical federated learning framework for logistic regression[J]. arXiv preprint arXiv:1912.00513, 2019. +8. Hardy S, Henecka W, Ivey-Law H, et al. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption[J]. arXiv preprint arXiv:1711.10677, 2017. +9. Chen C, Zhou J, Wang L, et al. When homomorphic encryption marries secret sharing: Secure large-scale sparse logistic regression and applications in risk control[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021: 2652-2662. +10. Gu H, Luo J, Kang Y, et al. FedPass: Privacy-Preserving Vertical Federated Deep Learning with Adaptive Obfuscation[J]. arXiv preprint arXiv:2301.12623, 2023. + + + + + diff --git a/doc/resources/SecureBoost-ijcai2019-workshop.pdf b/doc/resources/SecureBoost-ijcai2019-workshop.pdf new file mode 100644 index 0000000000..f30ce4b227 Binary files /dev/null and b/doc/resources/SecureBoost-ijcai2019-workshop.pdf differ diff --git "a/doc/resources/\346\235\250\345\274\272\346\225\231\346\216\210\357\274\2322021\350\201\224\351\202\246\345\255\246\344\271\240\344\270\223\351\242\230\347\240\224\350\256\250\344\274\232.pdf" "b/doc/resources/\346\235\250\345\274\272\346\225\231\346\216\210\357\274\2322021\350\201\224\351\202\246\345\255\246\344\271\240\344\270\223\351\242\230\347\240\224\350\256\250\344\274\232.pdf" new file mode 100644 index 0000000000..487454db1c Binary files /dev/null and "b/doc/resources/\346\235\250\345\274\272\346\225\231\346\216\210\357\274\2322021\350\201\224\351\202\246\345\255\246\344\271\240\344\270\223\351\242\230\347\240\224\350\256\250\344\274\232.pdf" differ diff --git a/examples/launchers/fedpass_nn_launcher.py b/examples/launchers/fedpass_nn_launcher.py new file mode 100644 index 0000000000..6a95c64c8b --- /dev/null +++ b/examples/launchers/fedpass_nn_launcher.py @@ -0,0 +1,211 @@ +import torch as t + +from fate.arch import Context +from fate.ml.nn.hetero.hetero_nn import HeteroNNTrainerGuest, HeteroNNTrainerHost, TrainingArguments +from fate.ml.nn.model_zoo.hetero_nn_model import FedPassArgument, TopModelStrategyArguments +from fate.ml.nn.model_zoo.hetero_nn_model import HeteroNNModelGuest, HeteroNNModelHost + + +def train(ctx: Context, + dataset=None, + model=None, + optimizer=None, + loss_func=None, + args: TrainingArguments = None, + ): + if ctx.is_on_guest: + trainer = HeteroNNTrainerGuest(ctx=ctx, + model=model, + train_set=dataset, + optimizer=optimizer, + loss_fn=loss_func, + training_args=args + ) + else: + trainer = HeteroNNTrainerHost(ctx=ctx, + model=model, + train_set=dataset, + optimizer=optimizer, + training_args=args + ) + + trainer.train() + return trainer + + +def predict(trainer, dataset): + return trainer.predict(dataset) + + +def get_setting(ctx): + import torchvision + + # define model + from torch import nn + from torch.nn import init + + class ConvBlock(nn.Module): + def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, bias=True, norm_type=None, + relu=False): + super().__init__() + + self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, bias=bias) + self.norm_type = norm_type + + if self.norm_type: + if self.norm_type == 'bn': + self.bn = nn.BatchNorm2d(out_channels) + elif self.norm_type == 'gn': + self.bn = nn.GroupNorm(out_channels // 16, out_channels) + elif self.norm_type == 'in': + self.bn = nn.InstanceNorm2d(out_channels) + else: + raise ValueError("Wrong norm_type") + else: + self.bn = None + + if relu: + self.relu = nn.ReLU(inplace=True) + else: + self.relu = None + + self.reset_parameters() + + def reset_parameters(self): + init.kaiming_normal_(self.conv.weight, mode='fan_out', nonlinearity='relu') + + def forward(self, x, scales=None, biases=None): + x = self.conv(x) + if self.norm_type is not None: + x = self.bn(x) + if scales is not None and biases is not None: + x = scales[-1] * x + biases[-1] + + if self.relu is not None: + x = self.relu(x) + return x + + # host top model + class LeNetBottom(nn.Module): + def __init__(self): + super(LeNetBottom, self).__init__() + self.layer0 = nn.Sequential( + ConvBlock(1, 8, kernel_size=5), + nn.ReLU(inplace=True), + nn.MaxPool2d(2, 2) + ) + + def forward(self, x): + x = self.layer0(x) + return x + + # guest top model + class LeNetTop(nn.Module): + + def __init__(self, out_feat=84): + super(LeNetTop, self).__init__() + self.pool = nn.MaxPool2d(2, 2) + self.fc1 = nn.Linear(16 * 4 * 4, 120) + self.fc1act = nn.ReLU(inplace=True) + self.fc2 = nn.Linear(120, 84) + self.fc2act = nn.ReLU(inplace=True) + self.fc3 = nn.Linear(84, out_feat) + + def forward(self, x_a): + x = x_a + x = self.pool(x) + x = x.view(x.size(0), -1) + x = self.fc1(x) + x = self.fc1act(x) + x = self.fc2(x) + x = self.fc2act(x) + x = self.fc3(x) + return x + + # fed simulate tool + from torch.utils.data import Dataset + + class NoFeatureDataset(Dataset): + def __init__(self, ds): + self.ds = ds + + def __len__(self): + return len(self.ds) + + def __getitem__(self, item): + return [self.ds[item][1]] + + class NoLabelDataset(Dataset): + def __init__(self, ds): + self.ds = ds + + def __len__(self): + return len(self.ds) + + def __getitem__(self, item): + return [self.ds[item][0]] + + # prepare mnist data + train_data = torchvision.datasets.MNIST(root='./', + train=True, download=True, transform=torchvision.transforms.ToTensor()) + + if ctx.is_on_guest: + + model = HeteroNNModelGuest( + top_model=LeNetTop(), + top_arg=TopModelStrategyArguments( + protect_strategy='fedpass', + fed_pass_arg=FedPassArgument( + layer_type='linear', + in_channels_or_features=84, + hidden_features=64, + out_channels_or_features=10, + passport_mode='multi', + activation='relu', + num_passport=1000, + low=-10 + ) + ) + ) + optimizer = t.optim.Adam(model.parameters(), lr=0.01) + loss = t.nn.CrossEntropyLoss() + ds = NoFeatureDataset(train_data) + + else: + + model = HeteroNNModelHost( + bottom_model=LeNetBottom(), + agglayer_arg=FedPassArgument( + layer_type='conv', + in_channels_or_features=8, + out_channels_or_features=16, + kernel_size=(5, 5), + stride=(1, 1), + passport_mode='multi', + activation='relu', + num_passport=1000 + ) + ) + optimizer = t.optim.Adam(model.parameters(), lr=0.01) + loss = None + ds = NoLabelDataset(train_data) + + args = TrainingArguments( + num_train_epochs=3, + per_device_train_batch_size=256, + disable_tqdm=False + ) + + return ds, model, optimizer, loss, args + + +def run(ctx): + ds, model, optimizer, loss, args = get_setting(ctx) + trainer = train(ctx, ds, model, optimizer, loss, args) + pred = predict(trainer, ds) + + +if __name__ == '__main__': + from fate.arch.launchers.multiprocess_launcher import launch + + launch(run) diff --git a/examples/launchers/run.sh b/examples/launchers/run.sh new file mode 100644 index 0000000000..652fe30a50 --- /dev/null +++ b/examples/launchers/run.sh @@ -0,0 +1,9 @@ +# examples; you can adjust it if you need. + +python sshe_lr_launcher.py --parties guest:9999 host:10000 --log_level INFO --guest_data ../data/breast_hetero_guest.csv --host_data ../data/breast_hetero_host.csv + +python secureboost_launcher.py --parties guest:9999 host:10000 --log_level INFO + +python sshe_nn_launcher.py --parties guest:9999 host:10000 --log_level INFO + +python fedpass_nn_launcher.py --parties guest:9999 host:10000 --log_level INFO \ No newline at end of file diff --git a/examples/launchers/secureboost_launcher.py b/examples/launchers/secureboost_launcher.py new file mode 100644 index 0000000000..d68ddaf981 --- /dev/null +++ b/examples/launchers/secureboost_launcher.py @@ -0,0 +1,62 @@ +import pandas as pd + +from fate.arch import Context +from fate.arch.dataframe import DataFrame +from fate.arch.dataframe import PandasReader +from fate.arch.launchers.multiprocess_launcher import launch +from fate.ml.ensemble.algo.secureboost.hetero.guest import HeteroSecureBoostGuest +from fate.ml.ensemble.algo.secureboost.hetero.host import HeteroSecureBoostHost + + +def train(ctx: Context, data: DataFrame, num_trees: int = 3, objective: str = 'binary:bce', max_depth: int = 3, + learning_rate: float = 0.3): + if ctx.is_on_guest: + bst = HeteroSecureBoostGuest(num_trees=num_trees, objective=objective, \ + max_depth=max_depth, learning_rate=learning_rate) + else: + bst = HeteroSecureBoostHost(num_trees=num_trees, max_depth=max_depth) + + bst.fit(ctx, data) + + return bst + + +def predict(ctx: Context, data: DataFrame, model_dict: dict): + if ctx.is_on_guest: + bst = HeteroSecureBoostGuest() + else: + bst = HeteroSecureBoostHost() + bst.from_model(model_dict) + return bst.predict(ctx, data) + + +def csv_to_df(ctx, file_path, has_label=True): + df = pd.read_csv(file_path) + df["sample_id"] = [i for i in range(len(df))] + if has_label: + reader = PandasReader(sample_id_name="sample_id", match_id_name="id", label_name="y", dtype="float32") + else: + reader = PandasReader(sample_id_name="sample_id", match_id_name="id", dtype="float32") + + fate_df = reader.to_frame(ctx, df) + return fate_df + + +def run(ctx): + num_tree = 3 + max_depth = 3 + if ctx.is_on_guest: + data = csv_to_df(ctx, '../data/breast_hetero_guest.csv') + bst = train(ctx, data, num_trees=num_tree, max_depth=max_depth) + model_dict = bst.get_model() + pred = predict(ctx, data, model_dict) + print(pred.as_pd_df()) + else: + data = csv_to_df(ctx, '../data/breast_hetero_host.csv', has_label=False) + bst = train(ctx, data, num_trees=num_tree, max_depth=max_depth) + model_dict = bst.get_model() + predict(ctx, data, model_dict) + + +if __name__ == '__main__': + launch(run) diff --git a/examples/launchers/sshe_nn_launcher.py b/examples/launchers/sshe_nn_launcher.py new file mode 100644 index 0000000000..da7402b6c1 --- /dev/null +++ b/examples/launchers/sshe_nn_launcher.py @@ -0,0 +1,112 @@ +import torch as t + +from fate.arch import Context +from fate.ml.nn.hetero.hetero_nn import HeteroNNTrainerGuest, HeteroNNTrainerHost, TrainingArguments +from fate.ml.nn.model_zoo.hetero_nn_model import HeteroNNModelGuest, HeteroNNModelHost +from fate.ml.nn.model_zoo.hetero_nn_model import SSHEArgument + + +def train(ctx: Context, + dataset=None, + model=None, + optimizer=None, + loss_func=None, + args: TrainingArguments = None, + ): + if ctx.is_on_guest: + trainer = HeteroNNTrainerGuest(ctx=ctx, + model=model, + train_set=dataset, + optimizer=optimizer, + loss_fn=loss_func, + training_args=args + ) + else: + trainer = HeteroNNTrainerHost(ctx=ctx, + model=model, + train_set=dataset, + optimizer=optimizer, + training_args=args + ) + + trainer.train() + return trainer + + +def predict(trainer, dataset): + return trainer.predict(dataset) + + +def get_setting(ctx): + from fate.ml.nn.dataset.table import TableDataset + # prepare data + if ctx.is_on_guest: + ds = TableDataset(to_tensor=True) + ds.load("../data/breast_hetero_guest.csv") + + bottom_model = t.nn.Sequential( + t.nn.Linear(10, 8), + t.nn.ReLU(), + ) + top_model = t.nn.Sequential( + t.nn.Linear(8, 1), + t.nn.Sigmoid() + ) + model = HeteroNNModelGuest( + top_model=top_model, + bottom_model=bottom_model, + agglayer_arg=SSHEArgument( + guest_in_features=8, + host_in_features=8, + out_features=8, + layer_lr=0.01 + ) + ) + + optimizer = t.optim.Adam(model.parameters(), lr=0.01) + loss = t.nn.BCELoss() + + else: + ds = TableDataset(to_tensor=True) + ds.load("../data/breast_hetero_host.csv") + bottom_model = t.nn.Sequential( + t.nn.Linear(20, 8), + t.nn.ReLU(), + ) + + model = HeteroNNModelHost( + bottom_model=bottom_model, + agglayer_arg=SSHEArgument( + guest_in_features=8, + host_in_features=8, + out_features=8, + layer_lr=0.01 + ) + ) + optimizer = t.optim.Adam(model.parameters(), lr=0.01) + loss = None + + args = TrainingArguments( + num_train_epochs=3, + per_device_train_batch_size=256 + ) + + return ds, model, optimizer, loss, args + + +def run(ctx): + ds, model, optimizer, loss, args = get_setting(ctx) + trainer = train(ctx, ds, model, optimizer, loss, args) + pred = predict(trainer, ds) + if ctx.is_on_guest: + # print("pred:", pred) + # compute auc here + from sklearn.metrics import roc_auc_score + print('auc is') + print(roc_auc_score(pred.label_ids, pred.predictions)) + + +if __name__ == '__main__': + from fate.arch.launchers.multiprocess_launcher import launch + + launch(run) diff --git a/examples/pipeline/homo_nn/homo_nn_testsuite.yaml b/examples/pipeline/homo_nn/homo_nn_testsuite.yaml index fc0abbabbd..c3158cc0fe 100644 --- a/examples/pipeline/homo_nn/homo_nn_testsuite.yaml +++ b/examples/pipeline/homo_nn/homo_nn_testsuite.yaml @@ -34,7 +34,9 @@ data: namespace: experiment role: host_0 tasks: - hetero-nn-binary-sshe: - script: test_nn_binary_sshe.py - hetero-nn-binary-fedpass: - script: test_nn_binary_fedpass.py \ No newline at end of file + homo-nn-binary: + script: test_nn_binary.py + homo-nn-multi: + script: test_nn_multi.py + homo-nn-regression: + script: test_nn_regression.py \ No newline at end of file diff --git a/examples/pipeline/sshe_linr/test_linr.py b/examples/pipeline/sshe_linr/test_linr.py index 4042b3e574..baa5aff534 100644 --- a/examples/pipeline/sshe_linr/test_linr.py +++ b/examples/pipeline/sshe_linr/test_linr.py @@ -41,7 +41,7 @@ def main(config="../config.yaml", namespace=""): epochs=2, batch_size=100, init_param={"fit_intercept": True}, - train_data=scale_0.outputs["output_data"], + train_data=scale_0.outputs["train_output_data"], reveal_every_epoch=False, early_stop="diff", reveal_loss_freq=1, diff --git a/java/osx/osx-broker/src/main/resources/broker/broker.properties b/java/osx/osx-broker/src/main/resources/broker/broker.properties index db43ee98f7..3a78fa97e8 100644 --- a/java/osx/osx-broker/src/main/resources/broker/broker.properties +++ b/java/osx/osx-broker/src/main/resources/broker/broker.properties @@ -1,7 +1,7 @@ grpc.port= 9370 self.party=9999 # the IP of the cluster manager component of eggroll -eggroll.cluster.manager.ip = 127.0.0. +eggroll.cluster.manager.ip = 127.0.0.1 # the port of the cluster manager component of eggroll eggroll.cluster.manager.port = 4670