Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release FL on SGX V1.0 #882

Open
wants to merge 151 commits into
base: sgx_ra_tls
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
151 commits
Select commit Hold shift + click to select a range
81de145
fix(deloy): operator ingress tls (#844)
Abingcbc Jun 11, 2021
fe8f0c2
[WebConsole] Sync to github. (#845)
tjulinfan Jun 15, 2021
a1abcd7
[fix](trainer/data): trainer data_block_loader ignore repeated reques…
whisylan Jun 29, 2021
5d91338
Fix server response body size too large (#848)
duanbing Jul 1, 2021
21b5026
feat(raw_data): add parameter validation for data portal job (#849)
nolanliou Jul 2, 2021
718dd75
fix follower restart index missing (#850)
duanbing Jul 8, 2021
4edcc85
fix follower restart index missing (#851)
duanbing Jul 8, 2021
e43a3fa
fix(data_portal): fix data portal config checking
nolanliou Jul 10, 2021
b6eb2d0
refactor(deploy): removes unused argo component in helm (#854)
tjulinfan Jul 14, 2021
70f3c5b
fix(psi): PSI failed when CPU_LIMIT < 4 (#855)
nolanliou Jul 19, 2021
3b82e88
feat(tree): enable label select (#852)
hangweiqiang-uestc Jul 21, 2021
31fd49a
feat: import spark operator (#368)
peng09 Jul 23, 2021
0fae30e
chore: upgrade gin version (#864)
peng09 Jul 29, 2021
38c6c0e
Fix default value double encode in csv (#861)
duanbing Jul 29, 2021
54f50ea
fix(mysql_client): creates table if datasourcemeta does not exist (#867)
tjulinfan Aug 2, 2021
d146479
feat(tree): add eval metrics (#865)
hangweiqiang-uestc Aug 3, 2021
df66fd6
fix(tree): remove x_names=none (#863)
hangweiqiang-uestc Aug 3, 2021
70a4b39
feat(deploy): adds node port service for web console v2 (#869)
tjulinfan Aug 9, 2021
81f7762
feat(deploy): enables webconsole v2 by default (#870)
tjulinfan Aug 10, 2021
eed77f2
feat(deploy): enables public service for webconsole v2 by default (#874)
tjulinfan Aug 10, 2021
0c25ce6
fix(tree): fix to_float (#873)
hangweiqiang-uestc Aug 10, 2021
8e2216f
chore(deploy): refactor image structure (#878)
chong1144 Aug 19, 2021
2d5dce3
feat(scripts): add pre_start_hook (#877)
chong1144 Aug 19, 2021
130f126
fix(imags): fix krb5 installation (#879)
chong1144 Aug 19, 2021
f360bc0
Expose peer enclave sig by temp file
duanbing Aug 24, 2021
cfc6d79
chore(images): add cron (#880)
chong1144 Aug 24, 2021
f034556
fix(common): do not modify value when convert tf example to dict (#883)
nolanliou Aug 25, 2021
0b0f7df
Remove grpc from dockerfile
duanbing Aug 27, 2021
308a26f
feat(authorize): add pre_start_hook and custom volumes (#886)
CAGeng Aug 31, 2021
5da8f0c
Enable RA and add tf example
duanbing Sep 1, 2021
2d9fcba
Enable RA TLS in TF
duanbing Sep 6, 2021
b938673
Enable RA TLS in TF
duanbing Sep 6, 2021
c117f92
bump graphene to master
duanbing Sep 7, 2021
25b8623
bump graphene to master
duanbing Sep 7, 2021
31823dc
Remove tf sample
duanbing Sep 7, 2021
2344e71
fix(trainer): fix summary hook panic (#888)
whisylan Sep 7, 2021
af8cbc2
Fix mnist manifest
duanbing Sep 10, 2021
1c0094e
Code polish
duanbing Sep 10, 2021
beda5b6
Grpc client in enclave support
duanbing Sep 14, 2021
d5d0bb5
post meterials used in e2e_test for webconsole_v2 (#891)
ChrisL1n Sep 17, 2021
b660f37
Debug streaming grpc
duanbing Sep 17, 2021
733c5d1
Debug streaming grpc
duanbing Sep 17, 2021
0414aee
Fix streaming rpc invalid root cert
duanbing Sep 18, 2021
71a8e7f
streaming client in-enclave
duanbing Sep 20, 2021
252e73f
Fix quote verification error
duanbing Sep 21, 2021
4fc7d4a
Adjust tf config
duanbing Sep 22, 2021
e82252d
Remove unused envs
duanbing Sep 24, 2021
745137a
Fix ps oom (#893)
whisylan Sep 27, 2021
0b12658
Update README
duanbing Sep 27, 2021
a89610b
one way SGX RA-TLS pass in V&H FL (#892)
0400H Sep 28, 2021
1173698
Fix build script
duanbing Sep 29, 2021
82ec70f
chore(spark-operator): use patched image to fix the ca issue in some …
tjulinfan Oct 14, 2021
2a78b32
feat(charts): update nfs client image to solve k8s 1.20 self link iss…
chong1144 Oct 25, 2021
76bfa22
fix(script): parse FILE_WILDCARD env correctly (#902)
tjulinfan Oct 25, 2021
09451a3
fix(metrics): not emitting invalid fields (#901)
ZhZhang711 Oct 25, 2021
eedf36b
fix(log): support containerd (#897)
CAGeng Oct 25, 2021
405c8a1
feat(deploy): use patch2 as default spark opeartor (#903)
chong1144 Oct 27, 2021
69c8fd2
feat(hdfs): more robust hdfs env initialization (#906)
chong1144 Oct 27, 2021
1d91f90
feat(deploy): support kubernetes 1.20 api version (#908)
chong1144 Oct 29, 2021
aa8fcd3
feat(trainer): add --load-checkpoint-path (#910)
whisylan Nov 4, 2021
fc62eba
fix(trainer): change bool to str_as_bool in trainer job (#911)
Ssskrilex Nov 4, 2021
f0b9e8a
Move grpc patch into tf patch
duanbing Nov 5, 2021
75c69f5
Fix some pylint
duanbing Nov 5, 2021
d1db3bb
Fix some pylint
duanbing Nov 5, 2021
1f28228
Upgrade Graphene to Gramine (#909)
RodgerZhu Nov 9, 2021
3464848
Delete build_docker_image.sh
duanbing Nov 9, 2021
90d9257
fix(deploy): use v1beta1 for 1.16-1.18 and v1 for 1.19+ (#913)
chong1144 Nov 9, 2021
e09b2d2
move gramine patches to sgx/gramine (#912)
RodgerZhu Nov 10, 2021
07f4195
Add two way SGX-RA-TLS support for Worker2Worker (#915)
0400H Nov 12, 2021
6b253c3
Merge branch 'sgx_ra_tls' into fix_dev_sgx
duanbing Nov 12, 2021
c72e521
Update sgx README.md (#917)
0400H Nov 12, 2021
9f3b247
Config op thread (#916)
duanbing Nov 12, 2021
873a7c3
fix(sgx): fix apply patch fail (#918)
lixiaoguang01 Nov 16, 2021
a446a9c
Fix wrong data type of function get_tf_config (#920)
0400H Nov 17, 2021
2eb2b94
feat: support start date and end date of rawdata (#922)
nolanliou Nov 18, 2021
78c099b
fix: make example_id unrequired. (#924)
nolanliou Dec 1, 2021
f5cc151
fix: list input dir by folder name (#926)
nolanliou Dec 2, 2021
e86f433
Enable TF MKL Optimization (#923)
0400H Dec 3, 2021
de2f3ab
feat(deploy): introduce fedlearner-pvc (#927)
Dec 6, 2021
ac7073c
refactor(tree): refactor to_float in tree (#905)
hangweiqiang-uestc Dec 8, 2021
5b07143
fix(script): add sleep to start script (#930)
lixiaoguang01 Dec 13, 2021
21d2ae4
feat(code): add dictionary support for code_key (#931)
Ssskrilex Dec 15, 2021
570101e
feat(tree): enable label select (#852) (#932)
duanbing Dec 16, 2021
5c9aded
Revert "feat(tree): enable label select (#852) (#932)" (#935)
chong1144 Dec 20, 2021
cf985ca
feat(tree): enable confusion matrix (#936)
hangweiqiang-uestc Dec 20, 2021
f11cc3b
feat(deploy): add new networking topology (#938)
chong1144 Dec 27, 2021
df5c257
Add multi-instance support in single process and make thread safe for…
0400H Dec 28, 2021
89f5a23
refactor(tree): add log to detect zero split point (#940)
hangweiqiang-uestc Dec 31, 2021
66584f6
Decrease grpc worker number
duanbing Jan 11, 2022
423273b
for compatibility, also write to metrics(es) (#944)
whisylan Jan 18, 2022
ca96d7f
Add RA-TLS cache and do stability optimization for gRPC (#943)
0400H Jan 20, 2022
3f512e9
Update sgx_ra_tls.pyx.pxi (#945)
Hsy-Intel Jan 24, 2022
77448c7
feat(sgx): support deploy in webconsole (#946)
lixiaoguang01 Jan 29, 2022
89b426c
feat(sgx): support NN eval (#950)
lixiaoguang01 Feb 18, 2022
882310f
Feat fedavg tf1.15 (#952)
whisylan Feb 25, 2022
d73ec44
feat(sgx): raw data support sgx (#955)
lixiaoguang01 Mar 2, 2022
5ee00b4
Add sgx release dockerfile (#956)
0400H Mar 10, 2022
bd291ca
Trainer add fcv2 (#962)
whisylan Apr 2, 2022
269dc6a
chore(nn): adapt to hostnetwork
Apr 24, 2022
3f7dfad
chore(nn): adapt to hostnetwork
Apr 24, 2022
4d1551f
chore(nn): adapt to hostnetwork
Apr 24, 2022
3060d93
chore(nn): adapt to hostnetwork (#966)
hangweiqiang-uestc May 9, 2022
5d7a1a2
chore(deploy): update fedlearner to adapt to hostnetwork
May 9, 2022
75e8043
Add Fedlearner README (#970)
RodgerZhu May 9, 2022
b7cd6d4
Sync new commits to stable branch (#993)
tjulinfan May 27, 2022
bdc6233
feat(metrics): support new metrics collector and apply to tree model …
lixiaoguang01 May 31, 2022
89e6be2
fix(psi): fix psi signer exit before client (#997)
Ssskrilex Jun 1, 2022
3b6d4a3
feat(tree): add file_wildcard to filter files (#981) (#1000)
hangweiqiang-uestc Jun 8, 2022
2c36d4a
feat(trainer): add extra_params and export_model parameters (#1005)
hangweiqiang-uestc Jun 22, 2022
9260f9c
chore(*): trigger release pipeline
Jun 22, 2022
739ff3f
chore(*): update image version of model preset template
Jun 22, 2022
786b567
fix(trainer): fix export_model args (#1006)
hangweiqiang-uestc Jun 23, 2022
f5ba0a7
feat(nn): add data path to start scripts (#1008)
chong1144 Jun 27, 2022
679b998
refactor(trainer): add checkpoint hook for eval (#1012)
hangweiqiang-uestc Jul 6, 2022
bfb84e2
fix(data_join): hot-fix fpath is none (#1018)
LHhan1996 Jul 14, 2022
92ac380
refactor(visitor): change allocate logic of DataPathVisitor (#1029) (…
hangweiqiang-uestc Aug 8, 2022
2d30b94
feat(tree): add secret sharing multiplication (#1025)
gejielun Aug 12, 2022
f575e55
feat(tree): add file_wildcard to DataBlockLoader (#1032)
hangweiqiang-uestc Aug 12, 2022
2a1cda4
feat(metrics): apply new metrics to nn model (#1031) (#1033)
lixiaoguang01 Aug 12, 2022
5f500f3
fix(tree): fix data_block_loader (#1037)
hangweiqiang-uestc Aug 17, 2022
af9d27e
feat(tree): add multiprocessing tree data loader (#1041) (#1043)
Lemon-412 Aug 30, 2022
b55f0ae
Feat: multiple data sources for trainer (#976)
nolanliou Aug 30, 2022
b7a5a8c
fix(trainer): fix data path visitor
Aug 31, 2022
98040b0
fix(trainer): fix data path visitor
Sep 1, 2022
18d26b2
fix(trainer): fix data path visitor
Sep 1, 2022
556f6a9
fix(trainer): fix data path visitor
Sep 2, 2022
696ef31
feat(metrics): append more metrics to nn model (#1042) (#1047)
lixiaoguang01 Sep 7, 2022
73f0653
fix(deploy): solve pull code startswith file error (#1020) (#1048)
hangweiqiang-uestc Sep 7, 2022
608e2c3
fix(deploy): fix pure path error (#1050) (#1051)
gejielun Sep 8, 2022
50a6945
feat(trainer): save checkpoints when load no data (#1066) (#1069)
lixiaoguang01 Jan 9, 2023
d8bbfa1
feat(metrics): support fedavg training (#1083) (#1084)
lixiaoguang01 May 12, 2023
9c5bb8b
feat(trainer): data_path_visitor add date filter (#1086)
gejielun May 23, 2023
809255c
feat(sgx): rebase current code from stable_without_spark_raw_data
Gezq Jan 18, 2024
f4d9048
feat(sgx): sgx sh add current param, set env and add aesm sh
Gezq Jan 18, 2024
2cfdfb9
fix(channel): fix rebase error
Gezq Jan 23, 2024
7e61d49
feat(deploy): add cp tensorflow_io.py
Gezq Jan 30, 2024
ec3ddb9
feat(sgx): update gramine version to 1.3 bigDL for hdfs error
Gezq Feb 28, 2024
f994c84
feat(sgx): gramine log level support setting
Gezq Mar 4, 2024
7dd7a64
feat(sgx): get token when buiding image
Gezq Mar 5, 2024
1733328
feat(sgx): get taskset parameters from env
Gezq Mar 6, 2024
141736e
feat(sgx): add proxy local port to file
Gezq Mar 18, 2024
546b5b3
fix(sgx):change entrypoint dir
Mar 12, 2024
561e259
change algorithm package to allowed file
zeuson0 Mar 19, 2024
c3ea95d
Use meituan hdfs to read or write train data and model (#1088)
henshy Mar 20, 2024
ab1a6bb
feat(sgx): support multi-measurements attestation of tensorflow compo…
zeuson0 Apr 7, 2024
51673e8
Fix dev sgx: Add label protection (#1090)
Jerseyshin Apr 8, 2024
383d1a5
Update marvell.py: fix ordering (#1094)
Jerseyshin Apr 8, 2024
f875f10
feat(sgx): put all python packages into trusted_files (#1095)
zeuson0 Apr 8, 2024
0e267b8
Meituan HDFS Access Without Proxy for Kerberos Authentication (#1096)
henshy Apr 9, 2024
f905579
FedLearner Framework and Core Dependency RA-TLS Configuration (#1097)
henshy Apr 15, 2024
f2b8203
Fix the signature inconsistency issue in Gramine (#1100)
henshy Apr 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
27 changes: 0 additions & 27 deletions .github/workflows/build-web-console.yaml

This file was deleted.

7 changes: 6 additions & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@ on:

jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
# As long as we need Python 3.6 here in the test, we can only use up to Ubuntu 20.
# https://github.com/actions/setup-python/issues/544
os: [ubuntu-20.04]
runs-on: ${{ matrix.os }}
name: CI tests
steps:
- uses: actions/checkout@v2
Expand Down
40 changes: 0 additions & 40 deletions .github/workflows/publish-to-pypi.yml

This file was deleted.

20 changes: 0 additions & 20 deletions .github/workflows/release-web-console-v2.yaml

This file was deleted.

19 changes: 0 additions & 19 deletions .github/workflows/release-web-console.yaml

This file was deleted.

21 changes: 0 additions & 21 deletions .github/workflows/tag-web-console.yaml

This file was deleted.

21 changes: 0 additions & 21 deletions .github/workflows/update-nightly.yml

This file was deleted.

30 changes: 0 additions & 30 deletions .github/workflows/web-console-v2-api.yaml

This file was deleted.

28 changes: 0 additions & 28 deletions .github/workflows/web-console-v2-client.yaml

This file was deleted.

4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,13 @@ WORKDIR /app
COPY . /app

RUN apt-get -y update \
&& apt-get -y install cron \
&& apt-get -y install libgmp-dev \
&& apt-get -y install libmpfr-dev \
&& apt-get -y install libmpc-dev \
# For krb5-user installation
&& export DEBIAN_FRONTEND=noninteractive \
&& apt-get -y install krb5-user \
&& rm -rf /var/lib/apt/lists/*

RUN pip install --upgrade pip \
Expand Down
8 changes: 8 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@ protobuf:
--grpc_python_out=. \
protocols/fedlearner/channel/*.proto

python -m grpc_tools.protoc -I. \
--python_out=. \
fedlearner/fedavg/cluster/cluster.proto
python -m grpc_tools.protoc -I. \
--python_out=. \
--grpc_python_out=. \
fedlearner/fedavg/training_service.proto

lint:
pylint --rcfile ci/pylintrc fedlearner example

Expand Down
39 changes: 39 additions & 0 deletions deploy/charts/fedlearner-add-on/templates/ingress-v1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{{- if semverCompare ">=1.19-0" .Capabilities.KubeVersion.GitVersion -}}

{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: fedlearner-proxy
namespace: {{ .Release.Namespace }}
labels:
{{- include "fedlearner-add-on.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
nginx.ingress.kubernetes.io/auth-tls-secret: default/ca-secret
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/configuration-snippet: |
grpc_set_header Authority $http_x_host;
grpc_set_header Host $http_x_host;
grpc_next_upstream_tries 5;
{{- end }}
spec:
rules:
- host: {{ .Values.ingress.host | quote }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: fedlearner-stack-ingress-nginx-controller
port:
number: 80
tls:
- hosts:
- {{ .Values.ingress.host }}
secretName: fedlearner-proxy-server
{{- end }}

{{- end }}
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{{- if semverCompare "<1.19-0" .Capabilities.KubeVersion.GitVersion -}}

{{- if .Values.ingress.enabled -}}
{{- $configurationSnippet := .Files.Get "configuration-snippet.txt" -}}
{{- $serverSnippet := .Files.Get "server-snippet.txt" -}}

{{- if semverCompare ">=1.14-0" .Capabilities.KubeVersion.GitVersion -}}
apiVersion: networking.k8s.io/v1beta1
{{- else -}}
Expand All @@ -15,14 +16,12 @@ metadata:
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- if not (empty $configurationSnippet) }}
nginx.ingress.kubernetes.io/auth-tls-secret: default/ca-secret
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/configuration-snippet: |
{{ $configurationSnippet | indent 6 }}
{{- end }}
{{- if not (empty $serverSnippet) }}
nginx.ingress.kubernetes.io/server-snippet: |
{{ $serverSnippet | indent 6 }}
{{- end }}
grpc_set_header Authority $http_x_host;
grpc_set_header Host $http_x_host;
grpc_next_upstream_tries 5;
{{- end }}
spec:
rules:
Expand All @@ -31,6 +30,12 @@ spec:
paths:
- path: "/"
backend:
serviceName: fedlearner-proxy
servicePort: {{ .Values.ingress.port }}
serviceName: fedlearner-stack-ingress-nginx-controller
servicePort: 80
tls:
- hosts:
- {{ .Values.ingress.host }}
secretName: fedlearner-proxy-server
{{- end }}

{{- end }}
11 changes: 0 additions & 11 deletions deploy/charts/fedlearner-add-on/templates/secrets.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,3 @@
{{- if .Values.imageCredentials.enabled }}
apiVersion: v1
kind: Secret
metadata:
name: regcred
namespace: {{ .Release.Namespace }}
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ template "imagePullSecret" . }}
{{- end}}
---
{{- if .Values.tls.enabled }}
apiVersion: v1
kind: Secret
Expand Down
16 changes: 2 additions & 14 deletions deploy/charts/fedlearner-add-on/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,14 @@
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

imageCredentials:
enabled: true
registry: ""
username: ""
password: ""

service:
enabled: true
type: ExternalName
externalName: ""

ingress:
enabled: true
annotations:
"nginx.ingress.kubernetes.io/proxy-body-size": 10g
"nginx.ingress.kubernetes.io/backend-protocol": GRPCS
"nginx.ingress.kubernetes.io/backend-protocol": GRPC
"nginx.ingress.kubernetes.io/http2-insecure-port": "true"
"kubernetes.io/ingress.class": nginx
host: external.name
port: 443
host: test.fedlearner.net

tls:
enabled: true
Expand Down
Loading
Loading