Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问efls-data的单机部署构建镜像时一定要求FLINK-K8S环境吗 #17

Open
rewonderful opened this issue Mar 7, 2022 · 4 comments

Comments

@rewonderful
Copy link

https://github.com/alibaba/Elastic-Federated-Learning-Solution/blob/master/docs/English/Standalone_Deployment_CN.md
我参照上面这个单机部署教程后,运行python /xfl/test/test_data_join.py后报错如下
是由env = StreamExecutionEnvironment.get_execution_environment()这里引起的
请问一定要在k8s上部署flink环境再启动docker镜像才可以部署单机模式么?

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "Thread-4" java.lang.NoClassDefFoundError: org/apache/flink/table/functions/python/PythonFunction
        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
        at java.lang.Class.privateGetPublicMethods(Class.java:2902)ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1188, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1014, in send_command
    response = connection.send_command(command)
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1193, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving

        at java.lang.Class.getMethods(Class.java:1615)
        at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:451)
        at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:339)
        at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:639)
        at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
        at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)Traceback (most recent call last):

  File "./test_data_join.py", line 180, in <module>
        at java.lang.reflect.WeakCache.get(WeakCache.java:127)
        at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
        at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:719)
        at org.apache.flink.api.python.shaded.py4j.Gateway.createProxy(Gateway.java:368)
        at org.apache.flink.api.python.shaded.py4j.Protocol.getPythonProxy(Protocol.java:433)
        at org.apache.flink.api.python.shaded.py4j.Protocol.getObject(Protocol.java:311)
    t.test_psi_join()
        at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.getArguments(AbstractCommand.java:82)  File "./test_data_join.py", line 174, in test_psi_join

        at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:77)
        at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:750)    run_client_and_server()

  File "./test_data_join.py", line 91, in run_client_and_server
Caused by: java.lang.ClassNotFoundException: org.apache.flink.table.functions.python.PythonFunction
    'example_id', 'example_id', 8)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)  File "./test_data_join2.py", line 78, in __init__

        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
    conf=conf)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
  File "/xfl/xfl/data/pipelines.py", line 71, in __init__
        ... 19 more
    env = get_flink_batch_env(conf)
  File "/xfl/xfl/data/pipelines.py", line 41, in get_flink_batch_env
    env = StreamExecutionEnvironment.get_execution_environment()
  File "/usr/local/lib/python3.7/dist-packages/pyflink/datastream/stream_execution_environment.py", line 688, in get_execution_environment
    gateway = get_gateway()
  File "/usr/local/lib/python3.7/dist-packages/pyflink/java_gateway.py", line 75, in get_gateway
    _gateway.entry_point.put("PythonFunctionFactory", PythonFunctionFactory())
  File "/usr/local/lib/python3.7/dist-packages/py4j/java_gateway.py", line 1286, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/usr/local/lib/python3.7/dist-packages/pyflink/util/exceptions.py", line 146, in deco
    return f(*a, **kw)
  File "/usr/local/lib/python3.7/dist-packages/py4j/protocol.py", line 336, in get_return_value
    format(target_id, ".", name))
py4j.protocol.Py4JError: An error occurred while calling t.put
@YanZhangN
Copy link
Collaborator

单机部署模式不需要k8s环境。看起来你可能是在物理机上运行了测试程序,实际上需要在docker环境里运行。
请尝试以下命令

docker run ${efls-data-docker-image} python /xfl/test/test_data_join.py

其中${efls-data-docker-image} 为你打好的镜像名称

@rewonderful
Copy link
Author

python /xfl/test/test_data_join.py

感谢回复,事实上我是在docker环境内操作的,会报这个错,是不是Dockerfile有问题呢,我第一次按照dockerfile build后甚至会出现JAVA_HOME找不到的问题,有点怀疑是不是镜像制作中pyflink装的有问题

另外请教一下通过EFLS训练的模型该如何保存和加载?文档中似乎没有提及

@zonghua94
Copy link
Collaborator

参考文档,在调用fit函数的时候填入checkpoint_dir来进行模型的加载
https://github.com/alibaba/Elastic-Federated-Learning-Solution/blob/master/docs/efls-train/model_api.md#fit
为模型添加一个tf.train.CheckpointSaverHook来进行模型的保存

@YanZhangN
Copy link
Collaborator

YanZhangN commented Mar 15, 2022

python /xfl/test/test_data_join.py

感谢回复,事实上我是在docker环境内操作的,会报这个错,是不是Dockerfile有问题呢,我第一次按照dockerfile build后甚至会出现JAVA_HOME找不到的问题,有点怀疑是不是镜像制作中pyflink装的有问题

另外请教一下通过EFLS训练的模型该如何保存和加载?文档中似乎没有提及

Dockerfile里这一部分是关于xfl-java的编译,看起来你的xfl-java在docker build过程中没有正确编译,必要的话请贴一下docker build过程中这里的日志。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants