You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you very much for this beautiful project. I try to test mnist example on our cluster but so far without any success :(
The first iteration of training runs without any problems, but in the weight updating step it crashes with the error:
weights = get_server_weights(master_url)
File "/grid/3/hadoop/yarn/local/usercache/user/appcache/application_blabla/container_blabla/virtualenv_application_blabla/lib/python3.6/site-packages/sparkflow/HogwildSparkModel.py", line 33, in get_server_weights
weights = pickle.loads(r.content)
_pickle.UnpicklingError: invalid load key, '<'.
I thought, that maybe the weights are not corretly pickled before sending to master, so I checked in the source code, but in my point of view everything seems to be correct..
So I am wondering, why this fails :(
I Would really appreciate for further insights.
Thank you in advance!! :)
The text was updated successfully, but these errors were encountered:
Thank you for the answer :)
I have been experimenting a lot, and we fixed the incidence. I think the framework itself is cool, the problem was with our proxy.. We need to set the proxy, but it was confused with connection for the weights transfer between driver /executors, so a lot of environment parameters have been set, now I think it runs..
I will keep you updated!
Hello, thank you very much for this beautiful project. I try to test mnist example on our cluster but so far without any success :(
The first iteration of training runs without any problems, but in the weight updating step it crashes with the error:
weights = get_server_weights(master_url)
File "/grid/3/hadoop/yarn/local/usercache/user/appcache/application_blabla/container_blabla/virtualenv_application_blabla/lib/python3.6/site-packages/sparkflow/HogwildSparkModel.py", line 33, in get_server_weights
weights = pickle.loads(r.content)
_pickle.UnpicklingError: invalid load key, '<'.
I thought, that maybe the weights are not corretly pickled before sending to master, so I checked in the source code, but in my point of view everything seems to be correct..
So I am wondering, why this fails :(
I Would really appreciate for further insights.
Thank you in advance!! :)
The text was updated successfully, but these errors were encountered: