Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Large dictionnary #4

Open
arita37 opened this issue Jan 18, 2017 · 4 comments
Open

Split Large dictionnary #4

arita37 opened this issue Jan 18, 2017 · 4 comments
Labels

Comments

@arita37
Copy link

arita37 commented Jan 18, 2017

Hello,

Sounds a good project.
Is there a way to pass / scatter a dictionnary of 500k keys ?
Dictionary is dictionnary of dictionnary... to handle complex data.

How the data are copied (because I have 5go and dont have to copy all....)

@mattja
Copy link
Owner

mattja commented Jan 19, 2017

Thanks, that is a good suggestion.
Currently there is a special case for scattering of sequences, but not for dictionaries. It would be a useful feature and should be straightforward to add, though I do not have time to do it this month.

At the moment distob is implemented as an object layer on top of ipyparallel 4.x.
So it's really designed for computation on multiple hosts separated by network links, and in my experience
distob is very inefficient with wasted serialization, deserialization and communications overhead if using
multiple CPUs on a single host. Currently when scattering I believe all the data are serialized by dill then copied by ipyparallel across sockets, using zeromq.

In the future it should be possible for distob to use faster back-ends, including one suitable for
parallel processing on a single host without copying data needlessly (and also a back-end for GPU
computation).

@arita37
Copy link
Author

arita37 commented Jan 19, 2017 via email

@mattja
Copy link
Owner

mattja commented Jan 19, 2017

Sure if you have some example code, I would be interested to have a look later, when I have some time available.

@arita37
Copy link
Author

arita37 commented Jan 20, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants