Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniform key popularity distribution for the windows key #7

Open
fbjorkman opened this issue Aug 22, 2022 · 6 comments
Open

Uniform key popularity distribution for the windows key #7

fbjorkman opened this issue Aug 22, 2022 · 6 comments

Comments

@fbjorkman
Copy link

Hello again and thanks for the previous help!
I'm trying to generate a trace file where the keys in the windows are generated uniformly from a specific interval, but when setting the "make a key popularity distribution for the windows key (keys used to access state store)" option to uniform, all the keys generated are just "0000000000" (I'm using a constant key size of 10). I've tried this with different intervals but the result has been the same.
I did not get the zipf distribution to work either. I did set the parameter s, but just got the following error message:

Hello from Gadget!
The config file has been successfully read!
Error: The operator cannot be made!

The sequential and constant distributions worked as expected.

@showanasyabi
Copy link
Contributor

Could you please let me know the config file that you use for your experiments

@fbjorkman
Copy link
Author

Here are the config files:
uniformconfig.txt
zipfconfig.txt

@showanasyabi
Copy link
Contributor

In practice, the keys used to access the state store must be unique. If keys are drawn from a uniform distribution, there will be windows with the same key on the state store, which leads to indeterministic behavior. That is why the current version of Gadget only supports unique sequential keys.

@fbjorkman
Copy link
Author

Okay, but still when generating traces it still accesses the same state store multiple times anyway, so what is the difference there? I'm looking for trace where the keys are distributed more randomly alphabetically to get accesses after each other that may hit different SST-tables. Is there another way to generate that kind of trace?

Eg. instead of having the trace like this:
1 put 0000000001 0000000000 gadget gadgetFile
2 get 0000000002 0000000000 gadget gadgetFile
3 put 0000000002 0000000000 gadget gadgetFile
4 get 0000000003 0000000000 gadget gadgetFile
5 put 0000000003 0000000000 gadget gadgetFile
6 get 0000000003 0000000000 gadget gadgetFile
7 put 0000000003 0000000000 gadget gadgetFile
8 get 0000000004 0000000000 gadget gadgetFile
9 put 0000000004 0000000000 gadget gadgetFile
10 get 0000000003 0000000000 gadget gadgetFile

Having it something like this with keys distributed randomly over a range:
1 put 0000100001 0000000000 gadget gadgetFile
2 get 0000000500 0000000000 gadget gadgetFile
3 put 0000000333 0000000000 gadget gadgetFile
4 get 0034005664 0000000000 gadget gadgetFile
5 put 0000012398 0000000000 gadget gadgetFile
6 get 0004505803 0000000000 gadget gadgetFile
7 put 0000004503 0000000000 gadget gadgetFile
8 get 0000200004 0000000000 gadget gadgetFile
9 put 0000000123 0000000000 gadget gadgetFile
10 get 0235956002 0000000000 gadget gadgetFile

@showanasyabi
Copy link
Contributor

State store operations with the same key stem from events that belong to the same window.

As the paper indicates, state store workloads in stream processing systems have high spatial locality. Gadget mimics a stream processing system to generate its workload. That's why its workloads have high spatial locality.

@fredrikbjorkman
Copy link

Okay, now it is more clear to me why it behaves the way it does. Thank you for your answers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants