-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Agitator #188
Comments
So, one idea I had was to make the agitator more dynamic, and make it so that you only had to run one script instead of multiple. You could, as an example, have a script that runs commands randomly from files in a directory. I tested this out, but it looks like my test files have been wiped. The basic gist was to have a set of bash files in a directory, where each file had two functions do() and undo(). The main agitator script would randomly select a file from the directory, source it, call do(), wait for some period of time and then call undo(). |
This sounds like it could be a whole project on its own, rather than something specifically for testing Accumulo. I'd want to be careful about scope creep here. Perhaps if that project were created separately, then maybe we could just add config/plugins to make use of it for testing Accumulo? |
I was able to reproduce my test pretty easily. Two script files and the driver. #! /bin/bash
function do_it {
echo "do A"
}
function undo_it {
echo "undo A"
} #! /bin/bash
function do_it {
echo "do B"
}
function undo_it {
echo "undo B"
} #! /bin/bash
for file in $(find scripts -type f | shuf)
do
source "$file"
do_it
sleep 1
undo_it
done |
My suggestion was merely to redo the packaging for the existing agitator functions (stop/start DN, NN, TS,etc) and maybe to add new functions to introduce host issues. I was thinking something simple as my example above. |
It seems to me like there are two separate ideas here:
Not sure if both of these should be handled together or on their own. |
Actually, both came from me, and the idea is that 2 allows for 1 to be easily added. |
Have you looked into existing open source utilities out there to provide agitation that we wouldn't have to maintain, but can just use off the shelf to provide those extra features? |
Yes, the scripts mentioned above come from the old Netflix ChaosMonkey tools. They have now re-written that framework in Go and from what I can tell it's targeted at commercial cloud infrastructure. I'm not sure how much work it would be to fork it and add our stuff to it. Your question led me to look again at alternatives and I found https://harness.io/blog/chaos-engineering-tools/ which references several possibilities in a new discipline called Chaos Engineering (which sounds way cooler than Agitation by the way) |
A recent comment from @dlmarion about possible improvements to Agitator:
"I think we could make the scripts more modular, reuse some code, and maybe include other types of failure. For example, the old Netflix Chaos Monkey scripts use Linux host commands to introduce various things into the environment - high CPU / IO usage, packet loss, packet latency, packet corruption, bad routing, etc."
https://github.com/Netflix/SimianArmy/tree/master/src/main/resources/scripts
The text was updated successfully, but these errors were encountered: