-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent Riak (riak_kv_ensemble_backend) fails to start if AAE is disabled #959
Comments
Working up a test case to reproduce. |
Turn off aae, and try to start riak_kv, with strong consistency on. Then look at riak ensembles, and they will never reach quorum. |
@sargun you are correct -- this behavior is as designed. When AAE is disabled and strong consistency is enabled, strongly consistent operations will not work because ensembles will never gain quorum. However, non-consistent operations will continue to work. We plan to add a log, as well as, riak admin warning regarding the inconsistent configuration. cc @jtuple |
Why not do the aae sync the first time when the riak_ensemble starts up? Riak_kv_entropy_manager supports being executed manually. You don't need continual synchronization, just during the ensemble member start-up. |
strong consistency is enabled and AAE is disabled (defect basho/riak_kv#959)
AAE is disabled. (defect basho/riak_kv#959) - Adds additional console output to reset-current-env to explain configuration and steps being executed - Adds the -n option to the reset-current-env script to specify the number of nodes to build. By default, 5 will be created.
@sargun AAE tree construction is an long running operation (often measured in hours to days depending on dataset size). Thus, Riak doesn't build an AAE tree on-demand for a given exchange. Riak builds AAE trees once a week, and keeps them up-to-date in realtime as K/V operations come in. If AAE is disabled, these trees aren't being maintained in realtime and thus all necessary trees would need to be built from scratch. Since that could take days, and therefore leave ensembles unavailable for days, this is operationally unfeasible. /cc @jtuple |
Open the following PRs to verify proper Riak startup under the conditions described, log a warning on startup, and check the strong consistency/AAE configuration using Riaknostic: |
All PRs have been approved and merged. |
strong consistency is enabled and AAE is disabled (defect basho/riak_kv#959)
I think I'm done with shell scripts for now. Port to Perl. Some cleaning up of the Perl code. Add check for ring_creation_size != num_partitions. Check if the Riak node is running. Prefer riak over riaksearch when looking for commands. Make it a proper hash. Fetch number of nodes in the ring. Look for crash dumps and emfile errors. Add checks for number of partitions vs. nodes and for not being part of the ring. Align status output for easier parsing. Check number of connected nodes vs. ring members. Use a better ratio for partitions vs. nodes. Add some checks if the node is running to prevent errors in checks. Start porting Riaknostic to Erlang. Remove io:format that's broken anyway. Remove shell script version. Fetch and start printing data from the Riak instance. Add Riak installation detection. Find Riak logs. Less code is good. Just tail-recurse it. Change run code to use a config dict. Add module to test for ring membership. Add more check modules. Add module to check for connected nodes. Fixed ping riak to work more consistantly. General code cleanup. Moved log directories into the riaknostic.app. Removed unneccassary filter code. Updated rebar to the current version Scriptized. Name and cookie can be passed in as parameters. Added a README Added type specs. Added more output to nodes connected. Added node to Config dict. Added initial version of disk check. Added noatime check for all mounted disks. Removed flag to specify vm name because it's not needed. Exrcised riaknostic node from connected node list. Removed perl script. Added dizzy's bitcask large value check. Better output from nodes connected Got rid of unnecessary sup. More readable output. Fixed incorrect application start callback return. Memory use stats Improved organization Added util library Added ability to output warnings and errors from riaknostic modules. Added a gen_server for logging. Improved logging with more logical strucutre Using list:keyfind for OTP release per Sean's comment Added conversion from binary to float Fixed issue with higher memory usage check vm.args can now be parsed in A bit of cleanup Integrated lager Improved code organization Moved from dicts to basic prop lists Added all ebin directories in riak lib to path Riaknostics are discovered via their run/1 methods Added lots of command line configuration Added sibling and vclock options to large value check Added a guard to bitcask_threshold_check function Key vals are binary_to_termed before printing. Inserted tabs in readme - usage Fixed broken lager:warning call. Add license headers to all source files. Closes #3. Upgrade rebar. Add a Makefile, copied from lager. Add check-module behavior according to plan. Starting refactor of check modules. Refactor memory use, add TODOs. Refactor ring membership. Added noatime check. Getting the DataDir from riaknostic_config won't work, but if DataDir is set correctly, it is a valid noatime check. Refactor nodes connected and fix some compilation bugs. Refactor ring size check. Fix typo/syntax error. Rename disk check module. Add ability to identify modules that are checks. Update TODOs. Refactor Joe's disk check module. Add a little documentation to the private functions. Add getopt, cleanup unused or antiquated modules. Switching to use a global notion of the config, probably app env but TBD. Do a little line-wrapping. WIP riaknostic_config accessors. Add top-level script with getopt and check descriptions. Remove high-impact bitcask check. Implement a huge swath of the runner, disk check works! application:get_env/2 returns {ok, Value}. Absolutize data directories. Expose base_dir/0 and etc_dir/0. Adjust crash dump detector to use base_dir(). Recognize -sname switch and distinguish between short and long names. Added riaknostic_node module for interacting with the local/cluster nodes. Fix a few bugs and enhance debugging information. * Messages are properly sorted now. * Match output of ps command properly. * Add debug logging of node-connection logic. * Improve detection of node connectivity. Fix docs target, ignore generated docs. Add edoc overview, initial stylesheet. Finish up some styles and documentation, more detail on behaviour needed. Add a more verbose description of the behaviour. Make clear that this stylesheet is for edoc. Initial version of the landing page. Don't need to link to edoc, reflow some of those paragraphs. Ignore parts of the gh-pages branch. WIP make pages. Add forkme ribbon. Make sure to ignore root PNG files and add the new image to the pages. Remove useless memsup info. Added Dr. Basho. Thanks @jgnewman! Closes #13. Build package tarball. Closes #12. Update the README. Minor wording correction, add missing docs to riaknostic_node. Add a word of caution. Solaris ps doesn't understand -o command and we don't use it anyway. Forgot to stage this line. Use -nocookie to prevent usage of the .erlang.cookie file. Closes #16. Fedora installs Riak libraries to /usr/lib64. Closes #18 Setup dialyzer. Fix dialyzer warnings. Check for: * Ring sizes not a multiple of 2 * Deployments where vnodes/node < 3% of ring size * Deployments where vnodes/node > 70% of ring size Change ring size inappropriate check from multiple of 2 to power of 2. Leave out the ring size/vnode messages until we have a better understanding of the relationship and can give better advice. Fix a few mistakes. Fix cluster_command 1 Added check for ring preflists satisfying n_val Check whether search is enabled on all nodes v1.0.1 Update lager dependency Changed regex split to string tokens. Fixes failure in Ubuntu Fix xargs argument for Linux eaccess -> eacces to catch the error correctly Add can_connect_all to check if all nodes are available. The reason for this, is that for search we're checking if search is enabled or disabled on all nodes. If a node is down, this is not a valid test, and errors out otherwise. Check if connected first before running all connected Travis CI config Ignore .eunit folder Add meck as a dependency Initial eunit test for riaknostic_check_ring using meck v1.0.2 Add Travis CI Build Status to README.md some work on the docs re: 26 & 29 add lines for the autosaves of the one true editor Added some reassuring output. Just a few lines so that the runner of the command knows that riaknostic is running and that it exited without error. Update README.md SmartOS has different paths from standard Solaris, using pkg_add package with current version 1.2 add basic sysctl checking Removed freeBSD stuff end of the day temp commit, code still kind of broken move stuff to zip rather than os:cmd added multiple platform support. a couple of bugs/features: - we also need to be able to just grab a copy of a file - we need a list of tests for each platform - need cases for sunos and freebsd - fold in regular diagnostic messages (once I land the fix for #14). - there is a bug in shelling out, only some of the output is actually recorded. another broken checkin, so I can work on something else added the ability to copy out named files changed the where the files were stored before cleanup to CWD. update to flesh out the export command a bit more. still needs much testing, especially on smartos clean up, fix some bugs, add directory-grabbing added a (bad) first pass at machine-readable output midstream checking to get back to work on export added a (bad) first pass at machine-readable output midstream checking to get back to work on export Changed getopt version, one other fix Fix default output Added some comments and TODO's Fix lager dependency version now that lager was updated Fixate lager dependency on 1.2.1 Change dep on lager to 1.2.2 to match the rest of riak Roll version riaknostic 1.1.0 Clarify that Riak 1.3 already has Riaknostic installed removed misplaced parathesis Add OpenBSD bits Update lager dep to 2.0.0rc2 Lager to 2.0.0 final Un-escriptize riaknostic and modify for lager 2.0 compatability Add an extra log line for clarity when running non-existent checks newline fix Restore riaknostic output to console When riaknostic became part of Riak instead of a separate app, its output (through lager) ended up in the node's console.log instead of being output by 'riak-admin diag'. Among other things, this broke the riaknostic_rt riak test. This adds a layer on top of lager, so messages can be directed to the console again, simply by using io:format. This way, messages are sent to the group_leader instead of the user process, which is what the lager backend does. When riaknostic is invoked through RPC by riak-admin, the caller becomes the group leader and picks up those messages. I wish there was a cleaner way to do this leveraging something in lager, but I couldn't find any. Roll riaknostic version 1.2.0 Pin meck dependency to a specific tag Remove sysctl checks Sysctl checks are now handled by the riak_kv_env module. Standardize meck dep Standardize on a rebar.config dep format to reduce conflicts pull app.config and vm.args from init:get_arguments added extra -vm_args to CONFIG_ARGS for easy access by erlang vm Roll riaknostic 1.2.1 to pull in lager 2.0.1 Fix rebar.config url to stay consistent Bump lager dep to 2.0.2 Bump lager dep to 2.0.3 - Adds a check for strong consistency configuration -- warning when strong consistency is enabled and AAE is disabled (defect basho/riak_kv#959)
So, I noticed that if I don't have anti-entropy on, and I enable strongly consistent Riak, it doesn't work. Specifically, what happens is that riak_kv_ensembles sets up the ensembles, but the riak_ensemble_peer never gets past to all_sync state. It appears that this is because the riak_kv_ensemble_backend relies on anti-entropy to perform an exchange before it comes up. See here:
(I uncommented the debugging). If riak_kv_entropy_manager is not enabled, then riak_kv_entropy_info:exchanges will always be empty. Can we either (1) manually trigger AAE exchange upon noticing that strong consistency is enabled (I imagine you can do this by setting the mode to manual, and then queueing up the AAE jobs), (2) throw a warning to the user saying that they should enable AAE.
The text was updated successfully, but these errors were encountered: