-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure new indexes can be created during depoyment #210
Comments
Might also want to experiment with cluster.routing.allocation.cluster_concurrent_rebalance and other throttling settings. Reducing pre-deploy, resetting post-deploy. By setting them extremely low, you'd effectively put the rebalancing on pause, but still allow new primaries+replicas to be allocated as necessary. Used it a few times at cityindex. Important to note that index and shard size is a factor though. If other disaster scenarios occurred during the deployment, it'd cause additional strain and delay on the cluster, though. Once a shard does start transferring though, can't interrupt it even if the original node is back online mid-transfer. You may be able to take advantage of how BOSH will operate on a single AZ in cloud-config-aware environments. If you assume operators are using awareness settings to distribute their shards in an HA-aware scenario across AZs, you might be able to take advantage of this. Drain could use a document to persist which AZ is being restarted. If it's the current AZ, restart. If not, and if the cluster is not yet green, wait until it is or the doc is the AZ, and then update the doc and restart. This approach would require operators to use cloud-config and specific index awareness settings though. A couple other things that concerned me when initially looking at the new drain script...
I don't think there's an easy solution though. I'll think about it some more. |
@dpb587 Excellent points; thank you. Looks like we have some experimenting to do with We're considering scale down as a special case, and attempting to document the gotchas. We're also telling people to not have more "in flight" nodes than replicas.
True. Meaning that the cluster will go red during upgrades ( with logs backing up in the queue ); but it should then recover once the |
@mrdavidlaing any updates on the topic in the meanwhile? Is it still the case that I have to run an errand to re-enable shard-allocation after a successful deployment? |
@voelzmo You still need to run the |
Just curious--could re-enabling shard allocation be moved to a post-deploy script instead of an errand so that operators don't have to run the errand manually or add it to CI? |
Joshua, Definately - we're just waiting for that functionality to become widely On 23 September 2016 at 04:02, Joshua Carp [email protected] wrote:
David Laing |
Even when people upgraded their Directors: keep in mind that post-deploy is still optional and disabled by default: http://bosh.io/jobs/director?source=github.com/cloudfoundry/bosh&version=257.15#p=director.enable_post_deploy |
Following on from the [known limitations]https://github.com/logsearch/logsearch-boshrelease/releases/tag/v201.0.0#known-limitations) in v201 and the discussion at #209 ...
As of v201, shard allocation is disabled at the beginning of a deployment that affects elasticsearch nodes; and then manually re-enabled after deployment with
bosh run errand enable_shard_allocation
This means that:
a. Unnecessary shard movements are avoided during deployment; speeding up deploys
b. Primary indices remain "green" throughout the deployment so that
c. New data can be written to existing indexes during the deployment
BUT:
d. New indexes cannot be created during deployment
e. Index replicas remain un-allocated until the
enable_shard_allocation
errand is run.The purpose of this issue is to capture ideas for alternative techniques that can remove the 2 remaining limitations - d & e.
The text was updated successfully, but these errors were encountered: