while scaling - affects the running AI response - causing spikes and slowness #472

danupo068 · 2019-04-20T15:29:20Z

Description
when using app autoscaler the during the scaling the process .. the overall throughput of the app is going down .. the app instances(AI) that are already existing shows higher response times while the new app instances are coming /scaling.. this is hugely impacting our production systems while using autoscaler; appreciate your insights into this esp with large scale prod environments
Observations:
The are some queries around scaling_events is largely effecting performance esp some tables does not have indices

cf-gitbot · 2019-04-20T15:29:22Z

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/165481535

The labels on this github issue will be updated when the story is started.

cdlliuy · 2019-04-23T03:26:39Z

@danupo068

which version of app-autoscaler are you using in your production system?
app-autoscaler doesn't include in data injection into your application instances. Could you provide more detail information about the things happened?
the scaling history query just happens between autoscaler components inside. Yes , it may be an issue for the index missing, but it should not affect app instances performance.

In summary, more information is welcomed.

boyang9527 · 2019-05-06T18:34:23Z

@cdlliuy let us have some experiments on that. This might be related to the health check of new app instances (the default is "port" instead of http. If port is used, CF will think it is ready but actually it is not, this will cause failures/long delays). Another potential reason is load balancing, new instances will need time to warm up so it will have longer response time than existing instances, while the router is doing it in a round-robin way.

These two may explain the reason why overall response time increases but can not explain increased response time for existing instances. we need investigation.

@danupo068 FYI with above. If you can provide more information, that will be much better for us to diagnose. For example, what language runtime you are using, the health check type, the scaling rules etc.

cf-gitbot added the unscheduled label Apr 20, 2019

danupo068 changed the title ~~while scaling - the running AI response times have increased~~ while scaling - affects the running AI response - causing spikes and slowness Apr 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

while scaling - affects the running AI response - causing spikes and slowness #472

while scaling - affects the running AI response - causing spikes and slowness #472

danupo068 commented Apr 20, 2019

cf-gitbot commented Apr 20, 2019

cdlliuy commented Apr 23, 2019

boyang9527 commented May 6, 2019

while scaling - affects the running AI response - causing spikes and slowness #472

while scaling - affects the running AI response - causing spikes and slowness #472

Comments

danupo068 commented Apr 20, 2019

cf-gitbot commented Apr 20, 2019

cdlliuy commented Apr 23, 2019

boyang9527 commented May 6, 2019