Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hystrix 工作原理和实践 #2

Open
greatwqs opened this issue Mar 22, 2019 · 0 comments
Open

hystrix 工作原理和实践 #2

greatwqs opened this issue Mar 22, 2019 · 0 comments
Assignees

Comments

@greatwqs
Copy link
Owner

greatwqs commented Mar 22, 2019

hystrix-logo-tagline-640
Hystrix中文意思豪猪,因其背上长满了刺而有自我保护能力。Netflix的Hystrix是一个帮助解决分布式系统交互时候超时处理、容错的类库,它同样拥有保护系统的能力。
Hystrix从Netflix API团队于2011年开始的弹性工程工作演变而来。2012年,Hystrix继续发展和成熟,Netflix的许多团队都采用了它。今天,Netflix每天都会通过Hystrix执行数百亿个线程隔离和数千亿个信号量隔离的调用。这导致了正常运行时间和弹性的显着改善。
https://github.com/Netflix/Hystrix

Hystrix解决了什么问题?

复杂分布式体系结构中的应用程序具有许多依赖关系,每个依赖关系在某些时候都将不可避免地失败。如果主机应用程序未与这些外部故障隔离,则可能会被它们取下。
例如,对于依赖于30个服务的应用程序,其中每个服务的正常运行时间为99.99%,您可以期待以下内容:

99.99 30 = 99.7%正常运行时间10 
%请求中的0.3%= 3,000,000次故障2个
小时停机/月,即使所有依赖项都具有出色的正常运行时间。

现实情况通常更糟。
即使所有依赖关系都表现良好,如果您没有为整个系统设计弹性,那么即使0.01%停机时间对数十种服务中的每项服务的总体影响也相当于每月停机一小时。

Hystrix工作目标

  • 防止任何单个依赖项用尽所有容器(例如Tomcat)用户线程。
    Preventing any single dependency from using up all container (such as Tomcat) user threads.
  • 脱落负载并快速失败而不是排队。
    Shedding load and failing fast instead of queueing.
  • 在可行的情况下提供回退以保护用户免于失败。
    Providing fallbacks wherever feasible to protect users from failure.
  • 使用隔离技术(例如隔板,泳道和断路器模式)来限制任何一个依赖项的影响。
    Using isolation techniques (such as bulkhead, swimlane, and circuit breaker patterns) to limit the impact of any one dependency.
  • 通过近实时指标,监控和警报优化发现时间
    Optimizing for time-to-discovery through near real-time metrics, monitoring, and alerting
  • 通过Hystrix的大多数方面的配置更改的低延迟传播和对动态属性更改的支持来优化恢复时间,这允许您使用低延迟反馈循环进行实时操作修改。
    Optimizing for time-to-recovery by means of low latency propagation of configuration changes and support for dynamic property changes in most aspects of Hystrix, which allows you to make real-time operational modifications with low latency feedback loops.
  • 防止整个依赖关系客户端执行中的故障,而不仅仅是网络流量。
    Protecting against failures in the entire dependency client execution, not just in the network traffic.
    Hystrix如何实现其目标
  • 将对外部系统(或“依赖项”)的所有调用包含在通常在单独线程中执行的对象HystrixCommand或HystrixObservableCommand对象中。
    Wrapping all calls to external systems (or “dependencies”) in a HystrixCommand or HystrixObservableCommand object which typically executes within a separate thread
  • 为每个依赖项维护一个小的线程池(或信号量); 如果它变满,将立即拒绝发往该依赖项的请求而不是排队。
    Maintaining a small thread-pool (or semaphore) for each dependency; if it becomes full, requests destined for that dependency will be immediately rejected instead of queued up.
  • 统计调用成功,失败(客户端引发的异常),超时和线程拒绝。
    Measuring successes, failures (exceptions thrown by client), timeouts, and thread rejections.
  • 如果服务的错误百分比超过阈值,则手动或自动地使断路器跳闸以停止对特定服务的所有请求一段时间。
    Tripping a circuit-breaker to stop all requests to a particular service for a period of time, either manually or automatically if the error percentage for the service passes a threshold.
  • 当请求失败时执行回退逻辑,被拒绝,超时或短路。
    Performing fallback logic when a request fails, is rejected, times-out, or short-circuits.
  • 近乎实时地监控指标和配置更改。
    Monitoring metrics and configuration changes in near real-time.

Hystrix依赖隔离

soa-4-isolation-640

Hystrix工作原理

hystrix-command-flow-chart
https://raw.githubusercontent.com/wiki/Netflix/Hystrix/images/hystrix-command-flow-chart.png

  1. Construct a HystrixCommand or HystrixObservableCommand Object
  2. Execute the Command
  3. Is the Response Cached?
  4. Is the Circuit Open?
  5. Is the Thread Pool/Queue/Semaphore Full?
  6. HystrixObservableCommand.construct() or HystrixCommand.run()
  7. Calculate Circuit Health
  8. Get the Fallback
  9. Return the Successful Response
    https://github.com/Netflix/Hystrix/wiki/How-it-Works

Hystrix断路器内核

circuit-breaker-1280

https://raw.githubusercontent.com/wiki/Netflix/Hystrix/images/circuit-breaker-1280.png

断路器打开或关闭状态判定

  1. 假设电路上的并发流量达到某个阈值HystrixCommandProperties.circuitBreakerRequestVolumeThreshold()
  2. 并假设错误百分比超过阈值错误百分比HystrixCommandProperties.circuitBreakerErrorThresholdPercentage()
  3. 然后断路器从转换CLOSED到OPEN。
  4. 当它打开时,它会短路所有针对该断路器的请求。
  5. 经过一段时间HystrixCommandProperties.circuitBreakerSleepWindowInMilliseconds()后,下一个请求将通过(这是HALF-OPEN状态)。如果请求失败,则断路器返回OPEN睡眠窗口持续时间的状态。如果请求成功,则断路器转换为1 CLOSED并且逻辑1再次接管。
//设置打开熔断
@HystrixProperty(name ="circuitBreaker.enabled", value ="true"), 
//请求数达到后才计算错误率
@HystrixProperty(name ="circuitBreaker.requestVolumeThreshold", value ="10"),    
//成功率超过这个数字就代表服务恢复了
@HystrixProperty(name ="circuitBreaker.errorThresholdPercentage", value ="40"),    
//熔断时间,即设置一个时间窗口。当失败次数达到熔断是,就会进入这个时间窗口,这时候默认返回服务降级的处理逻辑,
//过了这个窗口时间,服务恢复了就会采用原来的处理逻辑,如果服务未恢复就进入新的时间窗口。
@HystrixProperty(name ="circuitBreaker.sleepWindowInMilliseconds", value ="10000"), 

演示

@HystrixCommand(fallbackMethod = "callStudentService_Fallback",
   commandProperties = {
      @HystrixProperty(name = "circuitBreaker.forceClosed", value = "true"),
      @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "4000")
   },
   threadPoolKey = "studentServiceThreadPool",
   threadPoolProperties = {
      @HystrixProperty(name = "coreSize", value = "5"),
      @HystrixProperty(name = "maxQueueSize", value = "5")
   })

dashbord

http://localhost:8088/hystrix/monitor?stream=http%3A%2F%2Flocalhost%3A8088%2Fhystrix.stream

结尾

配置中心 spring cloud config / Apollo @RefreshScope
网关 spring cloud gateway 80 20 原则

相关产品

@greatwqs greatwqs changed the title hystrix hystrix 工作原理和实践 Mar 22, 2019
@greatwqs greatwqs self-assigned this Mar 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant