Consider upstreaming #111

bprashanth · 2016-12-07T18:02:41Z

Since this is based on nginx, and we have an nginx backend (https://github.com/kubernetes/ingress/tree/master/controllers/nginx), and we're working on a generic controller with pluggable backends (https://github.com/kubernetes/ingress/blob/master/core/pkg/ingress/doc.go#L19), it seems like a good way to collaborate, for greater justice.

Even if you don't feel like it's worth your time right now, it might make sense to streamline how feed and the standard nginx backend handle nginx specific features.

craigbox · 2016-12-07T18:04:13Z

@balooo: Hello and thank you for your time today :-)

jsravn · 2016-12-08T12:05:39Z

This sounds good to me. I'll talk to the team and see what they think. We are going to be prod live very soon so we'll have to stick with this project for at least the short term. We still have some work planned for feed that we need to do very soon (moving to client-go, using endpoints instead of services), after that we can look into switching to upstream.

bprashanth · 2016-12-08T17:04:30Z

fyi the upstream interface should give you something like a snapshot of urls+services+endpoints from its local cache on each update notification from apiserver, so you can pick if you want to go straight to svc vip, svc dns, svc nodeport or endpoints

jsravn · 2016-12-16T11:01:59Z

@bprashanth We're probably going to back away from adding endpoints to nginx. The problem is nginx reload is quite problematic when done frequently (such as would happen with endpoint updates):

When keepalive connections are involved, clients will get stale connections a lot - as nginx simply closes any connections it considers idle on a reload/graceful stop, without waiting for the keepalive_timeout.
Memory usage can balloon very quickly under load as old nginx workers will stick around after a reload.

So I think using endpoints with nginx is going to be a bad idea. The problem of persistent connections + endpoint updates needs to be solved at the kube-proxy layer ideally.

bprashanth · 2016-12-16T17:54:52Z

I'd expect both sides to follow the tcp protocol? As long as nginx sends a fin and does the normal close handshake, the client should tear down the connection, keepalive or not, unless there's a stateful NAT inbetween that disagrees with tcp timeouts?

Maybe you can describe what you see on a tcpdump in a controlled environment (or is this too hard to hit in testing)?

jsravn · 2016-12-16T18:18:13Z

We're using an ELB in front of nginx, and we'll see ELB 504s getting returned while under load when nginx does a reload. It isn't very many though, because ingress updates are pretty rare. As the ELB is a black box I can't see what it's doing.

I will try to duplicate this and capture logs to understand it better.

jsravn · 2016-12-16T18:24:52Z

I think something else may be going wrong, so I'll investigate further. Could be an ELB bug. I think somewhere in the ELB docs it says the ELB needs to close the backend connections. Akamai has a similar restriction.

jsravn · 2016-12-20T16:40:10Z

@bprashanth tcpdump shows the nginx host sending tons of tcp resets to the ELB when a reload occurs. These translate directly into 504s returned from the ELB. I can reproduce as well hitting nginx directly with wrk using http/1.1 keepalives, I get connection resets. It seems so far nginx reload isn't as graceful as one would hope. I would expect it to close its connections gracefully, but so far it seems nginx simply closes the socket when idle rather than wait for the connection to terminate on the ELB/client side.

bprashanth · 2016-12-20T17:36:16Z

how are you reloading it? (http://nginx.org/en/docs/control.html)

jsravn · 2016-12-20T17:40:12Z

We send a HUP to the master process. I'm pretty sure this is all expected, http 1.1 rfc 2616 says clients should be able to handle async closes (non synchronous / graceful). Otherwise nginx workers would have to wait for some grace period to be sure the connections close.

I spoke with AWS support and they suggested I use the new application load balancers instead. I tried one out, and indeed it works fine with nginx reloads - not a single 504. Apparently the ALBs don't reuse persistent connections, instead it looks like they batch multiple requests through a connection then close it. That seems to solve the problem for us - so with ALB I can start updating endpoints with confidence :).

bprashanth · 2016-12-20T17:47:03Z

RST is still kind of rude, I get the feeling there's a more obvious fin/fin-wait-timeout option but I'd have to dig up nginx config a bit. Great that you have it working. Unfortunately Service.Type=LoadBalancer doesn't setup an ALB, so maybe this is how we should finally manifest Ingress on AWS?

Right now it's an ELB created by Service.Type=LB over the nginx ingress controller. Is setting up an ALB just like setting up an ELB + one last step to create something like a url map? or are all the resources in the pipeline completely different (as is the case on GCE)

jsravn · 2016-12-20T17:51:43Z

ALB is very similar to an ELB it seems, I didn't have to set up any url mapping when I did it through the console. Unfortunately it is a different API so I'll have to modify feed to use it.

If you find a way to change nginx's behaviour please let me know, I spent a lot of time looking at config options and its source code, and it seems to do as I describe - closes socket immediately.

Apparently you can improve things by issuing a USR2 followed by WINCH, which will actually wait for connections to drain, but USR2 doesn't seem to work in containers.

bprashanth · 2016-12-20T17:53:04Z

Is ALB similar enough to ELB that we can have the service controller create it instead of the latter based off an annotation on the Service of Type=LoadBalancer?

…

On Tue, Dec 20, 2016 at 9:51 AM, James Ravn ***@***.***> wrote: ALB is very similar to an ELB it seems, I didn't have to set up any url mapping when I did it through the console. Unfortunately it is a different API so I'll have to modify feed to use it. If you find a way to change nginx's behaviour please let me know, I spent a lot of time looking at config options and its source code, and it seems to do as I describe - closes socket immediately. Apparently you can improve things by issuing a USR2 followed by WINCH, which will actually wait for connections to drain, but USR2 doesn't seem to work in containers. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#111 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKa-zMotgZMaAv2SwGO_Gr8ZRNQkYTBXks5rKBWwgaJpZM4LG6EJ> .

jsravn · 2016-12-20T18:13:08Z

I think the ALB is layer7 only, so you couldn't do TCP load balancing with it.

I only tested with layer7 ELB, so for me ALB was a drop in replacement.

bprashanth · 2016-12-20T18:17:09Z

Hmm, then maybe when you create an Ingress on AWS, if you don't specify `ingress.class=nginx`, you get an ALB with a catch all route pointing to an nginx, that does the actual proxying based on rules in the map. Ideally creating an Ingress on AWS should just work, no surprises, which apparently isn't the case right now since people are shoving a Service of Type=LB/ELB in front.

…

On Tue, Dec 20, 2016 at 10:13 AM, James Ravn ***@***.***> wrote: I think the ALB is layer7 only, so you couldn't do TCP load balancing with it. I only tested with layer7 ELB, so for me ALB was a drop in replacement. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#111 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKa-zBqRbAGs-0iaCpc64aiz_uPkU0Avks5rKBq1gaJpZM4LG6EJ> .

justinsb · 2016-12-22T00:41:39Z

Ideally creating an Ingress on AWS should just work, no surprises, which apparently
isn't the case right now since people are shoving a Service of Type=LB/ELB in front.

What is the alternative?

bprashanth · 2016-12-22T02:41:50Z

ingress controller that spins up an ALB, like the gce one? You can also tier them, so you have an ALB -> nginx ingress. or maybe we can find a way to setup a catch all ALB instead of an ELB via annotations on the Service (not sure how similar they are in setup)

jsravn · 2017-05-05T12:57:58Z

Just an update:

We've pretty much given on nginx as an ingress proxy. It's very difficult to get seamless reloads in nginx - it will frequently drop requests in proportion to how often you reload. After much research and spiking, we've come to the conclusion haproxy is the superior, and only, viable solution for use as k8s ingress. We can reload many times a second without any dropped requests, and no noticeable performance problems - unlike nginx. With https://www.haproxy.com/blog/truly-seamless-reloads-with-haproxy-no-more-hacks/ we don't even need to hack around the small issues in haproxy reload.
ALBs are also out, because we discovered that ALBs don't support connection draining. That means there is no way to have a 0 downtime redeployment of the ingress component. It's a huge deficit in ALBs and we had to work with AWS before we got to the bottom of the problem. It's not documented in the AWS docs - they have told us they raised an internal ticket to update the docs. In the meantime, we will probably go back to ELBs. ELB+haproxy should give us truly 0% downtime in the face of even heavy ingress updates.

jsravn · 2017-07-06T09:47:53Z

Update #2 - as ingress doesn't reload very often, and the percentage of dropped requests is quite low, we haven't prioritised haproxy work. Otherwise the nginx proxy works great.

bprashanth mentioned this issue Dec 7, 2016

AWS Ingress controller kubernetes-retired/contrib#346

Closed

bprashanth mentioned this issue Dec 20, 2016

AWS ingress kubernetes/ingress-nginx#69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider upstreaming #111

Consider upstreaming #111

bprashanth commented Dec 7, 2016

craigbox commented Dec 7, 2016

jsravn commented Dec 8, 2016 •

edited

Loading

bprashanth commented Dec 8, 2016

jsravn commented Dec 16, 2016 •

edited

Loading

bprashanth commented Dec 16, 2016

jsravn commented Dec 16, 2016 •

edited

Loading

jsravn commented Dec 16, 2016 •

edited

Loading

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016

jsravn commented Dec 20, 2016 •

edited

Loading

bprashanth commented Dec 20, 2016

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016 via email

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016 via email

justinsb commented Dec 22, 2016

bprashanth commented Dec 22, 2016

jsravn commented May 5, 2017 •

edited

Loading

jsravn commented Jul 6, 2017

Consider upstreaming #111

Consider upstreaming #111

Comments

bprashanth commented Dec 7, 2016

craigbox commented Dec 7, 2016

jsravn commented Dec 8, 2016 • edited Loading

bprashanth commented Dec 8, 2016

jsravn commented Dec 16, 2016 • edited Loading

bprashanth commented Dec 16, 2016

jsravn commented Dec 16, 2016 • edited Loading

jsravn commented Dec 16, 2016 • edited Loading

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016

jsravn commented Dec 20, 2016 • edited Loading

bprashanth commented Dec 20, 2016

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016 via email

jsravn commented Dec 20, 2016

bprashanth commented Dec 20, 2016 via email

justinsb commented Dec 22, 2016

bprashanth commented Dec 22, 2016

jsravn commented May 5, 2017 • edited Loading

jsravn commented Jul 6, 2017

jsravn commented Dec 8, 2016 •

edited

Loading

jsravn commented Dec 16, 2016 •

edited

Loading

jsravn commented Dec 16, 2016 •

edited

Loading

jsravn commented Dec 16, 2016 •

edited

Loading

jsravn commented Dec 20, 2016 •

edited

Loading

jsravn commented May 5, 2017 •

edited

Loading