-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gateway + Letsencrypt changes for app , notebook-api #5244
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,6 @@ | ||
FROM debian:9.5 | ||
|
||
RUN apt-get update -y && \ | ||
apt-get install -y nginx && \ | ||
rm -rf /var/lib/apt/lists/* | ||
FROM nginx:1.15-alpine | ||
|
||
RUN rm -f /etc/nginx/sites-enabled/default | ||
ADD @nginx_conf@ /etc/nginx/conf.d/hail.conf | ||
|
||
RUN ln -sf /dev/stdout /var/log/nginx/access.log | ||
RUN ln -sf /dev/stderr /var/log/nginx/error.log | ||
|
||
CMD ["nginx", "-g", "daemon off;"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -35,6 +35,9 @@ server { | |
server { | ||
server_name scorecard.@domain@; | ||
|
||
gzip on; | ||
gzip_types text/plain text/html application/json; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. gzip is on by default, see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Do we want a single global types definition, or application-controlled definitions? For instance notebook-api never serves html. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Global, I would assume. I don't see a problem with having types in gzip_types that aren't served. The question is does one service want to compress a type that another one does not. I don't see why that would be the case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To not incur the cost of compression/decompression when you know your responses are small, and to avoid having to modify a global config every time you want to add a new type (for instance a font, or .ico; I can modify the server block for the nginx instance that sits in front of my service, rather than modifying the nginx server that sits at the public/private boundary in our modular nginx world). For small items, gzipping increases size (i.e 0 bytes -> ~600 bytes) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then set gzip_min_length. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't see what's wrong with that. In fact, I see it as preferable to editing every server block that might use that modification. And, as you know, I'd like make gateway service-generic so there won't be a per-service block. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a strong opinion on this. A global option will be set in next commit, and overridden in server blocks if needed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would it need to be overridden? |
||
|
||
location / { | ||
proxy_pass http://scorecard/; | ||
} | ||
|
@@ -158,3 +161,113 @@ server { | |
include /etc/letsencrypt/options-ssl-nginx.conf; | ||
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; | ||
} | ||
|
||
server { | ||
server_name app.@domain@; | ||
|
||
gzip on; | ||
gzip_types text/plain text/css text/html text/javascript application/javascript application/x-javascript text/js text/xml application/xml application/json application/rss+xml image/svg+xml font/opentype; | ||
|
||
location / { | ||
proxy_pass http://web/; | ||
} | ||
|
||
listen [::]:443 ssl; | ||
listen 443 ssl; | ||
ssl_certificate /etc/letsencrypt/fullchain.pem; | ||
ssl_certificate_key /etc/letsencrypt/privkey.pem; | ||
include /etc/letsencrypt/options-ssl-nginx.conf; | ||
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; | ||
} | ||
|
||
server { | ||
server_name notebook-api.@domain@; | ||
|
||
# Support large URI's stemming from access tokens | ||
client_header_buffer_size 64k; | ||
large_client_header_buffers 4 64k; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We had a brief discussion the other day about passing the token when redirect to notebook. Would making the token a cookie solve this? Is that better that the header? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, in my response to your question about /auth-notebook I mention this as a potential strategy. This is maybe the preferred way for the 2nd ... Nth access of /instance/<svc_name>. The initial request needs to either have an access_token get parameter, or, if we're ok with not having a portable url, we can have app.hail.is do something more complicated, where the listed notebook isn't simply an anchor tag edit: I think a simpler version of the latter approach is the specify domain=hail.is when setting the access token cookie. It should then be usable with the anchor tag. In that case, the /instance... url still loses portability, will only be usable when clicked on from a browser that has had this cookie set by app.hail.is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I found that response way too specific to the details of Jupyter (discussion of file types, etc.) Won't making the JWT a cookie instead of putting it in the header completely solve this problem? Then all accesses (1st and Nth) check the user in the JWT and are the same, you don't need a second cookie, we get a portable URL and the link to the notebook is a regular anchor, etc. No? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the JWT is a cookie, one of our services needs to set it. That happens in one of two scenarios: app.hail.is sets it when it receives the token from auth0 after authentication, or Nginx gateway sets it after auth_request reads/authorizes an ?access_token ... You of course can't pin a cookie to a URL, since the URL is fully defined by its string domain/path/to/resource?query_variables , which I don't think you're suggesting . So when I say "portable" I mean that the url will only work from a browser that already has the cookie set ... unless the url contains access_token, and after auth_request nginx or Flask sets it (such that for all other requests it has this). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize I might be swinging in the dark here because I don't understand this stuff super well, but can't you set the cookie in JavaScript after the auth0 authentication? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To elaborate, if we set a domain for the cookie, all subdomains of that domain will have access to it. However, the nginx gateway will not see that cookie if the /instance/... url is accessed from a browser in which that cookie wasn't already sent. So if we want the instance url to be copy-pastable this won't work, only having the access_token in the url on the first request will work. We probably prefer to require a cookie, since the cookie will represent one added assurance (that the url was accessed from app.hail.is), although we should not feel too confident in this as an access control, because the access token can, and over enough time and use, will, be leaked. In that case, the only access control is the access token's short lifespan. |
||
|
||
gzip on; | ||
gzip_types text/plain application/json; | ||
|
||
location = /auth { | ||
internal; | ||
|
||
resolver kube-dns.kube-system.svc.cluster.local; | ||
proxy_pass http://auth-gateway.default.svc.cluster.local/verify/$auth_request_uri; | ||
} | ||
|
||
location = /auth-notebook { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this used? The auth_request below is commented out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will be used, placeholder until I re-enable it. It works as provided, just not on subsequent requests issued by Jupyter, which drop the ?access_token. I can remove if you prefer. |
||
internal; | ||
|
||
resolver kube-dns.kube-system.svc.cluster.local; | ||
proxy_pass http://notebook-api.default.svc.cluster.local/api/verify/$svc_name/$auth_request_uri; | ||
} | ||
|
||
location / { | ||
if ($request_method = 'OPTIONS') { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a particular reason to handle this here and not in the notebook api? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I saw nginx as the controller of ingress, and not notebook-api. It should also be more resource efficient than Flask. That being said, I don't have a strong preference, both Flask, and Sanic which I prefer over Flask, have CORS functions that are very simple to use |
||
add_header 'Access-Control-Allow-Origin' '*'; | ||
add_header 'Access-Control-Allow-Credentials' 'true'; | ||
add_header 'Access-Control-Allow-Methods' 'GET, POST, DELETE, OPTIONS'; | ||
add_header 'Access-Control-Allow-Headers' 'Authorization,Keep-Alive,User-Agent,X-Requested-With,Cache-Control,Content-Type'; | ||
add_header 'Access-Control-Max-Age' 86400; | ||
add_header 'Content-Type' 'text/plain charset=UTF-8'; | ||
add_header 'Content-Length' 0; | ||
return 204; | ||
break; | ||
} | ||
|
||
set $auth_request_uri "$is_args$args"; | ||
auth_request /auth; | ||
auth_request_set $auth_user $upstream_http_user; | ||
auth_request_set $auth_scope $upstream_http_scope; | ||
proxy_pass http://notebook-api; | ||
proxy_set_header Host $http_host; | ||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | ||
proxy_set_header X-Forwarded-Host $http_host; | ||
proxy_set_header X-Forwarded-Proto $scheme; | ||
proxy_set_header X-Real-IP $remote_addr; | ||
proxy_set_header Upgrade $http_upgrade; | ||
proxy_set_header Connection "upgrade"; | ||
proxy_set_header User $auth_user; | ||
proxy_set_header Scope $auth_scope; | ||
|
||
proxy_http_version 1.1; | ||
proxy_read_timeout 86400; | ||
proxy_buffering off; | ||
} | ||
|
||
location ~ /instance/([^/]+)/(.*) { | ||
set $svc_name $1; | ||
#set $auth_request_uri "$is_args$args"; | ||
#auth_request /auth-notebook; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Leaving this out means, essentially, anyone with the right URL can get execution access to our cluster. This seems bad, even temporarily. What's involved in fixing the endpoint to support the other requests? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, yes, not safe. For an initial demo I don't think it's very problematic, since attackers need to guess a svc_name and a long randomly generated string, for resources that live only the length of the demo, and whose names are transmitted over https and not stored browser-side (in-memory only). Potential strategies:
We could also save a Flask-issued session cookie, encrypted with a secret, and check that to avoid re-validating the access_token. |
||
|
||
resolver kube-dns.kube-system.svc.cluster.local; | ||
proxy_pass http://$svc_name.default.svc.cluster.local$request_uri; | ||
|
||
proxy_set_header Host $http_host; | ||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | ||
proxy_set_header X-Forwarded-Host $http_host; | ||
proxy_set_header X-Forwarded-Proto $scheme; | ||
proxy_set_header X-Real-IP $remote_addr; | ||
proxy_set_header Upgrade $http_upgrade; | ||
proxy_set_header Connection "upgrade"; | ||
proxy_http_version 1.1; | ||
proxy_redirect off; | ||
proxy_buffering off; | ||
proxy_read_timeout 86400; | ||
proxy_connect_timeout 10s; | ||
} | ||
|
||
add_header 'Access-Control-Allow-Origin' '*'; | ||
add_header 'Access-Control-Allow-Credentials' 'true'; | ||
add_header 'Access-Control-Allow-Methods' 'GET, POST, DELETE, OPTIONS'; | ||
add_header 'Access-Control-Allow-Headers' 'Authorization,Keep-Alive,User-Agent,Cache-Control,Content-Type'; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought we were using Flask-CORS for this. Is that not the case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We certainly can, mostly wanted to try to separate notebook business logic from public/private ingress concerns |
||
|
||
listen [::]:443 ssl; | ||
listen 443 ssl; | ||
ssl_certificate /etc/letsencrypt/fullchain.pem; | ||
ssl_certificate_key /etc/letsencrypt/privkey.pem; | ||
include /etc/letsencrypt/options-ssl-nginx.conf; | ||
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't prefer this at this point. Alpine is much harder to work with if you're trying to debug anything (and we're at the stage where we often are.) Thoughts, @danking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it harder to debug? To debug Nginx requests I view the log, where error.log and access.log are symlinked to stderr/stdout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes you want to log in and do stuff and Alpine has none of the tooling that makes that easy (which is why it is so small).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove the change if there is something that we're doing that alpine can't cover.
The other motivation was to move us to Nginx 1.15.N from 1.10.3, which is years old (if for no reason other than patching vulnerabilities: https://www.cvedetails.com/vulnerability-list/vendor_id-10048/product_id-17956/version_id-198730/Nginx-Nginx-1.10.0.html , nixawk/labs#15)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not talking about the normal behavior, I'm talking about the developer experience when you log into the pod to muck around, e.g. you get bash instead of sh. I'm just trying to make things easy for us. I don't think image size matters here, and if all the images are based on Debian, it will be cached on every node and there is very little overhead. I'd even prefer a standard image build on Debian with useful stuff installed (e.g. emacs, which I immediately install every time I log into a pod).
I'm all for upgrading. You can do that without switching to Alpine. Let's make that a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll roll back the change.