-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.4.0 testing tracker #976
Comments
I've been having trouble upgrading from 0.3.12 on AWS (using Auth0) to the version of qhub on
I've seen errors like this in past but I haven't been able to get around it. @danlester do you have any idea why this might be failing or if there are additional steps I need to take? |
@iameskild Not too sure, but we can have a call if you want to look together. |
@danlester I've attempted another upgrade with the same results. I will try to perform an upgrade from 0.3.13 to main for another cloud provider and see if I get it working. I'm free to jump on a call whenever is convenient for you, thanks for you help! |
I don't think there will be much difference, but I would suggest also trying 0.3.12 to main for another cloud provider, so you're changing less for comparison. It could also be worth trying with |
@danlester I was able to upgrade from 0.3.12 to 0.4.0 (main) running on DO using
Unfortunately the
|
@iameskild This is the same problem that Vini faced: #967 (comment) I'm not too sure why you manually updated the image tags to But since the qhub repo doesn't yet have a If you still have the broken site running, try updating the image tags in qhub-config.yaml and redeploy - it will still be a helpful test I think. Still happy to have a call to go through all of this together. |
Redeploying with image tags set to I still want to go back and test upgrading a QHub instance that uses Auth0. |
Upgrading qhub (on AWS, using Auth0) from I also noticed a few bizarre Terraform outputs:
@danlester are you available to troubleshoot together tomorrow after the QHub sync? |
@danlester capturing the Terraform logs led me to:
Googling this, I found an issue on the terraform-aws-eks repo. Here, one on of the top recommendations was terraform-aws-modules/terraform-aws-eks#1234 (comment).
With this trick, the deployment seemed to be working but then it started deleting subnet resources and errored out, leaving the cluster in an half-deleted state. Logs in this gist. |
@iameskild I believe I've solved this particular problem (Terraform trying to access localhost cluster) in the following issue which gives more details. It has a corresponding PR - please review: Kubeconfig state unavailable, Terraform defaults to localhost However, (in AWS) it leads me to the problem you were seeing about subnet resources being replaced. (Some outputs below). Once it wants to replace the node groups, the apply will never finish since the nodes can't be destroyed until the cluster has its contents removed safely. By the way, I tried the upgrade on AWS and got the same localhost error using password auth (not Auth0) - I don't think the auth type has anything to do with it, and you were just lucky if you got password upgrade to work before - or maybe something has changed since! As discussed, the login problem you saw with Auth0 above is because the callback URL needs to be changed, and we need to advise the user in Terraform AWS subnet replacement logs
|
I think it's something to do with CIDR changes:
I would take a look where these have been changed (e.g. vpc_cidr_newbits and vpc_cidr_block in the code), find out why, and see if they can at least be preserved for old installations. |
@iameskild just to keep in mind during tests
|
CICD workflows have been tested and a PR for the relevant bug fixes/modifications has been opened: |
Azure issues seem in integration tests does not affect fresh local deployments |
@danlester @HarshCasper Have you tested the qhub upgrade command for the above version migrations? just to know if that still needs to be tested 😄 |
v0.4.0 released. Closing issue 🙌 |
Checklist:
Validate successful
qhub deploy
andqhub destroy
for each provider:AWS
Validate the following services:
Azure
Validate the following services:
DO
Validate the following services:
GGP
Validate the following services:
local/existing kubernetes cluster/minikube
Validate the following services:
Validate
qhub upgrade
is successful for each provider:AWS
v0.3.12
/v0.3.13
/v0.3.14
tov0.4.0
Azure
v0.3.12
/v0.3.13
/v0.3.14
tov0.4.0
DO
v0.3.12
/v0.3.13
/v0.3.14
tov0.4.0
GCP
v0.3.12
/v0.3.13
/v0.3.14
tov0.4.0
local/existing kubernetes deployment/minikube
v0.3.12
/v0.3.13
/v0.3.14
tov0.4.0
Validate
qhub-ops.yaml
workflow(outdated)
#1003 Testing
Keycloak
Azure deployments fail see #978 for more details.
qhub upgrade
AWS
DO
The text was updated successfully, but these errors were encountered: