Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying loadbalancer for proxy service #18

Open
oruchreis opened this issue Dec 29, 2018 · 17 comments
Open

Specifying loadbalancer for proxy service #18

oruchreis opened this issue Dec 29, 2018 · 17 comments

Comments

@oruchreis
Copy link

Hi,
I'm using default kubernetes deployment scripts from codis repo with small changes on azure aks on production. But I was faced some situations when the proxy or the server has fallen. So I started to search another solution for codis on k8s. I've tried your project, but as far as I saw, it install the services as internal for the cluster. I need LoadBalancer for proxy and fe services, and also I need to add annotations for specifying loadbalancer for internal network use, so azure aks sets this service a static internal ip address. How can I specfy LoadBalancer type for the services and add annotations to the services? Also has there any issue to prevent us to use this on production environments?

@tangcong
Copy link
Owner

tangcong commented Dec 30, 2018

thanks for your attention and good advice. I have opened an issue(#19) and will Complete tomorrow at the latest.
warning: currently,codis operator is work in progress [WIP] and is NOT ready for production. use at your own risk. you can try it in your test environment.

@tangcong
Copy link
Owner

tangcong commented Dec 30, 2018

best practices:

  • use pv to store Redis data(ssd disk is better)
  • use dedicated node to run codis-server(Redis)
  • set max memory limit(node memory) for codis-server and assign enough memory
  • make sure request resource and limit source are equal(k8s pod qos is guaranteed,evict/oom seldom happens)
  • it is better that if your pod ip is sticky.

@tangcong
Copy link
Owner

tangcong commented Dec 30, 2018

there are some issues remained to be solved:

  • monitor(proxy/redis)
  • dedicated scheduler server(k8s do not know "codis group" conception, one group may have 2-N replicas, we want to make sure that every codis server pod which is in the same group be scheduled into different node, when one node crash/outage,we can promote other slave to master)
  • make sure that drain node safely and automatically.

@oruchreis

@tangcong
Copy link
Owner

tangcong commented Dec 30, 2018

ea59584
done,you can take a try~ @oruchreis
example:
https://github.com/tangcong/codis-operator/blob/master/examples/sample-3.yml

@oruchreis
Copy link
Author

Thanks a lot, I'll try it and notify the result here before long.

@oruchreis
Copy link
Author

Hi,
I tried this on a fresh install kubernetes. First I tried without rbac, but codis-fe displayed proxies at timeout state, and there were no server or sentinel. Then I tried with rbac, but it failed again. Also kubernetes dashboard displays pods as healthy. By the way, the serviceAnnotations worked as expected, I could set public and internal ips to the load balancer flawlessly.
I've attached codis-operator logs.
logs-from-codis-operator-in-codis-operator-0.txt

@tangcong
Copy link
Owner

tangcong commented Feb 12, 2019

How to reproduce it (as minimally and precisely as possible)? what is your kubernetes version?
can you provide codis-proxy logs and codis-fe snapshot?

@oruchreis
Copy link
Author

Kubernetes version is 1.12.4.
Here is the yaml that I've used which is cloned from sample3:
codis-operator.txt
Here are the logs from proxy and dashboard.
logs.zip
I don't know how to get snapshot of codis-fe. Codis-fe has any logs but these:
2019/02/12 08:05:09 main.go:101: [WARN] set ncpu = 2 2019/02/12 08:05:09 main.go:104: [WARN] set listen = 10.90.44.166:9090 2019/02/12 08:05:09 main.go:120: [WARN] set assets = /gopath/src/github.com/CodisLabs/codis/bin/assets 2019/02/12 08:05:09 main.go:162: [WARN] set --etcd = etcd-client:2379

@tangcong
Copy link
Owner

[error]: Get http://web-codis-dashboard.default.svc.cluster.local:18080/api/topom/model: dial tcp: lookup web-codis-dashboard.default.svc.cluster.local: no such host
    4   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:134
            github.com/CodisLabs/codis/pkg/utils/rpc.apiRequestJson
    3   /gopath/src/github.com/CodisLabs/codis/pkg/utils/rpc/api.go:169
            github.com/CodisLabs/codis/pkg/utils/rpc.ApiGetJson
    2   /gopath/src/github.com/CodisLabs/codis/pkg/topom/topom_api.go:787
            github.com/CodisLabs/codis/pkg/topom.(*ApiClient).Model
    1   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:329
            main.OnlineProxy
    0   /gopath/src/github.com/CodisLabs/codis/cmd/proxy/main.go:289
            main.AutoOnlineWithDashboard

it seems like that codis-proxy can not connect to dashboard and failed to resolve dashboard dns(web-codis-dashboard.default.svc.cluster.local). is the k8s dns service working properly? i have only tested it in k8s 1.10.

@oruchreis
Copy link
Author

I can connect with curl to this url inside the proxy. I've removed problematic proxies, now only one proxy seems to connected to the dashboard. But there isn't any server or sentinel displayed in the codis-fe. Also I recreated server and sentinel pods but it didn't worked neither. Also I've checked etcd with etcd-browser, it shows only one proxy, no group or server or sentinel. By the way I can create groups and add ip addresses of servers manually.

@tangcong
Copy link
Owner

tangcong commented Feb 13, 2019

ERROR: logging before flag.Parse: I0212 06:58:46.134697       1 dashboard.go:59] Successful Create,create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.134740       1 dashboard.go:190] deploy codis dashboard image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.135471       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216557       1 dashboard.go:77] Successful Create,create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.216627       1 codiscluster_controller.go:167] reconcile dashboard succ
ERROR: logging before flag.Parse: I0212 06:58:46.217034       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-dashboard in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300882       1 proxy.go:72] Successful Create,create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.300924       1 proxy.go:284] deploy codis proxy image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.302047       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330207       1 proxy.go:90] Successful Create,create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.330360       1 proxy.go:422] codis proxy hpa:{1 3 10}
ERROR: logging before flag.Parse: I0212 06:58:46.331222       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-proxy in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538063       1 proxy.go:108] Successful Create,create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.538083       1 codiscluster_controller.go:173] reconcile proxy succ
ERROR: logging before flag.Parse: I0212 06:58:46.538289       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create HPA web-codis-hpa in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628963       1 fe.go:58] Successful Create,create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.628992       1 fe.go:212] deploy codis fe image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.629463       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659217       1 fe.go:76] Successful Create,create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.659236       1 codiscluster_controller.go:179] reconcile fe succ
ERROR: logging before flag.Parse: I0212 06:58:46.659890       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Deploy web-codis-fe in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693087       1 redis.go:62] Successful Create,create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.693128       1 redis.go:224] deploy codis-server image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.693440       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729154       1 redis.go:80] Successful Create,create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.729206       1 codiscluster_controller.go:185] reconcile redis succ
ERROR: logging before flag.Parse: I0212 06:58:46.729652       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create StatefulSet web-codis-redis in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752893       1 sentinel.go:62] Successful Create,create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.752932       1 sentinel.go:220] deploy redis-sentinel image:ccr.ccs.tencentyun.com/codis/codis3.2:latest
ERROR: logging before flag.Parse: I0212 06:58:46.753208       1 event.go:221] Event(v1.ObjectReference{Kind:"CodisCluster", Namespace:"default", Name:"web-codis", UID:"a41bdd20-2e93-11e9-9e23-c2091dccf549", APIVersion:"codis.k8s.io/v1alpha1", ResourceVersion:"113544", FieldPath:""}): type: 'Normal' reason: 'Successful Create' create Service web-codis-redis-sentinel in CodisCluster web-codis successful
ERROR: logging before flag.Parse: I0212 06:58:46.776349       1 sentinel.go:80] Successful Create,create StatefulSet web-codis-redis-sentinel in CodisCluster web-codis successful

codis-operator log shows that every component create successfully at that time. it is strange that i do not find any error message. every component pods are running? maybe,i have got it, now, you have to create group and add redis/sentinel instance into your cluster manually. later, i will add component into codis-fe automaticly.

@oruchreis
Copy link
Author

Hımm, then it is my bad. I supposed to see every component onto codis-fe automatically which the k8s yaml scripts at codis repo does that. If I add every component and create groups, then if I scale up or down, will I add new group or create new groups?

@tangcong
Copy link
Owner

yes~ i will add component into codis-fe automaticly as soon as possible.( i am busy recently)

@ZhangSIming-blyq
Copy link

ZhangSIming-blyq commented Dec 1, 2021

In my condition, my codis-operator error is "codis-dashboard.codis-operator.svc.cluster.local: no such host", and the error reason is my cluster can not resolve xxx.xxx.svc.cluster.local, it can handle with something like "xxx.xxx.svc.[mycluster].local" instead.
Is there a configuration part to change it?

@ZhangSIming-blyq
Copy link

image
Hard-coded

@tangcong
Copy link
Owner

tangcong commented Dec 1, 2021

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

@ZhangSIming-blyq
Copy link

It is a demo and not ready for production(It is explained in the readme), my job has changed and there is no time to optimize it. @ZhangSIming-blyq

ok, thanks anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants