Description
Code of Conduct
- I agree to follow this project's Code of Conduct
Search before asking
- I have searched in the issues and found no similar issues.
What would you like to be improved?
There is no clear common approach at the moment on how to configure Kyuubi deployment if the Helm chart is used.
I would like to discuss requirements, limitations and different options to choose one approach to follow and support it in the Kyuubi Helm chart configuration. The problem has been mentioned and discussed in multiple issues and PRs, so the idea is to collect all opinions in one place and make the decision.
Configuration system of the Apache Kyuubi
The configuration system allows to configure values using the following options (ordered from low to high prio):
- [static] Kyuubi configuration files
- [static] Hadoop configuration files
- [static] Engine (Spark, Flink, Trino etc) configuration files
- [runtime] Environment variables
- [runtime] JDBC Connection URLs
- [runtime] SET Statements
Runtime options JDBC Connection URLs
and SET Statements
Can be skipped in the discussion, because they used only when Kyuubi is up and running.
Runtime option Environment variables
Configured by {{ .Values.env }}
and {{ .Values.envFrom }}
value properties.
The Helm chart users can specify environment variables to provide necessary configuration values with low effort if needed. The properties also allow to use provided (existing) ConfigMaps
and Secrets
as the sources of environment variables, for instance:
env:
- name: ENV_VALUE
value: env-value
- name: ENV_FROM_CONFIGMAP_KEY
valueFrom:
configMapKeyRef:
name: env-configmap
key: env-key
envFrom:
- configMapRef:
name: all-env-configmap
Static options
Represented by configuration files which should be located in each Kyuubi container in specific paths.
In general case, the easiest way to provide files into Kubernetes pod (container) is to mount ConfigMap
or Secret
to a specific path.
Requirements
Note: this section is subject to discuss.
- Ability to create
ConfigMaps
under the hood from value properties ofvalue.yaml
file.
Secrets
should never be created and managed by Helm chart because of security consideration! - Ability to specify existing (created outside the chart)
ConfigMaps
andSecrets
by resource name as a reference. - Ability to provide multiple existing
ConfigMaps
andSecrets
with priority order.
MultipleConfigMaps
andSecrets
might have key duplicates, so the implementation should clearly resolve the collision by merging keys in priority order. - Ability to mix
ConfigMaps
managed by the chart withConfigMaps
andSecrets
provided by user with priority order.
The issue with key duplicates should be clearly resolved.ConfigMaps
managed by the chart should have the lowest prio. - The approach should work for Helm and GitOps tools like ArgoCD, Flux etc.
- Easy way to specify one or many configuration files as Helm values, i.e. properties in
value.yaml
file.
Some configuration files might be huge and complex, so the idea is to prevent identation issues invalues.yaml
file. - Easy way to create
ConfigMaps
andSecrets
from one or many configuration files.
Users might have a lot of xml, properties and other files, so the idea is to help users to createConfigMap
andSecret
resources in a simple way.
How should we improve?
Approach
Note: this section is subject to discuss.
- Group configuration file properties in
values.yaml
by system like Kyuubi, Hadoop, Spark, Trino etc.
kyuubiConfDir: /opt/kyuubi/conf
kyuubiConf:
...
sparkConfDir: /opt/spark/conf
sparkConf:
...
- Use
files
property to specify various files.
Users can define files with any file name. Each entity withinfiles
property used as a key/value pair in the correspondingConfigMap
.
sparkConf:
files:
'spark-env.sh': |
#!/usr/bin/env bash
export SPARK_LOG_DIR=/opt/spark/logs
'spark-defaults.conf': |
spark.submit.deployMode=cluster
- Use
filesFrom
property to specify list of existingConfigMaps
andSecrets
to be mounted to the configuration path of Kyuubi container.
sparkConf:
filesFrom:
- configMap:
name: my-spark-confs
- secret:
name: my-sensetive-spark-confs
- secret:
name: my-sensetive-spark-confs-2
items:
- key: secretKey
path: filename.xml
The implementation idea is to use Projected Volumes with core/v1/SecretProjection
and core/v1/ConfigMapProjection
entities.
Also it will allow to merge ConfigMap
created from files
property with the entities from filesFrom
property.
- Move
xxxConfDir
property toxxxConf
property.
sparkConf:
dir: /opt/spark/conf
- Configuration example for Spark
sparkConf:
dir: /opt/spark/conf
files:
'spark-env.sh': |
#!/usr/bin/env bash
export SPARK_LOG_DIR=/opt/spark/logs
'spark-defaults.conf': |
spark.submit.deployMode=cluster
filesFrom:
- configMap:
name: my-spark-confs
- secret:
name: my-sensetive-spark-confs
- secret:
name: my-sensetive-spark-confs-2
items:
- key: secretKey
path: filename.xml
- Provide documentation with examples on how to set file content as a property when installing the chart, see Helm docs.
helm install kyuubi charts/kyuubi --set-file kyuubiConf.log4j2=kyuubi/conf/log4j2.xml.template
- Provide documentation with examples on how to create
ConfigMap
from file or directory, see Kubernetes docs .
kubectl create configmap my-spark-configs --from-file=prod/spark-configs/
Are you willing to submit PR?
- Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
- No. I cannot submit a PR at this time.