This guide demonstrates how to start a job with a python application using pyFlink without deploying Apache Beam
Apache Flink does not provide any official docker images for pyFlink, you will need to build and host your own image. A sample docker file is provided
in images/flink/python.Dockerfile
Alternatively:
- Create your own Dockerfile: please follow DockerSetup#enableing-python in the flink docs
- Deploy the Dockerfile to any docker registry
You can start a job with a python file by specifying the .spec.job.pyFile
property. The .spec.job.pyFile
is transformed to python
as an argument in the flink command.
Make sure you update .spec.image.name
to point to your pyFlink Docker Image and registry.
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
name: flinkjobcluster-sample
spec:
...
image:
name: <your_dockerfile>
...
job:
pyFile: "examples/python/table/word_count.py"
If you wrote the application with multiple python files, specify .spec.job.pyModule
and .spec.job.pyFiles
.
These properties are transformed to pyModule
and pyFiles
as arguments in the flink command, respectively.
Refer to the pyFlink CLI Docs for further
information.
apiVersion: flinkoperator.k8s.io/v1beta1
kind: FlinkCluster
metadata:
name: flinkjobcluster-sample
spec:
...
image:
name: <your_dockerfile>
...
job:
pyModule: "word_count"
pyFiles: "examples/python/table"