Job Cluster on Kubernetes with PyFlink

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Job Cluster on Kubernetes with PyFlink

Wojciech Korczyński
Hello,

I would like to use PyFlink jobs on Kubernetes in Job Cluster. I managed to do this in Cluster Session mode but deploying it as an independent Job Cluster for each job seems a better option for me.

If I understand the documentation well [1], [2] I should create a custom docker image which consists of my job which would be used in YAML files for jobmanger and taskmanager K8s deployments. However it only mentions JAR jobs. How can I do similar for a PyFlnik job? When I was deploying it in session cluster mode I also included a .jaf file to the Kafka connector which also should be somehow included.

The only source I have found [3] has a bash script which creates images but I am not sure if it can be used for PyFlink jobs (especially when I want to apply UDF).


Kind regards,
Wojtek

UWAGA - Wiadomość oraz załączone do niej dokumenty zawierają informacje poufne, które mogą być również objęte tajemnicą handlową lub służbową. Jeśli nie jesteś zamierzonym odbiorcą wiadomości, proszę bezzwłocznie skontaktuj się z nadawcą oraz usuń wiadomość ze swojego systemu. Ujawnianie, kopiowanie, rozpowszechnianie czy publikacja tej wiadomości oraz zawartych w niej informacji jest zabronione.

Alphamoon Sp. z o.o., ul. Pawła Włodkowica 21/3, 50-072 Wrocław,
wpisana pod numerem KRS 0000621513 do Krajowego Rejestru Sądowego, prowadzonego przez Sąd Rejonowy dla Wrocławia-Fabrycznej VI Wydział Gospodarczy Krajowego Rejestru Sądowego, NIP: 8943079568, REGON 364634116.; Kapitał zakładowy: 5.000 PLN w pełni opłacony.

NOTE - Message and the documents attached thereto contain confidential information, which may also be a trade secret or confidential. If you are not the intended recipient of the message, please contact the sender without delay and delete the message from your system. Disclosure, copying, dissemination or publication of this message and information contained therein is prohibited.

Alphamoon Sp. z o.o. (Ltd.), ul. Pawła Włodkowica 21/3, 50-072 Wrocław, Poland;
Registered under the KRS number 0000621513 to the National Court Register, kept by the District Court for Wrocław-Fabryczna VI Economic Department of the National Court Register, VAT-ID: PL8943079568, REGON 364634116; Share capital: PLN 5.000 fully paid-up.
Reply | Threaded
Open this post in threaded view
|

Re: Job Cluster on Kubernetes with PyFlink

Shuiqiang Chen
Hi Wojciech,

Currently, we are not able to deploy a job cluster for PyFlink jobs on kubernetes, but it will be supported in release-1.12.

Best,
Shuiqiang

Reply | Threaded
Open this post in threaded view
|

Re: Job Cluster on Kubernetes with PyFlink

Shuiqiang Chen
Hi Wojciech,

After double checking, there should be a way to run PyFlink jobs on kubernetes in the job cluster. You can have a try:
1. The custom image has a corresponding pyflink installed. (it seems that you have already done this)
2. If you use third-party python dependencies in the Python UDF, please make sure that the Python dependencies should also be pip installed
3. Putting the flink-python_{your_scala_version}-{your_flink_version}.jar into the /opt/flink/usrlib directory when building the custom docker image.
4. Setting the value of option "--job-classname" to be "org.apache.flink.client.python.PythonDriver".
5. Adding '-pym {the_entry_module_of_your_pyflink_job}' to [job arguments].

Best,
Shuiqiang


Shuiqiang Chen <[hidden email]> 于2020年7月28日周二 下午5:55写道:
Hi Wojciech,

Currently, we are not able to deploy a job cluster for PyFlink jobs on kubernetes, but it will be supported in release-1.12.

Best,
Shuiqiang