Flink[Python] questions

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink[Python] questions

Dc Zhao (BLOOMBERG/ 120 PARK)
Hi Flink Community:
We are using the pyflink to develop a POC for our project. We encountered some questions while using the flink.

We are using the flink version 1.2, python3.7, data stream API

1. Do you have examples of providing a python customized class as a `result type`? Based on the documentation research, we found out only built-in types are supported in Python. Also, what is the payload size limitation inside the flink, do we have a recommendation for that?

2. Do you have examples of `flink run --python` data stream API codes to the cluster? We tried to do that, however the process hangs on a `socket read from the java gateway`, due to the lack of the missing logs, we are not sure what is missing while submitting the job.



Regards
Dc


<< {CH} {TS} Anything that can possibly go wrong, it does. >>
Reply | Threaded
Open this post in threaded view
|

Re: Flink[Python] questions

Shuiqiang Chen
Hi Dc,

Thank you for your feedback.

1. Currently, only built-in types are supported in Python DataStream API, however, you can apply a Row type to represent a  custom Python class as a workaround that field names stand for the name of member variables and field types stand for the type of member variables.

2. Could you please provide the full executed command line and which kind of cluster you are running (standalone/yarn/k8s)? Various command lines to submit a Pylink job are shown in https://ci.apache.org/projects/flink/flink-docs-master/deployment/cli.html#submitting-pyflink-jobs.

The attachment is an example code for a Python DataStream API job, for your information.
  
Best,
Shuiqiang

Dc Zhao (BLOOMBERG/ 120 PARK) <[hidden email]> 于2021年1月14日周四 下午1:00写道:
Hi Flink Community:
We are using the pyflink to develop a POC for our project. We encountered some questions while using the flink.

We are using the flink version 1.2, python3.7, data stream API

1. Do you have examples of providing a python customized class as a `result type`? Based on the documentation research, we found out only built-in types are supported in Python. Also, what is the payload size limitation inside the flink, do we have a recommendation for that?

2. Do you have examples of `flink run --python` data stream API codes to the cluster? We tried to do that, however the process hangs on a `socket read from the java gateway`, due to the lack of the missing logs, we are not sure what is missing while submitting the job.
 
 


Regards
Dc


<< {CH} {TS} Anything that can possibly go wrong, it does. >>

PyFlinkDataStreamExample.py (2K) Download Attachment