Hi community,
I am using PyFlink and Pandas UDF in my job.
The job executes a SQL like this:
```
SELECT
LABEL_ENCODE(a),
LABEL_ENCODE(b),
LABEL_ENCODE(c)
...
```
And my LABEL_ENCODE UDF is defined below:
```
class LabelEncode(ScalarFunction):
def open(self, function_context):
self.encoder = load_encoder()
def eval(self, x):
...
labelEncode = udf(LabelEncode(), ...)
```
When I run the job, according to taskmanger log, "LabelEncode.open" is printed 3 times, which is exactly the times LABEL_ENCODE udf is called.
Since every LabelEncode.open causes an I/O (load_encoder() does so), I wonder if I can only initiate the UDF once, and use it 3 times?
Thank you!
Best,
Yik San