Side outputs PyFlink

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Side outputs PyFlink

Wouter Zorgdrager
Dear Flink community,

First of all, I'm very excited about the new 1.13 release. Among other features, I'm particularly excited about the support of stateful operations in Python. I think it will make the wonders of stream processing and the power of Flink accessible to more developers. 

I'm currently playing around a bit with these new features and I was wondering if there are already plans to support side output in the Python API? This already works pretty neatly in the DataStream API but couldn't find any communication on adding this to PyFlink. 

In the meantime, what do you suggest for a workaround on side outputs? Intuitively, I would copy a stream and add a filter for each side output but this seems a bit inefficient. In that setup, each side output will need to go over the complete stream. Any ideas?

Thanks in advance!
Regards,
Wouter
Reply | Threaded
Open this post in threaded view
|

Re: Side outputs PyFlink

Dian Fu
Hi Wouter,

You are right that side out is still not supported in PyFlink. It’s definitely one of the features we want to support in the next release.

For now, the workaround you mentioned is also what I have in my head. Personally I think if the performance of the filter is good enough, it will not affect the performance too much.

Regards,
Dian

> 2021年5月20日 下午5:15,Wouter Zorgdrager <[hidden email]> 写道:
>
> Dear Flink community,
>
> First of all, I'm very excited about the new 1.13 release. Among other features, I'm particularly excited about the support of stateful operations in Python. I think it will make the wonders of stream processing and the power of Flink accessible to more developers.
>
> I'm currently playing around a bit with these new features and I was wondering if there are already plans to support side output in the Python API? This already works pretty neatly in the DataStream API but couldn't find any communication on adding this to PyFlink.
>
> In the meantime, what do you suggest for a workaround on side outputs? Intuitively, I would copy a stream and add a filter for each side output but this seems a bit inefficient. In that setup, each side output will need to go over the complete stream. Any ideas?
>
> Thanks in advance!
> Regards,
> Wouter