Apache Flink - Question about broadcast state pattern usage

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Apache Flink - Question about broadcast state pattern usage

M Singh
Hi Flink folks:

I am reading the documentation on broadcast state pattern (https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/broadcast_state.html) and have following questions:

1. Point number 2 - '2. it is only available to specific operators that have as inputs a broadcasted stream and a non-broadcasted one,'.  From what I understand it can be used with connected streams.  Is there any other operator where it can be used ?

2. Point number 3 - '3. such an operator can have multiple broadcast states with different names.'.  Is there any additional documentation/example on how to implement/use multiple broadcast states ?

Thanks

Mans


On Saturday, April 6, 2019, 3:14:54 PM EDT, <[hidden email]> wrote:


Hi,

 

I have a simple data pipeline of a Kafka source, a flink map operator and  a Kafka sink.

 

I have a quick question about latency caused by the checkpoint on the exactly once mode.

 

Due to the changes are committed and visible on a checkpoint completion, so the latency could be as long as that length of checkpoint interval e.g. 5seconds?

 

Is my understanding correct?

 

If I use the at least mode, there will be this addition on latency.  More interestingly, the flink document https://ci.apache.org/projects/flink/flink-docs-release-1.7/internals/stream_checkpointing.html indicate that "dataflows with only embarrassingly parallel streaming operations (map()flatMap()filter(), …) actually give exactly once guarantees even in at least once mode."

 

Unfortunately, I have been not able to achieve the exactly once with the at least once. Do I need more settings than I have with the exactly once mode?

 

Many thanks for the advises in advance.

 

Min