(DEPRECATED) Apache Flink User Mailing List archive.

Understand Broadcast State in Node Failure Case

Classic

List

Threaded

3 messages Options

Chengzhi Zhao

Understand Broadcast State in Node Failure Case

Hey folks,

We are trying to use the broadcast state as "Shared Rule" state to filter test data in our stream pipeline, the broadcast will be connected with other streams in the pipeline.

I noticed on broadcast_state[1] important consideration page, it is mentioned No RocksDB state backend and state would be kept in in-memory at runtime.

I am trying to figure out how it works, for example,

1. If a node goes down, will broadcast state lost the entire state for that node and then sync from other nodes?

2. In case of the entire job fail or savepoint been triggered, how broadcast state get its state back or additional bootstrapping logic needs to be added ourselves?

Thanks for your help!

Best regards,

Chengzhi

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html#important-considerations

Fabian Hueske-2

Re: Understand Broadcast State in Node Failure Case

Hi Chengzhi,

Broadcast State is checkpointed like any other state and will be restored in all failure cases (including the ones you mentioned).

We added the warning to inform users that Broadcast state will also be stored in the JVM memory, even if the RocksDB StateBackend was configured (which stores state on disk).

This warning is only about the size of the state, not about the consistency guarantees.

Best, Fabian

Am Mo., 22. Okt. 2018 um 19:26 Uhr schrieb Chengzhi Zhao <[hidden email]>:

Hey folks,

We are trying to use the broadcast state as "Shared Rule" state to filter test data in our stream pipeline, the broadcast will be connected with other streams in the pipeline.
I noticed on broadcast_state[1] important consideration page, it is mentioned No RocksDB state backend and state would be kept in in-memory at runtime.

I am trying to figure out how it works, for example,
1. If a node goes down, will broadcast state lost the entire state for that node and then sync from other nodes?
2. In case of the entire job fail or savepoint been triggered, how broadcast state get its state back or additional bootstrapping logic needs to be added ourselves?

Thanks for your help!

Best regards,
Chengzhi

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html#important-considerations

Chengzhi Zhao

Re: Understand Broadcast State in Node Failure Case

Thanks Fabian for the clarification!

Best regards,

Chengzhi

On Mon, Oct 22, 2018 at 5:19 PM Fabian Hueske <[hidden email]> wrote:

Hi Chengzhi,

Broadcast State is checkpointed like any other state and will be restored in all failure cases (including the ones you mentioned).
We added the warning to inform users that Broadcast state will also be stored in the JVM memory, even if the RocksDB StateBackend was configured (which stores state on disk).
This warning is only about the size of the state, not about the consistency guarantees.

Best, Fabian

Am Mo., 22. Okt. 2018 um 19:26 Uhr schrieb Chengzhi Zhao <[hidden email]>:
Hey folks,

We are trying to use the broadcast state as "Shared Rule" state to filter test data in our stream pipeline, the broadcast will be connected with other streams in the pipeline.
I noticed on broadcast_state[1] important consideration page, it is mentioned No RocksDB state backend and state would be kept in in-memory at runtime.

I am trying to figure out how it works, for example,
1. If a node goes down, will broadcast state lost the entire state for that node and then sync from other nodes?
2. In case of the entire job fail or savepoint been triggered, how broadcast state get its state back or additional bootstrapping logic needs to be added ourselves?

Thanks for your help!

Best regards,
Chengzhi

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html#important-considerations