Keyed State

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Keyed State

Boris Lublinsky
Can you, please confirm that my understanding is correct?
And the example there.
When we are doing KeyBy and then Process, Flink maintains an instance per key and makes sure that that for a given key an instance for this key is used. Correct?
It mean that the value state for a given key is maintained by Flink and in my code I do not need to worry about a key value.
In my code I can use ValueState and assume that Flink will keep track of it on per key fashion.

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

Reply | Threaded
Open this post in threaded view
|

Re: Keyed State

Fabian Hueske-2
Yes, that is correct.
You can treat keyed ValueState like a distributed hashmap and Flink routes all state accesses to the entry for the key of the current record.

2018-01-13 17:07 GMT+01:00 Boris Lublinsky <[hidden email]>:
Can you, please confirm that my understanding is correct?
And the example there.
When we are doing KeyBy and then Process, Flink maintains an instance per key and makes sure that that for a given key an instance for this key is used. Correct?
It mean that the value state for a given key is maintained by Flink and in my code I do not need to worry about a key value.
In my code I can use ValueState and assume that Flink will keep track of it on per key fashion.

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/


Reply | Threaded
Open this post in threaded view
|

Re: Keyed State

Boris Lublinsky
Thanks Fabian

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On Jan 13, 2018, at 11:06 AM, Fabian Hueske <[hidden email]> wrote:

Yes, that is correct.
You can treat keyed ValueState like a distributed hashmap and Flink routes all state accesses to the entry for the key of the current record.

2018-01-13 17:07 GMT+01:00 Boris Lublinsky <[hidden email]>:
Can you, please confirm that my understanding is correct?
And the example there.
When we are doing KeyBy and then Process, Flink maintains an instance per key and makes sure that that for a given key an instance for this key is used. Correct?
It mean that the value state for a given key is maintained by Flink and in my code I do not need to worry about a key value.
In my code I can use ValueState and assume that Flink will keep track of it on per key fashion.

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/



Reply | Threaded
Open this post in threaded view
|

Re: Keyed State

Boris Lublinsky
In reply to this post by Fabian Hueske-2
Thanks Fabian
Can you also explain a thread model?
What is the paralelization between multiple keys? Is it hash based?
And also are processElement 1 and 2 are executed on different threads?
More specifically if processElement is an order of magnitude slower then 2, will it impact processElement 2?


Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On Jan 13, 2018, at 11:06 AM, Fabian Hueske <[hidden email]> wrote:

Yes, that is correct.
You can treat keyed ValueState like a distributed hashmap and Flink routes all state accesses to the entry for the key of the current record.

2018-01-13 17:07 GMT+01:00 Boris Lublinsky <[hidden email]>:
Can you, please confirm that my understanding is correct?
And the example there.
When we are doing KeyBy and then Process, Flink maintains an instance per key and makes sure that that for a given key an instance for this key is used. Correct?
It mean that the value state for a given key is maintained by Flink and in my code I do not need to worry about a key value.
In my code I can use ValueState and assume that Flink will keep track of it on per key fashion.

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/



Reply | Threaded
Open this post in threaded view
|

Re: Keyed State

Fabian Hueske-2
Sure.

A CoProcessFunction is executed in parallel by running multiple instances of the CoProcessFunction. Each instance runs in a separate TaskManager slot and is responsible for a subset of all keys. Keys are assigned by hash partitioning to function instances.

All calls to methods of an individual CoProcessFunction instance are synchronized, i.e., processElement1, processElement2, and onTimer are never concurrently called. A long running processElement1 method will block all method calls for all keys that are assigned to the same instance (not just method calls for the same key).

2018-01-13 20:33 GMT+01:00 Boris Lublinsky <[hidden email]>:
Thanks Fabian
Can you also explain a thread model?
What is the paralelization between multiple keys? Is it hash based?
And also are processElement 1 and 2 are executed on different threads?
More specifically if processElement is an order of magnitude slower then 2, will it impact processElement 2?


Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/

On Jan 13, 2018, at 11:06 AM, Fabian Hueske <[hidden email]> wrote:

Yes, that is correct.
You can treat keyed ValueState like a distributed hashmap and Flink routes all state accesses to the entry for the key of the current record.

2018-01-13 17:07 GMT+01:00 Boris Lublinsky <[hidden email]>:
Can you, please confirm that my understanding is correct?
And the example there.
When we are doing KeyBy and then Process, Flink maintains an instance per key and makes sure that that for a given key an instance for this key is used. Correct?
It mean that the value state for a given key is maintained by Flink and in my code I do not need to worry about a key value.
In my code I can use ValueState and assume that Flink will keep track of it on per key fashion.

Boris Lublinsky
FDP Architect
[hidden email]
https://www.lightbend.com/