Calling an operator serially

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Calling an operator serially

Philip Doctor

I’ve got a KeyedStream, I only want max parallelism (1) per key in the keyed stream (i.e. if key is OrganizationFoo then only 1 input at a time from OrganizationFoo is processed by this operator).  I feel like this is obvious somehow, but I’m struggling to find the docs for this.  Can anyone point me the right way here?

Reply | Threaded
Open this post in threaded view
|

Re: Calling an operator serially

Philip Doctor

I guess my logs suggest this is simply what a KeyedStream does by default, I guess I was just trying to find a doc that said that rather than relying on my logs.

 

From: Philip Doctor <[hidden email]>
Date: Tuesday, December 12, 2017 at 5:50 PM
To: "[hidden email]" <[hidden email]>
Subject: Calling an operator serially

 

I’ve got a KeyedStream, I only want max parallelism (1) per key in the keyed stream (i.e. if key is OrganizationFoo then only 1 input at a time from OrganizationFoo is processed by this operator).  I feel like this is obvious somehow, but I’m struggling to find the docs for this.  Can anyone point me the right way here?

Reply | Threaded
Open this post in threaded view
|

Re: Calling an operator serially

Fabian Hueske-2
Hi,

you are right.
The purpose of a KeyedStream is to process all events/records with the same key by the same operator task (which runs in a single thread). The operator itself can have a greater parallelism, such that different keys are processed by different tasks. 

Best, Fabian

2017-12-13 1:09 GMT+01:00 Philip Doctor <[hidden email]>:

I guess my logs suggest this is simply what a KeyedStream does by default, I guess I was just trying to find a doc that said that rather than relying on my logs.

 

From: Philip Doctor <[hidden email]>
Date: Tuesday, December 12, 2017 at 5:50 PM
To: "[hidden email]" <[hidden email]>
Subject: Calling an operator serially

 

I’ve got a KeyedStream, I only want max parallelism (1) per key in the keyed stream (i.e. if key is OrganizationFoo then only 1 input at a time from OrganizationFoo is processed by this operator).  I feel like this is obvious somehow, but I’m struggling to find the docs for this.  Can anyone point me the right way here?