When I start my flink application with a -p parallelism value of 24, 29 slots are used for the application. Is that expected behavior in some scenarios?
My application reads in an event stream from Kafka. It does some filtering and does a keyBy on the stream. Then it processes the same stream two different ways. The first does some data extraction and writes to a sink (it uses rocksdb to manage state). The second does a windowing on the stream and writes to a different sink. |
Hi, Parallelism is actually operator level, and each instance of the operator will occupy one slot. In some cases, Flink use chaining to chain multi operators to let them share one single slot, but sometimes it can not be done. If your job contains multiple operators and some of them cannot be chained, it's possible that the job will use more slots than the number of parallelism you configured. Best, Kurt On Thu, Feb 9, 2017 at 2:19 AM, bwong247 <[hidden email]> wrote: When I start my flink application with a -p parallelism value of 24, 29 |
Hi Kurt,
Thanks for the reply. Does this mean that if my job has 3 operators (not chained), it will use at least 3 slots? I thought parallelism was task based. You can define it at an operator level, but that only means that the tasks for that operator are distributed across that many slots. Shouldn't I be able to start the 3 operator job with a parallelism of 1 where all the operators run on the same single slot? Regards, Bernard |
Hi, The first answer is "yes", 3 unchained operator will use at least 3 slots, except if these 3 operators are blocking operators and you are running a batch job, the operators will use the same slot one after another. Regarding to you second question, if you want to start 3 operators with parallelism of 1, Flink will chain these operators and execute them on the same slot. But if you disabled chaining, they do need 3 slots. Best, Kurt On Sat, Feb 11, 2017 at 2:35 AM, bwong247 <[hidden email]> wrote: Hi Kurt, |
Hi Bernard and Kurt, Chaining affects how subtasks operate within slots. Resource groups segregate subtasks into different slots. Only a subset of operators can be chained. On Sat, Feb 11, 2017 at 3:25 AM, Kurt Young <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |