Hi, I am doing simple aggregation with a keyed and global windows in flink. What can be the reason behind this difference in performance? Thanks, Adrienne |
Hi,
to answer this question, we would first need to know what you mean by „global windows“: using „windowAll()“ or „GlobalWindows“? Also, the answer might depend on the Flink version that you are using. Best, Stefan > Am 07.05.2017 um 23:23 schrieb Adrienne Kole <[hidden email]>: > > Hi, > > I am doing simple aggregation with a keyed and global windows in flink. > When I compare the keyed window aggregation with 1 key and global window (which has parallelism 1) I would expect that both of them would have similar performance. > > However, keyed stream with 1 key performs with 2x more throughput than global window. > My configuration is 8 node cluster, 16 core in each node, parallelism = 128. > > AFAIK, Flink doesn't manage skew by default and uses hash function to assign keys to partitions. So if I have 1 key only, it should go to only one partition always, which is semantically similar to global windows in flink. > > What can be the reason behind this difference in performance? > > Thanks, > Adrienne |
Hi, Thanks for the reply. So I have 2 cases:I am using Flink 1.1.3. Thanks, Adrienne On Mon, May 8, 2017 at 3:55 PM, Stefan Richter <[hidden email]> wrote: Hi, |
That is interesting, because already in Flink 1.1.x, windowAll() is implemented as input.keyBy(new DummyKeySelector()).window(). Are you using event time or processing or event time, and most important, do the execution graphs in the web frontend look different in both variants?
|
Free forum by Nabble | Edit this page |