Custom Partitioner and Graph Algorithms

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Custom Partitioner and Graph Algorithms

mbilalce.dev@gmail.com
Hi,

I am observing a behaviour in the task statistics that I don't fully understand.
Essentially I have create a partitioner that assigns all the edges to a single partition.
I see imbalance (in terms of records sent/received) in the task statistics of different instances of the same operator for the second and third stages.
But from fourth stage onwards, all operator instances are executing pretty much the same number of records. I would have expected that the imbalance would exist in those stages as well.

Details of the my code and task statistics are in this stackoverflow question:
https://stackoverflow.com/questions/55138553/behaviour-of-custom-partitioner-in-apache-flink

Thanks.

- Bilal

Reply | Threaded
Open this post in threaded view
|

Re: Custom Partitioner and Graph Algorithms

mbilalce.dev@gmail.com
I have added a working code example to the stackoverflow question that is representative of what I am using. The github repo can be found here: https://github.com/MBtech/graphtest

On 2019/03/13 09:46:52, MBilal <[hidden email]> wrote:

> Hi,
>
> I am observing a behaviour in the task statistics that I don't fully understand.
> Essentially I have create a partitioner that assigns all the edges to a single partition.
> I see imbalance (in terms of records sent/received) in the task statistics of different instances of the same operator for the second and third stages.
> But from fourth stage onwards, all operator instances are executing pretty much the same number of records. I would have expected that the imbalance would exist in those stages as well.
>
> Details of the my code and task statistics are in this stackoverflow question:
> https://stackoverflow.com/questions/55138553/behaviour-of-custom-partitioner-in-apache-flink
>
> Thanks.
>
> - Bilal
>
>