Hello, the performance of apply function after join

Posted by Philip Lee on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Hello-the-performance-of-apply-function-after-join-tp3835.html

Hello, the performance of apply function after join.


Just for your information, I am running Flink job on the cluster consisted of 9 machine with each 48 cores. I am working on some benchmark with comparison of Flink, Spark-Sql, and Hive.

I tried to optimize join function with Hint for better performance. I want to increase the performance as much as possible.

Here are Questions===
1) When seeing job progress log, apply() after join function seems like it takes a bit long time. Do you think if I do not use apply() to format tuples, I would gain the better performance? Well, I could set just the column number instead of apply()

2) on using join with Hint like Huge or Tiny, is there the ideal ratio regarding to the size of two tables? For me, if some table is 10 times bigger than the other table, I use join with Hint. Otherwise, I usually use the general join().

Best,
Phil