Hi All,
have a question did anyone compared the performance of Flink batch job writing to s3 vs spark writing to s3? Thanks & Regards Sri Tummala |
Fair benchmarks are notoriously difficult to setup. Usually, it's easy to find a workload where one system shines and as its vendor you report that. Then, the competitor benchmarks a different use case where his system outperforms ours. In the end, customers are more confused than before. You should do your own benchmarks for your own workloads. That is the only reliable way. In the end, both systems use similar setups and improvements in one system are often also incorporated into the other system with some delay, such that there should be no ground-breaking differences between the two systems running on Java and using the same set of libraries. Of course, if one system has a very specific optimization for your use case, that could be much faster. On Mon, Feb 24, 2020 at 11:26 PM sri hari kali charan Tummala <[hidden email]> wrote:
|
Thank you (the two systems running on Java and using the same set of libraries), so from my understanding, Flink uses AWS SDK behind the scenes same as spark. On Wed, Feb 26, 2020 at 8:49 AM Arvid Heise <[hidden email]> wrote:
Thanks & Regards
Sri Tummala |
Exactly. We use the hadoop-fs as an indirection on top of that, but Spark probably does the same. On Wed, Feb 26, 2020 at 3:52 PM sri hari kali charan Tummala <[hidden email]> wrote:
|
Ok, thanks for the clarification. On Wed, Feb 26, 2020 at 9:22 AM Arvid Heise <[hidden email]> wrote:
Thanks & Regards
Sri Tummala |
sorry for being lazy I would have gone through flink source code. On Wed, Feb 26, 2020 at 9:35 AM sri hari kali charan Tummala <[hidden email]> wrote:
Thanks & Regards
Sri Tummala |
Free forum by Nabble | Edit this page |