It's not clear to me if you deploy streaming applications or batch jobs.
In case of a batch job, you probably want to get everything into one big job to use resources as efficiently as possible. I'm assuming stream for the remainder of this mail.
The granularity of the job depends more on your operational requirements.
What do you want/can restart in case of failure, configuration change, application update?
Usually, you want to break it down as much as possible for high SLAs / low downtimes. However, smaller jobs disproportionally use more resources and it might be harder to keep the overview over all jobs. So I'd probably go with your intuition.
I'm new to Apache Flink and I would like to get some opinions on how I should deploy my Flink jobs.
Let's say I want to do sentiment analysis for Slack workspaces. I have 10 companies each having 2 slack workspaces.
How should I deploy Flink jobs if I'd like to efficiently utilize flink?
1 sentiment analysis Flink job per slack workspace.
1 sentiment analysis Flink job per company.
1 sentiment analysis Flink job for all workspaces.
My intuition tells me that I should use 1 job per company, having a total of 10 jobs so they would be easy to manage and restart if a fault occurs. But I'd like to hear some other opinions.
Thank you!
--
--
Arvid Heise | Senior Java Developer
Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng