Hey everyone, I’m quite new to Apache Flink. I’m trying to build a system with Flink and wanted to hear your opinion and whether the proposed architecture is even possible with Flink. The environment for the system will be a microservice
architecture handling messaging via async events. I want to give you a brief description of the system: -
there are a lot of sensors, which each produces a stream of data -
on each stream of sensor data I want to match one or more patterns via Flink’s CEP library -
each of these sensors belongs to one or more administrative entities -
each pattern belongs to one administrative entity and needs to be evaluated on one or more sensors of this entity -
the user can change the connection of a sensor to an administrative entity as well as the sensors on which a pattern needs to be evaluated I hope this description is enough to give you an overview of the system. This is what I am thinking of doing: -
I will have an Apache Kafka cluster and a Flink cluster running inside docker containers -
I create a topic in Kafka for each administrative entity -
for each entity I create a Flink job which consumes the corresponding topic -
the Flink job creates a stream of the sensor data -
it splits the stream to a stream for each sensor -
for each pattern that hast to be evaluated on one stream I create a pattern stream This results in the following: -
there will be a lot of Kafka topics -
for each topic there will be one Flink job (-> a lot of jobs, too) -
in each job there will be quite a lot of streams and patterns and therefore even more pattern streams The main questions that arose while thinking of this implementation: 1.)
From other questions here, I know that there is currently no way to dynamically add taskmanagers to the Flink cluster. The proposed way to handle that, is to start up much more taskmanagers than first needed.
Is it even possible to have a great number of jobs on one cluster? 2.)
Would a viable alternative be to just dynamically start up a new cluster for each administrative entity? 3.)
I also came to know, that Flink isn’t able to handle dynamically created streams and patterns. I guess that is due to the fixed calculation of the execution graph at the jobs beginning. Is there a way to make
Flink recalculate the graph of a running job? I also just found out about this [1] example, where they use scripts to hot deploy queries. I will look into that, too. Maybe that provides an acceptable solution for me, too. 4.)
Is it even possible to have a great number of streams and patterns in one Flink job? Any comments and feedback are greatly appreciated.
Thanks a lot in advance J Best, Claudia [1]:
https://techblog.king.com/rbea-scalable-real-time-analytics-king/ |
Hi Claudia, 1) What do you mean by dynamically adding? In standalone mode (which you would probably use with Docker images), you can just start additional TaskManagers, which will connect to a JobManager. So you could implement some monitoring to start new TaskManagers as soon as they are needed. In general, we recommend to start one JobManager per job, but running multiple jobs per JM is also possible. I don't have much experience with many concurrent jobs on a JM, but in theory there are no limits. In practice you'll probably run into stability issues at some point, because the JM needs to coordinate too many jobs / taskmanagers. 2) Yes, that would be an option. The most important aspects here are: data throughput per admin group / state size / analysis complexity. If the each administrative group is low traffic (~100.000 elements / second), you could maybe process the data not using a Flink cluster at all. The Standalone mode of Flink starts a JobManager and TaskManager within the same JVM. You could prepare a docker image with a standalone flink + the job and start that per administrative group. I think a reasonably sized machine (8 cores, 32 gb of main memory) should handle that. 3) Yes, you can not modify a running job. You can follow the King.com / RBEA approach. 4) That depends on the the definition of great. I think above answers greatly depend on the expected amount of data and the available hardware. Since Flink is quite easy to deploy, and a simple testing job is implemented in a few hours, I would suggest to do some experiments to see how Flink behaves in the given environment. Regards, Robert On Mon, Jul 11, 2016 at 9:39 AM, Claudia Wegmann <[hidden email]> wrote:
|
Hey Robert, thanks for the valuable input. I have some follow-up questions: To 2) I had a separate Docker container for the JobManager and the TaskManagers for playing around with the Flink Cluster. In production
that’s not an option. As I mentioned before the Flink job has to interact in a microservice environment communicating via async messages. For message handling the tool vert.x is used. Therefore the JobManager and TaskManager also need a connection to the vert.x
instance. If they were deployed in a separate Docker container each container would need to have to include a vert.x instance, which is quite heavy weighed. The prefer to have a verticle, which starts the JobManager and the needed TaskManagers and starts the
Flink job. Is there a common way to start JobManagers/TaskManagers from the Java programm? I would even prefer to have JobManager and TaskManager build as separate JARs decorated with a verticle. Is that possible? Where would
I start separating the JobManager and the TaskManager? To 3) So there is no way to recalculate the JobGraph/ExecutionGraph of a running job? Would it be possible to alter the code to make
this possible (without making Flink to stand upside down ;)? To 4) What do you think would an upper limit for the number of streams in one job be? Thx again and best wishes, Claudia Von: Robert Metzger [mailto:[hidden email]]
Hi Claudia, 1) What do you mean by dynamically adding? In standalone mode (which you would probably use with Docker images), you can just start additional TaskManagers, which will connect to a JobManager. So you could implement some monitoring to start new TaskManagers as soon as they are needed. In general, we recommend to start one JobManager per job, but running multiple jobs per JM is also possible. I don't have much experience with many concurrent jobs on a JM, but in theory there are no limits. In practice you'll probably run into stability issues at some point, because the JM needs to coordinate too many jobs / taskmanagers. 2) Yes, that would be an option. The most important aspects here are: data throughput per admin group / state size / analysis complexity. If the each administrative group is low traffic (~100.000 elements / second), you could maybe process the data not using a Flink cluster at all. The Standalone mode of Flink starts a JobManager and TaskManager within
the same JVM. You could prepare a docker image with a standalone flink + the job and start that per administrative group. I think a reasonably sized machine (8 cores, 32 gb of main memory) should handle that. 3) Yes, you can not modify a running job. You can follow the King.com / RBEA approach. 4) That depends on the the definition of great. I think above answers greatly depend on the expected amount of data and the available hardware. Since Flink is quite easy to deploy, and a simple testing job is implemented in a few hours, I would suggest to do some experiments
to see how Flink behaves in the given environment. Regards, Robert On Mon, Jul 11, 2016 at 9:39 AM, Claudia Wegmann <[hidden email]> wrote:
|
In reply to this post by rmetzger0
Hey everyone, the last few days I looked into the approach the King-Team took. And another question to my original point 3 arose: To 3) Would an approach similar to King/RBEA even be possible combined with Flink CEP? As I understand, Patterns have to be defined
in Java code and therefore have to be recompiled? Do I overlook something important? Thx for some input. Best, Claudia Von: Claudia Wegmann
Hey Robert, thanks for the valuable input. I have some follow-up questions: To 2) I had a separate Docker container for the JobManager and the TaskManagers for playing around with the Flink Cluster. In production
that’s not an option. As I mentioned before the Flink job has to interact in a microservice environment communicating via async messages. For message handling the tool vert.x is used. Therefore the JobManager and TaskManager also need a connection to the vert.x
instance. If they were deployed in a separate Docker container each container would need to have to include a vert.x instance, which is quite heavy weighed. The prefer to have a verticle, which starts the JobManager and the needed TaskManagers and starts the
Flink job. Is there a common way to start JobManagers/TaskManagers from the Java programm? I would even prefer to have JobManager and TaskManager build as separate JARs decorated with a verticle. Is that possible? Where would
I start separating the JobManager and the TaskManager? To 3) So there is no way to recalculate the JobGraph/ExecutionGraph of a running job? Would it be possible to alter the code to make
this possible (without making Flink to stand upside down ;)? To 4) What do you think would an upper limit for the number of streams in one job be? Thx again and best wishes, Claudia Von: Robert Metzger [[hidden email]]
Hi Claudia, 1) What do you mean by dynamically adding? In standalone mode (which you would probably use with Docker images), you can just start additional TaskManagers, which will connect to a JobManager. So you could implement some monitoring to start new TaskManagers as soon as they are needed. In general, we recommend to start one JobManager per job, but running multiple jobs per JM is also possible. I don't have much experience with many concurrent jobs on a JM, but in theory there are no limits. In practice you'll probably run into stability issues at some point, because the JM needs to coordinate too many jobs / taskmanagers. 2) Yes, that would be an option. The most important aspects here are: data throughput per admin group / state size / analysis complexity. If the each administrative group is low traffic (~100.000 elements / second), you could maybe process the data not using a Flink cluster at all. The Standalone mode of Flink starts a JobManager and TaskManager within
the same JVM. You could prepare a docker image with a standalone flink + the job and start that per administrative group. I think a reasonably sized machine (8 cores, 32 gb of main memory) should handle that. 3) Yes, you can not modify a running job. You can follow the King.com / RBEA approach. 4) That depends on the the definition of great. I think above answers greatly depend on the expected amount of data and the available hardware. Since Flink is quite easy to deploy, and a simple testing job is implemented in a few hours, I would suggest to do some experiments
to see how Flink behaves in the given environment. Regards, Robert On Mon, Jul 11, 2016 at 9:39 AM, Claudia Wegmann <[hidden email]> wrote:
|
On Mon, Jul 25, 2016 at 10:09 AM, Claudia Wegmann <[hidden email]> wrote:
> To 3) Would an approach similar to King/RBEA even be possible combined with > Flink CEP? As I understand, Patterns have to be defined in Java code and > therefore have to be recompiled? Do I overlook something important? Pulling in Till (cc'd) who worked on the CEP library. |
Free forum by Nabble | Edit this page |