Hi Team,
Is it possible to edit the job graph at runtime ? . Suppose, I want to add a new sink to the flink application at runtime that depends upon the specific parameters in the incoming events.Can i edit the jobgraph of a running flink application ? Thanks Jessy |
Hi Jessy,
to be precise, the JobGraph is not used at runtime. It is translated into an ExecutionGraph. But nevertheless such patterns are possible but require a bit of manual implementation. Option 1) You stop the job with a savepoint and restart the application with slightly different parameters. If the pipeline has not changed much, the old state can be remapped to the slightly modified job graph. This is the easiest solution but with the downside of maybe a couple of seconds downtime. Option 2) You introduce a dedicated control stream (i.e. by using the connect() DataStream API [1]). Either you implement a custom sink in the main stream of the CoProcessFunction. Or you enrich every record in the main stream with sink parameters that are read by you custom sink implementation. I hope this helps. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/operators/overview/#connect On 16.03.21 12:37, Jessy Ping wrote: > Hi Team, > Is it possible to edit the job graph at runtime ? . Suppose, I want to > add a new sink to the flink application at runtime that depends upon > the specific parameters in the incoming events.Can i edit the jobgraph > of a running flink application ? > > Thanks > Jessy |
Hi Timo/Team, Thanks for the reply. Just take the example from the following pseduo code, Suppose , this is the current application logic. firstInputStream = addSource(...) //Kafka consumer C1 secondInputStream = addSource(...) //Kafka consumer C2 outputStream = firstInputStream,keyBy(a -> a.key) .connect(secondInputStream.keyBy(b->b.key)) .coProcessFunction(....) // logic determines : whether a new sink should be added to the application or not ?. If not: then the event will be produced to the existing sink(s). If a new sink is required: produce the events to the existing sinks + the new one sink1 = addSink(outPutStream). //Kafka producer P1 . . . sinkN =
addSink(outPutStream). //Kafka producer PN Questions --> Can I add a new sink into the execution graph at runtime, for example : a new Kafka producer , without restarting the current application or using option1 ? --> (Option 2 )What do you mean by adding a custom sink at coProcessFunction , how will it change the execution graph ? Thanks Jessy On Tue, 16 Mar 2021 at 17:45, Timo Walther <[hidden email]> wrote: Hi Jessy, |
Hi Team, Can you provide your thoughts on this, it will be helpful .. Thanks Jessy On Tue, 16 Mar 2021 at 21:29, Jessy Ping <[hidden email]> wrote:
|
Hi Jessy,
No, there is no way to add a sink without restart currently. Could you elaborate why a restart is not an option for you? You can use Option 2, which means that you implement 1 source and 1 sink which will dynamically read from or write to different topics possibly by wrapping the existing source and sink. This is a rather complex task that I would not recommend to a new Flink user. If you have a known set of possible sink topics, another option would be to add all sinks from the go and only route messages dynamically with side-outputs. However, I'm not aware that such a pattern exists for sources. Although with the new source interface, it should be possible to do that. On Wed, Mar 17, 2021 at 7:12 AM Jessy Ping <[hidden email]> wrote:
|
Thanks Arvid for the reply. Can you please elaborate a little bit on option 2 , if possible ? Thanks Jessy On Mon, Mar 22, 2021, 7:27 PM Arvid Heise <[hidden email]> wrote:
|
In reply to this post by Arvid Heise-4
Thanks Arvid for the reply. Can you please elaborate a little bit on option 2 , if possible ? We are looking for a similar option. Currently we are proceeding with option 1. Thanks Jessy for the question On Mon, Mar 22, 2021, 7:27 PM Arvid Heise <[hidden email]> wrote:
|
In reply to this post by Arvid Heise-4
Hi Arvid, Thanks for the reply. I am currently exploring the flink features and we have certain use cases where new producers will be added the system dynamically and we don't want to restart the application frequently. It will be helpful if you explain the option 2 in detail ? Thanks & Regards Jessy On Mon, 22 Mar 2021 at 19:27, Arvid Heise <[hidden email]> wrote:
|
Hi, Option 2 is to implement your own Source/Sink. Currently, we have the old, discouraged interfaces along with the new interfaces. For source, you want to read [1]. There is a KafkaSource already in 1.12 that we consider Beta, you can replace it with the 1.13 after the 1.13 release (should be compatible). If that is too new for you or you are using 1.11-, then you may also just implement your own SourceFunction [2], quite possibly by just copying the existing KafkaSourceFunction and adjusting it. Inside your source function, you would read from your main=control topic and read the additional topics as you go. Your emitted values would then probably also contain the source topic along with your payload. How you model your payload data types depends on your use case and in general dynamic types are harder to use as you will experience lots of runtime errors (misspelled field names, mismatched field types etc.). I can outline the options once you provide more details. With sink, it's similar, although we miss a nice documentation for the new interface. I'd recommend having a look at the FLIP [3], which is quite non-technical. Again, if it doesn't work for you, you can also go with old SinkFunctions. The implementation idea is similar, you start with an existing sink (by copying or delegating) and write to additional topics as you go. Again, the dynamic types (different types for output topic1 and topic2?) will cause you quite a bit of runtime bugs. If all types are the same, there are better options than going with custom sources and sinks. Implementing them is an advanced topic, so I usually discourage them unless you have lots of experience already. I can help in any case. But in that case, I also need to understand your use case better. (In general, if you ask a technical question, you get a technical answer. If you ask it on a conceptual level, we may also offer architectural advice). On Tue, Mar 23, 2021 at 8:10 AM Jessy Ping <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |