Hello I am trying to use CEP of Flink for log files (as batch job), but not for streams (as realtime).
Is that possible ? If yes, do you know examples Scala codes about that ? Or should I convert the log files (with time stamps) into streams ?
But how to handle time stamps in Flink ? If I can not use Flink at all for this purpose, do you have any recommendations of other tools ? I would want CEP type analysis for log files. |
Hi Esa, you can also read files as a stream. However, you have to be careful in which order you read the files and how you generate watermarks. 2018-02-07 9:40 GMT+01:00 Esa Heikkinen <[hidden email]>:
|
Hi Thanks for the reply, but because I am a newbie with Flink, do you have any good Scala code examples about this ? Esa From: Fabian Hueske [mailto:[hidden email]]
Hi Esa, you can also read files as a stream. The easiest approach is to implement a non-parallel source function that reads the files in the right order and generates watermarks. Things become more tricky when you try to read the files in parallel. Best, Fabian 2018-02-07 9:40 GMT+01:00 Esa Heikkinen <[hidden email]>:
|
Hi, I'm not aware of a good example but I can give you some pointers.- Implement the SourceFunction interface. This function will not be executed in parallel, so you don't have to worry about parallelism. - Every n-th record, you can emit a watermark. The watermark timestamp must be smaller than all record that will be emitted in the future. 2018-02-07 13:59 GMT+01:00 Esa Heikkinen <[hidden email]>:
|
Free forum by Nabble | Edit this page |