Hi,
I have an API that emits output that I want to use as a data source for Flink. I have written a custom source function that is as follows - public class DynamicRuleSource extends AlertingRuleSource { The run method in CustomSource polls for any data to be ingested. The same object instance is shared with the API and the Flink Execution environment, however, the output of the API does not get ingested into the Flink DataStream. Is this the right pattern to use, or is Kafka the recommended way of streaming data into Flink ? --Aarti
Director, Engineering, Correlation
|
Hi Aarti, I would imagine that the described approach (sharing the same object instance with the API and the Flink runtime) would only work in toy executions, such as executing the job within the IDE. Moreover, you would not be able to have exactly-once semantics with this source, which for most users, is one main advantage of choosing Kafka as the source for Flink jobs. With the way how Flink checkpointing works [1], Flink's exactly-once guarantees rely on the fact that Kafka records are replayable from a deterministic record offset. Please let me know if you have any further questions. Cheers, Gordon On Mon, Nov 12, 2018 at 3:16 PM Aarti Gupta <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |