Hi,
Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future? Thanks, Giancarlo |
Hi Giancarlo, I have no knowledge of someone working on such a project. However it would be a valuable contribution, if you were to start the effort please keep us notified, I would also suggest to file a JIRA ticket for it. Best, Marton On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote: Hi, |
In reply to this post by Giancarlo Pagano
Hi Giancarlo! I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka. I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as Kafka, especially with the ability to read the streams from specific positions. That way, it should be possible to follow the same design as the Kafka connector (even simpler, if they don't have the tricky offset committing part of Kafka). Greetings, Stephan On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote: Hi, |
Hi Stephan,
I’m not a lot familiar with Kafka on the other hand, but I think they offer a very similar abstraction. Kinesis has a low-level api and an high level consumer, the Kinesis Client Library (KCL). I‘ve implemented a first version of the connector using the KCL, that I’ve been using for testing. It doesn’t support checkpointing yet, I’ll have a better look at the Flink Kafka Consumer and see what needs to be done to add support for checkpoints. I’ll probably need more help for that. Thanks, Giancarlo > On 17 Sep 2015, at 12:27, Stephan Ewen <[hidden email]> wrote: > > Hi Giancarlo! > > I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka. > > I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as Kafka, especially with the ability to read the streams from specific positions. That way, it should be possible to follow the same design as the Kafka connector (even simpler, if they don't have the tricky offset committing part of Kafka). > > Greetings, > Stephan > > > On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote: > Hi, > > Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future? > > Thanks, > Giancarlo > |
Hi Giancarlo! Cool that you are working on a Kinesis connector, very exciting :-) To have a look at the Kafka fault tolerance, you can check out this blog post, it explains it in one of the later sections: http://data-artisans.com/kafka-flink-a-practical-how-to/ A general overview of checkpointing is here: https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html The basic principles behind the Kafka checkpointing are the following: 1) Partition-to-Source assignment is deterministic. On a retry run, the parallel source task n will get the same partitions assigned. 2) Whenever a record is fetched from Kafka, the "offset" (incrementing position) is fetched as well and remembered locally. We always keep the position of the last fetched element in each partition. 3) That position is stored as part of the checkpointed, to define where in a partition the stream was at the time of the checkpoint. On a restore-after-failure, we set the source to start reading from that position. Greetings, Stephan On Fri, Sep 18, 2015 at 12:00 PM, Giancarlo Pagano <[hidden email]> wrote: Hi Stephan, |
Hi Giancarlo,
Since it has been a while since the last post and there hasn't been a JIRA ticket opened for Kinesis connector yet, I'm wondering how you are doing on the Kinesis connector and hope to help out with this feature :) I've opened a JIRA (https://issues.apache.org/jira/browse/FLINK-3211), finished the Kinesis sink, and half way through the Kinesis consumer. Would you like to merge our current efforts so that we can complete this feature ASAP for the AWS user community? Thankfully, Gordon |
Thank you for picking up this issue. I am excited to have a Kinesis connector in the code base. I added you as a contributor in JIRA, you can assign the issue to yourself now, to let others know you are working on this. Thanks! Stephan On Sat, Jan 9, 2016 at 8:18 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote: Hi Giancarlo, |
Free forum by Nabble | Edit this page |