Kinesis Connector

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Kinesis Connector

Giancarlo Pagano
Hi,

Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future?

Thanks,
Giancarlo
Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Márton Balassi
Hi Giancarlo,

I have no knowledge of someone working on such a project. However it would be a valuable contribution, if you were to start the effort please keep us notified, I would also suggest to file a JIRA ticket for it.

Best,

Marton

On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote:
Hi,

Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future?

Thanks,
Giancarlo

Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Stephan Ewen
In reply to this post by Giancarlo Pagano
Hi Giancarlo!

I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka.

I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as Kafka, especially with the ability to read the streams from specific positions. That way, it should be possible to follow the same design as the Kafka connector (even simpler, if they don't have the tricky offset committing part of Kafka).

Greetings,
Stephan


On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote:
Hi,

Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future?

Thanks,
Giancarlo

Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Giancarlo Pagano
Hi Stephan,

I’m not a lot familiar with Kafka on the other hand, but I think they offer a very similar abstraction. Kinesis has a low-level api and an high level consumer, the Kinesis Client Library (KCL).
I‘ve implemented a first version of the connector using the KCL, that I’ve been using for testing.
It doesn’t support checkpointing yet, I’ll have a better look at the Flink Kafka Consumer and see what needs to be done to add support for checkpoints. I’ll probably need more help for that.
 
Thanks,
Giancarlo


> On 17 Sep 2015, at 12:27, Stephan Ewen <[hidden email]> wrote:
>
> Hi Giancarlo!
>
> I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka.
>
> I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as Kafka, especially with the ability to read the streams from specific positions. That way, it should be possible to follow the same design as the Kafka connector (even simpler, if they don't have the tricky offset committing part of Kafka).
>
> Greetings,
> Stephan
>
>
> On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote:
> Hi,
>
> Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future?
>
> Thanks,
> Giancarlo
>

Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Stephan Ewen
Hi Giancarlo!

Cool that you are working on a Kinesis connector, very exciting :-)

To have a look at the Kafka fault tolerance, you can check out this blog post, it explains it in one of the later sections: http://data-artisans.com/kafka-flink-a-practical-how-to/



The basic principles behind the Kafka checkpointing are the following:

1) Partition-to-Source assignment is deterministic. On a retry run, the parallel source task n will get the same partitions assigned.

2) Whenever a record is fetched from Kafka, the "offset" (incrementing position) is fetched as well and remembered locally. We always keep the position of the last fetched element in each partition.

3) That position  is stored as part of the checkpointed, to define where in a partition the stream was at the time of the checkpoint. On a restore-after-failure, we set the source to start reading from that position.

Greetings,
Stephan


On Fri, Sep 18, 2015 at 12:00 PM, Giancarlo Pagano <[hidden email]> wrote:
Hi Stephan,

I’m not a lot familiar with Kafka on the other hand, but I think they offer a very similar abstraction. Kinesis has a low-level api and an high level consumer, the Kinesis Client Library (KCL).
I‘ve implemented a first version of the connector using the KCL, that I’ve been using for testing.
It doesn’t support checkpointing yet, I’ll have a better look at the Flink Kafka Consumer and see what needs to be done to add support for checkpoints. I’ll probably need more help for that.

Thanks,
Giancarlo


> On 17 Sep 2015, at 12:27, Stephan Ewen <[hidden email]> wrote:
>
> Hi Giancarlo!
>
> I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka.
>
> I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as Kafka, especially with the ability to read the streams from specific positions. That way, it should be possible to follow the same design as the Kafka connector (even simpler, if they don't have the tricky offset committing part of Kafka).
>
> Greetings,
> Stephan
>
>
> On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <[hidden email]> wrote:
> Hi,
>
> Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future?
>
> Thanks,
> Giancarlo
>


Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Tzu-Li Tai
Hi Giancarlo,

Since it has been a while since the last post and there hasn't been a JIRA ticket opened for Kinesis connector yet, I'm wondering how you are doing on the Kinesis connector and hope to help out with this feature :)

I've opened a JIRA (https://issues.apache.org/jira/browse/FLINK-3211), finished the Kinesis sink, and half way through the Kinesis consumer. Would you like to merge our current efforts so that we can complete this feature ASAP for the AWS user community?

Thankfully,
Gordon
Reply | Threaded
Open this post in threaded view
|

Re: Kinesis Connector

Stephan Ewen
Thank you for picking up this issue. I am excited to have a Kinesis connector in the code base.

I added you as a contributor in JIRA, you can assign the issue to yourself now, to let others know you are working on this.

Thanks!
Stephan


On Sat, Jan 9, 2016 at 8:18 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi Giancarlo,

Since it has been a while since the last post and there hasn't been a JIRA
ticket opened for Kinesis connector yet, I'm wondering how you are doing on
the Kinesis connector and hope to help out with this feature :)

I've opened a JIRA (https://issues.apache.org/jira/browse/FLINK-3211),
finished the Kinesis sink, and half way through the Kinesis consumer. Would
you like to merge our current efforts so that we can complete this feature
ASAP for the AWS user community?

Thankfully,
Gordon



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-tp2872p4206.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.