Best practice for adding support for Kafka variants

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Best practice for adding support for Kafka variants

deepthi Sridharan
Hi,

We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink. 

It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).

Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?

--
Thanks & Regards

Reply | Threaded
Open this post in threaded view
|

Re: Best practice for adding support for Kafka variants

Roman Khachatryan
Hi,

Those classes will likely be deprecated in the future in favor of
FLIP-27 [1][2] source and FLIP-143 [3] sink implementations and
eventually removed (though it won't happen soon).
You probably should take a look at the above new APIs.

Either way, there is no such a recommendation AFAIK. Copied connector
classes will have to be updated if something in Flink changes. Maybe a
better way would be to build your own kafka client and use it to build
flink-kafka connector (by overriding ${kafka.version} for example).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2]
https://issues.apache.org/jira/browse/FLINK-18323
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API

Regards,
Roman

On Thu, May 20, 2021 at 7:45 PM deepthi Sridharan
<[hidden email]> wrote:

>
> Hi,
>
> We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink.
>
> It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).
>
> Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?
>
> --
> Thanks & Regards
>
Reply | Threaded
Open this post in threaded view
|

Re: Best practice for adding support for Kafka variants

deepthi Sridharan
Thank you, Roman. I should have said our own flavor of Kafka and not version. Thanks for the reference of the new source and sink interfaces, though, as it seems like the interfaces we should be implementing to use our custom Kafka connector.

I did notice however that the FLIP does not cover table interfaces. The KafkaDynamicTableFactory for example is still creating a FlinkKafkaConsumer instance. Is that something that will change in the future or are the table interfaces somehow exceptions to the advantages of the new interface?

--
Regards,
Deepthi

On Thu, May 20, 2021 at 12:23 PM Roman Khachatryan <[hidden email]> wrote:
Hi,

Those classes will likely be deprecated in the future in favor of
FLIP-27 [1][2] source and FLIP-143 [3] sink implementations and
eventually removed (though it won't happen soon).
You probably should take a look at the above new APIs.

Either way, there is no such a recommendation AFAIK. Copied connector
classes will have to be updated if something in Flink changes. Maybe a
better way would be to build your own kafka client and use it to build
flink-kafka connector (by overriding ${kafka.version} for example).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2]
https://issues.apache.org/jira/browse/FLINK-18323
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API

Regards,
Roman

On Thu, May 20, 2021 at 7:45 PM deepthi Sridharan
<[hidden email]> wrote:
>
> Hi,
>
> We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink.
>
> It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).
>
> Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?
>
> --
> Thanks & Regards
>
Reply | Threaded
Open this post in threaded view
|

Re: Best practice for adding support for Kafka variants

Chesnay Schepler
The FLIP-27 were primarily aimed at the DataStream API; the integration into the SQL/Table APIs will happen at a later date.

On 6/1/2021 5:59 PM, deepthi Sridharan wrote:
Thank you, Roman. I should have said our own flavor of Kafka and not version. Thanks for the reference of the new source and sink interfaces, though, as it seems like the interfaces we should be implementing to use our custom Kafka connector.

I did notice however that the FLIP does not cover table interfaces. The KafkaDynamicTableFactory for example is still creating a FlinkKafkaConsumer instance. Is that something that will change in the future or are the table interfaces somehow exceptions to the advantages of the new interface?

--
Regards,
Deepthi

On Thu, May 20, 2021 at 12:23 PM Roman Khachatryan <[hidden email]> wrote:
Hi,

Those classes will likely be deprecated in the future in favor of
FLIP-27 [1][2] source and FLIP-143 [3] sink implementations and
eventually removed (though it won't happen soon).
You probably should take a look at the above new APIs.

Either way, there is no such a recommendation AFAIK. Copied connector
classes will have to be updated if something in Flink changes. Maybe a
better way would be to build your own kafka client and use it to build
flink-kafka connector (by overriding ${kafka.version} for example).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2]
https://issues.apache.org/jira/browse/FLINK-18323
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API

Regards,
Roman

On Thu, May 20, 2021 at 7:45 PM deepthi Sridharan
<[hidden email]> wrote:
>
> Hi,
>
> We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink.
>
> It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).
>
> Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?
>
> --
> Thanks & Regards
>


Reply | Threaded
Open this post in threaded view
|

Re: Best practice for adding support for Kafka variants

Arvid Heise-4
Just to add, we target that for 1.14.

However, it's also not too complicated to add a new TableFactory that uses the new sources (or your source).

On Thu, Jun 3, 2021 at 10:04 AM Chesnay Schepler <[hidden email]> wrote:
The FLIP-27 were primarily aimed at the DataStream API; the integration into the SQL/Table APIs will happen at a later date.

On 6/1/2021 5:59 PM, deepthi Sridharan wrote:
Thank you, Roman. I should have said our own flavor of Kafka and not version. Thanks for the reference of the new source and sink interfaces, though, as it seems like the interfaces we should be implementing to use our custom Kafka connector.

I did notice however that the FLIP does not cover table interfaces. The KafkaDynamicTableFactory for example is still creating a FlinkKafkaConsumer instance. Is that something that will change in the future or are the table interfaces somehow exceptions to the advantages of the new interface?

--
Regards,
Deepthi

On Thu, May 20, 2021 at 12:23 PM Roman Khachatryan <[hidden email]> wrote:
Hi,

Those classes will likely be deprecated in the future in favor of
FLIP-27 [1][2] source and FLIP-143 [3] sink implementations and
eventually removed (though it won't happen soon).
You probably should take a look at the above new APIs.

Either way, there is no such a recommendation AFAIK. Copied connector
classes will have to be updated if something in Flink changes. Maybe a
better way would be to build your own kafka client and use it to build
flink-kafka connector (by overriding ${kafka.version} for example).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2]
https://issues.apache.org/jira/browse/FLINK-18323
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API

Regards,
Roman

On Thu, May 20, 2021 at 7:45 PM deepthi Sridharan
<[hidden email]> wrote:
>
> Hi,
>
> We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink.
>
> It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).
>
> Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?
>
> --
> Thanks & Regards
>


Reply | Threaded
Open this post in threaded view
|

Re: Best practice for adding support for Kafka variants

deepthi Sridharan
Makes sense. Thanks for the confirmation.

On Thu, Jun 3, 2021, 4:08 AM Arvid Heise <[hidden email]> wrote:
Just to add, we target that for 1.14.

However, it's also not too complicated to add a new TableFactory that uses the new sources (or your source).

On Thu, Jun 3, 2021 at 10:04 AM Chesnay Schepler <[hidden email]> wrote:
The FLIP-27 were primarily aimed at the DataStream API; the integration into the SQL/Table APIs will happen at a later date.

On 6/1/2021 5:59 PM, deepthi Sridharan wrote:
Thank you, Roman. I should have said our own flavor of Kafka and not version. Thanks for the reference of the new source and sink interfaces, though, as it seems like the interfaces we should be implementing to use our custom Kafka connector.

I did notice however that the FLIP does not cover table interfaces. The KafkaDynamicTableFactory for example is still creating a FlinkKafkaConsumer instance. Is that something that will change in the future or are the table interfaces somehow exceptions to the advantages of the new interface?

--
Regards,
Deepthi

On Thu, May 20, 2021 at 12:23 PM Roman Khachatryan <[hidden email]> wrote:
Hi,

Those classes will likely be deprecated in the future in favor of
FLIP-27 [1][2] source and FLIP-143 [3] sink implementations and
eventually removed (though it won't happen soon).
You probably should take a look at the above new APIs.

Either way, there is no such a recommendation AFAIK. Copied connector
classes will have to be updated if something in Flink changes. Maybe a
better way would be to build your own kafka client and use it to build
flink-kafka connector (by overriding ${kafka.version} for example).

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
[2]
https://issues.apache.org/jira/browse/FLINK-18323
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API

Regards,
Roman

On Thu, May 20, 2021 at 7:45 PM deepthi Sridharan
<[hidden email]> wrote:
>
> Hi,
>
> We have an internal version of Open source Kafka consumer and producer that we use and are working on adding that as a source and sink for flink.
>
> It seems like the easiest way to add the consumer as source would be to override the FlinkKafkaConsumer class's createFetcher method to provide our own derived class of KafkaFetcher class which can hookup its own version of the consumerThread. But the fetcher classes are annotated as Internal and seems like it is not meant to be used this way. (And the changes for Producer would be on similar lines).
>
> Is there a recommendation for how to add new flavors of Kafka Consumer/Producer from the community? Would it be recommended to maintain a copy of all the connector classes so we don't have to deal with changes to classes tagged as internal?
>
> --
> Thanks & Regards
>