Hi guys,
Is the Flink Connector Kafka 0.10 fully compatible with Kafka 0.11? Thank you in advance. Best, Gabriele |
Hi,
Yes, Flink Connector for Kafka 0.10 should work without problems with Kafka 0.11. There is also a pending work for a Kafka 0.11 connector that will add support for exactly-once semantic. Piotrek > On Aug 18, 2017, at 5:21 PM, Gabriele Di Bernardo <[hidden email]> wrote: > > Hi guys, > > Is the Flink Connector Kafka 0.10 fully compatible with Kafka 0.11? > > Thank you in advance. > > Best, > > > Gabriele |
Hi Piotr!
In this page of the documentation [1] I can see the different versions of Kafka Connectors, but I am now learning about Kafka so some help would be valuable. 1 -- Are 0.8, 0.9, 0.11 etc different version of the same thing or do they same thing? I mean does 0.11 offers everything the 0.8 already has? 2 -- I would like to use Kafka Streams API in my flink cluster [2], which is used for standalone clusters if I am not mistaken, i.e. one node only by default. 3 -- Can you give some hints and explain briefly about the cluster deployment with many machines? I mean what is Yarn, Mesos etc. I think they are "coordinators" of the cluster. But now that I would like to test my algorithm on a real cluster with several machines I would like some hints on which one should I use. What about Kubernetes and Docker [3] ? Thanks a lot in advance! Best, Max [1] -- https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html [2] -- https://kafka.apache.org/documentation/streams/ [3] -- https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
The different versions of the connector correspond to different versions of Kafka. If you are using Kafka 0.8 use 0.8 connector, etc. Versions of the connector after 0.10 support exactly once delivery, versions prior to that only offer at least once delivery.
Kafka supports distributed processing through deployment of multiple brokers. Each topic can be divided into partitions and those allocated to brokers to get distributed processing on a cluster. I have not used its stream processing API yet, but I assume it runs within the brokers working on partitions of a topic. Cluster deployment for Flink and Kafka can be as a stand alone cluster (manual deployment to a bunch of machines, via custom built AMI, etc), or through a cluster manager like Mesos, Yarn, Kubernetes which will manage the work performed on the machines in a cluster. I have not seen any good tutorials on multi-machine deployments, there are a few suggesting how to do so with kubernetes for Flink only, but none I have found for flink+kafka. For my proof of concept it was just easier to manually build out 4 machines. The install for kafka and flink are simple and getting java and maven on a base ubuntu image on AWS is quick, so I can manually build out a machine in about 5 minutes. Michael > On Apr 22, 2018, at 2:22 AM, m@xi <[hidden email]> wrote: > > Hi Piotr! > > In this page of the documentation [1] I can see the different versions of > Kafka Connectors, but I am now learning about Kafka so some help would be > valuable. > > 1 -- Are 0.8, 0.9, 0.11 etc different version of the same thing or do they > same thing? I mean does 0.11 offers everything the 0.8 already has? > > 2 -- I would like to use Kafka Streams API in my flink cluster [2], which is > used for standalone clusters if I am not mistaken, i.e. one node only by > default. > > 3 -- Can you give some hints and explain briefly about the cluster > deployment with many machines? I mean what is Yarn, Mesos etc. I think they > are "coordinators" of the cluster. But now that I would like to test my > algorithm on a real cluster with several machines I would like some hints on > which one should I use. What about Kubernetes and Docker [3] ? > > Thanks a lot in advance! > > Best, > Max > > [1] -- > https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html > > [2] -- https://kafka.apache.org/documentation/streams/ > > [3] -- > https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hey Michael! Thanks a lot for your answer.
1 -- OK then. Seems that Kafka version 0.11 is the most preferable since it supports exactly-once semantics. 2 -- I have implemented my algorithm in Flink but I would like to implement it on Kafka streams. All of them should run on a Flink cluster (standalone cluster with 1 and also >1 machines). Finally, I would like to compare the performance of the Flinks join algorithm with the performance of join implemented in Kafka Streams. More specifically, I have impemented a custom join in Flink and 1) I do not know if I may do the same with Kafka Streams, and 2) Currently, I do not use Kafka, although I should if I would like to compare with Kafka Streams as it only operates on top of Kafka brokers protocol 3 -- Imagine an algorithm A as a black box. Do you think that there are big differences performance wise in the next to scenarios: 1) Flink_read input from txt files, and 2) Flink+Kafka use Kafka for moving the streams. 4 -- I am now trying to setup Flink on a cluster of many machines in Microsoft Azure. If you have any experience or any tutorial related to it please forward it to me. Thanks in advance. If also someone else may help, it is more than welcome! Best, Max -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
For #4, there was past thread: You can find related information onĀ Azure Table in: docs/dev/batch/connectors.md FYI On Mon, Apr 23, 2018 at 4:13 AM, m@xi <[hidden email]> wrote: Hey Michael! Thanks a lot for your answer. |
Thanks a lot Ted!
I will look into it! If someone else could elaborate on the other bullets it would be great. Best, Max -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |