Hi, I have created the Kafka messaging architecture as a microservice that feeds both Spark streaming and Flink. Spark streaming uses micro-batches meaning "collect and process data" and flink as an event driven architecture (a stateful application that reacts to incoming events by triggering computations etc. According to Wikipedia, A Microservice is a technique that structures an application as a collection of loosely coupled services. In a microservices architecture, services are fine-grained and the protocols are lightweight. Ok for streaming data among other things I have to create and configure topic (or topics), design a robust zookeeper ensemble and create Kafka brokers with scalability and resiliency. Then I can offer the streaming as a microservice to subscribers among them Spark and Flink. I can upgrade this microservice component in isolation without impacting either Spark or Flink. The problem I face here is the dependency on Flink etc on the jar files specific for the version of Kafka deployed. For example kafka_2.12-1.1.0 is built on Scala 2.12 and Kafka version 1.1.0. To make this work in Flink 1.5 application, I need to use the correct dependency in sbt build. For example: libraryDependencies += "org.apache.flink" %% "flink-connector-kafka-0.11" % "1.5.0" libraryDependencies += "org.apache.flink" %% "flink-connector-kafka-base" % "1.5.0" libraryDependencies += "org.apache.flink" %% "flink-scala" % "1.5.0" libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.11.0.0" libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % "1.5.0" libraryDependencies += "org.apache.kafka" %% "kafka" % "0.11.0.0" and the Scala code needs to change: import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011 … val stream = env .addSource(new FlinkKafkaConsumer011[String]("md", new SimpleStringSchema(), properties)) So in summary some changes need to be made to Flink to be able to interact with the new version of Kafka. And more importantly if one can use an abstract notion of microservice here? Dr Mich Talebzadeh
LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
|
That they are loosely coupled does not mean they are independent. For instance, you would not be able to replace Kafka with zeromq in your scenario. Unfortunately also Kafka sometimes needs to introduce breaking changes and the dependent application needs to upgrade.
You will not be able to avoid these scenarios in the future (this is only possible if micro services don’t communicate with each other or if they would never need to change their communication protocol - pretty impossible ). However there are ways of course to reduce it, eg kafka could reduce the number of breaking changes or you can develop a very lightweight microservice that is very easy to change and that only deals with the broker integration and your application etc. > On 8. Jul 2018, at 10:59, Mich Talebzadeh <[hidden email]> wrote: > > Hi, > > I have created the Kafka messaging architecture as a microservice that > feeds both Spark streaming and Flink. Spark streaming uses micro-batches > meaning "collect and process data" and flink as an event driven > architecture (a stateful application that reacts to incoming events by > triggering computations etc. > > According to Wikipedia, A Microservice is a technique that structures an > application as a collection of loosely coupled services. In a microservices > architecture, services are fine-grained and the protocols are lightweight. > > Ok for streaming data among other things I have to create and configure > topic (or topics), design a robust zookeeper ensemble and create Kafka > brokers with scalability and resiliency. Then I can offer the streaming as > a microservice to subscribers among them Spark and Flink. I can upgrade > this microservice component in isolation without impacting either Spark or > Flink. > > The problem I face here is the dependency on Flink etc on the jar files > specific for the version of Kafka deployed. For example kafka_2.12-1.1.0 is > built on Scala 2.12 and Kafka version 1.1.0. To make this work in Flink 1.5 > application, I need to use the correct dependency in sbt build. For > example: > libraryDependencies += "org.apache.flink" %% "flink-connector-kafka-0.11" % > "1.5.0" > libraryDependencies += "org.apache.flink" %% "flink-connector-kafka-base" % > "1.5.0" > libraryDependencies += "org.apache.flink" %% "flink-scala" % "1.5.0" > libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.11.0.0" > libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % > "1.5.0" > libraryDependencies += "org.apache.kafka" %% "kafka" % "0.11.0.0" > > and the Scala code needs to change: > > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011 > … > val stream = env > .addSource(new FlinkKafkaConsumer011[String]("md", new > SimpleStringSchema(), properties)) > > So in summary some changes need to be made to Flink to be able to interact > with the new version of Kafka. And more importantly if one can use an > abstract notion of microservice here? > > Dr Mich Talebzadeh > > > > LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. |
Thanks Jorn. So I gather as you correctly suggested, microservices do provide value in terms of modularisation. However, there will always "inevitably" be scenarios where the receiving artefact say Flink needs communication protocol changes? thanks Dr Mich Talebzadeh
LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
On Sun, 8 Jul 2018 at 10:25, Jörn Franke <[hidden email]> wrote: That they are loosely coupled does not mean they are independent. For instance, you would not be able to replace Kafka with zeromq in your scenario. Unfortunately also Kafka sometimes needs to introduce breaking changes and the dependent application needs to upgrade. |
Yes or Kafka will need it ...
As soon as your orchestrate different microservices this will happen. > On 8. Jul 2018, at 11:33, Mich Talebzadeh <[hidden email]> wrote: > > Thanks Jorn. > > So I gather as you correctly suggested, microservices do provide value in > terms of modularisation. However, there will always "inevitably" be > scenarios where the receiving artefact say Flink needs communication > protocol changes? > > thanks > > Dr Mich Talebzadeh > > > > LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > >> On Sun, 8 Jul 2018 at 10:25, Jörn Franke <[hidden email]> wrote: >> >> That they are loosely coupled does not mean they are independent. For >> instance, you would not be able to replace Kafka with zeromq in your >> scenario. Unfortunately also Kafka sometimes needs to introduce breaking >> changes and the dependent application needs to upgrade. >> You will not be able to avoid these scenarios in the future (this is only >> possible if micro services don’t communicate with each other or if they >> would never need to change their communication protocol - pretty impossible >> ). However there are ways of course to reduce it, eg kafka could reduce the >> number of breaking changes or you can develop a very lightweight >> microservice that is very easy to change and that only deals with the >> broker integration and your application etc. >> >>> On 8. Jul 2018, at 10:59, Mich Talebzadeh <[hidden email]> >> wrote: >>> >>> Hi, >>> >>> I have created the Kafka messaging architecture as a microservice that >>> feeds both Spark streaming and Flink. Spark streaming uses micro-batches >>> meaning "collect and process data" and flink as an event driven >>> architecture (a stateful application that reacts to incoming events by >>> triggering computations etc. >>> >>> According to Wikipedia, A Microservice is a technique that structures an >>> application as a collection of loosely coupled services. In a >> microservices >>> architecture, services are fine-grained and the protocols are >> lightweight. >>> >>> Ok for streaming data among other things I have to create and configure >>> topic (or topics), design a robust zookeeper ensemble and create Kafka >>> brokers with scalability and resiliency. Then I can offer the streaming >> as >>> a microservice to subscribers among them Spark and Flink. I can upgrade >>> this microservice component in isolation without impacting either Spark >> or >>> Flink. >>> >>> The problem I face here is the dependency on Flink etc on the jar files >>> specific for the version of Kafka deployed. For example kafka_2.12-1.1.0 >> is >>> built on Scala 2.12 and Kafka version 1.1.0. To make this work in Flink >> 1.5 >>> application, I need to use the correct dependency in sbt build. For >>> example: >>> libraryDependencies += "org.apache.flink" %% >> "flink-connector-kafka-0.11" % >>> "1.5.0" >>> libraryDependencies += "org.apache.flink" %% >> "flink-connector-kafka-base" % >>> "1.5.0" >>> libraryDependencies += "org.apache.flink" %% "flink-scala" % "1.5.0" >>> libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.11.0.0" >>> libraryDependencies += "org.apache.flink" %% "flink-streaming-scala" % >>> "1.5.0" >>> libraryDependencies += "org.apache.kafka" %% "kafka" % "0.11.0.0" >>> >>> and the Scala code needs to change: >>> >>> import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011 >>> … >>> val stream = env >>> .addSource(new FlinkKafkaConsumer011[String]("md", new >>> SimpleStringSchema(), properties)) >>> >>> So in summary some changes need to be made to Flink to be able to >> interact >>> with the new version of Kafka. And more importantly if one can use an >>> abstract notion of microservice here? >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> < >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> * >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for any >>> loss, damage or destruction of data or any other property which may arise >>> from relying on this email's technical content is explicitly disclaimed. >>> The author will in no case be liable for any monetary damages arising >> from >>> such loss, damage or destruction. >> |
Hi all, I have now managed to deploy both ZooKeeper and Kafka as microservices using docker images. The idea came to me as I wanted to create lightweight processes for both ZooKeeper and Kafka to be used as services for Flink and Spark simultaneously. In this design both Flink and Spark rely on streaming market data messages published through Kafka. My current design is simple one docker for Zookeeper and another for Kafka [root@rhes75 ~]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 05cf097ac139 ches/kafka "/start.sh" 9 hours ago Up 9 hours 0.0.0.0:7203->7203/tcp, 0.0.0.0:9092->9092/tcp kafka b173e455cc80 jplock/zookeeper "/opt/zookeeper/bin/…" 10 hours ago Up 10 hours (healthy) 2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp zookeeper Note that the docker ports are exposed to the physical host that running the containers. A test message is simply created as follows: ${KAFKA_HOME}/bin/kafka-topics.sh --create --zookeeper rhes75:2181 --replication-factor 1 --partitions 1 --topic test Note that rhes75 is the host that houses the dockers and port 2181 is the zookeeper port used by the zookeeper docker and mapped The spark streaming uses speed layer in Lambda architecture to write to an Hbase table for selected market data (Hbase requires connectivity to a Zookeeper). For Hbase I specified a zookeeper instance running on another host and Hbase works fine. Anyway I will provide further info and diagrams. Cheers, Dr Mich Talebzadeh
LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
On Sun, 15 Jul 2018 at 08:40, Mich Talebzadeh <[hidden email]> wrote:
|
Is it on github Mich ? I would love to use the flink and spark edition and add some use cases from my side. Thanks Deepak On Sun, Jul 15, 2018, 13:38 Mich Talebzadeh <[hidden email]> wrote:
|
Hi Deepak, I will put it there once all the bits and pieces come together. At the moment I am drawing the diagrams. I will let you know. Definitely everyone's contribution is welcome. Regards, Dr Mich Talebzadeh
LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
On Sun, 15 Jul 2018 at 09:16, Deepak Sharma <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |