Hello Guys,
I am working on a Flink application, in which I consume data from Apache Kafka, the data is published in three topics of the cluster, and I need to read from them, I suppose I can create three FlikKafkaConsumer constructors. The data I am consuming is in the same format {Id_sensor:, Id_equipement, Date:, Value{...}, ...}, the problem is the "Value" field changes from topic to topic, in fact in the first topic I have the temperature as a value
"Value":{"temperature":26} , the second topic contains oil data as a value "Value":{"oil_data":26}, the third topic the value field is "Value": {"Pitch":, "Roll", "Yaw"}. So I created three FlinkKafkaConsumer, and I defined three DeserializationSchema for each data of a topic, the problem is I want to do some aggregations on those data all together in order to apply a function. So I am wondering if It is a problem to join the three streams together in one stream and then do my aggregation by a field, and then apply the function, and finally sink it. and if so, am I going to have a problem sinking the data, because actually as I explained the value field is different from topic to another. Can anyone give me an explanation Please, I would be so grateful. Thank you for your time !! Aissa |
Hi Aissa,
https://stackoverflow.com/questions/54277910/how-do-i-join-two-streams-in-apache-flink
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/joining.html
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/tableApi.html#joins It is important to understand what type of join is suits to your scenario, could one of streams miss some data + max delay. From: Aissa Elaffani <[hidden email]> Hello Guys, I am working on a Flink application, in which I consume data from Apache Kafka, the data is published in three topics of the cluster, and I need to read from them, I suppose I can create three FlikKafkaConsumer constructors. The data I
am consuming is in the same format {Id_sensor:, Id_equipement, Date:, Value{...}, ...}, the problem is the "Value" field changes from topic to topic, in fact in the first topic I have the temperature as a value "Value":{"temperature":26} , the second topic
contains oil data as a value "Value":{"oil_data":26}, the third topic the value field is "Value": {"Pitch":, "Roll", "Yaw"}. So I created three FlinkKafkaConsumer, and I defined three DeserializationSchema for each data of a topic, the problem is I want to do some aggregations on those data all together in order to apply a function. So I am wondering if It is
a problem to join the three streams together in one stream and then do my aggregation by a field, and then apply the function, and finally sink it. and if so, am I going to have a problem sinking the data, because actually as I explained the value field is
different from topic to another. Can anyone give me an explanation Please, I would be so grateful. Thank you for your time !! Aissa |
Free forum by Nabble | Edit this page |