On 29 July 2017 at 4:46:57 AM, ninad ([hidden email]) wrote:
Hi Gordon, I was able to reproduce the data loss on standalone flink cluster
also. I have stripped down version of our code with here:
Environment:
Flink standalone 1.3.0
Kafka 0.9
*What the code is doing:*
-consume messages from kafka topic ('event.filter.topic' property in
application.properties)
-group them by key
-analyze the events in a window and filter some messages.
-send remaining messages to kafka topc ('sep.http.topic' property in
application.properties)
To build:
./gradlew clean assemble
The jar needs path to 'application.properties' file to run
Important properties in application.properties:
window.session.interval.sec
kafka.brokers
event.filter.topic --> source topic
sep.http.topic --> destination topic
To test:
-Use 'EventGenerator' class to publish messages to source kafka topic
The data published won't be filtered by the logic. If you publish 10
messages to the source topic,
those 10 messages should be sent to the destination topic.
-Once we see that flink has received all the messages, bring down all kafka
brokers
-Let Flink jobs fail for 2-3 times.
-Restart kafka brokers.
Note: Data loss isn't observed frequently. 1/4 times or so.
Thanks for all your help.
eventFilter.zip
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n14522/eventFilter.zip>
--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fink-KafkaProducer-Data-Loss-tp11413p14522.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Free forum by Nabble | Edit this page |