Flink Kafka consumer with low latency requirement

Posted by xwang355 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-Kafka-consumer-with-low-latency-requirement-tp28368.html


Dear Flink experts,

I am experimenting Flink for a use case where there is a tight latency requirements.

A stackoverflow article suggests that I can use setParallism(n) to process a Kafka partition in a multi-threaded way. My understanding is there is still one kafka consumer per partition, but by using setParallelism, I can spin up multiple worker threads to process the messages read from the consumer. 

And according to Fabian`s comments in this link:
 http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-Flink-Kafka-connector-has-max-pending-offsets-concept-td28119.html
Flink is able to manage the offset correctly (commit in the right order).

Here is my questions, let`s say there is a Kafka topic with only one partition, and I setup a consumer with setParallism(2). Hypothetically,  worker threads call out to a REST service which may get slow or stuck periodically. If I want to make sure that the consumer overall is making progress even in face of a 'slow woker'. In other words, I`d like to have  multiple pending but uncommitted offsets by the fast worker even when the other worker is stuck. Is there such a knob  to tune in Flink? 

From my own experiment, I use Kafka consume group tool to to monitor the offset lag,  soon as one worker thread is stuck, the other cannot make any progress either. I really want the fast worker still progress to certain extend. For this use case, exactly once processing is not required. 

Thanks for helping.
Ben