"Slow ReadProcessor" warnings when using BucketSink

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

"Slow ReadProcessor" warnings when using BucketSink

none none
Hi,

I have a low throughput job (approx. 1000 messager per Minute), that consumes from Kafka und writes directly to HDFS. After an hour or so, I get the following warnings in the Task Manager log:

2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30002ms (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]

I have not found any erros or warning at the datanodes or the namenode. Every other application using HDFS performs fine. I have very little load and network latency is fine also. I also checked GC, disk I/O.

The files written are very small (only a few MB), so writing the blocks should be fast.

The threshold is crossed only 1 or 2 ms, this makes me wonder.

Does anyone have an Idea where to look next or how to fix these warnings?
Reply | Threaded
Open this post in threaded view
|

Re: "Slow ReadProcessor" warnings when using BucketSink

rmetzger0
Hi,
I haven't seen this error before. Also, I didn't find anything helpful searching for the error on Google. 

Did you check the GC times also for Flink? Is your Flink job doing any heavy tasks (like maintaining large windows, or other operations involving a lot of heap space?)

Regards,
Robert


On Tue, Oct 11, 2016 at 10:51 AM, static-max <[hidden email]> wrote:
Hi,

I have a low throughput job (approx. 1000 messager per Minute), that consumes from Kafka und writes directly to HDFS. After an hour or so, I get the following warnings in the Task Manager log:

2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30002ms (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]

I have not found any erros or warning at the datanodes or the namenode. Every other application using HDFS performs fine. I have very little load and network latency is fine also. I also checked GC, disk I/O.

The files written are very small (only a few MB), so writing the blocks should be fast.

The threshold is crossed only 1 or 2 ms, this makes me wonder.

Does anyone have an Idea where to look next or how to fix these warnings?

Reply | Threaded
Open this post in threaded view
|

Re: "Slow ReadProcessor" warnings when using BucketSink

none none
Hi Robert,

thanks for your reply. I also didn't find anything helpful on Google.

I checked all GC Times, they look OK. Here are GC Times for the Job Manager (the job is running fine since 5 days):

Collector Count Time
PS-MarkSweep 3 1s
PS-Scavenge 5814 2m 12s

I have no window or any computation, just reading from Kafka and directly writing to HDFS.

I can also run a terasort or teragen in parallel without any problems.

Best,
Max

2016-10-12 11:32 GMT+02:00 Robert Metzger <[hidden email]>:
Hi,
I haven't seen this error before. Also, I didn't find anything helpful searching for the error on Google. 

Did you check the GC times also for Flink? Is your Flink job doing any heavy tasks (like maintaining large windows, or other operations involving a lot of heap space?)

Regards,
Robert


On Tue, Oct 11, 2016 at 10:51 AM, static-max <[hidden email]> wrote:
Hi,

I have a low throughput job (approx. 1000 messager per Minute), that consumes from Kafka und writes directly to HDFS. After an hour or so, I get the following warnings in the Task Manager log:

2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30002ms (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]

I have not found any erros or warning at the datanodes or the namenode. Every other application using HDFS performs fine. I have very little load and network latency is fine also. I also checked GC, disk I/O.

The files written are very small (only a few MB), so writing the blocks should be fast.

The threshold is crossed only 1 or 2 ms, this makes me wonder.

Does anyone have an Idea where to look next or how to fix these warnings?


Reply | Threaded
Open this post in threaded view
|

Re: "Slow ReadProcessor" warnings when using BucketSink

rmetzger0
Hi Max,

maybe you need to ask this question on the Hadoop user mailing list (or your Hadoop vendor support, if you are using a Hadoop distribution).

On Tue, Oct 18, 2016 at 11:19 AM, static-max <[hidden email]> wrote:
Hi Robert,

thanks for your reply. I also didn't find anything helpful on Google.

I checked all GC Times, they look OK. Here are GC Times for the Job Manager (the job is running fine since 5 days):

Collector Count Time
PS-MarkSweep 3 1s
PS-Scavenge 5814 2m 12s

I have no window or any computation, just reading from Kafka and directly writing to HDFS.

I can also run a terasort or teragen in parallel without any problems.

Best,
Max

2016-10-12 11:32 GMT+02:00 Robert Metzger <[hidden email]>:
Hi,
I haven't seen this error before. Also, I didn't find anything helpful searching for the error on Google. 

Did you check the GC times also for Flink? Is your Flink job doing any heavy tasks (like maintaining large windows, or other operations involving a lot of heap space?)

Regards,
Robert


On Tue, Oct 11, 2016 at 10:51 AM, static-max <[hidden email]> wrote:
Hi,

I have a low throughput job (approx. 1000 messager per Minute), that consumes from Kafka und writes directly to HDFS. After an hour or so, I get the following warnings in the Task Manager log:

2016-10-10 01:59:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 66 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 1599276 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:04:44,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30002ms (threshold=30000ms); ack: seqno: 13 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2394027 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]
2016-10-10 02:05:14,635 WARN  org.apache.hadoop.hdfs.DFSClient                              - Slow ReadProcessor read fields took 30001ms (threshold=30000ms); ack: seqno: 17 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 2547467 flag: 0 flag: 0 flag: 0, targets: [DatanodeInfoWithStorage[Node1, Node2, Node3]]

I have not found any erros or warning at the datanodes or the namenode. Every other application using HDFS performs fine. I have very little load and network latency is fine also. I also checked GC, disk I/O.

The files written are very small (only a few MB), so writing the blocks should be fast.

The threshold is crossed only 1 or 2 ms, this makes me wonder.

Does anyone have an Idea where to look next or how to fix these warnings?