Hi, I have a rather simple Flink job which has a KinesisConsumer as a source and an HBase table as sink, in which I write using writeOutputFormat. I'm running it on a local machine with a single taskmanager (2 slots, 2G). The KinesisConsumer works fine and the connection to the HBase table gets opened fine (i.e. the open method of the class implementing OutputFormat gets actually called). I'm running the job at a parallelism of 2, while the sink has a parallelism of 1. The INFO org.apache.flink.runtime.blob.BlobCache - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268 Each following one another, like this: 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob.BlobCache - Created BLOB cache storage directory /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:59937 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:59976 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:18:13,710 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60022 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:19:29,814 INFO org.apache.flink.runtime.blob.BlobCache - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60110 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:21:42,858 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60295 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection - Received PUT request for content addressable BLOB 2017-08-30 14:26:11,247 INFO org.apache.flink.runtime.blob.BlobCache - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient - GET content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient - PUT content addressable BLOB stream to /127.0.0.1:60410 Federico D'Ambrosio
|
Hi Federico, Not sure what's going on there but Nico (in CC) is more familiar with the blob cache and might be able to help. Best, Fabian 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
|
Hi Federico,
1) Which version of Flink are you using? 2) Can you also share the JobManager log? 3) Why do you think, Flink is stuck at the BlobCache? Is it really blocked, or do you still have CPU load? Can you post stack traces of the TaskManager (TM) and JobManager processes when you think they are stuck (using jstack)? 4) These PUT requests in the TM logs are strange, unless you showed the TM logs in the web interface - did you? Nico On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote: > Hi Federico, > > Not sure what's going on there but Nico (in CC) is more familiar with the > blob cache and might be able to help. > > Best, Fabian > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>: > > Hi, > > > > I have a rather simple Flink job which has a KinesisConsumer as a source > > and an HBase table as sink, in which I write using writeOutputFormat. I'm > > running it on a local machine with a single taskmanager (2 slots, 2G). The > > KinesisConsumer works fine and the connection to the HBase table gets > > opened fine (i.e. the open method of the class implementing OutputFormat > > gets actually called). > > > > I'm running the job at a parallelism of 2, while the sink has a > > parallelism of 1. The > > > > Still, looking at the log I see that after opening the connection, the job > > gets stuck at lines like this one: > > > > INFO org.apache.flink.runtime.blob.BlobCache - > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > localhost/127.0.0.1:43268 > > > > Each following one another, like this: > > > > 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob.BlobCache > > - Created BLOB cache storage directory > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30 > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient > > - PUT content addressable BLOB stream to /127.0.0.1:59937 > > 2017-08-30 14:17:21,323 DEBUG > > org.apache.flink.runtime.blob.BlobServerConnection - Received > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO > > org.apache.flink.runtime.blob.BlobCache - > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG > > org.apache.flink.runtime.blob.BlobClient - GET > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG > > org.apache.flink.runtime.blob.BlobClient - PUT > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30 > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > - Received PUT request for content addressable BLOB 2017-08-30 > > 14:18:13,710 INFO org.apache.flink.runtime.blob.BlobCache > > - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG > > org.apache.flink.runtime.blob.BlobClient - GET > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG > > org.apache.flink.runtime.blob.BlobClient - PUT > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30 > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > - Received PUT request for content addressable BLOB 2017-08-30 > > 14:19:29,814 INFO org.apache.flink.runtime.blob.BlobCache > > - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG > > org.apache.flink.runtime.blob.BlobClient - GET > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG > > org.apache.flink.runtime.blob.BlobClient - PUT > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30 > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > - Received PUT request for content addressable BLOB 2017-08-30 > > 14:21:42,858 INFO org.apache.flink.runtime.blob.BlobCache > > - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG > > org.apache.flink.runtime.blob.BlobClient - GET > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG > > org.apache.flink.runtime.blob.BlobClient - PUT > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30 > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > - Received PUT request for content addressable BLOB 2017-08-30 > > 14:26:11,247 INFO org.apache.flink.runtime.blob.BlobCache > > - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG > > org.apache.flink.runtime.blob.BlobClient - GET > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG > > org.apache.flink.runtime.blob.BlobClient - PUT > > content addressable BLOB stream to /127.0.0.1:60410 > > > > > > My questions are: what is the jobmanager doing here? Why is he taking ages > > to do this? How do i speed up this behaviour? > > > > Thank you very much for your attention, > > > > Federico D'Ambrosio signature.asc (201 bytes) Download Attachment |
Hi, 1) I'm using Flink 1.3.22) Th JobManager log is pretty much the same concerning those lines: 2017-08-30 14:16:53,344 INFO org.apache.zookeeper.ClientCnxn - Socket connection established to master-1.localdomain/10.0.0.55:2181, initiating session 2017-08-30 14:16:53,348 INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server master-1.localdomain/10.0.0.55:2181, sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000 2017-08-30 14:16:53,467 INFO org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher - Subtask 0 has no active shards to read on startup; marking the subtask as temporarily idle ... 2017-08-30 14:16:53,469 INFO org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer - Subtask 1 will be seeded with initial shard StreamShardHandle{streamName='fdt', shard='{ShardId: shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: 340282366920938463463374607431768211455},SequenceNumberRange: {StartingSequenceNumber: 49576318619583361934524589342704326364484033599642796034,}}'}, starting state set as sequence number LATEST_SEQUENCE_NUM 2017-08-30 14:16:53,470 INFO org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher - Subtask 1 will start consuming seeded shard StreamShardHandle{streamName='fdt', shard='{ShardId: shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: 340282366920938463463374607431768211455},SequenceNumberRange: {StartingSequenceNumber: 49576318619583361934524589342704326364484033599642796034,}}'} from sequence number LATEST_SEQUENCE_NUM with ShardConsumer 0 2017-08-30 14:16:53,608 INFO lab.vardata.HBaseBatchFormat - Task 0: Opening connection to currentDay to execute 1 tasks on Single Put job 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob. 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:17:21,324 INFO org.apache.flink.runtime.blob. 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:18:13,710 INFO org.apache.flink.runtime.blob. 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:19:29,814 INFO org.apache.flink.runtime.blob. 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:21:42,858 INFO org.apache.flink.runtime.blob. 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:26:11,247 INFO org.apache.flink.runtime.blob. 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:29:20,945 INFO org.apache.flink.runtime.blob. 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:32:10,006 INFO org.apache.flink.runtime.blob. 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:51,285 INFO org.apache.flink.runtime.blob. 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:37:54,225 INFO org.apache.flink.runtime.blob. 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:40:44,007 INFO org.apache.flink.runtime.blob. 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. 2017-08-30 14:45:52,672 INFO org.apache.flink.runtime.blob. 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:15:24,638 INFO org.apache.flink.runtime.blob. 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:21:04,191 INFO org.apache.flink.runtime.blob. 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:22:10,144 INFO org.apache.flink.runtime.blob. 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. 2017-08-30 15:30:10,135 INFO org.apache.flink.runtime.blob. 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob. 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob. 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob. 2017-08-30 16:01:58,162 INFO org.apache.flink.runtime.blob. 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob. 3) There actually was CPU load, but I thought Flink was stuck in a loop or something, because as you can see from the timestamp on the logs these lines went on for a couple hours with no sign of actual writes on HBase (before crashing for unrelated reasons, Kinesis stream was deleted). Since I was writing on HBase, which I was accessing via Zookeeper, I was expecting to see the Logs I put inside the writeRecord method of the OutputFormat. As for the jstack trace, I'm currently unable to provide one (because I can't access the kinesis stream right now), but I'll try to emulate the stream and provide that stack as soon as I can. 2017-08-31 10:29 GMT+02:00 Nico Kruber <[hidden email]>: Hi Federico, -- Federico D'Ambrosio
|
to sum up: the lines you were seeing seem to be the down- and upload of the
TaskManager logs from the web interface which go through the BlobServer and its components. Nico On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote: > Hi, > > 1) I'm using Flink 1.3.2 > > 2) Th JobManager log is pretty much the same concerning those lines: > > 2017-08-30 14:16:53,343 INFO > org.apache.zookeeper.ClientCnxn - Opening > socket connection to server master-1.localdomain/10.0.0.55:2181 > 2017-08-30 14:16:53,344 INFO > org.apache.zookeeper.ClientCnxn - Socket > connection established to master-1.localdomain/10.0.0.55:2181, initiating > session > 2017-08-30 14:16:53,348 INFO > org.apache.zookeeper.ClientCnxn - Session > establishment complete on server master-1.localdomain/10.0.0.55:2181, > sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000 > 2017-08-30 14:16:53,467 INFO > org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher > - Subtask 0 has no active shards to read on startup; marking the subtask as > temporarily idle ... > 2017-08-30 14:16:53,469 INFO > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer - > Subtask 1 will be seeded with initial shard > StreamShardHandle{streamName='fdt', shard='{ShardId: > shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: > 340282366920938463463374607431768211455},SequenceNumberRange: > {StartingSequenceNumber: > 49576318619583361934524589342704326364484033599642796034,}}'}, starting > state set as sequence number LATEST_SEQUENCE_NUM > 2017-08-30 14:16:53,470 INFO > org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher > - Subtask 1 will start consuming seeded shard > StreamShardHandle{streamName='fdt', shard='{ShardId: > shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: > 340282366920938463463374607431768211455},SequenceNumberRange: > {StartingSequenceNumber: > 49576318619583361934524589342704326364484033599642796034,}}'} from sequence > number LATEST_SEQUENCE_NUM with ShardConsumer 0 > 2017-08-30 14:16:53,608 INFO > lab.vardata.HBaseBatchFormat - Task 0: > Opening connection to currentDay to execute 1 tasks on Single Put job > > 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob. > BlobCache - Created BLOB cache storage directory > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f > 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:59937 > 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:17:21,324 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 3ff486dff4c4eaafdab42b30a877326e62bfca82 > from localhost/127.0.0.1:43268 > 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938 > 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:59976 > 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:18:13,710 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 2f5283326aab77faa047b705cd1d6470035b3b7d > from localhost/127.0.0.1:43268 > 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978 > 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60022 > 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:19:29,814 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 > from localhost/127.0.0.1:43268 > 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024 > 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60110 > 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:21:42,858 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 8638bdf78b0e540786de6c291f710a8db447a2b4 > from localhost/127.0.0.1:43268 > 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112 > 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60295 > 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:26:11,247 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 6d30c88539d511bb9acc13b53bb2a128614f5621 > from localhost/127.0.0.1:43268 > 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297 > 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60410 > 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:29:20,945 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 345ae4be7a2a8077ecefd6a155d7e975b38efd48 > from localhost/127.0.0.1:43268 > 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412 > 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60525 > 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:32:10,006 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 709e2f64a484ff39634fe53d50dc760ceefd1f8e > from localhost/127.0.0.1:43268 > 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527 > 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60734 > 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:37:51,285 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 > from localhost/127.0.0.1:43268 > 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736 > 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60740 > 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:37:54,225 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 96387068e0bf71c5ff2019dbb1a43b74fb53067a > from localhost/127.0.0.1:43268 > 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742 > 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60852 > 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:40:44,007 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 430066a8a3febb68ada10d78663b0972b92e66e5 > from localhost/127.0.0.1:43268 > 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854 > 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:32820 > 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:45:52,672 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > ff0c1f371621c2f4ffb7ed06f448841a8a2458ac > from localhost/127.0.0.1:43268 > 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822 > 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:33955 > 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:15:24,638 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 291f5aeb9306cb94098255237e01ab6735cf42ea > from localhost/127.0.0.1:43268 > 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957 > 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34170 > 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:21:04,191 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > e46b51d3bd0476b6a8a656469e7546cd933fa478 > from localhost/127.0.0.1:43268 > 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172 > 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34223 > 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:22:10,144 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > cf9700032c229b39e634eeda73284e116314f7bb > from localhost/127.0.0.1:43268 > 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225 > 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34544 > 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:30:10,135 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 > from localhost/127.0.0.1:43268 > 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546 > 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:35741 > 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 16:01:58,162 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 6bde2f7a709181065c6710c2252a5846f361ad68 > from localhost/127.0.0.1:43268 > 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743 > > 3) There actually was CPU load, but I thought Flink was stuck in a loop or > something, because as you can see from the timestamp on the logs these > lines went on for a couple hours with no sign of actual writes on HBase > (before crashing for unrelated reasons, Kinesis stream was deleted). Since > I was writing on HBase, which I was accessing via Zookeeper, I was > expecting to see the Logs I put inside the writeRecord method of the > OutputFormat. > As for the jstack trace, I'm currently unable to provide one (because I > can't access the kinesis stream right now), but I'll try to emulate the > stream and provide that stack as soon as I can. > > 4) Yes, those were log lines from the TM log in the web dashboard > > Thank you very much for your help > > 2017-08-31 10:29 GMT+02:00 Nico Kruber <[hidden email]>: > > Hi Federico, > > 1) Which version of Flink are you using? > > 2) Can you also share the JobManager log? > > 3) Why do you think, Flink is stuck at the BlobCache? Is it really > > blocked, or > > do you still have CPU load? Can you post stack traces of the TaskManager > > (TM) > > and JobManager processes when you think they are stuck (using jstack)? > > 4) These PUT requests in the TM logs are strange, unless you showed the TM > > logs in the web interface - did you? > > > > > > Nico > > > > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote: > > > Hi Federico, > > > > > > Not sure what's going on there but Nico (in CC) is more familiar with > > > the > > > blob cache and might be able to help. > > > > > > Best, Fabian > > > > > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>: > > > > Hi, > > > > > > > > I have a rather simple Flink job which has a KinesisConsumer as a > > > > source > > > > > > and an HBase table as sink, in which I write using writeOutputFormat. > > > > I'm > > > > > > running it on a local machine with a single taskmanager (2 slots, 2G). > > > > The > > > > > > KinesisConsumer works fine and the connection to the HBase table gets > > > > opened fine (i.e. the open method of the class implementing > > > > OutputFormat > > > > > > gets actually called). > > > > > > > > I'm running the job at a parallelism of 2, while the sink has a > > > > parallelism of 1. The > > > > > > > > Still, looking at the log I see that after opening the connection, the > > > > job > > > > > > gets stuck at lines like this one: > > > > > > > > INFO org.apache.flink.runtime.blob.BlobCache - > > > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > localhost/127.0.0.1:43268 > > > > > > > > Each following one another, like this: > > > > > > > > 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Created BLOB cache storage directory > > > > > > > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30 > > > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient > > > > > > > > - PUT content addressable BLOB stream to /127.0.0.1:59937 > > > > > > > > 2017-08-30 14:17:21,323 DEBUG > > > > org.apache.flink.runtime.blob.BlobServerConnection - > > > > Received > > > > > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO > > > > org.apache.flink.runtime.blob.BlobCache - > > > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30 > > > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:18:13,710 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from > > > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30 > > > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:19:29,814 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30 > > > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:21:42,858 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30 > > > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:26:11,247 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60410 > > > > > > > > > > > > My questions are: what is the jobmanager doing here? Why is he taking > > > > ages > > > > > > to do this? How do i speed up this behaviour? > > > > > > > > Thank you very much for your attention, > > > > > > > > Federico D'Ambrosio signature.asc (201 bytes) Download Attachment |
Ok, thank you very much! Thank you againSo that was nothing actually related to what I was trying to do. I guess I'll have to investigate further on the effective correctness of the implementation of the OutputFormat then, because the total lack of other log lines was the most strucking thing about this whole issue. 2017-08-31 12:24 GMT+02:00 Nico Kruber <[hidden email]>: to sum up: the lines you were seeing seem to be the down- and upload of the -- Federico D'Ambrosio
|
Free forum by Nabble | Edit this page |