BlobCache and its functioning

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

BlobCache and its functioning

Federico D'Ambrosio-2
Hi,

I have a rather simple Flink job which has a KinesisConsumer as a source and an HBase table as sink, in which I write using writeOutputFormat. I'm running it on a local machine with a single taskmanager (2 slots, 2G). The KinesisConsumer works fine and the connection to the HBase table gets opened fine (i.e. the open method of the class implementing OutputFormat gets actually called).

I'm running the job at a parallelism of 2, while the sink has a parallelism of 1. The

Still, looking at the log I see that after opening the connection, the job gets stuck at lines like this one:

INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268
Each following one another, like this:

2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache                       - Created BLOB cache storage directory /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59937
2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from localhost/127.0.0.1:43268
2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59976
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from localhost/127.0.0.1:43268
2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60022
2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from localhost/127.0.0.1:43268
2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60110
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268
2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60295
2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from localhost/127.0.0.1:43268
2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60410

My questions are: what is the jobmanager doing here? Why is he taking ages to do this? How do i speed up this behaviour?

Thank you very much for your attention,

Federico D'Ambrosio
Reply | Threaded
Open this post in threaded view
|

Re: BlobCache and its functioning

Fabian Hueske-2
Hi Federico,

Not sure what's going on there but Nico (in CC) is more familiar with the blob cache and might be able to help.

Best, Fabian

2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
Hi,

I have a rather simple Flink job which has a KinesisConsumer as a source and an HBase table as sink, in which I write using writeOutputFormat. I'm running it on a local machine with a single taskmanager (2 slots, 2G). The KinesisConsumer works fine and the connection to the HBase table gets opened fine (i.e. the open method of the class implementing OutputFormat gets actually called).

I'm running the job at a parallelism of 2, while the sink has a parallelism of 1. The

Still, looking at the log I see that after opening the connection, the job gets stuck at lines like this one:

INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268
Each following one another, like this:

2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache                       - Created BLOB cache storage directory /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59937
2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from localhost/127.0.0.1:43268
2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59976
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from localhost/127.0.0.1:43268
2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60022
2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from localhost/127.0.0.1:43268
2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60110
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268
2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60295
2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from localhost/127.0.0.1:43268
2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60410

My questions are: what is the jobmanager doing here? Why is he taking ages to do this? How do i speed up this behaviour?

Thank you very much for your attention,

Federico D'Ambrosio

Reply | Threaded
Open this post in threaded view
|

Re: BlobCache and its functioning

Nico Kruber
Hi Federico,
1) Which version of Flink are you using?
2) Can you also share the JobManager log?
3) Why do you think, Flink is stuck at the BlobCache? Is it really blocked, or
do you still have CPU load? Can you post stack traces of the TaskManager (TM)
and JobManager processes when you think they are stuck (using jstack)?
4) These PUT requests in the TM logs are strange, unless you showed the TM
logs in the web interface - did you?


Nico


On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:

> Hi Federico,
>
> Not sure what's going on there but Nico (in CC) is more familiar with the
> blob cache and might be able to help.
>
> Best, Fabian
>
> 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
> > Hi,
> >
> > I have a rather simple Flink job which has a KinesisConsumer as a source
> > and an HBase table as sink, in which I write using writeOutputFormat. I'm
> > running it on a local machine with a single taskmanager (2 slots, 2G). The
> > KinesisConsumer works fine and the connection to the HBase table gets
> > opened fine (i.e. the open method of the class implementing OutputFormat
> > gets actually called).
> >
> > I'm running the job at a parallelism of 2, while the sink has a
> > parallelism of 1. The
> >
> > Still, looking at the log I see that after opening the connection, the job
> > gets stuck at lines like this one:
> >
> > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > localhost/127.0.0.1:43268
> >
> > Each following one another, like this:
> >
> > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache    
> >                  - Created BLOB cache storage directory
> > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient              
> >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > 2017-08-30 14:17:21,323 DEBUG
> > org.apache.flink.runtime.blob.BlobServerConnection            - Received
> > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > org.apache.flink.runtime.blob.BlobCache                       -
> > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection    
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache              
> >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection    
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache              
> >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection    
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache              
> >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection    
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache              
> >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60410
> >
> >
> > My questions are: what is the jobmanager doing here? Why is he taking ages
> > to do this? How do i speed up this behaviour?
> >
> > Thank you very much for your attention,
> >
> > Federico D'Ambrosio


signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BlobCache and its functioning

Federico D'Ambrosio-2
Hi,

1) I'm using Flink 1.3.2

2) Th JobManager log is pretty much the same concerning those lines:

2017-08-30 14:16:53,343 INFO  org.apache.zookeeper.ClientCnxn                               - Opening socket connection to server master-1.localdomain/10.0.0.55:2181
2017-08-30 14:16:53,344 INFO  org.apache.zookeeper.ClientCnxn                               - Socket connection established to master-1.localdomain/10.0.0.55:2181, initiating session
2017-08-30 14:16:53,348 INFO  org.apache.zookeeper.ClientCnxn                               - Session establishment complete on server master-1.localdomain/10.0.0.55:2181, sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000
2017-08-30 14:16:53,467 INFO  org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher  - Subtask 0 has no active shards to read on startup; marking the subtask as temporarily idle ...
2017-08-30 14:16:53,469 INFO  org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer  - Subtask 1 will be seeded with initial shard StreamShardHandle{streamName='fdt', shard='{ShardId: shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: 340282366920938463463374607431768211455},SequenceNumberRange: {StartingSequenceNumber: 49576318619583361934524589342704326364484033599642796034,}}'}, starting state set as sequence number LATEST_SEQUENCE_NUM
2017-08-30 14:16:53,470 INFO  org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher  - Subtask 1 will start consuming seeded shard StreamShardHandle{streamName='fdt', shard='{ShardId: shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: 340282366920938463463374607431768211455},SequenceNumberRange: {StartingSequenceNumber: 49576318619583361934524589342704326364484033599642796034,}}'} from sequence number LATEST_SEQUENCE_NUM with ShardConsumer 0
2017-08-30 14:16:53,608 INFO  lab.vardata.HBaseBatchFormat                                  - Task 0: Opening connection to currentDay to execute 1 tasks on Single Put job

2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache                       - Created BLOB cache storage directory /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59937
2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from localhost/127.0.0.1:43268
2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:59976
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from localhost/127.0.0.1:43268
2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60022
2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from localhost/127.0.0.1:43268
2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60110
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from localhost/127.0.0.1:43268
2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60295
2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from localhost/127.0.0.1:43268
2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60410
2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:29:20,945 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from localhost/127.0.0.1:43268
2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412
2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60525
2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:32:10,006 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 709e2f64a484ff39634fe53d50dc760ceefd1f8e from localhost/127.0.0.1:43268
2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527
2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60734
2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:37:51,285 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from localhost/127.0.0.1:43268
2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736
2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60740
2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:37:54,225 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 96387068e0bf71c5ff2019dbb1a43b74fb53067a from localhost/127.0.0.1:43268
2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742
2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:60852
2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:40:44,007 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 430066a8a3febb68ada10d78663b0972b92e66e5 from localhost/127.0.0.1:43268
2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854
2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:32820
2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 14:45:52,672 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from localhost/127.0.0.1:43268
2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822
2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:33955
2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 15:15:24,638 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 291f5aeb9306cb94098255237e01ab6735cf42ea from localhost/127.0.0.1:43268
2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957
2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:34170
2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 15:21:04,191 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading e46b51d3bd0476b6a8a656469e7546cd933fa478 from localhost/127.0.0.1:43268
2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172
2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:34223
2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 15:22:10,144 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading cf9700032c229b39e634eeda73284e116314f7bb from localhost/127.0.0.1:43268
2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225
2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:34544
2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 15:30:10,135 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from localhost/127.0.0.1:43268
2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546
2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob.BlobClient                      - PUT content addressable BLOB stream to /127.0.0.1:35741
2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob.BlobServerConnection            - Received PUT request for content addressable BLOB
2017-08-30 16:01:58,162 INFO  org.apache.flink.runtime.blob.BlobCache                       - Downloading 6bde2f7a709181065c6710c2252a5846f361ad68 from localhost/127.0.0.1:43268
2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob.BlobClient                      - GET content addressable BLOB 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743

3) There actually was CPU load, but I thought Flink was stuck in a loop or something, because as you can see from the timestamp on the logs these lines went on for a couple hours with no sign of actual writes on HBase (before crashing for unrelated reasons, Kinesis stream was deleted). Since I was writing on HBase, which I was accessing via Zookeeper, I was expecting to see the Logs I put inside the writeRecord method of the OutputFormat.
As for the jstack trace, I'm currently unable to provide one (because I can't access the kinesis stream right now), but I'll try to emulate the stream and provide that stack as soon as I can.

4) Yes, those were log lines from the TM log in the web dashboard

Thank you very much for your help


2017-08-31 10:29 GMT+02:00 Nico Kruber <[hidden email]>:
Hi Federico,
1) Which version of Flink are you using?
2) Can you also share the JobManager log?
3) Why do you think, Flink is stuck at the BlobCache? Is it really blocked, or
do you still have CPU load? Can you post stack traces of the TaskManager (TM)
and JobManager processes when you think they are stuck (using jstack)?
4) These PUT requests in the TM logs are strange, unless you showed the TM
logs in the web interface - did you?


Nico


On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:
> Hi Federico,
>
> Not sure what's going on there but Nico (in CC) is more familiar with the
> blob cache and might be able to help.
>
> Best, Fabian
>
> 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
> > Hi,
> >
> > I have a rather simple Flink job which has a KinesisConsumer as a source
> > and an HBase table as sink, in which I write using writeOutputFormat. I'm
> > running it on a local machine with a single taskmanager (2 slots, 2G). The
> > KinesisConsumer works fine and the connection to the HBase table gets
> > opened fine (i.e. the open method of the class implementing OutputFormat
> > gets actually called).
> >
> > I'm running the job at a parallelism of 2, while the sink has a
> > parallelism of 1. The
> >
> > Still, looking at the log I see that after opening the connection, the job
> > gets stuck at lines like this one:
> >
> > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > localhost/127.0.0.1:43268
> >
> > Each following one another, like this:
> >
> > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
> >                  - Created BLOB cache storage directory
> > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
> >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > 2017-08-30 14:17:21,323 DEBUG
> > org.apache.flink.runtime.blob.BlobServerConnection            - Received
> > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > org.apache.flink.runtime.blob.BlobCache                       -
> > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
> >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
> >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
> >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> >        - Received PUT request for content addressable BLOB 2017-08-30
> > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
> >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - GET
> > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > content addressable BLOB stream to /127.0.0.1:60410
> >
> >
> > My questions are: what is the jobmanager doing here? Why is he taking ages
> > to do this? How do i speed up this behaviour?
> >
> > Thank you very much for your attention,
> >
> > Federico D'Ambrosio




--
Federico D'Ambrosio
Reply | Threaded
Open this post in threaded view
|

Re: BlobCache and its functioning

Nico Kruber
to sum up: the lines you were seeing seem to be the down- and upload of the
TaskManager logs from the web interface which go through the BlobServer and
its components.


Nico

On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote:

> Hi,
>
> 1) I'm using Flink 1.3.2
>
> 2) Th JobManager log is pretty much the same concerning those lines:
>
> 2017-08-30 14:16:53,343 INFO
> org.apache.zookeeper.ClientCnxn                               - Opening
> socket connection to server master-1.localdomain/10.0.0.55:2181
> 2017-08-30 14:16:53,344 INFO
> org.apache.zookeeper.ClientCnxn                               - Socket
> connection established to master-1.localdomain/10.0.0.55:2181, initiating
> session
> 2017-08-30 14:16:53,348 INFO
> org.apache.zookeeper.ClientCnxn                               - Session
> establishment complete on server master-1.localdomain/10.0.0.55:2181,
> sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000
> 2017-08-30 14:16:53,467 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 0 has no active shards to read on startup; marking the subtask as
> temporarily idle ...
> 2017-08-30 14:16:53,469 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer  -
> Subtask 1 will be seeded with initial shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'}, starting
> state set as sequence number LATEST_SEQUENCE_NUM
> 2017-08-30 14:16:53,470 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 1 will start consuming seeded shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'} from sequence
> number LATEST_SEQUENCE_NUM with ShardConsumer 0
> 2017-08-30 14:16:53,608 INFO
> lab.vardata.HBaseBatchFormat                                  - Task 0:
> Opening connection to currentDay to execute 1 tasks on Single Put job
>
> 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Created BLOB cache storage directory
> /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
> 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59937
> 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3ff486dff4c4eaafdab42b30a877326e62bfca82
> from localhost/127.0.0.1:43268
> 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59976
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 2f5283326aab77faa047b705cd1d6470035b3b7d
> from localhost/127.0.0.1:43268
> 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
> 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60022
> 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6
> from localhost/127.0.0.1:43268
> 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60110
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 8638bdf78b0e540786de6c291f710a8db447a2b4
> from localhost/127.0.0.1:43268
> 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
> 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60295
> 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6d30c88539d511bb9acc13b53bb2a128614f5621
> from localhost/127.0.0.1:43268
> 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
> 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60410
> 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:29:20,945 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48
> from localhost/127.0.0.1:43268
> 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60525
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:32:10,006 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e
> from localhost/127.0.0.1:43268
> 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527
> 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60734
> 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:51,285 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736
> 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60740
> 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:54,225 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742
> 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60852
> 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:40:44,007 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 430066a8a3febb68ada10d78663b0972b92e66e5
> from localhost/127.0.0.1:43268
> 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:32820
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:45:52,672 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac
> from localhost/127.0.0.1:43268
> 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822
> 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:33955
> 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:15:24,638 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 291f5aeb9306cb94098255237e01ab6735cf42ea
> from localhost/127.0.0.1:43268
> 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34170
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:21:04,191 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> e46b51d3bd0476b6a8a656469e7546cd933fa478
> from localhost/127.0.0.1:43268
> 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172
> 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34223
> 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:22:10,144 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> cf9700032c229b39e634eeda73284e116314f7bb
> from localhost/127.0.0.1:43268
> 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34544
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:30:10,135 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189
> from localhost/127.0.0.1:43268
> 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546
> 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:35741
> 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 16:01:58,162 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6bde2f7a709181065c6710c2252a5846f361ad68
> from localhost/127.0.0.1:43268
> 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743
>
> 3) There actually was CPU load, but I thought Flink was stuck in a loop or
> something, because as you can see from the timestamp on the logs these
> lines went on for a couple hours with no sign of actual writes on HBase
> (before crashing for unrelated reasons, Kinesis stream was deleted). Since
> I was writing on HBase, which I was accessing via Zookeeper, I was
> expecting to see the Logs I put inside the writeRecord method of the
> OutputFormat.
> As for the jstack trace, I'm currently unable to provide one (because I
> can't access the kinesis stream right now), but I'll try to emulate the
> stream and provide that stack as soon as I can.
>
> 4) Yes, those were log lines from the TM log in the web dashboard
>
> Thank you very much for your help
>
> 2017-08-31 10:29 GMT+02:00 Nico Kruber <[hidden email]>:
> > Hi Federico,
> > 1) Which version of Flink are you using?
> > 2) Can you also share the JobManager log?
> > 3) Why do you think, Flink is stuck at the BlobCache? Is it really
> > blocked, or
> > do you still have CPU load? Can you post stack traces of the TaskManager
> > (TM)
> > and JobManager processes when you think they are stuck (using jstack)?
> > 4) These PUT requests in the TM logs are strange, unless you showed the TM
> > logs in the web interface - did you?
> >
> >
> > Nico
> >
> > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:
> > > Hi Federico,
> > >
> > > Not sure what's going on there but Nico (in CC) is more familiar with
> > > the
> > > blob cache and might be able to help.
> > >
> > > Best, Fabian
> > >
> > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
> > > > Hi,
> > > >
> > > > I have a rather simple Flink job which has a KinesisConsumer as a
> >
> > source
> >
> > > > and an HBase table as sink, in which I write using writeOutputFormat.
> >
> > I'm
> >
> > > > running it on a local machine with a single taskmanager (2 slots, 2G).
> >
> > The
> >
> > > > KinesisConsumer works fine and the connection to the HBase table gets
> > > > opened fine (i.e. the open method of the class implementing
> >
> > OutputFormat
> >
> > > > gets actually called).
> > > >
> > > > I'm running the job at a parallelism of 2, while the sink has a
> > > > parallelism of 1. The
> > > >
> > > > Still, looking at the log I see that after opening the connection, the
> >
> > job
> >
> > > > gets stuck at lines like this one:
> > > >
> > > > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > localhost/127.0.0.1:43268
> > > >
> > > > Each following one another, like this:
> > > >
> > > > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >                  - Created BLOB cache storage directory
> > > >
> > > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
> > > >
> > > >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > > >
> > > > 2017-08-30 14:17:21,323 DEBUG
> > > > org.apache.flink.runtime.blob.BlobServerConnection            -
> >
> > Received
> >
> > > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > > > org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60410
> > > >
> > > >
> > > > My questions are: what is the jobmanager doing here? Why is he taking
> >
> > ages
> >
> > > > to do this? How do i speed up this behaviour?
> > > >
> > > > Thank you very much for your attention,
> > > >
> > > > Federico D'Ambrosio


signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: BlobCache and its functioning

Federico D'Ambrosio-2
Ok, thank you very much!

So that was nothing actually related to what I was trying to do. I guess I'll have to investigate further on the effective correctness of the implementation of the OutputFormat then, because the total lack of other log lines was the most strucking thing about this whole issue.

Thank you again

2017-08-31 12:24 GMT+02:00 Nico Kruber <[hidden email]>:
to sum up: the lines you were seeing seem to be the down- and upload of the
TaskManager logs from the web interface which go through the BlobServer and
its components.


Nico

On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote:
> Hi,
>
> 1) I'm using Flink 1.3.2
>
> 2) Th JobManager log is pretty much the same concerning those lines:
>
> 2017-08-30 14:16:53,343 INFO
> org.apache.zookeeper.ClientCnxn                               - Opening
> socket connection to server master-1.localdomain/10.0.0.55:2181
> 2017-08-30 14:16:53,344 INFO
> org.apache.zookeeper.ClientCnxn                               - Socket
> connection established to master-1.localdomain/10.0.0.55:2181, initiating
> session
> 2017-08-30 14:16:53,348 INFO
> org.apache.zookeeper.ClientCnxn                               - Session
> establishment complete on server master-1.localdomain/10.0.0.55:2181,
> sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000
> 2017-08-30 14:16:53,467 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 0 has no active shards to read on startup; marking the subtask as
> temporarily idle ...
> 2017-08-30 14:16:53,469 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer  -
> Subtask 1 will be seeded with initial shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'}, starting
> state set as sequence number LATEST_SEQUENCE_NUM
> 2017-08-30 14:16:53,470 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 1 will start consuming seeded shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'} from sequence
> number LATEST_SEQUENCE_NUM with ShardConsumer 0
> 2017-08-30 14:16:53,608 INFO
> lab.vardata.HBaseBatchFormat                                  - Task 0:
> Opening connection to currentDay to execute 1 tasks on Single Put job
>
> 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Created BLOB cache storage directory
> /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
> 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59937
> 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3ff486dff4c4eaafdab42b30a877326e62bfca82
> from localhost/127.0.0.1:43268
> 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59976
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 2f5283326aab77faa047b705cd1d6470035b3b7d
> from localhost/127.0.0.1:43268
> 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
> 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60022
> 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6
> from localhost/127.0.0.1:43268
> 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60110
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 8638bdf78b0e540786de6c291f710a8db447a2b4
> from localhost/127.0.0.1:43268
> 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
> 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60295
> 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6d30c88539d511bb9acc13b53bb2a128614f5621
> from localhost/127.0.0.1:43268
> 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
> 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60410
> 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:29:20,945 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48
> from localhost/127.0.0.1:43268
> 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60525
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:32:10,006 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e
> from localhost/127.0.0.1:43268
> 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527
> 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60734
> 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:51,285 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736
> 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60740
> 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:54,225 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742
> 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60852
> 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:40:44,007 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 430066a8a3febb68ada10d78663b0972b92e66e5
> from localhost/127.0.0.1:43268
> 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:32820
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:45:52,672 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac
> from localhost/127.0.0.1:43268
> 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822
> 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:33955
> 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:15:24,638 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 291f5aeb9306cb94098255237e01ab6735cf42ea
> from localhost/127.0.0.1:43268
> 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34170
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:21:04,191 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> e46b51d3bd0476b6a8a656469e7546cd933fa478
> from localhost/127.0.0.1:43268
> 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172
> 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34223
> 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:22:10,144 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> cf9700032c229b39e634eeda73284e116314f7bb
> from localhost/127.0.0.1:43268
> 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34544
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:30:10,135 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189
> from localhost/127.0.0.1:43268
> 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546
> 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:35741
> 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 16:01:58,162 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6bde2f7a709181065c6710c2252a5846f361ad68
> from localhost/127.0.0.1:43268
> 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743
>
> 3) There actually was CPU load, but I thought Flink was stuck in a loop or
> something, because as you can see from the timestamp on the logs these
> lines went on for a couple hours with no sign of actual writes on HBase
> (before crashing for unrelated reasons, Kinesis stream was deleted). Since
> I was writing on HBase, which I was accessing via Zookeeper, I was
> expecting to see the Logs I put inside the writeRecord method of the
> OutputFormat.
> As for the jstack trace, I'm currently unable to provide one (because I
> can't access the kinesis stream right now), but I'll try to emulate the
> stream and provide that stack as soon as I can.
>
> 4) Yes, those were log lines from the TM log in the web dashboard
>
> Thank you very much for your help
>
> 2017-08-31 10:29 GMT+02:00 Nico Kruber <[hidden email]>:
> > Hi Federico,
> > 1) Which version of Flink are you using?
> > 2) Can you also share the JobManager log?
> > 3) Why do you think, Flink is stuck at the BlobCache? Is it really
> > blocked, or
> > do you still have CPU load? Can you post stack traces of the TaskManager
> > (TM)
> > and JobManager processes when you think they are stuck (using jstack)?
> > 4) These PUT requests in the TM logs are strange, unless you showed the TM
> > logs in the web interface - did you?
> >
> >
> > Nico
> >
> > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:
> > > Hi Federico,
> > >
> > > Not sure what's going on there but Nico (in CC) is more familiar with
> > > the
> > > blob cache and might be able to help.
> > >
> > > Best, Fabian
> > >
> > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <[hidden email]>:
> > > > Hi,
> > > >
> > > > I have a rather simple Flink job which has a KinesisConsumer as a
> >
> > source
> >
> > > > and an HBase table as sink, in which I write using writeOutputFormat.
> >
> > I'm
> >
> > > > running it on a local machine with a single taskmanager (2 slots, 2G).
> >
> > The
> >
> > > > KinesisConsumer works fine and the connection to the HBase table gets
> > > > opened fine (i.e. the open method of the class implementing
> >
> > OutputFormat
> >
> > > > gets actually called).
> > > >
> > > > I'm running the job at a parallelism of 2, while the sink has a
> > > > parallelism of 1. The
> > > >
> > > > Still, looking at the log I see that after opening the connection, the
> >
> > job
> >
> > > > gets stuck at lines like this one:
> > > >
> > > > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > localhost/127.0.0.1:43268
> > > >
> > > > Each following one another, like this:
> > > >
> > > > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >                  - Created BLOB cache storage directory
> > > >
> > > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
> > > >
> > > >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > > >
> > > > 2017-08-30 14:17:21,323 DEBUG
> > > > org.apache.flink.runtime.blob.BlobServerConnection            -
> >
> > Received
> >
> > > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > > > org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > >
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > >
> > > > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
> > > >
> > > >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > >
> > > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60410
> > > >
> > > >
> > > > My questions are: what is the jobmanager doing here? Why is he taking
> >
> > ages
> >
> > > > to do this? How do i speed up this behaviour?
> > > >
> > > > Thank you very much for your attention,
> > > >
> > > > Federico D'Ambrosio




--
Federico D'Ambrosio