Flink deadLetters

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink deadLetters

Flavio Pompermaier
Hi to all,

my job seems to be stucked and there's nothing logged also in debug mode.
The only strange thing is a  

Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].

Could it be a symptom of a problem?

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Till Rohrmann
That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.

What kind of job is it?

Cheers,
Till

On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

my job seems to be stucked and there's nothing logged also in debug mode.
The only strange thing is a  

Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].

Could it be a symptom of a problem?

Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Flavio Pompermaier

The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one).
Then there are two cogroup() to substitute the obtained global ids into the original tuple5.
Then the job outputs the mapping table into a csv and the translated tuple5 into a file.
The strange thing is that the job finish in about 1hour with the default memory settings (in Eclipse) while if I set -Xmx12gb it hangs).
I have 16GB of ram (maybe not all 12 available but I have 15 GB of swap on a SSD disk).

Any idea of what can cause this strange behaviour?

On 17 Jul 2015 19:04, "Till Rohrmann" <[hidden email]> wrote:
That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.

What kind of job is it?

Cheers,
Till

On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

my job seems to be stucked and there's nothing logged also in debug mode.
The only strange thing is a  

Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].

Could it be a symptom of a problem?

Best,
Flavio

Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Till Rohrmann
Without the logs it is hard to say.

On Sat, Jul 18, 2015 at 11:41 AM, Flavio Pompermaier <[hidden email]> wrote:

The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one).
Then there are two cogroup() to substitute the obtained global ids into the original tuple5.
Then the job outputs the mapping table into a csv and the translated tuple5 into a file.
The strange thing is that the job finish in about 1hour with the default memory settings (in Eclipse) while if I set -Xmx12gb it hangs).
I have 16GB of ram (maybe not all 12 available but I have 15 GB of swap on a SSD disk).

Any idea of what can cause this strange behaviour?

On 17 Jul 2015 19:04, "Till Rohrmann" <[hidden email]> wrote:
That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.

What kind of job is it?

Cheers,
Till

On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

my job seems to be stucked and there's nothing logged also in debug mode.
The only strange thing is a  

Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].

Could it be a symptom of a problem?

Best,
Flavio


Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Stephan Ewen
Can you share the logs, so we can check for suspicious error messages?

On Mon, Jul 20, 2015 at 9:34 AM, Till Rohrmann <[hidden email]> wrote:
Without the logs it is hard to say.

On Sat, Jul 18, 2015 at 11:41 AM, Flavio Pompermaier <[hidden email]> wrote:

The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one).
Then there are two cogroup() to substitute the obtained global ids into the original tuple5.
Then the job outputs the mapping table into a csv and the translated tuple5 into a file.
The strange thing is that the job finish in about 1hour with the default memory settings (in Eclipse) while if I set -Xmx12gb it hangs).
I have 16GB of ram (maybe not all 12 available but I have 15 GB of swap on a SSD disk).

Any idea of what can cause this strange behaviour?

On 17 Jul 2015 19:04, "Till Rohrmann" <[hidden email]> wrote:
That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.

What kind of job is it?

Cheers,
Till

On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

my job seems to be stucked and there's nothing logged also in debug mode.
The only strange thing is a  

Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].

Could it be a symptom of a problem?

Best,
Flavio



Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Ufuk Celebi
@Stephan: This is a with high probability a deadlock in the spillable partitions. I'm looking into it (https://issues.apache.org/jira/browse/FLINK-2384)

@Flavio: can you run your job in force pipelined mode for the time being and check whether it works.

env.getconfig().setExecutionMode(ExecutionMode.PIPELINED_FORCED);

– Ufuk

On 21 Jul 2015, at 15:05, Stephan Ewen <[hidden email]> wrote:

> Can you share the logs, so we can check for suspicious error messages?
>
> On Mon, Jul 20, 2015 at 9:34 AM, Till Rohrmann <[hidden email]> wrote:
> Without the logs it is hard to say.
>
> On Sat, Jul 18, 2015 at 11:41 AM, Flavio Pompermaier <[hidden email]> wrote:
> The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one).
> Then there are two cogroup() to substitute the obtained global ids into the original tuple5.
> Then the job outputs the mapping table into a csv and the translated tuple5 into a file.
> The strange thing is that the job finish in about 1hour with the default memory settings (in Eclipse) while if I set -Xmx12gb it hangs).
> I have 16GB of ram (maybe not all 12 available but I have 15 GB of swap on a SSD disk).
>
> Any idea of what can cause this strange behaviour?
>
> On 17 Jul 2015 19:04, "Till Rohrmann" <[hidden email]> wrote:
> That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.
>
> What kind of job is it?
>
> Cheers,
> Till
>
> On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
> Hi to all,
>
>
>
>
> my job seems to be stucked and there's nothing logged also in debug mode.
> The only strange thing is a  
>
> Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].
>
> Could it be a symptom of a problem?
>
> Best,
> Flavio
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Flavio Pompermaier
Setting env.getconfig().setExecutionMode(ExecutionMode.PIPELINED_FORCED) make it work :)

On Tue, Jul 21, 2015 at 4:20 PM, Ufuk Celebi <[hidden email]> wrote:
@Stephan: This is a with high probability a deadlock in the spillable partitions. I'm looking into it (https://issues.apache.org/jira/browse/FLINK-2384)

@Flavio: can you run your job in force pipelined mode for the time being and check whether it works.

env.getconfig().setExecutionMode(ExecutionMode.PIPELINED_FORCED);

– Ufuk

On 21 Jul 2015, at 15:05, Stephan Ewen <[hidden email]> wrote:

> Can you share the logs, so we can check for suspicious error messages?
>
> On Mon, Jul 20, 2015 at 9:34 AM, Till Rohrmann <[hidden email]> wrote:
> Without the logs it is hard to say.
>
> On Sat, Jul 18, 2015 at 11:41 AM, Flavio Pompermaier <[hidden email]> wrote:
> The job is quite simple..it just reads 10 parquet dirs, extract some infos out of the thrift objects and generates Tuple3,make a project() and a distinct() to call an external service only for some of the extracted ids (the external service translates the local id into a global one).
> Then there are two cogroup() to substitute the obtained global ids into the original tuple5.
> Then the job outputs the mapping table into a csv and the translated tuple5 into a file.
> The strange thing is that the job finish in about 1hour with the default memory settings (in Eclipse) while if I set -Xmx12gb it hangs).
> I have 16GB of ram (maybe not all 12 available but I have 15 GB of swap on a SSD disk).
>
> Any idea of what can cause this strange behaviour?
>
> On 17 Jul 2015 19:04, "Till Rohrmann" <[hidden email]> wrote:
> That is usually nothing to worry about. This just means that the message was sent without specifying a sender. What Akka then does is to use the `/deadLetters` actor as the sender.
>
> What kind of job is it?
>
> Cheers,
> Till
>
> On Fri, Jul 17, 2015 at 6:30 PM, Flavio Pompermaier <[hidden email]> wrote:
> Hi to all,
>
>
>
>
> my job seems to be stucked and there's nothing logged also in debug mode.
> The only strange thing is a
>
> Received message SendHeartbeat at akka://flink/user/taskmanager_1 from Actor[akka://flink/deadLetters].
>
> Could it be a symptom of a problem?
>
> Best,
> Flavio
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Flink deadLetters

Ufuk Celebi
OK, but I need to get back to you soon to test the fix as well. :D I hope that's OK. ;)

On 21 Jul 2015, at 17:45, Flavio Pompermaier <[hidden email]> wrote:

> Setting env.getconfig().setExecutionMode(ExecutionMode.PIPELINED_FORCED) make it work :)