akka.pattern.AskTimeoutException

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

akka.pattern.AskTimeoutException

Frederick Ayala
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick
Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

rmetzger0
Hi Frederick,

sorry for the delayed response.

I have no idea what the problem could be.
Has the exception been thrown from the env.execute() call?
Why did you set the akka.ask.timeout to 10k seconds?




On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick

Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

Frederick Ayala
Hi Robert,

Thanks for your reply. 

I set the akka.ask.timeout to 10k seconds just to see what happened. I tried different values but non did the trick.

My problem was solved by using a machine with more RAM. However, it was not clear that the memory was the problem :)

Attached are the log and the Scala code of the transformation that I was running.

The data file I am processing is around 57M lines (~1.7GB).

Let me know if you have any comment or suggestion.

Thanks again,

Frederick

 

On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <[hidden email]> wrote:
Hi Frederick,

sorry for the delayed response.

I have no idea what the problem could be.
Has the exception been thrown from the env.execute() call?
Why did you set the akka.ask.timeout to 10k seconds?




On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick




--
Frederick Ayala

flink_transformations.txt (1K) Download Attachment
netflix_100_sample_05.out (10K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

Stephan Ewen
Hi!

Do you get this problem with other Jobs as well?

The logs suggest that the JobManager receives the job and starts tasks, but the Client thinks it lost connection.

Greetings,
Stephan


On Fri, Jan 15, 2016 at 10:31 AM, Frederick Ayala <[hidden email]> wrote:
Hi Robert,

Thanks for your reply. 

I set the akka.ask.timeout to 10k seconds just to see what happened. I tried different values but non did the trick.

My problem was solved by using a machine with more RAM. However, it was not clear that the memory was the problem :)

Attached are the log and the Scala code of the transformation that I was running.

The data file I am processing is around 57M lines (~1.7GB).

Let me know if you have any comment or suggestion.

Thanks again,

Frederick

 

On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <[hidden email]> wrote:
Hi Frederick,

sorry for the delayed response.

I have no idea what the problem could be.
Has the exception been thrown from the env.execute() call?
Why did you set the akka.ask.timeout to 10k seconds?




On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick




--
Frederick Ayala

Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

Frederick Ayala

Hi Stephan,

Other jobs run fine but this one is not working on the machine that I was using previously (16GB RAM) [1]

Is there a way to debug the Akka messages to understand what's happening between the JobManager and the Client? I can add logging and send it.

Thanks!

Fred

[1] The failure started to happen when I added the flatMap transformation. Previously I was calling the collect function after the reduceGroup and then using Scala's flatten function. Since this was very slow and failed with large datafile I used Flink to flatten the list of lists and now it faster.

On Jan 15, 2016 11:51, "Stephan Ewen" <[hidden email]> wrote:
Hi!

Do you get this problem with other Jobs as well?

The logs suggest that the JobManager receives the job and starts tasks, but the Client thinks it lost connection.

Greetings,
Stephan


On Fri, Jan 15, 2016 at 10:31 AM, Frederick Ayala <[hidden email]> wrote:
Hi Robert,

Thanks for your reply. 

I set the akka.ask.timeout to 10k seconds just to see what happened. I tried different values but non did the trick.

My problem was solved by using a machine with more RAM. However, it was not clear that the memory was the problem :)

Attached are the log and the Scala code of the transformation that I was running.

The data file I am processing is around 57M lines (~1.7GB).

Let me know if you have any comment or suggestion.

Thanks again,

Frederick

 

On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <[hidden email]> wrote:
Hi Frederick,

sorry for the delayed response.

I have no idea what the problem could be.
Has the exception been thrown from the env.execute() call?
Why did you set the akka.ask.timeout to 10k seconds?




On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick




--
Frederick Ayala

Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

Till Rohrmann

You can set Flink’s log level to DEBUG in the log4j.properties file. Furthermore, you can activate logging of Akka’s life cycle events via akka.log.lifecycle.events: true which you specify in flink-conf.yaml.

Cheers,
Till


On Fri, Jan 15, 2016 at 12:41 PM, Frederick Ayala <[hidden email]> wrote:

Hi Stephan,

Other jobs run fine but this one is not working on the machine that I was using previously (16GB RAM) [1]

Is there a way to debug the Akka messages to understand what's happening between the JobManager and the Client? I can add logging and send it.

Thanks!

Fred

[1] The failure started to happen when I added the flatMap transformation. Previously I was calling the collect function after the reduceGroup and then using Scala's flatten function. Since this was very slow and failed with large datafile I used Flink to flatten the list of lists and now it faster.

On Jan 15, 2016 11:51, "Stephan Ewen" <[hidden email]> wrote:
Hi!

Do you get this problem with other Jobs as well?

The logs suggest that the JobManager receives the job and starts tasks, but the Client thinks it lost connection.

Greetings,
Stephan


On Fri, Jan 15, 2016 at 10:31 AM, Frederick Ayala <[hidden email]> wrote:
Hi Robert,

Thanks for your reply. 

I set the akka.ask.timeout to 10k seconds just to see what happened. I tried different values but non did the trick.

My problem was solved by using a machine with more RAM. However, it was not clear that the memory was the problem :)

Attached are the log and the Scala code of the transformation that I was running.

The data file I am processing is around 57M lines (~1.7GB).

Let me know if you have any comment or suggestion.

Thanks again,

Frederick

 

On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <[hidden email]> wrote:
Hi Frederick,

sorry for the delayed response.

I have no idea what the problem could be.
Has the exception been thrown from the env.execute() call?
Why did you set the akka.ask.timeout to 10k seconds?




On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
Hi,

I am having an error while running some Flink transformations in a DataStream Scala API.

The error I get is:
Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
...
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms]

This happens after a couple of minutes. Not after 21474835 seconds...

I tried different configurations but no result so far:
      val customConfiguration = new Configuration()
      customConfiguration.setInteger("parallelism", 8)
      customConfiguration.setInteger("jobmanager.heap.mb",2560)
      customConfiguration.setInteger("taskmanager.heap.mb",10240)
      customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8)
      customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384)
      customConfiguration.setString("akka.ask.timeout","10000 s")
      customConfiguration.setString("akka.lookup.timeout","100 s")
      env = ExecutionEnvironment.createLocalEnvironment(customConfiguration)

Any idea what could it be the problem?

Thanks!

Frederick




--
Frederick Ayala


Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

stefanobaghino
Frederick,

did you find the problem? I'm having a similar issue (the timeout apparently goes off immediately, despite the error message) with a very simple test job that reads from Kafka, appends a string to the input and writes it back to Kafka. Other jobs seem to work fine as well.
Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

Frederick Ayala
Hi Stefano,

In my case running the program in a machine with more ram solved the problem. Have you tried enabling debugging as Till's suggested?

Fred

On Wed, Mar 16, 2016 at 1:51 PM, stefanobaghino <[hidden email]> wrote:
Frederick,

did you find the problem? I'm having a similar issue (the timeout apparently
goes off immediately, despite the error message) with a very simple test job
that reads from Kafka, appends a string to the input and writes it back to
Kafka. Other jobs seem to work fine as well.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/akka-pattern-AskTimeoutException-tp4253p5572.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.



--
Frederick Ayala
Reply | Threaded
Open this post in threaded view
|

Re: akka.pattern.AskTimeoutException

stefanobaghino
Hi Frederick,

thanks for helping me, in the end it looked like it was just a missing property in the ones I gave to Kafka, but the error message looks really misleading. Thanks again.

Best,
Stefano

On Wed, Mar 16, 2016 at 4:04 PM, Frederick Ayala <[hidden email]> wrote:
Hi Stefano,

In my case running the program in a machine with more ram solved the problem. Have you tried enabling debugging as Till's suggested?

Fred

On Wed, Mar 16, 2016 at 1:51 PM, stefanobaghino <[hidden email]> wrote:
Frederick,

did you find the problem? I'm having a similar issue (the timeout apparently
goes off immediately, despite the error message) with a very simple test job
that reads from Kafka, appends a string to the input and writes it back to
Kafka. Other jobs seem to work fine as well.



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/akka-pattern-AskTimeoutException-tp4253p5572.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.



--
Frederick Ayala



--
BR,
Stefano Baghino

Software Engineer @ Radicalbit