Hi,
I am having an error while running some Flink transformations in a DataStream Scala API. The error I get is:
This happens after a couple of minutes. Not after 21474835 seconds... I tried different configurations but no result so far: val customConfiguration = new Configuration() customConfiguration.setInteger("parallelism", 8) customConfiguration.setInteger("jobmanager.heap.mb",2560) customConfiguration.setInteger("taskmanager.heap.mb",10240) customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8) customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384) customConfiguration.setString("akka.ask.timeout","10000 s") customConfiguration.setString("akka.lookup.timeout","100 s") env = ExecutionEnvironment.createLocalEnvironment(customConfiguration) Any idea what could it be the problem? Thanks! Frederick
|
Hi Frederick, sorry for the delayed response. I have no idea what the problem could be. Has the exception been thrown from the env.execute() call? Why did you set the akka.ask.timeout to 10k seconds? On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala <[hidden email]> wrote:
|
Hi Robert, Thanks for your reply. I set the akka.ask.timeout to 10k seconds just to see what happened. I tried different values but non did the trick. My problem was solved by using a machine with more RAM. However, it was not clear that the memory was the problem :) Attached are the log and the Scala code of the transformation that I was running. The data file I am processing is around 57M lines (~1.7GB). Let me know if you have any comment or suggestion. Thanks again, Frederick On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <[hidden email]> wrote:
Frederick Ayala flink_transformations.txt (1K) Download Attachment netflix_100_sample_05.out (10K) Download Attachment |
Hi! Do you get this problem with other Jobs as well? The logs suggest that the JobManager receives the job and starts tasks, but the Client thinks it lost connection. Greetings, Stephan On Fri, Jan 15, 2016 at 10:31 AM, Frederick Ayala <[hidden email]> wrote:
|
Hi Stephan, Other jobs run fine but this one is not working on the machine that I was using previously (16GB RAM) [1] Is there a way to debug the Akka messages to understand what's happening between the JobManager and the Client? I can add logging and send it. Thanks! Fred [1] The failure started to happen when I added the flatMap transformation. Previously I was calling the collect function after the reduceGroup and then using Scala's flatten function. Since this was very slow and failed with large datafile I used Flink to flatten the list of lists and now it faster. On Jan 15, 2016 11:51, "Stephan Ewen" <[hidden email]> wrote:
|
You can set Flink’s log level to Cheers, On Fri, Jan 15, 2016 at 12:41 PM, Frederick Ayala <[hidden email]> wrote:
|
Frederick,
did you find the problem? I'm having a similar issue (the timeout apparently goes off immediately, despite the error message) with a very simple test job that reads from Kafka, appends a string to the input and writes it back to Kafka. Other jobs seem to work fine as well. |
Hi Stefano, In my case running the program in a machine with more ram solved the problem. Have you tried enabling debugging as Till's suggested? Fred On Wed, Mar 16, 2016 at 1:51 PM, stefanobaghino <[hidden email]> wrote: Frederick, Frederick Ayala |
Hi Frederick,
thanks for helping me, in the end it looked like it was just a missing property in the ones I gave to Kafka, but the error message looks really misleading. Thanks again. Best, Stefano On Wed, Mar 16, 2016 at 4:04 PM, Frederick Ayala <[hidden email]> wrote:
BR, Stefano Baghino |
Free forum by Nabble | Edit this page |