 
	
					
		
	
					| Hi to all, I have this strange error in my job and I don't know what's going on. What can I do? The full exception is: The slot in which the task was scheduled has been killed (probably loss of TaskManager). 	at org.apache.flink.runtime.instance.SimpleSlot.cancel(SimpleSlot.java:98) 	at org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSimpleSlot(SlotSharingGroupAssignment.java:335) 	at org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:319) 	at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:106) 	at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:151) 	at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:182) 	at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:435) 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) 	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37) 	at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) 	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) 	at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) 	at akka.actor.Actor$class.aroundReceive(Actor.scala:465) 	at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) 	at akka.actor.ActorCell.invoke(ActorCell.scala:487) 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) 	at akka.dispatch.Mailbox.run(Mailbox.scala:221) 	at akka.dispatch.Mailbox.exec(Mailbox.scala:231) 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) | 
 
	
					
		
	
					| This means that the TaskManager was lost. The JobManager can no longer reach the TaskManager and consists all tasks executing ob the TaskManager as failed. Have a look at the TaskManager log, it should describe why the TaskManager failed. Am 15.04.2015 14:45 schrieb "Flavio Pompermaier" <[hidden email]>: 
 | 
 
	
					
		
	
					| Yes , sorry for that..I found it somewhere in the logs..the problem was that the program didn't die immediately but was somehow hanging and I discovered the source of the problem only running the program on a subset of the data. Thnks for the support, Flavio On Wed, Apr 15, 2015 at 2:56 PM, Stephan Ewen <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| I witnessed a similar issue yesterday on a simple job (single task chain, no shuffles) with a release-0.9 based fork. 2015-04-15 14:59 GMT+02:00 Flavio Pompermaier <[hidden email]>: 
 | 
 
	
					
		
	
					| Something similar in flink-0.10-SNAPSHOT: 06/29/2015 10:33:46 CHAIN Join(Join at main(TriangleCount.java:79)) -> Combine (Reduce at main(TriangleCount.java:79))(222/224) switched to FAILED java.lang.Exception: The slot in which the task was executed has been released. Probably loss of TaskManager 57c67d938c9144bec5ba798bb8ebe636 @ wally025 - 8 slots - URL: akka.tcp://flink@130.149.249.35:56135/user/taskmanager at org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:151) at org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:547) at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:119) at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:154) at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:182) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:421) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:36) at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:29) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:29) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:92) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) at akka.actor.ActorCell.invoke(ActorCell.scala:486) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 06/29/2015 10:33:46 Job execution switched to status FAILING. On Mon, Jun 29, 2015 at 1:08 PM, Alexander Alexandrov <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| Do you have the JobManager and TaskManager logs of the corresponding TM, by any chance? On Mon, Jun 29, 2015 at 8:12 PM, Andra Lungu <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| Hey Till,I managed to reproduce the bug; The logs are in the corresponding JIRA [hopefully I got the right ones :)]: FLINK-2299 On Tue, Jun 30, 2015 at 10:51 AM, Till Rohrmann <[hidden email]> wrote: 
 | 
| Free forum by Nabble | Edit this page | 
 
	

 
	
	
		
