In my understanding, flink just use task heartbeat to monitor taskManager is
running. If source stream (Time Trigger for XXX)thread is crash, it seems flink can't recovery from this state? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
We use Akka's DeathWatch mechanism to detect dead components.
TaskManager failure shouldn’t prevent recovering from state (as long as there are enough task slots). I’m not sure if I understand what you mean by "source stream thread" crash. If is was some error during performing a checkpoint so that it didn’t complete, Flink will not be able to recover from such incomplete checkpoint. Could you share us the logs with your issue? Thanks, Piotrek > On Sep 29, 2017, at 7:30 AM, yunfan123 <[hidden email]> wrote: > > In my understanding, flink just use task heartbeat to monitor taskManager is > running. > If source stream (Time Trigger for XXX)thread is crash, it seems flink can't > recovery from this state? > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
My source stream means the funciton implement the
org.apache.flink.streaming.api.functions.source.SourceFunction. My question is how flink know all working thread is alive? If one working thread that execute the SourceFunction crash, how flink know this happenned? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Any exception thrown by your SourceFunction will be caught by Flink and that will mark a task (that was executing this SourceFuntion) as failed.
If you started some custom threads in your SourceFunction, you have to manually propagate their exceptions to the SourceFunction. Piotrek > On Sep 29, 2017, at 2:09 PM, yunfan123 <[hidden email]> wrote: > > My source stream means the funciton implement the > org.apache.flink.streaming.api.functions.source.SourceFunction. > My question is how flink know all working thread is alive? > If one working thread that execute the SourceFunction crash, how flink know > this happenned? > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
So my question is if this thread crash without throw any Exception.
It seems flink can't handle this state. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
I am still not sure what do you mean by “thread crash without throw”.
If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task. If it continue working, without throwing an exception, but it is in some corrupted state, then there is no way for Flink to know that anything has broken. If it crash with some segfault, whole TaskManager will crash and that should be detected by Akka. Piotrek > On Sep 29, 2017, at 3:05 PM, yunfan123 <[hidden email]> wrote: > > So my question is if this thread crash without throw any Exception. > It seems flink can't handle this state. > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Thank you.
"If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task. " This sentence solve my confusion. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
You are welcome :)
Piotrek > On Oct 2, 2017, at 1:19 PM, yunfan123 <[hidden email]> wrote: > > Thank you. > "If SourceFunction.run methods returns without an exception Flink assumes > that it has cleanly shutdown and that there were simply no more elements to > collect/create by this task. " > This sentence solve my confusion. > > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
I think this might not actually be resolved. What YunFan was referring to in the initial mail is the Thread factory that is used for the processing-time service: https://github.com/apache/flink/blob/5af463a9c0ff62603bc342a78dfd5483d834e8a7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L223
Best, Aljoscha
On 4. Oct 2017, at 09:27, Piotr Nowojski <[hidden email]> wrote: |
Free forum by Nabble | Edit this page |