How flink monitor source stream task(Time Trigger) is running?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

How flink monitor source stream task(Time Trigger) is running?

yunfan123
In my understanding, flink just use task heartbeat to monitor taskManager is
running.
If source stream (Time Trigger for XXX)thread is crash, it seems flink can't
recovery from this state?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

Piotr Nowojski
We use Akka's DeathWatch mechanism to detect dead components.

TaskManager failure shouldn’t prevent recovering from state (as long as there are enough task slots).

I’m not sure if I understand what you mean by "source stream thread" crash. If is was some error during performing a checkpoint so that it didn’t complete, Flink will not be able to recover from such incomplete checkpoint.

Could you share us the logs with your issue?

Thanks, Piotrek

> On Sep 29, 2017, at 7:30 AM, yunfan123 <[hidden email]> wrote:
>
> In my understanding, flink just use task heartbeat to monitor taskManager is
> running.
> If source stream (Time Trigger for XXX)thread is crash, it seems flink can't
> recovery from this state?
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

yunfan123
My source stream means the funciton implement the
org.apache.flink.streaming.api.functions.source.SourceFunction.
My question is how flink know all working thread is alive?
If one working thread that execute the SourceFunction crash, how flink know
this happenned?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

Piotr Nowojski
Any exception thrown by your SourceFunction will be caught by Flink and that will mark a task (that was executing this SourceFuntion) as failed.

If you started some custom threads in your SourceFunction, you have to manually propagate their exceptions to the SourceFunction.

Piotrek

> On Sep 29, 2017, at 2:09 PM, yunfan123 <[hidden email]> wrote:
>
> My source stream means the funciton implement the
> org.apache.flink.streaming.api.functions.source.SourceFunction.
> My question is how flink know all working thread is alive?
> If one working thread that execute the SourceFunction crash, how flink know
> this happenned?
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

yunfan123
So my question is if this thread crash without throw any Exception.
It seems flink can't handle this state.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

Piotr Nowojski
I am still not sure what do you mean by “thread crash without throw”.

If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task.
If it continue working, without throwing an exception, but it is in some corrupted state, then there is no way for Flink to know that anything has broken.
If it crash with some segfault, whole TaskManager will crash and that should be detected by Akka.

Piotrek

> On Sep 29, 2017, at 3:05 PM, yunfan123 <[hidden email]> wrote:
>
> So my question is if this thread crash without throw any Exception.
> It seems flink can't handle this state.
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

yunfan123
Thank you.
"If SourceFunction.run methods returns without an exception Flink assumes
that it has cleanly shutdown and that there were simply no more elements to
collect/create by this task. "
This sentence solve my confusion.




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

Piotr Nowojski
You are welcome :)

Piotrek

> On Oct 2, 2017, at 1:19 PM, yunfan123 <[hidden email]> wrote:
>
> Thank you.
> "If SourceFunction.run methods returns without an exception Flink assumes
> that it has cleanly shutdown and that there were simply no more elements to
> collect/create by this task. "
> This sentence solve my confusion.
>
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: How flink monitor source stream task(Time Trigger) is running?

Aljoscha Krettek
I think this might not actually be resolved. What YunFan was referring to in the initial mail is the Thread factory that is used for the processing-time service: https://github.com/apache/flink/blob/5af463a9c0ff62603bc342a78dfd5483d834e8a7/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L223

How likely is it that a ScheduledThreadPoolExecutor simply fails? I don't think we currently have a mechanism that checks whether this service is still alive and would actually start scheduled tasks.

Best,
Aljoscha


On 4. Oct 2017, at 09:27, Piotr Nowojski <[hidden email]> wrote:

You are welcome :)

Piotrek

On Oct 2, 2017, at 1:19 PM, yunfan123 <[hidden email]> wrote:

Thank you. 
"If SourceFunction.run methods returns without an exception Flink assumes
that it has cleanly shutdown and that there were simply no more elements to
collect/create by this task. "
This sentence solve my confusion.




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/