Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

Stephen Connolly
We are using a 3rd party library that allocates some resources in one of our topologies.

Is there a listener or something that gets notified when the topology starts / stops running in the Task Manager's JVM?

The 3rd party library uses a singleton, so I need to initialize the singleton when the first task is started on the task manager and clear out the singleton when the last task is stopped in order to allow the topology classloader to be unloadable.

I had thought it could all be done from the Topology's main method, but after much head-banging we were able to identify that *when run on a distributed cluster* the main method is not invoked to start the topology for each task manager.
Reply | Threaded
Open this post in threaded view
|

Re: Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

Stephen Connolly
Currently the best I can see is to make *everything* a Rich... and hook into the open and close methods... but feels very ugly.



On Mon 23 Sep 2019 at 15:45, Stephen Connolly <[hidden email]> wrote:
We are using a 3rd party library that allocates some resources in one of our topologies.

Is there a listener or something that gets notified when the topology starts / stops running in the Task Manager's JVM?

The 3rd party library uses a singleton, so I need to initialize the singleton when the first task is started on the task manager and clear out the singleton when the last task is stopped in order to allow the topology classloader to be unloadable.

I had thought it could all be done from the Topology's main method, but after much head-banging we were able to identify that *when run on a distributed cluster* the main method is not invoked to start the topology for each task manager.
--
Sent from my phone
Reply | Threaded
Open this post in threaded view
|

Re: Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

Dian Fu
AFAIK, RichFunction is the only way you could take for this purpose. It's designed for life cycle management of functions.

Regards,
Dian

在 2019年9月24日,上午2:13,Stephen Connolly <[hidden email]> 写道:

Currently the best I can see is to make *everything* a Rich... and hook into the open and close methods... but feels very ugly.



On Mon 23 Sep 2019 at 15:45, Stephen Connolly <[hidden email]> wrote:
We are using a 3rd party library that allocates some resources in one of our topologies.

Is there a listener or something that gets notified when the topology starts / stops running in the Task Manager's JVM?

The 3rd party library uses a singleton, so I need to initialize the singleton when the first task is started on the task manager and clear out the singleton when the last task is stopped in order to allow the topology classloader to be unloadable.

I had thought it could all be done from the Topology's main method, but after much head-banging we were able to identify that *when run on a distributed cluster* the main method is not invoked to start the topology for each task manager.
--
Sent from my phone

Reply | Threaded
Open this post in threaded view
|

Re: Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

Zhu Zhu
In reply to this post by Stephen Connolly
Hi Stephen,

I think disposing static components in the closing stage of a task is required.
This is because your code(operators/UDFs) is part of the task, namely that it can only be executed when the task is not disposed.

Thanks,
Zhu Zhu

Stephen Connolly <[hidden email]> 于2019年9月24日周二 上午2:13写道:
Currently the best I can see is to make *everything* a Rich... and hook into the open and close methods... but feels very ugly.



On Mon 23 Sep 2019 at 15:45, Stephen Connolly <[hidden email]> wrote:
We are using a 3rd party library that allocates some resources in one of our topologies.

Is there a listener or something that gets notified when the topology starts / stops running in the Task Manager's JVM?

The 3rd party library uses a singleton, so I need to initialize the singleton when the first task is started on the task manager and clear out the singleton when the last task is stopped in order to allow the topology classloader to be unloadable.

I had thought it could all be done from the Topology's main method, but after much head-banging we were able to identify that *when run on a distributed cluster* the main method is not invoked to start the topology for each task manager.
--
Sent from my phone
Reply | Threaded
Open this post in threaded view
|

Re: Is there a lifecycle listener that gets notified when a topology starts/stops on a task manager

Stephen Connolly
I have created https://issues.apache.org/jira/browse/FLINK-14184 as a proposal to improve Flink in this specific area.

On Tue, 24 Sep 2019 at 03:23, Zhu Zhu <[hidden email]> wrote:
Hi Stephen,

I think disposing static components in the closing stage of a task is required.
This is because your code(operators/UDFs) is part of the task, namely that it can only be executed when the task is not disposed.

Thanks,
Zhu Zhu

Stephen Connolly <[hidden email]> 于2019年9月24日周二 上午2:13写道:
Currently the best I can see is to make *everything* a Rich... and hook into the open and close methods... but feels very ugly.



On Mon 23 Sep 2019 at 15:45, Stephen Connolly <[hidden email]> wrote:
We are using a 3rd party library that allocates some resources in one of our topologies.

Is there a listener or something that gets notified when the topology starts / stops running in the Task Manager's JVM?

The 3rd party library uses a singleton, so I need to initialize the singleton when the first task is started on the task manager and clear out the singleton when the last task is stopped in order to allow the topology classloader to be unloadable.

I had thought it could all be done from the Topology's main method, but after much head-banging we were able to identify that *when run on a distributed cluster* the main method is not invoked to start the topology for each task manager.
--
Sent from my phone