Question about sharing resource among slots with a TM

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about sharing resource among slots with a TM

Vishal Santoshi
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....



Reply | Threaded
Open this post in threaded view
|

Re: Question about sharing resource among slots with a TM

Vishal Santoshi
any one? 

On Wed, Sep 26, 2018 at 9:15 AM Vishal Santoshi <[hidden email]> wrote:
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....



Reply | Threaded
Open this post in threaded view
|

Re: Question about sharing resource among slots with a TM

Hequn Cheng
Hi vishal,

Yes, we can define a static connection to reuse it or implement a connection pool. Maybe we can also ask the problem in hbase community and see if there are other better ways.

Best, Hequn


On Thu, Sep 27, 2018 at 12:40 AM Vishal Santoshi <[hidden email]> wrote:
any one? 

On Wed, Sep 26, 2018 at 9:15 AM Vishal Santoshi <[hidden email]> wrote:
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....



Reply | Threaded
Open this post in threaded view
|

Re: Question about sharing resource among slots with a TM

Kostas Kloudas
Hi Vishal,

Currently there is no way to share (user-defined) resources between tasks on the same TM.
So I suppose that a singleton is the best way to go for now.

Cheers,
Kostas

On Sep 27, 2018, at 3:43 AM, Hequn Cheng <[hidden email]> wrote:

Hi vishal,

Yes, we can define a static connection to reuse it or implement a connection pool. Maybe we can also ask the problem in hbase community and see if there are other better ways.

Best, Hequn


On Thu, Sep 27, 2018 at 12:40 AM Vishal Santoshi <[hidden email]> wrote:
any one? 

On Wed, Sep 26, 2018 at 9:15 AM Vishal Santoshi <[hidden email]> wrote:
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....




Reply | Threaded
Open this post in threaded view
|

Re: Question about sharing resource among slots with a TM

Vishal Santoshi
Makes sense. An additional query.. How does flink handle class loading. Is  there a separate class loader per job ?  In essence if I have a static member in a class in a job, it would be highly inapprpraite that that static member is available to another job.

On Thu, Sep 27, 2018, 8:15 AM Kostas Kloudas <[hidden email]> wrote:
Hi Vishal,

Currently there is no way to share (user-defined) resources between tasks on the same TM.
So I suppose that a singleton is the best way to go for now.

Cheers,
Kostas

On Sep 27, 2018, at 3:43 AM, Hequn Cheng <[hidden email]> wrote:

Hi vishal,

Yes, we can define a static connection to reuse it or implement a connection pool. Maybe we can also ask the problem in hbase community and see if there are other better ways.

Best, Hequn


On Thu, Sep 27, 2018 at 12:40 AM Vishal Santoshi <[hidden email]> wrote:
any one? 

On Wed, Sep 26, 2018 at 9:15 AM Vishal Santoshi <[hidden email]> wrote:
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....




Reply | Threaded
Open this post in threaded view
|

Re: Question about sharing resource among slots with a TM

Vishal Santoshi

On Thu, Sep 27, 2018 at 11:04 AM Vishal Santoshi <[hidden email]> wrote:
Makes sense. An additional query.. How does flink handle class loading. Is  there a separate class loader per job ?  In essence if I have a static member in a class in a job, it would be highly inapprpraite that that static member is available to another job.

On Thu, Sep 27, 2018, 8:15 AM Kostas Kloudas <[hidden email]> wrote:
Hi Vishal,

Currently there is no way to share (user-defined) resources between tasks on the same TM.
So I suppose that a singleton is the best way to go for now.

Cheers,
Kostas

On Sep 27, 2018, at 3:43 AM, Hequn Cheng <[hidden email]> wrote:

Hi vishal,

Yes, we can define a static connection to reuse it or implement a connection pool. Maybe we can also ask the problem in hbase community and see if there are other better ways.

Best, Hequn


On Thu, Sep 27, 2018 at 12:40 AM Vishal Santoshi <[hidden email]> wrote:
any one? 

On Wed, Sep 26, 2018 at 9:15 AM Vishal Santoshi <[hidden email]> wrote:
We use Hbase extensively and the general pattern we follow is acquiring a Connection in the open() method of a RichFunction and closing in the close() method. Of course that implies that if we have a parallelism of n, there will be n Hbase Connections. We want to use the fact that Hbase connection is inherently thread safe http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html  and it is a pretty heavy object to begin with and thus makes sense to share a Connection across slots in a single TM. We could do it through a a static singleton pattern but was wondering if and whether there is an established paradigm for sharing a resource ....