Support for multiple HDFS

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Support for multiple HDFS

vijikarthi
Hello,

Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)? 

The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage).

Since the hadoop configuration is static, I am thinking it is hard to achieve this but I could be wrong.

Please share any thoughts.

Regards
Vijay
Reply | Threaded
Open this post in threaded view
|

Re: Support for multiple HDFS

Ted Yu
Would HDFS-6584 help with your use case ?

On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan <[hidden email]> wrote:
Hello,
Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)? 
The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage).
Since the hadoop configuration is static, I am thinking it is hard to achieve this but I could be wrong.
Please share any thoughts.
RegardsVijay

Reply | Threaded
Open this post in threaded view
|

Re: Support for multiple HDFS

vijikarthi
Hi Ted,

I believe HDFS-6584 is more of an HDFS feature supporting archive use case through some policy configurations.

My ask is that I have two distinct HCFS File systems which are independent but the Flink job will decide which one to use for sink while the Flink infrastructure is by default configured with one of these HCFS as state backend store.

Hope this helps.

Regards
Vijay


On Wednesday, August 23, 2017 11:06 AM, Ted Yu <[hidden email]> wrote:


Would HDFS-6584 help with your use case ?

On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan <
[hidden email]> wrote:

> Hello,
> Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1
> for managing Flink state backend, HDFS-2 for syncing results from user
> job)?
> The scenario can be viewed in the context of running some jobs that are
> meant to push the results to an archive repository (cold storage).
> Since the hadoop configuration is static, I am thinking it is hard to
> achieve this but I could be wrong.
> Please share any thoughts.
> RegardsVijay