(DEPRECATED) Apache Flink User Mailing List archive.

Support for multiple HDFS

Classic

List

Threaded

3 messages Options

vijikarthi

Support for multiple HDFS

Hello,

Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)?

The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage).

Since the hadoop configuration is static, I am thinking it is hard to achieve this but I could be wrong.

Please share any thoughts.

Regards

Vijay

Ted Yu

Re: Support for multiple HDFS

Would HDFS-6584 help with your use case ?

On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan <[hidden email]> wrote:

Hello,
Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)?
The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage).
Since the hadoop configuration is static, I am thinking it is hard to achieve this but I could be wrong.
Please share any thoughts.
RegardsVijay

vijikarthi

Re: Support for multiple HDFS

Hi Ted,

I believe HDFS-6584 is more of an HDFS feature supporting archive use case through some policy configurations.

My ask is that I have two distinct HCFS File systems which are independent but the Flink job will decide which one to use for sink while the Flink infrastructure is by default configured with one of these HCFS as state backend store.

Hope this helps.

Regards

Vijay

On Wednesday, August 23, 2017 11:06 AM, Ted Yu <[hidden email]> wrote:

Would HDFS-6584 help with your use case ?

On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan <
[hidden email]> wrote:

> Hello,
> Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1
> for managing Flink state backend, HDFS-2 for syncing results from user
> job)?
> The scenario can be viewed in the context of running some jobs that are
> meant to push the results to an archive repository (cold storage).
> Since the hadoop configuration is static, I am thinking it is hard to
> achieve this but I could be wrong.
> Please share any thoughts.
> RegardsVijay