how to override s3 key config in flink job

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

how to override s3 key config in flink job

Tony Wei
Hi,

Is there any way to provide s3.access-key and s3.secret-key in flink application, instead of setting
them in flink-conf.yaml?

In our use case, my team provide a flink standalone cluster to users. However, we don't want to let
each user use the same s3 bucket as filesystem to store checkpoints. So, we want to know if is it
feasible to let users provide their checkpoint path and corresponding aws key to access their own 
s3 bucket?

If not, could you show me why it doesn't work currently? And, is it possible to become a new
feature?

Thanks in advance for your help.

Best,
Tony Wei
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

Tony Wei
Hi,

Is there anyone can answer me?

Thanks,
Tony Wei

Tony Wei <[hidden email]> 於 2018年11月20日 週二 下午7:39寫道:
Hi,

Is there any way to provide s3.access-key and s3.secret-key in flink application, instead of setting
them in flink-conf.yaml?

In our use case, my team provide a flink standalone cluster to users. However, we don't want to let
each user use the same s3 bucket as filesystem to store checkpoints. So, we want to know if is it
feasible to let users provide their checkpoint path and corresponding aws key to access their own 
s3 bucket?

If not, could you show me why it doesn't work currently? And, is it possible to become a new
feature?

Thanks in advance for your help.

Best,
Tony Wei
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

yinhua.dai
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

Tony Wei
Hi yinhua,

I didn't try this yet, but I didn't see this option in both flink cli tool and rest api either.
Could you please provide more details about how to use this option to submit flink
application?

BTW, we are using standalone session cluster, not yarn session cluster. And I need
to submit different flink applications with different s3 key for flink presto s3 filesystem.

Any other suggestions are also welcome. Thank you.

Best,
Tony Wei

yinhua.dai <[hidden email]> 於 2018年11月27日 週二 上午11:37寫道:
Did you try "-Dkey=value"?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

yinhua.dai
Which flink version are you using.
I know how it works in yarn, but not very clear with standalone mode.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

Tony Wei
Hi yinhua,

Our flink version is 1.6.0.

Best,
Tony Wei

yinhua.dai <[hidden email]> 於 2018年11月27日 週二 下午2:32寫道:
Which flink version are you using.
I know how it works in yarn, but not very clear with standalone mode.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

yinhua.dai
It might be difficult as you the task manager and job manager are pre-started
in a session mode.

It seems that flink http server will always use the configuration that you
specified when you start your flink cluster, i.e. start-cluster.sh, I don't
find a way to override it.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

Andrey Zagrebin
Hi Tony,

File system factories are class-loaded in running JVMs of task executors.
That is why their configured objects are shared by different Flink jobs.
It is not possible to change their options per created file system and per job at the moment.

This could be changed, e.g. for s3, by providing "rewriting config” to file system factory “get" method,
but this method is not usually called by users directly in user facing components, like checkpointing or file sink. The user API is now mainly the file system URI string without any specific config.

I see that making it possible has value but it would require some involving changes in file system dependent APIs or changing the way how file systems are created in general.
You could create a JIRA issue to discuss it.

Best,
Andrey

> On 27 Nov 2018, at 10:06, yinhua.dai <[hidden email]> wrote:
>
> It might be difficult as you the task manager and job manager are pre-started
> in a session mode.
>
> It seems that flink http server will always use the configuration that you
> specified when you start your flink cluster, i.e. start-cluster.sh, I don't
> find a way to override it.
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: how to override s3 key config in flink job

Tony Wei
Hi Andrey,

Thanks for your detailed answer, and I have created a JIRA issue to discuss it [1].
Please check the description and help me to fill the details, like component/s, since
I'm not sure where it should be put. Thank you very much.

Best,
Tony Wei



Andrey Zagrebin <[hidden email]> 於 2018年11月27日 週二 下午10:43寫道:
Hi Tony,

File system factories are class-loaded in running JVMs of task executors.
That is why their configured objects are shared by different Flink jobs.
It is not possible to change their options per created file system and per job at the moment.

This could be changed, e.g. for s3, by providing "rewriting config” to file system factory “get" method,
but this method is not usually called by users directly in user facing components, like checkpointing or file sink. The user API is now mainly the file system URI string without any specific config.

I see that making it possible has value but it would require some involving changes in file system dependent APIs or changing the way how file systems are created in general.
You could create a JIRA issue to discuss it.

Best,
Andrey

> On 27 Nov 2018, at 10:06, yinhua.dai <[hidden email]> wrote:
>
> It might be difficult as you the task manager and job manager are pre-started
> in a session mode.
>
> It seems that flink http server will always use the configuration that you
> specified when you start your flink cluster, i.e. start-cluster.sh, I don't
> find a way to override it.
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/