(DEPRECATED) Apache Flink User Mailing List archive.

Programmatic configuration

Classic

List

Threaded

2 messages Options

Dustin Jenkins

Programmatic configuration

Hello,

I’m running a single Flink Job Manager with a Task Manager in Docker containers with Java 8. They are remotely located (flink.example.com).

I’m submitting a job from my desktop and passing the job to the Job Manager with -m flink.example.com:6123, which seems to work well. I’m doing a search on an S3 system located at s3.example.com.

The problem is that in order to have access to the S3 system, the $HADOOP_CONFIG/core-site.xml and $FLINK_HOME/flink-conf.yaml need to be configured for it at the Job Manager and Task Manager level, which means they are tied to that particular endpoint (including my access key and secret key). Is there someway I can specify the configuration only in my application so I can leave my Flink server cluster to be mostly generic?

Thank you!

Dustin

Tzu-Li (Gordon) Tai

Re: Programmatic configuration

Hi Dustin,

Are you using S3 for a Flink source / sink / streaming state backend? Or is it simply used in one of your operators?

I’m assuming the latter since you mentioned “doing a search on an S3 system”. For this, I think it would make sense to simply pass the job-specific S3 endpoint / credentials as a program argument.

As for setting AWS access key / secret keys in the configuration files, that is actually not recommended. The recommended way would be to do that via AWS IAM settings [1].

Cheers,

Gordon

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#identity-and-access-management-iam-recommended

On 28 September 2017 at 12:41:26 AM, Dustin Jenkins ([hidden email]) wrote:

Hello,

I’m running a single Flink Job Manager with a Task Manager in Docker containers with Java 8. They are remotely located (flink.example.com).

I’m submitting a job from my desktop and passing the job to the Job Manager with -m flink.example.com:6123, which seems to work well. I’m doing a search on an S3 system located at s3.example.com.

The problem is that in order to have access to the S3 system, the $HADOOP_CONFIG/core-site.xml and $FLINK_HOME/flink-conf.yaml need to be configured for it at the Job Manager and Task Manager level, which means they are tied to that particular endpoint (including my access key and secret key). Is there someway I can specify the configuration only in my application so I can leave my Flink server cluster to be mostly generic?

Thank you!
Dustin