Running flink on AWS ECS

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Running flink on AWS ECS

Navneeth Krishnan
Hi All,

I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?

Thanks
Reply | Threaded
Open this post in threaded view
|

Re: Running flink on AWS ECS

Terry Wang
Hi, Navneeth,

I think both is ok.
IMO, run one container with number of slots same as virtual cores may be better for slots can share the Flink Framework and thus reduce memory cost.

Best,
Terry Wang



> 在 2019年9月25日,下午3:26,Navneeth Krishnan <[hidden email]> 写道:
>
> Hi All,
>
> I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?
>
> Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Running flink on AWS ECS

Navneeth Krishnan
Thanks Terry, the reason why I asked this is because somewhere I saw running one slot per container is beneficial. I couldn’t find the where I saw that. 
Also I think running it with multiple slots will reduce IPC since some of the data will be processed writhing the same JVM.

Thanks

On Wed, Sep 25, 2019 at 1:16 AM Terry Wang <[hidden email]> wrote:
Hi, Navneeth,

I think both is ok.
IMO, run one container with number of slots same as virtual cores may be better for slots can share the Flink Framework and thus reduce memory cost.

Best,
Terry Wang



> 在 2019年9月25日,下午3:26,Navneeth Krishnan <[hidden email]> 写道:
>
> Hi All,
>
> I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?
>
> Thanks

Reply | Threaded
Open this post in threaded view
|

Re: Running flink on AWS ECS

David Anderson-2
I believe there can be advantages and disadvantages in both
directions. For example, fewer containers with multiple slots reduces
the effort the Flink Master has to do whenever global coordination is
required, i.e., during checkpointing. And the network stack in the
task managers is optimized to take advantage of locality, whenever
possible.

On the other hand, if you have a lot of pressure on the heap (e.g.,
because you are using a heap-based state backend), then having more,
smaller task managers can reduce latency by reducing the impact of
garbage collection pauses.

I'm sure I've overlooked some factors, but the bottom line appears to
be that there's no one-size-fits-all answer.

David

On Wed, Sep 25, 2019 at 5:43 PM Navneeth Krishnan
<[hidden email]> wrote:

>
> Thanks Terry, the reason why I asked this is because somewhere I saw running one slot per container is beneficial. I couldn’t find the where I saw that.
> Also I think running it with multiple slots will reduce IPC since some of the data will be processed writhing the same JVM.
>
> Thanks
>
> On Wed, Sep 25, 2019 at 1:16 AM Terry Wang <[hidden email]> wrote:
>>
>> Hi, Navneeth,
>>
>> I think both is ok.
>> IMO, run one container with number of slots same as virtual cores may be better for slots can share the Flink Framework and thus reduce memory cost.
>>
>> Best,
>> Terry Wang
>>
>>
>>
>> > 在 2019年9月25日,下午3:26,Navneeth Krishnan <[hidden email]> 写道:
>> >
>> > Hi All,
>> >
>> > I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?
>> >
>> > Thanks
>>
Reply | Threaded
Open this post in threaded view
|

Re: Running flink on AWS ECS

sri hari kali charan Tummala
Aws already has auto scale flink cluster it’s called Kinesis Data Analytics just add your flink Jar to Kinesis Sql analytics that’s all , aws will auto provision a flink cluster and do the admin part for you.

On Saturday, September 28, 2019, David Anderson <[hidden email]> wrote:
I believe there can be advantages and disadvantages in both
directions. For example, fewer containers with multiple slots reduces
the effort the Flink Master has to do whenever global coordination is
required, i.e., during checkpointing. And the network stack in the
task managers is optimized to take advantage of locality, whenever
possible.

On the other hand, if you have a lot of pressure on the heap (e.g.,
because you are using a heap-based state backend), then having more,
smaller task managers can reduce latency by reducing the impact of
garbage collection pauses.

I'm sure I've overlooked some factors, but the bottom line appears to
be that there's no one-size-fits-all answer.

David

On Wed, Sep 25, 2019 at 5:43 PM Navneeth Krishnan
<[hidden email]> wrote:
>
> Thanks Terry, the reason why I asked this is because somewhere I saw running one slot per container is beneficial. I couldn’t find the where I saw that.
> Also I think running it with multiple slots will reduce IPC since some of the data will be processed writhing the same JVM.
>
> Thanks
>
> On Wed, Sep 25, 2019 at 1:16 AM Terry Wang <[hidden email]> wrote:
>>
>> Hi, Navneeth,
>>
>> I think both is ok.
>> IMO, run one container with number of slots same as virtual cores may be better for slots can share the Flink Framework and thus reduce memory cost.
>>
>> Best,
>> Terry Wang
>>
>>
>>
>> > 在 2019年9月25日,下午3:26,Navneeth Krishnan <[hidden email]> 写道:
>> >
>> > Hi All,
>> >
>> > I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?
>> >
>> > Thanks
>>


--
Thanks & Regards
Sri Tummala


Reply | Threaded
Open this post in threaded view
|

Re: Running flink on AWS ECS

Oytun Tez
I said to myself I'll take another look at Kinesis Analytics, but this is the state of it:
image.png


Flink version is 1.6, which is way behind and lacking many good features and reliability improvements.


---
Oytun Tez

M O T A W O R D
The World's Fastest Human Translation Platform.


On Sat, Sep 28, 2019 at 9:04 AM sri hari kali charan Tummala <[hidden email]> wrote:
Aws already has auto scale flink cluster it’s called Kinesis Data Analytics just add your flink Jar to Kinesis Sql analytics that’s all , aws will auto provision a flink cluster and do the admin part for you.

On Saturday, September 28, 2019, David Anderson <[hidden email]> wrote:
I believe there can be advantages and disadvantages in both
directions. For example, fewer containers with multiple slots reduces
the effort the Flink Master has to do whenever global coordination is
required, i.e., during checkpointing. And the network stack in the
task managers is optimized to take advantage of locality, whenever
possible.

On the other hand, if you have a lot of pressure on the heap (e.g.,
because you are using a heap-based state backend), then having more,
smaller task managers can reduce latency by reducing the impact of
garbage collection pauses.

I'm sure I've overlooked some factors, but the bottom line appears to
be that there's no one-size-fits-all answer.

David

On Wed, Sep 25, 2019 at 5:43 PM Navneeth Krishnan
<[hidden email]> wrote:
>
> Thanks Terry, the reason why I asked this is because somewhere I saw running one slot per container is beneficial. I couldn’t find the where I saw that.
> Also I think running it with multiple slots will reduce IPC since some of the data will be processed writhing the same JVM.
>
> Thanks
>
> On Wed, Sep 25, 2019 at 1:16 AM Terry Wang <[hidden email]> wrote:
>>
>> Hi, Navneeth,
>>
>> I think both is ok.
>> IMO, run one container with number of slots same as virtual cores may be better for slots can share the Flink Framework and thus reduce memory cost.
>>
>> Best,
>> Terry Wang
>>
>>
>>
>> > 在 2019年9月25日,下午3:26,Navneeth Krishnan <[hidden email]> 写道:
>> >
>> > Hi All,
>> >
>> > I’m currently running flink on amazon ecs and I have assigned task slots based on vcpus per instance. Is it beneficial to run a separate container with one slot each or one container with number of slots same as virtual cores?
>> >
>> > Thanks
>>


--
Thanks & Regards
Sri Tummala