[Third-party Tool] Flink memory calculator

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Third-party Tool] Flink memory calculator

Yangze Guo
Hi, there.

In release-1.10, the memory setup of task managers has changed a lot.
I would like to provide here a third-party tool to simulate and get
the calculation result of Flink's memory configuration.

 Although there is already a detailed setup guide[1] and migration
guide[2] officially, the calculator could further allow users to:
- Verify if there is any conflict in their configuration. The
calculator is more lightweight than starting a Flink cluster,
especially when running Flink on Yarn/Kubernetes. User could make sure
their configuration is correct locally before deploying it to external
resource managers.
- Get all of the memory configurations before deploying. User may set
taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
But they also want to know the total memory consumption of Flink. With
this tool, users could get all of the memory configurations they are
interested in. If anything is unexpected, they would not need to
re-deploy a Flink cluster.

The repo link of this tool is
https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
BashJavaUtils.jar of Flink and ensures the calculation result is
exactly the same as your Flink dist. For more details, please take a
look at the README.

Any feedback or suggestion is welcomed!

[1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
[2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html

Best,
Yangze Guo
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yun Tang
Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?

Best
Yun Tang

From: Yangze Guo <[hidden email]>
Sent: Friday, March 27, 2020 18:00
To: user <[hidden email]>; [hidden email] <[hidden email]>
Subject: [Third-party Tool] Flink memory calculator
 
Hi, there.

In release-1.10, the memory setup of task managers has changed a lot.
I would like to provide here a third-party tool to simulate and get
the calculation result of Flink's memory configuration.

 Although there is already a detailed setup guide[1] and migration
guide[2] officially, the calculator could further allow users to:
- Verify if there is any conflict in their configuration. The
calculator is more lightweight than starting a Flink cluster,
especially when running Flink on Yarn/Kubernetes. User could make sure
their configuration is correct locally before deploying it to external
resource managers.
- Get all of the memory configurations before deploying. User may set
taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
But they also want to know the total memory consumption of Flink. With
this tool, users could get all of the memory configurations they are
interested in. If anything is unexpected, they would not need to
re-deploy a Flink cluster.

The repo link of this tool is
https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
BashJavaUtils.jar of Flink and ensures the calculation result is
exactly the same as your Flink dist. For more details, please take a
look at the README.

Any feedback or suggestion is welcomed!

[1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
[2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html

Best,
Yangze Guo
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yangze Guo
Hi, Yun,

I'm sorry that it currently could not handle it. But I think it is a
really good idea and that feature would be added to the next version.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:

>
> Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>
> Best
> Yun Tang
> ________________________________
> From: Yangze Guo <[hidden email]>
> Sent: Friday, March 27, 2020 18:00
> To: user <[hidden email]>; [hidden email] <[hidden email]>
> Subject: [Third-party Tool] Flink memory calculator
>
> Hi, there.
>
> In release-1.10, the memory setup of task managers has changed a lot.
> I would like to provide here a third-party tool to simulate and get
> the calculation result of Flink's memory configuration.
>
>  Although there is already a detailed setup guide[1] and migration
> guide[2] officially, the calculator could further allow users to:
> - Verify if there is any conflict in their configuration. The
> calculator is more lightweight than starting a Flink cluster,
> especially when running Flink on Yarn/Kubernetes. User could make sure
> their configuration is correct locally before deploying it to external
> resource managers.
> - Get all of the memory configurations before deploying. User may set
> taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
> But they also want to know the total memory consumption of Flink. With
> this tool, users could get all of the memory configurations they are
> interested in. If anything is unexpected, they would not need to
> re-deploy a Flink cluster.
>
> The repo link of this tool is
> https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
> BashJavaUtils.jar of Flink and ensures the calculation result is
> exactly the same as your Flink dist. For more details, please take a
> look at the README.
>
> Any feedback or suggestion is welcomed!
>
> [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
> [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>
> Best,
> Yangze Guo
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Xintong Song
Thanks Yangze, I've tried the tool and I think its very helpful.


Thank you~

Xintong Song



On Mon, Mar 30, 2020 at 9:40 AM Yangze Guo <[hidden email]> wrote:
Hi, Yun,

I'm sorry that it currently could not handle it. But I think it is a
really good idea and that feature would be added to the next version.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>
> Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>
> Best
> Yun Tang
> ________________________________
> From: Yangze Guo <[hidden email]>
> Sent: Friday, March 27, 2020 18:00
> To: user <[hidden email]>; [hidden email] <[hidden email]>
> Subject: [Third-party Tool] Flink memory calculator
>
> Hi, there.
>
> In release-1.10, the memory setup of task managers has changed a lot.
> I would like to provide here a third-party tool to simulate and get
> the calculation result of Flink's memory configuration.
>
>  Although there is already a detailed setup guide[1] and migration
> guide[2] officially, the calculator could further allow users to:
> - Verify if there is any conflict in their configuration. The
> calculator is more lightweight than starting a Flink cluster,
> especially when running Flink on Yarn/Kubernetes. User could make sure
> their configuration is correct locally before deploying it to external
> resource managers.
> - Get all of the memory configurations before deploying. User may set
> taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
> But they also want to know the total memory consumption of Flink. With
> this tool, users could get all of the memory configurations they are
> interested in. If anything is unexpected, they would not need to
> re-deploy a Flink cluster.
>
> The repo link of this tool is
> https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
> BashJavaUtils.jar of Flink and ensures the calculation result is
> exactly the same as your Flink dist. For more details, please take a
> look at the README.
>
> Any feedback or suggestion is welcomed!
>
> [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
> [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>
> Best,
> Yangze Guo
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Jeff Zhang
In reply to this post by Yangze Guo
Hi Yangze,

Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)



Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
Hi, Yun,

I'm sorry that it currently could not handle it. But I think it is a
really good idea and that feature would be added to the next version.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>
> Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>
> Best
> Yun Tang
> ________________________________
> From: Yangze Guo <[hidden email]>
> Sent: Friday, March 27, 2020 18:00
> To: user <[hidden email]>; [hidden email] <[hidden email]>
> Subject: [Third-party Tool] Flink memory calculator
>
> Hi, there.
>
> In release-1.10, the memory setup of task managers has changed a lot.
> I would like to provide here a third-party tool to simulate and get
> the calculation result of Flink's memory configuration.
>
>  Although there is already a detailed setup guide[1] and migration
> guide[2] officially, the calculator could further allow users to:
> - Verify if there is any conflict in their configuration. The
> calculator is more lightweight than starting a Flink cluster,
> especially when running Flink on Yarn/Kubernetes. User could make sure
> their configuration is correct locally before deploying it to external
> resource managers.
> - Get all of the memory configurations before deploying. User may set
> taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
> But they also want to know the total memory consumption of Flink. With
> this tool, users could get all of the memory configurations they are
> interested in. If anything is unexpected, they would not need to
> re-deploy a Flink cluster.
>
> The repo link of this tool is
> https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
> BashJavaUtils.jar of Flink and ensures the calculation result is
> exactly the same as your Flink dist. For more details, please take a
> look at the README.
>
> Any feedback or suggestion is welcomed!
>
> [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
> [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>
> Best,
> Yangze Guo


--
Best Regards

Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yangze Guo
Thanks for your feedbacks, @Xintong and @Jeff.

@Jeff
I think it would always be good to leverage exist logic in Flink, such
as JobListener. However, this calculator does not only target to check
the conflict, it also targets to provide the calculating result to
user before the job is actually deployed in case there is any
unexpected configuration. It's a good point that we need to parse the
dynamic configs. I prefer to parse the dynamic configs and cli
commands in bash instead of adding hook in JobListener.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:

>
> Hi Yangze,
>
> Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
>
> [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
>
>
> Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
>>
>> Hi, Yun,
>>
>> I'm sorry that it currently could not handle it. But I think it is a
>> really good idea and that feature would be added to the next version.
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>> >
>> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>> >
>> > Best
>> > Yun Tang
>> > ________________________________
>> > From: Yangze Guo <[hidden email]>
>> > Sent: Friday, March 27, 2020 18:00
>> > To: user <[hidden email]>; [hidden email] <[hidden email]>
>> > Subject: [Third-party Tool] Flink memory calculator
>> >
>> > Hi, there.
>> >
>> > In release-1.10, the memory setup of task managers has changed a lot.
>> > I would like to provide here a third-party tool to simulate and get
>> > the calculation result of Flink's memory configuration.
>> >
>> >  Although there is already a detailed setup guide[1] and migration
>> > guide[2] officially, the calculator could further allow users to:
>> > - Verify if there is any conflict in their configuration. The
>> > calculator is more lightweight than starting a Flink cluster,
>> > especially when running Flink on Yarn/Kubernetes. User could make sure
>> > their configuration is correct locally before deploying it to external
>> > resource managers.
>> > - Get all of the memory configurations before deploying. User may set
>> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
>> > But they also want to know the total memory consumption of Flink. With
>> > this tool, users could get all of the memory configurations they are
>> > interested in. If anything is unexpected, they would not need to
>> > re-deploy a Flink cluster.
>> >
>> > The repo link of this tool is
>> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
>> > BashJavaUtils.jar of Flink and ensures the calculation result is
>> > exactly the same as your Flink dist. For more details, please take a
>> > look at the README.
>> >
>> > Any feedback or suggestion is welcomed!
>> >
>> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
>> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>> >
>> > Best,
>> > Yangze Guo
>
>
>
> --
> Best Regards
>
> Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Xintong Song
Hi Jeff,

I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on K8s/Yarn/Mesos).

I think it makes sense do the checks earlier, i.e. on the client side. But I'm not sure if JobListener is the right place. IIUC, JobListener is invoked before submitting a specific job, while the mentioned checks validate Flink's cluster level configurations. It might be okay for a job cluster, but does not cover the scenarios of session clusters.

Thank you~

Xintong Song



On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo <[hidden email]> wrote:
Thanks for your feedbacks, @Xintong and @Jeff.

@Jeff
I think it would always be good to leverage exist logic in Flink, such
as JobListener. However, this calculator does not only target to check
the conflict, it also targets to provide the calculating result to
user before the job is actually deployed in case there is any
unexpected configuration. It's a good point that we need to parse the
dynamic configs. I prefer to parse the dynamic configs and cli
commands in bash instead of adding hook in JobListener.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:
>
> Hi Yangze,
>
> Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
>
> [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
>
>
> Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
>>
>> Hi, Yun,
>>
>> I'm sorry that it currently could not handle it. But I think it is a
>> really good idea and that feature would be added to the next version.
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>> >
>> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>> >
>> > Best
>> > Yun Tang
>> > ________________________________
>> > From: Yangze Guo <[hidden email]>
>> > Sent: Friday, March 27, 2020 18:00
>> > To: user <[hidden email]>; [hidden email] <[hidden email]>
>> > Subject: [Third-party Tool] Flink memory calculator
>> >
>> > Hi, there.
>> >
>> > In release-1.10, the memory setup of task managers has changed a lot.
>> > I would like to provide here a third-party tool to simulate and get
>> > the calculation result of Flink's memory configuration.
>> >
>> >  Although there is already a detailed setup guide[1] and migration
>> > guide[2] officially, the calculator could further allow users to:
>> > - Verify if there is any conflict in their configuration. The
>> > calculator is more lightweight than starting a Flink cluster,
>> > especially when running Flink on Yarn/Kubernetes. User could make sure
>> > their configuration is correct locally before deploying it to external
>> > resource managers.
>> > - Get all of the memory configurations before deploying. User may set
>> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
>> > But they also want to know the total memory consumption of Flink. With
>> > this tool, users could get all of the memory configurations they are
>> > interested in. If anything is unexpected, they would not need to
>> > re-deploy a Flink cluster.
>> >
>> > The repo link of this tool is
>> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
>> > BashJavaUtils.jar of Flink and ensures the calculation result is
>> > exactly the same as your Flink dist. For more details, please take a
>> > look at the README.
>> >
>> > Any feedback or suggestion is welcomed!
>> >
>> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
>> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>> >
>> > Best,
>> > Yangze Guo
>
>
>
> --
> Best Regards
>
> Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yangze Guo
Hi, there.

In the latest version, the calculator supports dynamic options. You
could append all your dynamic options to the end of "bin/calculator.sh
[-h]".
Since "-tm" will be deprecated eventually, please replace it with
"-Dtaskmanager.memory.process.size=".

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 12:57 PM Xintong Song <[hidden email]> wrote:

>
> Hi Jeff,
>
> I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on K8s/Yarn/Mesos).
>
> I think it makes sense do the checks earlier, i.e. on the client side. But I'm not sure if JobListener is the right place. IIUC, JobListener is invoked before submitting a specific job, while the mentioned checks validate Flink's cluster level configurations. It might be okay for a job cluster, but does not cover the scenarios of session clusters.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo <[hidden email]> wrote:
>>
>> Thanks for your feedbacks, @Xintong and @Jeff.
>>
>> @Jeff
>> I think it would always be good to leverage exist logic in Flink, such
>> as JobListener. However, this calculator does not only target to check
>> the conflict, it also targets to provide the calculating result to
>> user before the job is actually deployed in case there is any
>> unexpected configuration. It's a good point that we need to parse the
>> dynamic configs. I prefer to parse the dynamic configs and cli
>> commands in bash instead of adding hook in JobListener.
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:
>> >
>> > Hi Yangze,
>> >
>> > Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
>> >
>> > [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
>> >
>> >
>> > Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
>> >>
>> >> Hi, Yun,
>> >>
>> >> I'm sorry that it currently could not handle it. But I think it is a
>> >> really good idea and that feature would be added to the next version.
>> >>
>> >> Best,
>> >> Yangze Guo
>> >>
>> >> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>> >> >
>> >> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>> >> >
>> >> > Best
>> >> > Yun Tang
>> >> > ________________________________
>> >> > From: Yangze Guo <[hidden email]>
>> >> > Sent: Friday, March 27, 2020 18:00
>> >> > To: user <[hidden email]>; [hidden email] <[hidden email]>
>> >> > Subject: [Third-party Tool] Flink memory calculator
>> >> >
>> >> > Hi, there.
>> >> >
>> >> > In release-1.10, the memory setup of task managers has changed a lot.
>> >> > I would like to provide here a third-party tool to simulate and get
>> >> > the calculation result of Flink's memory configuration.
>> >> >
>> >> >  Although there is already a detailed setup guide[1] and migration
>> >> > guide[2] officially, the calculator could further allow users to:
>> >> > - Verify if there is any conflict in their configuration. The
>> >> > calculator is more lightweight than starting a Flink cluster,
>> >> > especially when running Flink on Yarn/Kubernetes. User could make sure
>> >> > their configuration is correct locally before deploying it to external
>> >> > resource managers.
>> >> > - Get all of the memory configurations before deploying. User may set
>> >> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
>> >> > But they also want to know the total memory consumption of Flink. With
>> >> > this tool, users could get all of the memory configurations they are
>> >> > interested in. If anything is unexpected, they would not need to
>> >> > re-deploy a Flink cluster.
>> >> >
>> >> > The repo link of this tool is
>> >> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
>> >> > BashJavaUtils.jar of Flink and ensures the calculation result is
>> >> > exactly the same as your Flink dist. For more details, please take a
>> >> > look at the README.
>> >> >
>> >> > Any feedback or suggestion is welcomed!
>> >> >
>> >> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
>> >> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>> >> >
>> >> > Best,
>> >> > Yangze Guo
>> >
>> >
>> >
>> > --
>> > Best Regards
>> >
>> > Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Marta Paes Moreira
Hey, Yangze.

I'd like to suggest that you submit this tool to Flink Community Pages [1]. That way it can get more exposure and it'll be easier for users to find it.

Thanks for your contribution!

[1] https://flink-packages.org/

On Tue, Mar 31, 2020 at 9:09 AM Yangze Guo <[hidden email]> wrote:
Hi, there.

In the latest version, the calculator supports dynamic options. You
could append all your dynamic options to the end of "bin/calculator.sh
[-h]".
Since "-tm" will be deprecated eventually, please replace it with
"-Dtaskmanager.memory.process.size=".

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 12:57 PM Xintong Song <[hidden email]> wrote:
>
> Hi Jeff,
>
> I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on K8s/Yarn/Mesos).
>
> I think it makes sense do the checks earlier, i.e. on the client side. But I'm not sure if JobListener is the right place. IIUC, JobListener is invoked before submitting a specific job, while the mentioned checks validate Flink's cluster level configurations. It might be okay for a job cluster, but does not cover the scenarios of session clusters.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo <[hidden email]> wrote:
>>
>> Thanks for your feedbacks, @Xintong and @Jeff.
>>
>> @Jeff
>> I think it would always be good to leverage exist logic in Flink, such
>> as JobListener. However, this calculator does not only target to check
>> the conflict, it also targets to provide the calculating result to
>> user before the job is actually deployed in case there is any
>> unexpected configuration. It's a good point that we need to parse the
>> dynamic configs. I prefer to parse the dynamic configs and cli
>> commands in bash instead of adding hook in JobListener.
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:
>> >
>> > Hi Yangze,
>> >
>> > Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
>> >
>> > [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
>> >
>> >
>> > Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
>> >>
>> >> Hi, Yun,
>> >>
>> >> I'm sorry that it currently could not handle it. But I think it is a
>> >> really good idea and that feature would be added to the next version.
>> >>
>> >> Best,
>> >> Yangze Guo
>> >>
>> >> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>> >> >
>> >> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>> >> >
>> >> > Best
>> >> > Yun Tang
>> >> > ________________________________
>> >> > From: Yangze Guo <[hidden email]>
>> >> > Sent: Friday, March 27, 2020 18:00
>> >> > To: user <[hidden email]>; [hidden email] <[hidden email]>
>> >> > Subject: [Third-party Tool] Flink memory calculator
>> >> >
>> >> > Hi, there.
>> >> >
>> >> > In release-1.10, the memory setup of task managers has changed a lot.
>> >> > I would like to provide here a third-party tool to simulate and get
>> >> > the calculation result of Flink's memory configuration.
>> >> >
>> >> >  Although there is already a detailed setup guide[1] and migration
>> >> > guide[2] officially, the calculator could further allow users to:
>> >> > - Verify if there is any conflict in their configuration. The
>> >> > calculator is more lightweight than starting a Flink cluster,
>> >> > especially when running Flink on Yarn/Kubernetes. User could make sure
>> >> > their configuration is correct locally before deploying it to external
>> >> > resource managers.
>> >> > - Get all of the memory configurations before deploying. User may set
>> >> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
>> >> > But they also want to know the total memory consumption of Flink. With
>> >> > this tool, users could get all of the memory configurations they are
>> >> > interested in. If anything is unexpected, they would not need to
>> >> > re-deploy a Flink cluster.
>> >> >
>> >> > The repo link of this tool is
>> >> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
>> >> > BashJavaUtils.jar of Flink and ensures the calculation result is
>> >> > exactly the same as your Flink dist. For more details, please take a
>> >> > look at the README.
>> >> >
>> >> > Any feedback or suggestion is welcomed!
>> >> >
>> >> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
>> >> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>> >> >
>> >> > Best,
>> >> > Yangze Guo
>> >
>> >
>> >
>> > --
>> > Best Regards
>> >
>> > Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yangze Guo
@Marta
Thanks for the tip! I'll do that.

Best,
Yangze Guo

On Wed, Apr 1, 2020 at 8:05 PM Marta Paes Moreira <[hidden email]> wrote:

>
> Hey, Yangze.
>
> I'd like to suggest that you submit this tool to Flink Community Pages [1]. That way it can get more exposure and it'll be easier for users to find it.
>
> Thanks for your contribution!
>
> [1] https://flink-packages.org/
>
> On Tue, Mar 31, 2020 at 9:09 AM Yangze Guo <[hidden email]> wrote:
>>
>> Hi, there.
>>
>> In the latest version, the calculator supports dynamic options. You
>> could append all your dynamic options to the end of "bin/calculator.sh
>> [-h]".
>> Since "-tm" will be deprecated eventually, please replace it with
>> "-Dtaskmanager.memory.process.size=".
>>
>> Best,
>> Yangze Guo
>>
>> On Mon, Mar 30, 2020 at 12:57 PM Xintong Song <[hidden email]> wrote:
>> >
>> > Hi Jeff,
>> >
>> > I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on K8s/Yarn/Mesos).
>> >
>> > I think it makes sense do the checks earlier, i.e. on the client side. But I'm not sure if JobListener is the right place. IIUC, JobListener is invoked before submitting a specific job, while the mentioned checks validate Flink's cluster level configurations. It might be okay for a job cluster, but does not cover the scenarios of session clusters.
>> >
>> > Thank you~
>> >
>> > Xintong Song
>> >
>> >
>> >
>> > On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo <[hidden email]> wrote:
>> >>
>> >> Thanks for your feedbacks, @Xintong and @Jeff.
>> >>
>> >> @Jeff
>> >> I think it would always be good to leverage exist logic in Flink, such
>> >> as JobListener. However, this calculator does not only target to check
>> >> the conflict, it also targets to provide the calculating result to
>> >> user before the job is actually deployed in case there is any
>> >> unexpected configuration. It's a good point that we need to parse the
>> >> dynamic configs. I prefer to parse the dynamic configs and cli
>> >> commands in bash instead of adding hook in JobListener.
>> >>
>> >> Best,
>> >> Yangze Guo
>> >>
>> >> On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:
>> >> >
>> >> > Hi Yangze,
>> >> >
>> >> > Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
>> >> >
>> >> > [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
>> >> >
>> >> >
>> >> > Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
>> >> >>
>> >> >> Hi, Yun,
>> >> >>
>> >> >> I'm sorry that it currently could not handle it. But I think it is a
>> >> >> really good idea and that feature would be added to the next version.
>> >> >>
>> >> >> Best,
>> >> >> Yangze Guo
>> >> >>
>> >> >> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
>> >> >> >
>> >> >> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
>> >> >> >
>> >> >> > Best
>> >> >> > Yun Tang
>> >> >> > ________________________________
>> >> >> > From: Yangze Guo <[hidden email]>
>> >> >> > Sent: Friday, March 27, 2020 18:00
>> >> >> > To: user <[hidden email]>; [hidden email] <[hidden email]>
>> >> >> > Subject: [Third-party Tool] Flink memory calculator
>> >> >> >
>> >> >> > Hi, there.
>> >> >> >
>> >> >> > In release-1.10, the memory setup of task managers has changed a lot.
>> >> >> > I would like to provide here a third-party tool to simulate and get
>> >> >> > the calculation result of Flink's memory configuration.
>> >> >> >
>> >> >> >  Although there is already a detailed setup guide[1] and migration
>> >> >> > guide[2] officially, the calculator could further allow users to:
>> >> >> > - Verify if there is any conflict in their configuration. The
>> >> >> > calculator is more lightweight than starting a Flink cluster,
>> >> >> > especially when running Flink on Yarn/Kubernetes. User could make sure
>> >> >> > their configuration is correct locally before deploying it to external
>> >> >> > resource managers.
>> >> >> > - Get all of the memory configurations before deploying. User may set
>> >> >> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
>> >> >> > But they also want to know the total memory consumption of Flink. With
>> >> >> > this tool, users could get all of the memory configurations they are
>> >> >> > interested in. If anything is unexpected, they would not need to
>> >> >> > re-deploy a Flink cluster.
>> >> >> >
>> >> >> > The repo link of this tool is
>> >> >> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
>> >> >> > BashJavaUtils.jar of Flink and ensures the calculation result is
>> >> >> > exactly the same as your Flink dist. For more details, please take a
>> >> >> > look at the README.
>> >> >> >
>> >> >> > Any feedback or suggestion is welcomed!
>> >> >> >
>> >> >> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
>> >> >> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
>> >> >> >
>> >> >> > Best,
>> >> >> > Yangze Guo
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Best Regards
>> >> >
>> >> > Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [Third-party Tool] Flink memory calculator

Yangze Guo
Hi, there,

As Flink 1.11.0 released, we provide a new calculator[1] for this
version. Feel free to try it and any feedback or suggestion is
welcomed!

[1] https://github.com/KarmaGYZ/flink-memory-calculator/blob/master/calculator-1.11.sh

Best,
Yangze Guo

On Wed, Apr 1, 2020 at 9:45 PM Yangze Guo <[hidden email]> wrote:

>
> @Marta
> Thanks for the tip! I'll do that.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 1, 2020 at 8:05 PM Marta Paes Moreira <[hidden email]> wrote:
> >
> > Hey, Yangze.
> >
> > I'd like to suggest that you submit this tool to Flink Community Pages [1]. That way it can get more exposure and it'll be easier for users to find it.
> >
> > Thanks for your contribution!
> >
> > [1] https://flink-packages.org/
> >
> > On Tue, Mar 31, 2020 at 9:09 AM Yangze Guo <[hidden email]> wrote:
> >>
> >> Hi, there.
> >>
> >> In the latest version, the calculator supports dynamic options. You
> >> could append all your dynamic options to the end of "bin/calculator.sh
> >> [-h]".
> >> Since "-tm" will be deprecated eventually, please replace it with
> >> "-Dtaskmanager.memory.process.size=".
> >>
> >> Best,
> >> Yangze Guo
> >>
> >> On Mon, Mar 30, 2020 at 12:57 PM Xintong Song <[hidden email]> wrote:
> >> >
> >> > Hi Jeff,
> >> >
> >> > I think the purpose of this tool it to allow users play with the memory configurations without needing to actually deploy the Flink cluster or even have a job. For sanity checks, we currently have them in the start-up scripts (for standalone clusters) and resource managers (on K8s/Yarn/Mesos).
> >> >
> >> > I think it makes sense do the checks earlier, i.e. on the client side. But I'm not sure if JobListener is the right place. IIUC, JobListener is invoked before submitting a specific job, while the mentioned checks validate Flink's cluster level configurations. It might be okay for a job cluster, but does not cover the scenarios of session clusters.
> >> >
> >> > Thank you~
> >> >
> >> > Xintong Song
> >> >
> >> >
> >> >
> >> > On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo <[hidden email]> wrote:
> >> >>
> >> >> Thanks for your feedbacks, @Xintong and @Jeff.
> >> >>
> >> >> @Jeff
> >> >> I think it would always be good to leverage exist logic in Flink, such
> >> >> as JobListener. However, this calculator does not only target to check
> >> >> the conflict, it also targets to provide the calculating result to
> >> >> user before the job is actually deployed in case there is any
> >> >> unexpected configuration. It's a good point that we need to parse the
> >> >> dynamic configs. I prefer to parse the dynamic configs and cli
> >> >> commands in bash instead of adding hook in JobListener.
> >> >>
> >> >> Best,
> >> >> Yangze Guo
> >> >>
> >> >> On Mon, Mar 30, 2020 at 10:32 AM Jeff Zhang <[hidden email]> wrote:
> >> >> >
> >> >> > Hi Yangze,
> >> >> >
> >> >> > Does this tool just parse the configuration in flink-conf.yaml ?  Maybe it could be done in JobListener [1] (we should enhance it via adding hook before job submission), so that it could all the cases (e.g. parameters coming from command line)
> >> >> >
> >> >> > [1] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L35
> >> >> >
> >> >> >
> >> >> > Yangze Guo <[hidden email]> 于2020年3月30日周一 上午9:40写道:
> >> >> >>
> >> >> >> Hi, Yun,
> >> >> >>
> >> >> >> I'm sorry that it currently could not handle it. But I think it is a
> >> >> >> really good idea and that feature would be added to the next version.
> >> >> >>
> >> >> >> Best,
> >> >> >> Yangze Guo
> >> >> >>
> >> >> >> On Mon, Mar 30, 2020 at 12:21 AM Yun Tang <[hidden email]> wrote:
> >> >> >> >
> >> >> >> > Very interesting and convenient tool, just a quick question: could this tool also handle deployment cluster commands like "-tm" mixed with configuration in `flink-conf.yaml` ?
> >> >> >> >
> >> >> >> > Best
> >> >> >> > Yun Tang
> >> >> >> > ________________________________
> >> >> >> > From: Yangze Guo <[hidden email]>
> >> >> >> > Sent: Friday, March 27, 2020 18:00
> >> >> >> > To: user <[hidden email]>; [hidden email] <[hidden email]>
> >> >> >> > Subject: [Third-party Tool] Flink memory calculator
> >> >> >> >
> >> >> >> > Hi, there.
> >> >> >> >
> >> >> >> > In release-1.10, the memory setup of task managers has changed a lot.
> >> >> >> > I would like to provide here a third-party tool to simulate and get
> >> >> >> > the calculation result of Flink's memory configuration.
> >> >> >> >
> >> >> >> >  Although there is already a detailed setup guide[1] and migration
> >> >> >> > guide[2] officially, the calculator could further allow users to:
> >> >> >> > - Verify if there is any conflict in their configuration. The
> >> >> >> > calculator is more lightweight than starting a Flink cluster,
> >> >> >> > especially when running Flink on Yarn/Kubernetes. User could make sure
> >> >> >> > their configuration is correct locally before deploying it to external
> >> >> >> > resource managers.
> >> >> >> > - Get all of the memory configurations before deploying. User may set
> >> >> >> > taskmanager.memory.task.heap.size and taskmanager.memory.managed.size.
> >> >> >> > But they also want to know the total memory consumption of Flink. With
> >> >> >> > this tool, users could get all of the memory configurations they are
> >> >> >> > interested in. If anything is unexpected, they would not need to
> >> >> >> > re-deploy a Flink cluster.
> >> >> >> >
> >> >> >> > The repo link of this tool is
> >> >> >> > https://github.com/KarmaGYZ/flink-memory-calculator. It reuses the
> >> >> >> > BashJavaUtils.jar of Flink and ensures the calculation result is
> >> >> >> > exactly the same as your Flink dist. For more details, please take a
> >> >> >> > look at the README.
> >> >> >> >
> >> >> >> > Any feedback or suggestion is welcomed!
> >> >> >> >
> >> >> >> > [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html
> >> >> >> > [2] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_migration.html
> >> >> >> >
> >> >> >> > Best,
> >> >> >> > Yangze Guo
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Best Regards
> >> >> >
> >> >> > Jeff Zhang