Hi,
My app is based on a lib that is not thread safe (yet...). In waiting of the patch has been pushed, how can I be sure that my Sink that uses this lib is in one JVM ? Context: I use one Yarn session and send my Flink jobs to this session Regards, David |
Hi David, In general, I don't think you can control all parallel subtasks of a certain task run in the same JVM process with the current Flink. If you job scale is very small, one thing you might try is to have only one task manager in the Flink session cluster. You need to make sure the task manager has enough cpu/memory resources and slots for running your job. Thank you~ Xintong Song On Sun, Feb 23, 2020 at 1:11 AM David Morin <[hidden email]> wrote: Hi, |
In reply to this post by aldu29
Hi,
Thanks Xintong. I've noticed than when I use yarn-session.sh with --slots (-s) parameter but without --container (-n) it creates one task/slot per taskmanager. Before with the both n and -s it was not the case. I prefer to use only small container with only one task to scale my pipeline and of course to prevent from thread-safe issue Do you think I cannot be confident on that behaviour ? Regards, David On 2020/02/22 17:11:25, David Morin <[hidden email]> wrote: > Hi, > My app is based on a lib that is not thread safe (yet...). > In waiting of the patch has been pushed, how can I be sure that my Sink that uses this lib is in one JVM ? > Context: I use one Yarn session and send my Flink jobs to this session > > Regards, > David > |
Depending on your Flink version, the '-n' option might not take effect. It is removed in the latest release, but before that there were a few versions where this option is neither removed nor taking effect. Anyway, as long as you have multiple containers, I don't think there's a way to make some of the tasks scheduled to the same JVM. Not that I'm aware of. Thank you~ Xintong Song On Mon, Feb 24, 2020 at 8:43 PM David Morin <[hidden email]> wrote: Hi, |
Hi Xintong, At the moment I'm using the 1.9.2 with this command: yarn-session.sh -d -s 1 -jm 4096 -tm 4096 -qu "XXX" -nm "MyPipeline" So, after a lot of tests, I've noticed that if I increase the parallelism of my Custom Sink, each task is embedded into one TS and, the most important, each one into one TaskManager (Yarn container in fact). So, if I understand I have to keep this Flink release (1.9.2) ? Thanks David Le mar. 25 févr. 2020 à 02:02, Xintong Song <[hidden email]> a écrit :
|
Hi David, Before with the both n and -s it was not the case. What do you mean by before? At least in 1.8 "-s" could be used to specify the number of slots per TM. how can I be sure that my Sink that uses this lib is in one JVM ? Is it enough that no other parallel instance of your sink runs in the same JVM? If that is the case, it is enough to start your your YARN session with: ./bin/yarn-session.sh -s 1 [...] This will result in exactly one slot per TM. Note that a single slot may still hold several subtasks of the job (Slot Sharing) but never two parallel instances of your sink [2]. You can also control Slot Sharing manually [3]. So, if I understand I have to keep this Flink release (1.9.2) ? I don't see why 1.10.0 would not work for you. Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html#start-a-session [2] https://ci.apache.org/projects/flink/flink-docs-stable/concepts/runtime.html#task-slots-and-resources [3] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/#task-chaining-and-resource-groups On Tue, Feb 25, 2020 at 10:28 AM David Morin <[hidden email]> wrote:
|
Hi Gary, Sorry I was probably not very clear. Yes that's exactly what I want to hear :) I use the -s 1 parameter and what I expect to have is one task of my Sink (one instance in fact) per TM (i.e. per JVM) That's the current behaviour during my tests but I want to be sure. Thanks a lot David Le mar. 25 févr. 2020 à 11:16, Gary Yao <[hidden email]> a écrit :
|
Ah, I misunderstood and thought that you want to keep all your Sink instances on the same TM. If what you want is to have one instance per TM, then as Gary mentioned specifying "-s 1" at starting the session would be enough, and it should work with all existing versions above (including) 1.8. Thank you~ Xintong Song On Tue, Feb 25, 2020 at 7:41 PM David Morin <[hidden email]> wrote:
|
Perfect. No problem. My Bad. Not really clear. Thanks ! Le mar. 25 févr. 2020 à 13:45, Xintong Song <[hidden email]> a écrit :
|
Free forum by Nabble | Edit this page |