Hi Folks, When I read the flink client api code, the concept of session is a little vague and unclear to me. It looks like the session concept is only applied in batch mode (I only see it in ExecutionEnvironment but not in StreamExecutionEnvironment). But for local mode (LocalExecutionEnvironment), starting one new session is starting one new MiniCluster, but in remote mode (RemoteExecutionEnvironment), starting one new session is just starting one new ClusterClient instead of one new cluster. So I am confused what does flink session really mean. Could anyone help me understand this ? Thanks.
Best Regards Jeff Zhang |
Hi Jeff, the session functionality which you find in Flink's client are the remnants of an uncompleted feature which was abandoned. The idea was that one could submit multiple parts of a job to the same cluster where these parts are added to the same ExecutionGraph. That way we wanted to allow to reuse computed results when using a notebook for ad-hoc queries, for example. But as I said, this feature has never been completed. Cheers, Till On Sun, Jun 2, 2019 at 3:20 PM Jeff Zhang <[hidden email]> wrote:
|
Thanks for the reply, [hidden email]. Regarding reuse computed results. I think JM keep all the metadata of intermediate data, and interactive programming is also trying to reuse computed results. It looks like it may not be necessary to introduce the session concept as long as we can achieve reusing computed results. Let me if I understand it correctly. Till Rohrmann <[hidden email]> 于2019年6月4日周二 下午4:03写道:
Best Regards
Jeff Zhang |
Yes, interactive programming solves the problem by storing the meta information on the client whereas in the past we thought whether to keep the information on the JM. But this would then not allow to share results between different clusters. Thus, the interactive programming approach is a bit more general, I think. Cheers, Till On Tue, Jun 4, 2019 at 11:13 AM Jeff Zhang <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |