How many times Flink initialize an operator?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How many times Flink initialize an operator?

Hao Sun
I am using Flink 1.7 on K8S. This might does not matter :D.

I think Flink only initialize the MapFunction once per taskManager right?
Because Flink will serialize the execution graph and distribute it to taskManagers.

Or it creates a new MapFunction for every element?
stream.map(new MapFunction[I,O]).addSink(discard)

Hao Sun
Team Lead
1019 Market St. 7F
San Francisco, CA 94103
Reply | Threaded
Open this post in threaded view
|

Re: How many times Flink initialize an operator?

Ken Krugler
It’s based the parallelism of that operator, not the number of TaskManagers.

E.g. you can have an operator with a parallelism of one, and your cluster has 10 TaskManagers, and you’ll only get a single instance of the operator.

— Ken


On Dec 11, 2018, at 2:01 PM, Hao Sun <[hidden email]> wrote:

I am using Flink 1.7 on K8S. This might does not matter :D.

I think Flink only initialize the MapFunction once per taskManager right?
Because Flink will serialize the execution graph and distribute it to taskManagers.

Or it creates a new MapFunction for every element?
stream.map(new MapFunction[I,O]).addSink(discard)

Hao Sun
Team Lead
1019 Market St. 7F
San Francisco, CA 94103

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra

Reply | Threaded
Open this post in threaded view
|

Re: How many times Flink initialize an operator?

Hao Sun
Ok, thanks for the clarification. 
Hao Sun
Team Lead
1019 Market St. 7F
San Francisco, CA 94103


On Tue, Dec 11, 2018 at 2:38 PM Ken Krugler <[hidden email]> wrote:
It’s based the parallelism of that operator, not the number of TaskManagers.

E.g. you can have an operator with a parallelism of one, and your cluster has 10 TaskManagers, and you’ll only get a single instance of the operator.

— Ken


On Dec 11, 2018, at 2:01 PM, Hao Sun <[hidden email]> wrote:

I am using Flink 1.7 on K8S. This might does not matter :D.

I think Flink only initialize the MapFunction once per taskManager right?
Because Flink will serialize the execution graph and distribute it to taskManagers.

Or it creates a new MapFunction for every element?
stream.map(new MapFunction[I,O]).addSink(discard)

Hao Sun
Team Lead
1019 Market St. 7F
San Francisco, CA 94103

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra