Hi,
I'm trying to figure out what graph the execution plan represents when you call env.getExecutionPlan on the StreamExecutionEnvironment. From my understanding the StreamGraph is what you call an APIGraph, which will be used to create the JobGraph. So is the ExecutionPlan is a full representation of the StreamGraph? And Is there a way to get a human-interpretable representation of the JobGraph? :) Best, Alex |
Hey Alex, Flink has 3 abstractions having a Graph suffix in place currently for streaming jobs: * StreamGraph: Used for representing the logical plan of a streaming job that is under construction in the API. This one is the only streaming specific in this list. * JobGraph: Used for representing the logical plan of a streaming job that is finished construction. * ExecutionGraph: The physical plan of the JobGraph, contains parallelism, estimated input sizes etc. env.getExecutionPlan gives you a JSON String representation of the ExecutionGraph, which should contain must of the info you need. To visualize that go to your flink binary distribution and open up tools/planVisualizer.html in a browser, paste the JSON there and hit the button. :) You might find it useful that the new Flink Dashboard also comes with this feature integrated, so you can visualize jobs that have been submitted to the cluster. Hope that helps, Marton On Thu, Jan 14, 2016 at 11:56 AM, lofifnc <[hidden email]> wrote: Hi, |
Hi Is there a way to map a JSON representation back to an executable flink job? If there is no such API, what is the best starting point to implement such a feature? Best Christian 2016-01-14 15:18 GMT+01:00 Márton Balassi <[hidden email]>:
|
In reply to this post by Márton Balassi
Hi Márton,
Thanks for your answer. But now I'm even more confused as it somehow conflicts with the documentation. ;) According to the wiki and the stratosphere paper the JobGraph will be submitted to the JobManager. And the JobManager will then translate it into the ExecutionGraph. So the ExecutionGraph should only be available at the JobManager and contain a node for each parallel instance of a operator and the corresponding vertices. The question is in the context of my master thesis as I'm trying to describe the deployment process of Flink. And wan't to use a visualization of the execution plan as an concrete example for one of these three Graphs. Best Alex! |
@Christian: I don't think that is possible. - User function objects (Flink ships objects not class names)There are quite a few things missing in the JSON including: 2016-01-14 16:02 GMT+01:00 lofifnc <[hidden email]>: Hi Márton, |
Actually, the thing with the JSON plans is slightly different now: There are two types of plans: 1) The plan that describes the user program originally. That is what you get from env.getExecutionPlan(). In the Batch API, this has the result of the optimizer, in the streaming API the stream graph. 2) There is a JSON plan for the JobGraph / ExecutionGraph. This is what the web dashboard uses. The main difference to the other JSON plan is that in the JobGraph, not all operators are visible any more. Chained operations currently look like one operator to the JobGraph. Hence this JSON plan usually has fewer operators, and the names indicate that an operator is actually a chain of operations. Greetings, Stephan On Thu, Jan 14, 2016 at 6:15 PM, Fabian Hueske <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |