I was wondering if it is possible to use a Stateful Function within a Flink
pipeline. I know they work with different API's, so I was wondering if it is possible to have a DataStream as ingress for a Stateful Function. Some context: I'm working on a streaming graph analytics system, and want to save the state of the graph within a window. Stateful functions could then allow me to process these distributed graph states by making use of the SF messaging. Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi Annemarie, Unfortunately this is not possible at the moment, but DataStream as in/egress is in the plans as much as I know. On Wed, Apr 22, 2020 at 10:26 AM Annemarie Burger <[hidden email]> wrote: I was wondering if it is possible to use a Stateful Function within a Flink |
In reply to this post by Annemarie Burger
Hi Annemarie, There are plans to make stateful functions more easily embeddable within a Flink Job, perhaps skipping ingress/egress routing abstracting all together and basically exposing the core Flink job that is the heart of stateful functions. Although these plans are not concrete yet I believe this would be brought to discussion with the community in the upcoming weeks. Currently, you can split your pipeline to a preprocessing Flink job, and a stateful functions job. Good luck, Igal. On Wed, Apr 22, 2020 at 4:26 PM Annemarie Burger <[hidden email]> wrote: I was wondering if it is possible to use a Stateful Function within a Flink |
@Igal, this sounds more comprehensive (better) than just opening DataStreams: "basically exposing the core Flink job that is the heart of stateful functions. " Great! --
On Wed, Apr 22, 2020 at 10:56 AM Igal Shilman <[hidden email]> wrote:
|
In reply to this post by Igal Shilman
Dear Igal,
Can you elaborate more on your proposed solution of splitting the pipeline? If possible, providing some skeleton pseudocode would be awesome! Thanks in advance. Best, Max -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi Max, Sorry for not being clearer earlier, by splitting the pipeline I mean: having a pre-processing job that does whatever transformations necessary with the DataStream and outputs to Kafka / Kinesis, and then having a separate StateFun deployment that consumes from that transformed Kafka / Kinesis [1] topic. [1] - https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/io-module/apache-kafka.html Good luck, Igal. On Sun, Apr 26, 2020 at 6:43 PM m@xi <[hidden email]> wrote: Dear Igal, |
Hi Igal,
Thanks for your responses. Regarding "having a pre-processing job that does whatever transformations necessary with the DataStream and outputs to Kafka / Kinesis, and then having a separate StateFun deployment that consumes from that transformed Kafka / Kinesis topic." I was wondering how to do this when using windows. I would like to call the Stateful Function job on each separate window of the stream. This because the stateful function only needs the information from within the window for its processing. Do you have any idea how to achieve this? Best, Annemarie -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Free forum by Nabble | Edit this page |