State Processor API to boot strap keyed state for Stream Application.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

State Processor API to boot strap keyed state for Stream Application.

Marco Villalobos-2
I have read the documentation and various blogs that state that it is possible to load data into a data-set and use that data to bootstrap a stream application.

The documentation literally says this, "...you can read a batch of data from any store, preprocess it, and write the result to a savepoint that you use to bootstrap the state of a streaming application." (source: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/state_processor_api.html).

Another blog states, "You can create both Batch and Stream environment in a single job." (source: https://www.kharekartik.dev/2019/12/14/bootstrap-your-flink-jobs/

I want to try this approach, but I cannot find any real examples online.

I have failed on numerous attempts.

I have a few questions:

1) is there an example that demonstrate this feature?
2) how can you launch batch and stream environment from a single job?
3) does this require two jobs?

Anybody, please help.

Reply | Threaded
Open this post in threaded view
|

Re: State Processor API to boot strap keyed state for Stream Application.

Arvid Heise-3
For future readers: this thread has been resolved in "Please help, I need to bootstrap keyed state into a stream" on the user mailing list asked by Marco.

On Fri, Aug 7, 2020 at 11:52 PM Marco Villalobos <[hidden email]> wrote:
I have read the documentation and various blogs that state that it is possible to load data into a data-set and use that data to bootstrap a stream application.

The documentation literally says this, "...you can read a batch of data from any store, preprocess it, and write the result to a savepoint that you use to bootstrap the state of a streaming application." (source: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/state_processor_api.html).

Another blog states, "You can create both Batch and Stream environment in a single job." (source: https://www.kharekartik.dev/2019/12/14/bootstrap-your-flink-jobs/

I want to try this approach, but I cannot find any real examples online.

I have failed on numerous attempts.

I have a few questions:

1) is there an example that demonstrate this feature?
2) how can you launch batch and stream environment from a single job?
3) does this require two jobs?

Anybody, please help.



--

Arvid Heise | Senior Java Developer


Follow us @VervericaData

--

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng