(DEPRECATED) Apache Flink User Mailing List archive.

Call batch job in streaming context?

Classic

List

Threaded

3 messages Options

eric hoffmann

Call batch job in streaming context?

Is it possible to call batch job on a streaming context?

what i want to do is:

for a given input event, fetch cassandra elements based on event data, apply transformation on them and apply a ranking when all elements fetched by cassandra are processed.

If i do this in batch mode i would have to submit a job on each events and i can have an event every 45 seconds.

Is there any alternative? can i start a batch job that will receive some external request, process it and wait for another request?

thx

Eric

Piotr Nowojski

Re: Call batch job in streaming context?

Hi,

I’m not sure if I understand your problem and your context, but spawning a batch job every 45 seconds doesn’t sound as a that bad idea (as long as the job is short).

Another idea would be to incorporate this batch job inside your streaming job, for example by reading from Cassandra using an AsyncIO operator:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/asyncio.html

Quick google search revealed for example this:

https://stackoverflow.com/questions/43067681/read-data-from-cassandra-for-processing-in-flink

Piotrek

> On 23 Nov 2018, at 10:33, eric hoffmann <[hidden email]> wrote:
>
> Hi
> Is it possible to call batch job on a streaming context?
> what i want to do is:
> for a given input event, fetch cassandra elements based on event data, apply transformation on them and apply a ranking when all elements fetched by cassandra are processed.
> If i do this in batch mode i would have to submit a job on each events and i can have an event every 45 seconds.
> Is there any alternative? can i start a batch job that will receive some external request, process it and wait for another request?
> thx
> Eric

bastien dine

Re: Call batch job in streaming context?

Hi Eric,

You can run a job from another one, using the REST API

This is the only way we have found to launch a batch job from a streaming job

------------------

Bastien DINE
Data Architect / Software Engineer / Sysadmin

bastiendine.io

Le ven. 23 nov. 2018 à 11:52, Piotr Nowojski <[hidden email]> a écrit :

Hi,

I’m not sure if I understand your problem and your context, but spawning a batch job every 45 seconds doesn’t sound as a that bad idea (as long as the job is short).

Another idea would be to incorporate this batch job inside your streaming job, for example by reading from Cassandra using an AsyncIO operator:
https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/asyncio.html

Quick google search revealed for example this:

https://stackoverflow.com/questions/43067681/read-data-from-cassandra-for-processing-in-flink

Piotrek

> On 23 Nov 2018, at 10:33, eric hoffmann <[hidden email]> wrote:
>
> Hi
> Is it possible to call batch job on a streaming context?
> what i want to do is:
> for a given input event, fetch cassandra elements based on event data, apply transformation on them and apply a ranking when all elements fetched by cassandra are processed.
> If i do this in batch mode i would have to submit a job on each events and i can have an event every 45 seconds.
> Is there any alternative? can i start a batch job that will receive some external request, process it and wait for another request?
> thx
> Eric