window function outputs two different values

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

window function outputs two different values

tao xiao
Hi team,

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?

Reply | Threaded
Open this post in threaded view
|

Re: window function outputs two different values

tao xiao
Hi team,

any suggestions on below topic?

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?



On Sat, 7 Jan 2017 at 20:30 tao xiao <[hidden email]> wrote:
Hi team,

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?

Reply | Threaded
Open this post in threaded view
|

Re: window function outputs two different values

Aljoscha Krettek
Hi,
I'm afraid this is not possible with the current model. A reduce function is only meant to combine two values and output the result of that. Side effects, such as emitting further data are not allowed right now.

Cheers,
Aljoscha

On Mon, 9 Jan 2017 at 15:27 tao xiao <[hidden email]> wrote:
Hi team,

any suggestions on below topic?

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?



On Sat, 7 Jan 2017 at 20:30 tao xiao <[hidden email]> wrote:
Hi team,

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?

Reply | Threaded
Open this post in threaded view
|

Re: window function outputs two different values

Yury Ruchin
Hi,

Is there a strict requirement that elements must proceed along the processing pipeline exactly after being accounted by the reduce function? If not, you could derive two streams from the original one to be processed concurrently, something like this:

val protoStream = kafka source -> keyBy

val aggregateStream = protoStream -> window -> reduce
val someOtherStream = protoStream -> <other processing operators go here>

Or, if the above is not an option and window collection latency is not an issue, you could just use generic window function or fold function. The former gives access to window elements as an iterable, the latter allows using custom accumulator that contains the intermediate count and window elements seen so far.

Regards,
Yury

2017-01-10 17:43 GMT+03:00 Aljoscha Krettek <[hidden email]>:
Hi,
I'm afraid this is not possible with the current model. A reduce function is only meant to combine two values and output the result of that. Side effects, such as emitting further data are not allowed right now.

Cheers,
Aljoscha

On Mon, 9 Jan 2017 at 15:27 tao xiao <[hidden email]> wrote:
Hi team,

any suggestions on below topic?

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?



On Sat, 7 Jan 2017 at 20:30 tao xiao <[hidden email]> wrote:
Hi team,

I have a requirement that wants to output two different values from a time window reduce function. Here is basic workflow

1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy -> session window -> reduce
3. inside the reduce function, count the number of data and also emit the data itself to another operator for further processing

As the reduce function can only emit the count, I want to know how to also emit the data as well?