finite subset of an infinite data stream

Posted by rss rss on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/finite-subset-of-an-infinite-data-stream-tp3400.html

Hello,

 

I need to extract a finite subset like a data buffer from an infinite data stream. The best way for me is to obtain a finite stream with data accumulated for a 1minute before (as example). But I not found any existing technique to do it.

 

As a possible ways how to do something near to a stream’s subset I see following cases:

-          some transformation operation like ‘take_while’ that produces new stream but able to switch one to FINNISHED state. Unfortunately I not found how to switch the state of a stream from a user code of transformation functions;

-          new DataStream or StreamSource constructors which allow to connect a data processing chain to the source stream. It may be something like mentioned take_while transform function or modified StreamSource.run method with data from the source stream.

 

That is I have two questions.

1)      Is there any technique to extract accumulated data from a stream as a stream (to union it with another stream)? This is like pure buffer mode.

2)      If the answer to first question is negative, is there something like take_while transformation or should I think about custom implementation of it? Is it possible to implement it without modification of the core of Flink?

 

Regards,

Roman