TimelyFlatMapFunction and DataStream

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

TimelyFlatMapFunction and DataStream

Ken Krugler
I’m curious why it seems like a TimelyFlatMapFunction can’t be used with a regular DataStream, but it can be used with a KeyedStream.

Or maybe I’m missing something obvious (this is with 1.2-SNAPSHOT, pulled today).

Also the documentation of TimelyFlatMapFunction (https://ci.apache.org/projects/flink/flink-docs-master/api/java/index.html?org/apache/flink/streaming/api/functions/TimelyFlatMapFunction.html) shows using it with a DataStream.flatMap(xxx) call.

Thanks,

— Ken

--------------------------
Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr



Reply | Threaded
Open this post in threaded view
|

Re: TimelyFlatMapFunction and DataStream

Stephan Ewen
Hi Ken!

It may not be obvious, so here is a bit of background:

The timers that are used in the FlatMapFunction are scoped by key. We thought that this is how they are mainly useful - that's why you need to define keys to use them.
I think the docs are in error, thanks for pointing that out.

In your use case, do you need timers without keys, or only access to the current processing/event time?

Best,
Stephan


On Wed, Nov 2, 2016 at 1:59 AM, Ken Krugler <[hidden email]> wrote:
I’m curious why it seems like a TimelyFlatMapFunction can’t be used with a regular DataStream, but it can be used with a KeyedStream.

Or maybe I’m missing something obvious (this is with 1.2-SNAPSHOT, pulled today).

Also the documentation of TimelyFlatMapFunction (https://ci.apache.org/projects/flink/flink-docs-master/api/java/index.html?org/apache/flink/streaming/api/functions/TimelyFlatMapFunction.html) shows using it with a DataStream.flatMap(xxx) call.

Thanks,

— Ken

--------------------------
Ken Krugler
<a href="tel:%2B1%20530-210-6378" value="+15302106378" target="_blank">+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr




Reply | Threaded
Open this post in threaded view
|

Re: TimelyFlatMapFunction and DataStream

Aljoscha Krettek
There is already an open PR for fixing those Javadoc issues (along with some other issues): https://github.com/apache/flink/pull/2715

On Wed, 2 Nov 2016 at 11:04 Stephan Ewen <[hidden email]> wrote:
Hi Ken!

It may not be obvious, so here is a bit of background:

The timers that are used in the FlatMapFunction are scoped by key. We thought that this is how they are mainly useful - that's why you need to define keys to use them.
I think the docs are in error, thanks for pointing that out.

In your use case, do you need timers without keys, or only access to the current processing/event time?

Best,
Stephan


On Wed, Nov 2, 2016 at 1:59 AM, Ken Krugler <[hidden email]> wrote:
I’m curious why it seems like a TimelyFlatMapFunction can’t be used with a regular DataStream, but it can be used with a KeyedStream.

Or maybe I’m missing something obvious (this is with 1.2-SNAPSHOT, pulled today).

Also the documentation of TimelyFlatMapFunction (https://ci.apache.org/projects/flink/flink-docs-master/api/java/index.html?org/apache/flink/streaming/api/functions/TimelyFlatMapFunction.html) shows using it with a DataStream.flatMap(xxx) call.

Thanks,

— Ken

--------------------------
Ken Krugler
<a href="tel:%2B1%20530-210-6378" value="+15302106378" class="gmail_msg" target="_blank">+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr




Reply | Threaded
Open this post in threaded view
|

Re: TimelyFlatMapFunction and DataStream

Ken Krugler
In reply to this post by Stephan Ewen
Hi Stephan,

On Nov 2, 2016, at 3:04am, Stephan Ewen <[hidden email]> wrote:

Hi Ken!

It may not be obvious, so here is a bit of background:

The timers that are used in the FlatMapFunction are scoped by key. We thought that this is how they are mainly useful - that's why you need to define keys to use them.
I think the docs are in error, thanks for pointing that out.

In your use case, do you need timers without keys, or only access to the current processing/event time?

I was hoping to use timers as an alternative approach for async generation of tuples. I’ve got several functions that use multi-threading to process the incoming tuples. These get put into a queue, processed by threads, and the results placed in another queue. With timers it seemed like I could regularly flush the output queue to the collector.

So unkeyed data, yes.

Though it wasn’t clear from the docs if/how I would set up a timer that regularly fires (say every 100ms).

In any case I can keep using my current approach of having a “tickler” Tuple0 stream that I use with CoFlatMapFunctions.

Regards,

— Ken
 
On Wed, Nov 2, 2016 at 1:59 AM, Ken Krugler <[hidden email]> wrote:
I’m curious why it seems like a TimelyFlatMapFunction can’t be used with a regular DataStream, but it can be used with a KeyedStream.

Or maybe I’m missing something obvious (this is with 1.2-SNAPSHOT, pulled today).

Also the documentation of TimelyFlatMapFunction (https://ci.apache.org/projects/flink/flink-docs-master/api/java/index.html?org/apache/flink/streaming/api/functions/TimelyFlatMapFunction.html) shows using it with a DataStream.flatMap(xxx) call.

Thanks,

— Ken

--------------------------
Ken Krugler
<a href="tel:%2B1%20530-210-6378" value="+15302106378" target="_blank" class="">+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





--------------------------
Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr