Best way to deriving streams from another one

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Best way to deriving streams from another one

AndreaKinn
Hi,
I have a data stream resulting from an operation executed on a data stream
of data.
Essentially I want to obtain two different streams from that one to send
their to different cassandra tables.

I.e.:

datastream 0 composed by Tuple3<Val1, Val2, Val3>

I want to have:

 a datastream 1 composed by every triple <Val1,Val2,Val3> of datastream 0
where Val2 > X
and
a data stream 2 composed by every couple <Val1, Val3>.

This lied me to have two datastreams with Tuples of different arity (3 and
2).

Currently I have implemented it getting the 0 datastream and then calling
separately a map function to retrieve datastream 2 and a flatmap function to
retrieve datastream 1. So I have two different prepared statement of
Cassandra called on the two different streams.
It works fine.

However this solutions looks really awful and inefficient, there is a more
elegant alternative?

I tried also to send towards Cassandra a datastream<Val1,Val2,Val3> and
select in the statement just two values (in this way I should use just the
flatmap operator) but during the execution raise an exception on it.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Best way to deriving streams from another one

Chesnay Schepler
Have a look at side outputs in the documentation, they allow you to emit
to multiple streams (of different types!) with a ProcessFunction.

On 10.09.2017 22:15, AndreaKinn wrote:

> Hi,
> I have a data stream resulting from an operation executed on a data stream
> of data.
> Essentially I want to obtain two different streams from that one to send
> their to different cassandra tables.
>
> I.e.:
>
> datastream 0 composed by Tuple3<Val1, Val2, Val3>
>
> I want to have:
>
>   a datastream 1 composed by every triple <Val1,Val2,Val3> of datastream 0
> where Val2 > X
> and
> a data stream 2 composed by every couple <Val1, Val3>.
>
> This lied me to have two datastreams with Tuples of different arity (3 and
> 2).
>
> Currently I have implemented it getting the 0 datastream and then calling
> separately a map function to retrieve datastream 2 and a flatmap function to
> retrieve datastream 1. So I have two different prepared statement of
> Cassandra called on the two different streams.
> It works fine.
>
> However this solutions looks really awful and inefficient, there is a more
> elegant alternative?
>
> I tried also to send towards Cassandra a datastream<Val1,Val2,Val3> and
> select in the statement just two values (in this way I should use just the
> flatmap operator) but during the execution raise an exception on it.
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>