This line «stream.keyBy(0)» only works if stream is of type DataStream[Tuple] - and this Tuple is not a scala tuple but a flink tuple (why not to use scala Tuple?). Currently keyBy can be applied to anything (at least in scala) like DataStream[String] and DataStream[Array[String]].
As described in the documentation, calls like keyBy(0) are meant for Tuples, so it only works for DataStream[Tuple]. Other key definition types like keyBy(new KeySelector() {...}) can basically take any DataStream of arbitrary data type. Flink finds out whether or not there is a conflict between the type of the data in the DataStream and the way the key is defined at runtime.
Hi,
using .keyBy(0) on a Scala DataStream[Tuple2] where Tuple2 is a Scala Tuple should work. Look, for example, at the SocketTextStreamWordCount example in Flink.
Cheers,
Aljoscha
> On 13 Jan 2016, at 18:25, Tzu-Li (Gordon) Tai <[hidden email]> wrote:
>
> Hi Saiph,
>
> In Flink, the key for keyBy() can be provided in different ways:
> https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#specifying-keys > (the doc is for DataSet API, but specifying keys is basically the same for
> DataStream and DataSet).
>
> As described in the documentation, calls like keyBy(0) are meant for Tuples,
> so it only works for DataStream[Tuple]. Other key definition types like
> keyBy(new KeySelector() {...}) can basically take any DataStream of
> arbitrary data type. Flink finds out whether or not there is a conflict
> between the type of the data in the DataStream and the way the key is
> defined at runtime.
>
> Hope this helps!
>
> Cheers,
> Gordon
>
>
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-DataStream-and-KeyBy-tp4271p4272.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.