dynamically partitioned stream
Posted by
Martin Eden on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/dynamically-partitioned-stream-tp15280.html
Hi all,
I am trying to implement the following using Flink:
I have 2 input message streams:
1. Data Stream:
KEY VALUE TIME
.
.
.
C V6 6
B V6 6
A V5 5
A V4 4
C V3 3
A V3 3
B V3 3
B V2 2
A V1 1
2. Control Stream:
Lambda ArgumentKeys TIME
.
.
.
f2 [A, C] 4
f1 [A, B, C] 1
I want to apply the lambdas coming in the control stream to the selection of keys that are coming in the data stream.
Since we have 2 streams I naturally thought of connecting them using .connect. For this I need to key both of them by a certain criteria. And here lies the problem, how can I make sure the messages with keys A,B,C specified in the control stream end up in the same task as well as the control message (f1, [A, B, C]) itself. Basically I don't know how to key by to achieve this.
I suspect a custom partitioner is required that partitions the data stream based on the messages in the control stream? Is this even possible?
Any suggestions welcomed!
Thanks,
M