How to visit outer service in batch for sql

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to visit outer service in batch for sql

liujiangang
      For API, we can visit outer service in batch through countWindow, such as the following. We can visit outer service every 1000 records. If we visit outer service every record, it will be very slow for our job.
source.keyBy(new KeySelector())
.countWindow(1000)
.apply((WindowFunction<MyType, MyType, String, GlobalWindow>)
(s, globalWindow, values, collector) -> {
List<MyType> resultList = service.visit(values);
for (MyType result: resultList) {
if (result.ok) {
collector.collect(result);
}
}
});
      But how can I write SQL to implement the batch logic? I can use udf to visit outer service. Currently, Flink only support time window but not count window. I also check the udf wiki but find it hard to batch records. 
      Any suggestion is welcome. Thank you.






Reply | Threaded
Open this post in threaded view
|

Re: How to visit outer service in batch for sql

Danny Chan
Hi, did you try to define a UDAF there within your group window sql, where you can have a custom service there.

I’m afraid you are right, SQL only supports time windows.

Best,
Danny Chan
在 2020年8月26日 +0800 PM8:02,刘建刚 <[hidden email]>,写道:
      For API, we can visit outer service in batch through countWindow, such as the following. We can visit outer service every 1000 records. If we visit outer service every record, it will be very slow for our job.
source.keyBy(new KeySelector())
.countWindow(1000)
.apply((WindowFunction<MyType, MyType, String, GlobalWindow>)
(s, globalWindow, values, collector) -> {
List<MyType> resultList = service.visit(values);
for (MyType result: resultList) {
if (result.ok) {
collector.collect(result);
}
}
});
      But how can I write SQL to implement the batch logic? I can use udf to visit outer service. Currently, Flink only support time window but not count window. I also check the udf wiki but find it hard to batch records. 
      Any suggestion is welcome. Thank you.