join in sql without time interval

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

join in sql without time interval

lec ssmi
Hi:
  As  the following sql:

   SELECT *  FROM Orders INNER JOIN Product ON Orders.productId = Product.id

 If we use datastream API instead  of sql, how should it be achieved?
 Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above state capacity problem in sql is using TableConfig. But TableConfig itself can only solve the state  ttl  problem of non-time operators. So I think the above sql's  implementation is neither tumble window join, nor sliding window join and interval join.
 
Best Regards
Lec Ssmi
Reply | Threaded
Open this post in threaded view
|

Re: join in sql without time interval

lec ssmi
Maybe,  the connect method?  

lec ssmi <[hidden email]> 于2020年4月30日周四 下午3:59写道:
Hi:
  As  the following sql:

   SELECT *  FROM Orders INNER JOIN Product ON Orders.productId = Product.id

 If we use datastream API instead  of sql, how should it be achieved?
 Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above state capacity problem in sql is using TableConfig. But TableConfig itself can only solve the state  ttl  problem of non-time operators. So I think the above sql's  implementation is neither tumble window join, nor sliding window join and interval join.
 
Best Regards
Lec Ssmi
Reply | Threaded
Open this post in threaded view
|

Re: join in sql without time interval

Konstantin Knauf-3
Hi Lec Ssmi, 

yes, Dastream#connect on two streams both keyed on the productId with a KeyedCoProcessFunction is the way to go.

Cheers, 

Konstantin

On Thu, Apr 30, 2020 at 11:10 AM lec ssmi <[hidden email]> wrote:
Maybe,  the connect method?  

lec ssmi <[hidden email]> 于2020年4月30日周四 下午3:59写道:
Hi:
  As  the following sql:

   SELECT *  FROM Orders INNER JOIN Product ON Orders.productId = Product.id

 If we use datastream API instead  of sql, how should it be achieved?
 Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above state capacity problem in sql is using TableConfig. But TableConfig itself can only solve the state  ttl  problem of non-time operators. So I think the above sql's  implementation is neither tumble window join, nor sliding window join and interval join.
 
Best Regards
Lec Ssmi


--
Reply | Threaded
Open this post in threaded view
|

Re: join in sql without time interval

lec ssmi
Thanks, but is the bottom layer of the table API really implemented like this?

Konstantin Knauf <[hidden email]> 于 2020年4月30日周四 22:02写道:
Hi Lec Ssmi, 

yes, Dastream#connect on two streams both keyed on the productId with a KeyedCoProcessFunction is the way to go.

Cheers, 

Konstantin

On Thu, Apr 30, 2020 at 11:10 AM lec ssmi <[hidden email]> wrote:
Maybe,  the connect method?  

lec ssmi <[hidden email]> 于2020年4月30日周四 下午3:59写道:
Hi:
  As  the following sql:

   SELECT *  FROM Orders INNER JOIN Product ON Orders.productId = Product.id

 If we use datastream API instead  of sql, how should it be achieved?
 Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above state capacity problem in sql is using TableConfig. But TableConfig itself can only solve the state  ttl  problem of non-time operators. So I think the above sql's  implementation is neither tumble window join, nor sliding window join and interval join.
 
Best Regards
Lec Ssmi


--
Reply | Threaded
Open this post in threaded view
|

Re: join in sql without time interval

Jark Wu-3
Yes. Flink Table&SQL uses something like that but more lower API called `TwoInputStreamOperator`, you can see:
org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator

And state ttl in TableConfig can take effect on such join query. 

Best,
Jark

On Thu, 30 Apr 2020 at 22:35, lec ssmi <[hidden email]> wrote:
Thanks, but is the bottom layer of the table API really implemented like this?

Konstantin Knauf <[hidden email]> 于 2020年4月30日周四 22:02写道:
Hi Lec Ssmi, 

yes, Dastream#connect on two streams both keyed on the productId with a KeyedCoProcessFunction is the way to go.

Cheers, 

Konstantin

On Thu, Apr 30, 2020 at 11:10 AM lec ssmi <[hidden email]> wrote:
Maybe,  the connect method?  

lec ssmi <[hidden email]> 于2020年4月30日周四 下午3:59写道:
Hi:
  As  the following sql:

   SELECT *  FROM Orders INNER JOIN Product ON Orders.productId = Product.id

 If we use datastream API instead  of sql, how should it be achieved?
 Because the APIs in DataStream only have Window Join and Interval Join,the official website says that to solve the above state capacity problem in sql is using TableConfig. But TableConfig itself can only solve the state  ttl  problem of non-time operators. So I think the above sql's  implementation is neither tumble window join, nor sliding window join and interval join.
 
Best Regards
Lec Ssmi


--