Hi,
I am following the PopularPlacesSQL example (http://training.data-artisans.com/exercises/popularPlacesSql.html), but I am unable to understand why the following statement will pickup events with START flag only. "toCoords(cell), wstart, wend, isStart, popCnt " + "FROM " + "(SELECT " + "cell, " + "isStart, " + "HOP_START(eventTime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS wstart, " + "HOP_END(eventTime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS wend, " + "COUNT(isStart) AS popCnt " + "FROM " + "(SELECT " + "eventTime, " + "isStart, " + "CASE WHEN isStart THEN toCellId(startLon, startLat) ELSE toCellId(endLon, endLat) END AS cell " + "FROM TaxiRides " + "WHERE isInNYC(startLon, startLat) AND isInNYC(endLon, endLat)) " + "GROUP BY cell, isStart, HOP(eventTime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE)) " + "WHERE popCnt > 20" Since we can update state in processElement when we do it with low level ProcessFunction, how does SQL rule out the un-paired events? This is a UTF-8 formatted mail ----------------------------------------------- James C.-C.Yu +886988713275 |
Hi James, the exercise does not require to filter on pickup events. It says: "This is done by counting every five minutes the number of taxi rides that started and ended in the same area within the last 15 minutes. Arrival and departure locations should be separately counted." Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#joins 2018-03-22 2:34 GMT+01:00 James Yu <[hidden email]>:
|
Free forum by Nabble | Edit this page |