Hi,
we have a requirement where we need to aggregate the data every 10mins and write ONCE the aggregated results to the elastic search. Right now, we are iterating over the iterable to make a count of different status codes to do this. Is there a better way to count different status codes. public void apply(TimeWindow timeWindow, Iterable<Tuple4<String, Long, String, String>> iterable, Collector<Tuple4<String, Long, String, String>> collector) throws Exception { long[] counts=new long[10]; Arrays.fill(counts,0l); //count different type of records in a window for (Tuple4<String, Long, String, String> in : iterable) { counts[0]++; if (in.f2!=null && in.f2.startsWith("5")) counts[1]++; else if (in.f2!=null && in.f2.startsWith("4")) counts[2]++; else if (in.f2!=null && in.f2.startsWith("2")) counts[3]++; if(in.f3!=null && in.f3.equalsIgnoreCase("GET")) counts[4]++; else if(in.f3!=null && in.f3.equalsIgnoreCase("POST")) counts[5]++; else if(in.f3!=null && in.f3.equalsIgnoreCase("PUT")) counts[6]++; else if(in.f3!=null && in.f3.equalsIgnoreCase("HEAD")) counts[7]++; } ... } |
Hi Raj, I would recommend to use a ReduceFunction instead of a WindowFunction. The benefit of ReduceFunction is that it can be eagerly computed whenever an element is put into the window such that the state of the window is only one element. In contrast, the WindowFunction collects all elements of a window in state and is applied when the window is closed [1].[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#reducefunction [2] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/tableApi.html#group-windows 2017-07-24 2:47 GMT+02:00 Raj Kumar <[hidden email]>: Hi, |
In reply to this post by Raj Kumar
Thanks Fabian. That helped.
But I want to access the window start time. AFAIK, reduce can not give this details as it doesn't have timewindow object passed to the reduce method. How can I achieve this ? |
Hi Raj, Best, FabianYou can use ReduceFunction in combination with a WindowFunction [1]. 2017-07-24 20:31 GMT+02:00 Raj Kumar <[hidden email]>: Thanks Fabian. That helped. |
Free forum by Nabble | Edit this page |