Hi,
I have been using Flink 1.1.1 with Kafka 0.9 to process real time streams. We have written our own Window Function and are processing data with Sliding Windows. We are using Event Time and use a custom watermark generator. We select a particular window out of multiple sliding windows and process all items in the window in an iterative loop to increment counts of the items selected. After this we call the sink method to log the result in a database. This is working fine most of the times, i.e. it produces the expected result most of the times. However there are situations when certain windows are not processed (under same test conditions), this means the results are less than expected i.e. the counts are less than expected. Sometimes, items instead of certain windows not getting processed, certain items do not get processed in the window. This is unpredictable. I wish to know what could be the cause of this inconsistent behaviour. How do we resolve it. We have integrated 1.2 and Kafka 0.10 now, the problem persists. Please could you suggest about what the problem could be and how to resolve this. Many thanks. Sujit Sakre This email is sent on behalf of Northgate Public Services (UK) Limited and its associated companies including Rave Technologies (India) Pvt Limited (together "Northgate Public Services") and is strictly confidential and intended solely for the addressee(s). If you are not the intended recipient of this email you must: (i) not disclose, copy or distribute its contents to any other person nor use its contents in any way or you may be acting unlawfully; (ii) contact Northgate Public Services immediately on +44(0)1908 264500 quoting the name of the sender and the addressee then delete it from your system. Northgate Public Services has taken reasonable precautions to ensure that no viruses are contained in this email, but does not accept any responsibility once this email has been transmitted. You should scan attachments (if any) for viruses. Northgate Public Services (UK) Limited, registered in England and Wales under number 00968498 with a registered address of Peoplebuilding 2, Peoplebuilding Estate, Maylands Avenue, Hemel Hempstead, Hertfordshire, HP2 4NN. Rave Technologies (India) Pvt Limited, registered in India under number 117068 with a registered address of 2nd Floor, Ballard House, Adi Marzban Marg, Ballard Estate, Mumbai, Maharashtra, India, 400001.
|
Hi Sujit,
this does indeed sound strange and we are not aware of any data loss issues. Are there any exceptions or other errors in the job/taskmanager logs? Do you have a minimal working example? Is it that whole windows are not processed or just single items inside a window? Nico On Tuesday, 14 February 2017 16:57:31 CET Sujit Sakre wrote: > Hi, > > I have been using Flink 1.1.1 with Kafka 0.9 to process real time streams. > We have written our own Window Function and are processing data with > Sliding Windows. We are using Event Time and use a custom watermark > generator. > > We select a particular window out of multiple sliding windows and process > all items in the window in an iterative loop to increment counts of the > items selected. After this we call the sink method to log the result in a > database. > > This is working fine most of the times, i.e. it produces the expected > result most of the times. However there are situations when certain windows > are not processed (under same test conditions), this means the results are > less than expected i.e. the counts are less than expected. Sometimes, items > instead of certain windows not getting processed, certain items do not get > processed in the window. This is unpredictable. > > I wish to know what could be the cause of this inconsistent behaviour. How > do we resolve it. We have integrated 1.2 and Kafka 0.10 now, the problem > persists. > > Please could you suggest about what the problem could be and how to resolve > this. > > Many thanks. > > > *Sujit Sakre* signature.asc (201 bytes) Download Attachment |
Hi Nico, Thanks for the reply. There are no exceptions or other errors in the job/task manager logs. I am running this example from Eclipse IDE with Kafka and Zookeeper running separately; in the console there are no errors shown while processing. Previously, we were missing some windows due to watermark exceeding the upcoming data timestamps, however, that is not the case anymore. There is a working example. I will share the code and data with you separately on your email ID. The results of processing are of three types: 1) Complete set of results is obtained without any missing calculations 2) A few windows (from 1 or 2 to more) are missing uniformly from calculations 3) In a particular window, only selective data is missing, whereas other data is processed accurately These results are for the same set of inputs under same processing steps. This is not predictable, and makes identification of error difficult, as sometimes it works i.e. results of pattern #1 and sometimes results of pattern #2 or #3 (#3 occurs less frequently, however it does take place. Sujit Sakre Senior Technical Architect Ext: 247 Direct: 6740 5247 Mobile: +91 98672 01204
Please consider the environment before printing this email On 14 February 2017 at 18:55, Nico Kruber <[hidden email]> wrote: Hi Sujit, This email is sent on behalf of Northgate Public Services (UK) Limited and its associated companies including Rave Technologies (India) Pvt Limited (together "Northgate Public Services") and is strictly confidential and intended solely for the addressee(s). If you are not the intended recipient of this email you must: (i) not disclose, copy or distribute its contents to any other person nor use its contents in any way or you may be acting unlawfully; (ii) contact Northgate Public Services immediately on +44(0)1908 264500 quoting the name of the sender and the addressee then delete it from your system. Northgate Public Services has taken reasonable precautions to ensure that no viruses are contained in this email, but does not accept any responsibility once this email has been transmitted. You should scan attachments (if any) for viruses. Northgate Public Services (UK) Limited, registered in England and Wales under number 00968498 with a registered address of Peoplebuilding 2, Peoplebuilding Estate, Maylands Avenue, Hemel Hempstead, Hertfordshire, HP2 4NN. Rave Technologies (India) Pvt Limited, registered in India under number 117068 with a registered address of 2nd Floor, Ballard House, Adi Marzban Marg, Ballard Estate, Mumbai, Maharashtra, India, 400001.
|
Hmm, without any exceptions in the logs, I'd say that you may be on the right
track with elements arriving with timestamps older than the last watermark. You may play around with allowed lateness https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/ windows.html#allowed-lateness to see if this is actually the case. Nico On Tuesday, 14 February 2017 19:39:46 CET Sujit Sakre wrote: > Hi Nico, > > Thanks for the reply. > > There are no exceptions or other errors in the job/task manager logs. I am > running this example from Eclipse IDE with Kafka and Zookeeper running > separately; in the console there are no errors shown while processing. > Previously, we were missing some windows due to watermark exceeding the > upcoming data timestamps, however, that is not the case anymore. > > There is a working example. I will share the code and data with you > separately on your email ID. > > The results of processing are of three types: > 1) Complete set of results is obtained without any missing calculations > 2) A few windows (from 1 or 2 to more) are missing uniformly from > calculations > 3) In a particular window, only selective data is missing, whereas other > data is processed accurately > > These results are for the same set of inputs under same processing steps. > > This is not predictable, and makes identification of error difficult, as > sometimes it works i.e. results of pattern #1 and sometimes results of > pattern #2 or #3 (#3 occurs less frequently, however it does take place. > > > > > *Sujit Sakre* > > Senior Technical Architect > Tel: +91 22 6660 6600 > Ext: > 247 > Direct: 6740 5247 > > Mobile: +91 98672 01204 > > www.rave-tech.com > > > > Follow us on: Twitter <https://twitter.com/Rave_Tech> / LinkedIn > <https://in.linkedin.com/in/ravetechnologies> / YouTube > <https://www.youtube.com/channel/UCTaO1am-cm4FqnQCGdB6ExA> > > > > Rave Technologies – A Northgate Public Services Company > <https://www.google.co.in/maps/place/Rave+Technologies/@19.0058078,72.823516 > ,17z/data=!3m1!4b1!4m5!3m4!1s0x3bae17fcde71c3b9:0x1e2a8c0c4a075145!8m2!3d19. > 0058078!4d72.8257047> > > > > Please consider the environment before printing this email > > On 14 February 2017 at 18:55, Nico Kruber <[hidden email]> wrote: > > Hi Sujit, > > this does indeed sound strange and we are not aware of any data loss > > issues. > > Are there any exceptions or other errors in the job/taskmanager logs? > > Do you have a minimal working example? Is it that whole windows are not > > processed or just single items inside a window? > > > > > > Nico > > > > On Tuesday, 14 February 2017 16:57:31 CET Sujit Sakre wrote: > > > Hi, > > > > > > I have been using Flink 1.1.1 with Kafka 0.9 to process real time > > > > streams. > > > > > We have written our own Window Function and are processing data with > > > Sliding Windows. We are using Event Time and use a custom watermark > > > generator. > > > > > > We select a particular window out of multiple sliding windows and > > > process > > > all items in the window in an iterative loop to increment counts of the > > > items selected. After this we call the sink method to log the result in > > > a > > > database. > > > > > > This is working fine most of the times, i.e. it produces the expected > > > result most of the times. However there are situations when certain > > > > windows > > > > > are not processed (under same test conditions), this means the results > > > > are > > > > > less than expected i.e. the counts are less than expected. Sometimes, > > > > items > > > > > instead of certain windows not getting processed, certain items do not > > > > get > > > > > processed in the window. This is unpredictable. > > > > > > I wish to know what could be the cause of this inconsistent behaviour. > > > > How > > > > > do we resolve it. We have integrated 1.2 and Kafka 0.10 now, the problem > > > persists. > > > > > > Please could you suggest about what the problem could be and how to > > > > resolve > > > > > this. > > > > > > Many thanks. > > > > > > > > > *Sujit Sakre* signature.asc (201 bytes) Download Attachment |
Free forum by Nabble | Edit this page |