Hi,
I don't see much discussion on Anomaly detection using Flink. we are working on a project where we need to monitor the server logs in real time. If there is any sudden spike in the number of transactions(Unusual), server errors, we need to create an alert. 1. How can we best achieve this? 2. How do we store the historical information about the patterns observed and compute the baseline? Do we need any external source like Elasticsearch to store the window snapshots to build a baseline? 3. Baseline should be self-learning as new patterns are discovered and baseline should get adjusted based on this. 4. Flink ML has any capabilities to achieve this? Please let me know if you have any approach/suggestions ? |
Raj -
I'm looking for the same thing. As the ML library doesn't support DataStream api, I'm tossing ideas around maybe using the windowing function to build up a model that changes over time. Jeremy D. Branham Technology Architect - Sprint O: +1 (972) 405-2970 | M: +1 (817) 791-1627 [hidden email] #gettingbettereveryday -----Original Message----- From: Raj Kumar [mailto:[hidden email]] Sent: Thursday, July 20, 2017 4:24 PM To: [hidden email] Subject: Flink Anomaly Detection Hi, I don't see much discussion on Anomaly detection using Flink. we are working on a project where we need to monitor the server logs in real time. If there is any sudden spike in the number of transactions(Unusual), server errors, we need to create an alert. 1. How can we best achieve this? 2. How do we store the historical information about the patterns observed and compute the baseline? Do we need any external source like Elasticsearch to store the window snapshots to build a baseline? 3. Baseline should be self-learning as new patterns are discovered and baseline should get adjusted based on this. 4. Flink ML has any capabilities to achieve this? Please let me know if you have any approach/suggestions ? -- View this message in context: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2FFlink-Anomaly-Detection-tp14370.html&data=02%7C01%7CJeremy.D.Branham%40sprint.com%7C0fd3a0f94d3547bdf12b08d4cfb86865%7C4f8bc0acbd784bf5b55f1b31301d9adf%7C0%7C0%7C636361838310440993&sdata=Rah8P27ro%2FT5xZJAN%2BFwQv0Ze%2FGD9WuF6lGM3ox1Mac%3D&reserved=0 Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. ________________________________ This e-mail may contain Sprint proprietary information intended for the sole use of the recipient(s). Any use by others is prohibited. If you are not the intended recipient, please contact the sender and delete all copies of the message. |
FWIW, We have a built a similar Log Aggregator internally using Apache Nifi + KFC stack (KFC = Kafka, Flink, Cassandra) Using Apache NiFi for ingesting logs from Openstack via rsyslog and writing them out to Kafka topics -> Flink Streaming + CEP for detecting anomalous patterns -> persist the patterns with relevant metadata to Cassandra -> Dashboard or Search Engine. We are using Flink CEP for detecting patterns in server logs and to flag alerts onto a dashboard. You can check out the implementation here - https://github.com/keedio/openstack-log-processor On Thu, Jul 20, 2017 at 5:47 PM, Branham, Jeremy [IT] <[hidden email]> wrote: Raj - |
Free forum by Nabble | Edit this page |