Hi,
I´m looking into setting up monitoring for our (Flink) environment and realized that both Kafka and Cassandra use the yammer metrics library. This library enables the direct export of all metrics to Graphite (and possibly statsd). Does Flink use Yammer metrics? Cheers, Sanne |
Flink currently doesn't expose any
metrics beyond those shown in the Dashboard.
I am currently working on integrating a new metrics system that is partly based on Yammer/Codahale/Dropwizard metrics. For a first version it is planned to export metrics only via JMX as this effectively covers all use-cases with the help of JMXTrans and similar tools. Reporting to specific systems is something we want to add as well though. Regards, Chesnay Schepler On 08.04.2016 09:32, Sanne de Roever wrote:
|
Thanks Chesnay. Might I make a tentative case for Yammer? I'm not an expert, but I am currently trying to pull together information on this and was reviewing jmxtrans. This is all tentative, I've just dived in a few days ago. Please find the information below. Using Yammer it is possible to load an MBean that queries all metrics and exports them en masse to a target destination, statsd or graphite for example, but any destination will to. The gist is that the metrics do not have to be queried one by one, and that the export can be arranged by loading an extra jar. An example of this setup can be seen at: https://github.com/airbnb/kafka-statsd-metrics2 Put the MBean jar in the classpath, add a config file, and presto. This would work for Kafka, and Cassandra, and I would not need a separate JMXtrans server. On Fri, Apr 8, 2016 at 11:01 AM, Chesnay Schepler <[hidden email]> wrote:
|
I forgot to add some extra information, still all tentative. Earlier (erm, 14 years ago to be honest), I also worked on a JMX monitoring system, and it turned out to be a pain to identify all the components, write the correct jmx queries and then to plot everything. It was hard. This presentation echos this sentiment, and proposes the Yammer route: http://www.slideshare.net/NaderGan/cassandra-jmxexpresshow-do-we-monitor-cassandra-using-graphite-leveraging-yammer-codahale-library On Fri, Apr 8, 2016 at 12:12 PM, Sanne de Roever <[hidden email]> wrote:
|
I'm very much aware of how Yammer
works.
As the slides you linked show (near the end) is that there are several small issues with the general-purpose reporters offered by yammer. Instead of hacking around those issues i would very much prefer creating our own reporters that are, again, based on the yammer reporters as the concept is sound, but can properly interact with the rest of the system. Long-term this will create cleaner code, easier debugging and allow us to adjust things as required at any time. On 08.04.2016 12:39, Sanne de Roever wrote:
|
note that we still could expose
the option of using the yammer reporters; there isn't any
technical limitation as of now that would prohibit that.
On 08.04.2016 13:05, Chesnay Schepler wrote:
|
O.k., sounds nice; didn't mean to impose: my train of thought was a bit murky, my apologies for that. Although I had prior experience with JMX, and none with Yammer, at this moment it is all new again. Cleaning up my thinking a bit, the following picture, not directly Flink related, comes up. For production usage a complete export of metrics makes a lot of sense; assuming that each individual metric is sound. In practice one would start graphing/monitoring general metrics. In case of an incident, additional metrics stored in a system could help to explain system behavior, and provide valuable indicators for tuning or things to come in the future. This would lead to graphing/monitoring additional metrics. Cheers. On Fri, Apr 8, 2016 at 1:15 PM, Chesnay Schepler <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |