Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding metrics #53

Open
pshemass opened this issue Aug 16, 2019 · 12 comments
Open

Adding metrics #53

pshemass opened this issue Aug 16, 2019 · 12 comments
Milestone

Comments

@pshemass
Copy link
Contributor

pshemass commented Aug 16, 2019

one of most important part of distributed system is observability that's why we should build our library with this mindset.

Probably we should try to use https://github.com/zio/zio-metrics if it is not ready we should join them to help them release it.

@jdegoes @mijicd Let me know if you see other options.

@mijicd
Copy link
Member

mijicd commented Aug 17, 2019

cc @toxicafunk

@toxicafunk
Copy link
Member

I just migrated zio-metrics to the new org and successfully builded on circleci:

https://circleci.com/gh/zio/zio-metrics

It should be ready minus bugs that may be encountered during usage. I am currently writing an implementation that creates an ZEnv for Prometheus and for Dropwizard but the current version should be usable.

Will ask on the channel on how to publish it and I'm more than willing to work integrating it here.

@pshemass
Copy link
Contributor Author

@toxicafunk do you need any help? Do you plan to add tracing ?

@mijicd
Copy link
Member

mijicd commented Aug 19, 2019

@toxicafunk let us know if you need any help.

@pshemass I'm not sure about tracing. There are plenty of options for Open Tracing out there, and it's relatively simple to "bake" another, ZIO compatible client. However, Open Telemetry looks more interesting to me. I think it's worth jumping in that train.
cc @jdegoes

However, at the moment I don't think we should broaden the scope too much. Let's come up with the set of metrics we want to expose, and see whether zio-metrics fits our needs right now.

@pshemass
Copy link
Contributor Author

pshemass commented Aug 19, 2019

no, we should not blow the scope now, I'm just curious.

@toxicafunk
Copy link
Member

  1. Thanks to @mijicd, zio-metrics is now published on Sonatype

  2. I made a small test. I have an implementation for Dropwizard as backend and another for Prometheus. The on for Dropwizard is more stable and appears to work as expected, I seem to have some bugs on the Prometheus backend.

  3. The idea for zio-metrics is to have a 1 common API and then implement it for different backends. Since the original API was mostly based on Dropwizard it makes sense this one is mostly ready, I could use some help to fix the Prometheus one, noting that this may imply changes in the original API. @pshemass I wouldn't mind some help in this aspect especially if you have experience with Prometheus.

  4. To avoid scope-creep we can ignore tracing for zio-keeper, but I will add it as an issue in zio-metrics so its something we could look at in parallel without affecting zio-keeper, @mijicd @pshemass wdyt?

  5. I will share the tests with you shortly, its an app that reads a file where each line is a json message, it extracts the ID and uses it as key to send a kafka message. Its just something I use at work and was the most convenient thing I had for testing this.

@mijicd
Copy link
Member

mijicd commented Aug 19, 2019

@toxicafunk I agree with the point 4. Also, we should take a look at zio-metrics together, maybe we could have different modules (e.g. prometheus and dropwizard backends, open tracing etc.) and parallelize the work on them.

edit: Not sure whether it makes sense, just a wild guess :)

@toxicafunk
Copy link
Member

toxicafunk commented Aug 19, 2019

So you can see my test for Prometheus here:

https://github.com/toxicafunk/zio-tests/blob/prometheus/src/main/scala/com/richweb/Main.scala

on the main function you'll see I define the backend impl:

val metrics = new PrometheusMetrics()

and since my own reporters are.... "funny".... I just use the Http Server included in Prometheus

val server = new HTTPServer(new InetSocketAddress(1234), metrics.registry);

which produces the following output given the counter
cnt <- metrics.counter(Label("kafka_sent_messages", Array("zenv")))
and the timer I defined:
tmr <- metrics.timer(Label("simple_timer", Array("test", "timer")))

# HELP kafka_sent_messages kafka_sent_messages counter
# TYPE kafka_sent_messages counter
kafka_sent_messages{zenv="zenv",} 100000.0
# HELP simple_timer simple_timer timer
# TYPE simple_timer summary
simple_timer_count{test="test",timer="timer",} 100000.0
simple_timer_sum{test="test",timer="timer",} 2456428.277020058

The count is correct since my file has 100k messages and we can average (sum/count) that processing took 24.56 ms/message.

For DropWizard (https://github.com/toxicafunk/zio-tests/blob/dropwizard/src/main/scala/com/richweb/Main.scala) I just define its backend:

val metrics = new DropwizardMetrics()

The easiest reporter to set is the ConsoleReporter:

val reporter = ConsoleReporter.forRegistry(metrics.registry)
	.convertRatesTo(TimeUnit.SECONDS)
	.convertDurationsTo(TimeUnit.MILLISECONDS)
	.build()

reporter.start(20, TimeUnit.SECONDS);

You may observe defining the count and timer (or its usage) doesn't change:

cnt <- metrics.counter(Label("kafka_sent_messages", Array("zenv")))
tmr <- metrics.timer(Label("simple_timer", Array("test", "timer")))

which produces the following output:

[info] Completed in 27355 ms
[info] 8/20/19 1:32:24 AM =============================================================
[info] -- Counters --------------------------------------------------------------------
[info] kafka_sent_messages.zenv
[info]              count = 100000
[info] -- Timers ----------------------------------------------------------------------
[info] simple_timer.test.timer
[info]              count = 100000
[info]          mean rate = 2502.84 calls/second
[info]      1-minute rate = 1857.60 calls/second
[info]      5-minute rate = 1293.06 calls/second
[info]     15-minute rate = 1167.15 calls/second
[info]                min = 1570.23 milliseconds
[info]                max = 27192.09 milliseconds
[info]               mean = 18209.35 milliseconds
[info]             stddev = 6573.32 milliseconds
[info]             median = 19658.25 milliseconds
[info]               75% <= 23925.28 milliseconds
[info]               95% <= 26495.53 milliseconds
[info]               98% <= 26861.20 milliseconds
[info]               99% <= 27068.98 milliseconds
[info]             99.9% <= 27192.09 milliseconds

So the library is usable but does needs some care and love 😄

Any comments?

@toxicafunk
Copy link
Member

@pshemass zio/zio-metrics-legacy#8

@pshemass
Copy link
Contributor Author

I have opportunity to test at work because we have Kamon setup with Prometheus.

API is usable but it needs some love :) Is there zio-metrics gitter channel that could discuss this?

@toxicafunk
Copy link
Member

There is now: https://gitter.im/ZIO/zio-metrics

@toxicafunk
Copy link
Member

Just wanted to add that there is also a histogramTimer in prometheus that is easier to use than the regular timer:

tmr <- metrics.histogramTimer(Label("simple_timer", Array.empty[String]))
...
  .mapMParUnordered(120)(
    l => messenger.send(prd, idL.getOption(toJson(l)).getOrElse("UND"), l)
  )
  .tap(md => cnt(1) *> tmr() *> putStrLn(md.toString()))

@mijicd mijicd added this to the 0.2.0 milestone Nov 5, 2019
@mijicd mijicd modified the milestone: 0.2.0 Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants