Releases: grafana/carbon-relay-ng
Release O'Time
various network setting fixes 351f807 , #243
make windows builds work #245
Add tmpfiles.d config for centos/7 #235
Use useradd to support multiple distros #233
support variable substitution in instance, and default to $HOST #236
fix blacklist #237
add support for specifying explicit prefixFilter and/or substringFilter in aggregations, which can help perf a lot. #239
show target address for GrafanaNet routes #244
massive aggregator improvements.
aggregators
- massive aggregation improvements. they now run faster (sometimes by 20x), use less memory and can handle much more load. see #227 , #230
- add derive (rate) aggregator #230
- aggregator regex cache, which lowers cpu usage and increases the max workload, in exchange for a bit more ram usage. #227 By default, the cache is enabled for aggregators set up via commands (init commands in the config) but disabled for aggregators configured via config sections (due to a limitation in our config library)
docs
- add dashboard explanation screenshot #208
packaging and versioning
- fix logging on cent6/amzn/el6 #224 43b265d
- Fix the creation of
/var/run/carbon-relay-ng
directory #213 version
argument to get the version
other
- disable http2.0 support for grafanaNet route, since an incompatibility with nginx was resulting in bogus
400 Bad Request
responses. note this fix does not properly work, use 0.9.3-1 instead. - track allocated memory
non-blocking mode, better monitoring
- make blocking behavior configurable for kafkaMdm and grafanaNet routes. previously, grafanaNet was blocking, kafkaMdm was non-blocking. Now both default to non-blocking, but you can specify
blocking=true
.- nonblocking (default): when the route's buffer fills up, data will be discarded for that route, but everything else (e.g. other routes) will be unaffected. If you set your buffers large enough this won't be an issue.
rule of thumb: rate in metrics/s times how many seconds you want to be able to buffer in case of downstream issues. memory used will bebufSize * 100B
or usebufSize * 150B
to be extra safe. - blocking: when the route's buffer fills up, ingestion into the route will slow down/block, providing backpressure to the clients, and also blocking other routes from making progress. use this only if you know what you're doing and have smart clients that can gracefully handle the backpressure
- nonblocking (default): when the route's buffer fills up, data will be discarded for that route, but everything else (e.g. other routes) will be unaffected. If you set your buffers large enough this won't be an issue.
- monitor queue drops for non-blocking queues
- document route options better
- monitor queue size and ram used #218
- preliminary support for parsing out the new graphite tag format (kafkaMdm and grafanaNet route only)
the included, dashboard is updated accordingly. and also on https://grafana.com/dashboards/338
more tuneables for destinations
these new settings were previously hardcoded (to the values that are now the defaults):
connbuf=<int> connection buffer (how many metrics can be queued, not written into network conn). default 30k
iobuf=<int> buffered io connection buffer in bytes. default: 2M
spoolbuf=<int> num of metrics to buffer across disk-write stalls. practically, tune this to number of metrics in a second. default: 10000
spoolmaxbytesperfile=<int> max filesize for spool files. default: 200MiB (200 * 1024 * 1024)
spoolsyncevery=<int> sync spool to disk every this many metrics. default: 10000
spoolsyncperiod=<int> sync spool to disk every this many milliseconds. default 1000
spoolsleep=<int> sleep this many microseconds(!) in between ingests from bulkdata/redo buffers into spool. default 500
unspoolsleep=<int> sleep this many microseconds(!) in between reads from the spool, when replaying spooled data. default 10
some fixes (requires config change)
- unrouteable messages should be debug not notice #198
- rewrite before aggregator; fix race condition, sometimes incorrect aggregations and occasional panics #199
- config parsing fix #175
- kafka-mdm: support multiple brokers. fix #195
- bugfix: make init section work again. fix #201
attention init section must be changed from:
init = ...
to:
[init]
cmds = ...
note : this was previously released as 0.8.9 but the breaking config change warrants a major version bump.
3 new aggregators, better input plugins and packaging; and better config method
inputs
better pickle input #174
better amqp input options #168, update amqp library 3e87664
kafka input logging fixes #188
config
more proper config format so you don't have to use init commands. #183
aggregations
add last, delta and stdev aggregator #191, #194
packaging
Add tmpfiles.d config for deb package #179
prevent erasing of configs #181
add root certs to docker container for better grafanaCloud experience #180
better docs and stuff
minor release
multi-line amqp
see #165
amqp input, min/max aggregators, and more
- fix metrics initialisation (#150)
- update to new metrics2.0 format (mtype instead of target_type)
- refactor docker build process (#158)
- amqp input (#160)
- allow grafanaNet route to make concurrent connections (#153) to Grafana hosted metrics
- add min/max aggregators (#161)
- add kafka-mdm route for metrictank (#161)
v0.8: Growing up a little
- build packages for ubuntu, debian, centos and automatically push to circleCI upon successfull builds (https://github.com/graphite-ng/carbon-relay-ng#installation)
- add pickle input (#140)
- publish dashboard on grafana.net
- fix build for go <1.6 (#118)
- allow overriding docker commandline (#124)
- validation: tuneable validation of metrics2.0 messages. switch default legacy validation to medium, strict was too strict for many. show validation level in UI. Add time-order validation on a per-key basis
- support rewriting of metrics
- document limitations
- grafana.net route updates and doc updates
- re-organize code into modules
- various small improvements
- remove seamless restart stuff. not sure anymore if it still worked, especially now that we have two listeners. we didn't maintain it well, and no-one seems to care
known issues:
there's some open tickets on github for various smaller issues, but one thing that has been impacting people a lot for a long time now is memory usage growing indefinitely when instrumentation is not configured (because it keeps accumulating internal metrics).
The solution is to configure graphite_addr
to an address it can send metrics to (e.g. its own input address)
see ticket 50 for more info