Vincent Bernat [Wed, 10 Dec 2014 14:41:49 +0000 (15:41 +0100)]
write_kafka: check for partition availability before selecting one
When a partition is unavailable, sending to it will just lead to a lost
metric. Therefore, after selecting the partition, check if it is
available. If not, select the next one until we tried them all.
A future iteration may use consistent hashing to avoid to double the
work done on a partition when the previous one is unavailable.
- The query is expected to be the block argument
- The type instance is inferred from the query if unsupplied
- The type will default to gauge if not supplied
Now that the redis plugin has moved to hiredis, it could
be worthwhile to add support for custom commands.
This diff implements a mechanism for executing commands which
allows for setting the type and type-instance. It doesn not
support hash or array returns, but if this is deemed necessary
could be added later on.
The canonical use case for this is for people using redis
has a queue (for instance, using solutions such as rq,
sidekiq and similar solutions) who want a simple way to
ensure the work queue size is not growing. To address this
you would use:
When reading from tables, upon errors the PDUs sent are already
freed by snmp_synch_response since they are right after
snmp_send is called.
This commit syncs collectd's approach with other occurences of
snmp_synch_response calls.
There might be a few corner cases where we leak PDUs, but it
is unclear how to check for those since we would need to
have an indication that snmp_send was never called, which
as far as I can tell is not possible.
The potential for failure in snmp_send is rather low and will
be easily spotted though, since when crafting invalid PDUs
snmp send will constantly fail and since valid configurations
can never leak memory.
When reading from tables, upon errors the PDUs sent are already
freed by snmp_synch_response since they are right after
snmp_send is called.
This commit syncs collectd's approach with other occurences of
snmp_synch_response calls.
There might be a few corner cases where we leak PDUs, but it
is unclear how to check for those since we would need to
have an indication that snmp_send was never called, which
as far as I can tell is not possible.
The potential for failure in snmp_send is rather low and will
be easily spotted though, since when crafting invalid PDUs
snmp send will constantly fail and since valid configurations
can never leak memory.
When reading from tables, upon errors the PDUs sent are already
freed by snmp_synch_response since they are right after
snmp_send is called.
This commit syncs collectd's approach with other occurences of
snmp_synch_response calls.
There might be a few corner cases where we leak PDUs, but it
is unclear how to check for those since we would need to
have an indication that snmp_send was never called, which
as far as I can tell is not possible.
The potential for failure in snmp_send is rather low and will
be easily spotted though, since when crafting invalid PDUs
snmp send will constantly fail and since valid configurations
can never leak memory.
Vincent Bernat [Mon, 17 Nov 2014 09:35:16 +0000 (10:35 +0100)]
libstatgrab: only use one configure test for 0.90 API change
Previously, each API change was tested in configure.ac. Some of the
tests are relying on signature checks and would need to have -Werror
flag enabled to make them work. This is quite fragile.
Instead, we assume that if `sg_init()` requires an argument, we must use
the 0.90 API.
Marc Fournier [Fri, 14 Nov 2014 21:04:16 +0000 (22:04 +0100)]
write_redis: avoid passing a float/double to redisCommand()
... as it seems to not be well supported by hiredis 0.10.1 on Debian
7.0, leading to a segfault. Storing the string representation in a
variable instead is the compromise I found to make the plugin work on
this system.
Vincent Bernat [Thu, 13 Nov 2014 16:57:46 +0000 (17:57 +0100)]
libstatgrab: fix sg_get_disk_io_stats() invocation for libstatgrab >= 0.9
In those versions, `sg_get_disk_io_stats()` need to be invoked a pointer
to size_t instead of pointer to int. Such a requirement is detected at
configure-time.
Vincent Bernat [Fri, 7 Nov 2014 14:20:22 +0000 (15:20 +0100)]
network: don't enable gcrypt thread callbacks when gcrypt recent enough
From `gcrypt.h`:
> NOTE: Since Libgcrypt 1.6 the thread callbacks are not anymore used.
> However we keep it to allow for some source code compatibility if used
> in the standard way.
Otherwise, we get a deprecation warning which is turned into an error:
```
CC libcollectdclient_la-network_buffer.lo
../../../src/libcollectdclient/network_buffer.c:58:15: error: 'gcry_thread_cbs' is deprecated (declared at /usr/include/gcrypt.h:213) [-Werror=deprecated-declarations]
GCRY_THREAD_OPTION_PTHREAD_IMPL;
```
This checks appropriate environment variables. When supervised
by either upstart or systemd, collectd will not daemonize but
will signal readyness with the appropriate method.
This allows collectd to be either configured with `expect stop`
in upstart or `Type=notify` with systemd.
The rationale for this is detailed here: http://spootnik.org/entries/2014/11/09_pid-tracking-in-modern-init-systems.html
Vincent Bernat [Fri, 7 Nov 2014 14:13:27 +0000 (15:13 +0100)]
smart: add a SMART plugin
This plugin uses libatasmart:
http://0pointer.de/blog/projects/being-smart.html
As libatasmart is Linux-only, the plugin is therefore Linux-only
too. The disks are discovered through libudev.
Each SMART attribute is extracted. The current value, worst value,
threshold value (if any) are recorded. Those are normalized
values (between 0 and 255, higher is better). For some values, it makes
more sense to record the raw value. libatasmart is converting this raw
value to something sensible. We record that form. Sometimes, this is
just the raw value but sometimes this is converted to another scale (for
example, the temperature). People should know what each attribute means
before using those values. Otherwise, the normalized values are better.
Four values are (power-on time, power cycle count, bad sectors and
temperature) are also recorded on their own. Those are usually the
values that the user care about the most.
Here is an excerpt of the plugin output with the CSV plugin (the SSD
disk on my laptop doesn't provide a temperature sensor):
Marc Fournier [Mon, 10 Nov 2014 06:58:13 +0000 (07:58 +0100)]
write_redis: fix format of commands sent to redis
The commands getting submitted to redis now look like this:
"ZADD" "collectd/hostname/entropy/entropy" "1415602051.335973024" "1415602051.335973024:823"
"SADD" "collectd/values" "hostname/entropy/entropy"
... which is the same as in the initial implementation, except for the
added decimals in the timestamp (the plugin was developped before
high-precision timestamps support was added to collectd).
When running f3706b0b87, the following command gets sent to redis:
"ZADD" "collectd/hostname/entropy/entropy" "1415487432.000000" "1415487432:932"
Meaning the value actually stored, and later returned by redis is:
"<timstamp>:<value>".
b7984797 accidentally dropped the comma separating the timestamp and the
value, which leads the plugin to store a somewhat confusing value in
redis:
"ZADD" "collectd/hostname/entropy/entropy" "1415487432.000000" "1415487432932"
When running f3706b0b87, the following command gets sent to redis:
"ZADD" "collectd/hostname/entropy/entropy" "1415487432.000000" "1415487432:932"
Meaning the value actually stored, and later returned by redis is:
"<timstamp>:<value>".
b7984797 accidentally dropped the comma separating the timestamp and the
value, which leads the plugin to store a somewhat confusing value in
redis:
"ZADD" "collectd/hostname/entropy/entropy" "1415487432.000000" "1415487432932"
Andrés J. Díaz [Thu, 7 Nov 2013 08:57:53 +0000 (09:57 +0100)]
Switch redis.c plugin from credis to hiredis.
Change the entire redis.c plugin to use libhiredis (tested with
libhiredis0.10) instead on credis. The libhiredis is supported in a number
of distributions like Debian or Ubuntu.
This patch keeps the same functionality that the old redis.c does.
Vincent Bernat [Fri, 7 Nov 2014 14:20:22 +0000 (15:20 +0100)]
network: don't enable gcrypt thread callbacks when gcrypt recent enough
From `gcrypt.h`:
> NOTE: Since Libgcrypt 1.6 the thread callbacks are not anymore used.
> However we keep it to allow for some source code compatibility if used
> in the standard way.
Otherwise, we get a deprecation warning which is turned into an error:
```
CC libcollectdclient_la-network_buffer.lo
../../../src/libcollectdclient/network_buffer.c:58:15: error: 'gcry_thread_cbs' is deprecated (declared at /usr/include/gcrypt.h:213) [-Werror=deprecated-declarations]
GCRY_THREAD_OPTION_PTHREAD_IMPL;
```
Vincent Bernat [Fri, 7 Nov 2014 14:40:37 +0000 (15:40 +0100)]
build: fix out-of-tree build
When building collectd out of tree, `srcdir` and `builddir` are
different. We ask to search path in `$(top_srcdir)/src` since this is
needed to find `liboconfig/config.h`. Also fix search path for
libcollectdclient where only one header is in `builddir` while the
remaining are in `srcdir`.
Syslog: if we can't find the loglevel specified by the configuration string default to 'info' and warn about the unknown configuration option. no way to make syslog totaly silent anymore.