Wolf480pl [Tue, 21 Nov 2023 10:53:59 +0000 (11:53 +0100)]
write_prometheus: don't use AI_ADDRCONFIG for resolving bind address
Fixes #4150
write_prometheus uses getaddrinfo to resolve the bind address.
The AI_ADDRCONFIG flag causes getaddrinfo to refuse to resolve
0.0.0.0 when the system has no non-loopback IPv4 addresses configured
and refuse to resolve :: when the system has no non-loopback IPv6 configured.
We want binding to a wildcard address (0.0.0.0 or ::) to always work,
even if the network is down.
To achieve that, don't pass the AI_ADDRCONFIG flag
when resolving a bind address.
Georg Gast [Sun, 27 Aug 2023 11:20:11 +0000 (13:20 +0200)]
contrib/postgresql: Second postgresql database layout.
Changelog: contrib/postgresql: Second postgresql database layout.
Motivation for that second possible postgresql layout:
------------------------------------------------------
The first layout from Sebastian 'tokkee' Harl is like that:
```
+-------------------+ +----------------+
|Identifiers | |values |
+-------------------+ +----------------+
|ID int <-- >ID int |
|plugin text | |tstamp time |
|plugin_inst text | |name text |
|type text | |value double|
|type_inst text | | |
+-------------------+ +----------------+
```
The ID connects the two tables. The plugin, plugin_inst, type and tpye_inst
create s so called identifier. The timestamp, name and value get inserted into
the value table.
collectd/postgresql calles the collectd_insert function.
```
collectd_insert(timestamp with time zone, -- tstamp
character varying, -- host
character varying, -- plugin
character varying, -- plugin_inst
character varying, -- type
character varying, -- type_inst
character varying[], -- value_name
character varying[], -- type_name
double precision[]) -- values
```
This seems to represents the user_data_t/notification_t structure.
https://github.com/collectd/collectd/blob/ef1e157de1a4f2cff10f6f902002066d0998232c/src/daemon/plugin.h#L172
Lets take the ping plugin as an example. It collects 3 values: ping, ping_stddev, ping_droprate.
The current structure creates 3 identifiers and 3 lines for each entry. The identifiers get reused. It reports "192.168.myping.ip" as type.
To draw a diagram with e.g. grafana i would like all 3 values near each other for that host that i am pinging. See the graph in the wiki. The current setup must join through all collected values to scrap the ping values out of it. Each value must do the same again because it has an other identifier.
This second setup creates two tables:
```
+--------------------+ +--------------------+
|Instance | |plugin_ping |
+--------------------+ +--------------------+
|ID int <-- >ID int |
|plugin text | |tstamp time |
|plugin_inst text | |ping double|
| | |ping_stddev double|
| | |ping_droprate double|
| | | |
+--------------------+ +--------------------+
```
The instance ID get reused. The plugin data get its own table. All relevant measurement values are on one line. Get out the data is much more easy.
What could get argued is that i must admit, maybe take the creation of the instance table, sequence out of the collectd_insert function.
The type, type_inst and value_name get used to create the name of the value volumn. The impl_location() function handles this "data anomalies" like the ping plugin.
Description:
------------
My development was done on postgresql 15.
It has some advantages: The data has much higher data locality as it stays in one table and much less unneeded text columns.
This leads to much smaller table spaces. In my case the first setup created about 300 MB per day. The new setup about 50 MB with the advantage of depending data near each other.
You can also think about changing the datatype of the plugin_$plugin table to real. Just think if you realy need the double precission that double vs real. This just cuts the needed space in half.
Sample configuration:
---------------------
```
<Plugin postgresql>
<Writer sqlstore>
Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8, $9);"
</Writer>
<Database collectd>
Host "127.0.0.1"
Port 5432
User collector
Password "mypassword"
SSLMode "prefer"
Writer sqlstore
</Database>
</Plugin>
```
Please make sure that your database user (in this collector) has the rights to create tables, insert and update. The user that drops data must have the delete right.
Function description:
---------------------
The function collectd_insert() creates all tables and columns by itself.
1. The instance table consists of host/plugin/plugin_inst
2. The plugin_$plugin table (e.g. plugin_apache) contain all data for that plugin. The function collectd_insert() inserts the value into the column that its type/type_inst/name determines. There is one sad thing about collectd. The times that are submitted dont match 100%, so there is a epsilon (0.5 sec) that is used to check to what row a value belongs. If the column is not yet present it is added by this function.
The function impl_location() removes some data anomalies that are there when the data get submitted. There is a default that matches most cases. The plugins cpufreq, ping and memory get their names, plugin_inst get adjusted.
The procedure collectd_cleanup() is the maintainance function. It has as an argument the number of days where to keep the data. It can be called by pgagent or a similar mechanism like "CALL collectd_cleanup(180)". This delete all data that is older than 180 days.
Chris Sibbitt [Fri, 25 Aug 2023 19:04:16 +0000 (15:04 -0400)]
Change AMQP queue drops from DEBUG to WARNING
Debug messages are only available if collectd is compiled with debug enabled, making it hard to troubleshoot the situation where the amqp queue is being overrun.
The way strptime is activated using feature macros, _DEFAULT_SOURCE
(successor to _BSD_SOURCE) is disabled implicitly, so timegm is
hidden. Defining _DEFAULT_SOURCE at the same time as the other
feature macros solves this, and removes the need for the
TIMEGM_NEEDS_BSD configure macro.
This avoids an implicit declaration of timegm in src/bind.c, and build
failures with future compilers.
tiozhang [Wed, 1 Mar 2023 06:19:15 +0000 (14:19 +0800)]
vmem: add metrics start with "pgscan_" in Kernel Linux
Some Linux Kernel versions have metrics start with "pgscan_"
in /proc/vmstat, for instance:
```
cat /proc/vmstat | grep pgscan
pgscan_kswapd 0
pgscan_direct 0
pgscan_direct_throttle 0
```
Yadnesh Kulkarni [Wed, 22 Feb 2023 12:27:55 +0000 (17:57 +0530)]
Hugepages plugin skips reading write-only file
Since 'demote' is a write-only file do not attempt to
to read it. This also prevents the plugin from generating
incessant logs about the failure to open it.
Thomas Renninger [Tue, 31 Jan 2023 15:40:42 +0000 (16:40 +0100)]
Fix compile issue if net-snmp has NETSNMP_DISABLE_MD5 set
Otherwise one gets:
src/snmp.c: In function 'csnmp_config_add_host_auth_protocol':
src/snmp.c:678:25: error: 'usmHMACMD5AuthProtocol' undeclared (first use in this function); did you mean 'usmHMACSHA1AuthProtocol'?
678 | hd->auth_protocol = usmHMACMD5AuthProtocol;
| ^~~~~~~~~~~~~~~~~~~~~~
| usmHMACSHA1AuthProtocol
Leonard Göhrs [Tue, 27 Sep 2022 06:03:14 +0000 (08:03 +0200)]
mmc: cache open file descriptors to block devices
Udev rules can contain a "watch" option, which is described in the man page as:
Watch the device node with inotify; when the node is closed after being
opened for writing, a change uevent is synthesized.
This watch option is enabled by default for all block devices[1].
The intention behind this is to be notified about changes to the partition
table. The mmc plugin does however also need to open the block device for
writing, even though it never modifies its content, in order to be able to
issue ioctls with vendor defined MMC-commands.
Reduce the amount of generated change events from one per read to one per
collectd runtime by caching the open file descriptor.
Jim Klimov [Wed, 31 Aug 2022 13:32:46 +0000 (15:32 +0200)]
configure.ac: if neither UPSCONN{,_t} type was found, refuse to build NUT plugin
NOTE: src/nut.c also has pragmas to error out in this situation,
but that handling is compiler-dependent and happens too late in
the checkout/configure/build loop.
Presumably this inability to find the type in the earlier-found header file
is also triggered by build environment "inconsistencies" like lack of basic
types in the libc implementation (maybe highlighting the need for additional
headers or macros for the platform).
Jim Klimov [Wed, 31 Aug 2022 09:40:01 +0000 (11:40 +0200)]
configure.ac, src/nut.c: detect int types required by NUT API we build against
Either use the stricter int types required by NUT headers since v2.8.0 release,
or the relaxed (arch-dependent) types required by older NUT releases - depending
on which NUT API version the collectd is building against at the moment.
Inspired by discussion at https://github.com/networkupstools/nut/issues/1638
lns [Wed, 15 Jun 2022 18:08:32 +0000 (20:08 +0200)]
Fix-configure.ac: define PREFIX in config.h
PREFIX was never defined and therefore set to the
default value `/opt/collectd`. collectd searched in
this path for desired files e.g. typesdb, plugins, etc
no matter if it was configured with `--prefix`.
Signed-off-by: lns <matzeton@googlemail.com> Signed-off-by: Toni Uhlig <matzeton@googlemail.com>
Eero Tamminen [Wed, 8 Jun 2022 09:31:07 +0000 (12:31 +0300)]
Add scalloc() and use it in src/processes.c (#4014)
* Add scalloc() wrapper similar to smalloc() to common utils
scalloc() wraps calloc() with exit on alloc failure,
similarly to what smalloc() does for malloc().
* Handle (Solaris-only) ps_read_process calloc fails by using scalloc
Everything else checks and handles calloc failures except this function.
As I cannot test Solaris specific code, I've just replaced calloc with
scalloc, which gracefully exits collectd with error message on alloc
failures (instead of corrupting memory / crashing, as would happen
with current code).
Leonard Göhrs [Fri, 3 Jun 2022 13:31:54 +0000 (15:31 +0200)]
mmc: add more vendor specific and generic data sources (#4006)
* mmc plugin: integrate into configure.ac
The mmc plugin is not fully integrated in the configure.ac.
Change that.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: Skip mmc paths in /sys that start with a '.' (like "." and "..")
The plugin tries to (and obiously fails to) use "." and "..", that come out of
listdir, as mmc devices.
Filter these two out by skipping hidden files/directories.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: read standard eMMC 5.0 health metrics
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: remove type-name defines
These defines can become confusing, especially when combined with the defines
for attribute names in the sysfs. This will only get worse when more
vendor-specific metrics are supported.
Remove the defines and use the type names directly.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: remove sysfs-attribute defines
These defines are used only once or twice and do not help with readability.
Replace them with just the raw strings.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: port to libudev
While using the sysfs directly works fine for the swissbit and generic eMMC
driver it does not scale well to other vendor-specific interfaces where one has
to open the block device in /dev to perform ioctls.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: add micron eMMC support
While this patch was only tested with a single product (MTFC16GAPALBH) I am
fairly confident that it will generalize to others as well, as micron
themselves ship a single tool[1], which this patch uses as a reference, to read
similar info from all of their eMMCs.
This patch also increases the maximum value of mmc_bad_blocks to infinity,
as it can be any 16 bit integer for micron eMMC but could be even larger for
other vendors.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
* mmc plugin: add sandisk eMMC support
While this patch was only tested with a single product (SDINBDG4-8G), I am
fairly confident that it should generalize to other devices as well,
as the current product portfolio on their website looks very similar to the one
I tested and new devies will likely use a Western Digital manufacturer ID.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
Leonard Göhrs [Wed, 1 Jun 2022 06:04:28 +0000 (08:04 +0200)]
MQTT: fix off-by-one error in published message length (#4005)
The mqtt plugin publishes messages including the trailing '\0'-Byte,
as can be verified using e.g. the mosquitto_sub command with a HEX message
formatter:
$ mosquitto_sub -t "#" -F "%t: %X"
metrics/loragw1/users/users: 313635323334303737392E3938353A3000
^^
While the the MQTT PUBLISH payload is, according to the specification,
application specific and most (C-Based) consumers will not notice the trailing
'\0'-Byte, it is rather uncommon to publish messages like this.
We stumbled upon this error while using Telegraf to ingest metrics via MQTT,
as it is Go-Based and does not use '\0'-terminated strings, leading to issues
parsing these strings into numbers.
Fix the off-by-one error by using the result of strlen() as-is.
Signed-off-by: Leonard Göhrs <l.goehrs@pengutronix.de>
CURLOPT_POSTFIELDSIZE allows to specify the data size, which is known in
advance and equals to cb->send_buffer_fill. When CURLOPT_POSTFIELDSIZE is not
set (or set to -1), then curl evaluates data size using strlen() function,
which have O(N) complexity, so we save a few CPU cycles here.
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
* write_influxdb_udp: Split formatting functions to format_influxdb
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
* write_http: Add influxdb format
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
* write_http: Enable using unix socket in libcurl
Currently, meta_data supports only the key lookup over forward list data
structure, so iterating over the list would take O(N^2).
Here we introduce meta_data_iter() and meta_data_iter_next() functions dealing
with opaque iterator type.
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
* format_influxdb: Support serializing meta_data
collectd 6.0 supports serializing series labels as influxdb tags. Here we
backport this feature serializing string-values meta data keys as influxdb
tags.
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
Florian Eckert [Thu, 17 Mar 2022 07:34:33 +0000 (08:34 +0100)]
smart: check udev_enumerate_scan_devices() return value and unify log messages (#3984)
* Check udev_enumerate_scan_devices return value
This change checks the return value of the function and cancels the call
if the returned integer is not greater than or equal to 0.
Signed-off-by: Florian Eckert <fe@dev.tdt.de>
* Unify and fix log messages
The error log messages were not consistent and had no prefix. This
commit adds the uniform prefix 'smart plugin:' to each log message. While
we're at it, I also removed the punctuation mark at the end of the
sentences.
Emma Foley [Tue, 15 Feb 2022 07:46:21 +0000 (07:46 +0000)]
Fix CI failures caused by unsupported distros and updates to dependencies (#3975)
* [ci][gha] Replace trusy with Bionic and Focal
Ubuntu 14.04 (Trusty) is out of standard support [1].
``make check`` fails for test_capabilities, as noted in [2].
[3] indicates that the cause is glibc, but that updates are not expected
to the version in trusty.
This PR replaces trusty with Ubuntu 18.04 (Bionic) and 20.04 (Focal).
Emma Foley [Fri, 5 Nov 2021 19:20:31 +0000 (19:20 +0000)]
[ci][cirrus] Replace trusty with bionic/focal in debian_default_toolchain
Ubuntu 14.04 (Trusty) is out of standard support [1].
``make check`` fails for test_capabilities, as noted in [2].
[3] indicates that the cause is glibc, but that updates are not expected
to the version in trusty.
This PR replaces trusty with Ubuntu 18.04 (Bionic) and 20.04 (Focal).