Vincent Bernat [Tue, 27 Oct 2020 17:31:22 +0000 (18:31 +0100)]
interfaces: listen to all incoming packets on Linux, not just LLDP ones
This mostly reverts fc5526dae75f. Listening only on ETH_P_LLDP makes
us miss incoming packets on enslaved interfaces to an Open vSwitch.
Therefore, prefer listening to ETH_P_ALL instead of ETH_P_LLDP. It is
likely that enslaved interfaces do not fully process Ethernet packets
and `type` is not correctly filled.
Vincent Bernat [Mon, 7 Sep 2020 18:10:10 +0000 (20:10 +0200)]
tests: fix tests around XML by canonicalizing XML representation
Since Python 3.8, insertion order is respected for attributes, so we
cannot just compare strings as previously. Python 3.8 also introduces
a `canonicalize()` function to normalize XML for digital signature. We
apply this function if it exists.
Vincent Bernat [Tue, 14 Jul 2020 05:16:47 +0000 (07:16 +0200)]
lib: remove limit on system description length
The limit was introduced in 9c49cedf8e75 while fixing a memory leak.
The state data is used to ensure we don't interleave operations. We
need to handle the case where the value is truncated because it is
larger than the allocated size.
Vincent Bernat [Sat, 23 May 2020 12:32:39 +0000 (14:32 +0200)]
agent: fix SNMP walk on lldpRemTable when missing remote sysName
When enumerating lldpRemSysName (and some others), one row could have
a NULL value because the remote system didn't provide a value. In this
case, we should return the next row.
There was already some code around that but it was not systematically
used. Therefore, we fix the issue for lldpRemTable and
lldpLocalSystemData. To ensure we catch future cases, we ensure
helpers functions use `default: return NULL` when no missing value is
allowed (no `break`, compiler would catch if it was the case) and
therefore, we don't need to try next OID and `default: break` when a
value may be missing and in this case, the caller should try next OID
upon receiving NULL.
Vincent Bernat [Fri, 24 Apr 2020 17:29:36 +0000 (19:29 +0200)]
lib: introduce lldpctl_watch_callback2()
This is similar to `lldpctl_watch_callback()` (which is getting
deprecated), except the callback won't receive the current connection.
This prevents a user to use the connection which is unusable because
it is now dedicated to watch events.
Minor ABI dump due to new function, but everything is
backward-compatible, except you may now get an error if you use the
connection while watching (but this was already not supported).
Vincent Bernat [Sat, 1 Feb 2020 22:21:44 +0000 (23:21 +0100)]
interfaces: include "netinet/in.h" before kernel headers
This should ensure definition of stuff like in6_addr are done by the
libc, not by the kernel headers. Recent kernel headers know how to
handle that when loaded second.
Vincent Bernat [Sat, 1 Feb 2020 16:34:29 +0000 (17:34 +0100)]
snmp: tentative to fix compilation with older versions of NetSNMP
In fact, gcc doesn't consider the signatures to be compatible when
using a function pointer. Let's use a preprocessor trick to detect the
version and hope gcc is OK with `char*` changed to `void*`.
interfaces: fix for limitation of 10 VLANs for LLDP .1q feature
Max 10 VLAN ids are supported on a port for .1q feature
Root Cause: All the VLANs learnt from netlink is dropped after 10 VLANs due
to the static array allocation of only 10 VLAN ids in the interface structure.
Beyond 10 VLAN membership for a port are ignored and error message gets
printed causing flooding of messages when hundreds of VLANs are configured.
Fix: Changed the static VLAN id array to VLAN id bitmap. With the bitmap all 4k
VLANs can be stored and learnt from netlink messages.
- Added a message to indicate when the LLDP packet is not sent out because
its too big. This will be helpful for user when too many VLANs are
configured and LLDP packets are not sent out.
Limitation: Even though the VLAN learning from netlink messages has been
alleviated, due to the LLDP message size around 380 VLANs can be advertised in
the packet. This number can vary based on the number of other TLVs being
advertised by LLDP.
vlan info shows interface_name.vlan-id which makes it look like sub
interface and causes confusion
Root Cause: Vlan name is sent as part of the .1q TLVs. But, the vlan name
format was <nterface-name>.<vlan-id> which makes it look like a sub-interface.
Fix: The vlan name cannot be removed from the vlan-info display since it is
sent/received as part of the .1q TLV in LLDP packet. But, changed the
vlan name format to vlan-<vlan-id> to avoid the confusion.
(cherry picked from commit 38db598121f5ce615f98d6cdaf41d5360c40dc3c)
lldpd: set a 30 seconds lower limit to the safeguard timer
This timer is a safeguard to refresh information about all networks
interfaces at regular interval, in case there was something wrong with
the event-based refresh.
When using milliseconds-grained tx-interval, this could happen several
times per second, which is not the intended use of the safeguard.
To quote Vincent Bernat:
"Minimal value of 30 seconds even if we don't have event-based
refresh (so outside the if). Most people have the event-based
refresh, so lowering the pace for the others should be pretty
invisible."
- Added a test in test_lldpcli.py to check that milliseconds delay can
be read back in either seconds or milliseconds units.
- Updated the manual page for lldpcli; warn about performance issue.
- Added an entry in NEWS
Issue: Error messages "netlink: 8 bytes leftover after parsing attributes
in process `lldpd'"
Root cause: Root Cause: The length of the netlink message was not being set
properly for non-bridge family type messages. Same length was being used for
both type of messages even though bridge family type message has extra
attribute. This causes 8 extra bytes being left over in the non-bridge
family type netlink messages.
Fix: Calculating and setting the length separately for bridge and non-bridge
family type messages.
(cherry picked from commit aac76966539bf932d5923b165762db370990bf94)
Sam Tannous [Thu, 21 Nov 2019 17:27:27 +0000 (09:27 -0800)]
LLDPD should document system refresh timer (tx-interval * 20)
In LLDPD, each port has its own timer to catch port-related
changes and is modified by changing the tx-interval.
LLDPD also starts another system based refresh timer on each port
for changes like hostname. This is the tx-interval multiplied by
20. This needs to be documented.
Signed-off-by: Sam Tannous <stannous@cumulusnetworks.com>
Vincent Bernat [Thu, 21 Nov 2019 19:13:38 +0000 (20:13 +0100)]
lldp: don't discard the whole LLDPDU when only one TLV is invalid
IEEE802.1AB-2005 says:
> If TLV_type_value is in the range of reserved TLV types in Table
> 9-1, the TLV is unrecognized and may be a basic TLV from a later
> LLDP version. The statsTLVsUnrecognizedTotal counter shall be
> incremented, and the TLV shall be assumed to be validated.
Vincent Bernat [Mon, 11 Nov 2019 08:54:10 +0000 (09:54 +0100)]
lib: fix memory leak when handling I/O
The state data is used to ensure we don't interleave requests of the
same kind (eg requesting data for eth0, then for eth1 while eth0 is
running). The data was freed only when reaching `CONN_STATE_IDLE`
again. Otherwise, there was a memory leak.
To avoid the memory leak, we avoid use a static allocation instead.
Vincent Bernat [Tue, 8 Oct 2019 17:35:41 +0000 (19:35 +0200)]
lldp: when receiving a shutdown LLDPU, don't clear chassis information
The chassis may be shared with another port. When the MSAP is known
and we receive a shutdown LLDPDU, just leave the original chassis as
is instead of copying information from the new chassis to the old
chassis.