git.ipfire.org Git - thirdparty/iproute2.git/log

iprule: Move port parsing to a function

In preparation for adding port mask support, move port parsing to a
function.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

tc: Fix rounding in tc_calc_xmittime and tc_calc_xmitsize.

Currently, tc_calc_xmittime and tc_calc_xmitsize round from double to
int three times — once when they call tc_core_time2tick /
tc_core_tick2time (whose argument is int), once when those functions
return (their return value is int), and then finally when the tc_calc_*
functions return. This leads to extremely granular and inaccurate
conversions.

As a result, for example, on my test system (where tick_in_usec=15.625,
clock_factor=1, and hz=1000000000) for a bitrate of 1Gbps, all tc htb
burst values between 0 and 999 bytes get encoded as 0 ticks; all values
between 1000 and 1999 bytes get encoded as 15 ticks (equivalent to 960
bytes); all values between 2000 and 2999 bytes as 31 ticks (1984 bytes);
etc.

The patch changes the code so these calculations are done internally in
floating-point, and only rounded to integer values when the value is
returned. It also changes tc_calc_xmittime to round its calculated value
up, rather than down, to ensure that the calculated time is actually
sufficient for the requested size.

Signed-off-by: Jonathan Lennox <jonathan.lennox@8x8.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit
56794b5862c5 ("Merge branch 'mlx5-health-syndrome'")

Signed-off-by: David Ahern <dsahern@kernel.org>

ip: link: netkit: Support scrub options

Add "scrub" option to configure IFLA_NETKIT_SCRUB and
IFLA_NETKIT_PEER_SCRUB when setting up a link. Add "scrub" and
"peer scrub" to device details as well when printing.

$ sudo ./ip/ip link add jordan type netkit scrub default peer scrub none
$ ./ip/ip -details link show jordan
43: jordan@nk0: <BROADCAST,MULTICAST,NOARP,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535
netkit mode l3 type primary policy forward peer policy forward scrub default peer scrub none numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536

v2->v3: Updated man page.
v1->v2: Added some spaces around "scrub SCRUB" in the help message.

Link: https://lore.kernel.org/netdev/20241004101335.117711-1-daniel@iogearbox.net/
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: David Ahern <dsahern@kernel.org>

lib: remove redundant checks in get_u64 and get_s64

Static analyzer reported:
1. if (res > 0xFFFFFFFFFFFFFFFFULL)
Expression 'res > 0xFFFFFFFFFFFFFFFFULL' is always false , which may be caused by a logical error:
'res' has a type 'unsigned long long' with minimum value '0' and a maximum value '18446744073709551615'

2. if (res > INT64_MAX || res < INT64_MIN)
Expression 'res > INT64_MAX' is always false , which may be caused by a logical error: 'res' has a type 'long long'
with minimum value '-9223372036854775808' and a maximum value '9223372036854775807'
Expression 'res < INT64_MIN' is always false , which may be caused by a logical error: 'res' has a type 'long long'
with minimum value '-9223372036854775808' and a maximum value '9223372036854775807'

Corrections explained:
- Removed redundant check `res > 0xFFFFFFFFFFFFFFFFULL` in `get_u64`,
as `res` cannot exceed this value due to its type.
- Removed redundant checks `res > INT64_MAX` and `res < INT64_MIN` in `get_s64`,
as `res` cannot exceed the range of `long long`.

Triggers found by static analyzer Svace.

Signed-off-by: Anton Moryakov <ant.v.moryakov@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: remove duplicate condition in ila_csum_name2mode in

Static analyzer reported:
expression is identical to previous conditio

Corrections explained:
The condition checking for "neutral-map-auto" was duplicated in the
ila_csum_name2mode function. This commit removes the redundant check
to improve code readability and maintainability.

Triggers found by static analyzer Svace.

Signed-off-by: Anton Moryakov <ant.v.moryakov@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: handle NULL return from localtime in strxf_time in

Static analyzer reported:
Pointer 'tp', returned from function 'localtime' at ipxfrm.c:352, may be NULL
and is dereferenced at ipxfrm.c:354 by calling function 'strftime'.

Corrections explained:
The function localtime() may return NULL if the provided time value is
invalid. This commit adds a check for NULL and handles the error case
by copying "invalid-time" into the output buffer.
Unlikely, but may return an error

Triggers found by static analyzer Svace.

Signed-off-by: Anton Moryakov <ant.v.moryakov@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: check return value of iproute_flush_cache() in irpoute.c

Static analyzer reported:
Return value of function 'iproute_flush_cache', called at iproute.c:1732,
is not checked. The return value is obtained from function 'open64' and possibly contains an error code.

Corrections explained:
The function iproute_flush_cache() may return an error code, which was
previously ignored. This could lead to unexpected behavior if the cache
flush fails. Added error handling to ensure the function fails gracefully
when iproute_flush_cache() returns an error.

Triggers found by static analyzer Svace.

Signed-off-by: Anton Moryakov <ant.v.moryakov@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

tc_util: Add support for 64-bit hardware packets counter

The netlink nest that carriers tc action statistics looks as follows:

[TCA_ACT_STATS]
[TCA_STATS_BASIC]
[TCA_STATS_BASIC_HW]

Where 'TCA_STATS_BASIC' carries the combined software and hardware
packets (32-bits) and bytes (64-bit) counters and 'TCA_STATS_BASIC_HW'
carries the hardware statistics.

When the number of packets exceeds 0xffffffff, the kernel emits the
'TCA_STATS_PKT64' attribute:

[TCA_ACT_STATS]
[TCA_STATS_BASIC]
[TCA_STATS_PKT64]
[TCA_STATS_BASIC_HW]
[TCA_STATS_PKT64]

This layout is not ideal as the only way for user space to know what
each 'TCA_STATS_PKT64' attribute carries is to check which attribute
precedes it, which is exactly what some applications are doing [1].

Do the same in iproute2 so that users with existing kernels could read
the 64-bit hardware packets counter of tc actions instead of reading the
truncated 32-bit counter.

Before:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 47 sec used 23 sec
        Action statistics:
        Sent 368689092544 bytes 5760767071 pkt (dropped 0, overlimits 0 requeues 0)
        Sent software 0 bytes 0 pkt
        Sent hardware 368689092544 bytes 1465799775 pkt
        backlog 0b 0p requeues 0
        used_hw_stats immediate

Where 5760767071 - 1465799775 = 0x100000000

After:

$ tc -s filter show dev swp2 ingress
filter protocol all pref 1 flower chain 0
filter protocol all pref 1 flower chain 0 handle 0x1
  skip_sw
  in_hw in_hw_count 1
        action order 1: mirred (Egress Redirect to device swp1) stolen
        index 1 ref 1 bind 1 installed 71 sec used 47 sec
        Action statistics:
        Sent 368689092544 bytes 5760767071 pkt (dropped 0, overlimits 0 requeues 0)
        Sent software 0 bytes 0 pkt
        Sent hardware 368689092544 bytes 5760767071 pkt
        backlog 0b 0p requeues 0
        used_hw_stats immediate

[1] https://github.com/openvswitch/ovs/commit/006e1c6dbfbadf474c17c8fa1ea358918d371588

Reported-by: Joe Botha <joe@atomic.ac>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

uapi: update bpf.h

Autogenerated from 6.14-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge ssh://gitolite.kernel.org/pub/scm/network/iproute2/iproute2-next

v6.13.0

ip: vxlan: Support IFLA_VXLAN_RESERVED_BITS

A new attribute, IFLA_VXLAN_RESERVED_BITS, was added in Linux kernel
commit 6c11379b104e ("vxlan: Add an attribute to make VXLAN header
validation configurable") (See the link below for the full patchset).

The payload is a 64-bit binary field that covers the VXLAN header. The set
bits indicate which bits in a VXLAN packet header should be allowed to
carry 1's. Support the new attribute through a CLI keyword "reserved_bits".

Link: https://patch.msgid.link/173378643250.273075.13832548579412179113.git-patchwork-notify@kernel.org
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

iproute2: add 'ip monitor acaddress' support

Enhanced the 'ip monitor' command to track changes in IPv6
anycast addresses. This update allows the command to listen for
events related to anycast address additions and deletions by
registering to the newly introduced RTNLGRP_IPV6_ACADDR netlink group.

This patch depends on the kernel patch that adds RTNLGRP_IPV6_ACADDR
being merged first.

Here is an example usage:

root@uml-x86-64:/# ip monitor acaddress
2: if2    inet6 any 2001:db8:7b:0:528e:a53a:9224:c9c5 scope global
       valid_lft forever preferred_lft forever
Deleted 2: if2    inet6 any 2001:db8:7b:0:528e:a53a:9224:c9c5 scope global
       valid_lft forever preferred_lft forever

Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
59372af69d4d ("Merge tag 'batadv-next-pullrequest-20250117' of git://git.open-mesh.org/linux-merge")

Signed-off-by: David Ahern <dsahern@kernel.org>

iproute2: Fix grammar in duplicate argument error message

Change "is a garbage" to "is garbage". Because garbage is a collective
noun, it does not need the indefinite article.

Signed-off-by: Neil Svedberg <neil.svedberg@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: update kernel headers

Update for 6.13-rc6

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iprule: Add flow label support

Add support for 'flowlabel' selector in ip-rule.

Rules can be added with or without a mask in which case exact match is
used:

# ip -6 rule add flowlabel 0x12345 table 100
# ip -6 rule add flowlabel 0x11/0xff table 200
# ip -6 rule add flowlabel 0x54321 table 300
# ip -6 rule del flowlabel 0x54321 table 300

Dump output:

$ ip -6 rule show
0:      from all lookup local
32764:  from all lookup 200 flowlabel 0x11/0xff
32765:  from all lookup 100 flowlabel 0x12345
32766:  from all lookup main

Dump can be filtered by flow label value and mask:

$ ip -6 rule show flowlabel 0x12345
32765:  from all lookup 100 flowlabel 0x12345
$ ip -6 rule show flowlabel 0x11/0xff
32764:  from all lookup 200 flowlabel 0x11/0xff

JSON output:

$ ip -6 -j -p rule show flowlabel 0x12345
[ {
         "priority": 32765,
         "src": "all",
         "table": "100",
         "flowlabel": "0x12345",
         "flowlabel_mask": "0xfffff"
     } ]
$ ip -6 -j -p rule show flowlabel 0x11/0xff
[ {
         "priority": 32764,
         "src": "all",
         "table": "200",
         "flowlabel": "0x11",
         "flowlabel_mask": "0xff"
     } ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: route: Add IPv6 flow label support

Allow specifying an IPv6 flow label when performing a route lookup.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

tc: fq: add support for TCA_FQ_OFFLOAD_HORIZON attribute

In linux-6.13, we added the ability to offload pacing on
capable devices.

tc qdisc add ... fq ... offload_horizon 100ms

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
9268abe611b0 ("Merge branch 'net-lan969x-add-rgmii-support'")

Signed-off-by: David Ahern <dsahern@kernel.org>

man: fix two small typos on xdp manipulations

Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>

iproute2: add 'ip monitor maddress' support

Enhanced the 'ip monitor' command to track changes in IPv4 and IPv6
multicast addresses. This update allows the command to listen for
events related to multicast address additions and deletions by
registering to the newly introduced RTNLGRP_IPV4_MCADDR and
RTNLGRP_IPV6_MCADDR netlink groups.

This patch depends on the kernel patch that adds RTNLGRP_IPV4_MCADDR
and RTNLGRP_IPV6_MCADDR being merged first.

Here is an example usage:

root@uml-x86-64:/# ip monitor maddress
9: nettest123    inet6 mcast ff01::1 scope global
       valid_lft forever preferred_lft forever
9: nettest123    inet6 mcast ff02::1 scope global
       valid_lft forever preferred_lft forever
9: nettest123    inet mcast 224.0.0.1 scope global
       valid_lft forever preferred_lft forever
9: nettest123    inet6 mcast ff02::1:ff00:7b01 scope global
       valid_lft forever preferred_lft forever
Deleted 9: nettest123    inet mcast 224.0.0.1 scope global
       valid_lft forever preferred_lft forever
Deleted 9: nettest123    inet6 mcast ff02::1:ff00:7b01 scope global
       valid_lft forever preferred_lft forever
Deleted 9: nettest123    inet6 mcast ff02::1 scope global
       valid_lft forever preferred_lft forever

Cc: Maciej Żenczykowski <maze@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
92c932b9946c ("Merge branch 'mptcp-pm-userspace-misc-cleanups'")

Signed-off-by: David Ahern <dsahern@kernel.org>

ip: link: rmnet: add support for flag handling

Extend the current rmnet support to allow enabling or disabling
IFLA_RMNET_FLAGS via ip link as well as printing the current settings.

Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Signed-off-by: David Ahern <dsahern@kernel.org>

uapi: remove no longer used linux/limits.h

Code is now using limits.h instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

flower: replace XATTR_SIZE_MAX

The flower tc parser was using XATTR_SIZE_MAX from linux/limits.h,
but this constant is intended to before extended filesystem attributes
not for TC. Replace it with a local define.

This fixes issue on systems with musl and XATTR_SIZE_MAX is not
defined in limits.h there.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

cg_map: use limits.h

Prefer limits.h from system headers over linux/limits.h
Fixes build with musl.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: rearrange and prune header files

The recent report of issues with missing limits.h impacting musl
suggested looking at what files are and are not included in ip code.

The standard practice is to put standard headers first, then system,
then local headers. Used iwyu to get suggestions about missing
and extraneous headers.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: add missing header for basename

The function basename prototype is in libgen.h
Fixes build on musl

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

libnetlink: add missing endian.h

Need endian.h to get htobe64 with musl.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: dump mcast querier state

Kernel support for dumping the multicast querier state was added in this
commit [1]. As some people might be interested to get this information
from userspace, this commit implements the necessary changes to show it
via

ip -d link show [dev]

The querier state shows the following information for IPv4 and IPv6
respectively:

1) The ip address of the current querier in the network. This could be
ourselves or an external querier.
2) The port on which the querier was seen
3) Querier timeout in seconds

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c7fa1d9b1fb179375e889ff076a1566ecc997bfc

Signed-off-by: Fabian Pfitzner <f.pfitzner@pengutronix.de>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

devlink: use the correct handle flag for port param show

Port param show command arg parser used the devlink dev flag
instead of the port, which caused to not identify the port device
argument, causing the following error:

$ devlink port param show eth0 name link_type
Wrong identification string format.
Devlink identification ("bus_name/dev_name") expected

Use the correct the devlink handle flag.

Fixes: 70faecdca8f5 ("devlink: implement dump selector for devlink objects show commands")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

devlink: do dry parse for extended handle with selector

When parsing with selector, there's a list of extended handles
(devname/busname/x) which require special treatment.
DL_OPT_HANDLEP is one of them. The code tries to parse devname/busname
handle and in case it is successful, it goes the "dump" way. However if
it's not, parsing is directly done. That is wrong, as the options may
still be incomplete. Do break in that case instead allowing to do dry
parse and possibly go the "dump" way in case the option list is not
complete.

Fixes: 70faecdca8f5 ("devlink: implement dump selector for devlink objects show commands")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: fix memory leak in error path

The 'json' object doesn't free when 'rtnl_dump_filter()' fails to process,
fix it.

Signed-off-by: Minhong He <heminhong@kylinos.cn>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

uapi: update headers to 6.13-rc1

Update of headers after 6.13-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge ssh://gitolite.kernel.org/pub/scm/network/iproute2/iproute2-next

devlink: fix memory leak in ifname_map_rtnl_init()

When the return value of rtnl_talk() is greater than
or equal to 0, 'answer' will be allocated.
The 'answer' should be free after using,
otherwise it will cause memory leak.

Signed-off-by: Minhong He <heminhong@kylinos.cn>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: fix memory leak in do_show()

Free the 'answer' obtained from 'rtnl_talk()'.

Fixes: 6887a0656dad ("ip: netconf: fix overzealous error checking")
Signed-off-by: Minhong He <heminhong@kylinos.cn>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

add .editorconfig file for basic formatting

EditorConfig is a specification to define the most basic code formatting
stuff, and it is supported by many editors and IDEs, either directly or via
plugins, including VSCode/VSCodium, Vim, emacs and more.

It allows to define formatting style related to indentation, charset, end
of lines and trailing whitespaces. It also allows to apply different
formats for different files based on wildcards, so for example it is
possible to apply different configurations to *.{c,h}, *.json or *.yaml.

In linux related projects, defining a .editorconfig might help people that
work on different projects with different indentation styles, so they
cannot define a global style. Now they will directly see the correct
indentation on every fresh clone of the project.

Add the .editorconfig file at the root of the iproute2 project with a broad
generic configuration for all file types. Then add exceptions for the file
types which follow different conventions.

See https://editorconfig.org

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

v6.12.0

tc: Add support for Hold/Release mechanism in TSN as per IEEE 802.1Q-2018

This commit enhances the q_taprio module by adding support for the
Hold/Release mechanism in Time-Sensitive Networking (TSN), as specified
in the IEEE 802.1Q-2018 standard.

Changes include:
- Addition of `TC_TAPRIO_CMD_SET_AND_HOLD` and `TC_TAPRIO_CMD_SET_AND_RELEASE`
cases in the `entry_cmd_to_str` function to return "H" and "R" respectively.
- Addition of corresponding string comparisons in the `str_to_entry_cmd`
function to map "H" and "R" to `TC_TAPRIO_CMD_SET_AND_HOLD` and
`TC_TAPRIO_CMD_SET_AND_RELEASE`.

The Hold/Release feature works as follows:
- Set-And-Hold-MAC (H): This command sets the gates and holds the current
configuration, preventing any further changes until a release command is
issued.
- Set-And-Release-MAC (R): This command releases the hold, allowing
subsequent gate configuration changes to take effect.

These changes ensure that the q_taprio module can correctly interpret and
handle the Hold/Release commands, aligning with the IEEE 802.1Q-2018 standard
for enhanced TSN configuration.

Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Merge branch 'can-xl-prep' into next

Vincent Mailhol  says:

====================

An RFC was sent last weekend to kick-off the discussion of the
introduction of CAN XL: [1] for the kernel side and [2] for the
iproute2 interface. While the series received some positive feedback,
it is far from completion. Some work is still needed to:

  - adjust the nesting of the IFLA_CAN_XL_DATA_BITTIMING_CONST in the
    netlink interface

  - add the CAN XL PWM configuration

and this TODO list may grow if more feedback is received.

Regardless of this, the RFC started with a set of trivial patches to
do some clean-up and some renaming in preparation of the introduction
of CAN XL.

This series just contains those preparation patches which were cherry
picked from the RFC.

The goal is to have those merged first to remove some overhead from
the netlink CAN XL main series before tacking care of the other
comments.

[1] [RFC] can: netlink: add CAN XL
Link: https://lore.kernel.org/linux-can/20241110155902.72807-16-mailhol.vincent@wanadoo.fr/
[2] [RFC] iplink_can: add CAN XL
Link: https://lore.kernel.org/linux-can/20241110160406.73584-10-mailhol.vincent@wanadoo.fr/
====================

Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: rename dbt into fd_dbt in can_parse_opt()

The CAN XL support will introduce another dbt variable. Rename the
current dbt variable into fd_dbt to avoid future confusion. When
introduced, the CAN XL variable will be named xl_dbt.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: add struct can_tdc

Add the struct can_tdc to group the tdcv, tdco and tdvf variables
together. The structure is borrowed from linux/can/bittiming.h [1].

This refactor is a preparation for the introduction of CAN XL.

[1] https://elixir.bootlin.com/linux/v6.11/source/include/linux/can/bittiming.h#L78

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: use invarg() instead of fprintf()

invarg() is specifically designed to print error messages when an
invalid argument is provided. Replace the generic fprintf() by
invarg() in can_parse_opt().

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: remove newline at the end of invarg()'s messages

invarg() already prints a new line by default. Adding an explicit "\n"
at the end of the message results in two lines being printed. Remove
all newlines at the end of the invarg() messages.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: reduce the visibility of tdc in can_parse_opt()

tdc is only used in a single if block. Move its declaration to the top
of the compound statement where it is used.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

iplink_can: remove unused FILE *f parameter in three functions

FILE *f, the first parameter of below functions:

* can_print_tdc_opt()
* can_print_tdc_const_opt()
* void can_print_ctrlmode_ext()

is unused. Remove it.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: David Ahern <dsahern@kernel.org>

lib: names: check calloc return value in db_names_alloc

db_names_load() may crash since it touches the
hash member. Fix it by checking the return value

Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: add ip/iplink_bridge files to MAINTAINERS

Add F line for the ip/iplink_bridge* files to bridge's MAINTAINERS
entry.

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

lib: utils: close file handle on error

reap_prop() doesn't close the file descriptor
on some errors, fix it.

Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

Merge branch 'rdma-monitor' into next

Chiara Meiohas says:

====================

This series adds support to a new command to monitor IB events
and expands the rdma-sys command to indicate whether this new
functionality is supported.
We've also included a fix for a typo in rdma-link man page.

Command usage and examples are in the commits and man pages.

These patches are complimentary to the kernel patches:
https://lore.kernel.org/linux-rdma/20240821051017.7730-1-michaelgur@nvidia.com/
https://lore.kernel.org/linux-rdma/093c978ef2766fd3ab4ff8798eeb68f2f11582f6.1730367038.git.leon@kernel.org/

====================

Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Add IB device and net device rename events

rdma monitor displays the IB device name and the netdevice
name when displaying event info. Since users can modiy these
names, we track and notify on renaming events.

$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER]    dev 1  rocep8s0f1
[UNREGISTER]    dev 0  rocep8s0f0

$ modprobe mlx5_ib
[REGISTER]      dev 2  mlx5_0
[NETDEV_ATTACH] dev 2  mlx5_0 port 1 netdev 4 eth2
[REGISTER]      dev 3  mlx5_1
[NETDEV_ATTACH] dev 3  mlx5_1 port 1 netdev 5 eth3
[RENAME]        dev 2  rocep8s0f0
[RENAME]        dev 3  rocep8s0f1

$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER]    dev 2  rocep8s0f0
[REGISTER]      dev 4  mlx5_0
[NETDEV_ATTACH] dev 4  mlx5_0 port 30 netdev 4 eth2
[RENAME]        dev 4  rdmap8s0f0

$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 4  rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 4  rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 4  rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 4  rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER]      dev 5  mlx5_0
[NETDEV_ATTACH] dev 5  mlx5_0 port 1 netdev 11 eth8
[REGISTER]      dev 6  mlx5_1
[NETDEV_ATTACH] dev 6  mlx5_1 port 1 netdev 12 eth9
[RENAME]        dev 5  rocep8s0f0v0
[RENAME]        dev 6  rocep8s0f0v1
[REGISTER]      dev 7  mlx5_0
[NETDEV_ATTACH] dev 7  mlx5_0 port 1 netdev 13 eth10
[RENAME]        dev 7  rocep8s0f0v2
[REGISTER]      dev 8  mlx5_0
[NETDEV_ATTACH] dev 8  mlx5_0 port 1 netdev 14 eth11
[RENAME]        dev 8  rocep8s0f0v3

$ ip link set eth2 name myeth2
[NETDEV_RENAME] netdev 4 myeth2

$ ip link set eth1 name myeth1

** no events received, because eth1 is not attached to
   an IB device **

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: update uapi headers

Update rdma_netlink.h file upto kernel commit 7566752e4d7d
("RDMA/nldev: Add IB device and net device rename events")

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Fix typo in rdma-link man page

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Expose whether RDMA monitoring is supported

Extend the "rdma sys" command to display whether RDMA
monitoring is supported.

Example output for kernel where monitoring is supported:
$ rdma sys show
netns shared privileged-qkey off monitor on copy-on-fork on

Example output for kernel where monitoring is not supported:
$ rdma sys show
netns shared privileged-qkey off monitor off copy-on-fork on

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

rdma: Add support for rdma monitor

Introduce a new command for RDMA event monitoring.
This patch adds a new attribute "event_type" which describes
the event recieved. Add a new NETLINK_RDMA multicast group
and processes listening to this multicast group receive RDMA
events.

The event types supported are IB device registration/unregistration
and net device attachment/detachment.

Example output of rdma monitor and the commands which trigger
the events:

$ rdma monitor
$ rmmod mlx5_ib
[UNREGISTER]    dev 3 rocep8s0f1
[UNREGISTER]    dev 2 rocep8s0f0

$modprobe mlx5_ib
[REGISTER]      dev 4 mlx5_0
[NETDEV_ATTACH] dev 4 mlx5_0 port 1 netdev 4 eth2
[REGISTER]      dev 5 mlx5_1
[NETDEV_ATTACH] dev 5 mlx5_1 port 1 netdev 5 eth3

$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev
[UNREGISTER]    dev 4 rocep8s0f0
[REGISTER]      dev 6 mlx5_0
[NETDEV_ATTACH] dev 6 mlx5_0 port 30 netdev 4 eth2

$ echo 4 > /sys/class/net/eth2/device/sriov_numvfs
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 2 netdev 7 eth4
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 3 netdev 8 eth5
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 4 netdev 9 eth6
[NETDEV_ATTACH] dev 6 rdmap8s0f0 port 5 netdev 10 eth7
[REGISTER]      dev 7 mlx5_0
[NETDEV_ATTACH] dev 7 mlx5_0 port 1 netdev 11 eth8
[REGISTER]      dev 8 mlx5_0
[NETDEV_ATTACH] dev 8 mlx5_0 port 1 netdev 12 eth9
[REGISTER]      dev 9 mlx5_0
[NETDEV_ATTACH] dev 9 mlx5_0 port 1 netdev 13 eth10
[REGISTER]      dev 10 mlx5_0
[NETDEV_ATTACH] dev 10 mlx5_0 port 1 netdev 14 eth11

$ echo 0 > /sys/class/net/eth2/device/sriov_numvfs
[UNREGISTER]    dev 7 rocep8s0f0v0
[UNREGISTER]    dev 8 rocep8s0f0v1
[UNREGISTER]    dev 9 rocep8s0f0v2
[UNREGISTER]    dev 10 rocep8s0f0v3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 2
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 3
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 4
[NETDEV_DETACH] dev 6 rdmap8s0f0 port 5

Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

bridge: add ip/iplink_bridge files to MAINTAINERS

Add F line for the ip/iplink_bridge* files to bridge's MAINTAINERS
entry.

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>

lib: utils: close file handle on error

reap_prop() doesn't close the file descriptor
on some errors, fix it.

Signed-off-by: Denis Kirjanov <kirjanov@gmail.com>

uapi: update to bpf.h

Stay insync with upstream

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

vdpa: Add support for setting the MAC address in vDPA tool.

Add a new function in vDPA tool to support set MAC address.
Currently, the kernel only supports setting the MAC address.

Update the man page to include usage for setting the MAC address.

The usage is: vdpa dev set name vdpa_name mac **:**:**:**:**

here is example:
root@L1# vdpa -jp dev config show vdpa0
{
    "config": {
        "vdpa0": {
            "mac": "82:4d:e9:5d:d7:e6",
            "link ": "up",
            "link_announce ": false,
            "mtu": 1500
        }
    }
}

root@L1# vdpa dev set name vdpa0 mac 00:11:22:33:44:55

root@L1# vdpa -jp dev config show vdpa0
{
    "config": {
        "vdpa0": {
            "mac": "00:11:22:33:44:55",
            "link ": "up",
            "link_announce ": false,
            "mtu": 1500
        }
    }
}

Signed-off-by: Cindy Lu <lulu@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

ip: Add "down" filter for "ip addr/link show"

Currently there is an "up" option, which allows showing only devices
that are up and running. Add a corresponding "down" option.

Also change the usage and man pages accordingly.

Signed-off-by: Yedaya Katsman <yedaya.ka@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

uapi: update of bpf.h

Update from 6.12-rc4

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iplink: Fix link-netns id and link ifindex

When link-netns or link-netnsid is supplied, lookup link in that netns.
And if both netns and link-netns are given, IFLA_LINK_NETNSID should be
the nsid of link-netns from the view of target netns, not from current
one.

For example, when handling:

# ip -n ns1 link add netns ns2 link-netns ns3 link eth1 eth1.100 type vlan id 100

should lookup eth1 in ns3 and IFLA_LINK_NETNSID is the id of ns3 from
ns2.

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

ip: Move of set_netnsid_from_name() to namespace.c

Move set_netnsid_from_name() outside for reuse, like what's done for
netns_id_from_name().

Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rdma: Fix help information of 'rdma resource'

'rdma resource show cq' supports object 'dev' but not 'link', and
doesn't support device name with port.

Fixes: b0b8e32cbf6e ("rdma: Add CQ resource tracking information")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

rt_names: read `rt_addrprotos.d` directory

`rt_addrprotos` doesn't currently use the `.d` directory thing - add it.

My magic 8-ball predicts we might be grabbing a value or two for use in
FRRouting at some point in the future. Let's make it so we can ship
those in a separate file when it's time.

Signed-off-by: David Lamparter <equinox@diac24.net>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: netconf: fix overzealous error checking

The rtnetlink.sh kernel test started reporting errors after
iproute2 update. The error checking introduced by commit
under fixes is incorrect. rtnl_listen() always returns
an error, because the only way to break the loop is to
return an error from the handler, it seems.

Switch this code to using normal rtnl_talk(), instead of
the rtnl_listen() abuse. As far as I can tell the use of
rtnl_listen() was to make get and dump use common handling
but that's no longer the case, anyway.

Before:
  $ ip -6 netconf show dev lo
  inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
  $ echo $?
  2

After:
  $ ./ip/ip -6 netconf show dev lo
inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
  $ echo $?
  0

Fixes: 00e8a64dac3b ("ip: detect errors in netconf monitor mode")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iprule: Add DSCP support

Add support for 'dscp' selector in ip-rule.

Rules can be added with a numeric DSCP value:

# ip rule add dscp 1 table 100
# ip rule add dscp 0x02 table 200

Or using symbolic names from /usr/share/iproute2/rt_dsfield or
/etc/iproute2/rt_dsfield:

# ip rule add dscp AF42 table 300

Dump output:

$ ip rule show
0:      from all lookup local
32763:  from all lookup 300 dscp AF42
32764:  from all lookup 200 dscp 2
32765:  from all lookup 100 dscp 1
32766:  from all lookup main
32767:  from all lookup default

Dump can be filtered by DSCP value:

$ ip rule show dscp 1
32765:  from all lookup 100 dscp 1

Or by a symbolic name:

$ ip rule show dscp AF42
32763:  from all lookup 300 dscp AF42

When the numeric option is specified, symbolic names will be translated
to numeric values:

$ ip -N rule show
0:      from all lookup 255
32763:  from all lookup 300 dscp 36
32764:  from all lookup 200 dscp 2
32765:  from all lookup 100 dscp 1
32766:  from all lookup 254
32767:  from all lookup 253

The same applies to the JSON output in order to be consistent with
existing fields such as "tos" and "table":

$ ip -j -p rule show dscp AF42
[ {
         "priority": 32763,
         "src": "all",
         "table": "300",
         "dscp": "AF42"
     } ]

$ ip -j -p -N rule show dscp AF42
[ {
         "priority": 32763,
         "src": "all",
         "table": "300",
         "dscp": "36"
     } ]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

man: Add ip-rule(8) as generation target

In a similar fashion to other man pages, add ip-rule(8) as generation
target so that we could use variable substitutions there in a subsequent
patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip/ipmroute: use preferred_family to get prefix

The mroute family is reset to RTNL_FAMILY_IPMR or RTNL_FAMILY_IP6MR when
retrieving the multicast routing cache. However, the get_prefix() and
subsequently __get_addr_1() cannot identify these families. Using
preferred_family to obtain the prefix can resolve this issue.

Fixes: 98ce99273f24 ("mroute: fix up family handling")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

netem: swap transposed calloc args

Gcc with -Wextra complains about transposed args to calloc
in netem.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: catch invalid stp state

The stp state parsing was putting result in an __u8 which
would mean that check for invalid string was never happening.

Caught by enabling -Wextra:
CC mst.o
mst.c: In function ‘mst_set’:
mst.c:217:27: warning: comparison is always false due to limited range of data type [-Wtype-limits]
217 | if (state == -1) {

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge remote-tracking branch 'main/main' into next

Fixed conflicts in lib/utils.c

Signed-off-by: David Ahern <dsahern@kernel.org>

lib: utils: move over `print_num` from ip/

`print_num()` was born in `ip/ipaddress.c` but considering it has
nothing to do with IP addresses it should really live in `lib/utils.c`.

(I've had reason to call it from bridge/* on some random hackery.)

Signed-off-by: David Lamparter <equinox@diac24.net>
Signed-off-by: David Ahern <dsahern@kernel.org>

uapi: update headers

Current headers from 6.12-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

iplink: fix fd leak when playing with netns

The command 'ip link set foo netns mynetns' opens a file descriptor to fill
the netlink attribute IFLA_NET_NS_FD. This file descriptor is never closed.
When batch mode is used, the number of file descriptor may grow greatly and
reach the maximum file descriptor number that can be opened.

This fd can be closed only after the netlink answer. Moreover, a second
fd could be opened because some (struct link_util)->parse_opt() handlers
call iplink_parse().

Let's add a helper to manage these fds:
- open_fds_add() stores a fd, up to 5 (arbitrary choice, it seems enough);
- open_fds_close() closes all stored fds.

Fixes: 0dc34c7713bb ("iproute2: Add processless network namespace support")
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

arpd: use designated initializers for msghdr structure

This patch fixes the following error:

arpd.c:442:17: error: initialization of 'int' from 'void *' makes integer from pointer without a cast [-Wint-conversion]
442 | NULL, 0,

raised by Buildroot autobuilder [1].

In the case in question, the analysis of socket.h [2] containing the
msghdr structure shows that it has been modified with the addition of
padding fields, which cause the compilation error. The use of designated
initializers allows the issue to be fixed.

struct msghdr {
void *msg_name;
socklen_t msg_namelen;
struct iovec *msg_iov;
int __pad1;
int msg_iovlen;
int __pad1;
void *msg_control;
int __pad2;
socklen_t msg_controllen;
int __pad2;
int msg_flags;
};

[1] http://autobuild.buildroot.org/results/e4cdfa38ae9578992f1c0ff5c4edae3cc0836e3c/
[2] iproute2/host/mips64-buildroot-linux-musl/sysroot/usr/include/sys/socket.h

Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: mst: fix a further musl build issue

This patch fixes the following build errors:

In file included from mst.c:11:
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
   80 | _PRINT_FUNC(tv, const struct timeval *)
      |                              ^~~~~~~
../include/json_print.h:50:37: note: in definition of macro '_PRINT_FUNC'
   50 |                                     type value);                        \
      |                                     ^~~~
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
   80 | _PRINT_FUNC(tv, const struct timeval *)
      |                              ^~~~~~~
../include/json_print.h:55:45: note: in definition of macro '_PRINT_FUNC'
   55 |                                             type value)                 \
      |                                             ^~~~
../include/json_print.h: In function 'print_tv':
../include/json_print.h:58:48: error: passing argument 5 of 'print_color_tv' from incompatible pointer type [-Wincompatible-pointer-types]
   58 |                                                value);                  \
      |                                                ^~~~~
      |                                                |
      |                                                const struct timeval *
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
   80 | _PRINT_FUNC(tv, const struct timeval *)
      | ^~~~~~~~~~~
../include/json_print.h:50:42: note: expected 'const struct timeval *' but argument is of type 'const struct timeval *'
   50 |                                     type value);                        \
      |                                          ^
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
   80 | _PRINT_FUNC(tv, const struct timeval *)

Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

bridge: mst: fix a musl build issue

This patch fixes a compilation error raised by the bump to version 6.11.0
in Buildroot using musl as the C library for the cross-compilation
toolchain.

After setting the CFLGAS

ifeq ($(BR2_TOOLCHAIN_USES_MUSL),y)
IPROUTE2_CFLAGS += -D__UAPI_DEF_IN6_ADDR=0 -D__UAPI_DEF_SOCKADDR_IN6=0 \
-D__UAPI_DEF_IPV6_MREQ=0
endif

to fix the following errors:

In file included from ../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/arpa/inet.h:9,
                 from ../include/libnetlink.h:14,
                 from mst.c:10:
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
   23 | struct in6_addr {
      |        ^~~~~~~~
In file included from ../include/uapi/linux/if_bridge.h:19,
                 from mst.c:7:
../include/uapi/linux/in6.h:33:8: note: originally defined here
   33 | struct in6_addr {
      |        ^~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
   34 | struct sockaddr_in6 {
      |        ^~~~~~~~~~~~
../include/uapi/linux/in6.h:50:8: note: originally defined here
   50 | struct sockaddr_in6 {
      |        ^~~~~~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
   42 | struct ipv6_mreq {
      |        ^~~~~~~~~
../include/uapi/linux/in6.h:60:8: note: originally defined here
   60 | struct ipv6_mreq {

I got this further errors

../include/uapi/linux/in6.h:72:25: error: field 'flr_dst' has incomplete type
   72 |         struct in6_addr flr_dst;
      |                         ^~~~~~~
../include/uapi/linux/if_bridge.h:711:41: error: field 'ip6' has incomplete type
  711 |                         struct in6_addr ip6;
      |                                         ^~~

fixed by including the netinet/in.h header.

Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Merge ssh://gitolite.kernel.org/pub/scm/network/iproute2/iproute2-next

v6.11.0

man: replace use of term whitelist

Avoid use of term whitelist because it propgates white == good
assumptions. Not really neede on the man page.
See: https://inclusivenaming.org/word-lists/tier-1/whitelist/

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man: replace use of word segregate

The term segregate carries a lot of racist baggage in the US.
It is on the Inclusive Naming word list.
See: https://inclusivenaming.org/word-lists/tier-3/segregate/

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

replace use of term 'Sanity check'

The term "sanity check" is on the Tier2 word list (should replace).
See https://inclusivenaming.org/word-lists/tier-2/sanity-check/

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

man8: ip-route: update documentation

Include "tunsrc" in the man page.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>

ip: lwtunnel: tunsrc support

Add support for setting/getting the new "tunsrc" feature.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
bfba7bc8b7c2 ("Merge branch 'unmask-dscp-part-four'")

Signed-off-by: David Ahern <dsahern@kernel.org>

Merge remote-tracking branch 'main/main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>

ip: nexthop: Support 16-bit nexthop weights

Two interlinked changes related to the nexthop group management have been
recently merged in kernel commit e96f6fd30eec ("Merge branch
'net-nexthop-increase-weight-to-u16'").

- One of the reserved bytes in struct nexthop_grp was redefined to carry
  high-order bits of the nexthop weight, thus allowing 16-bit nexthop
  weights.

- NHA_OP_FLAGS started getting dumped on nexthop group dump to carry a
  flag, NHA_OP_FLAG_RESP_GRP_RESVD_0, that indicates that reserved fields
  in struct nexthop_grp are zeroed before dumping.

If NHA_OP_FLAG_RESP_GRP_RESVD_0 is given, it is safe to interpret the newly
named nexthop_grp.weight_high as high-order bits of nexthop weight.

Extend ipnexthop to support configuring nexthop weights of up to 65536, and
when dumping, to interpret nexthop_grp.weight_high if safe.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>

Update kernel headers

Update kernel headers to commit:
a99ef548bba0 ("bnx2x: Set ivi->vlan field as an integer")

Signed-off-by: David Ahern <dsahern@kernel.org>

ss: fix libbpf version check for ENABLE_BPF_SKSTORAGE_SUPPORT

This patch fixes a problem with the libbpf version comparison to decide
if ENABLE_BPF_SKSTORAGE_SUPPORT could be enabled.

- The code enabled by ENABLE_BPF_SKSTORAGE_SUPPORT uses the function
  btf_dump__new with an API that was introduced in libbpf 0.6.0. So
  check now against libbpf version to be >= 0.6.x instead of 0.5.x.

- This code still depends on the necessity to have LIBBPF_MAJOR_VERSION
  and LIBBPF_MINOR_VERSION defined, even if libbpf_version.h is not
  present in the library development package. This was ensured with
  the previous patch for the configure script.

Fixes: e3ecf048 ("ss: pretty-print BPF socket-local storage")
Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

configure: provide surrogates for possibly missing libbpf_version.h

Old libbpf library versions (< 0.7.x) may not have the libbpf_version.h
header packaged. This header would provide LIBBPF_MAJOR_VERSION and
LIBBPF_MINOR_VERSION which are then missing to control conditional
compilation in some source files.

Provide surrogates for these defines via CFLAGS that are derived from
the LIBBPF_VERSION determined with $(${PKG_CONFIG} libbpf --modversion).

Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc-cake: reformat

Reformat tc-cake to use man format (nroff) instead of pre-formatting.

Signed-off-by: Lương Việt Hoàng <tcm4095@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

tc-cake: document 'ingress'

Linux kernel commit 7298de9cd7255a783ba ("sch_cake: Add ingress mode") added
an ingress mode for CAKE, which can be enabled with the 'ingress' parameter.
Document the changes in CAKE's behavior when ingress mode is enabled.

Signed-off-by: Lương Việt Hoàng <tcm4095@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>