This reverts commit 1cebf8f143c2 ("socket: fix struct ifreq
size in compat ioctl"), it's a bugfix for another commit that
I'll revert next.
This is not a 'perfect' revert, I'm keeping some coding style
intact rather than revert to the state with indentation errors.
Cc: stable@vger.kernel.org Fixes: 1cebf8f143c2 ("socket: fix struct ifreq size in compat ioctl") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Mon, 18 Feb 2019 22:30:11 +0000 (14:30 -0800)]
net: dsa: Fix NPD checking for br_vlan_enabled()
[ Upstream not applicable ]
It is possible for the DSA slave network device not to be part of a
bridge, yet have an upper device like a VLAN device be part of a bridge.
When that VLAN device is enslaved, since it does not define any
switchdev_ops, we will recurse down to the lower/physical port device,
call switchdev_port_obj_add() with a VLAN, and here we will check
br_vlan_enabled() on a NULL dp->bridge_dev, thus causing a NULL pointer
de-reference.
This is no longer a problem upstream after commit d17d9f5e5143
("switchdev: Replace port obj add/del SDO with a notification").
Fixes: 2ea7a679ca2a ("net: dsa: Don't add vlans when vlan filtering is disabled") Reported-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
as while we retrieve 'opt_inst' from team->option_inst_list, it could
be added to the local 'opt_inst_list' for multiple times. The
__team_option_inst_tmp_find() doesn't work, as the setter
team_mode_option_set() still calls team->ops.exit() which uses
->tmp_list too in __team_options_change_check().
Simplify the list operations by moving the 'opt_inst_list' and
team_nl_send_event_options_get() into the nla_for_each_nested() loop so
that it can be guranteed that we won't insert a same list entry for
multiple times. Therefore, __team_option_inst_tmp_find() can be removed
too.
Fixes: 4fb0534fb7bb ("team: avoid adding twice the same option to the event list") Fixes: 2fcdb2c9e659 ("team: allow to send multiple set events in one message") Reported-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com Reported-by: syzbot+68ee510075cf64260cc4@syzkaller.appspotmail.com Cc: Jiri Pirko <jiri@resnulli.us> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In sctp_stream_init(), after sctp_stream_outq_migrate() freed the
surplus streams' ext, but sctp_stream_alloc_out() returns -ENOMEM,
stream->outcnt will not be set to 'outcnt'.
With the bigger value on stream->outcnt, when closing the assoc and
freeing its streams, the ext of those surplus streams will be freed
again since those stream exts were not set to NULL after freeing in
sctp_stream_outq_migrate(). Then the invalid-free issue reported by
syzbot would be triggered.
We fix it by simply setting them to NULL after freeing.
Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: syzbot+58e480e7b28f2d890bfd@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It was caused by SKB_GSO_CB(skb)->csum_start not set in sctp_gso_segment.
sctp_gso_segment() calls skb_segment() with 'feature | NETIF_F_HW_CSUM',
which causes SKB_GSO_CB(skb)->csum_start not to be set in skb_segment().
For TCP/UDP, when feature supports HW_CSUM, CHECKSUM_PARTIAL will be set
and gso_reset_checksum will be called to set SKB_GSO_CB(skb)->csum_start.
So SCTP should do the same as TCP/UDP, to call gso_reset_checksum() when
computing checksum in sctp_gso_segment.
Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we probe a SFP module, we expect to be able to call the upstream
device's module_insert() function so that the upstream link can be
configured. However, when the upstream device is delayed, we currently
may end up probing the module before the upstream device is available,
and lose the module_insert() call.
Avoid this by holding off probing the module until the SFP bus is
properly connected to both the SFP socket driver and the upstream
driver.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calculating rb->frames_per_block * req->tp_block_nr the result
can overflow. Check it for overflow without limiting the total buffer
size to UINT_MAX.
This change fixes support for packet ring buffers >= UINT_MAX.
Fixes: 8f8d28e4d6d8 ("net/packet: fix overflow in check for tp_frame_nr") Signed-off-by: Kal Conley <kal.conley@dectris.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some case, we may use multiple pedit actions to modify packets.
The command shown as below: the last pedit action is effective.
$ tc filter add dev netdev_rep parent ffff: protocol ip prio 1 \
flower skip_sw ip_proto icmp dst_ip 3.3.3.3 \
action pedit ex munge ip dst set 192.168.1.100 pipe \
action pedit ex munge eth src set 00:00:00:00:00:01 pipe \
action pedit ex munge eth dst set 00:00:00:00:00:02 pipe \
action csum ip pipe \
action tunnel_key set src_ip 1.1.1.100 dst_ip 1.1.1.200 dst_port 4789 id 100 \
action mirred egress redirect dev vxlan0
To fix it, we add max_mod_hdr_actions to mlx5e_tc_flow_parse_attr struction,
max_mod_hdr_actions will store the max pedit action number we support and
num_mod_hdr_actions indicates how many pedit action we used, and store all
pedit action to mod_hdr_actions.
Fixes: d79b6df6b10a ("net/mlx5e: Add parsing of TC pedit actions to HW format") Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, it is not guaranteed. For example, switches might
choose to make other use of these octets.
This repeatedly causes kernel hardware checksum fault.
Prior to the cited commit below, skb checksum was forced to be
CHECKSUM_NONE when padding is detected. After it, we need to keep
skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
verify and parse IP headers, it does not worth the effort as the packets
are so small that CHECKSUM_COMPLETE has no significant advantage.
Future work: when reporting checksum complete is not an option for
IP non-TCP/UDP packets, we can actually fallback to report checksum
unnecessary, by looking at cqe IPOK bit.
Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix race condition between ena_update_on_link_change() and
ena_restore_device().
This race can occur if link notification arrives while the driver
is performing a reset sequence. In this case link can be set up,
enabling the device, before it is fully restored. If packets are
sent at this time, the driver might access uninitialized data
structures, causing kernel crash.
Move the clearing of ENA_FLAG_ONGOING_RESET and netif_carrier_on()
after ena_up() to ensure the device is ready when link is set up.
Fixes: d18e4f683445 ("net: ena: fix race condition between device reset and link up setup") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
genlmsg_reply can fail, so propagate its return code
Fixes: 915d7e5e593 ("ipv6: sr: add code base for control plane support of SR-IPv6") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
extensions has only 8 bits. Thus extensions starting from DCTCPINFO
cannot be requested directly. Some of them included into response
unconditionally or hook into some of lower 8 bits.
Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
reservation, and documents behavior for other extensions.
Also this patch adds fallback to reporting socket priority. This filed
is more widely used for traffic classification because ipv4 sockets
automatically maps TOS to priority and default qdisc pfifo_fast knows
about that. But priority could be changed via setsockopt SO_PRIORITY so
INET_DIAG_TOS isn't enough for predicting class.
Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
So, after this patch INET_DIAG_CLASS_ID will report socket priority
for most common setup when net_cls isn't set and/or cgroup2 in use.
Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent commit in Clang expanded the -Wstring-plus-int warning, showing
some odd behavior in this file.
drivers/isdn/hardware/avm/b1.c:426:30: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
cinfo->version[j] = "\0\0" + 1;
~~~~~~~^~~
drivers/isdn/hardware/avm/b1.c:426:30: note: use array indexing to silence this warning
cinfo->version[j] = "\0\0" + 1;
^
& [ ]
1 warning generated.
This is equivalent to just "\0". Nick pointed out that it is smarter to
use "" instead of "\0" because "" is used elsewhere in the kernel and
can be deduplicated at the linking stage.
Link: https://github.com/ClangBuiltLinux/linux/issues/309 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan reported that bpftool does not compile for him:
$ make tools/bpf
DESCEND bpf
Auto-detecting system features:
.. libbfd: [ on ]
.. disassembler-four-args: [ OFF ]
DESCEND bpftool
Auto-detecting system features:
.. libbfd: [ on ]
.. disassembler-four-args: [ OFF ]
CC /opt/linux.git/tools/bpf/bpftool/net.o
In file included from /opt/linux.git/tools/include/uapi/linux/pkt_cls.h:6:0,
from /opt/linux.git/tools/include/uapi/linux/tc_act/tc_bpf.h:14,
from net.c:13:
net.c: In function 'show_dev_tc_bpf':
net.c:164:21: error: 'TC_H_CLSACT' undeclared (first use in this function)
handle = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS);
[...]
Fix it by importing pkt_sched.h header copy into tooling
infrastructure.
Fixes: 49a249c38726 ("tools/bpftool: copy a few net uapi headers to tools directory") Fixes: f6f3bac08ff9 ("tools/bpf: bpftool: add net support") Reported-by: Dan Gilson <dan_gilson@yahoo.com>
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=202315 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Test that externally learned FDB entries can roam, but not age out.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver currently treats static FDB entries as both static and
sticky. This is incorrect and prevents such entries from being roamed to
a different port via learning.
Fix this by configuring static entries with ageing disabled and roaming
enabled.
In net-next we can add proper support for the newly introduced 'sticky'
flag.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Externally learned entries can be added by a user or by a switch driver
that is notifying the bridge driver about entries that were learned in
hardware.
In the first case, the entries are not marked with the 'added_by_user'
flag, which causes switch drivers to ignore them and not offload them.
The 'added_by_user' flag can be set on externally learned FDB entries
based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
which effectively means if the created / updated FDB entry was added by
a user or not.
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: bridge@lists.linux-foundation.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
As txq_trans_update() only updates trans_start when the lock is held,
trans_start does not get updated if NETIF_F_LLTX is declared.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If sch_fq packet scheduler is not used, TCP can fallback to
internal pacing, but this requires sk_pacing_status to
be properly set.
Fixes: 8c4b4c7e9ff0 ("bpf: Add setsockopt helper function to bpf") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Lawrence Brakmo <brakmo@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In sock_setsockopt() (net/core/sock.h), when SO_MARK option is used
to change sk_mark, sk_dst_reset(sk) is called. The same should be
done in bpf_setsockopt().
Fixes: 8c4b4c7e9ff0 ("bpf: Add setsockopt helper function to bpf") Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Shifting the 1 by exp by an int can lead to sign-extension overlow when
exp is 31 since 1 is an signed int and sign-extending this result to an
unsigned long long will set the upper 32 bits. Fix this by shifting an
unsigned long.
Detected by cppcheck:
(warning) Shifting signed 32-bit value by 31 bits is undefined behaviour
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
While running test_progs in a loop I found out that I'm sometimes hitting
"Didn't find expected build ID from the map" error.
Looking at stack_map_get_build_id_offset() it seems that it is racy (by
design) and can sometimes return BPF_STACK_BUILD_ID_IP (i.e. can't trylock
current->mm->mmap_sem).
Let's retry this test a single time.
Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When returning BPF_STACK_BUILD_ID_IP from stack_map_get_build_id_offset,
make sure that build_id field is empty. Since we are using percpu
free list, there is a possibility that we might reuse some previous
bpf_stack_build_id with non-zero build_id.
Fixes: 615755a77b24 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Build-id length is not fixed to 20, it can be (`man ld` /--build-id):
* 128-bit (uuid)
* 160-bit (sha1)
* any length specified in ld --build-id=0xhexstring
To fix the issue of missing BPF_STACK_BUILD_ID_VALID for shorter build-ids,
assume that build-id is somewhere in the range of 1 .. 20.
Set the remaining bytes to zero.
v2:
* don't introduce new "len = min(BPF_BUILD_ID_SIZE, nhdr->n_descsz)",
we already know that nhdr->n_descsz <= BPF_BUILD_ID_SIZE if we enter
this 'if' condition
Fixes: 615755a77b24 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
There's a race between afs_make_call() and afs_wake_up_async_call() in the
case that an error is returned from rxrpc_kernel_send_data() after it has
queued the final packet.
afs_make_call() will try and clean up the mess, but the call state may have
been moved on thereby causing afs_process_async_call() to also try and to
delete the call.
Fix this by:
(1) Getting an extra ref for an asynchronous call for the call itself to
hold. This makes sure the call doesn't evaporate on us accidentally
and will allow the call to be retained by the caller in a future
patch. The ref is released on leaving afs_make_call() or
afs_wait_for_call_to_complete().
(2) In the event of an error from rxrpc_kernel_send_data():
(a) Don't set the call state to AFS_CALL_COMPLETE until *after* the
call has been aborted and ended. This prevents
afs_deliver_to_call() from doing anything with any notifications
it gets.
(b) Explicitly end the call immediately to prevent further callbacks.
(c) Cancel any queued async_work and wait for the work if it's
executing. This allows us to be sure the race won't recur when we
change the state. We put the work queue's ref on the call if we
managed to cancel it.
(d) Put the call's ref that we got in (1). This belongs to us as long
as the call is in state AFS_CALL_CL_REQUESTING.
Fixes: 341f741f04be ("afs: Refcount the afs_call struct") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the refcounting of the authentication keys in the file locking code.
The vnode->lock_key member points to a key on which it expects to be
holding a ref, but it isn't always given an extra ref, however.
A cb_interest record is not necessarily attached to the vnode on entry to
afs_validate(), which can cause an oops when we try to bring the vnode's
cb_s_break up to date in the default case (ie. no current callback promise
and the vnode has not been deleted).
Fix this by simply removing the line, as vnode->cb_s_break will be set when
needed by afs_register_server_cb_interest() when we next get a callback
promise from RPC call.
In iproute2 commit 90c5c969f0b9 ("fix print_0xhex on 32 bit"), the format
specifier for the ife type changed from 0x%X to %#llX, causing systematic
failures in the following TDC test cases:
7682 - Create valid ife encode action with mark and pass control
ef47 - Create valid ife encode action with mark and pipe control
df43 - Create valid ife encode action with mark and continue control
e4cf - Create valid ife encode action with mark and drop control
ccba - Create valid ife encode action with mark and reclassify control
a1cf - Create valid ife encode action with mark and jump control
cb3d - Create valid ife encode action with mark value at 32-bit maximum
95ed - Create valid ife encode action with prio and pass control
aa17 - Create valid ife encode action with prio and pipe control
74c7 - Create valid ife encode action with prio and continue control
7a97 - Create valid ife encode action with prio and drop control
f66b - Create valid ife encode action with prio and reclassify control
3056 - Create valid ife encode action with prio and jump control
7dd3 - Create valid ife encode action with prio value at 32-bit maximum
05bb - Create valid ife encode action with tcindex and pass control
ce65 - Create valid ife encode action with tcindex and pipe control
09cd - Create valid ife encode action with tcindex and continue control
8eb5 - Create valid ife encode action with tcindex and continue control
451a - Create valid ife encode action with tcindex and drop control
d76c - Create valid ife encode action with tcindex and reclassify control
e731 - Create valid ife encode action with tcindex and jump control
b7b8 - Create valid ife encode action with tcindex value at 16-bit maximum
2a9c - Create valid ife encode action with mac src parameter
cf5c - Create valid ife encode action with mac dst parameter
2353 - Create valid ife encode action with mac src and mac dst parameters
552c - Create valid ife encode action with mark and type parameters
0421 - Create valid ife encode action with prio and type parameters
4017 - Create valid ife encode action with tcindex and type parameters
fac3 - Create valid ife encode action with index at 32-bit maximnum
7c25 - Create valid ife decode action with pass control
dccb - Create valid ife decode action with pipe control
7bb9 - Create valid ife decode action with continue control
d9ad - Create valid ife decode action with drop control
219f - Create valid ife decode action with reclassify control
8f44 - Create valid ife decode action with jump control
b330 - Create ife encode action with cookie
Change 'matchPattern' values, allowing '0' and '0x0' if ife type is equal
to 0, and accepting both '0x' and '0X' otherwise, to let these tests pass
both with old and new tc binaries.
While at it, fix a small typo in test case fac3 ('maximnum'->'maximum').
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
After commit 1c25324caf82 ("net/sched: act_tunnel_key: Don't dump dst port
if it wasn't set"), act_tunnel_key doesn't dump anymore the destination
port, unless it was explicitly configured. This caused systematic failures
in the following TDC test case:
7a88 - Add tunnel_key action with cookie parameter
Avoid matching zero values of TCA_TUNNEL_KEY_ENC_DST_PORT to let the test
pass again.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
After merge of commit 80ef0f22ceda ("net/sched: act_tunnel_key: Allow
key-less tunnels"), act_tunnel_key does not reject anymore requests to
install 'set' rules where the key id is missing. Therefore, drop the
following TDC testcase:
ba4e - Add tunnel_key set action with missing mandatory id parameter
because it's going to become a systematic fail as soon as userspace
iproute2 will start supporting key-less tunnels.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
So far genphy_soft_reset was used automatically if the PHY driver
didn't implement the soft_reset callback. This changed with the
mentioned commit and broke KSZ9031. To fix this configure the
KSZ9031 PHY driver to use genphy_soft_reset.
Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset") Reported-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Sekhar Nori <nsekhar@ti.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some systems have had functional issues since commit 5a8361f7ecce
(ACPICA: Integrate package handling with module-level code) that,
among other things, changed the initial values of the
acpi_gbl_group_module_level_code and acpi_gbl_parse_table_as_term_list
global flags in ACPICA which implicitly caused acpi_ec_ecdt_probe() to
be called before acpi_load_tables() on the vast majority of platforms.
Namely, before commit 5a8361f7ecce, acpi_load_tables() was called from
acpi_early_init() if acpi_gbl_parse_table_as_term_list was FALSE and
acpi_gbl_group_module_level_code was TRUE, which almost always was
the case as FALSE and TRUE were their initial values, respectively.
The acpi_gbl_parse_table_as_term_list value would be changed to TRUE
for a couple of platforms in acpi_quirks_dmi_table[], but it remained
FALSE in the vast majority of cases.
After commit 5a8361f7ecce, the initial values of the two flags have
been reversed, so in effect acpi_load_tables() has not been called
from acpi_early_init() any more. That, in turn, affects
acpi_ec_ecdt_probe() which is invoked before acpi_load_tables() now
and it is not possible to evaluate the _REG method for the EC address
space handler installed by it. That effectively causes the EC address
space to be inaccessible to AML on platforms with an ECDT matching the
EC device definition in the DSDT and functional problems ensue in
there.
Because the default behavior before commit 5a8361f7ecce was to call
acpi_ec_ecdt_probe() after acpi_load_tables(), it should be safe to
do that again. Moreover, the EC address space handler installed by
acpi_ec_ecdt_probe() is only needed for AML to be able to access the
EC address space and the only AML that can run during acpi_load_tables()
is module-level code which only is allowed to access address spaces
with default handlers (memory, I/O and PCI config space).
For this reason, move the acpi_ec_ecdt_probe() invocation back to
acpi_bus_init(), from where it was taken away by commit d737f333b211
(ACPI: probe ECDT before loading AML tables regardless of module-level
code flag), and put it after the invocation of acpi_load_tables() to
restore the original code ordering from before commit 5a8361f7ecce.
Fixes: 5a8361f7ecce ("ACPICA: Integrate package handling with module-level code") Link: https://bugzilla.kernel.org/show_bug.cgi?id=199981 Reported-by: step-ali <sunmooon15@gmail.com> Reported-by: Charles Stanhope <charles.stanhope@gmail.com> Tested-by: Charles Stanhope <charles.stanhope@gmail.com> Reported-by: Paulo Nascimento <paulo.ulusu@googlemail.com> Reported-by: David Purton <dcpurton@marshwiggle.net> Reported-by: Adam Harvey <adam@adamharvey.name> Reported-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Jean-Marc Lenoir <archlinux@jihemel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
static checker warning:
drivers/xen/pvcalls-front.c:373 alloc_active_ring()
error: we previously assumed 'map->active.ring' could be null
(see line 357)
The device node iterators perform an of_node_get on each
iteration, so a jump out of the loop requires an of_node_put.
Remote and port also have augmented reference counts, so drop them
on each iteration and at the end of the function, respectively.
Remote is only used for the address it contains, not for the
contents of that address, so the reference count can be dropped
immediately.
The semantic patch that fixes the first part of this problem is
as follows (http://coccinelle.lip6.fr):
// <smpl>
@@
expression root,e;
local idexpression child;
iterator name for_each_child_of_node;
@@
for_each_available_child_of_node(root, child) {
... when != of_node_put(child)
when != e = child
+ of_node_put(child);
? break;
...
}
... when != child
// </smpl>
x86 compilation has required asm goto support since 4.17.
Since clang does not support asm goto, at 4.17,
Commit b1ae32dbab50 ("x86/cpufeature: Guard asm_volatile_goto usage
for BPF compilation") worked around the issue by permitting an
alternative implementation without asm goto for clang.
Compiling samples/bpf directories, most bpf programs failed
compilation with error messages like:
In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_sample_pkts_kern.c:2:
In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
In file included from /home/yhs/work/bpf-next/include/linux/sched.h:15:
In file included from /home/yhs/work/bpf-next/include/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/uapi/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/linux/ipc.h:9:
In file included from /home/yhs/work/bpf-next/include/linux/refcount.h:72:
/home/yhs/work/bpf-next/arch/x86/include/asm/refcount.h:70:9: error: 'asm goto' constructs are not supported yet
return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:67:2: note: expanded from macro 'GEN_BINARY_SUFFIXED_RMWcc'
__GEN_RMWcc(op " %[val], %[var]\n\t" suffix, var, cc, \
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:21:2: note: expanded from macro '__GEN_RMWcc'
asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
^
/home/yhs/work/bpf-next/include/linux/compiler_types.h:188:37: note: expanded from macro 'asm_volatile_goto'
#define asm_volatile_goto(x...) asm goto(x)
Most implementation does not even provide an alternative
implementation. And it is also not practical to make changes
for each call site.
This patch workarounded the asm goto issue by redefining the macro like below:
#define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto")
If asm_volatile_goto is not used by bpf programs, which is typically the case, nothing bad
will happen. If asm_volatile_goto is used by bpf programs, which is incorrect, the compiler
will issue an error since "invalid use of asm_volatile_goto" is not valid assembly codes.
With this patch, all bpf programs under samples/bpf can pass compilation.
Note that bpf programs under tools/testing/selftests/bpf/ compiled fine as
they do not access kernel internal headers.
Fixes: e769742d3584 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"") Fixes: 18fe58229d80 ("x86, asm: change the GEN_*_RMWcc() macros to not quote the condition") Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the xdp_umem_assign_dev() path, the xsk code does not
check if a queue for which umem is to be created exists.
It leads to a situation where umem is not assigned to any
Tx/Rx queue of a netdevice, without notifying the stack
about an error. This affects both XDP_SKB and XDP_DRV
modes - in case of XDP_DRV_ZC, queue index is checked by
the driver.
This patch fixes xsk code, so that in both XDP_SKB and
XDP_DRV mode of AF_XDP, an error is returned when requested
queue index exceedes an existing maximum.
Fixes: c9b47cc1fabca ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") Reported-by: Jakub Spizewski <jakub.spizewski@intel.com> Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
We've failed to copy and process vhost_iotlb_msg so let userspace at
least know about it. For instance before these patch the code below runs
without any error:
int main()
{
struct vhost_msg msg;
struct iovec iov;
int fd;
if (writev(fd, &iov,1) == -1) {
perror("writev");
return 1;
}
return 0;
}
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Chris Park <Chris.Park@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add an of_node_put when the result of of_graph_get_remote_port_parent is
not available.
An of_node_put is also needed when meson_probe_remote completes. This was
present at the recursive call, but not in the call from meson_drv_probe.
The semantic match that finds this problem is as follows
(http://coccinelle.lip6.fr):
// <smpl>
@r exists@
local idexpression e;
expression x;
@@
e = of_graph_get_remote_port_parent(...);
... when != x = e
when != true e == NULL
when != of_node_put(e)
when != of_fwnode_handle(e)
(
return e;
|
*return ...;
)
// </smpl>
Commit e657fcc clears cpu capability bit instead of using fake cpuid
value, the EXTD should always be off for PV guest without depending
on cpuid value. So remove the cpuid check in xen_read_msr_safe() to
always clear the X2APIC_ENABLE bit.
Signed-off-by: Talons Lee <xin.li@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch uses nfct_help() to detect whether an established connection
needs conntrack helper instead of using test_bit(IPS_HELPER_BIT,
&ct->status).
The reason is that IPS_HELPER_BIT is only set when using explicit CT
target.
However, in the case that a device enables conntrack helper via command
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper", the status of
IPS_HELPER_BIT will not present any change, and consequently it loses
the checking ability in the context.
Signed-off-by: Henry Yen <henry.yen@mediatek.com> Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Tested-by: John Crispin <john@phrozen.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for
cmd completion, this can create race conditions, to avoid this add
wait_for_completion().
Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
After Commit 54aed4dd3526 ("MIPS: IP27: use dma_direct_ops") qla1280 driver
failed on SGI IP27 machines with
qla1280: QLA1040 found on PCI bus 0, dev 0
qla1280 0000:00:00.0: enabling device (0006 -> 0007)
qla1280: Failed to get request memory
qla1280: probe of 0000:00:00.0 failed with error -12
Reason is that SGI IP27 always generates 64bit DMA addresses and has no
fallback mode for 32bit DMA addresses implemented. QLA1280 supports 64bit
addressing for all DMA accesses so setting coherent mask to 64bit fixes the
issue.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Albeit we no longer rely on those hard-coded descriptor sizes, we still use
them as our defaults, so better get it right. While adding its sysfs
entries, we forgot to update the geometry descriptor size. It is 0x48
according to UFS2.1, and wasn't changed in UFS3.0.
[mkp: typo]
Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor) Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the driver finds invalid destination MAC for the first un-reachable
target, and before completes the PATH_REQ operation, set new ep_state to
OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper
open-iscsi layer is notified to complete the login process on the first
un-reachable target and thus proceed login to other reachable targets.
Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
hba->is_sys_suspended is set after successful system suspend but
not clear after successful system resume.
According to current behavior, hba->is_sys_suspended will not be set if
host is runtime-suspended but not system-suspended. Thus we shall aligh the
same policy: clear this flag even if host remains runtime-suspended after
ufshcd_system_resume is successfully returned.
Simply fix this flag to correct host status logs.
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently there is one cmd timeout timer and one qfull timer for each udev,
and whenever any new command is coming in we will update the cmd timer or
qfull timer. For some corner cases the timers are always working only for
the ringbuffer's and full queue's newest cmd. That's to say the timer won't
be fired even if one cmd has been stuck for a very long time and the
deadline is reached.
This fix will keep the cmd/qfull timers to be pended for the oldest cmd in
ringbuffer and full queue, and will update them with the next cmd's
deadline only when the old cmd's deadline is reached or removed from the
ringbuffer and full queue.
Signed-off-by: Xiubo Li <xiubli@redhat.com> Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be
concurrently executed.
isdn_tty_tiocmset
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
isdn_tty_set_termios
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
Thus, some concurrency double-free bugs may occur.
These possible bugs are found by a static tool written by myself and
my manual code review.
To fix these possible bugs, the mutex lock "modem_info_mutex" used in
isdn_tty_tiocmset() is added in isdn_tty_set_termios().
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, TX is given a budget which is consumed by stmmac_tx_clean()
and stmmac_rx() is given the remaining non-consumed budget.
This is wrong and in case we are sending a large number of packets this
can starve RX because remaining budget will be low.
Let's give always the same budget for RX and TX clean.
While at it, check if we missed any interrupts while we were in NAPI
callback by looking at DMA interrupt status.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
RX Watchdog can be disabled by platform definitions but currently we are
initializing the descriptors before checking if Watchdog must be
disabled or not.
Fix this by checking earlier if user wants Watchdog disabled or not.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check if CBS is currently supported before trying to configure it in HW.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In DMA interrupt handler we were clearing all interrupts status, even
the ones that were not active. Fix this and only clear the active
interrupts.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit b7d0f08e9129, the enable / disable of PCI device is not
managed which will result in IO regions not being automatically unmapped.
As regions continue mapped it is currently not possible to remove and
then probe again the PCI module of stmmac.
Fix this by manually unmapping regions on remove callback.
Changes from v1:
- Fix build error
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Fixes: b7d0f08e9129 ("net: stmmac: Fix WoL for PCI-based setups") Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Reported-by: Jane Chu <jane.chu@oracle.com> Fixes: 23222f8f8dce ("acpi, nfit: Add function to look up nvdimm...") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer
dereference") moved the loading of r6 earlier in the code. As some
functions are called inbetween, r6 needs to be loaded again with the
address of swapper_pg_dir in order to set PTE pointers for
the Abatron BDI.
Fixes: 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
As part of audit process to update drivers to use rdma_restrack_add()
ensure that QP objects is cleared before access. Such change fixes the
crash observed with uninitialized non zero sgid attr accessed by
ib_destroy_qp().
Reported-by: Alexander Murashkin <AlexanderMurashkin@msn.com> Tested-by: Alexander Murashkin <AlexanderMurashkin@msn.com> Fixes: 1a1f460ff151 ("RDMA: Hold the sgid_attr inside the struct ib_ah/qp") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the forward chain, the iif is changed from slave device to master vrf
device. Thus, flow offload does not find a match on the lower slave
device.
This patch uses the cached route, ie. dst->dev, to update the iif and
oif fields in the flow entry.
After this patch, the following example works fine:
# ip addr add dev eth0 1.1.1.1/24
# ip addr add dev eth1 10.0.0.1/24
# ip link add user1 type vrf table 1
# ip l set user1 up
# ip l set dev eth0 master user1
# ip l set dev eth1 master user1
# nft add table firewall
# nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
# nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
# nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
# nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1
A lock type of 0 is "LockRead", which makes the fileserver record an
unintentional read lock on the new file. This will cause problems
later on if the file is the subject of locking operations.
The correct default value should be -1 ("LockNone").
Fix the operation marshalling code to set the value and provide an enum to
symbolise the values whilst we're at it.
Fixes: 30062bd13e36 ("afs: Implement YFS support in the fs client") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As Naresh reported, test_stacktrace_build_id() causes panic on i386 and
arm32 systems. This is caused by page_address() returns NULL in certain
cases.
This patch fixes this error by using kmap_atomic/kunmap_atomic instead
of page_address.
Fixes: 615755a77b24 (" bpf: extend stackmap to save binary_build_id+offset instead of address") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The problem is that we call this with a spin lock held.
The call tree is:
pvcalls_front_accept() holds bedata->socket_lock.
-> create_active()
-> __get_free_pages() uses GFP_KERNEL
The create_active() function is only called from pvcalls_front_accept()
with a spin_lock held, The allocation is not allowed to sleep and
GFP_KERNEL is not sufficient.
This issue was detected by using the Coccinelle software.
v2: Add a function doing the allocations which is called
outside the lock and passing the allocated data to
create_active().
v3: Use the matching deallocators i.e., free_page()
and free_pages(), respectively.
v4: It would be better to pre-populate map (struct sock_mapping),
rather than introducing one more new struct.
v5: Since allocating the data outside of this call it should also
be freed outside, when create_active() fails.
Move kzalloc(sizeof(*map2), GFP_ATOMIC) outside spinlock and
use GFP_KERNEL instead.
v6: Drop the superfluous calls.
Suggested-by: Juergen Gross <jgross@suse.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Acked-by: Stefano Stabellini <sstabellini@kernel.org> CC: Julia Lawall <julia.lawall@lip6.fr> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Juergen Gross <jgross@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: xen-devel@lists.xenproject.org CC: linux-kernel@vger.kernel.org Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The existing BPF TCP initial congestion window (TCP_BPF_IW) does not
to work on (active) Fast Open sender. This is because it changes the
(initial) window only if data_segs_out is zero -- but data_segs_out
is also incremented on SYN-data. This patch fixes the issue by
proerly accounting for SYN-data additionally.
Fixes: fc7478103c84 ("bpf: Adds support for setting initial cwnd") Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
linux-next/arch/mips/jazz/jazzdma.c: In function `vdma_init`:
/linux-next/arch/mips/jazz/jazzdma.c:77:30: error: implicit declaration
of function `KSEG1ADDR`; did you mean `CKSEG1ADDR`?
[-Werror=implicit-function-declaration]
pgtbl = (VDMA_PGTBL_ENTRY *)KSEG1ADDR(pgtbl);
^~~~~~~~~
CKSEG1ADDR
/linux-next/arch/mips/jazz/jazzdma.c:77:10: error: cast to pointer from
integer of different size [-Werror=int-to-pointer-cast]
pgtbl = (VDMA_PGTBL_ENTRY *)KSEG1ADDR(pgtbl);
^
In file included from /linux-next/arch/mips/include/asm/barrier.h:11:0,
from /linux-next/include/linux/compiler.h:248,
from /linux-next/include/linux/kernel.h:10,
from /linux-next/arch/mips/jazz/jazzdma.c:11:
/linux-next/arch/mips/include/asm/addrspace.h:41:29: error: cast from
pointer to integer of different size [-Werror=pointer-to-int-cast]
#define _ACAST32_ (_ATYPE_)(_ATYPE32_) /* widen if necessary */
^
/linux-next/arch/mips/include/asm/addrspace.h:53:25: note: in
expansion of macro `_ACAST32_`
#define CPHYSADDR(a) ((_ACAST32_(a)) & 0x1fffffff)
^~~~~~~~~
/linux-next/arch/mips/jazz/jazzdma.c:84:44: note: in expansion of
macro `CPHYSADDR`
r4030_write_reg32(JAZZ_R4030_TRSTBL_BASE, CPHYSADDR(pgtbl));
Using correct casts and CKSEG1ADDR when dealing with the pgtbl setup
fixes this.
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
from compiler-gcc - instead it gets the version in
include/linux/compiler.h. Unfortunately that version doesn't actually
prevent compiler from optimizing out the variable.
Fix up by moving the macro out from compiler-gcc.h to compiler.h.
Compilers without incline asm support will keep working
since it's protected by an ifdef.
Also fix up comments to match reality since we are no longer overriding
any macros.
Build-tested with gcc and clang.
Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive") Cc: Eli Friedman <efriedma@codeaurora.org> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates the
command size to allocate based on the prot_capabilities. In the isci
driver, scsi_host_set_prot() is called after scsi_add_host() so the command
size gets calculated to be smaller than it needs to be. Eventually,
scsi_mq_init_request() locates the 'prot_sdb' after the command assuming it
was sized correctly and a buffer overrun may occur.
However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
size, the mistake can go unnoticed.
The bug was noticed after the struct request size was reduced by commit 9d037ad707ed ("block: remove req->timeout_list")
Which likely reduced the allocated space for the request by an entire cache
line, enough that the overflow could be hit and it caused a panic, on boot,
at:
sd_done() would call scsi_prot_sg_count() which reads the number of
entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
the allocated space it reads a garbage number and erroneously calls
t10_pi_complete().
To prevent this, the calls to scsi_host_set_prot() are moved into
isci_host_alloc() before the call to scsi_add_host(). Out of caution, also
move the similar call to scsi_host_set_guard().
Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support") Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Intel SCU Linux support <intel-linux-scu@intel.com> Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
There is no code that decreases the reference count of stateful objects
in error path of the nft_add_set_elem(). this causes a leak of reference
count of stateful objects.
Test commands:
$nft add table ip filter
$nft add counter ip filter c1
$nft add map ip filter m1 { type ipv4_addr : counter \;}
$nft add element ip filter m1 { 1 : c1 }
$nft add element ip filter m1 { 1 : c1 }
$nft delete element ip filter m1 { 1 }
$nft delete counter ip filter c1
Result:
Error: Could not process rule: Device or resource busy
delete counter ip filter c1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At the second 'nft add element ip filter m1 { 1 : c1 }', the reference
count of the 'c1' is increased then it tries to insert into the 'm1'. but
the 'm1' already has same element so it returns -EEXIST.
But it doesn't decrease the reference count of the 'c1' in the error path.
Due to a leak of the reference count of the 'c1', the 'c1' can't be
removed by 'nft delete counter ip filter c1'.
Fixes: 8aeff920dcc9 ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When VXLAN is a loadable module, MLXSW_SPECTRUM must not be built-in:
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:2547: undefined
reference to `vxlan_fdb_find_uc'
Add Kconfig dependency to enforce usable configurations.
Fixes: 1231e04f5bba ("mlxsw: spectrum_switchdev: Add support for VxLAN encapsulation") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When writing to C-TCAM, mlxsw driver uses cregion->ops->entry_insert().
In case of C-TCAM HW insertion error, the opposite action should take
place.
Add error handling case in which the C-TCAM region entry is removed, by
calling cregion->ops->entry_remove().
Fixes: a0a777b9409f ("mlxsw: spectrum_acl: Start using A-TCAM") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch solves a crash at the time of mlx4 driver unload or system
shutdown. The crash occurs because dma_alloc_coherent() returns one
value in mlx4_alloc_icm_coherent(), but a different value is passed to
dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
when allocated, that pointer is passed to sg_set_buf() to record it,
then when freed it is re-calculated by calling
lowmem_page_address(sg_page()) which returns a different value. Solve
this by recording the value that dma_alloc_coherent() returns, and
passing this to dma_free_coherent().
This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
rid of page operation after dma_alloc_coherent").
Based-on-code-from: Christoph Hellwig <hch@lst.de> Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
sys_sendmsg has supported unspecified destination IPv6 (wildcard) for
unconnected UDP sockets since 876c7f41. When [::] is passed by user as
destination, sys_sendmsg rewrites it with [::1] to be consistent with
BSD (see "BSD'ism" comment in the code).
This didn't work when cgroup-bpf was enabled though since the rewrite
[::] -> [::1] happened before passing control to cgroup-bpf block where
fl6.daddr was updated with passed by user sockaddr_in6.sin6_addr (that
might or might not be changed by BPF program). That way if user passed
[::] as dst IPv6 it was first rewritten with [::1] by original code from 876c7f41, but then rewritten back with [::] by cgroup-bpf block.
It happened even when BPF_CGROUP_UDP6_SENDMSG program was not present
(CONFIG_CGROUP_BPF=y was enough).
The fix is to apply BSD'ism after cgroup-bpf block so that [::] is
replaced with [::1] no matter where it came from: passed by user to
sys_sendmsg or set by BPF_CGROUP_UDP6_SENDMSG program.
Commit ade446403bfb ("net: ipv4: do not handle duplicate fragments as
overlapping") changed IPv4 defragmentation so that duplicate fragments,
as well as _some_ fragments completely covered by previously delivered
fragments, do not lead to the whole frag queue being discarded. This
makes the existing ip_defrag selftest flaky.
This patch
* makes sure that negative IPv4 defrag tests generate truly overlapping
fragments that trigger defrag queue drops;
* tests that duplicate IPv4 fragments do not trigger defrag queue drops;
* makes a couple of minor tweaks to the test aimed at increasing its code
coverage and reduce flakiness.
Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
struct hnae_handle is a member of struct hnae_vf_cb, so when vf_cb is
freed, than use hnae_handle will cause use after free panic.
This patch frees vf_cb after hnae_handle used.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In PBL chains with non power of 2 page count, the producer is not at the
beginning of the chain when index is 0 after a wrap. Therefore, after the
producer index wrap around, page index should be calculated more carefully.
Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently there are some issues with the ucc_of_parse_tdm function:
1, a possible null pointer dereference in ucc_of_parse_tdm,
detected by the semantic patch deref_null.cocci,
with the following warning:
drivers/soc/fsl/qe/qe_tdm.c:177:21-24: ERROR: pdev is NULL but dereferenced.
2, dev gets modified, so in any case that devm_iounmap() will fail
even when the new pdev is valid, because the iomap was done with a
different pdev.
3, there is no driver bind with the "fsl,t1040-qe-si" or
"fsl,t1040-qe-siram" device. So allocating resources using devm_*()
with these devices won't provide a cleanup path for these resources
when the caller fails.
This patch fixes them.
Suggested-by: Li Yang <leoyang.li@nxp.com> Suggested-by: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Reviewed-by: Peng Hao <peng.hao2@zte.com.cn> CC: Julia Lawall <julia.lawall@lip6.fr> CC: Zhao Qiang <qiang.zhao@nxp.com> CC: David S. Miller <davem@davemloft.net> CC: netdev@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/xen/pvcalls-back.c: In function 'pvcalls_sk_state_change':
drivers/xen/pvcalls-back.c:286:28: warning:
variable 'intf' set but not used [-Wunused-but-set-variable]
It not used since e6587cdbd732 ("pvcalls-back: set -ENOTCONN in
pvcalls_conn_back_read")
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When mc13xxx_reg_read() fails, "old_adc0" is uninitialized and will
contain random value. Further execution uses "old_adc0" even when
mc13xxx_reg_read() fails.
The fix checks the return value of mc13xxx_reg_read(), and exits
the execution when it fails.
Signed-off-by: Kangjie Lu <kjlu@umn.edu> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use devm_regmap_add_irq_chip and clean up error path in probe
and also the remove function.
Reported-by: Christian Hohnstaedt <Christian.Hohnstaedt@wago.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>