The function assumes that each PMD points to head of a
huge page. This is not correct as a PMD can point to
start of any 8M region with a, say 256M, hugepage. The
fix ensures that it points to the correct head of any PMD
huge page.
Cc: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line.
sed
's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/'
ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence
version generation is done only for the last matched symbol. This patch adds
new line between the symbol expansions.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the prototypes of assembly defined functions to asm-prototypes.h.
Some prototypes are directly added as they are not present in any existing header
files.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we have more than 32 unicast MAC addresses assigned to an interface
we will read beyond the end of the address table in the driver when
adding filters. The next 256 entries store multicast addresses, so we
will end up attempting to insert duplicate filters, which is mostly
harmless. If we add more than 288 unicast addresses we will then read
past the multicast address table, which is likely to be more exciting.
Fixes: 12fb0da45c9a ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The lower level nl80211 code in cfg80211 ensures that "len" is between
25 and NL80211_ATTR_FRAME (2304). We subtract DOT11_MGMT_HDR_LEN (24) from
"len" so thats's max of 2280. However, the action_frame->data[] buffer is
only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can
overflow.
We are not allowed to block on the RCU reader side, so can't
just hold the mutex as before. As a quick fix, convert it to
a spinlock.
Fixes: d9f1f61c0801 ("tap: Extending tap device create/destroy APIs") Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the introduction of ULD (Upper-Layer Drivers), the MSI-X
deallocating path changed in cxgb4: the driver frees the interrupts
of ULD when unregistering it or on shutdown PCI handler.
Problem is that if a MSI-X is not freed before deallocated in the PCI
layer, it will trigger a BUG() due to still "alive" interrupt being
tentatively quiesced.
The below trace was observed when doing a simple unbind of Chelsio's
adapter PCI function, like:
"echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind"
This patch fixes the issue by refactoring the shutdown path of ULD on
cxgb4 driver, by properly freeing and disabling interrupts on PCI
remove handler too.
Fixes: 0fbc81b3ad51 ("Allocate resources dynamically for all cxgb4 ULD's") Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Latest change in open-lldp code uses bytes 6-11 of perm_addr buffer
as the Ethernet source address for the host TLV packet.
Since our driver does not fill these bytes, they stay at zero and
the open-lldp code ends up sending the TLV packet with zero source
address and the switch drops this packet.
The fix is to initialize these bytes to 0xff. The open-lldp code
considers 0xff:ff:ff:ff:ff:ff as the invalid address and falls back to
use the host's mac address as the Ethernet source address.
There are two problems with calling sock_create_kern() from
rds_tcp_accept_one()
1. it sets up a new_sock->sk that is wasteful, because this ->sk
is going to get replaced by inet_accept() in the subsequent ->accept()
2. The new_sock->sk is a leaked reference in sock_graft() which
expects to find a null parent->sk
Avoid these problems by calling sock_create_lite().
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When destroying a VRF device we cleanup the slaves in its ndo_uninit()
function, but that causes packets to be switched (skb->dev == vrf being
destroyed) even though we're pass the point where the VRF should be
receiving any packets while it is being dismantled. This causes a BUG_ON
to trigger if we have raw sockets (trace below).
The reason is that the inetdev of the VRF has been destroyed but we're
still sending packets up the stack with it, so let's free the slaves in
the dellink callback as David Ahern suggested.
Note that this fix doesn't prevent packets from going up when the VRF
device is admin down.
Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Reported-by: Chris Cormier <chriscormier@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code that detects a failed soft reset of Octeon is comparing the wrong
value against the reset value of the Octeon SLI_SCRATCH_1 register,
resulting in an inability to detect a soft reset failure. Fix it by using
the correct value in the comparison, which is any non-zero value.
Fixes: f21fb3ed364b ("Add support of Cavium Liquidio ethernet adapters") Fixes: c0eab5b3580a ("liquidio: CN23XX firmware download") Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9256645af098 ("net/core: relax BUILD_BUG_ON in
netdev_stats_to_stats64") made an attempt to read beyond
the size of the source a possibility.
Fix to only copy src size to dest. As dest might be bigger than src.
==================================================================
BUG: KASAN: slab-out-of-bounds in netdev_stats_to_stats64+0xe/0x30 at addr ffff8801be248b20
Read of size 192 by task VBoxNetAdpCtl/6734
CPU: 1 PID: 6734 Comm: VBoxNetAdpCtl Tainted: G O 4.11.4prahal+intel+ #118
Hardware name: LENOVO 20CDCTO1WW/20CDCTO1WW, BIOS GQET52WW (1.32 ) 05/04/2017
Call Trace:
dump_stack+0x63/0x86
kasan_object_err+0x1c/0x70
kasan_report+0x270/0x520
? netdev_stats_to_stats64+0xe/0x30
? sched_clock_cpu+0x1b/0x190
? __module_address+0x3e/0x3b0
? unwind_next_frame+0x1ea/0xb00
check_memory_region+0x13c/0x1a0
memcpy+0x23/0x50
netdev_stats_to_stats64+0xe/0x30
dev_get_stats+0x1b9/0x230
rtnl_fill_stats+0x44/0xc00
? nla_put+0xc6/0x130
rtnl_fill_ifinfo+0xe9e/0x3700
? rtnl_fill_vfinfo+0xde0/0xde0
? sched_clock+0x9/0x10
? sched_clock+0x9/0x10
? sched_clock_local+0x120/0x130
? __module_address+0x3e/0x3b0
? unwind_next_frame+0x1ea/0xb00
? sched_clock+0x9/0x10
? sched_clock+0x9/0x10
? sched_clock_cpu+0x1b/0x190
? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
? depot_save_stack+0x1d8/0x4a0
? depot_save_stack+0x34f/0x4a0
? depot_save_stack+0x34f/0x4a0
? save_stack+0xb1/0xd0
? save_stack_trace+0x16/0x20
? save_stack+0x46/0xd0
? kasan_slab_alloc+0x12/0x20
? __kmalloc_node_track_caller+0x10d/0x350
? __kmalloc_reserve.isra.36+0x2c/0xc0
? __alloc_skb+0xd0/0x560
? rtmsg_ifinfo_build_skb+0x61/0x120
? rtmsg_ifinfo.part.25+0x16/0xb0
? rtmsg_ifinfo+0x47/0x70
? register_netdev+0x15/0x30
? vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
? vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
? do_vfs_ioctl+0x17f/0xff0
? SyS_ioctl+0x74/0x80
? do_syscall_64+0x182/0x390
? __alloc_skb+0xd0/0x560
? __alloc_skb+0xd0/0x560
? save_stack_trace+0x16/0x20
? init_object+0x64/0xa0
? ___slab_alloc+0x1ae/0x5c0
? ___slab_alloc+0x1ae/0x5c0
? __alloc_skb+0xd0/0x560
? sched_clock+0x9/0x10
? kasan_unpoison_shadow+0x35/0x50
? kasan_kmalloc+0xad/0xe0
? __kmalloc_node_track_caller+0x246/0x350
? __alloc_skb+0xd0/0x560
? kasan_unpoison_shadow+0x35/0x50
? memset+0x31/0x40
? __alloc_skb+0x31f/0x560
? napi_consume_skb+0x320/0x320
? br_get_link_af_size_filtered+0xb7/0x120 [bridge]
? if_nlmsg_size+0x440/0x630
rtmsg_ifinfo_build_skb+0x83/0x120
rtmsg_ifinfo.part.25+0x16/0xb0
rtmsg_ifinfo+0x47/0x70
register_netdevice+0xa2b/0xe50
? __kmalloc+0x171/0x2d0
? netdev_change_features+0x80/0x80
register_netdev+0x15/0x30
vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
? vboxNetAdpComposeMACAddress+0x1d0/0x1d0 [vboxnetadp]
? kasan_check_write+0x14/0x20
VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
? VBoxNetAdpLinuxOpen+0x20/0x20 [vboxnetadp]
? lock_acquire+0x11c/0x270
? __audit_syscall_entry+0x2fb/0x660
do_vfs_ioctl+0x17f/0xff0
? __audit_syscall_entry+0x2fb/0x660
? ioctl_preallocate+0x1d0/0x1d0
? __audit_syscall_entry+0x2fb/0x660
? kmem_cache_free+0xb2/0x250
? syscall_trace_enter+0x537/0xd00
? exit_to_usermode_loop+0x100/0x100
SyS_ioctl+0x74/0x80
? do_sys_open+0x350/0x350
? do_vfs_ioctl+0xff0/0xff0
do_syscall_64+0x182/0x390
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f7e39a1ae07
RSP: 002b:00007ffc6f04c6d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffc6f04c730 RCX: 00007f7e39a1ae07
RDX: 00007ffc6f04c730 RSI: 00000000c0207601 RDI: 0000000000000007
RBP: 00007ffc6f04c700 R08: 00007ffc6f04c780 R09: 0000000000000008
R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000007
R13: 00000000c0207601 R14: 00007ffc6f04c730 R15: 0000000000000012
Object at ffff8801be248008, in cache kmalloc-4096 size: 4096
Allocated:
PID = 6734
save_stack_trace+0x16/0x20
save_stack+0x46/0xd0
kasan_kmalloc+0xad/0xe0
__kmalloc+0x171/0x2d0
alloc_netdev_mqs+0x8a7/0xbe0
vboxNetAdpOsCreate+0x65/0x1c0 [vboxnetadp]
vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
do_vfs_ioctl+0x17f/0xff0
SyS_ioctl+0x74/0x80
do_syscall_64+0x182/0x390
return_from_SYSCALL_64+0x0/0x6a
Freed:
PID = 5600
save_stack_trace+0x16/0x20
save_stack+0x46/0xd0
kasan_slab_free+0x73/0xc0
kfree+0xe4/0x220
kvfree+0x25/0x30
single_release+0x74/0xb0
__fput+0x265/0x6b0
____fput+0x9/0x10
task_work_run+0xd5/0x150
exit_to_usermode_loop+0xe2/0x100
do_syscall_64+0x26c/0x390
return_from_SYSCALL_64+0x0/0x6a
Memory state around the buggy address: ffff8801be248a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801be248b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff8801be248b80: 00 00 00 00 00 00 00 00 00 00 00 07 fc fc fc fc
^ ffff8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Signed-off-by: Alban Browaeys <alban.browaeys@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's not a good idea to add the same hlist_node to two different hash lists.
This leads to various hard to debug memory corruptions.
Fixes: 8ed66f0e8235 ("geneve: implement support for IPv6-based tunnels") Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's not a good idea to add the same hlist_node to two different hash lists.
This leads to various hard to debug memory corruptions.
Fixes: b1be00a6c39f ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, when the link for $DEV is down, this command succeeds but the
address is removed immediately by DAD (1):
ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
In the same situation, this will succeed and not remove the address (2):
ip addr add 1111::12/64 dev $DEV
ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
The comment in addrconf_dad_begin() when !IF_READY makes it look like
this is the intended behavior, but doesn't explain why:
* If the device is not ready:
* - keep it tentative if it is a permanent address.
* - otherwise, kill it.
We clearly cannot prevent userspace from doing (2), but we can make (1)
work consistently with (2).
addrconf_dad_stop() is only called in two cases: if DAD failed, or to
skip DAD when the link is down. In that second case, the fix is to avoid
deleting the address, like we already do for permanent addresses.
Fixes: 3c21edbd1137 ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Draining the health workqueue will ignore future health works including
the one that report hardware failure and thus we can't enter error state
Instead cancel the recovery flow and make sure only recovery flow won't
be scheduled.
Fixes: 5e44fca50470 ('net/mlx5: Only cancel recovery work when cleaning up device') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently I started seeing warnings about pages with refcount -1. The
problem was traced to packets being reused after their head was merged into
a GRO packet by skb_gro_receive(). While bisecting the issue pointed to
commit c21b48cc1bbf ("net: adjust skb->truesize in ___pskb_trim()") and
I have never seen it on a kernel with it reverted, I believe the real
problem appeared earlier when the option to merge head frag in GRO was
implemented.
Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE
branch of napi_skb_finish() so that if the driver uses napi_gro_frags()
and head is merged (which in my case happens after the skb_condense()
call added by the commit mentioned above), the skb is reused including the
head that has been merged. As a result, we release the page reference
twice and eventually end up with negative page refcount.
To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish()
the same way it's done in napi_skb_finish().
Fixes: d7e8883cfcf4 ("net: make GRO aware of skb->head_frag") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doing pointer arithmetic on them is also forbidden, so that they
don't turn into unknown value and then get leaked out. However,
there's xadd as a special case, where we don't check the src reg
for being a pointer register, e.g. the following will pass:
We could store the pointer into skb->cb, loose the type context,
and then read it out from there again to leak it eventually out
of a map value. Or more easily in a different variant, too:
My static checker complains that ofdpa_neigh_del() can sometimes free
"found". It just makes sense to use it first before deleting it.
Fixes: ecf244f753e0 ("rocker: fix maybe-uninitialized warning") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case a VLAN device is enslaved to a bridge we shouldn't create a
router interface (RIF) for it when it's configured with an IP address.
This is already handled by the driver for other types of netdevs, such
as physical ports and LAG devices.
If this IP address is then removed and the interface is subsequently
unlinked from the bridge, a NULL pointer dereference can happen, as the
original 802.1d FID was replaced with an rFID which was then deleted.
To reproduce:
$ ip link set dev enp3s0np9 up
$ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111
$ ip link set dev enp3s0np9.111 up
$ ip link add name br0 type bridge
$ ip link set dev br0 up
$ ip link set enp3s0np9.111 master br0
$ ip address add dev enp3s0np9.111 192.168.0.1/24
$ ip address del dev enp3s0np9.111 192.168.0.1/24
$ ip link set dev enp3s0np9.111 nomaster
Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Tested-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When qdisc fail to init, qdisc_create would invoke the destroy callback
to cleanup. But there is no check if the callback exists really. So it
would cause the panic if there is no real destroy callback like the qdisc
codel, fq, and so on.
Take codel as an example following:
When a malicious user constructs one invalid netlink msg, it would cause
codel_init->codel_change->nla_parse_nested failed.
Then kernel would invoke the destroy callback directly but qdisc codel
doesn't define one. It causes one panic as a result.
Now add one the check for destroy to avoid the possible panic.
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We don't hold any tx lock when trying to disable TX during reset, this
would lead a use after free since ndo_start_xmit() tries to access
the virtqueue which has already been freed. Fix this by using
netif_tx_disable() before freeing the vqs, this could make sure no tx
after vq freeing.
Reported-by: Jean-Philippe Menil <jpmenil@gmail.com> Tested-by: Jean-Philippe Menil <jpmenil@gmail.com>
Fixes commit f600b6905015 ("virtio_net: Add XDP support") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Robert McCabe <robert.mccabe@rockwellcollins.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Similar to the fix provided by Dominik Heidler in commit 9b3dc0a17d73 ("l2tp: cast l2tp traffic counter to unsigned")
we need to take care of 32bit kernels in dev_get_stats().
When using atomic_long_read(), we add a 'long' to u64 and
might misinterpret high order bit, unless we cast to unsigned.
Fixes: caf586e5f23ce ("net: add a core netdev->rx_dropped counter") Fixes: 015f0688f57ca ("net: net: add a core netdev->tx_dropped counter") Fixes: 6e7333d315a76 ("net: add rx_nohandler stat counter") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have to reset the sk->sk_rx_dst when we disconnect a TCP
connection, because otherwise when we re-connect it this
dst reference is simply overridden in tcp_finish_connect().
This fixes a dst leak which leads to a loopback dev refcnt
leak. It is a long-standing bug, Kevin reported a very similar
(if not same) bug before. Thanks to Andrei for providing such
a reliable reproducer which greatly narrows down the problem.
Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Reported-by: Andrei Vagin <avagin@gmail.com> Reported-by: Kevin Xu <kaiwen.xu@hulu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function, skb_complete_tx_timestamp(), used to allow passing in a
NULL pointer for the time stamps, but that was changed in commit 62bccb8cdb69051b95a55ab0c489e3cab261c8ef ("net-timestamp: Make the
clone operation stand-alone from phy timestamping"), and the existing
call sites, all of which are in the dp83640 driver, were fixed up.
Even though the kernel-doc was subsequently updated in commit 7a76a021cd5a292be875fbc616daf03eab1e6996 ("net-timestamp: Update
skb_complete_tx_timestamp comment"), still a bug fix from Manfred
Rudigier came into the driver using the old semantics. Probably
Manfred derived that patch from an older kernel version.
This fix should be applied to the stable trees as well.
Fixes: 81e8f2e930fe ("net: dp83640: Fix tx timestamp overflow handling.") Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Our customer encountered stuck NFS writes for blocks starting at specific
offsets w.r.t. page boundary caused by networking stack sending packets via
UFO enabled device with wrong checksum. The problem can be reproduced by
composing a long UDP datagram from multiple parts using MSG_MORE flag:
Assume this packet is to be routed via a device with MTU 1500 and
NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
this condition is tested (among others) to decide whether to call
ip_ufo_append_data():
At the moment, we already have skb with 1028 bytes of data which is not
marked for GSO so that the test is false (fragheaderlen is usually 20).
Thus we append second 1000 bytes to this skb without invoking UFO. Third
sendto(), however, has sufficient length to trigger the UFO path so that we
end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
uses udp_csum() to calculate the checksum but that assumes all fragments
have correct checksum in skb->csum which is not true for UFO fragments.
When checking against MTU, we need to add skb->len to length of new segment
if we already have a partially filled skb and fragheaderlen only if there
isn't one.
In the IPv6 case, skb can only be null if this is the first segment so that
we have to use headersize (length of the first IPv6 header) rather than
fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach") Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
ip6 fragment between __ip6_append_data and ip6_finish_output") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 8000 series adapters uses catch-all filters for encapsulated traffic
to support filtering VXLAN, NVGRE and GENEVE traffic.
This new filter functionality requires a longer MCDI command.
This patch increases the size of buffers on stack that were missed, which
fixes a kernel panic from the stack protector.
Fixes: 9b41080125176 ("sfc: insert catch-all filters for encapsulated traffic") Signed-off-by: Martin Habets <mhabets@solarflare.com> Acked-by: Edward Cree <ecree@solarflare.com> Acked-by: Bert Kenward bkenward@solarflare.com Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This structure member is hidden behind CONFIG_SYSFS, and we
get a build error when that is disabled:
drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_channels':
drivers/net/hyperv/netvsc_drv.c:754:49: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'?
drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_rxfh':
drivers/net/hyperv/netvsc_drv.c:1181:25: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'?
As the value is only set once to the argument of alloc_netdev_mq(),
we can compare against that constant directly.
Fixes: ff4a44199012 ("netvsc: allow get/set of RSS indirection table") Fixes: 2b01888d1b45 ("netvsc: allow more flexible setting of number of channels") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The per netns loopback_dev->ip6_ptr is unregistered and set to
NULL when its mtu is set to smaller than IPV6_MIN_MTU, this
leads to that we could set rt->rt6i_idev NULL after a
rt6_uncached_list_flush_dev() and then crash after another
call.
In this case we should just bring its inet6_dev down, rather
than unregistering it, at least prior to commit 176c39af29bc
("netns: fix addrconf_ifdown kernel panic") we always
override the case for loopback.
Thanks a lot to Andrey for finding a reliable reproducer.
Fixes: 176c39af29bc ("netns: fix addrconf_ifdown kernel panic") Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an
autoneg failure case by resetting the hardware. This turns off
intterupts. Things will work themselves out if the phy polls, as it will
figure out it's state during a poll. However if the phy uses only
intterupts, the phy will stall, since interrupts are off. This patch
fixes the issue by calling config_intr after resetting the phy.
Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg ") Signed-off-by: Zach Brown <zach.brown@ni.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pat_enabled() logic is broken on CPUs which do not support PAT and
where the initialization code fails to call pat_init(). Due to that the
enabled flag stays true and pat_enabled() returns true wrongfully.
As a consequence the mappings, e.g. for Xorg, are set up with the wrong
caching mode and the required MTRR setups are omitted.
To cure this the following changes are required:
1) Make pat_enabled() return true only if PAT initialization was
invoked and successful.
2) Invoke init_cache_modes() unconditionally in setup_arch() and
remove the extra callsites in pat_disable() and the pat disabled
code path in pat_init().
Also rename __pat_enabled to pat_disabled to reflect the real purpose of
this variable.
Fixes: 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bernhard Held <berny156@gmx.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kstrtoull returns 0 on success, however, in reserved_clusters_store we
will return -EINVAL if kstrtoull returns 0, it makes us fail to update
reserved_clusters value through sysfs.
Otherwise, we enable all sorts of forgeries via timing attack.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Suggested-by: Stephan Müller <smueller@chronox.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: linux-crypto@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Changes in the SW cts (ciphertext stealing) code in
commit 0605c41cc53ca ("crypto: cts - Convert to skcipher")
revealed a problem in the CAAM driver:
when cts(cbc(aes)) is executed and cts runs in SW,
cbc(aes) is offloaded in CAAM; cts encrypts the last block
in atomic context and CAAM incorrectly decides to use GFP_KERNEL
for memory allocation.
Fix this by allowing GFP_KERNEL (sleeping) only when MAY_SLEEP flag is
set, i.e. remove MAY_BACKLOG flag.
We split the fix in two parts - first is sent to -stable, while the
second is not (since there is no known failure case).
Link: http://lkml.kernel.org/g/20170602122446.2427-1-david@sigma-star.at Reported-by: David Gstir <david@sigma-star.at> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a clean-up bug in the core comedi module initialization
functions, `comedi_init()`. If the `comedi_num_legacy_minors` module
parameter is non-zero (and valid), it creates that many "legacy" devices
and registers them in SysFS. A failure causes the function to clean up
and return an error. Unfortunately, it fails to destroy the "comedi"
class that was created earlier. Fix it by adding a call to
`class_destroy(comedi_class)` at the appropriate place in the clean-up
sequence.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
would have moved us to taking the sem. So, it's
not the time to wake a writer now, and only readers
are allowed now. Thus, 0 must be passed to __rwsem_do_wake().
Next, __rwsem_do_wake() wakes readers unconditionally.
But we mustn't do that if the sem is owned by writer
in the moment. Otherwise, writer and reader own the sem
the same time, which leads to memory corruption in
callers.
rwsem-xadd.c does not need that, as:
1) the similar check is made lockless there,
2) in __rwsem_mark_wake::try_reader_grant we test,
that sem is not owned by writer.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Niklas Cassel <niklas.cassel@axis.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 17fcbd590d0c "locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y" Link: http://lkml.kernel.org/r/149762063282.19811.9129615532201147826.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrei Vagin writes:
FYI: This bug has been reproduced on 4.11.7
> BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo} still in use (1) [unmount of proc proc]
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80
> CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1
> Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015
> Workqueue: events proc_cleanup_work
> Call Trace:
> dump_stack+0x63/0x86
> __warn+0xcb/0xf0
> warn_slowpath_null+0x1d/0x20
> umount_check+0x6e/0x80
> d_walk+0xc6/0x270
> ? dentry_free+0x80/0x80
> do_one_tree+0x26/0x40
> shrink_dcache_for_umount+0x2d/0x90
> generic_shutdown_super+0x1f/0xf0
> kill_anon_super+0x12/0x20
> proc_kill_sb+0x40/0x50
> deactivate_locked_super+0x43/0x70
> deactivate_super+0x5a/0x60
> cleanup_mnt+0x3f/0x90
> mntput_no_expire+0x13b/0x190
> kern_unmount+0x3e/0x50
> pid_ns_release_proc+0x15/0x20
> proc_cleanup_work+0x15/0x20
> process_one_work+0x197/0x450
> worker_thread+0x4e/0x4a0
> kthread+0x109/0x140
> ? process_one_work+0x450/0x450
> ? kthread_park+0x90/0x90
> ret_from_fork+0x2c/0x40
> ---[ end trace e1c109611e5d0b41 ]---
> VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds. Have a nice day...
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: _raw_spin_lock+0xc/0x30
> PGD 0
Fix this by taking a reference to the super block in proc_sys_prune_dcache.
The superblock reference is the core of the fix however the sysctl_inodes
list is converted to a hlist so that hlist_del_init_rcu may be used. This
allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while
not causing problems for proc_sys_evict_inode when if it later choses to
remove the inode from the sysctl_inodes list. Removing inodes from the
sysctl_inodes list allows proc_sys_prune_dcache to have a progress
guarantee, while still being able to drop all locks. The fact that
head->unregistering is set in start_unregistering ensures that no more
inodes will be added to the the sysctl_inodes list.
Previously the code did a dance where it delayed calling iput until the
next entry in the list was being considered to ensure the inode remained on
the sysctl_inodes list until the next entry was walked to. The structure
of the loop in this patch does not need that so is much easier to
understand and maintain.
Reported-by: Andrei Vagin <avagin@gmail.com> Tested-by: Andrei Vagin <avagin@openvz.org> Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.") Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The retry logic for netlink_attachskb() inside sys_mq_notify()
is nasty and vulnerable:
1) The sock refcnt is already released when retry is needed
2) The fd is controllable by user-space because we already
release the file refcnt
so we when retry but the fd has been just closed by user-space
during this small window, we end up calling netlink_detachskb()
on the error path which releases the sock again, later when
the user-space closes this socket a use-after-free could be
triggered.
Setting 'sock' to NULL here should be sufficient to fix it.
Thinkpad Helix 2 is a tablet PC, the audio is powered by Core M
broadwell-audio and rt286 codec. For all versions of Linux kernel,
the stereo output doesn't work properly when earphones are plugged
in, the sound was coming out from both channels even if the audio
contains only the left or right channel. Furthermore, if a music
recorded in stereo is played, the two channels cancle out each other
out, as a result, no voice but only distorted background music can be
heard, like a sound card with builtin a Karaoke sount effect.
Apparently this tablet uses a combo jack with polarity incorrectly
set by rt286 driver. This patch adds DMI information of Thinkpad Helix 2
to force_combo_jack_table[] and the issue is resolved. The microphone
input doesn't work regardless to the presence of this patch and still
needs help from other developers to investigate.
This is my first patch to LKML directly, sorry for CC-ing too many
people here.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=93841 Signed-off-by: Yifeng Li <tomli@tomli.me> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.
This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.
The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.
Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):
- in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
- in the [72%-9%] range on Intel Core i7-4810MQ
Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
While cleaning up sysfs callback that prints EK we discovered a kernel
memory leak. This commit fixes the issue by zeroing the buffer used for
TPM command/response.
The leak happen when we use either tpm_vtpm_proxy, tpm_ibmvtpm or
xen-tpmfront.
Fixes: 0883743825e3 ("TPM: sysfs functions consolidation") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a TPM2 loses power without a TPM2_Shutdown command being issued (a
"disorderly reboot"), it may lose some state that has yet to be
persisted to NVRam, and will increment the DA counter. After the DA
counter gets sufficiently large, the TPM will lock the user out.
NOTE: This only changes behavior on TPM2 devices. Since TPM1 uses sysfs,
and sysfs relies on implicit locking on chip->ops, it is not safe to
allow this code to run in TPM1, or to add sysfs support to TPM2, until
that locking is made explicit.
Signed-off-by: Josh Zimmerman <joshz@google.com> Fixes: 74d6b3ceaa17 ("tpm: fix suspend/resume paths for TPM 2.0") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before commit 88ffbf3e03 "GFS2: Use resizable hash table for glocks",
glocks were freed via call_rcu to allow reading the glock hashtable
locklessly using rcu. This was then changed to free glocks immediately,
which made reading the glock hashtable unsafe. Bring back the original
code for freeing glocks via call_rcu.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For AMD Promontory xHCI host, although you can disable USB 2.0 ports in
BIOS settings, those ports will be enabled anyway after you remove a
device on that port and re-plug it in again. It's a known limitation of
the chip. As a workaround we can clear the PORT_WAKE_BITS.
This will disable wake on connect, disconnect and overcurrent on
AMD Promontory USB2 ports
Update the sh_pfc_soc_info pointer after calling the SoC-specific
initialization function, as it may have been updated to e.g. handle
different SoC revisions. This makes sure the correct subdriver name is
printed later.
Fixes: 0c151062f32c9db8 ("sh-pfc: Add support for SoC-specific initialization") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The R8A7791 PFC driver was apparently based on the preliminary revisions
of the user's manual, which omitted the HSCIF1 group E signals in the
IPSR4 register description. This would cause HSCIF1's probe to fail with
the messages like below:
sh-pfc e6060000.pfc: cannot locate data/mark enum_id for mark 1989
sh-sci e62c8000.serial: Error applying setting, reverse things back
sh-sci: probe of e62c8000.serial failed with error -22
Add the neceassary PINMUX_IPSR_MSEL() invocations for the HSCK1_E,
HCTS1#_E, and HRTS1#_E signals...
. This however results in the mux mode being 0 between the two writes.
On my machine there is an IC's reset pin connected to LCD_D20. The
bootloader configures this pin as GPIO output-high (i.e. not holding the
IC in reset). When Linux reconfigures the pin to GPIO the short time
LCD_D20 is muxed as LCD_D20 instead of GPIO_1_20 is enough to confuse
the connected IC.
The same problem is present for the pin's drive strength setting which is
reset to low drive strength before using the right value.
So instead of relying on the hardware to modify the register setting
using two writes implement the bit toggling using read-modify-write.
Andre Przywara <andre.przywara@arm.com> noticed that we can get the
following warning with -EPROBE_DEFER:
"WARNING: CPU: 1 PID: 89 at drivers/base/dd.c:349
driver_probe_device+0x2ac/0x2e8"
Let's fix the issue by removing the indices as suggested by
Tejun Heo <tj@kernel.org>. All we have to do here is kill the radix
tree.
I probably ended up with the indices after grepping for removal
of all entries using radix_tree_for_each_slot() and the first
match found was gmap_radix_tree_free(). Anyways, no need for
indices here, and we can just do remove all the entries using
radix_tree_for_each_slot() along how the item_kill_tree() test
case does.
Fixes: c7059c5ac70a ("pinctrl: core: Add generic pinctrl functions for managing groups") Fixes: a76edc89b100 ("pinctrl: core: Add generic pinctrl functions for managing groups") Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In stm32_pconf_parse_conf function, stm32_pmx_gpio_set_direction is
called with wrong parameter value. Indeed, using NULL value for range
will raise an oops.
The nand_groups table uses different names for the NAND DQS pins than
the GROUP() definition in meson8b_cbus_groups (nand_dqs_0 vs nand_dqs0).
This prevents using the NAND DQS pins in the devicetree.
Fix this by ensuring that the GROUP() definition and the
meson8b_cbus_groups use the same name for these pins.
Fixes: 0fefcb6876d0 ("pinctrl: Add support for Meson8b") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The R8A7791 PFC driver was apparently based on the preliminary revisions
of the user's manual, which omitted the DVC_MUTE signal altogether in
the PFC section. The modern manual has the signal described, so just add
the necassary data to the driver...
The usbip stack dynamically allocates the transfer_buffer and
setup_packet of each urb that got generated by the tcp to usb stub code.
As these pointers are always used only once we will set them to NULL
after use. This is done likewise to the free_urb code in vudc_dev.c.
This patch fixes double kfree situations where the usbip remote side
added the URB_FREE_BUFFER.
The USB core and sysfs will attempt to enumerate certain parameters
which are unsupported by the au0828 - causing inconsistent behavior
and sometimes causing the chip to reset. Avoid making these calls.
This problem manifested as intermittent cases where the au8522 would
be reset on analog video startup, in particular when starting up ALSA
audio streaming in parallel - the sysfs entries created by
snd-usb-audio on streaming startup would result in unsupported control
messages being sent during tuning which would put the chip into an
unknown state.
%p will leak kernel pointers, so let's not expose the information on
dmesg and instead use %pK. %pK will only show the actual addresses if
explicitly enabled under /proc/sys/kernel/kptr_restrict.
Always try to parse an address, since kstrtoul() will safely fail when
given a symbol as input. If that fails (which will be the case for a
symbol), try to parse a symbol instead.
The dirfragtree is lazily updated, it's not always accurate. Infinite
loops happens in following circumstance.
- client send request to read frag A
- frag A has been fragmented into frag B and C. So mds fills the reply
with contents of frag B
- client wants to read next frag C. ceph_choose_frag(frag value of C)
return frag A.
The fix is using previous readdir reply to calculate next readdir frag
when possible.
The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive
the port number from user input as part of its attributes and assumes
it is valid. Down on the stack, that parameter is used to access kernel
data structures. If the value is invalid, the kernel accesses memory
it should not. To prevent this, verify the port number before using it.
BUG: KASAN: use-after-free in ib_uverbs_create_ah+0x6d5/0x7b0
Read of size 4 at addr ffff880018d67ab8 by task syz-executor/313
BUG: KASAN: slab-out-of-bounds in modify_qp.isra.4+0x19d0/0x1ef0
Read of size 4 at addr ffff88006c40ec58 by task syz-executor/819
Fixes: 67cdb40ca444 ("[IB] uverbs: Implement more commands") Fixes: 189aba99e70 ("IB/uverbs: Extend modify_qp and support packet pacing") Cc: Yevgeny Kliteynik <kliteyn@mellanox.com> Cc: Tziporet Koren <tziporet@mellanox.com> Cc: Alex Polak <alexpo@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver_override implementation is susceptible to race condition when
different threads are reading vs storing a different driver override.
Add locking to avoid race condition.
Currently we just stash anything we got into file->f_flags, and the
report it in fcntl(F_GETFD). This patch just clears out all unknown
flags so that we don't pass them to the fs or report them.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor'
Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Locally generated TCP packets are usually cloned, so we
do skb_cow_data() on this packets. After that we need to
reload the pointer to the esp header. On udpencap this
header has an offset to skb_transport_header, so take this
offset into account.
This is a backport of:
commit 0e78a87306a ("esp4: Fix udpencap for local TCP packets.")
Fixes: 67d349ed603 ("net/esp4: Fix invalid esph pointer crash") Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output") Reported-by: Don Bowman <db@donbowman.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is triggered occasionally by running both win7 and win2016 in L2, in
addition, EPT is disabled on both L1 and L2. It can't be reproduced easily.
Commit 0b6ac343fc (KVM: nVMX: Correct handling of exception injection) mentioned
that "KVM wants to inject page-faults which it got to the guest. This function
assumes it is called with the exit reason in vmcs02 being a #PF exception".
Commit e011c663 (KVM: nVMX: Check all exceptions for intercept during delivery to
L2) allows to check all exceptions for intercept during delivery to L2. However,
there is no guarantee the exit reason is exception currently, when there is an
external interrupt occurred on host, maybe a time interrupt for host which should
not be injected to guest, and somewhere queues an exception, then the function
nested_vmx_check_exception() will be called and the vmexit emulation codes will
try to emulate the "Acknowledge interrupt on exit" behavior, the warning is
triggered.
Reusing the exit reason from the L2->L0 vmexit is wrong in this case,
the reason must always be EXCEPTION_NMI when injecting an exception into
L1 as a nested vmexit.
Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Fixes: e011c663b9c7 ("KVM: nVMX: Check all exceptions for intercept during delivery to L2") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Static checker noticed that base3 could be used uninitialized if the
segment was not present (useable). Random stack values probably would
not pass VMCS entry checks.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 1aa366163b8b ("KVM: x86 emulator: consolidate segment accessors") Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm
on hflags is reverted later on in x86_emulate_instruction where hflags are
overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests
as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu.
Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after
an instruction is emulated, this commit deletes emul_flags altogether and
makes the emulator access vcpu->arch.hflags using two new accessors. This
way all changes, on the emulator side as well as in functions called from
the emulator and accessing vcpu state with emul_to_vcpu, are preserved.
More details on the bug and its manifestation with Windows and OVMF:
It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD.
I believe that the SMM part explains why we started seeing this only with
OVMF.
KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates
the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because
later on in x86_emulate_instruction we overwrite arch.hflags with
ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call.
The AMD-specific hflag of interest here is HF_NMI_MASK.
When rebooting the system, Windows sends an NMI IPI to all but the current
cpu to shut them down. Only after all of them are parked in HLT will the
initiating cpu finish the restart. If NMI is masked, other cpus never get
the memo and the initiating cpu spins forever, waiting for
hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe.
Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit eea628199d5b ("mtd: Add device-tree support to fsmc_nand"),
Device Tree support was added to the fmsc_nand driver. However, this
code has a bug in how it handles the bank-width DT property to set the
bus width.
Indeed, in the function fsmc_nand_probe_config_dt() that parses the
Device Tree, it sets pdata->width to either 8 or 16 depending on the
value of the bank-width DT property.
Then, the ->probe() function will test if pdata->width is equal to
FSMC_NAND_BW16 (which is 2) to set NAND_BUSWIDTH_16 in
nand->options. Therefore, with the DT probing, this condition will never
match.
This commit fixes that by removing the "width" field from
fsmc_nand_platform_data and instead have the fsmc_nand_probe_config_dt()
function directly set the appropriate nand->options value.
It is worth mentioning that if this commit gets backported to older
kernels, prior to the drop of non-DT probing, then non-DT probing will
be broken because nand->options will no longer be set to
NAND_BUSWIDTH_16.
Fixes: eea628199d5b ("mtd: Add device-tree support to fsmc_nand") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On brcmnand controller v6.x and v7.x, the #WP pin is controlled through
the NAND_WP bit in CS_SELECT register.
The driver currently assumes that toggling the #WP pin is
instantaneously enabling/disabling write-protection, but it actually
takes some time to propagate the new state to the internal NAND chip
logic. This behavior is sometime causing data corruptions when an
erase/program operation is executed before write-protection has really
been disabled.
Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:
infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci':
infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized]
roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1);
The code is actually correct since we always set all bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.
This initializes the field to zero first before setting the
individual bits.
Pass-through devices to VM guest can get updated IRQ affinity
information via irq_set_affinity() when not running in guest mode.
Currently, AMD IOMMU driver in GA mode ignores the updated information
if the pass-through device is setup to use vAPIC regardless of guest_mode.
This could cause invalid interrupt remapping.
Also, the guest_mode bit should be set and cleared only when
SVM updates posted-interrupt interrupt remapping information.
In function amd_iommu_bind_pasid(), the control flow jumps
to label out_free when pasid_state->mm and mm is NULL. And
mmput(mm) is called. In function mmput(mm), mm is
referenced without validation. This will result in a NULL
dereference bug. This patch fixes the bug.
Signed-off-by: Pan Bian <bianpan2016@163.com> Fixes: f0aac63b873b ('iommu/amd: Don't hold a reference to mm_struct') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even if a host controller's CPU-side MMIO windows into PCI I/O space do
happen to leak into PCI memory space such that it might treat them as
peer addresses, trying to reserve the corresponding I/O space addresses
doesn't do anything to help solve that problem. Stop doing a silly thing.
Another deadlock path caused by recursive locking is reported. This
kind of issue was introduced since commit 743b5f1434f5 ("ocfs2: take
inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been
fixed by commit b891fa5024a9 ("ocfs2: fix deadlock issue when taking
inode lock at vfs entry points"). Yes, we intend to fix this kind of
case in incremental way, because it's hard to find out all possible
paths at once.
This one can be reproduced like this. On node1, cp a large file from
home directory to ocfs2 mountpoint. While on node2, run
setfacl/getfacl. Both nodes will hang up there. The backtraces:
Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
exported by commit 439a36b8ef38 ("ocfs2/dlmglue: prepare tracking logic
to avoid recursive cluster lock").
Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <zren@suse.com> Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Configfs is the interface for ocfs2-tools to set configure to kernel and
$configfs_dir/cluster/$clustername/heartbeat/dead_threshold is the one
used to configure heartbeat dead threshold. Kernel has a default value
of it but user can set O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb
to override it.
Commit 45b997737a80 ("ocfs2/cluster: use per-attribute show and store
methods") changed heartbeat dead threshold name while ocfs2-tools did
not, so ocfs2-tools won't set this configurable and the default value is
always used. So revert it.
Fixes: 45b997737a80 ("ocfs2/cluster: use per-attribute show and store methods") Link: http://lkml.kernel.org/r/1490665245-15374-1-git-send-email-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up. native_flush_tlb_others() has the
fixup but Xen's version doesn't. Move the fixup to
flush_tlb_others().
AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range") Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When this function fails it just sends a SIGSEGV signal to
user-space using force_sig(). This signal is missing
essential information about the cause, e.g. the trap_nr or
an error code.
Fix this by propagating the error to the only caller of
mpx_handle_bd_fault(), do_bounds(), which sends the correct
SIGSEGV signal to the process.
Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: fe3d197f84319 ('x86, mpx: On-demand kernel allocation of bounds tables') Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>