While adding support loading kernel and initrd above 4G to grub2 in legacy
mode, I was referring to efi_high_alloc().
That will allocate buffer for kernel and then initrd, and initrd will
use kernel buffer start as limit.
During testing found two buffers will be overlapped when initrd size is
very big like 400M.
It turns out efi_high_alloc() boundary checking is not right.
end - size will be the new start, and should not compare new
start with max, we need to make sure end is smaller than max.
[ Basically, with the current efi_high_alloc() code it's possible to
allocate memory above 'max', because efi_high_alloc() doesn't check
that the tail of the allocation is below 'max'.
If you have an EFI memory map with a single entry that looks like so,
[0xc0000000-0xc0004000]
And want to allocate 0x3000 bytes below 0xc0003000 the current code
will allocate [0xc0001000-0xc0004000], not [0xc0000000-0xc0003000]
like you would expect. - Matt ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ luis: backported to 3.16:
- file rename: drivers/firmware/efi/libstub/efi-stub-helper.c ->
drivers/firmware/efi/efi-stub-helper.c ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Like the JMicron JMS567 enclosures with the JMS539 choke on report-opcodes,
so avoid it.
Tested-and-reported-by: Tom Arild Naess <tanaess@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When a signal is delivered, the information in the siginfo structure
is copied to userspace. Good security practice dicatates that the
unused fields in this structure should be initialized to 0 so that
random kernel stack data isn't exposed to the user. This patch adds
such an initialization to the two places where usbfs raises signals.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Dave Mielke <dave@mielke.cc> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Include the high order bit fields for Max scratchpad buffers when
calculating how many scratchpad buffers are needed.
I'm suprised this hasn't caused more issues, we never allocated more than
32 buffers even if xhci needed more. Either we got lucky and xhci never
really used past that area, or then we got enough zeroed dma memory anyway.
Should be backported as far back as possible
Reported-by: Tim Chen <tim.c.chen@linux.intel.com> Tested-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The commit 973747928514 ("usb: host: xhci-plat: add support for the Armada
375/38x XHCI controllers") extended the xhci-plat driver to support the Armada
375/38x SoCs, mostly by adding a quirk configuring the MBUS window.
However, that quirk was run before the clock the controllers needs has been
enabled. This usually worked because the clock was first enabled by the
bootloader, and left as such until the driver is probe, where it tries to
access the MBUS configuration registers before enabling the clock.
Things get messy when EPROBE_DEFER is involved during the probe, since as part
of its error path, the driver will rightfully disable the clock. When the
driver will be reprobed, it will retry to access the MBUS registers, but this
time with the clock disabled, which hangs forever.
Fix this by running the quirks after the clock has been enabled by the driver.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This has been broken for a long time: it broke first in 2.6.35, then was
almost fixed in 2.6.36 but this one-liner slipped through the cracks.
The bug shows up as an infinite loop in Windows 7 (and newer) boot on
32-bit hosts without EPT.
Windows uses CMPXCHG8B to write to page tables, which causes a
page fault if running without EPT; the emulator is then called from
kvm_mmu_page_fault. The loop then happens if the higher 4 bytes are
not 0; the common case for this is that the NX bit (bit 63) is 1.
Fixes: 6550e1f165f384f3a46b60a1be9aba4bc3c2adad Fixes: 16518d5ada690643453eb0aef3cc7841d3623c2d Reported-by: Erik Rull <erik.rull@rdsoftware.de> Tested-by: Erik Rull <erik.rull@rdsoftware.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The "Extended Compat ID OS Feature Descriptor Specification" does not
require the (sub)compatible ids to be NUL-terminated, because they
are placed in a fixed-size buffer and only unused parts of it should
contain NULs. If the buffer is fully utilized, there is no place for NULs.
Consequently, the code which uses desc->ext_compat_id never expects the
data contained to be NUL terminated.
If the compatible id is stored after sub-compatible id, and the compatible
id is full length (8 bytes), the (useless) NUL terminator overwrites the
first byte of the sub-compatible id.
If the sub-compatible id is full length (8 bytes), the (useless) NUL
terminator ends up out of the buffer. The situation can happen in the RNDIS
function, where the buffer is a part of struct f_rndis_opts. The next
member of struct f_rndis_opts is a mutex, so its first byte gets
overwritten. The said byte is a part of a mutex'es member which contains
the information on whether the muext is locked or not. This can lead to a
deadlock, because, in a configfs-composed gadget when a function is linked
into a configuration with config_usb_cfg_link(), usb_get_function()
is called, which then calls rndis_alloc(), which tries locking the same
mutex and (wrongly) finds it already locked.
This patch eliminates NUL terminating of the (sub)compatible id.
Fixes: da4243145fb1: "usb: gadget: configfs: OS Extended Compatibility descriptors support" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In the wrapper the IRQ disable should be done by writing 1's to the
IRQ*_CLR register. Existing code is broken because it instead writes
zeros to IRQ*_SET register.
Fix this by adding functions dwc3_omap_write_irqmisc_clr() and
dwc3_omap_write_irq0_clr() which do the right thing.
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Signed-off-by: George Cherian <george.cherian@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since commit c8231a9af8147f8a ("iio: mxs-lradc: compute temperature
from channel 8 and 9") with the removal of adc channel 9 there is
no 1-1 mapping in the channel spec.
All hwmon channel values above 9 are accessible via there index minus
one. So add a hidden iio channel 9 to fix this issue.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The intention is obviously to sign-extend a 12 bit quantity. But
because of C's promotion rules, the assignment is equivalent to "val16
&= 0xfff;". Use the proper API for this.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Using the touchscreen while running buffered capture results in the
buffer reporting lots of wrong values, often just zeros. This is because
we push readings to the buffer every time a touchscreen interrupt
arrives, including when the buffer's own conversions have not yet
finished. So let's only push to the buffer when its conversions are
ready.
Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reading a channel through sysfs, or starting a buffered capture, can
occasionally turn off the touchscreen.
This is because the read_raw() and buffer preenable()/postdisable()
callbacks unschedule current conversions on all channels. If a delay
channel happens to schedule a touchscreen conversion at the same time,
the conversion gets cancelled and the touchscreen sequence stops.
This is probably related to this note from the reference manual:
"If a delay group schedules channels to be sampled and a manual
write to the schedule field in CTRL0 occurs while the block is
discarding samples, the LRADC will switch to the new schedule
and will not sample the channels that were previously scheduled.
The time window for this to happen is very small and lasts only
while the LRADC is discarding samples."
So make the callbacks only unschedule conversions for the channels they
use. This means channel 0 for read_raw() and channels 0-5 for the buffer
(if the touchscreen is enabled). Since the touchscreen uses different
channels (6 and 7), it no longer gets turned off.
This is tested and fixes the issue on i.MX28, but hasn't been tested on
i.MX23.
Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reading a channel through sysfs, or starting a buffered capture, will
currently turn off the touchscreen. This is because the read_raw() and
buffer preenable()/postdisable() callbacks disable interrupts for all
LRADC channels, including those the touchscreen uses.
So make the callbacks only disable interrupts for the channels they use.
This means channel 0 for read_raw() and channels 0-5 for the buffer (if
the touchscreen is enabled). Since the touchscreen uses different
channels (6 and 7), it no longer gets turned off.
Note that only i.MX28 is affected by this issue, i.MX23 should be fine.
Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The touchscreen was initially designed [1] to map all of its physical
channels to one virtual channel, leaving buffered capture to use the
remaining 7 virtual channels. When the touchscreen was reimplemented
[2], it was made to use four virtual channels, which overlap and
conflict with the channels the buffer uses.
As a result, when the buffer is enabled, the touchscreen's virtual
channels are remapped to whichever physical channels the buffer was
configured with, causing the touchscreen to read those instead of the
touch measurement channels. Effectively the touchscreen stops working.
So here we separate the channels again, giving the touchscreen 2 virtual
channels and the buffer 6. We can't give the touchscreen just 1 channel
as before, as the current pressure calculation requires 2 channels to be
read at the same time.
This makes the touchscreen continue to work during buffered capture. It
has been tested on i.MX28, but not on i.MX23.
[1] 06ddd353f5c8 ("iio: mxs: Implement support for touchscreen")
[2] dee05308f602 ("Staging/iio/adc/touchscreen/MXS: add interrupt driven
touch detection")
Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Each inode of nilfs2 stores a root node of a b-tree, and it turned out to
have a memory overrun issue:
Each b-tree node of nilfs2 stores a set of key-value pairs and the number
of them (in "bn_nchildren" member of nilfs_btree_node struct), as well as
a few other "bn_*" members.
Since the value of "bn_nchildren" is used for operations on the key-values
within the b-tree node, it can cause memory access overrun if a large
number is incorrectly set to "bn_nchildren".
For instance, nilfs_btree_node_lookup() function determines the range of
binary search with it, and too large "bn_nchildren" leads
nilfs_btree_node_get_key() in that function to overrun.
As for intermediate b-tree nodes, this is prevented by a sanity check
performed when each node is read from a drive, however, no sanity check
has been done for root nodes stored in inodes.
This patch fixes the issue by adding missing sanity check against b-tree
root nodes so that it's called when on-memory inodes are read from ifile,
inode metadata file.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The native (64-bit) sigval_t union contains sival_int (32-bit) and
sival_ptr (64-bit). When a compat application invokes a syscall that
takes a sigval_t value (as part of a larger structure, e.g.
compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
union is converted to the native sigval_t with sival_int overlapping
with either the least or the most significant half of sival_ptr,
depending on endianness. When the corresponding signal is delivered to a
compat application, on big endian the current (compat_uptr_t)sival_ptr
cast always returns 0 since sival_int corresponds to the top part of
sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
is copied to the compat_siginfo_t structure.
This is essentially a partial revert of the commit [b1920c21102a:
'ALSA: hda - Enable runtime PM on Panther Point']. There was a bug
report showing the HD-audio bus hang during runtime PM on HP Spectre
XT.
When we walk the list of vma, or even for protecting against concurrent
framebuffer creation, we must hold the struct_mutex or else a second
thread can corrupt the list as we walk it.
References: https://bugs.freedesktop.org/show_bug.cgi?id=89085 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The KSTK_EIP() and KSTK_ESP() macros should return the user program
counter (PC) and stack pointer (A0StP) of the given task. These are used
to determine which VMA corresponds to the user stack in
/proc/<pid>/maps, and for the user PC & A0StP in /proc/<pid>/stat.
However for Meta the PC & A0StP from the task's kernel context are used,
resulting in broken output. For example in following /proc/<pid>/maps
output, the 3afff000-3b021000 VMA should be described as the stack:
And in the following /proc/<pid>/stat output, the PC is in kernel code
(1074234964 = 0x40078654) and the A0StP is in the kernel heap
(1335981392 = 0x4fa17550):
Fix the definitions of KSTK_EIP() and KSTK_ESP() to use
task_pt_regs(tsk)->ctx rather than (tsk)->thread.kernel_context. This
gets the registers from the user context stored after the thread info at
the base of the kernel stack, which is from the last entry into the
kernel from userland, regardless of where in the kernel the task may
have been interrupted, which results in the following more correct
/proc/<pid>/maps output:
When a PCM draining is performed to an empty stream that has been
already in PREPARED state, the current code just ignores and leaves as
it is, although the drain is supposed to set all such streams to SETUP
state. This patch covers that overlooked case.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The gpio_chip operations receive a pointer the gpio_chip struct which is
contained in the driver's private struct, yet the container_of call in those
functions point to the mfd struct defined in include/linux/mfd/tps65912.h.
Signed-off-by: Nicolas Saenz Julienne <nicolassaenzj@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
assumed that only one gpio-chip is registred per of-node.
Some drivers register more than one chip per of-node, so
adjust the matching function of_gpiochip_find_and_xlate to
not stop looking for chips if a node-match is found and
the translation fails.
Fixes: 7b8792bbdffd ("gpiolib: of: Correct error handling in of_get_named_gpiod_flags") Signed-off-by: Hans Holmberg <hans.holmberg@intel.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Tested-by: Tyler Hall <tylerwhall@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
For filesystems without separate project quota inode field in the
superblock we just reuse project quota file for group quotas (and vice
versa) if project quota file is allocated and we need group quota file.
When we reuse the file, quota structures on disk suddenly have wrong
type stored in d_flags though. Nobody really cares about this (although
structure type reported to userspace was wrong as well) except
that after commit 14bf61ffe6ac (quota: Switch ->get_dqblk() and
->set_dqblk() to use bytes as space units) assertion in
xfs_qm_scall_getquota() started to trigger on xfs/106 test (apparently I
was testing without XFS_DEBUG so I didn't notice when submitting the
above commit).
Fix the problem by properly resetting ddq->d_flags when running quotacheck
for a quota file.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently the list is traversed using rcu variant. That is not correct
since dev_set_mac_address can be called which eventually calls
rtmsg_ifinfo_build_skb and there, skb allocation can sleep. So fix this
by remove the rcu usage here.
Fixes: 3d249d4ca7 "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
1. For an IPv4 ping socket, ping_check_bind_addr does not check
the family of the socket address that's passed in. Instead,
make it behave like inet_bind, which enforces either that the
address family is AF_INET, or that the family is AF_UNSPEC and
the address is 0.0.0.0.
2. For an IPv6 ping socket, ping_check_bind_addr returns EINVAL
if the socket family is not AF_INET6. Return EAFNOSUPPORT
instead, for consistency with inet6_bind.
3. Make ping_v4_sendmsg and ping_v6_sendmsg return EAFNOSUPPORT
instead of EINVAL if an incorrect socket address structure is
passed in.
4. Make IPv6 ping sockets be IPv6-only. The code does not support
IPv4, and it cannot easily be made to support IPv4 because
the protocol numbers for ICMP and ICMPv6 are different. This
makes connect(::ffff:192.0.2.1) fail with EAFNOSUPPORT instead
of making the socket unusable.
Among other things, this fixes an oops that can be triggered by:
Change-Id: If06ca86d9f1e4593c0d6df174caca3487c57a241 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: based on davem's backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a
UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to
CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer
checksum is to be computed on segmentation. However, in this case,
skb->csum_start and skb->csum_offset are never set as raw socket
transmit path bypasses udp_send_skb() where they are usually set. As a
result, driver may access invalid memory when trying to calculate the
checksum and store the result (as observed in virtio_net driver).
Moreover, the very idea of modifying the userspace provided UDP header
is IMHO against raw socket semantics (I wasn't able to find a document
clearly stating this or the opposite, though). And while allowing
CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit
too intrusive change just to handle a corner case like this. Therefore
disallowing UFO for packets from SOCK_DGRAM seems to be the best option.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The National Instruments USB Host-to-Host Cable is based on the Prolific
PL-25A1 chipset. Add its VID/PID so the plusb driver will recognize it.
Signed-off-by: Ben Shelton <ben.shelton@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Neighbour code assumes headroom to push Ethernet header is
at least 16 bytes.
It appears macvtap has only 14 bytes available on arches
where NET_IP_ALIGN is 0 (like x86)
Effect is a corruption of 2 bytes right before skb->head,
and possible crashes if accessing non existing memory.
This fix should also increase IPv4 performance, as paranoid code
in ip_finish_output2() wont have to call skb_realloc_headroom()
Reported-by: Brian Rak <brak@vultr.com> Tested-by: Brian Rak <brak@vultr.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
With commit a7526eb5d06b (net: Unbreak compat_sys_{send,recv}msg), the
MSG_CMSG_COMPAT flag is blocked at the compat syscall entry points,
changing the kernel compat behaviour from the one before the commit it
was trying to fix (1be374a0518a, net: Block MSG_CMSG_COMPAT in
send(m)msg and recv(m)msg).
On 32-bit kernels (!CONFIG_COMPAT), MSG_CMSG_COMPAT is 0 and the native
32-bit sys_sendmsg() allows flag 0x80000000 to be set (it is ignored by
the kernel). However, on a 64-bit kernel, the compat ABI is different
with commit a7526eb5d06b.
This patch changes the compat_sys_{send,recv}msg behaviour to the one
prior to commit 1be374a0518a.
The problem was found running 32-bit LTP (sendmsg01) binary on an arm64
kernel. Arguably, LTP should not pass 0xffffffff as flags to sendmsg()
but the general rule is not to break user ABI (even when the user
behaviour is not entirely sane).
Fixes: a7526eb5d06b (net: Unbreak compat_sys_{send,recv}msg) Cc: Andy Lutomirski <luto@amacapital.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
CPU0 CPU1
team_port_del
team_upper_dev_unlink
priv_flags &= ~IFF_TEAM_PORT
team_handle_frame
team_port_get_rcu
team_port_exists
priv_flags & IFF_TEAM_PORT == 0
return NULL (instead of port got
from rx_handler_data)
netdev_rx_handler_unregister
The thing is that the flag is removed before rx_handler is unregistered.
If team_handle_frame is called in between, team_port_exists returns 0
and team_port_get_rcu will return NULL.
So do not check the flag here. It is guaranteed by netdev_rx_handler_unregister
that team_handle_frame will always see valid rx_handler_data pointer.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
colons are used as a separator in netdev device lookup in dev_ioctl.c
Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME
Signed-off-by: Matthew Thode <mthode@mthode.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Open vSwitch allows moving internal vport to different namespace
while still connected to the bridge. But when namespace deleted
OVS does not detach these vports, that results in dangling
pointer to netdevice which causes kernel panic as follows.
This issue is fixed by detaching all ovs ports from the deleted
namespace at net-exit.
Reported-by: Assaf Muller <amuller@redhat.com> Fixes: 46df7b81454("openvswitch: Add support for network namespaces.") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In tcf_em_validate(), after calling request_module() to load the
kind-specific module, set em->ops to NULL before returning -EAGAIN, so
that module_put() is not called again by tcf_em_tree_destroy().
Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
phy_init_eee uses phy_find_setting(phydev->speed, phydev->duplex)
to find a valid entry in the settings array for the given speed
and duplex value. For full duplex 1000baseT, this will return
the first matching entry, which is the entry for 1000baseKX_Full.
If the phy eee does not support 1000baseKX_Full, this entry will not
match, causing phy_init_eee to fail for no good reason.
Fixes: 9a9c56cb34e6 ("net: phy: fix a bug when verify the EEE support") Fixes: 3e7077067e80c ("phy: Expand phy speed/duplex settings array") Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ip_check_defrag() may be used by af_packet to defragment outgoing packets.
skb_network_offset() of af_packet's outgoing packets is not zero.
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
skb_copy_bits() returns zero on success and negative value on error,
so it is needed to invert the condition in ip_check_defrag().
Fixes: 1bf3751ec90c ("ipv4: ip_check_defrag must not modify skb before unsharing") Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The gnet_stats_copy_app() function gets called, more often than not, with its
second argument a pointer to an automatic variable in the caller's stack.
Therefore, to avoid copying garbage afterwards when calling
gnet_stats_finish_copy(), this data is better copied to a dynamically allocated
memory that gets freed after use.
[xiyou.wangcong@gmail.com: remove a useless kfree()]
Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Ignacy reported that when eth0 is down and add a vlan device
on top of it like:
ip link add link eth0 name eth0.1 up type vlan id 1
We will get a refcount leak:
unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2
The problem is when rtnl_configure_link() fails in rtnl_newlink(),
we simply call unregister_device(), but for stacked device like vlan,
we almost do nothing when we unregister the upper device, more work
is done when we unregister the lower device, so call its ->dellink().
Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: based on davem's backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ipv6_cow_metrics() currently assumes only DST_HOST routes require
dynamic metrics allocation from inetpeer. The assumption breaks
when ndisc discovered router with RTAX_MTU and RTAX_HOPLIMIT metric.
Refer to ndisc_router_discovery() in ndisc.c and note that dst_metric_set()
is called after the route is created.
This patch creates the metrics array (by calling dst_cow_metrics_generic) in
ipv6_cow_metrics().
Before:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0 proto kernel metric 256 expires 27sec
fe80::/64 dev eth0 proto kernel metric 256
default via fe80::74df:d0ff:fe23:8ef2 dev eth0 proto ra metric 1024 expires 27sec
After:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0 proto kernel metric 256 expires 27sec mtu 1300
fe80::/64 dev eth0 proto kernel metric 256 mtu 1300
default via fe80::74df:d0ff:fe23:8ef2 dev eth0 proto ra metric 1024 expires 27sec mtu 1300 hoplimit 30
Fixes: 8e2ec639173f325 (ipv6: don't use inetpeer to store metrics for routes.) Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
IPv6 can keep a copy of SYN message using skb_get() in
tcp_v6_conn_request() so that caller wont free the skb when calling
kfree_skb() later.
Therefore TCP fast open has to clone the skb it is queuing in
child->sk_receive_queue, as all skbs consumed from receive_queue are
freed using __kfree_skb() (ie assuming skb->users == 1)
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Fixes: 5b7ed0892f2af ("tcp: move fastopen functions to tcp_fastopen.c") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ifla_vf_policy[] is wrong in advertising its individual member types as
NLA_BINARY since .type = NLA_BINARY in combination with .len declares the
len member as *max* attribute length [0, len].
The issue is that when do_setvfinfo() is being called to set up a VF
through ndo handler, we could set corrupted data if the attribute length
is less than the size of the related structure itself.
The intent is exactly the opposite, namely to make sure to pass at least
data of minimum size of len.
Fixes: ebc08a6f47ee ("rtnetlink: Add VF config code to rtnetlink") Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes two issues in UDP checksum computation in pktgen.
First, the pseudo-header uses the source and destination IP
addresses. Currently, the ports are used for IPv4.
Second, the UDP checksum covers both header and data. So we need to
generate the data earlier (move pktgen_finalize_skb up), and compute
the checksum for UDP header + data.
Fixes: c26bf4a51308c ("pktgen: Add UDPCSUM flag to support UDP checksums") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
flow_cache_flush_task references a structure member flow_cache_gc_work
where it should reference flow_cache_flush_task instead.
Kernel panic occurs on kernels using IPsec during XFRM garbage
collection. The garbage collection interval can be shortened using the
following sysctl settings:
With the default settings, our productions servers crash approximately
once a week. With the settings above, they crash immediately.
Fixes: ca925cf1534e ("flowcache: Make flow cache name space aware") Reported-by: Tomáš Charvát <tc@excello.cz> Tested-by: Jan Hejl <jh@excello.cz> Signed-off-by: Miroslav Urbanek <mu@miroslavurbanek.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ip_vs_conn_fill_param_sync() gets in param.pe a module
reference for persistence engine from __ip_vs_pe_getbyname()
but forgets to put it. Problem occurs in backup for
sync protocol v1 (2.6.39).
Also, pe_data usually comes in sync messages for
connection templates and ip_vs_conn_new() copies
the pointer only in this case. Make sure pe_data
is not leaked if it comes unexpectedly for normal
connections. Leak can happen only if bogus messages
are sent to backup server.
Fixes: fe5e7a1efb66 ("IPVS: Backup, Adding Version 1 receive capability") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
As soon as extract_icmp6_fields() returns, its local storage (automatic
variables) is deallocated and can be overwritten.
Lets add an additional parameter to make sure storage is valid long
enough.
While we are at it, adds some const qualifiers.
Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: b64c9256a9b76 ("tproxy: added IPv6 support to the socket match") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP")
from 2.6.37 introduced ip_route_me_harder() call for responses to
local clients, so that we can provide valid rt_src after SNAT.
It was used by TCP to provide valid daddr for ip_send_reply().
After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to
ip_send_reply()." from 3.0 this rerouting is not needed anymore
and should be avoided, especially in LOCAL_IN.
Fixes 3.12.33 crash in xfrm reported by Florian Wiessner:
"3.12.33 - BUG xfrm_selector_match+0x25/0x2f6"
Reported-by: Andreas Schultz <aschultz@tpip.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently maximum space limit quota format supports is in blocks however
since we store space limits in bytes, this is somewhat confusing. So
store the maximum limit in bytes as well. Also rename the field to match
the new unit and related inode field to match the new naming scheme.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
src_net points to the netns where the netlink message has been received. This
netns may be different from the netns where the interface is created (because
the user may add IFLA_NET_NS_[PID|FD]). In this case, src_net is the link netns.
It seems wrong to override the netns in the newlink() handler because if it
was not already src_net, it means that the user explicitly asks to create the
netdevice in another netns.
CC: Sjur Brændeland <sjur.brandeland@stericsson.com> CC: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Fixes: 8391c4aab1aa ("caif: Bugfixes in CAIF netdevice for close and flow control") Fixes: c41254006377 ("caif-hsi: Add rtnl support") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fixes: http://bugs.elinux.org/issues/127
the bb.org community was seeing random reboots before this change.
Signed-off-by: Robert Nelson <robertcnelson@gmail.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Lockless access to pte in pagemap_pte_range() might race with page
migration and trigger BUG_ON(!PageLocked()) in migration_entry_to_page():
CPU A (pagemap) CPU B (migration)
lock_page()
try_to_unmap(page, TTU_MIGRATION...)
make_migration_entry()
set_pte_at()
<read *pte>
pte_to_pagemap_entry()
remove_migration_ptes()
unlock_page()
if(is_migration_entry())
migration_entry_to_page()
BUG_ON(!PageLocked(page))
Also lockless read might be non-atomic if pte is larger than wordsize.
Other pte walkers (smaps, numa_maps, clear_refs) already lock ptes.
Fixes: 052fb0d635df ("proc: report file/anon bit in /proc/pid/pagemap") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reported-by: Andrey Ryabinin <a.ryabinin@samsung.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If a /proc/pid/pagemap read spans a [VMA, an unmapped region, then a
VM_SOFTDIRTY VMA], the virtual pages in the unmapped region are reported
as softdirty. Here's a program to demonstrate the bug:
It appears that the Cintiq Companion Hybrid does not send an ABS_MISC event to
userspace when any of its ExpressKeys are pressed. This is not strictly
necessary now that the pad exists on its own device, but should be fixed for
consistency's sake.
Traditionally both the stylus and pad shared the same device node, and
xf86-input-wacom would use ABS_MISC for disambiguation. Not sending this causes
the Hybrid to behave incorrectly with xf86-input-wacom beginning with its
8f44f3 commit.
Signed-off-by: Jason Gerecke <killertofu@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[killertofu@gmail.com: ported to drivers/input/tablet/wacom_wac.c] Signed-off-by: Jason Gerecke <killertofu@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
For all pll-s on sun6i n == 0 means use a multiplier of 1, rather then 0 as
it means on sun4i / sun5i / sun7i. n_start = 1 is already correctly set
for sun6i pll6, but was missing for pll1, this commit fixes this.
Cc: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This code in vhost_scsi_make_tpg() is confusing because we limit "tpgt"
to UINT_MAX but the data type of "tpg->tport_tpgt" and that is a u16.
I looked at the context and it turns out that in
vhost_scsi_set_endpoint(), "tpg->tport_tpgt" is used as an offset into
the vs_tpg[] array which has VHOST_SCSI_MAX_TARGET (256) elements so
anything higher than 255 then it is invalid. I have made that the limit
now.
In vhost_scsi_send_evt() we mask away values higher than 255, but now
that the limit has changed, we don't need the mask.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: functions rename:
- tcm_vhost_send_evt -> vhost_scsi_send_evt
- tcm_vhost_make_tpg -> vhost_scsi_make_tpg ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
For PGR reservation of type Write Exclusive Access, allow all non
reservation holding I_T nexuses with active registrations to READ
from the device.
This addresses a bug where active registrations that attempted
to READ would result in an reservation conflict.
Signed-off-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch changes core_scsi3_pro_release() logic to allow an
existing AllRegistrants type reservation to be re-reserved by
any registered I_T nexus.
This addresses a issue where AllRegistrants type RESERVE was
receiving RESERVATION_CONFLICT status if dev_pr_res_holder did
not match the same I_T nexus, instead of just returning GOOD
status following spc4r34 Section 5.9.9:
"If the device server receives a PERSISTENT RESERVE OUT command
with RESERVE service action where the TYPE field and the SCOPE
field contain the same values as the existing type and scope
from a persistent reservation holder, it shall not make any
change to the existing persistent reservation and shall complete
the command with GOOD status."
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: Lee Duncan <lduncan@suse.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes an issue with AllRegistrants reservations where
an unregister operation by the I_T nexus reservation holder would
incorrectly drop the reservation, instead of waiting until the
last active I_T nexus is unregistered as per SPC-4.
This includes updating __core_scsi3_complete_pro_release() to reset
dev->dev_pr_res_holder with another pr_reg for this special case,
as well as a new 'unreg' parameter to determine when the release
is occuring from an implicit unregister, vs. explicit RELEASE.
It also adds special handling in core_scsi3_free_pr_reg_from_nacl()
to release the left-over pr_res_holder, now that pr_reg is deleted
from pr_reg_list within __core_scsi3_complete_pro_release().
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes the usage of R_HOLDER bit for an All Registrants
reservation in READ_FULL_STATUS, where only the registration who
issued RESERVE was being reported as having an active reservation.
It changes core_scsi3_pri_read_full_status() to check ahead of the
list walk of active registrations to see if All Registrants is active,
and if so set R_HOLDER bit and scope/type fields for all active
registrations.
Reported-by: Ilias Tsitsimpis <i.tsitsimpis@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The hardware range code values and list of valid ranges for the AI
subdevice is incorrect for several supported boards. The hardware range
code values for all boards except PCI-DAS4020/12 is determined by
calling `ai_range_bits_6xxx()` based on the maximum voltage of the range
and whether it is bipolar or unipolar, however it only returns the
correct hardware range code for the PCI-DAS60xx boards. For
PCI-DAS6402/16 (and /12) it returns the wrong code for the unipolar
ranges. For PCI-DAS64/Mx/16 it returns the wrong code for all the
ranges and the comedi range table is incorrect.
Change `ai_range_bits_6xxx()` to use a look-up table pointed to by new
member `ai_range_codes` of `struct pcidas64_board` to map the comedi
range table indices to the hardware range codes. Use a new comedi range
table for the PCI-DAS64/Mx/16 boards (and the commented out variants).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Increase si2168 cmd execute timeout to prevent firmware load failures. Tests
shows it takes up to 52ms to load the 'dvb-demod-si2168-a30-01.fw' firmware.
Increase timeout to a safe value of 70ms.
Signed-off-by: Jurgen Kramer <gtmkramer@xs4all.nl> Reviewed-by: Antti Palosaari <crope@iki.fi> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The MLX4_PROT_IB_IPV4 protocol should only be used with RoCEv2 and such.
Removing this wrong usage allows to run multicast applications over RoCE.
Fixes: d487ee77740c ("IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table") Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The deadlock occurs in __uverbs_modify_qp: we take a lock (idr_read_qp)
and in case of failure in ib_resolve_eth_l2_attrs we don't release
it (put_qp_read). Fix that.
Fixes: ed4c54e5b4ba ("IB/core: Resolve Ethernet L2 addresses when modifying QP") Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It seems that the same problems which lead to adding an rfkill blacklist and
putting the Lenovo Yoga 2 11 on it are also present on the Lenovo Yoga 2 13
and Lenovo Yoga 2 Pro too:
https://bugzilla.redhat.com/show_bug.cgi?id=1021036
https://forums.lenovo.com/t5/Linux-Discussion/Yoga-2-13-not-Pro-Linux-Warning/m-p/1517612
Testing has shown that the firmware rfkill settings are persistent over
reboots. So blacklisting the driver is not good enough, if the wifi is blocked
at the firmware level the wifi needs to be explictly unblocked through the
ideapad-laptop interface.
And at least on the Lenovo Yoga 2 13 the VPCCMD_RF register which on devices
with hardware kill switch reports the hardware switch state, needs to be
explictly set to 1 (radio enabled / not blocked).
So this patch does 3 things to get proper rfkill handling on these models:
1) Instead of blacklisting the rfkill functionality, which means that people
with a firmware blocked wifi get stuck in that situation, ignore the value
reported by the not present hardware rfkill switch, as this is what is causing
ideapad-laptop to wrongly report all radios as hardware blocks. But do register
the rfkill interfaces so that the user can soft [un]block them.
2) On models without a hardware rfkill switch, explictly set VPCCMD_RF to 1
3) Drop the " 11" postfix from the dmi match string, as the entire Yoga 2
series is affected.
Yoga 2 11: Reported-and-tested-by: Vincent Gerris <vgerris@gmail.com>
Yoga 2 13: Tested-by: madls05 <http://ubuntuforums.org/showthread.php?t=2215044>
Yoga 2 Pro: Reported-and-tested-by: Peter F. Patel-Schneider <pfpschneider@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Cc: Gaudenz Steinlin <gaudenz@debian.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It has been reported that generating an MLD listener report on
devices with large MTUs (e.g. 9000) and a high number of IPv6
addresses can trigger a skb_over_panic():
mld_newpack() skb allocations are usually requested with dev->mtu
in size, since commit 72e09ad107e7 ("ipv6: avoid high order allocations")
we have changed the limit in order to be less likely to fail.
However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
macros, which determine if we may end up doing an skb_put() for
adding another record. To avoid possible fragmentation, we check
the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
assumption as the actual max allocation size can be much smaller.
The IGMP case doesn't have this issue as commit 57e1ab6eaddc
("igmp: refine skb allocations") stores the allocation size in
the cb[].
Set a reserved_tailroom to make it fit into the MTU and use
skb_availroom() helper instead. This also allows to get rid of
igmp_skb_size().
Reported-by: Wei Liu <lw1a2.jing@gmail.com> Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: David L Stevens <david.stevens@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
I.e. one-to-many sockets in SCTP are not required to explicitly
call into connect(2) or sctp_connectx(2) prior to data exchange.
Instead, they can directly invoke sendmsg(2) and the SCTP stack
will automatically trigger connection establishment through 4WHS
via sctp_primitive_ASSOCIATE(). However, this in its current
implementation is racy: INIT is being sent out immediately (as
it cannot be bundled anyway) and the rest of the DATA chunks are
queued up for later xmit when connection is established, meaning
sendmsg(2) will return successfully. This behaviour can result
in an undesired side-effect that the kernel made the application
think the data has already been transmitted, although none of it
has actually left the machine, worst case even after close(2)'ing
the socket.
Instead, when the association from client side has been shut down
e.g. first gracefully through SCTP_EOF and then close(2), the
client could afterwards still receive the server's INIT_ACK due
to a connection with higher latency. This INIT_ACK is then considered
out of the blue and hence responded with ABORT as there was no
alive assoc found anymore. This can be easily reproduced f.e.
with sctp_test application from lksctp. One way to fix this race
is to wait for the handshake to actually complete.
The fix defers waiting after sctp_primitive_ASSOCIATE() and
sctp_primitive_SEND() succeeded, so that DATA chunks cooked up
from sctp_sendmsg() have already been placed into the output
queue through the side-effect interpreter, and therefore can then
be bundeled together with COOKIE_ECHO control chunks.
Looks like this bug is from the pre-git history museum. ;)
Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In recent testing I had disabled CONFIG_IP_MULTIPLE_TABLES and as a result
when I ran "cat /proc/net/fib_trie" the main trie was displayed multiple
times. I found that the problem line of code was in the function
fib_trie_seq_next. Specifically the line below caused the indexes to go in
the opposite direction of our traversal:
h = tb->tb_id & (FIB_TABLE_HASHSZ - 1);
This issue was that the RT tables are defined such that RT_TABLE_LOCAL is ID
255, while it is located at TABLE_LOCAL_INDEX of 0, and RT_TABLE_MAIN is 254
with a TABLE_MAIN_INDEX of 1. This means that the above line will return 1
for the local table and 0 for main. The result is that fib_trie_seq_next
will return NULL at the end of the local table, fib_trie_seq_start will
return the start of the main table, and then fib_trie_seq_next will loop on
main forever as h will always return 0.
The fix for this is to reverse the ordering of the two tables. It has the
advantage of making it so that the tables now print in the same order
regardless of if multiple tables are enabled or not. In order to make the
definition consistent with the multiple tables case I simply masked the to
RT_TABLE_XXX values by (FIB_TABLE_HASHSZ - 1). This way the two table
layouts should always stay consistent.
Fixes: 93456b6 ("[IPV4]: Unify access to the routing tables") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
d1c7e29e8d27 (HID: i2c-hid: prevent buffer overflow in early IRQ)
changed hid_get_input() to read ihid->bufsize bytes, which can be
more than wMaxInputLength. This is the case with the Dell XPS 13
9343, and it is causing events to be missed. In some cases the
missed events are releases, which can cause the cursor to jump or
freeze, among other problems. Limit the number of bytes read to
min(wMaxInputLength, ihid->bufsize) to prevent such problems.
Fixes: d1c7e29e8d27 "HID: i2c-hid: prevent buffer overflow in early IRQ" Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Max unacked packets/bytes is an int while sizeof(long) was used in the
sysctl table.
This means that when they were getting read we'd also leak kernel memory
to userspace along with the timeout values.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Moritz Muehlenhoff <jmm@inutil.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The timeout entries are sizeof(int) rather than sizeof(long), which
means that when they were getting read we'd also leak kernel memory
to userspace along with the timeout values.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Moritz Muehlenhoff <jmm@inutil.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fixes: e01580bf9e ("gfs2: use generic posix ACL infrastructure") Reported-by: Eric Meddaugh <etmsys@rit.edu> Tested-by: Eric Meddaugh <etmsys@rit.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Then we use it as an offset into a array with SNDRV_CARDS elements.
if (!request_region(joystick_port[dev], 8, "Riptide gameport")) {
This has 3 effects:
1) If you use the module option to specify the joystick port then it has
to be shifted one space over.
2) The wrong error message will be printed on failure if you have over
32 cards.
3) Static checkers will correctly complain that are off by one.
When calling to early_setup(), we pick "boot_paca" up for the master CPU
and initialize that with initialise_paca(). At that point, the SLB
shadow buffer isn't populated yet. Updating the SLB shadow buffer should
corrupt what we had in physical address 0 where the trap instruction is
usually stored.
This hasn't been observed to cause any trouble in practice, but is
obviously fishy.
Fixes: 6f4441ef7009 ("powerpc: Dynamically allocate slb_shadow from memblock") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
should be no higher than 15, however the corresponding check is obviously off-
by-one.
Fixes: 045779942c04 ("clk: gate: add CLK_GATE_HIWORD_MASK") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Michael Turquette <mturquette@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The "> 0" here should ">= 0" so we free map_entries[0].
Fixes: 926172d46038 ('efi: Export EFI runtime memory mapping to sysfs') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If the call to devm_kzalloc() fails, nothing must be cleant up.
This was missed before because gpio_rcar_probe() had a "return"
statement after the first "goto err0".
Commit f6b2a04590bb ("ASoC: pxa: mioa701_wm9713: Convert to table based DAPM
setup") converted the driver to register the board level DAPM elements with
the card's DAPM context rather than the CODEC's DAPM context. The change
overlooked that the speaker widget event callback accesses the widget's
codec field which is only valid if the widget has been registered in a CODEC
DAPM context. This patch modifies the callback to take an alternative route
to get the CODEC.
Fixes: f6b2a04590bb ("ASoC: pxa: mioa701_wm9713: Convert to table based DAPM
setup") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
As it is, we have debugfs_remove() racing with symlink traversals.
Supply ->evict_inode() and do freeing there - inode will remain
pinned until we are done with the symlink body.
And rip the idiocy with checking if dentry is positive right after
we'd verified debugfs_positive(), which is a stronger check...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.
Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.
The output of KDB 'summary' command should report MemTotal, MemFree
and Buffers output in kB. Current codes report in unit of pages.
A define of K(x) as
is defined in the code, but not used.
This patch would apply the define to convert the values to kB.
Please include me on Cc on replies. I do not subscribe to linux-kernel.
Signed-off-by: Jay Lan <jlan@sgi.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>