regulator_disable of pbias always writes '0' to the enable_reg.
However actual disable value of pbias regulator is not always '0'.
Fix it by populating the disable_val in pbias_reg_info for the
various platforms and assign it to the disable_val of
pbias regulator descriptor. This will be used by
regulator_disable_regmap while disabling pbias regulator.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Don't set .read_flag_mask for adav803, it's for adav801 only.
Fixes: 0c2d69645628 ("ASoC: adav80x: Split SPI and I2C code into different modules") Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Before the make_empty_dir_inode calls were introduce into proc, sysfs,
and sysctl those directories when stated reported an i_size of 0.
make_empty_dir_inode started reporting an i_size of 2. At least one
userspace application depended on stat returning i_size of 0. So
modify make_empty_dir_inode to cause an i_size of 0 to be reported for
these directories.
Reported-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In the logic in the initial commit of unshare made creating a new
thread group for a process, contingent upon creating a new memory
address space for that process. That is wrong. Two separate
processes in different thread groups can share a memory address space
and clone allows creation of such proceses.
This is significant because it was observed that mm_users > 1 does not
mean that a process is multi-threaded, as reading /proc/PID/maps
temporarily increments mm_users, which allows other processes to
(accidentally) interfere with unshare() calls.
Correct the check in check_unshare_flags() to test for
!thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM.
For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM.
For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM.
By using the correct checks in unshare this removes the possibility of
an accidental denial of service attack.
Additionally using the correct checks in unshare ensures that only an
explicit unshare(CLONE_VM) can possibly trigger the slow path of
current_is_single_threaded(). As an explict unshare(CLONE_VM) is
pointless it is not expected there are many applications that make
that call.
Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace Reported-by: Ricky Zhou <rickyz@chromium.org> Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
probably to reset the timestamps.
When it was an O_RDONLY open, the SETATTR command does not
identify any actual attributes to change.
If no delegation was provided to the open, the SETATTR uses the
all-zeros stateid and the request is accepted (at least by the
Linux NFS server - no harm, no foul).
If a read-delegation was provided, this is used in the SETATTR
request, and a Netapp filer will justifiably claim
NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
to retry - indefinitely.
So only treat O_EXCL specially if O_CREAT was also given.
Negative return values are not supported by iio_event_poll since
its return type is unsigned int.
Fixes: f18e7a068a0a3 ("iio: Return -ENODEV for file operations if the device has been unregistered") Signed-off-by: Cristina Opriceana <cristina.opriceana@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The power for line out was not turned on when line out is enabled.
So we add "LOUT amp" widget to turn on the power for line out.
Signed-off-by: John Lin <john.lin@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Like some of the other Yoga models the Lenovo Yoga 3 14 does not have a
hw rfkill switch, and trying to read the hw rfkill switch through the
ideapad module causes it to always reported blocking breaking wifi.
This commit adds the Lenovo Yoga 3 14 to the no_hw_rfkill dmi list, fixing
the wifi breakage.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1239050 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Add inverse unit conversion macro to convert from standard IIO units to
units that might be used by some devices.
Those are useful in combination with scale factors that are specified as
IIO_VAL_FRACTIONAL. Typically the denominator for those specifications will
contain the maximum raw value the sensor will generate and the numerator
the value it maps to in a specific unit. Sometimes datasheets specify those
in different units than the standard IIO units (e.g. degree/s instead of
rad/s) and so we need to do a unit conversion.
From a mathematical point of view it does not make a difference whether we
apply the unit conversion to the numerator or the inverse unit conversion
to the denominator since (x / y) / z = x / (y * z). But as the denominator
is typically a larger value and we are rounding both the numerator and
denominator to integer values using the later method gives us a better
precision (E.g. the relative error is smaller if we round 8000.3 to 8000
rather than rounding 8.3 to 8).
This is where in inverse unit conversion macros will be used.
Marked for stable as used by some upcoming fixes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
parport_find_base() will implicitly do parport_get_port() which
increases the refcount. Then parport_register_device() will again
increment the refcount. But while unloading the module we are only
doing parport_unregister_device() decrementing the refcount only once.
We add an parport_put_port() to neutralize the effect of
parport_get_port().
We got the bug that qemu complained with "KVM: unknown exit, hardware
reason 31" and KVM shown these info:
[84245.284948] EPT: Misconfiguration.
[84245.285056] EPT: GPA: 0xfeda848
[84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
[84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
[84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
[84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1
This is because we got a mmio #PF and the handler see the mmio spte becomes
normal (points to the ram page)
However, this is valid after introducing fast mmio spte invalidation which
increases the generation-number instead of zapping mmio sptes, a example
is as follows:
1. QEMU drops mmio region by adding a new memslot
2. invalidate all mmio sptes
3.
VCPU 0 VCPU 1
access the invalid mmio spte
access the region originally was MMIO before
set the spte to the normal ram map
mmio #PF
check the spte and see it becomes normal ram mapping !!!
This patch fixes the bug just by dropping the check in mmio handler, it's
good for backport. Full check will be introduced in later patches
Reported-by: Pavel Shirshov <ru.pchel@gmail.com> Tested-by: Pavel Shirshov <ru.pchel@gmail.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Pericom PI7C9X795[1248] are Uno/Dual/Quad/Octal UART devices, this
patch enables them, also defines PCI_VENDOR_ID_PERICOM here.
Signed-off-by: Adam Lee <adam.lee@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
DWC3 uses bounce buffer to handle non max packet aligned OUT transfers and
the size of bounce buffer is 512 bytes. However if the host initiates OUT
transfers of size more than 512 bytes (and non max packet aligned), the
driver throws a WARN dump but still programs the TRB to receive more than
512 bytes. This will cause bounce buffer to overflow and corrupt the
adjacent memory locations which can be fatal.
Fix it by programming the TRB to receive a maximum of DWC3_EP0_BOUNCE_SIZE
(512) bytes.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The TSADC gate clock was used in Exynos4x12 DTSI for exynos-adc driver.
However TSADC is present only on Exynos4210 so on Trats2 board (with
Exynos4412 SoC) the exynos-adc driver could not be probed:
ERROR: could not get clock /adc@126C0000:adc(0)
exynos-adc 126c0000.adc: failed getting clock, err = -2
exynos-adc: probe of 126c0000.adc failed with error -2
Instead on Exynos4x12 SoCs the main clock used by Analog to Digital
Converter is located in different register and it is named in datasheet
as PCLK_ADC. Regardless of the name the purpose of this PCLK_ADC clock
is the same as purpose of TSADC from Exynos4210.
The patch adds gate clock for Exynos4x12 using the proper register so
backward compatibility is preserved. This fixes the probe of exynos-adc
driver on Exynos4x12 boards and allows accessing sensors connected to it
on Trats2 board (ntc,ncp15wb473 AP and battery thermistors).
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: c63c57433003 ("ARM: dts: Add ADC's dt data to read raw data for exynos4x12") Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Acked-by: Tomasz Figa <tomasz.figa@gmail.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
`devpriv->ao_timer` is used while an asynchronous command is running on
the AO subdevice. It also gets modified by the subdevice's `cmdtest`
handler for checking new asynchronous commands,
`usbduxsigma_ao_cmdtest()`, which is not correct as it's allowed to
check new commands while an old command is still running. Fix it by
moving the code which sets up `devpriv->ao_timer` into the subdevice's
`cmd` handler, `usbduxsigma_ao_cmd()`.
** This backported patch also moves the code that sets up
`devpriv->ao_sample_count` from `usbduxsigma_ao_cmdtest()` to
`usbduxsigma_ao_cmd()` for the same reason as above. (This was not
needed in the upstream commit.) **
Note that the removed code in `usbduxsigma_ao_cmdtest()` checked that
`devpriv->ao_timer` did not end up less that 1, but that could not
happen due because `cmd->scan_begin_arg` or `cmd->convert_arg` had
already been range-checked.
Also note that we tested the `high_speed` variable in the old code, but
that is currently always 0 and means that we always use "scan" timing
(`cmd->scan_begin_src == TRIG_TIMER` and `cmd->convert_src == TRIG_NOW`)
and never "convert" (individual sample) timing (`cmd->scan_begin_src ==
TRIG_FOLLOW` and `cmd->convert_src == TRIG_TIMER`). The moved code
tests `cmd->convert_src` instead to decide whether "scan" or "convert"
timing is being used, although currently only "scan" timing is
supported.
Fixes: fb1ef622e7a3 ("staging: comedi: usbduxsigma: tidy up analog output command support") Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
`devpriv->ai_timer` is used while an asynchronous command is running on
the AI subdevice. It also gets modified by the subdevice's `cmdtest`
handler for checking new asynchronous commands
(`usbduxsigma_ai_cmdtest()`), which is not correct as it's allowed to
check new commands while an old command is still running. Fix it by
moving the code which sets up `devpriv->ai_timer` and
`devpriv->ai_interval` into the subdevice's `cmd` handler,
`usbduxsigma_ai_cmd()`.
** This backported patch also moves the code that sets up
`devpriv->ai_sample_count` from `usbduxsigma_ai_cmdtest()` to
`usbduxsigma_ai_cmd()` for the same reason as above. (This was not
needed in the upstream commit.) **
Note that the removed code in `usbduxsigma_ai_cmdtest()` checked that
`devpriv->ai_timer` did not end up less than than 1, but that could not
happen because `cmd->scan_begin_arg` had already been checked to be at
least the minimum required value (at least when `cmd->scan_begin_src ==
TRIG_TIMER`, which had also been checked to be the case).
Fixes: b986be8527c7 ("staging: comedi: usbduxsigma: tidy up analog input command support) Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device
functions other than function 0, so that on multi-function devices, we will
always read VPD from function 0 instead of from the other functions.
[bhelgaas: changelog] Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
function 0 to provide VPD access on other functions. This is for hardware
devices that provide copies of the same VPD capability registers in
multiple functions. Because the kernel expects that each function has its
own registers, both the locking and the state tracking are affected by VPD
accesses to different functions.
On such devices for example, if a VPD write is performed on function 0,
*any* later attempt to read VPD from any other function of that device will
hang. This has to do with how the kernel tracks the expected value of the
F bit per function.
Concurrent accesses to different functions of the same device can not only
hang but also corrupt both read and write VPD data.
When hangs occur, typically the error message:
vpd r/w failed. This is likely a firmware bug on this device.
will be seen.
Never set this bit on function 0 or there will be an infinite recursion.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We already set a station to be associated when peering completes, both
in user space and in the kernel. Thus we should always have an
associated sta before sending data frames to that station.
Failure to check assoc state can cause crashes in the lower-level driver
due to transmitting unicast data frames before driver sta structures
(e.g. ampdu state in ath9k) are initialized. This occurred when
forwarding in the presence of fixed mesh paths: frames were transmitted
to stations with whom we hadn't yet completed peering.
Reported-by: Alexis Green <agreen@cococorp.com> Tested-by: Jesse Jones <jjones@cococorp.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support
smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence,
program clock domain to SW_WKUP.
Signed-off-by: Vignesh R <vigneshr@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Reviewed-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Userspace return code may skip restoring THREADPTR register if there are
no registers that need to be zeroed. This leads to spurious failures in
libc NPTL tests.
Always restore THREADPTR on return to userspace.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Change all occurrences of be16 to le16 in cp2112_xfer(),
because SMBUS words are little endian, not big endian.
Signed-off-by: Ellen Wang <ellen@cumulusnetworks.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When doing an I2C_SMBUS_BYTE write (one byte write, no address),
the data to be written is in "command" not "data->byte".
Signed-off-by: Ellen Wang <ellen@cumulusnetworks.com> Acked-by: Wolfram Sang <wsa@the-dreams.de> Reviewed-by: Antonio Borneo <borneo.antonio@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The input_dev is already gone when the rc device is being unregistered
so checking for its presence only means that no remove uevent will be
generated.
Signed-off-by: David Härdeman <david@hardeman.nu> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 813f5c0ac5cc ("media: Change media device link_notify behaviour")
modified the media controller link setup notification API and updated the
OMAP3 ISP driver accordingly. As a side effect it introduced a bug by
turning power on after setting the link instead of before. This results in
sub-devices not being powered down in some cases when they should be. Fix
it.
Fixes: 813f5c0ac5cc [media] media: Change media device link_notify behaviour Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
virtio declares support for NETIF_F_FRAGLIST, but assumes
that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
always true with a fraglist.
A longer fraglist in the skb will make the call to skb_to_sgvec overflow
the sg array, leading to memory corruption.
Drop NETIF_F_FRAGLIST so we only get what we can handle.
Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Moritz Muehlenhoff <jmm@debian.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently we don't check if the new MTU is valid or not and this allows
one to configure a smaller than minimum allowed by RFCs or even bigger
than interface own MTU, which is a problem as it may lead to packet
drops.
If you have a daemon like NetworkManager running, this may be exploited
by remote attackers by forging RA packets with an invalid MTU, possibly
leading to a DoS. (NetworkManager currently only validates for values
too small, but not for too big ones.)
The fix is just to make sure the new value is valid. That is, between
IPV6_MIN_MTU and interface's MTU.
Note that similar check is already performed at
ndisc_router_discovery(), for when kernel itself parses the RA.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Moritz Muehlenhoff <jmm@debian.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This function can leak kernel stack data when the user siginfo_t has a
positive si_code value. The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.
copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.
This fixes the following information leaks:
x86: 8 bytes leaked when sending a signal from a 32-bit process to
itself. This leak grows to 16 bytes if the process uses x32.
(si_code = __SI_CHLD)
x86: 100 bytes leaked when sending a signal from a 32-bit process to
a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
64-bit process. (si_code = any)
parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process. These bugs are also fixed for consistency.
Signed-off-by: Amanieu d'Antras <amanieu@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This function may copy the si_addr_lsb, si_lower and si_upper fields to
user mode when they haven't been initialized, which can leak kernel
stack data to user mode.
Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals. This is solved by
checking the value of si_signo in addition to si_code.
Signed-off-by: Amanieu d'Antras <amanieu@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ luis: backported to 3.16:
- dropped 2nd hunk in kernel/signal.c ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This function may copy the si_addr_lsb field to user mode when it hasn't
been initialized, which can leak kernel stack data to user mode.
Just checking the value of si_code is insufficient because the same
si_code value is shared between multiple signals. This is solved by
checking the value of si_signo in addition to si_code.
While gntdev_release() is called the MMU notifier is still registered
and can traverse priv->maps list even if no pages are mapped (which is
the case -- gntdev_release() is called after all). But
gntdev_release() will clear that list, so make sure that only one of
those things happens at the same time.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fix this error when compiling with CONFIG_SMP=n and
CONFIG_DYNAMIC_DEBUG=y:
drivers/s390/char/sclp_early.c: In function 'sclp_read_info_early':
drivers/s390/char/sclp_early.c:87:19: error: 'EBUSY' undeclared (first use in this function)
} while (rc == -EBUSY);
^
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Stephen Powell <zlinuxman@wowway.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When a tunnel is deleted, the cached dst entry should be released.
This problem may prevent the removal of a netns (seen with a x-netns IPv6
gre tunnel):
unregister_netdevice: waiting for lo to become free. Usage count = 3
CC: Dmitry Kozlov <xeb@mail.ru> Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: huaibin Wang <huaibin.wang@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
the RCU lock.
Since pfkey_broadcast takes the RCU lock the allocation argument is
pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
The one call outside of rcu can be done with GFP_KERNEL.
Fixes: 7f6b9dbd5afbd ("af_key: locking change") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The tt_local_entry deletion performed in batadv_tt_local_remove() was neither
protecting against simultaneous deletes nor checking whether the element was
still part of the list before calling hlist_del_rcu().
Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate
protection via hash spinlocks as well as an is-element-still-in-hash check to
avoid 'blind' hash removal.
Fixes: 068ee6e204e1 ("batman-adv: roaming handling mechanism redesign") Reported-by: alfonsname@web.de Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When injecting a fault into a misbehaving 32bit guest, it seems
rather idiotic to also inject a 64bit fault that is only going
to corrupt the guest state. This leads to a situation where we
perform an illegal exception return at EL2 causing the host
to crash instead of killing the guest.
Just fix the stupid bug that has been there from day 1.
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The Crucial M500 is known to have issues with queued TRIM commands, the
factory recertified SSDs use a different model number naming convention
which causes them to get ignored by the blacklist.
The new naming convention boils down to: s/Crucial_/FC/
Signed-off-by: Guillermo A. Amaral <g@maral.me> Signed-off-by: Tejun Heo <tj@kernel.org>
[ luis: backported to 3.16:
- dropped ATA_HORKAGE_ZERO_AFTER_TRIM flag
- adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The fix for deadlock in PM in commit [1ee23fe07ee8: ALSA: usb-audio:
Fix deadlocks at resuming] introduced a new check of in_pm flag.
However, the brainless patch author evaluated it in a wrong way
(logical AND instead of logical OR), thus usb_autopm_get_interface()
is wrongly called at probing, leading to unbalance of runtime PM
refcount.
This patch fixes it by correcting the logic.
Reported-by: Hans Yang <hansy@nvidia.com> Fixes: 1ee23fe07ee8 ('ALSA: usb-audio: Fix deadlocks at resuming') Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The routines in scsi_rpm.c assume that if a runtime-PM callback is
invoked for a SCSI device, it can only mean that the device's driver
has asked the block layer to handle the runtime power management (by
calling blk_pm_runtime_init(), which among other things sets q->dev).
However, this assumption turns out to be wrong for things like the ses
driver. Normally ses devices are not allowed to do runtime PM, but
userspace can override this setting. If this happens, the kernel gets
a NULL pointer dereference when blk_post_runtime_resume() tries to use
the uninitialized q->dev pointer.
This patch fixes the problem by calling the block layer's runtime-PM
routines only if the device's driver really does have a runtime-PM
callback routine. Since ses doesn't define any such callbacks, the
crash won't occur.
This fixes Bugzilla #101371.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Stanisław Pitucha <viraptor@gmail.com> Reported-by: Ilan Cohen <ilanco@gmail.com> Tested-by: Ilan Cohen <ilanco@gmail.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.
Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)") Reviewed-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
!spin_is_locked() and spin_unlock_wait() are both only control barriers.
The code needs an acquire barrier, otherwise the cpu might perform read
operations before the lock test.
As no primitive exists inside <include/spinlock.h> and since it seems
noone wants another primitive, the code creates a local primitive within
ipc/sem.c.
With regards to -stable:
The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
nop (just a preprocessor redefinition to improve the readability). The
bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
starting from 3.10).
When I fixed bugs in the sem_lock() logic, I was more conservative than
necessary. Therefore it is safe to replace the smp_mb() with smp_rmb().
And: With smp_rmb(), semop() syscalls are up to 10% faster.
The race we must protect against is:
sem->lock is free
sma->complex_count = 0
sma->sem_perm.lock held by thread B
The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).
For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:
I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).
The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).
After the patch I'm unable to reproduce the problem using the test case
[1].
void create_set()
{
int i, j;
pid_t p;
union {
int val;
struct semid_ds *buf;
unsigned short int *array;
struct seminfo *__buf;
} un;
/* Create and initialize semaphore set */
for (i = 0; i < NSET; i++) {
sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
if (sid[i] < 0) {
perror("semget");
exit(EXIT_FAILURE);
}
}
un.val = 0;
for (i = 0; i < NSET; i++) {
for (j = 0; j < NSEM; j++) {
if (semctl(sid[i], j, SETVAL, un) < 0)
perror("semctl");
}
}
/* Launch threads that operate on semaphore set */
for (i = 0; i < NSEM * NSET * NSET; i++) {
p = fork();
if (p < 0)
perror("fork");
if (p == 0)
thread();
}
/* Free semaphore set */
for (i = 0; i < NSET; i++) {
if (semctl(sid[i], NSEM, IPC_RMID))
perror("IPC_RMID");
}
/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}
int main(int argc, char **argv)
{
pid_t p;
srand(time(NULL));
while (1) {
p = fork();
if (p < 0) {
perror("fork");
exit(EXIT_FAILURE);
}
if (p == 0) {
create_set();
goto end;
}
/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}
end:
return 0;
}
[akpm@linux-foundation.org: use normal comment layout] Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Rafael Aquini <aquini@redhat.com> CC: Aristeu Rozanski <aris@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
After trying to drain pages from pagevec/pageset, we try to get reference
count of the page again, however, the reference count of the page is not
reduced if the page is still not on LRU list.
Fix it by adding the put_page() to drop the page reference which is from
__get_any_page().
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.
The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.
The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: James Bottomley <JBottomley@Odin.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.
iscsi_conn_teardown(....)
{
.........
/*
* Block until all in-progress commands for this connection
* time out or fail.
*/
for (;;) {
spin_lock_irqsave(session->host->host_lock, flags);
if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
spin_unlock_irqrestore(session->host->host_lock, flags);
break;
}
spin_unlock_irqrestore(session->host->host_lock, flags);
msleep_interruptible(500);
iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
"host_busy %d host_failed %d\n",
atomic_read(&session->host->host_busy),
session->host->host_failed);
................
...............
}
}
This is not an issue with software-iscsi/iser as each cxn is a separate
host.
Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.
Signed-off-by: John Soni Jose <sony.john@avagotech.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down. If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.
This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp). And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
[ luis: backported to 3.16:
- dropped changes to remove_one() as suggested by Mike Snitzer ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The device details and mapping trees were just being decremented
before. Now btree_del() is called to do a deep delete.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.
Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at Reported-and-tested-by: Leonidas Spyropoulos <artafinde@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
I want to use lockless_dereference() from seqlock.h, which would mean
including rcupdate.h from it, however rcupdate.h already includes
seqlock.h.
Avoid this by moving lockless_dereference() into compiler.h. This is
somewhat tricky since it uses smp_read_barrier_depends() which isn't
available there, but its a CPP macro so we can get away with it.
The alternative would be moving it into asm/barrier.h, but that would
be updating each arch (I can do if people feel that is more
appropriate).
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ luis: 3.16 prereq for: 37868fe113ff "x86/ldt: Make modify_ldt synchronous" ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying. This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ luis: 3.16 prereq for: 37868fe113ff "x86/ldt: Make modify_ldt synchronous" ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
modify_ldt() has questionable locking and does not synchronize
threads. Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.
This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.
This fixes some fallout from the CVE-2015-5157 fixes.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.16:
- dropped changes to comments in switch_mm()
- included asm/mmu_context.h in perf_event.c
- adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When kvm_set_msr_common() handles a guest's write to
MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data
written by guest and then use it to adjust TSC offset by calling a
call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset()
indicates whether the adjustment is in host TSC cycles or in guest TSC
cycles. If SVM TSC scaling is enabled, adjust_tsc_offset()
[i.e. svm_adjust_tsc_offset()] will first scale the adjustment;
otherwise, it will just use the unscaled one. As the MSR write here
comes from the guest, the adjustment is in guest TSC cycles. However,
the current kvm_set_msr_common() uses it as a value in host TSC
cycles (by using true as the 3rd parameter of adjust_tsc_offset()),
which can result in an incorrect adjustment of TSC offset if SVM TSC
scaling is enabled. This patch fixes this problem.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Vince reported that the fasync signal stuff doesn't work proper for
inherited events. So fix that.
Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
which upon the demise of the file descriptor ensures the allocation is
freed and state is updated.
Now for perf, we can have the events stick around for a while after the
original FD is dead because of references from child events. So we
cannot copy the fasync pointer around. We can however consistently use
the parent's fasync, as that will be updated.
Reported-and-Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho deMelo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If an initiator doesn't have any real LUNs assigned, we should report
LUN 0 and a LUN list length of 1. Some versions of Solaris at least
go beserk if we report a LUN list length of 0.
Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Make sure all non-READ SCSI commands get targ_xfer_tag initialized
to 0xffffffff, not just WRITEs.
Double-free of a TUR cmd object occurs under the following scenario:
1. TUR received (targ_xfer_tag is uninitialized and left at 0)
2. TUR status sent
3. First unsolicited NOPIN is sent to initiator (gets targ_xfer_tag of 0)
4. NOPOUT for NOPIN (with TTT=0) arrives
- its ExpStatSN acks TUR status, TUR is queued for removal
- LIO tries to find NOPIN with TTT=0, but finds the same TUR instead,
TUR is queued for removal for the 2nd time
(Drop unbalanced conditional bracket usage - nab)
Signed-off-by: Alexei Potashnik <alexei@purestorage.com> Signed-off-by: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16:
- kept brackets as they are needed in 3.16 kernel ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When inserting a new register into a block, the present bit map size is
increased using krealloc. krealloc does not clear the additionally
allocated memory, leaving it filled with random values. Result is that
some registers are considered cached even though this is not the case.
Fix the problem by clearing the additionally allocated memory. Also, if
the bitmap size does not increase, do not reallocate the bitmap at all
to reduce overhead.
Fixes: 3f4ff561bc88 ("regmap: rbtree: Make cache_present bitmap per node") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.
There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.
Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We should consider info->feature_persistent when adding indirect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.
When we are using persistent grants the indirect_pages list
should always be empty because blkfront has pre-allocated enough
persistent pages to fill all requests on the ring.
Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The PM runtime core by default assumes a chip is suspended when runtime
PM is enabled. Currently the arizona driver enables runtime PM when the
chip is fully active and then disables the DCVDD regulator at the end of
arizona_dev_init. This however has several problems, firstly the if we
reach the end of arizona_dev_init, we did not properly follow all the
proceedures for shutting down the chip, and most notably we never marked
the chip as cache only so any writes occurring between then and the next
PM runtime resume will be lost. Secondly, if we are already resumed when
we reach the end of dev_init, then at best we get unbalanced regulator
enable/disables at work we lose DCVDD whilst we need it.
Additionally, since the commit 4f0216409f7c ("mfd: arizona: Add better
support for system suspend"), the PM runtime operations may
disable/enable the IRQ, so the IRQs must now be enabled before we call
any PM operations.
This patch adds a call to pm_runtime_set_active to inform the PM core
that the device is starting up active and moves the PM enabling to
around the IRQ initialisation to avoid any PM callbacks happening until
the IRQs are initialised.
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The current "mask" policy option matches files opened as MAY_READ,
MAY_WRITE, MAY_APPEND or MAY_EXEC. This patch extends the "mask"
option to match files opened containing one of these modes. For
example, "mask=^MAY_READ" would match files opened read-write.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The new "euid" policy condition measures files with the specified
effective uid (euid). In addition, for CAP_SETUID files it measures
files with the specified uid or suid.
Changelog:
- fixed checkpatch.pl warnings
- fixed avc denied {setuid} messages - based on Roberto's feedback
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dr. Greg Wettstein <gw@idfusion.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
I received a report from an user of following mouse which needs this quirk:
usb 1-1.6: USB disconnect, device number 58
usb 1-1.6: new low speed USB device number 59 using ehci_hcd
usb 1-1.6: New USB device found, idVendor=04f2, idProduct=1053
usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-1.6: Product: USB Optical Mouse
usb 1-1.6: Manufacturer: PixArt
usb 1-1.6: configuration #1 chosen from 1 choice
input: PixArt USB Optical Mouse as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.6/1-1.6:1.0/input/input5887
generic-usb 0003:04F2:1053.16FE: input,hidraw2: USB HID v1.11 Mouse [PixArt USB Optical Mouse] on usb-0000:00:1a.0-1.6/input0
The quirk was tested by the reporter and it fixed the frequent disconnections etc.
[jkosina@suse.cz: reorder the position in hid-ids.h] Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
sb_getblk() is used during ext4 (and possibly other FSes) writeback
paths. Sometimes such path require allocating memory and guaranteeing
that such allocation won't block. Currently, however, there is no way
to provide user flags for sb_getblk which could lead to deadlocks.
This patch implements a sb_getblk_gfp with the only difference it can
accept user-provided GFP flags.
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
A buffer cache is allocated from movable area because it is referred
for a while and released soon. But some filesystems are taking buffer
cache for a long time and it can disturb page migration.
New APIs are introduced to allocate buffer cache with user specific
flag. *_gfp APIs are for user want to set page allocation flag for
page cache allocation. And *_unmovable APIs are for the user wants to
allocate page cache from non-movable area.
Signed-off-by: Gioh Kim <gioh.kim@lge.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
[ luis: 3.16 prereq for: bd7ade3cd9b0 "bufferhead: Add _gfp version for sb_getblk()" ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We only want to steer the I/O completion towards a queue, but don't
actually access any per-CPU data, so the raw_ version is fine to use
and avoids the warnings when using smp_processor_id().
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Andy Lutomirski <luto@kernel.org> Tested-by: Andy Lutomirski <luto@kernel.org> Acked-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>
[ luis: backported to 3.16:
- dropped changes to megasas_build_dcdb_fusion() ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The firmware class uevent function accessed the "fw_priv->buf" buffer
without the proper locking and testing for NULL. This is an old bug
(looks like it goes back to 2012 and commit 1244691c73b2: "firmware
loader: introduce firmware_buf"), but for some reason it's triggering
only now in 4.2-rc1.
Shuah Khan is trying to bisect what it is that causes this to trigger
more easily, but in the meantime let's just fix the bug since others are
hitting it too (at least Ingo reports having seen it as well).
Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
skb_gso_segment() has a 'features' argument representing offload features
available to the output path.
A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
those instead of the provided ones when handing encapsulation/tunnels.
Depending on dev->hw_enc_features of the output device skb_gso_segment() can
then return NULL even when the caller has disabled all GSO feature bits,
as segmentation of inner header thinks device will take care of segmentation.
This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
that did not fit the remaining token quota as the segmentation does not work
when device supports corresponding hw offload capabilities.
Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
increment from the beginning of the NAPI processing loop after the check
for erroneous packets so they are never accounted. This counter is used
to inform firmware about number of processed completions (packets).
As these packets are never acked the firmware fires IRQs for them again
and again.
Fixes: e29aa33 ("bna: Enable Multi Buffer RX") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Rasesh Mody <rasesh.mody@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Multicast dst are not cached. They carry DST_NOCACHE.
As mentioned in commit f8864972126899 ("ipv4: fix dst race in
sk_dst_get()"), these dst need special care before caching them
into a socket.
Caching them is allowed only if their refcnt was not 0, ie we
must use atomic_inc_not_zero()
Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned
in commit d0c294c53a771 ("tcp: prevent fetching dst twice in early demux
code")
Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux") Tested-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Gregory Hoggarth <Gregory.Hoggarth@alliedtelesis.co.nz> Reported-by: Alex Gartrell <agartrell@fb.com> Cc: Michal Kubeček <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
"len" is a signed integer. We check that len is not negative, so it
goes from zero to INT_MAX. PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long. ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.
I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.
Fixes: a8c879a7ee98 ('RDS: Info and stats') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The socket should be dead at this point. It might be simpler to
add a netlink_release_ring() function which doesn't require
locking at all.
Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name> Diagnosed-by: Cong Wang <cwang@twopensource.com> Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The "follow" fail_over_mac policy is useful for multiport devices that
either become confused or incur a performance penalty when multiple
ports are programmed with the same MAC address, but the same MAC
address still may happened by this steps for this policy:
1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
bond0 has the same mac address with eth0, it is MAC1.
2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
eth1 is backup, eth1 has MAC2.
3) ifconfig eth0 down
eth1 became active slave, bond will swap MAC for eth0 and eth1,
so eth1 has MAC1, and eth0 has MAC2.
4) ifconfig eth1 down
there is no active slave, and eth1 still has MAC1, eth2 has MAC2.
5) ifconfig eth0 up
the eth0 became active slave again, the bond set eth0 to MAC1.
Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
MAC address, it will break this policy for ACTIVE_BACKUP mode.
This patch will fix this problem by finding the old active slave and
swap them MAC address before change active slave.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ip6_datagram_connect() is doing a lot of socket changes without
socket being locked.
This looks wrong, at least for udp_lib_rehash() which could corrupt
lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ("net: Clone
skb before setting peeked flag") introduced a use-after-free bug
in skb_recv_datagram. This is because skb_set_peeked may create
a new skb and free the existing one. As it stands the caller will
continue to use the old freed skb.
This patch fixes it by making skb_set_peeked return the new skb
(or the old one if unchanged).
Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag") Reported-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Brenden Blanco <bblanco@plumgrid.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.
This is in fact bogus for the MSG_PEEK case as this is done without
any locking. So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.
This patch fixes this by only storing the result if the skb is not
shared. This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Shared skbs must not be modified and this is crucial for broadcast
and/or multicast paths where we use it as an optimisation to avoid
unnecessary cloning.
The function skb_recv_datagram breaks this rule by setting peeked
without cloning the skb first. This causes funky races which leads
to double-free.
This patch fixes this by cloning the skb and replacing the skb
in the list when setting skb->peeked.
Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv") Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Incoming packet should be either in backlog queue or
in RCU read-side section. Otherwise, the final sequence of
flush_backlog() and synchronize_net() may miss packets
that can run without device reference:
CPU 1 CPU 2
skb->dev: no reference
process_backlog:__skb_dequeue
process_backlog:local_irq_enable
on_each_cpu for
flush_backlog => IPI(hardirq): flush_backlog
- packet not found in backlog
CPU delayed ...
synchronize_net
- no ongoing RCU
read-side sections
netdev_run_todo,
rcu_barrier: no
ongoing callbacks
__netif_receive_skb_core:rcu_read_lock
- too late
free dev
process packet for freed dev
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue") Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: used davem's backport to 3.18 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
pktgen_thread_worker() is obviously racy, kthread_stop() can come
between the kthread_should_stop() check and set_current_state().
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>