Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
perf record -e 'cpu/mem-stores/pp' -a
This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
When software needs to reconfigure PEBS facilities, it should allow a
quiescent period between stopping the prior event counting and setting
up a new PEBS event. The quiescent period is to allow any latent
residual PEBS records to complete its capture at their previously
specified buffer address (provided by IA32_DS_AREA).
There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory. (Or MSR switching is just buggy on some models.)
The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.
After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.
This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).
We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.
(If you are implementing PEBS for guests, be sure to handle the case
where both host and guest enable PEBS, because this patch doesn't.)
Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") Reported-by: Jiří Olša <jolsa@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Thomas Huth discovered that a guest could cause a hard hang of a
host CPU by setting the Instruction Authority Mask Register (IAMR)
to a suitable value. It turns out that this is because when the
code was added to context-switch the new special-purpose registers
(SPRs) that were added in POWER8, we forgot to add code to ensure
that they were restored to a sane value on guest exit.
This adds code to set those registers where a bad value could
compromise the execution of the host kernel to a suitable neutral
value on guest exit.
Fixes: b005255e12a3 Reported-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It turns out that commit can cause problems for systems with multiple
GPUs, and causes X to hang on at least a HP Pavilion dv7 with hybrid
graphics.
This got noticed originally in 4.4.4, where this patch had already
gotten back-ported, but 4.5-rc7 was verified to have the same problem.
Alexander Deucher says:
"It looks like you have a muxed system so I suspect what's happening is
that one of the display is being reported as connected for both the
IGP and the dGPU and then the desktop environment gets confused or
there some sort problem in the detect functions since the mux is not
switched to the dGPU. I don't see an easy fix unless Dave has any
ideas. I'd say just revert for now"
Reported-by: Jörg-Volker Peetz <jvpeetz@web.de> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Dave Airlie <airlied@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes a recent ABORT_TASK regression associated
with commit febe562c, where a left-over target_put_sess_cmd()
would still be called when __target_check_io_state() detected
a command has already been completed, and explicit ABORT must
be avoided.
Note commit febe562c dropped the local kref_get_unless_zero()
check in core_tmr_abort_task(), but did not drop this extra
corresponding target_put_sess_cmd() in the failure path.
So go ahead and drop this now bogus target_put_sess_cmd(),
and avoid this potential use-after-free.
Reported-by: Dan Lane <dracodan@gmail.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The firmware ctls like "DSP1 Firmware" in wm_adsp codec driver are
enum, while the current driver accesses wrongly via
value.integer.value[]. They have to be via value.enumerated.item[]
instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The DRC Mode like "AIF1DRC1 Mode" and EQ Mode like "AIF1.1 EQ Mode" in
wm8994 codec driver are enum ctls, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
"MBC Mode", "VSS Mode", "VSS HPF Mode" and "Enhanced EQ Mode" ctls in
wm8958 codec driver are enum, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
gs_destroy_candev() erroneously calls kfree() on a struct gs_can *, which is
allocated through alloc_candev() and should instead be freed using
free_candev() alone.
The inappropriate use of kfree() causes the kernel to hang when
gs_destroy_candev() is called.
Only the struct gs_usb * which is allocated through kzalloc() should be freed
using kfree() when the device is disconnected.
Signed-off-by: Maximilian Schneider <max@schneidersoft.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The value 5000 was put here with the addition of the timeout field to
ieee80211_start_tx_ba_session. It was originally added in mac80211 to
save resources for drivers like iwlwifi, which only supports a limited
number of concurrent aggregation sessions.
Since iwlwifi does not use minstrel_ht and other drivers don't need
this, 0 is a better default - especially since there have been
recent reports of aggregation setup related issues reproduced with
ath9k. This should improve stability without causing any adverse
effects.
Acked-by: Avery Pennarun <apenwarr@gmail.com> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since cfg80211 frequently takes actions from its netdev notifier
call, wireless extensions messages could still be ordered badly
since the wext netdev notifier, since wext is built into the
kernel, runs before the cfg80211 netdev notifier. For example,
the following can happen:
5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
5: wlan1: <BROADCAST,MULTICAST,UP>
link/ether
when setting the interface down causes the wext message.
To also fix this, export the wireless_nlevent_flush() function
and also call it from the cfg80211 notifier.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Beniamino reported that he was getting an RTM_NEWLINK message for a
given interface, after the RTM_DELLINK for it. It turns out that the
message is a wireless extensions message, which was sent because the
interface had been connected and disconnection while it was deleted
caused a wext message.
For its netlink messages, wext uses RTM_NEWLINK, but the message is
without all the regular rtnetlink attributes, so "ip monitor link"
prints just rudimentary information:
5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
Deleted 5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
5: wlan1: <BROADCAST,MULTICAST,UP>
link/ether
(from my hwsim reproduction)
This can cause userspace to get confused since it doesn't expect an
RTM_NEWLINK message after RTM_DELLINK.
The reason for this is that wext schedules a worker to send out the
messages, and the scheduling delay can cause the messages to get out
to userspace in different order.
To fix this, have wext register a netdevice notifier and flush out
any pending messages when netdevice state changes. This fixes any
ordering whenever the original message wasn't sent by a notifier
itself.
Reported-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
According to IBTA spec v1.3 section 12.7.19, QPs should use GRH when
the path returned by the SA has hop-limit > 0. Currently, we do that
only for the > 1 case, fix that.
Fixes: 6d969a471ba1 ('IB/sa: Add ib_init_ah_from_path()') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The R-Car GPIO driver handles Runtime PM for requested GPIOs only.
When using a GPIO purely as an interrupt source, no Runtime PM handling
is done, and the GPIO module's clock may not be enabled.
To fix this:
- Add .irq_request_resources() and .irq_release_resources() callbacks
to handle Runtime PM when an interrupt is requested,
- Add irq_bus_lock() and sync_unlock() callbacks to handle Runtime PM
when e.g. disabling/enabling an interrupt, or configuring the
interrupt type.
Fixes: d5c3d84657db57bd "net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS" Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[ luis: backported to 3.16:
- use gpio_to_priv() instead of gpiochip_get_data()
- adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ubi_start_leb_change() allocates too few bytes.
ubi_more_leb_change_data() will write up to req->upd_bytes +
ubi->min_io_size bytes.
Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Avoid sending a partially initialised `siginfo_t' structure along SIGFPE
signals issued from `do_ov' and `do_trap_or_bp', leading to information
leaking from the kernel stack.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch applies the microphone-related fix created for the Acer
Aspire E1-572 to the E1-472 as well, as it uses the same Realtek ALC282
CODEC and demonstrates the same issues.
This patch allows an external, headset microphone to be used and limits
the gain on the (quite noisy) internal microphone.
Signed-off-by: Simon South <simon@simonsouth.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Pause/unpause graph tracing around do_suspend_lowlevel as it has
inconsistent call/return info after it jumps to the wakeup vector.
The graph trace buffer will otherwise become misaligned and
may eventually crash and hang on suspend.
To reproduce the issue and test the fix:
Run a function_graph trace over suspend/resume and set the graph
function to suspend_devices_and_enter. This consistently hangs the
system without this fix.
Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On CI, we need to see if the number of crtcs changes to determine
whether or not we need to upload the mclk table again. In practice
we don't currently upload the mclk table again after the initial load.
The only reason you would would be to add new states, e.g., for
arbitrary mclk setting which is not currently supported.
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
During DRAM initialization on certain ASpeed devices, an incorrect
bit (bit 10) was checked in the "SDRAM Bus Width Status" register
to determine DRAM width.
Query bit 6 instead in accordance with the Aspeed AST2050 datasheet v1.05.
Signed-off-by: Timothy Pearson <tpearson@raptorengineeringinc.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The OSS sequencer client tries to drain the pending events at
releasing. Unfortunately, as spotted by syzkaller fuzzer, this may
lead to an unkillable process state when the event has been queued at
the far future. Since the process being released can't be signaled
any longer, it remains and waits for the echo-back event in that far
future.
Back to history, the draining feature was implemented at the time we
misinterpreted POSIX definition for blocking file operation.
Actually, such a behavior is superfluous at release, and we should
just release the device as is instead of keeping it up forever.
This patch just removes the draining call that may block the release
for too long time unexpectedly.
The Parrot NMEA GPS Flight Recorder is a USB composite device
consisting of hub, flash storage, and cp210x usb to serial chip.
It is an accessory to the mass-produced Parrot AR Drone 2.
The device emits standard NMEA messages which make the it compatible
with NMEA compatible software. It was tested using gpsd version 3.11-3
as an NMEA interpreter and using the official Parrot Flight Recorder.
Signed-off-by: Vittorio Alfieri <vittorio88@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
HDSPM driver contains a code issuing zero-division potentially in
system sample rate ctl code. This patch fixes it by not processing
a zero or invalid rate value as a divisor, as well as excluding the
invalid value to be passed via the given ctl element.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Calling return copy_to_user(...) in an ioctl will not
do the right thing if there's a pagefault:
copy_to_user returns the number of bytes not copied
in this case.
Fix up kvm to do
return copy_to_user(...)) ? -EFAULT : 0;
everywhere.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[ luis: backported to 3.16:
- dropped changes to arch/arm64/kvm/guest.c ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This issue is caused by commit 02323db17e3a7 ("cifs: fix
cifs_uniqueid_to_ino_t not to ever return 0"), when BITS_PER_LONG
is 64 on s390x, the corresponding cifs_uniqueid_to_ino_t()
function will cast 64-bit fileid to 32-bit by using (ino_t)fileid,
because ino_t (typdefed __kernel_ino_t) is int type.
It's defined in arch/s390/include/uapi/asm/posix_types.h
#ifndef __s390x__
typedef unsigned long __kernel_ino_t;
...
#else /* __s390x__ */
typedef unsigned int __kernel_ino_t;
So the #ifdef condition is wrong for s390x, we can just still use
one cifs_uniqueid_to_ino_t() function with comparing sizeof(ino_t)
and sizeof(u64) to choose the correct execution accordingly.
Signed-off-by: Yadan Fan <ydfan@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
For interim responses we only need to parse a header and update
a number credits. Now it is done for all SMB2+ command except
SMB2_READ which is wrong. Fix this by adding such processing.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Tested-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When opening a file, SMB2_open() attempts to parse the lease state from the
SMB2 CREATE Response. However, the parsing code was not careful to ensure
that the create contexts are not empty or invalid, which can lead to out-
of-bounds memory access. This can be seen easily by trying
to read a file from a OSX 10.11 SMB3 server. Here is sample crash output:
Signed-off-by: Justin Maggard <jmaggard@netgear.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
X32 ABI takes the 64bit timespec, thus the timer user status ioctl becomes
incompatible with IA32. This results in NOTTY error when the ioctl is
issued.
Meanwhile, this struct in X32 is essentially identical with the one in
X86-64, so we can just bypassing to the existing code for this
specific compat ioctl.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The timer user status compat ioctl returned the bogus struct used for
64bit architectures instead of the 32bit one. This patch addresses
it to return the proper struct.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Like the previous fixes for ctl and PCM, we need a fix for
incompatible X32 ABI regarding the rawmidi: namely, struct
snd_rawmidi_status has the timespec, and the size and the alignment on
X32 differ from IA32.
This patch fixes the incompatible ioctl for X32.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The X32 ABI takes the same alignment like x86-64, and this may result
in the incompatible struct size from ia32. Unfortunately, we hit this
in some control ABI: struct snd_ctl_elem_value differs between them
due to the position of 64bit variable array. This ends up with the
unknown ioctl (ENOTTY) error.
The fix is to add the compat entries for the new aligned struct.
Reported-and-tested-by: Steven Newbury <steve@snewbury.org.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Calling return copy_to_user(...) in an ioctl will not
do the right thing if there's a pagefault:
copy_to_user returns the number of bytes not copied
in this case.
Fix up vfio to do
return copy_to_user(...)) ?
-EFAULT : 0;
everywhere.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
[ luis: backported to 3.16:
- dropped changes to vfio_platform_common.c ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The id buffer in ata_device is a DMA target, but it isn't explicitly
cacheline aligned. Due to this, adjacent fields can be overwritten with
stale data from memory on non coherent architectures. As a result, the
kernel is sometimes unable to communicate with an ATA device.
Fix this by ensuring that the id buffer is cacheline aligned.
This issue is similar to that fixed by Commit 84bda12af31f
("libata: align ap->sector_buf").
The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.
Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
With this fix, all code paths should now be obtaining the page lock before
f->sem.
Reported-by: Szabó Tamás <sztomi89@gmail.com> Tested-by: Thomas Betker <thomas.betker@rohde-schwarz.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This reverts commit 5ffd3412ae55
("jffs2: Fix lock acquisition order bug in jffs2_write_begin").
The commit modified jffs2_write_begin() to remove a deadlock with
jffs2_garbage_collect_live(), but this introduced new deadlocks found
by multiple users. page_lock() actually has to be called before
mutex_lock(&c->alloc_sem) or mutex_lock(&f->sem) because
jffs2_write_end() and jffs2_readpage() are called with the page locked,
and they acquire c->alloc_sem and f->sem, resp.
In other words, the lock order in jffs2_write_begin() was correct, and
it is the jffs2_garbage_collect_live() path that has to be changed.
Revert the commit to get rid of the new deadlocks, and to clear the way
for a better fix of the original deadlock.
Reported-by: Deng Chao <deng.chao1@zte.com.cn> Reported-by: Ming Liu <liu.ming50@gmail.com> Reported-by: wangzaiwei <wangzaiwei@top-vision.cn> Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we use USB ID pin as wakeup source, and there is a USB block
device on this USB OTG (ID) cable, the system will be deadlock
after system resume.
The root cause for this problem is: the workqueue ci_otg may try
to remove hcd before the driver resume has finished, and hcd will
disconnect the device on it, then, it will call device_release_driver,
and holds the device lock "dev->mutex", but it is never unlocked since
it waits workqueue writeback to run to flush the block information, but
the workqueue writeback is freezable, it is not thawed before driver
resume has finished.
When the driver (device: sd 0:0:0:0:) resume goes to dpm_complete, it
tries to get its device lock "dev->mutex", but it can't get it forever,
then the deadlock occurs. Below call stacks show the situation.
So, in order to fix this problem, we need to change workqueue ci_otg
as freezable, then the work item in this workqueue will be run after
driver's resume, this workqueue will not be blocked forever like above
case since the workqueue writeback has been thawed too.
The 5 volt detect functionality broke in 3.14: the code reads IO register 0x70
again after it has already been cleared. Instead it should use the cached
irq_reg_0x70 value and the io_write to 0x71 to clear 0x70 can be dropped since
this has already been done.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
As reported by Soohoon Lee, the HDIO_GET_32BIT ioctl does not
work correctly in compat mode with libata.
I have investigated the issue further and found multiple problems
that all appeared with the same commit that originally introduced
HDIO_GET_32BIT handling in libata back in linux-2.6.8 and presumably
also linux-2.4, as the code uses "copy_to_user(arg, &val, 1)" to copy
a 'long' variable containing either 0 or 1 to user space.
The problems with this are:
* On big-endian machines, this will always write a zero because it
stores the wrong byte into user space.
* In compat mode, the upper three bytes of the variable are updated
by the compat_hdio_ioctl() function, but they now contain
uninitialized stack data.
* The hdparm tool calling this ioctl uses a 'static long' variable
to store the result. This means at least the upper bytes are
initialized to zero, but calling another ioctl like HDIO_GET_MULTCOUNT
would fill them with data that remains stale when the low byte
is overwritten. Fortunately libata doesn't implement any of the
affected ioctl commands, so this would only happen when we query
both an IDE and an ATA device in the same command such as
"hdparm -N -c /dev/hda /dev/sda"
* The libata code for unknown reasons started using ATA_IOC_GET_IO32
and ATA_IOC_SET_IO32 as aliases for HDIO_GET_32BIT and HDIO_SET_32BIT,
while the ioctl commands that were added later use the normal
HDIO_* names. This is harmless but rather confusing.
This addresses all four issues by changing the code to use put_user()
on an 'unsigned long' variable in HDIO_GET_32BIT, like the IDE subsystem
does, and by clarifying the names of the ioctl commands.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Soohoon Lee <Soohoon.Lee@f5.com> Tested-by: Soohoon Lee <Soohoon.Lee@f5.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch was triggering a Oops in stable kernel 3.10.99. Christian
agrees that the patch is correct but "assumes that radeon_fence_unref()
can safely take NULL as the fence which is not the case for older
kernels."
Reported-by: Erik Andersen <andersen@codepoet.org> Acked-by: Christian König <christian.koenig@amd.com> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes a race between setting of SCF_SEND_DELAYED_TAS
in transport_send_task_abort(), and check of the same bit in
transport_check_aborted_status().
It adds a __transport_check_aborted_status() version that is
used by target_execute_cmd() when se_cmd->t_state_lock is
held, and a transport_check_aborted_status() wrapper for
all other existing callers.
Also, it handles the case where the check happens before
transport_send_task_abort() gets called. For this, go
ahead and set SCF_SEND_DELAYED_TAS early when necessary,
and have transport_send_task_abort() send the abort.
Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: based on Nicholas' backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
To address the bug where fabric driver level shutdown
of se_cmd occurs at the same time when TMR CMD_T_ABORTED
is happening resulting in a -1 ->cmd_kref, this patch
adds a CMD_T_FABRIC_STOP bit that is used to determine
when TMR + driver I_T nexus shutdown is happening
concurrently.
It changes target_sess_cmd_list_set_waiting() to obtain
se_cmd->cmd_kref + set CMD_T_FABRIC_STOP, and drop local
reference in target_wait_for_sess_cmds() and invoke extra
target_put_sess_cmd() during Task Aborted Status (TAS)
when necessary.
Also, it adds a new target_wait_free_cmd() wrapper around
transport_wait_for_tasks() for the special case within
transport_generic_free_cmd() to set CMD_T_FABRIC_STOP,
and is now aware of CMD_T_ABORTED + CMD_T_TAS status
bits to know when an extra transport_put_cmd() during
TAS is required.
Note transport_generic_free_cmd() is expected to block on
cmd->cmd_wait_comp in order to follow what iscsi-target
expects during iscsi_conn context se_cmd shutdown.
Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
[ luis: backported to 3.16: used Nicholas' backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes a bug in TMR task aborted status (TAS)
handling when multiple sessions are connected to the
same target WWPN endpoint and se_node_acl descriptor,
resulting in TASK_ABORTED status to not be generated
for aborted se_cmds on the remote port.
This is due to core_tmr_handle_tas_abort() incorrectly
comparing se_node_acl instead of se_session, for which
the multi-session case is expected to be sharing the
same se_node_acl.
Instead, go ahead and update core_tmr_handle_tas_abort()
to compare tmr_sess + cmd->se_sess in order to determine
if the LUN_RESET was received on a different I_T nexus,
and TASK_ABORTED status response needs to be generated.
Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: used Nicholas' backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes a NULL pointer se_cmd->cmd_kref < 0
refcount bug during TMR LUN_RESET with active se_cmd
I/O, that can be triggered during se_cmd descriptor
shutdown + release via core_tmr_drain_state_list() code.
To address this bug, add common __target_check_io_state()
helper for ABORT_TASK + LUN_RESET w/ CMD_T_COMPLETE
checking, and set CMD_T_ABORTED + obtain ->cmd_kref for
both cases ahead of last target_put_sess_cmd() after
TFO->aborted_task() -> transport_cmd_finish_abort()
callback has completed.
It also introduces SCF_ACK_KREF to determine when
transport_cmd_finish_abort() needs to drop the second
extra reference, ahead of calling target_put_sess_cmd()
for the final kref_put(&se_cmd->cmd_kref).
It also updates transport_cmd_check_stop() to avoid
holding se_cmd->t_state_lock while dropping se_cmd
device state via target_remove_from_state_list(), now
that core_tmr_drain_state_list() is holding the
se_device lock while checking se_cmd state from
within TMR logic.
Finally, move transport_put_cmd() release of SGL +
TMR + extended CDB memory into target_free_cmd_mem()
in order to avoid potential resource leaks in TMR
ABORT_TASK + LUN_RESET code-paths. Also update
target_release_cmd_kref() accordingly.
Reviewed-by: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: used Nicholas' backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes a NULL pointer se_cmd->cmd_kref < 0
refcount bug during TMR LUN_RESET with active TMRs,
triggered during se_cmd + se_tmr_req descriptor
shutdown + release via core_tmr_drain_tmr_list().
To address this bug, go ahead and obtain a local
kref_get_unless_zero(&se_cmd->cmd_kref) for active I/O
to set CMD_T_ABORTED, and transport_wait_for_tasks()
followed by the final target_put_sess_cmd() to drop
the local ->cmd_kref.
Also add two new checks within target_tmr_work() to
avoid CMD_T_ABORTED -> TFO->queue_tm_rsp() callbacks
ahead of invoking the backend -> fabric put in
transport_cmd_check_stop_to_fabric().
For good measure, also change core_tmr_release_req()
to use list_del_init() ahead of se_tmr_req memory
free.
Reviewed-by: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ luis: backported to 3.16: used Nicholas' backport to 3.14 ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in
its size computation, observing that the current method never guaranteed
that the hashsize (measured in number of entries) would be a power of two,
which the input hash function for that table requires. The root cause of
the problem is that two values need to be computed (one, the allocation
order of the storage requries, as passed to __get_free_pages, and two the
number of entries for the hash table). Both need to be ^2, but for
different reasons, and the existing code is simply computing one order
value, and using it as the basis for both, which is wrong (i.e. it assumes
that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not).
To fix this, we change the logic slightly. We start by computing a goal
allocation order (which is limited by the maximum size hash table we want
to support. Then we attempt to allocate that size table, decreasing the
order until a successful allocation is made. Then, with the resultant
successful order we compute the number of buckets that hash table supports,
which we then round down to the nearest power of two, giving us the number
of entries the table actually supports.
I've tested this locally here, using non-debug and spinlock-debug kernels,
and the number of entries in the hashtable consistently work out to be
powers of two in all cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> CC: Dmitry Vyukov <dvyukov@google.com> CC: Vladislav Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
An error response from a RTM_GETNETCONF request can return the positive
error value EINVAL in the struct nlmsgerr that can mislead userspace.
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since the gc of ipv4 route was removed, the route cached would has
no chance to be removed, and even it has been timeout, it still could
be used, cause no code to check it's expires.
Fix this issue by checking and removing route cache when we get route.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Dmitry reported memory leaks of IP options allocated in
ip_cmsg_send() when/if this function returns an error.
Callers are responsible for the freeing.
Many thanks to Dmitry for the report and diagnostic.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
tg3_tso_bug() can hit a condition where the entire tx ring is not big
enough to segment the GSO packet. For example, if MSS is very small,
gso_segs can exceed the tx ring size. When we hit the condition, it
will cause tx timeout.
tg3_tso_bug() is called to handle TSO and DMA hardware bugs.
For TSO bugs, if tg3_tso_bug() cannot succeed, we have to drop the packet.
For DMA bugs, we can still fall back to linearize the SKB and let the
hardware transmit the TSO packet.
This patch adds a function tg3_tso_bug_gso_check() to check if there
are enough tx descriptors for GSO before calling tg3_tso_bug().
The caller will then handle the error appropriately - drop or
lineraize the SKB.
v2: Corrected patch description to avoid confusion.
Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Devices may have limits on the number of fragments in an skb they support.
Current codebase uses a constant as maximum for number of fragments one
skb can hold and use.
When enabling scatter/gather and running traffic with many small messages
the codebase uses the maximum number of fragments and may thereby violate
the max for certain devices.
The patch introduces a global variable as max number of fragments.
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 6fd99094de2b ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.
RFC 4861, 6.3.4. Processing Received Router Advertisements
A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
and Retrans Timer) may contain a value denoting that it is
unspecified. In such cases, the parameter should be ignored and the
host should continue using whatever value it is already using.
If the received Cur Hop Limit value is non-zero, the host SHOULD set
its CurHopLimit variable to the received value.
So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16:
- added missing DEVCONF_* as suggested by Yoshfuji so that uapi contains
the same values as mainline
- adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Currently, the egress interface index specified via IPV6_PKTINFO
is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
can be subverted when the user space application calls connect()
before sendmsg().
Fix it by initializing properly flowi6_oif in connect() before
performing the route lookup.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The problem here is the skb we provide to tcp_v4_send_ack() had to
be parked in the backlog of a new TCP fastopen child because this child
was owned by the user at the time an out of window packet arrived.
Before queuing a packet, TCP has to set skb->dev to NULL as the device
could disappear before packet is removed from the queue.
Fix this issue by using the net pointer provided by the socket (being a
timewait or a request socket).
IPv6 is immune to the bug : tcp_v6_send_response() already gets the net
pointer from the socket if provided.
Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
PHY status frames are not reliable, the PHY may not be able to send them
during heavy receive traffic. This overflow condition is signaled by the
PHY in the next status frame, but the driver did not make use of it.
Instead it always reported wrong tx timestamps to user space after an
overflow happened because it assigned newly received tx timestamps to old
packets in the queue.
This commit fixes this issue by clearing the tx timestamp queue every time
an overflow happens, so that no timestamps are delivered for overflow
packets. This way time stamping will continue correctly after an overflow.
Signed-off-by: Manfred Rudigier <manfred.rudigier@omicron.at> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The value passed by unix_diag_get_exact to unix_lookup_by_ino has type
__u32, but unix_lookup_by_ino's argument ino has type int, which is not
a problem yet.
However, when ino is compared with sock_i_ino return value of type
unsigned long, ino is sign extended to signed long, and this results
to incorrect comparison on 64-bit architectures for inode numbers
greater than INT_MAX.
This bug was found by strace test suite.
Fixes: 5d3cae8bc39d ("unix_diag: Dumping exact socket core") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It's forbidden to manually change dev->features in run-time. Currently, this is
done in the driver to make sure that GSO_UDP_TUNNEL is advertized only when
VXLAN tunnel is set. However, since the stack actually does features intersection
with hw_enc_features, we can safely revert to advertizing features early when
registering the netdevice.
Fixes: f4a1edd56120 ('net/mlx4_en: Advertize encapsulation offloads [...]') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Previously, the shift value used for time-stamping was constant and didn't
depend on the HW chip frequency. Change that to take the frequency into account
and calculate the maximal value in cycles per wraparound of ten seconds. This
time slot was chosen since it gives a good accuracy in time synchronization.
Algorithm for shift value calculation:
* Round up the maximal value in cycles to nearest power of two
* Calculate maximal multiplier by division of all 64 bits set
to above result
* Then, invert the function clocksource_khz2mult() to get the shift from
maximal mult value
Fixes: ec693d47010e ('net/mlx4_en: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
RdropOvflw counts overrun of HW buffer, therefore should
be used for rx_fifo_errors only.
Currently RdropOvflw counter is mistakenly also set into
rx_missed_errors and rx_over_errors too, which makes the
device total dropped packets accounting to show wrong results.
Fix that. Use it for rx_fifo_errors only.
Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC') Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The clear and set masks in the call to phy_set_clr_bits() called from
bcm7xxx_config_init() are inverted. We need to fix this by swapping the two
arguments, that is, set 0 bits, but clear the shade mode 2 enable bit.
Fixes: b560a58c45c66 ("net: phy: add Broadcom BCM7xxx internal PHY driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The unix_dgram_sendmsg routine use the following test
if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.
Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Reported-By: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Tested-by: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The current logic in bond_arp_rcv will accept an incoming ARP for
validation if (a) the receiving slave is either "active" (which includes
the currently active slave, or the current ARP slave) or, (b) there is a
currently active slave, and it has received an ARP since it became active.
For case (b), the receiving slave isn't the currently active slave, and is
receiving the original broadcast ARP request, not an ARP reply from the
target.
This logic can fail if there is no currently active slave. In
this situation, the ARP probe logic cycles through all slaves, assigning
each in turn as the "current_arp_slave" for one arp_interval, then setting
that one as "active," and sending an ARP probe from that slave. The
current logic expects the ARP reply to arrive on the sending
current_arp_slave, however, due to switch FDB updating delays, the reply
may be directed to another slave.
This can arise if the bonding slaves and switch are working, but
the ARP target is not responding. When the ARP target recovers, a
condition may result wherein the ARP target host replies faster than the
switch can update its forwarding table, causing each ARP reply to be sent
to the previous current_arp_slave. This will never pass the logic in
bond_arp_rcv, as neither of the above conditions (a) or (b) are met.
Some experimentation on a LAN shows ARP reply round trips in the
200 usec range, but my available switches never update their FDB in less
than 4000 usec.
This patch changes the logic in bond_arp_rcv to additionally
accept an ARP reply for validation on any slave if there is a current ARP
slave and it sent an ARP probe during the previous arp_interval.
Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works") Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Downstream packages like Debian flash-kernel use
/proc/device-tree/model
to determine which dtb file to install.
Hence each dts in the Linux kernel should provide a unique model
identifier.
Commit 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology NAS
devices") created the new files kirkwood-ds111.dts and kirkwood-ds112.dts
using the same model identifier.
This patch provides a unique model identifier for the
Synology DiskStation DS112.
Fixes: 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology NAS devices") Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
... into returning a positive to path_openat(), which would interpret that
as "symlink had been encountered" and proceed to corrupt memory, etc.
It can only happen due to a bug in some ->open() instance or in some LSM
hook, etc., so we report any such event *and* make sure it doesn't trick
us into further unpleasantness.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The delete opration can allocate additional space on the HPFS filesystem
due to btree split. The HPFS driver checks in advance if there is
available space, so that it won't corrupt the btree if we run out of space
during splitting.
If there is not enough available space, the HPFS driver attempted to
truncate the file, but this results in a deadlock since the commit 7dd29d8d865efdb00c0542a5d2c87af8c52ea6c7 ("HPFS: Introduce a global mutex
and lock it on every callback from VFS").
This patch removes the code that tries to truncate the file and -ENOSPC is
returned instead. If the user hits -ENOSPC on delete, he should try to
delete other files (that are stored in a leaf btree node), so that the
delete operation will make some space for deleting the file stored in
non-leaf btree node.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
HP EliteBook 755 G2 with ALC3228 (ALC280) codec [103c:221c] requires
the known fixup (ALC269_FIXUP_HEADSET_MIC) for making the headset mic
working. Also, it suffers from the loopback noise problem, so we
should disable aamix path as well.
Reported-by: Derick Eddington <derick.eddington@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
After login to the desktop on Dell Inspiron 3162,
there's a very loud background noise comes from the builtin speaker.
The noise does not go away even if the speaker is muted.
The noise disappears after using the aamix fixup.
Codec: Realtek ALC3234
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec0255
Subsystem Id: 0x10280725
Revision Id: 0x100002
No Modem Function Group found
The contract between try_read() and try_write() is that when called
each processes as much data as possible. When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more. try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.
Reported-by: Varada Kari <Varada.Kari@sandisk.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Varada Kari <Varada.Kari@sandisk.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.
Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.
Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.
One documented way to enable all events is to:
cat available_events > set_event
But because the function event is displayed in the available_events, this
now causes an INVALID error:
cat: write error: Invalid argument
Reported-by: Chunyu Hu <chuhu@redhat.com> Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In async_pf we try to allocate with NOWAIT to get an element quickly
or fail. This code also handle failures gracefully. Lets silence
potential page allocation failures under load.
The qword_get() function NUL-terminates its output buffer. If the input
string is in hex format \xXXXX... and the same length as the output
buffer, there is an off-by-one:
This patch ensures the NUL terminator doesn't fall outside the output
buffer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
set_power_state defaults to no displays, so we need to update
the display configuration after setting up the powerstate on the
first call. In most cases this is not an issue since ends up
getting called multiple times at any given modeset and the proper
order is achieved in the display changed handling at the top of
the function.
Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Jordan Lazare <Jordan.Lazare@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch fixes the problem that more CAN messages could be sent to the
interface as could be send on the CAN bus. This was more likely for slow baud
rates. The sleeping _start_xmit was woken up in the _write_bulk_callback. Under
heavy TX load this produced another bulk transfer without checking the
free_slots variable and hence caused the overflow in the interface.
Signed-off-by: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
There's one point was missed in the patch commit da49889deb34 ("staging:
binder: Support concurrent 32 bit and 64 bit processes."). When configure
BINDER_IPC_32BIT, the size of binder_uintptr_t was 32bits, but size of
void * is 64bit on 64bit system. Correct it here.
Signed-off-by: Lisa Du <cldu@marvell.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Fixes: da49889deb34 ("staging: binder: Support concurrent 32 bit and 64 bit processes.") Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ luis: backported to 3.16:
- binder is still in staging in the 3.16 kernel] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Add USB ID for cp2104/5 devices on GE B650v3 and B850v3 boards.
Signed-off-by: Ken Lin <ken.lin@advantech.com.tw> Signed-off-by: Akshay Bhat <akshay.bhat@timesys.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In the case where d_add_unique() finds an appropriate alias to use it will
have already incremented the reference count. An additional dget() to swap
the open context's dentry is unnecessary and will leak a reference.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 275bb307865a3 ("NFSv4: Move dentry instantiation into the NFSv4-...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The assignement of EP transfer resources was not handled properly in the
dwc3 driver. Commit aebda6187181 ("usb: dwc3: Reset the transfer
resource index on SET_INTERFACE") previously fixed one aspect of this
where resources may be exhausted with multiple calls to SET_INTERFACE.
However, it introduced an issue where composite devices with multiple
interfaces can be assigned the same transfer resources for different
endpoints. This patch solves both issues.
The assignment of transfer resources cannot perfectly follow the data
book due to the fact that the controller driver does not have all
knowledge of the configuration in advance. It is given this information
piecemeal by the composite gadget framework after every
SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook
programming model in this scenario can cause errors. For two reasons:
1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and
SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple
interfaces.
2) The databook does not mention doing more DEPXFERCFG for new endpoint
on alt setting (8.1.6).
The following simplified method is used instead:
All hardware endpoints can be assigned a transfer resource and this
setting will stay persistent until either a core reset or hibernation.
So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for
every hardware endpoint as well. We are guaranteed that there are as
many transfer resources as endpoints.
This patch triggers off of the calling dwc3_gadget_start_config() for
EP0-out, which always happens first, and which should only happen in one
of the above conditions.
Fixes: aebda6187181 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE") Reported-by: Ravi Babu <ravibabu@ti.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
pointer which was an 'pcifront_sd' structure. However in the
pci_device_add it expects that the 'sd' is 'struct sysdata' and
sets the dev->node to what is in sd->node (offset 4):
That is an hole - filled with garbage as we used kmalloc instead of
kzalloc (the second problem).
This patch fixes the issue by:
1) Use kzalloc to initialize to a well known state.
2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
way access to the 'node' will access the right offset.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 (xen/pciback: Save
xen_pci_op commands before processing it) broke enabling MSI-X because
it would never copy the resulting vectors into the response. The
number of vectors requested was being overwritten by the return value
(typically zero for success).
Save the number of vectors before processing the op, so the correct
number of vectors are copied afterwards.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 408fb0e5aa7fda0059db282ff58c3b2a4278baa0 (xen/pciback: Don't
allow MSI-X ops if PCI_COMMAND_MEMORY is not set) prevented enabling
MSI-X on passed-through virtual functions, because it checked the VF
for PCI_COMMAND_MEMORY but this is not a valid bit for VFs.
Instead, check the physical function for PCI_COMMAND_MEMORY.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Ben Hutchings [Mon, 7 Mar 2016 03:40:02 +0000 (03:40 +0000)]
crypto: {blk,giv}cipher: Set has_setkey
Commit a1383cd86a06 ("crypto: skcipher - Add crypto_skcipher_has_setkey")
was incorrectly backported to the 3.2.y and 3.16.y stable branches.
We need to set ablkcipher_tfm::has_setkey in the
crypto_init_blkcipher_ops_async() and crypto_init_givcipher_ops()
functions as well as crypto_init_ablkcipher_ops().
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This doesn't seem to fix a regression -- I don't think the CLAC was
ever there.
I double-checked in a debugger: entries through the int80 gate do
not automatically clear AC.
Stable maintainers: I can provide a backport to 4.3 and earlier if
needed. This needs to be backported all the way to 3.10.
Reported-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access") Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ kamal: backport to 3.10 through 3.19-stable: file rename; context ] Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released. A pointer on the conflicting resource is kept. At wake-up
this pointer is used as a parent to retry to request the region.
A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed). Another
problem is that the next call to __request_region() fails to detect a
remaining conflict. The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself. It is likely
to succeed anyway, even if there is still a conflict.
Instead, the parent of the conflicting resource should be passed to
__request_region().
As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.
Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Tested-by: Vincent Donnefort <vdonnefort@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Competing overwrite DIO in dioread_nolock mode will just overwrite
pointer to io_end in the inode. This may result in data corruption or
extent conversion happening from IO completion interrupt because we
don't properly set buffer_defer_completion() when unlocked DIO races
with locked DIO to unwritten extent.
Since unlocked DIO doesn't need io_end for anything, just avoid
allocating it and corrupting pointer from inode for locked DIO.
A cleaner fix would be to avoid these games with io_end pointer from the
inode but that requires more intrusive changes so we leave that for
later.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>