Fixes an error in having the iosf build as 'default m'. On X86 SoC's the iosf
sideband is the only way to access information for some registers, as opposed to
through MSR's on other Intel architectures. While selecting IOSF_MBI is
preferred, it does mean carrying extra code on non-SoC architectures. This
exports the selection to the user, allowing those driver writers to compile out
iosf code if it's not being built.
bitmap_parselist("", &mask, nmaskbits) will erroneously set bit zero in
the mask. The same bug is visible in cpumask_parselist() since it is
layered on top of the bitmask code, e.g. if you boot with "isolcpus=",
you will actually end up with cpu zero isolated.
The bug was introduced in commit 4b060420a596 ("bitmap, irq: add
smp_affinity_list interface to /proc/irq") when bitmap_parselist() was
generalized to support userspace as well as kernelspace.
Fixes: 4b060420a596 ("bitmap, irq: add smp_affinity_list interface to /proc/irq") Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The final version of commit 637241a900cb ("kmsg: honor dmesg_restrict
sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
processed incorrectly:
- open of /dev/kmsg checks syslog access permissions by using
check_syslog_permissions() where security_syslog() is not called if
dmesg_restrict is set.
- syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
can be executed twice (inside check_syslog_permissions() and then
directly in do_syslog())
With this patch security_syslog() is called once only in all
syslog-related operations regardless of dmesg_restrict value.
Fixes: 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When a port goes through a link down/up the multicast router configuration
is not restored.
Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The omap watchdog has the annoying behaviour that writes to most
registers don't have any effect when the watchdog is already running.
Quoting the AM335x reference manual:
To modify the timer counter value (the WDT_WCRR register),
prescaler ratio (the WDT_WCLR[4:2] PTV bit field), delay
configuration value (the WDT_WDLY[31:0] DLY_VALUE bit field), or
the load value (the WDT_WLDR[31:0] TIMER_LOAD bit field), the
watchdog timer must be disabled by using the start/stop sequence
(the WDT_WSPR register).
Currently the timer is stopped in the .probe callback but still there
are possibilities that yield to a situation where omap_wdt_start is
entered with the timer running (e.g. when /dev/watchdog is closed
without stopping and then reopened). In such a case programming the
timeout silently fails!
To circumvent this stop the timer before reprogramming.
Assuming one of the first things the watchdog user does is setting the
timeout explicitly nothing too bad should happen because this explicit
setting works fine.
The current handler of MMC_BLK_CMD_ERR in mmc_blk_issue_rw_rq function
may cause new coming request permanent missing when the ongoing
request (previoulsy started) complete end.
The problem scenario is as follows:
(1) Request A is ongoing;
(2) Request B arrived, and finally mmc_blk_issue_rw_rq() is called;
(3) Request A encounters the MMC_BLK_CMD_ERR error;
(4) In the error handling of MMC_BLK_CMD_ERR, suppose mmc_blk_cmd_err()
end request A completed and return zero. Continue the error handling,
suppose mmc_blk_reset() reset device success;
(5) Continue the execution, while loop completed because variable ret
is zero now;
(6) Finally, mmc_blk_issue_rw_rq() return without processing request B.
The process related to the missing request may wait that IO request
complete forever, possibly crashing the application or hanging the system.
Fix this issue by starting new request when reset success.
The DPAUX read/write FIFO registers aren't sequential in the register
space, causing transfers larger than 4 bytes to cause accesses to non-
existing FIFO registers.
Commit 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before
timekeeping_init()" moved the ACPI subsystem initialization,
including the ACPI mode enabling, to an earlier point in the
initialization sequence, to allow the timekeeping subsystem
use ACPI early. Unfortunately, that resulted in boot regressions
on some systems and the early ACPI initialization was moved toward
its original position in the kernel initialization code by commit c4e1acbb35e4 "ACPI / init: Invoke early ACPI initialization later".
However, that turns out to be insufficient, as boot is still broken
on the Tyan S8812 mainboard.
To fix that issue, split the ACPI early initialization code into
two pieces so the majority of it still located in acpi_early_init()
and the part switching over the platform into the ACPI mode goes into
a new function, acpi_subsystem_init(), executed at the original early
ACPI initialization spot.
That fixes the Tyan S8812 boot problem, but still allows ACPI
tables to be loaded earlier which is useful to the EFI code in
efi_enter_virtual_mode().
Link: https://bugzilla.kernel.org/show_bug.cgi?id=97141 Fixes: 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before timekeeping_init()" Reported-and-tested-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The optimized task selection logic optimistically selects a new task
to run without first doing a full put_prev_task(). This is so that we
can avoid a put/set on the common ancestors of the old and new task.
Similarly, we should only call check_cfs_rq_runtime() to throttle
eligible groups if they're part of the common ancestry, otherwise it
is possible to end up with no eligible task in the simple task
selection.
Imagine:
/root
/prev /next
/A /B
If our optimistic selection ends up throttling /next, we goto simple
and our put_prev_task() ends up throttling /prev, after which we're
going to bug out in set_next_entity() because there aren't any tasks
left.
Avoid this scenario by only throttling common ancestors.
Reported-by: Mohammed Naser <mnaser@vexxhost.com> Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Ben Segall <bsegall@google.com>
[ munged Changelog ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: pjt@google.com Fixes: 678d5718d8d0 ("sched/fair: Optimize cgroup pick_next_task_fair()") Link: http://lkml.kernel.org/r/xm26wq1oswoq.fsf@sword-of-the-dawn.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Node 0 might be offline as well as any other numa node,
in this case kernel cannot handle memory allocation and crashes.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 0c3f061c195c ("of: implement of_node_to_nid as a weak function") Signed-off-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since commit 0723a0473fb4 ("btrfs: allow mounting btrfs subvolumes with
different ro/rw options"), when mounting a subvolume read/write when
another subvolume has previously been mounted read-only, we first do a
remount. However, this should be done with the superblock locked, as per
sync_filesystem():
/*
* We need to be protected against the filesystem going from
* r/o to r/w or vice versa.
*/
WARN_ON(!rwsem_is_locked(&sb->s_umount));
This WARN_ON can easily be hit with:
mkfs.btrfs -f /dev/vdb
mount /dev/vdb /mnt
btrfs subvol create /mnt/vol1
btrfs subvol create /mnt/vol2
umount /mnt
mount -oro,subvol=/vol1 /dev/vdb /mnt
mount -orw,subvol=/vol2 /dev/vdb /mnt2
Fixes: 0723a0473fb4 ("btrfs: allow mounting btrfs subvolumes with different ro/rw options") Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When encoding the NFSACL SETACL operation, reserve just the estimated
size of the ACL rather than a fixed maximum. This eliminates needless
zero padding on the wire that the server ignores.
Fixes: ee5dc7732bd5 ('NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The USB phy should initialize with power-off, and will be powered on
by the USB system when a cable connection is detected.
Having this pm_runtime_get_sync() during probe causes the phy to
*always* be powered on.
Removing it returns to sensible power management.
Fixes: 96be39ab34b77c6f6f5cd6ae03aac6c6449ee5c4 Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
It was possible for mac80211 to be coerced into an
unexpected flow causing sdata union to become
corrupted. Station pointer was put into
sdata->u.vlan.sta memory location while it was
really master AP's sdata->u.ap.next_beacon. This
led to station entry being later freed as
next_beacon before __sta_info_flush() in
ieee80211_stop_ap() and a subsequent invalid
pointer dereference crash.
The problem was that ieee80211_ptr->use_4addr
wasn't cleared on interface type changes.
This could be reproduced with the following steps:
# host A and host B have just booted; no
# wpa_s/hostapd running; all vifs are down
host A> iw wlan0 set type station
host A> iw wlan0 set 4addr on
host A> printf 'interface=wlan0\nssid=4addrcrash\nchannel=1\nwds_sta=1' > /tmp/hconf
host A> hostapd -B /tmp/conf
host B> iw wlan0 set 4addr on
host B> ifconfig wlan0 up
host B> iw wlan0 connect -w hostAssid
host A> pkill hostapd
# host A crashed:
Note: This isn't a revert of f8cdddb8d61d
("cfg80211: check iface combinations only when
iface is running") as far as functionality is
considered because b6a550156bc ("cfg80211/mac80211:
move more combination checks to mac80211") moved
the logic somewhere else already.
Fixes: f8cdddb8d61d ("cfg80211: check iface combinations only when iface is running") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
There was a possible race between
ieee80211_reconfig() and
ieee80211_delayed_tailroom_dec(). This could
result in inability to transmit data if driver
crashed during roaming or rekeying and subsequent
skbs with insufficient tailroom appeared.
This race was probably never seen in the wild
because a device driver would have to crash AND
recover within 0.5s which is very unlikely.
I was able to prove this race exists after
changing the delay to 10s locally and crashing
ath10k via debugfs immediately after GTK
rekeying. In case of ath10k the counter went below
0. This was harmless but other drivers which
actually require tailroom (e.g. for WEP ICV or
MMIC) could end up with the counter at 0 instead
of >0 and introduce insufficient skb tailroom
failures because mac80211 would not resize skbs
appropriately anymore.
Fixes: 8d1f7ecd2af5 ("mac80211: defer tailroom counter manipulation when roaming") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Since commit bd31b85960a7 (which is in 3.2-rc1) nw_gpio_lock is a raw spinlock
that needs usage of the corresponding raw functions.
This fixes:
drivers/mtd/maps/dc21285.c: In function 'nw_en_write':
drivers/mtd/maps/dc21285.c:41:340: warning: passing argument 1 of 'spinlock_check' from incompatible pointer type
spin_lock_irqsave(&nw_gpio_lock, flags);
In file included from include/linux/seqlock.h:35:0,
from include/linux/time.h:5,
from include/linux/stat.h:18,
from include/linux/module.h:10,
from drivers/mtd/maps/dc21285.c:8:
include/linux/spinlock.h:299:102: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *'
static inline raw_spinlock_t *spinlock_check(spinlock_t *lock)
^
drivers/mtd/maps/dc21285.c:43:25: warning: passing argument 1 of 'spin_unlock_irqrestore' from incompatible pointer type
spin_unlock_irqrestore(&nw_gpio_lock, flags);
^
In file included from include/linux/seqlock.h:35:0,
from include/linux/time.h:5,
from include/linux/stat.h:18,
from include/linux/module.h:10,
from drivers/mtd/maps/dc21285.c:8:
include/linux/spinlock.h:370:91: note: expected 'struct spinlock_t *' but argument is of type 'struct raw_spinlock_t *'
static inline void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
Fixes: bd31b85960a7 ("locking, ARM: Annotate low level hw locks as raw") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The problem is that set_bit() takes the actual bit number and not a mask
so static checkers get upset. It doesn't affect run time because we do
it consistently, but we may as well clean it up.
Fixes: 6010ce07a66c ('rndis_wlan: do link-down state change in worker thread') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In d8a2c51cdcae ('ath9k_htc: Use atomic operations for op_flags') we
changed things like this:
- if (priv->op_flags & OP_TSF_RESET) {
+ if (test_bit(OP_TSF_RESET, &priv->op_flags)) {
The problem is that test_bit() takes a bit number and not a mask. It
means that when we do:
set_bit(OP_TSF_RESET, &priv->op_flags);
Then it sets the (1 << 6) bit instead of the 6 bit so we are setting a
bit which is past the end of the unsigned long.
Fixes: d8a2c51cdcae ('ath9k_htc: Use atomic operations for op_flags') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When multiplexling a MAD sent from VF, we should convert the port used
by the guest to send the packet to the actual physical port which will be
used to transmit the packet, before building the relevant address-handle (AH).
This is needed under VPI for single ported VFs, since the code that builds
the AH (mlx4_ib_query_ah()) makes decisions based on the input port. If we
use the port number provided by the guest, it might have different protocol
vs. the one this packat has to go from, and hence the result could be wrong.
So far, the conversion was done after the AH was built and it worked for
single ported Eth VFs which were not enabled under VPI. When adding support
for single ported IB VFs and VPI, we hit that.
Fixes: 449fc48866f7 ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Single port VFs always provide port = 1 (even if the actual physical
port used is port 2). As such, we need to convert the port provided
by the VF to the physical port before calling into the firmware.
It turns out that the Linux mlx4 VF RoCE driver maintains a copy of
the GID table and hence this change became critical only for single
ported IB VFs, but it could be needed for other RoCE VF drivers too.
Fixes: 449fc48866f7 ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The platform_sysrq_reset_seq code was intended as a way for an embedded
platform to provide its own sysrq sequence at compile time. After over
two years, nobody has started using it in an upstream kernel, and
the platforms that were interested in it have moved on to devicetree,
which can be used to configure the sequence without requiring kernel
changes. The method is also incompatible with the way that most
architectures build support for multiple platforms into a single
kernel.
Now the code is producing warnings when built with gcc-5.1:
drivers/tty/sysrq.c: In function 'sysrq_init':
drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds]
key = platform_sysrq_reset_seq[i];
We could fix this, but it seems unlikely that it will ever be used,
so let's just remove the code instead. We still have the option to
pass the sequence either in DT, using the kernel command line,
or using the /sys/module/sysrq/parameters/reset_seq file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 154b7a489a ("Input: sysrq - allow specifying alternate reset sequence")
----
v2: moved sysrq_reset_downtime_ms variable to avoid introducing a compile
warning when CONFIG_INPUT is disabled Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Static checkers complain that the current condition is never true. It
seems pretty likely that it's a typo and "URB" was intended instead of
"USB".
Fixes: 3d97ff63f899 ('usbdevfs: Use scatter-gather lists for large bulk transfers') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Too many spaces were introduced in commit 63adc6fb8ac0 ("pktgen: cleanup
checkpatch warnings"), thus misaligning "src_min:" to other columns.
Fixes: 63adc6fb8ac0 ("pktgen: cleanup checkpatch warnings") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The intent was to use bits 0, 1, and 2 but because of the extra shifts
we're using bits 1, 2, and 4. It's harmless becuase it's done
consistently but it's not the intent and static checkers will complain.
Fixes: 4a200c3b9a40 ('HID: i2c-hid: introduce HID over i2c specification implementation') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
So the intent was to use bits 0, 1 and 2 but because of the extra BIT()
shifts we're actually using 1, 2 and 4. It's harmless because it's done
consistently but static checkers will complain.
Fixes: 9fb6bf02e3ad ('HID: rmi: introduce RMI driver for Synaptics touchpads') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
make C=2 CF=-D__CHECK_ENDIAN__ net/ipv4/ipip.o
CHECK net/ipv4/ipip.c
net/ipv4/ipip.c:254:27: warning: incorrect type in assignment (different base types)
net/ipv4/ipip.c:254:27: expected restricted __be32 [addressable] [usertype] o_key
net/ipv4/ipip.c:254:27: got restricted __be16 [addressable] [usertype] i_flags
Fixes: 3b7b514f44bf ("ipip: fix a regression in ioctl") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
of_clk_get_from_provider() returns ERR_PTR on failure. The
dra7-atl-clock driver was not checking its return value and
immediately used it in __clk_get_hw(). __clk_get_hw()
dereferences supplied clock, if it is not NULL, so in that case
it would dereference an ERR_PTR.
Fixes: 9ac33b0ce81f ("CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)") Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Failure return from dlpar_configure_connector when dlpar adding cpus
results in leaking references to the cpus parent device node. Move the
call to of_node_put() prior to checking the result of
dlpar_configure_connector.
Fixes: 8d5ff320766f ("powerpc/pseries: Make dlpar_configure_connector parent node aware") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When the VLAN_HLEN was added to the calculation for the maximum frame size
there seems to have been a number of issues added to the driver.
The first issue is that in some cases the maximum frame size for a device
never really reached the actual maximum frame size as the VLAN header
length was not included the calculation for that value. As a result some
parts only supported a maximum frame size of either 1496 in the case of
parts that didn't support jumbo frames, and 8996 in the case of the parts
that do.
The second issue is the fact that there were several checks that weren't
updated so as a result setting an MTU of 1500 was treated as enabling jumbo
frames as the calculated value was 1522 instead of 1518. I have addressed
those by replacing ETH_FRAME_LEN with VLAN_ETH_FRAME_LEN where appropriate.
The final issue was the fact that lowering the MTU below 1500 would cause
the driver to allocate 2K buffers for the rings. This is an old issue that
was fixed several years ago in igb/ixgbe and I am addressing now by just
replacing == with a <= so that we always just round up to 1522 for anything
that isn't a jumbo frame.
Fixes: c751a3d58cf2d ("e1000e: Correctly include VLAN_HLEN when changing interface MTU") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
[ luis: backported to 3.16:
- dropped changes to struct e1000_pch_spt_info as i219 isn't supported ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
key/value pairs in a JSON object must be separated by a comma.
After adding the properties "accuracy" and "phase" the JSON output
of /sys/kernel/debug/clk/clk_dump is invalid.
So add the missing commas to fix it.
Fixes: 5279fc402ae5 ("clk: add clk accuracy retrieval support") Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
[sboyd@codeaurora.org: Added comment in function] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.
When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.
Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.
Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Tested-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Actually the next patch makes it much, much easier to trigger the race
so I'm including this one for stable@ as well. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The following commit temporarily disables correct 64-bit FADT addresses
favor during the period the root cause of the bug is not fixed:
Commit: 85dbd5801f62b66e2aa7826aaefcaebead44c8a6
ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses.
With enough protections, this patch re-enables 64-bit FADT addresses by
default. If regressions are reported against such change, this patch should
be bisected and reverted.
Note that 64-bit FACS favor and 64-bit firmware waking vector favor are
excluded by this commit in order not to break OSPMs. Lv Zheng.
This patch adds a new FACS initialization flag for acpi_tb_initialize().
acpi_enable_subsystem() might be invoked several times in OS bootup process,
and we don't want FACS initialization to be invoked twice. Lv Zheng.
Link: https://github.com/acpica/acpica/commit/90f5332a Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The following commit is reported to have broken s2ram on some platforms:
Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd
ACPICA: Add option to favor 32-bit FADT addresses.
The platform reports 2 FACS tables (which is not allowed by ACPI
specification) and the new 32-bit address favor rule forces OSPMs to use
the FACS table reported via FADT's X_FIRMWARE_CTRL field.
The root cause of the reported bug might be one of the followings:
1. BIOS may favor the 64-bit firmware waking vector address when the
version of the FACS is greater than 0 and Linux currently only supports
resuming from the real mode, so the 64-bit firmware waking vector has
never been set and might be invalid to BIOS while the commit enables
higher version FACS.
2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the
FADT while the commit doesn't set the firmware waking vector address of
the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking
vector address of the FACS reported by "X_FIRMWARE_CTRL".
This patch excludes the cases that can trigger the bugs caused by the root
cause 2.
There is no handshaking mechanism can be used by OSPM to tell BIOS which
FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still
be used by BIOS and the 0 value of the 32-bit firmware waking vector might
trigger such failure.
This patch tries to favor 32bit FACS address in another way where both the
FACS reported by "FIRMWARE_CTRL" and the FACS reported by "X_FIRMWARE_CTRL"
are loaded so that further commit can set firmware waking vector in the
both tables to ensure we can exclude the cases that trigger the bugs caused
by the root cause 2. The exclusion is split into 2 commits as this commit
is also useful for dumping more ACPI tables, it won't get reverted when
such exclusion is no longer necessary. Lv Zheng.
The mcp3021 scaling code is dividing the VDD (full-scale) value in
millivolts by the A2D resolution to obtain the scaling factor. When VDD
is 3300mV (the standard value) and the resolution is 12-bit (4096
divisions), the result is a scale factor of 3300/4096, which is always
one. Effectively, the raw A2D reading is always being returned because
no scaling is applied.
This patch fixes the issue and simplifies the register-to-volts
calculation, removing the unneeded "output_scale" struct member.
Signed-off-by: Nick Stevens <Nick.Stevens@digi.com>
[Guenter Roeck: Dropped unnecessary value check] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The warning message in prepend_path is unclear and outdated. It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead. Unfortunately the warning reads like a general warning,
making it unclear what to do with it.
Remove the warning. The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.
Reported-by: Ivan Delalande <colona@arista.com> Reported-by: Omar Sandoval <osandov@osandov.com> Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
fs_fully_visible attempts to make fresh mounts of proc and sysfs give
the mounter no more access to proc and sysfs than if they could have
by creating a bind mount. One aspect of proc and sysfs that makes
this particularly tricky is that there are other filesystems that
typically mount on top of proc and sysfs. As those filesystems are
mounted on empty directories in practice it is safe to ignore them.
However testing to ensure filesystems are mounted on empty directories
has not been something the in kernel data structures have supported so
the current test for an empty directory which checks to see
if nlink <= 2 is a bit lacking.
proc and sysfs have recently been modified to use the new empty_dir
infrastructure to create all of their dedicated mount points. Instead
of testing for S_ISDIR(inode->i_mode) && i_nlink <= 2 to see if a
directory is empty, test for is_empty_dir_inode(inode). That small
change guaranteess mounts found on proc and sysfs really are safe to
ignore, because the directories are not only empty but nothing can
ever be added to them. This guarantees there is nothing to worry
about when mounting proc and sysfs.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Add two functions sysfs_create_mount_point and
sysfs_remove_mount_point that hang a permanently empty directory off
of a kobject or remove a permanently emptpy directory hanging from a
kobject. Export these new functions so modular filesystems can use
them.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
To ensure it is safe to mount proc and sysfs I need to check if
filesystems that are mounted on top of them are mounted on truly empty
directories. Given that some directories can gain entries over time,
knowing that a directory is empty right now is insufficient.
Therefore add supporting infrastructure for permantently empty
directories that proc and sysfs can use when they create mount points
for filesystems and fs_fully_visible can use to test for permanently
empty directories to ensure that nothing will be gained by mounting a
fresh copy of proc or sysfs.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Introduce some function for getting the inode (and also the dentry) in an
environment where layered/unioned filesystems are in operation.
The problem is that we have places where we need *both* the union dentry and
the lower source or workspace inode or dentry available, but we can only have
a handle on one of them. Therefore we need to derive the handle to the other
from that.
The idea is to introduce an extra field in struct dentry that allows the union
dentry to refer to and pin the lower dentry.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ luis: 3.16 prereq for: fbabfd0f4ee2 "fs: Add helper functions for permanently empty directories." ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.
[Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling
fuse_conn_init(fc) instead of before.]
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()") Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The current pmd_huge() and pud_huge() functions simply check if the table
bit is not set and reports the entries as huge in that case. This is
counter-intuitive as a clear pmd/pud cannot also be a huge pmd/pud, and
it is inconsistent with at least arm and x86.
To prevent others from making the same mistake as me in looking at code
that calls these functions and to fix an issue with KVM on arm64 that
causes memory corruption due to incorrect page reference counting
resulting from this mistake, let's change the behavior.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Steve Capper <steve.capper@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Fixes: 084bd29810a5 ("ARM64: mm: HugeTLB support.") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
rbd_obj_request_create() is called on the main I/O path, so we need to
use GFP_NOIO to make sure allocation doesn't blow back on us. Not all
callers need this, but I'm still hardcoding the flag inside rather than
making it a parameter because a) this is going to stable, and b) those
callers shouldn't really use rbd_obj_request_create() and will be fixed
in the future.
More memory allocation fixes will follow.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used. -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews. The actual problem (at least for
common crushmaps) isn't the u32 -> u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.
The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
support TCP/IP checksumming with frame sizes larger than 1600 bytes.
This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
to a value greater than 1600 bytes.
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
This patch updates the Ethernet DT nodes for Armada XP SoCs with the
compatible string "marvell,armada-xp-neta".
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Fixes: 77916519cba3 ("arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces") Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
380 and 385 SoCs. Since at least one more hardware feature is available
for the Armada XP SoCs then a way to identify them is needed.
This patch introduces a new compatible string "marvell,armada-xp-neta".
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In order for hibernation to reliably work we need to properly turn
off the SDMA block, sadly after numerous attemps i haven't not found
proper sequence for clean and full shutdown. So simply reset both
SDMA block, this makes hibernation works reliably on sea island GPU
family (CI)
Hibernation and suspend to ram were tested (several times) on :
Bonaire
Hawaii
Mullins
Kaveri
Kabini
Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In order for hibernation to reliably work we need to cleanup more
thoroughly the compute ring. Hibernation is different from suspend
resume as when we resume from hibernation the hardware is first
fully initialize by regular kernel then freeze callback happens
(which correspond to a suspend inside the radeon kernel driver)
and turn off each of the block. It turns out we were not cleanly
shutting down the compute ring. This patch fix that.
Hibernation and suspend to ram were tested (several times) on :
Bonaire
Hawaii
Mullins
Kaveri
Kabini
Changed since v1:
- Factor the ring stop logic into a function taking ring as arg.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
In needs_ilk_vtd_wa(), we pass in the GPU device but compared it against
the ids for the mobile GPU and the mobile host bridge. That latter is
impossible and so likely was just a typo for the desktop GPU device id
(which is also buggy).
drm/i915: Disable WC PTE updates to w/a buggy IOMMU on ILK
Reported-by: Ting-Wei Lan <lantw44@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91127
References: https://bugzilla.freedesktop.org/show_bug.cgi?id=60391 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Fujitsu Lifebook E780 sets the sequence number 0x0f to only only of
the two headphones, thus the driver tries to assign another as the
line-out, and this results in the inconsistent mapping between the
created jack ctl and the actual I/O. Due to this, PulseAudio doesn't
handle it properly and gets the silent output.
The fix is to ignore the non-HP sequencer checks.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=99681 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acer Aspire V5 with ALC282 codec needs the similar quirk like Dell
laptops to support the headset mic. The headset mic pin is 0x19 and
it's not exposed by BIOS, thus we need to fix the pincfg as well.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96201 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Whilst testing cpu hotplug events on kernel configured with
DEBUG_PREEMPT and DEBUG_ATOMIC_SLEEP we get following BUG message,
caused by calling request_irq() and free_irq() in the context of
hotplug notification (which is in this case atomic context).
[ 40.785859] CPU1: Software reset
[ 40.786660] BUG: sleeping function called from invalid context at mm/slub.c:1241
[ 40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1
[ 40.786678] Preemption disabled at:[< (null)>] (null)
[ 40.786681]
[ 40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-00024-g7dca860 #36
[ 40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 40.786728] [<c0014a00>] (unwind_backtrace) from [<c0011980>] (show_stack+0x10/0x14)
[ 40.786747] [<c0011980>] (show_stack) from [<c0449ba0>] (dump_stack+0x70/0xbc)
[ 40.786767] [<c0449ba0>] (dump_stack) from [<c00c6124>] (kmem_cache_alloc+0xd8/0x170)
[ 40.786785] [<c00c6124>] (kmem_cache_alloc) from [<c005d6f8>] (request_threaded_irq+0x64/0x128)
[ 40.786804] [<c005d6f8>] (request_threaded_irq) from [<c0350b8c>] (exynos4_local_timer_setup+0xc0/0x13c)
[ 40.786820] [<c0350b8c>] (exynos4_local_timer_setup) from [<c0350ca8>] (exynos4_mct_cpu_notify+0x30/0xa8)
[ 40.786838] [<c0350ca8>] (exynos4_mct_cpu_notify) from [<c003b330>] (notifier_call_chain+0x44/0x84)
[ 40.786857] [<c003b330>] (notifier_call_chain) from [<c0022fd4>] (__cpu_notify+0x28/0x44)
[ 40.786873] [<c0022fd4>] (__cpu_notify) from [<c0013714>] (secondary_start_kernel+0xec/0x150)
[ 40.786886] [<c0013714>] (secondary_start_kernel) from [<40008764>] (0x40008764)
Interrupts cannot be requested/freed in the CPU_STARTING/CPU_DYING
notifications which run on the hotplugged cpu with interrupts and
preemption disabled.
To avoid the issue, request the interrupts for all possible cpus in
the boot code. The interrupts are marked NO_AUTOENABLE to avoid a racy
request_irq/disable_irq() sequence. The flag prevents the
request_irq() code from enabling the interrupt immediately.
The interrupt is then enabled in the CPU_STARTING notifier of the
hotplugged cpu and again disabled with disable_irq_nosync() in the
CPU_DYING notifier.
[ tglx: Massaged changelog to match the patch ]
Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq calls for local timer registration") Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Marcin Jabrzyk <m.jabrzyk@samsung.com> Signed-off-by: Damian Eppel <d.eppel@samsung.com> Cc: m.szyprowski@samsung.com Cc: kyungmin.park@samsung.com Cc: daniel.lezcano@linaro.org Cc: kgene@kernel.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1435324984-7328-1-git-send-email-d.eppel@samsung.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has
The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.
Luckily, only root can write to the filter.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).
That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.
Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com Reported-by: Vince Weaver <vincent.weaver@maine.edu> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Thinkpad X250, when attached to a dock, has two headphone outs but
no line out. Make sure we don't try to turn this into one headphone
and one line out (since that disables the headphone amp on the dock).
This commit fix kernel crash when probing for rfkill devices in dell-laptop
driver failed. Function free_page() was incorrectly used on struct page *
instead of virtual address of SMI buffer.
This commit also simplify allocating page for SMI buffer by using
__get_free_page() function instead of sequential call of functions
alloc_page() and page_address().
Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The kmemleak scanning thread can run for minutes. Callbacks like
kmemleak_free() are allowed during this time, the race being taken care
of by the object->lock spinlock. Such lock also prevents a memory block
from being freed or unmapped while it is being scanned by blocking the
kmemleak_free() -> ... -> __delete_object() function until the lock is
released in scan_object().
When a kmemleak error occurs (e.g. it fails to allocate its metadata),
kmemleak_enabled is set and __delete_object() is no longer called on
freed objects. If kmemleak_scan is running at the same time,
kmemleak_free() no longer waits for the object scanning to complete,
allowing the corresponding memory block to be freed or unmapped (in the
case of vfree()). This leads to kmemleak_scan potentially triggering a
page fault.
This patch separates the kmemleak_free() enabling/disabling from the
overall kmemleak_enabled nob so that we can defer the disabling of the
object freeing tracking until the scanning thread completed. The
kmemleak_free_part() is deliberately ignored by this patch since this is
only called during boot before the scanning thread started.
- arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers
Since ARCv2 only provides load/load, store/store and all/all, we need
the full barrier
- LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified
values were lacking the explicit smp barriers.
- Non LLOCK/SCOND varaints don't need the explicit barriers since that
is implicity provided by the spin locks used to implement the
critical section (the spin lock barriers in turn are also fixed in
this commit as explained above
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Many harddisks (mostly WD ones) have firmware problems and take too
long, more than 10 seconds, to resume from suspend. And this often
exceeds the default DPM watchdog timeout (12 seconds), resulting in a
kernel panic out of sudden.
Since most distros just take the default as is, we should give a bit
more safer value. This patch increases the default value from 12
seconds to one minute, which has been confirmed to be long enough for
such problematic disks.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=91921 Fixes: 70fea60d888d (PM / Sleep: Detect device suspend/resume lockup and log event) Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Current implementation of descriptor init procedure only takes
care about setting/clearing ownership flag in "des0"/"des1"
fields while it is perfectly possible to get unexpected bits
set because of the following factors:
[1] On driver probe underlying memory allocated with
dma_alloc_coherent() might not be zeroed and so
it will be filled with garbage.
[2] During driver operation some bits could be set by SD/MMC
controller (for example error flags etc).
And unexpected and/or randomly set flags in "des0"/"des1"
fields may lead to unpredictable behavior of GMAC DMA block.
This change addresses both items above with:
[1] Use of dma_zalloc_coherent() instead of simple
dma_alloc_coherent() to make sure allocated memory is
zeroed. That shouldn't affect performance because
this allocation only happens once on driver probe.
[2] Do explicit zeroing of both "des0" and "des1" fields
of all buffer descriptors during initialization of
DMA transfer.
And while at it fixed identation of dma_free_coherent()
counterpart as well.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: arc-linux-dev@synopsys.com Cc: linux-kernel@vger.kernel.org Cc: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.
Fix the bug by checking actual mode bits before setting S_NOSEC.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Add code to nf_unregister_hook to flush the nf_queue when a hook is
unregistered. This guarantees that the pointer that the nf_queue code
retains into the nf_hook list will remain valid while a packet is
queued.
I tested what would happen if we do not flush queued packets and was
trivially able to obtain the oops below. All that was required was
to stop the nf_queue listening process, to delete all of the nf_tables,
and to awaken the nf_queue listening process.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
A ROSE socket doesn't necessarily always have a neighbour pointer so check
if the neighbour pointer is valid before dereferencing it.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Tested-by: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The xfs_attr3_root_inactive() call from xfs_attr_inactive() assumes that
attribute blocks exist to invalidate. It is possible to have an
attribute fork without extents, however. Consider the case where the
attribute fork is created towards the beginning of xfs_attr_set() but
some part of the subsequent attribute set fails.
If an inode in such a state hits xfs_attr_inactive(), it eventually
calls xfs_dabuf_map() and possibly xfs_bmapi_read(). The former emits a
filesystem corruption warning, returns an error that bubbles back up to
xfs_attr_inactive(), and leads to destruction of the in-core attribute
fork without an on-disk reset. If the inode happens to make it back
through xfs_inactive() in this state (e.g., via a concurrent bulkstat
that cycles the inode from the reclaim state and releases it), i_afp
might not exist when xfs_bmapi_read() is called and causes a NULL
dereference panic.
A '-p 2' fsstress run to ENOSPC on a relatively small fs (1GB)
reproduces these problems. The behavior is a regression caused by:
6dfe5a0 xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
... which removed logic that avoided the attribute extent truncate when
no extents exist. Restore this logic to ensure the attribute fork is
destroyed and reset correctly if it exists without any allocated
extents.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
ext4 isn't willing to map clusters to a non-extent file. Don't signal
this with an out of space error, since the FS will retry the
allocation (which didn't fail) forever. Instead, return EUCLEAN so
that the operation will fail immediately all the way back to userspace.
(The fix is either to run e2fsck -E bmap2extent, or to chattr +e the file.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
If we create a CRC filesystem, mount it, and create a symlink with
a path long enough that it can't live in the inode, we get a very
strange result upon remount:
# ls -l mnt
total 4
lrwxrwxrwx. 1 root root 929 Jun 15 16:58 link -> XSLM
XSLM is the V5 symlink block header magic (which happens to be
followed by a NUL, so the string looks terminated).
xfs_readlink_bmap() advanced cur_chunk by the size of the header
for CRC filesystems, but never actually used that pointer; it
kept reading from bp->b_addr, which is the start of the block,
rather than the start of the symlink data after the header.
Looks like this problem goes back to v3.10.
Fixing this gets us reading the proper link target, again.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
KVM guest kernels for trap & emulate run in user mode, with a modified
set of kernel memory segments. However the fixmap address is still in
the normal KSeg3 region at 0xfffe0000 regardless, causing problems when
cache alias handling makes use of them when handling copy on write.
Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped
region when CONFIG_KVM_GUEST is defined.
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9887/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Normally all of the buffers will have been forced out to disk before
we call invalidate_bdev(), but there will be some cases, where a file
system operation was aborted due to an ext4_error(), where there may
still be some dirty buffers in the buffer cache for the device. So
try to force them out to memory before calling invalidate_bdev().
This fixes a warning triggered by generic/081:
WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f()
Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
When building the kernel with a bare-metal (ELF) toolchain, the -shared
option may not be passed down to collect2, resulting in silent corruption
of the vDSO image (in particular, the DYNAMIC section is omitted).
The effect of this corruption is that the dynamic linker fails to find
the vDSO symbols and libc is instead used for the syscalls that we
intended to optimise (e.g. gettimeofday). Functionally, there is no
issue as the sigreturn trampoline is still intact and located by the
kernel.
This patch fixes the problem by explicitly passing -shared to the linker
when building the vDSO.
Reported-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Reported-by: James Greenlaigh <james.greenhalgh@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Queued TRIM got disabled on Micron M500DC drives thanks to the
"Micron_M500*" pattern we had in place to accommodate the previous
generation of this drive family. Tweak the blacklist entry slightly so
we only disable queued TRIM for the non-DC variants of M500 drives.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1394368
This device requires new firmware files
AthrBT_0x11020100.dfu and ramps_0x11020100_40.dfu added to
/lib/firmware/ar3k/ that are not included in linux-firmware yet.
Commit b9a5e5e18fbf "ACPI / init: Fix the ordering of
acpi_reserve_resources()" overlooked the fact that the memory
and/or I/O regions reserved by acpi_reserve_resources() may
conflict with those reserved by the PNP "system" driver.
If that conflict actually takes place, it causes the reservations
made by the "system" driver to fail while before commit b9a5e5e18fbf
all reservations made by it and by acpi_reserve_resources() would be
successful. In turn, that allows the resources that haven't been
reserved by the "system" driver to be used by others (e.g. PCI) which
sometimes leads to functional problems (up to and including boot
failures).
To fix that issue, introduce a common resource reservation routine,
acpi_reserve_region(), to be used by both acpi_reserve_resources()
and the "system" driver, that will track all resources reserved by
it and avoid making conflicting requests.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
pnfs_do_write() expects the call to pnfs_write_through_mds() to free the
pgio header and to release the layout segment before exiting. The problem
is that nfs_pgio_data_destroy() doesn't actually do this; it only frees
the memory allocated by nfs_generic_pgio().
Ditto for pnfs_do_read()...
Fix in both cases is to add a call to hdr->release(hdr).
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
We enable _CRS on all systems from 2008 and later. On older systems, we
ignore _CRS and assume the whole physical address space (excluding RAM and
other devices) is available for PCI devices, but on systems that support
physical address spaces larger than 4GB, it's doubtful that the area above
4GB is really available for PCI.
After d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible"), we
try to use that space above 4GB *first*, so we're more likely to put a
device there.
On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394,
and card reader devices unassigned (but only after Windows had been
booted). Only the sound device had a 64-bit BAR, so it was the only device
placed above 4GB, and hence the only device that didn't work.
Keep _CRS enabled even on pre-2008 systems if they support physical address
space larger than 4GB.
Fixes: d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible") Reported-and-tested-by: Juan Dayer <jdayer@outlook.com> Reported-and-tested-by: Alan Horsfield <alan@hazelgarth.co.uk> Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221 Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The metadata space map has a simplified 'bootstrap' mode that is
operational when extending the space maps. Whilst in this mode it's
possible for some refcount decrement operations to become queued (eg, as
a result of shadowing one of the bitmap indexes). These decrements were
not being applied when switching out of bootstrap mode.
The effect of this bug was the leaking of a 4k metadata block. This is
detected by the latest version of thin_check as a non fatal error.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The memmap freeing code in free_unused_memmap() computes the end of
each memblock by adding the memblock size onto the base. However,
if SPARSEMEM is enabled then the value (start) used for the base
may already have been rounded downwards to work out which memmap
entries to free after the previous memblock.
This may cause memmap entries that are in use to get freed.
In general, you're not likely to hit this problem unless there
are at least 2 memblocks and one of them is not aligned to a
sparsemem section boundary. Note that carve-outs can increase
the number of memblocks by splitting the regions listed in the
device tree.
This problem doesn't occur with SPARSEMEM_VMEMMAP, because the
vmemmap code deals with freeing the unused regions of the memmap
instead of requiring the arch code to do it.
This patch gets the memblock base out of the memblock directly when
computing the block end address to ensure the correct value is used.
Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.
On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.
If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.
The same condition exists when trapping to enable VFP for the guest.
The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.
The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.
Reported-by: Vikram Sethi <vikrams@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
tpm_ibmvtpm_probe() calls ibmvtpm_reset_crq(ibmvtpm) without having yet
set the virtual device in the ibmvtpm structure. So in ibmvtpm_reset_crq,
the phype call contains empty unit addresses, ibmvtpm->vdev->unit_address.
Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Signed-off-by: Joy Latten <jmlatten@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <ashley@ahsleylai.com> Fixes: 132f76294744 ("drivers/char/tpm: Add new device driver to support IBM vTPM") Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
The way the mask is generated in regmap_field_init() is wrong.
Indeed, a field initialized with msb = 31 and lsb = 0 provokes a shift
overflow while calculating the mask field.
On some 32 bits architectures, such as x86, the generated mask is 0,
instead of the expected 0xffffffff.
This patch uses GENMASK() to fix the problem, as this macro is already safe
regarding shift overflow.
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated
deadlocks in read/write mode on mkdir.
This patch partially reverts it keeping fixes by Andrew Morton and
mutex_destroy()
[AV: fixed a missing bit in ufs_remount()]
Signed-off-by: Fabian Frederick <fabf@skynet.be> Reported-by: Ian Campbell <ian.campbell@citrix.com> Suggested-by: Jan Kara <jack@suse.cz> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Roger Pau Monne <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ luis: backported to 3.16: adjusted context ] Signed-off-by: Luis Henriques <luis.henriques@canonical.com>