For 64bit kernels the lmmio_space_offset of the host bridge window
isn't set correctly on systems with dino/cujo PCI host bridges.
This leads to not assigned memory bars and failing drivers, which
need to use these bars.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
audit_remove_watch_rule() drops watch's reference to parent but then
continues to work with it. That is not safe as parent can get freed once
we drop our reference. The following is a trivial reproducer:
mount -o loop image /mnt
touch /mnt/file
auditctl -w /mnt/file -p wax
umount /mnt
auditctl -D
<crash in fsnotify_destroy_mark()>
Grab our own reference in audit_remove_watch_rule() earlier to make sure
mark does not get freed under us.
Reported-by: Tony Jones <tonyj@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Tested-by: Tony Jones <tonyj@suse.de> Signed-off-by: Paul Moore <paul@paul-moore.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: ba51b6be38c1 ("net: Fix RCU splat in af_key") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: David Ahern <dsa@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we try to allocate a free inode by searching the inobt, we try to
find the inode nearest the parent inode by searching chunks both left
and right of the chunk containing the parent. As an optimization, we
cache the leftmost and rightmost records that we previously searched; if
we do another allocation with the same parent inode, we'll pick up the
search where it last left off.
There's a bug in the case where we found a free inode to the left of the
parent's chunk: we need to update the cached left and right records, but
because we already reassigned the right record to point to the left, we
end up assigning the left record to both the cached left and right
records.
This isn't a correctness problem strictly, but it can result in the next
allocation rechecking chunks unnecessarily or allocating inodes further
away from the parent than it needs to. Fix it by swapping the record
pointer after we update the cached left and right records.
Fixes: bd169565993b ("xfs: speed up free inode search") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Uverbs device should be cleaned up only when there is no
potential usage of.
As part of ib_uverbs_remove_one which might be triggered upon reset flow
the device reference count is decreased as expected and leave the final
cleanup to the FDs that were opened.
Current code increases reference count upon opening a new command FD and
decreases it upon closing the file. The event FD is opened internally
and rely on the command FD by taking on it a reference count.
In case that the command FD was closed and just later the event FD we
may ensure that the device resources as of srcu are still alive as they
are still in use.
Fixing the above by moving the reference count decreasing to the place
where the command FD is really freed instead of doing that when it was
just closed.
fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
initialize to zero the response structure to prevent
the leakage of "resp.reserved" field.
drivers/infiniband/core/uverbs_cmd.c:1178 ib_uverbs_resize_cq() warn:
check that 'resp.reserved' doesn't leak information
Fixes: 33b9b3ee9709 ("IB: Add userspace support for resizing CQs") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl()
into ocfs2_iop_set_acl(). That way the function will not be called when
inheriting ACLs which is what we want as it prevents SGID bit clearing
and the mode has been properly set by posix_acl_create() anyway. Also
posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating
mode itself.
Fixes: 073931017b4 ("posix_acl: Clear SGID bit when setting file permissions") Link: http://lkml.kernel.org/r/20170801141252.19675-3-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: Move the call to posix_acl_update_mode() into
ocfs2_xattr_set_acl(). Pass NULL as the bh argument to
ocfs2_acl_set_mode(). Reuse the existing cleanup label.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently when WoL is supported but disabled, ethtool reports:
"Supports Wake-on: d".
Fix the indication of Wol support, so that the indication
remains "g" all the time if the NIC supports WoL.
Tested:
As accepted, when NIC supports WoL- ethtool reports:
Supports Wake-on: g
Wake-on: d
when NIC doesn't support WoL- ethtool reports:
Supports Wake-on: d
Wake-on: d
Fixes: 14c07b1358ed ("mlx4: Wake on LAN support") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When more than one GPIO IRQs are triggered simultaneously,
tegra_gpio_irq_handler() called chained_irq_exit() multiple
times for one chained_irq_enter().
Fixes: 3c92db9ac0ca3eee8e46e2424b6c074e2e394ad9 Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
[Also changed the variable to a bool] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Functions clear_user_highpage, copy_user_highpage, flush_dcache_page,
local_flush_cache_range and local_flush_cache_page may be used from
modules. Export them.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
[bwh: Backported to 3.2: drop exports of {clear,copy}_user_highpage()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
csum_partial and csum_partial_copy_generic are defined unconditionally
and are available even when CONFIG_NET is disabled. They are used not
only by the network drivers, but also by scsi and media.
Don't limit these functions export by CONFIG_NET.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Chris Zankel <chris@zankel.net>
[bwh: Backported to 3.2: drop exports of some functions that aren't defined here] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Make usb_hc_died() clear the HCD_FLAG_RH_RUNNING flag for the shared
HCD and set HCD_FLAG_DEAD for it, in analogy with what is done for
the primary one.
Among other thigs, this prevents check_root_hub_suspended() from
returning -EBUSY for dead HCDs which helps to work around system
suspend issues in some situations.
This actually fixes occasional suspend failures on one of my test
machines.
Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently building kernel for xtensa core with aliasing WT cache fails
with the following messages:
mm/memory.c:2152: undefined reference to `flush_dcache_page'
mm/memory.c:2332: undefined reference to `local_flush_cache_page'
mm/memory.c:1919: undefined reference to `local_flush_cache_range'
mm/memory.c:4179: undefined reference to `copy_to_user_page'
mm/memory.c:4183: undefined reference to `copy_from_user_page'
This happens because implementation of these functions is only compiled
when data cache is WB, which looks wrong: even when data cache doesn't
need flushing it still needs invalidation. The functions like
__flush_[invalidate_]dcache_* are correctly defined for both WB and WT
caches (and even if they weren't that'd still be ok, just slower).
Fix this by providing the same implementation of the above functions for
both WB and WT cache.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We get a link error trying to access the w100fb_gpio_read/write
functions from the platform when the driver is a loadable module
or not built-in, so the platform already uses 'select' to hard-enable
the driver.
However, that fails if the framebuffer subsystem is disabled
altogether.
I've considered various ways to fix this properly, but they
all seem like too much work or too risky, so this simply
adds another 'select' to force the subsystem on as well.
Fixes: 82427de2c7c3 ("ARM: pxa: PXA_ESERIES depends on FB_W100.") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving
_sctp_walk_{params, errors}()") tried to fix the issue that it
may overstep the chunk end for _sctp_walk_{params, errors} with
'chunk_end > offset(length) + sizeof(length)'.
But it introduced a side effect: When processing INIT, it verifies
the chunks with 'param.v == chunk_end' after iterating all params
by sctp_walk_params(). With the check 'chunk_end > offset(length)
+ sizeof(length)', it would return when the last param is not yet
accessed. Because the last param usually is fwdtsn supported param
whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'
This is a badly issue even causing sctp couldn't process 4-shakes.
Client would always get abort when connecting to server, due to
the failure of INIT chunk verification on server.
The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
instead of 'chunk_end < offset(length) + sizeof(length)' for both
_sctp_walk_params and _sctp_walk_errors.
Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This patch fixes the following smatch warning:
drivers/infiniband/ulp/ipoib/ipoib_cm.c:517 ipoib_cm_rx_handler() warn:
missing break? reassigning 'p->id'
Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Don't allow negative values to max_nonsrq_conn_qp. There is no functional
impact on a negative value but it is logicically incorrect.
Fixes: 68e995a29572 ("IPoIB/cm: Add connected mode support for devices without SRQs") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
----> child is removed from sibling_list without any sync
with T1 path above
...
free_event(child)
Before the child is removed from the leader's child_list,
(and thus is omitted from perf_read_group processing), we
need to ensure that perf_read_group touches child's
siblings under its ctx->lock.
Peter further notes:
| One additional note; this bug got exposed by commit:
|
| ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
|
| which made it possible to actually trigger this code-path.
Tested-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP") Link: http://lkml.kernel.org/r/20170720141455.2106-1-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In order to enable the use of perf_event_read(.group = true), we need
to invert the sibling-child loop nesting of perf_read_group().
Currently we iterate the child list for each sibling, this precludes
using group reads. Flip things around so we iterate each group for
each child.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Made the patch compile and things. ] Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1441336073-22750-7-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2 as a dependency of commit 2aeb18835476 ("perf/core: Fix
locking for children siblings group read"):
- Keep the function name perf_event_read_group()
- Keep using perf_event_read_value()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.
Fixes: 7b1a74fdbb9e ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.") Fixes: 7f9b80529b8a ("[IPV4]: fib hash|trie initialization") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Initialize the port_num for iWARP in rdma_init_qp_attr.
Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds") Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The port number is only valid if IB_QP_PORT is set in the mask.
So only check port number if it is valid to prevent modify_qp from
failing due to an invalid port number.
Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds") Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: command structure is cmd not cmd->base] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We accidentally don't set the error code on some error paths. It means
return ERR_PTR(0) which is NULL and results in a NULL dereference in the
caller.
Fixes: 13a239330abd ("RDMA/cxgb3: Don't ignore insert_handle() failures") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: drop inapplicable hunk] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The bus_irq argument of mp_override_legacy_irq() is used as the index into
the isa_irq_to_gsi[] array. The bus_irq argument originates from
ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
tables, but is nowhere sanity checked.
That allows broken or malicious ACPI tables to overwrite memory, which
might cause malfunction, panic or arbitrary code execution.
Add a sanity check and emit a warning when that triggers.
[ tglx: Added warning and rewrote changelog ]
Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Doing this copy eliminates the "port=0" entry in
the /proc/mounts entries
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=69241 Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
My static checker complains that "devno" can be negative, meaning that
we read before the start of the loop. I've looked at the code, and I
think the warning is right. This come from /proc so it's root only or
it would be quite a quite a serious bug. The call tree looks like this:
proc_scsi_write() <- gets id and channel from simple_strtoul()
-> scsi_add_single_device() <- calls shost->transportt->user_scan()
-> ata_scsi_user_scan()
-> ata_find_dev()
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This patch fixes an issue that some registers may be not initialized
after resume if the USBHSF_RUNTIME_PWCTRL is not set. Otherwise,
if a cable is not connected, the driver will not enable INTENB0.VBSE
after resume. And then, the driver cannot detect the VBUS.
Fixes: ca8a282a5373 ("usb: gadget: renesas_usbhs: add suspend/resume support") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When us->extra is null the driver is not initialized, however, a
later call to osd200_scsi_to_ata is made that dereferences
us->extra, causing a null pointer dereference. The code
currently detects and reports that the driver is not initialized;
add a return to avoid the subsequent dereference issue in this
check.
Thanks to Alan Stern for pointing out that srb->result needs setting
to DID_ERROR << 16
Detected by CoverityScan, CID#100308 ("Dereference after null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Add device-id entry for DATECS FP-2000 fiscal printer needing the
NO_UNION_NORMAL quirk.
Reported-by: Anton Avramov <lukav@lukav.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The German Telekom offers a ZigBee USB Stick under the brand name Qivicon
for their SmartHome Home Base in its 1. Generation. The productId is not
known by the according kernel module, this patch adds support for it.
Signed-off-by: Stefan Triller <github@stefantriller.de> Reviewed-by: Frans Klaver <fransklaver@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
drivers/staging/iio/resolver/ad2s1210.c: In function 'ad2s1210_read_raw':
drivers/staging/iio/resolver/ad2s1210.c:515:42: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]
The original code had 'unsigned short' here, but incorrectly got
converted to 'bool'. This reverts the regression and uses a normal
type instead.
Fixes: 29148543c521 ("staging:iio:resolver:ad2s1210 minimal chan spec conversion.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The TSL2563 driver provides three iio channels, two of which are raw ADC
channels (channel 0 and channel 1) in the device and the remaining one
is calculated by the two. The ADC channel 0 only supports programmable
interrupt with threshold settings and this driver supports the event but
the generated event code does not contain the corresponding iio channel
type.
This is going to change userspace ABI. Hopefully fixing this to be
what it should always have been won't break any userspace code.
Cc: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
[bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Before the patch, the flock flag could remain uninitialized for the
lifespan of the fuse_file allocation. Unless set to true in
fuse_file_flock(), it would remain in an indeterminate state until read in
an if statement in fuse_release_common(). This could consequently lead to
taking an unexpected branch in the code.
The bug was discovered by a runtime instrumentation designed to detect use
of uninitialized memory in the kernel.
When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.
Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.
Here's the full report:
https://lkml.org/lkml/2014/9/19/230
To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.
v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.
v2:
- updated scripts/tags.sh, suggested by Peter
Cc: Ingo Molnar <mingo@kernel.org> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Zefan Li <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
[lizf: Backported to 3.4:
- adjust context
- remove no_new_priv code
- add atomic_flags to struct task_struct]
[bwh: Backported to 3.2:
- Drop changes in scripts/tags.sh
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:
$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80
some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.
When a user successfuly deletes a specific filter, by specifying the correct
tcm->tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.
Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);
After this patch the appropriate user identifiable handle as encoded
in the originating request tcm->tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.
Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/
Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Remove usage of the '__attribute__((alias("...")))' hack that aliased
to integer arrays containing micro-assembled instructions. This hack
breaks when building a microMIPS kernel. It also makes the code much
easier to understand.
[ralf@linux-mips.org: Added back export of the clear_page and copy_page
symbols so certain modules will work again. Also fixed build with
CONFIG_SIBYTE_DMA_PAGEOPS enabled.]
Signed-off-by: Steven J. Hill <sjhill@mips.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3866/ Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2: adjust context] Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.
But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().
Florian Westphal suggested to check the confirm bit too. I think it's
right.
task 1 task 2 task 3
nf_conntrack_find_get
____nf_conntrack_find
destroy_conntrack
hlist_nulls_del_rcu
nf_conntrack_free
kmem_cache_free
__nf_conntrack_alloc
kmem_cache_alloc
memset(&ct->tuplehash[IP_CT_DIR_MAX],
if (nf_ct_is_dying(ct))
if (!nf_ct_tuple_equal()
I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.
l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.
Signed-off-by: Paul Hüber <phueber@kernsp.in> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Validate the output buffer length for L2CAP config requests and responses
to avoid overflowing the stack buffer used for building the option blocks.
Signed-off-by: Ben Seri <ben@armis.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
- Drop changes to handling of L2CAP_CONF_EFS, L2CAP_CONF_EWS
- Drop changes to l2cap_do_create(), l2cap_security_cfm(), and L2CAP_CONF_PENDING
case in l2cap_config_rsp()
- In l2cap_config_rsp(), s/buf/req/
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It's caused by skb_shared_info at the end of sk_buff was overwritten by
ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx.
During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh),
ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a
new value to skb_shinfo(SKB)->nr_frags by ev->type.
This patch is to fix it by checking nlh->nlmsg_len properly there to
avoid over accessing sk_buff.
Reported-by: ChunYu Wang <chunwang@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
a directory in a filesystem that does not have a realtime device and
create a new file in that directory, it gets marked as a real time file.
When data is written and a fsync is issued, the filesystem attempts to
flush a non-existent rt device during the fsync process.
This results in a crash dereferencing a null buftarg pointer in
xfs_blkdev_issue_flush():
Setting RT inode flags does not require special privileges so any
unprivileged user can cause this oops to occur. To reproduce, confirm
kernel is compiled with CONFIG_XFS_RT=y and run:
'clk' is copied to a userland with padding byte(s) after 'vclk_post_div'
field unitialized, leaking data from the stack. Fix this ensuring all of
'clk' is initialized to zero.
If L1 does not specify the "use TPR shadow" VM-execution control in
vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store
exiting" VM-execution controls in vmcs02. Failure to do so will give
the L2 VM unrestricted read/write access to the hardware CR8.
This fixes CVE-2017-12154.
Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
nl80211_set_rekey_data() does not check if the required attributes
NL80211_REKEY_DATA_{REPLAY_CTR,KEK,KCK} are present when processing
NL80211_CMD_SET_REKEY_OFFLOAD request. This request can be issued by
users with CAP_NET_ADMIN privilege and may result in NULL dereference
and a system crash. Add a check for the required attributes presence.
This patch is based on the patch by bo Zhang.
This fixes CVE-2017-12153.
References: https://bugzilla.redhat.com/show_bug.cgi?id=1491046 Fixes: e5497d766ad ("cfg80211/nl80211: support GTK rekey offload") Reported-by: bo Zhang <zhangbo5891001@gmail.com> Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
drivers/media/pci/saa7164/saa7164-core.c:97:18: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:122:31: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:122:31: warning: incorrect type in initializer (different address spaces)
drivers/media/pci/saa7164/saa7164-core.c:122:31: expected unsigned char [noderef] [usertype] <asn:2>*bufcpu
drivers/media/pci/saa7164/saa7164-core.c:122:31: got unsigned char [usertype] *<noident>
drivers/media/pci/saa7164/saa7164-core.c:282:44: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:286:38: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:286:35: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-core.c:286:35: expected unsigned char [noderef] [usertype] <asn:2>*p
drivers/media/pci/saa7164/saa7164-core.c:286:35: got unsigned char [usertype] *<noident>
drivers/media/pci/saa7164/saa7164-core.c:352:44: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:527:53: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:129:30: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:133:38: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:133:72: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:134:35: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:287:61: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:288:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:289:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:290:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:291:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:292:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:293:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:294:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-fw.c:548:52: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:548:52: expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:548:52: got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-fw.c:579:44: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:579:44: expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:579:44: got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-fw.c:597:44: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:597:44: expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:597:44: got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-bus.c:36:36: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-bus.c:41:36: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-bus.c:151:19: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:151:19: expected unsigned short [unsigned] [usertype] size
drivers/media/pci/saa7164/saa7164-bus.c:151:19: got restricted __le16 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:152:22: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:152:22: expected unsigned int [unsigned] [usertype] command
drivers/media/pci/saa7164/saa7164-bus.c:152:22: got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:153:30: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:153:30: expected unsigned short [unsigned] [usertype] controlselector
drivers/media/pci/saa7164/saa7164-bus.c:153:30: got restricted __le16 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:172:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:173:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:206:28: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:287:9: warning: incorrect type in argument 1 (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:287:9: expected unsigned int [unsigned] val
drivers/media/pci/saa7164/saa7164-bus.c:287:9: got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:339:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:340:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:463:9: warning: incorrect type in argument 1 (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:463:9: expected unsigned int [unsigned] val
drivers/media/pci/saa7164/saa7164-bus.c:463:9: got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:466:21: warning: cast to restricted __le16
drivers/media/pci/saa7164/saa7164-bus.c:467:24: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:468:32: warning: cast to restricted __le16
drivers/media/pci/saa7164/saa7164-buffer.c:122:18: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:122:18: expected unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:122:18: got void *
drivers/media/pci/saa7164/saa7164-buffer.c:127:21: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:127:21: expected unsigned long long [noderef] [usertype] <asn:2>*pt_cpu
drivers/media/pci/saa7164/saa7164-buffer.c:127:21: got void *
drivers/media/pci/saa7164/saa7164-buffer.c:134:20: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-buffer.c:156:63: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:156:63: expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:156:63: got unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:179:57: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:179:57: expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:179:57: got unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:180:56: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:180:56: expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:180:56: got unsigned long long [noderef] [usertype] <asn:2>*pt_cpu
drivers/media/pci/saa7164/saa7164-buffer.c:84:17: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-buffer.c:147:31: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-buffer.c:148:17: warning: dereference of noderef expression
Most are caused by pointers marked as __iomem when they aren't or not marked as
__iomem when they should.
Also note that readl/writel already do endian conversion, so there is no need to
do it again.
saa7164_bus_set/get were a bit tricky: you have to make sure the msg endian
conversion is done at the right time, and that the code isn't using fields that
are still little endian instead of cpu-endianness.
The approach chosen is to convert just before writing to the ring buffer
and to convert it back right after reading from the ring buffer.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Cc: Steven Toth <stoth@kernellabs.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
[bwh: Backported to 3.2: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The msg->command field is 32 bits, and we should fill it with a call
to cpu_to_le32(). The current code is broke on big endian systems.
On little endian systems it truncates the 32 bit value to 16 bits
which probably still works fine.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Steven Toth <stoth@kernellabs.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When changing a file's acl mask, btrfs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.
Prevent this by restoring the original mode bits if __btrfs_set_acl
fails.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
[bwh: Backported to 3.2: the __ext4_set_acl() function didn't exist,
so added it] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When changing a file's acl mask, __ext4_set_acl() will first set the group
bits of i_mode to the value of the mask, and only then set the actual
extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the file
had no acl attribute to begin with, the system will from now on assume
that the mask permission bits are actual group permission bits, potentially
granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When changing a file's acl mask, reiserfs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__reiserfs_set_acl() into reiserfs_set_acl(). That way the function will
not be called when inheriting ACLs which is what we want as it prevents
SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef CC: reiserfs-devel@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: the __reiserfs_set_acl() function didn't exist,
so added it] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ben Hutchings [Sun, 8 Oct 2017 13:48:44 +0000 (14:48 +0100)]
ext3: preserve i_mode if ext2_set_acl() fails
Based on Ernesto A. Fernández's fix for ext2 (commit fe26569eb919), from
which the following description is taken:
> When changing a file's acl mask, ext2_set_acl() will first set the group
> bits of i_mode to the value of the mask, and only then set the actual
> extended attribute representing the new acl.
>
> If the second part fails (due to lack of space, for example) and the file
> had no acl attribute to begin with, the system will from now on assume
> that the mask permission bits are actual group permission bits, potentially
> granting access to the wrong users.
>
> Prevent this by only changing the inode mode after the acl has been set.
Cc: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ben Hutchings [Fri, 6 Oct 2017 02:18:40 +0000 (03:18 +0100)]
ext3: Don't clear SGID when inheriting ACLs
Based on Jan Kara's fix for ext2 (commit a992f2d38e4c), from which the
following description is taken:
> When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
> set, DIR1 is expected to have SGID bit set (and owning group equal to
> the owning group of 'DIR0'). However when 'DIR0' also has some default
> ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
> 'DIR1' to get cleared if user is not member of the owning group.
>
> Fix the problem by creating __ext2_set_acl() function that does not call
> posix_acl_update_mode() and use it when inheriting ACLs. That prevents
> SGID bit clearing and the mode has been properly set by
> posix_acl_create() anyway.
Fixes: 073931017b49 ("posix_acl: Clear SGID bit when setting file permissions") Cc: linux-ext4@vger.kernel.org Cc: Jan Kara <jack@suse.cz> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When changing a file's acl mask, ext2_set_acl() will first set the group
bits of i_mode to the value of the mask, and only then set the actual
extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the file
had no acl attribute to begin with, the system will from now on assume
that the mask permission bits are actual group permission bits, potentially
granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
[JK: Rebased on top of "ext2: Don't clear SGID when inheriting ACLs"] Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by creating __ext2_set_acl() function that does not call
posix_acl_update_mode() and use it when inheriting ACLs. That prevents
SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef CC: linux-ext4@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2:
- Keep using CURRENT_TIME_SEC
- Change parameter order of ext2_set_acl() to match upstream
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Actually, at the moment this is not an issue, as every in-tree arch does
the same manual checks for VERIFY_READ vs VERIFY_WRITE, relying on the MMU
to tell them apart, but this wasn't the case in the past and may happen
again on some odd arch in the future.
If anyone cares about 3.7 and earlier, this is a security hole (untested)
on real 80386 CPUs.
Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Jörn Engel noticed that the expand_upwards() function might not return
-ENOMEM in case the requested address is (unsigned long)-PAGE_SIZE and
if the architecture didn't defined TASK_SIZE as multiple of PAGE_SIZE.
Affected architectures are arm, frv, m68k, blackfin, h8300 and xtensa
which all define TASK_SIZE as 0xffffffff, but since none of those have
an upwards-growing stack we currently have no actual issue.
Nevertheless let's fix this just in case any of the architectures with
an upward-growing stack (currently parisc, metag and partly ia64) define
TASK_SIZE similar.
Link: http://lkml.kernel.org/r/20170702192452.GA11868@p100.box Fixes: bd726c90b6b8 ("Allow stack to grow up to address space limit") Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Jörn Engel <joern@purestorage.com> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When UBIFS prepares data structures which will be written to the MTD it
ensues that their lengths are multiple of 8. Since it uses kmalloc() the
padded bytes are left uninitialized and we leak a few bytes of kernel
memory to the MTD.
To make sure that all bytes are initialized, let's switch to kzalloc().
Kzalloc() is fine in this case because the buffers are not huge and in
the IO path the performance bottleneck is anyway the MTD.
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at>
[bwh: Backported to 3.2:
- Drop change in ubifs_jnl_xrename()
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
UBIFS handles extended attributes just like files, as consequence of
that, they also have inodes.
Therefore UBIFS does all the inode machinery also for xattrs. Since new
inodes have i_nlink of 1, a file or xattr inode will be evicted
if i_nlink goes down to 0 after an unlink. UBIFS assumes this model also
for xattrs, which is not correct.
One can create a file "foo" with xattr "user.test". By reading
"user.test" an inode will be created, and by deleting "user.test" it
will get evicted later. The assumption breaks if the file "foo", which
hosts the xattrs, will be removed. VFS nor UBIFS does not remove each
xattr via ubifs_xattr_remove(), it just removes the host inode from
the TNC and all underlying xattr nodes too and the inode will remain
in the cache and wastes memory.
To solve this problem, remove xattr inodes from the VFS inode cache in
ubifs_xattr_remove() to make sure that they get evicted.
Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger <richard@nod.at>
[bwh: Backported to 3.2:
- xattr support is optional, so add an #ifdef around the call
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The driver checks port->exists twice in i8042_interrupt(), first when
trying to assign temporary "serio" variable, and second time when deciding
whether it should call serio_interrupt(). The value of port->exists may
change between the 2 checks, and we may end up calling serio_interrupt()
with a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
IP: [<ffffffff8150feaf>] _spin_lock_irqsave+0x1f/0x40
PGD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
To avoid the issue let's change the second check to test whether serio is
NULL or not.
Also, let's take i8042_lock in i8042_start() and i8042_stop() instead of
trying to be overly smart and using memory barriers.
Signed-off-by: Chen Hong <chenhong3@huawei.com>
[dtor: take lock in i8042_start()/i8042_stop()] Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
From POWER4 onwards, mfocrf() only places the specified CR field into
the destination GPR, and the rest of it is set to 0. The PowerPC AS
from version 3.0 now requires this behaviour.
The emulation code currently puts the entire CR into the destination GPR.
Fix it.
Fixes: 6888199f7fe5 ("[POWERPC] Emulate more instructions in software") Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The workaround for the CELL timebase bug does not correctly mark cr0 as
being clobbered. This means GCC doesn't know that the asm block changes cr0 and
might leave the result of an unrelated comparison in cr0 across the block, which
we then trash, leading to basically random behaviour.
Fixes: 859deea949c3 ("[POWERPC] Cell timebase bug workaround") Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Tweak change log and flag for stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
__list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer
duration if there are more number of items in the lru list. As per the
current code, it can hold the spin lock for upto maximum UINT_MAX
entries at a time. So if there are more number of items in the lru
list, then "BUG: spinlock lockup suspected" is observed in the below
path:
Fix this lockup by reducing the number of entries to be shrinked from
the lru list to 1024 at once. Also, add cond_resched() before
processing the lru list again.
Link: http://marc.info/?t=149722864900001&r=1&w=2 Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: we don't hold the spin-lock for long, but adding
cond_resched() looks like a good idea] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
validate_scan_freqs() retrieves frequencies from attributes
nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with
nla_get_u32(), which reads 4 bytes from each attribute
without validating the size of data received. Attributes
nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy.
Validate size of each attribute before parsing to avoid potential buffer
overread.
Fixes: 2a519311926 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
nla policy checks for only maximum length of the attribute data
when the attribute type is NLA_BINARY. If userspace sends less
data than specified, the wireless drivers may access illegal
memory. When type is NLA_UNSPEC, nla policy check ensures that
userspace sends minimum specified length number of bytes.
Remove type assignment to NLA_BINARY from nla_policy of
NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum
WLAN_PMKID_LEN bytes are received from userspace with
NL80211_ATTR_PMKID.
While cleaning up sysfs callback that prints EK we discovered a kernel
memory leak. This commit fixes the issue by zeroing the buffer used for
TPM command/response.
The leak happen when we use either tpm_vtpm_proxy, tpm_ibmvtpm or
xen-tpmfront.
Fixes: 0883743825e3 ("TPM: sysfs functions consolidation") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
[bwh: Backported to 3.2: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The crash happens in syscall_get_arguments function for
syscalls with zero arguments, that will try to access
first argument (args[0]) in event entry, but it's not
allocated.
Bail out of there are no arguments.
Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive
the port number from user input as part of its attributes and assumes
it is valid. Down on the stack, that parameter is used to access kernel
data structures. If the value is invalid, the kernel accesses memory
it should not. To prevent this, verify the port number before using it.
BUG: KASAN: use-after-free in ib_uverbs_create_ah+0x6d5/0x7b0
Read of size 4 at addr ffff880018d67ab8 by task syz-executor/313
BUG: KASAN: slab-out-of-bounds in modify_qp.isra.4+0x19d0/0x1ef0
Read of size 4 at addr ffff88006c40ec58 by task syz-executor/819
Fixes: 67cdb40ca444 ("[IB] uverbs: Implement more commands") Fixes: 189aba99e70 ("IB/uverbs: Extend modify_qp and support packet pacing") Cc: <security@kernel.org> Cc: Yevgeny Kliteynik <kliteyn@mellanox.com> Cc: Tziporet Koren <tziporet@mellanox.com> Cc: Alex Polak <alexpo@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2:
- In modify_qp(), command structure is cmd not cmd->base
- In modify_qp(), add release_qp label
- In ib_uverbs_create_ah(), add definition of ib_dev
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Previously start_port and end_port were defined in 2 places, cache.c and
device.c and this prevented their use in other modules.
Make these common functions, change the name to reflect the rdma
name space, and update existing users.
Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: drop one inapplicable change] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We have pretty clear evidence that MSIs are getting lost on g4x and
somehow the interrupt logic doesn't seem to recover from that state
even if we try hard to clear the IIR.
Disabling IER around the normal IIR clearing in the irq handler isn't
sufficient to avoid this, so the problem really seems to be further
up the interrupt chain. This should guarantee that there's always
an edge if any IIR bits are set after the interrupt handler is done,
which should normally guarantee that the CPU interrupt is generated.
That approach seems to work perfectly on VLV/CHV, but apparently
not on g4x.
MSI is documented to be broken on 965gm at least. The chipset spec
says MSI is defeatured because interrupts can be delayed or lost,
which fits well with what we're seeing on g4x. Previously we've
already disabled GMBUS interrupts on g4x because somehow GMBUS
manages to raise legacy interrupts even when MSI is enabled.
Since there's such widespread MSI breakahge all over in the pre-gen5
land let's just give up on MSI on these platforms.
Seqno reporting might be negatively affected by this since the legcy
interrupts aren't guaranteed to be ordered with the seqno writes,
whereas MSI interrupts may be? But an occasioanlly missed seqno
seems like a small price to pay for generally working interrupts.
Cc: Diego Viola <diego.viola@gmail.com> Tested-by: Diego Viola <diego.viola@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101261 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170626203051.28480-1-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit e38c2da01f76cca82b59ca612529b81df82a7cc7) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[bwh: Backported to 3.2:
- Open-code INTEL_GEN()
- Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, when the link for $DEV is down, this command succeeds but the
address is removed immediately by DAD (1):
ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
In the same situation, this will succeed and not remove the address (2):
ip addr add 1111::12/64 dev $DEV
ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
The comment in addrconf_dad_begin() when !IF_READY makes it look like
this is the intended behavior, but doesn't explain why:
* If the device is not ready:
* - keep it tentative if it is a permanent address.
* - otherwise, kill it.
We clearly cannot prevent userspace from doing (2), but we can make (1)
work consistently with (2).
addrconf_dad_stop() is only called in two cases: if DAD failed, or to
skip DAD when the link is down. In that second case, the fix is to avoid
deleting the address, like we already do for permanent addresses.
Fixes: 3c21edbd1137 ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The userspace needs to know why is the address being removed so that it can
perhaps obtain a new address.
Without the DADFAILED flag it's impossible to distinguish removal of a
temporary and tentative address due to DAD failure from other reasons (device
removed, manual address removal).
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The enclosure_add_device() function should fail if it can't create the
relevant sysfs links.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Acked-by: James Bottomley <jejb@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently we saw a lot of "No irq handler" errors during hibernation, which
caused the system hang finally:
ata4.00: qc timeout (cmd 0xec)
ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata4.00: revalidation failed (errno=-5)
ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
do_IRQ: 31.151 No irq handler for vector
According to above logs, there is an interrupt triggered and it is
dispatched to CPU31 with a vector number 151, but there is no handler for
it, thus this IRQ will not get acked and will cause an IRQ flood which
kills the system. To be more specific, the 31.151 is an interrupt from the
AHCI host controller.
After some investigation, the reason why this issue is triggered is because
the thaw_noirq() function does not restore the MSI/MSI-X settings across
hibernation.
The scenario is illustrated below:
1. Before hibernation, IRQ 34 is the handler for the AHCI device, which
is bound to CPU31.
2. Hibernation starts, the AHCI device is put into low power state.
3. All the nonboot CPUs are put offline, so IRQ 34 has to be migrated to
the last alive one - CPU0.
4. After the snapshot has been created, all the nonboot CPUs are brought
up again; IRQ 34 remains bound to CPU0.
5. AHCI devices are put into D0.
6. The snapshot is written to the disk.
The issue is triggered in step 6. The AHCI interrupt should be delivered
to CPU0, however it is delivered to the original CPU31 instead, which
causes the "No irq handler" issue.
Ying Huang has provided a clue that, in step 3 it is possible that writing
to the register might not take effect as the PCI devices have been
suspended.
In step 3, the IRQ 34 affinity should be modified from CPU31 to CPU0, but
in fact it is not. In __pci_write_msi_msg(), if the device is already in
low power state, the low level MSI message entry will not be updated but
cached. During the device restore process after a normal suspend/resume,
pci_restore_msi_state() writes the cached MSI back to the hardware.
But this is not the case for hibernation. pci_restore_msi_state() is not
currently called in pci_pm_thaw_noirq(), although pci_save_state() has
saved the necessary PCI cached information in pci_pm_freeze_noirq().
Restore the PCI status for the device during hibernation. Otherwise the
status might be lost across hibernation (for example, settings for MSI,
MSI-X, ATS, ACS, IOV, etc.), which might cause problems during hibernation.
Suggested-by: Ying Huang <ying.huang@intel.com> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
[bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: Ying Huang <ying.huang@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__btrfs_set_acl() into btrfs_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef CC: linux-btrfs@vger.kernel.org CC: David Sterba <dsterba@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 3.2: move the call to posix_acl_update_mode() into
btrfs_xattr_acl_set()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The USB core and sysfs will attempt to enumerate certain parameters
which are unsupported by the au0828 - causing inconsistent behavior
and sometimes causing the chip to reset. Avoid making these calls.
This problem manifested as intermittent cases where the au8522 would
be reset on analog video startup, in particular when starting up ALSA
audio streaming in parallel - the sysfs entries created by
snd-usb-audio on streaming startup would result in unsupported control
messages being sent during tuning which would put the chip into an
unknown state.
Fix commit e50c0a8fa60d ("Support the MIPS32 / MIPS64 DSP ASE.") and
send SIGILL rather than SIGBUS whenever an unimplemented BPOSGE32 DSP
ASE instruction has been encountered in `__compute_return_epc_for_insn'
as our Reserved Instruction exception handler would in response to an
attempt to actually execute the instruction. Sending SIGBUS only makes
sense for the unaligned PC case, since moved to `__compute_return_epc'.
Adjust function documentation accordingly, correct formatting and use
`pr_info' rather than `printk' as the other exit path already does.
Fixes: e50c0a8fa60d ("Support the MIPS32 / MIPS64 DSP ASE.") Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16396/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2:
- Drop the comment change
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
pm_genpd_remove_subdomain() iterates over domain's master_links list and
removes matching element thus it has to use safe version of list
iteration.
Fixes: f721889ff65a ("PM / Domains: Support for generic I/O PM domains (v8)") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Neither soft poweroff (transition to ACPI power state S5) nor
suspend-to-RAM (transition to state S3) works on the Macbook Pro 11,4 and
11,5.
The problem is related to the [mem 0x7fa00000-0x7fbfffff] space. When we
use that space, e.g., by assigning it to the 00:1c.0 Root Port, the ACPI
Power Management 1 Control Register (PM1_CNT) at [io 0x1804] doesn't work
anymore.
Linux does a soft poweroff (transition to S5) by writing to PM1_CNT. The
theory about why this doesn't work is:
- The write to PM1_CNT causes an SMI
- The BIOS SMI handler depends on something in
[mem 0x7fa00000-0x7fbfffff]
- When Linux assigns [mem 0x7fa00000-0x7fbfffff] to the 00:1c.0 Port, it
covers up whatever the SMI handler uses, so the SMI handler no longer
works correctly
Reserve the [mem 0x7fa00000-0x7fbfffff] space so we don't assign it to
anything.
This is voodoo programming, since we don't know what the real conflict is,
but we've failed to find the root cause.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=103211 Tested-by: thejoe@gmail.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The Haswell Power Control Unit has a non-PCI register (CONFIG_TDP_NOMINAL)
where BAR 0 is supposed to be. This is erratum HSE43 in the spec update
referenced below:
The PCIe* Base Specification indicates that Configuration Space Headers
have a base address register at offset 0x10. Due to this erratum, the
Power Control Unit's CONFIG_TDP_NOMINAL CSR (Bus 1; Device 30; Function
3; Offset 0x10) is located where a base register is expected.
Mark the PCU as having non-compliant BARs so we don't try to probe any of
them. There are no other BARs on this device.
The inline asm retry check in the MIPS_ATOMIC_SET operation of the
sysmips system call has been backwards since commit f1e39a4a616c ("MIPS:
Rewrite sysmips(MIPS_ATOMIC_SET, ...) in C with inline assembler")
merged in v2.6.32, resulting in the non R10000_LLSC_WAR case retrying
until the operation was inatomic, before returning the new value that
was probably just written multiple times instead of the old value.
Invert the branch condition to fix that particular issue.
Fixes: f1e39a4a616c ("MIPS: Rewrite sysmips(MIPS_ATOMIC_SET, ...) in C with inline assembler") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16148/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Make sure to drop the reference to the dma device taken by
of_find_device_by_node() on probe errors and on driver unbind.
Fixes: 334ae614772b ("sparc: Kill SBUS DVMA layer.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If bnx2i_map_ep_dbell_regs() then we accidentally return NULL instead of
an error pointer. It results in a NULL dereference in
iscsi_if_ep_connect().
Fixes: cf4e6363859d ("[SCSI] bnx2i: Add bnx2i iSCSI driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_IUCV socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Fixes: 52a82e23b9f2 ("af_iucv: Validate socket address length in iucv_sock_bind()") Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
[jwi: removed unneeded null-check for addr] Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
For AMD Promontory xHCI host, although you can disable USB 2.0 ports in
BIOS settings, those ports will be enabled anyway after you remove a
device on that port and re-plug it in again. It's a known limitation of
the chip. As a workaround we can clear the PORT_WAKE_BITS.
This will disable wake on connect, disconnect and overcurrent on
AMD Promontory USB2 ports
[checkpatch cleanup and commit message reword -Mathias] Cc: Tsai Nicholas <nicholas.tsai@amd.com> Signed-off-by: Jiahau Chang <Lars_Chang@asmedia.com.tw> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>