Mount filesystem
Create subvol with ID 257
Unmount filesystem
Mount filesystem
Delete subvol with ID 257
btrfs_drop_snapshot()
Add root corresponding to subvol 257 into
btrfs_transaction->dropped_roots list
Create new subvol (i.e. create_subvol())
257 is returned as the next free objectid
btrfs_read_fs_root_no_name()
Finds the btrfs_root instance corresponding to the old subvol with ID 257
in btrfs_fs_info->fs_roots_radix.
Returns error since btrfs_root_item->refs has the value of 0.
To fix the issue the commit initializes tree root's and subvolume root's
highest_objectid when loading the roots from disk.
If we failed to create a hard link we were not always releasing the
the transaction handle we got before, resulting in a memory leak and
preventing any other tasks from being able to commit the current
transaction.
Fix this by always releasing our transaction handle.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
We weren't accounting for the insertion of an inline extent item for the
symlink inode nor that we need to update the parent inode item (through
the call to btrfs_add_nondir()). So fix this by including two more
transaction units.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When a symlink is successfully created it always has an inline extent
containing the source path. However if an error happens when creating
the symlink, we can leave in the subvolume's tree a symlink inode without
any such inline extent item - this happens if after btrfs_symlink() calls
btrfs_end_transaction() and before it calls the inode eviction handler
(through the final iput() call), the transaction gets committed and a
crash happens before the eviction handler gets called, or if a snapshot
of the subvolume is made before the eviction handler gets called. Sadly
we can't just avoid this by making btrfs_symlink() call
btrfs_end_transaction() after it calls the eviction handler, because the
later can commit the current transaction before it removes any items from
the subvolume tree (if it encounters ENOSPC errors while reserving space
for removing all the items).
So make send fail more gracefully, with an -EIO error, and print a
message to dmesg/syslog informing that there's an empty symlink inode,
so that the user can delete the empty symlink or do something else
about it.
Reported-by: Stephen R. van den Berg <srb@cuci.nl> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
There is one ENOSPC case that's very confusing. There's Available
greater than zero but no file operation succeds (besides removing
files). This happens when the metadata are exhausted and there's no
possibility to allocate another chunk.
In this scenario it's normal that there's still some space in the data
chunk and the calculation in df reflects that in the Avail value.
To at least give some clue about the ENOSPC situation, let statfs report
zero value in Avail, even if there's still data space available.
Current:
/dev/sdb1 4.0G 3.3G 719M 83% /mnt/test
New:
/dev/sdb1 4.0G 3.3G 0 100% /mnt/test
We calculate the remaining metadata space minus global reserve. If this
is (supposedly) smaller than zero, there's no space. But this does not
hold in practice, the exhausted state happens where's still some
positive delta. So we apply some guesswork and compare the delta to a 4M
threshold. (Practically observed delta was 2M.)
We probably cannot calculate the exact threshold value because this
depends on the internal reservations requested by various operations, so
some operations that consume a few metadata will succeed even if the
Avail is zero. But this is better than the other way around.
Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
We hit this panic on a few of our boxes this week where we have an
ordered_extent with an NULL inode. We do an igrab() of the inode in writepages,
but weren't doing it in writepage which can be called directly from the VM on
dirty pages. If the inode has been unlinked then we could have I_FREEING set
which means igrab() would return NULL and we get this panic. Fix this by trying
to igrab in btrfs_writepage, and if it returns NULL then just redirty the page
and return AOP_WRITEPAGE_ACTIVATE; so the VM knows it wasn't successful. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Looks like oversight, call brelse() when checksum fails. Further down the
code, in the non error path, we do call brelse() and so we don't see
brelse() in the goto error paths.
Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
'This is because the pstore filesystem can be backed by UEFI variables,
and (for example) a crash might dump the last kilobytes of the dmesg
into a number of pstore entries, each entry backed by a separate UEFI
variable in the above GUID namespace, and with a variable name
according to the above pattern.
Please see "drivers/firmware/efi/efi-pstore.c".
While this patch series will not prevent the user from deleting those
UEFI variables via the pstore filesystem (i.e., deleting a pstore fs
entry will continue to delete the backing UEFI variable), I think it
would be nice to preserve the possibility for the sysadmin to delete
Linux-created UEFI variables that carry portions of the crash log,
*without* having to mount the pstore filesystem.'
There's also no chance of causing machines to become bricked by
deleting these variables, which is the whole purpose of excluding
things from the whitelist.
Use the LINUX_EFI_CRASH_GUID guid and a wildcard '*' for the match so
that we don't have to update the string in the future if new variable
name formats are created for crash dump variables.
Reported-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Peter Jones <pjones@redhat.com> Tested-by: Peter Jones <pjones@redhat.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: "Lee, Chun-Yi" <jlee@suse.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
"rm -rf" is bricking some peoples' laptops because of variables being
used to store non-reinitializable firmware driver data that's required
to POST the hardware.
These are 100% bugs, and they need to be fixed, but in the mean time it
shouldn't be easy to *accidentally* brick machines.
We have to have delete working, and picking which variables do and don't
work for deletion is quite intractable, so instead make everything
immutable by default (except for a whitelist), and make tools that
aren't quite so broad-spectrum unset the immutable flag.
Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
All the variables in this list so far are defined to be in the global
namespace in the UEFI spec, so this just further ensures we're
validating the variables we think we are.
Including the guid for entries will become more important in future
patches when we decide whether or not to allow deletion of variables
based on presence in this list.
Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Actually translate from ucs2 to utf8 before doing the test, and then
test against our other utf8 data, instead of fudging it.
Signed-off-by: Peter Jones <pjones@redhat.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Translate EFI's UCS-2 variable names to UTF-8 instead of just assuming
all variable names fit in ASCII.
Signed-off-by: Peter Jones <pjones@redhat.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
It's not very normal to return 1 on failure and 0 on success. There
isn't a reason for it here, the callers don't care so long as it's
non-zero on failure.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This adds ucs2_utf8size(), which tells us how big our ucs2 string is in
bytes, and ucs2_as_utf8, which translates from ucs2 to utf8..
Signed-off-by: Peter Jones <pjones@redhat.com> Tested-by: Lee, Chun-Yi <jlee@suse.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The PSCI SMP implementation is built only when both CONFIG_SMP and
CONFIG_ARM_PSCI are set, so a configuration that has the latter
but not the former can get a link error when it tries to call
psci_smp_available().
arch/arm/mach-tegra/built-in.o: In function `tegra114_cpuidle_init':
cpuidle-tegra114.c:(.init.text+0x52a): undefined reference to `psci_smp_available'
This corrects the #ifdef in the psci.h header file to match the
Makefile conditional we have for building that function.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
ext4 can update bh->b_state non-atomically in _ext4_get_block() and
ext4_da_get_block_prep(). Usually this is fine since bh is just a
temporary storage for mapping information on stack but in some cases it
can be fully living bh attached to a page. In such case non-atomic
update of bh->b_state can race with an atomic update which then gets
lost. Usually when we are mapping bh and thus updating bh->b_state
non-atomically, nobody else touches the bh and so things work out fine
but there is one case to especially worry about: ext4_finish_bio() uses
BH_Uptodate_Lock on the first bh in the page to synchronize handling of
PageWriteback state. So when blocksize < pagesize, we can be atomically
modifying bh->b_state of a buffer that actually isn't under IO and thus
can race e.g. with delalloc trying to map that buffer. The result is
that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the
corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock.
Fix the problem by always updating bh->b_state bits atomically.
CC: stable@vger.kernel.org Reported-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
dax_fault() currently relies on the get_block callback to attach an
io completion callback to the mapping buffer head so that it can
run unwritten extent conversion after zeroing allocated blocks.
Instead of this hack, pass the conversion callback directly into
dax_fault() similar to the get_block callback. When the filesystem
allocates unwritten extents, it will set the buffer_unwritten()
flag, and hence the dax_fault code can call the completion function
in the contexts where it is necessary without overloading the
mapping buffer head.
Note: The changes to ext4 to use this interface are suspect at best.
In fact, the way ext4 did this end_io assignment in the first place
looks suspect because it only set a completion callback when there
wasn't already some other write() call taking place on the same
inode. The ext4 end_io code looks rather intricate and fragile with
all it's reference counting and passing to different contexts for
modification via inode private pointers that aren't protected by
locks...
Signed-off-by: Dave Chinner <dchinner@redhat.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
According to the datasheet, the resolusion of temperature sensor is
-5.35 counts/C. Temperature ADC is 472 counts at 25C.
(https://www.sparkfun.com/datasheets/Sensors/Pressure/MPL115A1.pdf
NOTE: This is older revision, but this information is removed from the
latest datasheet from nxp somehow)
The SPI tx and rx buffers are both supposed to be scan_bytes amount of
bytes large and a common allocation is used to allocate both buffers. This
puts the beginning of the tx buffer scan_bytes bytes after the rx buffer.
The initialization of the tx buffer pointer is done adding scan_bytes to
the beginning of the rx buffer, but since the rx buffer is of type __be16
this will actually add two times as much and the tx buffer ends up pointing
after the allocated buffer.
Fix this by using scan_count, which is scan_bytes / 2, instead of
scan_bytes when initializing the tx buffer pointer.
Fixes: aacff892cbd5 ("staging:iio:adis: Preallocate transfer message") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.
To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.
The problem was that when a privileged task had temporarily dropped its
privileges, e.g. by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.
While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.
In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:
/proc/$pid/stat - uses the check for determining whether pointers
should be visible, useful for bypassing ASLR
/proc/$pid/maps - also useful for bypassing ASLR
/proc/$pid/cwd - useful for gaining access to restricted
directories that contain files with lax permissions, e.g. in
this scenario:
lrwxrwxrwx root root /proc/13020/cwd -> /root/foobar
drwx------ root root /root
drwxr-xr-x root root /root/foobar
-rw-r--r-- root root /root/foobar/secret
Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).
[akpm@linux-foundation.org: fix warning] Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
as reported by https://bugzilla.kernel.org/show_bug.cgi?id=108481
This bug reports mentions 6d4f5440 ("HID: multitouch: Fetch feature
reports on demand for Win8 devices") as the origin of the problem but this
commit actually masked 2 firmware bugs that are annihilating each other:
The report descriptor declares two features in reports 3 and 5:
The report ID 3 presents 2 input mode features, while only the first one
is handled by the device. Given that we did not checked if one was
previously assigned, we were dealing with the ignored featured and we
should never have been able to switch this panel into the multitouch mode.
However, the firmware presents an other bugs which allowed 6d4f5440
to counteract the faulty report descriptor. When we request the values
of the feature 5, the firmware answers "03 03 00". The fields are correct
but the report id is wrong. Before 6d4f5440, we retrieved all the features
and injected them in the system. So when we called report 5, we injected
in the system the report 3 with the values "03 00".
Setting the second input mode to 03 in this report changed it to "03 03"
and the touchpad switched to the mt mode. We could have set anything
in the second field because the actual value (the first 03 in this report)
was given by the query of report ID 5.
To sum up: 2 bugs in the firmware were hiding that we were accessing the
wrong feature.
c118baf80256 ("arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing nodes")
avoids allocating bootmem memory for non existent nodes.
But when DEBUG_PER_CPU_MAPS=y is enabled, my powerNV system failed to boot
because in sched_init_numa(), cpumask_or() operation was done on
unallocated nodes.
Fix that by making cpumask_or() operation only on existing nodes.
[ Tested with and w/o DEBUG_PER_CPU_MAPS=y on x86 and PowerPC. ]
Many codecs, typically found on Realtek codecs, have the analog
loopback path merged to the secondary input of the middle of the
output paths. Currently, we don't offer the dynamic switching in such
configuration but let each loopback path mute by itself.
This should work well in theory, but in reality, we often see that
such a dead loopback path causes some background noises even if all
the elements get muted. Such a problem has been fixed by adding the
quirk accordingly to disable aamix, and it's the right fix, per se.
The only problem is that it's not so trivial to achieve it; user needs
to pass a hint string via patch module option or sysfs.
This patch gives a bit improvement on the situation: it adds "Loopback
Mixing" control element for such codecs like other codecs (e.g. IDT or
VIA codecs) with the individual loopback paths. User can turn on/off
the loopback path simply via a mixer app.
For keeping the compatibility, the loopback is still enabled on these
codecs. But user can try to turn it off if experiencing a suspicious
background or click noise on the fly, then build a static fixup later
once after the problem is addressed.
Other than the addition of the loopback enable/disablement control,
there should be no changes.
The critical section protected by usbhid->lock in hid_ctrl() is too
big and because of this it causes a recursive deadlock. "Too big" means
the case statement and the call to hid_input_report() do not need to be
protected by the spinlock (no URB operations are done inside them).
The deadlock happens because in certain rare cases drivers try to grab
the lock while handling the ctrl irq which grabs the lock before them
as described above. For example newer wacom tablets like 056a:033c try
to reschedule proximity reads from wacom_intuos_schedule_prox_event()
calling hid_hw_request() -> usbhid_request() -> usbhid_submit_report()
which tries to grab the usbhid lock already held by hid_ctrl().
There are two ways to get out of this deadlock:
1. Make the drivers work "around" the ctrl critical region, in the
wacom case for ex. by delaying the scheduling of the proximity read
request itself to a workqueue.
2. Shrink the critical region so the usbhid lock protects only the
instructions which modify usbhid state, calling hid_input_report()
with the spinlock unlocked, allowing the device driver to grab the
lock first, finish and then grab the lock afterwards in hid_ctrl().
Dell Latitude E6440 (1028:05bd) needs the same fixup as applied to
other Latitude E7xxx models for the click noise due to the recent
power-saving changes.
ALSA PCM may still have a leftover instance after disconnection and
it delays its release. The problem is that the PCM close code path of
USB-audio driver has a call of snd_usb_autosuspend(). This involves
with the call of usb_autopm_put_interface() and it may lead to a
kernel Oops due to the NULL object like:
We have already a check of disconnection in snd_usb_autoresume(), but
the check is missing its counterpart. The fix is just to put the same
check in snd_usb_autosuspend(), too.
After the recent fix of runtime PM for USB-audio driver, we got a
lockdep warning like:
=============================================
[ INFO: possible recursive locking detected ]
4.2.0-rc8+ #61 Not tainted
---------------------------------------------
pulseaudio/980 is trying to acquire lock:
(&chip->shutdown_rwsem){.+.+.+}, at: [<ffffffffa0355dac>] snd_usb_autoresume+0x1d/0x52 [snd_usb_audio]
but task is already holding lock:
(&chip->shutdown_rwsem){.+.+.+}, at: [<ffffffffa0355dac>] snd_usb_autoresume+0x1d/0x52 [snd_usb_audio]
This comes from snd_usb_autoresume() invoking down_read() and it's
used in a nested way. Although it's basically safe, per se (as these
are read locks), it's better to reduce such spurious warnings.
The read lock is needed to guarantee the execution of "shutdown"
(cleanup at disconnection) task after all concurrent tasks are
finished. This can be implemented in another better way.
Also, the current check of chip->in_pm isn't good enough for
protecting the racy execution of multiple auto-resumes.
This patch rewrites the logic of snd_usb_autoresume() & co; namely,
- The recursive call of autopm is avoided by the new refcount,
chip->active. The chip->in_pm flag is removed accordingly.
- Instead of rwsem, another refcount, chip->usage_count, is introduced
for tracking the period to delay the shutdown procedure. At
the last clear of this refcount, wake_up() to the shutdown waiter is
called.
- The shutdown flag is replaced with shutdown atomic count; this is
for reducing the lock.
- Two new helpers are introduced to simplify the management of these
refcounts; snd_usb_lock_shutdown() increases the usage_count, checks
the shutdown state, and does autoresume. snd_usb_unlock_shutdown()
does the opposite. Most of mixer and other codes just need this,
and simply returns an error if it receives an error from lock.
USB Audio Class version 2.0 supports three different parameter block sizes for
CUR requests, which are 1 byte (5.2.3.1 Layout 1 Parameter Block), 2 bytes
(5.2.3.2 Layout 2 Parameter Block) and 4 bytes (5.2.3.3 Layout 3 Parameter
Block). Use the correct size according to the specific control as it was
already done for UACv1. The allocated block size for control requests is
increased to support the 4 byte worst case.
Changed ctl type for Input Gain Control and Input Gain Pad Control to
USB_MIXER_S16 as per section 5.2.5.7.11-12 in the USB Audio Class 2.0
definition.
Inform userspace that one channel of the internal mic has reversed
polarity, so it does not attempt to add both channels together and
end up with silence.
The error path in perf_event_open() is such that asking for a sampling
event on a PMU that doesn't generate interrupts will end up in dropping
the perf_sched_count even though it hasn't been incremented for this
event yet.
Given a sufficient amount of these calls, we'll end up disabling
scheduler's jump label even though we'd still have active events in the
system, thereby facilitating the arrival of the infernal regions upon us.
I'm fixing this by moving account_event() inside perf_event_alloc().
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Because event_sched_out() checks event->pending_disable _before_
actually disabling the event, it can happen that the event fires after
it checks but before it gets disabled.
This would leave event->pending_disable set and the queued irq_work
will try and process it.
However, if the event trigger was during schedule(), the event might
have been de-scheduled by the time the irq_work runs, and
perf_event_disable_local() will fail.
Fix this by checking event->pending_disable _after_ we call
event->pmu->del(). This depends on the latter being a compiler
barrier, such that the compiler does not lift the load and re-creates
the problem.
Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8
This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.
This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.
Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
call_break_hook is called in atomic context (hard irq disabled), so replace
the sleepable lock to rcu lock, replace relevant list operations to rcu
version and call synchronize_rcu() in unregister_break_hook().
And, replace write lock to spinlock in {un}register_break_hook.
Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Jan Kara [Mon, 7 Dec 2015 19:34:49 +0000 (14:34 -0500)]
ext4: fix races of writeback with punch hole and zero range
When doing delayed allocation, update of on-disk inode size is postponed
until IO submission time. However hole punch or zero range fallocate
calls can end up discarding the tail page cache page and thus on-disk
inode size would never be properly updated.
Make sure the on-disk inode size is updated before truncating page
cache.
Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Mingming Cao <mingming.cao@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Jan Kara [Mon, 7 Dec 2015 19:31:11 +0000 (14:31 -0500)]
ext4: fix races between buffered IO and collapse / insert range
Current code implementing FALLOC_FL_COLLAPSE_RANGE and
FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page
faults. If buffered write or write via mmap manages to squeeze between
filemap_write_and_wait_range() and truncate_pagecache() in the fallocate
implementations, the written data is simply discarded by
truncate_pagecache() although it should have been shifted.
Fix the problem by moving filemap_write_and_wait_range() call inside
i_mutex and i_mmap_sem. That way we are protected against races with
both buffered writes and page faults.
Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Mingming Cao <mingming.cao@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Jan Kara [Mon, 7 Dec 2015 19:29:17 +0000 (14:29 -0500)]
ext4: move unlocked dio protection from ext4_alloc_file_blocks()
Currently ext4_alloc_file_blocks() was handling protection against
unlocked DIO. However we now need to sometimes call it under i_mmap_sem
and sometimes not and DIO protection ranks above it (although strictly
speaking this cannot currently create any deadlocks). Also
ext4_zero_range() was actually getting & releasing unlocked DIO
protection twice in some cases. Luckily it didn't introduce any real bug
but it was a land mine waiting to be stepped on. So move DIO protection
out from ext4_alloc_file_blocks() into the two callsites.
Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Mingming Cao <mingming.cao@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Jan Kara [Mon, 7 Dec 2015 19:28:03 +0000 (14:28 -0500)]
ext4: fix races between page faults and hole punching
Currently, page faults and hole punching are completely unsynchronized.
This can result in page fault faulting in a page into a range that we
are punching after truncate_pagecache_range() has been called and thus
we can end up with a page mapped to disk blocks that will be shortly
freed. Filesystem corruption will shortly follow. Note that the same
race is avoided for truncate by checking page fault offset against
i_size but there isn't similar mechanism available for punching holes.
Fix the problem by creating new rw semaphore i_mmap_sem in inode and
grab it for writing over truncate, hole punching, and other functions
removing blocks from extent tree and for read over page faults. We
cannot easily use i_data_sem for this since that ranks below transaction
start and we need something ranking above it so that it can be held over
the whole truncate / hole punching operation. Also remove various
workarounds we had in the code to reduce race window when page fault
could have created pages with stale mapping information.
Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Mingming Cao <mingming.cao@oracle.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The MIPS_GIC_IPI should only be selected when MIPS_GIC is also
selected, otherwise it results in a compile error. smp-gic.c uses some
functions from include/linux/irqchip/mips-gic.h like
plat_ipi_call_int_xlate() which are only added to the header file when
MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP.
The calls top the functions from smp-gic.c are already protected by
some #ifdefs
The first part of this was introduced in commit 72e20142b2bf ("MIPS:
Move GIC IPI functions out of smp-cmp.c")
R6 does not support the MIPS MT ASE and the CMP/SMP options so
restrict them in order to prevent users from selecting incompatible
SMP configuration for R6 cores. We also disable the CPS/SMP option
because its support hasn't been added to the CPS code yet.
The ld-version.sh script fails on some versions of awk with the
following error, resulting in build failures for MIPS:
awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(')
This is due to the regular expression ".*)", meant to strip off the
beginning of the ld version string up to the close bracket, however
brackets have a meaning in regular expressions, so lets escape it so
that awk doesn't expect a corresponding open bracket.
Fixes: ccbef1674a15 ("Kbuild, lto: add ld-version and ld-ifversion ...") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: James Hogan <james.hogan@imgtec.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Cc: Michal Marek <mmarek@suse.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-mips@linux-mips.org Cc: linux-kbuild@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/12838/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When computing the residue we need two pieces of information: the current
descriptor and the remaining data of the current descriptor. To get
that information, we need to read consecutively two registers but we
can't do it in an atomic way. For that reason, we have to check manually
that current descriptor has not changed.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Suggested-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Reported-by: David Engraf <david.engraf@sysgo.com> Tested-by: David Engraf <david.engraf@sysgo.com> Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver") Cc: stable@vger.kernel.org #4.1 and later Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.
KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.
The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).
There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.
Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The fork of a process with four page table levels is broken since
git commit 6252d702c5311ce9 "[S390] dynamic page tables."
All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.
The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.
This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added
a check to make sure that tracepoints only get called when the cpu is
online, as it uses rcu_read_lock_sched() for protection.
Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints
are disabled") added lockdep checks (including rcu checks) for events that
are not enabled to catch possible RCU issues that would only be triggered if
a trace event was enabled. Commit f37755490fe9b only stopped the warnings
when the trace event was enabled but did not prevent warnings if the trace
event was called when disabled.
To fix this, the cpu online check is moved to where the condition is added
to the trace event. This will place the cpu online check in all places that
it may be used now and in the future.
Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When I fixed the dp rate selection in: 092c96a8ab9d1bd60ada2ed385cc364ce084180e
drm/radeon: fix dp link rate selection (v2)
I accidently dropped the special handling for NUTMEG
DP bridge chips. They require a fixed link rate.
Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Need to properly handle the max link rate in the dpcd.
This prevents some cases where 5.4 Ghz is selected when
it shouldn't be.
v2: simplify logic, add array bounds check
Reviewed-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
perf record -e 'cpu/mem-stores/pp' -a
This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
When software needs to reconfigure PEBS facilities, it should allow a
quiescent period between stopping the prior event counting and setting
up a new PEBS event. The quiescent period is to allow any latent
residual PEBS records to complete its capture at their previously
specified buffer address (provided by IA32_DS_AREA).
There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory. (Or MSR switching is just buggy on some models.)
The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.
After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.
This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).
We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.
(If you are implementing PEBS for guests, be sure to handle the case
where both host and guest enable PEBS, because this patch doesn't.)
Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") Cc: <stable@vger.kernel.org> Reported-by: Jiří Olša <jolsa@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
d_instantiate(new_dentry, old_inode) is absolutely wrong thing to
do - it will oops if new_dentry used to be positive, for starters.
What we need is d_invalidate() the target and be done with that.
Failing to allocate an inode for child means that cache for *parent* is
incompletely populated. So it's parent directory inode ('dir') that
needs NCPI_DIR_CACHE flag removed, *not* the child inode ('inode', which
is what we'd failed to allocate in the first place).
Fucked-up-in: commit 5e993e25 ("ncpfs: get rid of d_validate() nonsense") Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Thomas Huth discovered that a guest could cause a hard hang of a
host CPU by setting the Instruction Authority Mask Register (IAMR)
to a suitable value. It turns out that this is because when the
code was added to context-switch the new special-purpose registers
(SPRs) that were added in POWER8, we forgot to add code to ensure
that they were restored to a sane value on guest exit.
This adds code to set those registers where a bad value could
compromise the execution of the host kernel to a suitable neutral
value on guest exit.
Cc: stable@vger.kernel.org # v3.14+ Fixes: b005255e12a3 Reported-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Description:
------------
The RGMII 1000 Mbps Transmit timing is based on the output clock
(rgmiin_txc) being driven relative to the rising edge of an internal
clock and the output control/data (rgmiin_txctl/txd) being driven relative
to the falling edge of an internal clock source. If the internal clock
source is allowed to be static low (i.e., disabled) for an extended period
of time then when the clock is actually enabled the timing delta between
the rising edge and falling edge can change over the lifetime of the
device. This can result in the device switching characteristics degrading
over time, and eventually failing to meet the Data Manual Delay Time/Skew
specs.
To maintain RGMII 1000 Mbps IO Timings, SW should minimize the
duration that the Ethernet internal clock source is disabled. Note that
the device reset state for the Ethernet clock is "disabled".
Other RGMII modes (10 Mbps, 100Mbps) are not affected
Workaround:
-----------
If the SoC Ethernet interface(s) are used in RGMII mode at 1000 Mbps,
SW should minimize the time the Ethernet internal clock source is disabled
to a maximum of 200 hours in a device life cycle. This is done by enabling
the clock as early as possible in IPL (QNX) or SPL/u-boot (Linux/Android)
by setting the register CM_GMAC_CLKSTCTRL[1:0]CLKTRCTRL = 0x2:SW_WKUP.
So, do not allow to gate the cpsw clocks using ti,no-idle property in
cpsw node assuming 1000 Mbps is being used all the time. If someone does
not need 1000 Mbps and wants to gate clocks to cpsw, this property needs
to be deleted in their respective board files.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Cc: <stable@vger.kernel.org> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Introduce a dt property, ti,no-idle, that prevents an IP to idle at any
point. This is to handle Errata i877, which tells that GMAC clocks
cannot be disabled.
Acked-by: Roger Quadros <rogerq@ti.com> Tested-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Some module needs more than one functional clock in order to be accessible,
like the McASPs found in DRA7xx family.
This flag will indicate that the opt_clks need to be handled at the same
time as the main_clk for the given hwmod, ensuring that all needed clocks
are enabled before we try to access the module's address space.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Paul Walmsley <paul@pwsan.com> Tested-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This patch fixes a recent ABORT_TASK regression associated
with commit febe562c, where a left-over target_put_sess_cmd()
would still be called when __target_check_io_state() detected
a command has already been completed, and explicit ABORT must
be avoided.
Note commit febe562c dropped the local kref_get_unless_zero()
check in core_tmr_abort_task(), but did not drop this extra
corresponding target_put_sess_cmd() in the failure path.
So go ahead and drop this now bogus target_put_sess_cmd(),
and avoid this potential use-after-free.
Reported-by: Dan Lane <dracodan@gmail.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Overlayfs must update uid/gid after chown, otherwise functions
like inode_owner_or_capable() will check user against stale uid.
Catched by xfstests generic/087, it chowns file and calls utimes.
After rename file dentry still holds reference to lower dentry from
previous location. This doesn't matter for data access because data comes
from upper dentry. But this stale lower dentry taints dentry at new
location and turns it into non-pure upper. Such file leaves visible
whiteout entry after remove in directory which shouldn't have whiteouts at
all.
Overlayfs already tracks pureness of file location in oe->opaque. This
patch just uses that for detecting actual path type.
Comment from Vivek Goyal's patch:
Here are the details of the problem. Do following.
$ mkdir upper lower work merged upper/dir/
$ touch lower/test
$ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
work merged
$ mv merged/test merged/dir/
$ rm merged/dir/test
$ ls -l merged/dir/
/usr/bin/ls: cannot access merged/dir/test: No such file or directory
total 0
c????????? ? ? ? ? ? test
Basic problem seems to be that once a file has been unlinked, a whiteout
has been left behind which was not needed and hence it becomes visible.
Whiteout is visible because parent dir is of not type MERGE, hence
od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
passes on iterate handling directly to underlying fs. Underlying fs does
not know/filter whiteouts so it becomes visible to user.
Why did we leave a whiteout to begin with when we should not have.
ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
whiteout if file is pure upper. In this case file is not found to be pure
upper hence whiteout is left.
So why file was not PURE_UPPER in this case? I think because dentry is
still carrying some leftover state which was valid before rename. For
example, od->numlower was set to 1 as it was a lower file. After rename,
this state is not valid anymore as there is no such file in lower.
ovl_remove_upper() should do d_drop() only after it successfully
removes the dir, otherwise a subsequent getcwd() system call will
fail, breaking userspace programs.
This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491
Signed-off-by: Rui Wang <rui.y.wang@intel.com> Reviewed-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Public Action frames use special rules for how the BSSID field (Address
3) is set. A wildcard BSSID is used in cases where the transmitter and
recipient are not members of the same BSS. As such, we need to accept
Public Action frames with wildcard BSSID.
Commit db8e17324553 ("mac80211: ignore frames between TDLS peers when
operating as AP") added a rule that drops Action frames to TDLS-peers
based on an Action frame having different DA (Address 1) and BSSID
(Address 3) values. This is not correct since it misses the possibility
of BSSID being a wildcard BSSID in which case the Address 1 would not
necessarily match.
Fix this by allowing mac80211 to accept wildcard BSSID in an Action
frame when in AP mode.
Fixes: db8e17324553 ("mac80211: ignore frames between TDLS peers when operating as AP") Cc: stable@vger.kernel.org Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Just like for CCMP we need to check that for GCMP the fragments
have PNs that increment by one; the spec was updated to fix this
security issue and now has the following text:
The receiver shall discard MSDUs and MMPDUs whose constituent
MPDU PN values are not incrementing in steps of 1.
Adapt the code for CCMP to work for GCMP as well, luckily the
relevant fields already alias each other so no code duplication
is needed (just check the aliasing with BUILD_BUG_ON.)
Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The firmware ctls like "DSP1 Firmware" in wm_adsp codec driver are
enum, while the current driver accesses wrongly via
value.integer.value[]. They have to be via value.enumerated.item[]
instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The DRC Mode like "AIF1DRC1 Mode" and EQ Mode like "AIF1.1 EQ Mode" in
wm8994 codec driver are enum ctls, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
"MBC Mode", "VSS Mode", "VSS HPF Mode" and "Enhanced EQ Mode" ctls in
wm8958 codec driver are enum, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
snd_soc_dapm_dai_link_get() and _put() access the associated ctl
values as value.integer.value[]. However, this is an enum ctl, and it
has to be accessed via value.enumerated.item[]. The former is long
while the latter is unsigned int, so they don't align.
Fixes: c66150824b8a ('ASoC: dapm: add code to configure dai link parameters') Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
gs_destroy_candev() erroneously calls kfree() on a struct gs_can *, which is
allocated through alloc_candev() and should instead be freed using
free_candev() alone.
The inappropriate use of kfree() causes the kernel to hang when
gs_destroy_candev() is called.
Only the struct gs_usb * which is allocated through kzalloc() should be freed
using kfree() when the device is disconnected.
Signed-off-by: Maximilian Schneider <max@schneidersoft.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The value 5000 was put here with the addition of the timeout field to
ieee80211_start_tx_ba_session. It was originally added in mac80211 to
save resources for drivers like iwlwifi, which only supports a limited
number of concurrent aggregation sessions.
Since iwlwifi does not use minstrel_ht and other drivers don't need
this, 0 is a better default - especially since there have been
recent reports of aggregation setup related issues reproduced with
ath9k. This should improve stability without causing any adverse
effects.
Cc: stable@vger.kernel.org Acked-by: Avery Pennarun <apenwarr@gmail.com> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Lockdep warns of a potential lock inversion, i2s->lock is held numerous
times whilst we are under the substream lock (snd_pcm_stream_lock). If
we use the IRQ unsafe spin lock calls, you can also end up locking
snd_pcm_stream_lock whilst under i2s->lock (if an IRQ happens whilst we
are holding i2s->lock). This could result in deadlock.
This patch changes to using the irq safe spinlock calls, to avoid this
issue.
Fixes: ce8bcdbb61d9 ("ASoC: samsung: i2s: Protect more registers with a spinlock") Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Tested-by: Anand Moon <linux.amoon@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Until this patch, when TXing non-sta the pending_frames counter
wasn't increased, but it WAS decreased in
iwl_mvm_rx_tx_cmd_single(), what makes it negative in certain
conditions. This in turn caused much trouble when we need to
remove the station since we won't be waiting forever until
pending_frames gets 0. In certain cases, we were exhausting
the station table even in BSS mode, because we had a lot of
stale stations.
Increase the counter also in iwl_mvm_tx_skb_non_sta() after a
successful TX to avoid this outcome.
The change from cur_tp to the function
minstrel_get_tp_avg/minstrel_ht_get_tp_avg changed the unit used for the
current throughput. For example in minstrel_ht the correct
conversion between them would be:
mrs->cur_tp / 10 == minstrel_ht_get_tp_avg(..).
This factor 10 must also be included in the calculation of
minstrel_get_expected_throughput and minstrel_ht_get_expected_throughput to
return values with the unit [Kbps] instead of [10Kbps]. Otherwise routing
algorithms like B.A.T.M.A.N. V will make incorrect decision based on these
values. Its kernel based implementation expects expected_throughput always
to have the unit [Kbps] and not sometimes [10Kbps] and sometimes [Kbps].
The same requirement has iw or olsrdv2's nl80211 based statistics module
which retrieve the same data via NL80211_STA_INFO_TX_BITRATE.
Cc: stable@vger.kernel.org Fixes: 6a27b2c40b48 ("mac80211: restructure per-rate throughput calculation into function") Signed-off-by: Sven Eckelmann <sven@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Since cfg80211 frequently takes actions from its netdev notifier
call, wireless extensions messages could still be ordered badly
since the wext netdev notifier, since wext is built into the
kernel, runs before the cfg80211 netdev notifier. For example,
the following can happen:
5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
5: wlan1: <BROADCAST,MULTICAST,UP>
link/ether
when setting the interface down causes the wext message.
To also fix this, export the wireless_nlevent_flush() function
and also call it from the cfg80211 notifier.
Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Beniamino reported that he was getting an RTM_NEWLINK message for a
given interface, after the RTM_DELLINK for it. It turns out that the
message is a wireless extensions message, which was sent because the
interface had been connected and disconnection while it was deleted
caused a wext message.
For its netlink messages, wext uses RTM_NEWLINK, but the message is
without all the regular rtnetlink attributes, so "ip monitor link"
prints just rudimentary information:
5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
Deleted 5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
5: wlan1: <BROADCAST,MULTICAST,UP>
link/ether
(from my hwsim reproduction)
This can cause userspace to get confused since it doesn't expect an
RTM_NEWLINK message after RTM_DELLINK.
The reason for this is that wext schedules a worker to send out the
messages, and the scheduling delay can cause the messages to get out
to userspace in different order.
To fix this, have wext register a netdevice notifier and flush out
any pending messages when netdevice state changes. This fixes any
ordering whenever the original message wasn't sent by a notifier
itself.
This is a clone of commit 2ab957492d13b ("ip_forward: Drop frames with
attached skb->sk") for ipv6.
This commit has exactly the same reasons as the above mentioned commit,
namely to prevent panics during netfilter reload or a misconfigured stack.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.
Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).
To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Avoid sending a partially initialised `siginfo_t' structure along SIGFPE
signals issued from `do_ov' and `do_trap_or_bp', leading to information
leaking from the kernel stack.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: stable@vger.kernel.org Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This patch applies the microphone-related fix created for the Acer
Aspire E1-572 to the E1-472 as well, as it uses the same Realtek ALC282
CODEC and demonstrates the same issues.
This patch allows an external, headset microphone to be used and limits
the gain on the (quite noisy) internal microphone.
Signed-off-by: Simon South <simon@simonsouth.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Pause/unpause graph tracing around do_suspend_lowlevel as it has
inconsistent call/return info after it jumps to the wakeup vector.
The graph trace buffer will otherwise become misaligned and
may eventually crash and hang on suspend.
To reproduce the issue and test the fix:
Run a function_graph trace over suspend/resume and set the graph
function to suspend_devices_and_enter. This consistently hangs the
system without this fix.
Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
On CI, we need to see if the number of crtcs changes to determine
whether or not we need to upload the mclk table again. In practice
we don't currently upload the mclk table again after the initial load.
The only reason you would would be to add new states, e.g., for
arbitrary mclk setting which is not currently supported.
Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
During DRAM initialization on certain ASpeed devices, an incorrect
bit (bit 10) was checked in the "SDRAM Bus Width Status" register
to determine DRAM width.
Query bit 6 instead in accordance with the Aspeed AST2050 datasheet v1.05.
Mike Frysinger reported that his ptrace testcase showed strange
behaviour on parisc: It was not possible to avoid a syscall and the
return value of a syscall couldn't be changed.
To modify a syscall number, we were missing to save the new syscall
number to gr20 which is then picked up later in assembly again.
The effect that the return value couldn't be changed is a side-effect of
another bug in the assembly code. When a process is ptraced, userspace
expects each syscall to report entrance and exit of a syscall. If a
syscall number was given which doesn't exist, we jumped to the normal
syscall exit code instead of informing userspace that the (non-existant)
syscall exits. This unexpected behaviour confuses userspace and thus the
bug was misinterpreted as if we can't change the return value.
This patch fixes both problems and was tested on 64bit kernel with
32bit userspace.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: stable@vger.kernel.org # v4.0+ Tested-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The OSS sequencer client tries to drain the pending events at
releasing. Unfortunately, as spotted by syzkaller fuzzer, this may
lead to an unkillable process state when the event has been queued at
the far future. Since the process being released can't be signaled
any longer, it remains and waits for the echo-back event in that far
future.
Back to history, the draining feature was implemented at the time we
misinterpreted POSIX definition for blocking file operation.
Actually, such a behavior is superfluous at release, and we should
just release the device as is instead of keeping it up forever.
This patch just removes the draining call that may block the release
for too long time unexpectedly.
Plantronics DA45 does not support reading the sample rate which leads
to many lines of "cannot get freq at ep 0x4" and "cannot get freq at
ep 0x84". This patch adds the USB ID of the DA45 to quirks.c and
avoids those error messages.