It turned out that many HP laptops suffer from the same problem as
fixed in commit [c932b98c1e47: ALSA: hda - Apply pin fixup for HP
ProBook 6550b]. But, it's tiresome to list up all such PCI SSIDs, as
there are really lots of HP machines.
Instead, we do a bit more clever, try to check the supposedly dock and
built-in headphone pins, and apply the fixup when both seem valid.
This rule can be applied generically to all models using the same
quirk, so we'll fix all in a shot.
When committed to upstream, these four modules had wrong entries for
Makefile. This forces them to be loadable modules even if they're set
as built-in.
This commit fixes this bug.
Fixes: b5b04336015e('ALSA: fireworks: Add skelton for Fireworks based devices') Fixes: fd6f4b0dc167('ALSA: bebob: Add skelton for BeBoB based devices') Fixes: 1a4e39c2e5ca('ALSA: oxfw: Move to its own directory') Fixes: 14ff6a094815('ALSA: dice: Move file to its own directory') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
HP ProBook 6550b needs the same pin fixup applied to other HP B-series
laptops with docks for making its headphone and dock headphone jacks
working properly. We just need to add the codec SSID to the list.
During the migration to HDA core code, we lost the workaround for 4k
BDL boundary. The flag exists in the new hdac_bus, but it's never
set. This resulted in the sudden sound stall on some controllers that
require this workaround like Creative Recon3D.
This patch fixes the issue by setting the flag for such controllers
properly.
We've had many reports that some Creative sound cards with CA0132
don't work well. Some reported that it starts working after reloading
the module, while some reported it starts working when a 32bit kernel
is used. All these facts seem implying that the chip fails to
communicate when the buffer is located in 64bit address.
This patch addresses these issues by just adding AZX_DCAPS_NO_64BIT
flag to the corresponding PCI entries. I casually had a chance to
test an SB Recon3D board, and indeed this seems helping.
Although this hasn't been tested on all Creative devices, it's safer
to assume that this restriction applies to the rest of them, too. So
the flag is applied to all Creative entries.
We encountered a panic on boot in ipmi_si on a dell per320 due to an
uninitialized timer as follows.
static int smi_start_processing(void *send_info,
ipmi_smi_t intf)
{
/* Try to claim any interrupts. */
if (new_smi->irq_setup)
new_smi->irq_setup(new_smi);
--> IRQ arrives here and irq handler tries to modify uninitialized timer
which triggers BUG_ON(!timer->function) in __mod_timer().
The timer and thread were not being started for internal messages,
so in interrupt mode if something hung the timer would never go
off and clean things up. Factor out the internal message sending
and start the timer for those messages, too.
Regardless of the previous CPU a timer was on, add_timer_on()
currently simply sets timer->flags to the new CPU. As the caller must
be seeing the timer as idle, this is locally fine, but the timer
leaving the old base while unlocked can lead to race conditions as
follows.
Let's say timer was on cpu 0.
cpu 0 cpu 1
-----------------------------------------------------------------------------
del_timer(timer) succeeds
del_timer(timer)
lock_timer_base(timer) locks cpu_0_base
add_timer_on(timer, 1)
spin_lock(&cpu_1_base->lock)
timer->flags set to cpu_1_base
operates on @timer operates on @timer
This triggered with mod_delayed_work_on() which contains
"if (del_timer()) add_timer_on()" sequence eventually leading to the
following oops.
When switch_mm() activates a new PGD, it also sets a bit that
tells other CPUs that the PGD is in use so that TLB flush IPIs
will be sent. In order for that to work correctly, the bit
needs to be visible prior to loading the PGD and therefore
starting to fill the local TLB.
Document all the barriers that make this work correctly and add
a couple that were missing.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When decompressing kernel image during x86 bootup, malloc memory
for ELF program headers may run out of heap space, which leads
to system halt. This patch doubles BOOT_HEAP_SIZE to 64KB.
Tested with 32-bit kernel which failed to boot without this patch.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we do not do this, it is not properly saved and restored across
migration. Windows notices due to its self-protection mechanisms,
and is very upset about it (blue screen of death).
Currently it is possible for userspace (e.g. QEMU) to set a value
for the MSR for a guest VCPU which has both of the TS bits set,
which is an illegal combination. The result of this is that when
we execute a hrfid (hypervisor return from interrupt doubleword)
instruction to enter the guest, the CPU will take a TM Bad Thing
type of program interrupt (vector 0x700).
Now, if PR KVM is configured in the kernel along with HV KVM, we
actually handle this without crashing the host or giving hypervisor
privilege to the guest; instead what happens is that we deliver a
program interrupt to the guest, with SRR0 reflecting the address
of the hrfid instruction and SRR1 containing the MSR value at that
point. If PR KVM is not configured in the kernel, then we try to
run the host's program interrupt handler with the MMU set to the
guest context, which almost certainly causes a host crash.
This closes the hole by making kvmppc_set_msr_hv() check for the
illegal combination and force the TS field to a safe value (00,
meaning non-transactional).
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In static micro-threading modes, the dynamic micro-threading code
is supposed to be disabled, because subcores can't make independent
decisions about what micro-threading mode to put the core in - there is
only one micro-threading mode for the whole core. The code that
implements dynamic micro-threading checks for this, except that the
check was missed in one case. This means that it is possible for a
subcore in static 2-way micro-threading mode to try to put the core
into 4-way micro-threading mode, which usually leads to stuck CPUs,
spinlock lockups, and other stalls in the host.
The problem was in the can_split_piggybacked_subcores() function, which
should always return false if the system is in a static micro-threading
mode. This fixes the problem by making can_split_piggybacked_subcores()
use subcore_config_ok() for its checks, as subcore_config_ok() includes
the necessary check for the static micro-threading modes.
Credit to Gautham Shenoy for working out that the reason for the hangs
and stalls we were seeing was that we were trying to do dynamic 4-way
micro-threading while we were in static 2-way mode.
Fixes: b4deba5c41e9 Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a cancelled suspend the vcpu_info location does not change (it's
still in the per-cpu area registered by xen_vcpu_setup()). So do not
call xen_hvm_init_shared_info() which would make the kernel think its
back in the shared info. With the wrong vcpu_info, events cannot be
received and the domain will hang after a cancelled suspend.
Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doing so will cause the grant to be unmapped and then, during
fault handling, the fault to be mistakenly treated as NUMA hint
fault.
In addition, even if those maps could partcipate in NUMA
balancing, it wouldn't provide any benefit since we are unable
to determine physical page's node (even if/when VNUMA is
implemented).
Marking grant maps' VMAs as VM_IO will exclude them from being
part of NUMA balancing.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:
Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
Kernel Offset: disabled
Rebooting in 100 seconds..
More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.
Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.
Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adding the rtc platform device in non-privileged Xen PV guests causes
an IRQ conflict because these guests do not have legacy PIC and may
allocate irqs in the legacy range.
But hvc_console cannot get its interrupt because it is already in use
by rtc0 and the console does not work.
genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 (rtc0)
We can avoid this problem by realizing that unprivileged PV guests (both
Xen and lguests) are not supposed to have rtc_cmos device and so
adding it is not necessary.
Privileged guests (i.e. Xen's dom0) do use it but they should not have
irq conflicts since they allocate irqs above legacy range (above
gsi_top, in fact).
Instead of explicitly testing whether the guest is privileged we can
extend pv_info structure to include information about guest's RTC
support.
Reported-and-tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: vkuznets@redhat.com Cc: xen-devel@lists.xenproject.org Cc: konrad.wilk@oracle.com Link: http://lkml.kernel.org/r/1449842873-2613-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK,
regs->ax is assigned to a restart_syscall number. For x32 tasks, this
syscall number must have __X32_SYSCALL_BIT set, otherwise it will be
an x86_64 syscall number instead of a valid x32 syscall number. This
issue has been there since the introduction of x32.
MPX decodes instructions in order to tell which bounds register
was violated. Part of this decoding involves looking at the "REX
prefix" which is a special instrucion prefix used to retrofit
support for new registers in to old instructions.
The X86_REX_*() macros are defined to return actual bit values:
#define X86_REX_R(rex) ((rex) & 4)
*not* boolean values. However, the MPX code was checking for
them like they were booleans. This might have led to us
mis-decoding the "REX prefix" and giving false information out to
userspace about bounds violations. X86_REX_B() actually is bit 1,
so this is really only broken for the X86_REX_X() case.
Fix the conditionals up to tolerate the non-boolean values.
Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20151201003113.D800C1E0@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
with a usage count of 100 * the number of times the program has been run,
then the kernel is malfunctioning. If leaked-keyring has zero usages or
has been garbage collected, then the problem is fixed.
Reported-by: Yevgeny Pats <yevgeny@perception-point.io> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's a race between keyctl_read() and keyctl_revoke(). If the revoke
happens between keyctl_read() checking the validity of a key and the key's
semaphore being taken, then the key type read method will see a revoked key.
This causes a problem for the user-defined key type because it assumes in
its read method that there will always be a payload in a non-revoked key
and doesn't check for a NULL pointer.
Fix this by making keyctl_read() check the validity of a key after taking
semaphore instead of before.
I think the bug was introduced with the original keyrings code.
This was discovered by a multithreaded test program generated by syzkaller
(http://github.com/google/syzkaller). Here's a cleaned up version:
The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable:
Fix sleeping inside RCU critical section in walk_stop") introduced
a new spinlock for the walker list. However, it did not convert
all existing users of the list over to the new spin lock. Some
continued to use the old mutext for this purpose. This obviously
led to corruption of the list.
The fix is to use the spin lock everywhere where we touch the list.
This also allows us to do rcu_rad_lock before we take the lock in
rhashtable_walk_start. With the old mutex this would've deadlocked
but it's safe with the new spin lock.
Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With b3ca9b02b00704053a38bfe4c31dbbb9c13595d0, the AF_UNIX SOCK_STREAM
receive code was changed from using mutex_lock(&u->readlock) to
mutex_lock_interruptible(&u->readlock) to prevent signals from being
delayed for an indefinite time if a thread sleeping on the mutex
happened to be selected for handling the signal. But this was never a
problem with the stream receive code (as opposed to its datagram
counterpart) as that never went to sleep waiting for new messages with the
mutex held and thus, wouldn't cause secondary readers to block on the
mutex waiting for the sleeping primary reader. As the interruptible
locking makes the code more complicated in exchange for no benefit,
change it back to using mutex_lock.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fou->udp_offloads is managed by RCU. As it is actually included inside
the fou sockets, we cannot let the memory go out of scope before a grace
period. We either can synchronize_rcu or switch over to kfree_rcu to
manage the sockets. kfree_rcu seems appropriate as it is used by vxlan
and geneve.
Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the
Parser"), 'TSEC' model controllers (for example as seen on MPC8541E)
always have 8 bytes stripped from the front of received frames.
Only 'eTSEC' gianfar controllers have the RX Filer capability (amongst
other enhancements). Previously this was treated as always enabled
for both 'TSEC' and 'eTSEC' controllers.
In commit 15bf176db1fb ("gianfar: Don't enable the Filer w/o the Parser")
a subtle change was made to the setting of 'uses_rxfcb' to effectively
always set it (since 'rx_filer_enable' was always true). This had the
side-effect of always stripping 8 bytes from the front of received frames
on 'TSEC' type controllers.
We now only enable the RX Filer capability on controller types that
support it, thereby avoiding the issue for 'TSEC' type controllers.
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
William Hua <william.hua@canonical.com> wrote:
>
> I wasn't aware there was an enforced minimum size. I simply set the
> nelem_hint in the rhastable_params struct to 1, expecting it to grow as
> needed. This caused a segfault afterwards when trying to insert an
> element.
OK we're doing the size computation before we enforce the limit
on min_size.
---8<---
We need to do the initial hash table size computation after we
have obtained the correct min_size/max_size parameters. Otherwise
we may end up with a hash table whose size is outside the allowed
envelope.
Fixes: a998f712f77e ("rhashtable: Round up/down min/max_size to...") Reported-by: William Hua <william.hua@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Ahern added a vif field in the a4 part of inetpeer_addr struct.
This broke IPv4 TCP fast open client side and more generally tcp metrics
cache, because inetpeer_addr_cmp() is now comparing two u32 instead of
one.
inetpeer_set_addr_v4() needs to properly init vif field, otherwise
the comparison result depends on uninitialized data.
Fixes: 192132b9a034 ("net: Add support for VRFs to inetpeer cache") Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn reported that while we switch all interfaces to privacy stable mode
when setting the secret, we don't set this mode for new interfaces. This
does not make sense, so change this behaviour.
Fixes: 622c81d57b392cc ("ipv6: generation of stable privacy addresses for link-local and autoconf") Reported-by: Bjørn Mork <bjorn@mork.no> Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
msg_iocb needs to be initialized on the recv/recvfrom path.
Otherwise afalg will wrongly interpret it as an async call.
Cc: stable@vger.kernel.org Reported-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stas Nichiporovich reported a regression in his HFSC qdisc setup
on a non multi queue device.
It turns out I mistakenly added a TCQ_F_NOPARENT flag on all qdisc
allocated in qdisc_create() for non multi queue devices, which was
rather buggy. I was clearly mislead by the TCQ_F_ONETXQUEUE that is
also set here for no good reason, since it only matters for the root
qdisc.
Fixes: 4eaf3b84f288 ("net_sched: fix qdisc_tree_decrease_qlen() races") Reported-by: Stas Nichiporovich <stasn77@gmail.com> Tested-by: Stas Nichiporovich <stasn77@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is because we mistake a raw socket as a tcp socket.
We should check both sk->sk_type and sk->sk_protocol to ensure
it is a tcp socket.
Willem points out __skb_complete_tx_timestamp() needs to fix as well.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
skb_reorder_vlan_header is called after the vlan header has
been pulled. As a result the offset of the begining of
the mac header has been incrased by 4 bytes (VLAN_HLEN).
When moving the mac addresses, include this incrase in
the offset calcualation so that the mac addresses are
copied correctly.
Fixes: a6e18ff1117 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off) CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Vladislav Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we have multiple stacked vlan devices all of which have
turned off REORDER_HEADER flag, the untag operation does not
locate the ethernet addresses correctly for nested vlans.
The reason is that in case of REORDER_HEADER flag being off,
the outer vlan headers are put back and the mac_len is adjusted
to account for the presense of the header. Then, the subsequent
untag operation, for the next level vlan, always use VLAN_ETH_HLEN
to locate the begining of the ethernet header and that ends up
being a multiple of 4 bytes short of the actuall beginning
of the mac header (the multiple depending on the how many vlan
encapsulations ethere are).
As a reslult, if there are multiple levles of vlan devices
with REODER_HEADER being off, the recevied packets end up
being dropped.
To solve this, we use skb->mac_len as the offset. The value
is always set on receive path and starts out as a ETH_HLEN.
The value is also updated when the vlan header manupations occur
so we know it will be correct.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Dmitry Vyukov <dvyukov@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Wilder reported crashes caused by dst reuse.
<quote David>
I am seeing a crash on a distro V4.2.3 kernel caused by a double
release of a dst_entry. In ipv4_dst_destroy() the call to
list_empty() finds a poisoned next pointer, indicating the dst_entry
has already been removed from the list and freed. The crash occurs
18 to 24 hours into a run of a network stress exerciser.
</quote>
Thanks to his detailed report and analysis, we were able to understand
the core issue.
IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.
When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.
Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.
We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.
This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.
It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.
Can probably be backported back to linux-3.6 kernels
Reported-by: David J. Wilder <dwilder@us.ibm.com> Tested-by: David J. Wilder <dwilder@us.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is because netdev_alloc_skb() fails and 'mdp->rx_skbuff[entry]' is left
NULL but sh_eth_rx() later uses it without checking. Add such check...
Reported-by: Yasushi SHOJI <yashi@atmark-techno.com> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.
This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.
I found no particular commit which introduced this problem.
CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The file ila.h used for lightweight tunnels is being used by iproute2
but is not exported yet.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If userspace executes ct(zone=1), and the connection tracker determines
that the packet is invalid, then the ct_zone flow key field is populated
with the default zone rather than the zone that was specified. Even
though connection tracking failed, this field should be updated with the
value that the action specified. Fix the issue.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the actions (re)allocation fails, or the actions list is larger than the
maximum size, and the conntrack action is the last action when these
problems are hit, then references to helper modules may be leaked. Fix
the issue.
Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b63ec1837fa ("phylib: Make PHYs children of their MDIO bus,
not the bus' parent.") changed the parenting of PHY devices, making
them a child of the MDIO bus, instead of the MAC device. This broken
the Micrel PHY driver which has a deprecated feature of allowing PHY
properties to be placed into the MAC node.
In order to find the MAC node, we need to walk up the tree of devices
until we find one with an OF node attached.
Reported-by: Dinh Nguyen <dinguyen@opensource.altera.com> Suggested-by: David Daney <david.daney@cavium.com> Acked-by: David Daney <david.daney@cavium.com> Fixes: 8b63ec1837fa ("phylib: Make PHYs children of their MDIO bus, not the bus' parent.") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an interface is brought up which was previously suspended (via
runtime PM), it would hang. This happens because napi_disable is called
before napi_enable.
Solve this by avoiding napi_enable in the resume during open function
(netif_running is true when open is called, IFF_UP is set after a
successful open; netif_running is false when close is called, but IFF_UP
is then still set).
While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
the original change) because it cannot happen:
- After this patch, runtime resume will not set it during rtl8152_open.
- When link is up, rtl8152_open is not called.
- When link is down during system/auto suspend/resume, it is not set.
Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready") Link: https://lkml.kernel.org/r/20151205105912.GA1766@al Signed-off-by: Peter Wu <peter@lekensteyn.nl> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of a tx queue timeout every transmit is blocked until the
QCA7000 resets himself and triggers a sync which makes the driver
flushs the tx ring. So avoid this blocking situation by triggering
the sync immediately after the timeout. Waking the queue doesn't
make sense in this situation.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Huawei E3372 (12d1:157d) needs this quirk in MBIM mode
as well. Allow this by forcing the NTB to contain only a
single NDP, and add a device specific entry for this ID.
Due to the way Huawei use device IDs, this might be applied
to other modems as well. It is assumed that those modems
will be based on the same firmware and will need this quirk
too. If not, it will still not harm normal usage, although
multiplexing performance could be impacted.
Cc: Enrico Mioso <mrkiko.rs@gmail.com> Reported-by: Sami Farin <hvtaifwkbgefbaei@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-By: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.
When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.
The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
detecting stale cookies. This cookie is echoed back to the server by the
client and then that timestamp is checked.
Thing is, if the listening socket is using packet timestamping, the
cookie is encoded with ktime_get() value and checked against
ktime_get_real(), as done by __net_timestamp().
The fix is to sctp also use ktime_get_real(), so we can compare bananas
with bananas later no matter if packet timestamping was enabled or not.
Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 3511494ce2f3d ("vxlan: Group Policy extension") changed definition of
VXLAN_HF_RCO from 0x00200000 to BIT(24). This is obviously incorrect. It's
also in violation with the RFC draft.
Fixes: 3511494ce2f3d ("vxlan: Group Policy extension") Cc: Thomas Graf <tgraf@suug.ch> Cc: Tom Herbert <therbert@google.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 64236f3f3d74 ("ipv6: introduce IFA_F_STABLE_PRIVACY flag")
failed to update the setting of the IFA_F_OPTIMISTIC flag, causing
the IFA_F_STABLE_PRIVACY flag to be lost if IFA_F_OPTIMISTIC is set.
Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Fixes: 64236f3f3d74 ("ipv6: introduce IFA_F_STABLE_PRIVACY flag") Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
atl1c driver is doing order-4 allocation with GFP_ATOMIC
priority. That often breaks networking after resume. Switch to
GFP_KERNEL. Still not ideal, but should be significantly better.
atl1c_setup_ring_resources() is called from .open() function, and
already uses GFP_KERNEL, so this change is safe.
Signed-off-by: Pavel Machek <pavel@ucw.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Parameters were updated only if the kernel was unable to find the tunnel
with the new parameters, ie only if core pamareters were updated (keys,
addr, link, type).
Now it's possible to update ttl, hoplimit, flowinfo and flags.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Andrew <nitr0@seti.kr.ua> Fixes: 287f3a943fef ("pppoe: Use workqueue to die properly when a PADT is received") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
usb_parse_ss_endpoint_companion() now decodes the burst multiplier
correctly in order to check that it's <= 3, but still uses the wrong
expression if warning that it's > 3.
Fixes: ff30cbc8da42 ("usb: Use the USB_SS_MULT() macro to get the ...") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a USB 3.0 mass storage device is disconnected in transporting
state, storage device driver may handle it as a transport error and
reset the device by invoking usb_reset_and_verify_device()
and following could happen:
in usb_reset_and_verify_device():
udev->bos = NULL;
For U1/U2 enabled devices, driver will disable LPM, and in some
conditions:
from usb_unlocked_disable_lpm()
--> usb_disable_lpm()
--> usb_enable_lpm()
udev->bos->ss_cap->bU1devExitLat;
And it causes 'NULL pointer' and 'kernel panic':
[ 157.976257] Unable to handle kernel NULL pointer dereference
at virtual address 00000010
...
[ 158.026400] PC is at usb_enable_link_state+0x34/0x2e0
[ 158.031442] LR is at usb_enable_lpm+0x98/0xac
...
[ 158.137368] [<ffffffc0006a1cac>] usb_enable_link_state+0x34/0x2e0
[ 158.143451] [<ffffffc0006a1fec>] usb_enable_lpm+0x94/0xac
[ 158.148840] [<ffffffc0006a20e8>] usb_disable_lpm+0xa8/0xb4
...
[ 158.214954] Kernel panic - not syncing: Fatal exception
This commit moves 'udev->bos = NULL' behind usb_unlocked_disable_lpm()
to prevent from NULL pointer access.
Issue can be reproduced by following setup:
1) A SS pen drive behind a SS hub connected to the host.
2) Transporting data between the pen drive and the host.
3) Abruptly disconnect hub and pen drive from host.
4) With a chance it crashes.
Signed-off-by: Hans Yang <hansy@nvidia.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CPPI-4.1 driver selects TI_CPPI41, which is a dmaengine
driver and that may not be available when CONFIG_DMADEVICES
is not set:
warning: (USB_TI_CPPI41_DMA) selects TI_CPPI41 which has unmet direct dependencies (DMADEVICES && ARCH_OMAP)
This adds an extra dependency to avoid generating warnings in randconfig
builds. Ideally we'd remove the 'select' statement, but that has the
potential to break defconfig files.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 411dd19c682d ("usb: musb: Kconfig: Select the DMA driver if DMA mode of MUSB is enabled") Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The interrupt handler, ohci_hcd_at91_overcurrent_irq may be called right
after registration. At that time, pdev->dev.platform_data is not yet set,
leading to a NULL pointer dereference.
Fixes: e4df92279fd9 (USB: host: ohci-at91: merge loops in ohci_hcd_at91_drv_probe) Reported-by: Peter Rosin <peda@axentia.se> Tested-by: Peter Rosin <peda@axentia.se> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some USB device / host controller combinations seem to have problems
with Link Power Management. For example, Steinar found that his xHCI
controller wouldn't handle bandwidth calculations correctly for two
video cards simultaneously when LPM was enabled, even though the bus
had plenty of bandwidth available.
This patch introduces a new quirk flag for devices that should remain
disabled for LPM, and creates quirk entries for Steinar's devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The patch extends the family of SATA-to-USB JMicron adapters that need
FUA to be disabled and applies the same policy for uas driver.
See details in http://unix.stackexchange.com/questions/237204/
The flash loader has been seen on a Telit UE910 modem. The flash loader
is a bit special, it presents both an ACM and CDC Data interface but
only the latter is useful. Unless a magic string is sent to the device
it will disappear and the regular modem device appears instead.
Signed-off-by: Jonas Jonsson <jonas@ludd.ltu.se> Tested-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some modems, such as the Telit UE910, are using an Infineon Flash Loader
utility. It has two interfaces, 2/2/0 (Abstract Modem) and 10/0/0 (CDC
Data). The latter can be used as a serial interface to upgrade the
firmware of the modem. However, that isn't possible when the cdc-acm
driver takes control of the device.
The following is an explanation of the behaviour by Daniele Palmas during
discussion on linux-usb.
"This is what happens when the device is turned on (without modifying
the drivers):
[155492.352031] usb 1-3: new high-speed USB device number 27 using ehci-pci
[155492.485429] usb 1-3: config 1 interface 0 altsetting 0 endpoint 0x81 has an invalid bInterval 255, changing to 11
[155492.485436] usb 1-3: New USB device found, idVendor=058b, idProduct=0041
[155492.485439] usb 1-3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[155492.485952] cdc_acm 1-3:1.0: ttyACM0: USB ACM device
This is the flashing device that is caught by the cdc-acm driver. Once
the ttyACM appears, the application starts sending a magic string
(simple write on the file descriptor) to keep the device in flashing
mode. If this magic string is not properly received in a certain time
interval, the modem goes on in normal operative mode:
[155493.748094] usb 1-3: USB disconnect, device number 27
[155494.916025] usb 1-3: new high-speed USB device number 28 using ehci-pci
[155495.059978] usb 1-3: New USB device found, idVendor=1bc7, idProduct=0021
[155495.059983] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[155495.059986] usb 1-3: Product: 6 CDC-ACM + 1 CDC-ECM
[155495.059989] usb 1-3: Manufacturer: Telit
[155495.059992] usb 1-3: SerialNumber: 359658044004697
[155495.138958] cdc_acm 1-3:1.0: ttyACM0: USB ACM device
[155495.140832] cdc_acm 1-3:1.2: ttyACM1: USB ACM device
[155495.142827] cdc_acm 1-3:1.4: ttyACM2: USB ACM device
[155495.144462] cdc_acm 1-3:1.6: ttyACM3: USB ACM device
[155495.145967] cdc_acm 1-3:1.8: ttyACM4: USB ACM device
[155495.147588] cdc_acm 1-3:1.10: ttyACM5: USB ACM device
[155495.154322] cdc_ether 1-3:1.12 wwan0: register 'cdc_ether' at usb-0000:00:1a.7-3, Mobile Broadband Network Device, 00:00:11:12:13:14
Using the cdc-acm driver, the string, though being sent in the same way
than using the usb-serial-simple driver (I can confirm that the data is
passing properly since I used an hw usb sniffer), does not make the
device to stay in flashing mode."
Signed-off-by: Jonas Jonsson <jonas@ludd.ltu.se> Tested-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 53147b6cabee5e8d1997b5682fcc0c3b72ddf9c2 ("toshiba_acpi: Fix
hotkeys registration on some toshiba models") fixed an issue on some
laptops regarding hotkeys registration, however, if failed to address
the initialization of the hotkey_event_type variable, and thus, it can
lead to potential unwanted effects as the variable is being checked.
This patch initializes such variable to avoid such unwanted effects.
Both for FIFO and CRB interface TCG has decided to use the same HID
MSFT0101. They can be differentiated by looking at the start method from
TPM2 ACPI table. This patches makes necessary fixes to tpm_tis and
tpm_crb modules in order to correctly detect, which module should be
used.
For MSFT0101 we must use struct acpi_driver because struct pnp_driver
has a 7 character limitation.
It turned out that the root cause in b371616b8 was not correct for
https://bugzilla.kernel.org/show_bug.cgi?id=98181.
v2:
* One fixup was missing from v1: is_tpm2_fifo -> is_fifo
v3:
* Use pnp_driver for existing HIDs and acpi_driver only for MSFT0101 in
order ensure backwards compatibility.
v4:
* Check for FIFO before doing *anything* in crb_acpi_add().
* There was return immediately after acpi_bus_unregister_driver() in
cleanup_tis(). This caused pnp_unregister_driver() not to be called.
Reported-by: Michael Saunders <mick.saunders@gmail.com> Reported-by: Michael Marley <michael@michaelmarley.com> Reported-by: Jethro Beekman <kernel@jbeekman.nl> Reported-by: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Michael Marley <michael@michaelmarley.com> Tested-by: Mimi Zohar <zohar@linux.vnet.ibm.com> (on TPM 1.2) Reviewed-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For an ACPI compatible system, the SCI (ACPI System Control
Interrupt) is used to wake the system up from suspend-to-idle.
Once the CPU is woken up by the SCI, the interrupt handler will
first check if the current IRQ has been configured for system
wakeup, so irq_pm_check_wakeup() is invoked to validate the IRQ
number. However, during suspend-to-idle, enable_irq_wake() is
called for acpi_gbl_FADT.sci_interrupt, although the IRQ number
that the SCI handler has been installed for should be passed to
it instead. Thus, if acpi_gbl_FADT.sci_interrupt happens to be
different from that number, ACPI interrupts will not be able to
wake up the system from sleep.
Fix this problem by passing the IRQ number returned by
acpi_gsi_to_irq() to enable_irq_wake() instead of
acpi_gbl_FADT.sci_interrupt.
Acked-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the system is waiting for GPE/fixed event handler to finish,
it uses acpi_gbl_FADT.sci_interrupt directly as the IRQ number.
However, the remapped IRQ returned by acpi_gsi_to_irq() should be
passed to synchronize_hardirq() instead of it.
Acked-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently when the system is trying to uninstall the ACPI interrupt
handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
However, the IRQ number that the ACPI interrupt handled is installed
for comes from acpi_gsi_to_irq() and that is the number that should
be used for the handler removal.
Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
as appropriate.
Acked-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Hutchings [Tue, 15 Dec 2015 21:21:57 +0000 (21:21 +0000)]
tipc: Fix kfree_skb() of uninitialised pointer
Commit 7098356baca7 ("tipc: fix error handling of expanding buffer
headroom") added a "goto tx_error". This is fine upstream, but
when backported to 4.3 it results in attempting to free the clone
before it has been allocated. In this early error case, no
cleanup is needed.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When vrf's ->newlink is called, if register_netdevice() fails then it
does free_netdev(), but that's also done by rtnl_newlink() so a second
free happens and memory gets corrupted, to reproduce execute the
following line a couple of times (1 - 5 usually is enough):
$ for i in `seq 1 5`; do ip link add vrf: type vrf table 1; done;
This works because we fail in register_netdevice() because of the wrong
name "vrf:".
Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: For 4.3, retain the kfree() on failure] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the kernel 4.2 merge window we had a big changes to the implementation
of delayed references and qgroups which made the no_quota field of delayed
references not used anymore. More specifically the no_quota field is not
used anymore as of:
commit 0ed4792af0e8 ("btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.")
Leaving the no_quota field actually prevents delayed references from
getting merged, which in turn cause the following BUG_ON(), at
fs/btrfs/extent-tree.c, to be hit when qgroups are enabled:
static int run_delayed_tree_ref(...)
{
(...)
BUG_ON(node->ref_mod != 1);
(...)
}
2) Ref2 bytenr X, action = BTRFS_DROP_DELAYED_REF, no_quota = 0, added.
It's not merged with Ref1 because Ref1->no_quota != Ref2->no_quota.
3) Ref3 bytenr X, action = BTRFS_ADD_DELAYED_REF, no_quota = 1, added.
It's not merged with the reference at the tail of the list of refs
for bytenr X because the reference at the tail, Ref2 is incompatible
due to Ref2->no_quota != Ref3->no_quota.
4) Ref4 bytenr X, action = BTRFS_DROP_DELAYED_REF, no_quota = 0, added.
It's not merged with the reference at the tail of the list of refs
for bytenr X because the reference at the tail, Ref3 is incompatible
due to Ref3->no_quota != Ref4->no_quota.
5) We run delayed references, trigger merging of delayed references,
through __btrfs_run_delayed_refs() -> btrfs_merge_delayed_refs().
6) Ref1 and Ref3 are merged as Ref1->no_quota = Ref3->no_quota and
all other conditions are satisfied too. So Ref1 gets a ref_mod
value of 2.
7) Ref2 and Ref4 are merged as Ref2->no_quota = Ref4->no_quota and
all other conditions are satisfied too. So Ref2 gets a ref_mod
value of 2.
8) Ref1 and Ref2 aren't merged, because they have different values
for their no_quota field.
9) Delayed reference Ref1 is picked for running (select_delayed_ref()
always prefers references with an action == BTRFS_ADD_DELAYED_REF).
So run_delayed_tree_ref() is called for Ref1 which triggers the
BUG_ON because Ref1->red_mod != 1 (equals 2).
So fix this by removing the no_quota field, as it's not used anymore as
of commit 0ed4792af0e8 ("btrfs: qgroup: Switch to new extent-oriented
qgroup mechanism.").
The use of no_quota was also buggy in at least two places:
1) At delayed-refs.c:btrfs_add_delayed_tree_ref() - we were setting
no_quota to 0 instead of 1 when the following condition was true:
is_fstree(ref_root) || !fs_info->quota_enabled
2) At extent-tree.c:__btrfs_inc_extent_ref() - we were attempting to
reset a node's no_quota when the condition "!is_fstree(root_objectid)
|| !root->fs_info->quota_enabled" was true but we did it only in
an unused local stack variable, that is, we never reset the no_quota
value in the node itself.
This fixes the remainder of problems several people have been having when
running delayed references, mostly while a balance is running in parallel,
on a 4.2+ kernel.
Very special thanks to Stéphane Lesimple for helping debugging this issue
and testing this fix on his multi terabyte filesystem (which took more
than one day to balance alone, plus fsck, etc).
Also, this fixes deadlock issue when using the clone ioctl with qgroups
enabled, as reported by Elias Probst in the mailing list. The deadlock
happens because after calling btrfs_insert_empty_item we have our path
holding a write lock on a leaf of the fs/subvol tree and then before
releasing the path we called check_ref() which did backref walking, when
qgroups are enabled, and tried to read lock the same leaf. The trace for
this case is the following:
The problem goes away by eleminating check_ref(), which no longer is
needed as its purpose was to get a value for the no_quota field of
a delayed reference (this patch removes the no_quota field as mentioned
earlier).
Reported-by: Stéphane Lesimple <stephane_btrfs@lesimple.fr> Tested-by: Stéphane Lesimple <stephane_btrfs@lesimple.fr> Reported-by: Elias Probst <mail@eliasprobst.eu> Reported-by: Peter Becker <floyd.net@gmail.com> Reported-by: Malte Schröder <malte@tnxip.de> Reported-by: Derek Dongray <derek@valedon.co.uk> Reported-by: Erkki Seppala <flux-btrfs@inside.org> Cc: stable@vger.kernel.org # 4.2+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
drivers/media/i2c/adv7604.c: In function 'adv76xx_get_format':
>> drivers/media/i2c/adv7604.c:1853:9: error: implicit declaration of function 'v4l2_subdev_get_try_format' [-Werror=implicit-function-declaration]
fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
^
drivers/media/i2c/adv7604.c:1853:7: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
^
drivers/media/i2c/adv7604.c: In function 'adv76xx_set_format':
drivers/media/i2c/adv7604.c:1882:7: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
^
cc1: some warnings being treated as errors
There are several sound drivers that 'select ZONE_DMA'. This is
backwards as ZONE_DMA is an architecture capability exported to drivers.
Switch the polarity of the dependency to disable these drivers when the
architecture does not support ZONE_DMA. This was discovered in the
context of testing/enabling devm_memremap_pages() which depends on
ZONE_DEVICE. ZONE_DEVICE in turn depends on !ZONE_DMA.
Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
create_request_message() computes the maximum length of a message,
but uses the wrong type for the time stamp: sizeof(struct timespec)
may be 8 or 16 depending on the architecture, while sizeof(struct
ceph_timespec) is always 8, and that is what gets put into the
message.
Found while auditing the uses of timespec for y2038 problems.
Fixes: b8e69066d8af ("ceph: include time stamp in every MDS request") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inside blk_bio_segment_split(), previous bvec pointer(bvprvp)
always points to the iterator local variable, which is obviously
wrong, so fix it by pointing to the local variable of 'bvprv'.
Fixes: 5014c311baa2b(block: fix bogus compiler warnings in blk-merge.c) Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
(correctly) not update any of the attributes, but it then clears the
NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
up to date. Don't clear the flag if the fattr struct has no valid
attrs to apply.
Reviewed-by: Steve French <steve.french@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pnfs_layout_process will check the returned layout stateid against what
the kernel has in-core. If it turns out that the stateid we received is
older, then we should resend the LAYOUTGET instead of falling back to
MDS I/O.
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
it when the nfs_client is freed. A decoding or server bug can then find
and try to put that first nfs_client which would lead to a crash.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: d6870312659d ("nfs4client: convert to idr_alloc()") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In debugfs' start_creating(), we pin the file system to safely access
its root. When we failed to create a file, we unpin the file system via
failed_creating() to release the mount count and eventually the reference
of the vfsmount.
However, when we run into an error during lookup_one_len() when still
in start_creating(), we only release the parent's mutex but not so the
reference on the mount. Looks like it was done in the past, but after
splitting portions of __create_file() into start_creating() and
end_creating() via 190afd81e4a5 ("debugfs: split the beginning and the
end of __create_file() off"), this seemed missed. Noticed during code
review.
Fixes: 190afd81e4a5 ("debugfs: split the beginning and the end of __create_file() off") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We've observed the nfsd server in a state where there are
multiple delegations on the same nfs4_file for the same client.
The nfs client does attempt to DELEGRETURN these when they are presented to
it - but apparently under some (unknown) circumstances the client does not
manage to return all of them. This leads to the eventual
attempt to CB_RECALL more than one delegation with the same nfs
filehandle to the same client. The first recall will succeed, but the
next recall will fail with NFS4ERR_BADHANDLE. This leads to the server
having delegations on cl_revoked that the client has no way to FREE
or DELEGRETURN, with resulting inability to recover. The state manager
on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED,
and the state manager on the client will be looping unable to satisfy
the server.
List discussion also reports a race between OPEN and DELEGRETURN that
will be avoided by only sending the delegation once to the
client. This is also logically in accordance with RFC5561 9.1.1 and 10.2.
So, let's:
1.) Not hand out duplicate delegations.
2.) Only send them to the client once.
RFC 5561:
9.1.1:
"Delegations and layouts, on the other hand, are not associated with a
specific owner but are associated with the client as a whole
(identified by a client ID)."
10.2:
"...the stateid for a delegation is associated with a client ID and may be
used on behalf of all the open-owners for the given client. A
delegation is made to the client as a whole and not to any specific
process or thread of control within it."
Reported-by: Eric Meddaugh <etmsys@rit.edu> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were
running in parallel. The server would receive the OPEN_DOWNGRADE first
and check its seqid, but then an OPEN would race in and bump it. The
OPEN_DOWNGRADE would then complete and bump the seqid again. The result
was that the OPEN_DOWNGRADE would be applied after the OPEN, even though
it should have been rejected since the seqid changed.
The only recourse we have here I think is to serialize operations that
bump the seqid in a stateid, particularly when we're given a seqid in
the call. To address this, we add a new rw_semaphore to the
nfs4_ol_stateid struct. We do a down_write prior to checking the seqid
after looking up the stateid to ensure that nothing else is going to
bump it while we're operating on it.
In the case of OPEN, we do a down_read, as the call doesn't contain a
seqid. Those can run in parallel -- we just need to serialize them when
there is a concurrent OPEN_DOWNGRADE or CLOSE.
LOCK and LOCKU however always take the write lock as there is no
opportunity for parallelizing those.
Reported-and-Tested-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
controllers: Often or even most of the time, the controller is
initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
0 IT contexts, quirks 0x10". With 0 isochronous transmit DMA contexts
(IT contexts), applications like audio output are impossible.
However, OHCI-1394 demands that at least 4 IT contexts are implemented
by the link layer controller, and indeed JMicron JMB38x do implement
four of them. Only their IsoXmitIntMask register is unreliable at early
access.
With my own JMB381 single function controller I found:
- I can reproduce the problem with a lower probability than Craig's.
- If I put a loop around the section which clears and reads
IsoXmitIntMask, then either the first or the second attempt will
return the correct initial mask of 0x0000000f. I never encountered
a case of needing more than a second attempt.
- Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
before the first write, the subsequent read will return the correct
result.
- If I merely ignore a wrong read result and force the known real
result, later isochronous transmit DMA usage works just fine.
So let's just fix this chip bug up by the latter method. Tested with
JMB381 on kernel 3.13 and 4.3.
Since OHCI-1394 generally requires 4 IT contexts at a minium, this
workaround is simply applied whenever the initial read of IsoXmitIntMask
returns 0, regardless whether it's a JMicron chip or not. I never heard
of this issue together with any other chip though.
I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
and JMB388 combo controllers exactly the same as on the JMB381 single-
function controller, but so far I haven't had a chance to let an owner
of a combo chip run a patched kernel.
Strangely enough, IsoRecvIntMask is always reported correctly, even
though it is probed right before IsoXmitIntMask.
Reported-by: Clifford Dunn Reported-by: Craig Moore <craig.moore@qenos.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option. But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.
There is a use-after-free possibility in __ext4_journal_stop() in the
case that we free the handle in the first jbd2_journal_stop() because
we're referencing handle->h_err afterwards. This was introduced in 9705acd63b125dee8b15c705216d7186daea4625 and it is wrong. Fix it by
storing the handle->h_err value beforehand and avoid referencing
potentially freed handle.