The PLL impose a certain input range to work correctly, but it appears that
this input range does not apply on the input clock (or parent clock) but
on the input clock after it has passed the PLL divisor.
Fix the implementation accordingly.
Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reported-by: Jonas Andersson <jonas@microbit.se> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The hwrng output buffers (2) are cast inside of a a struct (caam_rng_ctx)
allocated in one DMA-tagged region. While the kernel's heap allocator
should place the overall struct on a cacheline aligned boundary, the 2
buffers contained within may not necessarily align. Consenquently, the ends
of unaligned buffers may not fully flush, and if so, stale data will be left
behind, resulting in small repeating patterns.
This fix aligns the buffers inside the struct.
Note that not all of the data inside caam_rng_ctx necessarily needs to be
DMA-tagged, only the buffers themselves require this. However, a fix would
incur the expense of error-handling bloat in the case of allocation failure.
Cc: stable@vger.kernel.org Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com> Signed-off-by: Victoria Milhoan <vicki.milhoan@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Multiple function in asynchronous hashing use a saved-state block,
a.k.a. struct caam_hash_state, which holds a stash of information
between requests (init/update/final). Certain values in this state
block are loaded for processing using an inline-if, and when this
is done, the potential for uninitialized data can pose conflicts.
Therefore, this patch improves initialization of state data to
prevent false assignments using uninitialized data in the state block.
This patch addresses the following traceback, originating in
ahash_final_ctx(), although a problem like this could certainly
exhibit other symptoms:
Cc: stable@vger.kernel.org Signed-off-by: Steve Cornelius <steve.cornelius@freescale.com> Signed-off-by: Victoria Milhoan <vicki.milhoan@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
lib/rhashtable: fix race between rhashtable_lookup_compare and hashtable resize
Hash value passed as argument into rhashtable_lookup_compare could be
computed using different hash table than rhashtable_lookup_compare sees.
This patch passes key into rhashtable_lookup_compare() instead of hash and
compures hash value right in place using the same table as for lookup.
Also it adds comment for rhashtable_hashfn and rhashtable_obj_hashfn:
user must prevent concurrent insert/remove otherwise returned hash value
could be invalid.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: e341694e3eb5 ("netlink: Convert netlink_lookup() to use RCU protected hash table") Link: http://lkml.kernel.org/r/20150514042151.GA5482@gondor.apana.org.au Cc: Stable <stable@vger.kernel.org> (v3.17 .. v3.19) Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Haswell significantly improved the performance of sampler_c messages,
but the optimization appears to be off by default. Later platforms
remove this bit, and apparently always enable the optimization.
Improves performance in "Counter Strike: Global Offensive" by 18%
at default settings on Iris Pro.
This may break sampling of paletted formats (P8/A8P8/P8A8). It's
unclear whether it affects sampling of paletted formats in general,
or just the sample_c message (which is never used).
While libva does have support for using paletted formats (primarily
for OSDs), that support appears to have been broken for at least a
year, so I couldn't observe a regression from this:
I tried to get libva-intel to use paletted formats, and observe a
regression...but the only thing I found that used it was mplayer's OSD
(on screen display). Even without my patch, the colors were totally
wrong with that, and it's according to a few distro wikis, that's been
the case for over a year.
If libva's code for paletted formats /is/ broken, they could always
add code to disable this bit using the command validator when fixing
it.
Further investigation from Haihao shows that libva mplayer OSD seems
to work at least on his setup (still unclear what's wron with Ken's),
and that it's not affected by this patch. Quoting the discussion
between Haihao and Ken:
> > > If you use "-vo gl" or "-vo xv", the OSD is solid white text with a black
> > > border around it. I presume that it's supposed to be white with vaapi as
> > > well, but I guess I'm not entirely sure.
> > >
> > > It's possible that the optimization doesn't affect the palette as long as
> > > you never use sample_c with the paletted textures.
> >
> > I verified the palette takes effect in the following way:
> >
> > 1. Only support P8A8 format in the driver
> >
> > 2. ran the above command and I saw white OSD text
> >
> > 3. Only support P4A4 format in the driver and don't use
> > 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette,
> > so the palette keeps unchanged.
> >
> > 4. ran the above command and I saw black OSD text.
> >
> > 5. Load the right value to the texture palette and ran the above command
> > again, I saw white OSD text.
> >
> > Hence I think sample_c with the paletted textures is used in the driver.
>
> That sounds like the palette is actually working, then. Great :)
>
> I doubt that libva would use sample_c - sampling with a shadow comparison?
> It looks like it just uses sample and sample+killpix.
You are right, libva driver doesn't use sample_c message.
> I'm pretty sure the sample_c optimization just uses the palette memory as
> storage for some stuff, so it's quite possible it just works if you're
> only using sample and sample+killpix.
Thanks for the explanation, it makes sense to me.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
[danvet: Add wa name from Ville's review to the comment and copypaste
the explanation why we don't care about libva (already broken) from
Ken. Also add conclusion from libva devs that&why this is all fine.] Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: "Xiang, Haihao" <haihao.xiang@intel.com> Cc: libva@lists.freedesktop.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The use of READ_ONCE() causes lots of warnings witht he pending paravirt
spinlock fixes, because those ends up having passing a member to a
'const' structure to READ_ONCE().
There should certainly be nothing wrong with using READ_ONCE() with a
const source, but the helper function __read_once_size() would cause
warnings because it would drop the 'const' qualifier, but also because
the destination would be marked 'const' too due to the use of 'typeof'.
Use a union of types in READ_ONCE() to avoid this issue.
Also make sure to use parenthesis around the macro arguments to avoid
possible operator precedence issues.
The host's decision to enable machine check exceptions should remain
in force during non-root mode. KVM was writing 0 to cr4 on VCPU reset
and passed a slightly-modified 0 to the vmcs.guest_cr4 value.
Tested: Built.
On earlier version, tested by injecting machine check
while a guest is spinning.
Before the change, if guest CR4.MCE==0, then the machine check is
escalated to Catastrophic Error (CATERR) and the machine dies.
If guest CR4.MCE==1, then the machine check causes VMEXIT and is
handled normally by host Linux. After the change, injecting a machine
check causes normal Linux machine check handling.
Signed-off-by: Ben Serebrin <serebrin@google.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Context switches and TLB flushes can change individual bits of CR4.
CR4 reads take several cycles, so store a shadow copy of CR4 in a
per-cpu variable.
To avoid wasting a cache line, I added the CR4 shadow to
cpu_tlbstate, which is already touched in switch_mm. The heaviest
users of the cr4 shadow will be switch_mm and __switch_to_xtra, and
__switch_to_xtra is called shortly after switch_mm during context
switch, so the cacheline is likely to be hot.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
CR4 manipulation was split, seemingly at random, between direct
(write_cr4) and using a helper (set/clear_in_cr4). Unfortunately,
the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code,
which only a small subset of users actually wanted.
This patch replaces all cr4 access in functions that don't leave cr4
exactly the way they found it with new helpers cr4_set_bits,
cr4_clear_bits, and cr4_set_bits_and_update_boot.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Some controller drivers have no need of this callback (spi-altera even
causes a NULL pointer dereference because it doesn't register the callback,
falsely assuming that it is already optional).
Fixes: 30af9b558a56 ("spi/bitbang: Drop empty setup() functions") Signed-off-by: Pelle Nilsson <per.nilsson@xelmo.com> Reviewed-by: Ezequiel Garcia <ezequiel.garcia@vanguardiasur.com.ar> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This driver uses '#ifdef CONFIG_ARCH_SHMOBILE' and '#ifdef CONFIG_ARM'
interchangeably in its sh_dmae_probe function, which causes a build
warning when building for ARM without also enabling shmobile:
dma/sh/shdmac.c: In function sh_dmae_probe:
dma/sh/shdmac.c:696:6: warning: unused variable errirq [-Wunused-variable]
dma/sh/shdmac.c:695:16: warning: unused variable irqflags [-Wunused-variable]
dma/sh/shdmac.c: At top level:
dma/sh/shdmac.c:447:20: warning: sh_dmae_err defined but not used [-Wunused-function]
This changes all the #ifdef to test for CONFIG_ARCH_SHMOBILE to
avoid that warning. An earlier patch from Laurent had fixed the warning
for non-ARM case, but it still remained present in ARM randconfig builds.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 52d6a5ee101bf ("DMA: shdma: Fix warnings due to declared but unused symbols") Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
tcp_get_info() can be called without holding socket lock,
so any socket fields can change under us.
Use READ_ONCE() to fetch sk_pacing_rate and sk_max_pacing_rate
Fixes: 977cb0ecf82e ("tcp: add pacing_rate information into tcp_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Until now, the EFI stub was only setting the 32 bit cmd_line_ptr in
the setup_header structure, so on 64 bit platforms this could be truncated.
This patch adds setting the upper bits of the buffer address in
ext_cmd_line_ptr. This case was likely never hit, as the allocation
for this buffer is done at the lowest available address. Only
x86_64 kernels have this problem, as the 1-1 mapping mandated
by EFI ensures that all memory is 32 bit addressable on 32 bit
platforms. The EFI stub does not support mixed mode, so the
32 bit kernel on 64 bit firmware case does not need to be handled.
Signed-off-by: Roy Franz <roy.franz@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Some buggy firmware implementations update VariableNameSize on success
such that it does not include the final NUL character which results in
garbage in the efivarfs name entries. Use kzalloc on the efivar_entry
(as is done in efivars.c) to ensure that the name is always
NUL-terminated.
The buggy firmware is:
BIOS Information
Vendor: Intel Corp.
Version: S1200RP.86B.02.02.0005.102320140911
Release Date: 10/23/2014
BIOS Revision: 4.6
System Information
Manufacturer: Intel Corporation
Product Name: S1200RP_SE
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Matthew Garrett <mjg59@coreos.com> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit 2473238eac95 ("ihex: add support for CS:IP/EIP records") removes
the "default:" statement in the switch block, making the "return
usage();" line dead code and ihex2fw silently ignoring unknown options.
Restore this statement.
This bug was found by building with HOSTCC=clang and adding
-Wunreachable-code-return to HOSTCFLAGS.
Fixes: 2473238eac95 ("ihex: add support for CS:IP/EIP records") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
If CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for x86 systems and physical
memory is more than 4GB, dma_map_page may return a valid memory
address which greater than 0xffffffff. As a result, the mlx5 device page
allocator RB tree will be initialized with valid addresses greater than
0xfffffff.
However, (addr & PAGE_MASK) set the high four bytes to zeros. So, it's
impossible for the function, free_4k, to release the pages whose
addresses greater than 4GB. Memory leaks. And mlx5_ib module can't
release the pages when user try to remove the module, as a result,
system hang.
[root@rdma05 root]# dmesg | grep addr | head
addr = 3fe384000
addr & PAGE_MASK = fe384000
[root@rdma05 root]# rmmod mlx5_ib <---- hang on
---------------------- cosnole log -----------------
mlx5_ib 0000:04:00.0: irq 138 for MSI/MSI-X
alloc irq_desc for 139 on node -1
alloc kstat_irqs on node -1
mlx5_ib 0000:04:00.0: irq 139 for MSI/MSI-X
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
0000:04:00.0:free_4k:221:(pid 1519): page not found
---------------------- cosnole log -----------------
Fixes: bf0bf77f6519 ('mlx5: Support communicating arbitrary host page size to firmware') Signed-off-by: Honggang Li <honli@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 3a2dfbe8acb1 ("xfrm: Notify changes in UDP encapsulation via netlink") CC: Martin Willi <martin@strongswan.org> Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 5c79de6e79cd ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE") Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 97a64b4577ae ("[XFRM]: Introduce XFRM_MSG_REPORT.") Reported-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
lookup_mountpoint can return either NULL or an error value.
Update the test in __detach_mounts to test for an error value
to avoid pathological cases causing a NULL pointer dereferences.
The callers of __detach_mounts should prevent it from ever being
called on an unlinked dentry but don't take any chances.
Fixes: 28d8909bc790 ("[XFRM]: Export SAD info.") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: ecfd6b183780 ("[XFRM]: Export SPD info") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
In the media drivers, the v4l2 core knows about all submodules
and calls into them from a common function. However this cannot
work if the modules that get called are loadable and the
core is built-in. In that case we get
drivers/built-in.o: In function `set_type':
drivers/media/v4l2-core/tuner-core.c:301: undefined reference to `tea5767_attach'
drivers/media/v4l2-core/tuner-core.c:307: undefined reference to `tea5761_attach'
drivers/media/v4l2-core/tuner-core.c:349: undefined reference to `tda9887_attach'
drivers/media/v4l2-core/tuner-core.c:405: undefined reference to `xc4000_attach'
This was working previously, until the IS_ENABLED() macro was used
to replace the construct like
with the difference that the new code no longer checks whether it is being
built as a loadable module itself.
To fix this, this new patch adds an 'IS_REACHABLE' macro, which evaluates
true in exactly the condition that was used previously. The downside
of this is that this trades an obvious link error for a more subtle
runtime failure, but it is clear that the change that introduced the
link error was unintentional and it seems better to revert it for
now. Also, a similar change was originally created by Trent Piepho
and then reverted by teh change to the IS_ENABLED macro.
Ideally Kconfig would be used to avoid the case of a broken dependency,
or the code restructured in a way to turn around the dependency, but either
way would require much larger changes here.
Fixes: 7b34be71db53 ("[media] use IS_ENABLED() macro")
See-also: c5dec9fb248e ("V4L/DVB (4751): Fix DBV_FE_CUSTOMISE for card drivers compiled into kernel")
When the kernel deleted a vti6 interface, this interface was not removed from
the tunnels list. Thus, when the ip6_vti module was removed, this old interface
was found and the kernel tried to delete it again. This was leading to a kernel
panic.
Looking over the implementation for jhash2 and comparing it to jhash_3words
I realized that the two hashes were in fact very different. Doing a bit of
digging led me to "The new jhash implementation" in which lookup2 was
supposed to have been replaced with lookup3.
In reviewing the patch I noticed that jhash2 had originally initialized a
and b to JHASH_GOLDENRATIO and c to initval, but after the patch a, b, and
c were initialized to initval + (length << 2) + JHASH_INITVAL. However the
changes in jhash_3words simply replaced the initialization of a and b with
JHASH_INITVAL.
This change corrects what I believe was an oversight so that a, b, and c in
jhash_3words all have the same value added consisting of initval + (length
<< 2) + JHASH_INITVAL so that jhash2 and jhash_3words will now produce the
same hash result given the same inputs.
Fixes: 60d509c823cca ("The new jhash implementation") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Release references to buffer-heads if ext4_journal_start() fails.
Fixes: 5b61de757535 ("ext4: start handle at least possible moment when renaming files") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
While originally only being intended for outgoing traffic, commit a00e76349f35 ("netfilter: x_tables: allow to use cgroup match for
LOCAL_IN nf hooks") enabled xt_cgroups for the NF_INET_LOCAL_IN hook
as well, in order to allow for nfacct accounting.
Besides being currently limited to early demuxes only, commit a00e76349f35 forgot to add a check if we deal with full sockets,
i.e. in this case not with time wait sockets. TCP time wait sockets
do not have the same memory layout as full sockets, a lower memory
footprint and consequently also don't have a sk_classid member;
probing for sk_classid member there could potentially lead to a
crash.
Fixes: a00e76349f35 ("netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks") Cc: Alexey Perevalov <a.perevalov@samsung.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
We have many places where we want to check if a socket is
not a timewait or request socket. Use a helper to avoid
hard coding this.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
TCP_SYN_RECV state is currently used by fast open sockets.
Initial TCP requests (the pseudo sockets created when a SYN is received)
are not yet associated to a state. They are attached to their parent,
and the parent is in TCP_LISTEN state.
This commit adds TCP_NEW_SYN_RECV state, so that we can convert
TCP stack to a different schem gradually.
This state is not exported to user space.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
If a provider advertizes a zero max_fast_reg_page_list_len, FRWR
depth detection loops forever. Instead of just failing the mount,
try other memory registration modes.
Fixes: 0fc6c4e7bb28 ("xprtrdma: mind the device's max fast . . .") Reported-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <Devesh.Sharma@Emulex.Com> Tested-by: Meghana Cheripady <Meghana.Cheripady@Emulex.Com> Tested-by: Veeresh U. Kokatnur <veereshuk@chelsio.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Add a uniform interface by which device drivers can request device
properties from the platform firmware by providing a property name
and the corresponding data type. The purpose of it is to help to
write portable code that won't depend on any particular platform
firmware interface.
The first one allows the caller to check if the given property is
present. The next 5 of them allow single-valued properties of
various types to be retrieved in a uniform way. The remaining 5 are
for reading properties with multiple values (arrays of either numbers
or strings).
The interface covers both ACPI and Device Trees.
This change set includes material from Mika Westerberg and Aaron Lu.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Device Tree is used in many embedded systems to describe the system
configuration to the OS. It supports attaching properties or name-value
pairs to the devices it describe. With these properties one can pass
additional information to the drivers that would not be available
otherwise.
ACPI is another configuration mechanism (among other things) typically
seen, but not limited to, x86 machines. ACPI allows passing arbitrary
data from methods but there has not been mechanism equivalent to Device
Tree until the introduction of _DSD in the recent publication of the
ACPI 5.1 specification.
In order to facilitate ACPI usage in systems where Device Tree is
typically used, it would be beneficial to standardize a way to retrieve
Device Tree style properties from ACPI devices, which is what we do in
this patch.
If a given device described in ACPI namespace wants to export properties it
must implement _DSD method (Device Specific Data, introduced with ACPI 5.1)
that returns the properties in a package of packages. For example:
The UUID reserved for properties is daffd814-6eba-4d8c-8a91-bc9bbf4aa301
and is documented in the ACPI 5.1 companion document called "_DSD
Implementation Guide" [1], [2].
We add several helper functions that can be used to extract these
properties and convert them to different Linux data types.
The ultimate goal is that we only have one device property API that
retrieves the requested properties from Device Tree or from ACPI
transparent to the caller.
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API
domains") prevents certain options for devices with RMRRs. This even
prevents those devices from getting a 1:1 mapping with 'iommu=pt',
because we don't have the code to handle *preserving* the RMRR regions
when moving the device between domains.
There's already an exclusion for USB devices, because we know the only
reason for RMRRs there is a misguided desire to keep legacy
keyboard/mouse emulation running in some theoretical OS which doesn't
have support for USB in its own right... but which *does* enable the
IOMMU.
Add an exclusion for graphics devices too, so that 'iommu=pt' works
there. We should be able to successfully assign graphics devices to
guests too, as long as the initial handling of stolen memory is
reconfigured appropriately. This has certainly worked in the past.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Like other KVM switches, the Aten DVI KVM switch needs a quirk to avoid spewing
errors:
[791759.606542] usb 1-5.4: input irq status -75 received
[791759.614537] usb 1-5.4: input irq status -75 received
[791759.622542] usb 1-5.4: input irq status -75 received
The raphnet.net 4nes4snes and 2nes2snes multi-joystick adapters use a single
HID report descriptor with one report ID per controller. This has the effect
that the inputs of otherwise independent game controllers get packed in one
large joystick device.
With this patch each controller gets its own /dev/input/jsX device, which is
more natural and less confusing than having all inputs going to the same place.
Logitech devices use a vendor protocol to communicate various
information with the device. This protocol is called HID++,
and an exerpt can be found here:
https://drive.google.com/folderview?id=0BxbRzx7vEV7eWmgwazJ3NUFfQ28&usp=shar
The main difficulty which is related to this protocol is that
it is a synchronous protocol using the input reports.
So when we want to get some information from the device, we need
to wait for a matching input report.
This driver introduce this capabilities to be able to support
the multitouch mode of the Logitech Wireless Touchpad T651
(the bluetooth one). The multitouch data is available directly
from the mouse input reports, and we just need to query the device
on connect about its caracteristics.
HID++ and the touchpad features has a specific reporting mode
which uses pure HID++ reports, but Logitech told us not to use
it for this specific device. During QA, they detected that
some bluetooth input reports where lost, and so the only supported
mode is the pointer mode.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Andrew de los Reyes <adlr@chromium.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Microsoft Natural Wireless Ergonomic Keyboard 7000 has special My
Favorites 1..5 keys which are handled through a vendor-defined usage
page (0xff05).
Apply MS_ERGONOMY quirks handling to USB PID 0x071d (Microsoft Microsoft
2.4GHz Transceiver V1.0) so that the My Favorites 1..5 keys are reported
as KEY_F14..18 events.
Based on a patch from: Nikolai Kondrashov <Nikolai.Kondrashov@redhat.com>
Most of the tablets handled by hid-uclogic already use MULTI_INPUT.
For the ones which are not quirked in usbhid/hidquirks, they have a
custom report descriptor which contains only one report per HID
interface. For those tablets HID_QUIRK_MULTI_INPUT is transparent.
According to https://github.com/DIGImend/tablets, the only problematic
tablet currently handled by hid-uclogic is the TWHA60 v3. This tablet
presents different report descriptors from the ones currently quirked.
This is not a problem per se, given that this tablet is not supported
currently in this version (it needs the same command as a Huion to
start forwarding events).
Reviewed-by: Nikolai Kondrashov <spbnick@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This API has changed in commit 6e5e959dde0 (pinctrl: API changes to support
multiple states per device).
Fixes: 6e5e959dde0 ('pinctrl: API changes to support multiple states per device') Cc: Stephen Warren <swarren@nvidia.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit 03e9f0cac5d (pinctrl: clean up after enable refactoring) updated the
documentation to remove mention of disable(), and rename enable() to set_mux().
One in-text mention was forgotten. Fix this.
Fixes: 03e9f0cac5d ('pinctrl: clean up after enable refactoring') Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Set the SYSCIER as per the values indicated in the documentation.
The value previously used appears to been copied from the r8a7779
implementation but on closer inspection is not correct for the r8a7791.
Set the SYSCIER as per the values indicated in the documentation.
The value previously used appears to been copied from the r8a7779
implementation but on closer inspection is not correct for the r8a7790.
LCD driver is always built for the XTFPGA platform, but its base address
is not configurable, and is wrong for ML605/KC705. Its initialization
locks up KC705 board hardware.
Make the whole driver optional, and its base address and bus width
configurable. Implement 4-bit bus access method.
Cc: stable@vger.kernel.org Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
There appears to have been some inconsistency and confusion here as on
the r8a7790 these clocks are referred to as SD(HI)1 and SD(HI)2 while on
the r8a7791 and r8a7794 they are referred to as SD(HI)2 and SD(HI)3.
Compound pages cannot be migrated and it was not expected that such pages
be marked for NUMA balancing. This did not take into account that drivers
such as net/packet/af_packet.c may insert compound pages into userspace
with vm_insert_page. This patch tells the NUMA balancing protection
scanner to skip all VM_MIXEDMAP mappings which avoids the possibility that
compound pages are marked for migration.
Crush temporary buffers are allocated as per replica size configured
by the user. When there are more final osds (to be selected as per
rule) than the replicas, buffer overlaps and it causes crash. Now, it
ensures that at most num-rep osds are selected even if more number of
osds are allowed by the rule.
br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to disable softirqs because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. The spin locks in br_fdb_update were _bh before commit f8ae737deea1
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit: 292d1398983f ("bridge: add NTF_USE support")
Using local_bh_disable/enable around br_fdb_update() allows us to keep
using the spin_lock/unlock in br_fdb_update for the fast-path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Fixes: 292d1398983f ("bridge: add NTF_USE support") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.
Reported-by: Joerg Roedel <jroedel@suse.de> Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
421b3885bf6d56391297844f43fb7154a6396e12 "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.
Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.
This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.
In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.
Fixes: 55d8694fa82c ("net: tcp: assign tcp cong_ops when tcp sk is created") Cc: Florian Westphal <fw@strlen.de> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Glenn Judd <glenn.judd@morganstanley.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
We have two problems in UDP stack related to bogus checksums :
1) We return -EAGAIN to application even if receive queue is not empty.
This breaks applications using edge trigger epoll()
2) Under UDP flood, we can loop forever without yielding to other
processes, potentially hanging the host, especially on non SMP.
This patch is an attempt to make things better.
We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
sctp_v4_map_v6 was subtly writing and reading from members
of a union in a way the clobbered data it needed to read before
it read it.
Zeroing the v6 flowinfo overwrites the v4 sin_addr with 0, meaning
that every place that calls sctp_v4_map_v6 gets ::ffff:0.0.0.0 as the
result.
Reorder things to guarantee correct behaviour no matter what the
union layout is.
This impacts user space clients that open an IPv6 SCTP socket and
receive IPv4 connections. Prior to 299ee user space would see a
sockaddr with AF_INET and a correct address, after 299ee the sockaddr
is AF_INET6, but the address is wrong.
Fixes: 299ee123e198 (sctp: Fixup v4mapped behaviour to comply with Sock API) Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev->qdisc is finally set, this causes
q->list points to an old root qdisc which is going to be
freed right before assigning with a new one.
Fix this by moving ->attach() after setting dev->qdisc.
For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.
Fixes: 95dc19299f74 ("pkt_sched: give visibility to mq slave qdiscs") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit e9ce7cb6b107 ("xen-netback: Factor queue-specific data into queue
struct") introduced a regression when moving queue-specific data into
the queue struct by failing to set the credit_bytes field. This
prevented bandwidth limiting from working. Initialize the field as it
was done before multiqueue support was added.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
----
v2: switch to sock_flag(sk, SOCK_DEAD) and added net/caif/caif_socket.c
v3: return -ECONNRESET in upstream caller of wait function for SOCK_DEAD Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
A pair of nested spin locks was introduced in commit 63502b8d0
"dp83640: Fix receive timestamp race condition".
Unfortunately the 'flags' parameter was reused for the inner lock,
clobbering the originally saved IRQ state. This patch fixes the issue
by changing the inner lock to plain spin_lock without irqsave.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Callers of the ext_write function are supposed to hold a mutex that
protects the state of the dialed page, but one caller was missing the
lock from the very start, and over time the code has been changed
without following the rule. This patch cleans up the call sites in
violation of the rule.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Currently, the calibration function that corrects the initial offsets
among multiple devices only works the first time. If the function is
called more than once, the calibration fails and bogus offsets will be
programmed into the devices.
In a well hidden spot, the device documentation tells that trigger indexes
0 and 1 are special in allowing the TRIG_IF_LATE flag to actually work.
This patch fixes the issue by using one of the special triggers during the
recalibration method.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When more than a multicast address is present in a MLDv2 report, all but
the first address is ignored, because the code breaks out of the loop if
there has not been an error adding that address.
This has caused failures when two guests connected through the bridge
tried to communicate using IPv6. Neighbor discoveries would not be
transmitted to the other guest when both used a link-local address and a
static address.
This only happens when there is a MLDv2 querier in the network.
The fix will only break out of the loop when there is a failure adding a
multicast address.
The mdb before the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
After the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::fb temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
dev ovirtmgmt port bond0.86 grp ff02::d temp
dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
dev ovirtmgmt port bond0.86 grp ff02::16 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp
Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.") Reported-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Tested-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The tx_curr_frame_payload field is u32. When we try to calculate a
small negative delta based on it, we end up with a positive integer
close to 2^32 instead. So the tx_bytes pointer increases by about
2^32 for every transmitted frame.
Fix by calculating the delta as a signed long.
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Reported-by: Florian Bruhin <me@the-compiler.org> Fixes: 7a1e890e2168 ("usbnet: Fix tx_bytes statistic running backward in cdc_ncm") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
ip_error does not check if in_dev is NULL before dereferencing it.
IThe following sequence of calls is possible:
CPU A CPU B
ip_rcv_finish
ip_route_input_noref()
ip_route_input_slow()
inetdev_destroy()
dst_input()
With the result that a network device can be destroyed while processing
an input packet.
A crash was triggered with only unicast packets in flight, and
forwarding enabled on the only network device. The error condition
was created by the removal of the network device.
As such it is likely the that error code was -EHOSTUNREACH, and the
action taken by ip_error (if in_dev had been accessible) would have
been to not increment any counters and to have tried and likely failed
to send an icmp error as the network device is going away.
Therefore handle this weird case by just dropping the packet if
!in_dev. It will result in dropping the packet sooner, and will not
result in an actual change of behavior.
Fixes: 251da4130115b ("ipv4: Cache ip_error() routes even when not forwarding.") Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net> Tested-by: Vittorio Gambaletta <linuxbugs@vittgam.net> Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
while true; do
tc qdisc add dev foo root handle 1: prio
tc filter add dev foo parent 1: u32 match u32 0 0 flowid 1
tc qdisc del dev foo root
rmmod cls_u32
done
... will panic the kernel. Moreover, he bisected the change
apparently introducing it to 78fd1d0ab072 ("netlink: Re-add
locking to netlink_lookup() and seq walker").
The removal of synchronize_net() from the netlink socket
triggering the qdisc to be removed, seems to have uncovered
an RCU resp. module reference count race from the tc API.
Given that RCU conversion was done after e341694e3eb5 ("netlink:
Convert netlink_lookup() to use RCU protected hash table")
which added the synchronize_net() originally, occasion of
hitting the bug was less likely (not impossible though):
When qdiscs that i) support attaching classifiers and,
ii) have at least one of them attached, get deleted, they
invoke tcf_destroy_chain(), and thus call into ->destroy()
handler from a classifier module.
After RCU conversion, all classifier that have an internal
prio list, unlink them and initiate freeing via call_rcu()
deferral.
Meanhile, tcf_destroy() releases already reference to the
tp->ops->owner module before the queued RCU callback handler
has been invoked.
Subsequent rmmod on the classifier module is then not prevented
since all module references are already dropped.
By the time, the kernel invokes the RCU callback handler from
the module, that function address is then invalid.
One way to fix it would be to add an rcu_barrier() to
unregister_tcf_proto_ops() to wait for all pending call_rcu()s
to complete.
synchronize_rcu() is not appropriate as under heavy RCU
callback load, registered call_rcu()s could be deferred
longer than a grace period. In case we don't have any pending
call_rcu()s, the barrier is allowed to return immediately.
Since we came here via unregister_tcf_proto_ops(), there
are no users of a given classifier anymore. Further nested
call_rcu()s pointing into the module space are not being
done anywhere.
Only cls_bpf_delete_prog() may schedule a work item, to
unlock pages eventually, but that is not in the range/context
of cls_bpf anymore.
Fixes: 25d8c0d55f24 ("net: rcu-ify tcf_proto") Fixes: 9888faefe132 ("net: sched: cls_basic use RCU") Reported-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit <5cf3d46192fc> ("udp: Simplify__udp*_lib_mcast_deliver")
simplified the filter for incoming IPv6 multicast but removed
the check of the local socket address and the UDP destination
address.
This patch restores the filter to prevent sockets bound to a IPv6
multicast IP to receive other UDP traffic link unicast.
Signed-off-by: Henning Rogge <hrogge@gmail.com> Fixes: 5cf3d46192fc ("udp: Simplify__udp*_lib_mcast_deliver") Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
commit 1d13a96c74fc ("ipv6: tcp: fix flowlabel value in ACK messages
send from TIME_WAIT") added the flow label in the last TCP packets.
Unfortunately, it was not casted properly.
This patch replace the buggy shift with be32_to_cpu/cpu_to_be32.
Fixes: 1d13a96c74fc ("ipv6: tcp: fix flowlabel value in ACK messages") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Before the patch, the command 'ip link add bond2 type bond mode 802.3ad'
causes the kernel to send a rtnl message for the bond2 interface, with an
ifindex 0.
'ip monitor' shows:
0: bond2: <BROADCAST,MULTICAST,MASTER> mtu 1500 state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
9: bond2@NONE: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN group default
link/ether ea:3e:1f:53:92:7b brd ff:ff:ff:ff:ff:ff
[snip]
The patch fixes the spotted bug by checking in bond driver if the interface
is registered before calling the notifier chain.
It also adds a check in rtmsg_ifinfo() to prevent this kind of bug in the
future.
Fixes: d4261e565000 ("bonding: create netlink event when bonding option is changed") CC: Jiri Pirko <jiri@resnulli.us> Reported-by: Julien Meunier <julien.meunier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
RGMII interfaces come in multiple flavors: RGMII with transmit or
receive internal delay, no delays at all, or delays in both direction.
This change extends the initial check for PHY_INTERFACE_MODE_RGMII to
cover all of these variants since EEE should be allowed for any of these
modes, since it is a property of the RGMII, hence Gigabit PHY capability
more than the RGMII electrical interface and its delays.
Fixes: a59a4d192166 ("phy: add the EEE support and the way to access to the MMD registers") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
x86 has variable length encoding. x86 JIT compiler is trying
to pick the shortest encoding for given bpf instruction.
While doing so the jump targets are changing, so JIT is doing
multiple passes over the program. Typical program needs 3 passes.
Some very short programs converge with 2 passes. Large programs
may need 4 or 5. But specially crafted bpf programs may hit the
pass limit and if the program converges on the last iteration
the JIT compiler will be producing an image full of 'int 3' insns.
Fix this corner case by doing final iteration over bpf program.
Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
FROM_BE16:
'ror %reg, 8' doesn't clear upper bits of the register,
so use additional 'movzwl' insn to zero extend 16 bits into 64
FROM_LE16:
should zero extend lower 16 bits into 64 bit
FROM_LE32:
should zero extend lower 32 bits into 64 bit
Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The code in __netdev_upper_dev_link() has an over-stringent
loop detection logic that actually prevents valid configurations
from working correctly.
In particular, the logic returns an error if an upper device
is already in the list of all upper devices for a given dev.
This particular check seems to be a overzealous as it disallows
perfectly valid configurations. For example:
# ip l a link eth0 name eth0.10 type vlan id 10
# ip l a dev br0 typ bridge
# ip l s eth0.10 master br0
# ip l s eth0 master br0 <--- Will fail
If you switch the last two commands (add eth0 first), then both
will succeed. If after that, you remove eth0 and try to re-add
it, it will fail!
It appears to be enough to simply check adj_list to keeps things
safe.
I've tried stacking multiple devices multiple times in all different
combinations, and either rx_handler registration prevented the stacking
of the device linking cought the error.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Ard Biesheuvel [Thu, 28 May 2015 23:31:28 +0000 (16:31 -0700)]
ARM: 8221/1: PJ4: allow building in Thumb-2 mode
Two files that get included when building the multi_v7_defconfig target
fail to build when selecting THUMB2_KERNEL for this configuration.
In both cases, we can just build the file as ARM code, as none of its
symbols are exported to modules, so there are no interworking concerns.
In the iwmmxt.S case, add ENDPROC() declarations so the symbols are
annotated as functions, resulting in the linker to emit the appropriate
mode switches.
Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
(cherry picked from commit 13d1b9575ac2c2da143cd2236b6cf0fc314570f8) Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
arch/x86/kvm/mmu.c: In function 'kvm_mmu_pte_write':
arch/x86/kvm/mmu.c:4256: error: unknown field 'cr0_wp' specified in initializer
arch/x86/kvm/mmu.c:4257: error: unknown field 'cr4_pae' specified in initializer
arch/x86/kvm/mmu.c:4257: warning: excess elements in union initializer
...
gcc-4.4.4 (at least) has issues when using anonymous unions in
initializers.