Jason Gunthorpe [Wed, 27 Jul 2022 19:09:25 +0000 (16:09 -0300)]
Merge branch 'erdma' into rdma.git for-next
Cheng Xu says
====================
This v14 patch set introduces the Elastic RDMA Adapter (ERDMA) driver,
which released in Apsara Conference 2021 by Alibaba. The PR of ERDMA
userspace provider has already been created [1].
ERDMA enables large-scale RDMA acceleration capability in Alibaba ECS
environment, initially offered in g7re instance. It can improve the
efficiency of large-scale distributed computing and communication
significantly and expand dynamically with the cluster scale of Alibaba
Cloud.
ERDMA is a RDMA networking adapter based on the Alibaba MOC hardware. It
works in the VPC network environment (overlay network), and uses iWarp
transport protocol. ERDMA supports reliable connection (RC). ERDMA also
supports both kernel space and user space verbs. Now we have already
supported HPC/AI applications with libfabric, NoF and some other internal
verbs libraries, such as xrdma, epsl, etc,.
For the ECS instance with RDMA enabled, our MOC hardware generates two
kinds of PCI devices: one for ERDMA, and one for the original net device
(virtio-net). They are separated PCI devices.
====================
* branch 'erdma':
RDMA/erdma: Add driver to kernel build environment
RDMA/erdma: Add the ABI definitions
RDMA/erdma: Add the erdma module
RDMA/erdma: Add connection management (CM) support
RDMA/erdma: Add verbs implementation
RDMA/erdma: Add verbs header file
RDMA/erdma: Add event queue implementation
RDMA/erdma: Add cmdq implementation
RDMA/erdma: Add main include file
RDMA/erdma: Add the hardware related definitions
RDMA: Add ERDMA to rdma_driver_id definition
Currently, the driver tries to validat the HEVC SPS
against the CAPTURE queue format (i.e. the decoded format).
This is not correct, because typically the SPS control is set
before the CAPTURE queue is negotiated.
Fixes: 135ad96cb4d6b ("media: hantro: Be more accurate on pixel formats step_width constraints") Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Add the main erdma module, which provides interface to infiniband
subsystem.
This commit includes a modification from Christophe, that using the bitmap
API to allocate bitmaps instead of hand-writing. And the commit also fixes
warnings reported by static checkers.
Link: https://lore.kernel.org/r/20220727014927.76564-10-chengyou@linux.alibaba.com Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
RDMA/erdma: Add connection management (CM) support
ERDMA's transport protocol is iWarp, so the driver must support CM
interface. In CM part, we use the same way as SoftiWarp: using kernel
socket to set up the connection, then performing MPA negotiation in
kernel. So, this part of code mainly comes from SoftiWarp, base on it,
we add some more features, such as non-blocking iw_connect implementation.
This commit also fixes a duplicated include issue reported by Abaci Robot.
The RDMA verbs implementation of erdma is divided into three files:
erdma_qp.c, erdma_cq.c, and erdma_verbs.c. Internal used functions and
datapath functions of QP/CQ are put in erdma_qp.c and erdma_cq.c, the rest
is in erdma_verbs.c.
This commit also fixes some static check warnings.
Link: https://lore.kernel.org/r/20220727014927.76564-8-chengyou@linux.alibaba.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Event queue (EQ) is the main notification way from erdma hardware to its
driver. Each erdma device contains 2 kinds EQs: asynchronous EQ (AEQ) and
completion EQ (CEQ). Per device has 1 AEQ, which used for RDMA async event
report, and max to 32 CEQs (numbered for CEQ0 to CEQ31). CEQ0 is used for
cmdq completion event report, and the rest CEQs are used for RDMA
completion event report.
Cmdq is the main control plane channel between erdma driver and hardware.
After erdma device is initialized, the cmdq channel will be active in the
whole lifecycle of this driver.
This commit also includes two modifications from Christophe, one is using
the bitmap API to allocate bitmaps instead of hand-writing, and another
is using the non-atomic bitmap API when applicable.
Add ERDMA driver main header file, defining internal used data structures
and operations. The defined data structures includes *cmdq*, which is used
as the communication channel between ERDMA driver and hardware.
ERDMA is a PCIe device, and this file provides ERDMA hardware related
definitions, mainly including PCIe device capabilities and restrictions,
device registers definitions, doorbell space, doorbell structure
definitions and WQE definitions.
media: cedrus: hevc: Add check for invalid timestamp
Not all DPB entries will be used most of the time. Unused entries will
thus have invalid timestamps. They will produce negative buffer index
which is not specifically handled. This works just by chance in current
code. It will even produce bogus pointer, but since it's not used, it
won't do any harm.
Let's fix that brittle design by skipping writing DPB entry altogether
if timestamp is invalid.
Both sun6i_mipi_csi2.c and sun8i_a83t_mipi_csi2.c have the same issue:
the comment before the ret = 0 assignment is incorrect, drop it and
always assign the result of the v4l2_subdev_call(..., 0) to ret.
In the disable label check for !on and set ret to 0 in that case.
media: uvcvideo: Fix invalid pointer in uvc_ctrl_init_ctrl()
The handling of per-device mappings introduced in commit 86f7ef773156
("media: uvcvideo: Add support for per-device control mapping
overrides") overwrote the mapping variable after it was initialized and
before it was used, leading to usage of an invalid pointer for devices
with per-device mappings. Fix it.
Fixes: 86f7ef773156 ("media: uvcvideo: Add support for per-device control mapping overrides") Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Fix a typo in the mc-core.rst media driver API documentation. Due to its
nature, the typo unfortunately caused a warning during documentation
build.
Fixes: 03b282861ca7 ("media: mc-entity: Add a new helper function to get a remote pad for a pad") Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
30312730bd02 ("cgroup: Add "no" prefixed mount options") added "no" prefixed
mount options to allow turning them off and 6a010a49b63a ("cgroup: Make
!percpu threadgroup_rwsem operations optional") added one more "no" prefixed
mount option. However, Michal pointed out that the "no" prefixed options
aren't necessary in allowing mount options to be turned off:
While a bit confusing, given that there is a way to turn off the options,
there's no reason to have the explicit "no" prefixed options. Let's remove
them.
RDMA/mlx5: Store in the cache mkeys instead of mrs
Currently, the driver stores mlx5_ib_mr struct in the cache entries,
although the only use of the cached MR is the mkey. Store only the mkey in
the cache.
RDMA/mlx5: Store the number of in_use cache mkeys instead of total_mrs
total_mrs is used only to calculate the number of mkeys currently in
use. To simplify things, replace it with a new member called "in_use" and
directly store the number of mkeys currently in use.
The Xarray allows us to store the cached mkeys in memory efficient way.
Entries are reserved in the Xarray using xa_cmpxchg before calling to the
upcoming callbacks to avoid allocations in interrupt context. The
xa_cmpxchg can sleep when using GFP_KERNEL, so we call it in a loop to
ensure one reserved entry for each process trying to reserve.
In the next patch, ent->list will be replaced with an xarray. The xarray
uses an internal lock to protect the indexes. Use it to protect all the
entry fields, and get rid of ent->lock.
Marc Zyngier [Wed, 27 Jul 2022 17:33:27 +0000 (18:33 +0100)]
Merge branch kvm-arm64/nvhe-stacktrace into kvmarm-master/next
* kvm-arm64/nvhe-stacktrace: (27 commits)
: .
: Add an overflow stack to the nVHE EL2 code, allowing
: the implementation of an unwinder, courtesy of
: Kalesh Singh. From the cover letter (slightly edited):
:
: "nVHE has two modes of operation: protected (pKVM) and unprotected
: (conventional nVHE). Depending on the mode, a slightly different approach
: is used to dump the hypervisor stacktrace but the core unwinding logic
: remains the same.
:
: * Protected nVHE (pKVM) stacktraces:
:
: In protected nVHE mode, the host cannot directly access hypervisor memory.
:
: The hypervisor stack unwinding happens in EL2 and is made accessible to
: the host via a shared buffer. Symbolizing and printing the stacktrace
: addresses is delegated to the host and happens in EL1.
:
: * Non-protected (Conventional) nVHE stacktraces:
:
: In non-protected mode, the host is able to directly access the hypervisor
: stack pages.
:
: The hypervisor stack unwinding and dumping of the stacktrace is performed
: by the host in EL1, as this avoids the memory overhead of setting up
: shared buffers between the host and hypervisor."
:
: Additional patches from Oliver Upton and Marc Zyngier, tidying up
: the initial series.
: .
arm64: Update 'unwinder howto'
KVM: arm64: Don't open code ARRAY_SIZE()
KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c
KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions
KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit
KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around
KVM: arm64: Introduce pkvm_dump_backtrace()
KVM: arm64: Implement protected nVHE hyp stack unwinder
KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
KVM: arm64: Stub implementation of pKVM HYP stack unwinder
KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
KVM: arm64: Introduce hyp_dump_backtrace()
KVM: arm64: Implement non-protected nVHE hyp stack unwinder
KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder
KVM: arm64: On stack overflow switch to hyp overflow_stack
arm64: stacktrace: Add description of stacktrace/common.h
arm64: stacktrace: Factor out common unwind()
arm64: stacktrace: Handle frame pointer from different address spaces
...
Marc Zyngier [Wed, 27 Jul 2022 14:29:06 +0000 (15:29 +0100)]
arm64: Update 'unwinder howto'
Implementing a new unwinder is a bit more involved than writing
a couple of helpers, so let's not lure the reader into a false
sense of comfort. Instead, let's point out what they should
call into, and what sort of parameter they need to provide.
Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-7-maz@kernel.org
Eric Dumazet [Tue, 26 Jul 2022 11:57:43 +0000 (11:57 +0000)]
tcp: md5: fix IPv4-mapped support
After the blamed commit, IPv4 SYN packets handled
by a dual stack IPv6 socket are dropped, even if
perfectly valid.
$ nstat | grep MD5
TcpExtTCPMD5Failure 5 0.0
For a dual stack listener, an incoming IPv4 SYN packet
would call tcp_inbound_md5_hash() with @family == AF_INET,
while tp->af_specific is pointing to tcp_sock_ipv6_specific.
Only later when an IPv4-mapped child is created, tp->af_specific
is changed to tcp_sock_ipv6_mapped_specific.
Fixes: 7bbb765b7349 ("net/tcp: Merge TCP-MD5 inbound callbacks") Reported-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Dmitry Safonov <dima@arista.com> Tested-by: Leonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marc Zyngier [Wed, 27 Jul 2022 14:29:03 +0000 (15:29 +0100)]
KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions
Having multiple versions of on_accessible_stack() (one per unwinder)
makes it very hard to reason about what is used where due to the
complexity of the various includes, the forward declarations, and
the reliance on everything being 'inline'.
Instead, move the code back where it should be. Each unwinder
implements:
- on_accessible_stack() as well as the helpers it depends on,
- unwind()/unwind_next(), as they pass on_accessible_stack as
a parameter to unwind_next_common() (which is the only common
code here)
This hardly results in any duplication, and makes it much
easier to reason about the code.
Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-4-maz@kernel.org
Marc Zyngier [Wed, 27 Jul 2022 14:29:02 +0000 (15:29 +0100)]
KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit
The unwinding code doesn't really belong to the exit handling
code. Instead, move it to a file (conveniently named stacktrace.c
to confuse the reviewer), and move all the stacktrace-related
stuff there.
It will be joined by more code very soon.
Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-3-maz@kernel.org
Replace SET_*_PM_OPS with *_PM_OPS, which which have the advantage that the
compiler always sees the PM callbacks as referenced, so they don't need to
be wrapped with "#ifdef CONFIG_PM_SLEEP" or tagged with "__maybe_unused" to
avoid "defined but not used" warnings.
See 1a3c7bb08826 ("PM: core: Add new *_PM_OPS macros, deprecate old ones").
Commit 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis")
added a check to determine whether arm_dma_zone_size is exceeding the
amount of kernel virtual address space available between the upper 4GB
virtual address limit and PAGE_OFFSET in order to provide a suitable
definition of MAX_DMA_ADDRESS that should fit within the 32-bit virtual
address space. The quantity used for comparison was off by a missing
trailing 0, leading to MAX_DMA_ADDRESS to be overflowing a 32-bit
quantity.
This was caught thanks to CONFIG_DEBUG_VIRTUAL on the bcm2711 platform
where we define a dma_zone_size of 1GB and we have a PAGE_OFFSET value
of 0xc000_0000 (CONFIG_VMSPLIT_3G) leading to MAX_DMA_ADDRESS being
0x1_0000_0000 which overflows the unsigned long type used throughout
__pa() and then __virt_addr_valid(). Because the virtual address passed
to __virt_addr_valid() would now be 0, the function would loudly warn
and flood the kernel log, thus making the platform unable to boot
properly.
Jim Quinlan [Tue, 26 Jul 2022 17:39:11 +0000 (12:39 -0500)]
PCI: brcmstb: Rename .map_bus() functions to end with 'map_bus'
Rename the .map_bus() functions to end with 'map_bus' so they're easy to
find with, e.g., 'git grep "^static.*_map_bus" drivers/pci/'.
[bhelgaas: rename brcm_pcie_map_bus32() to brcm7425_pcie_map_bus() for
better cscope-ability (".*_map_bus" is not the same as ".*_map_bus.*")] Link: https://lore.kernel.org/r/20220725151258.42574-8-jim2101024@gmail.com Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Jim Quinlan [Mon, 25 Jul 2022 15:12:53 +0000 (11:12 -0400)]
PCI: brcmstb: Enable child bus device regulators from DT
Some platforms have power regulators for slots or devices below Root Ports.
On platforms like Raspberry Pi 4, these regulators are described in the
Root Port device tree node, since they logically belong to the Root Port,
not to the host bridge itself.
Add an .add_bus() hook (called when pci_alloc_child_bus() allocates the
secondary ("child") bus for a bridge), and look for such regulators. If we
find some, enable them before bringing up the link and enumerating devices
on the child bus.
Similarly, when pci_remove_bus() calls the ops->remove_bus() hook, disable
the regulators.
The regulators that may be described in a Root Port DT device are:
vpcie3v3
vpcie3v3aux
vpcie12v
These control power to the device downstream from the Root Port.
[bhelgaas: commit log, name hooks brcm_pcie_add_bus(), etc, since we only
support one set of subregulator info, save info in struct brcm_pcie instead
of dev->driver_data, move brcm_pcie_start_link() from probe to .add_bus()
(from subsequent patch)] Link: https://lore.kernel.org/r/20220725151258.42574-5-jim2101024@gmail.com Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Jim Quinlan [Mon, 25 Jul 2022 15:12:52 +0000 (11:12 -0400)]
PCI: brcmstb: Prevent config space access when link is down
When the link is down, config accesses to downstream devices cause CPU
aborts. Allow config accesses only when the link is up.
As the following scenario shows, this check is racy and cannot completely
avoid CPU aborts, but it makes them less likely:
pci_generic_config_read
addr = brcm_pcie_map_conf # bus->ops->map_bus()
brcm_pcie_link_up # returns "true"; link is up
<link goes down>
*val = readb(addr) # link is now down
<CPU abort>
Note that config space accesses to the Root Port are not affected by link
status.
[bhelgaas: commit log, use PCIE_ECAM_REG() instead of magic 0xfff masks;
note that pci_generic_config_read32() masks low two bits already] Link: https://lore.kernel.org/r/20220725151258.42574-4-jim2101024@gmail.com Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Remove forward function declarations in this driver. Also move some
constant structure definitions lower in the file. There are no changes to
the code that has been moved.
Merge tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
"Two more bug fixes for asm-generic, one addressing an incorrect
Kconfig symbol reference and another one fixing a build failure for
the perf tool on mips and possibly others"
* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove a broken and needless ifdef conditional
tools: Fixed MIPS builds due to struct flock re-definition
SUNRPC: Reinitialise the backchannel request buffers before reuse
When we're reusing the backchannel requests instead of freeing them,
then we should reinitialise any values of the send/receive xdr_bufs so
that they reflect the available space.
Daniel Müller [Wed, 27 Jul 2022 00:11:56 +0000 (00:11 +0000)]
selftests/bpf: Adjust vmtest.sh to use local kernel configuration
So far the vmtest.sh script, which can be used as a convenient way to
run bpf selftests, has obtained the kernel config safe to use for
testing from the libbpf/libbpf GitHub repository [0].
Given that we now have included this configuration into this very
repository, we can just consume it from here as well, eliminating the
necessity of remote accesses.
With this change we adjust the logic in the script to use the
configuration from below tools/testing/selftests/bpf/configs/ instead
of pulling it over the network.
Daniel Müller [Wed, 27 Jul 2022 00:11:55 +0000 (00:11 +0000)]
selftests/bpf: Copy over libbpf configs
This change integrates libbpf maintained configurations and black/white
lists [0] into the repository, co-located with the BPF selftests themselves.
We minimize the kernel configurations to keep future updates as small as
possible [1].
Furthermore, we make both kernel configurations build on top of the existing
configuration tools/testing/selftests/bpf/config (to be concatenated before
build). Lastly, we replaced the terms blacklist & whitelist with denylist and
allowlist, respectively.
Daniel Müller [Wed, 27 Jul 2022 00:11:54 +0000 (00:11 +0000)]
selftests/bpf: Sort configuration
This change makes sure to sort the existing minimal kernel configuration
containing options required for running BPF selftests alphabetically.
Doing so will make it easier to diff it against other configurations,
which in turn helps with maintaining disjunct config files that build on
top of each other. It also helped identify the CONFIG_IPV6_GRE being set
twice and removes one of the occurrences.
Lastly, we change NET_CLS_BPF from 'm' to 'y'. Having this option as 'm'
will cause failures of the btf_skc_cls_ingress selftest.
Tony Lindgren [Fri, 8 Apr 2022 10:17:13 +0000 (13:17 +0300)]
clocksource/drivers/timer-ti-dm: Move inline functions to driver for am6
The __omap_dm_timer_* inline functions in the header are no longer needed
outside the driver, and the header ifdefs prevent the driver working for
ARCH_K3.
Let's move the inline functions to the driver and drop the ifdefs and
drop the unused functions __omap_dm_timer_override_errata() and
__omap_dm_timer_load_start().
Rob Herring [Tue, 19 Jul 2022 21:51:52 +0000 (15:51 -0600)]
dt-bindings: iio/dac: adi,ad5766: Add missing type to 'output-range-microvolts'
'output-range-microvolts' is missing a type definition. '-microvolts' is
not a standard unit (should be '-microvolt'). As the property is already
in use, add a type reference.
Pavel Begunkov [Wed, 27 Jul 2022 09:30:41 +0000 (10:30 +0100)]
io_uring: notification completion optimisation
We want to use all optimisations that we have for io_uring requests like
completion batching, memory caching and more but for zc notifications.
Fortunately, notification perfectly fit the request model so we can
overlay them onto struct io_kiocb and use all the infratructure.
Most of the fields of struct io_notif natively fits into io_kiocb, so we
replace struct io_notif with struct io_kiocb carrying struct
io_notif_data in the cmd cache line. Then we adapt io_alloc_notif() to
use io_alloc_req()/io_alloc_req_refill(), and kill leftovers of hand
coded caching. __io_notif_complete_tw() is converted to use io_uring's
tw infra.
Yang Li [Wed, 1 Jun 2022 02:28:14 +0000 (10:28 +0800)]
ovl: fix some kernel-doc comments
Remove warnings found by running scripts/kernel-doc,
which is caused by using 'make W=1'.
fs/overlayfs/super.c:311: warning: Function parameter or member 'dentry'
not described in 'ovl_statfs'
fs/overlayfs/super.c:311: warning: Excess function parameter 'sb'
description in 'ovl_statfs'
fs/overlayfs/super.c:357: warning: Function parameter or member 'm' not
described in 'ovl_show_options'
fs/overlayfs/super.c:357: warning: Function parameter or member 'dentry'
not described in 'ovl_show_options'
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
When mounting overlayfs in an unprivileged user namespace, trusted xattr
creation will fail. This will lead to failures in some file operations,
e.g. in the following situation:
mkdir lower upper work merged
mkdir lower/directory
mount -toverlay -olowerdir=lower,upperdir=upper,workdir=work none merged
rmdir merged/directory
mkdir merged/directory
The cause for these failures is currently extremely non-obvious and hard to
debug. Hence, warn the user and suggest using the userxattr mount option,
if it is not already supplied and xattr creation fails during the
self-check.
Ian Rogers [Tue, 26 Jul 2022 22:09:21 +0000 (15:09 -0700)]
perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.
Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.
Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.
Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Christy Lee <christylee@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Sun, 24 Jul 2022 06:00:13 +0000 (14:00 +0800)]
perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed. E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.
Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.
This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.
Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chang Rui <changruinj@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Sun, 24 Jul 2022 06:00:12 +0000 (14:00 +0800)]
perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
Leo Yan [Mon, 25 Jul 2022 10:42:20 +0000 (18:42 +0800)]
perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes: 12fdd6c009da0d02 ("perf scripts python: Support Arm CoreSight trace data disassembly") Reported-by: Akemi Yagi <toracat@elrepo.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: ElRepo <contact@elrepo.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A client should be able to handle getting an EACCES error while doing
a mount operation to reclaim state due to NFS4CLNT_RECLAIM_REBOOT
being set. If the server returns RPC_AUTH_BADCRED because authentication
failed when we execute "exportfs -au", then RECLAIM_COMPLETE will go a
wrong way. After mount succeeds, all OPEN call will fail due to an
NFS4ERR_GRACE error being returned. This patch is to fix it by resending
a RPC request.
Signed-off-by: Zhang Xianwei <zhang.xianwei8@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Fixes: aa5190d0ed7d ("NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Merge tag 'at91-dt-5.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/dt
AT91 DT for v5.20 #4
It contains one new LAN966 based board, namely pcb8309, a cleanup
on Makefile to sort alphabetically LAN966 entries and 2 cleanups
on bindings.
* tag 'at91-dt-5.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
dt-bindings: soc: microchip: use absolute path to other schema
dt-bindings: soc: microchip: drop quotes when not needed
ARM: dts: lan966x: keep lan966 entries alphabetically sorted
ARM: dts: lan966x: add support for pcb8309
dt-bindings: arm: at91: add lan966 pcb8309 board
VA Macro fsgen clock is supplied to other LPASS Macros using proper
clock apis, however the internal user uses the registers directly without
clk apis. This approch has race condition where in external users of
the clock might cut the clock while VA macro is actively using this.
Moving the internal usage to clk apis would provide a proper refcounting
and avoid such race conditions.
This issue was noticed while headset was pulled out while recording is
in progress and shifting record patch to DMIC.
Reported-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Tested-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com> Link: https://lore.kernel.org/r/20220727124749.4604-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
John Garry [Tue, 26 Jul 2022 09:24:33 +0000 (17:24 +0800)]
arm64: defconfig: Sync some configs with savedefconfig
Some configs can obviously be removed when sync'ing with savedefconfig, as
follows:
- config SECCOMP was changed to def_bool y in commit 282a181b1a0d ("
seccomp: Move config option SECCOMP to arch/Kconfig"), so no need to
explicitly enable in the defconfig.
- config MAILBOX is already selected by some drivers enabled in the
defconfig, so no need to explicitly enable.
- config QRTR was enabled in the defconfig from commit 1bdf91fd2ae82 ("
arm64: defconfig: Enable Qualcomm QRTR"). However until many kernel
versions later in commit 231a136fdf46 ("arm64: defconfig: enable ath11k
driver"), no driver depended on config QRTR - not for building anyway.
In commit 231a136fdf46, config ATH11K_PCI was enabled and this selects
config QRTR, so there is no need to explicitly enable in the defconfig.
hwmon: (aquacomputer_d5next) Add support for Aquacomputer Quadro fan controller
Extend aquacomputer_d5next driver to expose hardware temperature sensors
and fans of the Aquacomputer Quadro fan controller, which communicates
through a proprietary USB HID protocol. Implemented by Jack Doan [1].
Four temperature sensors and PWM controllable fans are available. The
liquid flow sensor is also exposed, implemented by Leonard Anderweit [2].
Additionally, serial number, firmware version and power-on count are
exposed through debugfs.
wifi: wilc1000: use existing iftype variable to store the interface type
For consistency, use an existing 'iftype' element which was already
having the interface type. Replace 'mode' with 'iftype' as it was used
for the same purpose.
wifi: wilc1000: add 'isinit' flag for SDIO bus similar to SPI
Similar to SPI priv data, add 'isinit' variable in SDIO priv. Make use
of the state to invoke hif_init() once, and acquire the lock before
accessing hif function.
wifi: wilc1000: cancel the connect operation during interface down
Cancel the ongoing connection request to avoid any issue if the
interface is set down before the connection request is completed.
host_int_handle_disconnect was already available, so renamed it and used
the same API for 'ndio_close' cb.
wifi: wl12xx: Drop if with an always false condition
The remove callback is only called after probe completed successfully.
In this case platform_set_drvdata() was called with a non-NULL argument
(in wlcore_probe()) and so wl is never NULL.
This is a preparation for making platform remove callbacks return void.
Douglas Anderson [Tue, 26 Jul 2022 17:38:23 +0000 (10:38 -0700)]
regulator: core: Allow drivers to define their init data as const
Drivers tend to want to define the names of their regulators somewhere
in their source file as "static const". This means, inevitable, that
every driver out there open codes something like this:
Let's make this more convenient by doing providing a helper that does
the copy.
I have chosen to have the "const" input structure here be the exact
same structure as the normal one passed to
devm_regulator_bulk_get(). This is slightly inefficent since the input
data can't possibly have anything useful for "ret" or consumer and
thus we waste 8 bytes per structure. This seems an OK tradeoff for not
introducing an extra structure.