git.ipfire.org Git - people/ms/linux.git/log

KVM: selftests: Use kvm_cpu_has() for nested VMX checks

Use kvm_cpu_has() to check for nested VMX support, and drop the helpers
now that their functionality is trivial to implement.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-7-seanjc@google.com

KVM: selftests: Use kvm_cpu_has() for nested SVM checks

Use kvm_cpu_has() to check for nested SVM support, and drop the helpers
now that their functionality is trivial to implement.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-6-seanjc@google.com

KVM: selftests: Use kvm_cpu_has() in the SEV migration test

Use kvm_cpu_has() in the SEV migration test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-5-seanjc@google.com

KVM: selftests: Add framework to query KVM CPUID bits

Add X86_FEATURE_* magic in the style of KVM-Unit-Tests' implementation,
where the CPUID function, index, output register, and output bit position
are embedded in the macro value. Add kvm_cpu_has() to query KVM's
supported CPUID and use it set_sregs_test, which is the most prolific
user of manual feature querying.

Opportunstically rename calc_cr4_feature_bits() to
calc_supported_cr4_feature_bits() to better capture how the CR4 bits are
chosen.

Link: https://lore.kernel.org/all/20210422005626.564163-1-ricarkol@google.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-4-seanjc@google.com

KVM: sefltests: Use CPUID_* instead of X86_FEATURE_* for one-off usage

Rename X86_FEATURE_* macros to CPUID_* in various tests to free up the
X86_FEATURE_* names for KVM-Unit-Tests style CPUID automagic where the
function, leaf, register, and bit for the feature is embedded in its
macro value.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-3-seanjc@google.com

KVM: selftests: Set KVM's supported CPUID as vCPU's CPUID during recreate

On x86-64, set KVM's supported CPUID as the vCPU's CPUID when recreating
a VM+vCPU to deduplicate code for state save/restore tests, and to
provide symmetry of sorts with respect to vm_create_with_one_vcpu(). The
extra KVM_SET_CPUID2 call is wasteful for Hyper-V, but ultimately is
nothing more than an expensive nop, and overriding the vCPU's CPUID with
the Hyper-V CPUID information is the only known scenario where a state
save/restore test wouldn't need/want the default CPUID.

Opportunistically use __weak for the default vm_compute_max_gfn(), it's
provided by tools' compiler.h.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-2-seanjc@google.com

KVM: selftests: Fix filename reporting in guest asserts

Fix filename reporting in guest asserts by ensuring the GUEST_ASSERT
macro records __FILE__ and substituting REPORT_GUEST_ASSERT for many
repetitive calls to TEST_FAIL.

Previously filename was reported by using __FILE__ directly in the
selftest, wrongly assuming it would always be the same as where the
assertion failed.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reported-by: Ricardo Koller <ricarkol@google.com>
Fixes: 4e18bccc2e5544f0be28fc1c4e6be47a469d6c60
Link: https://lore.kernel.org/r/20220615193116.806312-5-coltonlewis@google.com
[sean: convert more TEST_FAIL => REPORT_GUEST_ASSERT instances]
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: selftests: Write REPORT_GUEST_ASSERT macros to pair with GUEST_ASSERT

Write REPORT_GUEST_ASSERT macros to pair with GUEST_ASSERT to abstract
and make consistent all guest assertion reporting. Every report
includes an explanatory string, a filename, and a line number.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-4-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: selftests: Increase UCALL_MAX_ARGS to 7

Increase UCALL_MAX_ARGS to 7 to allow GUEST_ASSERT_4 to pass 3 builtin
ucall arguments specified in guest_assert_builtin_args plus 4
user-specified arguments.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-3-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: selftests: enumerate GUEST_ASSERT arguments

Enumerate GUEST_ASSERT arguments to avoid magic indices to ucall.args.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-2-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: WARN only once if KVM leaves a dangling userspace I/O request

Change a WARN_ON() to separate WARN_ON_ONCE() if KVM has an outstanding
PIO or MMIO request without an associated callback, i.e. if KVM queued a
userspace I/O exit but didn't actually exit to userspace before moving
on to something else.  Warning on every KVM_RUN risks spamming the kernel
if KVM gets into a bad state.  Opportunistically split the WARNs so that
it's easier to triage failures when a WARN fires.

Deliberately do not use KVM_BUG_ON(), i.e. don't kill the VM.  While the
WARN is all but guaranteed to fire if and only if there's a KVM bug, a
dangling I/O request does not present a danger to KVM (that flag is truly
truly consumed only in a single emulator path), and any such bug is
unlikely to be fatal to the VM (KVM essentially failed to do something it
shouldn't have tried to do in the first place).  In other words, note the
bug, but let the VM keep running.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Set error code to segment selector on LLDT/LTR non-canonical #GP

When injecting a #GP on LLDT/LTR due to a non-canonical LDT/TSS base, set
the error code to the selector. Intel SDM's says nothing about the #GP,
but AMD's APM explicitly states that both LLDT and LTR set the error code
to the selector, not zero.

Note, a non-canonical memory operand on LLDT/LTR does generate a #GP(0),
but the KVM code in question is specific to the base from the descriptor.

Fixes: e37a75a13cda ("KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Mark TSS busy during LTR emulation _after_ all fault checks

Wait to mark the TSS as busy during LTR emulation until after all fault
checks for the LTR have passed. Specifically, don't mark the TSS busy if
the new TSS base is non-canonical.

Opportunistically drop the one-off !seg_desc.PRESENT check for TR as the
only reason for the early check was to avoid marking a !PRESENT TSS as
busy, i.e. the common !PRESENT is now done before setting the busy bit.

Fixes: e37a75a13cda ("KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR")
Reported-by: syzbot+760a73552f47a8cd0fd9@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>

KVM: x86: Tweak name of MONITOR/MWAIT #UD quirk to make it #UD specific

Add a "UD" clause to KVM_X86_QUIRK_MWAIT_NEVER_FAULTS to make it clear
that the quirk only controls the #UD behavior of MONITOR/MWAIT. KVM
doesn't currently enforce fault checks when MONITOR/MWAIT are supported,
but that could change in the future. SVM also has a virtualization hole
in that it checks all faults before intercepts, and so "never faults" is
already a lie when running on SVM.

Fixes: bfbcc81bb82c ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-4-seanjc@google.com

KVM: selftests: Use "a" and "d" to set EAX/EDX for wrmsr_safe()

Do not use GCC's "A" constraint to load EAX:EDX in wrmsr_safe().  Per
GCC's documenation on x86-specific constraints, "A" will not actually
load a 64-bit value into EAX:EDX on x86-64.

  The a and d registers. This class is used for instructions that return
  double word results in the ax:dx register pair. Single word values will
  be allocated either in ax or dx. For example on i386 the following
  implements rdtsc:

  unsigned long long rdtsc (void)
  {
    unsigned long long tick;
    __asm__ __volatile__("rdtsc":"=A"(tick));
    return tick;
  }

  This is not correct on x86-64 as it would allocate tick in either ax or
  dx. You have to use the following variant instead:

  unsigned long long rdtsc (void)
  {
    unsigned int tickl, tickh;
    __asm__ __volatile__("rdtsc":"=a"(tickl),"=d"(tickh));
    return ((unsigned long long)tickh << 32)|tickl;
  }

Because a u64 fits in a single 64-bit register, using "A" for selftests,
which are 64-bit only, results in GCC loading the value into either RAX
or RDX instead of splitting it across EAX:EDX.

E.g.:

  kvm_exit:             reason MSR_WRITE rip 0x402919 info 0 0
  kvm_msr:              msr_write 40000118 = 0x60000000001 (#GP)
...

With "A":

  48 8b 43 08           mov    0x8(%rbx),%rax
  49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9
  00 00 00
  4c 8d 15 07 00 00 00 lea    0x7(%rip),%r10        # 402f44 <guest_msr+0x34>
  4c 8d 1d 06 00 00 00 lea    0x6(%rip),%r11        # 402f4a <guest_msr+0x3a>
  0f 30                 wrmsr

With "a"/"d":

  48 8b 53 08             mov    0x8(%rbx),%rdx
  89 d0                   mov    %edx,%eax
  48 c1 ea 20             shr    $0x20,%rdx
  49 b9 ba da ca ba 0a    movabs $0xabacadaba,%r9
  00 00 00
  4c 8d 15 07 00 00 00    lea    0x7(%rip),%r10        # 402fc3 <guest_msr+0xb3>
  4c 8d 1d 06 00 00 00    lea    0x6(%rip),%r11        # 402fc9 <guest_msr+0xb9>
  0f 30                   wrmsr

Fixes: 3b23054cd3f5 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
[sean: use "& -1u", provide GCC blurb and link to documentation]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220714011115.3135828-1-seanjc@google.com

smb3: workaround negprot bug in some Samba servers

Mount can now fail to older Samba servers due to a server
bug handling padding at the end of the last negotiate
context (negotiate contexts typically are rounded up to 8
bytes by adding padding if needed). This server bug can
be avoided by switching the order of negotiate contexts,
placing a negotiate context at the end that does not
require padding (prior to the recent netname context fix
this was the case on the client).

Fixes: 73130a7b1ac9 ("smb3: fix empty netname context on secondary channels")
Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol+github@gmail.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>

drm/amd/display: remove duplicate dcn314 includes

Several headers were included twice. Fix that.

Reported-by: kernel test robot <yujie.liu@intel.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Enable DCN314 in DM

Add support for DCN 3.1.4 in Display Manager

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DMUB support for DCN314

Initialize DMUB for DCN 3.1.4.
Use same funcs as DCN31.

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Enable DCN314 in DC

Add support for DCN 3.1.4 in Display Core

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DCN314 version identifiers

DCN 3.1.4 version and family ids

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DCN314 DML calculation support

Display mode library for DCN 3.1.4

v2: squash in checkpatch fix (Alex)

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DCN314 DC resources

Display Core support for DCN 3.1.4

v2:(squash)fix non-x86 in dc/dcn314/Makefile
Properly handle PPC as well. (Alex)
v3: minor cleanup (Alex)
v4: fix comment (Alex)

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DCN314 clock manager

Clock and SMU interfaces for DCN 3.1.4

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add DCN314 IRQ services

IRQ services to support DCN 3.1.4 interrupts.

v2: make to_dal_irq_source_dcn314 static (Alex)

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Add reg headers for DCN314

Register headers for the following IPs:
- DCN 3.1.4
- DPCS 3.1.4

v2:(squash) clean up (Alex)

Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Ensure valid event timestamp for cursor-only commits

Requires enabling the vblank machinery for them.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2030
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Check BO's requested pinning domains against its preferred_domains

When pinning a buffer, we should check to see if there are any
additional restrictions imposed by bo->preferred_domains. This will
prevent the BO from being moved to an invalid domain when pinning.

For example, this can happen if the user requests to create a BO in GTT
domain for display scanout. amdgpu_dm will allow pinning to either VRAM
or GTT domains, since DCN can scanout from either or. However, in
amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is
adequate carveout. This can lead to pinning to VRAM despite the user
requesting GTT placement for the BO.

v2: Allow the kernel to override the domain, which can happen when
exporting a BO to a V4L camera (for example).

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

apparmor: disable showing the mode as part of a secid to secctx

Displaying the mode as part of the seectx takes up unnecessary memory,
makes it so we can't use refcounted secctx so we need to alloc/free on
every conversion from secid to secctx and introduces a space that
could be potentially mishandled by tooling.

Eg. In an audit record we get

subj_type=firefix (enforce)

Having the mode reported is not necessary, and might even be confusing
eg. when writing an audit rule to match the above record field you
would use

-F subj_type=firefox

ie. the mode is not included. AppArmor provides ways to find the mode
without reporting as part of the secctx. So disable this by default
before its use is wide spread and we can't. For now we add a sysctl
to control the behavior as we can't guarantee no one is using this.

Acked-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>

apparmor: Convert secid mapping to XArrays instead of IDR

XArrays are a better match than IDR for how AppArmor is mapping
secids. Specifically AppArmor is trying to keep the allocation
dense. XArrays also have the advantage of avoiding the complexity IDRs
preallocation.

In addition this avoids/fixes a lockdep issue raised in the LKML thread
"Linux 5.18-rc4"

where there is a report of an interaction between apparmor and IPC,
this warning may have been spurious as the reported issue is in a
per-cpu local lock taken by the IDR. With the one side in the IPC id
allocation and the other in AppArmor's secid allocation.

Description by John Johansen <john.johansen@canonical.com>

Message-Id: <226cee6a-6ca1-b603-db08-8500cd8f77b7@gnuweeb.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>

apparmor: add a kernel label to use on kernel objects

Separate kernel objects from unconfined. This is done so we can
distinguish between the two in debugging, auditing and in preparation
for being able to replace unconfined, which is not appropriate for the
kernel.

The kernel label will continue to behave similar to unconfined.

Acked-by: Jon Tourville <jon.tourville@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>

net/mlx5e: Remove the duplicating check for striding RQ when enabling LRO

LRO requires striding RQ and checks that it's enabled at two places:
mlx5e_fix_features and set_feature_lro. This commit keeps only one check
at mlx5e_fix_features and removes the duplicating one in
set_feature_lro.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Move the LRO-XSK check to mlx5e_fix_features

LRO is mutually exclusive with XSK. When LRO is enabled, it checks
whether XSK is active. This commit moves this check to a more correct
place at mlx5e_fix_features.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Extend flower police validation

Recent net commit 4d1e07d83ccc ("net/mlx5e: Fix matchall police parameters
validation") removed notexceed action id validation from
mlx5e_police_validate() and left it up to callers. However, since
tc_act_can_offload_police() only exists in net-next its validation is
extended in this dedicated followup patch.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: configure meter in flow action

After police action is parsed, set meter data in flow action,
so they can be used when adding FTE.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5e: Removed useless code in function

Comparison of eth_ft->ft with NULL is useless, because
get_flow_table() returns either pointer 'eth_ft'
such that eth_ft->ft != NULL, or an erroneous value that is
handled on return, causing mlx5e_ethtool_flow_replace()
to terminate before checking whether eth_ft->ft equals NULL.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Rustam Subkhankulov <subkhankulov@ispras.ru>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, implement QinQ support

Implement support for new 802.1ad VLAN protocol type. Create new flow
groups that handle svlan tags. Create FDB flows with svlan tag match when
bridge VLAN is set to QinQ.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, implement infrastructure for VLAN protocol change

Current implementation only supports 802.1Q VLAN Ethernet protocol. That
protocol type is assumed by default and
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification is ignored. To prepare
for supporting 802.1ad protocol in following patches implement the
necessary infrastructure to allow the user to dynamically change the VLAN
protocol:

- Handle SWITCHDEV_ATTR_ID_BRIDGE_VLAN_PROTOCOL notification by flushing
FDB and re-creating VLAN modify header actions with new protocol. In this
patch the only allowed dynamic VLAN protocol value is ETH_P_8021Q.

- Save current VLAN protocol in per-bridge instance variable. Use the
dynamic variable instead of hardcoded values in mlx5 bridge code. Create
VLAN flow groups and flows based on current mlx5_esw_bridge->vlan_proto
value instead of assuming 802.1Q ethertype.

- Extract common flow group creation code into dedicated functions in order
to be reused for creating QinQ groups in following patches.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, extract VLAN push/pop actions creation

Following patches in series need to re-create VLAN actions when user
changes VLAN protocol. Extract the code that creates VLAN push/pop actions
into dedicated function in order to be reused in next patch.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, rename filter fg to vlan_filter

Following patches in series introduce new qinq filtering group. To improve
readability rename the existing group in function, variable and definition
names to include "vlan" in order to make it easy to distinguish from
upcoming qinq group.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Bridge, refactor groups sizes and indices

Following patches in the series introduce additional flow groups for QinQ
support. With increased number of groups it becomes cumbersome to calculate
groups sizes as fractions of the table size. Instead, manually define sizes
of specific group types and ensure that totals are still correct by static
assertions. Having specific table size is important for firmware resource
management.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: debugfs, Add num of in-use FW command interface slots

Expose the number of busy / in-use slots in the FW command interface via
a read-only debugfs entry. This improves observability and helps in the
performance bottleneck analysis.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Expose vnic diagnostic counters for eswitch managed vports

Expose on vport group managers debug counters for their managed vports.

Counters are exposed through debugfs, the directory will be present only
for functions that are eswitch managers and only counters that are
supported on their specific HW/FW will be exposed.

Example:
$ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/
pf sf_8 vf_0 vf_1

$ ls -l /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/
cq_overrun
quota_exceeded_command
total_q_under_processor_handle
invalid_command
send_queue_priority_update_flow

List of all counter added:
total_q_under_processor_handle - number of queues in error state due to an
async error or errored command.
send_queue_priority_update_flow - number of QP/SQ priority/SL update
events.
cq_overrun - number of times CQ entered an error state due to an
overflow.
async_eq_overrun -number of time an EQ mapped to async events was
overrun.
comp_eq_overrun - number of time an EQ mapped to completion events was
overrun.
quota_exceeded_command - number of commands issued and failed due to quota
exceeded.
invalid_command - number of commands issued and failed dues to any reason
other than quota exceeded.

Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Use software VHCA id when it's supported

Use software VHCA id when it's supported by the firmware.

A unique id is allocated upon mlx5_mdev_init() and freed upon
mlx5_mdev_uninit(), as such it stays the same during the full life cycle
of the device including upon health recovery if occurred.

The conjunction of sw_vhca_id with sw_owner_id will be a global unique
id per function which uses mlx5_core.

The sw_vhca_id is set upon init_hca command and is used to specify the
VHCA that the NIC vport is affiliated with.

This functionality is needed upon migration of VM which is MPV based.
(i.e. multi port device).

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Introduce ifc bits for using software vhca id

Introduce ifc related stuff to enable using software vhca id
functionality.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Use the bitmap API to allocate bitmaps

Use bitmap_zalloc()/bitmap_free() instead of hand-writing them.

It is less verbose and it improves the semantic.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

netfilter: nf_nat: in nf_nat_initialized(), use const struct nf_conn *

nf_nat_initialized() doesn't modify passed struct nf_conn,
so declare as const.

This is helpful for code readability and makes it possible
to call nf_nat_initialized() with a const struct nf_conn *.

Signed-off-by: James Yonan <james@openvpn.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

NFSv4: Fix races in the legacy idmapper upcall

nfs_idmap_instantiate() will cause the process that is waiting in
request_key_with_auxdata() to wake up and exit. If there is a second
process waiting for the idmap->idmap_mutex, then it may wake up and
start a new call to request_key_with_auxdata(). If the call to
idmap_pipe_downcall() from the first process has not yet finished
calling nfs_idmap_complete_pipe_upcall_locked(), then we may end up
triggering the WARN_ON_ONCE() in nfs_idmap_prepare_pipe_upcall().

The fix is to ensure that we clear idmap->idmap_upcall_data before
calling nfs_idmap_instantiate().

Fixes: e9ab41b620e4 ("NFSv4: Clean up the legacy idmapper upcall")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

bpf: Tidy up verifier check_func_arg()

This patch does two things:

1. For matching against the arg type, the match should be against the
base type of the arg type, since the arg type can have different
bpf_type_flags set on it.

2. Uses switch casing to improve readability + efficiency.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/r/20220712210603.123791-1-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

ARM: dts: qcom: apq8064: create tsens device node

Create separate device node for thermal sensors on apq8064 platform.
Move related properties to the newly created device tree node.
This harmonizes apq8064 and ipq8064 device trees and allows gcc device
to be probed earlier by removing dependency on QFPROM nodes.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220521151437.1489111-5-dmitry.baryshkov@linaro.org

arm64: defconfig: Enable Qualcomm SC8280XP providers

The Qualcomm SC8280XP need the global clock controller, interconnect
provider and TLMM pinctrl in order to boot. Enable these as builtin, as
they are needed in order to provide e.g. UART.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20220707161014.3178798-1-bjorn.andersson@linaro.org

arm64: dts: qcom: sc8280xp: Add lost ranges for timer

The timer node needs ranges specified to map the 1-cell children to the
2-cell address range used in /soc. This addition never made it into the
patch that was posted and merged, so add it now.

Fixes: 152d1faf1e2f ("arm64: dts: qcom: add SC8280XP platform")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20220707160858.3178771-1-bjorn.andersson@linaro.org

Merge tag 'qcom-dts-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt

Qualcomm DTS updates for v5.20

This adds USB, NAND, QPIC BAM, CPUfreq, remoteprocs, SMEM, SCM,
watchdog, interconnect providers to the SDX65 5G modem platform and
enables relevant devices for the MTP.

The BAM DMUX interface used to exchange Ethernet/IP data with the modem
is described on the MSM8974 platform.

It fixes up the PXO supply clock to L2CC on IPQ6084, as the platform is
transitioned away from global clock lookup.

SDX55 has it's debug UART interrupt level corrected.

Lastly it contains a wide variety of fixes for DeviceTree validation
issues across most of the platforms.

* tag 'qcom-dts-for-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (48 commits)
  ARM: dts: qcom: msm8974: rename GPU's OPP table node
  ARM: dts: qcom: apq8064: disable DSI and DSI PHY by default
  ARM: dts: qcom: apq8064: rename DSI PHY iface clock
  ARM: dts: qcom: extend scm compatible to match dt-schema
  ARM: dts: qcom: Fix sdhci node names - use 'mmc@'
  ARM: dts: qcom: apq8064: drop phy-names from HDMI device node
  ARM: dts: qcom: apq8064-ifc6410: drop hdmi-mux-supply
  ARM: dts: qcom: pm8841: add required thermal-sensor-cells
  ARM: dts: qcom: msm8974: add required ranges to OCMEM
  ARM: dts: qcom: sdx55: add dedicated IMEM and syscon compatibles
  ARM: dts: qcom: msm8974: add dedicated IMEM compatible
  ARM: dts: qcom: apq8064-asus-nexus7: add dedicated IMEM compatible
  ARM: dts: qcom: use generic sram as name for imem and ocmem nodes
  ARM: dts: qcom: ipq8064: add function to LED nodes
  ARM: dts: qcom: ipq8064-rb3011: add color to LED node
  ARM: dts: qcom: ipq4018-ap120c-ac: add function and color to LED nodes
  ARM: dts: qcom: apq8060-ifc6410: add color to LED node
  ARM: dts: qcom: apq8060-dragonboard: add function and color to LED nodes
  ARM: dts: qcom: sdx55: Fix the IRQ trigger type for UART
  ARM: dts: qcom-msm8974: fix irq type on blsp2_uart1
  ...

Link: https://lore.kernel.org/r/20220713032024.1372427-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'at91-dt-5.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/dt

AT91 DT for v5.20 #2

It contains only the enablement of USB device port.

* tag 'at91-dt-5.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
ARM: dts: kswitch-d10: enable the USB device port

Link: https://lore.kernel.org/r/20220713070602.1652118-1-claudiu.beznea@microchip.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'imx-soc-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/soc

i.MX SoC update for 5.20:

- Update the mx25_read_cpu_rev() function to support silicon revision 1.2
  for i.MX25 SoC.
- A minor indentation fix in Kconfig file.

* tag 'imx-soc-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: imx25: support silicon revision 1.2
  ARM: imx: Kconfig: Fix indentation

Link: https://lore.kernel.org/r/20220709082951.15123-2-shawnguo@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge tag 'arm-soc/for-5.20/maintainers' of https://github.com/Broadcom/stblinux into arm/soc

This pull request contains Broadcom ARM-SoC MAINTAINERS file updates for
5.20, please pull the following:

- William and Anand updates the MAINTAINERS files with new BCA SoCs
  entries for a variety of PON/DSL SoCs: 63178, 63158, 4912, 6858, 6878,
  6846, 6855, 6756, 63146, 6856, 63148, 6813. 63138 is now listed as a
  supported BCA platform and joins the families just added.

* tag 'arm-soc/for-5.20/maintainers' of https://github.com/Broadcom/stblinux:
  MAINTAINERS: Move BCM63138 to bcmbca arch entry
  MAINTAINERS: Add BCM6813 to bcmbca arch entry
  MAINTAINERS: Add BCM63148 to bcmbca arch entry
  MAINTAINERS: Add BCM6856 to bcmbca arch entry
  MAINTAINERS: Add BCM63146 to bcmbca arch entry
  MAINTAINERS: Add BCM6756 to bcmbca arch entry
  MAINTAINERS: Add BCM6855 to bcmbca arch entry
  MAINTAINERS: Add BCM6846 to bcmbca arch entry
  MAINTAINERS: Add BCM6878 to bcmbca arch entry
  MAINTAINERS: Add BCM6858 to bcmbca arch entry
  MAINTAINERS: Add BCM4912 to bcmbca arch entry
  MAINTAINERS: Add BCM63158 to bcmbca arch entry
  MAINTAINERS: Add BCM63178 to bcmbca arch entry

Link: https://lore.kernel.org/r/20220711164451.3542127-7-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

docs/zh_CN: Add a new translation of reporting-regressions.rst

Last English version used:

commit d2b40ba2cce2 ("docs: *-regressions.rst: explain how quickly
issues should be handled")

Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
Reviewed-by: Yanteng Si<siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/YsbuDGIpUjOzfAAh@bobwxc.mipc
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Merge tag 'arm-soc/for-5.20/soc' of https://github.com/Broadcom/stblinux into arm/soc

This pull request contains Broadcom SoC drivers changes for 5.20, please
pull the following:

- Miaoqian fixes a device_node reference count leak in the Kona SMC
  initialization code

- William moves the 63138 support code to use CONFIG_ARCH_BCMBCA which
  is how all of those similar SoCs from the BCA division are supported
  moving forward. This includes machine code and UART low-level debug
  code.

* tag 'arm-soc/for-5.20/soc' of https://github.com/Broadcom/stblinux:
  ARM: debug: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  arm: bcmbca: Add BCMBCA sub platforms
  arm: bcmbca: Move BCM63138 ARCH_BCM_63XX to ARCH_BCMBCA
  ARM: bcm: Fix refcount leak in bcm_kona_smc_init

Link: https://lore.kernel.org/r/20220711164451.3542127-8-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Documentation: hyperv: Add overview of clocks and timers

Add documentation topic for clocks and timers when running as a
guest on Hyper-V.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1657561704-12631-4-git-send-email-mikelley@microsoft.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Documentation: hyperv: Add overview of VMbus

Add documentation topic for using VMbus when running as a guest
on Hyper-V.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1657561704-12631-3-git-send-email-mikelley@microsoft.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Documentation: hyperv: Add overview of Hyper-V enlightenments

Add an initial documentation topic for Linux enlightenments to
run as a guest on Microsoft's Hyper-V hypervisor, linked under
the "virt" documentation area. Update the virt doc index.rst
and the MAINTAINERS file.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1657561704-12631-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Documentation/translations/zh_CN/mm/page_owner.rst: adjust some words

I noticed that there are some Chinese words that can be more accurate.
So I fix them as follows.

首先，英文原文中的"release" 在这个语境下
是物理页面“释放”的意思，而不是“发布”。
其次，标准表的第一列和第二列，
表达的是“长短键”的意思，第一列是“短键”，
而第二列是“长键”。这样翻译或会更清晰一些。

Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Acked-by: Yanteng Si<siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20220708172351.20928-1-caoyixuan2019@email.szu.edu.cn
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

docs/zh_CN: core-api: Add watch_queue Chinese translation

Translate core-api/watch_queue.rst into Chinese.

Last English version used:
commit f5461124d59b ("Documentation: move watch_queue to core-api").

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Reviewed-by: Wu XiangCheng <bobwxc@email.cn>
Link: https://lore.kernel.org/r/20220710133604.31382-1-zhoubinbin@loongson.cn
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Documentation: siphash: Fix typo in the name of offsetofend macro

The siphash documentation misspelled "offsetendof" instead of
"offsetofend".

Fixes: 2c956a60778cbb ("siphash: add cryptographically secure PRF")
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20220712104455.1408150-1-dovmurik@linux.ibm.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

MAINTAINERS: mark linux-doc-tw-discuss mailing list moderated

After sending a patch to linux-doc-tw-discuss@lists.sourceforge.net, I got
the typical response for a moderated list:

  Your mail to 'linux-doc-tw-discuss' with the subject
  ....
  Is being held until the list moderator can review it for approval.

  The reason it is being held:

      Post by non-member to a members-only list

Mark this mailing list moderated in MAINTAINERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220713043516.19290-1-lukas.bulwahn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

PCI: qcom: Power on PHY before DBI register accesses

IPQ8074 requires the PHY to be powered on before accessing DBI registers.
It's not clear whether other variants have the same dependency, but there
seems to be no reason for them to be different, so move all the DBI
accesses from .init() to .post_init() so they are all after phy_power_on().

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20220623155004.688090-2-robimarko@gmail.com
Signed-off-by: Robert Marko <robimarko@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI: qcom: Power on PHY before IPQ8074 DBI register accesses

Currently the Gen2 port in IPQ8074 will cause the system to hang as it
accesses DBI registers in qcom_pcie_init_2_3_3(), and those are only
accesible after phy_power_on().

Move the DBI read/writes to a new qcom_pcie_post_init_2_3_3(), which is
executed after phy_power_on().

Link: https://lore.kernel.org/r/20220623155004.688090-1-robimarko@gmail.com
Fixes: a0fd361db8e5 ("PCI: dwc: Move "dbi", "dbi2", and "addr_space" resource setup into common code")
Signed-off-by: Robert Marko <robimarko@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: stable@vger.kernel.org # v5.11+

PCI: qcom: Set up rev 2.1.0 PARF_PHY before enabling clocks

We currently enable clocks BEFORE we write to PARF_PHY_CTRL reg to enable
clocks and resets. This causes the driver to never set to a ready state
with the error 'Phy link never came up'.

This is caused by the PHY clock getting enabled before setting the required
bits in the PARF regs.

A workaround for this was set but with this new discovery we can drop
the workaround and use a proper solution to the problem by just enabling
the clock only AFTER the PARF_PHY_CTRL bit is set.

This correctly sets up the PCIe link and makes it usable even when a
bootloader leaves the PCIe link in an undefined state.

Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
Link: https://lore.kernel.org/r/20220708222743.27019-1-ansuelsmth@gmail.com
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

PCI/AER: Iterate over error counters instead of error strings

Previously we iterated over AER stat *names*, e.g.,
aer_correctable_error_string[32], but the actual stat *counters* may not be
that large, e.g., pdev->aer_stats->dev_cor_errs[16], which means that we
printed junk in the sysfs stats files.

Iterate over the stat counter arrays instead of the names to avoid this
junk.

Also, added a build time check to make sure all
counters have entries in strings array.

Fixes: 0678e3109a3c ("PCI/AER: Simplify __aer_print_error()")
Link: https://lore.kernel.org/r/20220509181441.31884-1-mkhalfella@purestorage.com
Reported-by: Meeta Saggi <msaggi@purestorage.com>
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Meeta Saggi <msaggi@purestorage.com>
Reviewed-by: Eric Badger <ebadger@purestorage.com>
Cc: stable@vger.kernel.org

PCI/AER: Enable error reporting when AER is native

If we have native control of AER, set the following error reporting enable
bits:

  - Correctable Error Reporting Enable
  - Non-Fatal Error Reporting Enable
  - Fatal Error Reporting Enable
  - Unsupported Request Reporting Enable

Note that these bits are all in the Device Control register and are not
AER-specific.

This affects all devices with an AER capability, including hot-added
devices.

Please note that this change is quite invasive, as error reporting now will
be enabled for all available PCIe Endpoints, which was previously not the
case.

When "pci=noaer" is selected, error reporting stays disabled of course.

[bhelgaas: commit log, note error reporting is not AER-specific]
Link: https://lore.kernel.org/r/20220125071820.2247260-4-sr@denx.de
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pali Rohár <pali@kernel.org>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
Cc: Naveen Naidu <naveennaidu479@gmail.com>

PCI/portdrv: Don't disable AER reporting in get_port_device_capability()

AER reporting is currently disabled in the DevCtl registers of all non Root
Port PCIe devices on systems using pcie_ports_native || host->native_aer,
disabling AER completely in such systems. This is because 2bd50dd800b5
("PCI: PCIe: Disable PCIe port services during port initialization"), added
a call to pci_disable_pcie_error_reporting() *after* the AER setup was
completed for the PCIe device tree.

Here a longer analysis about the current status of AER enabling /
disabling upon bootup provided by Bjorn:

  pcie_portdrv_probe
    pcie_port_device_register
      get_port_device_capability
        pci_disable_pcie_error_reporting
          clear CERE NFERE FERE URRE               # <-- disable for RP USP DSP
      pcie_device_init
        device_register                            # new AER service device
          aer_probe
            aer_enable_rootport                    # RP only
              set_downstream_devices_error_reporting
                set_device_error_reporting         # self (RP)
                  if (RP || USP || DSP)
                    pci_enable_pcie_error_reporting
                      set CERE NFERE FERE URRE     # <-- enable for RP
                pci_walk_bus
                  set_device_error_reporting
                    if (RP || USP || DSP)
                      pci_enable_pcie_error_reporting
                        set CERE NFERE FERE URRE   # <-- enable for USP DSP

In a typical Root Port -> Endpoint hierarchy, the above:
  - Disables Error Reporting for the Root Port,
  - Enables Error Reporting for the Root Port,
  - Does NOT enable Error Reporting for the Endpoint because it is not a
    Root Port or Switch Port.

In a deeper Root Port -> Upstream Switch Port -> Downstream Switch
Port -> Endpoint hierarchy:
  - Disables Error Reporting for the Root Port,
  - Enables Error Reporting for the Root Port,
  - Enables Error Reporting for both Switch Ports,
  - Does NOT enable Error Reporting for the Endpoint because it is not a
    Root Port or Switch Port,
  - Disables Error Reporting for the Switch Ports when pcie_portdrv_probe()
    claims them.  AER does not re-enable it because these are not Root
    Ports.

Remove this call to pci_disable_pcie_error_reporting() from
get_port_device_capability(), leaving the already enabled AER configuration
intact. With this change, AER is enabled in the Root Port and the PCIe
switch upstream and downstream ports. Only the PCIe Endpoints don't have
AER enabled yet. A follow-up patch will take care of this Endpoint
enabling.

Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization")
Link: https://lore.kernel.org/r/20220125071820.2247260-3-sr@denx.de
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pali Rohár <pali@kernel.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yao Hongbo <yaohongbo@linux.alibaba.com>
Cc: Naveen Naidu <naveennaidu479@gmail.com>

ACPI: CPPC: Fix enabling CPPC on AMD systems with shared memory

When commit 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all
and when CPPC_LIB is supported") was introduced, we found collateral
damage that a number of AMD systems that supported CPPC but
didn't advertise support in _OSC stopped having a functional
amd-pstate driver. The _OSC was only enforced on Intel systems at that
time.

This was fixed for the MSR based designs by commit 8b356e536e69f
("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
but some shared memory based designs also support CPPC but haven't
advertised support in the _OSC. Add support for those designs as well by
hardcoding the list of systems.

Fixes: 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported")
Fixes: 8b356e536e69f ("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
Link: https://lore.kernel.org/all/3559249.JlDtxWtqDm@natalenko.name/
Cc: 5.18+ <stable@vger.kernel.org> # 5.18+
Reported-and-tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

vf/remap: return the amount of bytes actually deduplicated

When using the FIDEDUPRANGE ioctl, in case of success the requested size
is returned. In some cases this might not be the actual amount of bytes
deduplicated.

This change modifies vfs_dedupe_file_range() to report the actual amount
of bytes deduplicated, instead of the requested amount.

Link: https://lore.kernel.org/linux-fsdevel/5548ef63-62f9-4f46-5793-03165ceccacc@tu-darmstadt.de/
Reported-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Reported-by: Max Schlecht <max.schlecht@informatik.hu-berlin.de>
Reported-by: Björn Scheuermann <scheuermann@kom.tu-darmstadt.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Darrick J Wong <djwong@kernel.org>
Signed-off-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

libbpf: Error out when binary_path is NULL for uprobe and USDT

binary_path is a required non-null parameter for bpf_program__attach_usdt
and bpf_program__attach_uprobe_opts. Check it against NULL to prevent
coredump on strchr.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com

Merge tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
"Fix an old and subtle bug in the migration path.

  css_sets are used to track tasks and migrations are tasks moving from
  a group of css_sets to another group of css_sets. The migration path
  pins all source and destination css_sets in the prep stage.

  Unfortunately, it was overloading the same list_head entry to track
  sources and destinations, which got confused for migrations which are
  partially identity leading to use-after-frees.

  Fixed by using dedicated list_heads for tracking sources and
  destinations"

* tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Use separate src/dst nodes when preloading css_sets for migration

Merge tag 'dt-fixes-for-palmer-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git into fixes

Microchip RISC-V devicetree fixes for 5.19-rc6

A single fix for mpfs.dtsi:
- The l2 cache controller was never hooked up in the dt, so userspace
  is presented with the wrong topology information, so it has been
  hooked up.

* tag 'dt-fixes-for-palmer-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git:
  riscv: dts: microchip: hook up the mpfs' l2cache

platform/chrome: cros_ec_typec: Use dev_err_probe on port register fail

The typec_register_port() can fail with EPROBE_DEFER if the endpoint
node hasn't probed yet. In order to avoid spamming the log with errors
in that case, log using dev_err_probe().

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220712214554.545035-1-nfraprado@collabora.com
Signed-off-by: Prashant Malani <pmalani@chromium.org>

fs/remap: constrain dedupe of EOF blocks

If dedupe of an EOF block is not constrainted to match against only
other EOF blocks with the same EOF offset into the block, it can
match against any other block that has the same matching initial
bytes in it, even if the bytes beyond EOF in the source file do
not match.

Fix this by constraining the EOF block matching to only match
against other EOF blocks that have identical EOF offsets and data.
This allows "whole file dedupe" to continue to work without allowing
eof blocks to randomly match against partial full blocks with the
same data.

Reported-by: Ansgar Lößer <ansgar.loesser@tu-darmstadt.de>
Fixes: 1383a7ed6749 ("vfs: check file ranges before cloning files")
Link: https://lore.kernel.org/linux-fsdevel/a7c93559-4ba1-df2f-7a85-55a143696405@tu-darmstadt.de/
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/i915: Add lmem_bar_size modparam

For testing purposes, support forcing the lmem_bar_size through a new
modparam. In CI we only have a limited number of configurations for DG2,
but we still need to be reasonably sure we get a usable device (also
verifying we report the correct values for things like
probed_cpu_visible_size etc) with all the potential lmem_bar sizes that
we might expect see in the wild.

v2: Update commit message and a minor modification.(Matt)

v3: Optimised lmem bar size code and modified code to resize
bar maximum upto lmem_size instead of maximum supported size.(Nirmoy)

v4: Optimised lmem bar size code.(Nirmoy)

Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-3-priyanka.dandamudi@intel.com

drm/i915: Add support for LMEM PCIe resizable bar

Add support for the local memory PICe resizable bar, so that
local memory can be resized to the maximum size supported by the device,
and mapped correctly to the PCIe memory bar. It is usual that GPU
devices expose only 256MB BARs primarily to be compatible with 32-bit
systems. So, those devices cannot claim larger memory BAR windows size due
to the system BIOS limitation. With this change, it would be possible to
reprogram the windows of the bridge directly above the requesting device
on the same BAR type.

v2:Moved code to gt/intel_region_lmem.c and used only
single underscore for function names.(Jani)

v3: Optimised code.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Michael J Ruhl <michael.j.ruhl@intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220713130209.2573233-2-priyanka.dandamudi@intel.com

drm/i915: Correct ss -> steering calculation for pre-Xe_HP platforms

Accidental use of a "SLICE" macro where a "SUBSLICE" macro was intended
causes the group ID for steering to be calculated incorrectly on
pre-Xe_HP platforms.

Fixes: 9a92732f040a ("drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220712220513.3451794-1-matthew.d.roper@intel.com

drm/amd/display: Ensure valid event timestamp for cursor-only commits

Requires enabling the vblank machinery for them.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2030
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/display: correct check of coverage blend mode

Check the value of per_pixel_alpha to decide whether the Coverage pixel
blend mode is applicable or not.

Fixes: 76818cdd11a2 ("drm/amd/display: add Coverage blend mode for overlay plane")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Prevent divide by zero

divide error: 0000 [#1] SMP PTI
CPU: 3 PID: 78925 Comm: tee Not tainted 5.15.50-1-lts #1
Hardware name: MSI MS-7A59/Z270 SLI PLUS (MS-7A59), BIOS 1.90 01/30/2018
RIP: 0010:smu_v11_0_set_fan_speed_rpm+0x11/0x110 [amdgpu]

Speed is user-configurable through a file.
I accidentally set it to zero, and the driver crashed.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Yefim Barashkin <mr.b34r@kolabnow.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines.

Various DCE versions had trouble with 36 bpp lb depth, requiring fixes,
last time in commit 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display
on CIK GPUs") for DCE-8. So far >= DCE-11.2 was considered ok, but now I
found out that on DCE-11.2 it causes dithering when there shouldn't be
any, so identity pixel passthrough with identity gamma LUTs doesn't work
when it should. This breaks various important neuroscience applications,
as reported to me by scientific users of Polaris cards under Ubuntu 22.04
with Linux 5.15, and confirmed by testing it myself on DCE-11.2.

Lets only use depth 36 for DCN engines, where my testing showed that it
is both necessary for high color precision output, e.g., RGBA16 fb's,
and not harmful, as far as more than one year in real-world use showed.

DCE engines seem to work fine for high precision output at 30 bpp, so
this ("famous last words") depth 30 should hopefully fix all known problems
without introducing new ones.

Successfully retested on DCE-11.2 Polaris and DCN-1.0 Raven Ridge on
top of Linux 5.19.0-rc2 + drm-next.

Fixes: 353ca0fa5630 ("drm/amd/display: Fix 10bit 4K display on CIK GPUs")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: stable@vger.kernel.org # 5.14.0
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: correct the MEC atomic support firmware checking for GC 10.3.7

On the GC 10.3.7 platform the initial MEC release version #3 can support
atomic operation,so need correct and set its MEC atomic support version to #3.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

drm/amd/display: Ignore First MST Sideband Message Return Error

[why]
First MST sideband message returns AUX_RET_ERROR_HPD_DISCON
on certain intel platform. Aux transaction considered failure
if HPD unexpected pulled low. The actual aux transaction success
in such case, hence do not return error.

[how]
Not returning error when AUX_RET_ERROR_HPD_DISCON detected
on the first sideband message.

v2: squash in additional DMI entries
v3: squash in static fix

Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

iio: adc: xilinx-xadc: Drop duplicate NULL check in xadc_parse_dt()

The fwnode_for_each_child_node() is NULL-aware, no need to check
its parameters outside. Drop duplicate NULL check in xadc_parse_dt().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220531141118.64540-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iio: adc: xilinx-xadc: Make use of device properties

Convert the module to be property provider agnostic and allow
it to be used on non-OF platforms.

Add mod_devicetable.h include.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Michal Simek <michal.simek@amd.com>
Link: https://lore.kernel.org/r/20220531141118.64540-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)

KASAN reports:

[ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)
[    4.676149][    T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0
[    4.683454][    T0]
[    4.685638][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1
[    4.694331][    T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016
[    4.703196][    T0] Call Trace:
[    4.706334][    T0]  <TASK>
[ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497)

after converting the type of the first argument (@nr, bit number)
of arch_test_bit() from `long` to `unsigned long`[0].

Under certain conditions (for example, when ACPI NUMA is disabled
via command line), pxm_to_node() can return %NUMA_NO_NODE (-1).
It is valid 'magic' number of NUMA node, but not valid bit number
to use in bitops.
node_online() eventually descends to test_bit() without checking
for the input, assuming it's on caller side (which might be good
for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads
to an insane array index when calculating bit position in memory.

For now, add an explicit check for @node being not %NUMA_NO_NODE
before calling test_bit(). The actual logics didn't change here
at all.

[0] https://github.com/norov/linux/commit/0e862838f290147ea9c16db852d8d494b552d38d

Fixes: ee34b32d8c29 ("dmar: support for parsing Remapping Hardware Static Affinity structure")
Cc: stable@vger.kernel.org # 2.6.33+
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>

ASoC/SoundWire: Intel: add sdw BE dai trigger

Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>:

For SOF IPC4, we need to set pipeline state in BE DAI trigger.

hwmon: (asus-ec-sensors) add definitions for ROG ZENITH II EXTREME

Add definitions for ROG ZENITH II EXTREME and some unknown yet
temperature sensors in the second EC bank. Details are available at
[1, 2].

[1] https://github.com/zeule/asus-ec-sensors/pull/26
[2] https://github.com/zeule/asus-ec-sensors/issues/16

Signed-off-by: Urs Schroffenegger <nabajour@lampshade.ch>
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220710202639.1812058-2-eugene.shalygin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (aquacomputer_d5next) Move device-specific data into struct aqc_data

As preparation for adding support for more devices in upcoming patches,
move device-specific data, such as number of fans, temperature sensors,
register offsets etc. to struct aqc_data. This is made possible by
the fact that the supported Aquacomputer devices share the same layouts
of sensor substructures. This allows aqc_raw_event() and others to stay
general and not be cluttered with similar loops for each device.

Signed-off-by: Jack Doan <me@jackdoan.com>
Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com>
Link: https://lore.kernel.org/r/20220707115050.90021-1-savicaleksa83@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (asus-ec-sensors) add missing sensors for X570-I GAMING

VRM and chipset temperature for ROG STRIX X570-I GAMING were missing
according to a user contribution to the LHM project [1].

[1] https://github.com/LibreHardwareMonitor/LibreHardwareMonitor/pull/767

Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220710085539.1682869-1-eugene.shalygin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (drivetemp) Add module alias

Adding a MODULE_ALIAS() to drivetemp will make the driver easier
for modprobe to autoprobe.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220712214624.1845158-1-linus.walleij@linaro.org
Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (asus_wmi_sensors) Save a few bytes of memory

The first 'for' loop of asus_wmi_configure_sensor_setup() only computes
the number and type of sensors that exist in the system.

Here, the 'temp_sensor' structure is only used to store the data collected
by asus_wmi_sensor_info(). There is no point in using a devm_ variant for
this allocation. This wastes some memory for no good reason.

Use the stack instead.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/e23cea6c489fabb109a61e8a33d146a6b74c0529.1656741926.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (lm90) Use worker for alarm notifications

Reporting alarms using hwmon_notify_event() may result in a callback
from the thermal subsystem. This means that such notifications must
not hold the update lock to avoid a deadlock. To avoid this situation,
use a worker to handle notifications.

Reported-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Fixes: f6d0775119fb ("hwmon: (lm90) Rework alarm/status handling")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (asus-ec-sensors) add support for Maximus XI Hero

Add definitions for ROG MAXIMUS XI HERO and ROG MAXIMUS XI HERO (WI-FI)
boards.

Signed-off-by: Michael Carns <mike@carns.com>
Signed-off-by: Eugene Shalygin <eugene.shalygin@gmail.com>
Link: https://lore.kernel.org/r/20220627225437.87462-1-eugene.shalygin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (dell-smm) Improve assembly code

The new assembly code works on both 32 bit and 64 bit
cpus and allows for more compiler optimisations.
Since clang runs out of registers on 32 bit x86 when
using CC_OUT, we need to execute "setc" ourself.
Also modify the debug message so we can still see
the result (eax) when the carry flag was set.

Tested with 32 bit and 64 bit kernels on a Dell Inspiron 3505.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20220220190851.17965-1-W_Armin@gmx.de
[groeck: Rebased to v5.19-rc3]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (pmbus/ltc2978) Set voltage resolution

The LTC2977 regulator does not set the regulator_desc .n_voltages value
which is needed in order to let the regulator core list the regulator
voltage range.

This patch defines a regulator_desc with a voltage range, and uses it
for defining voltage resolution for regulators LTC2972/LTC2974/LTC2975/
LTC2977/LTC2978/LTC2979/LTC2980/LTM2987 based on that they all have a 16
bit ADC with the same stepwise 122.07uV resolution. It also scales the
resolution to a 1mV resolution which is easier to handle.

Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Link: https://lore.kernel.org/r/20220614095144.3472305-1-marten.lindahl@axis.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>