Marc Zyngier [Wed, 20 Jul 2022 10:52:19 +0000 (11:52 +0100)]
arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
Even if we are now able to tell the kernel to avoid exposing SVE/SME
from the command line, we still have a couple of places where we
unconditionally access the ZCR_EL1 (resp. SMCR_EL1) registers.
On systems with broken firmwares, this results in a crash even if
arm64.nosve (resp. arm64.nosme) was passed on the command-line.
To avoid this, only update cpuinfo_arm64::reg_{zcr,smcr} once
we have computed the sanitised version for the corresponding
feature registers (ID_AA64PFR0 for SVE, and ID_AA64PFR1 for
SME). This results in some minor refactoring.
Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Peter Collingbourne <pcc@google.com> Tested-by: Peter Collingbourne <pcc@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220720105219.1755096-1-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
Will Deacon [Mon, 25 Jul 2022 09:59:15 +0000 (10:59 +0100)]
Merge branch 'for-next/boot' into for-next/core
* for-next/boot: (34 commits)
arm64: fix KASAN_INLINE
arm64: Add an override for ID_AA64SMFR0_EL1.FA64
arm64: Add the arm64.nosve command line option
arm64: Add the arm64.nosme command line option
arm64: Expose a __check_override primitive for oddball features
arm64: Allow the idreg override to deal with variable field width
arm64: Factor out checking of a feature against the override into a macro
arm64: Allow sticky E2H when entering EL1
arm64: Save state of HCR_EL2.E2H before switch to EL1
arm64: Rename the VHE switch to "finalise_el2"
arm64: mm: fix booting with 52-bit address space
arm64: head: remove __PHYS_OFFSET
arm64: lds: use PROVIDE instead of conditional definitions
arm64: setup: drop early FDT pointer helpers
arm64: head: avoid relocating the kernel twice for KASLR
arm64: kaslr: defer initialization to initcall where permitted
arm64: head: record CPU boot mode after enabling the MMU
arm64: head: populate kernel page tables with MMU and caches on
arm64: head: factor out TTBR1 assignment into a macro
arm64: idreg-override: use early FDT mapping in ID map
...
Will Deacon [Mon, 25 Jul 2022 09:58:10 +0000 (10:58 +0100)]
Merge branch 'for-next/cpufeature' into for-next/core
* for-next/cpufeature:
arm64/hwcap: Support FEAT_EBF16
arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
arm64/hwcap: Document allocation of upper bits of AT_HWCAP
arm64: trap implementation defined functionality in userspace
Will Deacon [Mon, 25 Jul 2022 09:57:26 +0000 (10:57 +0100)]
Merge branch 'for-next/stacktrace' into for-next/core
* for-next/stacktrace:
arm64: Copy the task argument to unwind_state
arm64: Split unwind_init()
arm64: stacktrace: use non-atomic __set_bit
arm64: kasan: do not instrument stacktrace.c
Will Deacon [Mon, 25 Jul 2022 09:57:14 +0000 (10:57 +0100)]
Merge branch 'for-next/perf' into for-next/core
* for-next/perf:
drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
drivers/perf: hisi: add driver for HNS3 PMU
drivers/perf: hisi: Add description for HNS3 PMU driver
drivers/perf: riscv_pmu_sbi: perf format
perf/arm-cci: Use the bitmap API to allocate bitmaps
drivers/perf: riscv_pmu: Add riscv pmu pm notifier
perf: hisi: Extract hisi_pmu_init
perf/marvell_cn10k: Fix TAD PMU register offset
perf/marvell_cn10k: Remove useless license text when SPDX-License-Identifier is already used
arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1
perf/arm-cci: fix typo in comment
drivers/perf:Directly use ida_alloc()/free()
drivers/perf: Directly use ida_alloc()/free()
Will Deacon [Mon, 25 Jul 2022 09:57:08 +0000 (10:57 +0100)]
Merge branch 'for-next/mte' into for-next/core
* for-next/mte:
arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
mm: kasan: Skip unpoisoning of user pages
mm: kasan: Ensure the tags are visible before the tag in page->flags
Will Deacon [Mon, 25 Jul 2022 09:56:57 +0000 (10:56 +0100)]
Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
arm64: numa: Don't check node against MAX_NUMNODES
arm64: mm: Remove assembly DMA cache maintenance wrappers
arm64/mm: Define defer_reserve_crashkernel()
arm64: fix oops in concurrently setting insn_emulation sysctls
arm64: Do not forget syscall when starting a new thread.
arm64: boot: add zstd support
Will Deacon [Mon, 25 Jul 2022 09:56:49 +0000 (10:56 +0100)]
Merge branch 'for-next/kpti' into for-next/core
* for-next/kpti:
arm64: correct the effect of mitigations off on kpti
arm64: entry: simplify trampoline data page
arm64: mm: install KPTI nG mappings with MMU enabled
arm64: kpti-ng: simplify page table traversal logic
Will Deacon [Mon, 25 Jul 2022 09:56:16 +0000 (10:56 +0100)]
Merge branch 'for-next/extable' into for-next/core
* for-next/extable:
arm64: extable: cleanup redundant extable type EX_TYPE_FIXUP
arm64: extable: move _cond_extable to _cond_uaccess_extable
arm64: extable: make uaaccess helper use extable type EX_TYPE_UACCESS_ERR_ZERO
arm64: asm-extable: add asm uacess helpers
arm64: asm-extable: move data fields
arm64: extable: add new extable type EX_TYPE_KACCESS_ERR_ZERO support
Mark Rutland [Wed, 13 Jul 2022 14:09:49 +0000 (15:09 +0100)]
arm64: fix KASAN_INLINE
Since commit:
a004393f45d9a55e ("arm64: idreg-override: use early FDT mapping in ID map")
Kernels built with KASAN_INLINE=y die early in boot before producing any
console output. This is because the accesses made to the FDT (e.g. in
generic string processing functions) are instrumented with KASAN, and
with KASAN_INLINE=y any access to an address in TTBR0 results in a bogus
shadow VA, resulting in a data abort.
This patch fixes this by reverting commits:
7559d9f97581654f ("arm64: setup: drop early FDT pointer helpers") bd0c3fa21878b6d0 ("arm64: idreg-override: use early FDT mapping in ID map")
... and using the TTBR1 fixmap mapping of the FDT.
Note that due to a later commit:
b65e411d6cc2f12a ("arm64: Save state of HCR_EL2.E2H before switch to EL1")
... which altered the prototype of init_feature_override() (and
invocation from head.S), commit bd0c3fa21878b6d0 does not revert
cleanly, and I've fixed that up manually.
Fixes: a004393f45d9 ("arm64: idreg-override: use early FDT mapping in ID map") Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220713140949.45440-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
Mark Brown [Thu, 7 Jul 2022 10:36:32 +0000 (11:36 +0100)]
arm64/hwcap: Support FEAT_EBF16
The v9.2 feature FEAT_EBF16 provides support for an extended BFloat16 mode.
Allow userspace to discover system support for this feature by adding a
hwcap for it.
Mark Brown [Thu, 7 Jul 2022 10:36:31 +0000 (11:36 +0100)]
arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
When we added support for AT_HWCAP2 we took advantage of the fact that we
have limited hwcaps to the low 32 bits and stored it along with AT_HWCAP
in a single unsigned integer. Thanks to the ever expanding capabilities of
the architecture we have now allocated all 64 of the bits in an unsigned
long so in preparation for adding more hwcaps convert elf_hwcap to be a
bitmap instead, with 64 bits allocated to each AT_HWCAP.
There should be no functional change from this patch.
Mark Brown [Thu, 7 Jul 2022 10:36:30 +0000 (11:36 +0100)]
arm64/hwcap: Document allocation of upper bits of AT_HWCAP
The top two bits of AT_HWCAP are reserved for use by glibc and the rest of
the top 32 bits are being kept unallocated for potential use by glibc.
Document this in the header.
Barry Song [Wed, 20 Jul 2022 09:37:37 +0000 (21:37 +1200)]
arm64: enable THP_SWAP for arm64
THP_SWAP has been proven to improve the swap throughput significantly
on x86_64 according to commit bd4c82c22c367e ("mm, THP, swap: delay
splitting THP after swapped out").
As long as arm64 uses 4K page size, it is quite similar with x86_64
by having 2MB PMD THP. THP_SWAP is architecture-independent, thus,
enabling it on arm64 will benefit arm64 as well.
A corner case is that MTE has an assumption that only base pages
can be swapped. We won't enable THP_SWAP for ARM64 hardware with
MTE support until MTE is reworked to coexist with THP_SWAP.
A micro-benchmark is written to measure thp swapout throughput as
below,
unsigned long long tv_to_ms(struct timeval tv)
{
return tv.tv_sec * 1000 + tv.tv_usec / 1000;
}
printf("swp out bandwidth: %ld bytes/ms\n",
SIZE/(tv_to_ms(tv_e) - tv_to_ms(tv_b)));
}
Testing is done on rk3568 64bit Quad Core Cortex-A55 platform -
ROCK 3A.
thp swp throughput w/o patch: 2734bytes/ms (mean of 10 tests)
thp swp throughput w/ patch: 3331bytes/ms (mean of 10 tests)
Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Steven Price <steven.price@arm.com> Cc: Yang Shi <shy828301@gmail.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Link: https://lore.kernel.org/r/20220720093737.133375-1-21cnbao@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
James Morse [Thu, 14 Jul 2022 16:15:23 +0000 (17:15 +0100)]
arm64: errata: Remove AES hwcap for COMPAT tasks
Cortex-A57 and Cortex-A72 have an erratum where an interrupt that
occurs between a pair of AES instructions in aarch32 mode may corrupt
the ELR. The task will subsequently produce the wrong AES result.
The AES instructions are part of the cryptographic extensions, which are
optional. User-space software will detect the support for these
instructions from the hwcaps. If the platform doesn't support these
instructions a software implementation should be used.
Remove the hwcap bits on affected parts to indicate user-space should
not use the AES instructions.
arm64: numa: Don't check node against MAX_NUMNODES
When the NUMA nodes are sorted by checking ACPI SRAT (GICC AFFINITY)
sub-table, it's impossible for acpi_map_pxm_to_node() to return
any value, which is greater than or equal to MAX_NUMNODES. Lets drop
the unnecessary check in acpi_numa_gicc_affinity_init().
drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
The arm_spe_pmu driver will enable SYS_PMSCR_EL1.CX in order to add CONTEXT
packets into the traces, if the owner of the perf event runs with required
capabilities i.e CAP_PERFMON or CAP_SYS_ADMIN via perfmon_capable() helper.
The value of this bit is computed in the arm_spe_event_to_pmscr() function
but the check for capabilities happens in the pmu event init callback i.e
arm_spe_pmu_event_init(). This suggests that the value of the CX bit should
remain consistent for the duration of the perf session.
However, the function arm_spe_event_to_pmscr() may be called later during
the event start callback i.e arm_spe_pmu_start() when the "current" process
is not the owner of the perf session, hence the CX bit setting is currently
not consistent.
One way to fix this, is by caching the required value of the CX bit during
the initialization of the PMU event, so that it remains consistent for the
duration of the session. It uses currently unused 'event->hw.flags' element
to cache perfmon_capable() value, which can be referred during event start
callback to compute SYS_PMSCR_EL1.CX. This ensures consistent availability
of context packets in the trace as per event owner capabilities.
Drop BIT(SYS_PMSCR_EL1_CX_SHIFT) check in arm_spe_pmu_event_init(), because
now CX bit cannot be set in arm_spe_event_to_pmscr() with perfmon_capable()
disabled.
Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") Reported-by: German Gomez <german.gomez@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20220714061302.2715102-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
Liang He [Fri, 15 Jul 2022 13:03:30 +0000 (21:03 +0800)]
perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
In pmu_sbi_setup_irqs(), we should call of_node_put() for the 'cpu'
when breaking out of for_each_of_cput_node() as its refcount will
be automatically increased and decreased during the iteration.
Fixes: 4905ec2fb7e6 ("RISC-V: Add sscofpmf extension support") Signed-off-by: Liang He <windhl@126.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20220715130330.443363-1-windhl@126.com Signed-off-by: Will Deacon <will@kernel.org>
Pages mapped in user-space with PROT_MTE have the allocation tags either
zeroed or copied/restored to some user values. In order for the kernel
to access such pages via page_address(), resetting the tag in
page->flags was necessary. This tag resetting was deferred to
set_pte_at() -> mte_sync_page_tags() but it can race with another CPU
reading the flags (via page_to_virt()):
Since now the post_alloc_hook() function resets the page->flags tag when
unpoisoning is skipped for user pages (including the __GFP_ZEROTAGS
case), revert the arm64 commit calling page_kasan_tag_reset().
Catalin Marinas [Fri, 10 Jun 2022 15:21:40 +0000 (16:21 +0100)]
mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
Currently post_alloc_hook() skips the kasan unpoisoning if the tags will
be zeroed (__GFP_ZEROTAGS) or __GFP_SKIP_KASAN_UNPOISON is passed. Since
__GFP_ZEROTAGS is now accompanied by __GFP_SKIP_KASAN_UNPOISON, remove
the extra check.
Catalin Marinas [Fri, 10 Jun 2022 15:21:39 +0000 (16:21 +0100)]
mm: kasan: Skip unpoisoning of user pages
Commit c275c5c6d50a ("kasan: disable freed user page poisoning with HW
tags") added __GFP_SKIP_KASAN_POISON to GFP_HIGHUSER_MOVABLE. A similar
argument can be made about unpoisoning, so also add
__GFP_SKIP_KASAN_UNPOISON to user pages. To ensure the user page is
still accessible via page_address() without a kasan fault, reset the
page->flags tag.
With the above changes, there is no need for the arm64
tag_clear_highpage() to reset the page->flags tag.
Catalin Marinas [Fri, 10 Jun 2022 15:21:38 +0000 (16:21 +0100)]
mm: kasan: Ensure the tags are visible before the tag in page->flags
__kasan_unpoison_pages() colours the memory with a random tag and stores
it in page->flags in order to re-create the tagged pointer via
page_to_virt() later. When the tag from the page->flags is read, ensure
that the in-memory tags are already visible by re-ordering the
page_kasan_tag_set() after kasan_unpoison(). The former already has
barriers in place through try_cmpxchg(). On the reader side, the order
is ensured by the address dependency between page->flags and the memory
access.
Guangbin Huang [Tue, 28 Jun 2022 06:34:19 +0000 (14:34 +0800)]
drivers/perf: hisi: add driver for HNS3 PMU
HNS3(HiSilicon Network System 3) PMU is RCiEP device in HiSilicon SoC NIC,
supports collection of performance statistics such as bandwidth, latency,
packet rate and interrupt rate.
NIC of each SICL has one PMU device for it. Driver registers each PMU
device to perf, and exports information of supported events, filter mode of
each event, bdf range, hardware clock frequency, identifier and so on via
sysfs.
Each PMU device has its own registers of control, counters and interrupt,
and it supports 8 hardware events, each hardward event has its own
registers for configuration, counters and interrupt.
Filter options contains:
config - select event
port - select physical port of nic
tc - select tc(must be used with port)
func - select PF/VF
queue - select queue of PF/VF(must be used with func)
intr - select interrupt number(must be used with func)
global - select all functions of IO DIE
Currently, when the CPU is doing suspend to ram, we don't
save pmu counter register and its content will be lost.
To ensure perf profiling is not affected by suspend to ram,
this patch is based on arm_pmu CPU_PM notifier and implements riscv
pmu pm notifier. In the pm notifier, we stop the counter and update
the counter value before suspend and start the counter after resume.
Remove the __dma_{flush,map,unmap}_area assembly wrappers and call the
appropriate cache maintenance functions directly from the DMA mapping
callbacks.
James Morse [Mon, 4 Jul 2022 15:57:32 +0000 (16:57 +0100)]
arm64: errata: Add Cortex-A510 to the repeat tlbi list
Cortex-A510 is affected by an erratum where in rare circumstances the
CPUs may not handle a race between a break-before-make sequence on one
CPU, and another CPU accessing the same page. This could allow a store
to a page that has been unmapped.
Work around this by adding the affected CPUs to the list that needs
TLB sequences to be done twice.
Mark Brown [Mon, 4 Jul 2022 17:02:50 +0000 (18:02 +0100)]
arm64/sysreg: Add _EL1 into ID_AA64ISAR2_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR2_EL1 to follow the convention. No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:49 +0000 (18:02 +0100)]
arm64/sysreg: Add _EL1 into ID_AA64ISAR1_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64ISAR1_EL1 to follow the convention. No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:48 +0000 (18:02 +0100)]
arm64/sysreg: Remove defines for RPRES enumeration
We have defines for the RPRES enumeration in ID_AA64ISAR2 which do not
follow our normal conventions. Since these defines are never used just
remove them. No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:47 +0000 (18:02 +0100)]
arm64/sysreg: Standardise naming for ID_AA64ZFR0_EL1 fields
The various defines for bitfields in ID_AA64ZFR0_EL1 do not follow our
conventions for register field names, they omit the _EL1, they don't use
specific defines for enumeration values and they don't follow the naming
in the architecture in some cases. In preparation for automatic generation
bring them into line with convention. No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:46 +0000 (18:02 +0100)]
arm64/sysreg: Standardise naming for ID_AA64SMFR0_EL1 enums
We have a series of defines for enumeration values we test for in the
fields in ID_AA64SMFR0_EL1 which do not follow our usual convention of
including the EL1 in the name and having _IMP at the end of the basic
"feature present" define. In preparation for automatic register
generation bring the defines into sync with convention, no functional
change.
Mark Brown [Mon, 4 Jul 2022 17:02:45 +0000 (18:02 +0100)]
arm64/sysreg: Standardise naming for WFxT defines
The defines for WFxT refer to the feature as WFXT and use SUPPORTED rather
than IMP. In preparation for automatic generation of defines update these
to be more standard. No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:44 +0000 (18:02 +0100)]
arm64/sysreg: Make BHB clear feature defines match the architecture
The architecture refers to the field identifying support for BHB clear as
BC but the kernel has called it CLEARBHB. In preparation for generation of
defines for ID_AA64ISAR2_EL1 rename to use the architecture's naming. No
functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:43 +0000 (18:02 +0100)]
arm64/sysreg: Align pointer auth enumeration defines with architecture
The defines used for the pointer authentication feature enumerations do not
follow the naming convention we've decided to use where we name things
after the architecture feature that introduced. Prepare for generating the
defines for the ISA ID registers by updating to use the feature names.
No functional changes.
Mark Brown [Mon, 4 Jul 2022 17:02:42 +0000 (18:02 +0100)]
arm64/mte: Standardise GMID field name definitions
Usually our defines for bitfields in system registers do not include a SYS_
prefix but those for GMID do. In preparation for automatic generation of
defines remove that prefix. No functional change.
Mark Brown [Mon, 4 Jul 2022 17:02:41 +0000 (18:02 +0100)]
arm64/sysreg: Standardise naming for DCZID_EL0 field names
The constants defining field names for DCZID_EL0 do not include the _EL0
that is included as part of our standard naming scheme. In preparation
for automatic generation of the defines add the _EL0 in. No functional
change.
Mark Brown [Mon, 4 Jul 2022 17:02:40 +0000 (18:02 +0100)]
arm64/sysreg: Standardise naming for CTR_EL0 fields
cache.h contains some defines which are used to represent fields and
enumeration values which do not follow the standard naming convention used for
when we automatically generate defines for system registers. Update the
names of the constants to reflect standardised naming and move them to
sysreg.h.
There is also a helper CTR_L1IP() which was open coded and has been
converted to use SYS_FIELD_GET().
Mark Brown [Mon, 4 Jul 2022 17:02:39 +0000 (18:02 +0100)]
arm64/cache: Restrict which headers are included in __ASSEMBLY__
Future changes to generate register definitions automatically will cause
this header to be included in a linker script. This will mean that headers
it in turn includes that are not safe for use in such a context (eg, due
to the use of assembler macros) cause build problems. Avoid these issues by
moving the affected includes and associated defines to the section of the
file already guarded by ifndef __ASSEMBLY__.
Mark Brown [Mon, 4 Jul 2022 17:02:38 +0000 (18:02 +0100)]
arm64/sysreg: Add SYS_FIELD_GET() helper
Add a SYS_FIELD_GET() helper to match SYS_FIELD_PREP(), providing a
simplified interface to FIELD_GET() when using the generated defines
with standardized naming.
Mark Brown [Mon, 4 Jul 2022 17:02:37 +0000 (18:02 +0100)]
arm64/sysreg: Allow leading blanks on comments in sysreg file
Currently we only accept comments where the # is placed at the start of a
line, allow leading blanks so we can format comments inside definitions in
a more pleasing manner.
Mark Brown [Mon, 4 Jul 2022 17:02:35 +0000 (18:02 +0100)]
arm64/cpuinfo: Remove references to reserved cache type
In 155433cb365ee466 ("arm64: cache: Remove support for ASID-tagged VIVT
I-caches") we removed all the support fir AIVIVT cache types and renamed
all references to the field to say "unknown" since support for AIVIVT
caches was removed from the architecture. Some confusion has resulted since
the corresponding change to the architecture left the value named as
AIVIVT but documented it as reserved in v8, refactor the code so we don't
define the constant instead. This will help with automatic generation of
this register field since it means we care less about the correspondence
with the ARM.
No functional change, the value displayed to userspace is unchanged.
Crash kernel memory reservation gets deferred, when either CONFIG_ZONE_DMA
or CONFIG_ZONE_DMA32 config is enabled on the platform. This deferral also
impacts overall linear mapping creation including the crash kernel itself.
Just encapsulate this deferral check in a new helper for better clarity.
To fix this issue, keep the table->data as &insn->current_mode and
use container_of() to retrieve the insn pointer. Another mutex is
used to protect against the current_mode update but not for retrieving
insn_emulation as table->data is no longer changing.
Marc Zyngier [Thu, 30 Jun 2022 16:04:59 +0000 (17:04 +0100)]
arm64: Add the arm64.nosve command line option
In order to be able to completely disable SVE even if the HW
seems to support it (most likely because the FW is broken),
move the SVE setup into the EL2 finalisation block, and
use a new idreg override to deal with it.
Note that we also nuke id_aa64zfr0_el1 as a byproduct, and
that SME also gets disabled, due to the dependency between the
two features.
Marc Zyngier [Thu, 30 Jun 2022 16:04:58 +0000 (17:04 +0100)]
arm64: Add the arm64.nosme command line option
In order to be able to completely disable SME even if the HW
seems to support it (most likely because the FW is broken),
move the SME setup into the EL2 finalisation block, and
use a new idreg override to deal with it.
Note that we also nuke id_aa64smfr0_el1 as a byproduct.
Marc Zyngier [Thu, 30 Jun 2022 16:04:57 +0000 (17:04 +0100)]
arm64: Expose a __check_override primitive for oddball features
In order to feal with early override of features that are not
classically encoded in a standard ID register with a 4 bit wide
field, add a primitive that takes a sysreg value as an input
(instead of the usual sysreg name) as well as a bit field
width (usually 4).
Marc Zyngier [Thu, 30 Jun 2022 16:04:56 +0000 (17:04 +0100)]
arm64: Allow the idreg override to deal with variable field width
Currently, the override mechanism can only deal with 4bit fields,
which is the most common case. However, we now have a bunch of
ID registers that have more diverse field widths, such as
ID_AA64SMFR0_EL1, which has fields that are a single bit wide.
Add the support for variable width, and a macro that encodes
a feature width of 4 for all existing override.
Marc Zyngier [Thu, 30 Jun 2022 16:04:55 +0000 (17:04 +0100)]
arm64: Factor out checking of a feature against the override into a macro
Checking for a feature being supported from assembly code is
a bit tedious if we need to factor in the idreg override.
Since we already have such code written for forcing nVHE, move
the whole thing into a macro. This heavily relies on the override
structure being called foo_override for foo_el1.
Marc Zyngier [Thu, 30 Jun 2022 16:04:54 +0000 (17:04 +0100)]
arm64: Allow sticky E2H when entering EL1
For CPUs that have the unfortunate mis-feature to be stuck in
VHE mode, we perform a funny dance where we completely shortcut
the normal boot process to enable VHE and run the kernel at EL2,
and only then start booting the kernel.
Not only this is pretty ugly, but it means that the EL2 finalisation
occurs before we have processed the sysreg override.
Instead, start executing the kernel as if it was an EL1 guest and
rely on the normal EL2 finalisation to go back to EL2.
Marc Zyngier [Thu, 30 Jun 2022 16:04:53 +0000 (17:04 +0100)]
arm64: Save state of HCR_EL2.E2H before switch to EL1
As we're about to switch the way E2H-stuck CPUs boot, save
the boot CPU E2H state as a flag tied to the boot mode
that can then be checked by the idreg override code.
This allows us to replace the is_kernel_in_hyp_mode() check
with a simple comparison with this state, even when running
at EL1. Note that this flag isn't saved in __boot_cpu_mode,
and is only kept in a register in the assembly code.
Marc Zyngier [Thu, 30 Jun 2022 16:04:52 +0000 (17:04 +0100)]
arm64: Rename the VHE switch to "finalise_el2"
as we are about to perform a lot more in 'mutate_to_vhe' than
we currently do, this function really becomes the point where
we finalise the basic EL2 configuration.
Reflect this into the code by renaming a bunch of things:
- HVC_VHE_RESTART -> HVC_FINALISE_EL2
- switch_to_vhe --> finalise_el2
- mutate_to_vhe -> __finalise_el2
Joey reports that booting 52-bit VA capable builds on 52-bit VA capable
CPUs is broken since commit 0d9b1ffefabe ("arm64: mm: make vabits_actual
a build time constant if possible"). This is due to the fact that the
primary CPU reads the vabits_actual variable before it has been
assigned.
The reason for deferring the assignment of vabits_actual was that we try
to perform as few stores to memory as we can with the MMU and caches
off, due to the cache coherency issues it creates.
Since __cpu_setup() [which is where the read of vabits_actual occurs] is
also called on the secondary boot path, we cannot just read the CPU ID
registers directly, given that the size of the VA space is decided by
the capabilities of the primary CPU. So let's read vabits_actual only on
the secondary boot path, and read the CPU ID registers directly on the
primary boot path, by making it a function parameter of __cpu_setup().
To ensure that all users of vabits_actual (including kasan_early_init())
observe the correct value, move the assignment of vabits_actual back
into asm code, but still defer it to after the MMU and caches have been
enabled.
Cc: Will Deacon <will@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 0d9b1ffefabe ("arm64: mm: make vabits_actual a build time constant if possible") Reported-by: Joey Gouly <joey.gouly@arm.com> Co-developed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220701111045.2944309-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
Francis Laniel [Wed, 8 Jun 2022 16:24:46 +0000 (17:24 +0100)]
arm64: Do not forget syscall when starting a new thread.
Enable tracing of the execve*() system calls with the
syscalls:sys_exit_execve tracepoint by removing the call to
forget_syscall() when starting a new thread and preserving the value of
regs->syscallno across exec.
When building the 32-bit vDSO with LLVM 15 and CONFIG_DEBUG_INFO, there
are the following orphan section warnings:
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_abbrev) is being placed in '.debug_abbrev'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_info) is being placed in '.debug_info'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_str_offsets) is being placed in '.debug_str_offsets'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_str) is being placed in '.debug_str'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_addr) is being placed in '.debug_addr'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_line) is being placed in '.debug_line'
ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_line_str) is being placed in '.debug_line_str'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_loclists) is being placed in '.debug_loclists'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_abbrev) is being placed in '.debug_abbrev'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_info) is being placed in '.debug_info'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_rnglists) is being placed in '.debug_rnglists'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_str_offsets) is being placed in '.debug_str_offsets'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_str) is being placed in '.debug_str'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_addr) is being placed in '.debug_addr'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_frame) is being placed in '.debug_frame'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_line) is being placed in '.debug_line'
ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_line_str) is being placed in '.debug_line_str'
These are DWARF5 sections, as that is the implicit default version for
clang-14 and newer when just '-g' is used. All DWARF sections are
handled by the DWARF_DEBUG macro from include/asm-generic/vmlinux.lds.h
so use that macro here to fix the warnings regardless of DWARF version.
Fixes: 9d4775b332e1 ("arm64: vdso32: enable orphan handling for VDSO") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20220630153121.1317045-3-nathan@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
When building the 32-bit vDSO after commit 5c4fb60816ea ("arm64: vdso32:
add ARM.exidx* sections"), ld.lld 11 fails to link:
ld.lld: error: could not allocate headers
ld.lld: error: unable to place section .text at file offset [0x2A0, 0xBB1]; check your linker script for overflows
ld.lld: error: unable to place section .comment at file offset [0xBB2, 0xC8A]; check your linker script for overflows
ld.lld: error: unable to place section .symtab at file offset [0xC8C, 0xE0B]; check your linker script for overflows
ld.lld: error: unable to place section .strtab at file offset [0xE0C, 0xF1C]; check your linker script for overflows
ld.lld: error: unable to place section .shstrtab at file offset [0xF1D, 0xFAA]; check your linker script for overflows
ld.lld: error: section .ARM.exidx file range overlaps with .hash
>>> .ARM.exidx range is [0x90, 0xCF]
>>> .hash range is [0xB4, 0xE3]
ld.lld: error: section .hash file range overlaps with .ARM.attributes
>>> .hash range is [0xB4, 0xE3]
>>> .ARM.attributes range is [0xD0, 0x10B]
ld.lld: error: section .ARM.attributes file range overlaps with .dynsym
>>> .ARM.attributes range is [0xD0, 0x10B]
>>> .dynsym range is [0xE4, 0x133]
ld.lld: error: section .ARM.exidx virtual address range overlaps with .hash
>>> .ARM.exidx range is [0x90, 0xCF]
>>> .hash range is [0xB4, 0xE3]
ld.lld: error: section .ARM.exidx load address range overlaps with .hash
>>> .ARM.exidx range is [0x90, 0xCF]
>>> .hash range is [0xB4, 0xE3]
This was fixed in ld.lld 12 with a change to match GNU ld's semantics of
placing non-SHF_ALLOC sections after SHF_ALLOC sections.
To workaround this issue, move the .ARM.exidx section before the
.comment, .symtab, .strtab, and .shstrtab sections (ELF_DETAILS) so that
those sections remain contiguous with the .ARM.attributes section.
Mark Rutland [Wed, 29 Jun 2022 04:12:07 +0000 (09:42 +0530)]
arm64: head: remove __PHYS_OFFSET
It's very easy to confuse __PHYS_OFFSET and PHYS_OFFSET. To clarify
things, let's remove __PHYS_OFFSET and use KERNEL_START directly, with
comments to show that we're using physical address, as we do for other
objects.
At the same time, update the comment regarding the kernel entry address
to mention __pa(KERNEL_START) rather than __pa(PAGE_OFFSET).
There should be no functional change as a result of this patch.
Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220629041207.1670133-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
Ard Biesheuvel [Wed, 29 Jun 2022 08:32:46 +0000 (10:32 +0200)]
arm64: lds: use PROVIDE instead of conditional definitions
Currently, a build with CONFIG_EFI=n and CONFIG_KASAN=y will not
complete successfully because of missing symbols. This is due to the
fact that the __pi_ prefixed aliases for __memcpy/__memmove were put
inside a #ifdef CONFIG_EFI block inadvertently, and are therefore
missing from the build in question.
These definitions should only be provided when needed, as they will
otherwise clutter up the symbol table, kallsyms etc for no reason.
Fortunately, instead of using CPP conditionals, we can achieve the same
result by using the linker's PROVIDE() directive, which only defines a
symbol if it is required to complete the link. So let's use that for all
symbols alias definitions.
Tong Tiangen [Tue, 21 Jun 2022 07:26:33 +0000 (07:26 +0000)]
arm64: extable: move _cond_extable to _cond_uaccess_extable
Currently, We use _cond_extable for cache maintenance uaccess helper
caches_clean_inval_user_pou(), so this should be moved over to
EX_TYPE_UACCESS_ERR_ZERO and rename _cond_extable to _cond_uaccess_extable
for clarity.
Mark Rutland [Tue, 21 Jun 2022 07:26:31 +0000 (07:26 +0000)]
arm64: asm-extable: add asm uacess helpers
In subsequent patches we want to explciitly annotate uaccess fixups in
assembly files.
We have existing helpers for this for inline assembly, but due to
differing stringification requirements it's not possible to have a
single definition that we can use for both inline asm and plain asm
files. So as with other cases (e.g. gpr-regnum.h), we must prove
separate helprs for plain asm and inline asm.
So that we can do so, this patch adds helpers to define
EX_TYPE_UACCESS_ERR_ZERO fixups in plain assembly. These correspond 1-1
with the inline assembly versions except for the absence of
stringification. No plain assmebly heleprs are added for
EX_TYPE_LOAD_UNALIGNED_ZEROPAD fixups as these only exist for a single C
function.
For copy_{to,from}_user() we'll need fixups with regs and err, so I've
added _ASM_EXTABLE_UACCESS(insn, fixup), where both the error and zero
registers are WZR.
For clarity, the existing `_asm_extable` assemgbly maco is now defined
in terms of the _ASM_EXTABLE() CPP macro, making the CPP macros
canonical in all cases.
There should be no functional change as a result of this patch.
Mark Rutland [Tue, 21 Jun 2022 07:26:30 +0000 (07:26 +0000)]
arm64: asm-extable: move data fields
In subsequent patches we'll need to fill in extable data fields in
regular assembly files. In preparation for this, move the definitions of
the extable data fields earlier in asm-extable.h so that they are
defined for both assembly and C files.
There should be no functional change as a result of this patch.
Tong Tiangen [Tue, 21 Jun 2022 07:26:29 +0000 (07:26 +0000)]
arm64: extable: add new extable type EX_TYPE_KACCESS_ERR_ZERO support
Currently, The extable type EX_TYPE_UACCESS_ERR_ZERO is used by
__get/put_kernel_nofault(), but those helpers are not uaccess type, so we
add a new extable type EX_TYPE_KACCESS_ERR_ZERO which can be used by
__get/put_kernel_no_fault().
This is also to prepare for distinguishing the two types in machine check
safe process.
Kefeng Wang [Tue, 7 Jun 2022 12:50:27 +0000 (20:50 +0800)]
arm64: Add HAVE_IOREMAP_PROT support
With ioremap_prot() definition from generic ioremap, also move
pte_pgprot() from hugetlbpage.c into pgtable.h, then arm64 could
have HAVE_IOREMAP_PROT, which will enable generic_access_phys()
code, it is useful for debug, eg, gdb.
Kefeng Wang [Tue, 7 Jun 2022 12:50:26 +0000 (20:50 +0800)]
arm64: mm: Convert to GENERIC_IOREMAP
Add hook for arm64's special operation when ioremap(), then
ioremap_wc/np/cache is converted to use ioremap_prot() from
GENERIC_IOREMAP, update the Copyright and kill the unused
inclusions.
Kefeng Wang [Tue, 7 Jun 2022 12:50:25 +0000 (20:50 +0800)]
mm: ioremap: Add ioremap/iounmap_allowed()
Add special hook for architecture to verify addr, size or prot
when ioremap() or iounmap(), which will make the generic ioremap
more useful.
ioremap_allowed() return a bool,
- true means continue to remap
- false means skip remap and return directly
iounmap_allowed() return a bool,
- true means continue to vunmap
- false code means skip vunmap and return directly
Meanwhile, only vunmap the address when it is in vmalloc area
as the generic ioremap only returns vmalloc addresses.
Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Baoquan He <bhe@redhat.com> Link: https://lore.kernel.org/r/20220607125027.44946-5-wangkefeng.wang@huawei.com Signed-off-by: Will Deacon <will@kernel.org>