Michael Brown [Fri, 9 May 2025 14:02:47 +0000 (15:02 +0100)]
[riscv] Use load and store pseudo-instructions where possible
The pattern of "load address to register" followed by "load value from
address in register" generally results in three instructions: two to
load the address and one to load the value.
This can be reduced to two instructions by allowing the assembler to
incorporate the low bits of the address within the load (or store)
instruction itself. In the case of a store, this requires specifying
a second register that can be temporarily used to hold the high bits
of the address. (In the case of a load, the destination register is
reused for this purpose.)
Michael Brown [Fri, 9 May 2025 13:25:59 +0000 (14:25 +0100)]
[build] Formalise mechanism for accessing absolute symbols
In a position-dependent executable, where all addresses are fixed
at link time, we can use the standard technique as documented by
GNU ld to get the value of an absolute symbol, e.g.:
extern char _my_symbol[];
printf ( "Absolute symbol value is %x\n", ( ( int ) _my_symbol ) );
This technique may not work in a position-independent executable.
When dynamic relocations are applied, the runtime addresses will no
longer be equal to the link-time addresses. If the code to obtain the
address of _my_symbol uses PC-relative addressing, then it will
calculate the runtime "address" of the absolute symbol, which will no
longer be equal the the link-time "address" (i.e. the correct value)
of the absolute symbol.
Define macros ABS_SYMBOL(), ABS_VALUE_INIT(), and ABS_VALUE() that
provide access to the correct values of absolute symbols even in
position-independent code, and use these macros wherever absolute
symbols are accessed.
Michael Brown [Fri, 9 May 2025 11:03:29 +0000 (12:03 +0100)]
[libc] Display assertion failure message before incrementing counter
During early initialisation on some platforms, the .data and .bss
sections may not yet be writable.
Display the assertion message before attempting to increment the
assertion failure counter, since writing to the assertion counter may
trigger a CPU exception that ends up resetting the system.
Michael Brown [Thu, 8 May 2025 13:22:13 +0000 (14:22 +0100)]
[riscv] Return virtual address offset from enable_paging()
Once paging has been enabled, there is no direct way to determine the
virtual address offset without external knowledge. (The paging mode,
if needed, can be read directly from the SATP CSR.)
Change the return value from enable_paging() to provide the virtual
address offset.
Michael Brown [Thu, 8 May 2025 10:03:38 +0000 (11:03 +0100)]
[riscv] Restore temporarily modified PTE within 32-bit transition code
If the virtual address offset is precisely one page (i.e. each virtual
address maps to a physical address one page higher), and if the 32-bit
transition code happens to end up at the end of a page (which would
require an unrealistic 2MB of content in .prefix), then it would be
possible for the program counter to cross into the portion of the
virtual address space still borrowed for use as the temporary physical
map.
Avoid this remote possibility by moving the restoration of the
temporarily modified PTE within the transition code block (which is
guaranteed to remain within a single page since it is aligned on its
own size).
This unfortunately requires increasing the alignment of the transition
code (and hence the maximum number of NOPs inserted). The assembler
syntax theoretically allows us to avoid inserting any NOPs via a
directive such as:
.balign PAGE_SIZE, , enable_paging_32_max_len
(i.e. relying on the fact that if the transition code is already
sufficiently far away from the end of a page, then no padding needs to
be inserted). However, alignment on RISC-V is implemented using the
R_RISCV_ALIGN relaxing relocation, which doesn't encode any concept of
a maximum padding length, and so the maximum padding length value is
effectively ignored.
Michael Brown [Wed, 7 May 2025 22:02:40 +0000 (23:02 +0100)]
[uaccess] Generalise librm's virt_offset mechanism for RISC-V
The virtual offset memory model used for i386-pcbios and x86_64-pcbios
can be generalised to also cover riscv32-sbi and riscv64-sbi. In both
architectures, the 32-bit builds will use a circular map of the 32-bit
address space, and the 64-bit builds will use an identity map for the
relevant portion of the physical address space, with iPXE itself
placed in the negative (kernel) address space.
Generalise and document the virt_offset mechanism, and set it as the
default for both PCBIOS and SBI platforms.
Michael Brown [Wed, 7 May 2025 21:57:40 +0000 (22:57 +0100)]
[build] Constrain PHYS_CODE() and REAL_CODE() to use i386 registers
Inline assembly using PHYS_CODE() or REAL_CODE() must use the "R"
constraint rather than the "r" constraint to ensure that the compiler
chooses registers that will be valid for the 32-bit or 16-bit assembly
code fragment.
Michael Brown [Wed, 7 May 2025 11:56:20 +0000 (12:56 +0100)]
[riscv] Provide a millicode variant of print_message()
RISC-V has a millicode calling convention that allows for the use of
an alternative link register x5/t0. With sufficient care, this allows
for two levels of subroutine call even when no stack is available.
Provide both standard and millicode entry points for print_message(),
and use the millicode entry point to allow for printing debug messages
from libprefix.S itself.
Michael Brown [Tue, 6 May 2025 15:35:19 +0000 (16:35 +0100)]
[riscv] Move prefix debug message printing to libprefix.S
Create a prefix library function print_message() to print text to the
SBI debug console. Use the "write byte" SBI call (rather than "write
string") so that the function remains usable even after enabling
paging.
Michael Brown [Tue, 6 May 2025 13:53:29 +0000 (14:53 +0100)]
[riscv] Use compressed relocation records
Use compressed relocation records instead of raw Elf_Rela records.
This saves around 15% of the total binary size for the all-drivers
image bin-riscv64/ipxe.sbi.
Michael Brown [Tue, 6 May 2025 11:11:56 +0000 (12:11 +0100)]
[zbin] Allow for constructing compressed dynamic relocation records
Define a new "ZREL" compressor information block, describing a block
of Elf_Rel or Elf_Rela runtime relocations to be converted to an
iPXE-specific compressed relocation format.
The compressed relocation format is based loosely on the Elf_Relr
bitmap+offset format, with some optimisations for use in iPXE. In
particular:
- a relative "skip" value is used instead of an absolute offset
- the width of the skip value is reduced to 19 bits (when present)
- an explicit skip value of zero is used to terminate the list
- unaligned relocations are prohibited
The layout of bits within the compressed relocation record is also
adjusted to make assembly code implementations simpler: the skip flag
bit is placed in the MSB so that it can be tested using "bltz" or
similar instructions, and the skip value is placed above the
relocation flag bits so that a typical shifting implementation will
naturally end up with a zero value in its accumulator if and only if
the record was a terminator.
Michael Brown [Sun, 4 May 2025 20:29:06 +0000 (21:29 +0100)]
[riscv] Add support for enabling 32-bit paging
Add code to construct a 32-bit page table to map the whole of the
32-bit address space with a fixed offset selected to map iPXE itself
at its link-time address, and to return with paging enabled and the
program counter updated to a virtual address.
Michael Brown [Fri, 2 May 2025 13:10:41 +0000 (14:10 +0100)]
[riscv] Add support for enabling 64-bit paging
Paging provides an alternative to using relocations: instead of
applying relocation fixups to the runtime addresses, we can set up
virtual addressing so that the runtime addresses match the link-time
addresses.
This opens up the possibility of running portions of iPXE directly
from read-only memory (such as a memory-mapped flash device), subject
to the caveats that .data is not yet writable and .bss is not yet
zeroed. This should allow us to run enough code to parse the memory
map from the FDT, identify a suitable RAM block, and physically
relocate ourselves there.
Add code to construct a 64-bit page table (in a single 4kB buffer) to
identity-map as much of the physical address space as possible, to map
iPXE itself at its link-time address, and to return with paging
enabled and the program counter updated to a virtual address. We use
the highest paging level supported by the CPU, to maximise the amount
of the physical address space covered by the identity map.
Michael Brown [Thu, 1 May 2025 13:24:33 +0000 (14:24 +0100)]
[riscv] Allow for a non-zero link-time address
Using paging (rather than relocation records) will be easier on 64-bit
RISC-V if we place iPXE within the negative (kernel) virtual address
space.
Allow the link-time address to be non-zero and to vary between 32-bit
and 64-bit builds. Choose addresses that are expected to be amenable
to the use of paging.
There is no particular need to use a non-zero address in the 32-bit
builds, but doing so allows us to validate that the relocation code is
handling this case correctly.
Michael Brown [Thu, 1 May 2025 13:04:27 +0000 (14:04 +0100)]
[riscv] Split out runtime relocator to libprefix.S
Split out the runtime relocation logic from sbiprefix.S to a new
library libprefix.S.
Since this logically decouples the process of runtime relocation from
the _sbi_start symbol (currently used to determine the base address
for applying relocations), provide an alternative mechanism for the
relocator to determine the base address.
Michael Brown [Wed, 30 Apr 2025 15:07:04 +0000 (16:07 +0100)]
[uaccess] Remove redundant virt_to_user() and userptr_t
Remove the last remaining traces of the concept of a user pointer,
leaving iPXE with a simpler and cleaner memory model that implicitly
assumes that all memory locations can be reached through pointer
dereferences.
Michael Brown [Wed, 30 Apr 2025 13:33:57 +0000 (14:33 +0100)]
[uaccess] Reduce scope of included uaccess.h header
The uaccess.h header is no longer required for any code that touches
external ("user") memory, since such memory accesses are now performed
through pointer dereferences. Reduce the number of files including
this header.
Michael Brown [Wed, 30 Apr 2025 13:14:51 +0000 (14:14 +0100)]
[image] Make image data read-only to most consumers
Almost all image consumers do not need to modify the content of the
image. Now that the image data is a pointer type (rather than the
opaque userptr_t type), we can rely on the compiler to enforce this at
build time.
Change the .data field to be a const pointer, so that the compiler can
verify that image consumers do not modify the image content. Provide
a transparent .rwdata field for consumers who have a legitimate (and
now explicit) reason to modify the image content.
We do not attempt to impose any runtime restriction on checking
whether or not an image is writable. The only existing instances of
genuinely read-only images are the various unit test images, and it is
acceptable for defective test cases to result in a segfault rather
than a runtime error.
Michael Brown [Wed, 30 Apr 2025 12:22:54 +0000 (13:22 +0100)]
[image] Add the concept of a static image
Not all images are allocated via alloc_image(). For example: embedded
images, the static images created to hold a runtime command line, and
the images used by unit tests are all static structures.
Using image_set_cmdline() (via e.g. the "imgargs" command) to set the
command-line arguments of a static image will succeed but will leak
memory, since nothing will ever free the allocated command line.
There are no code paths that can lead to calling image_set_len() on a
static image, but there is no safety check against future code paths
attempting this.
Define a flag IMAGE_STATIC to mark an image as statically allocated,
generalise free_image() to also handle freeing dynamically allocated
portions of static images (such as the command line), and expose
free_image() for use by static images.
Define a related flag IMAGE_STATIC_NAME to mark the name as statically
allocated. Allow a statically allocated name to be replaced with a
dynamically allocated name since this is a potentially valid use case
(e.g. if "imgdecrypt --name <name>" is used on an embedded image).
Michael Brown [Tue, 29 Apr 2025 12:39:12 +0000 (13:39 +0100)]
[bofm] Allow BOFM tests to be run without a BOFM-capable device driver
The BOFM tests are not part of the standard unit test suite, since
they are designed to allow for exercising real BOFM driver code
outside of the context of a real IBM blade server.
Allow for the BOFM tests to be run without a real BOFM driver, by
providing a dummy driver for the specified PCI test device.
The peerdist_msg_blk() macro seems to have been introduced in the
original commit that added pccrr.h, but this macro was never used by
the version of the code present in that commit.
Remove this unused macro and the corresponding nonexistent external
function declaration.
Michael Brown [Tue, 29 Apr 2025 08:16:41 +0000 (09:16 +0100)]
[xferbuf] Simplify and generalise data transfer buffers
Since all data transfer buffer contents are now accessible via direct
pointer dereferences, remove the unnecessary abstractions for read and
write operations and create two new data transfer buffer types: a
fixed-size buffer, and a void buffer that records its size but can
never receive non-zero lengths of data. These replace the custom data
buffer types currently implemented for EFI PXE TFTP downloads and for
block device translations.
A new operation xferbuf_detach() is required to take ownership of the
data accumulated in the data transfer buffer, since we no longer rely
on the existence of an independently owned external data pointer for
data transfer buffers allocated via umalloc().
Michael Brown [Mon, 28 Apr 2025 14:20:43 +0000 (15:20 +0100)]
[initrd] Use physical addresses for calculations on initrd locations
Commit ef03849 ("[uaccess] Remove redundant userptr_add() and
userptr_diff()") exposed a signedness bug in the comparison of initrd
locations, since the expression (initrd->data - current) was
effectively no longer coerced to a signed type.
In particular, the common case will be that the top of the initrd
region is the start of the iPXE .textdata region, which has virtual
address zero. This causes initrd->data to compare as being above the
top of the initrd region for all images, when this bug would
previously have been limited to affecting only initrds placed 2GB or
more below the start of .textdata.
Fix by using physical addresses for all comparisons on initrd
locations.
Reported-by: Sven Dreyer <sven@dreyer-net.de> Reported-by: Harald Jensås <hjensas@redhat.com> Reported-by: Jan ONDREJ (SAL) <ondrejj@salstar.sk> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Mon, 28 Apr 2025 10:20:16 +0000 (11:20 +0100)]
[multiboot] Remove userptr_t from Multiboot and ELF image parsing
Simplify Multiboot and ELF image parsing by assuming that the
Multiboot and ELF headers are directly accessible via pointer
dereferences, and add some missing header validations.
GCC 15 generates a warning when a string initializer is too large to
allow for a trailing NUL terminator byte. This type of initializer is
fairly common in signature strings such as ACPI table identifiers.
Michael Brown [Sun, 27 Apr 2025 16:37:44 +0000 (17:37 +0100)]
[build] Remove unsafe disable function wrapper from legacy NIC drivers
The legacy NIC drivers do not consistently take a second parameter in
their disable function. We currently use an unsafe function wrapper
that declares no parameters, and rely on the ABI allowing a second
parameter to be silently ignored if not expected by the caller. As of
GCC 15, this hack results in an incompatible pointer type warning.
Fix by removing the hack, and instead updating all relevant legacy NIC
drivers to take an unused second parameter in their disable function.
Michael Brown [Fri, 25 Apr 2025 12:24:21 +0000 (13:24 +0100)]
[fbcon] Avoid redrawing unchanged characters when scrolling
Scrolling currently involves redrawing every character cell, which can
be frustratingly slow on large framebuffer consoles. Accelerate this
operation by skipping the redraw for any unchanged character cells.
In the common case that large areas of the screen contain whitespace,
this optimises away the vast majority of the redrawing operations.
Michael Brown [Fri, 25 Apr 2025 09:52:26 +0000 (10:52 +0100)]
[fbcon] Remove userptr_t from framebuffer console drivers
Simplify the framebuffer console drivers by assuming that the raw
framebuffer, character cell array, background picture, and glyph data
are all directly accessible via pointer dereferences.
In particular, this avoids the need to copy each glyph during drawing:
the VESA framebuffer driver can simply return a pointer to the glyph
data stored in the video ROM.
Michael Brown [Thu, 24 Apr 2025 22:36:32 +0000 (23:36 +0100)]
[pxe] Remove userptr_t from PXE API call dispatcher
Simplify the PXE API call dispatcher code by assuming that the PXE
parameter block is accessible via a direct pointer dereference. This
avoids the need for the API call dispatcher to know the size of the
parameter block.
Michael Brown [Wed, 23 Apr 2025 11:47:53 +0000 (12:47 +0100)]
[umalloc] Remove userptr_t from user memory allocations
Use standard void pointers for umalloc(), urealloc(), and ufree(),
with the "u" prefix retained to indicate that these allocations are
made from external ("user") memory rather than from the internal heap.
Michael Brown [Wed, 23 Apr 2025 08:53:38 +0000 (09:53 +0100)]
[smbios] Remove userptr_t from SMBIOS structure parsing
Simplify the SMBIOS structure parsing code by assuming that all
structure content is fully accessible via pointer dereferences.
In particular, this allows the convoluted find_smbios_structure() and
read_smbios_structure() to be combined into a single function
smbios_structure() that just returns a direct pointer to the SMBIOS
structure, with smbios_string() similarly now returning a direct
pointer to the relevant string.
Michael Brown [Mon, 21 Apr 2025 23:28:07 +0000 (00:28 +0100)]
[crypto] Remove userptr_t from CMS verification and decryption
Simplify the CMS code by assuming that all content is fully accessible
via pointer dereferences. This avoids the need to use fragment loops
for calculating digests and decrypting (or reencrypting) data.
Michael Brown [Mon, 21 Apr 2025 21:40:59 +0000 (22:40 +0100)]
[crypto] Remove userptr_t from ASN.1 parsers
Simplify the ASN.1 code by assuming that all objects are fully
accessible via pointer dereferences. This allows the concept of
"additional data beyond the end of the cursor" to be removed, and
simplifies parsing of all ASN.1 image formats.
Michael Brown [Mon, 21 Apr 2025 15:16:01 +0000 (16:16 +0100)]
[uaccess] Remove user_to_phys() and phys_to_user()
Remove the intermediate concept of a user pointer from physical
address conversions, leaving virt_to_phys() and phys_to_virt() as the
directly implemented functions.
Michael Brown [Sun, 20 Apr 2025 17:29:48 +0000 (18:29 +0100)]
[uaccess] Remove redundant memcpy_user() and related string functions
The memcpy_user(), memmove_user(), memcmp_user(), memset_user(), and
strlen_user() functions are now just straightforward wrappers around
the corresponding standard library functions.
Michael Brown [Sun, 20 Apr 2025 16:26:48 +0000 (17:26 +0100)]
[uaccess] Change userptr_t to be a pointer type
The original motivation for the userptr_t type was to be able to
support a pure 16-bit real-mode memory model in which a segment:offset
value could be encoded as an unsigned long, with corresponding
copy_from_user() and copy_to_user() functions used to perform
real-mode segmented memory accesses.
Since this memory model was first created almost twenty years ago, no
serious effort has been made to support a pure 16-bit mode of
operation for iPXE. The constraints imposed by the memory model are
becoming increasingly cumbersome to work within: for example, the
parsing of devicetree structures is hugely simplified by being able to
use and return direct pointers to the names and property values. The
devicetree code therefore relies upon virt_to_user(), which is
nominally illegal under the userptr_t memory model.
Drop support for the concept of a memory location that cannot be
reached through a straightforward pointer dereference, by redefining
userptr_t to be a simple pointer type.
Michael Brown [Sun, 20 Apr 2025 16:18:06 +0000 (17:18 +0100)]
[uaccess] Rename userptr_sub() to userptr_diff()
Clarify the intended usage of userptr_sub() by renaming it to
userptr_diff() (to avoid confusion with userptr_add()), and fix the
existing call sites that erroneously use userptr_sub() to subtract an
offset from a userptr_t value.
Michael Brown [Sat, 19 Apr 2025 12:35:23 +0000 (13:35 +0100)]
[time] Use currticks() to provide the null system time
For platforms with no real-time clock (such as RISC-V SBI) we use the
null time source, which currently just returns a constant zero.
Switch to using currticks() to provide a clock that does not represent
the real current time, but does at least advance at approximately the
correct rate. In conjunction with the "ntp" command, this allows
these platforms to use time-dependent features such as X.509
certificate verification for HTTPS connections.
Michael Brown [Wed, 16 Apr 2025 23:29:41 +0000 (00:29 +0100)]
[efi] Inhibit calls to Shutdown() for wireless SNP devices
The UEFI model for wireless network configuration is somewhat
underdefined. At the time of writing, the EDK2 "UEFI WiFi Connection
Manager" driver provides only one way to configure wireless network
credentials, which is to enter them interactively via an HII form.
Credentials are not stored (or exposed via any protocol interface),
and so any temporary disconnection from the wireless network will
inevitably leave the interface in an unusable state that cannot be
recovered without user intervention.
Experimentation shows that at least some wireless network drivers
(observed with an HP Elitebook 840 G10) will disconnect from the
wireless network when the SNP Shutdown() method is called, or if the
device is not polled sufficiently frequently to maintain its
association to the network. We therefore inhibit calls to Shutdown()
and Stop() for any such SNP protocol interfaces, and mark our network
device as insomniac so that it will be polled even when closed.
Note that we need to inhibit not only our own calls to Shutdown() and
Stop(), but also those that will be attempted by MnpDxe when we
disconnect it from the SNP handle. We do this by patching the
installed SNP protocol interface structure to modify the Shutdown()
and Stop() method pointers, which is ugly but unavoidable.
Michael Brown [Wed, 16 Apr 2025 23:27:13 +0000 (00:27 +0100)]
[netdevice] Add the concept of an insomniac network device
Some network devices (observed with the SNP interface to the wireless
network card on an HP Elitebook 840 G10) will stop working if they are
left for too long without being polled.
Add the concept of an insomniac network device, that must continue to
be polled even when closed.
Note that drivers are already permitted to call netdev_rx() et al even
when closed: this will already be happening for USB devices since
polling operates at the level of the whole USB bus, rather than at the
level of individual USB devices.
Michael Brown [Wed, 16 Apr 2025 20:26:45 +0000 (21:26 +0100)]
[efi] Allow for custom methods for disconnecting existing drivers
Allow for greater control over the process used to disconnect existing
drivers from a device handle, by converting the "exclude" field from a
simple protocol GUID to a per-driver method.