git.ipfire.org Git - thirdparty/ipxe.git/log

[fdt] Provide ability to locate the parent device node

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdt] Add tests for device tree creation

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add support for a SiFive-compatible early UART

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Support mapping early UARTs outside of the identity map

Some platforms (such as the Sipeed Lichee Pi 4A) choose to make early
debugging entertainingly cumbersome for the programmer. These
platforms not only fail to provide a functional SBI debug console, but
also choose to place the UART at a physical address that cannot be
identity-mapped under the only paging model supported by the CPU.

Support such platforms by creating a virtual address mapping for the
early UART (in the 2MB megapage immediately below iPXE itself), and
using this as the UART base address whenever paging is enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add support for writing prefix debug messages direct to a UART

Some platforms (such as the Sipeed Lichee Pi 4A) do not provide a
functional SBI debug console.  We can obtain early debug messages on
these systems by writing directly to the UART used by the vendor
firmware.

There is no viable way to parse the UART address from the device tree,
since the prefix debug messages occur extremely early, before the C
runtime environment is available and therefore before any information
has been parsed from the device tree.  The early UART model and
register addresses must be configured by editing config/serial.h if
needed.  (This is an acceptable limitation, since prefix debugging is
an extremely specialised use case.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Create macros for writing characters to the debug console

Abstract out the SBI debug console calls into macros that can be
shared between print_message and print_hex_value.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Ignore riscv,isa property in favour of direct CSR testing

The riscv,isa devicetree property appears not to be fully populated on
some real-world systems. For example, the Sipeed Lichee Pi 4A
(running the vendor U-Boot) reports itself as "rv64imafdcvsu", which
does not include the "zicntr" extension even though the time CSR is
present and functional.

Ignore the riscv,isa property and rely solely on CSR testing to
determine whether or not extensions are present.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Use image name rather than pointer value in all debug messages

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Support mapping I/O devices outside of the identity map

With the 64-bit paging schemes (Sv39, Sv48, and Sv57), we identity-map
as much of the physical address space as is possible. Experimentation
shows that this is not sufficient to provide access to all I/O
devices. For example: the Sipeed Lichee Pi 4A includes a CPU that
supports only Sv39, but places I/O devices at the top of a 40-bit
address space.

Add support for creating I/O page table entries on demand to map I/O
devices, based on the existing design used for x86_64 BIOS.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdtmem] Ignore reservation regions with no fixed addresses

Do not print an error message for unused reservation regions that have
no fixed reserved address ranges.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Include carriage returns in libprefix.S debug messages

Support debug consoles that do not automatically convert LF to CRLF by
including the CR character within the debug message strings.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[memmap] Allow explicit colour selection for memory map debug messages

Provide DBGC_MEMMAP() as a replacement for memmap_dump(), allowing the
colour used to match other messages within the same message group.

Retain a dedicated colour for output from memmap_dump_all(), on the
basis that it is generally most useful to visually compare full memory
dumps against previous full memory dumps.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Support older SBI implementations

Fall back to attempting the legacy SBI console and shutdown calls if
the standard calls fail.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[memmap] Rename addr/last fields to min/max for clarity

Use the terminology "min" and "max" for addresses covered by a memory
region descriptor, since this is sufficiently intuitive to generally
not require further explanation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[lkrn] Support initrd construction for RISC-V bare-metal kernels

Use the shared initrd reshuffling and CPIO header construction code
for RISC-V bare-metal kernels.  This allows for files to be injected
into the constructed ("magic") initrd image in exactly the same way as
is done for bzImage and UEFI kernels.

We append a dummy image encompassing the FDT to the end of the
reshuffle list, so that it ends up directly following the constructed
initrd in memory (but excluded from the initrd length, which was
recorded before constructing the FDT).

We also temporarily prepend the kernel binary itself to the reshuffle
list.  This is guaranteed to be safe (since reshuffling is designed to
be unable to fail), and avoids the requirement for the kernel segment
to be available before reshuffling.  This is useful since current
RISC-V bare-metal kernels tend to be distributed as EFI zboot images,
which require large temporary allocations from the external heap for
the intermediate images created during archive extraction.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Squash and shuffle only initrds within the external heap

Any initrd images that are not within the external heap (e.g. embedded
images) do not need to be copied to the external heap for reshuffling,
and can just be left in their original locations.

Ignore any images that are not already within the external heap (or,
more precisely, that are wholly outside of the reshuffle region within
the external heap) when squashing and swapping images.

This reduces the maximum additional storage required by squashing and
swapping to zero, and so ensures that the reshuffling step is
guaranteed to succeed under all circumstances. (This is unrelated to
the post-reshuffle load region check, which is still required.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Split out initrd construction from bzimage.c

Provide a reusable function initrd_load_all() to load all initrds
(including any constructed CPIO headers) into a contiguous memory
region, and support functions to find the constructed total length and
permissible post-reshuffling load address range.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Allow for images straddling the top of the reshuffle region

It is hypothetically possible for external heap memory allocated
during driver startup to have been freed before an image was
downloaded, which could therefore leave an image straddling the
address recorded as the top of the reshuffle region.

Allow for this possibility by skipping squashing for any images
already straddling (or touching) the top of the reshuffle region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Rename bzimage_align() to initrd_align()

Alignment of initrd lengths is applicable to all Linux kernels, not
just those in the x86 bzImage format.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Swap initrds entirely in-place via triple reversal

Eliminate the requirement for free space when reshuffling initrds by
swapping adjacent initrds using an in-place triple reversal.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uheap] Expose external heap region directly

We currently rely on implicit detection of the external heap region.
The INT 15 memory map mangler relies on examining the corresponding
in-use memory region, and the initrd reshuffler relies on performing a
separate detection of the largest free memory block after startup has
completed.

Replace these with explicit public symbols to describe the external
heap region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uheap] Prevent allocation of blocks with zero physical addresses

If the external heap ends up at the top of the system memory map then
leave a gap after the heap to ensure that no block ends up being
allocated with either a start or end address of zero, since this is
frequently confusing to both code and humans.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdtmem] Allow iPXE to be relocated to the top of the address space

Allow for relocation to a region at the very end of the physical
address space (where the next address wraps to zero).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Speed up memmove() when copying in forwards direction

Use the word-at-a-time variable-length memcpy() implementation when
performing an overlapping copy in the forwards direction, since this
is guaranteed to be safe and likely to be substantially faster than
the existing bytewise copy.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[lkrn] Shut down devices before jumping to kernel entry point

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[lkrn] Allow a single initrd to be passed to the booted kernel

Allow a single initrd image to be passed verbatim to the booted RISC-V
kernel, as a proof of concept.

We do not yet support reshuffling to make optimal use of available
memory, or dynamic construction of CPIO headers, but this is
sufficient to allow iPXE to start up the Fedora 42 kernel with its
matching initrd image.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdt] Allow an initrd to be specified when creating a device tree

Allow an initrd location to be specified in our constructed device
tree via the "linux,initrd-start" and "linux,initrd-end" properties.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[initrd] Move initrd reshuffling to be architecture-independent code

There is nothing x86-specific in initrd.c, and a variant of the
reshuffling logic will be required for executing bare-metal kernels on
RISC-V and AArch64.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Use image replacement when executing extracted images

Use image_replace() to transfer execution to the extracted image,
rather than calling image_exec() directly. This allows the original
archive image to be freed immediately if it was marked as an
automatically freeable image (e.g. via "chain --autofree").

In particular, this ensures that in the case of an archive image
containing another archive image (such as an EFI zboot kernel wrapper
image containing a gzip-compressed kernel image), the intermediate
extracted image will be freed as early as possible, since extracted
images are always marked as automatically freeable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[lkrn] Add support for EFI zboot compressed kernel images

Current RISC-V and AArch64 kernels found in the wild tend not to be in
the documented kernel format, but are instead "EFI zboot" kernels
comprising a small EFI executable that decompresses and executes the
inner payload (which is a kernel in the expected format).

The EFI zboot header includes a recognisable magic value "zimg" along
with two fields describing the offset and length of the compressed
payload. We can therefore treat this as an archive image format,
extracting the payload as-is and then relying on our existing ability
to execute compressed images.

This is sufficient to allow iPXE to execute the Fedora 42 RISC-V
kernel binary as currently published.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[lkrn] Add basic support for the RISC-V Linux kernel image format

The RISC-V and AArch64 bare-metal kernel images share a common header
format, and require essentially the same execution environment: loaded
close to the start of RAM, entered with paging disabled, and passed a
pointer to a flattened device tree that describes the hardware and any
boot arguments.

Implement basic support for executing bare-metal RISC-V and AArch64
kernel images.  The (trivial) AArch64-specific code path is untested
since we do not yet have the ability to build for any bare-metal
AArch64 platforms.  Constructing and passing an initramfs image is not
yet supported.

Rename the IMAGE_BZIMAGE build configuration option to IMAGE_LKRN,
since "bzImage" is specific to x86.  To retain backwards compatibility
with existing local build configurations, we leave IMAGE_BZIMAGE as
the enabled option in config/default/pcbios.h and treat IMAGE_LKRN as
a synonym for IMAGE_BZIMAGE when building for x86 BIOS.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Use generic external heap based on the system memory map

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Use generic external heap based on the system memory map

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uheap] Add a generic external heap based on the system memory map

Add an implementation of umalloc() using the generalised model of a
heap, placing the external heap in the largest usable region obtained
from the system memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[malloc] Allow heap to specify block and pointer alignments

Size-tracked pointers allocated via umalloc() have historically been
aligned to a page boundary, as have the edges of the hidden memory
region covering the external heap.

Allow the block and size-tracked pointer alignments to be specified as
heap configuration parameters.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[malloc] Allow for the existence of multiple heaps

Create a generic model of a heap as a list of free blocks with
optional methods for growing and shrinking the heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[memmap] Remove now-obsolete get_memmap()

All memory map users have been updated to use the new system memory
map API. Remove get_memmap() and its associated definitions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Use memmap_describe() to find an external heap location

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[settings] Use memmap_describe() to construct memory map settings

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Use memmap_describe() to find a relocation address

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[comboot] Use memmap_describe() to obtain available memory

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[multiboot] Use memmap_describe() to construct Multiboot memory map

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Use memmap_describe() to check loadable image segments

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[memmap] Use memmap_dump_all() to dump debug memory maps

There are several places where get_memmap() is called solely to
produce debug output. Replace these with calls to memmap_dump_all()
(which will be a no-op unless debugging is enabled).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Describe umalloc() heap as an in-use memory area

Use the concept of an in-use memory region defined as part of the
system memory map API to describe the umalloc() heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Update to use the generic system memory map API

Provide an implementation of the system memory map API based on the
assorted BIOS INT 15 calls, and a temporary implementation of the
legacy get_memmap() function using the new API.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdtmem] Update to use the generic system memory map API

Provide an implementation of the system memory map API based on the
system device tree, excluding any memory outside the size of the
accessible physical address space and defining an in-use region to
cover the relocated copy of iPXE and the system device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[memmap] Define an API for managing the system memory map

Define a generic system memory map API, based on the abstraction
created for parsing the FDT memory map and adding a concept of hidden
in-use memory regions as required to support patching the BIOS INT 15
memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[tests] Remove prehistoric umalloc() test code

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdtmem] Record size of accessible physical address space

The size of accessible physical address space will be required for the
runtime memory map, not just at relocation time. Make this size an
additional parameter to fdt_register() (matching the prototype for
fdt_relocate()), and record the value for future reference.

Note that we cannot simply store the limit in fdt_relocate() since it
is called before .data is writable and before .bss is zeroed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bios] Rename memmap.c to int15.c

Create namespace for an architecture-independent memmap.c by renaming
the BIOS-specific memmap.c to int15.c.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bnxt] Use updated DMA APIs

Replace malloc_phys with dma_alloc, free_phys with dma_free, alloc_iob
with alloc_rx_iob, free_iob with free_rx_iob, virt_to_bus with dma or
iob_dma. Replace dma_addr_t with physaddr_t.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>

[bnxt] Return proper error codes in probe

Return the proper error codes in bnxt_init_one, to indicate the
correct return status upon completion. Failure paths could
incorrectly indicate a success. Correct assertion condition to check
for non-NULL pointer.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>

[crypto] Remove redundant null pointer check

Coverity reports a spurious potential null pointer dereference in
cms_decrypt(), since the null pointer check takes place after the
pointer has already been dereferenced. The pointer can never be null,
since it is initialised to point to cipher_null at the point that the
containing structure is allocated.

Remove the redundant null pointer check, and for symmetry ensure that
the digest and public-key algorithm pointers are similarly initialised
at the point of allocation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add a .pf32 build target for padded parallel flash images

QEMU's -pflash option requires an image that has been padded to the
exact expected size (32MB for all of the supported RISC-V virtual
machines).

Add a .pf32 build target which is simply the equivalent .sbi target
padded to 32MB in size, to simplify testing.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Perform a writability test before applying relocations

If paging is not supported, then we will attempt to apply dynamic
relocations to fix up the runtime addresses. If the image is
currently executing directly from flash memory, this can result in
effectively sending an undefined sequence of commands to the flash
device, which can cause unwanted side effects.

Perform an explicit writability test before applying relocations,
using a write value chosen to be safe for at least any devices
conforming to the JEDEC Common Flash Interface (CFI01).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Avoid potentially overwriting the scratch area during relocation

We do not currently describe the temporary page table or the temporary
stack as areas to be avoided during relocation of the iPXE image to a
new physical address.

Perform the copy of the iPXE image and zeroing of the .bss within
libprefix.S, after we have no futher use for the temporary page table
or the temporary initial stack. Perform the copy and registration of
the system device tree in C code after relocation is complete and the
new stack (within .bss) has been set up.

This provides a clean separation of responsibilities between the
RISC-V libprefix.S and the architecture-independent fdtmem.c. The
prefix is responsible only for relocating iPXE to the new physical
address returned from fdtmem_relocate(), and doesn't need to know or
care where fdtmem.c is planning to place the copy of the device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add a .lkrn build target resembling a Linux kernel binary

On x86 BIOS, it has been useful to be able to build iPXE to resemble a
Linux kernel, so that it can be loaded by programs such as syslinux
which already know how to handle Linux kernel binaries.

Add an equivalent .lkrn build target for RISC-V SBI, allowing for
build targets such as:

  make bin-riscv64/ipxe.lkrn

  make bin-riscv64/cgem.lkrn

The Linux kernel header format allows us to specify a required length
(including uninitialised-data portions) and defines that the image
will be loaded at a fixed offset from the start of RAM.  We can
therefore use known-safe areas of memory (within our own .bss) for the
initial temporary page table and stack.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Relocate to a safe physical address on startup

On startup, we may be running from read-only memory.  We need to parse
the devicetree to obtain the system memory map, and identify a safe
location to which we can copy our own binary image along with a
stashed copy of the devicetree, and then transfer execution to this
new location.

Parsing the system memory map realistically requires running C code.
This in turn requires a small temporary stack, and some way to ensure
that symbol references are valid.

We first attempt to enable paging, to make the runtime virtual
addresses equal to the link-time virtual addresses.  If this fails,
then we attempt to apply the compressed relocation records.

Assuming that one of these has worked (i.e. that either the CPU
supports paging or that our image started execution in writable
memory), then we call fdtmem_relocate() to parse the system memory map
to find a suitable relocation target address.

After the copy we disable paging, jump to the relocated copy,
re-enable paging, and reapply relocation records (if needed).  At this
point, we have a full runtime environment, and can transfer control to
normal C code.

Provide this functionality as part of libprefix.S, since it is likely
to be shared by multiple prefixes.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Construct page tables based on link-time virtual addresses

Always construct the page tables based on the link-time address values
even if relocations have already been applied, on the assumption that
relocations will be reapplied after paging has been enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Allow apply_relocs() to use non-inline relocation records

The address of the compressed relocation records is currently
calculated implicitly relative to the program counter. This requires
the relocation records to be copied as part of relocation to a new
physical address, so that they can be reapplied (if needed) after
copying iPXE to the new physical address.

Since the relocation destination will never overlap the original iPXE
image, and since the relocation records will not be needed further
after completing relocation, we can avoid the need to copy the records
by passing in a pointer to the relocation records present in the
original iPXE image.

Pass the compressed relocation record address as an explicit parameter
to apply_relocs(), rather than being implicit in the program counter.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Return accessible physical address space size from enable_paging()

Relocation requires knowledge of the size of the accessible physical
address space, which for 64-bit CPUs will vary according to the paging
level supported by the processor.

Update enable_paging_64() and enable_paging_32() to calculate and
return the size of the accessible physical address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdtmem] Add ability to parse FDT memory map for a relocation address

Add code to parse the devicetree memory nodes, memory reservations
block, and reserved memory nodes to construct an ordered and
non-overlapping description of the system memory map, and use this to
identify a suitable address to which iPXE may be relocated at runtime.

We choose to place iPXE on a superpage boundary (as required by the
paging code), and to use the highest available address within
accessible memory. This mirrors the approach taken for x86 BIOS
builds, where we have long assumed that any image format that we might
need to support may require specific fixed addresses towards the
bottom of the memory map, but is very unlikely to require specific
fixed addresses towards the top of the memory map (since those
addresses may not exist, depending on the amount of installed RAM).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Ensure that prefix_virt is aligned on an xlen boundary

Ensure that the prefix_virt dynamic relocation ends up on a suitably
aligned boundary for a compressed relocation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Hold virtual address offset in the thread pointer register

iPXE does not make use of any thread-local storage. Use the otherwise
unused thread pointer register ("tp") to hold the current value of
the virtual address offset, rather than using a global variable.

This ensures that virt_offset can be made valid even during very early
initialisation (when iPXE may be executing directly from read-only
memory and so cannot update a global variable).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[fdt] Generalise access to "reg" property

The "reg" property is also used by non-device nodes, such as the nodes
describing the system memory map.

Provide generalised functionality for parsing the "#address-cells",
"#size-cells", and "reg" properties.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Use load and store pseudo-instructions where possible

The pattern of "load address to register" followed by "load value from
address in register" generally results in three instructions: two to
load the address and one to load the value.

This can be reduced to two instructions by allowing the assembler to
incorporate the low bits of the address within the load (or store)
instruction itself. In the case of a store, this requires specifying
a second register that can be temporarily used to hold the high bits
of the address. (In the case of a load, the destination register is
reused for this purpose.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[build] Formalise mechanism for accessing absolute symbols

In a position-dependent executable, where all addresses are fixed
at link time, we can use the standard technique as documented by
GNU ld to get the value of an absolute symbol, e.g.:

    extern char _my_symbol[];

    printf ( "Absolute symbol value is %x\n", ( ( int ) _my_symbol ) );

This technique may not work in a position-independent executable.
When dynamic relocations are applied, the runtime addresses will no
longer be equal to the link-time addresses.  If the code to obtain the
address of _my_symbol uses PC-relative addressing, then it will
calculate the runtime "address" of the absolute symbol, which will no
longer be equal the the link-time "address" (i.e. the correct value)
of the absolute symbol.

Define macros ABS_SYMBOL(), ABS_VALUE_INIT(), and ABS_VALUE() that
provide access to the correct values of absolute symbols even in
position-independent code, and use these macros wherever absolute
symbols are accessed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[libc] Display assertion failure message before incrementing counter

During early initialisation on some platforms, the .data and .bss
sections may not yet be writable.

Display the assertion message before attempting to increment the
assertion failure counter, since writing to the assertion counter may
trigger a CPU exception that ends up resetting the system.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add support for disabling 64-bit and 32-bit paging

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Return virtual address offset from enable_paging()

Once paging has been enabled, there is no direct way to determine the
virtual address offset without external knowledge. (The paging mode,
if needed, can be read directly from the SATP CSR.)

Change the return value from enable_paging() to provide the virtual
address offset.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Restore temporarily modified PTE within 32-bit transition code

If the virtual address offset is precisely one page (i.e. each virtual
address maps to a physical address one page higher), and if the 32-bit
transition code happens to end up at the end of a page (which would
require an unrealistic 2MB of content in .prefix), then it would be
possible for the program counter to cross into the portion of the
virtual address space still borrowed for use as the temporary physical
map.

Avoid this remote possibility by moving the restoration of the
temporarily modified PTE within the transition code block (which is
guaranteed to remain within a single page since it is aligned on its
own size).

This unfortunately requires increasing the alignment of the transition
code (and hence the maximum number of NOPs inserted).  The assembler
syntax theoretically allows us to avoid inserting any NOPs via a
directive such as:

   .balign PAGE_SIZE, , enable_paging_32_max_len

(i.e. relying on the fact that if the transition code is already
sufficiently far away from the end of a page, then no padding needs to
be inserted).  However, alignment on RISC-V is implemented using the
R_RISCV_ALIGN relaxing relocation, which doesn't encode any concept of
a maximum padding length, and so the maximum padding length value is
effectively ignored.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uaccess] Generalise librm's virt_offset mechanism for RISC-V

The virtual offset memory model used for i386-pcbios and x86_64-pcbios
can be generalised to also cover riscv32-sbi and riscv64-sbi. In both
architectures, the 32-bit builds will use a circular map of the 32-bit
address space, and the 64-bit builds will use an identity map for the
relevant portion of the physical address space, with iPXE itself
placed in the negative (kernel) address space.

Generalise and document the virt_offset mechanism, and set it as the
default for both PCBIOS and SBI platforms.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[build] Constrain PHYS_CODE() and REAL_CODE() to use i386 registers

Inline assembly using PHYS_CODE() or REAL_CODE() must use the "R"
constraint rather than the "r" constraint to ensure that the compiler
chooses registers that will be valid for the 32-bit or 16-bit assembly
code fragment.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add debug printing of hexadecimal values in libprefix.S

Add millicode routines to print hexadecimal values (with any number of
digits), and macros to print register contents or symbol addresses.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Move prefix system reset code to libprefix.S

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add basic debug progress messages in libprefix.S

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Provide a millicode variant of print_message()

RISC-V has a millicode calling convention that allows for the use of
an alternative link register x5/t0. With sufficient care, this allows
for two levels of subroutine call even when no stack is available.

Provide both standard and millicode entry points for print_message(),
and use the millicode entry point to allow for printing debug messages
from libprefix.S itself.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Move prefix debug message printing to libprefix.S

Create a prefix library function print_message() to print text to the
SBI debug console. Use the "write byte" SBI call (rather than "write
string") so that the function remains usable even after enabling
paging.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Place prefix debug strings in .rodata

The GNU assembler does not seem to automatically assume alignment to
an instruction boundary for sections containing assembled code.

Place the prefix debug strings (if present) in .rodata rather than in
.prefix, to avoid potentially creating misaligned code sections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Use compressed relocation records

Use compressed relocation records instead of raw Elf_Rela records.
This saves around 15% of the total binary size for the all-drivers
image bin-riscv64/ipxe.sbi.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Place .got and .got.plt in .data

Even though we build with -mno-plt, redundant .got and .got.plt
sections are still generated.

Include these redundant sections within .data (which has identical
section attributes) to simplify the section list.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Discard ELF hash tables

The ELF hash table is generated when building a position-independent
executable even though it is not required (since we have no dynamic
linker).

Explicitly discard these unneeded sections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[zbin] Allow for constructing compressed dynamic relocation records

Define a new "ZREL" compressor information block, describing a block
of Elf_Rel or Elf_Rela runtime relocations to be converted to an
iPXE-specific compressed relocation format.

The compressed relocation format is based loosely on the Elf_Relr
bitmap+offset format, with some optimisations for use in iPXE.  In
particular:

  - a relative "skip" value is used instead of an absolute offset

  - the width of the skip value is reduced to 19 bits (when present)

  - an explicit skip value of zero is used to terminate the list

  - unaligned relocations are prohibited

The layout of bits within the compressed relocation record is also
adjusted to make assembly code implementations simpler: the skip flag
bit is placed in the MSB so that it can be tested using "bltz" or
similar instructions, and the skip value is placed above the
relocation flag bits so that a typical shifting implementation will
naturally end up with a zero value in its accumulator if and only if
the record was a terminator.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[build] Allow for 32-bit and 64-bit versions of util/zbin

Parsing ELF data is simpler if we don't have to build a single binary
to handle both 32-bit and 64-bit ELF formats.

Allow for separate 32-bit and 64-bit binaries built from util/zbin.c
(as is already done for util/elf2efi.c).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add support for enabling 32-bit paging

Add code to construct a 32-bit page table to map the whole of the
32-bit address space with a fixed offset selected to map iPXE itself
at its link-time address, and to return with paging enabled and the
program counter updated to a virtual address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Add support for enabling 64-bit paging

Paging provides an alternative to using relocations: instead of
applying relocation fixups to the runtime addresses, we can set up
virtual addressing so that the runtime addresses match the link-time
addresses.

This opens up the possibility of running portions of iPXE directly
from read-only memory (such as a memory-mapped flash device), subject
to the caveats that .data is not yet writable and .bss is not yet
zeroed. This should allow us to run enough code to parse the memory
map from the FDT, identify a suitable RAM block, and physically
relocate ourselves there.

Add code to construct a 64-bit page table (in a single 4kB buffer) to
identity-map as much of the physical address space as possible, to map
iPXE itself at its link-time address, and to return with paging
enabled and the program counter updated to a virtual address. We use
the highest paging level supported by the CPU, to maximise the amount
of the physical address space covered by the identity map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Allow for a non-zero link-time address

Using paging (rather than relocation records) will be easier on 64-bit
RISC-V if we place iPXE within the negative (kernel) virtual address
space.

Allow the link-time address to be non-zero and to vary between 32-bit
and 64-bit builds. Choose addresses that are expected to be amenable
to the use of paging.

There is no particular need to use a non-zero address in the 32-bit
builds, but doing so allows us to validate that the relocation code is
handling this case correctly.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[riscv] Split out runtime relocator to libprefix.S

Split out the runtime relocation logic from sbiprefix.S to a new
library libprefix.S.

Since this logically decouples the process of runtime relocation from
the _sbi_start symbol (currently used to determine the base address
for applying relocations), provide an alternative mechanism for the
relocator to determine the base address.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uaccess] Remove redundant virt_to_user() and userptr_t

Remove the last remaining traces of the concept of a user pointer,
leaving iPXE with a simpler and cleaner memory model that implicitly
assumes that all memory locations can be reached through pointer
dereferences.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uaccess] Reduce scope of included uaccess.h header

The uaccess.h header is no longer required for any code that touches
external ("user") memory, since such memory accesses are now performed
through pointer dereferences. Reduce the number of files including
this header.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Make image data read-only to most consumers

Almost all image consumers do not need to modify the content of the
image.  Now that the image data is a pointer type (rather than the
opaque userptr_t type), we can rely on the compiler to enforce this at
build time.

Change the .data field to be a const pointer, so that the compiler can
verify that image consumers do not modify the image content.  Provide
a transparent .rwdata field for consumers who have a legitimate (and
now explicit) reason to modify the image content.

We do not attempt to impose any runtime restriction on checking
whether or not an image is writable.  The only existing instances of
genuinely read-only images are the various unit test images, and it is
acceptable for defective test cases to result in a segfault rather
than a runtime error.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Add the concept of a static image

Not all images are allocated via alloc_image(). For example: embedded
images, the static images created to hold a runtime command line, and
the images used by unit tests are all static structures.

Using image_set_cmdline() (via e.g. the "imgargs" command) to set the
command-line arguments of a static image will succeed but will leak
memory, since nothing will ever free the allocated command line.
There are no code paths that can lead to calling image_set_len() on a
static image, but there is no safety check against future code paths
attempting this.

Define a flag IMAGE_STATIC to mark an image as statically allocated,
generalise free_image() to also handle freeing dynamically allocated
portions of static images (such as the command line), and expose
free_image() for use by static images.

Define a related flag IMAGE_STATIC_NAME to mark the name as statically
allocated. Allow a statically allocated name to be replaced with a
dynamically allocated name since this is a potentially valid use case
(e.g. if "imgdecrypt --name <name>" is used on an embedded image).

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Move embedded images from .rodata to .data

Decrypting a CMS-encrypted image will overwrite the existing image
data in place, and using an encrypted embedded image is a valid use
case.

Move embedded images from .rodata to .data to reflect the fact that
they are intended to be writable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[test] Separate read-only and writable CMS test images

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[uaccess] Remove redundant copy_from_user() and copy_to_user()

Remove the now-redundant copy_from_user() and copy_to_user() wrapper
functions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[image] Clear recorded replacement image immediately after consuming

If an embedded script uses "chain --replace", the embedded image will
retain a reference to the replacement image in perpetuity.

Fix by clearing any recorded replacement image immediately in
image_exec(), instead of relying upon image_free() to drop the
reference.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bofm] Remove userptr_t from BOFM table parsing and updating

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[bofm] Allow BOFM tests to be run without a BOFM-capable device driver

The BOFM tests are not part of the standard unit test suite, since
they are designed to allow for exercising real BOFM driver code
outside of the context of a real IBM blade server.

Allow for the BOFM tests to be run without a real BOFM driver, by
providing a dummy driver for the specified PCI test device.

Signed-off-by: Michael Brown <mcb30@ipxe.org>

[build] Remove some long-obsolete unused header files

Signed-off-by: Michael Brown <mcb30@ipxe.org>