Michael Brown [Mon, 1 Jun 2026 13:12:39 +0000 (14:12 +0100)]
[linux] Disable implicit linking against libatomic
GCC 16 attempts to link against -latomic_asneeded by default, and
expects that this library will be provided by the installed build
toolchain alongside libgcc.
The Fedora cross-gcc packages do not include libatomic, which causes
the build to fail.
We do not require any functions provided by libatomic. Work around
the missing packaged files in Fedora by disabling gcc's implicit
linking via the -fno-link-libatomic build option.
Joseph Wong [Thu, 28 May 2026 05:45:59 +0000 (22:45 -0700)]
[tg3] Use updated DMA APIs
Replace malloc_phys with dma_alloc, free_phys with dma_free, alloc_iob
with alloc_rx_iob, free_iob with free_rx_iob, virt_to_bus with dma or
iob_dma. Replace dma_addr_t with physaddr_t.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
Michael Brown [Sun, 24 May 2026 22:00:08 +0000 (23:00 +0100)]
[loong64] Port the RISC-V optimised TCP/IP checksum implementation
As with most other assembly code in iPXE, LoongArch64 is sufficiently
close to RISC-V that a straightforward transcription of the assembly
language code generally works.
Copy the RISC-V implementation for TCP/IP checksumming, retain the
register names and ABI, and just adjust the syntax to match
LoongArch64 requirements.
to eliminate some unnecessary folding steps, and hold the folded value
in the most significant bits of the register rather than the least
significant bits so that the final one's complement negation can be
accomplished naturally without requiring an explicit 0xffff constant.
Michael Brown [Thu, 21 May 2026 14:41:08 +0000 (15:41 +0100)]
[ci] Include UEFI Secure Boot builds for RISC-V 64 and LoongArch64
The usage pattern for UEFI Secure Boot on RISC-V 64 and LoongArch64 is
not yet well defined: there is no equivalent on those architectures
for the UEFI shim or the Microsoft signing submission infrastructure.
Include signed binaries for these architectures within the release
artifacts. Users may choose to enrol the iPXE Secure Boot CA
certificate on their own systems in order to use these binaries with
UEFI Secure Boot enabled.
OEMs such as Loongson may choose to include the iPXE Secure Boot CA
certificate within their default enrolled certificate list, or to
issue a cross-signed version of the iPXE Secure Boot CA certificate
(which could then be included within the official iPXE binaries in
future releases).
Michael Brown [Thu, 21 May 2026 14:19:10 +0000 (15:19 +0100)]
[loong64] Replace optimised string operations
The current implementation of the optimised string operations appears
to have been ported from the (old) arm64 implementation, and does not
cleanly match the LoongArch64 instruction set.
Replace with code derived from the riscv64 implementation, modified to
use indexed load and store instructions.
Michael Brown [Thu, 21 May 2026 11:43:47 +0000 (12:43 +0100)]
[neighbour] Discard deferred packets before discarding complete entries
Discarding neighbour cache entries for active connections is known to
be extremely disruptive, and is therefore done only as a last resort
when attempting to free up memory for a new allocation attempt.
There is currently no way to discard the deferred packet queue
separately from discarding the complete neighbour cache entry. Under
some conditions (such as a sustained ICMP echo request packet flood
from an IP address that will never complete neighbour resolution),
this can lead to the deferred packet queue growing without limit,
which will eventually lead to complete neighbour cache entries being
discarded.
Split out the logic in neighbour_destroy() for dropping deferred
packets to a separate neighbour_drop() function, and add a separate
cache discarder that will use this to free up memory without requiring
the complete neighbour cache entry to be discarded.
Reported-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Wed, 13 May 2026 14:32:17 +0000 (15:32 +0100)]
[virtio] Replace the virtio core and network device driver
The existing virtio network driver has been somewhat hacked together
over the past two decades by multiple contributors, and includes a
substantial amount of logic that is almost but not quite duplicated
between the "legacy" and "modern" code paths.
Rip out the existing driver and replace with a completely new driver
written based on the Virtual I/O Device specification document, not
derived from the Linux kernel driver.
Michael Brown [Tue, 12 May 2026 11:09:28 +0000 (12:09 +0100)]
[lacp] Use the same system identifier for all ports
Commit 3d43789 ("[lacp] Detect and ignore erroneously looped back LACP
packets") added protection against LACP packet storms that arise when
our own transmitted packets are somehow looped back to the same port,
but does not protect against a situation in which we have two
different ports that are externally bridged to each other.
This situation is unlikely to arise in practice since a properly
configured link partner should not be both sending and forwarding LACP
packets. Triggering this situation essentially requires our two ports
to be connected to a non-LACP-capable switch, while another port on
the same switch is connected to a separate device that is sending out
LACP packets.
Guard against this situation by using the MAC address of the first
network device as the LACP system identifier, thereby allowing the
loopback detection to reject any packets that were sent from any of
our ports.
Since the system identifier is no longer unique between ports, use the
guaranteed-unique network device scope ID as the group key to indicate
that we do not support aggregation.
Michael Brown [Wed, 6 May 2026 21:06:16 +0000 (22:06 +0100)]
[tls] Add support for RSA-PSS signature scheme
The RSA-PSS signature scheme is crowbarred somewhat awkwardly into TLS
version 1.2. Certificates with the standard rsaEncryption OID in the
public key may be used with either PKCS#1 or RSA-PSS, which breaks the
straightforward mapping between the OID and the signature algorithm.
Extend the definition of a TLS signature hash algorithm to include a
required OID-identified algorithm in the certificate's public key.
This allows us to define signature schemes such as rsa_pss_rsae_sha256
where the signature scheme uses an algorithm that differs from the
algorithm identified in the certificate's public key.
Michael Brown [Wed, 6 May 2026 21:14:17 +0000 (22:14 +0100)]
[crypto] Add support for RSA-PSS signature scheme
Add support for the RSA-PSS signature scheme as defined in RFC 8017
and required for TLS version 1.3.
Signature verification is deliberately implemented by first deriving
the salt value and then reconstructing the entire expected signature.
This is arguably inefficient since it involves two invocations of the
mask generation function when only one is required. However, this
implementation approach keeps the code size minimal (since there is no
need to implement separate verification logic), and makes it provably
impossible to accidentally omit a verification step (such as checking
the leading zero bits or the fixed 0x01 or 0xbc bytes). Since
signature verification is not a fast-path operation, the guaranteed
correctness is more valuable than a marginally faster execution.
Michael Brown [Wed, 6 May 2026 20:49:30 +0000 (21:49 +0100)]
[crypto] Allow for alternative RSA signature schemes
The RSA-PSS signature scheme has the same basic structure as the
existing PKCS#1 signature scheme, with a difference only in how the
digest value is encoded before being enciphered.
Abstract out the digest encoding from the signature and verification
methods, and add an explicit "pkcs1" to the relevant method names.
Joseph Wong [Thu, 30 Apr 2026 21:03:29 +0000 (14:03 -0700)]
[bnxt] Do not abort teardown on command failure
Modify bnxt_hwrm_run() to accept a flag indicating whether to abort
immediately upon a command failure. During initialization path,
driver will continue to abort on first error. During teardown,
sequence will continue executing subsequent cleanup commands even if
one fails. This ensures a best-effort cleanup.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
Joseph Wong [Mon, 27 Apr 2026 22:07:56 +0000 (15:07 -0700)]
[bnxt] Improve code readability and debug output
Enhance code readability in the completion queue servicing logic to
use explicit function calls per case statement, rather than falling
through to the next statement. Add debug print in ring allocation
path. Fix typo in PCI ROM entry.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
Michael Brown [Thu, 30 Apr 2026 13:25:18 +0000 (14:25 +0100)]
[efi] Register EFI IPv6 device path settings as netX.ndp
The EFI device path settings are currently registered as the
"netX.dhcp" settings block, in order that they will be automatically
overridden if a real DHCP configuration takes place. This does not
work as expected in an IPv6-only network, since the IPv6 configurator
will register "netX.ndp" rather than "netX.dhcp".
Fix by registering the EFI device path settings as either "netX.dhcp"
or "netX.ndp" based on the first address family encountered within the
device path.
Michael Brown [Thu, 30 Apr 2026 10:56:41 +0000 (11:56 +0100)]
[tls] Treat signature algorithm identifiers as opaque 16-bit values
RFC 5246 defines the signature_algorithm extension values for TLS
version 1.2 as being tuples of {HashAlgorithm, SignatureAlgorithm}
pairs. RFC 8446 redefines the signature_algorithm extension values
for TLS version 1.3 in a backwards-compatible way as opaque 16-bit
SignatureScheme values, and RFC 8447 updates RFC 5246 to allow these
values to be used with TLS version 1.2.
Redefine our concept of a signature algorithm identifier to remove the
internal structure that no longer exists.
Michael Brown [Wed, 29 Apr 2026 14:05:20 +0000 (15:05 +0100)]
[crypto] Fail all operations for the null public-key algorithm
The null crypto algorithms are intended to do nothing: the null digest
algorithm accepts all input and generates a zero-length digest, and
the null cipher algorithm simply copies the input unmodifed to the
output.
The null public-key algorithm currently does nothing successfully.
Unlike the null digest and cipher algorithms, the null public-key
algorithm's methods are never called.
Change the null public-key algorithm to fail all operations, thereby
allowing its methods to be used as stubs by algorithms such as ECDSA
that do not implement all of the possible public-key operations.
[efi] Fix operator precedence in autoexec network download
The != operator has higher precedence than = in C, so the expressions:
rc = imgacquire ( ..., image ) != 0
are parsed as:
rc = ( imgacquire ( ..., image ) != 0 )
This assigns the boolean result (0 or 1) to rc instead of the actual
return code from imgacquire(). As a result, strerror(rc) reports an
incorrect error message when debugging is enabled.
Add parentheses around each assignment to ensure rc captures the
actual return value, matching the pattern already used in
efi_autoexec_filesystem() within the same file.
Modified-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Thu, 23 Apr 2026 22:03:26 +0000 (23:03 +0100)]
[virtio] Ensure that device is closed before unmapping regions
Commit 988243c ("[virtio] Add virtio-net 1.0 support") erroneously
placed the code to unmap the device regions before the code to
unregister the network device. In the common case that the network
device is still open at the time that we shut down to boot the OS,
this results in the regions being accessed after having been unmapped.
For 32-bit BIOS or for UEFI with no IOMMU enabled, the iounmap()
operation is a no-op and so the driver still happens to work despite
the ordering bug. For 64-bit BIOS or for UEFI with an IOMMU enabled,
the iounmap() operation is not a no-op, and the driver will trigger a
page fault.
Fix by moving the call to unregister_netdev() to before the code that
unmaps the device regions.
Michael Brown [Thu, 23 Apr 2026 13:57:20 +0000 (14:57 +0100)]
[virtio] Fix assertion failures when interface is closed
The unused RX I/O buffers are currently freed without being deleted
from the list, with the list head being reinitialised only after all
buffers have been deleted. This triggers assertion failures due to
the list integrity checks when debugging is enabled.
Fix by deleting each buffer individually, so that the list structure
remains valid at all times.
Michael Brown [Thu, 23 Apr 2026 13:12:00 +0000 (14:12 +0100)]
[virtio] Set MTU for both modern and legacy devices
Commit b9d68b9 ("[ethernet] Use standard 1500 byte MTU unless
explicitly overridden") added code to explicitly set the MTU for
virtio-net devices, but only on the legacy probe path.
Make the behaviour consistent by setting the MTU on both legacy and
modern probe paths.
Michael Brown [Wed, 22 Apr 2026 22:09:42 +0000 (23:09 +0100)]
[cloud] Delete underlying snapshots when deleting Alibaba Cloud images
The underlying snapshots are not automatically deleted along with the
image, and there is no flag that can be set to cause them to be
automatically deleted.
Tag the underlying snapshots for deletion before deleting the image,
delete the image, and then delete any such tagged snapshots (including
any that may remain from a previous failed deletion attempt).
Michael Brown [Tue, 21 Apr 2026 22:44:06 +0000 (23:44 +0100)]
[ci] Add a workflow to import images to Alibaba Cloud
Add a workflow to build and import the official iPXE images for
Alibaba Cloud. As with the AWS and Google Cloud imports, treat this
as a workflow that must be triggered manually.
Michael Brown [Tue, 21 Apr 2026 15:31:52 +0000 (16:31 +0100)]
[cloud] Retry all Alibaba Cloud API calls
Experimentation suggests Alibaba Cloud API calls are extremely
unreliable, with a failure rate around 1%. It is therefore necessary
to allow for retrying basically every API call.
Some API calls (e.g. DescribeImages or ModifyImageAttribute) are
naturally idempotent and so safe to retry. Some non-idempotent API
calls (e.g. CopyImage) support explicit idempotence tokens. The
remaining API calls may simply fail on a retry, if the original
request happened to succeed but failed to return a response.
We could write convoluted retry logic around the non-idempotent calls,
but this would substantially increase the complexity of the already
unnecessarily complex code. For now, we assume that retrying
non-idempotent requests is probably more likely to fix transient
failures than to cause additional problems.
Michael Brown [Tue, 21 Apr 2026 15:06:40 +0000 (16:06 +0100)]
[cloud] Do not rely on CopyImage to import images to Alibaba Cloud
The CopyImage API call does work, but is unacceptably slow due to rate
limiting. Importing a full set of images to all regions can take
several hours (and is likely to fail at some point due to transient
errors in making API calls).
Resort to a mixture of strategies to get images imported to all
regions:
- For regions with working OSS that are not blocked by Chinese state
censorship laws, upload the image files to an OSS bucket and then
import the images.
- For regions with working OSS that are blocked by Chinese state
censorship laws but that have working FC, use a temporary FC
function to copy the image files from the uncensored OSS buckets
and then import the images. Attempt downloads from a variety of
uncensored buckets, since cross-region OSS traffic tends to
experience a failure rate of around 10% of requests.
- For regions that have working OSS but are blocked by Chinese state
censorship laws and do not have working FC, or for regions that
don't even have working OSS, resort to using CopyImage to copy the
previously imported images from another region. Spread the
imports across as many source regions as possible to minimise the
effect of the CopyImage rate limiting.
Michael Brown [Fri, 17 Apr 2026 12:54:49 +0000 (13:54 +0100)]
[cloud] Do not rely on ECS instances to import images to Alibaba Cloud
Spinning up ECS instances is supported in all ECS regions (unlike
Function Compute), but turns out to be unacceptably unreliable since
Alibaba Cloud has a very irritating tendency to fail to launch ECS
instances for a variety of spurious and unpredictable reasons.
Rewrite the censorship bypass mechanism to use the (extremely slow)
CopyImage API call to copy an imported image from an uncensored region
to a censored region.
Michael Brown [Wed, 15 Apr 2026 15:02:23 +0000 (16:02 +0100)]
[cloud] Do not rely on Function Compute to import images to Alibaba Cloud
Function Compute is unsupported in several Alibaba Cloud regions.
Rewrite the censorship bypass mechanism to access OSS buckets using a
temporary ECS instance instead of a temporary Function Compute
function.
Importing images now requires that the account has been prepared using
the "ali-setup" script, which creates the necessary role, VPCs, and
vSwitches to allow ECS instances to be launched in each region.
Michael Brown [Tue, 14 Apr 2026 12:59:38 +0000 (13:59 +0100)]
[cloud] Support creation of a censorship bypass role for Alibaba Cloud
Importing images into Alibaba Cloud currently relies upon using a
temporary Function Compute function to work around Chinese state
censorship laws that prevent direct access to OSS bucket contents in
mainland China regions.
Unfortunately, Alibaba Cloud regions are extremely asymmetric in terms
of feature support. (For example, some regions do not even support
IPv6 networking.) Several mainland China regions do not support
Function Compute, and so this workaround is not available for those
regions.
A possible alternative censorship workaround is to create temporary
ECS virtual machine instances instead of temporary Function Compute
functions. This requires the existence of a role that can be used by
ECS instances to access OSS. We cannot use the AliyunFcDefaultRole
that is currently used by Function Compute, since this role cannot be
assumed by ECS instances.
Creating roles is a privileged operation, and it would be sensible to
assume that the image importer (which may be running as part of a
GitHub Actions workflow) may not have permission to itself create a
suitable temporary role. The censorship bypass role must therefore be
set up once in advance by a suitably privileged user.
Add the ability to create a suitable censorship bypass role to the
Alibaba Cloud setup utility.
Michael Brown [Fri, 10 Apr 2026 14:59:24 +0000 (15:59 +0100)]
[cloud] Add utility to set up VPCs and security groups in Alibaba Cloud
Creating ad hoc instances in Alibaba Cloud is extremely cumbersome and
tedious due to the need to specify an explicit vSwitch and security
group, with no defaults being available.
Add a utility that will create a VPC within each region, a vSwitch
within each zone within each region, and a security group within each
region.
Michael Brown [Thu, 9 Apr 2026 09:47:55 +0000 (10:47 +0100)]
[cloud] Update disk log console tool descriptions
Update the descriptive text for the disk log console tools to remove
references to INT13, since these now work for both BIOS and UEFI disk
log consoles.
Leave the script names as {aws,gce,ali}-int13con, to avoid breaking
any existing tooling that might use these names.
Michael Brown [Wed, 8 Apr 2026 12:55:47 +0000 (13:55 +0100)]
[cloud] Fix architecture detection for partitioned disk images
Allow the UEFI CPU architecture to be detected for the partitioned
disk images generated by genfsimg as of commit 2c84b68 ("[build] Use a
partition table in generated USB disk images").
Michael Brown [Sun, 29 Mar 2026 12:37:55 +0000 (13:37 +0100)]
[disklog] Generalise CONSOLE_INT13 to CONSOLE_DISKLOG
The name "int13" is intrinsically specific to a BIOS environment.
Generalise the build configuration option CONSOLE_INT13 to
CONSOLE_DISKLOG, in preparation for adding EFI disk log console
support.
Existing configurations using CONSOLE_INT13 will continue to work.
Michael Brown [Mon, 6 Apr 2026 23:37:00 +0000 (00:37 +0100)]
[undi] Pad transmit buffer length to work around vendor driver bugs
The workaround used for UEFI in commit 926816c ("[efi] Pad transmit
buffer length to work around vendor driver bugs") is also applicable
to the BIOS UNDI driver.
Apply the same workaround of padding the transmit I/O buffers to the
minimum Ethernet frame length before passing them to the underlying
UNDI driver's transmit function.
Reported-by: Alexander Patrakov <patrakov@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Mon, 6 Apr 2026 21:24:08 +0000 (22:24 +0100)]
[efi] Remove the Dhcp6Dxe driver veto
Commit cb95b5b ("[efi] Veto the Dhcp6Dxe driver on all platforms")
vetoed the Dhcp6Dxe driver to work around the bug described at
https://github.com/tianocore/edk2/issues/10506 that results in
EfiDhcp6Stop() getting stuck in a tight loop waiting for an event that
will never occur.
Since we now call UnloadImage() at TPL_APPLICATION, we no longer
trigger the bug in Dhcp6Dxe, and so the veto may be removed.
Michael Brown [Mon, 6 Apr 2026 21:17:58 +0000 (22:17 +0100)]
[efi] Drop to external TPL when unloading vetoed images
As of commit c3376f8 ("[efi] Drop to external TPL for calls to
ConnectController()"), the veto mechanism will drop to TPL_APPLICATION
for calls to DisconnectController().
Match this behaviour for calls to UnloadImage(), since that is likely
to result in calls to DisconnectController(). For example, any EDK2
driver using NetLibDefaultUnload() as its unload handler will call
DisconnectController() to disconnect itself from all handles.
On Ubuntu/Debian, syslinux-common installs mbr.bin to
/usr/lib/syslinux/mbr/mbr.bin. This path is not currently searched by
find_syslinux_file(), causing USB disk image generation to fail with
"could not find mbr.bin".
Add /usr/lib/syslinux/mbr, /usr/share/syslinux/mbr, and
/usr/local/share/syslinux/mbr to the search paths.
Michael Brown [Thu, 26 Mar 2026 14:46:26 +0000 (14:46 +0000)]
[build] Allow binaries to request a log partition from genfsimg
For UEFI, the USB disk image is constructed from the built EFI binary
(e.g. bin-x86_64-efi/ipxe.efi) by genfsimg, which does not itself have
any way to access the build configuration. We therefore need a way to
annotate the binary such that genfsimg can determine whether or not to
include a log partition within the USB disk image.
The "OEM ID" and "OEM information" fields within the PE header can be
used for this, since they are easily accessed and serve no other
purpose. We define bit 0 of "OEM information" as a flag indicating
that a log partition should be included. If this bit is set, genfsimg
will create a log partition with a layout matching that of the BIOS
build (i.e. using partition 3 and at an offset of 16kB from the start
of the disk).
The PE header is constructed by elf2efi.c, which takes as an input the
linked ELF form of the binary. We use an ELF .note section to allow
any linked-in object to communicate the log partition request through
to elf2efi.c, which then populates the OEM information field
accordingly.
We choose to use the same field locations within the BIOS bzImage
header, since this allows genfsimg to use the same logic for both BIOS
and UEFI binaries. In a BIOS build, there is no external processing
equivalent to elf2efi.c, and so we construct the field value directly
using absolute symbols and explicit relocation records.
(Note that the bzImage header is relevant only when using genfsimg to
construct a combined BIOS/UEFI image. In the common case of building
a BIOS-only image such as bin/ipxe.usb, the partition table is
manually constructed by usbdisk.S and genfsimg is not involved.)
Michael Brown [Thu, 26 Mar 2026 12:51:20 +0000 (12:51 +0000)]
[build] Work around syslinux bugs in FAT cluster counting
The syslinux function check_fat_bootsect() performs some sanity checks
to ensure that the filesystem type string (e.g. "FAT12") is correct
for the total number of clusters in the FAT. There is unfortunately a
bug in its calculation of the number of sectors occupied by the root
directory, which causes it to underestimate the number of sectors by a
factor of 32.
When the total number of clusters is close to the FAT12 limit of 4096,
this bug can cause syslinux to erroneously report that the filesystem
has "more than 4084 clusters but claims FAT12".
Work around this bug by selecting an explicit cluster size in order to
avoid potentially problematic cluster counts. We default to using 4kB
clusters, doubling to 8kB if using 4kB would result in a total cluster
count near 4096 (the FAT12 limit) or near 65536 (the FAT16 limit).
Michael Brown [Thu, 26 Mar 2026 12:40:40 +0000 (12:40 +0000)]
[build] Use sector count values consistently in genfsimg
The calculations around the FAT filesystem layout currently use a
mixture of kilobytes and sector counts. Switch to using sector counts
throughout the calculation, to make the code easier to read.
Michael Brown [Mon, 23 Mar 2026 16:08:09 +0000 (16:08 +0000)]
[build] Use a partition table in generated USB disk images
The USB disk image constructed by util/genfsimg is currently a raw FAT
filesystem, with no containing partition. This makes it incompatible
with the use of CONSOLE_INT13, since there is no way to add a
dedicated log partition without a partition table.
Add a partition table when building a non-ISO image, using the mbr.bin
provided by syslinux (since we are already using syslinux to invoke
the ipxe.lkrn within the FAT filesystem).
The BIOS .usb targets are built using a manually constructed partition
table with C/H/S geometry x/64/32. Match this geometry to minimise
the differences between genfsimg and non-genfsimg USB disk images.
Michael Brown [Tue, 24 Mar 2026 16:41:30 +0000 (16:41 +0000)]
[build] Ensure that generated filesystem images retain no stale content
We use mformat to ensure that the FAT filesystem starts as empty.
However, formatting the filesystem can still leave old data blocks
present (though unreferenced) within the disk image.
Truncate the image to a zero length before extending, to ensure that
no stale content is retained.
Michael Brown [Fri, 20 Mar 2026 13:48:54 +0000 (13:48 +0000)]
[cloud] Add utility to read INT13CON partition in Alibaba Cloud
Following the examples of aws-int13con and gce-int13con, add a utility
that can be used to read the INT13 console log from a used iPXE boot
disk in Alibaba Cloud Elastic Compute Service (ECS).
We cannot reliably access the used iPXE boot disk (or a snapshot
created from it) since OSS buckets in mainland China cannot be
accessed due to Chinese laws. We therefore create a snapshot and
attach this snapshot as a data disk to a temporary Linux instance, as
we do in Google Compute Engine.
Unlike in Google Compute Engine, we cannot reliably capture serial
port output from the temporary Linux instance. Issuing the relevant
GetInstanceConsoleOutput API call will cause the output to be captured
once and (unpredictably) cached. Without knowing in advance precisely
when the output is complete, we cannot use this approach to capture
the relevant part of the output.
We therefore use an Alibaba Cloud Linux image that includes the Cloud
Assistant Agent. This allows us to use the RunCommand API call to run
a command on the instance and capture the output, all done via the
control plane so that we are not dependent on having direct network
access to the temporary instance.
Joseph Wong [Thu, 19 Mar 2026 13:08:55 +0000 (13:08 +0000)]
[bnxt] Update conditions for invoking short commands
Include additional condition to invoke short command logic when
firmware indicates it is required. Replace 100ms delay with wmb() to
ensure DMA buffer is ready when short command is invoked.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
Michael Brown [Wed, 18 Mar 2026 16:01:54 +0000 (16:01 +0000)]
[cloud] Add utility for importing images to Alibaba Cloud
Following the examples of aws-import and gce-import, add a utility
that can be used to upload an iPXE disk image to Alibaba Cloud Elastic
Compute Service (ECS) as a bootable image.
The iPXE disk image is first uploaded to a temporary Object Storage
Service (OSS) bucket and then imported as an ECS image. The temporary
bucket is deleted after use.
As with Google Compute Engine, an appropriate image family name is
identified automatically: "ipxe" for BIOS images, "ipxe-uefi-x86-64"
for x86_64 UEFI images, and "ipxe-uefi-arm64" for AArch64 UEFI images.
This allows the latest image within each family to be launched within
needing to know the precise image name.
Copies of the images are uploaded to all selected regions. One major
complication is that OSS buckets in mainland China can be created but
cannot be accessed due to Chinese laws, which require an ICP filing
for any bucket hosted in mainland China. We work around this
restriction by first uploading the image to a region outside mainland
China and then using a temporary Function Compute function running in
each region to copy the images to the OSS bucket via the internal OSS
endpoints, which are not subject to the same restrictions.
Michael Brown [Wed, 18 Mar 2026 15:12:11 +0000 (15:12 +0000)]
[undi] Drag in PCI-specific configuration
The undionly.kpxe binary does not need the full PCI bus support.
However, the overwhelming majority of UNDI devices are PCI-based and
we already end up dragging in PCI configuration space support in order
to be able to test for devices with broken interrupts.
Dragging in the PCI configuration allows the PCI settings mechanism to
also be present, which is often useful for end users. The total cost
is around 200 bytes in the final binary, which is acceptable for a
generally very useful feature.
Users wanting to minimise the binary size can choose to explicitly
disable PCI_SETTINGS via config/settings.h.
Michael Brown [Fri, 13 Mar 2026 15:40:27 +0000 (15:40 +0000)]
[efi] Add a dummy SBOM PE section
Since October 2025, the Microsoft UEFI Signing Requirements have
included a clause stating that "submissions must contain a valid
signed SPDX SBOM in a custom '.sbom' PE section". A list of required
fields is provided, and a link is given to "the Microsoft SBOM tool to
aid SBOM generation". So far, so promising.
The Microsoft SBOM tool has no support for handling a .sbom PE
section. There is no published document that specifies what is
supposed to appear within this PE section. An educated guess is that
it should probably contain the raw JSON data in the same format that
the Microsoft SBOM tool produces.
The list of required fields does not map to identifiable fields within
the JSON. In particular:
- "file name / software"
This might be the top-level "name" field. It's hard to tell. The
SPDX SBOM specification is not particularly informative either: the
only definition it appears to give for "name" is "This field
identifies the name of an Element as designated by the creator",
which is a spectacularly useless definition.
- "software version / component generation (shim)"
This may refer to the "packages[].versionInfo" field. There is no
obvious relevance for the words "component", "generation", or
"shim". The proximity of "generation" and "shim" suggests that this
might be related in some way to the SBAT security generation, which
is absolutely not the same thing as the software version.
- "vendor / company name (this must exactly match the verified company
name in the submitter's EV certificate on the Microsoft HDC partner
center account)"
This is clearly written as though it has some significance for the
UEFI signing submission process. Unfortunately there is no obvious
map to any defined SBOM field. An educated guess is that this might
be referring to "packages[].supplier", since experiments show that
the Microsoft SBOM tool will fail validation unless this field is
present.
- "product-name"
This might also be the top-level "name" field. There is no
indication given as to how this might differ from "file name /
software".
- "OEM Name" and "OEM ID"
These seem to be terms made up on the spur of the moment. The
three-letter sequence "OEM" does not appear anywhere within the
codebase of the Microsoft SBOM tool.
In the absence of any meaningful specification, we choose not to
engage in good faith with this requirement. Instead, we construct a
best guess at the contents of a .sbom section that has some chance of
being accepted by the UEFI signing submission process. We assume that
anything that passes "sbom-tool validate" will probably be accepted,
with the only actual check being that the supplier name must match the
registered EV code signing certificate.
To anyone who actually cares about the arguably valuable benefits of
having a software bill of materials: please stop creating junk
requirements. If you want people to actually make the effort to
produce useful SBOM data, then make it clear what data you want.
Provide unambiguous specifications. Provide example files. Provide
tools that actually do the job they are claimed to do. Don't just
throw out another piece of "MUST HAS THING BECAUSE IS MORE SECURITY"
garbage and call it a day.
Michael Brown [Tue, 10 Mar 2026 13:41:48 +0000 (13:41 +0000)]
[ci] Add a workflow to import images to Google Cloud
Add a workflow to build and import the official iPXE images for Google
Cloud. As with the AWS import, treat this as a workflow that must be
triggered manually.
Michael Brown [Tue, 10 Mar 2026 12:34:10 +0000 (12:34 +0000)]
[cloud] Specify Google Cloud project explicitly for storage client
The storage client is currently constructed with the project inferred
from the environment, rather than using the project specified via the
command line arguments.
Fix by passing the project name to the storage client constructor.
Michael Brown [Mon, 9 Mar 2026 22:52:38 +0000 (22:52 +0000)]
[test] Assign unique MAC addresses for test network devices
Commit 19dffdc ("[efi] Allow for creating devices with no EFI parent
device") relaxed the restriction on attempting to create SNP devices
when no EFI parent device is available, with the result that the test
network devices created when running the IPv4 tests are now registered
as SNP devices.
Since the dummy EFI parent device path is fixed and the test network
device MAC addresses are empty, the SNP devices end up with identical
constructed device paths and registration of the second and subsequent
devices will fail since device paths must be unique.
Fix by assigning MAC addresses to the test network devices.
Reported-by: Miao Wang <shankerwangmiao@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Sat, 7 Mar 2026 23:32:03 +0000 (23:32 +0000)]
[ci] Add a workflow to import images to AWS EC2
Add a workflow to build and import the official iPXE images for AWS
EC2. Treat this as a workflow that must be triggered manually, since
importing is prone to failure for reasons unrelated to the state of
the codebase (e.g. the creation of new regions, or an explosion at a
data centre) and so should not result in CI failures being reported
against specific commits.
Michael Brown [Thu, 5 Mar 2026 15:56:07 +0000 (15:56 +0000)]
[efi] Do not unconditionally raise back to internal TPL
Most TPL manipulation is handled by efi_raise_tpl()/efi_restore_tpl()
pairs. The exceptions are the places where we need to temporarily
drop to a lower TPL in order to allow a timer interrupt to occur.
These currently assume that they are called only from code that is
already running at the internal TPL (generally TPL_CALLBACK). This
assumption is not always correct. In particular, the call from
_efi_start() to efi_driver_reconnect_all() takes place after the SNP
devices have been released and so will be running at the external TPL.
Create an efi_drop_tpl()/efi_undrop_tpl() pair to abstract away the
temporary lowering of the TPL, and ensure that the TPL is always
raised back to its original level rather than being unconditionally
raised to the internal TPL.
Michael Brown [Thu, 5 Mar 2026 12:24:35 +0000 (12:24 +0000)]
[efi] Allow creating an image device handle with no parent device
When we fall back to using our own loaded image's device handle
(instead of the most recently opened SNP device handle), we may find
that the device handle is no longer valid since we have disconnected
the driver that originally provided it.
Check for existence of the device path protocol on the identified
parent handle, and choose not to attempt to set a parent-child
relationship if the parent handle appears to no longer be valid.
Michael Brown [Thu, 5 Mar 2026 12:10:15 +0000 (12:10 +0000)]
[efi] Install protocols onto a dedicated device handle
When we fall back to using our own loaded image's device handle
(instead of the most recently opened SNP device handle), we may find
that some protocols (e.g. EFI_SIMPLE_FILE_SYSTEM_PROTOCOL) are already
present on this handle.
Fix by creating a child device handle with an added Uri() device path
component, and installing EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and others
onto this handle instead of onto the identified parent device handle.
This also provides a way for us to communicate the image URI to a
chainloaded iPXE, so that that iPXE can set its current working URI
appropriately.
A side effect of this change is that the EFI_SIMPLE_NETWORK_PROTOCOL
will be found on the parent of the loaded image's device handle,
rather than directly on the loaded image's device handle. This will
not cause problems for a chainloaded iPXE, since that will already use
LocateDevicePath() to find EFI_SIMPLE_NETWORK_PROTOCOL (or the
EFI_MANAGED_NETWORK_SERVICE_BINDING_PROTOCOL created by MnpDxe) and so
will already find the instance on the parent device handle. If other
UEFI executables are found to exist that do require the protocols to
be installed directly on the loaded image's device handle, then we
could potentially install copies of these protocol instances on the
device handle.
Michael Brown [Thu, 5 Mar 2026 11:29:41 +0000 (11:29 +0000)]
[efi] Allow executing images even with no open network devices
We need a device handle from which to nominally load an EFI image. We
currently rely on using the most recently opened network device's SNP
device handle, in the same way that we use the most recently opened
network device when loading a BIOS PXE NBP image. If there is no most
recently opened network device, then we cannot execute an EFI image.
We use three aspects of the SNP device handle: the handle itself
(giving us something on which to install protocols), the associated
device path (giving us a base path from which to construct the new
image's file path), and the associated network device (giving us an
interface for the PXE base code protocol installation).
Make the network device optional by simply choosing not to install the
PXE base code protocols when no network device is defined. This
allows us to fall back to using our own loaded image's device handle
and device path for the other two purposes.
Michael Brown [Thu, 5 Mar 2026 12:55:37 +0000 (12:55 +0000)]
[efi] Try all supported autoexec protocols
When chainloaded from another iPXE, there will be both a virtual
filesystem and a managed network protocol available through which we
could attempt to load autoexec.ipxe.
Try both of these, with the virtual filesystem attempted first so that
an autoexec.ipxe that was explicitly downloaded by the chainloading
iPXE will have the highest priority.
Michael Brown [Thu, 5 Mar 2026 12:37:00 +0000 (12:37 +0000)]
[efi] Treat a URI device path as higher priority than a cached DHCP packet
We currently expect to find either a cached DHCP packet (from a UEFI
PXE boot) or a URI device path (from a UEFI HTTP boot), but not both
simultaneously. When both are present, the cached DHCP packet will
currently override any current working URI that was previously derived
from a URI device path.
Treat the URI device path as being more informative than the cached
DHCP packet by swapping the order in which these are processed.
Leave the boot option device path as being a lower priority than a
cached DHCP packet, since the boot option device path may well refer
to an earlier boot stage.
Michael Brown [Tue, 3 Mar 2026 14:57:17 +0000 (14:57 +0000)]
[github] Add organization to sponsorship links
There is no viable way to link to the list of sponsorship recipients.
Add the organization itself as a sponsorship recipient, solely in
order to enable the use of https://github.com/sponsors/ipxe as a
central link for sponsorship information.
Michael Brown [Tue, 3 Mar 2026 11:04:18 +0000 (11:04 +0000)]
[cachedhcp] Set current working URI to cached DHCP filename
For a UEFI HTTP boot, we set the current working URI based on the
loaded image device path. The autoexec.ipxe script will be fetched
from the same directory as the iPXE binary itself.
For a BIOS or UEFI PXE boot, we do not explicitly set a current
working URI, but rely on the fact that registering the cached DHCP
settings block will cause the TFTP code to set the current working URI
to "tftp://${next-server}/". The autoexec.ipxe script will therefore
be fetched from the default directory (which is most probably the root
directory) of the TFTP server.
When using a UEFI shim, the shim will always fetch iPXE from the same
directory as the shim itself. This leads to a somewhat unintuitive
requirement for a UEFI PXE boot: the shim and iPXE must be placed in
the same directory, but the corresponding autoexec.ipxe script must be
placed in the root directory.
As with the loaded image device path for a UEFI HTTP boot, the
existence of a cached DHCP packet gives us a way to construct the URI
of our own binary. We can therefore choose to use this to set the
current working URI, so that the autoexec.ipxe script may be placed in
the same directory as the iPXE binary itself. This is the least
surprising location, and avoids the need for lengthy explanations in
documentation.
Choose to set the current working URI at the point that the cached
DHCP packet is recorded, rather than the point at which it is applied
and registered as a settings block. This avoids some awkward corner
cases (such as failing to find a matching network device for the
DHCPACK), and naturally ensures that we retrieve the next-server
address and filename from the same DHCP packet. We rely on the order
in which cached DHCP packets are recorded to impose a priority
ordering: later packets (e.g. PxeBSACK) will override earlier ones.
To avoid breaking existing setups that do place the autoexec.ipxe
script in the root directory, we modify the fetching logic to first
attempt to retrieve autoexec.ipxe from the current working URI, then
from the root directory of that URI.
As with commit a69afd7 ("[tftp] Use TFTP server URI only if no other
working URI is set"), this is technically a breaking change in
behaviour, but the new behaviour is almost certainly less surprising
than the existing behaviour. Scripts that rely on the current working
URI being set to the root of the TFTP server can use absolute URIs
(i.e. add an initial slash): this is more explicit and will work on
iPXE builds both before and after this change.
Michael Brown [Mon, 2 Mar 2026 16:10:49 +0000 (16:10 +0000)]
[build] Add support for including a UEFI shim in filesystem images
Add support for loading iPXE via a UEFI shim in ISO and USB images.
Since the iPXE shim's default loader filename is currently "ipxe.efi"
for all CPU architectures, at most one architecture within an image
may use a shim. (This limitation should be removed in the next signed
release of the iPXE shim.)
Michael Brown [Mon, 2 Mar 2026 00:08:18 +0000 (00:08 +0000)]
[efi] Automatically open network device matching loaded image device path
It is unintuitive to have to include an "ifopen" at the start of an
autoexec.ipxe script. Commit efe8126 ("[cachedhcp] Automatically open
network device matching cached DHCPACK") causes the chainloaded device
to be opened automatically, using the cached DHCPACK to identify the
chainloaded device.
In the case of a UEFI HTTP(S) boot, the firmware does not provide
access to the DHCPACK and we are forced to instead extract the very
limited amount of information encoded into the loaded image's device
path.
Mark the device matching the loaded image's device path to be opened
automatically, so that the chainloaded device will be opened in the
same way for both TFTP and HTTP(S) boots.
Michael Brown [Sun, 1 Mar 2026 16:37:29 +0000 (16:37 +0000)]
[tftp] Use TFTP server URI only if no other working URI is set
We currently set the working URI to "tftp://${next-server}/" whenever
the value of the next-server setting changes.
Many years ago this was required for the default boot sequence, which
would treat the boot filename as a potentially relative URI. Since
commit 481a217 ("[autoboot] Retain initial-slash (if present) when
constructing TFTP URIs"), the default boot sequence has always
constructed an absolute URI.
There is still a valid use case for setting the default working URI
based on the value of next-server: it allows command sequences such as
dhcp && chain ${filename}
or
set next-server 192.168.0.1
chain myscript.ipxe
to work as expected. Note that since "${filename}" may be a relative
path, it is necessary for the current working URI to be the root of
the TFTP server, i.e. "tftp://${next-server}/", rather than the full
path "tftp://${next-server}/${filename}".
In the case of a UEFI HTTP(S) boot, we already have a working URI set
on entry (to be the URI of the iPXE binary itself). Running "dhcp"
would change this current working URI, which is quite unintuitive.
Similarly, once we start executing an image (e.g. a script), the
current working URI is set to the image's own URI, so that relative
URIs may be used in a script to download files relative to the
location of the script itself. Running "dhcp" within the script may
or may not change the current working URI: it will happen to do so
only if the TFTP server address happens to change. This is also
somewhat unintuitive.
Change the behaviour of the TFTP settings applicator to treat the TFTP
server URI as a fallback, to be used only if nothing else has already
set a current working URI. This is technically a breaking change in
behaviour, but the new behaviour is almost certainly much less
surprising than the existing behaviour. (Scripts that do genuinely
expect to acquire a new TFTP server address can use full URIs of the
form "tftp://${next-server}/...": this is more explicit and will work
on iPXE builds both before and after this change.)
Michael Brown [Fri, 27 Feb 2026 13:16:51 +0000 (13:16 +0000)]
[tls] Respond to received closure alerts
TLS defines a mechanism for gracefully closing a connection via a
closure alert. We currently ignore this alert since it is a warning
rather than an error, and warnings are allowed to be ignored.
In almost all cases, a higher-level protocol such as HTTP will already
give us the information required to know when the connection should be
closed. In the very rare case of an HTTPS server that does not send a
Content-Length header and does not close the TCP connection, only the
closure alert indicates that the whole file has been retrieved.
Handle a received closure alert by gracefully closing the connection.
Reported-by: Tuomo Tanskanen <tuomo.tanskanen@est.tech> Signed-off-by: Michael Brown <mcb30@ipxe.org>
Michael Brown [Thu, 26 Feb 2026 13:11:57 +0000 (13:11 +0000)]
[cachedhcp] Automatically open network device matching cached DHCPACK
It is unintuitive to have to include an "ifopen" at the start of an
autoexec.ipxe script. Provide a mechanism for upper-layer drivers to
mark a network device to be opened automatically upon registration,
and do so for the device to which the cached DHCPACK is applied.
Michael Brown [Thu, 26 Feb 2026 12:28:50 +0000 (12:28 +0000)]
[dynui] Allow for duplicate shortcut keys
When searching for a shortcut key, search first from the currently
selected menu item and then from the start of the list.
This allows several ways for a shortcut key to be meaningfully used
multiple times within the same menu. For example, two sections may
have the same shortcut key:
item --key s --gap (S)ection 1
item ...
item ...
item --key s --gap (S)ection 2
item ...
With the above menu, repeated "s" keypresses would cycle through the
sections.
As another example, entries within different sections may have the
same shortcut keys. For example:
item --key d --gap (D)ebian
item --key s debst Debian (s)table release
item --key u debun Debian (u)nstable release
item --key f --gap (F)edora
item --key s fedst Fedora (s)table release
item --key u fedun Fedora (u)nstable release
With the above menu, a shortcut key sequence such as "f", "s" can be
used to select an entry within a specific section, avoiding the need
to choose shortcut keys that are globally unique within the menu.
Michael Brown [Wed, 25 Feb 2026 17:05:59 +0000 (17:05 +0000)]
[efi] Allow for the existence of multiple shim lock protocols
When multiple shims are present in the system (e.g. in a boot chain
such as UEFI -> iPXE shim -> iPXE -> distro shim -> distro kernel),
there may be more than one installed shim lock protocol.
There is no sensible way to identify which shim lock protocol belongs
to which shim. The shim lock protocol is installed on an anonymous
handle that has no device path, no other form of identifier, and no
connection to any other handle or protocol instance installed by the
shim.
The shim does include some extremely convoluted logic whereby a second
shim will attempt to uninstall a shim lock protocol installed by an
earlier shim. However, this logic is broken: the second shim calls
UninstallProtocolInterface() with the wrong handle and the wrong
protocol interface pointer. This logic error is silently ignored
since shim does not bother to check the return status.
Experience shows that there is unfortunately no point in trying to get
a fix for this upstreamed into shim, or even in raising the issue with
the shim project. We therefore work around the shim bug by calling
all instances of the shim lock protocol, rather than relying on shim
itself to ensure that only one such instance exists.
Michael Brown [Wed, 25 Feb 2026 00:14:26 +0000 (00:14 +0000)]
[xferbuf] Silently discard data written to a void data transfer buffer
Allow data to be successfully written (and discarded) to a void data
transfer buffer, rather than throwing an error. This allows a void
data transfer buffer to be used when determining the length of a file
downloaded from a TFTP server that does not support the "tsize" option
defined in RFC 2349.
Michael Brown [Wed, 25 Feb 2026 00:00:28 +0000 (00:00 +0000)]
[xferbuf] Record maximum required size
Record the maximum size required when writing into a data transfer
buffer. This allows the maximum size to be determined even if
allocation fails (e.g. due to a fixed-size buffer or an out-of-memory
condition).
In the case of a fixed-size buffer (which may already be larger than
required), this allows the caller to determine the actual size used
for written data.