From: Sean Christopherson Date: Tue, 2 Jun 2026 17:09:19 +0000 (-0700) Subject: KVM: guest_memfd: Treat memslot binding offset+size as unsigned values X-Git-Url: http://git.ipfire.org/gitweb/?a=commitdiff_plain;h=eba85fee7fc6cf28fec38a5bf3c378bef9a79ca6;p=thirdparty%2Flinux.git KVM: guest_memfd: Treat memslot binding offset+size as unsigned values When binding a memslot to a guest_memfd file, treat the offset and size as unsigned values to fix a bug where the sum of the two can result in a false negative when checking for overflow against the size of the file. Passing unsigned values also avoids relying on somewhat obscure checks in other flows for safety, and tracks the offset and size as they are intended to be tracked, as unsigned values. On 64-bit kernels, the number of pages a memslot contains and thus the size (and offset) of its guest_memfd binding are unsigned 64-bit values. Taking the offset+size as an loff_t instead of a uoff_t inadvertently converts the unsigned value to a signed value if the offset and/or size is massive. Locally storing the offset and size as signed values is benign in and of itself (though even that is *extremely* difficult to discern), but operating on their sum is not. For the offset, KVM explicitly checks against a negative value, which might seem like a bug as KVM could incorrectly reject a legitimate binding, but that's not actually the case as KVM_CREATE_GUEST_MEMFD takes a signed value for its size, i.e. a would-be-negative offset is also greater than the maximum possible size of any guest_memfd file. Regarding the size, while KVM lacks an explicit check for a negative value, i.e. seemingly has a flawed overflow check, KVM restricts the number of pages in a single memslot to the largest positive signed 32-bit value: if (id < KVM_USER_MEM_SLOTS && (mem->memory_size >> PAGE_SHIFT) > KVM_MEM_MAX_NR_PAGES) return -EINVAL; and so that maximum "size" will ever be is 0x7fffffff000. The sum of the two is, however, problematic. While the size is restricted by KVM's memslot logic, the offset is not, i.e. the offset is completely unchecked until the "offset + size > i_size_read(inode)" check. If the offset is the (nearly) largest possible _positive_ value, then adding size to the offset can result in a signed, negative 64-bit value. When compared against the size of the file (guaranteed to be positive), the negative sum is always smaller, and KVM incorrectly allows the absurd offset. Opportunistically add missing includes in kvm_mm.h (instead of relying on its parents). Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable@vger.kernel.org Cc: Ackerley Tng Reviewed-by: Michael Roth Reviewed-by: Ackerley Tng Link: https://patch.msgid.link/20260602170921.1304394-2-seanjc@google.com Signed-off-by: Sean Christopherson --- diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 46727539d08a..f54502640b08 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -640,15 +640,16 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) } int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, - unsigned int fd, loff_t offset) + unsigned int fd, uoff_t offset) { - loff_t size = slot->npages << PAGE_SHIFT; + uoff_t size = slot->npages << PAGE_SHIFT; unsigned long start, end; struct gmem_file *f; struct inode *inode; struct file *file; int r = -EINVAL; + BUILD_BUG_ON(sizeof(gpa_t) != sizeof(offset)); BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff)); file = fget(fd); @@ -664,8 +665,7 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, inode = file_inode(file); - if (offset < 0 || !PAGE_ALIGNED(offset) || - offset + size > i_size_read(inode)) + if (!PAGE_ALIGNED(offset) || offset + size > i_size_read(inode)) goto err; filemap_invalidate_lock(inode->i_mapping); diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index 9fcc5d5b7f8d..7510ca915dd1 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -3,6 +3,9 @@ #ifndef __KVM_MM_H__ #define __KVM_MM_H__ 1 +#include +#include + /* * Architectures can choose whether to use an rwlock or spinlock * for the mmu_lock. These macros, for use in common code @@ -72,7 +75,7 @@ int kvm_gmem_init(struct module *module); void kvm_gmem_exit(void); int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args); int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, - unsigned int fd, loff_t offset); + unsigned int fd, uoff_t offset); void kvm_gmem_unbind(struct kvm_memory_slot *slot); #else static inline int kvm_gmem_init(struct module *module) @@ -82,7 +85,7 @@ static inline int kvm_gmem_init(struct module *module) static inline void kvm_gmem_exit(void) {}; static inline int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, - unsigned int fd, loff_t offset) + unsigned int fd, uoff_t offset) { WARN_ON_ONCE(1); return -EIO;