6.12-stable patches

author Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)
author Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)
diff --git a/queue-6.12/btrfs-fix-corruption-reading-compressed-range-when-block-size-is-smaller-than-page-size.patch b/queue-6.12/btrfs-fix-corruption-reading-compressed-range-when-block-size-is-smaller-than-page-size.patch

new file mode 100644 (file)

index 0000000..d56e125
--- /dev/null
+++ b/queue-6.12/btrfs-fix-corruption-reading-compressed-range-when-block-size-is-smaller-than-page-size.patch
@@ -0,0 +1,232 @@
+From stable+bounces-179510-greg=kroah.com@vger.kernel.org Sat Sep 13 18:14:04 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 12:13:53 -0400
+Subject: btrfs: fix corruption reading compressed range when block size is smaller than page size
+To: stable@vger.kernel.org
+Cc: Qu Wenruo <wqu@suse.com>, Filipe Manana <fdmanana@suse.com>, David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913161353.1443138-2-sashal@kernel.org>
+
+From: Qu Wenruo <wqu@suse.com>
+
+[ Upstream commit 9786531399a679fc2f4630d2c0a186205282ab2f ]
+
+[BUG]
+With 64K page size (aarch64 with 64K page size config) and 4K btrfs
+block size, the following workload can easily lead to a corrupted read:
+
+        mkfs.btrfs -f -s 4k $dev > /dev/null
+        mount -o compress $dev $mnt
+        xfs_io -f -c "pwrite -S 0xff 0 64k" $mnt/base > /dev/null
+       echo "correct result:"
+        od -Ad -t x1 $mnt/base
+        xfs_io -f -c "reflink $mnt/base 32k 0 32k" \
+                 -c "reflink $mnt/base 0 32k 32k" \
+                 -c "pwrite -S 0xff 60k 4k" $mnt/new > /dev/null
+       echo "incorrect result:"
+        od -Ad -t x1 $mnt/new
+        umount $mnt
+
+This shows the following result:
+
+correct result:
+0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+*
+0065536
+incorrect result:
+0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+*
+0032768 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+*
+0061440 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
+*
+0065536
+
+Notice the zero in the range [32K, 60K), which is incorrect.
+
+[CAUSE]
+With extra trace printk, it shows the following events during od:
+(some unrelated info removed like CPU and context)
+
+ od-3457   btrfs_do_readpage: enter r/i=5/258 folio=0(65536) prev_em_start=0000000000000000
+
+The "r/i" is indicating the root and inode number. In our case the file
+"new" is using ino 258 from fs tree (root 5).
+
+Here notice the @prev_em_start pointer is NULL. This means the
+btrfs_do_readpage() is called from btrfs_read_folio(), not from
+btrfs_readahead().
+
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=0 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=4096 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=8192 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=12288 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=16384 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=20480 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=24576 got em start=0 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=28672 got em start=0 len=32768
+
+These above 32K blocks will be read from the first half of the
+compressed data extent.
+
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=32768 got em start=32768 len=32768
+
+Note here there is no btrfs_submit_compressed_read() call. Which is
+incorrect now.
+Although both extent maps at 0 and 32K are pointing to the same compressed
+data, their offsets are different thus can not be merged into the same
+read.
+
+So this means the compressed data read merge check is doing something
+wrong.
+
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=36864 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=40960 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=45056 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=49152 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=53248 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=57344 got em start=32768 len=32768
+ od-3457   btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=61440 skip uptodate
+ od-3457   btrfs_submit_compressed_read: cb orig_bio: file off=0 len=61440
+
+The function btrfs_submit_compressed_read() is only called at the end of
+folio read. The compressed bio will only have an extent map of range [0,
+32K), but the original bio passed in is for the whole 64K folio.
+
+This will cause the decompression part to only fill the first 32K,
+leaving the rest untouched (aka, filled with zero).
+
+This incorrect compressed read merge leads to the above data corruption.
+
+There were similar problems that happened in the past, commit 808f80b46790
+("Btrfs: update fix for read corruption of compressed and shared
+extents") is doing pretty much the same fix for readahead.
+
+But that's back to 2015, where btrfs still only supports bs (block size)
+== ps (page size) cases.
+This means btrfs_do_readpage() only needs to handle a folio which
+contains exactly one block.
+
+Only btrfs_readahead() can lead to a read covering multiple blocks.
+Thus only btrfs_readahead() passes a non-NULL @prev_em_start pointer.
+
+With v5.15 kernel btrfs introduced bs < ps support. This breaks the above
+assumption that a folio can only contain one block.
+
+Now btrfs_read_folio() can also read multiple blocks in one go.
+But btrfs_read_folio() doesn't pass a @prev_em_start pointer, thus the
+existing bio force submission check will never be triggered.
+
+In theory, this can also happen for btrfs with large folios, but since
+large folio is still experimental, we don't need to bother it, thus only
+bs < ps support is affected for now.
+
+[FIX]
+Instead of passing @prev_em_start to do the proper compressed extent
+check, introduce one new member, btrfs_bio_ctrl::last_em_start, so that
+the existing bio force submission logic will always be triggered.
+
+CC: stable@vger.kernel.org # 5.15+
+Reviewed-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: Qu Wenruo <wqu@suse.com>
+Signed-off-by: David Sterba <dsterba@suse.com>
+[ Adjust context ]
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/extent_io.c |   40 ++++++++++++++++++++++++++++++----------
+ 1 file changed, 30 insertions(+), 10 deletions(-)
+
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -109,6 +109,24 @@ struct btrfs_bio_ctrl {
+        */
+       unsigned long submit_bitmap;
+       struct readahead_control *ractl;
++
++      /*
++       * The start offset of the last used extent map by a read operation.
++       *
++       * This is for proper compressed read merge.
++       * U64_MAX means we are starting the read and have made no progress yet.
++       *
++       * The current btrfs_bio_is_contig() only uses disk_bytenr as
++       * the condition to check if the read can be merged with previous
++       * bio, which is not correct. E.g. two file extents pointing to the
++       * same extent but with different offset.
++       *
++       * So here we need to do extra checks to only merge reads that are
++       * covered by the same extent map.
++       * Just extent_map::start will be enough, as they are unique
++       * inside the same inode.
++       */
++      u64 last_em_start;
+ };
+ 
+ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
+@@ -955,7 +973,7 @@ static void btrfs_readahead_expand(struc
+  * return 0 on success, otherwise return error
+  */
+ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
+-                    struct btrfs_bio_ctrl *bio_ctrl, u64 *prev_em_start)
++                           struct btrfs_bio_ctrl *bio_ctrl)
+ {
+       struct inode *inode = folio->mapping->host;
+       struct btrfs_fs_info *fs_info = inode_to_fs_info(inode);
+@@ -1066,12 +1084,11 @@ static int btrfs_do_readpage(struct foli
+                * non-optimal behavior (submitting 2 bios for the same extent).
+                */
+               if (compress_type != BTRFS_COMPRESS_NONE &&
+-                  prev_em_start && *prev_em_start != (u64)-1 &&
+-                  *prev_em_start != em->start)
++                  bio_ctrl->last_em_start != U64_MAX &&
++                  bio_ctrl->last_em_start != em->start)
+                       force_bio_submit = true;
+ 
+-              if (prev_em_start)
+-                      *prev_em_start = em->start;
++              bio_ctrl->last_em_start = em->start;
+ 
+               free_extent_map(em);
+               em = NULL;
+@@ -1115,12 +1132,15 @@ int btrfs_read_folio(struct file *file,
+       const u64 start = folio_pos(folio);
+       const u64 end = start + folio_size(folio) - 1;
+       struct extent_state *cached_state = NULL;
+-      struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ };
++      struct btrfs_bio_ctrl bio_ctrl = {
++              .opf = REQ_OP_READ,
++              .last_em_start = U64_MAX,
++      };
+       struct extent_map *em_cached = NULL;
+       int ret;
+ 
+       btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state);
+-      ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl, NULL);
++      ret = btrfs_do_readpage(folio, &em_cached, &bio_ctrl);
+       unlock_extent(&inode->io_tree, start, end, &cached_state);
+ 
+       free_extent_map(em_cached);
+@@ -2391,7 +2411,8 @@ void btrfs_readahead(struct readahead_co
+ {
+       struct btrfs_bio_ctrl bio_ctrl = {
+               .opf = REQ_OP_READ | REQ_RAHEAD,
+-              .ractl = rac
++              .ractl = rac,
++              .last_em_start = U64_MAX,
+       };
+       struct folio *folio;
+       struct btrfs_inode *inode = BTRFS_I(rac->mapping->host);
+@@ -2399,12 +2420,11 @@ void btrfs_readahead(struct readahead_co
+       const u64 end = start + readahead_length(rac) - 1;
+       struct extent_state *cached_state = NULL;
+       struct extent_map *em_cached = NULL;
+-      u64 prev_em_start = (u64)-1;
+ 
+       btrfs_lock_and_flush_ordered_range(inode, start, end, &cached_state);
+ 
+       while ((folio = readahead_folio(rac)) != NULL)
+-              btrfs_do_readpage(folio, &em_cached, &bio_ctrl, &prev_em_start);
++              btrfs_do_readpage(folio, &em_cached, &bio_ctrl);
+ 
+       unlock_extent(&inode->io_tree, start, end, &cached_state);
+ 
diff --git a/queue-6.12/btrfs-use-readahead_expand-on-compressed-extents.patch b/queue-6.12/btrfs-use-readahead_expand-on-compressed-extents.patch

new file mode 100644 (file)

index 0000000..d31f7e3
--- /dev/null
+++ b/queue-6.12/btrfs-use-readahead_expand-on-compressed-extents.patch
@@ -0,0 +1,204 @@
+From stable+bounces-179509-greg=kroah.com@vger.kernel.org Sat Sep 13 18:14:01 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 12:13:52 -0400
+Subject: btrfs: use readahead_expand() on compressed extents
+To: stable@vger.kernel.org
+Cc: Boris Burkov <boris@bur.io>, Dimitrios Apostolou <jimis@gmx.net>, Filipe Manana <fdmanana@suse.com>, David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913161353.1443138-1-sashal@kernel.org>
+
+From: Boris Burkov <boris@bur.io>
+
+[ Upstream commit 9e9ff875e4174be939371667d2cc81244e31232f ]
+
+We recently received a report of poor performance doing sequential
+buffered reads of a file with compressed extents. With bs=128k, a naive
+sequential dd ran as fast on a compressed file as on an uncompressed
+(1.2GB/s on my reproducing system) while with bs<32k, this performance
+tanked down to ~300MB/s.
+
+i.e., slow:
+
+  dd if=some-compressed-file of=/dev/null bs=4k count=X
+
+vs fast:
+
+  dd if=some-compressed-file of=/dev/null bs=128k count=Y
+
+The cause of this slowness is overhead to do with looking up extent_maps
+to enable readahead pre-caching on compressed extents
+(add_ra_bio_pages()), as well as some overhead in the generic VFS
+readahead code we hit more in the slow case. Notably, the main
+difference between the two read sizes is that in the large sized request
+case, we call btrfs_readahead() relatively rarely while in the smaller
+request we call it for every compressed extent. So the fast case stays
+in the btrfs readahead loop:
+
+    while ((folio = readahead_folio(rac)) != NULL)
+           btrfs_do_readpage(folio, &em_cached, &bio_ctrl, &prev_em_start);
+
+where the slower one breaks out of that loop every time. This results in
+calling add_ra_bio_pages a lot, doing lots of extent_map lookups,
+extent_map locking, etc.
+
+This happens because although add_ra_bio_pages() does add the
+appropriate un-compressed file pages to the cache, it does not
+communicate back to the ractl in any way. To solve this, we should be
+using readahead_expand() to signal to readahead to expand the readahead
+window.
+
+This change passes the readahead_control into the btrfs_bio_ctrl and in
+the case of compressed reads sets the expansion to the size of the
+extent_map we already looked up anyway. It skips the subpage case as
+that one already doesn't do add_ra_bio_pages().
+
+With this change, whether we use bs=4k or bs=128k, btrfs expands the
+readahead window up to the largest compressed extent we have seen so far
+(in the trivial example: 128k) and the call stacks of the two modes look
+identical. Notably, we barely call add_ra_bio_pages at all. And the
+performance becomes identical as well. So this change certainly "fixes"
+this performance problem.
+
+Of course, it does seem to beg a few questions:
+
+1. Will this waste too much page cache with a too large ra window?
+2. Will this somehow cause bugs prevented by the more thoughtful
+   checking in add_ra_bio_pages?
+3. Should we delete add_ra_bio_pages?
+
+My stabs at some answers:
+
+1. Hard to say. See attempts at generic performance testing below. Is
+   there a "readahead_shrink" we should be using? Should we expand more
+   slowly, by half the remaining em size each time?
+2. I don't think so. Since the new behavior is indistinguishable from
+   reading the file with a larger read size passed in, I don't see why
+   one would be safe but not the other.
+3. Probably! I tested that and it was fine in fstests, and it seems like
+   the pages would get re-used just as well in the readahead case.
+   However, it is possible some reads that use page cache but not
+   btrfs_readahead() could suffer. I will investigate this further as a
+   follow up.
+
+I tested the performance implications of this change in 3 ways (using
+compress-force=zstd:3 for compression):
+
+Directly test the affected workload of small sequential reads on a
+compressed file (improved from ~250MB/s to ~1.2GB/s)
+
+==========for-next==========
+  dd /mnt/lol/non-cmpr 4k
+  1048576+0 records in
+  1048576+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.02983 s, 712 MB/s
+  dd /mnt/lol/non-cmpr 128k
+  32768+0 records in
+  32768+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 5.92403 s, 725 MB/s
+  dd /mnt/lol/cmpr 4k
+  1048576+0 records in
+  1048576+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 17.8832 s, 240 MB/s
+  dd /mnt/lol/cmpr 128k
+  32768+0 records in
+  32768+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.71001 s, 1.2 GB/s
+
+==========ra-expand==========
+  dd /mnt/lol/non-cmpr 4k
+  1048576+0 records in
+  1048576+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.09001 s, 705 MB/s
+  dd /mnt/lol/non-cmpr 128k
+  32768+0 records in
+  32768+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.07664 s, 707 MB/s
+  dd /mnt/lol/cmpr 4k
+  1048576+0 records in
+  1048576+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.79531 s, 1.1 GB/s
+  dd /mnt/lol/cmpr 128k
+  32768+0 records in
+  32768+0 records out
+  4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.69533 s, 1.2 GB/s
+
+Built the linux kernel from clean (no change)
+
+Ran fsperf. Mostly neutral results with some improvements and
+regressions here and there.
+
+Reported-by: Dimitrios Apostolou <jimis@gmx.net>
+Link: https://lore.kernel.org/linux-btrfs/34601559-6c16-6ccc-1793-20a97ca0dbba@gmx.net/
+Reviewed-by: Filipe Manana <fdmanana@suse.com>
+Signed-off-by: Boris Burkov <boris@bur.io>
+Signed-off-by: David Sterba <dsterba@suse.com>
+[ Assert doesn't take a format string ]
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ fs/btrfs/extent_io.c |   34 +++++++++++++++++++++++++++++++++-
+ 1 file changed, 33 insertions(+), 1 deletion(-)
+
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -108,6 +108,7 @@ struct btrfs_bio_ctrl {
+        * This is to avoid touching ranges covered by compression/inline.
+        */
+       unsigned long submit_bitmap;
++      struct readahead_control *ractl;
+ };
+ 
+ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
+@@ -929,6 +930,23 @@ static struct extent_map *get_extent_map
+ 
+       return em;
+ }
++
++static void btrfs_readahead_expand(struct readahead_control *ractl,
++                                 const struct extent_map *em)
++{
++      const u64 ra_pos = readahead_pos(ractl);
++      const u64 ra_end = ra_pos + readahead_length(ractl);
++      const u64 em_end = em->start + em->ram_bytes;
++
++      /* No expansion for holes and inline extents. */
++      if (em->disk_bytenr > EXTENT_MAP_LAST_BYTE)
++              return;
++
++      ASSERT(em_end >= ra_pos);
++      if (em_end > ra_end)
++              readahead_expand(ractl, ra_pos, em_end - ra_pos);
++}
++
+ /*
+  * basic readpage implementation.  Locked extent state structs are inserted
+  * into the tree that are removed when the IO is done (by the end_io
+@@ -994,6 +1012,17 @@ static int btrfs_do_readpage(struct foli
+ 
+               iosize = min(extent_map_end(em) - cur, end - cur + 1);
+               iosize = ALIGN(iosize, blocksize);
++
++              /*
++               * Only expand readahead for extents which are already creating
++               * the pages anyway in add_ra_bio_pages, which is compressed
++               * extents in the non subpage case.
++               */
++              if (bio_ctrl->ractl &&
++                  !btrfs_is_subpage(fs_info, folio->mapping) &&
++                  compress_type != BTRFS_COMPRESS_NONE)
++                      btrfs_readahead_expand(bio_ctrl->ractl, em);
++
+               if (compress_type != BTRFS_COMPRESS_NONE)
+                       disk_bytenr = em->disk_bytenr;
+               else
+@@ -2360,7 +2389,10 @@ int btrfs_writepages(struct address_spac
+ 
+ void btrfs_readahead(struct readahead_control *rac)
+ {
+-      struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ | REQ_RAHEAD };
++      struct btrfs_bio_ctrl bio_ctrl = {
++              .opf = REQ_OP_READ | REQ_RAHEAD,
++              .ractl = rac
++      };
+       struct folio *folio;
+       struct btrfs_inode *inode = BTRFS_I(rac->mapping->host);
+       const u64 start = readahead_pos(rac);
diff --git a/queue-6.12/kasan-avoid-sleepable-page-allocation-from-atomic-context.patch b/queue-6.12/kasan-avoid-sleepable-page-allocation-from-atomic-context.patch

new file mode 100644 (file)

index 0000000..14e0b0e
--- /dev/null
+++ b/queue-6.12/kasan-avoid-sleepable-page-allocation-from-atomic-context.patch
@@ -0,0 +1,199 @@
+From stable+bounces-179517-greg=kroah.com@vger.kernel.org Sat Sep 13 20:59:54 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 14:59:44 -0400
+Subject: kasan: avoid sleepable page allocation from atomic context
+To: stable@vger.kernel.org
+Cc: Alexander Gordeev <agordeev@linux.ibm.com>, Andrey Ryabinin <ryabinin.a.a@gmail.com>, Harry Yoo <harry.yoo@oracle.com>, Daniel Axtens <dja@axtens.net>, Andrew Morton <akpm@linux-foundation.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913185945.1514830-1-sashal@kernel.org>
+
+From: Alexander Gordeev <agordeev@linux.ibm.com>
+
+[ Upstream commit b6ea95a34cbd014ab6ade4248107b86b0aaf2d6c ]
+
+apply_to_pte_range() enters the lazy MMU mode and then invokes
+kasan_populate_vmalloc_pte() callback on each page table walk iteration.
+However, the callback can go into sleep when trying to allocate a single
+page, e.g.  if an architecutre disables preemption on lazy MMU mode enter.
+
+On s390 if make arch_enter_lazy_mmu_mode() -> preempt_enable() and
+arch_leave_lazy_mmu_mode() -> preempt_disable(), such crash occurs:
+
+[    0.663336] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
+[    0.663348] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
+[    0.663358] preempt_count: 1, expected: 0
+[    0.663366] RCU nest depth: 0, expected: 0
+[    0.663375] no locks held by kthreadd/2.
+[    0.663383] Preemption disabled at:
+[    0.663386] [<0002f3284cbb4eda>] apply_to_pte_range+0xfa/0x4a0
+[    0.663405] CPU: 0 UID: 0 PID: 2 Comm: kthreadd Not tainted 6.15.0-rc5-gcc-kasan-00043-gd76bb1ebb558-dirty #162 PREEMPT
+[    0.663408] Hardware name: IBM 3931 A01 701 (KVM/Linux)
+[    0.663409] Call Trace:
+[    0.663410]  [<0002f3284c385f58>] dump_stack_lvl+0xe8/0x140
+[    0.663413]  [<0002f3284c507b9e>] __might_resched+0x66e/0x700
+[    0.663415]  [<0002f3284cc4f6c0>] __alloc_frozen_pages_noprof+0x370/0x4b0
+[    0.663419]  [<0002f3284ccc73c0>] alloc_pages_mpol+0x1a0/0x4a0
+[    0.663421]  [<0002f3284ccc8518>] alloc_frozen_pages_noprof+0x88/0xc0
+[    0.663424]  [<0002f3284ccc8572>] alloc_pages_noprof+0x22/0x120
+[    0.663427]  [<0002f3284cc341ac>] get_free_pages_noprof+0x2c/0xc0
+[    0.663429]  [<0002f3284cceba70>] kasan_populate_vmalloc_pte+0x50/0x120
+[    0.663433]  [<0002f3284cbb4ef8>] apply_to_pte_range+0x118/0x4a0
+[    0.663435]  [<0002f3284cbc7c14>] apply_to_pmd_range+0x194/0x3e0
+[    0.663437]  [<0002f3284cbc99be>] __apply_to_page_range+0x2fe/0x7a0
+[    0.663440]  [<0002f3284cbc9e88>] apply_to_page_range+0x28/0x40
+[    0.663442]  [<0002f3284ccebf12>] kasan_populate_vmalloc+0x82/0xa0
+[    0.663445]  [<0002f3284cc1578c>] alloc_vmap_area+0x34c/0xc10
+[    0.663448]  [<0002f3284cc1c2a6>] __get_vm_area_node+0x186/0x2a0
+[    0.663451]  [<0002f3284cc1e696>] __vmalloc_node_range_noprof+0x116/0x310
+[    0.663454]  [<0002f3284cc1d950>] __vmalloc_node_noprof+0xd0/0x110
+[    0.663457]  [<0002f3284c454b88>] alloc_thread_stack_node+0xf8/0x330
+[    0.663460]  [<0002f3284c458d56>] dup_task_struct+0x66/0x4d0
+[    0.663463]  [<0002f3284c45be90>] copy_process+0x280/0x4b90
+[    0.663465]  [<0002f3284c460940>] kernel_clone+0xd0/0x4b0
+[    0.663467]  [<0002f3284c46115e>] kernel_thread+0xbe/0xe0
+[    0.663469]  [<0002f3284c4e440e>] kthreadd+0x50e/0x7f0
+[    0.663472]  [<0002f3284c38c04a>] __ret_from_fork+0x8a/0xf0
+[    0.663475]  [<0002f3284ed57ff2>] ret_from_fork+0xa/0x38
+
+Instead of allocating single pages per-PTE, bulk-allocate the shadow
+memory prior to applying kasan_populate_vmalloc_pte() callback on a page
+range.
+
+Link: https://lkml.kernel.org/r/c61d3560297c93ed044f0b1af085610353a06a58.1747316918.git.agordeev@linux.ibm.com
+Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory")
+Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
+Suggested-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
+Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
+Cc: Daniel Axtens <dja@axtens.net>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Stable-dep-of: 79357cd06d41 ("mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()")
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/kasan/shadow.c |   92 +++++++++++++++++++++++++++++++++++++++++++++---------
+ 1 file changed, 78 insertions(+), 14 deletions(-)
+
+--- a/mm/kasan/shadow.c
++++ b/mm/kasan/shadow.c
+@@ -292,33 +292,99 @@ void __init __weak kasan_populate_early_
+ {
+ }
+ 
++struct vmalloc_populate_data {
++      unsigned long start;
++      struct page **pages;
++};
++
+ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
+-                                    void *unused)
++                                    void *_data)
+ {
+-      unsigned long page;
++      struct vmalloc_populate_data *data = _data;
++      struct page *page;
+       pte_t pte;
++      int index;
+ 
+       if (likely(!pte_none(ptep_get(ptep))))
+               return 0;
+ 
+-      page = __get_free_page(GFP_KERNEL);
+-      if (!page)
+-              return -ENOMEM;
+-
+-      __memset((void *)page, KASAN_VMALLOC_INVALID, PAGE_SIZE);
+-      pte = pfn_pte(PFN_DOWN(__pa(page)), PAGE_KERNEL);
++      index = PFN_DOWN(addr - data->start);
++      page = data->pages[index];
++      __memset(page_to_virt(page), KASAN_VMALLOC_INVALID, PAGE_SIZE);
++      pte = pfn_pte(page_to_pfn(page), PAGE_KERNEL);
+ 
+       spin_lock(&init_mm.page_table_lock);
+       if (likely(pte_none(ptep_get(ptep)))) {
+               set_pte_at(&init_mm, addr, ptep, pte);
+-              page = 0;
++              data->pages[index] = NULL;
+       }
+       spin_unlock(&init_mm.page_table_lock);
+-      if (page)
+-              free_page(page);
++
++      return 0;
++}
++
++static void ___free_pages_bulk(struct page **pages, int nr_pages)
++{
++      int i;
++
++      for (i = 0; i < nr_pages; i++) {
++              if (pages[i]) {
++                      __free_pages(pages[i], 0);
++                      pages[i] = NULL;
++              }
++      }
++}
++
++static int ___alloc_pages_bulk(struct page **pages, int nr_pages)
++{
++      unsigned long nr_populated, nr_total = nr_pages;
++      struct page **page_array = pages;
++
++      while (nr_pages) {
++              nr_populated = alloc_pages_bulk(GFP_KERNEL, nr_pages, pages);
++              if (!nr_populated) {
++                      ___free_pages_bulk(page_array, nr_total - nr_pages);
++                      return -ENOMEM;
++              }
++              pages += nr_populated;
++              nr_pages -= nr_populated;
++      }
++
+       return 0;
+ }
+ 
++static int __kasan_populate_vmalloc(unsigned long start, unsigned long end)
++{
++      unsigned long nr_pages, nr_total = PFN_UP(end - start);
++      struct vmalloc_populate_data data;
++      int ret = 0;
++
++      data.pages = (struct page **)__get_free_page(GFP_KERNEL | __GFP_ZERO);
++      if (!data.pages)
++              return -ENOMEM;
++
++      while (nr_total) {
++              nr_pages = min(nr_total, PAGE_SIZE / sizeof(data.pages[0]));
++              ret = ___alloc_pages_bulk(data.pages, nr_pages);
++              if (ret)
++                      break;
++
++              data.start = start;
++              ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
++                                        kasan_populate_vmalloc_pte, &data);
++              ___free_pages_bulk(data.pages, nr_pages);
++              if (ret)
++                      break;
++
++              start += nr_pages * PAGE_SIZE;
++              nr_total -= nr_pages;
++      }
++
++      free_page((unsigned long)data.pages);
++
++      return ret;
++}
++
+ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
+ {
+       unsigned long shadow_start, shadow_end;
+@@ -348,9 +414,7 @@ int kasan_populate_vmalloc(unsigned long
+       shadow_start = PAGE_ALIGN_DOWN(shadow_start);
+       shadow_end = PAGE_ALIGN(shadow_end);
+ 
+-      ret = apply_to_page_range(&init_mm, shadow_start,
+-                                shadow_end - shadow_start,
+-                                kasan_populate_vmalloc_pte, NULL);
++      ret = __kasan_populate_vmalloc(shadow_start, shadow_end);
+       if (ret)
+               return ret;
+ 
diff --git a/queue-6.12/mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch b/queue-6.12/mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch

new file mode 100644 (file)

index 0000000..fc0e38f
--- /dev/null
+++ b/queue-6.12/mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch
@@ -0,0 +1,42 @@
+From e6b543ca9806d7bced863f43020e016ee996c057 Mon Sep 17 00:00:00 2001
+From: Quanmin Yan <yanquanmin1@huawei.com>
+Date: Wed, 27 Aug 2025 19:58:58 +0800
+Subject: mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
+
+From: Quanmin Yan <yanquanmin1@huawei.com>
+
+commit e6b543ca9806d7bced863f43020e016ee996c057 upstream.
+
+When creating a new scheme of DAMON_RECLAIM, the calculation of
+'min_age_region' uses 'aggr_interval' as the divisor, which may lead to
+division-by-zero errors.  Fix it by directly returning -EINVAL when such a
+case occurs.
+
+Link: https://lkml.kernel.org/r/20250827115858.1186261-3-yanquanmin1@huawei.com
+Fixes: f5a79d7c0c87 ("mm/damon: introduce struct damos_access_pattern")
+Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
+Reviewed-by: SeongJae Park <sj@kernel.org>
+Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
+Cc: ze zuo <zuoze1@huawei.com>
+Cc: <stable@vger.kernel.org>   [6.1+]
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: SeongJae Park <sj@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/damon/reclaim.c |    5 +++++
+ 1 file changed, 5 insertions(+)
+
+--- a/mm/damon/reclaim.c
++++ b/mm/damon/reclaim.c
+@@ -194,6 +194,11 @@ static int damon_reclaim_apply_parameter
+       if (err)
+               return err;
+ 
++      if (!damon_reclaim_mon_attrs.aggr_interval) {
++              err = -EINVAL;
++              goto out;
++      }
++
+       err = damon_set_attrs(ctx, &damon_reclaim_mon_attrs);
+       if (err)
+               goto out;
diff --git a/queue-6.12/mm-damon-sysfs-fix-use-after-free-in-state_show.patch b/queue-6.12/mm-damon-sysfs-fix-use-after-free-in-state_show.patch

new file mode 100644 (file)

index 0000000..684014d
--- /dev/null
+++ b/queue-6.12/mm-damon-sysfs-fix-use-after-free-in-state_show.patch
@@ -0,0 +1,70 @@
+From 3260a3f0828e06f5f13fac69fb1999a6d60d9cff Mon Sep 17 00:00:00 2001
+From: Stanislav Fort <stanislav.fort@aisle.com>
+Date: Fri, 5 Sep 2025 13:10:46 +0300
+Subject: mm/damon/sysfs: fix use-after-free in state_show()
+
+From: Stanislav Fort <stanislav.fort@aisle.com>
+
+commit 3260a3f0828e06f5f13fac69fb1999a6d60d9cff upstream.
+
+state_show() reads kdamond->damon_ctx without holding damon_sysfs_lock.
+This allows a use-after-free race:
+
+CPU 0                         CPU 1
+-----                         -----
+state_show()                  damon_sysfs_turn_damon_on()
+ctx = kdamond->damon_ctx;     mutex_lock(&damon_sysfs_lock);
+                              damon_destroy_ctx(kdamond->damon_ctx);
+                              kdamond->damon_ctx = NULL;
+                              mutex_unlock(&damon_sysfs_lock);
+damon_is_running(ctx);        /* ctx is freed */
+mutex_lock(&ctx->kdamond_lock); /* UAF */
+
+(The race can also occur with damon_sysfs_kdamonds_rm_dirs() and
+damon_sysfs_kdamond_release(), which free or replace the context under
+damon_sysfs_lock.)
+
+Fix by taking damon_sysfs_lock before dereferencing the context, mirroring
+the locking used in pid_show().
+
+The bug has existed since state_show() first accessed kdamond->damon_ctx.
+
+Link: https://lkml.kernel.org/r/20250905101046.2288-1-disclosure@aisle.com
+Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring")
+Signed-off-by: Stanislav Fort <disclosure@aisle.com>
+Reported-by: Stanislav Fort <disclosure@aisle.com>
+Reviewed-by: SeongJae Park <sj@kernel.org>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: SeongJae Park <sj@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/damon/sysfs.c |   14 +++++++++-----
+ 1 file changed, 9 insertions(+), 5 deletions(-)
+
+--- a/mm/damon/sysfs.c
++++ b/mm/damon/sysfs.c
+@@ -1067,14 +1067,18 @@ static ssize_t state_show(struct kobject
+ {
+       struct damon_sysfs_kdamond *kdamond = container_of(kobj,
+                       struct damon_sysfs_kdamond, kobj);
+-      struct damon_ctx *ctx = kdamond->damon_ctx;
+-      bool running;
++      struct damon_ctx *ctx;
++      bool running = false;
+ 
+-      if (!ctx)
+-              running = false;
+-      else
++      if (!mutex_trylock(&damon_sysfs_lock))
++              return -EBUSY;
++
++      ctx = kdamond->damon_ctx;
++      if (ctx)
+               running = damon_sysfs_ctx_running(ctx);
+ 
++      mutex_unlock(&damon_sysfs_lock);
++
+       return sysfs_emit(buf, "%s\n", running ?
+                       damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_ON] :
+                       damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_OFF]);
diff --git a/queue-6.12/mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch b/queue-6.12/mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch

new file mode 100644 (file)

index 0000000..3a0c385
--- /dev/null
+++ b/queue-6.12/mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch
@@ -0,0 +1,89 @@
+From stable+bounces-179521-greg=kroah.com@vger.kernel.org Sat Sep 13 21:10:40 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 15:10:32 -0400
+Subject: mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
+To: stable@vger.kernel.org
+Cc: Jeongjun Park <aha310510@gmail.com>, syzbot+417aeb05fd190f3a6da9@syzkaller.appspotmail.com, Sidhartha Kumar <sidhartha.kumar@oracle.com>, Breno Leitao <leitao@debian.org>, David Hildenbrand <david@redhat.com>, Muchun Song <muchun.song@linux.dev>, Oscar Salvador <osalvador@suse.de>, Andrew Morton <akpm@linux-foundation.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913191032.1527419-1-sashal@kernel.org>
+
+From: Jeongjun Park <aha310510@gmail.com>
+
+[ Upstream commit 21cc2b5c5062a256ae9064442d37ebbc23f5aef7 ]
+
+When restoring a reservation for an anonymous page, we need to check to
+freeing a surplus.  However, __unmap_hugepage_range() causes data race
+because it reads h->surplus_huge_pages without the protection of
+hugetlb_lock.
+
+And adjust_reservation is a boolean variable that indicates whether
+reservations for anonymous pages in each folio should be restored.
+Therefore, it should be initialized to false for each round of the loop.
+However, this variable is not initialized to false except when defining
+the current adjust_reservation variable.
+
+This means that once adjust_reservation is set to true even once within
+the loop, reservations for anonymous pages will be restored
+unconditionally in all subsequent rounds, regardless of the folio's state.
+
+To fix this, we need to add the missing hugetlb_lock, unlock the
+page_table_lock earlier so that we don't lock the hugetlb_lock inside the
+page_table_lock lock, and initialize adjust_reservation to false on each
+round within the loop.
+
+Link: https://lkml.kernel.org/r/20250823182115.1193563-1-aha310510@gmail.com
+Fixes: df7a6d1f6405 ("mm/hugetlb: restore the reservation if needed")
+Signed-off-by: Jeongjun Park <aha310510@gmail.com>
+Reported-by: syzbot+417aeb05fd190f3a6da9@syzkaller.appspotmail.com
+Closes: https://syzkaller.appspot.com/bug?extid=417aeb05fd190f3a6da9
+Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
+Cc: Breno Leitao <leitao@debian.org>
+Cc: David Hildenbrand <david@redhat.com>
+Cc: Muchun Song <muchun.song@linux.dev>
+Cc: Oscar Salvador <osalvador@suse.de>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+[ Page vs folio differences ]
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ mm/hugetlb.c |    9 ++++++---
+ 1 file changed, 6 insertions(+), 3 deletions(-)
+
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -5512,7 +5512,7 @@ void __unmap_hugepage_range(struct mmu_g
+       struct page *page;
+       struct hstate *h = hstate_vma(vma);
+       unsigned long sz = huge_page_size(h);
+-      bool adjust_reservation = false;
++      bool adjust_reservation;
+       unsigned long last_addr_mask;
+       bool force_flush = false;
+ 
+@@ -5604,6 +5604,7 @@ void __unmap_hugepage_range(struct mmu_g
+                                       sz);
+               hugetlb_count_sub(pages_per_huge_page(h), mm);
+               hugetlb_remove_rmap(page_folio(page));
++              spin_unlock(ptl);
+ 
+               /*
+                * Restore the reservation for anonymous page, otherwise the
+@@ -5611,14 +5612,16 @@ void __unmap_hugepage_range(struct mmu_g
+                * If there we are freeing a surplus, do not set the restore
+                * reservation bit.
+                */
++              adjust_reservation = false;
++
++              spin_lock_irq(&hugetlb_lock);
+               if (!h->surplus_huge_pages && __vma_private_lock(vma) &&
+                   folio_test_anon(page_folio(page))) {
+                       folio_set_hugetlb_restore_reserve(page_folio(page));
+                       /* Reservation to be adjusted after the spin lock */
+                       adjust_reservation = true;
+               }
+-
+-              spin_unlock(ptl);
++              spin_unlock_irq(&hugetlb_lock);
+ 
+               /*
+                * Adjust the reservation for the region that will have the
diff --git a/queue-6.12/mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch b/queue-6.12/mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch

new file mode 100644 (file)

index 0000000..90fc734
--- /dev/null
+++ b/queue-6.12/mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
@@ -0,0 +1,204 @@
+From stable+bounces-179518-greg=kroah.com@vger.kernel.org Sat Sep 13 21:00:00 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 14:59:45 -0400
+Subject: mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
+To: stable@vger.kernel.org
+Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>, syzbot+3470c9ffee63e4abafeb@syzkaller.appspotmail.com, Andrey Ryabinin <ryabinin.a.a@gmail.com>, Baoquan He <bhe@redhat.com>, Michal Hocko <mhocko@kernel.org>, Alexander Potapenko <glider@google.com>, Andrey Konovalov <andreyknvl@gmail.com>, Dmitry Vyukov <dvyukov@google.com>, Vincenzo Frascino <vincenzo.frascino@arm.com>, Andrew Morton <akpm@linux-foundation.org>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913185945.1514830-2-sashal@kernel.org>
+
+From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
+
+[ Upstream commit 79357cd06d41d0f5a11b17d7c86176e395d10ef2 ]
+
+kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and
+always allocate memory using the hardcoded GFP_KERNEL flag.  This makes
+them inconsistent with vmalloc(), which was recently extended to support
+GFP_NOFS and GFP_NOIO allocations.
+
+Page table allocations performed during shadow population also ignore the
+external gfp_mask.  To preserve the intended semantics of GFP_NOFS and
+GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate
+memalloc scope.
+
+xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock.
+
+There was a report here
+https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com
+
+This patch:
+ - Extends kasan_populate_vmalloc() and helpers to take gfp_mask;
+ - Passes gfp_mask down to alloc_pages_bulk() and __get_free_page();
+ - Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore()
+   around apply_to_page_range();
+ - Updates vmalloc.c and percpu allocator call sites accordingly.
+
+Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com
+Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc")
+Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
+Reported-by: syzbot+3470c9ffee63e4abafeb@syzkaller.appspotmail.com
+Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
+Cc: Baoquan He <bhe@redhat.com>
+Cc: Michal Hocko <mhocko@kernel.org>
+Cc: Alexander Potapenko <glider@google.com>
+Cc: Andrey Konovalov <andreyknvl@gmail.com>
+Cc: Dmitry Vyukov <dvyukov@google.com>
+Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
+Cc: <stable@vger.kernel.org>
+Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/kasan.h |    6 +++---
+ mm/kasan/shadow.c     |   31 ++++++++++++++++++++++++-------
+ mm/vmalloc.c          |    8 ++++----
+ 3 files changed, 31 insertions(+), 14 deletions(-)
+
+--- a/include/linux/kasan.h
++++ b/include/linux/kasan.h
+@@ -564,7 +564,7 @@ static inline void kasan_init_hw_tags(vo
+ #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
+ 
+ void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
+-int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
++int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask);
+ void kasan_release_vmalloc(unsigned long start, unsigned long end,
+                          unsigned long free_region_start,
+                          unsigned long free_region_end,
+@@ -576,7 +576,7 @@ static inline void kasan_populate_early_
+                                                      unsigned long size)
+ { }
+ static inline int kasan_populate_vmalloc(unsigned long start,
+-                                      unsigned long size)
++                                      unsigned long size, gfp_t gfp_mask)
+ {
+       return 0;
+ }
+@@ -612,7 +612,7 @@ static __always_inline void kasan_poison
+ static inline void kasan_populate_early_vm_area_shadow(void *start,
+                                                      unsigned long size) { }
+ static inline int kasan_populate_vmalloc(unsigned long start,
+-                                      unsigned long size)
++                                      unsigned long size, gfp_t gfp_mask)
+ {
+       return 0;
+ }
+--- a/mm/kasan/shadow.c
++++ b/mm/kasan/shadow.c
+@@ -335,13 +335,13 @@ static void ___free_pages_bulk(struct pa
+       }
+ }
+ 
+-static int ___alloc_pages_bulk(struct page **pages, int nr_pages)
++static int ___alloc_pages_bulk(struct page **pages, int nr_pages, gfp_t gfp_mask)
+ {
+       unsigned long nr_populated, nr_total = nr_pages;
+       struct page **page_array = pages;
+ 
+       while (nr_pages) {
+-              nr_populated = alloc_pages_bulk(GFP_KERNEL, nr_pages, pages);
++              nr_populated = alloc_pages_bulk(gfp_mask, nr_pages, pages);
+               if (!nr_populated) {
+                       ___free_pages_bulk(page_array, nr_total - nr_pages);
+                       return -ENOMEM;
+@@ -353,25 +353,42 @@ static int ___alloc_pages_bulk(struct pa
+       return 0;
+ }
+ 
+-static int __kasan_populate_vmalloc(unsigned long start, unsigned long end)
++static int __kasan_populate_vmalloc(unsigned long start, unsigned long end, gfp_t gfp_mask)
+ {
+       unsigned long nr_pages, nr_total = PFN_UP(end - start);
+       struct vmalloc_populate_data data;
++      unsigned int flags;
+       int ret = 0;
+ 
+-      data.pages = (struct page **)__get_free_page(GFP_KERNEL | __GFP_ZERO);
++      data.pages = (struct page **)__get_free_page(gfp_mask | __GFP_ZERO);
+       if (!data.pages)
+               return -ENOMEM;
+ 
+       while (nr_total) {
+               nr_pages = min(nr_total, PAGE_SIZE / sizeof(data.pages[0]));
+-              ret = ___alloc_pages_bulk(data.pages, nr_pages);
++              ret = ___alloc_pages_bulk(data.pages, nr_pages, gfp_mask);
+               if (ret)
+                       break;
+ 
+               data.start = start;
++
++              /*
++               * page tables allocations ignore external gfp mask, enforce it
++               * by the scope API
++               */
++              if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
++                      flags = memalloc_nofs_save();
++              else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
++                      flags = memalloc_noio_save();
++
+               ret = apply_to_page_range(&init_mm, start, nr_pages * PAGE_SIZE,
+                                         kasan_populate_vmalloc_pte, &data);
++
++              if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
++                      memalloc_nofs_restore(flags);
++              else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)
++                      memalloc_noio_restore(flags);
++
+               ___free_pages_bulk(data.pages, nr_pages);
+               if (ret)
+                       break;
+@@ -385,7 +402,7 @@ static int __kasan_populate_vmalloc(unsi
+       return ret;
+ }
+ 
+-int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
++int kasan_populate_vmalloc(unsigned long addr, unsigned long size, gfp_t gfp_mask)
+ {
+       unsigned long shadow_start, shadow_end;
+       int ret;
+@@ -414,7 +431,7 @@ int kasan_populate_vmalloc(unsigned long
+       shadow_start = PAGE_ALIGN_DOWN(shadow_start);
+       shadow_end = PAGE_ALIGN(shadow_end);
+ 
+-      ret = __kasan_populate_vmalloc(shadow_start, shadow_end);
++      ret = __kasan_populate_vmalloc(shadow_start, shadow_end, gfp_mask);
+       if (ret)
+               return ret;
+ 
+--- a/mm/vmalloc.c
++++ b/mm/vmalloc.c
+@@ -1977,6 +1977,8 @@ static struct vmap_area *alloc_vmap_area
+       if (unlikely(!vmap_initialized))
+               return ERR_PTR(-EBUSY);
+ 
++      /* Only reclaim behaviour flags are relevant. */
++      gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
+       might_sleep();
+ 
+       /*
+@@ -1989,8 +1991,6 @@ static struct vmap_area *alloc_vmap_area
+        */
+       va = node_alloc(size, align, vstart, vend, &addr, &vn_id);
+       if (!va) {
+-              gfp_mask = gfp_mask & GFP_RECLAIM_MASK;
+-
+               va = kmem_cache_alloc_node(vmap_area_cachep, gfp_mask, node);
+               if (unlikely(!va))
+                       return ERR_PTR(-ENOMEM);
+@@ -2040,7 +2040,7 @@ retry:
+       BUG_ON(va->va_start < vstart);
+       BUG_ON(va->va_end > vend);
+ 
+-      ret = kasan_populate_vmalloc(addr, size);
++      ret = kasan_populate_vmalloc(addr, size, gfp_mask);
+       if (ret) {
+               free_vmap_area(va);
+               return ERR_PTR(ret);
+@@ -4789,7 +4789,7 @@ retry:
+ 
+       /* populate the kasan shadow space */
+       for (area = 0; area < nr_vms; area++) {
+-              if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area]))
++              if (kasan_populate_vmalloc(vas[area]->va_start, sizes[area], GFP_KERNEL))
+                       goto err_free_shadow;
+       }
+ 
diff --git a/queue-6.12/mtd-spinand-winbond-fix-oob_layout-for-w25n01jw.patch b/queue-6.12/mtd-spinand-winbond-fix-oob_layout-for-w25n01jw.patch

new file mode 100644 (file)

index 0000000..1b380d3
--- /dev/null
+++ b/queue-6.12/mtd-spinand-winbond-fix-oob_layout-for-w25n01jw.patch
@@ -0,0 +1,81 @@
+From stable+bounces-179511-greg=kroah.com@vger.kernel.org Sat Sep 13 18:19:32 2025
+From: Sasha Levin <sashal@kernel.org>
+Date: Sat, 13 Sep 2025 12:19:24 -0400
+Subject: mtd: spinand: winbond: Fix oob_layout for W25N01JW
+To: stable@vger.kernel.org
+Cc: Santhosh Kumar K <s-k6@ti.com>, Sridharan S N <quic_sridsn@quicinc.com>, Miquel Raynal <miquel.raynal@bootlin.com>, Sasha Levin <sashal@kernel.org>
+Message-ID: <20250913161924.1449549-1-sashal@kernel.org>
+
+From: Santhosh Kumar K <s-k6@ti.com>
+
+[ Upstream commit 4550d33e18112a11a740424c4eec063cd58e918c ]
+
+Fix the W25N01JW's oob_layout according to the datasheet [1]
+
+[1] https://www.winbond.com/hq/product/code-storage-flash-memory/qspinand-flash/?__locale=en&partNo=W25N01JW
+
+Fixes: 6a804fb72de5 ("mtd: spinand: winbond: add support for serial NAND flash")
+Cc: Sridharan S N <quic_sridsn@quicinc.com>
+Cc: stable@vger.kernel.org
+Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
+Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
+[ Adjust context ]
+Signed-off-by: Sasha Levin <sashal@kernel.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ drivers/mtd/nand/spi/winbond.c |   37 ++++++++++++++++++++++++++++++++++++-
+ 1 file changed, 36 insertions(+), 1 deletion(-)
+
+--- a/drivers/mtd/nand/spi/winbond.c
++++ b/drivers/mtd/nand/spi/winbond.c
+@@ -122,6 +122,41 @@ static const struct mtd_ooblayout_ops w2
+       .free = w25n02kv_ooblayout_free,
+ };
+ 
++static int w25n01jw_ooblayout_ecc(struct mtd_info *mtd, int section,
++                                struct mtd_oob_region *region)
++{
++      if (section > 3)
++              return -ERANGE;
++
++      region->offset = (16 * section) + 12;
++      region->length = 4;
++
++      return 0;
++}
++
++static int w25n01jw_ooblayout_free(struct mtd_info *mtd, int section,
++                                 struct mtd_oob_region *region)
++{
++      if (section > 3)
++              return -ERANGE;
++
++      region->offset = (16 * section);
++      region->length = 12;
++
++      /* Extract BBM */
++      if (!section) {
++              region->offset += 2;
++              region->length -= 2;
++      }
++
++      return 0;
++}
++
++static const struct mtd_ooblayout_ops w25n01jw_ooblayout = {
++      .ecc = w25n01jw_ooblayout_ecc,
++      .free = w25n01jw_ooblayout_free,
++};
++
+ static int w25n02kv_ecc_get_status(struct spinand_device *spinand,
+                                  u8 status)
+ {
+@@ -206,7 +241,7 @@ static const struct spinand_info winbond
+                                             &write_cache_variants,
+                                             &update_cache_variants),
+                    0,
+-                   SPINAND_ECCINFO(&w25m02gv_ooblayout, NULL)),
++                   SPINAND_ECCINFO(&w25n01jw_ooblayout, NULL)),
+       SPINAND_INFO("W25N02JWZEIF",
+                    SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0xbf, 0x22),
+                    NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 2, 1),
diff --git a/queue-6.12/series b/queue-6.12/series

index 916ac5fa232228c5cb2bf8c87b9f1cc421838f3e..09f7ab618f41cd29cde41eddfacde9c636732502 100644 (file)
--- a/queue-6.12/series
+++ b/queue-6.12/series
@@ -76,3 +76,11 @@ kernfs-fix-uaf-in-polling-when-open-file-is-released.patch
  libceph-fix-invalid-accesses-to-ceph_connection_v1_info.patch
  ceph-fix-race-condition-validating-r_parent-before-applying-state.patch
  ceph-fix-race-condition-where-r_parent-becomes-stale-before-sending-message.patch
+mm-damon-sysfs-fix-use-after-free-in-state_show.patch
+mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch
+mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch
+kasan-avoid-sleepable-page-allocation-from-atomic-context.patch
+mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch
+mtd-spinand-winbond-fix-oob_layout-for-w25n01jw.patch
+btrfs-use-readahead_expand-on-compressed-extents.patch
+btrfs-fix-corruption-reading-compressed-range-when-block-size-is-smaller-than-page-size.patch
author	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Sun, 14 Sep 2025 08:02:24 +0000 (10:02 +0200)
queue-6.12/btrfs-fix-corruption-reading-compressed-range-when-block-size-is-smaller-than-page-size.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/btrfs-use-readahead_expand-on-compressed-extents.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/kasan-avoid-sleepable-page-allocation-from-atomic-context.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/mm-damon-reclaim-avoid-divide-by-zero-in-damon_reclaim_apply_parameters.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/mm-damon-sysfs-fix-use-after-free-in-state_show.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/mm-hugetlb-add-missing-hugetlb_lock-in-__unmap_hugepage_range.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/mm-vmalloc-mm-kasan-respect-gfp-mask-in-kasan_populate_vmalloc.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/mtd-spinand-winbond-fix-oob_layout-for-w25n01jw.patch	[new file with mode: 0644]	patch \| blob
queue-6.12/series		patch \| blob \| blame \| history