From: Linus Torvalds Date: Tue, 22 Mar 2022 23:11:53 +0000 (-0700) Subject: Merge branch 'akpm' (patches from Andrew) X-Git-Tag: v5.18-rc1~168 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=3bf03b9a0839c9fb06927ae53ebd0f960b19d408;p=thirdparty%2Fkernel%2Flinux.git Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton : (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ... --- 3bf03b9a0839c9fb06927ae53ebd0f960b19d408 diff --cc Documentation/admin-guide/sysctl/kernel.rst index 2d1b8c1ea2e81,fdfd2b6848220..803c60bf21d3b --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@@ -609,8 -616,56 +616,13 @@@ being accessed should be migrated to a The unmapping of pages and trapping faults incur additional overhead that ideally is offset by improved memory locality but there is no universal guarantee. If the target workload is already bound to NUMA nodes then this -feature should be disabled. Otherwise, if the system overhead from the -feature is too high then the rate the kernel samples for NUMA hinting -faults may be controlled by the `numa_balancing_scan_period_min_ms, -numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, -numa_balancing_scan_size_mb`_, and numa_balancing_settle_count sysctls. +feature should be disabled. + Or NUMA_BALANCING_MEMORY_TIERING to optimize page placement among + different types of memory (represented as different NUMA nodes) to + place the hot pages in the fast memory. This is implemented based on + unmapping and page fault too. + -numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb -=============================================================================================================================== - - -Automatic NUMA balancing scans tasks address space and unmaps pages to -detect if pages are properly placed or if the data should be migrated to a -memory node local to where the task is running. Every "scan delay" the task -scans the next "scan size" number of pages in its address space. When the -end of the address space is reached the scanner restarts from the beginning. - -In combination, the "scan delay" and "scan size" determine the scan rate. -When "scan delay" decreases, the scan rate increases. The scan delay and -hence the scan rate of every task is adaptive and depends on historical -behaviour. If pages are properly placed then the scan delay increases, -otherwise the scan delay decreases. The "scan size" is not adaptive but -the higher the "scan size", the higher the scan rate. - -Higher scan rates incur higher system overhead as page faults must be -trapped and potentially data must be migrated. However, the higher the scan -rate, the more quickly a tasks memory is migrated to a local node if the -workload pattern changes and minimises performance impact due to remote -memory accesses. These sysctls control the thresholds for scan delays and -the number of pages scanned. - -``numa_balancing_scan_period_min_ms`` is the minimum time in milliseconds to -scan a tasks virtual memory. It effectively controls the maximum scanning -rate for each task. - -``numa_balancing_scan_delay_ms`` is the starting "scan delay" used for a task -when it initially forks. - -``numa_balancing_scan_period_max_ms`` is the maximum time in milliseconds to -scan a tasks virtual memory. It effectively controls the minimum scanning -rate for each task. - -``numa_balancing_scan_size_mb`` is how many megabytes worth of pages are -scanned for a given scan. - - oops_all_cpu_backtrace ====================== diff --cc arch/x86/Kconfig index 42a3a736436f9,37372cd5c9a71..61e511c518903 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@@ -119,9 -118,9 +119,10 @@@ config X8 select ARCH_WANT_DEFAULT_BPF_JIT if X86_64 select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANTS_NO_INSTR + select ARCH_WANT_GENERAL_HUGETLB select ARCH_WANT_HUGE_PMD_SHARE select ARCH_WANT_LD_ORPHAN_WARN + select ARCH_WANTS_RT_DELAYED_SIGNALS select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT diff --cc fs/nilfs2/segbuf.c index a3bb0c856ec80,f4b57bc0c5861..1362ccb64ec7d --- a/fs/nilfs2/segbuf.c +++ b/fs/nilfs2/segbuf.c @@@ -337,21 -337,10 +337,9 @@@ static void nilfs_end_bio_write(struct } static int nilfs_segbuf_submit_bio(struct nilfs_segment_buffer *segbuf, - struct nilfs_write_info *wi, int mode, - int mode_flags) + struct nilfs_write_info *wi) { struct bio *bio = wi->bio; - int err; - - if (segbuf->sb_nbio > 0 && - bdi_write_congested(segbuf->sb_super->s_bdi)) { - wait_for_completion(&segbuf->sb_bio_event); - segbuf->sb_nbio--; - if (unlikely(atomic_read(&segbuf->sb_err))) { - bio_put(bio); - err = -EIO; - goto failed; - } - } bio->bi_end_io = nilfs_end_bio_write; bio->bi_private = segbuf; @@@ -363,12 -353,31 +351,8 @@@ wi->nr_vecs = min(wi->max_pages, wi->rest_blocks); wi->start = wi->end; return 0; - - failed: - wi->bio = NULL; - return err; } -/** - * nilfs_alloc_seg_bio - allocate a new bio for writing log - * @nilfs: nilfs object - * @start: start block number of the bio - * @nr_vecs: request size of page vector. - * - * Return Value: On success, pointer to the struct bio is returned. - * On error, NULL is returned. - */ -static struct bio *nilfs_alloc_seg_bio(struct the_nilfs *nilfs, sector_t start, - int nr_vecs) -{ - struct bio *bio; - - bio = bio_alloc(GFP_NOIO, nr_vecs); - if (likely(bio)) { - bio_set_dev(bio, nilfs->ns_bdev); - bio->bi_iter.bi_sector = - start << (nilfs->ns_blocksize_bits - 9); - } - return bio; -} - static void nilfs_segbuf_prepare_write(struct nilfs_segment_buffer *segbuf, struct nilfs_write_info *wi) {