When PAGE_SIZE is 64K, if read_log_page is called by log_read_rst for
the first time, the size of *buffer would be equal to
DefaultLogPageSize(4K).But for *buffer operations like memcpy,
if the memory area size(n) which being assigned to buffer is larger
than 4K (log->page_size(64K) or bytes(64K-page_off)), it will cause
an out of boundary error.
Call trace:
[...]
kasan_report+0x44/0x130
check_memory_region+0xf8/0x1a0
memcpy+0xc8/0x100
ntfs_read_run_nb+0x20c/0x460
read_log_page+0xd0/0x1f4
log_read_rst+0x110/0x75c
log_replay+0x1e8/0x4aa0
ntfs_loadlog_and_replay+0x290/0x2d0
ntfs_fill_super+0x508/0xec0
get_tree_bdev+0x1fc/0x34c
[...]
Fix this by setting variable r_page to NULL in log_read_rst.
syzbot is reporting too large allocation at ntfs_fill_super() [1], for a
crafted filesystem can contain bogus inode->i_size. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvmalloc().
syzbot is reporting too large allocation at wnd_init() [1], for a crafted
filesystem can become wnd->nwnd close to UINT_MAX. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvcalloc().
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Syzkaller reports slab-out-of-bounds bug as follows:
==================================================================
BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611
The buggy address belongs to the physical page:
page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8
head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Kernel will tries to read record and parse MFT from disk in
ntfs_read_mft().
Yet the problem is that during enumerating attributes in record,
kernel doesn't check whether run_off field loading from the disk
is a valid value.
To be more specific, if attr->nres.run_off is larger than attr->size,
kernel will passes an invalid argument run_buf_size in
run_unpack_ex(), which having an integer overflow. Then this invalid
argument will triggers the slab-out-of-bounds Read bug as above.
This patch solves it by adding the sanity check between
the offset to packed runs and attribute size.
Though we already have some sanity checks while enumerating attributes,
resident attribute names aren't included. This patch checks the resident
attribute names are in the valid ranges.
[ 259.209031] BUG: KASAN: slab-out-of-bounds in ni_create_attr_list+0x1e1/0x850
[ 259.210770] Write of size 426 at addr ffff88800632f2b2 by task exp/255
[ 259.211551]
[ 259.212035] CPU: 0 PID: 255 Comm: exp Not tainted 6.0.0-rc6 #37
[ 259.212955] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 259.214387] Call Trace:
[ 259.214640] <TASK>
[ 259.214895] dump_stack_lvl+0x49/0x63
[ 259.215284] print_report.cold+0xf5/0x689
[ 259.215565] ? kasan_poison+0x3c/0x50
[ 259.215778] ? kasan_unpoison+0x28/0x60
[ 259.215991] ? ni_create_attr_list+0x1e1/0x850
[ 259.216270] kasan_report+0xa7/0x130
[ 259.216481] ? ni_create_attr_list+0x1e1/0x850
[ 259.216719] kasan_check_range+0x15a/0x1d0
[ 259.216939] memcpy+0x3c/0x70
[ 259.217136] ni_create_attr_list+0x1e1/0x850
[ 259.217945] ? __rcu_read_unlock+0x5b/0x280
[ 259.218384] ? ni_remove_attr+0x2e0/0x2e0
[ 259.218712] ? kernel_text_address+0xcf/0xe0
[ 259.219064] ? __kernel_text_address+0x12/0x40
[ 259.219434] ? arch_stack_walk+0x9e/0xf0
[ 259.219668] ? __this_cpu_preempt_check+0x13/0x20
[ 259.219904] ? sysvec_apic_timer_interrupt+0x57/0xc0
[ 259.220140] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 259.220561] ni_ins_attr_ext+0x52c/0x5c0
[ 259.220984] ? ni_create_attr_list+0x850/0x850
[ 259.221532] ? run_deallocate+0x120/0x120
[ 259.221972] ? vfs_setxattr+0x128/0x300
[ 259.222688] ? setxattr+0x126/0x140
[ 259.222921] ? path_setxattr+0x164/0x180
[ 259.223431] ? __x64_sys_setxattr+0x6d/0x80
[ 259.223828] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.224417] ? mi_find_attr+0x3c/0xf0
[ 259.224772] ni_insert_attr+0x1ba/0x420
[ 259.225216] ? ni_ins_attr_ext+0x5c0/0x5c0
[ 259.225504] ? ntfs_read_ea+0x119/0x450
[ 259.225775] ni_insert_resident+0xc0/0x1c0
[ 259.226316] ? ni_insert_nonresident+0x400/0x400
[ 259.227001] ? __kasan_kmalloc+0x88/0xb0
[ 259.227468] ? __kmalloc+0x192/0x320
[ 259.227773] ntfs_set_ea+0x6bf/0xb30
[ 259.228216] ? ftrace_graph_ret_addr+0x2a/0xb0
[ 259.228494] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.228838] ? ntfs_read_ea+0x450/0x450
[ 259.229098] ? is_bpf_text_address+0x24/0x40
[ 259.229418] ? kernel_text_address+0xcf/0xe0
[ 259.229681] ? __kernel_text_address+0x12/0x40
[ 259.229948] ? unwind_get_return_address+0x3a/0x60
[ 259.230271] ? write_profile+0x270/0x270
[ 259.230537] ? arch_stack_walk+0x9e/0xf0
[ 259.230836] ntfs_setxattr+0x114/0x5c0
[ 259.231099] ? ntfs_set_acl_ex+0x2e0/0x2e0
[ 259.231529] ? evm_protected_xattr_common+0x6d/0x100
[ 259.231817] ? posix_xattr_acl+0x13/0x80
[ 259.232073] ? evm_protect_xattr+0x1f7/0x440
[ 259.232351] __vfs_setxattr+0xda/0x120
[ 259.232635] ? xattr_resolve_name+0x180/0x180
[ 259.232912] __vfs_setxattr_noperm+0x93/0x300
[ 259.233219] __vfs_setxattr_locked+0x141/0x160
[ 259.233492] ? kasan_poison+0x3c/0x50
[ 259.233744] vfs_setxattr+0x128/0x300
[ 259.234002] ? __vfs_setxattr_locked+0x160/0x160
[ 259.234837] do_setxattr+0xb8/0x170
[ 259.235567] ? vmemdup_user+0x53/0x90
[ 259.236212] setxattr+0x126/0x140
[ 259.236491] ? do_setxattr+0x170/0x170
[ 259.236791] ? debug_smp_processor_id+0x17/0x20
[ 259.237232] ? kasan_quarantine_put+0x57/0x180
[ 259.237605] ? putname+0x80/0xa0
[ 259.237870] ? __kasan_slab_free+0x11c/0x1b0
[ 259.238234] ? putname+0x80/0xa0
[ 259.238500] ? preempt_count_sub+0x18/0xc0
[ 259.238775] ? __mnt_want_write+0xaa/0x100
[ 259.238990] ? mnt_want_write+0x8b/0x150
[ 259.239290] path_setxattr+0x164/0x180
[ 259.239605] ? setxattr+0x140/0x140
[ 259.239849] ? debug_smp_processor_id+0x17/0x20
[ 259.240174] ? fpregs_assert_state_consistent+0x67/0x80
[ 259.240411] __x64_sys_setxattr+0x6d/0x80
[ 259.240715] do_syscall_64+0x3b/0x90
[ 259.240934] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.241697] RIP: 0033:0x7fc6b26e4469
[ 259.242647] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[ 259.244512] RSP: 002b:00007ffc3c7841f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000bc
[ 259.245086] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6b26e4469
[ 259.246025] RDX: 00007ffc3c784380 RSI: 00007ffc3c7842e0 RDI: 00007ffc3c784238
[ 259.246961] RBP: 00007ffc3c788410 R08: 0000000000000001 R09: 00007ffc3c7884f8
[ 259.247775] R10: 000000000000007f R11: 0000000000000217 R12: 00000000004004e0
[ 259.248534] R13: 00007ffc3c7884f0 R14: 0000000000000000 R15: 0000000000000000
[ 259.249368] </TASK>
[ 259.249644]
[ 259.249888] Allocated by task 255:
[ 259.250283] kasan_save_stack+0x26/0x50
[ 259.250957] __kasan_kmalloc+0x88/0xb0
[ 259.251826] __kmalloc+0x192/0x320
[ 259.252745] ni_create_attr_list+0x11e/0x850
[ 259.253298] ni_ins_attr_ext+0x52c/0x5c0
[ 259.253685] ni_insert_attr+0x1ba/0x420
[ 259.253974] ni_insert_resident+0xc0/0x1c0
[ 259.254311] ntfs_set_ea+0x6bf/0xb30
[ 259.254629] ntfs_setxattr+0x114/0x5c0
[ 259.254859] __vfs_setxattr+0xda/0x120
[ 259.255155] __vfs_setxattr_noperm+0x93/0x300
[ 259.255445] __vfs_setxattr_locked+0x141/0x160
[ 259.255862] vfs_setxattr+0x128/0x300
[ 259.256251] do_setxattr+0xb8/0x170
[ 259.256522] setxattr+0x126/0x140
[ 259.256911] path_setxattr+0x164/0x180
[ 259.257308] __x64_sys_setxattr+0x6d/0x80
[ 259.257637] do_syscall_64+0x3b/0x90
[ 259.257970] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 259.258550]
[ 259.258772] The buggy address belongs to the object at ffff88800632f000
[ 259.258772] which belongs to the cache kmalloc-1k of size 1024
[ 259.260190] The buggy address is located 690 bytes inside of
[ 259.260190] 1024-byte region [ffff88800632f000, ffff88800632f400)
[ 259.261412]
[ 259.261743] The buggy address belongs to the physical page:
[ 259.262354] page:0000000081e8cac9 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x632c
[ 259.263722] head:0000000081e8cac9 order:2 compound_mapcount:0 compound_pincount:0
[ 259.264284] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[ 259.265312] raw: 000fffffc0010200ffffea0000060d00dead000000000004ffff888001041dc0
[ 259.265772] raw: 0000000000000000000000008008000800000001ffffffff0000000000000000
[ 259.266305] page dumped because: kasan: bad access detected
[ 259.266588]
[ 259.266728] Memory state around the buggy address:
[ 259.267225] ffff88800632f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 259.267841] ffff88800632f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 259.269111] >ffff88800632f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 259.269626] ^
[ 259.270162] ffff88800632f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 259.270810] ffff88800632f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
indx_read is called when we have some NTFS directory operations that
need more information from the index buffers. This adds a sanity check
to make sure the returned index buffer length is legit, or we may have
some out-of-bound memory accesses.
[ 560.897595] BUG: KASAN: slab-out-of-bounds in hdr_find_e.isra.0+0x10c/0x320
[ 560.898321] Read of size 2 at addr ffff888009497238 by task exp/245
[ 560.898760]
[ 560.899129] CPU: 0 PID: 245 Comm: exp Not tainted 6.0.0-rc6 #37
[ 560.899505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 560.900170] Call Trace:
[ 560.900407] <TASK>
[ 560.900732] dump_stack_lvl+0x49/0x63
[ 560.901108] print_report.cold+0xf5/0x689
[ 560.901395] ? hdr_find_e.isra.0+0x10c/0x320
[ 560.901716] kasan_report+0xa7/0x130
[ 560.901950] ? hdr_find_e.isra.0+0x10c/0x320
[ 560.902208] __asan_load2+0x68/0x90
[ 560.902427] hdr_find_e.isra.0+0x10c/0x320
[ 560.902846] ? cmp_uints+0xe0/0xe0
[ 560.903363] ? cmp_sdh+0x90/0x90
[ 560.903883] ? ntfs_bread_run+0x190/0x190
[ 560.904196] ? rwsem_down_read_slowpath+0x750/0x750
[ 560.904969] ? ntfs_fix_post_read+0xe0/0x130
[ 560.905259] ? __kasan_check_write+0x14/0x20
[ 560.905599] ? up_read+0x1a/0x90
[ 560.905853] ? indx_read+0x22c/0x380
[ 560.906096] indx_find+0x2ef/0x470
[ 560.906352] ? indx_find_buffer+0x2d0/0x2d0
[ 560.906692] ? __kasan_kmalloc+0x88/0xb0
[ 560.906977] dir_search_u+0x196/0x2f0
[ 560.907220] ? ntfs_nls_to_utf16+0x450/0x450
[ 560.907464] ? __kasan_check_write+0x14/0x20
[ 560.907747] ? mutex_lock+0x8f/0xe0
[ 560.907970] ? __mutex_lock_slowpath+0x20/0x20
[ 560.908214] ? kmem_cache_alloc+0x143/0x4b0
[ 560.908459] ntfs_lookup+0xe0/0x100
[ 560.908788] __lookup_slow+0x116/0x220
[ 560.909050] ? lookup_fast+0x1b0/0x1b0
[ 560.909309] ? lookup_fast+0x13f/0x1b0
[ 560.909601] walk_component+0x187/0x230
[ 560.909944] link_path_walk.part.0+0x3f0/0x660
[ 560.910285] ? handle_lookup_down+0x90/0x90
[ 560.910618] ? path_init+0x642/0x6e0
[ 560.911084] ? percpu_counter_add_batch+0x6e/0xf0
[ 560.912559] ? __alloc_file+0x114/0x170
[ 560.913008] path_openat+0x19c/0x1d10
[ 560.913419] ? getname_flags+0x73/0x2b0
[ 560.913815] ? kasan_save_stack+0x3a/0x50
[ 560.914125] ? kasan_save_stack+0x26/0x50
[ 560.914542] ? __kasan_slab_alloc+0x6d/0x90
[ 560.914924] ? kmem_cache_alloc+0x143/0x4b0
[ 560.915339] ? getname_flags+0x73/0x2b0
[ 560.915647] ? getname+0x12/0x20
[ 560.916114] ? __x64_sys_open+0x4c/0x60
[ 560.916460] ? path_lookupat.isra.0+0x230/0x230
[ 560.916867] ? __isolate_free_page+0x2e0/0x2e0
[ 560.917194] do_filp_open+0x15c/0x1f0
[ 560.917448] ? may_open_dev+0x60/0x60
[ 560.917696] ? expand_files+0xa4/0x3a0
[ 560.917923] ? __kasan_check_write+0x14/0x20
[ 560.918185] ? _raw_spin_lock+0x88/0xdb
[ 560.918409] ? _raw_spin_lock_irqsave+0x100/0x100
[ 560.918783] ? _find_next_bit+0x4a/0x130
[ 560.919026] ? _raw_spin_unlock+0x19/0x40
[ 560.919276] ? alloc_fd+0x14b/0x2d0
[ 560.919635] do_sys_openat2+0x32a/0x4b0
[ 560.920035] ? file_open_root+0x230/0x230
[ 560.920336] ? __rcu_read_unlock+0x5b/0x280
[ 560.920813] do_sys_open+0x99/0xf0
[ 560.921208] ? filp_open+0x60/0x60
[ 560.921482] ? exit_to_user_mode_prepare+0x49/0x180
[ 560.921867] __x64_sys_open+0x4c/0x60
[ 560.922128] do_syscall_64+0x3b/0x90
[ 560.922369] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 560.923030] RIP: 0033:0x7f7dff2e4469
[ 560.923681] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[ 560.924451] RSP: 002b:00007ffd41a210b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000002
[ 560.925168] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dff2e4469
[ 560.925655] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd41a211f0
[ 560.926085] RBP: 00007ffd41a252a0 R08: 00007f7dff60fba0 R09: 00007ffd41a25388
[ 560.926405] R10: 0000000000400b80 R11: 0000000000000206 R12: 00000000004004e0
[ 560.926867] R13: 00007ffd41a25380 R14: 0000000000000000 R15: 0000000000000000
[ 560.927241] </TASK>
[ 560.927491]
[ 560.927755] Allocated by task 245:
[ 560.928409] kasan_save_stack+0x26/0x50
[ 560.929271] __kasan_kmalloc+0x88/0xb0
[ 560.929778] __kmalloc+0x192/0x320
[ 560.930023] indx_read+0x249/0x380
[ 560.930224] indx_find+0x2a2/0x470
[ 560.930695] dir_search_u+0x196/0x2f0
[ 560.930892] ntfs_lookup+0xe0/0x100
[ 560.931115] __lookup_slow+0x116/0x220
[ 560.931323] walk_component+0x187/0x230
[ 560.931570] link_path_walk.part.0+0x3f0/0x660
[ 560.931791] path_openat+0x19c/0x1d10
[ 560.932008] do_filp_open+0x15c/0x1f0
[ 560.932226] do_sys_openat2+0x32a/0x4b0
[ 560.932413] do_sys_open+0x99/0xf0
[ 560.932709] __x64_sys_open+0x4c/0x60
[ 560.933417] do_syscall_64+0x3b/0x90
[ 560.933776] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 560.934235]
[ 560.934486] The buggy address belongs to the object at ffff888009497000
[ 560.934486] which belongs to the cache kmalloc-512 of size 512
[ 560.935239] The buggy address is located 56 bytes to the right of
[ 560.935239] 512-byte region [ffff888009497000, ffff888009497200)
[ 560.936153]
[ 560.937326] The buggy address belongs to the physical page:
[ 560.938228] page:0000000062a3dfae refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9496
[ 560.939616] head:0000000062a3dfae order:1 compound_mapcount:0 compound_pincount:0
[ 560.940219] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[ 560.942702] raw: 000fffffc0010200ffffea0000164f80dead000000000005ffff888001041c80
[ 560.943932] raw: 0000000000000000000000008008000800000001ffffffff0000000000000000
[ 560.944568] page dumped because: kasan: bad access detected
[ 560.945735]
[ 560.946112] Memory state around the buggy address:
[ 560.946870] ffff888009497100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 560.947242] ffff888009497180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 560.947611] >ffff888009497200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 560.947915] ^
[ 560.948249] ffff888009497280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 560.948687] ffff888009497300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Although the attribute name length is checked before comparing it to
some common names (e.g., $I30), the offset isn't. This adds a sanity
check for the attribute name offset, guarantee the validity and prevent
possible out-of-bound memory accesses.
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This adds a sanity check for the i_op pointer of the inode which is
returned after reading Root directory MFT record. We should check the
i_op is valid before trying to create the root dentry, otherwise we may
encounter a NPD while mounting a image with a funny Root directory MFT
record.
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some metadata files are handled before MFT. This adds a null pointer
check for some corner cases that could lead to NPD while reading these
metadata files for a malformed NTFS image.
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This adds sanity checks for data run offset. We should make sure data
run offset is legit before trying to unpack them, otherwise we may
encounter use-after-free or some unexpected memory access behaviors.
Signed-off-by: Edward Lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The offset addition could overflow and pass the used size check given an
attribute with very large size (e.g., 0xffffff7f) while parsing MFT
attributes. This could lead to out-of-bound memory R/W if we try to
access the next attribute derived by Add2Ptr(attr, asize)
Signed-off-by: edward lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the NTFS BOOT record_size field < 0, it represents a
shift value. However, there is no sanity check on the shift result
and the sbi->record_bits calculation through blksize_bits() assumes
the size always > 256, which could lead to NPD while mounting a
malformed NTFS image.
Signed-off-by: edward lo <edward.lo@ambergroup.io> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Mask out the "Command Supported" and "Logical Block Content Change" bits
and only defer execution of commands that have non-trivial effects to
the workqueue for synchronous execution. This allows to execute admin
commands asynchronously on controllers that provide a Command Supported
and Effects log page, and will keep allowing to execute Write commands
asynchronously once command effects on I/O commands are taken into
account.
Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since kernel 5.3.4 my laptop (ICH8M controller) does not see Kingston
SV300S37A60G SSD disk connected into a SATA connector on wake from
suspend. The problem was introduced in c312ef176399 ("libata/ahci: Drop
PCS quirk for Denverton and beyond"): the quirk is not applied on wake
from suspend as it originally was.
It is worth to mention the commit contained another bug: the quirk is
not applied at all to controllers which require it. The fix commit 09d6ac8dc51a ("libata/ahci: Fix PCS quirk application") landed in 5.3.8.
So testing my patch anywhere between commits c312ef176399 and 09d6ac8dc51a is pointless.
Not all disks trigger the problem. For example nothing bad happens with
Western Digital WD5000LPCX HDD.
Test hardware:
- Acer 5920G with ICH8M SATA controller
- sda: some SATA HDD connnected into the DVD drive IDE port with a
SATA-IDE caddy. It is a boot disk
- sdb: Kingston SV300S37A60G SSD connected into the only SATA port
sd 0:0:0:0: [sda] Starting disk
sd 2:0:0:0: [sdb] Starting disk
ata4: SATA link down (SStatus 4 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out
ata1.00: ACPI cmd ef/03:42:00:00:00:a0 (SET FEATURES) filtered out
ata1: FORCE: cable set to 80c
ata5: SATA link down (SStatus 0 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata3.00: disabled
sd 2:0:0:0: rejecting I/O to offline device
ata3.00: detaching (SCSI 2:0:0:0)
sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_NO_CONNECT
driverbyte=DRIVER_OK
sd 2:0:0:0: [sdb] Synchronizing SCSI cache
sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 2:0:0:0: [sdb] Stopping disk
sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
Commit c312ef176399 dropped ahci_pci_reset_controller() which internally
calls ahci_reset_controller() and applies the PCS quirk if needed after
that. It was called each time a reset was required instead of just
ahci_reset_controller(). This patch puts the function back in place.
Fixes: c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond") Signed-off-by: Adam Vodopjan <grozzly@protonmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'")
will access 'bic->bfqq' in bic_set_bfqq(), however, bfq_exit_icq_bfqq()
can free bfqq first, and then call bic_set_bfqq(), which will cause uaf.
Fix the problem by moving bfq_exit_bfqq() behind bic_set_bfqq().
Fixes: 64dc8c732f5c ("block, bfq: fix possible uaf for 'bfqq->bic'") Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20221226030605.1437081-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit bfcdf58380b1 ("ACPI: resource: do IRQ override on LENOVO IdeaPad")
added an override for Lenovo IdeaPad 5 16ALC7. The 14ALC7 variant also
suffers from a broken touchscreen and trackpad.
Fixes: 9946e39fe8d0 ("ACPI: resource: skip IRQ override on AMD Zen platforms") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216804 Signed-off-by: Adrian Freund <adrian@freund.io> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Schenker XMG CORE 15 (M22) is Ryzen-6 based and needs IRQ overriding
for the keyboard to work. Adding an entry for this laptop to the
override_table makes the internal keyboard functional again.
Signed-off-by: Erik Schumacher <ofenfisch@googlemail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
LENOVO IdeaPad Flex 5 is ryzen-5 based and the commit below removed IRQ
overriding for those. This broke touchscreen and trackpad:
i2c_designware AMDI0010:00: controller timed out
i2c_designware AMDI0010:03: controller timed out
i2c_hid_acpi i2c-MSFT0001:00: failed to reset device: -61
i2c_designware AMDI0010:03: controller timed out
...
i2c_hid_acpi i2c-MSFT0001:00: can't add hid device: -61
i2c_hid_acpi: probe of i2c-MSFT0001:00 failed with error -61
White-list this specific model in the override_table.
For this to work, the ZEN test needs to be put below the table walk.
Fixes: 37c81d9f1d1b (ACPI: resource: skip IRQ override on AMD Zen platforms) Link: https://bugzilla.suse.com/show_bug.cgi?id=1203794 Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the ACPI DSDT table for Asus VivoBook K3402ZA/K3502ZA
IRQ 1 is described as ActiveLow; however, the kernel overrides
it to Edge_High. This prevents the internal keyboard from working
on these laptops. In order to fix this add these laptops to the
skip_override_table so that the kernel does not override IRQ 1 to
Edge_High.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216158 Reviewed-by: Hui Wang <hui.wang@canonical.com> Tested-by: Tamim Khan <tamim@fusetak.com> Tested-by: Sunand <sunandchakradhar@gmail.com> Signed-off-by: Tamim Khan <tamim@fusetak.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: f3cb9b740869 ("ACPI: resource: do IRQ override on Lenovo 14ALC7") Signed-off-by: Sasha Levin <sashal@kernel.org>
Convert the max size to bytes to match the units of the divisor that
calculates the worst-case number of PRP entries.
The result is used to determine how many PRP Lists are required. The
code was previously rounding this to 1 list, but we can require 2 in the
worst case. In that scenario, the driver would corrupt memory beyond the
size provided by the mempool.
While unlikely to occur (you'd need a 4MB in exactly 127 phys segments
on a queue that doesn't support SGLs), this memory corruption has been
observed by kfence.
Cc: Jens Axboe <axboe@kernel.dk> Fixes: 943e942e6266f ("nvme-pci: limit max IO size and segments to avoid high order allocations") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When using shadow doorbells, the event index and the doorbell values are
written to host memory. Prior to this patch, the values written would
erroneously be written in host endianness. This causes trouble on
big-endian platforms. Fix this by adding missing endian conversions.
This issue was noticed by Guenter while testing various big-endian
platforms under QEMU[1]. A similar fix required for hw/nvme in QEMU is
up for review as well[2].
A NULL error response might be a valid case where smb2_reconnect()
failed to reconnect the session and tcon due to a disconnected server
prior to issuing the I/O operation, so don't leak -ENOMEM to userspace
on such occasions.
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove unnecessary NULL check of oparam->cifs_sb when parsing symlink
error response as it's already set by all smb2_open_file() callers and
deferenced earlier.
This fixes below report:
fs/cifs/smb2file.c:126 smb2_open_file()
warn: variable dereferenced before check 'oparms->cifs_sb' (see line 112)
Link: https://lore.kernel.org/r/Y0kt42j2tdpYakRu@kili Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
Stable-dep-of: f60ffa662d14 ("cifs: don't leak -ENOMEM in smb2_open_file()") Signed-off-by: Sasha Levin <sashal@kernel.org>
When a gendisk is successfully initialized but add_disk() fails such as when
a loop device has invalid number of minor device numbers specified,
blkcg_init_disk() is called during init and then blkcg_exit_disk() during
error handling. Unfortunately, iolatency gets initialized in the former but
doesn't get cleaned up in the latter.
This is because, in non-error cases, the cleanup is performed by
del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
policies, iolatency being one of them, can only be activated once the disk
is fully registered and visible. That assumption is true for wbt and iocost,
but not so for iolatency as it gets initialized before add_disk() is called.
It is desirable to lazy-init rq_qos policies because they are optional
features and add to hot path overhead once initialized - each IO has to walk
all the registered rq_qos policies. So, we want to switch iolatency to lazy
init too. However, that's a bigger change. As a fix for the immediate
problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
This is safe because duplicate calls to rq_qos_exit() become noop's.
Pass the gendisk to blkcg_init_disk and blkcg_exit_disk as part of moving
the blk-cgroup infrastructure to be gendisk based. Also remove the
rather pointless kerneldoc comments for these internal functions with a
single caller each.
Add a fully inlined blkg_lookup as the extra two checks aren't going
to generated a lot more code vs the call to the slowpath routine, and
open code the hint update in the two callers that care.
of_icc_get() alloc resources for path handle, we should release it when not
need anymore. Like the release in dwc3_qcom_interconnect_exit() function.
Add icc_put() in error handling to fix this.
The value of NSEC_PER_SEC << PWM_DUTY_WIDTH doesn't fix within a 32 bit
integer causing a build warning/error (and the value truncated):
drivers/pwm/pwm-tegra.c: In function ‘tegra_pwm_config’:
drivers/pwm/pwm-tegra.c:148:53: error: result of ‘1000000000 << 8’ requires 39 bits to represent, but ‘long int’ only has 32 bits [-Werror=shift-overflow=]
148 | required_clk_rate = DIV_ROUND_UP_ULL(NSEC_PER_SEC << PWM_DUTY_WIDTH,
| ^~
Explicitly cast to a u64 to ensure the correct result.
When opened a symlink, link name is from 'inode->i_link', but it may be
reset to a new value when revalidate the dentry. If some processes get the
link name on the race scenario, then UAF will happen on link name.
Fix this by implementing 'get_link' interface to duplicate the link name.
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We were only zeroing out the ntlmssp blob but forgot to free the
allocated buffer in the end of SMB2_sess_auth_rawntlmssp_negotiate()
and SMB2_sess_auth_rawntlmssp_authenticate() functions.
Fixes: a4e430c8c8ba ("cifs: replace kfree() with kfree_sensitive() for sensitive data") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit bf7571c00dca ("extcon: usbc-tusb320: Add USB TYPE-C support")
added an optional Type-C interface to the driver but missed to check
if it is in use when calling the IRQ handler. This causes an oops on
devices currently using the old extcon interface. Check if a Type-C
port is registered before calling the Type-C IRQ handler.
Fixes: bf7571c00dca ("extcon: usbc-tusb320: Add USB TYPE-C support") Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com> Reviewed-by: Marek Vasut <marex@denx.de> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20221107153317.657803-1-y.oudjana@protonmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should set the return value to -ENOMEM explicitly when
create_singlethread_workqueue() fails in stmmac_dvr_probe(),
otherwise we'll lose the error value.
Fixes: a137f3f27f92 ("net: stmmac: fix possible memory leak in stmmac_dvr_probe()") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20221214080117.3514615-1-cuigaosheng1@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Read cq_timeouts in io_flush_timeouts() only after taking the
timeout_lock, as it's protected by it. There are many places where we
also grab ->completion_lock, but for instance io_timeout_fn() doesn't
and still modifies cq_timeouts.
hugetlb does not support fake write-faults (write faults without write
permissions). However, we are currently able to trigger a
FAULT_FLAG_WRITE fault on a VMA without VM_WRITE.
If we'd ever want to support FOLL_FORCE|FOLL_WRITE, we'd have to teach
hugetlb to:
(1) Leave the page mapped R/O after the fake write-fault, like
maybe_mkwrite() does.
(2) Allow writing to an exclusive anon page that's mapped R/O when
FOLL_FORCE is set, like can_follow_write_pte(). E.g.,
__follow_hugetlb_must_fault() needs adjustment.
For now, it's not clear if that added complexity is really required.
History tolds us that FOLL_FORCE is dangerous and that we better limit its
use to a bare minimum.
if (pwrite(mem_fd, "0", 1, (uintptr_t) map) == 1) {
fprintf(stderr, "write() succeeded, which is unexpected\n");
return 1;
}
printf("write() failed as expected: %d\n", errno);
return 0;
}
--------------------------------------------------------------------------
Fortunately, we have a sanity check in hugetlb_wp() in place ever since
commit 1d8d14641fd9 ("mm/hugetlb: support write-faults in shared
mappings"), that bails out instead of silently mapping a page writable in
a !PROT_WRITE VMA.
Consequently, above reproducer triggers a warning, similar to the one
reported by szsbot:
So let's silence that warning by teaching GUP code that FOLL_FORCE -- so
far -- does not apply to hugetlb.
Note that FOLL_FORCE for read-access seems to be working as expected. The
assumption is that this has been broken forever, only ever since above
commit, we actually detect the wrong handling and WARN_ON_ONCE().
I assume this has been broken at least since 2014, when mm/gup.c came to
life. I failed to come up with a suitable Fixes tag quickly.
Link: https://lkml.kernel.org/r/20221031152524.173644-1-david@redhat.com Fixes: 1d8d14641fd9 ("mm/hugetlb: support write-faults in shared mappings") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: <syzbot+f0b97304ef90f0d0b1dc@syzkaller.appspotmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Xu <peterx@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we get -ENOMEM while dropping file extent items in a given range, at
btrfs_drop_extents(), due to failure to allocate memory when attempting to
increment the reference count for an extent or drop the reference count,
we handle it with a BUG_ON(). This is excessive, instead we can simply
abort the transaction and return the error to the caller. In fact most
callers of btrfs_drop_extents(), directly or indirectly, already abort
the transaction if btrfs_drop_extents() returns any error.
Also, we already have error paths at btrfs_drop_extents() that may return
-ENOMEM and in those cases we abort the transaction, like for example
anything that changes the b+tree may return -ENOMEM due to a failure to
allocate a new extent buffer when COWing an existing extent buffer, such
as a call to btrfs_duplicate_item() for example.
So replace the BUG_ON() calls with proper logic to abort the transaction
and return the error.
Reported-by: syzbot+0b1fb6b0108c27419f9f@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/00000000000089773e05ee4b9cb4@google.com/ CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ovl_dentry_revalidate_common() can be called in rcu-walk mode. As document
said, "in rcu-walk mode, d_parent and d_inode should not be used without
care".
Check inode here to protect access under rcu-walk mode.
syzbot is reporting memory leak at fbcon_do_set_font() [1], for
commit a5a923038d70 ("fbdev: fbcon: Properly revert changes when
vc_resize() failed") missed that the buffer might be newly allocated
by fbcon_set_font().
If the floppy_alloc_disk() failed, disks of current drive will not be set,
thus the lastest allocated set->tag cannot be freed in the error handling
path. A simple call graph shown as below:
->out_put_disk:
for (drive = 0; drive < N_DRIVE; drive++)
if (!disks[drive][0]) # the last disks is not set and loop break
break;
blk_mq_free_tag_set() # the latest allocated set->tag leaked
Fix this problem by free the set->tag of current drive before jump to
error handling path.
Cc: stable@vger.kernel.org Fixes: 302cfee15029 ("floppy: use a separate gendisk for each media format") Signed-off-by: Yuan Can <yuancan@huawei.com>
[efremov: added stable list, changed title] Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When updating the operating mode as part of regulator enable, the caller
has already locked the regulator tree and drms_uA_update() must not try
to do the same in order not to trigger a deadlock.
The lock inversion is reported by lockdep as:
======================================================
WARNING: possible circular locking dependency detected
6.1.0-next-20221215 #142 Not tainted
------------------------------------------------------
udevd/154 is trying to acquire lock: ffffc11f123d7e50 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x280
but task is already holding lock: ffff80000e4c36e8 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_enable+0x34/0x80
The constant AD74413R_ADC_RESULT_MAX is defined via GENMASK, so its
type is "unsigned long".
Hence in the expression voltage_offset * AD74413R_ADC_RESULT_MAX,
voltage_offset is first promoted to unsigned long, and since it may be
negative, that results in a garbage value. For example, when range is
AD74413R_ADC_RANGE_5V_BI_DIR, voltage_offset is -2500 and
voltage_range is 5000, so the RHS of this assignment is, depending on
sizeof(long), either 826225UL or 3689348814709142UL, which after
truncation to int then results in either 826225 or 1972216214 being
the output from in_currentX_offset.
Casting to int avoids that promotion and results in the correct -32767
output.
Prior to commit bd5d54e4d49d ("iio: adc128s052: add ACPI _HID
AANT1280"), the driver unconditionally used spi_get_device_id() to get
the index into the adc128_config array.
However, with that commit, OF-based boards now incorrectly treat all
supported sensors as if they are an adc128s052, because all the .data
members of the adc128_of_match table are implicitly 0. Our board,
which has an adc122s021, thus exposes 8 channels whereas it really
only has two.
Drop 'mlock' usage by making use of iio_device_claim_direct_mode().
This change actually makes sure we cannot do a single conversion while
buffering is enable. Note there was a potential race in the previous
code since we were only acquiring the lock after checking if the bus is
enabled.
Fixes: af3008485ea0 ("iio:adc: Add common code for ADI Sigma Delta devices") Signed-off-by: Nuno Sá <nuno.sa@analog.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: <Stable@vger.kernel.org> #No rush as race is very old. Link: https://lore.kernel.org/r/20220920112821.975359-2-nuno.sa@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 57fe60df6241 ("reiserfs: add atomic addition of selinux attributes
during inode creation") defined reiserfs_security_free() to free the name
and value of a security xattr allocated by the active LSM through
security_old_inode_init_security(). However, this function is not called
in the reiserfs code.
Thus, add a call to reiserfs_security_free() whenever
reiserfs_security_init() is called, and initialize value to NULL, to avoid
to call kfree() on an uninitialized pointer.
Finally, remove the kfree() for the xattr name, as it is not allocated
anymore.
Fixes: 57fe60df6241 ("reiserfs: add atomic addition of selinux attributes during inode creation") Cc: stable@vger.kernel.org Cc: Jeff Mahoney <jeffm@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Mimi Zohar <zohar@linux.ibm.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A bad bug in clang's implementation of -fzero-call-used-regs can result
in NULL pointer dereferences (see the links above the check for more
information). Restrict CONFIG_CC_HAS_ZERO_CALL_USED_REGS to either a
supported GCC version or a clang newer than 15.0.6, which will catch
both a theoretical 15.0.7 and the upcoming 16.0.0, which will both have
the bug fixed.
When a new request is allocated, the refcount will be zero if it is
reused, but if the request is newly allocated from slab, it is not fully
initialized before being added to idr.
If the p9_read_work got a response before the refcount initiated. It will
use a uninitialized req, which will result in a bad request data struct.
refcount_set(&req->refcount, 2);
/* cb drops one ref */
p9_client_cb(req)
/* reader thread drops its ref:
request is incorrectly freed */
p9_req_put(req)
/* use after free and ref underflow */
p9_req_put(req)
To fix it, we can initialize the refcount to zero before add to idr.
Link: https://lkml.kernel.org/r/20221201033310.18589-1-schspa@gmail.com Cc: stable@vger.kernel.org # 6.0+ due to 6cda12864cb0 ("9p: Drop kref usage") Fixes: 728356dedeff ("9p: Add refcount to p9_req_t") Reported-by: syzbot+8f1060e2aaf8ca55220b@syzkaller.appspotmail.com Signed-off-by: Schspa Shi <schspa@gmail.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the max_loop commandline argument can be used to specify how
many loop block devices are created at init time. If it is not
specified on the commandline, CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block
devices will be created.
The max_loop commandline argument can be used to override the value of
CONFIG_BLK_DEV_LOOP_MIN_COUNT. However, when max_loop is set to 0
through the commandline, the current logic treats it as if it had not
been set, and creates CONFIG_BLK_DEV_LOOP_MIN_COUNT devices anyway.
Fix this by starting max_loop off as set to CONFIG_BLK_DEV_LOOP_MIN_COUNT.
This preserves the intended behavior of creating
CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block devices if the max_loop
commandline parameter is not specified, and allowing max_loop to
be respected for all values, including 0.
This allows environments that can create all of their required loop
block devices on demand to not have to unnecessarily preallocate loop
block devices.
Fixes: 732850827450 ("remove artificial software max_loop limit") Cc: stable@vger.kernel.org Cc: Ken Chen <kenchen@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20221208212902.765781-1-isaacmanjarres@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MCP2221 driver should not connect to the hidraw userspace interface,
as it needs exclusive access to the chip.
If you want to use /dev/hidrawX with the MCP2221, you need to avoid
binding this driver to the device and use the hid generic driver instead
(e.g. using udev rules).
Cc: stable@vger.kernel.org Reported-by: Sven Zühlsdorf <sven.zuehlsdorf@vigem.de> Signed-off-by: Enrik Berkhan <Enrik.Berkhan@inka.de> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20221103222714.21566-2-Enrik.Berkhan@inka.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some Wacom devices have a special "bootloader" mode that is used for
firmware flashing. When operating in this mode, the device cannot be
used for input, and the HID descriptor is not able to be processed by
the driver. The driver generates an "Unknown device_type" warning and
then returns an error code from wacom_probe(). This is a problem because
userspace still needs to be able to interact with the device via hidraw
to perform the firmware flash.
This commit adds a non-generic device definition for 056a:0094 which
is used when devices are in "bootloader" mode. It marks the devices
with a special BOOTLOADER type that is recognized by wacom_probe() and
wacom_raw_event(). When we see this type we ensure a hidraw device is
created and otherwise keep our hands off so that userspace is in full
control.
Make sure to also limit the amount of soft reset retries for transaction
errors on streams in cases where the transaction error event doesn't point
to any specific TRB.
In these cases we don't know the TRB or stream ring, but we do know which
endpoint had the error.
To keep error counting simple and functional, move the current err_count
from ring structure to endpoint structure.
Since commit 0f0101719138 ("usb: dwc3: Don't switch OTG -> peripheral
if extcon is present"), Dual Role support on Intel Merrifield platform
broke due to rearranging the call to dwc3_get_extcon().
It appears to be caused by ulpi_read_id() masking the timeout on the first
test write. In the past dwc3 probe continued by calling dwc3_core_soft_reset()
followed by dwc3_get_extcon() which happend to return -EPROBE_DEFER.
On deferred probe ulpi_read_id() finally succeeded. Due to above mentioned
rearranging -EPROBE_DEFER is not returned and probe completes without phy.
On Intel Merrifield the timeout on the first test write issue is reproducible
but it is difficult to find the root cause. Using a mainline kernel and
rootfs with buildroot ulpi_read_id() succeeds. As soon as adding
ftrace / bootconfig to find out why, ulpi_read_id() fails and we can't
analyze the flow. Using another rootfs ulpi_read_id() fails even without
adding ftrace. We suspect the issue is some kind of timing / race, but
merely retrying ulpi_read_id() does not resolve the issue.
As we now changed ulpi_read_id() to return -ETIMEDOUT in this case, we
need to handle the error by calling dwc3_core_soft_reset() and request
-EPROBE_DEFER. On deferred probe ulpi_read_id() is retried and succeeds.
dwc->desired_dr_role is changed by dwc3_set_mode inside a spinlock but
then read by __dwc3_set_mode outside of that lock. This can lead to a
race condition when very quick successive role switch events happen:
CPU A
dwc3_set_mode(DWC3_GCTL_PRTCAP_HOST) // first role switch event
spin_lock_irqsave(&dwc->lock, flags);
dwc->desired_dr_role = mode; // DWC3_GCTL_PRTCAP_HOST
spin_unlock_irqrestore(&dwc->lock, flags);
queue_work(system_freezable_wq, &dwc->drd_work);
CPU B
__dwc3_set_mode
// ....
spin_lock_irqsave(&dwc->lock, flags);
// desired_dr_role is DWC3_GCTL_PRTCAP_HOST
dwc3_set_prtcap(dwc, dwc->desired_dr_role);
spin_unlock_irqrestore(&dwc->lock, flags);
CPU A
dwc3_set_mode(DWC3_GCTL_PRTCAP_DEVICE) // second event
spin_lock_irqsave(&dwc->lock, flags);
dwc->desired_dr_role = mode; // DWC3_GCTL_PRTCAP_DEVICE
spin_unlock_irqrestore(&dwc->lock, flags);
CPU B (continues running __dwc3_set_mode)
switch (dwc->desired_dr_role) { // DWC3_GCTL_PRTCAP_DEVICE
// ....
case DWC3_GCTL_PRTCAP_DEVICE:
// ....
ret = dwc3_gadget_init(dwc);
We then have DWC3_GCTL.DWC3_GCTL_PRTCAPDIR = DWC3_GCTL_PRTCAP_HOST and
dwc->current_dr_role = DWC3_GCTL_PRTCAP_HOST but initialized the
controller in device mode. It's also possible to get into a state
where both host and device are intialized at the same time.
Fix this race by creating a local copy of desired_dr_role inside
__dwc3_set_mode while holding dwc->lock.
Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly") Cc: stable <stable@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Signed-off-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20221128161526.79730-1-sven@svenpeter.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
32K usb suspend clock gate is shared with usb_root_clk, this
shared clock gate was initially defined only for usb suspend
clock, usb suspend clk is kept on while system is active or
system sleep with usb wakeup enabled, so usb root clock is
fine with this situation; with the commit cf7f3f4fa9e5
("clk: imx8mp: fix usb_root_clk parent"), this clock gate is
changed to be for usb root clock, but usb root clock will
be off while usb is suspended, so usb suspend clock will be
gated too, this cause some usb functionalities will not work,
so define this clock to be a shared clock gate to conform with
the real HW status.
Fixes: 9c140d9926761 ("clk: imx: Add support for i.MX8MP clock driver") Cc: stable@vger.kernel.org # v5.19+ Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/1664549663-20364-2-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
usb suspend clock has a gate shared with usb_root_clk.
Fixes: 9c140d9926761 ("clk: imx: Add support for i.MX8MP clock driver") Cc: stable@vger.kernel.org # v5.19+ Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/1664549663-20364-1-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When adding support for the DisplayPort part of the QMP PHY the binding
(and devicetree parser) for the (USB) child node was simply reused and
this has lead to some confusion.
The third DP register region is really the DP_PHY region, not "PCS" as
the binding claims, and lie at offset 0x2a00 (not 0x2c00).
Similarly, there likely are no "RX", "RX2" or "PCS_MISC" regions as
there are for the USB part of the PHY (and in any case the Linux driver
does not use them).
Note that the sixth "PCS_MISC" region is not even in the binding.
When adding support for the DisplayPort part of the QMP PHY the binding
(and devicetree parser) for the (USB) child node was simply reused and
this has lead to some confusion.
The third DP register region is really the DP_PHY region, not "PCS" as
the binding claims, and lie at offset 0x2a00 (not 0x2c00).
Similarly, there likely are no "RX", "RX2" or "PCS_MISC" regions as
there are for the USB part of the PHY (and in any case the Linux driver
does not use them).
Note that the sixth "PCS_MISC" region is not even in the binding.
Patch implements the handling of ZLP for control transfer.
To send the ZLP driver must prepare the extra TRB in TD with
length set to zero and TRB type to TRB_NORMAL.
The first TRB must have set TRB_CHAIN flag, TD_SIZE = 1
and TRB type to TRB_DATA.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
cc: <stable@vger.kernel.org> Reviewed-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Pawel Laszczak <pawell@cadence.com> Link: https://lore.kernel.org/r/20221122085138.332434-1-pawell@cadence.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Force Feedback code assumes that all the devices passed to it will
be USB devices, but that might not be the case for emulated devices.
Guard against a crash by checking the device type before poking at USB
properties.
HDMI audio is not working on the HP EliteDesk 800 G6 because the pin is
unconnected. This issue can be resolved by using the 'hdajackretask'
tool to override the unconnected pin to force it to connect.
IQS7222A revisions 1.13 and later widen the gesture multiplier from
x4 ms to x16 ms. Add a means to scale the gesture timings specified
in the device tree based on the revision of the device.
Fixes: e505edaedcb9 ("Input: add support for Azoteq IQS7222A/B/C") Signed-off-by: Jeff LaBundy <jeff@labundy.com> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> Link: https://lore.kernel.org/r/Y1SRdbK1Dp2q7O8o@nixie71 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to the datasheets, writing only 0xFF is sufficient to
elicit a communication window. Remove the superfluous 0x00 from
the force communication command.
Fixes: e505edaedcb9 ("Input: add support for Azoteq IQS7222A/B/C") Signed-off-by: Jeff LaBundy <jeff@labundy.com> Link: https://lore.kernel.org/r/20220908131548.48120-6-jeff@labundy.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Following a recent refactor of the driver to properly drop unused
device nodes, the 'linux,code' property is now optional. This can
be useful for applications that define GPIO-mapped events that do
not correspond to any keycode.
Fixes: 44dc42d254bf ("dt-bindings: input: Add bindings for Azoteq IQS7222A/B/C") Signed-off-by: Jeff LaBundy <jeff@labundy.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/Y1SROIrrC1LwX0Sd@nixie71 Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently ima_lsm_copy_rule() set the arg_p field of the source rule to
NULL, so that the source rule could be freed afterward. It does not make
sense for this behavior to be inside a "copy" function. So move it
outside and let the caller handle this field.
ima_lsm_copy_rule() now produce a shallow copy of the original entry
including args_p field. Meaning only the lsm.rule and the rule itself
should be freed for the original rule. Thus, instead of calling
ima_lsm_free_rule() which frees lsm.rule as well as args_p field, free
the lsm.rule directly.
In commit 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex
to avoid priority inversion") I changed a lock to an rt_mutex.
However, its possible that CONFIG_RT_MUTEXES is not enabled,
which then results in a build failure, as the 0day bot detected:
https://lore.kernel.org/linux-mm/202212211244.TwzWZD3H-lkp@intel.com/
Thus this patch changes CONFIG_PSTORE_PMSG to select
CONFIG_RT_MUTEXES, which ensures the build will not fail.
Cc: Wei Wang <wvw@google.com> Cc: Midas Chien<midaschieh@google.com> Cc: Connor O'Brien <connoro@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: kernel test robot <lkp@intel.com> Cc: kernel-team@android.com Fixes: 76d62f24db07 ("pstore: Switch pmsg_lock to an rt_mutex to avoid priority inversion") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221221051855.15761-1-jstultz@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The afs_fs_probe_dispatcher() work function is passed a count on
net->servers_outstanding when it is scheduled (which may come via its
timer). This is passed back to the work_item, passed to the timer or
dropped at the end of the dispatcher function.
But, at the top of the dispatcher function, there are two checks which
skip the rest of the function: if the network namespace is being destroyed
or if there are no fileservers to probe. These two return paths, however,
do not drop the count passed to the dispatcher, and so, sometimes, the
destruction of a network namespace, such as induced by rmmod of the kafs
module, may get stuck in afs_purge_servers(), waiting for
net->servers_outstanding to become zero.
Fix this by adding the missing decrements in afs_fs_probe_dispatcher().
Fixes: f6cbb368bcb0 ("afs: Actively poll fileservers to maintain NAT or firewall openings") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/167164544917.2072364.3759519569649459359.stgit@warthog.procyon.org.uk/ Signed-off-by: Sasha Levin <sashal@kernel.org>
Parametrized events are not only a powerpc domain. They occur on other
platforms too (e.g. aarch64). They should be ignored in this testcase,
since proper setup of the parameters is out of scope of this script.
Let's not filter them out by PMU name, but rather based on the fact that
they expect a parameter.
Fixes: 451ed8058c69a3fe ("perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc") Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221219163008.9691-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>