Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used.
Back in the days, commit bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
embedded per-socket cgroup information into sock->sk_cgrp_data and in order
to save 8 bytes in struct sock made both mutually exclusive, that is, when
cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2
falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp).
The assumption made was "there is no reason to mix the two and this is in line
with how legacy and v2 compatibility is handled" as stated in bd1060a1d671.
However, with Kubernetes more widely supporting cgroups v2 as well nowadays,
this assumption no longer holds, and the possibility of the v1/v2 mixed mode
with the v2 root fallback being hit becomes a real security issue.
Many of the cgroup v2 BPF programs are also used for policy enforcement, just
to pick _one_ example, that is, to programmatically deny socket related system
calls like connect(2) or bind(2). A v2 root fallback would implicitly cause
a policy bypass for the affected Pods.
In production environments, we have recently seen this case due to various
circumstances: i) a different 3rd party agent and/or ii) a container runtime
such as [0] in the user's environment configuring legacy cgroup v1 net_cls
tags, which triggered implicitly mentioned root fallback. Another case is
Kubernetes projects like kind [1] which create Kubernetes nodes in a container
and also add cgroup namespaces to the mix, meaning programs which are attached
to the cgroup v2 root of the cgroup namespace get attached to a non-root
cgroup v2 path from init namespace point of view. And the latter's root is
out of reach for agents on a kind Kubernetes node to configure. Meaning, any
entity on the node setting cgroup v1 net_cls tag will trigger the bypass
despite cgroup v2 BPF programs attached to the namespace root.
Generally, this mutual exclusiveness does not hold anymore in today's user
environments and makes cgroup v2 usage from BPF side fragile and unreliable.
This fix adds proper struct cgroup pointer for the cgroup v2 case to struct
sock_cgroup_data in order to address these issues; this implicitly also fixes
the tradeoffs being made back then with regards to races and refcount leaks
as stated in bd1060a1d671, and removes the fallback, so that cgroup v2 BPF
programs always operate as expected.
Metadata added by bio_integrity_prep is using plain kmalloc, which leads
to random kernel memory being written media. For PI metadata this is
limited to the app tag that isn't used by kernel generated metadata,
but for non-PI metadata the entire buffer leaks kernel memory.
Fix this by adding the __GFP_ZERO flag to allocations for writes.
Fixes: 7ba1ba12eeef ("block: Block layer data integrity support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20240613084839.1044015-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Struct uvc_frame and interval (u32*) are packaged together on
streaming->formats on a single contiguous allocation.
Right now they are allocated right after uvc_format, without taking into
consideration their required alignment.
This is working fine because both structures have a field with a
pointer, but it will stop working when the sizeof() of any of those
structs is not a multiple of the sizeof(void*).
Callers can pass null in filter (i.e. from returned from the function
wbscl_get_filter_coeffs_16p) and a null check is added to ensure that is
not the case.
This fixes 4 NULL_RETURNS issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Now that there are no indirect calls for PI processing there is no
way to dereference a NULL pointer here. Additionally drivers now always
freeze the queue (or in case of stacking drivers use their internal
equivalent) around changing the integrity profile.
This is effectively a revert of commit 3df49967f6f1 ("block: flush the
integrity workqueue in blk_integrity_unregister").
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240613084839.1044015-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Jiazi Li reported that they occasionally see hash table duplicates
as evidenced by the WARN_ON() in rb_insert_bss() in this code. It
isn't clear how that happens, nor have I been able to reproduce it,
but if it does happen, the kernel crashes later, when it tries to
unhash the entry that's now not hashed.
Try to make this situation more survivable by removing the BSS from
the list(s) as well, that way it's fully leaked here (as had been
the intent in the hash insert error path), and no longer reachable
through the list(s) so it shouldn't be unhashed again later.
This patch adds robust error handling to the meson_plane_create
function in drivers/gpu/drm/meson/meson_plane.c. The function
previously lacked proper handling for potential failure scenarios
of the drm_universal_plane_init call.
Currently, Smack mirrors the label of incoming tcp/ipv4 connections:
when a label 'foo' connects to a label 'bar' with tcp/ipv4,
'foo' always gets 'foo' in returned ipv4 packets. So,
1) returned packets are incorrectly labeled ('foo' instead of 'bar')
2) 'bar' can write to 'foo' without being authorized to write.
Here is a scenario how to see this:
* Take two machines, let's call them C and S,
with active Smack in the default state
(no settings, no rules, no labeled hosts, only builtin labels)
* At S, add Smack rule 'foo bar w'
(labels 'foo' and 'bar' are instantiated at S at this moment)
* At S, at label 'bar', launch a program
that listens for incoming tcp/ipv4 connections
* From C, at label 'foo', connect to the listener at S.
(label 'foo' is instantiated at C at this moment)
Connection succeedes and works.
* Send some data in both directions.
* Collect network traffic of this connection.
All packets in both directions are labeled with the CIPSO
of the label 'foo'. Hence, label 'bar' writes to 'foo' without
being authorized, and even without ever being known at C.
If anybody cares: exactly the same happens with DCCP.
This behavior 1st manifested in release 2.6.29.4 (see Fixes below)
and it looks unintentional. At least, no explanation was provided.
I changed returned packes label into the 'bar',
to bring it into line with the Smack documentation claims.
In some setups directories can have many (usually negative) dentries.
Hence __fsnotify_update_child_dentry_flags() function can take a
significant amount of time. Since the bulk of this function happens
under inode->i_lock this causes a significant contention on the lock
when we remove the watch from the directory as the
__fsnotify_update_child_dentry_flags() call from fsnotify_recalc_mask()
races with __fsnotify_update_child_dentry_flags() calls from
__fsnotify_parent() happening on children. This can lead upto softlockup
reports reported by users.
Fix the problem by calling fsnotify_update_children_dentry_flags() to
set PARENT_WATCHED flags only when parent starts watching children.
When parent stops watching children, clear false positive PARENT_WATCHED
flags lazily in __fsnotify_parent() for each accessed child.
Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
ucsi_register_altmode checks IS_ERR for the alt pointer and treats
NULL as valid. When CONFIG_TYPEC_DP_ALTMODE is not enabled,
ucsi_register_displayport returns NULL which causes a NULL pointer
dereference in trace. Rather than return NULL, call
typec_port_register_altmode to register DisplayPort alternate mode
as a non-controllable mode when CONFIG_TYPEC_DP_ALTMODE is not enabled.
Skip submitting URBs, when identical requests were already sent in
tweak_special_requests(). Instead call the completion handler directly
to return the result of the URB.
Even though submitting those requests twice should be harmless, there
are USB devices that react poorly to some duplicated requests.
One example is the ChipIdea controller implementation in U-Boot: The
second SET_CONFIGURATION request makes U-Boot disable and re-enable all
endpoints. Re-enabling an endpoint in the ChipIdea controller, however,
was broken until U-Boot commit b272c8792502 ("usb: ci: Fix gadget
reinit").
Address a warning about potential string truncation based on the
string buffer sizes. We can add some hints to the string format
specifier to set limits on the resulting possible string to
squelch the complaints.
When a remoteproc crashes or goes down unexpectedly this can result in
a state where locks held by the remoteproc will remain locked possibly
resulting in deadlock. This new API hwspin_lock_bust() allows
hwspinlock implementers to define a bust operation for freeing previously
acquired hwspinlocks after verifying ownership of the acquired lock.
If IORESOURCE_BUS is not provided in Device Tree it will be fabricated in
of_pci_parse_bus_range(), so NULL pointer dereference should not happen
here.
But that's hard to verify, so check for NULL anyway.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
fw_running assumes that memory can be retrieved only after alive.
This assumption is no longer true as we support dump before alive.
To avoid invalid access to the NIC, check that STATUS_DEVICE_ENABLED
bit in trans status is set before dumping instead of the prior check.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.ca07138cedeb.I090e31d3eaeb4ba19f5f84aba997ccd36927e9ac@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check the input value for CUSTOM profile mode setting on legacy
SOCs. Otherwise we may use uninitalized value of input[]
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
profile->parent->dents[AAFS_PROF_DIR] could be NULL only if its parent is made
from __create_missing_ancestors(..) and 'ent->old' is NULL in
aa_replace_profiles(..).
In that case, it must return an error code and the code, -ENOENT represents
its state that the path of its parent is not existed yet.
Currently oem_id is defined as uint8_t[6] and casted to uint64_t*
in some use case. This would lead code scanner to complain about
access beyond. Re-define it in union to enforce 8-byte size and
alignment to avoid potential issue.
Signed-off-by: Michael Chen <michael.chen@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check the fb_channel_number range to avoid the array out-of-bounds
read error
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check the ring type value to fix the out-of-bounds
write warning
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Assign an default value to agc_btc_response in failed case
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[WHY & HOW]
HDCP_MESSAGE_ID_INVALID (-1) is not a valid msg_id nor is it a valid
array index, and it needs checking before used.
This fixes 4 OVERRUN issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[WHY & HOW]
num_valid_sets needs to be checked to avoid a negative index when
accessing reader_wm_sets[num_valid_sets - 1].
This fixes an OVERRUN issue reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
[WHY & HOW]
GPIO_ID_UNKNOWN (-1) is not a valid value for array index and therefore
should be checked in advance.
This fixes 5 OVERRUN issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Clear warnings that using uninitialized variable when fails
to get the valid value from SMU.
Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Clear overflowed array index read warning by cast operation.
Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Assign value to clock to fix the warning below:
"Using uninitialized value res. Field res.clock is uninitialized"
Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the dma debugging code can end up indirectly calling printk
under the radix_lock. This happens when a radix tree node allocation
fails.
This is a problem because the printk code, when used together with
netconsole, can end up inside the dma debugging code while trying to
transmit a message over netcons.
This creates the possibility of either a circular deadlock on the same
CPU, with that CPU trying to grab the radix_lock twice, or an ABBA
deadlock between different CPUs, where one CPU grabs the console lock
first and then waits for the radix_lock, while the other CPU is holding
the radix_lock and is waiting for the console lock.
The trace captured by lockdep is of the ABBA variant.
Add IS_ENABLED(CONFIG_I2C) to the conditional around a bunch of ACPI
functions.
The conditional around these functions depended only on CONFIG_ACPI.
But the functions are implemented in I2C core, so are only present if
CONFIG_I2C is enabled.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some devices indicate click noises at suspend or shutdown when the
speakers are unmuted. This patch adds a helper,
snd_hda_gen_shutup_speakers(), to work around it. The new function is
supposed to be called at suspend or shutdown by the codec driver, and
it mutes the speakers.
The mute status isn't cached, hence the original mute state will be
restored at resume again.
policy_unpack_test fails on big endian systems because data byte order
is expected to be little endian but is generated in host byte order.
This results in test failures such as:
# policy_unpack_test_unpack_array_with_null_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:150
Expected array_size == (u16)16, but
array_size == 4096 (0x1000)
(u16)16 == 16 (0x10)
# policy_unpack_test_unpack_array_with_null_name: pass:0 fail:1 skip:0 total:1
not ok 3 policy_unpack_test_unpack_array_with_null_name
# policy_unpack_test_unpack_array_with_name: EXPECTATION FAILED at security/apparmor/policy_unpack_test.c:164
Expected array_size == (u16)16, but
array_size == 4096 (0x1000)
(u16)16 == 16 (0x10)
# policy_unpack_test_unpack_array_with_name: pass:0 fail:1 skip:0 total:1
Add the missing endianness conversions when generating test data.
aac_probe_one() calls hardware-specific init functions through the
aac_driver_ident::init pointer, all of which eventually call down to
aac_init_adapter().
If aac_init_adapter() fails after allocating memory for aac_dev::queues,
it frees the memory but does not clear that member.
After the hardware-specific init function returns an error,
aac_probe_one() goes down an error path that frees the memory pointed to
by aac_dev::queues, resulting.in a double-free.
Reported-by: Michael Gordon <m.gordon.zelenoborsky@gmail.com> Link: https://bugs.debian.org/1075855 Fixes: 8e0c5ebde82b ("[SCSI] aacraid: Newer adapter communication iterface support") Signed-off-by: Ben Hutchings <benh@debian.org> Link: https://lore.kernel.org/r/ZsZvfqlQMveoL5KQ@decadent.org.uk Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Device attribute group @usb3_hardware_lpm_attr_group is merged by
add_power_attributes(), but it is not unmerged explicitly, fixed by
unmerging it in remove_power_attributes().
Fixes: 655fe4effe0f ("usbcore: add sysfs support to xHCI usb3 hardware LPM") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20240820-sysfs_fix-v2-1-a9441487077e@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The probe function never performs any paltform device allocation, thus
error path "undo_platform_dev_alloc" is entirely bogus. It drops the
reference count from the platform device being probed. If error path is
triggered, this will lead to unbalanced device reference counts and
premature release of device resources, thus possible use-after-free when
releasing remaining devm-managed resources.
This commit addresses an issue where the USB core could access an
invalid event buffer address during runtime suspend, potentially causing
SMMU faults and other memory issues in Exynos platforms. The problem
arises from the following sequence.
1. In dwc3_gadget_suspend, there is a chance of a timeout when
moving the USB core to the halt state after clearing the
run/stop bit by software.
2. In dwc3_core_exit, the event buffer is cleared regardless of
the USB core's status, which may lead to an SMMU faults and
other memory issues. if the USB core tries to access the event
buffer address.
To prevent this hardware quirk on Exynos platforms, this commit ensures
that the event buffer address is not cleared by software when the USB
core is active during runtime suspend by checking its status before
clearing the buffer address.
Linux does not write into cmd-db region. This region of memory is write
protected by XPU. XPU may sometime falsely detect clean cache eviction
as "write" into the write protected region leading to secure interrupt
which causes an endless loop somewhere in Trust Zone.
The only reason it is working right now is because Qualcomm Hypervisor
maps the same region as Non-Cacheable memory in Stage 2 translation
tables. The issue manifests if we want to use another hypervisor (like
Xen or KVM), which does not know anything about those specific mappings.
Changing the mapping of cmd-db memory from MEMREMAP_WB to MEMREMAP_WT/WC
removes dependency on correct mappings in Stage 2 tables. This patch
fixes the issue by updating the mapping to MEMREMAP_WC.
In case of im_protocols value is 1 and tm_protocols value is 0 this
combination successfully passes the check
'if (!im_protocols && !tm_protocols)' in the nfc_start_poll().
But then after pn533_poll_create_mod_list() call in pn533_start_poll()
poll mod list will remain empty and dev->poll_mod_count will remain 0
which lead to division by zero.
Normally no im protocol has value 1 in the mask, so this combination is
not expected by driver. But these protocol values actually come from
userspace via Netlink interface (NFC_CMD_START_POLL operation). So a
broken or malicious program may pass a message containing a "bad"
combination of protocol parameter values so that dev->poll_mod_count
is not incremented inside pn533_poll_create_mod_list(), thus leading
to division by zero.
Call trace looks like:
nfc_genl_start_poll()
nfc_start_poll()
->start_poll()
pn533_start_poll()
Add poll mod list filling check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: dfccd0f58044 ("NFC: pn533: Add some polling entropy") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20240827084822.18785-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Typically, busy-polling durations are below 100 usec.
When/if the busy-poller thread migrates to another cpu,
local_clock() can be off by +/-2msec or more for small
values of HZ, depending on the platform.
Use ktimer_get_ns() to ensure deterministic behavior,
which is the whole point of busy-polling.
Fixes: 060212928670 ("net: add low latency socket poll") Fixes: 9a3c71aa8024 ("net: convert low latency sockets to sched_clock()") Fixes: 37089834528b ("sched, net: Fixup busy_loop_us_clock()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20240827114916.223377-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When sockfd_lookup() fails, gtp_encap_enable_socket() returns a
NULL pointer, but its callers only check for error pointers thus miss
the NULL pointer case.
Fix it by returning an error pointer with the error code carried from
sockfd_lookup().
(I found this bug during code inspection.)
Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Cc: Andreas Schultz <aschultz@tpip.net> Cc: Harald Welte <laforge@gnumonks.org> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240825191638.146748-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
data width (single transfer width) is determined based on the buffer
length, buffer base address or DMA master-channel max address width
capability. It isn't enough in case of the channel disabling prior the
block transfer is finished. Here is what DW AHB DMA IP-core databook says
regarding the port suspension (DMA-transfer pause) implementation in the
controller:
"When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
still be data in the channel FIFO, but not enough to form a single
transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
disabled, the remaining data in the channel FIFO is not transferred to the
destination peripheral."
So in case if the port gets to be suspended and then disabled it's
possible to have the data silently discarded even though the controller
reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
data already received from the source device. This looks as if the data
somehow got lost on a way from the peripheral device to memory and causes
problems for instance in the DW APB UART driver, which pauses and disables
the DMA-transfer as soon as the recv data timeout happens. Here is the way
it looks:
Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
DST_TR_WIDTH -+--------| | |
| | | | No more data
Current lvl -+--------| |---------+- DMA-burst lvl
| | |---------+- Leftover data
| | |---------+- SRC_TR_WIDTH
-+--------+-------+---------+
In the example above: no more data is getting received over the UART port
and BLOCK_TS is not even close to be fully received; some data is left in
the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
FIFO; some data is left in the DMA FIFO, but not enough to be passed
further to the system memory in a single transfer. In this situation the
8250 UART driver catches the recv timeout interrupt, pauses the
DMA-transfer and terminates it completely, after which the IRQ handler
manually fetches the leftover data from the UART FIFO into the
recv-buffer. But since the DMA-channel has been disabled with the data
left in the DMA FIFO, that data will be just discarded and the recv-buffer
will have a gap of the "current lvl" size in the recv-buffer at the tail
of the lately received data portion. So the data will be lost just due to
the misconfigured DMA transfer.
Note this is only relevant for the case of the transfer suspension and
_disabling_. No problem will happen if the transfer will be re-enabled
afterwards or the block transfer is fully completed. In the later case the
"FIFO flush mode" will be executed at the transfer final stage in order to
push out the data left in the DMA FIFO.
In order to fix the denoted problem the DW AHB DMA-engine driver needs to
make sure that the _bursted_ source transfer width is greater or equal to
the single destination transfer (note the HW databook describes more
strict constraint than actually required). Since the peripheral-device
side is prescribed by the client driver logic, the memory-side can be only
used for that. The solution can be easily implemented for the DEV_TO_MEM
transfers just by adjusting the memory-channel address width. Sadly it's
not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
is normally dynamically determined by the controller. So the only thing
that can be done is to make sure that memory-side address width is greater
than the peripheral device address width.
Fixes: a09820043c9e ("dw_dmac: autoconfigure data_width or get it via platform data") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20240802075100.6475-3-fancer.lancer@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the src_addr_width and dst_addr_width fields of the
dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
properly aligned data passed to the target device. It's done just by
converting the passed peripheral bus width to the encoded value using the
__ffs() function. This implementation has several problematic sides:
1. __ffs() is undefined if no bit exist in the passed value. Thus if the
specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
unexpected value depending on the platform-specific implementation.
2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
bus-width out of that constraints scope will definitely cause unexpected
result since the destination reg will be only partly touched than the
client driver implied.
Let's fix all of that by adding the peripheral bus width verification
method and calling it in dwc_config() which is supposed to be executed
before preparing any transfer. The new method will make sure that the
passed source or destination address width is valid and if undefined then
the driver will just fallback to the 1-byte width transfer.
Two bitmasks in 'struct sdw_slave_prop' - 'source_ports' and
'sink_ports' - define which ports to program in
sdw_program_slave_port_params(). The masks are used to get the
appropriate data port properties ('struct sdw_get_slave_dpn_prop') from
an array.
Bitmasks can be non-continuous or can start from index different than 0,
thus when looking for matching port property for given port, we must
iterate over mask bits, not from 0 up to number of ports.
This fixes allocation and programming slave ports, when a source or sink
masks start from further index.
Fixes: f8101c74aa54 ("soundwire: Add Master and Slave port programming") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20240729140157.326450-1-krzysztof.kozlowski@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Functions rds_still_queued and rds_clear_recv_queue lock a given socket
in order to safely iterate over the incoming rds messages. However
calling rds_inc_put while under this lock creates a potential deadlock.
rds_inc_put may eventually call rds_message_purge, which will lock
m_rs_lock. This is the incorrect locking order since m_rs_lock is
meant to be locked before the socket. To fix this, we move the message
item to a local list or variable that wont need rs_recv_lock protection.
Then we can safely call rds_inc_put on any item stored locally after
rs_recv_lock is released.
An UAF can happen when /proc/cpuset is read as reported in [1].
This can be reproduced by the following methods:
1.add an mdelay(1000) before acquiring the cgroup_lock In the
cgroup_path_ns function.
2.$cat /proc/<pid>/cpuset repeatly.
3.$mount -t cgroup -o cpuset cpuset /sys/fs/cgroup/cpuset/
$umount /sys/fs/cgroup/cpuset/ repeatly.
The race that cause this bug can be shown as below:
When the cpuset is initialized, the root node top_cpuset.css.cgrp
will point to &cgrp_dfl_root.cgrp. In cgroup v1, the mount operation will
allocate cgroup_root, and top_cpuset.css.cgrp will point to the allocated
&cgroup_root.cgrp. When the umount operation is executed,
top_cpuset.css.cgrp will be rebound to &cgrp_dfl_root.cgrp.
The problem is that when rebinding to cgrp_dfl_root, there are cases
where the cgroup_root allocated by setting up the root for cgroup v1
is cached. This could lead to a Use-After-Free (UAF) if it is
subsequently freed. The descendant cgroups of cgroup v1 can only be
freed after the css is released. However, the css of the root will never
be released, yet the cgroup_root should be freed when it is unmounted.
This means that obtaining a reference to the css of the root does
not guarantee that css.cgrp->root will not be freed.
Fix this problem by using rcu_read_lock in proc_cpuset_show().
As cgroup_root is kfree_rcu after commit d23b5c577715
("cgroup: Make operations on the cgroup root_list RCU safe"),
css->cgroup won't be freed during the critical section.
To call cgroup_path_ns_locked, css_set_lock is needed, so it is safe to
replace task_get_css with task_css.
The function uvc_video_clock_update() supports a single SOF overflow. Or
in other words, the maximum difference between the first ant the last
timestamp can be 4096 ticks or 4.096 seconds.
This results in a maximum value for y2 of: 0x12FBECA00, that overflows
32bits.
y2 = (u32)ktime_to_ns(ktime_sub(last->host_time, first->host_time)) + y1;
Extend the size of y2 to u64 to support all its values.
We don't get the right offset in that case. The GPU has
an unused 4K area of the register BAR space into which you can
remap registers. We remap the HDP flush registers into this
space to allow userspace (CPU or GPU) to flush the HDP when it
updates VRAM. However, on systems with >4K pages, we end up
exposing PAGE_SIZE of MMIO space.
Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sysvipc_find_ipc() was left with a costly way to check if the offset
position fed to it is bigger than the total number of IPC IDs in use. So
much so that the time it takes to iterate over /proc/sysvipc/* files grows
exponentially for a custom benchmark that creates "N" SYSV shm segments
and then times the read of /proc/sysvipc/shm (milliseconds):
12 msecs to read 1024 segs from /proc/sysvipc/shm
18 msecs to read 2048 segs from /proc/sysvipc/shm
65 msecs to read 4096 segs from /proc/sysvipc/shm
325 msecs to read 8192 segs from /proc/sysvipc/shm
1303 msecs to read 16384 segs from /proc/sysvipc/shm
5182 msecs to read 32768 segs from /proc/sysvipc/shm
The root problem lies with the loop that computes the total amount of ids
in use to check if the "pos" feeded to sysvipc_find_ipc() grew bigger than
"ids->in_use". That is a quite inneficient way to get to the maximum
index in the id lookup table, specially when that value is already
provided by struct ipc_ids.max_idx.
This patch follows up on the optimization introduced via commit 15df03c879836 ("sysvipc: make get_maxid O(1) again") and gets rid of the
aforementioned costly loop replacing it by a simpler checkpoint based on
ipc_get_maxidx() returned value, which allows for a smooth linear increase
in time complexity for the same custom benchmark:
2 msecs to read 1024 segs from /proc/sysvipc/shm
2 msecs to read 2048 segs from /proc/sysvipc/shm
4 msecs to read 4096 segs from /proc/sysvipc/shm
9 msecs to read 8192 segs from /proc/sysvipc/shm
19 msecs to read 16384 segs from /proc/sysvipc/shm
39 msecs to read 32768 segs from /proc/sysvipc/shm
Link: https://lkml.kernel.org/r/20210809203554.1562989-1-aquini@redhat.com Signed-off-by: Rafael Aquini <aquini@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Manfred Spraul <manfred@colorfullife.com> Cc: Waiman Long <llong@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'mptcp_subflow_context' structure has two items related to the
backup flags:
- 'backup': the subflow has been marked as backup by the other peer
- 'request_bkup': the backup flag has been set by the host
Looking only at the 'backup' flag can make sense in some cases, but it
is not the behaviour of the default packet scheduler when selecting
paths.
As explained in the commit b6a66e521a20 ("mptcp: sched: check both
directions for backup"), the packet scheduler should look at both flags,
because that was the behaviour from the beginning: the 'backup' flag was
set by accident instead of the 'request_bkup' one. Now that the latter
has been fixed, get_retrans() needs to be adapted as well.
Fixes: b6a66e521a20 ("mptcp: sched: check both directions for backup") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mwifiex_band_2ghz and mwifiex_band_5ghz are statically allocated, but
used and modified in driver instances. Duplicate them before using
them in driver instances so that different driver instances do not
influence each other.
This was observed on a board which has one PCIe and one SDIO mwifiex
adapter. It blew up in mwifiex_setup_ht_caps(). This was called with
the statically allocated struct which is modified in this function.
Cc: stable@vger.kernel.org Fixes: d6bffe8bb520 ("mwifiex: support for creation of AP interface") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Acked-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240809-mwifiex-duplicate-static-structs-v1-1-6837b903b1a4@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pinmux_generic_get_function() can return NULL and the pointer 'function'
was dereferenced without checking against NULL. Add checking of pointer
'function' in pcs_get_function().
Found by code review.
Cc: stable@vger.kernel.org Fixes: 571aec4df5b7 ("pinctrl: single: Use generic pinmux helpers for managing functions") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Link: https://lore.kernel.org/20240808041355.2766009-1-make24@iscas.ac.cn Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The base iomux offsets for each GPIO pin line are accumulatively
calculated based off iomux width flag in rockchip_pinctrl_get_soc_data.
If the iomux width flag is one of IOMUX_WIDTH_4BIT, IOMUX_WIDTH_3BIT or
IOMUX_WIDTH_2BIT, the base offset for next pin line would increase by 8
bytes, otherwise it would increase by 4 bytes.
Despite most of GPIO2-B iomux have 2-bit data width, which can be fit
into 4 bytes space with write mask, it actually take 8 bytes width for
whole GPIO2-B line.
Commit e8448a6c817c ("pinctrl: rockchip: fix pinmux bits for RK3328
GPIO2-B pins") wrongly set iomux width flag to 0, causing all base
iomux offset for line after GPIO2-B to be calculated wrong. Fix the
iomux width flag to IOMUX_WIDTH_2BIT so the offset after GPIO2-B is
correctly increased by 8, matching the actual width of GPIO2-B iomux.
Fixes: e8448a6c817c ("pinctrl: rockchip: fix pinmux bits for RK3328 GPIO2-B pins") Cc: stable@vger.kernel.org Reported-by: Richard Kojedzinszky <richard@kojedz.in> Closes: https://lore.kernel.org/linux-rockchip/4f29b743202397d60edfb3c725537415@kojedz.in/ Tested-by: Richard Kojedzinszky <richard@kojedz.in> Signed-off-by: Huang-Huang Bao <i@eh5.me> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Trevor Woerner <twoerner@gmail.com> Link: https://lore.kernel.org/20240709105428.1176375-1-i@eh5.me Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
allmodconfig + CONFIG_LTO_CLANG_THIN=y fails to build due to following
linker errors:
ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21CC):
relocation R_AARCH64_CONDBR19 out of range: 2031220 is not in
[-1048576, 1048575]; references hyp_panic
>>> defined in vmlinux.o
ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21E0):
relocation R_AARCH64_ADR_PREL_LO21 out of range: 2031200 is not in
[-1048576, 1048575]; references hyp_panic
>>> defined in vmlinux.o
This is because with LTO, the compiler ends up placing hyp_panic()
more than 1MB away from __guest_enter(). Use an unconditional branch
and adr_l instead to fix the issue.
Link: https://github.com/ClangBuiltLinux/linux/issues/1317 Reported-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Marc Zyngier <maz@kernel.org> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Will Deacon <will@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210305202124.3768527-1-samitolvanen@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001.
V2: To really improve the handling we would actually
need to have a separate value of 0xffffffff.(Christian)
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Vamsi Krishna Brahmajosyula <vamsi-krishna.brahmajosyula@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, tools have *ALIGN*() macros scattered across the unrelated
headers, as there are only 3 of them and they were added separately
each time on an as-needed basis.
Anyway, let's make it more consistent with the kernel headers and allow
using those macros outside of the mentioned headers. Create
<linux/align.h> inside the tools/ folder and include it where needed.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds code to check HCI_UART_PROTO_READY flag before
accessing hci_uart->proto. It fixes the race condition in
hci_uart_tty_ioctl() between HCIUARTSETPROTO and HCIUARTGETPROTO.
This issue bug found by Yu Hao and Weiteng Chen:
BUG: general protection fault in hci_uart_tty_ioctl [1]
The information of C reproducer can also reference the link [2]
nfsd: Don't call freezable_schedule_timeout() after each successful page allocation in svc_alloc_arg().
When commit 390390240145 ("nfsd: don't allow nfsd threads to be
signalled.") is backported to 5.10, it was adjusted considering commit 3feac2b55293 ("sunrpc: exclude from freezer when waiting for requests:").
However, 3feac2b55293 is based on commit f6e70aab9dfe ("SUNRPC: refresh
rq_pages using a bulk page allocator"), which converted page-by-page
allocation to a batch allocation, so schedule_timeout() is placed
un-nested.
As a result, the backported commit 7229200f6866 ("nfsd: don't allow nfsd
threads to be signalled.") placed freezable_schedule_timeout() in the wrong
place.
Now, freezable_schedule_timeout() is called after every successful page
allocation, and we see 30%+ performance regression on 5.10.220 in our
test suite.
Let's move it to the correct place so that freezable_schedule_timeout()
is called only when page allocation fails.
Fixes: 7229200f6866 ("nfsd: don't allow nfsd threads to be signalled.") Reported-by: Hughdan Liu <hughliu@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The recent addition of a sanity check for a too low start tick time
seems breaking some applications that uses aloop with a certain slave
timer setup. They may have the initial resolution 0, hence it's
treated as if it were a too low value.
Relax and skip the check for the slave timer instance for addressing
the regression.
Fix a race condition if the clock provider comes up after mmc is probed,
this causes mmc to fail without retrying.
When given the DEFER error from the clk source, pass it on up the chain.
Fixes: f90a0612f0e1 ("mmc: dw_mmc: lookup for optional biu and ciu clocks") Signed-off-by: Ben Whitten <ben.whitten@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240811212212.123255-1-ben.whitten@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a system with a GICv3, if a guest hasn't been configured with
GICv3 and that the host is not capable of GICv2 emulation,
a write to any of the ICC_*SGI*_EL1 registers is trapped to EL2.
We therefore try to emulate the SGI access, only to hit a NULL
pointer as no private interrupt is allocated (no GIC, remember?).
The obvious fix is to give the guest what it deserves, in the
shape of a UNDEF exception.
Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240820100349.3544850-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, rumble is only supported via bluetooth on a single xbox
controller, called 'model 1708'. On the back of the device, it's named
'wireless controller for xbox one'. However, in 2021, Microsoft released
a firmware update for this controller. As part of this update, the HID
descriptor of the device changed. The product ID was also changed from
0x02fd to 0x0b20. On this controller, rumble was supported via
hid-microsoft, which matched against the old product id (0x02fd). As a
result, the firmware update broke rumble support on this controller.
The hid-microsoft driver actually supports rumble on the new firmware,
as well. So simply adding new product id is sufficient to bring back
this support.
After discussing further with the xbox team, it was pointed out that
another xbox controller, xbox elite series 2, can be supported in a
similar way.
Add rumble support for all of these devices in this patch. Two of the
devices have received firmware updates that caused their product id's to
change. Both old and new firmware versions of these devices were tested.
The tested controllers are:
1. 'wireless controller for xbox one', model 1708
2. 'xbox wireless controller', model 1914. This is also sometimes
referred to as 'xbox series S|X'.
3. 'elite series 2', model 1797.
The tested configurations are:
1. model 1708, pid 0x02fd (old firmware)
2. model 1708, pid 0x0b20 (new firmware)
3. model 1914, pid 0x0b13
4. model 1797, pid 0x0b05 (old firmware)
5. model 1797, pid 0x0b22 (new firmware)
I verified rumble support on both bluetooth and usb.
The Wacom driver maps the HID_DG_TWIST usage to ABS_Z (rather than ABS_RZ)
for historic reasons. When the code to support twist was introduced in
commit 50066a042da5 ("HID: wacom: generic: Add support for height, tilt,
and twist usages"), we were careful to write it in such a way that it had
HID calculate the resolution of the twist axis assuming ABS_RZ instead
(so that we would get correct angular behavior). This was broken with
the introduction of commit 08a46b4190d3 ("HID: wacom: Set a default
resolution for older tablets"), which moved the resolution calculation
to occur *before* the adjustment from ABS_Z to ABS_RZ occurred.
This commit moves the calculation of resolution after the point that
we are finished setting things up for its proper use.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Fixes: 08a46b4190d3 ("HID: wacom: Set a default resolution for older tablets") Cc: stable@vger.kernel.org Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Loongson64 C and G processors have EXTIMER feature which
is conflicting with CP0 counter.
Although the processor resets in EXTIMER disabled & INTIMER
enabled mode, which is compatible with MIPS CP0 compare, firmware
may attempt to enable EXTIMER and interfere CP0 compare.
Set timer mode back to MIPS compatible mode to fix booting on
systems with such firmware before we have an actual driver for
EXTIMER.
Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the "test->highmem = alloc_pages()" allocation fails then calling
__free_pages(test->highmem) will result in a NULL dereference. Also
change the error code to -ENOMEM instead of returning success.
Before re-starting link training reset the link phy params namely
the pre-emphasis and voltage swing levels otherwise the next
link training begins at the previously cached levels which can result
in link training failures.
DPU debugging macros need to be converted to a proper drm_debug_*
macros, however this is a going an intrusive patch, not suitable for a
fix. Wire DPU_DEBUG and DPU_DEBUG_DRIVER to always use DRM_DEBUG_DRIVER
to make sure that DPU debugging messages always end up in the drm debug
messages and are controlled via the usual drm.debug mask.
I don't think that it is a good idea for a generic DPU_DEBUG macro to be
tied to DRM_UT_KMS. It is used to report a debug message from driver, so by
default it should go to the DRM_UT_DRIVER channel. While refactoring
debug macros later on we might end up with particular messages going to
ATOMIC or KMS, but DRIVER should be the default.
If a multicast address is removed but there are still some multicast
addresses, that address would remain programmed into the frame filter.
Fix this by explicitly setting the enable bit for each filter.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If promiscuous mode is disabled when there are fewer than four multicast
addresses, then it will not be reflected in the hardware. Fix this by
always clearing the promiscuous mode flag even when we program multicast
addresses.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
After ip6_local_out() has returned, we no longer can safely
dereference rt, unless we hold rcu_read_lock().
A similar issue has been fixed in commit a688caa34beb ("ipv6: take rcu lock in rawv6_send_hdrinc()")
Another potential issue in ip6_finish_output2() is handled in a
separate patch.
[1]
BUG: KASAN: slab-use-after-free in ip6_send_skb+0x18d/0x230 net/ipv6/ip6_output.c:1964
Read of size 8 at addr ffff88806dde4858 by task syz.1.380/6530
There is a bug in netem_enqueue() introduced by
commit 5845f706388a ("net: netem: fix skb length BUG_ON in __skb_to_sgvec")
that can lead to a use-after-free.
This commit made netem_enqueue() always return NET_XMIT_SUCCESS
when a packet is duplicated, which can cause the parent qdisc's q.qlen
to be mistakenly incremented. When this happens qlen_notify() may be
skipped on the parent during destruction, leaving a dangling pointer
for some classful qdiscs like DRR.
There are two ways for the bug happen:
- If the duplicated packet is dropped by rootq->enqueue() and then
the original packet is also dropped.
- If rootq->enqueue() sends the duplicated packet to a different qdisc
and the original packet is dropped.
In both cases NET_XMIT_SUCCESS is returned even though no packets
are enqueued at the netem qdisc.
The fix is to defer the enqueue of the duplicate packet until after
the original packet has been guaranteed to return NET_XMIT_SUCCESS.
Fixes: 5845f706388a ("net: netem: fix skb length BUG_ON in __skb_to_sgvec") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240819175753.5151-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>