Merge patch series "Optimize the hot path in the UFS driver"
Bart Van Assche <bvanassche@acm.org> says:
Hi Martin,
This patch series optimizes the hot path of the UFS driver by making
struct scsi_cmnd and struct ufshcd_lrb adjacent. Making these two data
structures adjacent is realized as follows:
The following changes had to be made prior to making these two data
structures adjacent:
* Add support for driver-internal and reserved commands in the SCSI core.
* Instead of making the reserved command slot (hba->reserved_slot)
invisible to the SCSI core, let the SCSI core allocate a reserved command.
* Remove all UFS data structure members that are no longer needed
because struct scsi_cmnd and struct ufshcd_lrb are now adjacent
* Call ufshcd_init_lrb() from inside the code for queueing a command instead of
calling this function before I/O starts. This is necessary because
ufshcd_memory_alloc() allocates fewer instances than the block layer
allocates requests. See also the following code in the block layer
core:
if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx,
hctx->numa_node))
Although the UFS driver could be modified such that ufshcd_init_lrb()
is called from ufshcd_init_cmd_priv(), realizing this would require
moving the memory allocations that happen from inside
ufshcd_memory_alloc() into ufshcd_init_cmd_priv(). That would make
this patch series even larger. Although ufshcd_init_lrb() is called for each
command, the benefits of reduced indirection and better cache efficiency
outweigh the small overhead of per-command lrb initialization.
* ufshcd_add_scsi_host() happens now before any device management
commands are submitted. This change is necessary because this patch
makes device management command allocation happen when the SCSI host
is allocated.
* Allocate as many command slots as the host controller supports. Decrease
host->cmds_per_lun if necessary once it is clear whether or not the UFS
device supports less command slots than the host controller.
Please consider this patch series for the next merge window.
Bart Van Assche [Fri, 31 Oct 2025 20:39:36 +0000 (13:39 -0700)]
scsi: ufs: core: Switch to scsi_get_internal_cmd()
Instead of storing the tag of the reserved command in hba->reserved_slot,
use scsi_get_internal_cmd() and scsi_put_internal_cmd() to allocate the
tag for the reserved command dynamically. Add
ufshcd_queue_reserved_command() for submitting reserved commands. Add
support in ufshcd_abort() for device management commands. Use
blk_execute_rq() for submitting reserved commands. Remove the code and
data structures that became superfluous. This includes
ufshcd_wait_for_dev_cmd(), hba->reserved_slot and ufs_dev_cmd.complete.
Bart Van Assche [Fri, 31 Oct 2025 20:39:35 +0000 (13:39 -0700)]
scsi: ufs: core: Move code out of ufshcd_wait_for_dev_cmd()
The ufshcd_dev_cmd_completion() call is useful for some but not for all
ufshcd_wait_for_dev_cmd() callers. Hence, remove the
ufshcd_dev_cmd_completion() call from ufshcd_wait_for_dev_cmd() and move
it past the ufshcd_issue_dev_cmd() calls where appropriate. This makes
it easier to detect timeout errors for UPIU frames submitted through the
BSG interface.
Bart Van Assche [Fri, 31 Oct 2025 20:39:34 +0000 (13:39 -0700)]
scsi: ufs: core: Make blk_mq_tagset_busy_iter() skip reserved requests
A later patch will convert hba->reserved_slot into a reserved tag. Make
blk_mq_tagset_busy_iter() skip reserved requests such that device
management commands are skipped.
Bart Van Assche [Fri, 31 Oct 2025 20:39:33 +0000 (13:39 -0700)]
scsi: ufs: core: Remove the ufshcd_lrb task_tag member
Remove the ufshcd_lrb task_tag member and use scsi_cmd_to_rq(cmd)->tag
instead. Use rq->tag instead of lrbp->task_tag. This patch reduces the
size of struct ufshcd_lrb.
Bart Van Assche [Fri, 31 Oct 2025 20:39:31 +0000 (13:39 -0700)]
scsi: ufs: core: Optimize the hot path
Set .cmd_size in the SCSI host template such that the SCSI core makes
struct scsi_cmnd and struct ufshcd_lrb adjacent. Convert the cmd->lrbp
and lrbp->cmd memory loads into pointer offset calculations. Remove the
data structure members that became superfluous, namely ufshcd_lrb.cmd
and ufs_hba.lrb. Since ufshcd_lrb.cmd is removed, this pointer cannot be
used anymore to test whether or not a command is a SCSI command.
Introduce a new function for this purpose, namely ufshcd_is_scsi_cmd().
Bart Van Assche [Fri, 31 Oct 2025 20:39:29 +0000 (13:39 -0700)]
scsi: ufs: core: Make the reserved slot a reserved request
Instead of letting the SCSI core allocate hba->nutrs - 1 commands, let
the SCSI core allocate hba->nutrs commands, set the number of reserved
tags to 1 and use the reserved tag for device management commands. This
patch changes the 'reserved slot' from hba->nutrs - 1 into 0 because the
block layer reserves the smallest tags for reserved commands.
Bart Van Assche [Fri, 31 Oct 2025 20:39:27 +0000 (13:39 -0700)]
scsi: ufs: core: Call ufshcd_init_lrb() later
Call ufshcd_init_lrb() from inside ufshcd_setup_dev_cmd() instead of
ufshcd_host_memory_configure(). This patch prepares for calling
ufshcd_host_memory_configure() before the information is available that
is required to call ufshcd_setup_dev_cmd().
Bart Van Assche [Fri, 31 Oct 2025 20:39:26 +0000 (13:39 -0700)]
scsi: ufs: core: Allocate the SCSI host earlier
Call ufshcd_add_scsi_host() before any UPIU commands are sent to the UFS
device. This patch prepares for letting ufshcd_add_scsi_host() allocate
memory for both SCSI and UPIU commands.
Prepare for allocating the SCSI host earlier by making the SCSI host
queue depth independent of the queue depth supported by the UFS device.
This patch may increase the queue depth of the UFS SCSI host.
Merge the MCQ mode and legacy mode loops into a single loop. This patch
prepares for optimizing the hot path by removing the direct hba->lrb[]
accesses from ufshcd_eh_device_reset_handler().
Bart Van Assche [Fri, 31 Oct 2025 20:39:22 +0000 (13:39 -0700)]
scsi: ufs: core: Change the monitor function argument types
Pass a SCSI command pointer instead of a struct ufshcd_lrb pointer. This
patch prepares for combining the SCSI command and ufshcd_lrb data
structures into a single data structure.
Bart Van Assche [Fri, 31 Oct 2025 20:39:21 +0000 (13:39 -0700)]
scsi: ufs: core: Only call ufshcd_should_inform_monitor() for SCSI commands
ufshcd_should_inform_monitor() only returns 'true' for SCSI commands.
Instead of checking inside ufshcd_should_inform_monitor() whether its
second argument represents a SCSI command, only call this function for
SCSI commands. This patch prepares for removing the lrbp->cmd member.
Bart Van Assche [Fri, 31 Oct 2025 20:39:20 +0000 (13:39 -0700)]
scsi: ufs: core: Change the type of one ufshcd_send_command() argument
Change the 'task_tag' argument into an LRB pointer. This patch prepares
for the removal of the hba->lrb[] array.
Reviewed-by: Avri Altman <avri.altman@sandisk.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-13-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:19 +0000 (13:39 -0700)]
scsi: ufs: core: Change the type of one ufshcd_add_command_trace() argument
Change the 'tag' argument into a SCSI command pointer. This patch
prepares for the removal of the hba->lrb[] array.
Reviewed-by: Avri Altman <avri.altman@sandisk.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-12-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:18 +0000 (13:39 -0700)]
scsi: ufs: core: Only call ufshcd_add_command_trace() for SCSI commands
Instead of checking inside ufshcd_add_command_trace() whether 'cmd'
points at a SCSI command, let the caller perform that check. This patch
prepares for removing the lrbp->cmd pointer.
Reviewed-by: Avri Altman <avri.altman@sandisk.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-11-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:17 +0000 (13:39 -0700)]
scsi: ufs: core: Change the type of one ufshcd_add_cmd_upiu_trace() argument
Change the 'tag' argument into an LRB pointer. This patch prepares for the
removal of the hba->lrb[] array.
Reviewed-by: Avri Altman <avri.altman@sandisk.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-10-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:16 +0000 (13:39 -0700)]
scsi: ufs: core: Move an assignment in ufshcd_mcq_process_cqe()
Since 'tag' is only used inside the if-statement, move the 'tag'
assignment into the if-statement. This patch prepares for introducing a
WARN_ON_ONCE() call in ufshcd_mcq_get_tag() if the tag lookup fails.
Bart Van Assche [Fri, 31 Oct 2025 20:39:15 +0000 (13:39 -0700)]
scsi: scsi_debug: Abort SCSI commands via an internal command
Add a .queue_reserved_command() implementation and call it from the code
path that aborts SCSI commands. This ensures that the code for
allocating a pseudo SCSI device and also the code for allocating and
processing reserved commands gets triggered while running blktests.
Most of the code in this patch is a modified version of code from John
Garry. See also
https://lore.kernel.org/linux-scsi/75018e17-4dea-4e1b-8c92-7a224a1e13b9@oracle.com/
Reviewed-by: John Garry <john.g.garry@oracle.com> Suggested-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-8-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper functions to allow LLDDs to allocate and free internal commands.
[ bvanassche: changed the 'nowait' argument into a 'flags' argument. See also
https://lore.kernel.org/linux-scsi/20211125151048.103910-3-hare@suse.de/ ]
Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-7-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Fri, 31 Oct 2025 20:39:13 +0000 (13:39 -0700)]
scsi: core: Introduce .queue_reserved_command()
Reserved commands will be used by SCSI LLDs for submitting internal
commands. Since the SCSI host, target and device limits do not apply to
the reserved command use cases, bypass the SCSI host limit checks for
reserved commands. Introduce the .queue_reserved_command() callback for
reserved commands. Additionally, do not activate the SCSI error handler
if a reserved command fails such that reserved commands can be submitted
from inside the SCSI error handler.
[ bvanassche: modified patch title and patch description. Renamed
.reserved_queuecommand() into .queue_reserved_command(). Changed
the second argument of __blk_mq_end_request() from 0 into error
code in the completion path if cmd->result != 0. Rewrote the
scsi_queue_rq() changes. See also
https://lore.kernel.org/linux-scsi/1666693096-180008-5-git-send-email-john.garry@huawei.com/ ]
Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://patch.msgid.link/20251031204029.2883185-6-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 31 Oct 2025 20:39:12 +0000 (13:39 -0700)]
scsi: core: Support allocating a pseudo SCSI device
Allocate a pseudo SCSI device if 'nr_reserved_cmds' has been set. Pseudo
SCSI devices have the SCSI ID <max_id>:U64_MAX so they won't clash with
any devices the LLD might create. Pseudo SCSI devices are excluded from
scanning and will not show up in sysfs. Additionally, pseudo SCSI
devices are skipped by shost_for_each_device(). This prevents that the
SCSI error handler tries to submit a reset to a non-existent logical
unit.
Do not allocate a budget map for pseudo SCSI devices since the
cmd_per_lun limit does not apply to pseudo SCSI devices.
Do not perform queue depth ramp up / ramp down for pseudo SCSI devices.
Pseudo SCSI devices will be used to send internal commands to a storage
device.
[ bvanassche: edited patch description / renamed host_sdev into
pseudo_sdev / unexported scsi_get_host_dev() / modified error path in
scsi_get_pseudo_dev() / skip pseudo devices in __scsi_iterate_devices()
and also when calling sdev_init(), sdev_configure() and sdev_destroy().
See also
https://lore.kernel.org/linux-scsi/20211125151048.103910-2-hare@suse.de/ ]
Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:11 +0000 (13:39 -0700)]
scsi: core: Make the budget map optional
Prepare for not allocating a budget map for pseudo SCSI devices by
checking whether a budget map has been allocated before using it.
Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.g.garry@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Fri, 31 Oct 2025 20:39:10 +0000 (13:39 -0700)]
scsi: core: Move two statements
Move two statements that will be needed for pseudo SCSI devices in front
of code that won't be needed for pseudo SCSI devices. No functionality
has been changed.
Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 31 Oct 2025 20:39:09 +0000 (13:39 -0700)]
scsi: core: Support allocating reserved commands
Quite some drivers are using management commands internally. These
commands typically use the same tag pool as regular SCSI commands. Tags
for these management commands are set aside before allocating the
block-mq tag bitmap for regular SCSI commands. The block layer already
supports this via the reserved tag mechanism. Add a new field
'nr_reserved_cmds' to the SCSI host template to instruct the block layer
to set aside a tag space for these management commands by using reserved
tags. Exclude reserved commands from .can_queue because .can_queue is
visible in sysfs.
[ bvanassche: modified patch title and patch description. Left out the
following statements: "if (sht->nr_reserved_cmds)" and also
"if (sdev->host->nr_reserved_cmds) flags |= BLK_MQ_REQ_RESERVED;". Moved
nr_reserved_cmds declarations and statements close to the
corresponding can_queue declarations and statements. See also
https://lore.kernel.org/linux-scsi/20210503150333.130310-11-hare@suse.de/ ]
Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251031204029.2883185-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge patch series "Support power resources defined in acpi on ata"
Markus Probst <markus.probst@posteo.de> says:
This series adds support for power resources defined in acpi on ata
ports/devices. A device can define a power resource in an ata port/device,
which then gets powered on right before the port is probed. This can be
useful for devices, which have sata power connectors that are:
a: powered down by default
b: can be individually powered on
like in some synology nas devices. If thats the case it will be assumed,
that the power resource won't survive reboots and therefore the disk will
be stopped.
Markus Probst [Tue, 4 Nov 2025 14:24:34 +0000 (14:24 +0000)]
scsi: ata: Stop disk on restart if ACPI power resources are found
Some embedded devices have the ability to control whether power is
provided to the disks via the SATA power connector or not. ACPI power
resources are usually off by default, thus making it unclear if the
specific power resource will retain its state after a restart. If power
resources are defined on ATA ports / devices in ACPI, we should stop the
disk on SYSTEM_RESTART, to ensure the disk will not lose power while
active.
Add a new function, ata_acpi_dev_manage_restart(), that will be used to
determine if a disk should be stopped before restarting the system. If a
usable ACPI power resource has been found, it is assumed that the disk
will lose power after a restart and should be stopped to avoid unclean
shutdown due to power loss.
Markus Probst [Tue, 4 Nov 2025 14:24:33 +0000 (14:24 +0000)]
scsi: ata: Use ACPI methods to power on disks
Some embedded devices have the ability to control whether power is
provided to the disks via the SATA power connector or not. If power
resources are defined on ATA ports / devices in ACPI, we should try to
set the power state to D0 before probing the disk to ensure that any
power supply or power gate that may exist is providing power to the
disk.
An example for such devices would be newer synology NAS devices. Every
disk slot has its own SATA power connector. Whether the connector is
providing power is controlled via an gpio, which is *off by default*.
Also the disk loses power on reboots.
Add a new function, ata_acpi_port_power_on(), that will be used to power
on the SATA power connector if usable ACPI power resources on the
associated ATA port / device are found. It will be called right before
probing the port, therefore the disk will be powered on just in time.
Markus Probst [Tue, 4 Nov 2025 14:24:32 +0000 (14:24 +0000)]
scsi: sd: Add manage_restart device attribute to scsi_disk
In addition to the already existing manage_shutdown,
manage_system_start_stop and manage_runtime_start_stop device scsi_disk
attributes, add manage_restart, which allows the high-level device
driver (sd) to manage the device power state for SYSTEM_RESTART if set
to 1.
This attribute is necessary for the following commit "ata: stop disk on
restart if ACPI power resources are found" to avoid a potential disk
power failure in the case the SATA power connector does not retain the
power state after a restart.
Merge patch series "Update lpfc to revision 14.4.0.12"
Justin Tee <justin.tee@broadcom.com> says:
Update lpfc to revision 14.4.0.12
This patch set contains updates to log messaging, revision of outdated
comment descriptions, fixes to kref accounting, support for BB credit
recovery in point-to-point mode, and introduction of registering unique
platform name identifiers with fabrics.
The patches were cut against Martin's 6.19/scsi-queue tree.
Justin Tee [Thu, 6 Nov 2025 22:46:38 +0000 (14:46 -0800)]
scsi: lpfc: Add capability to register Platform Name ID to fabric
FC-LS and FC-GS specifications outline fields for registering a platform
name identifier (PNI) to the fabric. The PNI assists fabrics with
identifying the physical server source of frames in the fabric.
lpfc generates a PNI based partially on the uuid specific for the
system. Initial attempts to extract a uuid are made from SMBIOS's
System Information 08h uuid entry. If SMBIOS DMI does not exist, a PNI
is not generated and PNI registration with the fabric is skipped.
The PNI is submitted in FLOGI and FDISC frames. After successful fabric
login, the RSPNI_PNI CT frame is submitted to the fabric to register the
OS host name tying it to the PNI.
Justin Tee [Thu, 6 Nov 2025 22:46:37 +0000 (14:46 -0800)]
scsi: lpfc: Allow support for BB credit recovery in point-to-point topology
Currently, BB credit recovery is excluded to fabric topology mode. This
patch allows setting of BB_SC_N in PLOGIs and PLOGI_ACCs when in
point-to-point mode so that BB credit recovery can operate in
point-to-point topology as well.
Justin Tee [Thu, 6 Nov 2025 22:46:36 +0000 (14:46 -0800)]
scsi: lpfc: Fix reusing an ndlp that is marked NLP_DROPPED during FLOGI
It's possible for an unstable link to repeatedly bounce allowing a FLOGI
retry, but then bounce again forcing an abort of the FLOGI. Ensure that
the initial reference count on the FLOGI ndlp is restored in this faulty
link scenario.
Justin Tee [Thu, 6 Nov 2025 22:46:35 +0000 (14:46 -0800)]
scsi: lpfc: Modify kref handling for Fabric Controller ndlps
Currently, there is a kref put in the lpfc_cleanup() routine that takes
care of outstanding references on fabric controller ndlps in UNUSED
state. While typically there is a state change from UNUSED -> REGLOGIN
when the ndlp successfully logs into the fabric, there may be cases when
FLOGI is unsuccessful and the ndlp will remain in UNUSED state without a
registered rpi, yet the ndlp incorrectly has a kref count of one.
To address this, handling of Fabric Controller ndlps are moved into the
routines: lpfc_issue_els_scr(), lpfc_issue_els_rdf(),
lpfc_cmpl_els_disc_cmd().
In both lpfc_issue_els_scr() and lpfc_issue_els_rdf(), if there does not
exist a previously created fabric controller ndlp, an ndlp will be
created. Otherwise, we can reuse the pre-existing ndlp object.
In lpfc_cmpl_els_disc_cmd(), if the SCR or RDF are not successfully
issued, the initial reference on the ndlp that is not registered with
upper layers will be decremented with a kref_put().
Justin Tee [Thu, 6 Nov 2025 22:46:34 +0000 (14:46 -0800)]
scsi: lpfc: Fix leaked ndlp krefs when in point-to-point topology
In point-to-point topology, the driver sometimes defers the unsolicited
FLOGI LS_ACC until it sends its FLOGI to the remote port. When this
happens, lpfc neglects to release the ndlp allocated for the unsolicited
FLOGI. This patch adds code to release the ndlp for the deferred
unsolicited FLOGI LS_ACC.
An NLP_FLOGI_DFR_ACC flag is introduced to facilitate identifying an
ndlp with an expected deferred FLOGI LS_ACC completion. When
lpfc_cmpl_els_rsp() detects the correct qualifiers, it releases the
initial reference on the ndlp object. And when lpfc_cmpl_els_rsp()
exits, the remaining put for the deferred action is executed and the
ndlp is released.
Justin Tee [Thu, 6 Nov 2025 22:46:33 +0000 (14:46 -0800)]
scsi: lpfc: Ensure unregistration of rpis for received PLOGIs
Unregistration of an rpi object should be done when a PLOGI is received
as PLOGI receipt implies an implicit LOGO. Previously, the driver would
continue using the same, already registered, rpi and ACC the received
PLOGI.
Replace the ACC and early return statement with break to execute the
rest of the lpfc_rcv_plogi logic outside the switch case statement.
This ensures unregistration and reregistration of an rpi after PLOGI_ACC
completion.
Sets a timeout value for requests sent to physical devices via the
RAID path. This prevents the controller firmware from waiting
indefinitely and allows it to time out requests a few seconds before
the OS issues Target Management Function (TMF) commands.
The timeout value sent to the firmware is set to 3 seconds less than
the OS timeout, which significantly reduces TMF invocations.
2. smartpqi-fix-Device-resources-accessed-after-device-removal
Fixes a race condition during device removal by:
* Checking for device removal in the reset handler and canceling any
pending reset work if the device is no longer present.
* Canceling outstanding TMF work items in the .sdev_destroy handler.
Together, these changes eliminate races between reset operations
and device removal.
The other two patches:
3. smartpqi-add-new-Hurray-Data-pci-device
Adds support for new Hurray Data PCI device.
No functional changes.
4. smartpqi-update-driver-version-to-2.1.36-026
Updates the driver version string.
No functional changes.
Don Brace [Thu, 6 Nov 2025 16:38:22 +0000 (10:38 -0600)]
scsi: smartpqi: Update version to 2.1.36-026
Update driver version to 2.1.36-026
Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-5-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
David Strahan [Thu, 6 Nov 2025 16:38:21 +0000 (10:38 -0600)]
scsi: smartpqi: Add support for Hurray Data new controller PCI device
Add support for new Hurray Data controller.
All entries are in HEX.
Add PCI IDs for Hurray Data controllers:
VID / DID / SVID / SDID
---- ---- ---- ----
9005 028f 207d 4840
Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: David Strahan <David.Strahan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike McGowen [Thu, 6 Nov 2025 16:38:20 +0000 (10:38 -0600)]
scsi: smartpqi: Fix device resources accessed after device removal
Correct possible race conditions during device removal.
Previously, a scheduled work item to reset a LUN could still execute
after the device was removed, leading to use-after-free and other
resource access issues.
This race condition occurs because the abort handler may schedule a LUN
reset concurrently with device removal via sdev_destroy(), leading to
use-after-free and improper access to freed resources.
- Check in the device reset handler if the device is still present in
the controller's SCSI device list before running; if not, the reset
is skipped.
- Cancel any pending TMF work that has not started in sdev_destroy().
- Ensure device freeing in sdev_destroy() is done while holding the
LUN reset mutex to avoid races with ongoing resets.
Fixes: 2d80f4054f7f ("scsi: smartpqi: Update deleting a LUN via sysfs") Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike McGowen [Thu, 6 Nov 2025 16:38:19 +0000 (10:38 -0600)]
scsi: smartpqi: Add timeout value to RAID path requests to physical devices
Add a timeout value to requests sent to physical devices via the RAID
path.
A timeout value of zero means wait indefinitely, which may cause the OS
to issue Target Management Function (TMF) commands if the device does
not respond.
For input timeouts of 8 seconds or greater, the value sent to firmware
is reduced by 3 seconds to provide an earlier firmware timeout and allow
the OS additional time before timing out.
This change improves timeout handling between the driver, firmware, and
OS, helping to better manage device responsiveness and avoid indefinite
waits.
Reviewed-by: David Strahan <david.strahan@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-2-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: fcoe: Add WQ_PERCPU to alloc_workqueue() users
Currently if a user enqueues a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() uses
WORK_CPU_UNBOUND (used when a CPU is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be
addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
Continue the effort to refactor workqueue APIs, which has begun with the
change introducing new workqueues and a new alloc_workqueue flag:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
Adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be
per-CPU when WQ_UNBOUND has not been specified.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
David Jeffery [Tue, 4 Nov 2025 15:46:23 +0000 (10:46 -0500)]
scsi: st: Skip buffer flush for information ioctls
With commit 9604eea5bd3a ("scsi: st: Add third party poweron reset
handling") some customer tape applications fail from being unable to
complete ioctls to verify ID information for the device when there has
been any type of reset event to their tape devices.
The st driver currently will fail all standard SCSI ioctls if a call to
flush_buffer() fails in st_ioctl(). This causes ioctls which otherwise
have no effect on tape state to succeed or fail based on events
unrelated to the requested ioctl.
This makes SCSI information ioctls unreliable after a reset even if no
buffering is in use. With a reset setting the pos_unknown field,
flush_buffer() will report failure and fail all ioctls. So any
application expecting to use ioctls to check the identify the device
will be unable to do so in such a state.
For SCSI information ioctls, avoid the need for a buffer flush and allow
the ioctls to execute regardless of buffer state.
Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Link: https://patch.msgid.link/20251104154709.6436-2-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
David Jeffery [Tue, 4 Nov 2025 15:46:22 +0000 (10:46 -0500)]
scsi: st: Separate st-unique ioctl handling from SCSI common ioctl handling
The st ioctl function currently interleaves code for handling various st
specific ioctls with parts of code needed for handling ioctls common to
all SCSI devices. Separate out st's code for the common ioctls into a
more manageable, separate function.
Signed-off-by: David Jeffery <djeffery@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Link: https://patch.msgid.link/20251104154709.6436-1-djeffery@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Haotian Zhang [Tue, 4 Nov 2025 09:48:47 +0000 (17:48 +0800)]
scsi: stex: Fix reboot_notifier leak in probe error path
In stex_probe(), register_reboot_notifier() is called at the beginning,
but if any subsequent initialization step fails, the function returns
without unregistering the notifier, resulting in a resource leak.
Add unregister_reboot_notifier() in the out_disable error path to ensure
proper cleanup on all failure paths.
Merge patch series "target: RW/num_cmds stats improvements"
Mike Christie <michael.christie@oracle.com> says:
The following patches were made over Linus tree. They fix/improve the
stats used in the main IO path. The first patch fixes an issue where
I made some stats u32 when they should have stayed u64. The rest of
the patches improve the handling of RW/num_cmds stats to reduce code
duplication and improve performance.
Mike Christie [Wed, 17 Sep 2025 22:12:55 +0000 (17:12 -0500)]
scsi: target: Move LUN stats to per-CPU
The atomic use in the main I/O path is causing perf issues when using
higher performance backend devices and multiple queues (more than
10 when using vhost-scsi) like with this fio workload:
Note: I forgot to include this patch with the delayed/ordered per CPU
tracking and per device/device entry per CPU stats. With this patch you
get the full 33% improvements when using fast backends, multiple queues
and multiple IO submiters.
Mike Christie [Wed, 17 Sep 2025 22:12:54 +0000 (17:12 -0500)]
scsi: target: Create and use macro helpers for per-CPU stats
Create some macros to reduce code duplication for when we handle per-CPU
stats. Convert the existing LUN and auth cases.
Note: This is similar to percpu_counters but they only support s64 since
they are also used for non-stat counters where you need to handle/prevent
rollover more gracefully. Our use is just for stats where the spec
defines u64 use plus we already have some files exporting u64 values so
it's not possible to directly use percpu_counters.
Mike Christie [Wed, 17 Sep 2025 22:12:53 +0000 (17:12 -0500)]
scsi: target: Fix LUN/device R/W and total command stats
In commit 9cf2317b795d ("scsi: target: Move I/O path stats to per CPU")
I saw we sometimes use %u and also misread the spec. As a result I
thought all the stats were supposed to be 32-bit only. However, for the
majority of cases we support currently, the spec specifies u64 bit
stats. This patch converts the stats changed in the commit above to u64.
Fixes: 9cf2317b795d ("scsi: target: Move I/O path stats to per CPU") Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Link: https://patch.msgid.link/20250917221338.14813-2-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove the stack buffer 'buf'. This patch does not modify the output
produced by target_lu_gp_members_show().
An example of the output that is produced with this patch applied:
$ cat /sys/kernel/config/target/core/alua/lu_gps/default_lu_gp/members
fileio_0/vdev0
fileio_1/vdev1
iblock_0/vdev2
$ od -c /sys/kernel/config/target/core/alua/lu_gps/default_lu_gp/members 0000000 f i l e i o _ 0 / v d e v 0 \n f 0000020 i l e i o _ 1 / v d e v 1 \n i b 0000040 l o c k _ 0 / v d e v 2 \n 0000055
Merge patch series "scsi: target: Add WRITE_ATOMIC_16 support"
John Garry <john.g.garry@oracle.com> says:
This is a reposting of Mike's atomic writes support for the SCSI
target.
Again, we are now only supporting target_core_iblock. It's implemented
similar to UNMAP where we do not do any emulation and instead pass the
operation to the block layer.
Mike Christie [Mon, 20 Oct 2025 10:38:20 +0000 (10:38 +0000)]
scsi: target: Add atomic support to target_core_iblock
Make target_core_iblock use the LIO helper function to translate its
block_device atomic settings to LIO settings. If we then get a write
that LIO has indicated is atomic via the SCF_ATOMIC flag, we use the
REQ_ATOMIC flag to tell the block layer to perform an atomic write.
Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-8-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Mon, 20 Oct 2025 10:38:18 +0000 (10:38 +0000)]
scsi: target: Report atomic values in INQUIRY
Report the atomic values in the Block Limits VPD page.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
jpg: handle not having atomic_supported attribute Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-6-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Mon, 20 Oct 2025 10:38:17 +0000 (10:38 +0000)]
scsi: target: Add WRITE_ATOMIC_16 handler
Add the core LIO code to process the WRITE_ATOMIC_16 command.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
jpg: fix return code from sbc_check_atomic, reformat Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-5-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Mon, 20 Oct 2025 10:38:16 +0000 (10:38 +0000)]
scsi: target: Add helper to set up atomic values from block_device
Add a helper function that sets up the atomic value based on a
block_device similar to what we do for unmap.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
jpg: Set atomic alignment, drop atomic_supported reference Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-4-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie [Mon, 20 Oct 2025 10:38:15 +0000 (10:38 +0000)]
scsi: target: Add atomic se_device fields
Add atomic fields to the se_device and export them in configfs.
Initially only target_core_iblock will be supported and we will inherit
all the settings from the block layer.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
jpg: Stop being allowed to configure atomic write alignment, Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-3-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rename target_configure_unmap_from_queue() to
target_configure_unmap_from_bdev() since it now takes a bdev.
Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251020103820.2917593-2-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: amd-versal2: Add UFS support for AMD Versal Gen 2 SoC
Add support for the UFS host controller on the AMD Versal Gen 2 SoC,
built on the Synopsys DWC UFS architecture, using the UFSHCD DWC and
UFSHCD platform driver. This controller requires specific configurations
like M-PHY/RMMI/UniPro and vendor specific registers programming before
doing the UIC_LINKSTARTUP.
Signed-off-by: Sai Krishna Potthuri <sai.krishna.potthuri@amd.com> Signed-off-by: Ajay Neeli <ajay.neeli@amd.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251021113003.13650-5-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: firmware: xilinx: Add support for secure read/write ioctl interface
Add support for a generic ioctl read/write interface using which users
can request firmware to perform read/write operations on a protected and
secure address space.
The functionality is introduced through the means of two new IOCTL IDs
which extend the existing PM_IOCTL EEMI API:
- IOCTL_READ_REG
- IOCTL_MASK_WRITE_REG
The caller only passes the node id of the given device and an offset.
The base address is not exposed to the caller and internally retrieved
by the firmware. Firmware will enforce an access policy on the incoming
read/write request.
Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@amd.com> Reviewed-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Signed-off-by: Ajay Neeli <ajay.neeli@amd.com> Acked-by: Senthil Nathan Thangaraj <senthilnathan.thangaraj@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Acked-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251021113003.13650-3-ajay.neeli@amd.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alok Tiwari [Tue, 21 Oct 2025 09:03:52 +0000 (02:03 -0700)]
scsi: qla4xxx: Use correct variable in memset for clarity
Both mbox_cmd and mbox_sts have the same size, so using sizeof(mbox_cmd)
when clearing mbox_sts did not cause any functional issue. However, it
is misleading and reduces code readability.
Update the memset() calls to use sizeof(mbox_sts) to make the intent
clear
Bart Van Assche [Tue, 21 Oct 2025 20:17:43 +0000 (13:17 -0700)]
scsi: aacraid: Improve code readability
aac_queuecommand() is a scsi_host_template.queuecommand()
implementation. Any value returned by this function other than one of
the following values is translated into SCSI_MLQUEUE_HOST_BUSY:
The driver calls asc_prt_scsi_host() -> scsi_host_busy() prior to
calling scsi_add_host(). This should not be done, and has raised issues
for other drivers, like [0].
Function asc_prt_scsi_host() only has a single callsite, as above, where
the shost busy count would always be 0.
Avoid printing the shost busy count to avoid this problem.
Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251023085451.3933666-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch series includes two bug fixes for this development cycle
and six small patches that are intended for the next merge window. If
applying the first two patches only during the current development
cycle would be inconvenient, postponing all patches until the next
merge window is fine with me.
Please consider including these patches in the upstream kernel.
Thanks,
Bart.
[mkp: Applied patches #1 and #2 to 6.18/scsi-fixes]
Magnus Lindholm [Thu, 2 Oct 2025 05:25:24 +0000 (07:25 +0200)]
scsi: qla1280: Fix compiler warnings (DEBUG mode)
Building the qla1280 driver with DEBUG_QLA1280 set will emit compiler
warnings. Fix some print formatting strings to reflect the correct type
of printed variables as well as remove unused code. (static function
ql1280_dump_device) in order to avoid compiler warnings.
Improves the UFS Mediatek driver by correcting clock scaling with PM
QoS, and adjusting power management flows. It addresses
shutdown/suspend race conditions, and removes redundant
functions. Support for new platforms is added with the MMIO_OTSD_CTRL
register, and MT6991 performance is optimized with MRTT and random
improvements. These changes collectively enhance driver performance,
stability, and compatibility.
Changes since v1:
1. Remove two patches that will be fixed in UFS core.
- ufs: host: mediatek: Fix runtime suspend error deadlock
- ufs: host: mediatek: Enable interrupts for MCQ mode
2. Use hba->shutting_down instead of ufshcd_is_user_access_allowed
scsi: ufs: host: mediatek: Support new features for MT6991
Add support for the MT6991 platform by enabling MRTT settings and random
performance improvements. These enhancements aim to optimize performance
and efficiency on the MT6991 hardware.
Enable multi-Round Trip Time (MRTT) for improved data handling. Enable
random performance improvement features to boost overall system
responsiveness.
Signed-off-by: Naomi Chu <naomi.chu@mediatek.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Acked-by: Chun-Hung Wu <chun-hung.wu@mediatek.com> Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://patch.msgid.link/20250924094527.2992256-9-peter.wang@mediatek.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Wed, 24 Sep 2025 09:43:29 +0000 (17:43 +0800)]
scsi: ufs: host: mediatek: Add support for new platform with MMIO_OTSD_CTR
Introduce support for a new UFS Mediatek platform by adding the
REG_UFS_UFS_MMIO_OTSD_CTRL register. This update includes checks for
legacy platforms and uses the new register to replace debug selection
and handle specific operations. The changes ensure compatibility across
different hardware versions and prevent potential issues with debug
usage on newer platforms.
Additional updates include error logging improvements during link setup
for newer and legacy platforms, ensuring proper event logging and
debugging.
Peter Wang [Wed, 24 Sep 2025 09:43:28 +0000 (17:43 +0800)]
scsi: ufs: host: mediatek: Remove duplicate function
Remove the duplicate ufs_mtk_us_to_ahit() function in the UFS Mediatek
driver and export the existing ufshcd_us_to_ahit() function for shared
use. This change reduces redundancy and maintains consistency across the
codebase.
Signed-off-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Chun-Hung Wu <chun-hung.wu@mediatek.com> Link: https://patch.msgid.link/20250924094527.2992256-7-peter.wang@mediatek.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Address a race condition between shutdown and suspend operations in the
UFS Mediatek driver. Before entering suspend, check if a shutdown is in
progress to prevent conflicts and ensure system stability.
Peter Wang [Wed, 24 Sep 2025 09:43:26 +0000 (17:43 +0800)]
scsi: ufs: host: mediatek: Adjust sync length for FASTAUTO mode
Set the sync length for FASTAUTO G1 mode in the UFS Mediatek
driver. This ensures the sync length meets minimum values for high-speed
gears, improving stability during power mode changes.
Signed-off-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Chun-Hung Wu <chun-hung.wu@mediatek.com> Link: https://patch.msgid.link/20250924094527.2992256-5-peter.wang@mediatek.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Wed, 24 Sep 2025 09:43:25 +0000 (17:43 +0800)]
scsi: ufs: host: mediatek: Handle clock scaling for high gear in PM flow
Add clock scaling down for power management flow in the UFS Mediatek
driver. If clock scaling is disabled and fixed in high gear, ensure the
clock scales down during suspend and scales up again after resume to
support high gear. This adjustment maintains proper power management.
Peter Wang [Wed, 24 Sep 2025 09:43:24 +0000 (17:43 +0800)]
scsi: ufs: host: mediatek: Adjust clock scaling for PM flow
Adjust clock scaling during suspend and resume in the UFS Mediatek
driver. Ensure that the clock scales down during suspend if it was
scaled up, and scales up again after resume. This adjustment maintains
proper power management.
Correct clock scaling with PM QoS during suspend and resume. Ensure PM
QoS is released during suspend if scaling up and re-applied after
resume. This prevents performance issues and maintains proper power
management.
Signed-off-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Chun-Hung Wu <chun-hung.wu@mediatek.com> Link: https://patch.msgid.link/20250924094527.2992256-2-peter.wang@mediatek.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>