Kai Mäkisara [Mon, 16 Dec 2024 11:37:55 +0000 (13:37 +0200)]
scsi: st: Don't set pos_unknown just after device recognition
Commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") in
v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA)
sense data. This was in addition to the existing method. When this Unit
Attention is received, the driver blocks attempts to read, write and some
other operations because the reset may have rewinded the tape. Because of
the added code, also the initial POR UA resulted in blocking operations,
including those that are used to set the driver options after the device is
recognized. Also, reading and writing are refused, whereas they succeeded
before this commit.
Add code to not set pos_unknown to block operations if the POR UA is
received from the first test_ready() call after the st device has been
created. This restores the behavior before v6.6.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20241216113755.30415-1-Kai.Makisara@kolumbus.fi Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling") CC: stable@vger.kernel.org Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kolumbus.fi/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Although added a while ago, to date no one make use of ext_iid,
specifically incorporates it in the upiu header. Therefore, remove it as
it is currently unused and not serving any purpose.
Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250103080204.63951-1-avri.altman@wdc.com Cc: Can Guo <quic_cang@quicinc.com> Cc: Asutosh Das <quic_asutoshd@quicinc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: storvsc: Ratelimit warning logs to prevent VM denial of service
If there's a persistent error in the hypervisor, the SCSI warning for
failed I/O can flood the kernel log and max out CPU utilization,
preventing troubleshooting from the VM side. Ratelimit the warning so
it doesn't DoS the VM.
Randy Dunlap [Sat, 21 Dec 2024 21:25:39 +0000 (13:25 -0800)]
scsi: documentation: Corrections for struct updates
Update scsi_mid_low_api.rst for changes to struct scsi_host and
struct scsi_cmnd.
struct scsi_host:
- no_async_abort is gone
- drop sh_list w/ no replacement
- change my_devices to __devices
struct scsi_cmnd:
- removed 'done' (now in struct scsi_driver); use scsi_done() or
scsi_done_direct() callbacks
- change previous request_bufflen to scsi_bufflen()
- change previous use_sg field to scsi_dma_map() or scsi_sg_count()
- change previous request_buffer field to reference to 'usg_sg' text
[mkp: removed more obsolete stuff]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241221212539.1314560-1-rdunlap@infradead.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Wed, 18 Dec 2024 00:07:48 +0000 (16:07 -0800)]
scsi: driver-api: documentation: Change what is added to docbook
For scsi_devinfo.c, use :export: so that exported symbols are put into the
docbook. Drop :internal: -- they aren't needed in the docbook.
For scsi_proc.c, drop :internal:. This will cause all documented private
(as is already done) and exported symbols to be added to the docbook.
For scsi_scan.c, switch from :internal: to :export: so that exported
symbols are put into the generated docbook.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241218000748.932850-1-rdunlap@infradead.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 12 Dec 2024 20:52:17 +0000 (12:52 -0800)]
scsi: transport: sas: spi: Fix kernel-doc for exported functions
Fix kernel-doc for sas_port_alloc(), sas_port_alloc_num(), and
spi_dv_device(). This allows them to be part of the SCSI driver-api
docbook.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-6-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 12 Dec 2024 20:52:16 +0000 (12:52 -0800)]
scsi: scsi_scan: Add kernel-doc for exported function
Add kernel-doc for scsi_add_device() since it is exported. This allows it
to be part of the SCSI driver-api docbook.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-5-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 12 Dec 2024 20:52:15 +0000 (12:52 -0800)]
scsi: scsi_lib: Add kernel-doc for exported functions
Add kernel-doc for scsi_failures_reset_retries() and scsi_alloc_request()
since these are exported. This allows them to be part of the SCSI
driver-api docbook.
Fix kernel-doc comments for scsi_vpd_tpg_id() [add kernel-doc for one
parameter and fix a typo].
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-4-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 12 Dec 2024 20:52:14 +0000 (12:52 -0800)]
scsi: scsi_ioctl: Add kernel-doc for exported functions
Add kernel-doc for scsi_set_medium_removal(), scsi_cmd_allowed(), and
scsi_ioctl_block_when_processing_errors() since these are exported. This
allows them to be part of the SCSI driver-api docbook.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-3-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 12 Dec 2024 20:52:13 +0000 (12:52 -0800)]
scsi: scsi_error: Add kernel-doc for exported functions
Convert scsi_report_bus_reset() and scsi_report_device_reset() to
kernel-doc since they are exported. This allows them to be part of the
driver-api/scsi.rst docbook.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241212205217.597844-2-rdunlap@infradead.org CC: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> CC: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge patch series "scsi: Constify 'struct bin_attribute'"
Thomas Weißschuh <linux@weissschuh.net> says:
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:18 +0000 (12:29 +0100)]
scsi: qla4xxx: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:17 +0000 (12:29 +0100)]
scsi: qla2xxx: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:16 +0000 (12:29 +0100)]
scsi: qedi: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:15 +0000 (12:29 +0100)]
scsi: qedf: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:14 +0000 (12:29 +0100)]
scsi: ipr: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:13 +0000 (12:29 +0100)]
scsi: lpfc: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:12 +0000 (12:29 +0100)]
scsi: ibmvfc: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:11 +0000 (12:29 +0100)]
scsi: esas2r: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:10 +0000 (12:29 +0100)]
scsi: arcmsr: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:09 +0000 (12:29 +0100)]
scsi: 3w-sas: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Thomas Weißschuh [Mon, 16 Dec 2024 11:29:08 +0000 (12:29 +0100)]
scsi: core: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be moved
into read-only memory. Make use of that to protect them against accidental
or malicious modifications.
Merge patch series "Update lpfc to revision 14.4.0.7"
Justin Tee <justintee8345@gmail.com> says:
Update lpfc to revision 14.4.0.7
This patch set contains fixes related to smatch, clean up of obsolete code
and global spinlocks, changes to ADISC and LS_RJT handling, and support for
large fw object reads used in proprietary applications.
The patches were cut against Martin's 6.14/scsi-queue tree.
Justin Tee [Thu, 12 Dec 2024 23:33:07 +0000 (15:33 -0800)]
scsi: lpfc: Add support for large fw object application layer reads
Current lpfc bsg implementation allows a maximum fw read object size of
30KB. Implementation and support for read object mailbox commands for fw
objects larger than 30KB are now required for proprietary applications.
Thus, update the lpfc_sli_config_emb0_subsys structure and its associated
submission and completion paths to accommodate for an alternative form of
read object command that supports large fw objects.
Justin Tee [Thu, 12 Dec 2024 23:33:06 +0000 (15:33 -0800)]
scsi: lpfc: Update definition of firmware configuration mbox cmds
There are unused fields in mailbox commands that query for firmware
configuration information. As such, update the struct definitions by
correcting the name of certain fields and removing the unused fields.
Justin Tee [Thu, 12 Dec 2024 23:33:05 +0000 (15:33 -0800)]
scsi: lpfc: Change lpfc_nodelist save_flags member into a bitmask
In attempt to reduce the amount of unnecessary ndlp->lock acquisitions in
the lpfc driver, change save_flags into an unsigned long bitmask and use
clear_bit/test_bit bitwise atomic APIs instead of reliance on ndlp->lock
for synchronization.
Justin Tee [Thu, 12 Dec 2024 23:33:04 +0000 (15:33 -0800)]
scsi: lpfc: Add handling for LS_RJT reason explanation authentication required
When a LS_RJT is received with reason explanation authentication required,
current driver logic is to retry the PLOGI up to 48 times. In the worse
case scenario, 48 retries can take longer than dev_loss_tmo and if there is
an RSCN received indicating an authentication requirement change, the
driver may miss processing it. Fix by adding logic to specifically handle
reason explanation authentication required and set the max retry count to 8
times.
Justin Tee [Thu, 12 Dec 2024 23:33:03 +0000 (15:33 -0800)]
scsi: lpfc: Modify handling of ADISC based on ndlp state and RPI registration
In lpfc_check_adisc, remove the requirement that the ndlp object must have
been RPI registered. Whether or not the ndlp is RPI registered is
unrelated to verifying that the received ADISC is intended for that ndlp
rport object.
After ADISC receipt, there's no need to put the ndlp state into NPR. Let
the cmpl routines from the actions taken earlier in ADISC handling set the
proper ndlp state.
Also, refactor when a RESUME_RPI mailbox command should be sent. It should
only be sent if the RPI registered flag is set.
Justin Tee [Thu, 12 Dec 2024 23:33:02 +0000 (15:33 -0800)]
scsi: lpfc: Delete NLP_TARGET_REMOVE flag due to obsolete usage
Remove the NLP_TARGET_REMOVE flag as its usage is obsolete. The current
framework is to rely on the lpfc_dev_loss_tmo_callbk from upper layer to
notify final ndlp kref release. There's no need to specifically set
NLP_EVT_DEVICE_RM when a LOGO completes. The dev_loss_tmo_callbk is
responsible for the final kref put.
Justin Tee [Thu, 12 Dec 2024 23:33:00 +0000 (15:33 -0800)]
scsi: lpfc: Redefine incorrect type in lpfc_create_device_data()
Fix smatch warning by redefining local variable memory_flags from int to
gfp_t.
lpfc_scsi.c: warning: incorrect type in argument 2 (different base types)
lpfc_scsi.c: expected restricted gfp_t [usertype] gfp_mask
lpfc_scsi.c: got int memory_flags
Eric Biggers [Fri, 13 Dec 2024 04:19:46 +0000 (20:19 -0800)]
scsi: ufs: qcom: Convert to use UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE
By default the UFS core is responsible for initializing the
blk_crypto_profile, but Qualcomm platforms have their own way of
programming and evicting crypto keys. So currently
ufs_hba_variant_ops::program_key is used to redirect control flow from
ufshcd_program_key(). This has worked until now, but it's a bit of a hack,
given that the key (and algorithm ID etc.) ends up being converted from
blk_crypto_key => ufs_crypto_cfg_entry => SCM call parameters, where the
intermediate ufs_crypto_cfg_entry step is unnecessary. Taking a similar
approach with the upcoming wrapped key support, the implementation of which
is similarly platform-specific, would require adding four new methods to
ufs_hba_variant_ops, changing program_key to take the struct
blk_crypto_key, and adding a new UFSHCD_CAP_* flag to indicate support for
wrapped keys.
This patch takes a different approach. It changes ufs-qcom to use the
existing UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE which was recently added for
ufs-exynos. This allows it to override the full blk_crypto_profile,
eliminating the need for the existing ufs_hba_variant_ops::program_key and
the hooks that would have been needed for wrapped key support. It does
require a bit of duplicated code to read the crypto capability registers,
but it's worth the simplification in design with ufs-qcom and ufs-exynos
now using the same method to customize the crypto profile, and it makes it
much easier to add wrapped key support.
Eric Biggers [Tue, 10 Dec 2024 03:08:39 +0000 (19:08 -0800)]
scsi: ufs: qcom: Fix crypto key eviction
Commit 56541c7c4468 ("scsi: ufs: ufs-qcom: Switch to the new ICE API")
introduced an incorrect check of the algorithm ID into the key eviction
path, and thus qcom_ice_evict_key() is no longer ever called. Fix it.
Fixes: 56541c7c4468 ("scsi: ufs: ufs-qcom: Switch to the new ICE API") Cc: stable@vger.kernel.org Cc: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20241210030839.1118805-1-ebiggers@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 19 Dec 2024 21:49:28 +0000 (13:49 -0800)]
scsi: documentation: scsi_eh: updates for EH changes
SCSI_SOFTIRQ and scsi_softirq() are no longer used. Change to block layer
equivalents.
scsi_setup_cmd_retry() has been deleted. Remove references to it.
SCSI_EH_CANCEL_CMD has been deleted. Remove references to it.
scsi_eh_abort_cmds() has been deleted. Remove references to it.
[mkp: fixed START STOP UNIT]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241219214928.1170302-1-rdunlap@infradead.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: qcom: Power down the controller/device during system suspend for SM8550/SM8650 SoCs
SM8550 and SM8650 SoCs doesn't support UFS PHY retention. So once these SoCs
reaches the low power state (CX power collapse) during system suspend, all
the PHY hardware state gets lost. This leads to the UFS resume failure:
With the default system suspend level of UFS_PM_LVL_3, the power domain for
UFS PHY needs to be kept always ON to retain the state. But this would
prevent these SoCs from reaching the CX power collapse state, leading to
poor power saving during system suspend.
So to fix this issue without affecting the power saving, set
'ufs_qcom_drvdata::no_phy_retention' to true which sets 'hba->spm_lvl' to
UFS_PM_LVL_5 to allow both the controller and device (in turn the PHY) to be
powered down during system suspend for these SoCs by default.
Cc: stable@vger.kernel.org # 6.3 Fixes: 35cf1aaab169 ("arm64: dts: qcom: sm8550: Add UFS host controller and phy nodes") Fixes: 10e024671295 ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes") Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-4-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: qcom: Allow passing platform specific OF data
In order to allow platform specific flags and configurations, introduce the
platform specific OF data and move the existing quirk
UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 and SM8650 SoCs.
Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-3-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
PHY might already be powered on during ufs_qcom_power_up_sequence() in a
couple of cases:
1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk
2. Resuming from spm_lvl = 5 suspend
In those cases, it is necessary to call phy_power_off() and phy_exit() in
ufs_qcom_power_up_sequence() function to power off the PHY before calling
phy_init() and phy_power_on().
Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is
not handled. So to satisfy both cases, call phy_power_off() and phy_exit()
if the phy_count is non-zero. And with this change, the reinit_notify()
callback is no longer needed.
This fixes the below UFS resume failure with spm_lvl = 5:
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5
ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5
ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5
Cc: stable@vger.kernel.org # 6.3 Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device") Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Michael Kelley [Thu, 3 Oct 2024 03:53:32 +0000 (20:53 -0700)]
scsi: storvsc: Don't assume cpu_possible_mask is dense
Current code allocates the stor_chns array with size num_possible_cpus().
This code assumes cpu_possible_mask is dense, which is not true in the
general case per [1]. If cpu_possible_mask is sparse, the array might be
indexed by a value beyond the size of the array.
However, the configurations that Hyper-V provides to guest VMs on x86 and
ARM64 hardware, in combination with how architecture specific code assigns
Linux CPU numbers, *does* always produce a dense cpu_possible_mask. So the
dense assumption is not currently causing failures. But for robustness
against future changes in how cpu_possible_mask is populated, update the
code to no longer assume dense.
The correct approach is to allocate and initialize the array using size
"nr_cpu_ids". While this leaves unused array entries corresponding to holes
in cpu_possible_mask, the holes are assumed to be minimal and hence the
amount of memory wasted by unused entries is minimal.
Steffen Maier [Thu, 5 Dec 2024 14:19:31 +0000 (15:19 +0100)]
scsi: zfcp: Clarify zfcp_port refcount ownership during "link" test
Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20241205141932.1227039-3-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fedor Loshakov [Thu, 5 Dec 2024 14:19:30 +0000 (15:19 +0100)]
scsi: zfcp: Correct kdoc parameter description for sending ELS and CT
Since commit 7c7dc196814b ("[SCSI] zfcp: Simplify handling of ct and els
requests") there are no more such structures as zfcp_send_els and
zfcp_send_ct. Instead there is now one common fsf structure to hold zfcp
data for ct and els requests. Fix parameter description for
zfcp_fsf_send_ct() and zfcp_fsf_send_els() accordingly.
Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20241205141932.1227039-2-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 5 Dec 2024 04:18:39 +0000 (20:18 -0800)]
scsi: Eliminate scsi_register() and scsi_unregister() usage & docs
scsi_mid_low_api.rst refers to scsi_register() and scsi_unregister() but
these functions don't exist. They have been replaced by more meaningful
names.
Update one driver (megaraid_mbox.c) that uses "scsi_unregister" in a
warning message. Update scsi_mid_low_api.rst to eliminate references to
scsi_register() and scsi_unregister().
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241205041839.164404-1-rdunlap@infradead.org Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: Chandrakanth patil <chandrakanth.patil@broadcom.com> Cc: megaraidlinux.pdl@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Randy Dunlap [Thu, 5 Dec 2024 03:13:07 +0000 (19:13 -0800)]
scsi: docs: Remove init_this_scsi_driver()
Finish removing mention of init_this_scsi_driver() that was removed ages
ago.
Fixes: 83c9f08e6c6a ("scsi: remove the old scsi_module.c initialization model") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20241205031307.130441-1-rdunlap@infradead.org Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
liuderong [Fri, 6 Dec 2024 07:29:42 +0000 (15:29 +0800)]
scsi: ufs: core: Update compl_time_stamp_local_clock after completing a cqe
lrbp->compl_time_stamp_local_clock is set to zero after sending a sqe
but it is not updated after completing a cqe. Thus the printed
information in ufshcd_print_tr() will always be zero.
Update lrbp->cmpl_time_stamp_local_clock after completing a cqe.
Log sample:
ufshcd-qcom 1d84000.ufshc: UPIU[8] - issue time 8750227249 us
ufshcd-qcom 1d84000.ufshc: UPIU[8] - complete time 0 us
Fixes: c30d8d010b5e ("scsi: ufs: core: Prepare for completion in MCQ") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: liuderong <liuderong@oppo.com> Link: https://lore.kernel.org/r/1733470182-220841-1-git-send-email-liuderong@oppo.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Avri Altman [Thu, 28 Nov 2024 07:25:42 +0000 (09:25 +0200)]
scsi: ufs: core: Do not hold any lock in ufshcd_hba_stop()
This change is motivated by Bart's suggestion in [1], which enables to
further reduce the SCSI host lock usage in the UFS driver. The reason why
it makes sense, because although the legacy interrupt is disabled by some
but not all ufshcd_hba_stop() callers, it is safe to nest disable_irq()
calls as it checks the irq depth.
Merge patch series "Replace the "slave_*" function names"
Bart Van Assche <bvanassche@acm.org> says:
Hi Martin,
The text "slave_" in multiple function names does not make it clear what
the purpose of these functions is. Hence this patch series that renames all
SCSI functions that have the word "slave" in their function name. Please
consider this patch series for the next merge window.
Bart Van Assche [Tue, 22 Oct 2024 18:07:57 +0000 (11:07 -0700)]
scsi: core: Update API documentation
Since the .slave_alloc(), .slave_destroy() and .slave_configure() methods
have been renamed in struct scsi_host_template, also rename these in the
API documentation.
Bart Van Assche [Tue, 22 Oct 2024 18:07:55 +0000 (11:07 -0700)]
scsi: Convert SCSI drivers to .sdev_configure()
The only difference between the .sdev_configure() and .slave_configure()
methods is that the former accepts an additional 'limits' argument.
Convert all SCSI drivers that define a .slave_configure() method to
.sdev_configure(). This patch prepares for removing the
.slave_configure() method. No functionality has been changed.
Acked-by: Geoff Levand <geoff@infradead.org> # for ps3rom Acked-by: Khalid Aziz <khalid@gonehiking.org> # for the BusLogic driver Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241022180839.2712439-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche [Tue, 22 Oct 2024 18:07:53 +0000 (11:07 -0700)]
scsi: Rename .slave_alloc() and .slave_destroy()
Rename .slave_alloc() into .sdev_init() and .slave_destroy() into
.sdev_destroy(). The new names make it clear that these are actions on
SCSI devices. Make this change in the SCSI core, SCSI drivers and also
in the ATA drivers. No functionality has been changed.
This patch has been created as follows:
* Change the text "slave_alloc" into "sdev_init" in all source files
except those in drivers/net/ and Documentation/.
* Change the text "slave_destroy" into "sdev_destroy" in all source
files except those in drivers/net/ and Documentation/.
* Rename lpfc_no_slave() into lpfc_no_sdev().
* Manually adjust whitespace where necessary to restore vertical
alignment (dc395x driver and include/linux/libata.h).
Igor Pylypiv [Tue, 26 Nov 2024 22:49:23 +0000 (22:49 +0000)]
scsi: pm80xx: Increase reserved tags from 8 to 128
Increase the number of reserved tags to prevent command processing failures
when the driver is under stress. 8 reserved tags are quickly getting all
used up leading to errors when command completions are delayed.
The driver needs ~512 ccbs/tags for maximum I/O utilization:
John Garry [Mon, 2 Dec 2024 13:00:45 +0000 (13:00 +0000)]
scsi: scsi_debug: Fix hrtimer support for ndelay
Since commit 771f712ba5b0 ("scsi: scsi_debug: Fix cmd duration
calculation"), ns_from_boot value is only evaluated in schedule_resp()
for polled requests.
However, ns_from_boot is also required for hrtimer support for when
ndelay is less than INCLUSIVE_TIMING_MAX_NS, so fix up the logic to
decide when to evaluate ns_from_boot.
Cathy Avery [Wed, 27 Nov 2024 18:13:24 +0000 (13:13 -0500)]
scsi: storvsc: Do not flag MAINTENANCE_IN return of SRB_STATUS_DATA_OVERRUN as an error
This partially reverts commit 812fe6420a6e ("scsi: storvsc: Handle
additional SRB status values").
HyperV does not support MAINTENANCE_IN resulting in FC passthrough
returning the SRB_STATUS_DATA_OVERRUN value. Now that
SRB_STATUS_DATA_OVERRUN is treated as an error, multipath ALUA paths go
into a faulty state as multipath ALUA submits RTPG commands via
MAINTENANCE_IN.
Make MAINTENANCE_IN return success to avoid the error path as is
currently done with INQUIRY and MODE_SENSE.
Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Cathy Avery <cavery@redhat.com> Link: https://lore.kernel.org/r/20241127181324.3318443-1-cavery@redhat.com Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Peter Wang [Fri, 22 Nov 2024 02:49:43 +0000 (10:49 +0800)]
scsi: ufs: core: Add missing post notify for power mode change
When the power mode change is successful but the power mode hasn't
actually changed, the post notification was missed. Similar to the
approach with hibernate/clock scale/hce enable, having pre/post
notifications in the same function will make it easier to maintain.
Additionally, supplement the description of power parameters for the
pwr_change_notify callback.
Fixes: 7eb584db73be ("ufs: refactor configuring power mode") Cc: stable@vger.kernel.org #6.11.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241122024943.30589-1-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Suraj Sonawane [Wed, 20 Nov 2024 12:59:44 +0000 (18:29 +0530)]
scsi: sg: Fix slab-use-after-free read in sg_release()
Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN:
BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30
kernel/locking/lockdep.c:5838
__mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912
sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407
In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is
called before releasing the open_rel_lock mutex. The kref_put() call may
decrement the reference count of sfp to zero, triggering its cleanup
through sg_remove_sfp(). This cleanup includes scheduling deferred work
via sg_remove_sfp_usercontext(), which ultimately frees sfp.
After kref_put(), sg_release() continues to unlock open_rel_lock and may
reference sfp or sdp. If sfp has already been freed, this results in a
slab-use-after-free error.
Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the
open_rel_lock mutex. This ensures:
- No references to sfp or sdp occur after the reference count is
decremented.
- Cleanup functions such as sg_remove_sfp() and
sg_remove_sfp_usercontext() can safely execute without impacting the
mutex handling in sg_release().
The fix has been tested and validated by syzbot. This patch closes the
bug reported at the following syzkaller link and ensures proper
sequencing of resource cleanup and mutex operations, eliminating the
risk of use-after-free errors in sg_release().
Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling") Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Quinn Tran [Fri, 15 Nov 2024 13:03:11 +0000 (18:33 +0530)]
scsi: qla2xxx: Fix NVMe and NPIV connect issue
NVMe controller fails to send connect command due to failure to locate
hw context buffer for NVMe queue 0 (blk_mq_hw_ctx, hctx_idx=0). The
cause of the issue is NPIV host did not initialize the vha->irq_offset
field. This field is given to blk-mq (blk_mq_pci_map_queues) to help
locate the beginning of IO Queues which in turn help locate NVMe queue
0.
Initialize this field to allow NVMe to work properly with NPIV host.
Saurav Kashyap [Fri, 15 Nov 2024 13:03:10 +0000 (18:33 +0530)]
scsi: qla2xxx: Remove check req_sg_cnt should be equal to rsp_sg_cnt
Firmware supports multiple sg_cnt for request and response for CT
commands, so remove the redundant check. A check is there where sg_cnt
for request and response should be same. This is not required as driver
and FW have code to handle multiple and different sg_cnt on request and
response.
Merge patch series "Untie the host lock entanglement - part 2"
Avri Altman <avri.altman@wdc.com> says:
Here is the 2nd part in the sequel, watering down the scsi host lock
usage in the ufs driver. This work is motivated by a comment made by
Bart [1], of the abuse of the scsi host lock in the ufs driver. Its
Precursor [2] removed the host lock around some of the host register
accesses.
This part replaces the scsi host lock by dedicated locks serializing
access to the clock gating and clock scaling members.
Changes compared to v4:
- split patch 1 into 2 parts (Bart)
- use scoped_guard() for the host_lock as well (Bart)
- remove irrelevant comment and use lockdep_assert_held instead (Bart)
- improve @lock documentation (Bart)
Changes compared to v3:
- Keep the host lock when checking ufshcd_state (Bean)
Changes compared to v2:
- Use clang-format to fix formating (Bart)
- Use guard() in ufshcd_clkgate_enable_store (Bart)
- Elaborate commit log (Bart)
Changes compared to v1:
- use the guard() & scoped_guard() macros (Bart)
- re-order struct ufs_clk_scaling and struct ufs_clk_gating (Bart)
Avri Altman [Sun, 24 Nov 2024 07:08:08 +0000 (09:08 +0200)]
scsi: ufs: core: Introduce a new clock_scaling lock
Introduce a new clock scaling lock to serialize access to some of the clock
scaling members instead of the host_lock. here also, simplify the code with
the guard() macro and co.
Avri Altman [Sun, 24 Nov 2024 07:08:07 +0000 (09:08 +0200)]
scsi: ufs: core: Introduce a new clock_gating lock
Introduce a new clock gating lock to serialize access to some of the clock
gating members instead of the host_lock.
While at it, simplify the code with the guard() macro and co for automatic
cleanup of the new lock. There are some explicit
spin_lock_irqsave()/spin_unlock_irqrestore() snaking instances I left
behind because I couldn't make heads or tails of it.
Additionally, move the trace_ufshcd_clk_gating() call from inside the
region protected by the lock as it doesn't needs protection.
Quinn Tran [Fri, 15 Nov 2024 13:03:08 +0000 (18:33 +0530)]
scsi: qla2xxx: Fix use after free on unload
System crash is observed with stack trace warning of use after
free. There are 2 signals to tell dpc_thread to terminate (UNLOADING
flag and kthread_stop).
On setting the UNLOADING flag when dpc_thread happens to run at the time
and sees the flag, this causes dpc_thread to exit and clean up
itself. When kthread_stop is called for final cleanup, this causes use
after free.
Remove UNLOADING signal to terminate dpc_thread. Use the kthread_stop
as the main signal to exit dpc_thread.
Quinn Tran [Fri, 15 Nov 2024 13:03:07 +0000 (18:33 +0530)]
scsi: qla2xxx: Fix abort in bsg timeout
Current abort of bsg on timeout prematurely clears the
outstanding_cmds[]. Abort does not allow FW to return the IOCB/SRB. In
addition, bsg_job_done() is not called to return the BSG (i.e. leak).
Abort the outstanding bsg/SRB and wait for the completion. The
completion IOCB will wake up the bsg_timeout thread. If abort is not
successful, then driver will forcibly call bsg_job_done() and free the
srb.
Err Inject:
- qaucli -z
- assign CT Passthru IOCB's NportHandle with another initiator
nport handle to trigger timeout. Remote port will drop CT request.
- bsg_job_done is properly called as part of cleanup
Ranjan Kumar [Sun, 10 Nov 2024 19:44:04 +0000 (01:14 +0530)]
scsi: mpi3mr: Handling of fault code for insufficient power
Before retrying initialization, check and abort if the fault code
indicates insufficient power. Also mark the controller as unrecoverable
instead of issuing reset in the watch dog timer if the fault code
indicates insufficient power.
Ranjan Kumar [Sun, 10 Nov 2024 19:44:02 +0000 (01:14 +0530)]
scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfs
The driver, through the SAS transport, exposes a sysfs interface to
enable/disable PHYs in a controller/expander setup. When multiple PHYs
are disabled and enabled in rapid succession, the persistent and current
config pages related to SAS IO unit/SAS Expander pages could get
corrupted.