Douglas Gilbert [Thu, 15 Apr 2021 01:50:31 +0000 (21:50 -0400)]
scsi: scsi_debug: Fix cmd_per_lun, set to max_queue
Make sure that the cmd_per_lun value placed in the host template never
exceeds the can_queue value. If the max_queue driver parameter is not
specified then both cmd_per_lun and can_queue are set to CAN_QUEUE.
CAN_QUEUE is a compile time constant and is used to dimension an array to
hold queued requests. If the max_queue driver parameter is given it is must
be less than or equal to CAN_QUEUE and if so, the host template values are
adjusted.
Remove undocumented code that allowed queue_depth to exceed CAN_QUEUE and
cause stack full type errors. There is a documented way to do that with
every_nth and
Tweak some formatting, and add a suggestion to the "trim poll_queues"
warning.
Link: https://lore.kernel.org/r/20210415015031.607153-1-dgilbert@interlog.com Reported-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: John Garry <john.garry@hauwei.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Can Guo [Mon, 26 Apr 2021 03:48:40 +0000 (20:48 -0700)]
scsi: ufs: core: Narrow down fast path in system suspend path
If spm_lvl is set to 0 or 1, when system suspend kicks start and HBA is
runtime active, system suspend may just bail without doing anything (the
fast path), leaving other contexts still running, e.g., clock gating and
clock scaling. When system resume kicks start, concurrency can happen
between ufshcd_resume() and these contexts, leading to various stability
issues.
Add a check against HBA's runtime state and allowing fast path only if HBA
is runtime suspended, otherwise let system suspend go ahead call
ufshcd_suspend(). This will guarantee that these contexts are stopped by
either runtime suspend or system suspend.
Can Guo [Mon, 26 Apr 2021 03:48:39 +0000 (20:48 -0700)]
scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend
During ufs system suspend, leaving rpm_dev_flush_recheck_work running or
pending is risky because concurrency may happen between system
suspend/resume and runtime resume routine. Fix this by cancelling
rpm_dev_flush_recheck_work synchronously during system suspend.
Link: https://lore.kernel.org/r/1619408921-30426-3-git-send-email-cang@codeaurora.org Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Can Guo [Mon, 26 Apr 2021 03:48:38 +0000 (20:48 -0700)]
scsi: ufs: core: Do not put UFS power into LPM if link is broken
During resume, if link is broken due to AH8 failure, make sure
ufshcd_resume() does not put UFS power back into LPM.
Link: https://lore.kernel.org/r/1619408921-30426-2-git-send-email-cang@codeaurora.org Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a case when the initiator in P2P mode by some circumstances does not
send PRLI, the target, in a case when the target port's WWPN is less than
initiator's, changes the discovery state in DSC_GNL. When gnl completes it
sends PRLI to the initiator.
Usually the initiator in P2P mode always sends PRLI. We caught this issue
on Linux stable v5.4.6 https://www.spinics.net/lists/stable/msg458515.html.
Fix this particular corner case in the behaviour of the P2P mod target
login state machine.
Link: https://lore.kernel.org/r/20210422153414.4022-1-a.kovaleva@yadro.com Fixes: a9ed06d4e640 ("scsi: qla2xxx: Allow PLOGI in target mode") Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bodo Stroesser [Fri, 23 Apr 2021 15:01:23 +0000 (17:01 +0200)]
scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found
If tcmu_handle_completions() finds an invalid cmd_id while looping over cmd
responses from userspace it sets TCMU_DEV_BIT_BROKEN and breaks the
loop. This means that it does further handling for the tcmu device.
Skip that handling by replacing 'break' with 'return'.
Additionally change tcmu_handle_completions() from unsigned int to bool,
since the value used in return already is bool.
James Smart [Wed, 21 Apr 2021 23:45:11 +0000 (16:45 -0700)]
scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command
The dump command for reading a region passes a requested read length
specified in words (4-byte units). The response overwrites the same field
with the actual number of bytes read.
The mailbox handler for DUMP which reads VPD data (region 23) is treating
the response field as if it were still a word_cnt, thus multiplying it by 4
to set the read's "length". Given the read value was calculated based on
the size of the read buffer, the longer response length runs off the end of
the buffer.
Fix by reworking the code to use the response field as a byte count.
Link: https://lore.kernel.org/r/20210421234511.102206-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Wed, 21 Apr 2021 23:44:48 +0000 (16:44 -0700)]
scsi: lpfc: Fix DMA virtual address ptr assignment in bsg
lpfc_bsg_ct_unsol_event() routine acts assigns a ct_request to the wrong
structure address, resulting in a bad address that results in bsg related
timeouts.
Correct the ct_request assignment to use the kernel virtual buffer address
(not the control structure address).
Link: https://lore.kernel.org/r/20210421234448.102132-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Wed, 21 Apr 2021 23:44:33 +0000 (16:44 -0700)]
scsi: lpfc: Fix illegal memory access on Abort IOCBs
In devloss timer handler and in backend calls to terminate remote port I/O,
there is logic to walk through all active IOCBs and validate them to
potentially trigger an abort request. This logic is causing illegal memory
accesses which leads to a crash. Abort IOCBs, which may be on the list, do
not have an associated lpfc_io_buf struct. The driver is trying to map an
lpfc_io_buf struct on the IOCB and which results in a bogus address thus
the issue.
Fix by skipping over ABORT IOCBs (CLOSE IOCBs are ABORTS that don't send
ABTS) in the IOCB scan logic.
Link: https://lore.kernel.org/r/20210421234433.102079-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ming Lei [Wed, 21 Apr 2021 15:45:26 +0000 (23:45 +0800)]
scsi: blk-mq: Fix build warning when making htmldocs
Fixes the following warning when running 'make htmldocs':
include/linux/blk-mq.h:395: warning: Function parameter or member
'set_rq_budget_token' not described in 'blk_mq_ops'
include/linux/blk-mq.h:395: warning: Function parameter or member
'get_rq_budget_token' not described in 'blk_mq_ops'
[mkp: added warning messages]
Link: https://lore.kernel.org/r/20210421154526.1954174-1-ming.lei@redhat.com Fixes: d022d18c045f ("scsi: blk-mq: Add callbacks for storing & retrieving budget token") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Do not print tg_pt_gp->tg_pt_gp_valid_id if we already know that it is zero.
Link: https://lore.kernel.org/r/20210415220826.29438-20-bvanassche@acm.org Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use format specifier '%u' to format the u32 data type instead of '%hu'.
Link: https://lore.kernel.org/r/20210415220826.29438-19-bvanassche@acm.org Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: sd: Introduce a new local variable in sd_check_events()
Instead of using 'retval' to represent first a SCSI status and later
whether or not a disk change event occurred, introduce a new variable for
the latter purpose.
Link: https://lore.kernel.org/r/20210415220826.29438-17-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The dc395x driver is one of the two drivers that passes an u8 argument to
status_byte() instead of an s32 argument. Open-code status_byte() in
preparation of changing SCSI status values into a structure.
The 53c700 driver is one of the two drivers that passes an u8 argument to
status_byte() instead of an s32 argument. Open-code status_byte in
preparation of changing SCSI status values into a structure.
drivers/scsi/mpt3sas/mpt3sas_base.c:5430: warning: Excess function parameter 'ct' description in '_base_allocate_pcie_sgl_pool'
drivers/scsi/mpt3sas/mpt3sas_base.c:5493: warning: Excess function parameter 'ctr' description in '_base_allocate_chain_dma_pool'
Link: https://lore.kernel.org/r/20210415220826.29438-10-bvanassche@acm.org Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region") Fixes: 7dd847dae1c4 ("scsi: mpt3sas: Force chain buffer allocations to be within same 4 GB region") Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the 'mfs' member has been declared as 'u32' in include/scsi/libfc.h,
use the %u format specifier instead of %hu. This patch fixes the following
clang compiler warning:
warning: format specifies type
'unsigned short' but the argument has type 'u32' (aka 'unsigned int')
[-Wformat]
"lport->mfs:%hu\n", mfs, lport->mfs);
~~~ ^~~~~~~~~~
%u
Improve readability of the code in the SCSI core by introducing an
enumeration type for the values used internally that decide how to continue
processing a SCSI command. The eh_*_handler return values have not been
changed because that would involve modifying all SCSI drivers.
The output of the following command has been inspected to verify that no
out-of-range values are assigned to a variable of type enum
scsi_disposition:
KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/
Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
The comment above scsi_send_eh_cmnd() says: "Returns SUCCESS or FAILED or
NEEDS_RETRY". This patch makes all values returned by scsi_send_eh_cmnd()
match the documentation of this function. This change does not affect the
behavior of scsi_eh_tur() nor of scsi_eh_try_stu() nor of the
scsi_request_sense() callers.
See also commit bbe9fb0d04b9 ("scsi: Avoid that .queuecommand() gets called
for a blocked SCSI device"; v5.3).
Link: https://lore.kernel.org/r/20210415220826.29438-5-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: core: Rename scsi_softirq_done() into scsi_complete()
Commit 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism";
v3.13) introduced a code path that calls the blk-mq completion function
from interrupt context. scsi-mq was introduced by commit d285203cf647
("scsi: add support for a blk-mq based I/O path."; v3.17).
Since the introduction of scsi-mq, scsi_softirq_done() can be called from
interrupt context. That made the name of the function misleading, rename it
to scsi_complete().
Link: https://lore.kernel.org/r/20210415220826.29438-4-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi_device.sdev_target is used in more code than the single_lun code,
hence remove the comment next to the definition of the sdev_target member.
Link: https://lore.kernel.org/r/20210415220826.29438-3-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
The current scsi_alloc_sgtables() documentation does not accurately explain
what this function does. Hence improve the documentation of this function.
Link: https://lore.kernel.org/r/20210415220826.29438-2-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Viswas G [Thu, 15 Apr 2021 10:33:52 +0000 (16:03 +0530)]
scsi: pm80xx: Remove global lock from outbound queue processing
Introduce spin lock for outbound queue. With this, driver need not acquire
HBA global lock for outbound queue processing.
Link: https://lore.kernel.org/r/20210415103352.3580-9-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Viswas G [Thu, 15 Apr 2021 10:33:51 +0000 (16:03 +0530)]
scsi: pm80xx: Reset PI and CI memory during re-initialization
Producer index(PI) outbound queue and consumer index(CI) for Outbound queue
are in DMA memory. During resume(), the stale PI and CI Values will lead to
unexpected behavior. These values should be reset to 0 during driver
reinitialization.
Link: https://lore.kernel.org/r/20210415103352.3580-8-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: pm80xx: Completing pending I/O after fatal error
When controller runs into fatal error, I/Os get stuck with no response,
handler event is defined to complete the pending I/Os (SAS task and
internal task) and also perform the cleanup for the drives.
Link: https://lore.kernel.org/r/20210415103352.3580-7-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: pm80xx: Add sysfs attribute to track iop1 count
A new sysfs variable 'ctl_iop1_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.
Using the 'ctl_iop1_count' sysfs variable we can see ticks incrementing:
scsi: pm80xx: Add sysfs attribute to track iop0 count
A new sysfs variable 'ctl_iop0_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.
Using the 'ctl_iop0_count' sysfs variable we can see ticks incrementing:
scsi: pm80xx: Add sysfs attribute to track RAAE count
A new sysfs variable 'ctl_raae_count' is being introduced that tells if the
controller is alive by indicating controller ticks. If on subsequent run we
see the ticks changing in RAAE count that indicates that controller is not
dead.
Using the 'ctl_raae_count' sysfs variable we can see ticks incrementing:
scsi: zfcp: Lift Request Queue tasklet & timer from qdio
The qdio layer currently provides its own infrastructure to scan for
Request Queue completions & to report them to the device driver. This
comes with several drawbacks - having an async tasklet & timer construct in
qdio introduces additional lifetime complexity, and makes it harder to
integrate them with the rest of the device driver. The timeouts are also
currently hard-coded, and can't be tweaked without affecting other qdio
drivers (ie. qeth).
But due to recent enhancements to the qdio layer, zfcp can actually take
full control of the Request Queue completion processing. It merely needs to
opt-out from the qdio layer mechanisms by setting the scan_threshold to 0,
and then use qdio_inspect_queue() to scan for completions.
So re-implement the tasklet & timer mechanism in zfcp, while initially
copying the scan conditions from qdio's handle_outbound() and
qdio_outbound_tasklet(). One minor behavioural change is that
zfcp_qdio_send() will unconditionally reduce the timeout to 1 HZ, rather
than leaving it at 10 Hz if it was last armed by the tasklet. This just
makes things more consistent. Also note that we can drop a lot of the
accumulated cruft in qdio_outbound_tasklet(), as zfcp doesn't even use PCI
interrupt requests any longer.
This also slightly touches the Response Queue processing, as
qdio_get_next_buffers() will no longer implicitly scan for Request Queue
completions. So complete the migration to qdio_inspect_queue() here as well
and make the tasklet_schedule() visible.
Place the put_device() call after device_unregister() in both
zfcp_unit_remove() and zfcp_sysfs_port_remove_store() to make it more
natural. put_device() ought to be the last time we touch the object in both
functions.
Add comments after put_device() to make code clearer.
scsi: zfcp: Fix sysfs roll-back on error in zfcp_adapter_enqueue()
When zfcp_adapter_enqueue() fails to create the zfcp_sysfs_adapter_attrs
group, it calls zfcp_adapter_unregister() to tear down the adapter state
again. This then unconditionally attempts to remove the
zfcp_sysfs_adapter_attrs group, resulting in a "group not found" WARN from
sysfs code.
Avoid this by copying most of zfcp_adapter_unregister() into the error
path, allowing for more fine-granular roll-back. Then skip the sysfs
tear-down steps if we haven't progressed this far in the initialization.
scsi: zfcp: Remove unneeded INIT_LIST_HEAD() for FSF requests
INIT_LIST_HEAD() is only needed for actual list heads, while req->list is
used as a list entry.
Note that when the error path in zfcp_fsf_req_send() removes the request
from the adapter's list of pending requests, it actually looks up the
request from the zfcp_reqlist - rather than just calling list_del(). So
there's no risk of us calling list_del() on a request that hasn't been
added to any list yet.
Roman Bolshakov [Mon, 12 Apr 2021 16:57:40 +0000 (19:57 +0300)]
scsi: qla2xxx: Reserve extra IRQ vectors
Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of
CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.
That breaks vector allocation assumptions in qla83xx_iospace_config(),
qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions
computes maximum number of qpairs as:
ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default
response queue) - 1 (ATIO, in dual or pure target mode)
max_qpairs is set to zero in case of two CPUs and initiator mode. The
number is then used to allocate ha->queue_pair_map inside
qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is
left NULL but the driver thinks there are queue pairs available.
qla2xxx_queuecommand() tries to find a qpair in the map and crashes:
Dan Carpenter found a possible NULL pointer dereference issue in function
pqi_sas_port_add_rphy():
drivers/scsi/smartpqi/smartpqi_sas_transport.c:97
pqi_sas_port_add_rphy() warn: variable dereferenced before
check 'pqi_sas_port->device' (see line 95)
Correct issue by moving reference of pqi_sas_port->device after the check
for the device pointer being non-NULL.
Link: https://www.mail-archive.com/kbuild@lists.01.org/msg06329.html Link: https://lore.kernel.org/r/161850493026.7302.10032784239320437353.stgit@brunhilda Fixes: ec504b23df9d ("scsi: smartpqi: Add phy ID support for the physical drives") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter found a possible divide by 0 issue in the smartpqi driver in
functions pci_get_aio_common_raid_map_values() and pqi_calc_aio_r5_or_r6().
The variable rmd->blocks_per_row is used as a divisor and could be 0.
Using rmd->blocks_per_row as a divisor without checking
it for 0 first.
Correct these possible divide by 0 conditions by insuring that
rmd->blocks_per_row is not zero before usage. The check for non-0 was too
late to prevent a divide by 0 condition. Add in a comment to explain why
the check for non-zero is necessary. If the member is 0, return
PQI_RAID_BYPASS_INELIGIBLE before any division is performed.
Link: https://lore.kernel.org/linux-scsi/YG%2F5kWHHAr7w5dU5@mwanda/ Link: https://lore.kernel.org/r/161850492435.7302.392780350442938047.stgit@brunhilda Fixes: 6702d2c40f31 ("scsi: smartpqi: Add support for RAID5 and RAID6 writes") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Tue, 13 Apr 2021 00:10:09 +0000 (18:10 -0600)]
scsi: ibmvfc: Fix invalid state machine BUG_ON()
This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going
through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to
IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ,
which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the
host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end
up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change
the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON().
Make a couple of changes to avoid this. Leave the host action to be
IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop
the host lock and reset or reenable the CRQ. Also harden the host state
machine to ensure we cannot leave the reset / reenable state until we've
finished processing the reset or reenable.
Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com Fixes: 73ee5d867287 ("[SCSI] ibmvfc: Fix soft lockup on resume") Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
[tyreld: added fixes tag] Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mkp: fix comment checkpatch warnings] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:27 +0000 (18:31 -0700)]
scsi: lpfc: Copyright updates for 12.8.0.9 patches
Update copyrights to 2021 for files modified in the 12.8.0.9 patch set.
Link: https://lore.kernel.org/r/20210412013127.2387-17-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:26 +0000 (18:31 -0700)]
scsi: lpfc: Update lpfc version to 12.8.0.9
Update lpfc version to 12.8.0.9
Link: https://lore.kernel.org/r/20210412013127.2387-16-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:25 +0000 (18:31 -0700)]
scsi: lpfc: Eliminate use of LPFC_DRIVER_NAME in lpfc_attr.c
During code inspection, several cases of creating a dynamic attribute names
in logs messages using a define was found. This is unnecessary.
Place the native symbol name in the log messages.
Link: https://lore.kernel.org/r/20210412013127.2387-15-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:24 +0000 (18:31 -0700)]
scsi: lpfc: Standardize discovery object logging format
Code inspection showed lpfc was using three different pointer formats when
logging discovery object pointers.
Standardize the pointer format to x%px.
Note: %px use is limited to discovery objects in order to aid core
analysis.
Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:23 +0000 (18:31 -0700)]
scsi: lpfc: Fix various trivial errors in comments and log messages
Clean up minor issues spotted by tools and code review:
- Spelling Errors
- Spurious characters and errors in function headers
- nvme_info wqerr and err fields source data reversed
- Extraneous new line in log message 0466
- Spacing error in log message 0109
- Messages 0140 and 0141 have portname and nodename reversed
- Incorrect function labelling in comment
Link: https://lore.kernel.org/r/20210412013127.2387-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3
does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have
code to issue the command. The command will always fail.
Remove the code for the mailbox command and leave only the resulting
"failure path" logic.
Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:21 +0000 (18:31 -0700)]
scsi: lpfc: Fix lpfc_hdw_queue attribute being ignored
The lpfc_hdw_queue attribute is to set the number of hardware queues to be
created on the adapter. Normally, the value is set to a default, which
allows the hw queue count to be sized dynamically based on adapter
capabilities, CPU/platform architecture, or CPU type. Currently, when
lpfc_hdw_queue is set to a specific value, is has no effect and the dynamic
sizing occurs.
The routine checking whether parameters are default or not ignores the
lpfc_hdw_queue setting and invokes the dynamic logic.
Fix the routine to additionally check the lpfc_hdw_queue attribute value
before using dynamic scaling. Additionally, SLI-3 supports only a small
number of queues with dedicated functions, thus it needs to be exempted
from the variable scaling and set to the expected values.
Link: https://lore.kernel.org/r/20210412013127.2387-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:20 +0000 (18:31 -0700)]
scsi: lpfc: Fix missing FDMI registrations after Mgmt Svc login
FDMI registration needs to be performed after every login with the FC Mgmt
service. The flag the driver is using to track registration is cleared on
link up, but never on Mgmt service logout/re-login.
Fix by clearing the flag whenever a new login is completed with the FC Mgmt
service.
While perusing the flag use, logging was performed as if FDMI registration
occurred on vports. However, it is limited to the physical port only.
Revise the logging to reflect physical port based.
Link: https://lore.kernel.org/r/20210412013127.2387-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:19 +0000 (18:31 -0700)]
scsi: lpfc: Fix silent memory allocation failure in lpfc_sli4_bsg_link_diag_test()
In the unlikely case of a failure to allocate an LPFC_MBOXQ_t structure, no
return status is set, thus the routine never logs an error and returns
success to the callee.
Fix by setting a return code on failure.
Link: https://lore.kernel.org/r/20210412013127.2387-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:18 +0000 (18:31 -0700)]
scsi: lpfc: Fix use-after-free on unused nodes after port swap
During target port swap, the swap logic ignores the DROPPED flag in the
nodes. As a node then moves into the UNUSED state, the reference count will
be dropped. If a node is later reused and moved out of the UNUSED state, an
access can result in a use-after-free assert.
Fix by having the port swap logic propagate the DROPPED flag when switching
nodes. This will avoid reference from being dropped.
Link: https://lore.kernel.org/r/20210412013127.2387-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:17 +0000 (18:31 -0700)]
scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode
In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses
the BMBX register to send the command rather than the MQ. A flag is set
indicating the BMBX register is active and saves the mailbox job struct
(mboxq) in the mbox_active element of the adapter. The routine then waits
for completion or timeout. The mailbox job struct is not freed by the
routine. In cases of timeout, the adapter will be reset. The
lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for
the reset. It clears the BMBX active flag and marks the job structure as
MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation
in both normal completion and timeout cases is that the issuer of the mbx
command will free the structure. Unfortunately, not all calling paths are
freeing the memory in cases of error.
All calling paths were looked at and updated, if missing, to free the mboxq
memory regardless of completion status.
Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:16 +0000 (18:31 -0700)]
scsi: lpfc: Fix lack of device removal on port swaps with PRLIs
During target port-swap testing with link flips, the initiator could
encounter PRLI errors. If the target node disappears permanently, the ndlp
is found stuck in UNUSED state with ref count of 1. The rmmod of the driver
will hang waiting for this node to be freed.
While handling a link error in PRLI completion path, the code intends to
skip triggering the discovery state machine. However this is causing the
final reference release path to be skipped. This causes the node to be
stuck with ref count of 1
Fix by ensuring the code path triggers the device removal event on the node
state machine.
Link: https://lore.kernel.org/r/20210412013127.2387-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:15 +0000 (18:31 -0700)]
scsi: lpfc: Fix NMI crash during rmmod due to circular hbalock dependency
Remove hbalock dependency for lpfc_abts_els_sgl_list and
lpfc_abts_nvmet_ctx_list. The lists are adaquately synchronized with the
sgl_list_lock and abts_nvmet_buf_list_lock.
Link: https://lore.kernel.org/r/20210412013127.2387-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:14 +0000 (18:31 -0700)]
scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp()
Call traces are being seen that result from a nodelist structure ref
counting error. They are typically seen after transmission of an LS_RJT ELS
response.
Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the
ndlp reference count is exactly 1, will decrement the reference count.
Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put'
within the free would only be invoked if cmdiocb->context1 was not NULL.
Since the nodelist structure reference count is decremented when exiting
lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required.
Calling them is causing the reference count issue.
Fix by removing the lpfc_nlp_not_used() calls.
Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:13 +0000 (18:31 -0700)]
scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response
Fix a crash caused by a double put on the node when the driver completed an
ACC for an unsolicted abort on the same node. The second put was executed
by lpfc_nlp_not_used() and is wrong because the completion routine executes
the nlp_put when the iocbq was released. Additionally, the driver is
issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node
into NPR. This call does nothing.
Remove the lpfc_nlp_not_used call and additional set_state in the
completion routine. Remove the lpfc_nlp_set_state post issue_logo. Isn't
necessary.
Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Mon, 12 Apr 2021 01:31:12 +0000 (18:31 -0700)]
scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag
Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in
lpfc_els_free_iocb().
A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the
changes was to convert from building/sending an abort within the routine to
using a common routine. The reworked routine passes, without modification,
the pring ptr to the new common routine. The older routine had logic to
check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were
passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine
is missing this check and adapt, so the SLI-3 ring pointers are being used
in SLI-4 paths.
Fix by cleaning up the calling routines. In review, there is no need to
pass the ring ptr argument to abort_iocb at all. The routine can look at
the adapter type itself and reference the proper ring.
Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers") Cc: <stable@vger.kernel.org> # v5.11+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhen Lei [Thu, 8 Apr 2021 06:18:50 +0000 (14:18 +0800)]
scsi: message: fusion: Remove unused local variable 'port'
Fixes the following W=1 kernel build warning:
drivers/message/fusion/mptctl.c: In function ‘mptctl_gettargetinfo
drivers/message/fusion/mptctl.c:1372:7: warning: variable ‘port’ set but not used [-Wunused-but-set-variable]
Zhen Lei [Thu, 8 Apr 2021 06:18:49 +0000 (14:18 +0800)]
scsi: message: fusion: Remove unused local variable 'time_count'
Fixes the following W=1 kernel build warning:
drivers/message/fusion/mptctl.c: In function ‘mptctl_do_taskmgmt:
drivers/message/fusion/mptctl.c:324:17: warning: variable ‘time_count’ set but not used [-Wunused-but-set-variable]
Link: https://lore.kernel.org/r/20210408061851.3089-2-thunder.leizhen@huawei.com Fixes: 7d757f185540 ("[SCSI] mptfusion: Updated SCSI IO IOCTL error handling.") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Qiheng Lin [Fri, 9 Apr 2021 12:03:45 +0000 (20:03 +0800)]
scsi: qla4xxx: Remove unneeded if-null-free check
Eliminate the following coccicheck warning:
drivers/scsi/qla4xxx/ql4_os.c:4175:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4196:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4215:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6400:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6402:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6555:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6557:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7838:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7840:2-7: WARNING:
NULL check before some freeing functions is not needed.
Qiheng Lin [Fri, 9 Apr 2021 12:09:25 +0000 (20:09 +0800)]
scsi: qla2xxx: Remove unneeded if-null-free check
Eliminate the following coccicheck warning:
drivers/scsi/qla2xxx/qla_os.c:4622:2-7:
WARNING: NULL check before some freeing functions is not needed.
drivers/scsi/qla2xxx/qla_os.c:4637:3-8:
WARNING: NULL check before some freeing functions is not needed.
scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query
Fix the following out-of-bounds warnings by embedding existing struct
htb_rel_query into struct mpt3_addnl_diag_query, instead of duplicating its
members:
include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds]
include/linux/fortify-string.h:22:29: warning: '__builtin_memset' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds]
The problem is that the original code is trying to copy data into a bunch
of struct members adjacent to each other in a single call to memcpy(). All
those members are exactly the same contained in struct htb_rel_query, so
instead of duplicating them into struct mpt3_addnl_diag_query, replace them
with new member rel_query of type struct htb_rel_query. So, now that this
new object is introduced, memcpy() doesn't overrun the length of
&karg.buffer_rel_condition, because the address of the new struct object
_rel_query_ is used as destination, instead. The same issue is present when
calling memset(), and it is fixed with this same approach.
Below is a comparison of struct mpt3_addnl_diag_query, before and after
this change (the size and cachelines remain the same):
Also, this helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines on
memcpy().
Javed Hasan [Wed, 31 Mar 2021 16:49:17 +0000 (09:49 -0700)]
scsi: qedf: Use devlink to report errors and recovery
Use devlink_health_report() to push error indications.
Implement this in qede via a callback function to make it possible to reuse
it for other drivers sitting on top of qed in future. Also remove forcible
recovery trigger and put it as a normal devlink callback in qed module.
This allows user to enable/disable it via:
devlink health set pci/xxxx:xx:xx.x reporter fw_fatal auto_recover false
Sergey Shtylyov [Tue, 30 Mar 2021 17:45:12 +0000 (20:45 +0300)]
scsi: sni_53c710: Add IRQ check
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #s), causing it to fail with -EINVAL (overridden by -ENODEV
further below). Stop calling request_irq() with the invalid IRQ #s.
Sergey Shtylyov [Tue, 30 Mar 2021 17:44:08 +0000 (20:44 +0300)]
scsi: sun3x_esp: Add IRQ check
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real
error code. Stop calling request_irq() with the invalid IRQ #s.
Sergey Shtylyov [Tue, 30 Mar 2021 17:43:23 +0000 (20:43 +0300)]
scsi: jazz_esp: Add IRQ check
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real
error code. Stop calling request_irq() with the invalid IRQ #s.
Commit df2d8213d9e3 ("hisi_sas: use platform_get_irq()") failed to take
into account that irq_of_parse_and_map() and platform_get_irq() have a
different way of indicating an error: the former returns 0 and the latter
returns a negative error code. Fix up the IRQ checks!
Link: https://lore.kernel.org/r/810f26d3-908b-1d6b-dc5c-40019726baca@omprussia.ru Fixes: df2d8213d9e3 ("hisi_sas: use platform_get_irq()") Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Sergey Shtylyov [Mon, 29 Mar 2021 20:50:58 +0000 (23:50 +0300)]
scsi: ufs: ufshcd-pltfrm: Fix deferred probing
The driver overrides the error codes returned by platform_get_irq() to
-ENODEV, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Propagate the error code
upstream as it should have been done from the start...
Ye Bin [Fri, 9 Apr 2021 07:55:22 +0000 (15:55 +0800)]
scsi: ufs: ufs-qcom: Remove redundant dev_err() call in ufs_qcom_init()
There is a error message within devm_ioremap_resource() already, so remove
the dev_err() call to avoid redundant error message.
Link: https://lore.kernel.org/r/20210409075522.2111083-1-yebin10@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Luo Jiaxing [Tue, 6 Apr 2021 11:48:30 +0000 (19:48 +0800)]
scsi: hisi_sas: Warn in v3 hw channel interrupt handler when status reg cleared
If a channel interrupt occurs without any status bit set, the handler will
return directly. However, if such redundant interrupts are received, it's
better to check what happen, so add logs for this.
scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() fails
Function sas_unregister_ha() needs to be called to roll back if
hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or
hisi_sas_v3_probe(). Make that change.
Colin Ian King [Wed, 7 Apr 2021 13:58:40 +0000 (14:58 +0100)]
scsi: pm80xx: Fix potential infinite loop
The for-loop iterates with a u8 loop counter i and compares this with the
loop upper limit of pm8001_ha->max_q_num which is a u32 type. There is a
potential infinite loop if pm8001_ha->max_q_num is larger than the u8 loop
counter. Fix this by making the loop counter the same type as
pm8001_ha->max_q_num.
[mkp: this is purely theoretical, max_q_num is currently limited to 64]
Link: https://lore.kernel.org/r/20210407135840.494747-1-colin.king@canonical.com Fixes: 65df7d1986a1 ("scsi: pm80xx: Fix chip initialization failure")
Addresses-Coverity: ("Infinite loop") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There was a similar fix for mpi_init_check() in commit d71023af4bec ("scsi: pm80xx: Do not busy wait in MPI init check")
Link: https://lore.kernel.org/r/20210406180534.1924345-3-ipylypiv@google.com Reviewed-by: Vishakha Channapattan <vishakhavc@google.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Igor Pylypiv [Tue, 6 Apr 2021 18:05:33 +0000 (11:05 -0700)]
scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()
The mpi_uninit_check() takes longer for inbound doorbell register to be
cleared. Increase the timeout substantially so that the driver does not
fail to load.
Previously, the inbound doorbell wait time was mistakenly increased in the
mpi_init_check() instead of mpi_uninit_check(). It is okay to leave the
mpi_init_check() wait time as-is as these are timeout values and if there
is a failure, waiting longer is not an issue.
Link: https://lore.kernel.org/r/20210406180534.1924345-2-ipylypiv@google.com Fixes: e90e236250e9 ("scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check") Reviewed-by: Vishakha Channapattan <vishakhavc@google.com> Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Igor Pylypiv <ipylypiv@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bodo Stroesser [Wed, 24 Mar 2021 19:57:57 +0000 (20:57 +0100)]
scsi: target: tcmu: Replace block size definitions with new udev members
Replace DATA_PAGES_PER_BLK and DATA_BLOCK_SIZE with new struct elements
tcmu_dev->data_pages_per_blk and tcmu_dev->data_blk_size. These new
variables are still loaded with constant definition DATA_PAGES_PER_BLK_DEF
(= 1) and DATA_PAGES_PER_BLK_DEF * PAGE_SIZE.
There is no way yet to set the values via configfs.
Bodo Stroesser [Wed, 24 Mar 2021 19:57:55 +0000 (20:57 +0100)]
scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE
Change tcmu to support DATA_BLOCK_SIZE being a multiple of PAGE_SIZE. There
are two reasons why one would like to have a bigger DATA_BLOCK_SIZE:
1) If userspace - e.g. due to data compression, encryption or
deduplication - needs to have receive or transmit data in a consecutive
buffer, we can define DATA_BLOCK_SIZE to the maximum size of a SCSI
READ/WRITE to enforce that userspace sees just one consecutive
buffer. That way we can avoid the need for doing data copy in
userspace.
2) Using a bigger data block size can speed up command processing in
tcmu. The number of free data blocks to look up in bitmap is reduced
substantially. The lookup for data pages in radix_tree can be done more
efficiently if there are multiple pages in a data block. The maximum
number of IOVs to set up is lower so cmd entries in the ring become
smaller.
Bodo Stroesser [Wed, 24 Mar 2021 19:57:54 +0000 (20:57 +0100)]
scsi: target: tcmu: Prepare for PAGE_SIZE != DATA_BLOCK_SIZE
Rename some variables and definitions as a first preparation for
DATA_BLOCK_SIZE != PAGE_SIZE and add the new DATA_PAGES_PER_BLK definition
containing the number of pages per data block.
Rename tcmu_try_get_block_page() to tcmu_try_get_data_page(). Keep name
tcmu_get_block_page() since it will go away in a following commit when
there is only one caller left. Subsequent commits will then add full
support for DATA_PAGES_PER_BLK != 1, which also means DATA_BLOCK_SIZE =
DATA_PAGES_PER_BLK * PAGE_SIZE