git.ipfire.org Git - thirdparty/kernel/linux.git/log

scsi: scsi_debug: Fix cmd_per_lun, set to max_queue

Make sure that the cmd_per_lun value placed in the host template never
exceeds the can_queue value. If the max_queue driver parameter is not
specified then both cmd_per_lun and can_queue are set to CAN_QUEUE.
CAN_QUEUE is a compile time constant and is used to dimension an array to
hold queued requests. If the max_queue driver parameter is given it is must
be less than or equal to CAN_QUEUE and if so, the host template values are
adjusted.

Remove undocumented code that allowed queue_depth to exceed CAN_QUEUE and
cause stack full type errors. There is a documented way to do that with
every_nth and

echo 0x8000 > /sys/bus/pseudo/drivers/scsi_debug/opts

See: https://sg.danny.cz/sg/scsi_debug.html

Tweak some formatting, and add a suggestion to the "trim poll_queues"
warning.

Link: https://lore.kernel.org/r/20210415015031.607153-1-dgilbert@interlog.com
Reported-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: John Garry <john.garry@hauwei.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Narrow down fast path in system suspend path

If spm_lvl is set to 0 or 1, when system suspend kicks start and HBA is
runtime active, system suspend may just bail without doing anything (the
fast path), leaving other contexts still running, e.g., clock gating and
clock scaling. When system resume kicks start, concurrency can happen
between ufshcd_resume() and these contexts, leading to various stability
issues.

Add a check against HBA's runtime state and allowing fast path only if HBA
is runtime suspended, otherwise let system suspend go ahead call
ufshcd_suspend(). This will guarantee that these contexts are stopped by
either runtime suspend or system suspend.

Link: https://lore.kernel.org/r/1619408921-30426-4-git-send-email-cang@codeaurora.org
Fixes: 0b257734344a ("scsi: ufs: optimize system suspend handling")
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend

During ufs system suspend, leaving rpm_dev_flush_recheck_work running or
pending is risky because concurrency may happen between system
suspend/resume and runtime resume routine. Fix this by cancelling
rpm_dev_flush_recheck_work synchronously during system suspend.

Link: https://lore.kernel.org/r/1619408921-30426-3-git-send-email-cang@codeaurora.org
Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend")
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Do not put UFS power into LPM if link is broken

During resume, if link is broken due to AH8 failure, make sure
ufshcd_resume() does not put UFS power back into LPM.

Link: https://lore.kernel.org/r/1619408921-30426-2-git-send-email-cang@codeaurora.org
Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error recovery paths")
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Prevent PRLI in target mode

In a case when the initiator in P2P mode by some circumstances does not
send PRLI, the target, in a case when the target port's WWPN is less than
initiator's, changes the discovery state in DSC_GNL. When gnl completes it
sends PRLI to the initiator.

Usually the initiator in P2P mode always sends PRLI. We caught this issue
on Linux stable v5.4.6 https://www.spinics.net/lists/stable/msg458515.html.

Fix this particular corner case in the behaviour of the P2P mod target
login state machine.

Link: https://lore.kernel.org/r/20210422153414.4022-1-a.kovaleva@yadro.com
Fixes: a9ed06d4e640 ("scsi: qla2xxx: Allow PLOGI in target mode")
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Add marginal path handling support

Add support for eh_should_retry_cmd callback in qla2xxx host template.

Link: https://lore.kernel.org/r/20210427050914.7270-1-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found

If tcmu_handle_completions() finds an invalid cmd_id while looping over cmd
responses from userspace it sets TCMU_DEV_BIT_BROKEN and breaks the
loop. This means that it does further handling for the tcmu device.

Skip that handling by replacing 'break' with 'return'.

Additionally change tcmu_handle_completions() from unsigned int to bool,
since the value used in return already is bool.

Link: https://lore.kernel.org/r/20210423150123.24468-1-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: core: Fix a typo in ufs-sysfs.c

Fix the following typo:

ufschd_uic_link_state_to_string() -> ufshcd_uic_link_state_to_string()
ufschd_ufs_dev_pwr_mode_to_string() -> ufshcd_ufs_dev_pwr_mode_to_string()

Link: https://lore.kernel.org/r/1381713434.61619509208911.JavaMail.epsvc@epcpadp3
Signed-off-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command

The dump command for reading a region passes a requested read length
specified in words (4-byte units). The response overwrites the same field
with the actual number of bytes read.

The mailbox handler for DUMP which reads VPD data (region 23) is treating
the response field as if it were still a word_cnt, thus multiplying it by 4
to set the read's "length". Given the read value was calculated based on
the size of the read buffer, the longer response length runs off the end of
the buffer.

Fix by reworking the code to use the response field as a byte count.

Link: https://lore.kernel.org/r/20210421234511.102206-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix DMA virtual address ptr assignment in bsg

lpfc_bsg_ct_unsol_event() routine acts assigns a ct_request to the wrong
structure address, resulting in a bad address that results in bsg related
timeouts.

Correct the ct_request assignment to use the kernel virtual buffer address
(not the control structure address).

Link: https://lore.kernel.org/r/20210421234448.102132-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix illegal memory access on Abort IOCBs

In devloss timer handler and in backend calls to terminate remote port I/O,
there is logic to walk through all active IOCBs and validate them to
potentially trigger an abort request. This logic is causing illegal memory
accesses which leads to a crash. Abort IOCBs, which may be on the list, do
not have an associated lpfc_io_buf struct. The driver is trying to map an
lpfc_io_buf struct on the IOCB and which results in a bogus address thus
the issue.

Fix by skipping over ABORT IOCBs (CLOSE IOCBs are ABORTS that don't send
ABTS) in the IOCB scan logic.

Link: https://lore.kernel.org/r/20210421234433.102079-1-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: blk-mq: Fix build warning when making htmldocs

Fixes the following warning when running 'make htmldocs':

  include/linux/blk-mq.h:395: warning: Function parameter or member
  'set_rq_budget_token' not described in 'blk_mq_ops'
  include/linux/blk-mq.h:395: warning: Function parameter or member
  'get_rq_budget_token' not described in 'blk_mq_ops'

[mkp: added warning messages]

Link: https://lore.kernel.org/r/20210421154526.1954174-1-ming.lei@redhat.com
Fixes: d022d18c045f ("scsi: blk-mq: Add callbacks for storing & retrieving budget token")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcm_fc: Fix a kernel-doc header

Fix the function name in the kernel-doc header above ft_prli().

Link: https://lore.kernel.org/r/20210415220826.29438-21-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Shorten ALUA error messages

Do not print tg_pt_gp->tg_pt_gp_valid_id if we already know that it is zero.

Link: https://lore.kernel.org/r/20210415220826.29438-20-bvanassche@acm.org
Cc: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Fix two format specifiers

Use format specifier '%u' to format the u32 data type instead of '%hu'.

Link: https://lore.kernel.org/r/20210415220826.29438-19-bvanassche@acm.org
Cc: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Compare explicitly with SAM_STAT_GOOD

Instead of leaving it implicit that SAM_STAT_GOOD == 0, compare explicitly
with SAM_STAT_GOOD.

Link: https://lore.kernel.org/r/20210415220826.29438-18-bvanassche@acm.org
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Introduce a new local variable in sd_check_events()

Instead of using 'retval' to represent first a SCSI status and later
whether or not a disk change event occurred, introduce a new variable for
the latter purpose.

Link: https://lore.kernel.org/r/20210415220826.29438-17-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: dc395x: Open-code status_byte(u8) calls

The dc395x driver is one of the two drivers that passes an u8 argument to
status_byte() instead of an s32 argument. Open-code status_byte() in
preparation of changing SCSI status values into a structure.

Link: https://lore.kernel.org/r/20210415220826.29438-16-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: 53c700: Open-code status_byte(u8) calls

The 53c700 driver is one of the two drivers that passes an u8 argument to
status_byte() instead of an s32 argument. Open-code status_byte in
preparation of changing SCSI status values into a structure.

Link: https://lore.kernel.org/r/20210415220826.29438-15-bvanassche@acm.org
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: smartpqi: Remove unused functions

This was detected by building the kernel with clang and W=1.

Link: https://lore.kernel.org/r/20210415220826.29438-14-bvanassche@acm.org
Cc: Don Brace <don.brace@microchip.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla4xxx: Remove an unused function

This was detected by building the kernel with clang and W=1.

Link: https://lore.kernel.org/r/20210415220826.29438-13-bvanassche@acm.org
Cc: Nilesh Javali <njavali@marvell.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: myrs: Remove unused functions

This was detected by building the kernel with clang and W=1.

Link: https://lore.kernel.org/r/20210415220826.29438-12-bvanassche@acm.org
Cc: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: myrb: Remove unused functions

This was detected by building the kernel with clang and W=1.

Link: https://lore.kernel.org/r/20210415220826.29438-11-bvanassche@acm.org
Cc: Hannes Reinecke <hare@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Fix two kernel-doc headers

Fix the following warnings:

drivers/scsi/mpt3sas/mpt3sas_base.c:5430: warning: Excess function parameter 'ct' description in '_base_allocate_pcie_sgl_pool'
drivers/scsi/mpt3sas/mpt3sas_base.c:5493: warning: Excess function parameter 'ctr' description in '_base_allocate_chain_dma_pool'

Link: https://lore.kernel.org/r/20210415220826.29438-10-bvanassche@acm.org
Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region")
Fixes: 7dd847dae1c4 ("scsi: mpt3sas: Force chain buffer allocations to be within same 4 GB region")
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: fcoe: Suppress a compiler warning

Suppress the following compiler warning:

warning: cast to smaller integer type
      'enum fip_mode' from 'void *' [-Wvoid-pointer-to-enum-cast]
        enum fip_mode fip_mode = (enum fip_mode)kp->arg;
                                 ^~~~~~~~~~~~~~~~~~~~~~

Link: https://lore.kernel.org/r/20210415220826.29438-9-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libfc: Fix a format specifier

Since the 'mfs' member has been declared as 'u32' in include/scsi/libfc.h,
use the %u format specifier instead of %hu. This patch fixes the following
clang compiler warning:

warning: format specifies type
      'unsigned short' but the argument has type 'u32' (aka 'unsigned int')
      [-Wformat]
                             "lport->mfs:%hu\n", mfs, lport->mfs);
                                         ~~~          ^~~~~~~~~~
                                         %u

Link: https://lore.kernel.org/r/20210415220826.29438-8-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aacraid: Remove an unused function

This was detected by building the kernel with clang and W=1.

Link: https://lore.kernel.org/r/20210415220826.29438-7-bvanassche@acm.org
Cc: aacraid@microsemi.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: Introduce enum scsi_disposition

Improve readability of the code in the SCSI core by introducing an
enumeration type for the values used internally that decide how to continue
processing a SCSI command. The eh_*_handler return values have not been
changed because that would involve modifying all SCSI drivers.

The output of the following command has been inspected to verify that no
out-of-range values are assigned to a variable of type enum
scsi_disposition:

KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/

Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case

The comment above scsi_send_eh_cmnd() says: "Returns SUCCESS or FAILED or
NEEDS_RETRY". This patch makes all values returned by scsi_send_eh_cmnd()
match the documentation of this function. This change does not affect the
behavior of scsi_eh_tur() nor of scsi_eh_try_stu() nor of the
scsi_request_sense() callers.

See also commit bbe9fb0d04b9 ("scsi: Avoid that .queuecommand() gets called
for a blocked SCSI device"; v5.3).

Link: https://lore.kernel.org/r/20210415220826.29438-5-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: Rename scsi_softirq_done() into scsi_complete()

Commit 320ae51feed5 ("blk-mq: new multi-queue block IO queueing mechanism";
v3.13) introduced a code path that calls the blk-mq completion function
from interrupt context. scsi-mq was introduced by commit d285203cf647
("scsi: add support for a blk-mq based I/O path."; v3.17).

Since the introduction of scsi-mq, scsi_softirq_done() can be called from
interrupt context. That made the name of the function misleading, rename it
to scsi_complete().

Link: https://lore.kernel.org/r/20210415220826.29438-4-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: Remove an incorrect comment

scsi_device.sdev_target is used in more code than the single_lun code,
hence remove the comment next to the definition of the sdev_target member.

Link: https://lore.kernel.org/r/20210415220826.29438-3-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: Make the scsi_alloc_sgtables() documentation more accurate

The current scsi_alloc_sgtables() documentation does not accurately explain
what this function does. Hence improve the documentation of this function.

Link: https://lore.kernel.org/r/20210415220826.29438-2-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Remove global lock from outbound queue processing

Introduce spin lock for outbound queue. With this, driver need not acquire
HBA global lock for outbound queue processing.

Link: https://lore.kernel.org/r/20210415103352.3580-9-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Reset PI and CI memory during re-initialization

Producer index(PI) outbound queue and consumer index(CI) for Outbound queue
are in DMA memory. During resume(), the stale PI and CI Values will lead to
unexpected behavior. These values should be reset to 0 during driver
reinitialization.

Link: https://lore.kernel.org/r/20210415103352.3580-8-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Completing pending I/O after fatal error

When controller runs into fatal error, I/Os get stuck with no response,
handler event is defined to complete the pending I/Os (SAS task and
internal task) and also perform the cleanup for the drives.

Link: https://lore.kernel.org/r/20210415103352.3580-7-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Add sysfs attribute to track iop1 count

A new sysfs variable 'ctl_iop1_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.

Using the 'ctl_iop1_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat  /sys/class/scsi_host/host*/ctl_iop1_count
    0x00000069
    0x0000006b
    0x0000006d
    0x00000072

Link: https://lore.kernel.org/r/20210415103352.3580-6-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Add sysfs attribute to track iop0 count

A new sysfs variable 'ctl_iop0_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.

Using the 'ctl_iop0_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_iop0_count
    0x000000a3
    0x000001db
    0x000001e4
    0x000001e7

Link: https://lore.kernel.org/r/20210415103352.3580-5-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Add sysfs attribute to track RAAE count

A new sysfs variable 'ctl_raae_count' is being introduced that tells if the
controller is alive by indicating controller ticks. If on subsequent run we
see the ticks changing in RAAE count that indicates that controller is not
dead.

Using the 'ctl_raae_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_raae_count
    0x00002245
    0x00002253
    0x0000225e

Link: https://lore.kernel.org/r/20210415103352.3580-4-Viswas.G@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Add sysfs attribute to check controller hmi error

A new sysfs variable 'ctl_hmi_error' is being introduced to give the error
details if the MPI initialization fails

Using the 'ctl_hmi_error' sysfs variable we can check the error details:

    linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_hmi_error
    0x00000000
    0x00000000
    0x00000000

Link: https://lore.kernel.org/r/20210415103352.3580-3-Viswas.G@microchip.com
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Add sysfs attribute to check MPI state

A new sysfs variable 'ctl_mpi_state' is being introduced to check the state
of MPI.

Using the 'ctl_mpi_state' sysfs variable we can check the MPI state:

linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_mpi_state
MPI is successfully initialized

Link: https://lore.kernel.org/r/20210415103352.3580-2-Viswas.G@microchip.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Lift Request Queue tasklet & timer from qdio

The qdio layer currently provides its own infrastructure to scan for
Request Queue completions & to report them to the device driver. This
comes with several drawbacks - having an async tasklet & timer construct in
qdio introduces additional lifetime complexity, and makes it harder to
integrate them with the rest of the device driver. The timeouts are also
currently hard-coded, and can't be tweaked without affecting other qdio
drivers (ie. qeth).

But due to recent enhancements to the qdio layer, zfcp can actually take
full control of the Request Queue completion processing. It merely needs to
opt-out from the qdio layer mechanisms by setting the scan_threshold to 0,
and then use qdio_inspect_queue() to scan for completions.

So re-implement the tasklet & timer mechanism in zfcp, while initially
copying the scan conditions from qdio's handle_outbound() and
qdio_outbound_tasklet(). One minor behavioural change is that
zfcp_qdio_send() will unconditionally reduce the timeout to 1 HZ, rather
than leaving it at 10 Hz if it was last armed by the tasklet. This just
makes things more consistent. Also note that we can drop a lot of the
accumulated cruft in qdio_outbound_tasklet(), as zfcp doesn't even use PCI
interrupt requests any longer.

This also slightly touches the Response Queue processing, as
qdio_get_next_buffers() will no longer implicitly scan for Request Queue
completions. So complete the migration to qdio_inspect_queue() here as well
and make the tasklet_schedule() visible.

Link: https://lore.kernel.org/r/018d3ddd029f8d6ac00cf4184880288c637c4fd1.1618417667.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Move the position of put_device()

Place the put_device() call after device_unregister() in both
zfcp_unit_remove() and zfcp_sysfs_port_remove_store() to make it more
natural. put_device() ought to be the last time we touch the object in both
functions.

Add comments after put_device() to make code clearer.

Link: https://lore.kernel.org/r/0a568c7733ba0f1dde28b0c663b90270d44dd540.1618417667.git.bblock@linux.ibm.com
Suggested-by: Steffen Maier <maier@linux.ibm.com>
Suggested-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Clean up sysfs code for SFP diagnostics

The error path from zfcp_adapter_enqueue() no longer attempts to remove the
diagnostics attributes if they haven't been created yet.

So remove the manual 'sysfs_established' guard for this case, and use
device_add_groups() to add all adapter-related sysfs attributes in one go.

Link: https://lore.kernel.org/r/37a97537f675d643006271f37723c346189b6eec.1618417667.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Fix sysfs roll-back on error in zfcp_adapter_enqueue()

When zfcp_adapter_enqueue() fails to create the zfcp_sysfs_adapter_attrs
group, it calls zfcp_adapter_unregister() to tear down the adapter state
again. This then unconditionally attempts to remove the
zfcp_sysfs_adapter_attrs group, resulting in a "group not found" WARN from
sysfs code.

Avoid this by copying most of zfcp_adapter_unregister() into the error
path, allowing for more fine-granular roll-back. Then skip the sysfs
tear-down steps if we haven't progressed this far in the initialization.

Link: https://lore.kernel.org/r/790922cc3af075795fff9a4b787e6bda19bdb3be.1618417667.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Fix indentation coding style issue

Code indentation should use tabs where possible.

Link: https://lore.kernel.org/r/e8a15a2f3d64e2e76a214647cfd4fe23d370b165.1618417667.git.bblock@linux.ibm.com
Signed-off-by: Yevhen Viktorov <yevhen.viktorov@virginmedia.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zfcp: Remove unneeded INIT_LIST_HEAD() for FSF requests

INIT_LIST_HEAD() is only needed for actual list heads, while req->list is
used as a list entry.

Note that when the error path in zfcp_fsf_req_send() removes the request
from the adapter's list of pending requests, it actually looks up the
request from the zfcp_reqlist - rather than just calling list_del(). So
there's no risk of us calling list_del() on a request that hasn't been
added to any list yet.

Link: https://lore.kernel.org/r/254dc0ae28dccc43ab0b1079ef2c8dcb5fe1d2e4.1618417667.git.bblock@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Reserve extra IRQ vectors

Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of
CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.

That breaks vector allocation assumptions in qla83xx_iospace_config(),
qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions
computes maximum number of qpairs as:

  ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default
                   response queue) - 1 (ATIO, in dual or pure target mode)

max_qpairs is set to zero in case of two CPUs and initiator mode. The
number is then used to allocate ha->queue_pair_map inside
qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is
left NULL but the driver thinks there are queue pairs available.

qla2xxx_queuecommand() tries to find a qpair in the map and crashes:

  if (ha->mqenable) {
          uint32_t tag;
          uint16_t hwq;
          struct qla_qpair *qpair = NULL;

          tag = blk_mq_unique_tag(cmd->request);
          hwq = blk_mq_unique_tag_to_hwq(tag);
          qpair = ha->queue_pair_map[hwq]; # <- HERE

          if (qpair)
                  return qla2xxx_mqueuecommand(host, cmd, qpair);
  }

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G        W         5.10.0-rc1+ #25
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
  Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc]
  RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx]
  Call Trace:
   scsi_queue_rq+0x58c/0xa60
   blk_mq_dispatch_rq_list+0x2b7/0x6f0
   ? __sbitmap_get_word+0x2a/0x80
   __blk_mq_sched_dispatch_requests+0xb8/0x170
   blk_mq_sched_dispatch_requests+0x2b/0x50
   __blk_mq_run_hw_queue+0x49/0xb0
   __blk_mq_delay_run_hw_queue+0xfb/0x150
   blk_mq_sched_insert_request+0xbe/0x110
   blk_execute_rq+0x45/0x70
   __scsi_execute+0x10e/0x250
   scsi_probe_and_add_lun+0x228/0xda0
   __scsi_scan_target+0xf4/0x620
   ? __pm_runtime_resume+0x4f/0x70
   scsi_scan_target+0x100/0x110
   fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc]
   process_one_work+0x1ea/0x3b0
   worker_thread+0x28/0x3b0
   ? process_one_work+0x3b0/0x3b0
   kthread+0x112/0x130
   ? kthread_park+0x80/0x80
   ret_from_fork+0x22/0x30

The driver should allocate enough vectors to provide every CPU it's own HW
queue and still handle reserved (MB, RSP, ATIO) interrupts.

The change fixes the crash on dual core VM and prevents unbalanced QP
allocation where nr_hw_queues is two less than the number of CPUs.

Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com
Fixes: a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs")
Cc: Daniel Wagner <daniel.wagner@suse.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org # 5.11+
Reported-by: Aleksandr Volkov <a.y.volkov@yadro.com>
Reported-by: Aleksandr Miloserdov <a.miloserdov@yadro.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: smartpqi: Fix device pointer variable reference static checker issue

Dan Carpenter found a possible NULL pointer dereference issue in function
pqi_sas_port_add_rphy():

   drivers/scsi/smartpqi/smartpqi_sas_transport.c:97
   pqi_sas_port_add_rphy() warn: variable dereferenced before
   check 'pqi_sas_port->device' (see line 95)

Correct issue by moving reference of pqi_sas_port->device after the check
for the device pointer being non-NULL.

Link: https://www.mail-archive.com/kbuild@lists.01.org/msg06329.html
Link: https://lore.kernel.org/r/161850493026.7302.10032784239320437353.stgit@brunhilda
Fixes: ec504b23df9d ("scsi: smartpqi: Add phy ID support for the physical drives")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: smartpqi: Fix blocks_per_row static checker issue

Dan Carpenter found a possible divide by 0 issue in the smartpqi driver in
functions pci_get_aio_common_raid_map_values() and pqi_calc_aio_r5_or_r6().
The variable rmd->blocks_per_row is used as a divisor and could be 0.

       Using rmd->blocks_per_row as a divisor without checking
       it for 0 first.

Correct these possible divide by 0 conditions by insuring that
rmd->blocks_per_row is not zero before usage.  The check for non-0 was too
late to prevent a divide by 0 condition.  Add in a comment to explain why
the check for non-zero is necessary. If the member is 0, return
PQI_RAID_BYPASS_INELIGIBLE before any division is performed.

Link: https://lore.kernel.org/linux-scsi/YG%2F5kWHHAr7w5dU5@mwanda/
Link: https://lore.kernel.org/r/161850492435.7302.392780350442938047.stgit@brunhilda
Fixes: 6702d2c40f31 ("scsi: smartpqi: Add support for RAID5 and RAID6 writes")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ibmvfc: Fix invalid state machine BUG_ON()

This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going
through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to
IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ,
which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the
host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end
up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change
the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON().

Make a couple of changes to avoid this. Leave the host action to be
IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop
the host lock and reset or reenable the CRQ. Also harden the host state
machine to ensure we cannot leave the reset / reenable state until we've
finished processing the reset or reenable.

Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com
Fixes: 73ee5d867287 ("[SCSI] ibmvfc: Fix soft lockup on resume")
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
[tyreld: added fixes tag]
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mkp: fix comment checkpatch warnings]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Copyright updates for 12.8.0.9 patches

Update copyrights to 2021 for files modified in the 12.8.0.9 patch set.

Link: https://lore.kernel.org/r/20210412013127.2387-17-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Update lpfc version to 12.8.0.9

Update lpfc version to 12.8.0.9

Link: https://lore.kernel.org/r/20210412013127.2387-16-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Eliminate use of LPFC_DRIVER_NAME in lpfc_attr.c

During code inspection, several cases of creating a dynamic attribute names
in logs messages using a define was found. This is unnecessary.

Place the native symbol name in the log messages.

Link: https://lore.kernel.org/r/20210412013127.2387-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Standardize discovery object logging format

Code inspection showed lpfc was using three different pointer formats when
logging discovery object pointers.

Standardize the pointer format to x%px.

Note: %px use is limited to discovery objects in order to aid core
analysis.

Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix various trivial errors in comments and log messages

Clean up minor issues spotted by tools and code review:

- Spelling Errors

- Spurious characters and errors in function headers

- nvme_info wqerr and err fields source data reversed

- Extraneous new line in log message 0466

- Spacing error in log message 0109

- Messages 0140 and 0141 have portname and nodename reversed

- Incorrect function labelling in comment

Link: https://lore.kernel.org/r/20210412013127.2387-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic

SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3
does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have
code to issue the command. The command will always fail.

Remove the code for the mailbox command and leave only the resulting
"failure path" logic.

Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix lpfc_hdw_queue attribute being ignored

The lpfc_hdw_queue attribute is to set the number of hardware queues to be
created on the adapter. Normally, the value is set to a default, which
allows the hw queue count to be sized dynamically based on adapter
capabilities, CPU/platform architecture, or CPU type. Currently, when
lpfc_hdw_queue is set to a specific value, is has no effect and the dynamic
sizing occurs.

The routine checking whether parameters are default or not ignores the
lpfc_hdw_queue setting and invokes the dynamic logic.

Fix the routine to additionally check the lpfc_hdw_queue attribute value
before using dynamic scaling. Additionally, SLI-3 supports only a small
number of queues with dedicated functions, thus it needs to be exempted
from the variable scaling and set to the expected values.

Link: https://lore.kernel.org/r/20210412013127.2387-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix missing FDMI registrations after Mgmt Svc login

FDMI registration needs to be performed after every login with the FC Mgmt
service. The flag the driver is using to track registration is cleared on
link up, but never on Mgmt service logout/re-login.

Fix by clearing the flag whenever a new login is completed with the FC Mgmt
service.

While perusing the flag use, logging was performed as if FDMI registration
occurred on vports. However, it is limited to the physical port only.
Revise the logging to reflect physical port based.

Link: https://lore.kernel.org/r/20210412013127.2387-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix silent memory allocation failure in lpfc_sli4_bsg_link_diag_test()

In the unlikely case of a failure to allocate an LPFC_MBOXQ_t structure, no
return status is set, thus the routine never logs an error and returns
success to the callee.

Fix by setting a return code on failure.

Link: https://lore.kernel.org/r/20210412013127.2387-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix use-after-free on unused nodes after port swap

During target port swap, the swap logic ignores the DROPPED flag in the
nodes. As a node then moves into the UNUSED state, the reference count will
be dropped. If a node is later reused and moved out of the UNUSED state, an
access can result in a use-after-free assert.

Fix by having the port swap logic propagate the DROPPED flag when switching
nodes. This will avoid reference from being dropped.

Link: https://lore.kernel.org/r/20210412013127.2387-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode

In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses
the BMBX register to send the command rather than the MQ. A flag is set
indicating the BMBX register is active and saves the mailbox job struct
(mboxq) in the mbox_active element of the adapter. The routine then waits
for completion or timeout. The mailbox job struct is not freed by the
routine. In cases of timeout, the adapter will be reset. The
lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for
the reset. It clears the BMBX active flag and marks the job structure as
MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation
in both normal completion and timeout cases is that the issuer of the mbx
command will free the structure. Unfortunately, not all calling paths are
freeing the memory in cases of error.

All calling paths were looked at and updated, if missing, to free the mboxq
memory regardless of completion status.

Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix lack of device removal on port swaps with PRLIs

During target port-swap testing with link flips, the initiator could
encounter PRLI errors. If the target node disappears permanently, the ndlp
is found stuck in UNUSED state with ref count of 1. The rmmod of the driver
will hang waiting for this node to be freed.

While handling a link error in PRLI completion path, the code intends to
skip triggering the discovery state machine. However this is causing the
final reference release path to be skipped. This causes the node to be
stuck with ref count of 1

Fix by ensuring the code path triggers the device removal event on the node
state machine.

Link: https://lore.kernel.org/r/20210412013127.2387-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix NMI crash during rmmod due to circular hbalock dependency

Remove hbalock dependency for lpfc_abts_els_sgl_list and
lpfc_abts_nvmet_ctx_list. The lists are adaquately synchronized with the
sgl_list_lock and abts_nvmet_buf_list_lock.

Link: https://lore.kernel.org/r/20210412013127.2387-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp()

Call traces are being seen that result from a nodelist structure ref
counting error. They are typically seen after transmission of an LS_RJT ELS
response.

Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the
ndlp reference count is exactly 1, will decrement the reference count.
Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put'
within the free would only be invoked if cmdiocb->context1 was not NULL.
Since the nodelist structure reference count is decremented when exiting
lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required.
Calling them is causing the reference count issue.

Fix by removing the lpfc_nlp_not_used() calls.

Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response

Fix a crash caused by a double put on the node when the driver completed an
ACC for an unsolicted abort on the same node.  The second put was executed
by lpfc_nlp_not_used() and is wrong because the completion routine executes
the nlp_put when the iocbq was released.  Additionally, the driver is
issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node
into NPR.  This call does nothing.

Remove the lpfc_nlp_not_used call and additional set_state in the
completion routine.  Remove the lpfc_nlp_set_state post issue_logo.  Isn't
necessary.

Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag

Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in
lpfc_els_free_iocb().

A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the
changes was to convert from building/sending an abort within the routine to
using a common routine. The reworked routine passes, without modification,
the pring ptr to the new common routine. The older routine had logic to
check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were
passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine
is missing this check and adapt, so the SLI-3 ring pointers are being used
in SLI-4 paths.

Fix by cleaning up the calling routines. In review, there is no need to
pass the ring ptr argument to abort_iocb at all. The routine can look at
the adapter type itself and reference the proper ring.

Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com
Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers")
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: isci: Remove unnecessary struct declaration

struct sci_phy_proto was already defined on line 142. The declaration here
is unnecessary. Remove it.

Link: https://lore.kernel.org/r/20210406105913.676746-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Remove unused local variable 'vtarget'

Fix the following gcc warning:

drivers/message/fusion/mptsas.c:783:14: warning: variable ‘vtarget’ set
but not used [-Wunused-but-set-variable].

Link: https://lore.kernel.org/r/1618207146-96542-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Remove unused local variable 'status'

Fix the following gcc warning:

drivers/message/fusion/mptbase.c:3087:9: warning: variable ‘status’ set
but not used [-Wunused-but-set-variable].

Link: https://lore.kernel.org/r/1617872780-126448-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Remove unused local variable 'port'

Fixes the following W=1 kernel build warning:

drivers/message/fusion/mptctl.c: In function ‘mptctl_gettargetinfo
drivers/message/fusion/mptctl.c:1372:7: warning: variable ‘port’ set but not used [-Wunused-but-set-variable]

Link: https://lore.kernel.org/r/20210408061851.3089-3-thunder.leizhen@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: message: fusion: Remove unused local variable 'time_count'

Fixes the following W=1 kernel build warning:

drivers/message/fusion/mptctl.c: In function ‘mptctl_do_taskmgmt:
drivers/message/fusion/mptctl.c:324:17: warning: variable ‘time_count’ set but not used [-Wunused-but-set-variable]

Link: https://lore.kernel.org/r/20210408061851.3089-2-thunder.leizhen@huawei.com
Fixes: 7d757f185540 ("[SCSI] mptfusion: Updated SCSI IO IOCTL error handling.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla4xxx: Remove unneeded if-null-free check

Eliminate the following coccicheck warning:

drivers/scsi/qla4xxx/ql4_os.c:4175:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4196:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4215:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6400:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6402:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6555:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6557:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7838:2-7: WARNING:
NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7840:2-7: WARNING:
NULL check before some freeing functions is not needed.

Link: https://lore.kernel.org/r/20210409120345.6447-1-linqiheng@huawei.com
Signed-off-by: Qiheng Lin <linqiheng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Reuse existing error handling path

There is no need to duplicate code, use the existing error handling path to
free resources. This is more future-proof.

Link: https://lore.kernel.org/r/6973844a1532ec2dc8e86f3533362e79d78ed774.1618132821.git.christophe.jaillet@wanadoo.fr
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Remove unneeded if-null-free check

Eliminate the following coccicheck warning:

drivers/scsi/qla2xxx/qla_os.c:4622:2-7:
WARNING: NULL check before some freeing functions is not needed.
drivers/scsi/qla2xxx/qla_os.c:4637:3-8:
WARNING: NULL check before some freeing functions is not needed.

Link: https://lore.kernel.org/r/20210409120925.7122-1-linqiheng@huawei.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Qiheng Lin <linqiheng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query

Fix the following out-of-bounds warnings by embedding existing struct
htb_rel_query into struct mpt3_addnl_diag_query, instead of duplicating its
members:

include/linux/fortify-string.h:20:29: warning: '__builtin_memcpy' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds]
include/linux/fortify-string.h:22:29: warning: '__builtin_memset' offset [19, 32] from the object at 'karg' is out of the bounds of referenced subobject 'buffer_rel_condition' with type 'short unsigned int' at offset 16 [-Warray-bounds]

The problem is that the original code is trying to copy data into a bunch
of struct members adjacent to each other in a single call to memcpy(). All
those members are exactly the same contained in struct htb_rel_query, so
instead of duplicating them into struct mpt3_addnl_diag_query, replace them
with new member rel_query of type struct htb_rel_query. So, now that this
new object is introduced, memcpy() doesn't overrun the length of
&karg.buffer_rel_condition, because the address of the new struct object
_rel_query_ is used as destination, instead. The same issue is present when
calling memset(), and it is fixed with this same approach.

Below is a comparison of struct mpt3_addnl_diag_query, before and after
this change (the size and cachelines remain the same):

$ pahole -C mpt3_addnl_diag_query drivers/scsi/mpt3sas/mpt3sas_ctl.o
struct mpt3_addnl_diag_query {
struct mpt3_ioctl_header   hdr;                  /*     0    12 */
uint32_t                   unique_id;            /*    12     4 */
uint16_t                   buffer_rel_condition; /*    16     2 */
uint16_t                   reserved1;            /*    18     2 */
uint32_t                   trigger_type;         /*    20     4 */
uint32_t                   trigger_info_dwords[2]; /*    24     8 */
uint32_t                   reserved2[2];         /*    32     8 */

/* size: 40, cachelines: 1, members: 7 */
/* last cacheline: 40 bytes */
};

$ pahole -C mpt3_addnl_diag_query drivers/scsi/mpt3sas/mpt3sas_ctl.o
struct mpt3_addnl_diag_query {
struct mpt3_ioctl_header   hdr;                  /*     0    12 */
uint32_t                   unique_id;            /*    12     4 */
struct htb_rel_query       rel_query;            /*    16    16 */
uint32_t                   reserved2[2];         /*    32     8 */

/* size: 40, cachelines: 1, members: 4 */
/* last cacheline: 40 bytes */
};

Also, this helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines on
memcpy().

Link: https://github.com/KSPP/linux/issues/109
Link: https://lore.kernel.org/lkml/60659889.bJJILx2THu3hlpxW%25lkp@intel.com/
Link: https://lore.kernel.org/r/20210401162054.GA397186@embeddedor
Build-tested-by: kernel test robot <lkp@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Use devlink to report errors and recovery

Use devlink_health_report() to push error indications.

Implement this in qede via a callback function to make it possible to reuse
it for other drivers sitting on top of qed in future. Also remove forcible
recovery trigger and put it as a normal devlink callback in qed module.

This allows user to enable/disable it via:

devlink health set pci/xxxx:xx:xx.x reporter fw_fatal auto_recover false

Link: https://lore.kernel.org/r/20210331164917.24662-3-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Enable devlink support

Devlink instance lifetime was linked to qed_dev object. That caused devlink
to be recreated on each recovery.

Change it by making higher level driver (qede) responsible for lifetime
management. This way devlink survives recoveries.

qede now stores devlink structure pointer as a part of its device object,
devlink private data contains a linkage structure, qed_devlink.

Link: https://lore.kernel.org/r/20210331164917.24662-2-jhasan@marvell.com
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sni_53c710: Add IRQ check

The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #s), causing it to fail with -EINVAL (overridden by -ENODEV
further below). Stop calling request_irq() with the invalid IRQ #s.

Link: https://lore.kernel.org/r/8f4b8fa5-8251-b977-70a1-9099bcb4bb17@omprussia.ru
Fixes: c27d85f3f3c5 ("[SCSI] SNI RM 53c710 driver")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sun3x_esp: Add IRQ check

The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real
error code. Stop calling request_irq() with the invalid IRQ #s.

Link: https://lore.kernel.org/r/363eb4c8-a3bf-4dc9-2a9e-90f349030a15@omprussia.ru
Fixes: 0bb67f181834 ("[SCSI] sun3x_esp: convert to esp_scsi")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: jazz_esp: Add IRQ check

The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding the real
error code. Stop calling request_irq() with the invalid IRQ #s.

Link: https://lore.kernel.org/r/594aa9ae-2215-49f6-f73c-33bd38989912@omprussia.ru
Fixes: 352e921f0dd4 ("[SCSI] jazz_esp: converted to use esp_core")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Fix IRQ checks

Commit df2d8213d9e3 ("hisi_sas: use platform_get_irq()") failed to take
into account that irq_of_parse_and_map() and platform_get_irq() have a
different way of indicating an error: the former returns 0 and the latter
returns a negative error code. Fix up the IRQ checks!

Link: https://lore.kernel.org/r/810f26d3-908b-1d6b-dc5c-40019726baca@omprussia.ru
Fixes: df2d8213d9e3 ("hisi_sas: use platform_get_irq()")
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: ufshcd-pltfrm: Fix deferred probing

The driver overrides the error codes returned by platform_get_irq() to
-ENODEV, so if it returns -EPROBE_DEFER, the driver would fail the probe
permanently instead of the deferred probing. Propagate the error code
upstream as it should have been done from the start...

Link: https://lore.kernel.org/r/420364ca-614a-45e3-4e35-0e0653c7bc53@omprussia.ru
Fixes: 2953f850c3b8 ("[SCSI] ufs: use devres functions for ufshcd")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: snic: Convert to DEFINE_SHOW_ATTRIBUTE()

Use DEFINE_SHOW_ATTRIBUTE() macro to simplify the code.

Link: https://lore.kernel.org/r/20210331065326.18804-1-dingsenjie@163.com
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: ufs-qcom: Remove redundant dev_err() call in ufs_qcom_init()

There is a error message within devm_ioremap_resource() already, so remove
the dev_err() call to avoid redundant error message.

Link: https://lore.kernel.org/r/20210409075522.2111083-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Print SATA device SAS address for soft reset failure

Add (pseudo) SAS address for ATA software reset failure log to assist in
debugging.

Link: https://lore.kernel.org/r/1617709711-195853-7-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Warn in v3 hw channel interrupt handler when status reg cleared

If a channel interrupt occurs without any status bit set, the handler will
return directly. However, if such redundant interrupts are received, it's
better to check what happen, so add logs for this.

Link: https://lore.kernel.org/r/1617709711-195853-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Directly snapshot registers when executing a reset

The debugfs snapshot should be executed before the reset occurs to ensure
that the register contents are saved properly.

As such, it is incorrect to queue the debugfs dump when running a reset as
the reset will occur prior to the snapshot work item is handler.

Therefore, directly snapshot registers in the reset work handler.

Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.com
Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Call sas_unregister_ha() to roll back if .hw_init() fails

Function sas_unregister_ha() needs to be called to roll back if
hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or
hisi_sas_v3_probe(). Make that change.

Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Print SAS address for v3 hw erroneous completion print

To help debugging efforts, print the device SAS address for v3 hw erroneous
completion log.

Here is an example print:

hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=2193 task=000000002b0c13f8 dev id=17 addr=570fd45f9d17b001

Link: https://lore.kernel.org/r/1617709711-195853-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Delete some unused callbacks

The debugfs code has been relocated to v3 hw driver, so delete unused
struct hisi_sas_hw function pointers snapshot_{prepare, restore}.

Link: https://lore.kernel.org/r/1617709711-195853-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Clean up open braces

checkpatch reports the following:

    ERROR: that open brace { should be on the previous line
    +static struct error_fw flash_error_table[] =
    +{

Fix a couple of instances of misplaced open bracket.

Link: https://lore.kernel.org/r/1617886593-36421-3-git-send-email-luojiaxing@huawei.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm8001: Clean up white space

checkpatch reports the following:

ERROR: space prohibited before that ',' (ctx:WxW)
+int pm8001_mpi_general_event(struct pm8001_hba_info *pm8001_ha , void *piomb);

Remove unnecessary whitespace.

Link: https://lore.kernel.org/r/1617886593-36421-2-git-send-email-luojiaxing@huawei.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Jianqin Xie <xiejianqin@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Fix potential infinite loop

The for-loop iterates with a u8 loop counter i and compares this with the
loop upper limit of pm8001_ha->max_q_num which is a u32 type. There is a
potential infinite loop if pm8001_ha->max_q_num is larger than the u8 loop
counter. Fix this by making the loop counter the same type as
pm8001_ha->max_q_num.

[mkp: this is purely theoretical, max_q_num is currently limited to 64]

Link: https://lore.kernel.org/r/20210407135840.494747-1-colin.king@canonical.com
Fixes: 65df7d1986a1 ("scsi: pm80xx: Fix chip initialization failure")
Addresses-Coverity: ("Infinite loop")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Remove busy wait from mpi_uninit_check()

mpi_uninit_check() is not being called in an atomic context. The only
caller of mpi_uninit_check() is pm80xx_chip_soft_rst().

Callers of pm80xx_chip_soft_rst():

- pm8001_ioctl_soft_reset()
- pm8001_pci_probe()
- pm8001_pci_remove()
- pm8001_pci_suspend()
- pm8001_pci_resume()

There was a similar fix for mpi_init_check() in commit
d71023af4bec ("scsi: pm80xx: Do not busy wait in MPI init check")

Link: https://lore.kernel.org/r/20210406180534.1924345-3-ipylypiv@google.com
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()

The mpi_uninit_check() takes longer for inbound doorbell register to be
cleared. Increase the timeout substantially so that the driver does not
fail to load.

Previously, the inbound doorbell wait time was mistakenly increased in the
mpi_init_check() instead of mpi_uninit_check(). It is okay to leave the
mpi_init_check() wait time as-is as these are timeout values and if there
is a failure, waiting longer is not an issue.

Link: https://lore.kernel.org/r/20210406180534.1924345-2-ipylypiv@google.com
Fixes: e90e236250e9 ("scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check")
Reviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Make data_pages_per_blk changeable via configfs

Make data_pages_per_blk changeable similar to the way it is done for
max_data_area_mb. One can change the value by typing:

echo "data_pages_per_blk=N" >control

The value is printed when doing:

cat info

In addition, a new readonly attribute 'data_pages_per_blk' returns the
value on read.

Link: https://lore.kernel.org/r/20210324195758.2021-7-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Replace block size definitions with new udev members

Replace DATA_PAGES_PER_BLK and DATA_BLOCK_SIZE with new struct elements
tcmu_dev->data_pages_per_blk and tcmu_dev->data_blk_size. These new
variables are still loaded with constant definition DATA_PAGES_PER_BLK_DEF
(= 1) and DATA_PAGES_PER_BLK_DEF * PAGE_SIZE.

There is no way yet to set the values via configfs.

Link: https://lore.kernel.org/r/20210324195758.2021-6-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Remove function tcmu_get_block_page()

There is only one caller of tcmu_get_block_page left. Since it is a
one-liner, we can remove the function.

Link: https://lore.kernel.org/r/20210324195758.2021-5-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE

Change tcmu to support DATA_BLOCK_SIZE being a multiple of PAGE_SIZE. There
are two reasons why one would like to have a bigger DATA_BLOCK_SIZE:

1) If userspace - e.g. due to data compression, encryption or
    deduplication - needs to have receive or transmit data in a consecutive
    buffer, we can define DATA_BLOCK_SIZE to the maximum size of a SCSI
    READ/WRITE to enforce that userspace sees just one consecutive
    buffer. That way we can avoid the need for doing data copy in
    userspace.

2) Using a bigger data block size can speed up command processing in
    tcmu. The number of free data blocks to look up in bitmap is reduced
    substantially. The lookup for data pages in radix_tree can be done more
    efficiently if there are multiple pages in a data block. The maximum
    number of IOVs to set up is lower so cmd entries in the ring become
    smaller.

Link: https://lore.kernel.org/r/20210324195758.2021-4-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Prepare for PAGE_SIZE != DATA_BLOCK_SIZE

Rename some variables and definitions as a first preparation for
DATA_BLOCK_SIZE != PAGE_SIZE and add the new DATA_PAGES_PER_BLK definition
containing the number of pages per data block.

Rename tcmu_try_get_block_page() to tcmu_try_get_data_page(). Keep name
tcmu_get_block_page() since it will go away in a following commit when
there is only one caller left. Subsequent commits will then add full
support for DATA_PAGES_PER_BLK != 1, which also means DATA_BLOCK_SIZE =
DATA_PAGES_PER_BLK * PAGE_SIZE

Link: https://lore.kernel.org/r/20210324195758.2021-3-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>