git.ipfire.org Git - thirdparty/kernel/linux.git/log

]> git.ipfire.org Git - thirdparty/kernel/linux.git/log

projects / thirdparty / kernel / linux.git / log

Christoph Hellwig [Mon, 5 Oct 2020 08:41:21 +0000 (10:41 +0200)]

scsi: core: Don't export scsi_device_from_queue()

This function is only used by code built into scsi_mod.ko.

Link: https://lore.kernel.org/r/20201005084130.143273-2-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:38 +0000 (22:30 +0800)]

scsi: hisi_sas: Recover PHY state according to the status before reset

Currently the PHY state is set according to the state of the PHYs after
reset. This is invalid as the PHYs are already re-initialized.

Set PHY state according to the state before the reset instead of after.

Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:37 +0000 (22:30 +0800)]

scsi: hisi_sas: Filter out new PHY up events during suspend

Currently sas_resume_ha() is called while resuming the controller to wait
for all suspended PHYs to come up and all the libsas events to be
completed.

There is a scenario which will cause task hung: For direct attach with two
disks connected with two PHYs, disable phy0 before suspending the disk on
phy1 and the controller, then enable phy0 and resume the controller, and
task hung occurs as follows:

[  591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0]
[  593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined
[  593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume
[  593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata)
[  593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200)
[  593.148350] sas: DOING DISCOVERY on port 0, pid:948
[  593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found
[  593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
[  593.165663] sas: ata7: end_device-2:0: dev error handler
[  593.165730] sas: ata2: end_device-2:1: dev error handler
[  593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2
[  593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down
[  593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[  593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133
[  593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32)
[  593.514295] ata7.00: configured for UDMA/133
[  593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[  593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005
serial:S45NNA0M712225
[  593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0
[  593.543674] device_link_add 324
[  593.546801] device_link_add 352
[  593.549930] device_link_add 406
[  593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0
[  593.559208] device_link_add 444
[  593.562335] device_link_add 455
[  593.565517] scsi 2:0:2:0: Direct-Access     ATA      SAMSUNG MZ7LH960 404Q PQ: 0
ANSI: 5
[  620.057464]  phy-2:1: resume timeout
[  738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds.
[  738.848295]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.862155] kworker/u256:0  D    0     8      2 0x00000028
[  738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker
[  738.873693] Call trace:
[  738.876133]  __switch_to+0xf4/0x148
[  738.879613]  __schedule+0x270/0x5d8
[  738.883091]  schedule+0x78/0x110
[  738.886307]  schedule_timeout+0x1ac/0x280
[  738.890299]  wait_for_completion+0x94/0x138
[  738.894472]  flush_workqueue+0x114/0x438
[  738.898377]  sas_porte_bytes_dmaed+0x400/0x500
[  738.902801]  sas_port_event_worker+0x28/0x40
[  738.907053]  process_one_work+0x1e8/0x360
[  738.911046]  worker_thread+0x44/0x478
[  738.914698]  kthread+0x150/0x158
[  738.917915]  ret_from_fork+0x10/0x1c
[  738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds.
[  738.928550]       Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[  738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  738.942408] kworker/u256:1  D    0   948      2 0x00000028
[  738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain
[  738.953766] Call trace:
[  738.956203]  __switch_to+0xf4/0x148
[  738.959678]  __schedule+0x270/0x5d8
[  738.963152]  schedule+0x78/0x110
[  738.966368]  rpm_resume+0xcc/0x550
[  738.969757]  __pm_runtime_resume+0x3c/0x88
[  738.973836]  rpm_get_suppliers+0x50/0x148
[  738.977829]  __pm_runtime_set_status+0x124/0x2f0
[  738.982427]  scsi_sysfs_add_sdev+0x1a0/0x2a8
[  738.986679]  scsi_probe_and_add_lun+0x888/0xab0
[  738.991190]  __scsi_scan_target+0xec/0x520
[  738.995268]  scsi_scan_target+0x11c/0x128
[  738.999261]  sas_rphy_add+0x15c/0x1e8
[  739.002907]  sas_probe_devices+0xe4/0x150
[  739.006899]  sas_discover_domain+0x33c/0x588
[  739.011150]  process_one_work+0x1e8/0x360
[  739.015143]  worker_thread+0x44/0x478
[  739.018789]  kthread+0x150/0x158
[  739.022003]  ret_from_fork+0x10/0x1c
...

If an extra phy0 up happens during resume of the SAS controller, it will
emit a new libsas event (event PORTE_BYTES_DMAED and event
DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in
event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to
resume supplier (host controller). For runtime PM core, if device is in the
resuming state, the later resume request of the device will wait for
previous resume request to complete synchronously. At that point in time
the state of the controller is still resuming as it waits for all libsas
events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked
as the state of the controller is resuming which causes a deadlock.

To avoid the issue, filter out new PHY up events while the controller is
suspended.

Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:36 +0000 (22:30 +0800)]

scsi: hisi_sas: Add device link between SCSI devices and hisi_hba

Runtime PM of SCSI devices is already supported in SCSI layer, we can
suspend/resume every SCSI device separately. But if there is no link
between hisi_hba and SCSI devices or SCSI targets it will cause issues if
the controller is suspended while SCSI devices are still resuming. Only
when all the SCSI devices under the controller are suspended, the
controller can be suspended. Add the device link between SCSI devices
and the controller.

Link: https://lore.kernel.org/r/1601649038-25534-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:35 +0000 (22:30 +0800)]

scsi: hisi_sas: Add check for methods _PS0 and _PR0

To support system suspend/resume or runtime suspend/resume, need to use the
function pci_set_power_state() to change the power state which requires at
least method _PS0 or _PR0 be filled by platform for v3 hw. So check whether
the method is supported, if not, print a warning.

A Kconfig dependency is added as there is no stub for
acpi_device_power_manageable().

Link: https://lore.kernel.org/r/1601649038-25534-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:34 +0000 (22:30 +0800)]

scsi: hisi_sas: Add controller runtime PM support for v3 hw

Add controller runtime PM support for v3 hw.

Link: https://lore.kernel.org/r/1601649038-25534-4-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Xiang Chen [Fri, 2 Oct 2020 14:30:33 +0000 (22:30 +0800)]

scsi: hisi_sas: Switch to new framework to support suspend and resume

For v3 hw we will add support for runtime PM which is only supported in new
framework. Legacy PM support and new framework are not allowed to be used
together. Switch to new framework to support suspend and resume.

Link: https://lore.kernel.org/r/1601649038-25534-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Luo Jiaxing [Fri, 2 Oct 2020 14:30:32 +0000 (22:30 +0800)]

scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()

A call trace is observed when running function level reset with online CPUs
less than 16 and MSI auto-affinity enabled.

[16538.348038] Call trace:
[16538.348422]  pci_irq_vector+0x98/0xc0
[16538.348947]  disable_host_v3_hw+0x8c/0x288 [hisi_sas_v3_hw]
[16538.349706]  hisi_sas_reset_prepare_v3_hw+0x60/0x88 [hisi_sas_v3_hw]
[16538.350631]  pci_dev_save_and_disable+0x38/0x68
[16538.351290]  pci_reset_function+0x44/0x88
[16538.351846]  reset_store+0x6c/0xb8
[16538.352429]  dev_attr_store+0x44/0x60
[16538.353035]  sysfs_kf_write+0x58/0x80
[16538.353558]  kernfs_fop_write+0x140/0x230
[16538.354175]  __vfs_write+0x48/0x80
[16538.354675]  vfs_write+0xb8/0x1d8
[16538.355145]  ksys_write+0x74/0xf8
[16538.355615]  __arm64_sys_write+0x24/0x30
[16538.356240]  el0_svc_common.constprop.4+0x80/0x1f0
[16538.356905]  do_el0_svc+0x2c/0x38
[16538.357408]  el0_svc+0x14/0x40
[16538.357848]  el0_sync_handler+0xbc/0x2ec
[16538.358388]  el0_sync+0x140/0x180

The reason is that if we use pci_alloc_irq_vectors_affinity() to allocate
IRQs, the number of CQ IRQs can only be less than or equal to the number of
online CPUs, but we use hisi_hba->queue_count (always 16) to iterate during
interrupt_disable_v3_hw().

Use hisi_hba->cq_nvecs to replace hisi_hba->queue_count to avoid
synchronize IRQ on a CPU which does not exist.

Link: https://lore.kernel.org/r/1601649038-25534-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Jing Xiangfeng [Thu, 17 Sep 2020 02:19:06 +0000 (10:19 +0800)]

scsi: qedf: Remove redundant assignment to variable 'rc'

This assignment is meaningless. Remove it.

Link: https://lore.kernel.org/r/20200917021906.175933-1-jingxiangfeng@huawei.com
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Ye Bin [Wed, 16 Sep 2020 02:28:59 +0000 (10:28 +0800)]

scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()

Fixes coccicheck warning:
drivers/scsi/lpfc/lpfc_attr.c:5341:5-11: Unneeded variable: "status". Return "- EINVAL" on line 5342

Link: https://lore.kernel.org/r/20200916022859.349089-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Liu Shixin [Wed, 16 Sep 2020 02:50:30 +0000 (10:50 +0800)]

scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro

Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.

Link: https://lore.kernel.org/r/20200916025030.3992991-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree

Ye Bin [Wed, 16 Sep 2020 02:27:49 +0000 (10:27 +0800)]

scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed

Fixes coccicheck warning:
drivers/scsi/qla4xxx/ql4_init.c:1173:5-11: Unneeded variable: "status". Return "QLA_ERROR" on line 1195

Link: https://lore.kernel.org/r/20200916022749.348923-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

commit | commitdiff | tree