Changbin Du <changbin.du@intel.com> <changbin.du@intel.com>
Chao Yu <chao@kernel.org> <chao2.yu@samsung.com>
Chao Yu <chao@kernel.org> <yuchao0@huawei.com>
+Chester Lin <chester62515@gmail.com> <clin@suse.com>
Chris Chiu <chris.chiu@canonical.com> <chiu@endlessm.com>
Chris Chiu <chris.chiu@canonical.com> <chiu@endlessos.org>
Chris Lew <quic_clew@quicinc.com> <clew@codeaurora.org>
Gao Xiang <xiang@kernel.org> <hsiangkao@aol.com>
Gao Xiang <xiang@kernel.org> <hsiangkao@linux.alibaba.com>
Gao Xiang <xiang@kernel.org> <hsiangkao@redhat.com>
+Geliang Tang <geliang.tang@linux.dev> <geliang.tang@suse.com>
+Geliang Tang <geliang.tang@linux.dev> <geliangtang@xiaomi.com>
+Geliang Tang <geliang.tang@linux.dev> <geliangtang@gmail.com>
+Geliang Tang <geliang.tang@linux.dev> <geliangtang@163.com>
Georgi Djakov <djakov@kernel.org> <georgi.djakov@linaro.org>
Gerald Schaefer <gerald.schaefer@linux.ibm.com> <geraldsc@de.ibm.com>
Gerald Schaefer <gerald.schaefer@linux.ibm.com> <gerald.schaefer@de.ibm.com>
Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net>
Jessica Zhang <quic_jesszhan@quicinc.com> <jesszhan@codeaurora.org>
Jilai Wang <quic_jilaiw@quicinc.com> <jilaiw@codeaurora.org>
+Jiri Kosina <jikos@kernel.org> <jikos@jikos.cz>
+Jiri Kosina <jikos@kernel.org> <jkosina@suse.cz>
+Jiri Kosina <jikos@kernel.org> <jkosina@suse.com>
Jiri Pirko <jiri@resnulli.us> <jiri@nvidia.com>
Jiri Pirko <jiri@resnulli.us> <jiri@mellanox.com>
Jiri Pirko <jiri@resnulli.us> <jpirko@redhat.com>
N: Venkatesh Pallipadi (Venki)
D: x86/HPET
+N: Antti Palosaari
+E: crope@iki.fi
+D: Various DVB drivers
+W: https://palosaari.fi/linux/
+S: Yliopistokatu 1 D 513
+S: FI-90570 Oulu
+S: FINLAND
+
N: Kyungmin Park
E: kyungmin.park@samsung.com
D: Samsung S5Pv210 and Exynos4210 mobile platforms
OP-TEE bus provides reference to registered drivers under this directory. The <uuid>
matches Trusted Application (TA) driver and corresponding TA in secure OS. Drivers
are free to create needed API under optee-ta-<uuid> directory.
+
+What: /sys/bus/tee/devices/optee-ta-<uuid>/need_supplicant
+Date: November 2023
+KernelVersion: 6.7
+Contact: op-tee@lists.trustedfirmware.org
+Description:
+ Allows to distinguish whether an OP-TEE based TA/device requires user-space
+ tee-supplicant to function properly or not. This attribute will be present for
+ devices which depend on tee-supplicant to be running.
brightness. Reading this file when no hw brightness change
event has happened will return an ENODATA error.
-What: /sys/class/leds/<led>/color
-Date: June 2023
-KernelVersion: 6.5
-Description:
- Color of the LED.
-
- This is a read-only file. Reading this file returns the color
- of the LED as a string (e.g: "red", "green", "multicolor").
-
What: /sys/class/leds/<led>/trigger
Date: March 2006
KernelVersion: 2.6.17
vulnerability. System may allow data leaks with this
option.
- no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES] Disable paravirtualized
- steal time accounting. steal time is computed, but
- won't influence scheduler behaviour
+ no-steal-acc [X86,PV_OPS,ARM64,PPC/PSERIES,RISCV] Disable
+ paravirtualized steal time accounting. steal time is
+ computed, but won't influence scheduler behaviour
nosync [HW,M68K] Disables sync negotiation for all devices.
Documentation of LoongArch ISA:
- https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.02-CN.pdf (in Chinese)
+ https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-CN.pdf (in Chinese)
- https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.02-EN.pdf (in English)
+ https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-EN.pdf (in English)
Documentation of LoongArch ELF psABI:
Protocol 2.15 (Kernel 5.5) Added the kernel_info and kernel_info.setup_type_max.
============= ============================================================
-.. note::
+ .. note::
The protocol version number should be changed only if the setup header
is changed. There is no need to update the version number if boot_params
or kernel_info are changed. Additionally, it is recommended to use
maintainers:
- Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+allOf:
+ - $ref: /schemas/sound/dai-common.yaml#
+
description: |
The ADV7533 and ADV7535 are HDMI audio and video transmitters
compatible with HDMI 1.4 and DVI 1.0. They support color space
$ref: /schemas/types.yaml#/definitions/uint32
enum: [ 1, 2, 3, 4 ]
+ "#sound-dai-cells":
+ const: 0
+
ports:
description:
The ADV7533/35 has two video ports and one audio port.
minItems: 1
interrupts:
- maxItems: 1
+ items:
+ - description: LCDIF DMA interrupt
+ - description: LCDIF Error interrupt
+ minItems: 1
power-domains:
maxItems: 1
then:
required:
- power-domains
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - fsl,imx23-lcdif
+ then:
+ properties:
+ interrupts:
+ minItems: 2
+ maxItems: 2
+ else:
+ properties:
+ interrupts:
+ maxItems: 1
examples:
- |
- Chun-Kuang Hu <chunkuang.hu@kernel.org>
- Philipp Zabel <p.zabel@pengutronix.de>
- Jitao Shi <jitao.shi@mediatek.com>
- - Xinlei Lee <xinlei.lee@mediatek.com>
description: |
The MediaTek DSI function block is a sink of the display subsystem and can
- lg,acx467akm-7
# LG Corporation 7" WXGA TFT LCD panel
- lg,ld070wx3-sl01
+ # LG Corporation 5" HD TFT LCD panel
+ - lg,lh500wx1-sd03
# One Stop Displays OSD101T2587-53TS 10.1" 1920x1200 panel
- osddisplays,osd101t2587-53ts
# Panasonic 10" WUXGA TFT LCD panel
- lemaker,bl035-rgb-002
# LG 7" (800x480 pixels) TFT LCD panel
- lg,lb070wv8
- # LG Corporation 5" HD TFT LCD panel
- - lg,lh500wx1-sd03
# LG LP079QX1-SP0V 7.9" (1536x2048 pixels) TFT LCD panel
- lg,lp079qx1-sp0v
# LG 9.7" (2048x1536 pixels) TFT LCD panel
- description: MPM pin number
- description: GIC SPI number for the MPM pin
+ '#power-domain-cells':
+ const: 0
+
required:
- compatible
- reg
<86 183>,
<90 260>,
<91 260>;
+ #power-domain-cells = <0>;
};
properties:
rx-internal-delay-ps:
description:
- RGMII Receive Clock Delay defined in pico seconds.This is used for
+ RGMII Receive Clock Delay defined in pico seconds. This is used for
controllers that have configurable RX internal delays. If this
property is present then the MAC applies the RX delay.
tx-internal-delay-ps:
description:
- RGMII Transmit Clock Delay defined in pico seconds.This is used for
+ RGMII Transmit Clock Delay defined in pico seconds. This is used for
controllers that have configurable TX internal delays. If this
property is present then the MAC applies the TX delay.
properties:
compatible:
- enum:
- - fsl,imx23-ocotp
- - fsl,imx28-ocotp
+ items:
+ - enum:
+ - fsl,imx23-ocotp
+ - fsl,imx28-ocotp
+ - const: fsl,ocotp
reg:
maxItems: 1
examples:
- |
ocotp: efuse@8002c000 {
- compatible = "fsl,imx28-ocotp";
+ compatible = "fsl,imx28-ocotp", "fsl,ocotp";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x8002c000 0x2000>;
bitmap of all MHPMCOUNTERx that can monitor the range of events
dependencies:
- "riscv,event-to-mhpmevent": [ "riscv,event-to-mhpmcounters" ]
+ riscv,event-to-mhpmevent: [ "riscv,event-to-mhpmcounters" ]
required:
- compatible
maintainers:
- Ghennadi Procopciuc <Ghennadi.Procopciuc@oss.nxp.com>
- - Chester Lin <clin@suse.com>
+ - Chester Lin <chester62515@gmail.com>
description: |
S32G2 pinmux is implemented in SIUL2 (System Integration Unit Lite2),
properties:
"#pwm-cells":
- description: |
- Should be 2 for i.MX1 and 3 for i.MX27 and newer SoCs. See pwm.yaml
- in this directory for a description of the cells format.
- enum:
- - 2
- - 3
+ description:
+ The only third cell flag supported by this binding is
+ PWM_POLARITY_INVERTED. fsl,imx1-pwm does not support this flags.
+ const: 3
compatible:
oneOf:
- rockchip,rk3399-grf
- rockchip,rk3399-pmugrf
- rockchip,rk3568-pmugrf
+ - rockchip,rk3588-pmugrf
- rockchip,rv1108-grf
- rockchip,rv1108-pmugrf
- qcom,sm8350-ufshc
- qcom,sm8450-ufshc
- qcom,sm8550-ufshc
+ - qcom,sm8650-ufshc
- const: qcom,ufshc
- const: jedec,ufs-2.0
- qcom,sm8350-ufshc
- qcom,sm8450-ufshc
- qcom,sm8550-ufshc
+ - qcom,sm8650-ufshc
then:
properties:
clocks:
vdd-supply:
description:
- VDD power supply to the hub
+ 3V3 power supply to the hub
+
+ vdd2-supply:
+ description:
+ 1V2 power supply to the hub
peer-hub:
$ref: /schemas/types.yaml#/definitions/phandle
properties:
reset-gpios: false
vdd-supply: false
+ vdd2-supply: false
peer-hub: false
i2c-bus: false
else:
interrupts = <GIC_SPI 131 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 486 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 488 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 489 IRQ_TYPE_LEVEL_HIGH>;
+ <GIC_SPI 488 IRQ_TYPE_EDGE_BOTH>,
+ <GIC_SPI 489 IRQ_TYPE_EDGE_BOTH>;
interrupt-names = "hs_phy_irq", "ss_phy_irq",
"dm_hs_phy_irq", "dp_hs_phy_irq";
- |
usb {
phys = <&usb2_phy1>, <&usb3_phy1>;
- phy-names = "usb";
+ phy-names = "usb2", "usb3";
#address-cells = <1>;
#size-cells = <0>;
- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
+For more information, please also refer to the documentation site:
+
+- https://erofs.docs.kernel.org
+
Bugs and patches are welcome, please kindly help us and send to the following
linux-erofs mailing list:
FUSE_OPEN reply.
In direct-io mode the page cache is completely bypassed for reads and writes.
-No read-ahead takes place. Shared mmap is disabled.
+No read-ahead takes place. Shared mmap is disabled by default. To allow shared
+mmap, the FUSE_DIRECT_IO_ALLOW_MMAP flag may be enabled in the FUSE_INIT reply.
In cached mode reads may be satisfied from the page cache, and data may be
read-ahead by the kernel to fill the cache. The cache is always kept consistent
when it is no longer considered permitted.
Linux TCP-AO will try its best to prevent you from removing a key that's
-being used, considering it a key management failure. But sine keeping
+being used, considering it a key management failure. But since keeping
an outdated key may become a security issue and as a peer may
unintentionally prevent the removal of an old key by always setting
it as RNextKeyID - a forced key removal mechanism is provided, where
Generally speaking, the patches get triaged quickly (in less than
48h). But be patient, if your patch is active in patchwork (i.e. it's
listed on the project's patch list) the chances it was missed are close to zero.
-Asking the maintainer for status updates on your
-patch is a good way to ensure your patch is ignored or pushed to the
-bottom of the priority list.
+
+The high volume of development on netdev makes reviewers move on
+from discussions relatively quickly. New comments and replies
+are very unlikely to arrive after a week of silence. If a patch
+is no longer active in patchwork and the thread went idle for more
+than a week - clarify the next steps and/or post the next version.
+
+For RFC postings specifically, if nobody responded in a week - reviewers
+either missed the posting or have no strong opinions. If the code is ready,
+repost as a PATCH.
+
+Emails saying just "ping" or "bump" are considered rude. If you can't figure
+out the status of the patch from patchwork or where the discussion has
+landed - describe your best guess and ask if it's correct. For example::
+
+ I don't understand what the next steps are. Person X seems to be unhappy
+ with A, should I do B and repost the patches?
.. _Changes requested:
Device Tree Bindings
--------------------
-See Documentation/devicetree/bindings/arm/arm,coresight-\*.yaml for details.
+See ``Documentation/devicetree/bindings/arm/arm,coresight-*.yaml`` for details.
As of this writing drivers for ITM, STMs and CTIs are not provided but are
expected to be added as the solution matures.
LoongArch指令集架构的文档:
- https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.02-CN.pdf (中文版)
+ https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-CN.pdf (中文版)
- https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.02-EN.pdf (英文版)
+ https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-EN.pdf (英文版)
LoongArch的ELF psABI文档:
This is an asynchronous vcpu ioctl and can be invoked from any thread.
-4.17 KVM_DEBUG_GUEST
---------------------
-
-:Capability: basic
-:Architectures: none
-:Type: vcpu ioctl
-:Parameters: none)
-:Returns: -1 on error
-
-Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
-
-
4.18 KVM_GET_MSRS
-----------------
F: drivers/soc/fujitsu/a64fx-diag.c
A8293 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/a8293*
AACRAID SCSI RAID DRIVER
F: drivers/iio/accel/adxl372_spi.c
AF9013 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/af9013*
AF9033 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/af9033*
AFFS FILE SYSTEM
F: include/linux/*aio*.h
AIRSPY MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/airspy/
ALACRITECH GIGABIT ETHERNET DRIVER
T: git git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git
F: arch/arm/boot/dts/nxp/imx/
F: arch/arm/boot/dts/nxp/mxs/
+F: arch/arm64/boot/dts/freescale/
X: arch/arm64/boot/dts/freescale/fsl-*
X: arch/arm64/boot/dts/freescale/qoriq-*
X: drivers/media/i2c/
F: drivers/*/*wpcm*
ARM/NXP S32G ARCHITECTURE
-M: Chester Lin <clin@suse.com>
+M: Chester Lin <chester62515@gmail.com>
R: Andreas Färber <afaerber@suse.de>
R: Matthias Brugger <mbrugger@suse.com>
R: NXP S32 Linux Team <s32@nxp.com>
M: Sami Tolvanen <samitolvanen@google.com>
M: Kees Cook <keescook@chromium.org>
R: Nathan Chancellor <nathan@kernel.org>
-R: Nick Desaulniers <ndesaulniers@google.com>
L: llvm@lists.linux.dev
S: Supported
B: https://github.com/ClangBuiltLinux/linux/issues
CLANG/LLVM BUILD SUPPORT
M: Nathan Chancellor <nathan@kernel.org>
-M: Nick Desaulniers <ndesaulniers@google.com>
-R: Tom Rix <trix@redhat.com>
+R: Nick Desaulniers <ndesaulniers@google.com>
+R: Bill Wendling <morbo@google.com>
+R: Justin Stitt <justinstitt@google.com>
L: llvm@lists.linux.dev
S: Supported
W: https://clangbuiltlinux.github.io/
COMPILER ATTRIBUTES
M: Miguel Ojeda <ojeda@kernel.org>
-R: Nick Desaulniers <ndesaulniers@google.com>
S: Maintained
F: include/linux/compiler_attributes.h
F: drivers/media/pci/cx88/
CXD2820R MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/cxd2820r*
CXGB3 ETHERNET DRIVER (CXGB3)
F: drivers/input/keyboard/cypress-sf.c
CYPRESS_FIRMWARE MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/common/cypress_firmware*
CYTTSP TOUCHSCREEN DRIVER
M: dm-devel@lists.linux.dev
L: dm-devel@lists.linux.dev
S: Maintained
-W: http://sources.redhat.com/dm
Q: http://patchwork.kernel.org/project/dm-devel/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git
-T: quilt http://people.redhat.com/agk/patches/linux/editing/
F: Documentation/admin-guide/device-mapper/
F: drivers/md/Kconfig
F: drivers/md/Makefile
F: drivers/media/pci/dt3155/
DVB_USB_AF9015 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/af9015*
DVB_USB_AF9035 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/af9035*
DVB_USB_ANYSEE MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/anysee*
DVB_USB_AU6610 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/au6610*
DVB_USB_CE6230 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/ce6230*
DVB_USB_CXUSB MEDIA DRIVER
F: drivers/media/usb/dvb-usb/cxusb*
DVB_USB_EC168 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/ec168*
DVB_USB_GL861 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/gl861*
DVB_USB_MXL111SF MEDIA DRIVER
F: drivers/media/usb/dvb-usb-v2/mxl111sf*
DVB_USB_RTL28XXU MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/rtl28xxu*
DVB_USB_V2 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/dvb-usb-v2/dvb_usb*
F: drivers/media/usb/dvb-usb-v2/usb_urb.c
F: drivers/input/misc/e3x0-button.c
E4000 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/e4000*
EARTH_PT1 MEDIA DRIVER
F: drivers/media/pci/pt3/
EC100 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/ec100*
ECRYPT FILE SYSTEM
R: Jeffle Xu <jefflexu@linux.alibaba.com>
L: linux-erofs@lists.ozlabs.org
S: Maintained
+W: https://erofs.docs.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git
F: Documentation/ABI/testing/sysfs-fs-erofs
F: Documentation/filesystems/erofs.rst
F: drivers/media/tuners/fc0011.h
FC2580 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/fc2580*
FCOE SUBSYSTEM (libfc, libfcoe, fcoe)
F: scripts/get_maintainer.pl
GFS2 FILE SYSTEM
-M: Bob Peterson <rpeterso@redhat.com>
M: Andreas Gruenbacher <agruenba@redhat.com>
L: gfs2@lists.linux.dev
S: Supported
F: include/uapi/drm/habanalabs_accel.h
HACKRF MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/hackrf/
HANDSHAKE UPCALL FOR TRANSPORT LAYER SECURITY
HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3)
M: Yisen Zhuang <yisen.zhuang@huawei.com>
M: Salil Mehta <salil.mehta@huawei.com>
+M: Jijie Shao <shaojijie@huawei.com>
L: netdev@vger.kernel.org
S: Maintained
W: http://www.hisilicon.com
F: include/linux/hisi_acc_qm.h
HISILICON ROCE DRIVER
+M: Chengchang Tang <tangchengchang@huawei.com>
M: Junxian Huang <huangjunxian6@hisilicon.com>
L: linux-rdma@vger.kernel.org
S: Maintained
INTEL WMI SLIM BOOTLOADER (SBL) FIRMWARE UPDATE DRIVER
M: Jithu Joseph <jithu.joseph@intel.com>
-R: Maurice Ma <maurice.ma@intel.com>
S: Maintained
W: https://slimbootloader.github.io/security/firmware-update.html
F: drivers/platform/x86/intel/wmi/sbl-fw-update.c
F: drivers/hwmon/it87.c
IT913X MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/it913x*
ITE IT66121 HDMI BRIDGE DRIVER
KERNEL BUILD + files below scripts/ (unless maintained elsewhere)
M: Masahiro Yamada <masahiroy@kernel.org>
R: Nathan Chancellor <nathan@kernel.org>
-R: Nick Desaulniers <ndesaulniers@google.com>
R: Nicolas Schier <nicolas@fjasle.eu>
L: linux-kbuild@vger.kernel.org
S: Maintained
F: include/uapi/linux/ndctl.h
F: tools/testing/nvdimm/
+LIBRARY CODE
+M: Andrew Morton <akpm@linux-foundation.org>
+L: linux-kernel@vger.kernel.org
+S: Supported
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-nonmm-unstable
+F: lib/*
+
LICENSES and SPDX stuff
M: Thomas Gleixner <tglx@linutronix.de>
M: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
M: Michael Ellerman <mpe@ellerman.id.au>
R: Nicholas Piggin <npiggin@gmail.com>
R: Christophe Leroy <christophe.leroy@csgroup.eu>
+R: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
+R: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
L: linuxppc-dev@lists.ozlabs.org
S: Supported
W: https://github.com/linuxppc/wiki/wiki
F: arch/m68k/hp300/
M88DS3103 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/m88ds3103*
M88RS2000 MEDIA DRIVER
MELLANOX HARDWARE PLATFORM SUPPORT
M: Hans de Goede <hdegoede@redhat.com>
M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
-M: Mark Gross <markgross@kernel.org>
M: Vadim Pasternak <vadimp@nvidia.com>
L: platform-driver-x86@vger.kernel.org
S: Supported
MICROSOFT SURFACE HARDWARE PLATFORM SUPPORT
M: Hans de Goede <hdegoede@redhat.com>
M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
-M: Mark Gross <markgross@kernel.org>
M: Maximilian Luz <luzmaximilian@gmail.com>
L: platform-driver-x86@vger.kernel.org
S: Maintained
F: mm/mmu_gather.c
MN88472 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
F: drivers/media/dvb-frontends/mn88472*
MN88473 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
F: drivers/media/dvb-frontends/mn88473*
F: drivers/platform/x86/msi-wmi.c
MSI001 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/msi001*
MSI2500 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/usb/msi2500/
MSTAR INTERRUPT CONTROLLER DRIVER
M: Paolo Abeni <pabeni@redhat.com>
L: netdev@vger.kernel.org
S: Maintained
+P: Documentation/process/maintainer-netdev.rst
Q: https://patchwork.kernel.org/project/netdevbpf/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
M: Paolo Abeni <pabeni@redhat.com>
L: netdev@vger.kernel.org
S: Maintained
+P: Documentation/process/maintainer-netdev.rst
Q: https://patchwork.kernel.org/project/netdevbpf/list/
B: mailto:netdev@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
F: Documentation/process/maintainer-netdev.rst
F: Documentation/userspace-api/netlink/
F: include/linux/in.h
+F: include/linux/indirect_call_wrapper.h
F: include/linux/net.h
F: include/linux/netdevice.h
F: include/net/
F: net/
F: tools/net/
F: tools/testing/selftests/net/
+X: net/9p/
X: net/bluetooth/
NETWORKING [IPSEC]
F: include/uapi/linux/fsl_mc.h
QT1010 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/qt1010*
QUALCOMM ATH12K WIRELESS DRIVER
L: linux-arm-msm@vger.kernel.org
S: Maintained
F: drivers/iommu/arm/arm-smmu/qcom_iommu.c
+F: drivers/iommu/arm/arm-smmu/arm-smmu-qcom*
+F: drivers/iommu/msm_iommu*
QUALCOMM IPC ROUTER (QRTR) DRIVER
M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
F: drivers/tty/rpmsg_tty.c
RTL2830 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/rtl2830*
RTL2832 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/rtl2832*
RTL2832_SDR MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/rtl2832_sdr*
RTL8180 WIRELESS DRIVER
F: drivers/misc/sgi-xp/
SHARED MEMORY COMMUNICATIONS (SMC) SOCKETS
-M: Karsten Graul <kgraul@linux.ibm.com>
M: Wenjia Zhang <wenjia@linux.ibm.com>
M: Jan Karcher <jaka@linux.ibm.com>
R: D. Wythe <alibuda@linux.alibaba.com>
F: include/media/drv-intf/sh_vou.h
SI2157 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/si2157*
SI2165 MEDIA DRIVER
F: drivers/media/dvb-frontends/si2165*
SI2168 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/si2168*
SI470X FM RADIO RECEIVER I2C DRIVER
F: net/ipv4/tcp_lp.c
TDA10071 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/dvb-frontends/tda10071*
TDA18212 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/tda18212*
TDA18218 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/tda18218*
TDA18250 MEDIA DRIVER
F: drivers/counter/ti-eqep.c
TI ETHERNET SWITCH DRIVER (CPSW)
-R: Grygorii Strashko <grygorii.strashko@ti.com>
+R: Siddharth Vadapalli <s-vadapalli@ti.com>
+R: Ravi Gunasekaran <r-gunasekaran@ti.com>
+R: Roger Quadros <rogerq@kernel.org>
L: linux-omap@vger.kernel.org
L: netdev@vger.kernel.org
S: Maintained
F: drivers/media/i2c/ds90*
F: include/media/i2c/ds90*
+TI ICSSG ETHERNET DRIVER (ICSSG)
+R: MD Danish Anwar <danishanwar@ti.com>
+R: Roger Quadros <rogerq@kernel.org>
+L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
+L: netdev@vger.kernel.org
+S: Maintained
+F: Documentation/devicetree/bindings/net/ti,icss*.yaml
+F: drivers/net/ethernet/ti/icssg/*
+
TI J721E CSI2RX DRIVER
M: Jai Luthra <j-luthra@ti.com>
L: linux-media@vger.kernel.org
TRACING
M: Steven Rostedt <rostedt@goodmis.org>
M: Masami Hiramatsu <mhiramat@kernel.org>
+R: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
L: linux-kernel@vger.kernel.org
L: linux-trace-kernel@vger.kernel.org
S: Maintained
F: include/uapi/linux/tty.h
TUA9001 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org
-W: http://palosaari.fi/linux/
Q: http://patchwork.linuxtv.org/project/linux-media/list/
-T: git git://linuxtv.org/anttip/media_tree.git
F: drivers/media/tuners/tua9001*
TULIP NETWORK DRIVERS
X86 PLATFORM DRIVERS
M: Hans de Goede <hdegoede@redhat.com>
M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
-M: Mark Gross <markgross@kernel.org>
L: platform-driver-x86@vger.kernel.org
S: Maintained
Q: https://patchwork.kernel.org/project/platform-driver-x86/list/
F: arch/x86/kernel/stacktrace.c
F: arch/x86/kernel/unwind_*.c
+X86 TRUST DOMAIN EXTENSIONS (TDX)
+M: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
+R: Dave Hansen <dave.hansen@linux.intel.com>
+L: x86@kernel.org
+L: linux-coco@lists.linux.dev
+S: Supported
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/tdx
+F: arch/x86/boot/compressed/tdx*
+F: arch/x86/coco/tdx/
+F: arch/x86/include/asm/shared/tdx.h
+F: arch/x86/include/asm/tdx.h
+F: arch/x86/virt/vmx/tdx/
+F: drivers/virt/coco/tdx-guest
+
X86 VDSO
M: Andy Lutomirski <luto@kernel.org>
L: linux-kernel@vger.kernel.org
P: Documentation/filesystems/xfs-maintainer-entry-profile.rst
F: Documentation/ABI/testing/sysfs-fs-xfs
F: Documentation/admin-guide/xfs.rst
-F: Documentation/filesystems/xfs-delayed-logging-design.rst
-F: Documentation/filesystems/xfs-self-describing-metadata.rst
+F: Documentation/filesystems/xfs-*
F: fs/xfs/
F: include/uapi/linux/dqblk_xfs.h
F: include/uapi/linux/fsmap.h
F: drivers/net/wireless/zydas/zd1211rw/
ZD1301 MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org/
-W: http://palosaari.fi/linux/
Q: https://patchwork.linuxtv.org/project/linux-media/list/
F: drivers/media/usb/dvb-usb-v2/zd1301*
ZD1301_DEMOD MEDIA DRIVER
-M: Antti Palosaari <crope@iki.fi>
L: linux-media@vger.kernel.org
-S: Maintained
+S: Orphan
W: https://linuxtv.org/
-W: http://palosaari.fi/linux/
Q: https://patchwork.linuxtv.org/project/linux-media/list/
F: drivers/media/dvb-frontends/zd1301_demod*
VERSION = 6
PATCHLEVEL = 7
SUBLEVEL = 0
-EXTRAVERSION = -rc1
+EXTRAVERSION = -rc7
NAME = Hurr durr I'ma ninja sloth
# *DOCUMENTATION*
select OF
select OF_EARLY_FLATTREE
select PCI_SYSCALL if PCI
- select PERF_USE_VMALLOC if ARC_CACHE_VIPT_ALIASING
select HAVE_ARCH_JUMP_LABEL if ISA_ARCV2 && !CPU_ENDIAN_BE32
select TRACE_IRQFLAGS_SUPPORT
Note that Global I/D ENABLE + Per Page DISABLE works but corollary
Global DISABLE + Per Page ENABLE won't work
-config ARC_CACHE_VIPT_ALIASING
- bool "Support VIPT Aliasing D$"
- depends on ARC_HAS_DCACHE && ISA_ARCOMPACT
-
endif #ARC_CACHE
config ARC_HAS_ICCM
#define flush_cache_dup_mm(mm) /* called on fork (VIVT only) */
-#ifndef CONFIG_ARC_CACHE_VIPT_ALIASING
-
#define flush_cache_mm(mm) /* called on munmap/exit */
#define flush_cache_range(mm, u_vstart, u_vend)
#define flush_cache_page(vma, u_vaddr, pfn) /* PF handling/COW-break */
-#else /* VIPT aliasing dcache */
-
-/* To clear out stale userspace mappings */
-void flush_cache_mm(struct mm_struct *mm);
-void flush_cache_range(struct vm_area_struct *vma,
- unsigned long start,unsigned long end);
-void flush_cache_page(struct vm_area_struct *vma,
- unsigned long user_addr, unsigned long page);
-
-/*
- * To make sure that userspace mapping is flushed to memory before
- * get_user_pages() uses a kernel mapping to access the page
- */
-#define ARCH_HAS_FLUSH_ANON_PAGE
-void flush_anon_page(struct vm_area_struct *vma,
- struct page *page, unsigned long u_vaddr);
-
-#endif /* CONFIG_ARC_CACHE_VIPT_ALIASING */
-
/*
* A new pagecache page has PG_arch_1 clear - thus dcache dirty by default
* This works around some PIO based drivers which don't call flush_dcache_page
*/
#define PG_dc_clean PG_arch_1
-#define CACHE_COLORS_NUM 4
-#define CACHE_COLORS_MSK (CACHE_COLORS_NUM - 1)
-#define CACHE_COLOR(addr) (((unsigned long)(addr) >> (PAGE_SHIFT)) & CACHE_COLORS_MSK)
-
-/*
- * Simple wrapper over config option
- * Bootup code ensures that hardware matches kernel configuration
- */
-static inline int cache_is_vipt_aliasing(void)
-{
- return IS_ENABLED(CONFIG_ARC_CACHE_VIPT_ALIASING);
-}
-
-/*
- * checks if two addresses (after page aligning) index into same cache set
- */
-#define addr_not_cache_congruent(addr1, addr2) \
-({ \
- cache_is_vipt_aliasing() ? \
- (CACHE_COLOR(addr1) != CACHE_COLOR(addr2)) : 0; \
-})
-
#define copy_to_user_page(vma, page, vaddr, dst, src, len) \
do { \
memcpy(dst, src, len); \
/* M = 8-1 N = 8 */
.endm
+.macro SAVE_ABI_CALLEE_REGS
+ push r13
+ push r14
+ push r15
+ push r16
+ push r17
+ push r18
+ push r19
+ push r20
+ push r21
+ push r22
+ push r23
+ push r24
+ push r25
+.endm
+
+.macro RESTORE_ABI_CALLEE_REGS
+ pop r25
+ pop r24
+ pop r23
+ pop r22
+ pop r21
+ pop r20
+ pop r19
+ pop r18
+ pop r17
+ pop r16
+ pop r15
+ pop r14
+ pop r13
+.endm
+
#endif
#include <asm/irqflags-compact.h>
#include <asm/thread_info.h> /* For THREAD_SIZE */
+/* Note on the LD/ST addr modes with addr reg wback
+ *
+ * LD.a same as LD.aw
+ *
+ * LD.a reg1, [reg2, x] => Pre Incr
+ * Eff Addr for load = [reg2 + x]
+ *
+ * LD.ab reg1, [reg2, x] => Post Incr
+ * Eff Addr for load = [reg2]
+ */
+
+.macro PUSHAX aux
+ lr r9, [\aux]
+ push r9
+.endm
+
+.macro POPAX aux
+ pop r9
+ sr r9, [\aux]
+.endm
+
+.macro SAVE_R0_TO_R12
+ push r0
+ push r1
+ push r2
+ push r3
+ push r4
+ push r5
+ push r6
+ push r7
+ push r8
+ push r9
+ push r10
+ push r11
+ push r12
+.endm
+
+.macro RESTORE_R12_TO_R0
+ pop r12
+ pop r11
+ pop r10
+ pop r9
+ pop r8
+ pop r7
+ pop r6
+ pop r5
+ pop r4
+ pop r3
+ pop r2
+ pop r1
+ pop r0
+.endm
+
+.macro SAVE_ABI_CALLEE_REGS
+ push r13
+ push r14
+ push r15
+ push r16
+ push r17
+ push r18
+ push r19
+ push r20
+ push r21
+ push r22
+ push r23
+ push r24
+ push r25
+.endm
+
+.macro RESTORE_ABI_CALLEE_REGS
+ pop r25
+ pop r24
+ pop r23
+ pop r22
+ pop r21
+ pop r20
+ pop r19
+ pop r18
+ pop r17
+ pop r16
+ pop r15
+ pop r14
+ pop r13
+.endm
+
/*--------------------------------------------------------------
* Switch to Kernel Mode stack if SP points to User Mode stack
*
SWITCH_TO_KERNEL_STK
- PUSH 0x003\LVL\()abcd /* Dummy ECR */
+ st.a 0x003\LVL\()abcd, [sp, -4] /* Dummy ECR */
sub sp, sp, 8 /* skip orig_r0 (not needed)
skip pt_regs->sp, already saved above */
#include <asm/entry-arcv2.h>
#endif
-/* Note on the LD/ST addr modes with addr reg wback
- *
- * LD.a same as LD.aw
- *
- * LD.a reg1, [reg2, x] => Pre Incr
- * Eff Addr for load = [reg2 + x]
- *
- * LD.ab reg1, [reg2, x] => Post Incr
- * Eff Addr for load = [reg2]
- */
-
-.macro PUSH reg
- st.a \reg, [sp, -4]
-.endm
-
-.macro PUSHAX aux
- lr r9, [\aux]
- PUSH r9
-.endm
-
-.macro POP reg
- ld.ab \reg, [sp, 4]
-.endm
-
-.macro POPAX aux
- POP r9
- sr r9, [\aux]
-.endm
-
-/*--------------------------------------------------------------
- * Helpers to save/restore Scratch Regs:
- * used by Interrupt/Exception Prologue/Epilogue
- *-------------------------------------------------------------*/
-.macro SAVE_R0_TO_R12
- PUSH r0
- PUSH r1
- PUSH r2
- PUSH r3
- PUSH r4
- PUSH r5
- PUSH r6
- PUSH r7
- PUSH r8
- PUSH r9
- PUSH r10
- PUSH r11
- PUSH r12
-.endm
-
-.macro RESTORE_R12_TO_R0
- POP r12
- POP r11
- POP r10
- POP r9
- POP r8
- POP r7
- POP r6
- POP r5
- POP r4
- POP r3
- POP r2
- POP r1
- POP r0
-
-.endm
-
-/*--------------------------------------------------------------
- * Helpers to save/restore callee-saved regs:
- * used by several macros below
- *-------------------------------------------------------------*/
-.macro SAVE_R13_TO_R25
- PUSH r13
- PUSH r14
- PUSH r15
- PUSH r16
- PUSH r17
- PUSH r18
- PUSH r19
- PUSH r20
- PUSH r21
- PUSH r22
- PUSH r23
- PUSH r24
- PUSH r25
-.endm
-
-.macro RESTORE_R25_TO_R13
- POP r25
- POP r24
- POP r23
- POP r22
- POP r21
- POP r20
- POP r19
- POP r18
- POP r17
- POP r16
- POP r15
- POP r14
- POP r13
-.endm
-
/*
* save user mode callee regs as struct callee_regs
* - needed by fork/do_signal/unaligned-access-emulation.
*/
.macro SAVE_CALLEE_SAVED_USER
- SAVE_R13_TO_R25
+ SAVE_ABI_CALLEE_REGS
.endm
/*
* - could have been changed by ptrace tracer or unaligned-access fixup
*/
.macro RESTORE_CALLEE_SAVED_USER
- RESTORE_R25_TO_R13
+ RESTORE_ABI_CALLEE_REGS
.endm
/*
* save/restore kernel mode callee regs at the time of context switch
*/
.macro SAVE_CALLEE_SAVED_KERNEL
- SAVE_R13_TO_R25
+ SAVE_ABI_CALLEE_REGS
.endm
.macro RESTORE_CALLEE_SAVED_KERNEL
- RESTORE_R25_TO_R13
+ RESTORE_ABI_CALLEE_REGS
.endm
/*--------------------------------------------------------------
#include <linux/types.h>
#include <asm-generic/pgtable-nopmd.h>
+/*
+ * Hugetlb definitions.
+ */
+#define HPAGE_SHIFT PMD_SHIFT
+#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
+#define HPAGE_MASK (~(HPAGE_SIZE - 1))
+
static inline pte_t pmd_pte(pmd_t pmd)
{
return __pte(pmd_val(pmd));
ecr_reg ecr;
};
+struct callee_regs {
+ unsigned long r25, r24, r23, r22, r21, r20, r19, r18, r17, r16, r15, r14, r13;
+};
+
#define MAX_REG_OFFSET offsetof(struct pt_regs, ecr)
#else
unsigned long status32;
};
-#define MAX_REG_OFFSET offsetof(struct pt_regs, status32)
-
-#endif
-
-/* Callee saved registers - need to be saved only when you are scheduled out */
-
struct callee_regs {
unsigned long r25, r24, r23, r22, r21, r20, r19, r18, r17, r16, r15, r14, r13;
};
+#define MAX_REG_OFFSET offsetof(struct pt_regs, status32)
+
+#endif
+
#define instruction_pointer(regs) ((regs)->ret)
#define profile_pc(regs) instruction_pointer(regs)
{
int n = 0;
#ifdef CONFIG_ISA_ARCV2
- const char *release, *cpu_nm, *isa_nm = "ARCv2";
+ const char *release = "", *cpu_nm = "HS38", *isa_nm = "ARCv2";
int dual_issue = 0, dual_enb = 0, mpy_opt, present;
int bpu_full, bpu_cache, bpu_pred, bpu_ret_stk;
char mpy_nm[16], lpb_nm[32];
* releases only update it.
*/
- cpu_nm = "HS38";
-
if (info->arcver > 0x50 && info->arcver <= 0x53) {
release = arc_hs_rel[info->arcver - 0x51].str;
} else {
unsigned int sigret_magic;
};
-static int save_arcv2_regs(struct sigcontext *mctx, struct pt_regs *regs)
+static int save_arcv2_regs(struct sigcontext __user *mctx, struct pt_regs *regs)
{
int err = 0;
#ifndef CONFIG_ISA_ARCOMPACT
#else
v2abi.r58 = v2abi.r59 = 0;
#endif
- err = __copy_to_user(&mctx->v2abi, &v2abi, sizeof(v2abi));
+ err = __copy_to_user(&mctx->v2abi, (void const *)&v2abi, sizeof(v2abi));
#endif
return err;
}
-static int restore_arcv2_regs(struct sigcontext *mctx, struct pt_regs *regs)
+static int restore_arcv2_regs(struct sigcontext __user *mctx, struct pt_regs *regs)
{
int err = 0;
#ifndef CONFIG_ISA_ARCOMPACT
p_dc->sz_k = 1 << (dbcr.sz - 1);
n += scnprintf(buf + n, len - n,
- "D-Cache\t\t: %uK, %dway/set, %uB Line, %s%s%s\n",
+ "D-Cache\t\t: %uK, %dway/set, %uB Line, %s%s\n",
p_dc->sz_k, assoc, p_dc->line_len,
vipt ? "VIPT" : "PIPT",
- p_dc->colors > 1 ? " aliasing" : "",
IS_USED_CFG(CONFIG_ARC_HAS_DCACHE));
slc_chk:
* Exported APIs
*/
-/*
- * Handle cache congruency of kernel and userspace mappings of page when kernel
- * writes-to/reads-from
- *
- * The idea is to defer flushing of kernel mapping after a WRITE, possible if:
- * -dcache is NOT aliasing, hence any U/K-mappings of page are congruent
- * -U-mapping doesn't exist yet for page (finalised in update_mmu_cache)
- * -In SMP, if hardware caches are coherent
- *
- * There's a corollary case, where kernel READs from a userspace mapped page.
- * If the U-mapping is not congruent to K-mapping, former needs flushing.
- */
void flush_dcache_folio(struct folio *folio)
{
- struct address_space *mapping;
-
- if (!cache_is_vipt_aliasing()) {
- clear_bit(PG_dc_clean, &folio->flags);
- return;
- }
-
- /* don't handle anon pages here */
- mapping = folio_flush_mapping(folio);
- if (!mapping)
- return;
-
- /*
- * pagecache page, file not yet mapped to userspace
- * Make a note that K-mapping is dirty
- */
- if (!mapping_mapped(mapping)) {
- clear_bit(PG_dc_clean, &folio->flags);
- } else if (folio_mapped(folio)) {
- /* kernel reading from page with U-mapping */
- phys_addr_t paddr = (unsigned long)folio_address(folio);
- unsigned long vaddr = folio_pos(folio);
-
- /*
- * vaddr is not actually the virtual address, but is
- * congruent to every user mapping.
- */
- if (addr_not_cache_congruent(paddr, vaddr))
- __flush_dcache_pages(paddr, vaddr,
- folio_nr_pages(folio));
- }
+ clear_bit(PG_dc_clean, &folio->flags);
+ return;
}
EXPORT_SYMBOL(flush_dcache_folio);
}
-#ifdef CONFIG_ARC_CACHE_VIPT_ALIASING
-
-void flush_cache_mm(struct mm_struct *mm)
-{
- flush_cache_all();
-}
-
-void flush_cache_page(struct vm_area_struct *vma, unsigned long u_vaddr,
- unsigned long pfn)
-{
- phys_addr_t paddr = pfn << PAGE_SHIFT;
-
- u_vaddr &= PAGE_MASK;
-
- __flush_dcache_pages(paddr, u_vaddr, 1);
-
- if (vma->vm_flags & VM_EXEC)
- __inv_icache_pages(paddr, u_vaddr, 1);
-}
-
-void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
- unsigned long end)
-{
- flush_cache_all();
-}
-
-void flush_anon_page(struct vm_area_struct *vma, struct page *page,
- unsigned long u_vaddr)
-{
- /* TBD: do we really need to clear the kernel mapping */
- __flush_dcache_pages((phys_addr_t)page_address(page), u_vaddr, 1);
- __flush_dcache_pages((phys_addr_t)page_address(page),
- (phys_addr_t)page_address(page), 1);
-
-}
-
-#endif
-
void copy_user_highpage(struct page *to, struct page *from,
unsigned long u_vaddr, struct vm_area_struct *vma)
{
struct folio *dst = page_folio(to);
void *kfrom = kmap_atomic(from);
void *kto = kmap_atomic(to);
- int clean_src_k_mappings = 0;
-
- /*
- * If SRC page was already mapped in userspace AND it's U-mapping is
- * not congruent with K-mapping, sync former to physical page so that
- * K-mapping in memcpy below, sees the right data
- *
- * Note that while @u_vaddr refers to DST page's userspace vaddr, it is
- * equally valid for SRC page as well
- *
- * For !VIPT cache, all of this gets compiled out as
- * addr_not_cache_congruent() is 0
- */
- if (page_mapcount(from) && addr_not_cache_congruent(kfrom, u_vaddr)) {
- __flush_dcache_pages((unsigned long)kfrom, u_vaddr, 1);
- clean_src_k_mappings = 1;
- }
copy_page(kto, kfrom);
- /*
- * Mark DST page K-mapping as dirty for a later finalization by
- * update_mmu_cache(). Although the finalization could have been done
- * here as well (given that both vaddr/paddr are available).
- * But update_mmu_cache() already has code to do that for other
- * non copied user pages (e.g. read faults which wire in pagecache page
- * directly).
- */
clear_bit(PG_dc_clean, &dst->flags);
-
- /*
- * if SRC was already usermapped and non-congruent to kernel mapping
- * sync the kernel mapping back to physical page
- */
- if (clean_src_k_mappings) {
- __flush_dcache_pages((unsigned long)kfrom,
- (unsigned long)kfrom, 1);
- } else {
- clear_bit(PG_dc_clean, &src->flags);
- }
+ clear_bit(PG_dc_clean, &src->flags);
kunmap_atomic(kto);
kunmap_atomic(kfrom);
dc->line_len, L1_CACHE_BYTES);
/* check for D-Cache aliasing on ARCompact: ARCv2 has PIPT */
- if (is_isa_arcompact()) {
- int handled = IS_ENABLED(CONFIG_ARC_CACHE_VIPT_ALIASING);
-
- if (dc->colors > 1) {
- if (!handled)
- panic("Enable CONFIG_ARC_CACHE_VIPT_ALIASING\n");
- if (CACHE_COLORS_NUM != dc->colors)
- panic("CACHE_COLORS_NUM not optimized for config\n");
- } else if (handled && dc->colors == 1) {
- panic("Disable CONFIG_ARC_CACHE_VIPT_ALIASING\n");
- }
+ if (is_isa_arcompact() && dc->colors > 1) {
+ panic("Aliasing VIPT cache not supported\n");
}
}
#include <asm/cacheflush.h>
-#define COLOUR_ALIGN(addr, pgoff) \
- ((((addr) + SHMLBA - 1) & ~(SHMLBA - 1)) + \
- (((pgoff) << PAGE_SHIFT) & (SHMLBA - 1)))
-
/*
* Ensure that shared mappings are correctly aligned to
* avoid aliasing issues with VIPT caches.
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
- int do_align = 0;
- int aliasing = cache_is_vipt_aliasing();
struct vm_unmapped_area_info info;
- /*
- * We only need to do colour alignment if D cache aliases.
- */
- if (aliasing)
- do_align = filp || (flags & MAP_SHARED);
-
/*
* We enforce the MAP_FIXED case.
*/
if (flags & MAP_FIXED) {
- if (aliasing && flags & MAP_SHARED &&
+ if (flags & MAP_SHARED &&
(addr - (pgoff << PAGE_SHIFT)) & (SHMLBA - 1))
return -EINVAL;
return addr;
return -ENOMEM;
if (addr) {
- if (do_align)
- addr = COLOUR_ALIGN(addr, pgoff);
- else
- addr = PAGE_ALIGN(addr);
+ addr = PAGE_ALIGN(addr);
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
info.length = len;
info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE;
- info.align_mask = do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0;
+ info.align_mask = 0;
info.align_offset = pgoff << PAGE_SHIFT;
return vm_unmapped_area(&info);
}
create_tlb(vma, vaddr, ptep);
- if (page == ZERO_PAGE(0)) {
+ if (page == ZERO_PAGE(0))
return;
- }
/*
- * Exec page : Independent of aliasing/page-color considerations,
- * since icache doesn't snoop dcache on ARC, any dirty
- * K-mapping of a code page needs to be wback+inv so that
- * icache fetch by userspace sees code correctly.
- * !EXEC page: If K-mapping is NOT congruent to U-mapping, flush it
- * so userspace sees the right data.
- * (Avoids the flush for Non-exec + congruent mapping case)
+ * For executable pages, since icache doesn't snoop dcache, any
+ * dirty K-mapping of a code page needs to be wback+inv so that
+ * icache fetch by userspace sees code correctly.
*/
- if ((vma->vm_flags & VM_EXEC) ||
- addr_not_cache_congruent(paddr, vaddr)) {
+ if (vma->vm_flags & VM_EXEC) {
struct folio *folio = page_folio(page);
int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags);
if (dirty) {
gpios = <&gpio 42 GPIO_ACTIVE_HIGH>;
};
-&leds {
- /delete-node/ led_act;
-};
+/delete-node/ &led_act;
&pm {
/delete-property/ system-power-controller;
&clks {
assigned-clocks = <&clks IMX6QDL_CLK_LDB_DI0_SEL>,
- <&clks IMX6QDL_CLK_LDB_DI1_SEL>;
+ <&clks IMX6QDL_CLK_LDB_DI1_SEL>, <&clks IMX6QDL_CLK_ENET_REF_SEL>;
assigned-clock-parents = <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>,
- <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>;
+ <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>, <&clk50m_phy>;
};
&hdmi {
max-speed = <100>;
interrupt-parent = <&gpio5>;
interrupts = <6 IRQ_TYPE_LEVEL_LOW>;
+ clocks = <&clks IMX6UL_CLK_ENET_REF>;
+ clock-names = "rmii-ref";
};
};
};
};
gpt1: timer@302d0000 {
- compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt";
+ compatible = "fsl,imx7d-gpt", "fsl,imx6dl-gpt";
reg = <0x302d0000 0x10000>;
interrupts = <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX7D_GPT1_ROOT_CLK>,
};
gpt2: timer@302e0000 {
- compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt";
+ compatible = "fsl,imx7d-gpt", "fsl,imx6dl-gpt";
reg = <0x302e0000 0x10000>;
interrupts = <GIC_SPI 54 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX7D_GPT2_ROOT_CLK>,
};
gpt3: timer@302f0000 {
- compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt";
+ compatible = "fsl,imx7d-gpt", "fsl,imx6dl-gpt";
reg = <0x302f0000 0x10000>;
interrupts = <GIC_SPI 53 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX7D_GPT3_ROOT_CLK>,
};
gpt4: timer@30300000 {
- compatible = "fsl,imx7d-gpt", "fsl,imx6sx-gpt";
+ compatible = "fsl,imx7d-gpt", "fsl,imx6dl-gpt";
reg = <0x30300000 0x10000>;
interrupts = <GIC_SPI 52 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX7D_GPT4_ROOT_CLK>,
#include "imx28-lwe.dtsi"
/ {
+ model = "Liebherr XEA board";
compatible = "lwn,imx28-xea", "fsl,imx28";
};
};
sdmmc_pwren: sdmmc-pwren {
- rockchip,pins = <1 RK_PB6 1 &pcfg_pull_default>;
+ rockchip,pins = <1 RK_PB6 RK_FUNC_GPIO &pcfg_pull_default>;
};
sdmmc_bus4: sdmmc-bus4 {
power-domain@RK3228_PD_VOP {
reg = <RK3228_PD_VOP>;
- clocks =<&cru ACLK_VOP>,
- <&cru DCLK_VOP>,
- <&cru HCLK_VOP>;
+ clocks = <&cru ACLK_VOP>,
+ <&cru DCLK_VOP>,
+ <&cru HCLK_VOP>;
pm_qos = <&qos_vop>;
#power-domain-cells = <0>;
};
<SYSC_IDLE_NO>,
<SYSC_IDLE_SMART>,
<SYSC_IDLE_SMART_WKUP>;
+ ti,sysc-delay-us = <2>;
clocks = <&l3s_clkctrl AM3_L3S_USB_OTG_HS_CLKCTRL 0>;
clock-names = "fck";
#address-cells = <1>;
l3-noc@44000000 {
compatible = "ti,dra7-l3-noc";
- reg = <0x44000000 0x1000>,
+ reg = <0x44000000 0x1000000>,
<0x45000000 0x1000>;
interrupts-extended = <&crossbar_mpu GIC_SPI 4 IRQ_TYPE_LEVEL_HIGH>,
<&wakeupgen GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
#ifndef _ARM_KEXEC_H
#define _ARM_KEXEC_H
-#ifdef CONFIG_KEXEC
-
/* Maximum physical address we can use pages from */
#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
/* Maximum address we can reach in physical address mode */
#endif /* __ASSEMBLY__ */
-#endif /* CONFIG_KEXEC */
-
#endif /* _ARM_KEXEC_H */
obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o insn.o patch.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o insn.o patch.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o insn.o patch.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o
# Main staffs in KPROBES are in arch/arm/probes/ .
obj-$(CONFIG_KPROBES) += patch.o insn.o
obj-$(CONFIG_OABI_COMPAT) += sys_oabi-compat.o
name = devm_kasprintf(&pdev->dev,
GFP_KERNEL, "mmdc%d", ret);
+ if (!name) {
+ ret = -ENOMEM;
+ goto pmu_release_id;
+ }
pmu_mmdc->mmdc_ipg_clk = mmdc_ipg_clk;
pmu_mmdc->devtype_data = (struct fsl_mmdc_devtype_data *)of_id->data;
pmu_register_err:
pr_warn("MMDC Perf PMU failed (%d), disabled\n", ret);
- ida_simple_remove(&mmdc_ida, pmu_mmdc->id);
cpuhp_state_remove_instance_nocalls(cpuhp_mmdc_state, &pmu_mmdc->node);
hrtimer_cancel(&pmu_mmdc->hrtimer);
+pmu_release_id:
+ ida_simple_remove(&mmdc_ida, pmu_mmdc->id);
pmu_free:
kfree(pmu_mmdc);
return ret;
soc_dev_attr->machine = soc_name;
soc_dev_attr->family = omap_get_family();
+ if (!soc_dev_attr->family) {
+ kfree(soc_dev_attr);
+ return;
+ }
soc_dev_attr->revision = soc_rev;
soc_dev_attr->custom_attr_group = omap_soc_groups[0];
soc_dev = soc_device_register(soc_dev_attr);
if (IS_ERR(soc_dev)) {
+ kfree(soc_dev_attr->family);
kfree(soc_dev_attr);
return;
}
* for secondary CPUs as they are brought up.
* For uniformity we use VCPUOP_register_vcpu_info even on cpu0.
*/
- xen_vcpu_info = alloc_percpu(struct vcpu_info);
+ xen_vcpu_info = __alloc_percpu(sizeof(struct vcpu_info),
+ 1 << fls(sizeof(struct vcpu_info) - 1));
if (xen_vcpu_info == NULL)
return -ENOMEM;
all: $(notdir $(KBUILD_IMAGE))
-
+vmlinuz.efi: Image
Image vmlinuz.efi: vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
&emac0 {
pinctrl-names = "default";
pinctrl-0 = <&ext_rgmii_pins>;
- phy-mode = "rgmii";
phy-handle = <&ext_rgmii_phy>;
- allwinner,rx-delay-ps = <3100>;
- allwinner,tx-delay-ps = <700>;
status = "okay";
};
};
&emac0 {
+ allwinner,rx-delay-ps = <3100>;
+ allwinner,tx-delay-ps = <700>;
+ phy-mode = "rgmii";
phy-supply = <®_dcdce>;
};
};
&emac0 {
+ allwinner,tx-delay-ps = <700>;
+ phy-mode = "rgmii-rxid";
phy-supply = <®_dldo1>;
};
pinctrl-0 = <&pinctrl_wifi_pdn>;
gpio = <&lsio_gpio1 28 GPIO_ACTIVE_HIGH>;
enable-active-high;
+ regulator-always-on;
regulator-name = "wifi_pwrdn_fake_regulator";
regulator-settling-time-us = <100>;
-
- regulator-state-mem {
- regulator-off-in-suspend;
- };
};
reg_pcie_switch: regulator-pcie-switch {
clock-names = "ipg", "per";
assigned-clocks = <&clk IMX_SC_R_LCD_0_PWM_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
- #pwm-cells = <2>;
+ #pwm-cells = <3>;
power-domains = <&pd IMX_SC_R_LCD_0_PWM_0>;
};
<&pwm0_lpcg 1>;
assigned-clocks = <&clk IMX_SC_R_PWM_0 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
- #pwm-cells = <2>;
+ #pwm-cells = <3>;
interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};
<&pwm1_lpcg 1>;
assigned-clocks = <&clk IMX_SC_R_PWM_1 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
- #pwm-cells = <2>;
+ #pwm-cells = <3>;
interrupts = <GIC_SPI 95 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};
<&pwm2_lpcg 1>;
assigned-clocks = <&clk IMX_SC_R_PWM_2 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
- #pwm-cells = <2>;
+ #pwm-cells = <3>;
interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};
<&pwm3_lpcg 1>;
assigned-clocks = <&clk IMX_SC_R_PWM_3 IMX_SC_PM_CLK_PER>;
assigned-clock-rates = <24000000>;
- #pwm-cells = <2>;
+ #pwm-cells = <3>;
interrupts = <GIC_SPI 97 IRQ_TYPE_LEVEL_HIGH>;
status = "disabled";
};
phys = <&usb3_phy0>, <&usb3_phy0>;
phy-names = "usb2-phy", "usb3-phy";
snps,gfladj-refclk-lpm-sel-quirk;
+ snps,parkmode-disable-ss-quirk;
};
};
phys = <&usb3_phy1>, <&usb3_phy1>;
phy-names = "usb2-phy", "usb3-phy";
snps,gfladj-refclk-lpm-sel-quirk;
+ snps,parkmode-disable-ss-quirk;
};
};
phys = <&usb3_phy0>, <&usb3_phy0>;
phy-names = "usb2-phy", "usb3-phy";
power-domains = <&pgc_otg1>;
+ snps,parkmode-disable-ss-quirk;
status = "disabled";
};
phys = <&usb3_phy1>, <&usb3_phy1>;
phy-names = "usb2-phy", "usb3-phy";
power-domains = <&pgc_otg2>;
+ snps,parkmode-disable-ss-quirk;
status = "disabled";
};
status = "okay";
};
+&edma3 {
+ power-domains = <&pd IMX_SC_R_DMA_1_CH0>,
+ <&pd IMX_SC_R_DMA_1_CH1>,
+ <&pd IMX_SC_R_DMA_1_CH2>,
+ <&pd IMX_SC_R_DMA_1_CH3>,
+ <&pd IMX_SC_R_DMA_1_CH4>,
+ <&pd IMX_SC_R_DMA_1_CH5>,
+ <&pd IMX_SC_R_DMA_1_CH6>,
+ <&pd IMX_SC_R_DMA_1_CH7>;
+};
+
&flexcan1 {
fsl,clk-source = /bits/ 8 <1>;
};
};
};
- gpioe: gpio@2d000080 {
+ gpioe: gpio@2d000000 {
compatible = "fsl,imx8ulp-gpio";
reg = <0x2d000000 0x1000>;
gpio-controller;
gpio-ranges = <&iomuxc1 0 32 24>;
};
- gpiof: gpio@2d010080 {
+ gpiof: gpio@2d010000 {
compatible = "fsl,imx8ulp-gpio";
reg = <0x2d010000 0x1000>;
gpio-controller;
};
};
- gpiod: gpio@2e200080 {
+ gpiod: gpio@2e200000 {
compatible = "fsl,imx8ulp-gpio";
reg = <0x2e200000 0x1000>;
gpio-controller;
fsl,pins = <
MX93_PAD_UART2_TXD__LPUART2_TX 0x31e
MX93_PAD_UART2_RXD__LPUART2_RX 0x31e
- MX93_PAD_SAI1_TXD0__LPUART2_RTS_B 0x31e
+ MX93_PAD_SAI1_TXD0__LPUART2_RTS_B 0x51e
>;
};
compatible = "fsl,imx93-src-slice";
reg = <0x44462400 0x400>, <0x44465800 0x400>;
#power-domain-cells = <0>;
- clocks = <&clk IMX93_CLK_MEDIA_AXI>,
+ clocks = <&clk IMX93_CLK_NIC_MEDIA_GATE>,
<&clk IMX93_CLK_MEDIA_APB>;
};
};
};
};
- gpio2: gpio@43810080 {
+ gpio2: gpio@43810000 {
compatible = "fsl,imx93-gpio", "fsl,imx8ulp-gpio";
reg = <0x43810000 0x1000>;
gpio-controller;
gpio-ranges = <&iomuxc 0 4 30>;
};
- gpio3: gpio@43820080 {
+ gpio3: gpio@43820000 {
compatible = "fsl,imx93-gpio", "fsl,imx8ulp-gpio";
reg = <0x43820000 0x1000>;
gpio-controller;
<&iomuxc 26 34 2>, <&iomuxc 28 0 4>;
};
- gpio4: gpio@43830080 {
+ gpio4: gpio@43830000 {
compatible = "fsl,imx93-gpio", "fsl,imx8ulp-gpio";
reg = <0x43830000 0x1000>;
gpio-controller;
gpio-ranges = <&iomuxc 0 38 28>, <&iomuxc 28 36 2>;
};
- gpio1: gpio@47400080 {
+ gpio1: gpio@47400000 {
compatible = "fsl,imx93-gpio", "fsl,imx8ulp-gpio";
reg = <0x47400000 0x1000>;
gpio-controller;
};
};
- memory {
+ memory@40000000 {
reg = <0 0x40000000 0 0x40000000>;
};
};
};
- memory {
+ memory@40000000 {
reg = <0 0x40000000 0 0x20000000>;
};
compatible = "sff,sfp";
i2c-bus = <&i2c_sfp1>;
los-gpios = <&pio 46 GPIO_ACTIVE_HIGH>;
+ maximum-power-milliwatt = <3000>;
mod-def0-gpios = <&pio 49 GPIO_ACTIVE_LOW>;
tx-disable-gpios = <&pio 20 GPIO_ACTIVE_HIGH>;
tx-fault-gpios = <&pio 7 GPIO_ACTIVE_HIGH>;
i2c-bus = <&i2c_sfp2>;
los-gpios = <&pio 31 GPIO_ACTIVE_HIGH>;
mod-def0-gpios = <&pio 47 GPIO_ACTIVE_LOW>;
+ maximum-power-milliwatt = <3000>;
tx-disable-gpios = <&pio 15 GPIO_ACTIVE_HIGH>;
tx-fault-gpios = <&pio 48 GPIO_ACTIVE_HIGH>;
};
trip = <&cpu_trip_active_high>;
};
- cpu-active-low {
+ cpu-active-med {
/* active: set fan to cooling level 1 */
cooling-device = <&fan 1 1>;
- trip = <&cpu_trip_active_low>;
+ trip = <&cpu_trip_active_med>;
};
- cpu-passive {
- /* passive: set fan to cooling level 0 */
+ cpu-active-low {
+ /* active: set fan to cooling level 0 */
cooling-device = <&fan 0 0>;
- trip = <&cpu_trip_passive>;
+ trip = <&cpu_trip_active_low>;
};
};
};
reg = <0 0x11230000 0 0x1000>,
<0 0x11c20000 0 0x1000>;
interrupts = <GIC_SPI 143 IRQ_TYPE_LEVEL_HIGH>;
+ assigned-clocks = <&topckgen CLK_TOP_EMMC_416M_SEL>,
+ <&topckgen CLK_TOP_EMMC_250M_SEL>;
+ assigned-clock-parents = <&apmixedsys CLK_APMIXED_MPLL>,
+ <&topckgen CLK_TOP_NET1PLL_D5_D2>;
clocks = <&topckgen CLK_TOP_EMMC_416M_SEL>,
<&infracfg CLK_INFRA_MSDC_HCK_CK>,
<&infracfg CLK_INFRA_MSDC_CK>,
thermal-sensors = <&thermal 0>;
trips {
+ cpu_trip_crit: crit {
+ temperature = <125000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+
+ cpu_trip_hot: hot {
+ temperature = <120000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+
cpu_trip_active_high: active-high {
temperature = <115000>;
hysteresis = <2000>;
type = "active";
};
- cpu_trip_active_low: active-low {
+ cpu_trip_active_med: active-med {
temperature = <85000>;
hysteresis = <2000>;
type = "active";
};
- cpu_trip_passive: passive {
- temperature = <40000>;
+ cpu_trip_active_low: active-low {
+ temperature = <60000>;
hysteresis = <2000>;
- type = "passive";
+ type = "active";
};
};
};
id-gpio = <&pio 16 GPIO_ACTIVE_HIGH>;
};
- usb_p1_vbus: regulator@0 {
+ usb_p1_vbus: regulator-usb-p1 {
compatible = "regulator-fixed";
regulator-name = "usb_vbus";
regulator-min-microvolt = <5000000>;
enable-active-high;
};
- usb_p0_vbus: regulator@1 {
+ usb_p0_vbus: regulator-usb-p0 {
compatible = "regulator-fixed";
regulator-name = "vbus";
regulator-min-microvolt = <5000000>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
- scp_mem_reserved: scp_mem_region {
+ scp_mem_reserved: memory@50000000 {
compatible = "shared-dma-pool";
reg = <0 0x50000000 0 0x2900000>;
no-map;
};
};
- ntc@0 {
+ thermal-sensor {
compatible = "murata,ncp03wf104";
pullup-uv = <1800000>;
pullup-ohm = <390000>;
&dsi0 {
status = "okay";
+ /delete-property/#size-cells;
+ /delete-property/#address-cells;
/delete-node/panel@0;
ports {
port {
};
touchscreen_pins: touchscreen-pins {
- touch_int_odl {
+ touch-int-odl {
pinmux = <PINMUX_GPIO155__FUNC_GPIO155>;
input-enable;
bias-pull-up;
};
- touch_rst_l {
+ touch-rst-l {
pinmux = <PINMUX_GPIO156__FUNC_GPIO156>;
output-high;
};
};
trackpad_pins: trackpad-pins {
- trackpad_int {
+ trackpad-int {
pinmux = <PINMUX_GPIO7__FUNC_GPIO7>;
input-enable;
bias-disable; /* pulled externally */
#size-cells = <2>;
ranges;
- scp_mem_reserved: scp_mem_region {
+ scp_mem_reserved: memory@50000000 {
compatible = "shared-dma-pool";
reg = <0 0x50000000 0 0x2900000>;
no-map;
&pio {
aud_pins_default: audiopins {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO97__FUNC_I2S2_MCK>,
<PINMUX_GPIO98__FUNC_I2S2_BCK>,
<PINMUX_GPIO101__FUNC_I2S2_LRCK>,
};
aud_pins_tdm_out_on: audiotdmouton {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO169__FUNC_TDM_BCK_2ND>,
<PINMUX_GPIO170__FUNC_TDM_LRCK_2ND>,
<PINMUX_GPIO171__FUNC_TDM_DATA0_2ND>,
};
aud_pins_tdm_out_off: audiotdmoutoff {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO169__FUNC_GPIO169>,
<PINMUX_GPIO170__FUNC_GPIO170>,
<PINMUX_GPIO171__FUNC_GPIO171>,
};
bt_pins: bt-pins {
- pins_bt_en {
+ pins-bt-en {
pinmux = <PINMUX_GPIO120__FUNC_GPIO120>;
output-low;
};
};
- ec_ap_int_odl: ec_ap_int_odl {
+ ec_ap_int_odl: ec-ap-int-odl {
pins1 {
pinmux = <PINMUX_GPIO151__FUNC_GPIO151>;
input-enable;
};
};
- h1_int_od_l: h1_int_od_l {
+ h1_int_od_l: h1-int-od-l {
pins1 {
pinmux = <PINMUX_GPIO153__FUNC_GPIO153>;
input-enable;
};
i2c0_pins: i2c0 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO82__FUNC_SDA0>,
<PINMUX_GPIO83__FUNC_SCL0>;
mediatek,pull-up-adv = <3>;
};
i2c1_pins: i2c1 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO81__FUNC_SDA1>,
<PINMUX_GPIO84__FUNC_SCL1>;
mediatek,pull-up-adv = <3>;
};
i2c2_pins: i2c2 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO103__FUNC_SCL2>,
<PINMUX_GPIO104__FUNC_SDA2>;
bias-disable;
};
i2c3_pins: i2c3 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO50__FUNC_SCL3>,
<PINMUX_GPIO51__FUNC_SDA3>;
mediatek,pull-up-adv = <3>;
};
i2c4_pins: i2c4 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO105__FUNC_SCL4>,
<PINMUX_GPIO106__FUNC_SDA4>;
bias-disable;
};
i2c5_pins: i2c5 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO48__FUNC_SCL5>,
<PINMUX_GPIO49__FUNC_SDA5>;
mediatek,pull-up-adv = <3>;
};
i2c6_pins: i2c6 {
- pins_bus {
+ pins-bus {
pinmux = <PINMUX_GPIO11__FUNC_SCL6>,
<PINMUX_GPIO12__FUNC_SDA6>;
bias-disable;
};
mmc0_pins_default: mmc0-pins-default {
- pins_cmd_dat {
+ pins-cmd-dat {
pinmux = <PINMUX_GPIO123__FUNC_MSDC0_DAT0>,
<PINMUX_GPIO128__FUNC_MSDC0_DAT1>,
<PINMUX_GPIO125__FUNC_MSDC0_DAT2>,
mediatek,pull-up-adv = <01>;
};
- pins_clk {
+ pins-clk {
pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
drive-strength = <MTK_DRIVE_14mA>;
mediatek,pull-down-adv = <10>;
};
- pins_rst {
+ pins-rst {
pinmux = <PINMUX_GPIO133__FUNC_MSDC0_RSTB>;
drive-strength = <MTK_DRIVE_14mA>;
mediatek,pull-down-adv = <01>;
};
mmc0_pins_uhs: mmc0-pins-uhs {
- pins_cmd_dat {
+ pins-cmd-dat {
pinmux = <PINMUX_GPIO123__FUNC_MSDC0_DAT0>,
<PINMUX_GPIO128__FUNC_MSDC0_DAT1>,
<PINMUX_GPIO125__FUNC_MSDC0_DAT2>,
mediatek,pull-up-adv = <01>;
};
- pins_clk {
+ pins-clk {
pinmux = <PINMUX_GPIO124__FUNC_MSDC0_CLK>;
drive-strength = <MTK_DRIVE_14mA>;
mediatek,pull-down-adv = <10>;
};
- pins_ds {
+ pins-ds {
pinmux = <PINMUX_GPIO131__FUNC_MSDC0_DSL>;
drive-strength = <MTK_DRIVE_14mA>;
mediatek,pull-down-adv = <10>;
};
- pins_rst {
+ pins-rst {
pinmux = <PINMUX_GPIO133__FUNC_MSDC0_RSTB>;
drive-strength = <MTK_DRIVE_14mA>;
mediatek,pull-up-adv = <01>;
};
mmc1_pins_default: mmc1-pins-default {
- pins_cmd_dat {
+ pins-cmd-dat {
pinmux = <PINMUX_GPIO31__FUNC_MSDC1_CMD>,
<PINMUX_GPIO32__FUNC_MSDC1_DAT0>,
<PINMUX_GPIO34__FUNC_MSDC1_DAT1>,
mediatek,pull-up-adv = <10>;
};
- pins_clk {
+ pins-clk {
pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
input-enable;
mediatek,pull-down-adv = <10>;
};
mmc1_pins_uhs: mmc1-pins-uhs {
- pins_cmd_dat {
+ pins-cmd-dat {
pinmux = <PINMUX_GPIO31__FUNC_MSDC1_CMD>,
<PINMUX_GPIO32__FUNC_MSDC1_DAT0>,
<PINMUX_GPIO34__FUNC_MSDC1_DAT1>,
mediatek,pull-up-adv = <10>;
};
- pins_clk {
+ pins-clk {
pinmux = <PINMUX_GPIO29__FUNC_MSDC1_CLK>;
drive-strength = <MTK_DRIVE_8mA>;
mediatek,pull-down-adv = <10>;
};
};
- panel_pins_default: panel_pins_default {
- panel_reset {
+ panel_pins_default: panel-pins-default {
+ panel-reset {
pinmux = <PINMUX_GPIO45__FUNC_GPIO45>;
output-low;
bias-pull-up;
};
};
- pwm0_pin_default: pwm0_pin_default {
+ pwm0_pin_default: pwm0-pin-default {
pins1 {
pinmux = <PINMUX_GPIO176__FUNC_GPIO176>;
output-high;
};
scp_pins: scp {
- pins_scp_uart {
+ pins-scp-uart {
pinmux = <PINMUX_GPIO110__FUNC_TP_URXD1_AO>,
<PINMUX_GPIO112__FUNC_TP_UTXD1_AO>;
};
};
spi0_pins: spi0 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO85__FUNC_SPI0_MI>,
<PINMUX_GPIO86__FUNC_GPIO86>,
<PINMUX_GPIO87__FUNC_SPI0_MO>,
};
spi1_pins: spi1 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO161__FUNC_SPI1_A_MI>,
<PINMUX_GPIO162__FUNC_SPI1_A_CSB>,
<PINMUX_GPIO163__FUNC_SPI1_A_MO>,
};
spi2_pins: spi2 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO0__FUNC_SPI2_CSB>,
<PINMUX_GPIO1__FUNC_SPI2_MO>,
<PINMUX_GPIO2__FUNC_SPI2_CLK>;
bias-disable;
};
- pins_spi_mi {
+ pins-spi-mi {
pinmux = <PINMUX_GPIO94__FUNC_SPI2_MI>;
mediatek,pull-down-adv = <00>;
};
};
spi3_pins: spi3 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO21__FUNC_SPI3_MI>,
<PINMUX_GPIO22__FUNC_SPI3_CSB>,
<PINMUX_GPIO23__FUNC_SPI3_MO>,
};
spi4_pins: spi4 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO17__FUNC_SPI4_MI>,
<PINMUX_GPIO18__FUNC_SPI4_CSB>,
<PINMUX_GPIO19__FUNC_SPI4_MO>,
};
spi5_pins: spi5 {
- pins_spi {
+ pins-spi {
pinmux = <PINMUX_GPIO13__FUNC_SPI5_MI>,
<PINMUX_GPIO14__FUNC_SPI5_CSB>,
<PINMUX_GPIO15__FUNC_SPI5_MO>,
};
uart0_pins_default: uart0-pins-default {
- pins_rx {
+ pins-rx {
pinmux = <PINMUX_GPIO95__FUNC_URXD0>;
input-enable;
bias-pull-up;
};
- pins_tx {
+ pins-tx {
pinmux = <PINMUX_GPIO96__FUNC_UTXD0>;
};
};
uart1_pins_default: uart1-pins-default {
- pins_rx {
+ pins-rx {
pinmux = <PINMUX_GPIO121__FUNC_URXD1>;
input-enable;
bias-pull-up;
};
- pins_tx {
+ pins-tx {
pinmux = <PINMUX_GPIO115__FUNC_UTXD1>;
};
- pins_rts {
+ pins-rts {
pinmux = <PINMUX_GPIO47__FUNC_URTS1>;
output-enable;
};
- pins_cts {
+ pins-cts {
pinmux = <PINMUX_GPIO46__FUNC_UCTS1>;
input-enable;
};
};
uart1_pins_sleep: uart1-pins-sleep {
- pins_rx {
+ pins-rx {
pinmux = <PINMUX_GPIO121__FUNC_GPIO121>;
input-enable;
bias-pull-up;
};
- pins_tx {
+ pins-tx {
pinmux = <PINMUX_GPIO115__FUNC_UTXD1>;
};
- pins_rts {
+ pins-rts {
pinmux = <PINMUX_GPIO47__FUNC_URTS1>;
output-enable;
};
- pins_cts {
+ pins-cts {
pinmux = <PINMUX_GPIO46__FUNC_UCTS1>;
input-enable;
};
};
wifi_pins_pwrseq: wifi-pins-pwrseq {
- pins_wifi_enable {
+ pins-wifi-enable {
pinmux = <PINMUX_GPIO119__FUNC_GPIO119>;
output-low;
};
};
wifi_pins_wakeup: wifi-pins-wakeup {
- pins_wifi_wakeup {
+ pins-wifi-wakeup {
pinmux = <PINMUX_GPIO113__FUNC_GPIO113>;
input-enable;
};
nvmem-cell-names = "calibration-data";
};
- thermal_zones: thermal-zones {
- cpu_thermal: cpu-thermal {
- polling-delay-passive = <100>;
- polling-delay = <500>;
- thermal-sensors = <&thermal 0>;
- sustainable-power = <5000>;
-
- trips {
- threshold: trip-point0 {
- temperature = <68000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- target: trip-point1 {
- temperature = <80000>;
- hysteresis = <2000>;
- type = "passive";
- };
-
- cpu_crit: cpu-crit {
- temperature = <115000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
-
- cooling-maps {
- map0 {
- trip = <&target>;
- cooling-device = <&cpu0
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu1
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu2
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu3
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>;
- contribution = <3072>;
- };
- map1 {
- trip = <&target>;
- cooling-device = <&cpu4
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu5
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu6
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>,
- <&cpu7
- THERMAL_NO_LIMIT
- THERMAL_NO_LIMIT>;
- contribution = <1024>;
- };
- };
- };
-
- /* The tzts1 ~ tzts6 don't need to polling */
- /* The tzts1 ~ tzts6 don't need to thermal throttle */
-
- tzts1: tzts1 {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 1>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
-
- tzts2: tzts2 {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 2>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
-
- tzts3: tzts3 {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 3>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
-
- tzts4: tzts4 {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 4>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
-
- tzts5: tzts5 {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 5>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
-
- tztsABB: tztsABB {
- polling-delay-passive = <0>;
- polling-delay = <0>;
- thermal-sensors = <&thermal 6>;
- sustainable-power = <5000>;
- trips {};
- cooling-maps {};
- };
- };
-
pwm0: pwm@1100e000 {
compatible = "mediatek,mt8183-disp-pwm";
reg = <0 0x1100e000 0 0x1000>;
power-domains = <&spm MT8183_POWER_DOMAIN_CAM>;
};
};
+
+ thermal_zones: thermal-zones {
+ cpu_thermal: cpu-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <500>;
+ thermal-sensors = <&thermal 0>;
+ sustainable-power = <5000>;
+
+ trips {
+ threshold: trip-point0 {
+ temperature = <68000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+
+ target: trip-point1 {
+ temperature = <80000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+
+ cpu_crit: cpu-crit {
+ temperature = <115000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+
+ cooling-maps {
+ map0 {
+ trip = <&target>;
+ cooling-device = <&cpu0
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu1
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu2
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu3
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>;
+ contribution = <3072>;
+ };
+ map1 {
+ trip = <&target>;
+ cooling-device = <&cpu4
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu5
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu6
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>,
+ <&cpu7
+ THERMAL_NO_LIMIT
+ THERMAL_NO_LIMIT>;
+ contribution = <1024>;
+ };
+ };
+ };
+
+ /* The tzts1 ~ tzts6 don't need to polling */
+ /* The tzts1 ~ tzts6 don't need to thermal throttle */
+
+ tzts1: tzts1 {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 1>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+
+ tzts2: tzts2 {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 2>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+
+ tzts3: tzts3 {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 3>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+
+ tzts4: tzts4 {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 4>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+
+ tzts5: tzts5 {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 5>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+
+ tztsABB: tztsABB {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&thermal 6>;
+ sustainable-power = <5000>;
+ trips {};
+ cooling-maps {};
+ };
+ };
};
reg = <MT8186_POWER_DOMAIN_CSIRX_TOP>;
clocks = <&topckgen CLK_TOP_SENINF>,
<&topckgen CLK_TOP_SENINF1>;
- clock-names = "csirx_top0", "csirx_top1";
+ clock-names = "subsys-csirx-top0",
+ "subsys-csirx-top1";
#power-domain-cells = <0>;
};
reg = <MT8186_POWER_DOMAIN_ADSP_AO>;
clocks = <&topckgen CLK_TOP_AUDIODSP>,
<&topckgen CLK_TOP_ADSP_BUS>;
- clock-names = "audioadsp", "adsp_bus";
+ clock-names = "audioadsp",
+ "subsys-adsp-bus";
#address-cells = <1>;
#size-cells = <0>;
#power-domain-cells = <1>;
<&mmsys CLK_MM_SMI_COMMON>,
<&mmsys CLK_MM_SMI_GALS>,
<&mmsys CLK_MM_SMI_IOMMU>;
- clock-names = "disp", "mdp", "smi_infra", "smi_common",
- "smi_gals", "smi_iommu";
+ clock-names = "disp", "mdp",
+ "subsys-smi-infra",
+ "subsys-smi-common",
+ "subsys-smi-gals",
+ "subsys-smi-iommu";
mediatek,infracfg = <&infracfg_ao>;
#address-cells = <1>;
#size-cells = <0>;
power-domain@MT8186_POWER_DOMAIN_CAM {
reg = <MT8186_POWER_DOMAIN_CAM>;
- clocks = <&topckgen CLK_TOP_CAM>,
- <&topckgen CLK_TOP_SENINF>,
+ clocks = <&topckgen CLK_TOP_SENINF>,
<&topckgen CLK_TOP_SENINF1>,
<&topckgen CLK_TOP_SENINF2>,
<&topckgen CLK_TOP_SENINF3>,
+ <&camsys CLK_CAM2MM_GALS>,
<&topckgen CLK_TOP_CAMTM>,
- <&camsys CLK_CAM2MM_GALS>;
- clock-names = "cam-top", "cam0", "cam1", "cam2",
- "cam3", "cam-tm", "gals";
+ <&topckgen CLK_TOP_CAM>;
+ clock-names = "cam0", "cam1", "cam2",
+ "cam3", "gals",
+ "subsys-cam-tm",
+ "subsys-cam-top";
mediatek,infracfg = <&infracfg_ao>;
#address-cells = <1>;
#size-cells = <0>;
power-domain@MT8186_POWER_DOMAIN_IMG {
reg = <MT8186_POWER_DOMAIN_IMG>;
- clocks = <&topckgen CLK_TOP_IMG1>,
- <&imgsys1 CLK_IMG1_GALS_IMG1>;
- clock-names = "img-top", "gals";
+ clocks = <&imgsys1 CLK_IMG1_GALS_IMG1>,
+ <&topckgen CLK_TOP_IMG1>;
+ clock-names = "gals", "subsys-img-top";
mediatek,infracfg = <&infracfg_ao>;
#address-cells = <1>;
#size-cells = <0>;
<&ipesys CLK_IPE_LARB20>,
<&ipesys CLK_IPE_SMI_SUBCOM>,
<&ipesys CLK_IPE_GALS_IPE>;
- clock-names = "ipe-top", "ipe-larb0", "ipe-larb1",
- "ipe-smi", "ipe-gals";
+ clock-names = "subsys-ipe-top",
+ "subsys-ipe-larb0",
+ "subsys-ipe-larb1",
+ "subsys-ipe-smi",
+ "subsys-ipe-gals";
mediatek,infracfg = <&infracfg_ao>;
#power-domain-cells = <0>;
};
clocks = <&topckgen CLK_TOP_WPE>,
<&wpesys CLK_WPE_SMI_LARB8_CK_EN>,
<&wpesys CLK_WPE_SMI_LARB8_PCLK_EN>;
- clock-names = "wpe0", "larb-ck", "larb-pclk";
+ clock-names = "wpe0",
+ "subsys-larb-ck",
+ "subsys-larb-pclk";
mediatek,infracfg = <&infracfg_ao>;
#power-domain-cells = <0>;
};
#address-cells = <1>;
#size-cells = <1>;
- gpu_speedbin: gpu-speed-bin@59c {
+ gpu_speedbin: gpu-speedbin@59c {
reg = <0x59c 0x4>;
bits = <0 3>;
};
pinctrl-0 = <&i2c7_pins>;
pmic@34 {
- #interrupt-cells = <1>;
+ #interrupt-cells = <2>;
compatible = "mediatek,mt6360";
reg = <0x34>;
interrupt-controller;
power-domain@MT8195_POWER_DOMAIN_VENC_CORE1 {
reg = <MT8195_POWER_DOMAIN_VENC_CORE1>;
+ clocks = <&vencsys_core1 CLK_VENC_CORE1_LARB>;
+ clock-names = "venc1-larb";
mediatek,infracfg = <&infracfg_ao>;
#power-domain-cells = <0>;
};
power-domain@MT8195_POWER_DOMAIN_VENC {
reg = <MT8195_POWER_DOMAIN_VENC>;
+ clocks = <&vencsys CLK_VENC_LARB>;
+ clock-names = "venc0-larb";
mediatek,infracfg = <&infracfg_ao>;
#power-domain-cells = <0>;
};
reg = <0 0x1b010000 0 0x1000>;
mediatek,larb-id = <20>;
mediatek,smi = <&smi_common_vpp>;
- clocks = <&vencsys_core1 CLK_VENC_CORE1_LARB>,
+ clocks = <&vencsys_core1 CLK_VENC_CORE1_VENC>,
<&vencsys_core1 CLK_VENC_CORE1_GALS>,
<&vppsys0 CLK_VPP0_GALS_VDO0_VDO1_VENCSYS_CORE1>;
clock-names = "apb", "smi", "gals";
mt6360: pmic@34 {
compatible = "mediatek,mt6360";
reg = <0x34>;
+ interrupt-parent = <&pio>;
interrupts = <128 IRQ_TYPE_EDGE_FALLING>;
interrupt-names = "IRQB";
interrupt-controller;
sgtl5000_clk: sgtl5000-oscillator {
compatible = "fixed-clock";
#clock-cells = <0>;
- clock-frequency = <24576000>;
+ clock-frequency = <24576000>;
};
dc_12v: dc-12v-regulator {
vdec: video-codec@ff360000 {
compatible = "rockchip,rk3328-vdec", "rockchip,rk3399-vdec";
- reg = <0x0 0xff360000 0x0 0x400>;
+ reg = <0x0 0xff360000 0x0 0x480>;
interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru ACLK_RKVDEC>, <&cru HCLK_RKVDEC>,
<&cru SCLK_VDEC_CABAC>, <&cru SCLK_VDEC_CORE>;
&pci_rootport {
mvl_wifi: wifi@0,0 {
compatible = "pci1b4b,2b42";
- reg = <0x83010000 0x0 0x00000000 0x0 0x00100000
- 0x83010000 0x0 0x00100000 0x0 0x00100000>;
+ reg = <0x0000 0x0 0x0 0x0 0x0>;
interrupt-parent = <&gpio0>;
interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
pinctrl-names = "default";
&pci_rootport {
wifi@0,0 {
compatible = "qcom,ath10k";
- reg = <0x00010000 0x0 0x00000000 0x0 0x00000000>,
- <0x03010010 0x0 0x00000000 0x0 0x00200000>;
+ reg = <0x00000000 0x0 0x00000000 0x0 0x00000000>,
+ <0x03000010 0x0 0x00000000 0x0 0x00200000>;
qcom,ath10k-calibration-variant = "GO_DUMO";
};
};
#address-cells = <3>;
#size-cells = <2>;
ranges;
+ device_type = "pci";
};
};
power-domain@RK3399_PD_VDU {
reg = <RK3399_PD_VDU>;
clocks = <&cru ACLK_VDU>,
- <&cru HCLK_VDU>;
+ <&cru HCLK_VDU>,
+ <&cru SCLK_VDU_CA>,
+ <&cru SCLK_VDU_CORE>;
pm_qos = <&qos_video_m1_r>,
<&qos_video_m1_w>;
#power-domain-cells = <0>;
vdec: video-codec@ff660000 {
compatible = "rockchip,rk3399-vdec";
- reg = <0x0 0xff660000 0x0 0x400>;
+ reg = <0x0 0xff660000 0x0 0x480>;
interrupts = <GIC_SPI 116 IRQ_TYPE_LEVEL_HIGH 0>;
clocks = <&cru ACLK_VDU>, <&cru HCLK_VDU>,
<&cru SCLK_VDU_CA>, <&cru SCLK_VDU_CORE>;
<GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 71 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "sys", "pmc", "msi", "legacy", "err";
+ interrupt-names = "sys", "pmc", "msg", "legacy", "err";
bus-range = <0x0 0xf>;
clocks = <&cru ACLK_PCIE20_MST>, <&cru ACLK_PCIE20_SLV>,
<&cru ACLK_PCIE20_DBI>, <&cru PCLK_PCIE20>,
&pinctrl {
fan {
fan_int: fan-int {
- rockchip,pins = <0 RK_PA4 RK_FUNC_GPIO &pcfg_pull_none>;
+ rockchip,pins = <0 RK_PA4 RK_FUNC_GPIO &pcfg_pull_up>;
};
};
hym8563 {
hym8563_int: hym8563-int {
- rockchip,pins = <0 RK_PB0 RK_FUNC_GPIO &pcfg_pull_none>;
+ rockchip,pins = <0 RK_PB0 RK_FUNC_GPIO &pcfg_pull_up>;
};
};
leds {
compatible = "gpio-leds";
pinctrl-names = "default";
- pinctrl-0 =<&leds_gpio>;
+ pinctrl-0 = <&leds_gpio>;
led-1 {
gpios = <&gpio1 RK_PA2 GPIO_ACTIVE_HIGH>;
emmc_data_strobe: emmc-data-strobe {
rockchip,pins =
/* emmc_data_strobe */
- <2 RK_PA2 1 &pcfg_pull_none>;
+ <2 RK_PA2 1 &pcfg_pull_down>;
};
};
<GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH 0>,
<GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH 0>,
<GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH 0>;
- interrupt-names = "ch0", "ch1", "ch2", "ch3";
rockchip,pmu = <&pmu1grf>;
};
return alternative_has_cap_unlikely(ARM64_HAS_TLB_RANGE);
}
+static inline bool system_supports_lpa2(void)
+{
+ return cpus_have_final_cap(ARM64_HAS_LPA2);
+}
+
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
return ec == ESR_ELx_EC_DABT_LOW || ec == ESR_ELx_EC_DABT_CUR;
}
+static inline bool esr_fsc_is_translation_fault(unsigned long esr)
+{
+ return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT;
+}
+
+static inline bool esr_fsc_is_permission_fault(unsigned long esr)
+{
+ return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_PERM;
+}
+
+static inline bool esr_fsc_is_access_flag_fault(unsigned long esr)
+{
+ return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_ACCESS;
+}
+
const char *esr_get_class_string(unsigned long esr);
#endif /* __ASSEMBLY */
#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En)
/* TCR_EL2 Registers bits */
+#define TCR_EL2_DS (1UL << 32)
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
#define TCR_EL2_TBI (1 << 20)
#define TCR_EL2_PS_SHIFT 16
TCR_EL2_ORGN0_MASK | TCR_EL2_IRGN0_MASK | TCR_EL2_T0SZ_MASK)
/* VTCR_EL2 Registers bits */
+#define VTCR_EL2_DS TCR_EL2_DS
#define VTCR_EL2_RES1 (1U << 31)
#define VTCR_EL2_HD (1 << 22)
#define VTCR_EL2_HA (1 << 21)
* Once we get to a point where the two describe the same thing, we'll
* merge the definitions. One day.
*/
-#define __HFGRTR_EL2_RES0 (GENMASK(63, 56) | GENMASK(53, 51))
+#define __HFGRTR_EL2_RES0 HFGxTR_EL2_RES0
#define __HFGRTR_EL2_MASK GENMASK(49, 0)
-#define __HFGRTR_EL2_nMASK (GENMASK(58, 57) | GENMASK(55, 54) | BIT(50))
+#define __HFGRTR_EL2_nMASK ~(__HFGRTR_EL2_RES0 | __HFGRTR_EL2_MASK)
-#define __HFGWTR_EL2_RES0 (GENMASK(63, 56) | GENMASK(53, 51) | \
- BIT(46) | BIT(42) | BIT(40) | BIT(28) | \
- GENMASK(26, 25) | BIT(21) | BIT(18) | \
+/*
+ * The HFGWTR bits are a subset of HFGRTR bits. To ensure we don't miss any
+ * future additions, define __HFGWTR* macros relative to __HFGRTR* ones.
+ */
+#define __HFGRTR_ONLY_MASK (BIT(46) | BIT(42) | BIT(40) | BIT(28) | \
+ GENMASK(26, 25) | BIT(21) | BIT(18) | \
GENMASK(15, 14) | GENMASK(10, 9) | BIT(2))
-#define __HFGWTR_EL2_MASK GENMASK(49, 0)
-#define __HFGWTR_EL2_nMASK (GENMASK(58, 57) | GENMASK(55, 54) | BIT(50))
-
-#define __HFGITR_EL2_RES0 GENMASK(63, 57)
-#define __HFGITR_EL2_MASK GENMASK(54, 0)
-#define __HFGITR_EL2_nMASK GENMASK(56, 55)
-
-#define __HDFGRTR_EL2_RES0 (BIT(49) | BIT(42) | GENMASK(39, 38) | \
- GENMASK(21, 20) | BIT(8))
-#define __HDFGRTR_EL2_MASK ~__HDFGRTR_EL2_nMASK
-#define __HDFGRTR_EL2_nMASK GENMASK(62, 59)
-
-#define __HDFGWTR_EL2_RES0 (BIT(63) | GENMASK(59, 58) | BIT(51) | BIT(47) | \
- BIT(43) | GENMASK(40, 38) | BIT(34) | BIT(30) | \
- BIT(22) | BIT(9) | BIT(6))
-#define __HDFGWTR_EL2_MASK ~__HDFGWTR_EL2_nMASK
-#define __HDFGWTR_EL2_nMASK GENMASK(62, 60)
+#define __HFGWTR_EL2_RES0 (__HFGRTR_EL2_RES0 | __HFGRTR_ONLY_MASK)
+#define __HFGWTR_EL2_MASK (__HFGRTR_EL2_MASK & ~__HFGRTR_ONLY_MASK)
+#define __HFGWTR_EL2_nMASK ~(__HFGWTR_EL2_RES0 | __HFGWTR_EL2_MASK)
+
+#define __HFGITR_EL2_RES0 HFGITR_EL2_RES0
+#define __HFGITR_EL2_MASK (BIT(62) | BIT(60) | GENMASK(54, 0))
+#define __HFGITR_EL2_nMASK ~(__HFGITR_EL2_RES0 | __HFGITR_EL2_MASK)
+
+#define __HDFGRTR_EL2_RES0 HDFGRTR_EL2_RES0
+#define __HDFGRTR_EL2_MASK (BIT(63) | GENMASK(58, 50) | GENMASK(48, 43) | \
+ GENMASK(41, 40) | GENMASK(37, 22) | \
+ GENMASK(19, 9) | GENMASK(7, 0))
+#define __HDFGRTR_EL2_nMASK ~(__HDFGRTR_EL2_RES0 | __HDFGRTR_EL2_MASK)
+
+#define __HDFGWTR_EL2_RES0 HDFGWTR_EL2_RES0
+#define __HDFGWTR_EL2_MASK (GENMASK(57, 52) | GENMASK(50, 48) | \
+ GENMASK(46, 44) | GENMASK(42, 41) | \
+ GENMASK(37, 35) | GENMASK(33, 31) | \
+ GENMASK(29, 23) | GENMASK(21, 10) | \
+ GENMASK(8, 7) | GENMASK(5, 0))
+#define __HDFGWTR_EL2_nMASK ~(__HDFGWTR_EL2_RES0 | __HDFGWTR_EL2_MASK)
+
+#define __HAFGRTR_EL2_RES0 HAFGRTR_EL2_RES0
+#define __HAFGRTR_EL2_MASK (GENMASK(49, 17) | GENMASK(4, 0))
+#define __HAFGRTR_EL2_nMASK ~(__HAFGRTR_EL2_RES0 | __HAFGRTR_EL2_MASK)
/* Similar definitions for HCRX_EL2 */
-#define __HCRX_EL2_RES0 (GENMASK(63, 16) | GENMASK(13, 12))
-#define __HCRX_EL2_MASK (0)
-#define __HCRX_EL2_nMASK (GENMASK(15, 14) | GENMASK(4, 0))
+#define __HCRX_EL2_RES0 HCRX_EL2_RES0
+#define __HCRX_EL2_MASK (BIT(6))
+#define __HCRX_EL2_nMASK ~(__HCRX_EL2_RES0 | __HCRX_EL2_MASK)
/* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
#define HPFAR_MASK (~UL(0xf))
#include <asm/esr.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_hyp.h>
+#include <asm/kvm_nested.h>
#include <asm/ptrace.h>
#include <asm/cputype.h>
#include <asm/virt.h>
int kvm_inject_nested_sync(struct kvm_vcpu *vcpu, u64 esr_el2);
int kvm_inject_nested_irq(struct kvm_vcpu *vcpu);
-static inline bool vcpu_has_feature(const struct kvm_vcpu *vcpu, int feature)
-{
- return test_bit(feature, vcpu->kvm->arch.vcpu_features);
-}
-
#if defined(__KVM_VHE_HYPERVISOR__) || defined(__KVM_NVHE_HYPERVISOR__)
static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
{
static inline bool is_hyp_ctxt(const struct kvm_vcpu *vcpu)
{
- return __is_hyp_ctxt(&vcpu->arch.ctxt);
+ return vcpu_has_nv(vcpu) && __is_hyp_ctxt(&vcpu->arch.ctxt);
}
/*
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC;
}
-static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
+static inline
+bool kvm_vcpu_trap_is_permission_fault(const struct kvm_vcpu *vcpu)
{
- return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE;
+ return esr_fsc_is_permission_fault(kvm_vcpu_get_esr(vcpu));
}
-static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu)
+static inline
+bool kvm_vcpu_trap_is_translation_fault(const struct kvm_vcpu *vcpu)
{
- return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL;
+ return esr_fsc_is_translation_fault(kvm_vcpu_get_esr(vcpu));
+}
+
+static inline
+u64 kvm_vcpu_trap_get_perm_fault_granule(const struct kvm_vcpu *vcpu)
+{
+ unsigned long esr = kvm_vcpu_get_esr(vcpu);
+
+ BUG_ON(!esr_fsc_is_permission_fault(esr));
+ return BIT(ARM64_HW_PGTABLE_LEVEL_SHIFT(esr & ESR_ELx_FSC_LEVEL));
}
static __always_inline bool kvm_vcpu_abt_issea(const struct kvm_vcpu *vcpu)
* first), then a permission fault to allow the flags
* to be set.
*/
- switch (kvm_vcpu_trap_get_fault_type(vcpu)) {
- case ESR_ELx_FSC_PERM:
- return true;
- default:
- return false;
- }
+ return kvm_vcpu_trap_is_permission_fault(vcpu);
}
if (kvm_vcpu_trap_is_iabt(vcpu))
#include <asm/fpsimd.h>
#include <asm/kvm.h>
#include <asm/kvm_asm.h>
+#include <asm/vncr_mapping.h>
#define __KVM_HAVE_ARCH_INTC_INITIALIZED
* Atomic access to multiple idregs are guarded by kvm_arch.config_lock.
*/
#define IDREG_IDX(id) (((sys_reg_CRm(id) - 1) << 3) | sys_reg_Op2(id))
+#define IDX_IDREG(idx) sys_reg(3, 0, 0, ((idx) >> 3) + 1, (idx) & Op2_mask)
#define IDREG(kvm, id) ((kvm)->arch.id_regs[IDREG_IDX(id)])
#define KVM_ARM_ID_REG_NUM (IDREG_IDX(sys_reg(3, 0, 0, 7, 7)) + 1)
u64 id_regs[KVM_ARM_ID_REG_NUM];
u64 disr_el1; /* Deferred [SError] Status Register */
};
+/*
+ * VNCR() just places the VNCR_capable registers in the enum after
+ * __VNCR_START__, and the value (after correction) to be an 8-byte offset
+ * from the VNCR base. As we don't require the enum to be otherwise ordered,
+ * we need the terrible hack below to ensure that we correctly size the
+ * sys_regs array, no matter what.
+ *
+ * The __MAX__ macro has been lifted from Sean Eron Anderson's wonderful
+ * treasure trove of bit hacks:
+ * https://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
+ */
+#define __MAX__(x,y) ((x) ^ (((x) ^ (y)) & -((x) < (y))))
+#define VNCR(r) \
+ __before_##r, \
+ r = __VNCR_START__ + ((VNCR_ ## r) / 8), \
+ __after_##r = __MAX__(__before_##r - 1, r)
+
enum vcpu_sysreg {
__INVALID_SYSREG__, /* 0 is reserved as an invalid value */
MPIDR_EL1, /* MultiProcessor Affinity Register */
CLIDR_EL1, /* Cache Level ID Register */
CSSELR_EL1, /* Cache Size Selection Register */
- SCTLR_EL1, /* System Control Register */
- ACTLR_EL1, /* Auxiliary Control Register */
- CPACR_EL1, /* Coprocessor Access Control */
- ZCR_EL1, /* SVE Control */
- TTBR0_EL1, /* Translation Table Base Register 0 */
- TTBR1_EL1, /* Translation Table Base Register 1 */
- TCR_EL1, /* Translation Control Register */
- TCR2_EL1, /* Extended Translation Control Register */
- ESR_EL1, /* Exception Syndrome Register */
- AFSR0_EL1, /* Auxiliary Fault Status Register 0 */
- AFSR1_EL1, /* Auxiliary Fault Status Register 1 */
- FAR_EL1, /* Fault Address Register */
- MAIR_EL1, /* Memory Attribute Indirection Register */
- VBAR_EL1, /* Vector Base Address Register */
- CONTEXTIDR_EL1, /* Context ID Register */
TPIDR_EL0, /* Thread ID, User R/W */
TPIDRRO_EL0, /* Thread ID, User R/O */
TPIDR_EL1, /* Thread ID, Privileged */
- AMAIR_EL1, /* Aux Memory Attribute Indirection Register */
CNTKCTL_EL1, /* Timer Control Register (EL1) */
PAR_EL1, /* Physical Address Register */
- MDSCR_EL1, /* Monitor Debug System Control Register */
MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */
OSLSR_EL1, /* OS Lock Status Register */
DISR_EL1, /* Deferred Interrupt Status Register */
APGAKEYLO_EL1,
APGAKEYHI_EL1,
- ELR_EL1,
- SP_EL1,
- SPSR_EL1,
-
- CNTVOFF_EL2,
- CNTV_CVAL_EL0,
- CNTV_CTL_EL0,
- CNTP_CVAL_EL0,
- CNTP_CTL_EL0,
-
/* Memory Tagging Extension registers */
RGSR_EL1, /* Random Allocation Tag Seed Register */
GCR_EL1, /* Tag Control Register */
- TFSR_EL1, /* Tag Fault Status Register (EL1) */
TFSRE0_EL1, /* Tag Fault Status Register (EL0) */
- /* Permission Indirection Extension registers */
- PIR_EL1, /* Permission Indirection Register 1 (EL1) */
- PIRE0_EL1, /* Permission Indirection Register 0 (EL1) */
-
/* 32bit specific registers. */
DACR32_EL2, /* Domain Access Control Register */
IFSR32_EL2, /* Instruction Fault Status Register */
DBGVCR32_EL2, /* Debug Vector Catch Register */
/* EL2 registers */
- VPIDR_EL2, /* Virtualization Processor ID Register */
- VMPIDR_EL2, /* Virtualization Multiprocessor ID Register */
SCTLR_EL2, /* System Control Register (EL2) */
ACTLR_EL2, /* Auxiliary Control Register (EL2) */
- HCR_EL2, /* Hypervisor Configuration Register */
MDCR_EL2, /* Monitor Debug Configuration Register (EL2) */
CPTR_EL2, /* Architectural Feature Trap Register (EL2) */
- HSTR_EL2, /* Hypervisor System Trap Register */
HACR_EL2, /* Hypervisor Auxiliary Control Register */
- HCRX_EL2, /* Extended Hypervisor Configuration Register */
TTBR0_EL2, /* Translation Table Base Register 0 (EL2) */
TTBR1_EL2, /* Translation Table Base Register 1 (EL2) */
TCR_EL2, /* Translation Control Register (EL2) */
- VTTBR_EL2, /* Virtualization Translation Table Base Register */
- VTCR_EL2, /* Virtualization Translation Control Register */
SPSR_EL2, /* EL2 saved program status register */
ELR_EL2, /* EL2 exception link register */
AFSR0_EL2, /* Auxiliary Fault Status Register 0 (EL2) */
VBAR_EL2, /* Vector Base Address Register (EL2) */
RVBAR_EL2, /* Reset Vector Base Address Register */
CONTEXTIDR_EL2, /* Context ID Register (EL2) */
- TPIDR_EL2, /* EL2 Software Thread ID Register */
CNTHCTL_EL2, /* Counter-timer Hypervisor Control register */
SP_EL2, /* EL2 Stack Pointer */
- HFGRTR_EL2,
- HFGWTR_EL2,
- HFGITR_EL2,
- HDFGRTR_EL2,
- HDFGWTR_EL2,
CNTHP_CTL_EL2,
CNTHP_CVAL_EL2,
CNTHV_CTL_EL2,
CNTHV_CVAL_EL2,
+ __VNCR_START__, /* Any VNCR-capable reg goes after this point */
+
+ VNCR(SCTLR_EL1),/* System Control Register */
+ VNCR(ACTLR_EL1),/* Auxiliary Control Register */
+ VNCR(CPACR_EL1),/* Coprocessor Access Control */
+ VNCR(ZCR_EL1), /* SVE Control */
+ VNCR(TTBR0_EL1),/* Translation Table Base Register 0 */
+ VNCR(TTBR1_EL1),/* Translation Table Base Register 1 */
+ VNCR(TCR_EL1), /* Translation Control Register */
+ VNCR(TCR2_EL1), /* Extended Translation Control Register */
+ VNCR(ESR_EL1), /* Exception Syndrome Register */
+ VNCR(AFSR0_EL1),/* Auxiliary Fault Status Register 0 */
+ VNCR(AFSR1_EL1),/* Auxiliary Fault Status Register 1 */
+ VNCR(FAR_EL1), /* Fault Address Register */
+ VNCR(MAIR_EL1), /* Memory Attribute Indirection Register */
+ VNCR(VBAR_EL1), /* Vector Base Address Register */
+ VNCR(CONTEXTIDR_EL1), /* Context ID Register */
+ VNCR(AMAIR_EL1),/* Aux Memory Attribute Indirection Register */
+ VNCR(MDSCR_EL1),/* Monitor Debug System Control Register */
+ VNCR(ELR_EL1),
+ VNCR(SP_EL1),
+ VNCR(SPSR_EL1),
+ VNCR(TFSR_EL1), /* Tag Fault Status Register (EL1) */
+ VNCR(VPIDR_EL2),/* Virtualization Processor ID Register */
+ VNCR(VMPIDR_EL2),/* Virtualization Multiprocessor ID Register */
+ VNCR(HCR_EL2), /* Hypervisor Configuration Register */
+ VNCR(HSTR_EL2), /* Hypervisor System Trap Register */
+ VNCR(VTTBR_EL2),/* Virtualization Translation Table Base Register */
+ VNCR(VTCR_EL2), /* Virtualization Translation Control Register */
+ VNCR(TPIDR_EL2),/* EL2 Software Thread ID Register */
+ VNCR(HCRX_EL2), /* Extended Hypervisor Configuration Register */
+
+ /* Permission Indirection Extension registers */
+ VNCR(PIR_EL1), /* Permission Indirection Register 1 (EL1) */
+ VNCR(PIRE0_EL1), /* Permission Indirection Register 0 (EL1) */
+
+ VNCR(HFGRTR_EL2),
+ VNCR(HFGWTR_EL2),
+ VNCR(HFGITR_EL2),
+ VNCR(HDFGRTR_EL2),
+ VNCR(HDFGWTR_EL2),
+ VNCR(HAFGRTR_EL2),
+
+ VNCR(CNTVOFF_EL2),
+ VNCR(CNTV_CVAL_EL0),
+ VNCR(CNTV_CTL_EL0),
+ VNCR(CNTP_CVAL_EL0),
+ VNCR(CNTP_CTL_EL0),
+
NR_SYS_REGS /* Nothing after this line! */
};
u64 sys_regs[NR_SYS_REGS];
struct kvm_vcpu *__hyp_running_vcpu;
+
+ /* This pointer has to be 4kB aligned. */
+ u64 *vncr_array;
};
struct kvm_host_data {
* accessed by a running VCPU. For example, for userspace access or
* for system registers that are never context switched, but only
* emulated.
+ *
+ * Don't bother with VNCR-based accesses in the nVHE code, it has no
+ * business dealing with NV.
*/
-#define __ctxt_sys_reg(c,r) (&(c)->sys_regs[(r)])
+static inline u64 *__ctxt_sys_reg(const struct kvm_cpu_context *ctxt, int r)
+{
+#if !defined (__KVM_NVHE_HYPERVISOR__)
+ if (unlikely(cpus_have_final_cap(ARM64_HAS_NESTED_VIRT) &&
+ r >= __VNCR_START__ && ctxt->vncr_array))
+ return &ctxt->vncr_array[r - __VNCR_START__];
+#endif
+ return (u64 *)&ctxt->sys_regs[r];
+}
#define ctxt_sys_reg(c,r) (*__ctxt_sys_reg(c,r))
case AMAIR_EL1: *val = read_sysreg_s(SYS_AMAIR_EL12); break;
case CNTKCTL_EL1: *val = read_sysreg_s(SYS_CNTKCTL_EL12); break;
case ELR_EL1: *val = read_sysreg_s(SYS_ELR_EL12); break;
+ case SPSR_EL1: *val = read_sysreg_s(SYS_SPSR_EL12); break;
case PAR_EL1: *val = read_sysreg_par(); break;
case DACR32_EL2: *val = read_sysreg_s(SYS_DACR32_EL2); break;
case IFSR32_EL2: *val = read_sysreg_s(SYS_IFSR32_EL2); break;
case AMAIR_EL1: write_sysreg_s(val, SYS_AMAIR_EL12); break;
case CNTKCTL_EL1: write_sysreg_s(val, SYS_CNTKCTL_EL12); break;
case ELR_EL1: write_sysreg_s(val, SYS_ELR_EL12); break;
+ case SPSR_EL1: write_sysreg_s(val, SYS_SPSR_EL12); break;
case PAR_EL1: write_sysreg_s(val, SYS_PAR_EL1); break;
case DACR32_EL2: write_sysreg_s(val, SYS_DACR32_EL2); break;
case IFSR32_EL2: write_sysreg_s(val, SYS_IFSR32_EL2); break;
#define kvm_vm_has_ran_once(kvm) \
(test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &(kvm)->arch.flags))
+static inline bool __vcpu_has_feature(const struct kvm_arch *ka, int feature)
+{
+ return test_bit(feature, ka->vcpu_features);
+}
+
+#define vcpu_has_feature(v, f) __vcpu_has_feature(&(v)->kvm->arch, (f))
+
int kvm_trng_call(struct kvm_vcpu *vcpu);
#ifdef CONFIG_KVM
extern phys_addr_t hyp_mem_base;
#ifndef __ARM64_KVM_NESTED_H
#define __ARM64_KVM_NESTED_H
-#include <asm/kvm_emulate.h>
+#include <linux/bitfield.h>
#include <linux/kvm_host.h>
+#include <asm/kvm_emulate.h>
static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu)
{
vcpu_has_feature(vcpu, KVM_ARM_VCPU_HAS_EL2));
}
-extern bool __check_nv_sr_forward(struct kvm_vcpu *vcpu);
+/* Translation helpers from non-VHE EL2 to EL1 */
+static inline u64 tcr_el2_ps_to_tcr_el1_ips(u64 tcr_el2)
+{
+ return (u64)FIELD_GET(TCR_EL2_PS_MASK, tcr_el2) << TCR_IPS_SHIFT;
+}
+
+static inline u64 translate_tcr_el2_to_tcr_el1(u64 tcr)
+{
+ return TCR_EPD1_MASK | /* disable TTBR1_EL1 */
+ ((tcr & TCR_EL2_TBI) ? TCR_TBI0 : 0) |
+ tcr_el2_ps_to_tcr_el1_ips(tcr) |
+ (tcr & TCR_EL2_TG0_MASK) |
+ (tcr & TCR_EL2_ORGN0_MASK) |
+ (tcr & TCR_EL2_IRGN0_MASK) |
+ (tcr & TCR_EL2_T0SZ_MASK);
+}
+
+static inline u64 translate_cptr_el2_to_cpacr_el1(u64 cptr_el2)
+{
+ u64 cpacr_el1 = 0;
+
+ if (cptr_el2 & CPTR_EL2_TTA)
+ cpacr_el1 |= CPACR_ELx_TTA;
+ if (!(cptr_el2 & CPTR_EL2_TFP))
+ cpacr_el1 |= CPACR_ELx_FPEN;
+ if (!(cptr_el2 & CPTR_EL2_TZ))
+ cpacr_el1 |= CPACR_ELx_ZEN;
-struct sys_reg_params;
-struct sys_reg_desc;
+ return cpacr_el1;
+}
+
+static inline u64 translate_sctlr_el2_to_sctlr_el1(u64 val)
+{
+ /* Only preserve the minimal set of bits we support */
+ val &= (SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | SCTLR_ELx_SA |
+ SCTLR_ELx_I | SCTLR_ELx_IESB | SCTLR_ELx_WXN | SCTLR_ELx_EE);
+ val |= SCTLR_EL1_RES1;
+
+ return val;
+}
+
+static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0)
+{
+ /* Clear the ASID field */
+ return ttbr0 & ~GENMASK_ULL(63, 48);
+}
+
+extern bool __check_nv_sr_forward(struct kvm_vcpu *vcpu);
-void access_nested_id_reg(struct kvm_vcpu *v, struct sys_reg_params *p,
- const struct sys_reg_desc *r);
+int kvm_init_nv_sysregs(struct kvm *kvm);
#endif /* __ARM64_KVM_NESTED_H */
#include <linux/kvm_host.h>
#include <linux/types.h>
-#define KVM_PGTABLE_MAX_LEVELS 4U
+#define KVM_PGTABLE_FIRST_LEVEL -1
+#define KVM_PGTABLE_LAST_LEVEL 3
/*
* The largest supported block sizes for KVM (no 52-bit PA support):
* - 64K (level 2): 512MB
*/
#ifdef CONFIG_ARM64_4K_PAGES
-#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1U
+#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1
#else
-#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U
+#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2
#endif
+#define kvm_lpa2_is_enabled() system_supports_lpa2()
+
+static inline u64 kvm_get_parange_max(void)
+{
+ if (kvm_lpa2_is_enabled() ||
+ (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SHIFT == 16))
+ return ID_AA64MMFR0_EL1_PARANGE_52;
+ else
+ return ID_AA64MMFR0_EL1_PARANGE_48;
+}
+
static inline u64 kvm_get_parange(u64 mmfr0)
{
+ u64 parange_max = kvm_get_parange_max();
u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
ID_AA64MMFR0_EL1_PARANGE_SHIFT);
- if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX)
- parange = ID_AA64MMFR0_EL1_PARANGE_MAX;
+ if (parange > parange_max)
+ parange = parange_max;
return parange;
}
#define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT)
#define KVM_PTE_ADDR_51_48 GENMASK(15, 12)
+#define KVM_PTE_ADDR_MASK_LPA2 GENMASK(49, PAGE_SHIFT)
+#define KVM_PTE_ADDR_51_50_LPA2 GENMASK(9, 8)
#define KVM_PHYS_INVALID (-1ULL)
static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
{
- u64 pa = pte & KVM_PTE_ADDR_MASK;
-
- if (PAGE_SHIFT == 16)
- pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+ u64 pa;
+
+ if (kvm_lpa2_is_enabled()) {
+ pa = pte & KVM_PTE_ADDR_MASK_LPA2;
+ pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50;
+ } else {
+ pa = pte & KVM_PTE_ADDR_MASK;
+ if (PAGE_SHIFT == 16)
+ pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+ }
return pa;
}
static inline kvm_pte_t kvm_phys_to_pte(u64 pa)
{
- kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
-
- if (PAGE_SHIFT == 16) {
- pa &= GENMASK(51, 48);
- pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+ kvm_pte_t pte;
+
+ if (kvm_lpa2_is_enabled()) {
+ pte = pa & KVM_PTE_ADDR_MASK_LPA2;
+ pa &= GENMASK(51, 50);
+ pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50);
+ } else {
+ pte = pa & KVM_PTE_ADDR_MASK;
+ if (PAGE_SHIFT == 16) {
+ pa &= GENMASK(51, 48);
+ pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+ }
}
return pte;
return __phys_to_pfn(kvm_pte_to_phys(pte));
}
-static inline u64 kvm_granule_shift(u32 level)
+static inline u64 kvm_granule_shift(s8 level)
{
- /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */
+ /* Assumes KVM_PGTABLE_LAST_LEVEL is 3 */
return ARM64_HW_PGTABLE_LEVEL_SHIFT(level);
}
-static inline u64 kvm_granule_size(u32 level)
+static inline u64 kvm_granule_size(s8 level)
{
return BIT(kvm_granule_shift(level));
}
-static inline bool kvm_level_supports_block_mapping(u32 level)
+static inline bool kvm_level_supports_block_mapping(s8 level)
{
return level >= KVM_PGTABLE_MIN_BLOCK_LEVEL;
}
static inline u32 kvm_supported_block_sizes(void)
{
- u32 level = KVM_PGTABLE_MIN_BLOCK_LEVEL;
+ s8 level = KVM_PGTABLE_MIN_BLOCK_LEVEL;
u32 r = 0;
- for (; level < KVM_PGTABLE_MAX_LEVELS; level++)
+ for (; level <= KVM_PGTABLE_LAST_LEVEL; level++)
r |= BIT(kvm_granule_shift(level));
return r;
void* (*zalloc_page)(void *arg);
void* (*zalloc_pages_exact)(size_t size);
void (*free_pages_exact)(void *addr, size_t size);
- void (*free_unlinked_table)(void *addr, u32 level);
+ void (*free_unlinked_table)(void *addr, s8 level);
void (*get_page)(void *addr);
void (*put_page)(void *addr);
int (*page_count)(void *addr);
u64 start;
u64 addr;
u64 end;
- u32 level;
+ s8 level;
enum kvm_pgtable_walk_flags flags;
};
*/
struct kvm_pgtable {
u32 ia_bits;
- u32 start_level;
+ s8 start_level;
kvm_pteref_t pgd;
struct kvm_pgtable_mm_ops *mm_ops;
* The page-table is assumed to be unreachable by any hardware walkers prior to
* freeing and therefore no TLB invalidation is performed.
*/
-void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level);
+void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level);
/**
* kvm_pgtable_stage2_create_unlinked() - Create an unlinked stage-2 paging structure.
* an ERR_PTR(error) on failure.
*/
kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
- u64 phys, u32 level,
+ u64 phys, s8 level,
enum kvm_pgtable_prot prot,
void *mc, bool force_pte);
* Return: 0 on success, negative error code on failure.
*/
int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
- kvm_pte_t *ptep, u32 *level);
+ kvm_pte_t *ptep, s8 *level);
/**
* kvm_pgtable_stage2_pte_prot() - Retrieve the protection attributes of a
static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
{
- unsigned long total = 0, i;
+ unsigned long total = 0;
+ int i;
/* Provision the worst case scenario */
- for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) {
+ for (i = KVM_PGTABLE_FIRST_LEVEL; i <= KVM_PGTABLE_LAST_LEVEL; i++) {
nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE);
total += nr_pages;
}
#define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0)
#define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0)
+#define lpa2_is_enabled() false
+
/*
* If we have userspace only BTI we don't want to mark kernel pages
* guarded even if the system does support BTI.
pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
+ /*
+ * If we end up clearing hw dirtiness for a sw-dirty PTE, set hardware
+ * dirtiness again.
+ */
+ if (pte_sw_dirty(pte))
+ pte = pte_mkdirty(pte);
return pte;
}
extern bool rodata_enabled;
extern bool rodata_full;
- if (arg && !strcmp(arg, "full")) {
+ if (!arg)
+ return false;
+
+ if (!strcmp(arg, "full")) {
+ rodata_enabled = rodata_full = true;
+ return true;
+ }
+
+ if (!strcmp(arg, "off")) {
+ rodata_enabled = rodata_full = false;
+ return true;
+ }
+
+ if (!strcmp(arg, "on")) {
rodata_enabled = true;
- rodata_full = true;
+ rodata_full = false;
return true;
}
return sys_ni_syscall(); \
}
-#define COMPAT_SYS_NI(name) \
- SYSCALL_ALIAS(__arm64_compat_sys_##name, sys_ni_posix_timers);
-
#endif /* CONFIG_COMPAT */
#define __SYSCALL_DEFINEx(x, name, ...) \
}
asmlinkage long __arm64_sys_ni_syscall(const struct pt_regs *__unused);
-#define SYS_NI(name) SYSCALL_ALIAS(__arm64_sys_##name, sys_ni_posix_timers);
#endif /* __ASM_SYSCALL_WRAPPER_H */
#define OP_AT_S1E0W sys_insn(AT_Op0, 0, AT_CRn, 8, 3)
#define OP_AT_S1E1RP sys_insn(AT_Op0, 0, AT_CRn, 9, 0)
#define OP_AT_S1E1WP sys_insn(AT_Op0, 0, AT_CRn, 9, 1)
+#define OP_AT_S1E1A sys_insn(AT_Op0, 0, AT_CRn, 9, 2)
#define OP_AT_S1E2R sys_insn(AT_Op0, 4, AT_CRn, 8, 0)
#define OP_AT_S1E2W sys_insn(AT_Op0, 4, AT_CRn, 8, 1)
#define OP_AT_S12E1R sys_insn(AT_Op0, 4, AT_CRn, 8, 4)
#define OP_TLBI_VMALLS12E1NXS sys_insn(1, 4, 9, 7, 6)
/* Misc instructions */
+#define OP_GCSPUSHX sys_insn(1, 0, 7, 7, 4)
+#define OP_GCSPOPCX sys_insn(1, 0, 7, 7, 5)
+#define OP_GCSPOPX sys_insn(1, 0, 7, 7, 6)
+#define OP_GCSPUSHM sys_insn(1, 3, 7, 7, 0)
+
#define OP_BRB_IALL sys_insn(1, 1, 7, 2, 4)
#define OP_BRB_INJ sys_insn(1, 1, 7, 2, 5)
#define OP_CFP_RCTX sys_insn(1, 3, 7, 3, 4)
#define OP_DVP_RCTX sys_insn(1, 3, 7, 3, 5)
+#define OP_COSP_RCTX sys_insn(1, 3, 7, 3, 6)
#define OP_CPP_RCTX sys_insn(1, 3, 7, 3, 7)
/* Common SCTLR_ELx flags. */
/* id_aa64mmfr0 */
#define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN 0x0
+#define ID_AA64MMFR0_EL1_TGRAN4_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX 0x7
#define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN 0x0
#define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX 0x7
#define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN 0x1
+#define ID_AA64MMFR0_EL1_TGRAN16_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX 0xf
#define ARM64_MIN_PARANGE_BITS 32
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_DEFAULT 0x0
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_NONE 0x1
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MIN 0x2
+#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2 0x3
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MAX 0x7
#ifdef CONFIG_ARM64_PA_BITS_52
#if defined(CONFIG_ARM64_4K_PAGES)
#define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN4_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX
#define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN4_2_SHIFT
#elif defined(CONFIG_ARM64_16K_PAGES)
#define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN16_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX
#define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN16_2_SHIFT
#define PIRx_ELx_PERM(idx, perm) ((perm) << ((idx) * 4))
+/*
+ * Permission Overlay Extension (POE) permission encodings.
+ */
+#define POE_NONE UL(0x0)
+#define POE_R UL(0x1)
+#define POE_X UL(0x2)
+#define POE_RX UL(0x3)
+#define POE_W UL(0x4)
+#define POE_RW UL(0x5)
+#define POE_XW UL(0x6)
+#define POE_RXW UL(0x7)
+#define POE_MASK UL(0xf)
+
#define ARM64_FEATURE_FIELD_BITS 4
/* Defined for compatibility only, do not add new users. */
#include <asm-generic/tlb.h>
/*
- * get the tlbi levels in arm64. Default value is 0 if more than one
- * of cleared_* is set or neither is set.
- * Arm64 doesn't support p4ds now.
+ * get the tlbi levels in arm64. Default value is TLBI_TTL_UNKNOWN if more than
+ * one of cleared_* is set or neither is set - this elides the level hinting to
+ * the hardware.
*/
static inline int tlb_get_level(struct mmu_gather *tlb)
{
/* The TTL field is only valid for the leaf entry. */
if (tlb->freed_tables)
- return 0;
+ return TLBI_TTL_UNKNOWN;
if (tlb->cleared_ptes && !(tlb->cleared_pmds ||
tlb->cleared_puds ||
tlb->cleared_p4ds))
return 1;
- return 0;
+ if (tlb->cleared_p4ds && !(tlb->cleared_ptes ||
+ tlb->cleared_pmds ||
+ tlb->cleared_puds))
+ return 0;
+
+ return TLBI_TTL_UNKNOWN;
}
static inline void tlb_flush(struct mmu_gather *tlb)
* When ARMv8.4-TTL exists, TLBI operations take an additional hint for
* the level at which the invalidation must take place. If the level is
* wrong, no invalidation may take place. In the case where the level
- * cannot be easily determined, a 0 value for the level parameter will
- * perform a non-hinted invalidation.
+ * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform
+ * a non-hinted invalidation. Any provided level outside the hint range
+ * will also cause fall-back to non-hinted invalidation.
*
* For Stage-2 invalidation, use the level values provided to that effect
* in asm/stage2_pgtable.h.
*/
#define TLBI_TTL_MASK GENMASK_ULL(47, 44)
+#define TLBI_TTL_UNKNOWN INT_MAX
+
#define __tlbi_level(op, addr, level) do { \
u64 arg = addr; \
\
if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && \
- level) { \
+ level >= 0 && level <= 3) { \
u64 ttl = level & 3; \
ttl |= get_trans_granule() << 2; \
arg &= ~TLBI_TTL_MASK; \
} while (0)
/*
- * This macro creates a properly formatted VA operand for the TLB RANGE.
- * The value bit assignments are:
+ * This macro creates a properly formatted VA operand for the TLB RANGE. The
+ * value bit assignments are:
*
* +----------+------+-------+-------+-------+----------------------+
* | ASID | TG | SCALE | NUM | TTL | BADDR |
* +-----------------+-------+-------+-------+----------------------+
* |63 48|47 46|45 44|43 39|38 37|36 0|
*
- * The address range is determined by below formula:
- * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE)
+ * The address range is determined by below formula: [BADDR, BADDR + (NUM + 1) *
+ * 2^(5*SCALE + 1) * PAGESIZE)
+ *
+ * Note that the first argument, baddr, is pre-shifted; If LPA2 is in use, BADDR
+ * holds addr[52:16]. Else BADDR holds page number. See for example ARM DDI
+ * 0487J.a section C5.5.60 "TLBI VAE1IS, TLBI VAE1ISNXS, TLB Invalidate by VA,
+ * EL1, Inner Shareable".
*
*/
-#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \
- ({ \
- unsigned long __ta = (addr) >> PAGE_SHIFT; \
- __ta &= GENMASK_ULL(36, 0); \
- __ta |= (unsigned long)(ttl) << 37; \
- __ta |= (unsigned long)(num) << 39; \
- __ta |= (unsigned long)(scale) << 44; \
- __ta |= get_trans_granule() << 46; \
- __ta |= (unsigned long)(asid) << 48; \
- __ta; \
+#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl) \
+ ({ \
+ unsigned long __ta = (baddr); \
+ unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \
+ __ta &= GENMASK_ULL(36, 0); \
+ __ta |= __ttl << 37; \
+ __ta |= (unsigned long)(num) << 39; \
+ __ta |= (unsigned long)(scale) << 44; \
+ __ta |= get_trans_granule() << 46; \
+ __ta |= (unsigned long)(asid) << 48; \
+ __ta; \
})
/* These macros are used by the TLBI RANGE feature. */
* CPUs, ensuring that any walk-cache entries associated with the
* translation are also invalidated.
*
- * __flush_tlb_range(vma, start, end, stride, last_level)
+ * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level)
* Invalidate the virtual-address range '[start, end)' on all
* CPUs for the user address space corresponding to 'vma->mm'.
* The invalidation operations are issued at a granularity
* determined by 'stride' and only affect any walk-cache entries
- * if 'last_level' is equal to false.
+ * if 'last_level' is equal to false. tlb_level is the level at
+ * which the invalidation must take place. If the level is wrong,
+ * no invalidation may take place. In the case where the level
+ * cannot be easily determined, the value TLBI_TTL_UNKNOWN will
+ * perform a non-hinted invalidation.
*
*
* Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented
* @tlb_level: Translation Table level hint, if known
* @tlbi_user: If 'true', call an additional __tlbi_user()
* (typically for user ASIDs). 'flase' for IPA instructions
+ * @lpa2: If 'true', the lpa2 scheme is used as set out below
*
* When the CPU does not support TLB range operations, flush the TLB
* entries one by one at the granularity of 'stride'. If the TLB
* range ops are supported, then:
*
- * 1. If 'pages' is odd, flush the first page through non-range
- * operations;
+ * 1. If FEAT_LPA2 is in use, the start address of a range operation must be
+ * 64KB aligned, so flush pages one by one until the alignment is reached
+ * using the non-range operations. This step is skipped if LPA2 is not in
+ * use.
+ *
+ * 2. The minimum range granularity is decided by 'scale', so multiple range
+ * TLBI operations may be required. Start from scale = 3, flush the largest
+ * possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
+ * requested range, then decrement scale and continue until one or zero pages
+ * are left. We must start from highest scale to ensure 64KB start alignment
+ * is maintained in the LPA2 case.
*
- * 2. For remaining pages: the minimum range granularity is decided
- * by 'scale', so multiple range TLBI operations may be required.
- * Start from scale = 0, flush the corresponding number of pages
- * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it
- * until no pages left.
+ * 3. If there is 1 page remaining, flush it through non-range operations. Range
+ * operations can only span an even number of pages. We save this for last to
+ * ensure 64KB start alignment is maintained for the LPA2 case.
*
* Note that certain ranges can be represented by either num = 31 and
* scale or num = 0 and scale + 1. The loop below favours the latter
* since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
*/
#define __flush_tlb_range_op(op, start, pages, stride, \
- asid, tlb_level, tlbi_user) \
+ asid, tlb_level, tlbi_user, lpa2) \
do { \
int num = 0; \
- int scale = 0; \
+ int scale = 3; \
+ int shift = lpa2 ? 16 : PAGE_SHIFT; \
unsigned long addr; \
\
while (pages > 0) { \
if (!system_supports_tlb_range() || \
- pages % 2 == 1) { \
+ pages == 1 || \
+ (lpa2 && start != ALIGN(start, SZ_64K))) { \
addr = __TLBI_VADDR(start, asid); \
__tlbi_level(op, addr, tlb_level); \
if (tlbi_user) \
\
num = __TLBI_RANGE_NUM(pages, scale); \
if (num >= 0) { \
- addr = __TLBI_VADDR_RANGE(start, asid, scale, \
- num, tlb_level); \
+ addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
+ scale, num, tlb_level); \
__tlbi(r##op, addr); \
if (tlbi_user) \
__tlbi_user(r##op, addr); \
start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
pages -= __TLBI_RANGE_PAGES(num, scale); \
} \
- scale++; \
+ scale--; \
} \
} while (0)
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
- __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false)
+ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
asid = ASID(vma->vm_mm);
if (last_level)
- __flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, true);
+ __flush_tlb_range_op(vale1is, start, pages, stride, asid,
+ tlb_level, true, lpa2_is_enabled());
else
- __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true);
+ __flush_tlb_range_op(vae1is, start, pages, stride, asid,
+ tlb_level, true, lpa2_is_enabled());
dsb(ish);
mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
/*
* We cannot use leaf-only invalidation here, since we may be invalidating
* table entries as part of collapsing hugepages or moving page tables.
- * Set the tlb_level to 0 because we can not get enough information here.
+ * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough
+ * information here.
*/
- __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
}
static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * System register offsets in the VNCR page
+ * All offsets are *byte* displacements!
+ */
+
+#ifndef __ARM64_VNCR_MAPPING_H__
+#define __ARM64_VNCR_MAPPING_H__
+
+#define VNCR_VTTBR_EL2 0x020
+#define VNCR_VTCR_EL2 0x040
+#define VNCR_VMPIDR_EL2 0x050
+#define VNCR_CNTVOFF_EL2 0x060
+#define VNCR_HCR_EL2 0x078
+#define VNCR_HSTR_EL2 0x080
+#define VNCR_VPIDR_EL2 0x088
+#define VNCR_TPIDR_EL2 0x090
+#define VNCR_HCRX_EL2 0x0A0
+#define VNCR_VNCR_EL2 0x0B0
+#define VNCR_CPACR_EL1 0x100
+#define VNCR_CONTEXTIDR_EL1 0x108
+#define VNCR_SCTLR_EL1 0x110
+#define VNCR_ACTLR_EL1 0x118
+#define VNCR_TCR_EL1 0x120
+#define VNCR_AFSR0_EL1 0x128
+#define VNCR_AFSR1_EL1 0x130
+#define VNCR_ESR_EL1 0x138
+#define VNCR_MAIR_EL1 0x140
+#define VNCR_AMAIR_EL1 0x148
+#define VNCR_MDSCR_EL1 0x158
+#define VNCR_SPSR_EL1 0x160
+#define VNCR_CNTV_CVAL_EL0 0x168
+#define VNCR_CNTV_CTL_EL0 0x170
+#define VNCR_CNTP_CVAL_EL0 0x178
+#define VNCR_CNTP_CTL_EL0 0x180
+#define VNCR_SCXTNUM_EL1 0x188
+#define VNCR_TFSR_EL1 0x190
+#define VNCR_HFGRTR_EL2 0x1B8
+#define VNCR_HFGWTR_EL2 0x1C0
+#define VNCR_HFGITR_EL2 0x1C8
+#define VNCR_HDFGRTR_EL2 0x1D0
+#define VNCR_HDFGWTR_EL2 0x1D8
+#define VNCR_ZCR_EL1 0x1E0
+#define VNCR_HAFGRTR_EL2 0x1E8
+#define VNCR_TTBR0_EL1 0x200
+#define VNCR_TTBR1_EL1 0x210
+#define VNCR_FAR_EL1 0x220
+#define VNCR_ELR_EL1 0x230
+#define VNCR_SP_EL1 0x240
+#define VNCR_VBAR_EL1 0x250
+#define VNCR_TCR2_EL1 0x270
+#define VNCR_PIRE0_EL1 0x290
+#define VNCR_PIRE0_EL2 0x298
+#define VNCR_PIR_EL1 0x2A0
+#define VNCR_ICH_LR0_EL2 0x400
+#define VNCR_ICH_LR1_EL2 0x408
+#define VNCR_ICH_LR2_EL2 0x410
+#define VNCR_ICH_LR3_EL2 0x418
+#define VNCR_ICH_LR4_EL2 0x420
+#define VNCR_ICH_LR5_EL2 0x428
+#define VNCR_ICH_LR6_EL2 0x430
+#define VNCR_ICH_LR7_EL2 0x438
+#define VNCR_ICH_LR8_EL2 0x440
+#define VNCR_ICH_LR9_EL2 0x448
+#define VNCR_ICH_LR10_EL2 0x450
+#define VNCR_ICH_LR11_EL2 0x458
+#define VNCR_ICH_LR12_EL2 0x460
+#define VNCR_ICH_LR13_EL2 0x468
+#define VNCR_ICH_LR14_EL2 0x470
+#define VNCR_ICH_LR15_EL2 0x478
+#define VNCR_ICH_AP0R0_EL2 0x480
+#define VNCR_ICH_AP0R1_EL2 0x488
+#define VNCR_ICH_AP0R2_EL2 0x490
+#define VNCR_ICH_AP0R3_EL2 0x498
+#define VNCR_ICH_AP1R0_EL2 0x4A0
+#define VNCR_ICH_AP1R1_EL2 0x4A8
+#define VNCR_ICH_AP1R2_EL2 0x4B0
+#define VNCR_ICH_AP1R3_EL2 0x4B8
+#define VNCR_ICH_HCR_EL2 0x4C0
+#define VNCR_ICH_VMCR_EL2 0x4C8
+#define VNCR_VDISR_EL2 0x500
+#define VNCR_PMBLIMITR_EL1 0x800
+#define VNCR_PMBPTR_EL1 0x810
+#define VNCR_PMBSR_EL1 0x820
+#define VNCR_PMSCR_EL1 0x828
+#define VNCR_PMSEVFR_EL1 0x830
+#define VNCR_PMSICR_EL1 0x838
+#define VNCR_PMSIRR_EL1 0x840
+#define VNCR_PMSLATFR_EL1 0x848
+#define VNCR_TRFCR_EL1 0x880
+#define VNCR_MPAM1_EL1 0x900
+#define VNCR_MPAMHCR_EL2 0x930
+#define VNCR_MPAMVPMV_EL2 0x938
+#define VNCR_MPAMVPM0_EL2 0x940
+#define VNCR_MPAMVPM1_EL2 0x948
+#define VNCR_MPAMVPM2_EL2 0x950
+#define VNCR_MPAMVPM3_EL2 0x958
+#define VNCR_MPAMVPM4_EL2 0x960
+#define VNCR_MPAMVPM5_EL2 0x968
+#define VNCR_MPAMVPM6_EL2 0x970
+#define VNCR_MPAMVPM7_EL2 0x978
+
+#endif /* __ARM64_VNCR_MAPPING_H__ */
return !meltdown_safe;
}
+#if defined(ID_AA64MMFR0_EL1_TGRAN_LPA2) && defined(ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2)
+static bool has_lpa2_at_stage1(u64 mmfr0)
+{
+ unsigned int tgran;
+
+ tgran = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_EL1_TGRAN_SHIFT);
+ return tgran == ID_AA64MMFR0_EL1_TGRAN_LPA2;
+}
+
+static bool has_lpa2_at_stage2(u64 mmfr0)
+{
+ unsigned int tgran;
+
+ tgran = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_EL1_TGRAN_2_SHIFT);
+ return tgran == ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2;
+}
+
+static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ u64 mmfr0;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ return has_lpa2_at_stage1(mmfr0) && has_lpa2_at_stage2(mmfr0);
+}
+#else
+static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ return false;
+}
+#endif
+
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
#define KPTI_NG_TEMP_VA (-(1UL << PMD_SHIFT))
static void __init kpti_install_ng_mappings(void)
{
+ /* Check whether KPTI is going to be used */
+ if (!cpus_have_cap(ARM64_UNMAP_KERNEL_AT_EL0))
+ return;
+
/*
* We don't need to rewrite the page-tables if either we've done
* it already or we have KASLR enabled and therefore have not
.capability = ARM64_HAS_NESTED_VIRT,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.matches = has_nested_virt_support,
- ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, NV, IMP)
+ ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, NV, NV2)
},
{
.capability = ARM64_HAS_32BIT_EL0_DO_NOT_USE,
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
},
+ {
+ .desc = "52-bit Virtual Addressing for KVM (LPA2)",
+ .capability = ARM64_HAS_LPA2,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = has_lpa2,
+ },
{},
};
menuconfig KVM
bool "Kernel-based Virtual Machine (KVM) support"
depends on HAVE_KVM
+ select KVM_COMMON
select KVM_GENERIC_HARDWARE_ENABLING
select KVM_GENERIC_MMU_NOTIFIER
- select PREEMPT_NOTIFIERS
select HAVE_KVM_CPU_RELAX_INTERCEPT
select KVM_MMIO
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_XFER_TO_GUEST_WORK
select KVM_VFIO
- select HAVE_KVM_EVENTFD
- select HAVE_KVM_IRQFD
select HAVE_KVM_DIRTY_RING_ACQ_REL
select NEED_KVM_DIRTY_RING_WITH_BITMAP
select HAVE_KVM_MSI
select HAVE_KVM_VCPU_RUN_PID_CHANGE
select SCHED_INFO
select GUEST_PERF_EVENTS if PERF_EVENTS
- select INTERVAL_TREE
select XARRAY_MULTI
help
Support hosting virtualized guest machines.
u64 val = vcpu_get_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu));
struct arch_timer_context *ctx;
- ctx = (vcpu_has_nv(vcpu) && is_hyp_ctxt(vcpu)) ? vcpu_hvtimer(vcpu)
- : vcpu_vtimer(vcpu);
+ ctx = is_hyp_ctxt(vcpu) ? vcpu_hvtimer(vcpu) : vcpu_vtimer(vcpu);
return kvm_counter_compute_delta(ctx, val);
}
kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_cache);
kvm_timer_vcpu_terminate(vcpu);
kvm_pmu_vcpu_destroy(vcpu);
-
+ kvm_vgic_vcpu_destroy(vcpu);
kvm_arm_vcpu_destroy(vcpu);
}
return ret;
}
+ if (vcpu_has_nv(vcpu)) {
+ ret = kvm_init_nv_sysregs(vcpu->kvm);
+ if (ret)
+ return ret;
+ }
+
ret = kvm_timer_enable(vcpu);
if (ret)
return ret;
static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
{
struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
+ u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
unsigned long tcr;
/*
}
tcr &= ~TCR_T0SZ_MASK;
tcr |= TCR_T0SZ(hyp_va_bits);
+ tcr &= ~TCR_EL2_PS_MASK;
+ tcr |= FIELD_PREP(TCR_EL2_PS_MASK, kvm_get_parange(mmfr0));
+ if (kvm_lpa2_is_enabled())
+ tcr |= TCR_EL2_DS;
params->tcr_el2 = tcr;
params->pgd_pa = kvm_mmu_get_httbr();
HDFGRTR_GROUP,
HDFGWTR_GROUP,
HFGITR_GROUP,
+ HAFGRTR_GROUP,
/* Must be last */
__NR_FGT_GROUP_IDS__
static const struct encoding_to_trap_config encoding_to_fgt[] __initconst = {
/* HFGRTR_EL2, HFGWTR_EL2 */
+ SR_FGT(SYS_AMAIR2_EL1, HFGxTR, nAMAIR2_EL1, 0),
+ SR_FGT(SYS_MAIR2_EL1, HFGxTR, nMAIR2_EL1, 0),
+ SR_FGT(SYS_S2POR_EL1, HFGxTR, nS2POR_EL1, 0),
+ SR_FGT(SYS_POR_EL1, HFGxTR, nPOR_EL1, 0),
+ SR_FGT(SYS_POR_EL0, HFGxTR, nPOR_EL0, 0),
SR_FGT(SYS_PIR_EL1, HFGxTR, nPIR_EL1, 0),
SR_FGT(SYS_PIRE0_EL1, HFGxTR, nPIRE0_EL1, 0),
+ SR_FGT(SYS_RCWMASK_EL1, HFGxTR, nRCWMASK_EL1, 0),
SR_FGT(SYS_TPIDR2_EL0, HFGxTR, nTPIDR2_EL0, 0),
SR_FGT(SYS_SMPRI_EL1, HFGxTR, nSMPRI_EL1, 0),
+ SR_FGT(SYS_GCSCR_EL1, HFGxTR, nGCS_EL1, 0),
+ SR_FGT(SYS_GCSPR_EL1, HFGxTR, nGCS_EL1, 0),
+ SR_FGT(SYS_GCSCRE0_EL1, HFGxTR, nGCS_EL0, 0),
+ SR_FGT(SYS_GCSPR_EL0, HFGxTR, nGCS_EL0, 0),
SR_FGT(SYS_ACCDATA_EL1, HFGxTR, nACCDATA_EL1, 0),
SR_FGT(SYS_ERXADDR_EL1, HFGxTR, ERXADDR_EL1, 1),
SR_FGT(SYS_ERXPFGCDN_EL1, HFGxTR, ERXPFGCDN_EL1, 1),
SR_FGT(SYS_AFSR1_EL1, HFGxTR, AFSR1_EL1, 1),
SR_FGT(SYS_AFSR0_EL1, HFGxTR, AFSR0_EL1, 1),
/* HFGITR_EL2 */
+ SR_FGT(OP_AT_S1E1A, HFGITR, ATS1E1A, 1),
+ SR_FGT(OP_COSP_RCTX, HFGITR, COSPRCTX, 1),
+ SR_FGT(OP_GCSPUSHX, HFGITR, nGCSEPP, 0),
+ SR_FGT(OP_GCSPOPX, HFGITR, nGCSEPP, 0),
+ SR_FGT(OP_GCSPUSHM, HFGITR, nGCSPUSHM_EL1, 0),
SR_FGT(OP_BRB_IALL, HFGITR, nBRBIALL, 0),
SR_FGT(OP_BRB_INJ, HFGITR, nBRBINJ, 0),
SR_FGT(SYS_DC_CVAC, HFGITR, DCCVAC, 1),
SR_FGT(SYS_PMCR_EL0, HDFGWTR, PMCR_EL0, 1),
SR_FGT(SYS_PMSWINC_EL0, HDFGWTR, PMSWINC_EL0, 1),
SR_FGT(SYS_OSLAR_EL1, HDFGWTR, OSLAR_EL1, 1),
+ /*
+ * HAFGRTR_EL2
+ */
+ SR_FGT(SYS_AMEVTYPER1_EL0(15), HAFGRTR, AMEVTYPER115_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(14), HAFGRTR, AMEVTYPER114_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(13), HAFGRTR, AMEVTYPER113_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(12), HAFGRTR, AMEVTYPER112_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(11), HAFGRTR, AMEVTYPER111_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(10), HAFGRTR, AMEVTYPER110_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(9), HAFGRTR, AMEVTYPER19_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(8), HAFGRTR, AMEVTYPER18_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(7), HAFGRTR, AMEVTYPER17_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(6), HAFGRTR, AMEVTYPER16_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(5), HAFGRTR, AMEVTYPER15_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(4), HAFGRTR, AMEVTYPER14_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(3), HAFGRTR, AMEVTYPER13_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(2), HAFGRTR, AMEVTYPER12_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(1), HAFGRTR, AMEVTYPER11_EL0, 1),
+ SR_FGT(SYS_AMEVTYPER1_EL0(0), HAFGRTR, AMEVTYPER10_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(15), HAFGRTR, AMEVCNTR115_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(14), HAFGRTR, AMEVCNTR114_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(13), HAFGRTR, AMEVCNTR113_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(12), HAFGRTR, AMEVCNTR112_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(11), HAFGRTR, AMEVCNTR111_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(10), HAFGRTR, AMEVCNTR110_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(9), HAFGRTR, AMEVCNTR19_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(8), HAFGRTR, AMEVCNTR18_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(7), HAFGRTR, AMEVCNTR17_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(6), HAFGRTR, AMEVCNTR16_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(5), HAFGRTR, AMEVCNTR15_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(4), HAFGRTR, AMEVCNTR14_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(3), HAFGRTR, AMEVCNTR13_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(2), HAFGRTR, AMEVCNTR12_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(1), HAFGRTR, AMEVCNTR11_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR1_EL0(0), HAFGRTR, AMEVCNTR10_EL0, 1),
+ SR_FGT(SYS_AMCNTENCLR1_EL0, HAFGRTR, AMCNTEN1, 1),
+ SR_FGT(SYS_AMCNTENSET1_EL0, HAFGRTR, AMCNTEN1, 1),
+ SR_FGT(SYS_AMCNTENCLR0_EL0, HAFGRTR, AMCNTEN0, 1),
+ SR_FGT(SYS_AMCNTENSET0_EL0, HAFGRTR, AMCNTEN0, 1),
+ SR_FGT(SYS_AMEVCNTR0_EL0(3), HAFGRTR, AMEVCNTR03_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR0_EL0(2), HAFGRTR, AMEVCNTR02_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR0_EL0(1), HAFGRTR, AMEVCNTR01_EL0, 1),
+ SR_FGT(SYS_AMEVCNTR0_EL0(0), HAFGRTR, AMEVCNTR00_EL0, 1),
};
static union trap_config get_trap_config(u32 sysreg)
val = sanitised_sys_reg(vcpu, HDFGWTR_EL2);
break;
+ case HAFGRTR_GROUP:
+ val = sanitised_sys_reg(vcpu, HAFGRTR_EL2);
+ break;
+
case HFGITR_GROUP:
val = sanitised_sys_reg(vcpu, HFGITR_EL2);
switch (tc.fgf) {
*/
if (!(esr & ESR_ELx_S1PTW) &&
(cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
- (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_PERM)) {
+ esr_fsc_is_permission_fault(esr))) {
if (!__translate_far_to_hpfar(far, &hpfar))
return false;
} else {
clr |= ~hfg & __ ## reg ## _nMASK; \
} while(0)
+#define update_fgt_traps_cs(vcpu, reg, clr, set) \
+ do { \
+ struct kvm_cpu_context *hctxt = \
+ &this_cpu_ptr(&kvm_host_data)->host_ctxt; \
+ u64 c = 0, s = 0; \
+ \
+ ctxt_sys_reg(hctxt, reg) = read_sysreg_s(SYS_ ## reg); \
+ compute_clr_set(vcpu, reg, c, s); \
+ s |= set; \
+ c |= clr; \
+ if (c || s) { \
+ u64 val = __ ## reg ## _nMASK; \
+ val |= s; \
+ val &= ~c; \
+ write_sysreg_s(val, SYS_ ## reg); \
+ } \
+ } while(0)
+
+#define update_fgt_traps(vcpu, reg) \
+ update_fgt_traps_cs(vcpu, reg, 0, 0)
+
+/*
+ * Validate the fine grain trap masks.
+ * Check that the masks do not overlap and that all bits are accounted for.
+ */
+#define CHECK_FGT_MASKS(reg) \
+ do { \
+ BUILD_BUG_ON((__ ## reg ## _MASK) & (__ ## reg ## _nMASK)); \
+ BUILD_BUG_ON(~((__ ## reg ## _RES0) ^ (__ ## reg ## _MASK) ^ \
+ (__ ## reg ## _nMASK))); \
+ } while(0)
+
+static inline bool cpu_has_amu(void)
+{
+ u64 pfr0 = read_sysreg_s(SYS_ID_AA64PFR0_EL1);
+
+ return cpuid_feature_extract_unsigned_field(pfr0,
+ ID_AA64PFR0_EL1_AMU_SHIFT);
+}
static inline void __activate_traps_hfgxtr(struct kvm_vcpu *vcpu)
{
u64 r_clr = 0, w_clr = 0, r_set = 0, w_set = 0, tmp;
u64 r_val, w_val;
+ CHECK_FGT_MASKS(HFGRTR_EL2);
+ CHECK_FGT_MASKS(HFGWTR_EL2);
+ CHECK_FGT_MASKS(HFGITR_EL2);
+ CHECK_FGT_MASKS(HDFGRTR_EL2);
+ CHECK_FGT_MASKS(HDFGWTR_EL2);
+ CHECK_FGT_MASKS(HAFGRTR_EL2);
+ CHECK_FGT_MASKS(HCRX_EL2);
+
if (!cpus_have_final_cap(ARM64_HAS_FGT))
return;
compute_clr_set(vcpu, HFGWTR_EL2, w_clr, w_set);
}
- /* The default is not to trap anything but ACCDATA_EL1 */
- r_val = __HFGRTR_EL2_nMASK & ~HFGxTR_EL2_nACCDATA_EL1;
+ /* The default to trap everything not handled or supported in KVM. */
+ tmp = HFGxTR_EL2_nAMAIR2_EL1 | HFGxTR_EL2_nMAIR2_EL1 | HFGxTR_EL2_nS2POR_EL1 |
+ HFGxTR_EL2_nPOR_EL1 | HFGxTR_EL2_nPOR_EL0 | HFGxTR_EL2_nACCDATA_EL1;
+
+ r_val = __HFGRTR_EL2_nMASK & ~tmp;
r_val |= r_set;
r_val &= ~r_clr;
- w_val = __HFGWTR_EL2_nMASK & ~HFGxTR_EL2_nACCDATA_EL1;
+ w_val = __HFGWTR_EL2_nMASK & ~tmp;
w_val |= w_set;
w_val &= ~w_clr;
if (!vcpu_has_nv(vcpu) || is_hyp_ctxt(vcpu))
return;
- ctxt_sys_reg(hctxt, HFGITR_EL2) = read_sysreg_s(SYS_HFGITR_EL2);
+ update_fgt_traps(vcpu, HFGITR_EL2);
+ update_fgt_traps(vcpu, HDFGRTR_EL2);
+ update_fgt_traps(vcpu, HDFGWTR_EL2);
- r_set = r_clr = 0;
- compute_clr_set(vcpu, HFGITR_EL2, r_clr, r_set);
- r_val = __HFGITR_EL2_nMASK;
- r_val |= r_set;
- r_val &= ~r_clr;
-
- write_sysreg_s(r_val, SYS_HFGITR_EL2);
-
- ctxt_sys_reg(hctxt, HDFGRTR_EL2) = read_sysreg_s(SYS_HDFGRTR_EL2);
- ctxt_sys_reg(hctxt, HDFGWTR_EL2) = read_sysreg_s(SYS_HDFGWTR_EL2);
-
- r_clr = r_set = w_clr = w_set = 0;
-
- compute_clr_set(vcpu, HDFGRTR_EL2, r_clr, r_set);
- compute_clr_set(vcpu, HDFGWTR_EL2, w_clr, w_set);
-
- r_val = __HDFGRTR_EL2_nMASK;
- r_val |= r_set;
- r_val &= ~r_clr;
-
- w_val = __HDFGWTR_EL2_nMASK;
- w_val |= w_set;
- w_val &= ~w_clr;
-
- write_sysreg_s(r_val, SYS_HDFGRTR_EL2);
- write_sysreg_s(w_val, SYS_HDFGWTR_EL2);
+ if (cpu_has_amu())
+ update_fgt_traps(vcpu, HAFGRTR_EL2);
}
static inline void __deactivate_traps_hfgxtr(struct kvm_vcpu *vcpu)
write_sysreg_s(ctxt_sys_reg(hctxt, HFGITR_EL2), SYS_HFGITR_EL2);
write_sysreg_s(ctxt_sys_reg(hctxt, HDFGRTR_EL2), SYS_HDFGRTR_EL2);
write_sysreg_s(ctxt_sys_reg(hctxt, HDFGWTR_EL2), SYS_HDFGWTR_EL2);
+
+ if (cpu_has_amu())
+ write_sysreg_s(ctxt_sys_reg(hctxt, HAFGRTR_EL2), SYS_HAFGRTR_EL2);
}
static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
bool valid;
- valid = kvm_vcpu_trap_get_fault_type(vcpu) == ESR_ELx_FSC_FAULT &&
+ valid = kvm_vcpu_trap_is_translation_fault(vcpu) &&
kvm_vcpu_dabt_isvalid(vcpu) &&
!kvm_vcpu_abt_issea(vcpu) &&
!kvm_vcpu_abt_iss1tw(vcpu);
ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SSBS) \
)
+#define PVM_ID_AA64PFR2_ALLOW 0ULL
+
/*
* Allow for protected VMs:
* - Mixed-endian
* - Privileged Access Never
* - SError interrupt exceptions from speculative reads
* - Enhanced Translation Synchronization
+ * - Control for cache maintenance permission
*/
#define PVM_ID_AA64MMFR1_ALLOW (\
ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HAFDBS) | \
ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HPDS) | \
ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_PAN) | \
ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_SpecSEI) | \
- ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_ETS) \
+ ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_ETS) | \
+ ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_CMOW) \
)
/*
ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_E0PD) \
)
+#define PVM_ID_AA64MMFR3_ALLOW (0ULL)
+
/*
* No support for Scalable Vectors for protected VMs:
* Requires additional support from KVM, e.g., context-switching and
ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_RNDR) \
)
+/* Restrict pointer authentication to the basic version. */
+#define PVM_ID_AA64ISAR1_RESTRICT_UNSIGNED (\
+ FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA), ID_AA64ISAR1_EL1_APA_PAuth) | \
+ FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API), ID_AA64ISAR1_EL1_API_PAuth) \
+ )
+
+#define PVM_ID_AA64ISAR2_RESTRICT_UNSIGNED (\
+ FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3), ID_AA64ISAR2_EL1_APA3_PAuth) \
+ )
+
#define PVM_ID_AA64ISAR1_ALLOW (\
ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DPB) | \
- ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) | \
- ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) | \
ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_JSCVT) | \
ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FCMA) | \
ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_LRCPC) | \
)
#define PVM_ID_AA64ISAR2_ALLOW (\
+ ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_ATS1A)| \
ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3) | \
- ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) | \
ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_MOPS) \
)
alternative_else_nop_endif
msr ttbr0_el2, x2
- /*
- * Set the PS bits in TCR_EL2.
- */
ldr x0, [x0, #NVHE_INIT_TCR_EL2]
- tcr_compute_pa_size x0, #TCR_EL2_PS_SHIFT, x1, x2
msr tcr_el2, x0
isb
mov sp, x0
/* And turn the MMU back on! */
+ dsb nsh
+ isb
set_sctlr_el2 x2
ret x1
SYM_FUNC_END(__pkvm_init_switch_pgd)
hyp_put_page(&host_s2_pool, addr);
}
-static void host_s2_free_unlinked_table(void *addr, u32 level)
+static void host_s2_free_unlinked_table(void *addr, s8 level)
{
kvm_pgtable_stage2_free_unlinked(&host_mmu.mm_ops, addr, level);
}
{
struct kvm_mem_range cur;
kvm_pte_t pte;
- u32 level;
+ s8 level;
int ret;
hyp_assert_lock_held(&host_mmu.lock);
cur.start = ALIGN_DOWN(addr, granule);
cur.end = cur.start + granule;
level++;
- } while ((level < KVM_PGTABLE_MAX_LEVELS) &&
+ } while ((level <= KVM_PGTABLE_LAST_LEVEL) &&
!(kvm_level_supports_block_mapping(level) &&
range_included(&cur, range)));
* https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/T/#mf10dfbaf1eaef9274c581b81c53758918c1d0f03
*/
dsb(ishst);
- __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1));
+ __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), KVM_PGTABLE_LAST_LEVEL);
dsb(ish);
isb();
}
{
struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg);
- if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
+ if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_LAST_LEVEL)
return -EINVAL;
slot->addr = ctx->addr;
cptr_set |= CPTR_EL2_TTA;
}
+ /* Trap External Trace */
+ if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_ExtTrcBuff), feature_ids))
+ mdcr_clear |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;
+
vcpu->arch.mdcr_el2 |= mdcr_set;
vcpu->arch.mdcr_el2 &= ~mdcr_clear;
vcpu->arch.cptr_el2 |= cptr_set;
if (!kvm_pte_valid(ctx->old))
return 0;
- if (ctx->level != (KVM_PGTABLE_MAX_LEVELS - 1))
+ if (ctx->level != KVM_PGTABLE_LAST_LEVEL)
return -EINVAL;
phys = kvm_pte_to_phys(ctx->old);
static bool kvm_phys_is_valid(u64 phys)
{
- return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX));
+ u64 parange_max = kvm_get_parange_max();
+ u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max);
+
+ return phys < BIT(shift);
}
static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, u64 phys)
return IS_ALIGNED(ctx->addr, granule);
}
-static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level)
+static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, s8 level)
{
u64 shift = kvm_granule_shift(level);
u64 mask = BIT(PAGE_SHIFT - 3) - 1;
return (addr & mask) >> shift;
}
-static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
+static u32 kvm_pgd_pages(u32 ia_bits, s8 start_level)
{
struct kvm_pgtable pgt = {
.ia_bits = ia_bits,
return kvm_pgd_page_idx(&pgt, -1ULL) + 1;
}
-static bool kvm_pte_table(kvm_pte_t pte, u32 level)
+static bool kvm_pte_table(kvm_pte_t pte, s8 level)
{
- if (level == KVM_PGTABLE_MAX_LEVELS - 1)
+ if (level == KVM_PGTABLE_LAST_LEVEL)
return false;
if (!kvm_pte_valid(pte))
return pte;
}
-static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level)
+static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, s8 level)
{
kvm_pte_t pte = kvm_phys_to_pte(pa);
- u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE :
- KVM_PTE_TYPE_BLOCK;
+ u64 type = (level == KVM_PGTABLE_LAST_LEVEL) ? KVM_PTE_TYPE_PAGE :
+ KVM_PTE_TYPE_BLOCK;
pte |= attr & (KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI);
pte |= FIELD_PREP(KVM_PTE_TYPE, type);
}
static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data,
- struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level);
+ struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level);
static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
struct kvm_pgtable_mm_ops *mm_ops,
- kvm_pteref_t pteref, u32 level)
+ kvm_pteref_t pteref, s8 level)
{
enum kvm_pgtable_walk_flags flags = data->walker->flags;
kvm_pte_t *ptep = kvm_dereference_pteref(data->walker, pteref);
}
static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data,
- struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level)
+ struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level)
{
u32 idx;
int ret = 0;
- if (WARN_ON_ONCE(level >= KVM_PGTABLE_MAX_LEVELS))
+ if (WARN_ON_ONCE(level < KVM_PGTABLE_FIRST_LEVEL ||
+ level > KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
for (idx = kvm_pgtable_idx(data, level); idx < PTRS_PER_PTE; ++idx) {
struct leaf_walk_data {
kvm_pte_t pte;
- u32 level;
+ s8 level;
};
static int leaf_walker(const struct kvm_pgtable_visit_ctx *ctx,
}
int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
- kvm_pte_t *ptep, u32 *level)
+ kvm_pte_t *ptep, s8 *level)
{
struct leaf_walk_data data;
struct kvm_pgtable_walker walker = {
}
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
- attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
+ if (!kvm_lpa2_is_enabled())
+ attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
*ptep = attr;
if (hyp_map_walker_try_leaf(ctx, data))
return 0;
- if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1))
+ if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
childp = (kvm_pte_t *)mm_ops->zalloc_page(NULL);
int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
struct kvm_pgtable_mm_ops *mm_ops)
{
- u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits);
+ s8 start_level = KVM_PGTABLE_LAST_LEVEL + 1 -
+ ARM64_HW_PGTABLE_LEVELS(va_bits);
+
+ if (start_level < KVM_PGTABLE_FIRST_LEVEL ||
+ start_level > KVM_PGTABLE_LAST_LEVEL)
+ return -EINVAL;
pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_page(NULL);
if (!pgt->pgd)
return -ENOMEM;
pgt->ia_bits = va_bits;
- pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels;
+ pgt->start_level = start_level;
pgt->mm_ops = mm_ops;
pgt->mmu = NULL;
pgt->force_pte_cb = NULL;
u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
{
u64 vtcr = VTCR_EL2_FLAGS;
- u8 lvls;
+ s8 lvls;
vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT;
vtcr |= VTCR_EL2_T0SZ(phys_shift);
lvls = stage2_pgtable_levels(phys_shift);
if (lvls < 2)
lvls = 2;
+
+ /*
+ * When LPA2 is enabled, the HW supports an extra level of translation
+ * (for 5 in total) when using 4K pages. It also introduces VTCR_EL2.SL2
+ * to as an addition to SL0 to enable encoding this extra start level.
+ * However, since we always use concatenated pages for the first level
+ * lookup, we will never need this extra level and therefore do not need
+ * to touch SL2.
+ */
vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
#ifdef CONFIG_ARM64_HW_AFDBM
vtcr |= VTCR_EL2_HA;
#endif /* CONFIG_ARM64_HW_AFDBM */
+ if (kvm_lpa2_is_enabled())
+ vtcr |= VTCR_EL2_DS;
+
/* Set the vmid bits */
vtcr |= (get_vmid_bits(mmfr1) == 16) ?
VTCR_EL2_VS_16BIT :
if (prot & KVM_PGTABLE_PROT_W)
attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
- attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+ if (!kvm_lpa2_is_enabled())
+ attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+
attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
*ptep = attr;
{
u64 phys = stage2_map_walker_phys_addr(ctx, data);
- if (data->force_pte && (ctx->level < (KVM_PGTABLE_MAX_LEVELS - 1)))
+ if (data->force_pte && ctx->level < KVM_PGTABLE_LAST_LEVEL)
return false;
return kvm_block_mapping_supported(ctx, phys);
if (ret != -E2BIG)
return ret;
- if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1))
+ if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
if (!data->memcache)
kvm_pte_t attr_set;
kvm_pte_t attr_clr;
kvm_pte_t pte;
- u32 level;
+ s8 level;
};
static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
u64 size, kvm_pte_t attr_set,
kvm_pte_t attr_clr, kvm_pte_t *orig_pte,
- u32 *level, enum kvm_pgtable_walk_flags flags)
+ s8 *level, enum kvm_pgtable_walk_flags flags)
{
int ret;
kvm_pte_t attr_mask = KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI;
enum kvm_pgtable_prot prot)
{
int ret;
- u32 level;
+ s8 level;
kvm_pte_t set = 0, clr = 0;
if (prot & KVM_PTE_LEAF_ATTR_HI_SW)
}
kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
- u64 phys, u32 level,
+ u64 phys, s8 level,
enum kvm_pgtable_prot prot,
void *mc, bool force_pte)
{
* fully populated tree up to the PTE entries. Note that @level is
* interpreted as in "level @level entry".
*/
-static int stage2_block_get_nr_page_tables(u32 level)
+static int stage2_block_get_nr_page_tables(s8 level)
{
switch (level) {
case 1:
return 0;
default:
WARN_ON_ONCE(level < KVM_PGTABLE_MIN_BLOCK_LEVEL ||
- level >= KVM_PGTABLE_MAX_LEVELS);
+ level > KVM_PGTABLE_LAST_LEVEL);
return -EINVAL;
};
}
struct kvm_s2_mmu *mmu;
kvm_pte_t pte = ctx->old, new, *childp;
enum kvm_pgtable_prot prot;
- u32 level = ctx->level;
+ s8 level = ctx->level;
bool force_pte;
int nr_pages;
u64 phys;
/* No huge-pages exist at the last level */
- if (level == KVM_PGTABLE_MAX_LEVELS - 1)
+ if (level == KVM_PGTABLE_LAST_LEVEL)
return 0;
/* We only split valid block mappings */
u64 vtcr = mmu->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
- u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
+ s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE;
pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_pages_exact(pgd_sz);
{
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
- u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
+ s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
return kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE;
}
pgt->pgd = NULL;
}
-void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level)
+void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level)
{
kvm_pteref_t ptep = (kvm_pteref_t)pgtable;
struct kvm_pgtable_walker walker = {
{
struct page *page = container_of(head, struct page, rcu_head);
void *pgtable = page_to_virt(page);
- u32 level = page_private(page);
+ s8 level = page_private(page);
kvm_pgtable_stage2_free_unlinked(&kvm_s2_mm_ops, pgtable, level);
}
-static void stage2_free_unlinked_table(void *addr, u32 level)
+static void stage2_free_unlinked_table(void *addr, s8 level)
{
struct page *page = virt_to_page(addr);
struct kvm_pgtable pgt = {
.pgd = (kvm_pteref_t)kvm->mm->pgd,
.ia_bits = vabits_actual,
- .start_level = (KVM_PGTABLE_MAX_LEVELS -
- CONFIG_PGTABLE_LEVELS),
+ .start_level = (KVM_PGTABLE_LAST_LEVEL -
+ CONFIG_PGTABLE_LEVELS + 1),
.mm_ops = &kvm_user_mm_ops,
};
unsigned long flags;
kvm_pte_t pte = 0; /* Keep GCC quiet... */
- u32 level = ~0;
+ s8 level = S8_MAX;
int ret;
/*
* Not seeing an error, but not updating level? Something went
* deeply wrong...
*/
- if (WARN_ON(level >= KVM_PGTABLE_MAX_LEVELS))
+ if (WARN_ON(level > KVM_PGTABLE_LAST_LEVEL))
+ return -EFAULT;
+ if (WARN_ON(level < KVM_PGTABLE_FIRST_LEVEL))
return -EFAULT;
/* Oops, the userspace PTs are gone... Replay the fault */
static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_memory_slot *memslot, unsigned long hva,
- unsigned long fault_status)
+ bool fault_is_perm)
{
int ret = 0;
bool write_fault, writable, force_pte = false;
gfn_t gfn;
kvm_pfn_t pfn;
bool logging_active = memslot_is_logging(memslot);
- unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
long vma_pagesize, fault_granule;
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
struct kvm_pgtable *pgt;
- fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
+ if (fault_is_perm)
+ fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu);
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
VM_BUG_ON(write_fault && exec_fault);
- if (fault_status == ESR_ELx_FSC_PERM && !write_fault && !exec_fault) {
+ if (fault_is_perm && !write_fault && !exec_fault) {
kvm_err("Unexpected L2 read permission error\n");
return -EFAULT;
}
* only exception to this is when dirty logging is enabled at runtime
* and a write fault needs to collapse a block entry into a table.
*/
- if (fault_status != ESR_ELx_FSC_PERM ||
- (logging_active && write_fault)) {
+ if (!fault_is_perm || (logging_active && write_fault)) {
ret = kvm_mmu_topup_memory_cache(memcache,
kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu));
if (ret)
* backed by a THP and thus use block mapping if possible.
*/
if (vma_pagesize == PAGE_SIZE && !(force_pte || device)) {
- if (fault_status == ESR_ELx_FSC_PERM &&
- fault_granule > PAGE_SIZE)
+ if (fault_is_perm && fault_granule > PAGE_SIZE)
vma_pagesize = fault_granule;
else
vma_pagesize = transparent_hugepage_adjust(kvm, memslot,
}
}
- if (fault_status != ESR_ELx_FSC_PERM && !device && kvm_has_mte(kvm)) {
+ if (!fault_is_perm && !device && kvm_has_mte(kvm)) {
/* Check the VMM hasn't introduced a new disallowed VMA */
if (mte_allowed) {
sanitise_mte_tags(kvm, pfn, vma_pagesize);
* permissions only if vma_pagesize equals fault_granule. Otherwise,
* kvm_pgtable_stage2_map() should be called to change block size.
*/
- if (fault_status == ESR_ELx_FSC_PERM && vma_pagesize == fault_granule)
+ if (fault_is_perm && vma_pagesize == fault_granule)
ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
else
ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
*/
int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
{
- unsigned long fault_status;
+ unsigned long esr;
phys_addr_t fault_ipa;
struct kvm_memory_slot *memslot;
unsigned long hva;
gfn_t gfn;
int ret, idx;
- fault_status = kvm_vcpu_trap_get_fault_type(vcpu);
+ esr = kvm_vcpu_get_esr(vcpu);
fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
is_iabt = kvm_vcpu_trap_is_iabt(vcpu);
- if (fault_status == ESR_ELx_FSC_FAULT) {
+ if (esr_fsc_is_permission_fault(esr)) {
/* Beyond sanitised PARange (which is the IPA limit) */
if (fault_ipa >= BIT_ULL(get_kvm_ipa_limit())) {
kvm_inject_size_fault(vcpu);
kvm_vcpu_get_hfar(vcpu), fault_ipa);
/* Check the stage-2 fault is trans. fault or write fault */
- if (fault_status != ESR_ELx_FSC_FAULT &&
- fault_status != ESR_ELx_FSC_PERM &&
- fault_status != ESR_ELx_FSC_ACCESS) {
+ if (!esr_fsc_is_translation_fault(esr) &&
+ !esr_fsc_is_permission_fault(esr) &&
+ !esr_fsc_is_access_flag_fault(esr)) {
kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
kvm_vcpu_trap_get_class(vcpu),
(unsigned long)kvm_vcpu_trap_get_fault(vcpu),
/* Userspace should not be able to register out-of-bounds IPAs */
VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->arch.hw_mmu));
- if (fault_status == ESR_ELx_FSC_ACCESS) {
+ if (esr_fsc_is_access_flag_fault(esr)) {
handle_access_fault(vcpu, fault_ipa);
ret = 1;
goto out_unlock;
}
- ret = user_mem_abort(vcpu, fault_ipa, memslot, hva, fault_status);
+ ret = user_mem_abort(vcpu, fault_ipa, memslot, hva,
+ esr_fsc_is_permission_fault(esr));
if (ret == 0)
ret = 1;
out:
* This list should get updated as new features get added to the NV
* support, and new extension to the architecture.
*/
-void access_nested_id_reg(struct kvm_vcpu *v, struct sys_reg_params *p,
- const struct sys_reg_desc *r)
+static u64 limit_nv_id_reg(u32 id, u64 val)
{
- u32 id = reg_to_encoding(r);
- u64 val, tmp;
-
- val = p->regval;
+ u64 tmp;
switch (id) {
case SYS_ID_AA64ISAR0_EL1:
break;
}
- p->regval = val;
+ return val;
+}
+int kvm_init_nv_sysregs(struct kvm *kvm)
+{
+ mutex_lock(&kvm->arch.config_lock);
+
+ for (int i = 0; i < KVM_ARM_ID_REG_NUM; i++)
+ kvm->arch.id_regs[i] = limit_nv_id_reg(IDX_IDREG(i),
+ kvm->arch.id_regs[i]);
+
+ mutex_unlock(&kvm->arch.config_lock);
+
+ return 0;
}
parange = cpuid_feature_extract_unsigned_field(mmfr0,
ID_AA64MMFR0_EL1_PARANGE_SHIFT);
/*
- * IPA size beyond 48 bits could not be supported
- * on either 4K or 16K page size. Hence let's cap
- * it to 48 bits, in case it's reported as larger
- * on the system.
+ * IPA size beyond 48 bits for 4K and 16K page size is only supported
+ * when LPA2 is available. So if we have LPA2, enable it, else cap to 48
+ * bits, in case it's reported as larger on the system.
*/
- if (PAGE_SIZE != SZ_64K)
+ if (!kvm_lpa2_is_enabled() && PAGE_SIZE != SZ_64K)
parange = min(parange, (unsigned int)ID_AA64MMFR0_EL1_PARANGE_48);
/*
static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
u64 val);
-static bool read_from_write_only(struct kvm_vcpu *vcpu,
- struct sys_reg_params *params,
- const struct sys_reg_desc *r)
+static bool bad_trap(struct kvm_vcpu *vcpu,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc *r,
+ const char *msg)
{
- WARN_ONCE(1, "Unexpected sys_reg read to write-only register\n");
+ WARN_ONCE(1, "Unexpected %s\n", msg);
print_sys_reg_instr(params);
kvm_inject_undefined(vcpu);
return false;
}
+static bool read_from_write_only(struct kvm_vcpu *vcpu,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc *r)
+{
+ return bad_trap(vcpu, params, r,
+ "sys_reg read to write-only register");
+}
+
static bool write_to_read_only(struct kvm_vcpu *vcpu,
struct sys_reg_params *params,
const struct sys_reg_desc *r)
{
- WARN_ONCE(1, "Unexpected sys_reg write to read-only register\n");
- print_sys_reg_instr(params);
- kvm_inject_undefined(vcpu);
- return false;
+ return bad_trap(vcpu, params, r,
+ "sys_reg write to read-only register");
+}
+
+#define PURE_EL2_SYSREG(el2) \
+ case el2: { \
+ *el1r = el2; \
+ return true; \
+ }
+
+#define MAPPED_EL2_SYSREG(el2, el1, fn) \
+ case el2: { \
+ *xlate = fn; \
+ *el1r = el1; \
+ return true; \
+ }
+
+static bool get_el2_to_el1_mapping(unsigned int reg,
+ unsigned int *el1r, u64 (**xlate)(u64))
+{
+ switch (reg) {
+ PURE_EL2_SYSREG( VPIDR_EL2 );
+ PURE_EL2_SYSREG( VMPIDR_EL2 );
+ PURE_EL2_SYSREG( ACTLR_EL2 );
+ PURE_EL2_SYSREG( HCR_EL2 );
+ PURE_EL2_SYSREG( MDCR_EL2 );
+ PURE_EL2_SYSREG( HSTR_EL2 );
+ PURE_EL2_SYSREG( HACR_EL2 );
+ PURE_EL2_SYSREG( VTTBR_EL2 );
+ PURE_EL2_SYSREG( VTCR_EL2 );
+ PURE_EL2_SYSREG( RVBAR_EL2 );
+ PURE_EL2_SYSREG( TPIDR_EL2 );
+ PURE_EL2_SYSREG( HPFAR_EL2 );
+ PURE_EL2_SYSREG( CNTHCTL_EL2 );
+ MAPPED_EL2_SYSREG(SCTLR_EL2, SCTLR_EL1,
+ translate_sctlr_el2_to_sctlr_el1 );
+ MAPPED_EL2_SYSREG(CPTR_EL2, CPACR_EL1,
+ translate_cptr_el2_to_cpacr_el1 );
+ MAPPED_EL2_SYSREG(TTBR0_EL2, TTBR0_EL1,
+ translate_ttbr0_el2_to_ttbr0_el1 );
+ MAPPED_EL2_SYSREG(TTBR1_EL2, TTBR1_EL1, NULL );
+ MAPPED_EL2_SYSREG(TCR_EL2, TCR_EL1,
+ translate_tcr_el2_to_tcr_el1 );
+ MAPPED_EL2_SYSREG(VBAR_EL2, VBAR_EL1, NULL );
+ MAPPED_EL2_SYSREG(AFSR0_EL2, AFSR0_EL1, NULL );
+ MAPPED_EL2_SYSREG(AFSR1_EL2, AFSR1_EL1, NULL );
+ MAPPED_EL2_SYSREG(ESR_EL2, ESR_EL1, NULL );
+ MAPPED_EL2_SYSREG(FAR_EL2, FAR_EL1, NULL );
+ MAPPED_EL2_SYSREG(MAIR_EL2, MAIR_EL1, NULL );
+ MAPPED_EL2_SYSREG(AMAIR_EL2, AMAIR_EL1, NULL );
+ MAPPED_EL2_SYSREG(ELR_EL2, ELR_EL1, NULL );
+ MAPPED_EL2_SYSREG(SPSR_EL2, SPSR_EL1, NULL );
+ default:
+ return false;
+ }
}
u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
{
u64 val = 0x8badf00d8badf00d;
+ u64 (*xlate)(u64) = NULL;
+ unsigned int el1r;
+
+ if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
+ goto memory_read;
- if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) &&
- __vcpu_read_sys_reg_from_cpu(reg, &val))
+ if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
+ if (!is_hyp_ctxt(vcpu))
+ goto memory_read;
+
+ /*
+ * If this register does not have an EL1 counterpart,
+ * then read the stored EL2 version.
+ */
+ if (reg == el1r)
+ goto memory_read;
+
+ /*
+ * If we have a non-VHE guest and that the sysreg
+ * requires translation to be used at EL1, use the
+ * in-memory copy instead.
+ */
+ if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
+ goto memory_read;
+
+ /* Get the current version of the EL1 counterpart. */
+ WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
return val;
+ }
+ /* EL1 register can't be on the CPU if the guest is in vEL2. */
+ if (unlikely(is_hyp_ctxt(vcpu)))
+ goto memory_read;
+
+ if (__vcpu_read_sys_reg_from_cpu(reg, &val))
+ return val;
+
+memory_read:
return __vcpu_sys_reg(vcpu, reg);
}
void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
- if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) &&
- __vcpu_write_sys_reg_to_cpu(val, reg))
+ u64 (*xlate)(u64) = NULL;
+ unsigned int el1r;
+
+ if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
+ goto memory_write;
+
+ if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
+ if (!is_hyp_ctxt(vcpu))
+ goto memory_write;
+
+ /*
+ * Always store a copy of the write to memory to avoid having
+ * to reverse-translate virtual EL2 system registers for a
+ * non-VHE guest hypervisor.
+ */
+ __vcpu_sys_reg(vcpu, reg) = val;
+
+ /* No EL1 counterpart? We're done here.? */
+ if (reg == el1r)
+ return;
+
+ if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
+ val = xlate(val);
+
+ /* Redirect this to the EL1 version of the register. */
+ WARN_ON(!__vcpu_write_sys_reg_to_cpu(val, el1r));
+ return;
+ }
+
+ /* EL1 register can't be on the CPU if the guest is in vEL2. */
+ if (unlikely(is_hyp_ctxt(vcpu)))
+ goto memory_write;
+
+ if (__vcpu_write_sys_reg_to_cpu(val, reg))
return;
- __vcpu_sys_reg(vcpu, reg) = val;
+memory_write:
+ __vcpu_sys_reg(vcpu, reg) = val;
}
/* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */
return write_to_read_only(vcpu, p, r);
p->regval = read_id_reg(vcpu, r);
- if (vcpu_has_nv(vcpu))
- access_nested_id_reg(vcpu, p, r);
return true;
}
return REG_HIDDEN;
}
+static bool bad_vncr_trap(struct kvm_vcpu *vcpu,
+ struct sys_reg_params *p,
+ const struct sys_reg_desc *r)
+{
+ /*
+ * We really shouldn't be here, and this is likely the result
+ * of a misconfigured trap, as this register should target the
+ * VNCR page, and nothing else.
+ */
+ return bad_trap(vcpu, p, r,
+ "trap of VNCR-backed register");
+}
+
+static bool bad_redir_trap(struct kvm_vcpu *vcpu,
+ struct sys_reg_params *p,
+ const struct sys_reg_desc *r)
+{
+ /*
+ * We really shouldn't be here, and this is likely the result
+ * of a misconfigured trap, as this register should target the
+ * corresponding EL1, and nothing else.
+ */
+ return bad_trap(vcpu, p, r,
+ "trap of EL2 register redirected to EL1");
+}
+
#define EL2_REG(name, acc, rst, v) { \
SYS_DESC(SYS_##name), \
.access = acc, \
.val = v, \
}
+#define EL2_REG_VNCR(name, rst, v) EL2_REG(name, bad_vncr_trap, rst, v)
+#define EL2_REG_REDIR(name, rst, v) EL2_REG(name, bad_redir_trap, rst, v)
+
/*
* EL{0,1}2 registers are the EL2 view on an EL0 or EL1 register when
* HCR_EL2.E2H==1, and only in the sysreg table for convenience of
{ PMU_SYS_REG(PMCCFILTR_EL0), .access = access_pmu_evtyper,
.reset = reset_val, .reg = PMCCFILTR_EL0, .val = 0 },
- EL2_REG(VPIDR_EL2, access_rw, reset_unknown, 0),
- EL2_REG(VMPIDR_EL2, access_rw, reset_unknown, 0),
+ EL2_REG_VNCR(VPIDR_EL2, reset_unknown, 0),
+ EL2_REG_VNCR(VMPIDR_EL2, reset_unknown, 0),
EL2_REG(SCTLR_EL2, access_rw, reset_val, SCTLR_EL2_RES1),
EL2_REG(ACTLR_EL2, access_rw, reset_val, 0),
- EL2_REG(HCR_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(HCR_EL2, reset_val, 0),
EL2_REG(MDCR_EL2, access_rw, reset_val, 0),
EL2_REG(CPTR_EL2, access_rw, reset_val, CPTR_NVHE_EL2_RES1),
- EL2_REG(HSTR_EL2, access_rw, reset_val, 0),
- EL2_REG(HFGRTR_EL2, access_rw, reset_val, 0),
- EL2_REG(HFGWTR_EL2, access_rw, reset_val, 0),
- EL2_REG(HFGITR_EL2, access_rw, reset_val, 0),
- EL2_REG(HACR_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(HSTR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HFGRTR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HFGWTR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HFGITR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HACR_EL2, reset_val, 0),
- EL2_REG(HCRX_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(HCRX_EL2, reset_val, 0),
EL2_REG(TTBR0_EL2, access_rw, reset_val, 0),
EL2_REG(TTBR1_EL2, access_rw, reset_val, 0),
EL2_REG(TCR_EL2, access_rw, reset_val, TCR_EL2_RES1),
- EL2_REG(VTTBR_EL2, access_rw, reset_val, 0),
- EL2_REG(VTCR_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(VTTBR_EL2, reset_val, 0),
+ EL2_REG_VNCR(VTCR_EL2, reset_val, 0),
{ SYS_DESC(SYS_DACR32_EL2), trap_undef, reset_unknown, DACR32_EL2 },
- EL2_REG(HDFGRTR_EL2, access_rw, reset_val, 0),
- EL2_REG(HDFGWTR_EL2, access_rw, reset_val, 0),
- EL2_REG(SPSR_EL2, access_rw, reset_val, 0),
- EL2_REG(ELR_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(HDFGRTR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HDFGWTR_EL2, reset_val, 0),
+ EL2_REG_VNCR(HAFGRTR_EL2, reset_val, 0),
+ EL2_REG_REDIR(SPSR_EL2, reset_val, 0),
+ EL2_REG_REDIR(ELR_EL2, reset_val, 0),
{ SYS_DESC(SYS_SP_EL1), access_sp_el1},
/* AArch32 SPSR_* are RES0 if trapped from a NV guest */
{ SYS_DESC(SYS_IFSR32_EL2), trap_undef, reset_unknown, IFSR32_EL2 },
EL2_REG(AFSR0_EL2, access_rw, reset_val, 0),
EL2_REG(AFSR1_EL2, access_rw, reset_val, 0),
- EL2_REG(ESR_EL2, access_rw, reset_val, 0),
+ EL2_REG_REDIR(ESR_EL2, reset_val, 0),
{ SYS_DESC(SYS_FPEXC32_EL2), trap_undef, reset_val, FPEXC32_EL2, 0x700 },
- EL2_REG(FAR_EL2, access_rw, reset_val, 0),
+ EL2_REG_REDIR(FAR_EL2, reset_val, 0),
EL2_REG(HPFAR_EL2, access_rw, reset_val, 0),
EL2_REG(MAIR_EL2, access_rw, reset_val, 0),
EL2_REG(CONTEXTIDR_EL2, access_rw, reset_val, 0),
EL2_REG(TPIDR_EL2, access_rw, reset_val, 0),
- EL2_REG(CNTVOFF_EL2, access_rw, reset_val, 0),
+ EL2_REG_VNCR(CNTVOFF_EL2, reset_val, 0),
EL2_REG(CNTHCTL_EL2, access_rw, reset_val, 0),
- EL12_REG(SCTLR, access_vm_reg, reset_val, 0x00C50078),
- EL12_REG(CPACR, access_rw, reset_val, 0),
- EL12_REG(TTBR0, access_vm_reg, reset_unknown, 0),
- EL12_REG(TTBR1, access_vm_reg, reset_unknown, 0),
- EL12_REG(TCR, access_vm_reg, reset_val, 0),
- { SYS_DESC(SYS_SPSR_EL12), access_spsr},
- { SYS_DESC(SYS_ELR_EL12), access_elr},
- EL12_REG(AFSR0, access_vm_reg, reset_unknown, 0),
- EL12_REG(AFSR1, access_vm_reg, reset_unknown, 0),
- EL12_REG(ESR, access_vm_reg, reset_unknown, 0),
- EL12_REG(FAR, access_vm_reg, reset_unknown, 0),
- EL12_REG(MAIR, access_vm_reg, reset_unknown, 0),
- EL12_REG(AMAIR, access_vm_reg, reset_amair_el1, 0),
- EL12_REG(VBAR, access_rw, reset_val, 0),
- EL12_REG(CONTEXTIDR, access_vm_reg, reset_val, 0),
EL12_REG(CNTKCTL, access_rw, reset_val, 0),
EL2_REG(SP_EL2, NULL, reset_unknown, 0),
vgic_v4_teardown(kvm);
}
-void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
+static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
vgic_flush_pending_lpis(vcpu);
INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
- vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
+ if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
+ vgic_unregister_redist_iodev(vcpu);
+ vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
+ }
}
-static void __kvm_vgic_destroy(struct kvm *kvm)
+void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+
+ mutex_lock(&kvm->slots_lock);
+ __kvm_vgic_vcpu_destroy(vcpu);
+ mutex_unlock(&kvm->slots_lock);
+}
+
+void kvm_vgic_destroy(struct kvm *kvm)
{
struct kvm_vcpu *vcpu;
unsigned long i;
- lockdep_assert_held(&kvm->arch.config_lock);
+ mutex_lock(&kvm->slots_lock);
vgic_debug_destroy(kvm);
kvm_for_each_vcpu(i, vcpu, kvm)
- kvm_vgic_vcpu_destroy(vcpu);
+ __kvm_vgic_vcpu_destroy(vcpu);
+
+ mutex_lock(&kvm->arch.config_lock);
kvm_vgic_dist_destroy(kvm);
-}
-void kvm_vgic_destroy(struct kvm *kvm)
-{
- mutex_lock(&kvm->arch.config_lock);
- __kvm_vgic_destroy(kvm);
mutex_unlock(&kvm->arch.config_lock);
+ mutex_unlock(&kvm->slots_lock);
}
/**
type = VGIC_V3;
}
- if (ret) {
- __kvm_vgic_destroy(kvm);
+ if (ret)
goto out;
- }
+
dist->ready = true;
dist_base = dist->vgic_dist_base;
mutex_unlock(&kvm->arch.config_lock);
ret = vgic_register_dist_iodev(kvm, dist_base, type);
- if (ret) {
+ if (ret)
kvm_err("Unable to register VGIC dist MMIO regions\n");
- kvm_vgic_destroy(kvm);
- }
- mutex_unlock(&kvm->slots_lock);
- return ret;
+ goto out_slots;
out:
mutex_unlock(&kvm->arch.config_lock);
+out_slots:
mutex_unlock(&kvm->slots_lock);
+
+ if (ret)
+ kvm_vgic_destroy(kvm);
+
return ret;
}
unsigned long flags;
raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
irq = __vgic_its_check_cache(dist, db, devid, eventid);
+ if (irq)
+ vgic_get_irq_kref(irq);
+
raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
return irq;
raw_spin_lock_irqsave(&irq->irq_lock, flags);
irq->pending_latch = true;
vgic_queue_irq_unlock(kvm, irq, flags);
+ vgic_put_irq(kvm, irq);
return 0;
}
gpa_t addr, unsigned int len,
unsigned long val)
{
- u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
- int i;
- unsigned long flags;
-
- for (i = 0; i < len * 8; i++) {
- struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
-
- raw_spin_lock_irqsave(&irq->irq_lock, flags);
- if (test_bit(i, &val)) {
- /*
- * pending_latch is set irrespective of irq type
- * (level or edge) to avoid dependency that VM should
- * restore irq config before pending info.
- */
- irq->pending_latch = true;
- vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
- } else {
- irq->pending_latch = false;
- raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
- }
+ int ret;
- vgic_put_irq(vcpu->kvm, irq);
- }
+ ret = vgic_uaccess_write_spending(vcpu, addr, len, val);
+ if (ret)
+ return ret;
- return 0;
+ return vgic_uaccess_write_cpending(vcpu, addr, len, ~val);
}
/* We want to avoid outer shareable. */
return ret;
}
-static void vgic_unregister_redist_iodev(struct kvm_vcpu *vcpu)
+void vgic_unregister_redist_iodev(struct kvm_vcpu *vcpu)
{
struct vgic_io_device *rd_dev = &vcpu->arch.vgic_cpu.rd_iodev;
unsigned long c;
int ret = 0;
+ lockdep_assert_held(&kvm->slots_lock);
+
kvm_for_each_vcpu(c, vcpu, kvm) {
ret = vgic_register_redist_iodev(vcpu);
if (ret)
vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V2);
}
-void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
- gpa_t addr, unsigned int len,
- unsigned long val)
+static void __set_pending(struct kvm_vcpu *vcpu, gpa_t addr, unsigned int len,
+ unsigned long val, bool is_user)
{
u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
int i;
for_each_set_bit(i, &val, len * 8) {
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
- /* GICD_ISPENDR0 SGI bits are WI */
- if (is_vgic_v2_sgi(vcpu, irq)) {
+ /* GICD_ISPENDR0 SGI bits are WI when written from the guest. */
+ if (is_vgic_v2_sgi(vcpu, irq) && !is_user) {
vgic_put_irq(vcpu->kvm, irq);
continue;
}
raw_spin_lock_irqsave(&irq->irq_lock, flags);
+ /*
+ * GICv2 SGIs are terribly broken. We can't restore
+ * the source of the interrupt, so just pick the vcpu
+ * itself as the source...
+ */
+ if (is_vgic_v2_sgi(vcpu, irq))
+ irq->source |= BIT(vcpu->vcpu_id);
+
if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
/* HW SGI? Ask the GIC to inject it */
int err;
}
irq->pending_latch = true;
- if (irq->hw)
+ if (irq->hw && !is_user)
vgic_irq_set_phys_active(irq, true);
vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
}
}
+void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
+ gpa_t addr, unsigned int len,
+ unsigned long val)
+{
+ __set_pending(vcpu, addr, len, val, false);
+}
+
int vgic_uaccess_write_spending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len,
unsigned long val)
{
- u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
- int i;
- unsigned long flags;
-
- for_each_set_bit(i, &val, len * 8) {
- struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
-
- raw_spin_lock_irqsave(&irq->irq_lock, flags);
- irq->pending_latch = true;
-
- /*
- * GICv2 SGIs are terribly broken. We can't restore
- * the source of the interrupt, so just pick the vcpu
- * itself as the source...
- */
- if (is_vgic_v2_sgi(vcpu, irq))
- irq->source |= BIT(vcpu->vcpu_id);
-
- vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
-
- vgic_put_irq(vcpu->kvm, irq);
- }
-
+ __set_pending(vcpu, addr, len, val, true);
return 0;
}
vgic_irq_set_phys_active(irq, false);
}
-void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
- gpa_t addr, unsigned int len,
- unsigned long val)
+static void __clear_pending(struct kvm_vcpu *vcpu,
+ gpa_t addr, unsigned int len,
+ unsigned long val, bool is_user)
{
u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
int i;
for_each_set_bit(i, &val, len * 8) {
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
- /* GICD_ICPENDR0 SGI bits are WI */
- if (is_vgic_v2_sgi(vcpu, irq)) {
+ /* GICD_ICPENDR0 SGI bits are WI when written from the guest. */
+ if (is_vgic_v2_sgi(vcpu, irq) && !is_user) {
vgic_put_irq(vcpu->kvm, irq);
continue;
}
raw_spin_lock_irqsave(&irq->irq_lock, flags);
+ /*
+ * More fun with GICv2 SGIs! If we're clearing one of them
+ * from userspace, which source vcpu to clear? Let's not
+ * even think of it, and blow the whole set.
+ */
+ if (is_vgic_v2_sgi(vcpu, irq))
+ irq->source = 0;
+
if (irq->hw && vgic_irq_is_sgi(irq->intid)) {
/* HW SGI? Ask the GIC to clear its pending bit */
int err;
continue;
}
- if (irq->hw)
+ if (irq->hw && !is_user)
vgic_hw_irq_cpending(vcpu, irq);
else
irq->pending_latch = false;
}
}
+void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
+ gpa_t addr, unsigned int len,
+ unsigned long val)
+{
+ __clear_pending(vcpu, addr, len, val, false);
+}
+
int vgic_uaccess_write_cpending(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len,
unsigned long val)
{
- u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
- int i;
- unsigned long flags;
-
- for_each_set_bit(i, &val, len * 8) {
- struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
-
- raw_spin_lock_irqsave(&irq->irq_lock, flags);
- /*
- * More fun with GICv2 SGIs! If we're clearing one of them
- * from userspace, which source vcpu to clear? Let's not
- * even think of it, and blow the whole set.
- */
- if (is_vgic_v2_sgi(vcpu, irq))
- irq->source = 0;
-
- irq->pending_latch = false;
-
- raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
-
- vgic_put_irq(vcpu->kvm, irq);
- }
-
+ __clear_pending(vcpu, addr, len, val, true);
return 0;
}
if (ret)
goto out;
+ /* Silently exit if the vLPI is already mapped */
+ if (irq->hw)
+ goto out;
+
/*
* Emit the mapping request. If it fails, the ITS probably
* isn't v4 compatible, so let's silently bail out. Holding
int vgic_v3_save_pending_tables(struct kvm *kvm);
int vgic_v3_set_redist_base(struct kvm *kvm, u32 index, u64 addr, u32 count);
int vgic_register_redist_iodev(struct kvm_vcpu *vcpu);
+void vgic_unregister_redist_iodev(struct kvm_vcpu *vcpu);
bool vgic_v3_check_base(struct kvm *kvm);
void vgic_v3_load(struct kvm_vcpu *vcpu);
*
* KFENCE pool requires page-granular mapping if initialized late.
*/
- return (rodata_enabled && rodata_full) || debug_pagealloc_enabled() ||
- arm64_kfence_can_set_direct_map();
+ return rodata_full || debug_pagealloc_enabled() ||
+ arm64_kfence_can_set_direct_map();
}
static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
* If we are manipulating read-only permissions, apply the same
* change to the linear mapping of the pages that back this VM area.
*/
- if (rodata_enabled &&
- rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
+ if (rodata_full && (pgprot_val(set_mask) == PTE_RDONLY ||
pgprot_val(clear_mask) == PTE_RDONLY)) {
for (i = 0; i < area->nr_pages; i++) {
__change_memory_common((u64)page_address(area->pages[i]),
HAS_GIC_PRIO_RELAXED_SYNC
HAS_HCX
HAS_LDAPR
+HAS_LPA2
HAS_LSE_ATOMICS
HAS_MOPS
HAS_NESTED_VIRT
EndEnum
EndSysreg
+Sysreg ID_AA64PFR2_EL1 3 0 0 4 2
+Res0 63:36
+UnsignedEnum 35:32 FPMR
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+Res0 31:12
+UnsignedEnum 11:8 MTEFAR
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 7:4 MTESTOREONLY
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 3:0 MTEPERM
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+EndSysreg
+
Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4
Res0 63:60
UnsignedEnum 59:56 F64MM
0b0 NI
0b1 IMP
EndEnum
-Res0 62:60
+Res0 62:61
+UnsignedEnum 60 LUTv2
+ 0b0 NI
+ 0b1 IMP
+EndEnum
UnsignedEnum 59:56 SMEver
0b0000 SME
0b0001 SME2
0b0 NI
0b1 IMP
EndEnum
-Res0 41:40
+UnsignedEnum 41 F8F16
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 40 F8F32
+ 0b0 NI
+ 0b1 IMP
+EndEnum
UnsignedEnum 39:36 I8I32
0b0000 NI
0b1111 IMP
0b0 NI
0b1 IMP
EndEnum
-Res0 31:0
+Res0 31
+UnsignedEnum 30 SF8FMA
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 29 SF8DP4
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 28 SF8DP2
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+Res0 27:0
+EndSysreg
+
+Sysreg ID_AA64FPFR0_EL1 3 0 0 4 7
+Res0 63:32
+UnsignedEnum 31 F8CVT
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 30 F8FMA
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 29 F8DP4
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 28 F8DP2
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+Res0 27:2
+UnsignedEnum 1 F8E4M3
+ 0b0 NI
+ 0b1 IMP
+EndEnum
+UnsignedEnum 0 F8E5M2
+ 0b0 NI
+ 0b1 IMP
+EndEnum
EndSysreg
Sysreg ID_AA64DFR0_EL1 3 0 0 5 0
0b0000 UNPREDICTABLE
0b0001 DEF
EndEnum
-Res0 59:56
+UnsignedEnum 59:56 ExtTrcBuff
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
UnsignedEnum 55:52 BRBE
0b0000 NI
0b0001 IMP
0b0011 PAuth2
0b0100 FPAC
0b0101 FPACCOMBINE
+ 0b0110 PAuth_LR
EndEnum
UnsignedEnum 7:4 APA
0b0000 NI
0b0011 PAuth2
0b0100 FPAC
0b0101 FPACCOMBINE
+ 0b0110 PAuth_LR
EndEnum
UnsignedEnum 3:0 DPB
0b0000 NI
EndSysreg
Sysreg ID_AA64ISAR2_EL1 3 0 0 6 2
-Res0 63:56
+UnsignedEnum 63:60 ATS1A
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 59:56 LUT
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
UnsignedEnum 55:52 CSSC
0b0000 NI
0b0001 IMP
0b0000 NI
0b0001 IMP
EndEnum
-Res0 47:32
+Res0 47:44
+UnsignedEnum 43:40 PRFMSLC
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 39:36 SYSINSTR_128
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 35:32 SYSREG_128
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
UnsignedEnum 31:28 CLRBHB
0b0000 NI
0b0001 IMP
0b0011 PAuth2
0b0100 FPAC
0b0101 FPACCOMBINE
+ 0b0110 PAuth_LR
EndEnum
UnsignedEnum 11:8 GPA3
0b0000 NI
EndEnum
EndSysreg
+Sysreg ID_AA64ISAR3_EL1 3 0 0 6 3
+Res0 63:12
+UnsignedEnum 11:8 TLBIW
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 7:4 FAMINMAX
+ 0b0000 NI
+ 0b0001 IMP
+EndEnum
+UnsignedEnum 3:0 CPA
+ 0b0000 NI
+ 0b0001 IMP
+ 0b0010 CPA2
+EndEnum
+EndSysreg
+
Sysreg ID_AA64MMFR0_EL1 3 0 0 7 0
UnsignedEnum 63:60 ECV
0b0000 NI
Field 62 SPINTMASK
Field 61 NMI
Field 60 EnTP2
-Res0 59:58
+Field 59 TCSO
+Field 58 TCSO0
Field 57 EPAN
Field 56 EnALS
Field 55 EnAS0
Field 37 ITFSB
Field 36 BT1
Field 35 BT0
-Res0 34
+Field 34 EnFPM
Field 33 MSCEn
Field 32 CMOW
Field 31 EnIA
EndSysreg
SysregFields CPACR_ELx
-Res0 63:29
+Res0 63:30
+Field 29 E0POE
Field 28 TTA
Res0 27:26
Field 25:24 SMEN
Fields SMCR_ELx
EndSysreg
+SysregFields GCSCR_ELx
+Res0 63:10
+Field 9 STREn
+Field 8 PUSHMEn
+Res0 7
+Field 6 EXLOCKEN
+Field 5 RVCHKEN
+Res0 4:1
+Field 0 PCRSEL
+EndSysregFields
+
+Sysreg GCSCR_EL1 3 0 2 5 0
+Fields GCSCR_ELx
+EndSysreg
+
+SysregFields GCSPR_ELx
+Field 63:3 PTR
+Res0 2:0
+EndSysregFields
+
+Sysreg GCSPR_EL1 3 0 2 5 1
+Fields GCSPR_ELx
+EndSysreg
+
+Sysreg GCSCRE0_EL1 3 0 2 5 2
+Res0 63:11
+Field 10 nTR
+Field 9 STREn
+Field 8 PUSHMEn
+Res0 7:6
+Field 5 RVCHKEN
+Res0 4:1
+Field 0 PCRSEL
+EndSysreg
+
Sysreg ALLINT 3 0 4 3 0
Res0 63:14
Field 13 ALLINT
Fields CONTEXTIDR_ELx
EndSysreg
+Sysreg RCWSMASK_EL1 3 0 13 0 3
+Field 63:0 RCWSMASK
+EndSysreg
+
Sysreg TPIDR_EL1 3 0 13 0 4
Field 63:0 ThreadID
EndSysreg
+Sysreg RCWMASK_EL1 3 0 13 0 6
+Field 63:0 RCWMASK
+EndSysreg
+
Sysreg SCXTNUM_EL1 3 0 13 0 7
Field 63:0 SoftwareContextNumber
EndSysreg
Field 3:0 BS
EndSysreg
+Sysreg GCSPR_EL0 3 3 2 5 1
+Fields GCSPR_ELx
+EndSysreg
+
Sysreg SVCR 3 3 4 2 2
Res0 63:2
Field 1 ZA
Field 0 SM
EndSysreg
+Sysreg FPMR 3 3 4 4 2
+Res0 63:38
+Field 37:32 LSCALE2
+Field 31:24 NSCALE
+Res0 23
+Field 22:16 LSCALE
+Field 15 OSC
+Field 14 OSM
+Res0 13:9
+UnsignedEnum 8:6 F8D
+ 0b000 E5M2
+ 0b001 E4M3
+EndEnum
+UnsignedEnum 5:3 F8S2
+ 0b000 E5M2
+ 0b001 E4M3
+EndEnum
+UnsignedEnum 2:0 F8S1
+ 0b000 E5M2
+ 0b001 E4M3
+EndEnum
+EndSysreg
+
SysregFields HFGxTR_EL2
Field 63 nAMAIR2_EL1
Field 62 nMAIR2_EL1
EndSysreg
Sysreg HFGITR_EL2 3 4 1 1 6
-Res0 63:61
+Res0 63
+Field 62 ATS1E1A
+Res0 61
Field 60 COSPRCTX
Field 59 nGCSEPP
Field 58 nGCSSTR_EL1
Field 0 DBGBCRn_EL1
EndSysreg
+Sysreg HAFGRTR_EL2 3 4 3 1 6
+Res0 63:50
+Field 49 AMEVTYPER115_EL0
+Field 48 AMEVCNTR115_EL0
+Field 47 AMEVTYPER114_EL0
+Field 46 AMEVCNTR114_EL0
+Field 45 AMEVTYPER113_EL0
+Field 44 AMEVCNTR113_EL0
+Field 43 AMEVTYPER112_EL0
+Field 42 AMEVCNTR112_EL0
+Field 41 AMEVTYPER111_EL0
+Field 40 AMEVCNTR111_EL0
+Field 39 AMEVTYPER110_EL0
+Field 38 AMEVCNTR110_EL0
+Field 37 AMEVTYPER19_EL0
+Field 36 AMEVCNTR19_EL0
+Field 35 AMEVTYPER18_EL0
+Field 34 AMEVCNTR18_EL0
+Field 33 AMEVTYPER17_EL0
+Field 32 AMEVCNTR17_EL0
+Field 31 AMEVTYPER16_EL0
+Field 30 AMEVCNTR16_EL0
+Field 29 AMEVTYPER15_EL0
+Field 28 AMEVCNTR15_EL0
+Field 27 AMEVTYPER14_EL0
+Field 26 AMEVCNTR14_EL0
+Field 25 AMEVTYPER13_EL0
+Field 24 AMEVCNTR13_EL0
+Field 23 AMEVTYPER12_EL0
+Field 22 AMEVCNTR12_EL0
+Field 21 AMEVTYPER11_EL0
+Field 20 AMEVCNTR11_EL0
+Field 19 AMEVTYPER10_EL0
+Field 18 AMEVCNTR10_EL0
+Field 17 AMCNTEN1
+Res0 16:5
+Field 4 AMEVCNTR03_EL0
+Field 3 AMEVCNTR02_EL0
+Field 2 AMEVCNTR01_EL0
+Field 1 AMEVCNTR00_EL0
+Field 0 AMCNTEN0
+EndSysreg
+
Sysreg ZCR_EL2 3 4 1 2 0
Fields ZCR_ELx
EndSysreg
Sysreg HCRX_EL2 3 4 1 2 2
-Res0 63:23
+Res0 63:25
+Field 24 PACMEn
+Field 23 EnFPM
Field 22 GCSEn
Field 21 EnIDCP128
Field 20 EnSDERR
Fields SMCR_ELx
EndSysreg
+Sysreg GCSCR_EL2 3 4 2 5 0
+Fields GCSCR_ELx
+EndSysreg
+
+Sysreg GCSPR_EL2 3 4 2 5 1
+Fields GCSPR_ELx
+EndSysreg
+
Sysreg DACR32_EL2 3 4 3 0 0
Res0 63:32
Field 31:30 D15
Fields SMCR_ELx
EndSysreg
+Sysreg GCSCR_EL12 3 5 2 5 0
+Fields GCSCR_ELx
+EndSysreg
+
+Sysreg GCSPR_EL12 3 5 2 5 1
+Fields GCSPR_ELx
+EndSysreg
+
Sysreg FAR_EL12 3 5 6 0 0
Field 63:0 ADDR
EndSysreg
Field 0 PnCH
EndSysreg
+SysregFields MAIR2_ELx
+Field 63:56 Attr7
+Field 55:48 Attr6
+Field 47:40 Attr5
+Field 39:32 Attr4
+Field 31:24 Attr3
+Field 23:16 Attr2
+Field 15:8 Attr1
+Field 7:0 Attr0
+EndSysregFields
+
+Sysreg MAIR2_EL1 3 0 10 2 1
+Fields MAIR2_ELx
+EndSysreg
+
+Sysreg MAIR2_EL2 3 4 10 1 1
+Fields MAIR2_ELx
+EndSysreg
+
+Sysreg AMAIR2_EL1 3 0 10 3 1
+Field 63:0 ImpDef
+EndSysreg
+
+Sysreg AMAIR2_EL2 3 4 10 3 1
+Field 63:0 ImpDef
+EndSysreg
+
SysregFields PIRx_ELx
Field 63:60 Perm15
Field 59:56 Perm14
Fields PIRx_ELx
EndSysreg
+Sysreg POR_EL0 3 3 10 2 4
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg POR_EL1 3 0 10 2 4
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg POR_EL12 3 5 10 2 4
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg S2POR_EL1 3 0 10 2 5
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg S2PIR_EL2 3 4 10 2 5
+Fields PIRx_ELx
+EndSysreg
+
Sysreg LORSA_EL1 3 0 10 4 0
Res0 63:52
Field 51:16 SA
ifdef CONFIG_AS_HAS_EXPLICIT_RELOCS
cflags-y += $(call cc-option,-mexplicit-relocs)
KBUILD_CFLAGS_KERNEL += $(call cc-option,-mdirect-extern-access)
+KBUILD_CFLAGS_KERNEL += $(call cc-option,-fdirect-access-external-data)
KBUILD_AFLAGS_MODULE += $(call cc-option,-fno-direct-access-external-data)
KBUILD_CFLAGS_MODULE += $(call cc-option,-fno-direct-access-external-data)
KBUILD_AFLAGS_MODULE += $(call cc-option,-mno-relax) $(call cc-option,-Wa$(comma)-mno-relax)
ifeq ($(CONFIG_RELOCATABLE),y)
KBUILD_CFLAGS_KERNEL += -fPIE
-LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext
+LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext $(call ld-option, --apply-dynamic-relocs)
endif
cflags-y += $(call cc-option, -mno-check-zero-division)
all: $(notdir $(KBUILD_IMAGE))
+vmlinuz.efi: vmlinux.efi
+
vmlinux.elf vmlinux.efi vmlinuz.efi: vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(bootvars-y) $(boot)/$@
lu32i.d \reg, 0
lu52i.d \reg, \reg, 0
.pushsection ".la_abs", "aw", %progbits
- 768:
- .dword 768b-766b
+ .dword 766b
.dword \sym
.popsection
#endif
#define EFI_KIMG_PREFERRED_ADDRESS PHYSADDR(VMLINUX_LOAD_ADDRESS)
-unsigned long kernel_entry_address(void);
+unsigned long kernel_entry_address(unsigned long kernel_addr);
#endif /* _ASM_LOONGARCH_EFI_H */
#define ELF_PLAT_INIT(_r, load_addr) do { \
_r->regs[1] = _r->regs[2] = _r->regs[3] = _r->regs[4] = 0; \
_r->regs[5] = _r->regs[6] = _r->regs[7] = _r->regs[8] = 0; \
- _r->regs[9] = _r->regs[10] = _r->regs[11] = _r->regs[12] = 0; \
+ _r->regs[9] = _r->regs[10] /* syscall n */ = _r->regs[12] = 0; \
_r->regs[13] = _r->regs[14] = _r->regs[15] = _r->regs[16] = 0; \
_r->regs[17] = _r->regs[18] = _r->regs[19] = _r->regs[20] = 0; \
_r->regs[21] = _r->regs[22] = _r->regs[23] = _r->regs[24] = 0; \
u64 signal_exits;
};
+#define KVM_MEM_HUGEPAGE_CAPABLE (1UL << 0)
+#define KVM_MEM_HUGEPAGE_INCAPABLE (1UL << 1)
struct kvm_arch_memory_slot {
+ unsigned long flags;
};
struct kvm_context {
};
#define KVM_LARCH_FPU (0x1 << 0)
-#define KVM_LARCH_SWCSR_LATEST (0x1 << 1)
-#define KVM_LARCH_HWCSR_USABLE (0x1 << 2)
+#define KVM_LARCH_LSX (0x1 << 1)
+#define KVM_LARCH_LASX (0x1 << 2)
+#define KVM_LARCH_SWCSR_LATEST (0x1 << 3)
+#define KVM_LARCH_HWCSR_USABLE (0x1 << 4)
struct kvm_vcpu_arch {
/*
csr->csrs[reg] = val;
}
+static inline bool kvm_guest_has_fpu(struct kvm_vcpu_arch *arch)
+{
+ return arch->cpucfg[2] & CPUCFG2_FP;
+}
+
+static inline bool kvm_guest_has_lsx(struct kvm_vcpu_arch *arch)
+{
+ return arch->cpucfg[2] & CPUCFG2_LSX;
+}
+
+static inline bool kvm_guest_has_lasx(struct kvm_vcpu_arch *arch)
+{
+ return arch->cpucfg[2] & CPUCFG2_LASX;
+}
+
/* Debug: dump vcpu state */
int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu);
void kvm_restore_fpu(struct loongarch_fpu *fpu);
void kvm_restore_fcsr(struct loongarch_fpu *fpu);
-void kvm_acquire_timer(struct kvm_vcpu *vcpu);
+#ifdef CONFIG_CPU_HAS_LSX
+int kvm_own_lsx(struct kvm_vcpu *vcpu);
+void kvm_save_lsx(struct loongarch_fpu *fpu);
+void kvm_restore_lsx(struct loongarch_fpu *fpu);
+#else
+static inline int kvm_own_lsx(struct kvm_vcpu *vcpu) { }
+static inline void kvm_save_lsx(struct loongarch_fpu *fpu) { }
+static inline void kvm_restore_lsx(struct loongarch_fpu *fpu) { }
+#endif
+
+#ifdef CONFIG_CPU_HAS_LASX
+int kvm_own_lasx(struct kvm_vcpu *vcpu);
+void kvm_save_lasx(struct loongarch_fpu *fpu);
+void kvm_restore_lasx(struct loongarch_fpu *fpu);
+#else
+static inline int kvm_own_lasx(struct kvm_vcpu *vcpu) { }
+static inline void kvm_save_lasx(struct loongarch_fpu *fpu) { }
+static inline void kvm_restore_lasx(struct loongarch_fpu *fpu) { }
+#endif
+
void kvm_init_timer(struct kvm_vcpu *vcpu, unsigned long hz);
void kvm_reset_timer(struct kvm_vcpu *vcpu);
void kvm_save_timer(struct kvm_vcpu *vcpu);
static __always_inline u64 drdtime(void)
{
- int rID = 0;
u64 val = 0;
__asm__ __volatile__(
- "rdtime.d %0, %1 \n\t"
- : "=r"(val), "=r"(rID)
+ "rdtime.d %0, $zero\n\t"
+ : "=r"(val)
:
);
return val;
switch (size) { \
case 4: \
__asm__ __volatile__( \
- "am"#asm_op".w" " %[ret], %[val], %[ptr] \n" \
+ "am"#asm_op".w" " %[ret], %[val], %[ptr] \n" \
: [ret] "=&r" (ret), [ptr] "+ZB"(*(u32 *)ptr) \
: [val] "r" (val)); \
break; \
case 8: \
__asm__ __volatile__( \
- "am"#asm_op".d" " %[ret], %[val], %[ptr] \n" \
+ "am"#asm_op".d" " %[ret], %[val], %[ptr] \n" \
: [ret] "=&r" (ret), [ptr] "+ZB"(*(u64 *)ptr) \
: [val] "r" (val)); \
break; \
PERCPU_OP(or, or, |)
#undef PERCPU_OP
-static __always_inline unsigned long __percpu_read(void *ptr, int size)
+static __always_inline unsigned long __percpu_read(void __percpu *ptr, int size)
{
unsigned long ret;
return ret;
}
-static __always_inline void __percpu_write(void *ptr, unsigned long val, int size)
+static __always_inline void __percpu_write(void __percpu *ptr, unsigned long val, int size)
{
switch (size) {
case 1:
}
}
-static __always_inline unsigned long __percpu_xchg(void *ptr, unsigned long val,
- int size)
+static __always_inline unsigned long __percpu_xchg(void *ptr, unsigned long val, int size)
{
switch (size) {
case 1:
#ifdef CONFIG_RELOCATABLE
struct rela_la_abs {
- long offset;
+ long pc;
long symvalue;
};
#define LOONGARCH_REG_64(TYPE, REG) (TYPE | KVM_REG_SIZE_U64 | (REG << LOONGARCH_REG_SHIFT))
#define KVM_IOC_CSRID(REG) LOONGARCH_REG_64(KVM_REG_LOONGARCH_CSR, REG)
#define KVM_IOC_CPUCFG(REG) LOONGARCH_REG_64(KVM_REG_LOONGARCH_CPUCFG, REG)
+#define KVM_LOONGARCH_VCPU_CPUCFG 0
struct kvm_debug_exit_arch {
};
obj-$(CONFIG_RELOCATABLE) += relocate.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
lsx_restore_all_upper a0 t0 t1
jr ra
SYM_FUNC_END(_restore_lsx_upper)
+EXPORT_SYMBOL(_restore_lsx_upper)
SYM_FUNC_START(_init_lsx_upper)
lsx_init_all_upper t1
lasx_restore_all_upper a0 t0 t1
jr ra
SYM_FUNC_END(_restore_lasx_upper)
+EXPORT_SYMBOL(_restore_lasx_upper)
SYM_FUNC_START(_init_lasx_upper)
lasx_init_all_upper t1
for (p = begin; (void *)p < end; p++) {
long v = p->symvalue;
uint32_t lu12iw, ori, lu32id, lu52id;
- union loongarch_instruction *insn = (void *)p - p->offset;
+ union loongarch_instruction *insn = (void *)p->pc;
lu12iw = (v >> 12) & 0xfffff;
ori = v & 0xfff;
return hash;
}
+static int __init nokaslr(char *p)
+{
+ pr_info("KASLR is disabled.\n");
+
+ return 0; /* Print a notice and silence the boot warning */
+}
+early_param("nokaslr", nokaslr);
+
static inline __init bool kaslr_disabled(void)
{
char *str;
}
for (unwind_start(&state, task, regs);
- !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
+ !unwind_done(&state); unwind_next_frame(&state)) {
addr = unwind_get_return_address(&state);
if (!addr || !consume_entry(cookie, addr))
break;
return 0;
}
-static int constant_set_state_oneshot_stopped(struct clock_event_device *evt)
+static int constant_set_state_periodic(struct clock_event_device *evt)
{
+ unsigned long period;
unsigned long timer_config;
raw_spin_lock(&state_lock);
- timer_config = csr_read64(LOONGARCH_CSR_TCFG);
- timer_config &= ~CSR_TCFG_EN;
+ period = const_clock_freq / HZ;
+ timer_config = period & CSR_TCFG_VAL;
+ timer_config |= (CSR_TCFG_PERIOD | CSR_TCFG_EN);
csr_write64(timer_config, LOONGARCH_CSR_TCFG);
raw_spin_unlock(&state_lock);
return 0;
}
-static int constant_set_state_periodic(struct clock_event_device *evt)
+static int constant_set_state_shutdown(struct clock_event_device *evt)
{
- unsigned long period;
unsigned long timer_config;
raw_spin_lock(&state_lock);
- period = const_clock_freq / HZ;
- timer_config = period & CSR_TCFG_VAL;
- timer_config |= (CSR_TCFG_PERIOD | CSR_TCFG_EN);
+ timer_config = csr_read64(LOONGARCH_CSR_TCFG);
+ timer_config &= ~CSR_TCFG_EN;
csr_write64(timer_config, LOONGARCH_CSR_TCFG);
raw_spin_unlock(&state_lock);
return 0;
}
-static int constant_set_state_shutdown(struct clock_event_device *evt)
-{
- return 0;
-}
-
static int constant_timer_next_event(unsigned long delta, struct clock_event_device *evt)
{
unsigned long timer_config;
cd->rating = 320;
cd->cpumask = cpumask_of(cpu);
cd->set_state_oneshot = constant_set_state_oneshot;
- cd->set_state_oneshot_stopped = constant_set_state_oneshot_stopped;
+ cd->set_state_oneshot_stopped = constant_set_state_shutdown;
cd->set_state_periodic = constant_set_state_periodic;
cd->set_state_shutdown = constant_set_state_shutdown;
cd->set_next_event = constant_timer_next_event;
} while (!get_stack_info(state->sp, state->task, info));
- state->error = true;
return false;
}
} while (!get_stack_info(state->sp, state->task, info));
out:
- state->error = true;
+ state->stack_info.type = STACK_TYPE_UNKNOWN;
return false;
}
depends on AS_HAS_LVZ_EXTENSION
depends on HAVE_KVM
select HAVE_KVM_DIRTY_RING_ACQ_REL
- select HAVE_KVM_EVENTFD
select HAVE_KVM_VCPU_ASYNC_IOCTL
+ select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_GENERIC_HARDWARE_ENABLING
select KVM_GENERIC_MMU_NOTIFIER
select KVM_MMIO
select KVM_XFER_TO_GUEST_WORK
- select PREEMPT_NOTIFIERS
help
Support hosting virtualized guest machines using
hardware virtualization extensions. You will need
++vcpu->stat.idle_exits;
trace_kvm_exit_idle(vcpu, KVM_TRACE_EXIT_IDLE);
- if (!kvm_arch_vcpu_runnable(vcpu)) {
- /*
- * Switch to the software timer before halt-polling/blocking as
- * the guest's timer may be a break event for the vCPU, and the
- * hypervisor timer runs only when the CPU is in guest mode.
- * Switch before halt-polling so that KVM recognizes an expired
- * timer before blocking.
- */
- kvm_save_timer(vcpu);
- kvm_vcpu_block(vcpu);
- }
+ if (!kvm_arch_vcpu_runnable(vcpu))
+ kvm_vcpu_halt(vcpu);
return EMULATE_DONE;
}
{
struct kvm_run *run = vcpu->run;
+ if (!kvm_guest_has_fpu(&vcpu->arch)) {
+ kvm_queue_exception(vcpu, EXCCODE_INE, 0);
+ return RESUME_GUEST;
+ }
+
/*
* If guest FPU not present, the FPU operation should have been
* treated as a reserved instruction!
return RESUME_GUEST;
}
+/*
+ * kvm_handle_lsx_disabled() - Guest used LSX while disabled in root.
+ * @vcpu: Virtual CPU context.
+ *
+ * Handle when the guest attempts to use LSX when it is disabled in the root
+ * context.
+ */
+static int kvm_handle_lsx_disabled(struct kvm_vcpu *vcpu)
+{
+ if (kvm_own_lsx(vcpu))
+ kvm_queue_exception(vcpu, EXCCODE_INE, 0);
+
+ return RESUME_GUEST;
+}
+
+/*
+ * kvm_handle_lasx_disabled() - Guest used LASX while disabled in root.
+ * @vcpu: Virtual CPU context.
+ *
+ * Handle when the guest attempts to use LASX when it is disabled in the root
+ * context.
+ */
+static int kvm_handle_lasx_disabled(struct kvm_vcpu *vcpu)
+{
+ if (kvm_own_lasx(vcpu))
+ kvm_queue_exception(vcpu, EXCCODE_INE, 0);
+
+ return RESUME_GUEST;
+}
+
/*
* LoongArch KVM callback handling for unimplemented guest exiting
*/
[EXCCODE_TLBS] = kvm_handle_write_fault,
[EXCCODE_TLBM] = kvm_handle_write_fault,
[EXCCODE_FPDIS] = kvm_handle_fpu_disabled,
+ [EXCCODE_LSXDIS] = kvm_handle_lsx_disabled,
+ [EXCCODE_LASXDIS] = kvm_handle_lasx_disabled,
[EXCCODE_GSPR] = kvm_handle_gspr,
};
if (env & CSR_GCFG_MATC_ROOT)
gcfg |= CSR_GCFG_MATC_ROOT;
- gcfg |= CSR_GCFG_TIT;
write_csr_gcfg(gcfg);
kvm_flush_tlb_all();
#include <asm/tlb.h>
#include <asm/kvm_mmu.h>
+static inline bool kvm_hugepage_capable(struct kvm_memory_slot *slot)
+{
+ return slot->arch.flags & KVM_MEM_HUGEPAGE_CAPABLE;
+}
+
+static inline bool kvm_hugepage_incapable(struct kvm_memory_slot *slot)
+{
+ return slot->arch.flags & KVM_MEM_HUGEPAGE_INCAPABLE;
+}
+
static inline void kvm_ptw_prepare(struct kvm *kvm, kvm_ptw_ctx *ctx)
{
ctx->level = kvm->arch.root_level;
kvm_ptw_top(kvm->arch.pgd, start << PAGE_SHIFT, end << PAGE_SHIFT, &ctx);
}
+int kvm_arch_prepare_memory_region(struct kvm *kvm, const struct kvm_memory_slot *old,
+ struct kvm_memory_slot *new, enum kvm_mr_change change)
+{
+ gpa_t gpa_start;
+ hva_t hva_start;
+ size_t size, gpa_offset, hva_offset;
+
+ if ((change != KVM_MR_MOVE) && (change != KVM_MR_CREATE))
+ return 0;
+ /*
+ * Prevent userspace from creating a memory region outside of the
+ * VM GPA address space
+ */
+ if ((new->base_gfn + new->npages) > (kvm->arch.gpa_size >> PAGE_SHIFT))
+ return -ENOMEM;
+
+ new->arch.flags = 0;
+ size = new->npages * PAGE_SIZE;
+ gpa_start = new->base_gfn << PAGE_SHIFT;
+ hva_start = new->userspace_addr;
+ if (IS_ALIGNED(size, PMD_SIZE) && IS_ALIGNED(gpa_start, PMD_SIZE)
+ && IS_ALIGNED(hva_start, PMD_SIZE))
+ new->arch.flags |= KVM_MEM_HUGEPAGE_CAPABLE;
+ else {
+ /*
+ * Pages belonging to memslots that don't have the same
+ * alignment within a PMD for userspace and GPA cannot be
+ * mapped with PMD entries, because we'll end up mapping
+ * the wrong pages.
+ *
+ * Consider a layout like the following:
+ *
+ * memslot->userspace_addr:
+ * +-----+--------------------+--------------------+---+
+ * |abcde|fgh Stage-1 block | Stage-1 block tv|xyz|
+ * +-----+--------------------+--------------------+---+
+ *
+ * memslot->base_gfn << PAGE_SIZE:
+ * +---+--------------------+--------------------+-----+
+ * |abc|def Stage-2 block | Stage-2 block |tvxyz|
+ * +---+--------------------+--------------------+-----+
+ *
+ * If we create those stage-2 blocks, we'll end up with this
+ * incorrect mapping:
+ * d -> f
+ * e -> g
+ * f -> h
+ */
+ gpa_offset = gpa_start & (PMD_SIZE - 1);
+ hva_offset = hva_start & (PMD_SIZE - 1);
+ if (gpa_offset != hva_offset) {
+ new->arch.flags |= KVM_MEM_HUGEPAGE_INCAPABLE;
+ } else {
+ if (gpa_offset == 0)
+ gpa_offset = PMD_SIZE;
+ if ((size + gpa_offset) < (PMD_SIZE * 2))
+ new->arch.flags |= KVM_MEM_HUGEPAGE_INCAPABLE;
+ }
+ }
+
+ return 0;
+}
+
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_memory_slot *old,
const struct kvm_memory_slot *new,
}
static bool fault_supports_huge_mapping(struct kvm_memory_slot *memslot,
- unsigned long hva, unsigned long map_size, bool write)
+ unsigned long hva, bool write)
{
- size_t size;
- gpa_t gpa_start;
- hva_t uaddr_start, uaddr_end;
+ hva_t start, end;
/* Disable dirty logging on HugePages */
if (kvm_slot_dirty_track_enabled(memslot) && write)
return false;
- size = memslot->npages * PAGE_SIZE;
- gpa_start = memslot->base_gfn << PAGE_SHIFT;
- uaddr_start = memslot->userspace_addr;
- uaddr_end = uaddr_start + size;
+ if (kvm_hugepage_capable(memslot))
+ return true;
- /*
- * Pages belonging to memslots that don't have the same alignment
- * within a PMD for userspace and GPA cannot be mapped with stage-2
- * PMD entries, because we'll end up mapping the wrong pages.
- *
- * Consider a layout like the following:
- *
- * memslot->userspace_addr:
- * +-----+--------------------+--------------------+---+
- * |abcde|fgh Stage-1 block | Stage-1 block tv|xyz|
- * +-----+--------------------+--------------------+---+
- *
- * memslot->base_gfn << PAGE_SIZE:
- * +---+--------------------+--------------------+-----+
- * |abc|def Stage-2 block | Stage-2 block |tvxyz|
- * +---+--------------------+--------------------+-----+
- *
- * If we create those stage-2 blocks, we'll end up with this incorrect
- * mapping:
- * d -> f
- * e -> g
- * f -> h
- */
- if ((gpa_start & (map_size - 1)) != (uaddr_start & (map_size - 1)))
+ if (kvm_hugepage_incapable(memslot))
return false;
+ start = memslot->userspace_addr;
+ end = start + memslot->npages * PAGE_SIZE;
+
/*
* Next, let's make sure we're not trying to map anything not covered
* by the memslot. This means we have to prohibit block size mappings
* userspace_addr or the base_gfn, as both are equally aligned (per
* the check above) and equally sized.
*/
- return (hva & ~(map_size - 1)) >= uaddr_start &&
- (hva & ~(map_size - 1)) + map_size <= uaddr_end;
+ return (hva >= ALIGN(start, PMD_SIZE)) && (hva < ALIGN_DOWN(end, PMD_SIZE));
}
/*
/* Disable dirty logging on HugePages */
level = 0;
- if (!fault_supports_huge_mapping(memslot, hva, PMD_SIZE, write)) {
+ if (!fault_supports_huge_mapping(memslot, hva, write)) {
level = 0;
} else {
level = host_pfn_mapping_level(kvm, gfn, memslot);
{
}
-int kvm_arch_prepare_memory_region(struct kvm *kvm, const struct kvm_memory_slot *old,
- struct kvm_memory_slot *new, enum kvm_mr_change change)
-{
- return 0;
-}
-
void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
const struct kvm_memory_slot *memslot)
{
jr ra
SYM_FUNC_END(kvm_restore_fpu)
+#ifdef CONFIG_CPU_HAS_LSX
+SYM_FUNC_START(kvm_save_lsx)
+ fpu_save_csr a0 t1
+ fpu_save_cc a0 t1 t2
+ lsx_save_data a0 t1
+ jr ra
+SYM_FUNC_END(kvm_save_lsx)
+
+SYM_FUNC_START(kvm_restore_lsx)
+ lsx_restore_data a0 t1
+ fpu_restore_cc a0 t1 t2
+ fpu_restore_csr a0 t1 t2
+ jr ra
+SYM_FUNC_END(kvm_restore_lsx)
+#endif
+
+#ifdef CONFIG_CPU_HAS_LASX
+SYM_FUNC_START(kvm_save_lasx)
+ fpu_save_csr a0 t1
+ fpu_save_cc a0 t1 t2
+ lasx_save_data a0 t1
+ jr ra
+SYM_FUNC_END(kvm_save_lasx)
+
+SYM_FUNC_START(kvm_restore_lasx)
+ lasx_restore_data a0 t1
+ fpu_restore_cc a0 t1 t2
+ fpu_restore_csr a0 t1 t2
+ jr ra
+SYM_FUNC_END(kvm_restore_lasx)
+#endif
.section ".rodata"
SYM_DATA(kvm_exception_size, .quad kvm_exc_entry_end - kvm_exc_entry)
SYM_DATA(kvm_enter_guest_size, .quad kvm_enter_guest_end - kvm_enter_guest)
kvm_write_sw_gcsr(vcpu->arch.csr, LOONGARCH_CSR_TVAL, 0);
}
-/*
- * Restore hard timer state and enable guest to access timer registers
- * without trap, should be called with irq disabled
- */
-void kvm_acquire_timer(struct kvm_vcpu *vcpu)
-{
- unsigned long cfg;
-
- cfg = read_csr_gcfg();
- if (!(cfg & CSR_GCFG_TIT))
- return;
-
- /* Enable guest access to hard timer */
- write_csr_gcfg(cfg & ~CSR_GCFG_TIT);
-
- /*
- * Freeze the soft-timer and sync the guest stable timer with it. We do
- * this with interrupts disabled to avoid latency.
- */
- hrtimer_cancel(&vcpu->arch.swtimer);
-}
-
/*
* Restore soft timer state from saved context.
*/
void kvm_restore_timer(struct kvm_vcpu *vcpu)
{
- unsigned long cfg, delta, period;
+ unsigned long cfg, estat;
+ unsigned long ticks, delta, period;
ktime_t expire, now;
struct loongarch_csrs *csr = vcpu->arch.csr;
/*
* Set guest stable timer cfg csr
+ * Disable timer before restore estat CSR register, avoid to
+ * get invalid timer interrupt for old timer cfg
*/
cfg = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG);
+
+ write_gcsr_timercfg(0);
kvm_restore_hw_gcsr(csr, LOONGARCH_CSR_ESTAT);
kvm_restore_hw_gcsr(csr, LOONGARCH_CSR_TCFG);
if (!(cfg & CSR_TCFG_EN)) {
return;
}
+ /*
+ * Freeze the soft-timer and sync the guest stable timer with it.
+ */
+ hrtimer_cancel(&vcpu->arch.swtimer);
+
+ /*
+ * From LoongArch Reference Manual Volume 1 Chapter 7.6.2
+ * If oneshot timer is fired, CSR TVAL will be -1, there are two
+ * conditions:
+ * 1) timer is fired during exiting to host
+ * 2) timer is fired and vm is doing timer irq, and then exiting to
+ * host. Host should not inject timer irq to avoid spurious
+ * timer interrupt again
+ */
+ ticks = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TVAL);
+ estat = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_ESTAT);
+ if (!(cfg & CSR_TCFG_PERIOD) && (ticks > cfg)) {
+ /*
+ * Writing 0 to LOONGARCH_CSR_TVAL will inject timer irq
+ * and set CSR TVAL with -1
+ */
+ write_gcsr_timertick(0);
+
+ /*
+ * Writing CSR_TINTCLR_TI to LOONGARCH_CSR_TINTCLR will clear
+ * timer interrupt, and CSR TVAL keeps unchanged with -1, it
+ * avoids spurious timer interrupt
+ */
+ if (!(estat & CPU_TIMER))
+ gcsr_write(CSR_TINTCLR_TI, LOONGARCH_CSR_TINTCLR);
+ return;
+ }
+
/*
* Set remainder tick value if not expired
*/
+ delta = 0;
now = ktime_get();
expire = vcpu->arch.expire;
if (ktime_before(now, expire))
delta = ktime_to_tick(vcpu, ktime_sub(expire, now));
- else {
- if (cfg & CSR_TCFG_PERIOD) {
- period = cfg & CSR_TCFG_VAL;
- delta = ktime_to_tick(vcpu, ktime_sub(now, expire));
- delta = period - (delta % period);
- } else
- delta = 0;
+ else if (cfg & CSR_TCFG_PERIOD) {
+ period = cfg & CSR_TCFG_VAL;
+ delta = ktime_to_tick(vcpu, ktime_sub(now, expire));
+ delta = period - (delta % period);
+
/*
* Inject timer here though sw timer should inject timer
* interrupt async already, since sw timer may be cancelled
- * during injecting intr async in function kvm_acquire_timer
+ * during injecting intr async
*/
kvm_queue_irq(vcpu, INT_TI);
}
*/
static void _kvm_save_timer(struct kvm_vcpu *vcpu)
{
- unsigned long ticks, delta;
+ unsigned long ticks, delta, cfg;
ktime_t expire;
struct loongarch_csrs *csr = vcpu->arch.csr;
+ cfg = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG);
ticks = kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TVAL);
- delta = tick_to_ns(vcpu, ticks);
- expire = ktime_add_ns(ktime_get(), delta);
- vcpu->arch.expire = expire;
- if (ticks) {
+
+ /*
+ * From LoongArch Reference Manual Volume 1 Chapter 7.6.2
+ * If period timer is fired, CSR TVAL will be reloaded from CSR TCFG
+ * If oneshot timer is fired, CSR TVAL will be -1
+ * Here judge one-shot timer fired by checking whether TVAL is larger
+ * than TCFG
+ */
+ if (ticks < cfg) {
+ delta = tick_to_ns(vcpu, ticks);
+ expire = ktime_add_ns(ktime_get(), delta);
+ vcpu->arch.expire = expire;
+
/*
- * Update hrtimer to use new timeout
* HRTIMER_MODE_PINNED is suggested since vcpu may run in
* the same physical cpu in next time
*/
- hrtimer_cancel(&vcpu->arch.swtimer);
hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED);
- } else
+ } else if (vcpu->stat.generic.blocking) {
/*
- * Inject timer interrupt so that hall polling can dectect and exit
+ * Inject timer interrupt so that halt polling can dectect and exit.
+ * VCPU is scheduled out already and sleeps in rcuwait queue and
+ * will not poll pending events again. kvm_queue_irq() is not enough,
+ * hrtimer swtimer should be used here.
*/
- kvm_queue_irq(vcpu, INT_TI);
+ expire = ktime_add_ns(ktime_get(), 10);
+ vcpu->arch.expire = expire;
+ hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED);
+ }
}
/*
*/
void kvm_save_timer(struct kvm_vcpu *vcpu)
{
- unsigned long cfg;
struct loongarch_csrs *csr = vcpu->arch.csr;
preempt_disable();
- cfg = read_csr_gcfg();
- if (!(cfg & CSR_GCFG_TIT)) {
- /* Disable guest use of hard timer */
- write_csr_gcfg(cfg | CSR_GCFG_TIT);
-
- /* Save hard timer state */
- kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TCFG);
- kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TVAL);
- if (kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG) & CSR_TCFG_EN)
- _kvm_save_timer(vcpu);
- }
+
+ /* Save hard timer state */
+ kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TCFG);
+ kvm_save_hw_gcsr(csr, LOONGARCH_CSR_TVAL);
+ if (kvm_read_sw_gcsr(csr, LOONGARCH_CSR_TCFG) & CSR_TCFG_EN)
+ _kvm_save_timer(vcpu);
/* Save timer-related state to vCPU context */
kvm_save_hw_gcsr(csr, LOONGARCH_CSR_ESTAT);
#define KVM_TRACE_AUX_DISCARD 4
#define KVM_TRACE_AUX_FPU 1
+#define KVM_TRACE_AUX_LSX 2
+#define KVM_TRACE_AUX_LASX 3
#define kvm_trace_symbol_aux_op \
{ KVM_TRACE_AUX_SAVE, "save" }, \
{ KVM_TRACE_AUX_DISCARD, "discard" }
#define kvm_trace_symbol_aux_state \
- { KVM_TRACE_AUX_FPU, "FPU" }
+ { KVM_TRACE_AUX_FPU, "FPU" }, \
+ { KVM_TRACE_AUX_LSX, "LSX" }, \
+ { KVM_TRACE_AUX_LASX, "LASX" }
TRACE_EVENT(kvm_aux,
TP_PROTO(struct kvm_vcpu *vcpu, unsigned int op,
* check vmid before vcpu enter guest
*/
local_irq_disable();
- kvm_acquire_timer(vcpu);
kvm_deliver_intr(vcpu);
kvm_deliver_exception(vcpu);
/* Make sure the vcpu mode has been written */
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
{
- return kvm_pending_timer(vcpu) ||
+ int ret;
+
+ /* Protect from TOD sync and vcpu_load/put() */
+ preempt_disable();
+ ret = kvm_pending_timer(vcpu) ||
kvm_read_hw_gcsr(LOONGARCH_CSR_ESTAT) & (1 << INT_TI);
+ preempt_enable();
+
+ return ret;
}
int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu)
return -EINVAL;
}
-/**
- * kvm_migrate_count() - Migrate timer.
- * @vcpu: Virtual CPU.
- *
- * Migrate hrtimer to the current CPU by cancelling and restarting it
- * if the hrtimer is active.
- *
- * Must be called when the vCPU is migrated to a different CPU, so that
- * the timer can interrupt the guest at the new CPU, and the timer irq can
- * be delivered to the vCPU.
- */
-static void kvm_migrate_count(struct kvm_vcpu *vcpu)
-{
- if (hrtimer_cancel(&vcpu->arch.swtimer))
- hrtimer_restart(&vcpu->arch.swtimer);
-}
-
static int _kvm_getcsr(struct kvm_vcpu *vcpu, unsigned int id, u64 *val)
{
unsigned long gintc;
return ret;
}
+static int _kvm_get_cpucfg(int id, u64 *v)
+{
+ int ret = 0;
+
+ if (id < 0 && id >= KVM_MAX_CPUCFG_REGS)
+ return -EINVAL;
+
+ switch (id) {
+ case 2:
+ /* Return CPUCFG2 features which have been supported by KVM */
+ *v = CPUCFG2_FP | CPUCFG2_FPSP | CPUCFG2_FPDP |
+ CPUCFG2_FPVERS | CPUCFG2_LLFTP | CPUCFG2_LLFTPREV |
+ CPUCFG2_LAM;
+ /*
+ * If LSX is supported by CPU, it is also supported by KVM,
+ * as we implement it.
+ */
+ if (cpu_has_lsx)
+ *v |= CPUCFG2_LSX;
+ /*
+ * if LASX is supported by CPU, it is also supported by KVM,
+ * as we implement it.
+ */
+ if (cpu_has_lasx)
+ *v |= CPUCFG2_LASX;
+
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+ return ret;
+}
+
+static int kvm_check_cpucfg(int id, u64 val)
+{
+ u64 mask;
+ int ret = 0;
+
+ if (id < 0 && id >= KVM_MAX_CPUCFG_REGS)
+ return -EINVAL;
+
+ if (_kvm_get_cpucfg(id, &mask))
+ return ret;
+
+ switch (id) {
+ case 2:
+ /* CPUCFG2 features checking */
+ if (val & ~mask)
+ /* The unsupported features should not be set */
+ ret = -EINVAL;
+ else if (!(val & CPUCFG2_LLFTP))
+ /* The LLFTP must be set, as guest must has a constant timer */
+ ret = -EINVAL;
+ else if ((val & CPUCFG2_FP) && (!(val & CPUCFG2_FPSP) || !(val & CPUCFG2_FPDP)))
+ /* Single and double float point must both be set when enable FP */
+ ret = -EINVAL;
+ else if ((val & CPUCFG2_LSX) && !(val & CPUCFG2_FP))
+ /* FP should be set when enable LSX */
+ ret = -EINVAL;
+ else if ((val & CPUCFG2_LASX) && !(val & CPUCFG2_LSX))
+ /* LSX, FP should be set when enable LASX, and FP has been checked before. */
+ ret = -EINVAL;
+ break;
+ default:
+ break;
+ }
+ return ret;
+}
+
static int kvm_get_one_reg(struct kvm_vcpu *vcpu,
const struct kvm_one_reg *reg, u64 *v)
{
break;
case KVM_REG_LOONGARCH_CPUCFG:
id = KVM_GET_IOC_CPUCFG_IDX(reg->id);
- if (id >= 0 && id < KVM_MAX_CPUCFG_REGS)
- vcpu->arch.cpucfg[id] = (u32)v;
- else
- ret = -EINVAL;
+ ret = kvm_check_cpucfg(id, v);
+ if (ret)
+ break;
+ vcpu->arch.cpucfg[id] = (u32)v;
break;
case KVM_REG_LOONGARCH_KVM:
switch (reg->id) {
return -EINVAL;
}
+static int kvm_loongarch_cpucfg_has_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ switch (attr->attr) {
+ case 2:
+ return 0;
+ default:
+ return -ENXIO;
+ }
+
+ return -ENXIO;
+}
+
+static int kvm_loongarch_vcpu_has_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ int ret = -ENXIO;
+
+ switch (attr->group) {
+ case KVM_LOONGARCH_VCPU_CPUCFG:
+ ret = kvm_loongarch_cpucfg_has_attr(vcpu, attr);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int kvm_loongarch_get_cpucfg_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ int ret = 0;
+ uint64_t val;
+ uint64_t __user *uaddr = (uint64_t __user *)attr->addr;
+
+ ret = _kvm_get_cpucfg(attr->attr, &val);
+ if (ret)
+ return ret;
+
+ put_user(val, uaddr);
+
+ return ret;
+}
+
+static int kvm_loongarch_vcpu_get_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ int ret = -ENXIO;
+
+ switch (attr->group) {
+ case KVM_LOONGARCH_VCPU_CPUCFG:
+ ret = kvm_loongarch_get_cpucfg_attr(vcpu, attr);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int kvm_loongarch_cpucfg_set_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ return -ENXIO;
+}
+
+static int kvm_loongarch_vcpu_set_attr(struct kvm_vcpu *vcpu,
+ struct kvm_device_attr *attr)
+{
+ int ret = -ENXIO;
+
+ switch (attr->group) {
+ case KVM_LOONGARCH_VCPU_CPUCFG:
+ ret = kvm_loongarch_cpucfg_set_attr(vcpu, attr);
+ break;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
long kvm_arch_vcpu_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
long r;
+ struct kvm_device_attr attr;
void __user *argp = (void __user *)arg;
struct kvm_vcpu *vcpu = filp->private_data;
r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
break;
}
+ case KVM_HAS_DEVICE_ATTR: {
+ r = -EFAULT;
+ if (copy_from_user(&attr, argp, sizeof(attr)))
+ break;
+ r = kvm_loongarch_vcpu_has_attr(vcpu, &attr);
+ break;
+ }
+ case KVM_GET_DEVICE_ATTR: {
+ r = -EFAULT;
+ if (copy_from_user(&attr, argp, sizeof(attr)))
+ break;
+ r = kvm_loongarch_vcpu_get_attr(vcpu, &attr);
+ break;
+ }
+ case KVM_SET_DEVICE_ATTR: {
+ r = -EFAULT;
+ if (copy_from_user(&attr, argp, sizeof(attr)))
+ break;
+ r = kvm_loongarch_vcpu_set_attr(vcpu, &attr);
+ break;
+ }
default:
r = -ENOIOCTLCMD;
break;
preempt_enable();
}
+#ifdef CONFIG_CPU_HAS_LSX
+/* Enable LSX and restore context */
+int kvm_own_lsx(struct kvm_vcpu *vcpu)
+{
+ if (!kvm_guest_has_fpu(&vcpu->arch) || !kvm_guest_has_lsx(&vcpu->arch))
+ return -EINVAL;
+
+ preempt_disable();
+
+ /* Enable LSX for guest */
+ set_csr_euen(CSR_EUEN_LSXEN | CSR_EUEN_FPEN);
+ switch (vcpu->arch.aux_inuse & KVM_LARCH_FPU) {
+ case KVM_LARCH_FPU:
+ /*
+ * Guest FPU state already loaded,
+ * only restore upper LSX state
+ */
+ _restore_lsx_upper(&vcpu->arch.fpu);
+ break;
+ default:
+ /* Neither FP or LSX already active,
+ * restore full LSX state
+ */
+ kvm_restore_lsx(&vcpu->arch.fpu);
+ break;
+ }
+
+ trace_kvm_aux(vcpu, KVM_TRACE_AUX_RESTORE, KVM_TRACE_AUX_LSX);
+ vcpu->arch.aux_inuse |= KVM_LARCH_LSX | KVM_LARCH_FPU;
+ preempt_enable();
+
+ return 0;
+}
+#endif
+
+#ifdef CONFIG_CPU_HAS_LASX
+/* Enable LASX and restore context */
+int kvm_own_lasx(struct kvm_vcpu *vcpu)
+{
+ if (!kvm_guest_has_fpu(&vcpu->arch) || !kvm_guest_has_lsx(&vcpu->arch) || !kvm_guest_has_lasx(&vcpu->arch))
+ return -EINVAL;
+
+ preempt_disable();
+
+ set_csr_euen(CSR_EUEN_FPEN | CSR_EUEN_LSXEN | CSR_EUEN_LASXEN);
+ switch (vcpu->arch.aux_inuse & (KVM_LARCH_FPU | KVM_LARCH_LSX)) {
+ case KVM_LARCH_LSX:
+ case KVM_LARCH_LSX | KVM_LARCH_FPU:
+ /* Guest LSX state already loaded, only restore upper LASX state */
+ _restore_lasx_upper(&vcpu->arch.fpu);
+ break;
+ case KVM_LARCH_FPU:
+ /* Guest FP state already loaded, only restore upper LSX & LASX state */
+ _restore_lsx_upper(&vcpu->arch.fpu);
+ _restore_lasx_upper(&vcpu->arch.fpu);
+ break;
+ default:
+ /* Neither FP or LSX already active, restore full LASX state */
+ kvm_restore_lasx(&vcpu->arch.fpu);
+ break;
+ }
+
+ trace_kvm_aux(vcpu, KVM_TRACE_AUX_RESTORE, KVM_TRACE_AUX_LASX);
+ vcpu->arch.aux_inuse |= KVM_LARCH_LASX | KVM_LARCH_LSX | KVM_LARCH_FPU;
+ preempt_enable();
+
+ return 0;
+}
+#endif
+
/* Save context and disable FPU */
void kvm_lose_fpu(struct kvm_vcpu *vcpu)
{
preempt_disable();
- if (vcpu->arch.aux_inuse & KVM_LARCH_FPU) {
+ if (vcpu->arch.aux_inuse & KVM_LARCH_LASX) {
+ kvm_save_lasx(&vcpu->arch.fpu);
+ vcpu->arch.aux_inuse &= ~(KVM_LARCH_LSX | KVM_LARCH_FPU | KVM_LARCH_LASX);
+ trace_kvm_aux(vcpu, KVM_TRACE_AUX_SAVE, KVM_TRACE_AUX_LASX);
+
+ /* Disable LASX & LSX & FPU */
+ clear_csr_euen(CSR_EUEN_FPEN | CSR_EUEN_LSXEN | CSR_EUEN_LASXEN);
+ } else if (vcpu->arch.aux_inuse & KVM_LARCH_LSX) {
+ kvm_save_lsx(&vcpu->arch.fpu);
+ vcpu->arch.aux_inuse &= ~(KVM_LARCH_LSX | KVM_LARCH_FPU);
+ trace_kvm_aux(vcpu, KVM_TRACE_AUX_SAVE, KVM_TRACE_AUX_LSX);
+
+ /* Disable LSX & FPU */
+ clear_csr_euen(CSR_EUEN_FPEN | CSR_EUEN_LSXEN);
+ } else if (vcpu->arch.aux_inuse & KVM_LARCH_FPU) {
kvm_save_fpu(&vcpu->arch.fpu);
vcpu->arch.aux_inuse &= ~KVM_LARCH_FPU;
trace_kvm_aux(vcpu, KVM_TRACE_AUX_SAVE, KVM_TRACE_AUX_FPU);
unsigned long flags;
local_irq_save(flags);
- if (vcpu->arch.last_sched_cpu != cpu) {
- kvm_debug("[%d->%d]KVM vCPU[%d] switch\n",
- vcpu->arch.last_sched_cpu, cpu, vcpu->vcpu_id);
- /*
- * Migrate the timer interrupt to the current CPU so that it
- * always interrupts the guest and synchronously triggers a
- * guest timer interrupt.
- */
- kvm_migrate_count(vcpu);
- }
-
/* Restore guest state to registers */
_kvm_vcpu_load(vcpu, cpu);
local_irq_restore(flags);
{
return pfn_to_page(virt_to_pfn(kaddr));
}
-EXPORT_SYMBOL_GPL(dmw_virt_to_page);
+EXPORT_SYMBOL(dmw_virt_to_page);
struct page *tlb_virt_to_page(unsigned long kaddr)
{
return pfn_to_page(pte_pfn(*virt_to_kpte(kaddr)));
}
-EXPORT_SYMBOL_GPL(tlb_virt_to_page);
+EXPORT_SYMBOL(tlb_virt_to_page);
pgd_t *pgd_alloc(struct mm_struct *mm)
{
case 8:
move_reg(ctx, t1, src);
emit_insn(ctx, extwb, dst, t1);
+ emit_zext_32(ctx, dst, is32);
break;
case 16:
move_reg(ctx, t1, src);
emit_insn(ctx, extwh, dst, t1);
+ emit_zext_32(ctx, dst, is32);
break;
case 32:
emit_insn(ctx, addw, dst, src, LOONGARCH_GPR_ZERO);
break;
case 32:
emit_insn(ctx, revb2w, dst, dst);
- /* zero-extend 32 bits into 64 bits */
- emit_zext_32(ctx, dst, is32);
+ /* clear the upper 32 bits */
+ emit_zext_32(ctx, dst, true);
break;
case 64:
emit_insn(ctx, revbd, dst, dst);
/* function return */
case BPF_JMP | BPF_EXIT:
- emit_sext_32(ctx, regmap[BPF_REG_0], true);
-
if (i == ctx->prog->len - 1)
break;
}
break;
case BPF_DW:
- if (is_signed_imm12(off)) {
- emit_insn(ctx, ldd, dst, src, off);
- } else if (is_signed_imm14(off)) {
- emit_insn(ctx, ldptrd, dst, src, off);
- } else {
- move_imm(ctx, t1, off, is32);
- emit_insn(ctx, ldxd, dst, src, t1);
- }
+ move_imm(ctx, t1, off, is32);
+ emit_insn(ctx, ldxd, dst, src, t1);
break;
}
#ifndef _ASM_M68K_KEXEC_H
#define _ASM_M68K_KEXEC_H
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
/* Maximum physical address we can use pages from */
#define KEXEC_SOURCE_MEMORY_LIMIT (-1UL)
#endif /* __ASSEMBLY__ */
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
#endif /* _ASM_M68K_KEXEC_H */
obj-$(CONFIG_M68K_NONCOHERENT_DMA) += dma.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_BOOTINFO_PROC) += bootinfo_proc.o
obj-$(CONFIG_UBOOT) += uboot.o
config MACH_LOONGSON64
bool "Loongson 64-bit family of machines"
+ select ARCH_DMA_DEFAULT_COHERENT
select ARCH_SPARSEMEM_ENABLE
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select CPU_SUPPORTS_MSA
select CPU_DIEI_BROKEN if !LOONGSON3_ENHANCEMENT
select CPU_MIPSR2_IRQ_VI
+ select DMA_NONCOHERENT
select WEAK_ORDERING
select WEAK_REORDERING_BEYOND_LLSC
select MIPS_ASID_BITS_VARIABLE
compatible = "pci0014,7a03.0",
"pci0014,7a03",
"pciclass0c0320",
- "pciclass0c03",
- "loongson, pci-gmac";
+ "pciclass0c03";
reg = <0x1800 0x0 0x0 0x0 0x0>;
interrupts = <12 IRQ_TYPE_LEVEL_LOW>,
compatible = "pci0014,7a03.0",
"pci0014,7a03",
"pciclass020000",
- "pciclass0200",
- "loongson, pci-gmac";
+ "pciclass0200";
reg = <0x1800 0x0 0x0 0x0 0x0>;
interrupts = <12 IRQ_TYPE_LEVEL_HIGH>,
.cpu_disable = octeon_cpu_disable,
.cpu_die = octeon_cpu_die,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = kexec_nonboot_cpu_jump,
#endif
};
.cpu_disable = octeon_cpu_disable,
.cpu_die = octeon_cpu_die,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = kexec_nonboot_cpu_jump,
#endif
};
prepare_frametrace(newregs);
}
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
struct kimage;
extern unsigned long kexec_args[4];
extern int (*_machine_kexec_prepare)(struct kimage *);
#define ADAPTER_ROM 8
#define ACPI_TABLE 9
#define SMBIOS_TABLE 10
-#define MAX_MEMORY_TYPE 11
+#define UMA_VIDEO_RAM 11
+#define VUMA_VIDEO_RAM 12
+#define MAX_MEMORY_TYPE 13
+
+#define MEM_SIZE_IS_IN_BYTES (1 << 31)
#define LOONGSON3_BOOT_MEM_MAP_MAX 128
struct efi_memory_map_loongson {
u64 pci_io_start_addr;
u64 pci_io_end_addr;
u64 pci_config_addr;
- u32 dma_mask_bits;
+ u16 dma_mask_bits;
+ u16 dma_noncoherent;
} __packed;
struct interface_info {
void (*cpu_die)(unsigned int cpu);
void (*cleanup_dead_cpu)(unsigned cpu);
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
void (*kexec_nonboot_cpu)(void);
#endif
};
extern void __noreturn play_dead(void);
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
static inline void kexec_nonboot_cpu(void)
{
extern const struct plat_smp_ops *mp_ops; /* private */
obj-$(CONFIG_RELOCATABLE) += relocate.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o crash.o
+obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o crash.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_EARLY_PRINTK_8250) += early_printk_8250.o
/* Put the stack after the struct pt_regs. */
childksp = (unsigned long) childregs;
p->thread.cp0_status = (read_c0_status() & ~(ST0_CU2|ST0_CU1)) | ST0_KERNEL_CUMASK;
+
+ /*
+ * New tasks lose permission to use the fpu. This accelerates context
+ * switching for most programs since they don't use the fpu.
+ */
+ clear_tsk_thread_flag(p, TIF_USEDFPU);
+ clear_tsk_thread_flag(p, TIF_USEDMSA);
+ clear_tsk_thread_flag(p, TIF_MSA_CTX_LIVE);
+
+#ifdef CONFIG_MIPS_MT_FPAFF
+ clear_tsk_thread_flag(p, TIF_FPUBOUND);
+#endif /* CONFIG_MIPS_MT_FPAFF */
+
if (unlikely(args->fn)) {
/* kernel thread */
unsigned long status = p->thread.cp0_status;
p->thread.reg29 = (unsigned long) childregs;
p->thread.reg31 = (unsigned long) ret_from_fork;
- /*
- * New tasks lose permission to use the fpu. This accelerates context
- * switching for most programs since they don't use the fpu.
- */
childregs->cp0_status &= ~(ST0_CU2|ST0_CU1);
- clear_tsk_thread_flag(p, TIF_USEDFPU);
- clear_tsk_thread_flag(p, TIF_USEDMSA);
- clear_tsk_thread_flag(p, TIF_MSA_CTX_LIVE);
-
-#ifdef CONFIG_MIPS_MT_FPAFF
- clear_tsk_thread_flag(p, TIF_FPUBOUND);
-#endif /* CONFIG_MIPS_MT_FPAFF */
-
#ifdef CONFIG_MIPS_FP_SUPPORT
atomic_set(&p->thread.bd_emu_frame, BD_EMUFRAME_NONE);
#endif
.cpu_disable = bmips_cpu_disable,
.cpu_die = bmips_cpu_die,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = kexec_nonboot_cpu_jump,
#endif
};
.cpu_disable = bmips_cpu_disable,
.cpu_die = bmips_cpu_die,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = kexec_nonboot_cpu_jump,
#endif
};
local_irq_enable();
}
-#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_KEXEC)
+#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_KEXEC_CORE)
enum cpu_death {
CPU_DEATH_HALT,
}
}
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
static void cps_kexec_nonboot_cpu(void)
{
cps_shutdown_this_cpu(CPU_DEATH_POWER);
}
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
-#endif /* CONFIG_HOTPLUG_CPU || CONFIG_KEXEC */
+#endif /* CONFIG_HOTPLUG_CPU || CONFIG_KEXEC_CORE */
#ifdef CONFIG_HOTPLUG_CPU
.cpu_die = cps_cpu_die,
.cleanup_dead_cpu = cps_cleanup_dead_cpu,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = cps_kexec_nonboot_cpu,
#endif
};
*/
asmlinkage void start_secondary(void)
{
- unsigned int cpu;
+ unsigned int cpu = raw_smp_processor_id();
cpu_probe();
per_cpu_trap_init(false);
+ rcutree_report_cpu_starting(cpu);
mips_clockevent_init();
mp_ops->init_secondary();
cpu_report();
*/
calibrate_delay();
- cpu = smp_processor_id();
cpu_data[cpu].udelay_val = loops_per_jiffy;
set_cpu_sibling_map(cpu);
depends on HAVE_KVM
depends on MIPS_FP_SUPPORT
select EXPORT_UASM
- select PREEMPT_NOTIFIERS
+ select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
- select HAVE_KVM_EVENTFD
select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_MMIO
select KVM_GENERIC_MMU_NOTIFIER
- select INTERVAL_TREE
select KVM_GENERIC_HARDWARE_ENABLING
help
Support for hosting Guest kernels.
* Copyright (C) 2009 Lemote Inc.
* Author: Wu Zhangjin, wuzhangjin@gmail.com
*/
+
+#include <linux/dma-map-ops.h>
#include <linux/export.h>
#include <linux/pci_ids.h>
#include <asm/bootinfo.h>
loongson_sysconf.dma_mask_bits = eirq_source->dma_mask_bits;
if (loongson_sysconf.dma_mask_bits < 32 ||
- loongson_sysconf.dma_mask_bits > 64)
+ loongson_sysconf.dma_mask_bits > 64) {
loongson_sysconf.dma_mask_bits = 32;
+ dma_default_coherent = true;
+ } else {
+ dma_default_coherent = !eirq_source->dma_noncoherent;
+ }
+
+ pr_info("Firmware: Coherent DMA: %s\n", dma_default_coherent ? "on" : "off");
loongson_sysconf.restart_addr = boot_p->reset_system.ResetWarm;
loongson_sysconf.poweroff_addr = boot_p->reset_system.Shutdown;
void __init szmem(unsigned int node)
{
u32 i, mem_type;
- static unsigned long num_physpages;
- u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
+ phys_addr_t node_id, mem_start, mem_size;
/* Otherwise come from DTB */
if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
mem_type = loongson_memmap->map[i].mem_type;
mem_size = loongson_memmap->map[i].mem_size;
- mem_start = loongson_memmap->map[i].mem_start;
+
+ /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
+ if (mem_size & MEM_SIZE_IS_IN_BYTES)
+ mem_size &= ~MEM_SIZE_IS_IN_BYTES;
+ else
+ mem_size = mem_size << 20;
+
+ mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
switch (mem_type) {
case SYSTEM_RAM_LOW:
case SYSTEM_RAM_HIGH:
- start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
- node_psize = (mem_size << 20) >> PAGE_SHIFT;
- end_pfn = start_pfn + node_psize;
- num_physpages += node_psize;
- pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
- (u32)node_id, mem_type, mem_start, mem_size);
- pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
- start_pfn, end_pfn, num_physpages);
- memblock_add_node(PFN_PHYS(start_pfn),
- PFN_PHYS(node_psize), node,
+ case UMA_VIDEO_RAM:
+ pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
+ (u32)node_id, mem_type, &mem_start, &mem_size);
+ memblock_add_node(mem_start, mem_size, node,
MEMBLOCK_NONE);
break;
case SYSTEM_RAM_RESERVED:
- pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
- (u32)node_id, mem_type, mem_start, mem_size);
- memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
+ case VIDEO_ROM:
+ case ADAPTER_ROM:
+ case ACPI_TABLE:
+ case SMBIOS_TABLE:
+ pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
+ (u32)node_id, mem_type, &mem_start, &mem_size);
+ memblock_reserve(mem_start, mem_size);
+ break;
+ /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
+ case VUMA_VIDEO_RAM:
+ default:
+ pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
+ (u32)node_id, mem_type, &mem_start, &mem_size);
break;
}
}
+
+ /* Reserve vgabios if it comes from firmware */
+ if (loongson_sysconf.vgabios_addr)
+ memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
+ SZ_256K);
}
#ifndef CONFIG_NUMA
}
}
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
/* 0X80000000~0X80200000 is safe */
#define MAX_ARGS 64
_machine_halt = loongson_halt;
pm_power_off = loongson_poweroff;
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
kexec_argv = kmalloc(KEXEC_ARGV_SIZE, GFP_KERNEL);
if (WARN_ON(!kexec_argv))
return -ENOMEM;
.cpu_disable = loongson3_cpu_disable,
.cpu_die = loongson3_cpu_die,
#endif
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.kexec_nonboot_cpu = kexec_nonboot_cpu_jump,
#endif
};
default n
config GENERIC_BUG
- bool
- default y
+ def_bool y
depends on BUG
+ select GENERIC_BUG_RELATIVE_POINTERS if 64BIT
+
+config GENERIC_BUG_RELATIVE_POINTERS
+ bool
config GENERIC_HWEIGHT
bool
default 8
config ARCH_MMAP_RND_BITS_MAX
- default 24 if 64BIT
- default 17
+ default 18 if 64BIT
+ default 13
config ARCH_MMAP_RND_COMPAT_BITS_MAX
- default 17
+ default 13
# unless you want to implement ACPI on PA-RISC ... ;-)
config PM
/* Alternative SMP implementation. */
#define ALTERNATIVE(cond, replacement) "!0:" \
- ".section .altinstructions, \"aw\" !" \
+ ".section .altinstructions, \"a\" !" \
+ ".align 4 !" \
".word (0b-4-.) !" \
".hword 1, " __stringify(cond) " !" \
".word " __stringify(replacement) " !" \
/* to replace one single instructions by a new instruction */
#define ALTERNATIVE(from, to, cond, replacement)\
- .section .altinstructions, "aw" ! \
+ .section .altinstructions, "a" ! \
+ .align 4 ! \
.word (from - .) ! \
.hword (to - from)/4, cond ! \
.word replacement ! \
/* to replace multiple instructions by new code */
#define ALTERNATIVE_CODE(from, num_instructions, cond, new_instr_ptr)\
- .section .altinstructions, "aw" ! \
+ .section .altinstructions, "a" ! \
+ .align 4 ! \
.word (from - .) ! \
.hword -num_instructions, cond ! \
.word (new_instr_ptr - .) ! \
*/
#define ASM_EXCEPTIONTABLE_ENTRY(fault_addr, except_addr) \
.section __ex_table,"aw" ! \
+ .align 4 ! \
.word (fault_addr - .), (except_addr - .) ! \
.previous
#define PARISC_BUG_BREAK_ASM "break 0x1f, 0x1fff"
#define PARISC_BUG_BREAK_INSN 0x03ffe01f /* PARISC_BUG_BREAK_ASM */
-#if defined(CONFIG_64BIT)
-#define ASM_WORD_INSN ".dword\t"
+#ifdef CONFIG_GENERIC_BUG_RELATIVE_POINTERS
+# define __BUG_REL(val) ".word " __stringify(val) " - ."
#else
-#define ASM_WORD_INSN ".word\t"
+# define __BUG_REL(val) ".word " __stringify(val)
#endif
+
#ifdef CONFIG_DEBUG_BUGVERBOSE
#define BUG() \
do { \
asm volatile("\n" \
"1:\t" PARISC_BUG_BREAK_ASM "\n" \
- "\t.pushsection __bug_table,\"aw\"\n" \
- "2:\t" ASM_WORD_INSN "1b, %c0\n" \
- "\t.short %c1, %c2\n" \
- "\t.org 2b+%c3\n" \
+ "\t.pushsection __bug_table,\"a\"\n" \
+ "\t.align 4\n" \
+ "2:\t" __BUG_REL(1b) "\n" \
+ "\t" __BUG_REL(%c0) "\n" \
+ "\t.short %1, %2\n" \
+ "\t.blockz %3-2*4-2*2\n" \
"\t.popsection" \
: : "i" (__FILE__), "i" (__LINE__), \
- "i" (0), "i" (sizeof(struct bug_entry)) ); \
+ "i" (0), "i" (sizeof(struct bug_entry)) ); \
unreachable(); \
} while(0)
do { \
asm volatile("\n" \
"1:\t" PARISC_BUG_BREAK_ASM "\n" \
- "\t.pushsection __bug_table,\"aw\"\n" \
- "2:\t" ASM_WORD_INSN "1b, %c0\n" \
- "\t.short %c1, %c2\n" \
- "\t.org 2b+%c3\n" \
+ "\t.pushsection __bug_table,\"a\"\n" \
+ "\t.align 4\n" \
+ "2:\t" __BUG_REL(1b) "\n" \
+ "\t" __BUG_REL(%c0) "\n" \
+ "\t.short %1, %2\n" \
+ "\t.blockz %3-2*4-2*2\n" \
"\t.popsection" \
: : "i" (__FILE__), "i" (__LINE__), \
"i" (BUGFLAG_WARNING|(flags)), \
do { \
asm volatile("\n" \
"1:\t" PARISC_BUG_BREAK_ASM "\n" \
- "\t.pushsection __bug_table,\"aw\"\n" \
- "2:\t" ASM_WORD_INSN "1b\n" \
- "\t.short %c0\n" \
- "\t.org 2b+%c1\n" \
+ "\t.pushsection __bug_table,\"a\"\n" \
+ "\t.align 4\n" \
+ "2:\t" __BUG_REL(1b) "\n" \
+ "\t.short %0\n" \
+ "\t.blockz %1-4-2\n" \
"\t.popsection" \
: : "i" (BUGFLAG_WARNING|(flags)), \
"i" (sizeof(struct bug_entry)) ); \
#define ELF_HWCAP 0
-/* Masks for stack and mmap randomization */
-#define BRK_RND_MASK (is_32bit_task() ? 0x07ffUL : 0x3ffffUL)
-#define MMAP_RND_MASK (is_32bit_task() ? 0x1fffUL : 0x3ffffUL)
-#define STACK_RND_MASK MMAP_RND_MASK
-
-struct mm_struct;
-extern unsigned long arch_randomize_brk(struct mm_struct *);
-#define arch_randomize_brk arch_randomize_brk
-
+#define STACK_RND_MASK 0x7ff /* 8MB of VA */
#define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
struct linux_binprm;
asm_volatile_goto("1:\n\t"
"nop\n\t"
".pushsection __jump_table, \"aw\"\n\t"
+ ".align %1\n\t"
".word 1b - ., %l[l_yes] - .\n\t"
__stringify(ASM_ULONG_INSN) " %c0 - .\n\t"
".popsection\n\t"
- : : "i" (&((char *)key)[branch]) : : l_yes);
+ : : "i" (&((char *)key)[branch]), "i" (sizeof(long))
+ : : l_yes);
return false;
l_yes:
asm_volatile_goto("1:\n\t"
"b,n %l[l_yes]\n\t"
".pushsection __jump_table, \"aw\"\n\t"
+ ".align %1\n\t"
".word 1b - ., %l[l_yes] - .\n\t"
__stringify(ASM_ULONG_INSN) " %c0 - .\n\t"
".popsection\n\t"
- : : "i" (&((char *)key)[branch]) : : l_yes);
+ : : "i" (&((char *)key)[branch]), "i" (sizeof(long))
+ : : l_yes);
return false;
l_yes:
})
#ifdef CONFIG_SMP
-# define __lock_aligned __section(".data..lock_aligned")
+# define __lock_aligned __section(".data..lock_aligned") __aligned(16)
#endif
#endif /* __PARISC_LDCW_H */
#ifndef __ASSEMBLY__
+struct rlimit;
+unsigned long mmap_upper_limit(struct rlimit *rlim_stack);
unsigned long calc_max_stack_size(unsigned long stack_max);
/*
#define ASM_EXCEPTIONTABLE_ENTRY( fault_addr, except_addr )\
".section __ex_table,\"aw\"\n" \
+ ".align 4\n" \
".word (" #fault_addr " - .), (" #except_addr " - .)\n\t" \
".previous\n"
/* We now return you to your regularly scheduled HPUX. */
-#define ENOSYM 215 /* symbol does not exist in executable */
#define ENOTSOCK 216 /* Socket operation on non-socket */
#define EDESTADDRREQ 217 /* Destination address required */
#define EMSGSIZE 218 /* Message too long */
#define ETIMEDOUT 238 /* Connection timed out */
#define ECONNREFUSED 239 /* Connection refused */
#define EREFUSED ECONNREFUSED /* for HP's NFS apparently */
-#define EREMOTERELEASE 240 /* Remote peer released connection */
#define EHOSTDOWN 241 /* Host is down */
#define EHOSTUNREACH 242 /* No route to host */
char cpu_name[60], *p;
/* strip PA path from CPU name to not confuse lscpu */
- strlcpy(cpu_name, per_cpu(cpu_data, 0).dev->name, sizeof(cpu_name));
+ strscpy(cpu_name, per_cpu(cpu_data, 0).dev->name, sizeof(cpu_name));
p = strrchr(cpu_name, '[');
if (p)
*(--p) = 0;
* indicating that "current" should be used instead of a passed-in
* value from the exec bprm as done with arch_pick_mmap_layout().
*/
-static unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
+unsigned long mmap_upper_limit(struct rlimit *rlim_stack)
{
unsigned long stack_base;
RO_DATA(8)
/* unwind info */
+ . = ALIGN(4);
.PARISC.unwind : {
__start___unwind = .;
*(.PARISC.unwind)
CONFIG_DEBUG_SG=y
CONFIG_DEBUG_NOTIFIERS=y
CONFIG_BUG_ON_DATA_CORRUPTION=y
-CONFIG_DEBUG_CREDENTIALS=y
# CONFIG_FTRACE is not set
CONFIG_XMON=y
# CONFIG_RUNTIME_TESTING_MENU is not set
#include <asm/feature-fixups.h>
#ifdef CONFIG_VSX
+#define __REST_1FPVSR(n,c,base) \
+BEGIN_FTR_SECTION \
+ b 2f; \
+END_FTR_SECTION_IFSET(CPU_FTR_VSX); \
+ REST_FPR(n,base); \
+ b 3f; \
+2: REST_VSR(n,c,base); \
+3:
+
#define __REST_32FPVSRS(n,c,base) \
BEGIN_FTR_SECTION \
b 2f; \
2: SAVE_32VSRS(n,c,base); \
3:
#else
+#define __REST_1FPVSR(n,b,base) REST_FPR(n, base)
#define __REST_32FPVSRS(n,b,base) REST_32FPRS(n, base)
#define __SAVE_32FPVSRS(n,b,base) SAVE_32FPRS(n, base)
#endif
+#define REST_1FPVSR(n,c,base) __REST_1FPVSR(n,__REG_##c,__REG_##base)
#define REST_32FPVSRS(n,c,base) __REST_32FPVSRS(n,__REG_##c,__REG_##base)
#define SAVE_32FPVSRS(n,c,base) __SAVE_32FPVSRS(n,__REG_##c,__REG_##base)
SAVE_32FPVSRS(0, R4, R3)
mffs fr0
stfd fr0,FPSTATE_FPSCR(r3)
+ REST_1FPVSR(0, R4, R3)
blr
EXPORT_SYMBOL(store_fp_state)
2: SAVE_32FPVSRS(0, R4, R6)
mffs fr0
stfd fr0,FPSTATE_FPSCR(r6)
+ REST_1FPVSR(0, R4, R6)
blr
usermsr = current->thread.regs->msr;
+ /* Caller has enabled FP/VEC/VSX/TM in MSR */
if (usermsr & MSR_FP)
- save_fpu(current);
-
+ __giveup_fpu(current);
if (usermsr & MSR_VEC)
- save_altivec(current);
+ __giveup_altivec(current);
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if (usermsr & MSR_TM) {
.endif
/* Save previous stack pointer (r1) */
- addi r8, r1, SWITCH_FRAME_SIZE
+ addi r8, r1, SWITCH_FRAME_SIZE+STACK_FRAME_MIN_SIZE
PPC_STL r8, GPR1(r1)
.if \allregs == 1
mflr r3
mtctr r3
REST_GPR(3, r1)
- addi r1, r1, SWITCH_FRAME_SIZE
+ addi r1, r1, SWITCH_FRAME_SIZE+STACK_FRAME_MIN_SIZE
mtlr r0
bctr
#endif
mfvscr v0
li r4, VRSTATE_VSCR
stvx v0, r4, r3
+ lvx v0, 0, r3
blr
EXPORT_SYMBOL(store_vr_state)
mfvscr v0
li r4,VRSTATE_VSCR
stvx v0,r4,r7
+ lvx v0,0,r7
blr
#ifdef CONFIG_VSX
config KVM
bool
- select PREEMPT_NOTIFIERS
- select HAVE_KVM_EVENTFD
+ select KVM_COMMON
select HAVE_KVM_VCPU_ASYNC_IOCTL
select KVM_VFIO
select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_BYPASS
- select INTERVAL_TREE
config KVM_BOOK3S_HANDLER
bool
bool "KVM in-kernel MPIC emulation"
depends on KVM && PPC_E500
select HAVE_KVM_IRQCHIP
- select HAVE_KVM_IRQFD
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
help
bool "KVM in-kernel XICS emulation"
depends on KVM_BOOK3S_64 && !KVM_MPIC
select HAVE_KVM_IRQCHIP
- select HAVE_KVM_IRQFD
default y
help
Include support for the XICS (eXternal Interrupt Controller
break;
#endif
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
case KVM_CAP_IRQFD_RESAMPLE:
r = !xive_enabled();
break;
* same fault IRQ is not freed by the OS before.
*/
mutex_lock(&vas_pseries_mutex);
- if (migration_in_progress)
+ if (migration_in_progress) {
rc = -EBUSY;
- else
+ } else {
rc = allocate_setup_window(txwin, (u64 *)&domain[0],
cop_feat_caps->win_type);
+ if (!rc)
+ caps->nr_open_wins_progress++;
+ }
+
mutex_unlock(&vas_pseries_mutex);
if (rc)
goto out;
goto out_free;
txwin->win_type = cop_feat_caps->win_type;
- mutex_lock(&vas_pseries_mutex);
+
/*
+ * The migration SUSPEND thread sets migration_in_progress and
+ * closes all open windows from the list. But the window is
+ * added to the list after open and modify HCALLs. So possible
+ * that migration_in_progress is set before modify HCALL which
+ * may cause some windows are still open when the hypervisor
+ * initiates the migration.
+ * So checks the migration_in_progress flag again and close all
+ * open windows.
+ *
* Possible to lose the acquired credit with DLPAR core
* removal after the window is opened. So if there are any
* closed windows (means with lost credits), do not give new
* after the existing windows are reopened when credits are
* available.
*/
- if (!caps->nr_close_wins) {
+ mutex_lock(&vas_pseries_mutex);
+ if (!caps->nr_close_wins && !migration_in_progress) {
list_add(&txwin->win_list, &caps->list);
caps->nr_open_windows++;
+ caps->nr_open_wins_progress--;
mutex_unlock(&vas_pseries_mutex);
vas_user_win_add_mm_context(&txwin->vas_win.task_ref);
return &txwin->vas_win;
*/
free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
+ /*
+ * Hold mutex and reduce nr_open_wins_progress counter.
+ */
+ mutex_lock(&vas_pseries_mutex);
+ caps->nr_open_wins_progress--;
+ mutex_unlock(&vas_pseries_mutex);
out:
atomic_dec(&cop_feat_caps->nr_used_credits);
kfree(txwin);
struct vas_caps *vcaps;
int i, rc = 0;
+ pr_info("VAS migration event %d\n", action);
+
/*
* NX-GZIP is not enabled. Nothing to do for migration.
*/
if (!copypaste_feat)
return rc;
- mutex_lock(&vas_pseries_mutex);
-
if (action == VAS_SUSPEND)
migration_in_progress = true;
else
switch (action) {
case VAS_SUSPEND:
+ mutex_lock(&vas_pseries_mutex);
rc = reconfig_close_windows(vcaps, vcaps->nr_open_windows,
true);
+ /*
+ * Windows are included in the list after successful
+ * open. So wait for closing these in-progress open
+ * windows in vas_allocate_window() which will be
+ * done if the migration_in_progress is set.
+ */
+ while (vcaps->nr_open_wins_progress) {
+ mutex_unlock(&vas_pseries_mutex);
+ msleep(10);
+ mutex_lock(&vas_pseries_mutex);
+ }
+ mutex_unlock(&vas_pseries_mutex);
break;
case VAS_RESUME:
+ mutex_lock(&vas_pseries_mutex);
atomic_set(&caps->nr_total_credits, new_nr_creds);
rc = reconfig_open_windows(vcaps, new_nr_creds, true);
+ mutex_unlock(&vas_pseries_mutex);
break;
default:
/* should not happen */
goto out;
}
+ pr_info("VAS migration event (%d) successful\n", action);
+
out:
- mutex_unlock(&vas_pseries_mutex);
return rc;
}
struct vas_caps {
struct vas_cop_feat_caps caps;
struct list_head list; /* List of open windows */
+ int nr_open_wins_progress; /* Number of open windows in */
+ /* progress. Used in migration */
int nr_close_wins; /* closed windows in the hypervisor for DLPAR */
int nr_open_windows; /* Number of successful open windows */
u8 feat; /* Feature type */
If unsure what to do here, say N.
config ARCH_SUPPORTS_KEXEC
- def_bool MMU
+ def_bool y
config ARCH_SELECTS_KEXEC
def_bool y
select HOTPLUG_CPU if SMP
config ARCH_SUPPORTS_KEXEC_FILE
- def_bool 64BIT && MMU
+ def_bool 64BIT
config ARCH_SELECTS_KEXEC_FILE
def_bool y
If you want to execute 32-bit userspace applications, say Y.
+config PARAVIRT
+ bool "Enable paravirtualization code"
+ depends on RISCV_SBI
+ help
+ This changes the kernel so it can modify itself when it is run
+ under a hypervisor, potentially improving performance significantly
+ over full virtualization.
+
+config PARAVIRT_TIME_ACCOUNTING
+ bool "Paravirtual steal time accounting"
+ depends on PARAVIRT
+ help
+ Select this option to enable fine granularity task steal time
+ accounting. Time spent executing other tasks in parallel with
+ the current vCPU is discounted from the vCPU power. To account for
+ that, there can be a small performance impact.
+
+ If in doubt, say N here.
+
config RELOCATABLE
bool "Build a relocatable kernel"
depends on MMU && 64BIT && !XIP_KERNEL
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/leds/common.h>
-/* Clock frequency (in Hz) of the rtcclk */
-#define RTCCLK_FREQ 1000000
-
/ {
model = "Microchip PolarFire-SoC Icicle Kit";
compatible = "microchip,mpfs-icicle-reference-rtlv2210", "microchip,mpfs-icicle-kit",
stdout-path = "serial1:115200n8";
};
- cpus {
- timebase-frequency = <RTCCLK_FREQ>;
- };
-
leds {
compatible = "gpio-leds";
#include "mpfs.dtsi"
#include "mpfs-m100pfs-fabric.dtsi"
-/* Clock frequency (in Hz) of the rtcclk */
-#define MTIMER_FREQ 1000000
-
/ {
model = "Aries Embedded M100PFEVPS";
compatible = "aries,m100pfsevp", "microchip,mpfs";
stdout-path = "serial1:115200n8";
};
- cpus {
- timebase-frequency = <MTIMER_FREQ>;
- };
-
ddrc_cache_lo: memory@80000000 {
device_type = "memory";
reg = <0x0 0x80000000 0x0 0x40000000>;
#include "mpfs.dtsi"
#include "mpfs-polarberry-fabric.dtsi"
-/* Clock frequency (in Hz) of the rtcclk */
-#define MTIMER_FREQ 1000000
-
/ {
model = "Sundance PolarBerry";
compatible = "sundance,polarberry", "microchip,mpfs";
stdout-path = "serial0:115200n8";
};
- cpus {
- timebase-frequency = <MTIMER_FREQ>;
- };
-
ddrc_cache_lo: memory@80000000 {
device_type = "memory";
reg = <0x0 0x80000000 0x0 0x2e000000>;
#include "mpfs.dtsi"
#include "mpfs-sev-kit-fabric.dtsi"
-/* Clock frequency (in Hz) of the rtcclk */
-#define MTIMER_FREQ 1000000
-
/ {
#address-cells = <2>;
#size-cells = <2>;
stdout-path = "serial1:115200n8";
};
- cpus {
- timebase-frequency = <MTIMER_FREQ>;
- };
-
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
#include "mpfs.dtsi"
#include "mpfs-tysom-m-fabric.dtsi"
-/* Clock frequency (in Hz) of the rtcclk */
-#define MTIMER_FREQ 1000000
-
/ {
model = "Aldec TySOM-M-MPFS250T-REV2";
compatible = "aldec,tysom-m-mpfs250t-rev2", "microchip,mpfs";
stdout-path = "serial1:115200n8";
};
- cpus {
- timebase-frequency = <MTIMER_FREQ>;
- };
-
ddrc_cache_lo: memory@80000000 {
device_type = "memory";
reg = <0x0 0x80000000 0x0 0x30000000>;
cpus {
#address-cells = <1>;
#size-cells = <0>;
+ timebase-frequency = <1000000>;
cpu0: cpu@0 {
compatible = "sifive,e51", "sifive,rocket0", "riscv";
cpu0_intc: interrupt-controller {
compatible = "riscv,cpu-intc";
interrupt-controller;
- #address-cells = <0>;
#interrupt-cells = <1>;
};
};
return ret.error ? 0 : ret.value;
}
-static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
+static void errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
{
+ static bool done;
+
if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
- return false;
+ return;
+
+ if (done)
+ return;
+
+ done = true;
if (arch_id != ANDESTECH_AX45MP_MARCHID || impid != ANDESTECH_AX45MP_MIMPID)
- return false;
+ return;
if (!ax45mp_iocp_sw_workaround())
- return false;
+ return;
/* Set this just to make core cbo code happy */
riscv_cbom_block_size = 1;
riscv_noncoherent_supported();
-
- return true;
}
void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
unsigned long archid, unsigned long impid,
unsigned int stage)
{
- errata_probe_iocp(stage, archid, impid);
+ if (stage == RISCV_ALTERNATIVES_BOOT)
+ errata_probe_iocp(stage, archid, impid);
/* we have nothing to patch here ATM so just return back */
}
KVM_ARCH_REQ_FLAGS(4, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
#define KVM_REQ_HFENCE \
KVM_ARCH_REQ_FLAGS(5, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+#define KVM_REQ_STEAL_UPDATE KVM_ARCH_REQ(6)
enum kvm_riscv_hfence_type {
KVM_RISCV_HFENCE_UNKNOWN = 0,
/* 'static' configurations which are set only once */
struct kvm_vcpu_config cfg;
+
+ /* SBI steal-time accounting */
+ struct {
+ gpa_t shmem;
+ u64 last_steal;
+ } sta;
};
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
void kvm_riscv_vcpu_power_off(struct kvm_vcpu *vcpu);
void kvm_riscv_vcpu_power_on(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu);
+void kvm_riscv_vcpu_record_steal_time(struct kvm_vcpu *vcpu);
+
#endif /* __RISCV_KVM_HOST_H__ */
#define KVM_SBI_VERSION_MINOR 0
enum kvm_riscv_sbi_ext_status {
- KVM_RISCV_SBI_EXT_UNINITIALIZED,
- KVM_RISCV_SBI_EXT_AVAILABLE,
- KVM_RISCV_SBI_EXT_UNAVAILABLE,
+ KVM_RISCV_SBI_EXT_STATUS_UNINITIALIZED,
+ KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE,
+ KVM_RISCV_SBI_EXT_STATUS_ENABLED,
+ KVM_RISCV_SBI_EXT_STATUS_DISABLED,
};
struct kvm_vcpu_sbi_context {
unsigned long extid_start;
unsigned long extid_end;
- bool default_unavail;
+ bool default_disabled;
/**
* SBI extension handler. It can be defined for a given extension or group of
const struct kvm_one_reg *reg);
int kvm_riscv_vcpu_get_reg_sbi_ext(struct kvm_vcpu *vcpu,
const struct kvm_one_reg *reg);
+int kvm_riscv_vcpu_set_reg_sbi(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg);
+int kvm_riscv_vcpu_get_reg_sbi(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg);
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(
struct kvm_vcpu *vcpu, unsigned long extid);
+bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx);
int kvm_riscv_vcpu_sbi_ecall(struct kvm_vcpu *vcpu, struct kvm_run *run);
void kvm_riscv_vcpu_sbi_init(struct kvm_vcpu *vcpu);
+int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num,
+ unsigned long *reg_val);
+int kvm_riscv_vcpu_set_reg_sbi_sta(struct kvm_vcpu *vcpu, unsigned long reg_num,
+ unsigned long reg_val);
+
#ifdef CONFIG_RISCV_SBI_V01
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01;
#endif
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn;
+extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_PARAVIRT_H
+#define _ASM_RISCV_PARAVIRT_H
+
+#ifdef CONFIG_PARAVIRT
+#include <linux/static_call_types.h>
+
+struct static_key;
+extern struct static_key paravirt_steal_enabled;
+extern struct static_key paravirt_steal_rq_enabled;
+
+u64 dummy_steal_clock(int cpu);
+
+DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock);
+
+static inline u64 paravirt_steal_clock(int cpu)
+{
+ return static_call(pv_steal_clock)(cpu);
+}
+
+int __init pv_time_init(void);
+
+#else
+
+#define pv_time_init() do {} while (0)
+
+#endif /* CONFIG_PARAVIRT */
+#endif /* _ASM_RISCV_PARAVIRT_H */
--- /dev/null
+#include <asm/paravirt.h>
#define PAGE_KERNEL __pgprot(0)
#define swapper_pg_dir NULL
#define TASK_SIZE 0xffffffffUL
-#define VMALLOC_START 0
+#define VMALLOC_START _AC(0, UL)
#define VMALLOC_END TASK_SIZE
#endif /* !CONFIG_MMU */
SBI_EXT_SRST = 0x53525354,
SBI_EXT_PMU = 0x504D55,
SBI_EXT_DBCN = 0x4442434E,
+ SBI_EXT_STA = 0x535441,
/* Experimentals extensions must lie within this range */
SBI_EXT_EXPERIMENTAL_START = 0x08000000,
SBI_EXT_DBCN_CONSOLE_WRITE_BYTE = 2,
};
+/* SBI STA (steal-time accounting) extension */
+enum sbi_ext_sta_fid {
+ SBI_EXT_STA_STEAL_TIME_SET_SHMEM = 0,
+};
+
+struct sbi_sta_struct {
+ __le32 sequence;
+ __le32 flags;
+ __le64 steal;
+ u8 preempted;
+ u8 pad[47];
+} __packed;
+
+#define SBI_STA_SHMEM_DISABLE -1
+
+/* SBI spec version fields */
#define SBI_SPEC_VERSION_DEFAULT 0x1
#define SBI_SPEC_VERSION_MAJOR_SHIFT 24
#define SBI_SPEC_VERSION_MAJOR_MASK 0x7f
return sys_ni_syscall(); \
}
-#define COMPAT_SYS_NI(name) \
- SYSCALL_ALIAS(__riscv_compat_sys_##name, sys_ni_posix_timers);
-
#endif /* CONFIG_COMPAT */
#define __SYSCALL_DEFINEx(x, name, ...) \
return sys_ni_syscall(); \
}
-#define SYS_NI(name) SYSCALL_ALIAS(__riscv_sys_##name, sys_ni_posix_timers);
-
#endif /* __ASM_SYSCALL_WRAPPER_H */
KVM_RISCV_SBI_EXT_EXPERIMENTAL,
KVM_RISCV_SBI_EXT_VENDOR,
KVM_RISCV_SBI_EXT_DBCN,
+ KVM_RISCV_SBI_EXT_STA,
KVM_RISCV_SBI_EXT_MAX,
};
+/* SBI STA extension registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */
+struct kvm_riscv_sbi_sta {
+ unsigned long shmem_lo;
+ unsigned long shmem_hi;
+};
+
/* Possible states for kvm_riscv_timer */
#define KVM_RISCV_TIMER_STATE_OFF 0
#define KVM_RISCV_TIMER_STATE_ON 1
#define KVM_REG_RISCV_VECTOR_REG(n) \
((n) + sizeof(struct __riscv_v_ext_state) / sizeof(unsigned long))
+/* Registers for specific SBI extensions are mapped as type 10 */
+#define KVM_REG_RISCV_SBI_STATE (0x0a << KVM_REG_RISCV_TYPE_SHIFT)
+#define KVM_REG_RISCV_SBI_STA (0x0 << KVM_REG_RISCV_SUBTYPE_SHIFT)
+#define KVM_REG_RISCV_SBI_STA_REG(name) \
+ (offsetof(struct kvm_riscv_sbi_sta, name) / sizeof(unsigned long))
+
/* Device Control API: RISC-V AIA */
#define KVM_DEV_RISCV_APLIC_ALIGN 0x1000
#define KVM_DEV_RISCV_APLIC_SIZE 0x4000
obj-$(CONFIG_SMP) += cpu_ops_sbi.o
endif
obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
+obj-$(CONFIG_PARAVIRT) += paravirt.o
obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_KEXEC_CORE) += kexec_relocate.o crash_save_regs.o machine_kexec.o
obj-$(CONFIG_KEXEC_FILE) += elf_kexec.o machine_kexec_file.o
void arch_crash_save_vmcoreinfo(void)
{
- VMCOREINFO_NUMBER(VA_BITS);
VMCOREINFO_NUMBER(phys_ram_base);
vmcoreinfo_append_str("NUMBER(PAGE_OFFSET)=0x%lx\n", PAGE_OFFSET);
vmcoreinfo_append_str("NUMBER(VMALLOC_START)=0x%lx\n", VMALLOC_START);
vmcoreinfo_append_str("NUMBER(VMALLOC_END)=0x%lx\n", VMALLOC_END);
+#ifdef CONFIG_MMU
+ VMCOREINFO_NUMBER(VA_BITS);
vmcoreinfo_append_str("NUMBER(VMEMMAP_START)=0x%lx\n", VMEMMAP_START);
vmcoreinfo_append_str("NUMBER(VMEMMAP_END)=0x%lx\n", VMEMMAP_END);
#ifdef CONFIG_64BIT
vmcoreinfo_append_str("NUMBER(MODULES_VADDR)=0x%lx\n", MODULES_VADDR);
vmcoreinfo_append_str("NUMBER(MODULES_END)=0x%lx\n", MODULES_END);
+#endif
#endif
vmcoreinfo_append_str("NUMBER(KERNEL_LINK_ADDR)=0x%lx\n", KERNEL_LINK_ADDR);
vmcoreinfo_append_str("NUMBER(va_kernel_pa_offset)=0x%lx\n",
XIP_FIXUP_OFFSET a3
add a3, a3, a1
REG_L sp, (a3)
- scs_load_current
.Lsecondary_start_common:
call relocate_enable_mmu
#endif
call .Lsetup_trap_vector
+ scs_load_current
tail smp_callin
#endif /* CONFIG_SMP */
long buffer);
};
-unsigned int initialize_relocation_hashtable(unsigned int num_relocations);
-void process_accumulated_relocations(struct module *me);
-int add_relocation_to_accumulate(struct module *me, int type, void *location,
- unsigned int hashtable_bits, Elf_Addr v);
-
-struct hlist_head *relocation_hashtable;
-
-struct list_head used_buckets_list;
-
/*
* The auipc+jalr instruction pair can reach any PC-relative offset
* in the range [-2^31 - 2^11, 2^31 - 2^11)
static int riscv_insn_rmw(void *location, u32 keep, u32 set)
{
- u16 *parcel = location;
+ __le16 *parcel = location;
u32 insn = (u32)le16_to_cpu(parcel[0]) | (u32)le16_to_cpu(parcel[1]) << 16;
insn &= keep;
static int riscv_insn_rvc_rmw(void *location, u16 keep, u16 set)
{
- u16 *parcel = location;
+ __le16 *parcel = location;
u16 insn = le16_to_cpu(*parcel);
insn &= keep;
/* 192-255 nonstandard ABI extensions */
};
-void process_accumulated_relocations(struct module *me)
+static void
+process_accumulated_relocations(struct module *me,
+ struct hlist_head **relocation_hashtable,
+ struct list_head *used_buckets_list)
{
/*
* Only ADD/SUB/SET/ULEB128 should end up here.
* - Each relocation entry for a location address
*/
struct used_bucket *bucket_iter;
+ struct used_bucket *bucket_iter_tmp;
struct relocation_head *rel_head_iter;
+ struct hlist_node *rel_head_iter_tmp;
struct relocation_entry *rel_entry_iter;
+ struct relocation_entry *rel_entry_iter_tmp;
int curr_type;
void *location;
long buffer;
- list_for_each_entry(bucket_iter, &used_buckets_list, head) {
- hlist_for_each_entry(rel_head_iter, bucket_iter->bucket, node) {
+ list_for_each_entry_safe(bucket_iter, bucket_iter_tmp,
+ used_buckets_list, head) {
+ hlist_for_each_entry_safe(rel_head_iter, rel_head_iter_tmp,
+ bucket_iter->bucket, node) {
buffer = 0;
location = rel_head_iter->location;
- list_for_each_entry(rel_entry_iter,
- rel_head_iter->rel_entry, head) {
+ list_for_each_entry_safe(rel_entry_iter,
+ rel_entry_iter_tmp,
+ rel_head_iter->rel_entry,
+ head) {
curr_type = rel_entry_iter->type;
reloc_handlers[curr_type].reloc_handler(
me, &buffer, rel_entry_iter->value);
kfree(bucket_iter);
}
- kfree(relocation_hashtable);
+ kfree(*relocation_hashtable);
}
-int add_relocation_to_accumulate(struct module *me, int type, void *location,
- unsigned int hashtable_bits, Elf_Addr v)
+static int add_relocation_to_accumulate(struct module *me, int type,
+ void *location,
+ unsigned int hashtable_bits, Elf_Addr v,
+ struct hlist_head *relocation_hashtable,
+ struct list_head *used_buckets_list)
{
struct relocation_entry *entry;
struct relocation_head *rel_head;
unsigned long hash;
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+
+ if (!entry)
+ return -ENOMEM;
+
INIT_LIST_HEAD(&entry->head);
entry->type = type;
entry->value = v;
current_head = &relocation_hashtable[hash];
- /* Find matching location (if any) */
+ /*
+ * Search for the relocation_head for the relocations that happen at the
+ * provided location
+ */
bool found = false;
struct relocation_head *rel_head_iter;
}
}
+ /*
+ * If there has not yet been any relocations at the provided location,
+ * create a relocation_head for that location and populate it with this
+ * relocation_entry.
+ */
if (!found) {
rel_head = kmalloc(sizeof(*rel_head), GFP_KERNEL);
+
+ if (!rel_head) {
+ kfree(entry);
+ return -ENOMEM;
+ }
+
rel_head->rel_entry =
kmalloc(sizeof(struct list_head), GFP_KERNEL);
+
+ if (!rel_head->rel_entry) {
+ kfree(entry);
+ kfree(rel_head);
+ return -ENOMEM;
+ }
+
INIT_LIST_HEAD(rel_head->rel_entry);
rel_head->location = location;
INIT_HLIST_NODE(&rel_head->node);
if (!current_head->first) {
bucket =
kmalloc(sizeof(struct used_bucket), GFP_KERNEL);
+
+ if (!bucket) {
+ kfree(entry);
+ kfree(rel_head);
+ kfree(rel_head->rel_entry);
+ return -ENOMEM;
+ }
+
INIT_LIST_HEAD(&bucket->head);
bucket->bucket = current_head;
- list_add(&bucket->head, &used_buckets_list);
+ list_add(&bucket->head, used_buckets_list);
}
hlist_add_head(&rel_head->node, current_head);
}
return 0;
}
-unsigned int initialize_relocation_hashtable(unsigned int num_relocations)
+static unsigned int
+initialize_relocation_hashtable(unsigned int num_relocations,
+ struct hlist_head **relocation_hashtable)
{
/* Can safely assume that bits is not greater than sizeof(long) */
unsigned long hashtable_size = roundup_pow_of_two(num_relocations);
hashtable_size <<= should_double_size;
- relocation_hashtable = kmalloc_array(hashtable_size,
- sizeof(*relocation_hashtable),
- GFP_KERNEL);
- __hash_init(relocation_hashtable, hashtable_size);
+ *relocation_hashtable = kmalloc_array(hashtable_size,
+ sizeof(*relocation_hashtable),
+ GFP_KERNEL);
+ if (!*relocation_hashtable)
+ return -ENOMEM;
- INIT_LIST_HEAD(&used_buckets_list);
+ __hash_init(*relocation_hashtable, hashtable_size);
return hashtable_bits;
}
Elf_Addr v;
int res;
unsigned int num_relocations = sechdrs[relsec].sh_size / sizeof(*rel);
- unsigned int hashtable_bits = initialize_relocation_hashtable(num_relocations);
+ struct hlist_head *relocation_hashtable;
+ struct list_head used_buckets_list;
+ unsigned int hashtable_bits;
+
+ hashtable_bits = initialize_relocation_hashtable(num_relocations,
+ &relocation_hashtable);
+
+ if (hashtable_bits < 0)
+ return hashtable_bits;
+
+ INIT_LIST_HEAD(&used_buckets_list);
pr_debug("Applying relocate section %u to %u\n", relsec,
sechdrs[relsec].sh_info);
}
if (reloc_handlers[type].accumulate_handler)
- res = add_relocation_to_accumulate(me, type, location, hashtable_bits, v);
+ res = add_relocation_to_accumulate(me, type, location,
+ hashtable_bits, v,
+ relocation_hashtable,
+ &used_buckets_list);
else
res = handler(me, location, v);
if (res)
return res;
}
- process_accumulated_relocations(me);
+ process_accumulated_relocations(me, &relocation_hashtable,
+ &used_buckets_list);
return 0;
}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2023 Ventana Micro Systems Inc.
+ */
+
+#define pr_fmt(fmt) "riscv-pv: " fmt
+
+#include <linux/cpuhotplug.h>
+#include <linux/compiler.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/jump_label.h>
+#include <linux/kconfig.h>
+#include <linux/kernel.h>
+#include <linux/percpu-defs.h>
+#include <linux/printk.h>
+#include <linux/static_call.h>
+#include <linux/types.h>
+
+#include <asm/barrier.h>
+#include <asm/page.h>
+#include <asm/paravirt.h>
+#include <asm/sbi.h>
+
+struct static_key paravirt_steal_enabled;
+struct static_key paravirt_steal_rq_enabled;
+
+static u64 native_steal_clock(int cpu)
+{
+ return 0;
+}
+
+DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock);
+
+static bool steal_acc = true;
+static int __init parse_no_stealacc(char *arg)
+{
+ steal_acc = false;
+ return 0;
+}
+
+early_param("no-steal-acc", parse_no_stealacc);
+
+DEFINE_PER_CPU(struct sbi_sta_struct, steal_time) __aligned(64);
+
+static bool __init has_pv_steal_clock(void)
+{
+ if (sbi_spec_version >= sbi_mk_version(2, 0) &&
+ sbi_probe_extension(SBI_EXT_STA) > 0) {
+ pr_info("SBI STA extension detected\n");
+ return true;
+ }
+
+ return false;
+}
+
+static int sbi_sta_steal_time_set_shmem(unsigned long lo, unsigned long hi,
+ unsigned long flags)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_STA, SBI_EXT_STA_STEAL_TIME_SET_SHMEM,
+ lo, hi, flags, 0, 0, 0);
+ if (ret.error) {
+ if (lo == SBI_STA_SHMEM_DISABLE && hi == SBI_STA_SHMEM_DISABLE)
+ pr_warn("Failed to disable steal-time shmem");
+ else
+ pr_warn("Failed to set steal-time shmem");
+ return sbi_err_map_linux_errno(ret.error);
+ }
+
+ return 0;
+}
+
+static int pv_time_cpu_online(unsigned int cpu)
+{
+ struct sbi_sta_struct *st = this_cpu_ptr(&steal_time);
+ phys_addr_t pa = __pa(st);
+ unsigned long lo = (unsigned long)pa;
+ unsigned long hi = IS_ENABLED(CONFIG_32BIT) ? upper_32_bits((u64)pa) : 0;
+
+ return sbi_sta_steal_time_set_shmem(lo, hi, 0);
+}
+
+static int pv_time_cpu_down_prepare(unsigned int cpu)
+{
+ return sbi_sta_steal_time_set_shmem(SBI_STA_SHMEM_DISABLE,
+ SBI_STA_SHMEM_DISABLE, 0);
+}
+
+static u64 pv_time_steal_clock(int cpu)
+{
+ struct sbi_sta_struct *st = per_cpu_ptr(&steal_time, cpu);
+ u32 sequence;
+ u64 steal;
+
+ /*
+ * Check the sequence field before and after reading the steal
+ * field. Repeat the read if it is different or odd.
+ */
+ do {
+ sequence = READ_ONCE(st->sequence);
+ virt_rmb();
+ steal = READ_ONCE(st->steal);
+ virt_rmb();
+ } while ((le32_to_cpu(sequence) & 1) ||
+ sequence != READ_ONCE(st->sequence));
+
+ return le64_to_cpu(steal);
+}
+
+int __init pv_time_init(void)
+{
+ int ret;
+
+ if (!has_pv_steal_clock())
+ return 0;
+
+ ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+ "riscv/pv_time:online",
+ pv_time_cpu_online,
+ pv_time_cpu_down_prepare);
+ if (ret < 0)
+ return ret;
+
+ static_call_update(pv_steal_clock, pv_time_steal_clock);
+
+ static_key_slow_inc(¶virt_steal_enabled);
+ if (steal_acc)
+ static_key_slow_inc(¶virt_steal_rq_enabled);
+
+ pr_info("Computing paravirt steal-time\n");
+
+ return 0;
+}
pair->value &= ~missing;
}
-static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext)
+static bool hwprobe_ext0_has(const struct cpumask *cpus, u64 ext)
{
struct riscv_hwprobe pair;
.text
.global test_uleb_basic
test_uleb_basic:
- ld a0, second
+ lw a0, second
addi a0, a0, -127
ret
.global test_uleb_large
test_uleb_large:
- ld a0, fourth
+ lw a0, fourth
addi a0, a0, -0x07e8
ret
second:
.reloc second, R_RISCV_SET_ULEB128, second
.reloc second, R_RISCV_SUB_ULEB128, first
- .dword 0
+ .word 0
third:
.space 1000
fourth:
.reloc fourth, R_RISCV_SET_ULEB128, fourth
.reloc fourth, R_RISCV_SUB_ULEB128, third
- .dword 0
+ .word 0
#include <asm/sbi.h>
#include <asm/processor.h>
#include <asm/timex.h>
+#include <asm/paravirt.h>
unsigned long riscv_timebase __ro_after_init;
EXPORT_SYMBOL_GPL(riscv_timebase);
timer_probe();
tick_setup_hrtimer_broadcast();
+
+ pv_time_init();
}
} else if ((insn & INSN_MASK_C_SD) == INSN_MATCH_C_SD) {
len = 8;
val.data_ulong = GET_RS2S(insn, regs);
- } else if ((insn & INSN_MASK_C_SDSP) == INSN_MATCH_C_SDSP &&
- ((insn >> SH_RD) & 0x1f)) {
+ } else if ((insn & INSN_MASK_C_SDSP) == INSN_MATCH_C_SDSP) {
len = 8;
val.data_ulong = GET_RS2C(insn, regs);
#endif
} else if ((insn & INSN_MASK_C_SW) == INSN_MATCH_C_SW) {
len = 4;
val.data_ulong = GET_RS2S(insn, regs);
- } else if ((insn & INSN_MASK_C_SWSP) == INSN_MATCH_C_SWSP &&
- ((insn >> SH_RD) & 0x1f)) {
+ } else if ((insn & INSN_MASK_C_SWSP) == INSN_MATCH_C_SWSP) {
len = 4;
val.data_ulong = GET_RS2C(insn, regs);
} else if ((insn & INSN_MASK_C_FSD) == INSN_MATCH_C_FSD) {
config KVM
tristate "Kernel-based Virtual Machine (KVM) support (EXPERIMENTAL)"
depends on RISCV_SBI && MMU
- select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQCHIP
- select HAVE_KVM_IRQFD
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
select HAVE_KVM_VCPU_ASYNC_IOCTL
+ select KVM_COMMON
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_GENERIC_HARDWARE_ENABLING
select KVM_MMIO
select KVM_XFER_TO_GUEST_WORK
select KVM_GENERIC_MMU_NOTIFIER
- select PREEMPT_NOTIFIERS
+ select SCHED_INFO
help
Support hosting virtualized guest machines.
kvm-y += vcpu_sbi_base.o
kvm-y += vcpu_sbi_replace.o
kvm-y += vcpu_sbi_hsm.o
+kvm-y += vcpu_sbi_sta.o
kvm-y += vcpu_timer.o
kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
kvm-y += aia.o
/* IMSIC SW-file */
struct imsic_mrif *swfile;
phys_addr_t swfile_pa;
+ spinlock_t swfile_extirq_lock;
};
#define imsic_vs_csr_read(__c) \
{
struct imsic *imsic = vcpu->arch.aia_context.imsic_state;
struct imsic_mrif *mrif = imsic->swfile;
+ unsigned long flags;
+
+ /*
+ * The critical section is necessary during external interrupt
+ * updates to avoid the risk of losing interrupts due to potential
+ * interruptions between reading topei and updating pending status.
+ */
+
+ spin_lock_irqsave(&imsic->swfile_extirq_lock, flags);
if (imsic_mrif_atomic_read(mrif, &mrif->eidelivery) &&
imsic_mrif_topei(mrif, imsic->nr_eix, imsic->nr_msis))
kvm_riscv_vcpu_set_interrupt(vcpu, IRQ_VS_EXT);
else
kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_EXT);
+
+ spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags);
}
static void imsic_swfile_read(struct kvm_vcpu *vcpu, bool clear,
}
imsic->swfile = page_to_virt(swfile_page);
imsic->swfile_pa = page_to_phys(swfile_page);
+ spin_lock_init(&imsic->swfile_extirq_lock);
/* Setup IO device */
kvm_iodevice_init(&imsic->iodev, &imsic_iodoev_ops);
vcpu->arch.hfence_tail = 0;
memset(vcpu->arch.hfence_queue, 0, sizeof(vcpu->arch.hfence_queue));
+ kvm_riscv_vcpu_sbi_sta_reset(vcpu);
+
/* Reset the guest CSRs for hotplug usecase */
if (loaded)
kvm_arch_vcpu_load(vcpu, smp_processor_id());
kvm_riscv_vcpu_aia_load(vcpu, cpu);
+ kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
+
vcpu->cpu = cpu;
}
if (kvm_check_request(KVM_REQ_HFENCE, vcpu))
kvm_riscv_hfence_process(vcpu);
+
+ if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu))
+ kvm_riscv_vcpu_record_steal_time(vcpu);
}
}
/* Update HVIP CSR for current CPU */
kvm_riscv_update_hvip(vcpu);
- if (ret <= 0 ||
- kvm_riscv_gstage_vmid_ver_changed(&vcpu->kvm->arch.vmid) ||
+ if (kvm_riscv_gstage_vmid_ver_changed(&vcpu->kvm->arch.vmid) ||
kvm_request_pending(vcpu) ||
xfer_to_guest_mode_work_pending()) {
vcpu->mode = OUTSIDE_GUEST_MODE;
if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SMSTATEEN))
rc = kvm_riscv_vcpu_smstateen_set_csr(vcpu, reg_num,
reg_val);
-break;
+ break;
default:
rc = -ENOENT;
break;
return copy_isa_ext_reg_indices(vcpu, NULL);;
}
-static inline unsigned long num_sbi_ext_regs(void)
+static int copy_sbi_ext_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
{
- /*
- * number of KVM_REG_RISCV_SBI_SINGLE +
- * 2 x (number of KVM_REG_RISCV_SBI_MULTI)
- */
- return KVM_RISCV_SBI_EXT_MAX + 2*(KVM_REG_RISCV_SBI_MULTI_REG_LAST+1);
-}
-
-static int copy_sbi_ext_reg_indices(u64 __user *uindices)
-{
- int n;
+ unsigned int n = 0;
- /* copy KVM_REG_RISCV_SBI_SINGLE */
- n = KVM_RISCV_SBI_EXT_MAX;
- for (int i = 0; i < n; i++) {
+ for (int i = 0; i < KVM_RISCV_SBI_EXT_MAX; i++) {
u64 size = IS_ENABLED(CONFIG_32BIT) ?
KVM_REG_SIZE_U32 : KVM_REG_SIZE_U64;
u64 reg = KVM_REG_RISCV | size | KVM_REG_RISCV_SBI_EXT |
KVM_REG_RISCV_SBI_SINGLE | i;
+ if (!riscv_vcpu_supports_sbi_ext(vcpu, i))
+ continue;
+
if (uindices) {
if (put_user(reg, uindices))
return -EFAULT;
uindices++;
}
+
+ n++;
}
- /* copy KVM_REG_RISCV_SBI_MULTI */
- n = KVM_REG_RISCV_SBI_MULTI_REG_LAST + 1;
- for (int i = 0; i < n; i++) {
- u64 size = IS_ENABLED(CONFIG_32BIT) ?
- KVM_REG_SIZE_U32 : KVM_REG_SIZE_U64;
- u64 reg = KVM_REG_RISCV | size | KVM_REG_RISCV_SBI_EXT |
- KVM_REG_RISCV_SBI_MULTI_EN | i;
+ return n;
+}
+
+static unsigned long num_sbi_ext_regs(struct kvm_vcpu *vcpu)
+{
+ return copy_sbi_ext_reg_indices(vcpu, NULL);
+}
+
+static int copy_sbi_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
+{
+ struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context;
+ int total = 0;
+
+ if (scontext->ext_status[KVM_RISCV_SBI_EXT_STA] == KVM_RISCV_SBI_EXT_STATUS_ENABLED) {
+ u64 size = IS_ENABLED(CONFIG_32BIT) ? KVM_REG_SIZE_U32 : KVM_REG_SIZE_U64;
+ int n = sizeof(struct kvm_riscv_sbi_sta) / sizeof(unsigned long);
+
+ for (int i = 0; i < n; i++) {
+ u64 reg = KVM_REG_RISCV | size |
+ KVM_REG_RISCV_SBI_STATE |
+ KVM_REG_RISCV_SBI_STA | i;
+
+ if (uindices) {
+ if (put_user(reg, uindices))
+ return -EFAULT;
+ uindices++;
+ }
+ }
+
+ total += n;
+ }
+
+ return total;
+}
+
+static inline unsigned long num_sbi_regs(struct kvm_vcpu *vcpu)
+{
+ return copy_sbi_reg_indices(vcpu, NULL);
+}
+
+static inline unsigned long num_vector_regs(const struct kvm_vcpu *vcpu)
+{
+ if (!riscv_isa_extension_available(vcpu->arch.isa, v))
+ return 0;
+
+ /* vstart, vl, vtype, vcsr, vlenb and 32 vector regs */
+ return 37;
+}
+
+static int copy_vector_reg_indices(const struct kvm_vcpu *vcpu,
+ u64 __user *uindices)
+{
+ const struct kvm_cpu_context *cntx = &vcpu->arch.guest_context;
+ int n = num_vector_regs(vcpu);
+ u64 reg, size;
+ int i;
+
+ if (n == 0)
+ return 0;
+
+ /* copy vstart, vl, vtype, vcsr and vlenb */
+ size = IS_ENABLED(CONFIG_32BIT) ? KVM_REG_SIZE_U32 : KVM_REG_SIZE_U64;
+ for (i = 0; i < 5; i++) {
+ reg = KVM_REG_RISCV | size | KVM_REG_RISCV_VECTOR | i;
if (uindices) {
if (put_user(reg, uindices))
return -EFAULT;
uindices++;
}
+ }
- reg = KVM_REG_RISCV | size | KVM_REG_RISCV_SBI_EXT |
- KVM_REG_RISCV_SBI_MULTI_DIS | i;
+ /* vector_regs have a variable 'vlenb' size */
+ size = __builtin_ctzl(cntx->vector.vlenb);
+ size <<= KVM_REG_SIZE_SHIFT;
+ for (i = 0; i < 32; i++) {
+ reg = KVM_REG_RISCV | KVM_REG_RISCV_VECTOR | size |
+ KVM_REG_RISCV_VECTOR_REG(i);
if (uindices) {
if (put_user(reg, uindices))
}
}
- return num_sbi_ext_regs();
+ return n;
}
/*
res += num_timer_regs();
res += num_fp_f_regs(vcpu);
res += num_fp_d_regs(vcpu);
+ res += num_vector_regs(vcpu);
res += num_isa_ext_regs(vcpu);
- res += num_sbi_ext_regs();
+ res += num_sbi_ext_regs(vcpu);
+ res += num_sbi_regs(vcpu);
return res;
}
return ret;
uindices += ret;
+ ret = copy_vector_reg_indices(vcpu, uindices);
+ if (ret < 0)
+ return ret;
+ uindices += ret;
+
ret = copy_isa_ext_reg_indices(vcpu, uindices);
if (ret < 0)
return ret;
uindices += ret;
- ret = copy_sbi_ext_reg_indices(uindices);
+ ret = copy_sbi_ext_reg_indices(vcpu, uindices);
if (ret < 0)
return ret;
+ uindices += ret;
+
+ ret = copy_sbi_reg_indices(vcpu, uindices);
+ if (ret < 0)
+ return ret;
+ uindices += ret;
return 0;
}
case KVM_REG_RISCV_FP_D:
return kvm_riscv_vcpu_set_reg_fp(vcpu, reg,
KVM_REG_RISCV_FP_D);
+ case KVM_REG_RISCV_VECTOR:
+ return kvm_riscv_vcpu_set_reg_vector(vcpu, reg);
case KVM_REG_RISCV_ISA_EXT:
return kvm_riscv_vcpu_set_reg_isa_ext(vcpu, reg);
case KVM_REG_RISCV_SBI_EXT:
return kvm_riscv_vcpu_set_reg_sbi_ext(vcpu, reg);
- case KVM_REG_RISCV_VECTOR:
- return kvm_riscv_vcpu_set_reg_vector(vcpu, reg);
+ case KVM_REG_RISCV_SBI_STATE:
+ return kvm_riscv_vcpu_set_reg_sbi(vcpu, reg);
default:
break;
}
case KVM_REG_RISCV_FP_D:
return kvm_riscv_vcpu_get_reg_fp(vcpu, reg,
KVM_REG_RISCV_FP_D);
+ case KVM_REG_RISCV_VECTOR:
+ return kvm_riscv_vcpu_get_reg_vector(vcpu, reg);
case KVM_REG_RISCV_ISA_EXT:
return kvm_riscv_vcpu_get_reg_isa_ext(vcpu, reg);
case KVM_REG_RISCV_SBI_EXT:
return kvm_riscv_vcpu_get_reg_sbi_ext(vcpu, reg);
- case KVM_REG_RISCV_VECTOR:
- return kvm_riscv_vcpu_get_reg_vector(vcpu, reg);
+ case KVM_REG_RISCV_SBI_STATE:
+ return kvm_riscv_vcpu_get_reg_sbi(vcpu, reg);
default:
break;
}
.ext_idx = KVM_RISCV_SBI_EXT_DBCN,
.ext_ptr = &vcpu_sbi_ext_dbcn,
},
+ {
+ .ext_idx = KVM_RISCV_SBI_EXT_STA,
+ .ext_ptr = &vcpu_sbi_ext_sta,
+ },
{
.ext_idx = KVM_RISCV_SBI_EXT_EXPERIMENTAL,
.ext_ptr = &vcpu_sbi_ext_experimental,
},
};
+static const struct kvm_riscv_sbi_extension_entry *
+riscv_vcpu_get_sbi_ext(struct kvm_vcpu *vcpu, unsigned long idx)
+{
+ const struct kvm_riscv_sbi_extension_entry *sext = NULL;
+
+ if (idx >= KVM_RISCV_SBI_EXT_MAX)
+ return NULL;
+
+ for (int i = 0; i < ARRAY_SIZE(sbi_ext); i++) {
+ if (sbi_ext[i].ext_idx == idx) {
+ sext = &sbi_ext[i];
+ break;
+ }
+ }
+
+ return sext;
+}
+
+bool riscv_vcpu_supports_sbi_ext(struct kvm_vcpu *vcpu, int idx)
+{
+ struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context;
+ const struct kvm_riscv_sbi_extension_entry *sext;
+
+ sext = riscv_vcpu_get_sbi_ext(vcpu, idx);
+
+ return sext && scontext->ext_status[sext->ext_idx] != KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE;
+}
+
void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
{
struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
unsigned long reg_num,
unsigned long reg_val)
{
- unsigned long i;
- const struct kvm_riscv_sbi_extension_entry *sext = NULL;
struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context;
-
- if (reg_num >= KVM_RISCV_SBI_EXT_MAX)
- return -ENOENT;
+ const struct kvm_riscv_sbi_extension_entry *sext;
if (reg_val != 1 && reg_val != 0)
return -EINVAL;
- for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) {
- if (sbi_ext[i].ext_idx == reg_num) {
- sext = &sbi_ext[i];
- break;
- }
- }
- if (!sext)
+ sext = riscv_vcpu_get_sbi_ext(vcpu, reg_num);
+ if (!sext || scontext->ext_status[sext->ext_idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE)
return -ENOENT;
scontext->ext_status[sext->ext_idx] = (reg_val) ?
- KVM_RISCV_SBI_EXT_AVAILABLE :
- KVM_RISCV_SBI_EXT_UNAVAILABLE;
+ KVM_RISCV_SBI_EXT_STATUS_ENABLED :
+ KVM_RISCV_SBI_EXT_STATUS_DISABLED;
return 0;
}
unsigned long reg_num,
unsigned long *reg_val)
{
- unsigned long i;
- const struct kvm_riscv_sbi_extension_entry *sext = NULL;
struct kvm_vcpu_sbi_context *scontext = &vcpu->arch.sbi_context;
+ const struct kvm_riscv_sbi_extension_entry *sext;
- if (reg_num >= KVM_RISCV_SBI_EXT_MAX)
- return -ENOENT;
-
- for (i = 0; i < ARRAY_SIZE(sbi_ext); i++) {
- if (sbi_ext[i].ext_idx == reg_num) {
- sext = &sbi_ext[i];
- break;
- }
- }
- if (!sext)
+ sext = riscv_vcpu_get_sbi_ext(vcpu, reg_num);
+ if (!sext || scontext->ext_status[sext->ext_idx] == KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE)
return -ENOENT;
*reg_val = scontext->ext_status[sext->ext_idx] ==
- KVM_RISCV_SBI_EXT_AVAILABLE;
+ KVM_RISCV_SBI_EXT_STATUS_ENABLED;
+
return 0;
}
return 0;
}
+int kvm_riscv_vcpu_set_reg_sbi(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg)
+{
+ unsigned long __user *uaddr =
+ (unsigned long __user *)(unsigned long)reg->addr;
+ unsigned long reg_num = reg->id & ~(KVM_REG_ARCH_MASK |
+ KVM_REG_SIZE_MASK |
+ KVM_REG_RISCV_SBI_STATE);
+ unsigned long reg_subtype, reg_val;
+
+ if (KVM_REG_SIZE(reg->id) != sizeof(unsigned long))
+ return -EINVAL;
+
+ if (copy_from_user(®_val, uaddr, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ reg_subtype = reg_num & KVM_REG_RISCV_SUBTYPE_MASK;
+ reg_num &= ~KVM_REG_RISCV_SUBTYPE_MASK;
+
+ switch (reg_subtype) {
+ case KVM_REG_RISCV_SBI_STA:
+ return kvm_riscv_vcpu_set_reg_sbi_sta(vcpu, reg_num, reg_val);
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_get_reg_sbi(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg)
+{
+ unsigned long __user *uaddr =
+ (unsigned long __user *)(unsigned long)reg->addr;
+ unsigned long reg_num = reg->id & ~(KVM_REG_ARCH_MASK |
+ KVM_REG_SIZE_MASK |
+ KVM_REG_RISCV_SBI_STATE);
+ unsigned long reg_subtype, reg_val;
+ int ret;
+
+ if (KVM_REG_SIZE(reg->id) != sizeof(unsigned long))
+ return -EINVAL;
+
+ reg_subtype = reg_num & KVM_REG_RISCV_SUBTYPE_MASK;
+ reg_num &= ~KVM_REG_RISCV_SUBTYPE_MASK;
+
+ switch (reg_subtype) {
+ case KVM_REG_RISCV_SBI_STA:
+ ret = kvm_riscv_vcpu_get_reg_sbi_sta(vcpu, reg_num, ®_val);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (ret)
+ return ret;
+
+ if (copy_to_user(uaddr, ®_val, KVM_REG_SIZE(reg->id)))
+ return -EFAULT;
+
+ return 0;
+}
+
const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(
struct kvm_vcpu *vcpu, unsigned long extid)
{
if (ext->extid_start <= extid && ext->extid_end >= extid) {
if (entry->ext_idx >= KVM_RISCV_SBI_EXT_MAX ||
scontext->ext_status[entry->ext_idx] ==
- KVM_RISCV_SBI_EXT_AVAILABLE)
+ KVM_RISCV_SBI_EXT_STATUS_ENABLED)
return ext;
return NULL;
if (ext->probe && !ext->probe(vcpu)) {
scontext->ext_status[entry->ext_idx] =
- KVM_RISCV_SBI_EXT_UNAVAILABLE;
+ KVM_RISCV_SBI_EXT_STATUS_UNAVAILABLE;
continue;
}
- scontext->ext_status[entry->ext_idx] = ext->default_unavail ?
- KVM_RISCV_SBI_EXT_UNAVAILABLE :
- KVM_RISCV_SBI_EXT_AVAILABLE;
+ scontext->ext_status[entry->ext_idx] = ext->default_disabled ?
+ KVM_RISCV_SBI_EXT_STATUS_DISABLED :
+ KVM_RISCV_SBI_EXT_STATUS_ENABLED;
}
}
const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_dbcn = {
.extid_start = SBI_EXT_DBCN,
.extid_end = SBI_EXT_DBCN,
- .default_unavail = true,
+ .default_disabled = true,
.handler = kvm_sbi_ext_dbcn_handler,
};
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 Ventana Micro Systems Inc.
+ */
+
+#include <linux/kconfig.h>
+#include <linux/kernel.h>
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/sizes.h>
+
+#include <asm/bug.h>
+#include <asm/current.h>
+#include <asm/kvm_vcpu_sbi.h>
+#include <asm/page.h>
+#include <asm/sbi.h>
+#include <asm/uaccess.h>
+
+void kvm_riscv_vcpu_sbi_sta_reset(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.sta.shmem = INVALID_GPA;
+ vcpu->arch.sta.last_steal = 0;
+}
+
+void kvm_riscv_vcpu_record_steal_time(struct kvm_vcpu *vcpu)
+{
+ gpa_t shmem = vcpu->arch.sta.shmem;
+ u64 last_steal = vcpu->arch.sta.last_steal;
+ u32 *sequence_ptr, sequence;
+ u64 *steal_ptr, steal;
+ unsigned long hva;
+ gfn_t gfn;
+
+ if (shmem == INVALID_GPA)
+ return;
+
+ /*
+ * shmem is 64-byte aligned (see the enforcement in
+ * kvm_sbi_sta_steal_time_set_shmem()) and the size of sbi_sta_struct
+ * is 64 bytes, so we know all its offsets are in the same page.
+ */
+ gfn = shmem >> PAGE_SHIFT;
+ hva = kvm_vcpu_gfn_to_hva(vcpu, gfn);
+
+ if (WARN_ON(kvm_is_error_hva(hva))) {
+ vcpu->arch.sta.shmem = INVALID_GPA;
+ return;
+ }
+
+ sequence_ptr = (u32 *)(hva + offset_in_page(shmem) +
+ offsetof(struct sbi_sta_struct, sequence));
+ steal_ptr = (u64 *)(hva + offset_in_page(shmem) +
+ offsetof(struct sbi_sta_struct, steal));
+
+ if (WARN_ON(get_user(sequence, sequence_ptr)))
+ return;
+
+ sequence = le32_to_cpu(sequence);
+ sequence += 1;
+
+ if (WARN_ON(put_user(cpu_to_le32(sequence), sequence_ptr)))
+ return;
+
+ if (!WARN_ON(get_user(steal, steal_ptr))) {
+ steal = le64_to_cpu(steal);
+ vcpu->arch.sta.last_steal = READ_ONCE(current->sched_info.run_delay);
+ steal += vcpu->arch.sta.last_steal - last_steal;
+ WARN_ON(put_user(cpu_to_le64(steal), steal_ptr));
+ }
+
+ sequence += 1;
+ WARN_ON(put_user(cpu_to_le32(sequence), sequence_ptr));
+
+ kvm_vcpu_mark_page_dirty(vcpu, gfn);
+}
+
+static int kvm_sbi_sta_steal_time_set_shmem(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+ unsigned long shmem_phys_lo = cp->a0;
+ unsigned long shmem_phys_hi = cp->a1;
+ u32 flags = cp->a2;
+ struct sbi_sta_struct zero_sta = {0};
+ unsigned long hva;
+ bool writable;
+ gpa_t shmem;
+ int ret;
+
+ if (flags != 0)
+ return SBI_ERR_INVALID_PARAM;
+
+ if (shmem_phys_lo == SBI_STA_SHMEM_DISABLE &&
+ shmem_phys_hi == SBI_STA_SHMEM_DISABLE) {
+ vcpu->arch.sta.shmem = INVALID_GPA;
+ return 0;
+ }
+
+ if (shmem_phys_lo & (SZ_64 - 1))
+ return SBI_ERR_INVALID_PARAM;
+
+ shmem = shmem_phys_lo;
+
+ if (shmem_phys_hi != 0) {
+ if (IS_ENABLED(CONFIG_32BIT))
+ shmem |= ((gpa_t)shmem_phys_hi << 32);
+ else
+ return SBI_ERR_INVALID_ADDRESS;
+ }
+
+ hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable);
+ if (kvm_is_error_hva(hva) || !writable)
+ return SBI_ERR_INVALID_ADDRESS;
+
+ ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
+ if (ret)
+ return SBI_ERR_FAILURE;
+
+ vcpu->arch.sta.shmem = shmem;
+ vcpu->arch.sta.last_steal = current->sched_info.run_delay;
+
+ return 0;
+}
+
+static int kvm_sbi_ext_sta_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
+ struct kvm_vcpu_sbi_return *retdata)
+{
+ struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+ unsigned long funcid = cp->a6;
+ int ret;
+
+ switch (funcid) {
+ case SBI_EXT_STA_STEAL_TIME_SET_SHMEM:
+ ret = kvm_sbi_sta_steal_time_set_shmem(vcpu);
+ break;
+ default:
+ ret = SBI_ERR_NOT_SUPPORTED;
+ break;
+ }
+
+ retdata->err_val = ret;
+
+ return 0;
+}
+
+static unsigned long kvm_sbi_ext_sta_probe(struct kvm_vcpu *vcpu)
+{
+ return !!sched_info_on();
+}
+
+const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta = {
+ .extid_start = SBI_EXT_STA,
+ .extid_end = SBI_EXT_STA,
+ .handler = kvm_sbi_ext_sta_handler,
+ .probe = kvm_sbi_ext_sta_probe,
+};
+
+int kvm_riscv_vcpu_get_reg_sbi_sta(struct kvm_vcpu *vcpu,
+ unsigned long reg_num,
+ unsigned long *reg_val)
+{
+ switch (reg_num) {
+ case KVM_REG_RISCV_SBI_STA_REG(shmem_lo):
+ *reg_val = (unsigned long)vcpu->arch.sta.shmem;
+ break;
+ case KVM_REG_RISCV_SBI_STA_REG(shmem_hi):
+ if (IS_ENABLED(CONFIG_32BIT))
+ *reg_val = upper_32_bits(vcpu->arch.sta.shmem);
+ else
+ *reg_val = 0;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int kvm_riscv_vcpu_set_reg_sbi_sta(struct kvm_vcpu *vcpu,
+ unsigned long reg_num,
+ unsigned long reg_val)
+{
+ switch (reg_num) {
+ case KVM_REG_RISCV_SBI_STA_REG(shmem_lo):
+ if (IS_ENABLED(CONFIG_32BIT)) {
+ gpa_t hi = upper_32_bits(vcpu->arch.sta.shmem);
+
+ vcpu->arch.sta.shmem = reg_val;
+ vcpu->arch.sta.shmem |= hi << 32;
+ } else {
+ vcpu->arch.sta.shmem = reg_val;
+ }
+ break;
+ case KVM_REG_RISCV_SBI_STA_REG(shmem_hi):
+ if (IS_ENABLED(CONFIG_32BIT)) {
+ gpa_t lo = lower_32_bits(vcpu->arch.sta.shmem);
+
+ vcpu->arch.sta.shmem = ((gpa_t)reg_val << 32);
+ vcpu->arch.sta.shmem |= lo;
+ } else if (reg_val != 0) {
+ return -EINVAL;
+ }
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
.altmacro
.option norelax
-ENTRY(__kvm_riscv_switch_to)
+SYM_FUNC_START(__kvm_riscv_switch_to)
/* Save Host GPRs (except A0 and T0-T6) */
REG_S ra, (KVM_ARCH_HOST_RA)(a0)
REG_S sp, (KVM_ARCH_HOST_SP)(a0)
REG_L t0, (KVM_ARCH_GUEST_SSTATUS)(a0)
REG_L t1, (KVM_ARCH_GUEST_HSTATUS)(a0)
REG_L t2, (KVM_ARCH_GUEST_SCOUNTEREN)(a0)
- la t4, __kvm_switch_return
+ la t4, .Lkvm_switch_return
REG_L t5, (KVM_ARCH_GUEST_SEPC)(a0)
/* Save Host and Restore Guest SSTATUS */
/* Back to Host */
.align 2
-__kvm_switch_return:
+.Lkvm_switch_return:
/* Swap Guest A0 with SSCRATCH */
csrrw a0, CSR_SSCRATCH, a0
/* Return to C code */
ret
-ENDPROC(__kvm_riscv_switch_to)
+SYM_FUNC_END(__kvm_riscv_switch_to)
-ENTRY(__kvm_riscv_unpriv_trap)
+SYM_CODE_START(__kvm_riscv_unpriv_trap)
/*
* We assume that faulting unpriv load/store instruction is
* 4-byte long and blindly increment SEPC by 4.
csrr a1, CSR_HTINST
REG_S a1, (KVM_ARCH_TRAP_HTINST)(a0)
sret
-ENDPROC(__kvm_riscv_unpriv_trap)
+SYM_CODE_END(__kvm_riscv_unpriv_trap)
#ifdef CONFIG_FPU
- .align 3
- .global __kvm_riscv_fp_f_save
-__kvm_riscv_fp_f_save:
+SYM_FUNC_START(__kvm_riscv_fp_f_save)
csrr t2, CSR_SSTATUS
li t1, SR_FS
csrs CSR_SSTATUS, t1
sw t0, KVM_ARCH_FP_F_FCSR(a0)
csrw CSR_SSTATUS, t2
ret
+SYM_FUNC_END(__kvm_riscv_fp_f_save)
- .align 3
- .global __kvm_riscv_fp_d_save
-__kvm_riscv_fp_d_save:
+SYM_FUNC_START(__kvm_riscv_fp_d_save)
csrr t2, CSR_SSTATUS
li t1, SR_FS
csrs CSR_SSTATUS, t1
sw t0, KVM_ARCH_FP_D_FCSR(a0)
csrw CSR_SSTATUS, t2
ret
+SYM_FUNC_END(__kvm_riscv_fp_d_save)
- .align 3
- .global __kvm_riscv_fp_f_restore
-__kvm_riscv_fp_f_restore:
+SYM_FUNC_START(__kvm_riscv_fp_f_restore)
csrr t2, CSR_SSTATUS
li t1, SR_FS
lw t0, KVM_ARCH_FP_F_FCSR(a0)
fscsr t0
csrw CSR_SSTATUS, t2
ret
+SYM_FUNC_END(__kvm_riscv_fp_f_restore)
- .align 3
- .global __kvm_riscv_fp_d_restore
-__kvm_riscv_fp_d_restore:
+SYM_FUNC_START(__kvm_riscv_fp_d_restore)
csrr t2, CSR_SSTATUS
li t1, SR_FS
lw t0, KVM_ARCH_FP_D_FCSR(a0)
fscsr t0
csrw CSR_SSTATUS, t2
ret
+SYM_FUNC_END(__kvm_riscv_fp_d_restore)
#endif
cntx->vector.datap = kmalloc(riscv_v_vsize, GFP_KERNEL);
if (!cntx->vector.datap)
return -ENOMEM;
+ cntx->vector.vlenb = riscv_v_vsize / 32;
vcpu->arch.host_context.vector.datap = kzalloc(riscv_v_vsize, GFP_KERNEL);
if (!vcpu->arch.host_context.vector.datap)
case KVM_REG_RISCV_VECTOR_CSR_REG(vcsr):
*reg_addr = &cntx->vector.vcsr;
break;
+ case KVM_REG_RISCV_VECTOR_CSR_REG(vlenb):
+ *reg_addr = &cntx->vector.vlenb;
+ break;
case KVM_REG_RISCV_VECTOR_CSR_REG(datap):
default:
return -ENOENT;
if (!riscv_isa_extension_available(isa, v))
return -ENOENT;
+ if (reg_num == KVM_REG_RISCV_VECTOR_CSR_REG(vlenb)) {
+ struct kvm_cpu_context *cntx = &vcpu->arch.guest_context;
+ unsigned long reg_val;
+
+ if (copy_from_user(®_val, uaddr, reg_size))
+ return -EFAULT;
+ if (reg_val != cntx->vector.vlenb)
+ return -EINVAL;
+
+ return 0;
+ }
+
rc = kvm_riscv_vcpu_vreg_addr(vcpu, reg_num, reg_size, ®_addr);
if (rc)
return rc;
CONFIG_KEXEC_SIG=y
CONFIG_CRASH_DUMP=y
CONFIG_LIVEPATCH=y
-CONFIG_MARCH_ZEC12=y
-CONFIG_TUNE_ZEC12=y
+CONFIG_MARCH_Z13=y
CONFIG_NR_CPUS=512
CONFIG_NUMA=y
CONFIG_HZ_100=y
CONFIG_MODULE_UNLOAD_TAINT_TRACKING=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
-CONFIG_MODULE_SIG_SHA256=y
CONFIG_BLK_DEV_THROTTLING=y
CONFIG_BLK_WBT=y
CONFIG_BLK_CGROUP_IOLATENCY=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
+CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_SLUB_STATS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y
CONFIG_NILFS2_FS=m
+CONFIG_BCACHEFS_FS=y
+CONFIG_BCACHEFS_QUOTA=y
+CONFIG_BCACHEFS_POSIX_ACL=y
CONFIG_FS_DAX=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FS_ENCRYPTION=y
CONFIG_ENCRYPTED_KEYS=m
CONFIG_KEY_NOTIFICATIONS=y
CONFIG_SECURITY=y
-CONFIG_SECURITY_NETWORK=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
CONFIG_SECURITY_SELINUX=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_SG=y
CONFIG_DEBUG_NOTIFIERS=y
-CONFIG_DEBUG_CREDENTIALS=y
CONFIG_RCU_TORTURE_TEST=m
CONFIG_RCU_REF_SCALE_TEST=m
CONFIG_RCU_CPU_STALL_TIMEOUT=300
CONFIG_KEXEC_SIG=y
CONFIG_CRASH_DUMP=y
CONFIG_LIVEPATCH=y
-CONFIG_MARCH_ZEC12=y
-CONFIG_TUNE_ZEC12=y
+CONFIG_MARCH_Z13=y
CONFIG_NR_CPUS=512
CONFIG_NUMA=y
CONFIG_HZ_100=y
CONFIG_MODULE_UNLOAD_TAINT_TRACKING=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
-CONFIG_MODULE_SIG_SHA256=y
CONFIG_BLK_DEV_THROTTLING=y
CONFIG_BLK_WBT=y
CONFIG_BLK_CGROUP_IOLATENCY=y
CONFIG_IOSCHED_BFQ=y
CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
+CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
CONFIG_ZSMALLOC_STAT=y
# CONFIG_COMPAT_BRK is not set
CONFIG_MEMORY_HOTPLUG=y
CONFIG_BTRFS_FS=y
CONFIG_BTRFS_FS_POSIX_ACL=y
CONFIG_NILFS2_FS=m
+CONFIG_BCACHEFS_FS=m
+CONFIG_BCACHEFS_QUOTA=y
+CONFIG_BCACHEFS_POSIX_ACL=y
CONFIG_FS_DAX=y
CONFIG_EXPORTFS_BLOCK_OPS=y
CONFIG_FS_ENCRYPTION=y
CONFIG_ENCRYPTED_KEYS=m
CONFIG_KEY_NOTIFICATIONS=y
CONFIG_SECURITY=y
-CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_LOCKDOWN_LSM=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_CRASH_DUMP=y
-CONFIG_MARCH_ZEC12=y
-CONFIG_TUNE_ZEC12=y
+CONFIG_MARCH_Z13=y
# CONFIG_COMPAT is not set
CONFIG_NR_CPUS=2
CONFIG_HZ_100=y
preempt_enable();
}
+/**
+ * stfle_size - Actual size of the facility list as specified by stfle
+ * (number of double words)
+ */
+unsigned int stfle_size(void);
+
#endif /* __ASM_FACILITY_H */
#define KERNEL_VXR_HIGH (KERNEL_VXR_V16V23|KERNEL_VXR_V24V31)
#define KERNEL_VXR (KERNEL_VXR_LOW|KERNEL_VXR_HIGH)
-#define KERNEL_FPR (KERNEL_FPC|KERNEL_VXR_V0V7)
+#define KERNEL_FPR (KERNEL_FPC|KERNEL_VXR_LOW)
struct kernel_fpu;
struct kvm_s390_cpu_model {
/* facility mask supported by kvm & hosting machine */
- __u64 fac_mask[S390_ARCH_FAC_LIST_SIZE_U64];
+ __u64 fac_mask[S390_ARCH_FAC_MASK_SIZE_U64];
struct kvm_s390_vm_cpu_subfunc subfuncs;
/* facility list requested by guest (in dma page) */
__u64 *fac_list;
execve_tail(); \
} while (0)
-/* Forward declaration, a strange C thing */
struct task_struct;
struct mm_struct;
struct seq_file;
cond_syscall(__s390x_sys_##name); \
cond_syscall(__s390_sys_##name)
-#define SYS_NI(name) \
- SYSCALL_ALIAS(__s390x_sys_##name, sys_ni_posix_timers); \
- SYSCALL_ALIAS(__s390_sys_##name, sys_ni_posix_timers)
-
#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
long __s390_compat_sys##name(struct pt_regs *regs); \
ALLOW_ERROR_INJECTION(__s390_compat_sys##name, ERRNO); \
/*
* As some compat syscalls may not be implemented, we need to expand
- * COND_SYSCALL_COMPAT in kernel/sys_ni.c and COMPAT_SYS_NI in
- * kernel/time/posix-stubs.c to cover this case as well.
+ * COND_SYSCALL_COMPAT in kernel/sys_ni.c to cover this case as well.
*/
#define COND_SYSCALL_COMPAT(name) \
cond_syscall(__s390_compat_sys_##name)
-#define COMPAT_SYS_NI(name) \
- SYSCALL_ALIAS(__s390_compat_sys_##name, sys_ni_posix_timers)
-
#define __S390_SYS_STUBx(x, name, ...) \
long __s390_sys##name(struct pt_regs *regs); \
ALLOW_ERROR_INJECTION(__s390_sys##name, ERRNO); \
#define COND_SYSCALL(name) \
cond_syscall(__s390x_sys_##name)
-#define SYS_NI(name) \
- SYSCALL_ALIAS(__s390x_sys_##name, sys_ni_posix_timers)
-
#define __S390_SYS_STUBx(x, fullname, name, ...)
#endif /* CONFIG_COMPAT */
obj-y += runtime_instr.o cache.o fpu.o dumpstack.o guarded_storage.o sthyi.o
obj-y += entry.o reipl.o kdebugfs.o alternative.o
obj-y += nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
-obj-y += smp.o text_amode31.o stacktrace.o abs_lowcore.o
+obj-y += smp.o text_amode31.o stacktrace.o abs_lowcore.o facility.o
extra-y += vmlinux.lds
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright IBM Corp. 2023
+ */
+
+#include <asm/facility.h>
+
+unsigned int stfle_size(void)
+{
+ static unsigned int size;
+ unsigned int r;
+ u64 dummy;
+
+ r = READ_ONCE(size);
+ if (!r) {
+ r = __stfle_asm(&dummy, 1) + 1;
+ WRITE_ONCE(size, r);
+ }
+ return r;
+}
+EXPORT_SYMBOL(stfle_size);
&ipl_ccw_attr_group_lpar);
break;
case IPL_TYPE_ECKD:
+ case IPL_TYPE_ECKD_DUMP:
rc = sysfs_create_group(&ipl_kset->kobj, &ipl_eckd_attr_group);
break;
case IPL_TYPE_FCP:
if (IS_ERR(cpump))
return PTR_ERR(cpump);
- /* Event initialization sets last_tag to 0. When later on the events
- * are deleted and re-added, do not reset the event count value to zero.
- * Events are added, deleted and re-added when 2 or more events
- * are active at the same time.
- */
- event->hw.last_tag = 0;
event->destroy = paicrypt_event_destroy;
if (a->sample_period) {
{
u64 sum;
+ /* Event initialization sets last_tag to 0. When later on the events
+ * are deleted and re-added, do not reset the event count value to zero.
+ * Events are added, deleted and re-added when 2 or more events
+ * are active at the same time.
+ */
if (!event->hw.last_tag) {
event->hw.last_tag = 1;
sum = paicrypt_getall(event); /* Get current value */
rc = paiext_alloc(a, event);
if (rc)
return rc;
- event->hw.last_tag = 0;
event->destroy = paiext_event_destroy;
if (a->sample_period) {
def_tristate y
prompt "Kernel-based Virtual Machine (KVM) support"
depends on HAVE_KVM
- select PREEMPT_NOTIFIERS
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_VCPU_ASYNC_IOCTL
- select HAVE_KVM_EVENTFD
select KVM_ASYNC_PF
select KVM_ASYNC_PF_SYNC
+ select KVM_COMMON
select HAVE_KVM_IRQCHIP
- select HAVE_KVM_IRQFD
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_INVALID_WAKEUPS
select HAVE_KVM_NO_POLL
select KVM_VFIO
- select INTERVAL_TREE
select MMU_NOTIFIER
help
Support hosting paravirtualized guest machines using the SIE
#include <asm/nmi.h>
#include <asm/dis.h>
#include <asm/fpu/api.h>
+#include <asm/facility.h>
#include "kvm-s390.h"
#include "gaccess.h"
if (!gmap_is_shadow(gmap))
return;
- if (start >= 1UL << 31)
- /* We are only interested in prefix pages */
- return;
-
/*
* Only new shadow blocks are added to the list during runtime,
* therefore we can safely reference them all the time.
static int handle_stfle(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
{
struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s;
- __u32 fac = READ_ONCE(vsie_page->scb_o->fac) & 0x7ffffff8U;
+ __u32 fac = READ_ONCE(vsie_page->scb_o->fac);
+ /*
+ * Alternate-STFLE-Interpretive-Execution facilities are not supported
+ * -> format-0 flcb
+ */
if (fac && test_kvm_facility(vcpu->kvm, 7)) {
retry_vsie_icpt(vsie_page);
+ /*
+ * The facility list origin (FLO) is in bits 1 - 28 of the FLD
+ * so we need to mask here before reading.
+ */
+ fac = fac & 0x7ffffff8U;
+ /*
+ * format-0 -> size of nested guest's facility list == guest's size
+ * guest's size == host's size, since STFLE is interpretatively executed
+ * using a format-0 for the guest, too.
+ */
if (read_guest_real(vcpu, fac, &vsie_page->fac,
- sizeof(vsie_page->fac)))
+ stfle_size() * sizeof(u64)))
return set_validity_icpt(scb_s, 0x1090U);
scb_s->fac = (__u32)(__u64) &vsie_page->fac;
}
pte_clear(mm, addr, ptep);
}
if (reset)
- pgste_val(pgste) &= ~_PGSTE_GPS_USAGE_MASK;
+ pgste_val(pgste) &= ~(_PGSTE_GPS_USAGE_MASK | _PGSTE_GPS_NODAT);
pgste_set_unlock(ptep, pgste);
preempt_enable();
}
/* The native architecture */
#define KEXEC_ARCH KEXEC_ARCH_SH
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
/* arch/sh/kernel/machine_kexec.c */
void reserve_crashkernel(void);
}
#else
static inline void reserve_crashkernel(void) { }
-#endif /* CONFIG_KEXEC */
+#endif /* CONFIG_KEXEC_CORE */
#endif /* __ASM_SH_KEXEC_H */
obj-$(CONFIG_SH_STANDARD_BIOS) += sh_bios.o
obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_MODULES) += sh_ksyms_32.o module.o
-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o
+obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_STACKTRACE) += stacktrace.o
obj-$(CONFIG_IO_TRAPPED) += io_trapped.o
.shutdown = native_machine_shutdown,
.restart = native_machine_restart,
.halt = native_machine_halt,
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
.crash_shutdown = native_machine_crash_shutdown,
#endif
};
machine_ops.halt();
}
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
void machine_crash_shutdown(struct pt_regs *regs)
{
machine_ops.crash_shutdown(regs);
request_resource(res, &code_resource);
request_resource(res, &data_resource);
request_resource(res, &bss_resource);
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
request_resource(res, &crashk_res);
#endif
{
unsigned long addr = 0;
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
char val[MAX_ADDR_LEN] = { };
int ret;
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
+#include <asm/ia32.h>
#include <asm/insn.h>
#include <asm/insn-eval.h>
#include <asm/pgtable.h>
#include <xen/events.h>
#endif
+#include <asm/apic.h>
#include <asm/desc.h>
#include <asm/traps.h>
#include <asm/vdso.h>
}
}
-/* Handles int $0x80 */
+#ifdef CONFIG_IA32_EMULATION
+static __always_inline bool int80_is_external(void)
+{
+ const unsigned int offs = (0x80 / 32) * 0x10;
+ const u32 bit = BIT(0x80 % 32);
+
+ /* The local APIC on XENPV guests is fake */
+ if (cpu_feature_enabled(X86_FEATURE_XENPV))
+ return false;
+
+ /*
+ * If vector 0x80 is set in the APIC ISR then this is an external
+ * interrupt. Either from broken hardware or injected by a VMM.
+ *
+ * Note: In guest mode this is only valid for secure guests where
+ * the secure module fully controls the vAPIC exposed to the guest.
+ */
+ return apic_read(APIC_ISR + offs) & bit;
+}
+
+/**
+ * int80_emulation - 32-bit legacy syscall entry
+ *
+ * This entry point can be used by 32-bit and 64-bit programs to perform
+ * 32-bit system calls. Instances of INT $0x80 can be found inline in
+ * various programs and libraries. It is also used by the vDSO's
+ * __kernel_vsyscall fallback for hardware that doesn't support a faster
+ * entry method. Restarted 32-bit system calls also fall back to INT
+ * $0x80 regardless of what instruction was originally used to do the
+ * system call.
+ *
+ * This is considered a slow path. It is not used by most libc
+ * implementations on modern hardware except during process startup.
+ *
+ * The arguments for the INT $0x80 based syscall are on stack in the
+ * pt_regs structure:
+ * eax: system call number
+ * ebx, ecx, edx, esi, edi, ebp: arg1 - arg 6
+ */
+DEFINE_IDTENTRY_RAW(int80_emulation)
+{
+ int nr;
+
+ /* Kernel does not use INT $0x80! */
+ if (unlikely(!user_mode(regs))) {
+ irqentry_enter(regs);
+ instrumentation_begin();
+ panic("Unexpected external interrupt 0x80\n");
+ }
+
+ /*
+ * Establish kernel context for instrumentation, including for
+ * int80_is_external() below which calls into the APIC driver.
+ * Identical for soft and external interrupts.
+ */
+ enter_from_user_mode(regs);
+
+ instrumentation_begin();
+ add_random_kstack_offset();
+
+ /* Validate that this is a soft interrupt to the extent possible */
+ if (unlikely(int80_is_external()))
+ panic("Unexpected external interrupt 0x80\n");
+
+ /*
+ * The low level idtentry code pushed -1 into regs::orig_ax
+ * and regs::ax contains the syscall number.
+ *
+ * User tracing code (ptrace or signal handlers) might assume
+ * that the regs::orig_ax contains a 32-bit number on invoking
+ * a 32-bit syscall.
+ *
+ * Establish the syscall convention by saving the 32bit truncated
+ * syscall number in regs::orig_ax and by invalidating regs::ax.
+ */
+ regs->orig_ax = regs->ax & GENMASK(31, 0);
+ regs->ax = -ENOSYS;
+
+ nr = syscall_32_enter(regs);
+
+ local_irq_enable();
+ nr = syscall_enter_from_user_mode_work(regs, nr);
+ do_syscall_32_irqs_on(regs, nr);
+
+ instrumentation_end();
+ syscall_exit_to_user_mode(regs);
+}
+#else /* CONFIG_IA32_EMULATION */
+
+/* Handles int $0x80 on a 32bit kernel */
__visible noinstr void do_int80_syscall_32(struct pt_regs *regs)
{
int nr = syscall_32_enter(regs);
instrumentation_end();
syscall_exit_to_user_mode(regs);
}
+#endif /* !CONFIG_IA32_EMULATION */
static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
{
ANNOTATE_NOENDBR
int3
SYM_CODE_END(entry_SYSCALL_compat)
-
-/*
- * 32-bit legacy system call entry.
- *
- * 32-bit x86 Linux system calls traditionally used the INT $0x80
- * instruction. INT $0x80 lands here.
- *
- * This entry point can be used by 32-bit and 64-bit programs to perform
- * 32-bit system calls. Instances of INT $0x80 can be found inline in
- * various programs and libraries. It is also used by the vDSO's
- * __kernel_vsyscall fallback for hardware that doesn't support a faster
- * entry method. Restarted 32-bit system calls also fall back to INT
- * $0x80 regardless of what instruction was originally used to do the
- * system call.
- *
- * This is considered a slow path. It is not used by most libc
- * implementations on modern hardware except during process startup.
- *
- * Arguments:
- * eax system call number
- * ebx arg1
- * ecx arg2
- * edx arg3
- * esi arg4
- * edi arg5
- * ebp arg6
- */
-SYM_CODE_START(entry_INT80_compat)
- UNWIND_HINT_ENTRY
- ENDBR
- /*
- * Interrupts are off on entry.
- */
- ASM_CLAC /* Do this early to minimize exposure */
- ALTERNATIVE "swapgs", "", X86_FEATURE_XENPV
-
- /*
- * User tracing code (ptrace or signal handlers) might assume that
- * the saved RAX contains a 32-bit number when we're invoking a 32-bit
- * syscall. Just in case the high bits are nonzero, zero-extend
- * the syscall number. (This could almost certainly be deleted
- * with no ill effects.)
- */
- movl %eax, %eax
-
- /* switch to thread stack expects orig_ax and rdi to be pushed */
- pushq %rax /* pt_regs->orig_ax */
-
- /* Need to switch before accessing the thread stack. */
- SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
-
- /* In the Xen PV case we already run on the thread stack. */
- ALTERNATIVE "", "jmp .Lint80_keep_stack", X86_FEATURE_XENPV
-
- movq %rsp, %rax
- movq PER_CPU_VAR(pcpu_hot + X86_top_of_stack), %rsp
-
- pushq 5*8(%rax) /* regs->ss */
- pushq 4*8(%rax) /* regs->rsp */
- pushq 3*8(%rax) /* regs->eflags */
- pushq 2*8(%rax) /* regs->cs */
- pushq 1*8(%rax) /* regs->ip */
- pushq 0*8(%rax) /* regs->orig_ax */
-.Lint80_keep_stack:
-
- PUSH_AND_CLEAR_REGS rax=$-ENOSYS
- UNWIND_HINT_REGS
-
- cld
-
- IBRS_ENTER
- UNTRAIN_RET
-
- movq %rsp, %rdi
- call do_int80_syscall_32
- jmp swapgs_restore_regs_and_return_to_usermode
-SYM_CODE_END(entry_INT80_compat)
if (pmu->intel_cap.pebs_output_pt_available)
pmu->pmu.capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
else
- pmu->pmu.capabilities |= ~PERF_PMU_CAP_AUX_OUTPUT;
+ pmu->pmu.capabilities &= ~PERF_PMU_CAP_AUX_OUTPUT;
intel_pmu_check_event_constraints(pmu->event_constraints,
pmu->num_counters,
#include <linux/io.h>
#include <asm/apic.h>
#include <asm/desc.h>
+#include <asm/e820/api.h>
#include <asm/sev.h>
#include <asm/ibt.h>
#include <asm/hypervisor.h>
static int __init hv_pci_init(void)
{
- int gen2vm = efi_enabled(EFI_BOOT);
+ bool gen2vm = efi_enabled(EFI_BOOT);
/*
- * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
- * The purpose is to suppress the harmless warning:
+ * A Generation-2 VM doesn't support legacy PCI/PCIe, so both
+ * raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() ->
+ * pcibios_init() doesn't call pcibios_resource_survey() ->
+ * e820__reserve_resources_late(); as a result, any emulated persistent
+ * memory of E820_TYPE_PRAM (12) via the kernel parameter
+ * memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be
+ * detected by register_e820_pmem(). Fix this by directly calling
+ * e820__reserve_resources_late() here: e820__reserve_resources_late()
+ * depends on e820__reserve_resources(), which has been called earlier
+ * from setup_arch(). Note: e820__reserve_resources_late() also adds
+ * any memory of E820_TYPE_PMEM (7) into iomem_resource, and
+ * acpi_nfit_register_region() -> acpi_nfit_insert_resource() ->
+ * region_intersects() returns REGION_INTERSECTS, so the memory of
+ * E820_TYPE_PMEM won't get added twice.
+ *
+ * We return 0 here so that pci_arch_init() won't print the warning:
* "PCI: Fatal: No config space access function found"
*/
- if (gen2vm)
+ if (gen2vm) {
+ e820__reserve_resources_late();
return 0;
+ }
/* For Generation-1 VM, we'll proceed in pci_arch_init(). */
return 1;
#include <asm/x86_init.h>
#include <asm/cpufeature.h>
#include <asm/irq_vectors.h>
+#include <asm/xen/hypervisor.h>
+
+#include <xen/xen.h>
#ifdef CONFIG_ACPI_APEI
# include <asm/pgtable_types.h>
if (!cpu_has(c, X86_FEATURE_MWAIT) ||
boot_option_idle_override == IDLE_NOMWAIT)
*cap &= ~(ACPI_PROC_CAP_C_C1_FFH | ACPI_PROC_CAP_C_C2C3_FFH);
+
+ if (xen_initial_domain()) {
+ /*
+ * When Linux is running as Xen dom0, the hypervisor is the
+ * entity in charge of the processor power management, and so
+ * Xen needs to check the OS capabilities reported in the
+ * processor capabilities buffer matches what the hypervisor
+ * driver supports.
+ */
+ xen_sanitize_proc_cap_bits(cap);
+ }
}
static inline bool acpi_has_cpu_in_madt(void)
return __ia32_enabled;
}
+static inline void ia32_disable(void)
+{
+ __ia32_enabled = false;
+}
+
#else /* !CONFIG_IA32_EMULATION */
static inline bool ia32_enabled(void)
return IS_ENABLED(CONFIG_X86_32);
}
+static inline void ia32_disable(void) {}
+
#endif
#endif /* _ASM_X86_IA32_H */
DECLARE_IDTENTRY_RAW(X86_TRAP_BP, exc_int3);
DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_PF, exc_page_fault);
+#if defined(CONFIG_IA32_EMULATION)
+DECLARE_IDTENTRY_RAW(IA32_SYSCALL_VECTOR, int80_emulation);
+#endif
+
#ifdef CONFIG_X86_MCE
#ifdef CONFIG_X86_64
DECLARE_IDTENTRY_MCE(X86_TRAP_MC, exc_machine_check);
void entry_SYSCALL_compat_safe_stack(void);
void entry_SYSRETL_compat_unsafe_stack(void);
void entry_SYSRETL_compat_end(void);
-void entry_INT80_compat(void);
-#ifdef CONFIG_XEN_PV
-void xen_entry_INT80_compat(void);
-#endif
#else /* !CONFIG_IA32_EMULATION */
#define entry_SYSCALL_compat NULL
#define entry_SYSENTER_compat NULL
return sys_ni_syscall(); \
}
-#define __SYS_NI(abi, name) \
- SYSCALL_ALIAS(__##abi##_##name, sys_ni_posix_timers);
-
#ifdef CONFIG_X86_64
#define __X64_SYS_STUB0(name) \
__SYS_STUB0(x64, sys_##name)
#define __X64_COND_SYSCALL(name) \
__COND_SYSCALL(x64, sys_##name)
-#define __X64_SYS_NI(name) \
- __SYS_NI(x64, sys_##name)
#else /* CONFIG_X86_64 */
#define __X64_SYS_STUB0(name)
#define __X64_SYS_STUBx(x, name, ...)
#define __X64_COND_SYSCALL(name)
-#define __X64_SYS_NI(name)
#endif /* CONFIG_X86_64 */
#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
#define __IA32_COND_SYSCALL(name) \
__COND_SYSCALL(ia32, sys_##name)
-#define __IA32_SYS_NI(name) \
- __SYS_NI(ia32, sys_##name)
#else /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
#define __IA32_SYS_STUB0(name)
#define __IA32_SYS_STUBx(x, name, ...)
#define __IA32_COND_SYSCALL(name)
-#define __IA32_SYS_NI(name)
#endif /* CONFIG_X86_32 || CONFIG_IA32_EMULATION */
#ifdef CONFIG_IA32_EMULATION
* additional wrappers (aptly named __ia32_sys_xyzzy) which decode the
* ia32 regs in the proper order for shared or "common" syscalls. As some
* syscalls may not be implemented, we need to expand COND_SYSCALL in
- * kernel/sys_ni.c and SYS_NI in kernel/time/posix-stubs.c to cover this
- * case as well.
+ * kernel/sys_ni.c to cover this case as well.
*/
#define __IA32_COMPAT_SYS_STUB0(name) \
__SYS_STUB0(ia32, compat_sys_##name)
#define __IA32_COMPAT_COND_SYSCALL(name) \
__COND_SYSCALL(ia32, compat_sys_##name)
-#define __IA32_COMPAT_SYS_NI(name) \
- __SYS_NI(ia32, compat_sys_##name)
-
#else /* CONFIG_IA32_EMULATION */
#define __IA32_COMPAT_SYS_STUB0(name)
#define __IA32_COMPAT_SYS_STUBx(x, name, ...)
#define __IA32_COMPAT_COND_SYSCALL(name)
-#define __IA32_COMPAT_SYS_NI(name)
#endif /* CONFIG_IA32_EMULATION */
#define __X32_COMPAT_COND_SYSCALL(name) \
__COND_SYSCALL(x64, compat_sys_##name)
-#define __X32_COMPAT_SYS_NI(name) \
- __SYS_NI(x64, compat_sys_##name)
#else /* CONFIG_X86_X32_ABI */
#define __X32_COMPAT_SYS_STUB0(name)
#define __X32_COMPAT_SYS_STUBx(x, name, ...)
#define __X32_COMPAT_COND_SYSCALL(name)
-#define __X32_COMPAT_SYS_NI(name)
#endif /* CONFIG_X86_X32_ABI */
/*
* As some compat syscalls may not be implemented, we need to expand
- * COND_SYSCALL_COMPAT in kernel/sys_ni.c and COMPAT_SYS_NI in
- * kernel/time/posix-stubs.c to cover this case as well.
+ * COND_SYSCALL_COMPAT in kernel/sys_ni.c to cover this case as well.
*/
#define COND_SYSCALL_COMPAT(name) \
__IA32_COMPAT_COND_SYSCALL(name) \
__X32_COMPAT_COND_SYSCALL(name)
-#define COMPAT_SYS_NI(name) \
- __IA32_COMPAT_SYS_NI(name) \
- __X32_COMPAT_SYS_NI(name)
-
#endif /* CONFIG_COMPAT */
#define __SYSCALL_DEFINEx(x, name, ...) \
* As the generic SYSCALL_DEFINE0() macro does not decode any parameters for
* obvious reasons, and passing struct pt_regs *regs to it in %rdi does not
* hurt, we only need to re-define it here to keep the naming congruent to
- * SYSCALL_DEFINEx() -- which is essential for the COND_SYSCALL() and SYS_NI()
- * macros to work correctly.
+ * SYSCALL_DEFINEx() -- which is essential for the COND_SYSCALL() macro
+ * to work correctly.
*/
#define SYSCALL_DEFINE0(sname) \
SYSCALL_METADATA(_##sname, 0); \
__X64_COND_SYSCALL(name) \
__IA32_COND_SYSCALL(name)
-#define SYS_NI(name) \
- __X64_SYS_NI(name) \
- __IA32_SYS_NI(name)
-
/*
* For VSYSCALLS, we need to declare these three syscalls with the new
enum xen_lazy_mode xen_get_lazy_mode(void);
+#if defined(CONFIG_XEN_DOM0) && defined(CONFIG_ACPI)
+void xen_sanitize_proc_cap_bits(uint32_t *buf);
+#else
+static inline void xen_sanitize_proc_cap_bits(uint32_t *buf)
+{
+ BUG();
+}
+#endif
+
#endif /* _ASM_X86_XEN_HYPERVISOR_H */
#ifdef CONFIG_X86_LOCAL_APIC
static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
+static bool has_lapic_cpus __initdata;
static bool acpi_support_online_capable;
#endif
if (!acpi_is_processor_usable(processor->lapic_flags))
return 0;
+ /*
+ * According to https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#processor-local-x2apic-structure
+ * when MADT provides both valid LAPIC and x2APIC entries, the APIC ID
+ * in x2APIC must be equal or greater than 0xff.
+ */
+ if (has_lapic_cpus && apic_id < 0xff)
+ return 0;
+
/*
* We need to register disabled CPU as well to permit
* counting disabled CPUs. This allows us to size
processor->processor_id, /* ACPI ID */
processor->lapic_flags & ACPI_MADT_ENABLED);
+ has_lapic_cpus = true;
return 0;
}
static int __init acpi_parse_madt_lapic_entries(void)
{
- int count;
- int x2count = 0;
- int ret;
- struct acpi_subtable_proc madt_proc[2];
+ int count, x2count = 0;
if (!boot_cpu_has(X86_FEATURE_APIC))
return -ENODEV;
acpi_parse_sapic, MAX_LOCAL_APIC);
if (!count) {
- memset(madt_proc, 0, sizeof(madt_proc));
- madt_proc[0].id = ACPI_MADT_TYPE_LOCAL_APIC;
- madt_proc[0].handler = acpi_parse_lapic;
- madt_proc[1].id = ACPI_MADT_TYPE_LOCAL_X2APIC;
- madt_proc[1].handler = acpi_parse_x2apic;
- ret = acpi_table_parse_entries_array(ACPI_SIG_MADT,
- sizeof(struct acpi_table_madt),
- madt_proc, ARRAY_SIZE(madt_proc), MAX_LOCAL_APIC);
- if (ret < 0) {
- pr_err("Error parsing LAPIC/X2APIC entries\n");
- return ret;
- }
-
- count = madt_proc[0].count;
- x2count = madt_proc[1].count;
+ count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_APIC,
+ acpi_parse_lapic, MAX_LOCAL_APIC);
+ x2count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_X2APIC,
+ acpi_parse_x2apic, MAX_LOCAL_APIC);
}
if (!count && !x2count) {
pr_err("No LAPIC entries present\n");
}
}
+static void __init_or_module noinline optimize_nops_inplace(u8 *instr, size_t len)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ optimize_nops(instr, len);
+ sync_core();
+ local_irq_restore(flags);
+}
+
/*
* In this context, "source" is where the instructions are placed in the
* section .altinstr_replacement, for example during kernel build by the
* patch if feature is *NOT* present.
*/
if (!boot_cpu_has(a->cpuid) == !(a->flags & ALT_FLAG_NOT)) {
- optimize_nops(instr, a->instrlen);
+ optimize_nops_inplace(instr, a->instrlen);
continue;
}
} else {
local_irq_save(flags);
memcpy(addr, opcode, len);
- local_irq_restore(flags);
sync_core();
+ local_irq_restore(flags);
/*
* Could also do a CLFLUSH here to speed up CPU recovery; but
void amd_check_microcode(void)
{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
+ return;
+
on_each_cpu(zenbleed_check_cpu, NULL, 1);
}
size_t size;
};
-static u32 ucode_new_rev;
-
/*
* Microcode patch container file is prepended to the initrd in cpio
* format. See Documentation/arch/x86/microcode.rst
*
* Returns true if container found (sets @desc), false otherwise.
*/
-static bool early_apply_microcode(u32 cpuid_1_eax, void *ucode, size_t size)
+static bool early_apply_microcode(u32 cpuid_1_eax, u32 old_rev, void *ucode, size_t size)
{
struct cont_desc desc = { 0 };
struct microcode_amd *mc;
bool ret = false;
- u32 rev, dummy;
desc.cpuid_1_eax = cpuid_1_eax;
if (!mc)
return ret;
- native_rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
-
/*
* Allow application of the same revision to pick up SMT-specific
* changes even if the revision of the other SMT thread is already
* up-to-date.
*/
- if (rev > mc->hdr.patch_id)
+ if (old_rev > mc->hdr.patch_id)
return ret;
- if (!__apply_microcode_amd(mc)) {
- ucode_new_rev = mc->hdr.patch_id;
- ret = true;
- }
-
- return ret;
+ return !__apply_microcode_amd(mc);
}
static bool get_builtin_microcode(struct cpio_data *cp, unsigned int family)
*ret = cp;
}
-void __init load_ucode_amd_bsp(unsigned int cpuid_1_eax)
+void __init load_ucode_amd_bsp(struct early_load_data *ed, unsigned int cpuid_1_eax)
{
struct cpio_data cp = { };
+ u32 dummy;
+
+ native_rdmsr(MSR_AMD64_PATCH_LEVEL, ed->old_rev, dummy);
/* Needed in load_microcode_amd() */
ucode_cpu_info[0].cpu_sig.sig = cpuid_1_eax;
if (!(cp.data && cp.size))
return;
- early_apply_microcode(cpuid_1_eax, cp.data, cp.size);
+ if (early_apply_microcode(cpuid_1_eax, ed->old_rev, cp.data, cp.size))
+ native_rdmsr(MSR_AMD64_PATCH_LEVEL, ed->new_rev, dummy);
}
static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size);
rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
if (rev < mc->hdr.patch_id) {
- if (!__apply_microcode_amd(mc)) {
- ucode_new_rev = mc->hdr.patch_id;
- pr_info("reload patch_level=0x%08x\n", ucode_new_rev);
- }
+ if (!__apply_microcode_amd(mc))
+ pr_info_once("reload revision: 0x%08x\n", mc->hdr.patch_id);
}
}
if (p && (p->patch_id == csig->rev))
uci->mc = p->data;
- pr_info("CPU%d: patch_level=0x%08x\n", cpu, csig->rev);
-
return 0;
}
rev = mc_amd->hdr.patch_id;
ret = UCODE_UPDATED;
- pr_info("CPU%d: new patch_level=0x%08x\n", cpu, rev);
-
out:
uci->cpu_sig.rev = rev;
c->microcode = rev;
pr_warn("AMD CPU family 0x%x not supported\n", c->x86);
return NULL;
}
-
- if (ucode_new_rev)
- pr_info_once("microcode updated early to new patch_level=0x%08x\n",
- ucode_new_rev);
-
return µcode_amd_ops;
}
#include "internal.h"
-#define DRIVER_VERSION "2.2"
-
static struct microcode_ops *microcode_ops;
bool dis_ucode_ldr = true;
0, /* T-101 terminator */
};
+struct early_load_data early_data;
+
/*
* Check the current patch level on this CPU.
*
return;
if (intel)
- load_ucode_intel_bsp();
+ load_ucode_intel_bsp(&early_data);
else
- load_ucode_amd_bsp(cpuid_1_eax);
+ load_ucode_amd_bsp(&early_data, cpuid_1_eax);
}
void load_ucode_ap(void)
if (!microcode_ops)
return -ENODEV;
+ pr_info_once("Current revision: 0x%08x\n", (early_data.new_rev ?: early_data.old_rev));
+
+ if (early_data.new_rev)
+ pr_info_once("Updated early from: 0x%08x\n", early_data.old_rev);
+
microcode_pdev = platform_device_register_simple("microcode", -1, NULL, 0);
if (IS_ERR(microcode_pdev))
return PTR_ERR(microcode_pdev);
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/microcode:online",
mc_cpu_online, mc_cpu_down_prep);
- pr_info("Microcode Update Driver: v%s.", DRIVER_VERSION);
-
return 0;
out_pdev:
static enum ucode_state apply_microcode_early(struct ucode_cpu_info *uci)
{
struct microcode_intel *mc = uci->mc;
- enum ucode_state ret;
- u32 cur_rev, date;
+ u32 cur_rev;
- ret = __apply_microcode(uci, mc, &cur_rev);
- if (ret == UCODE_UPDATED) {
- date = mc->hdr.date;
- pr_info_once("updated early: 0x%x -> 0x%x, date = %04x-%02x-%02x\n",
- cur_rev, mc->hdr.rev, date & 0xffff, date >> 24, (date >> 16) & 0xff);
- }
- return ret;
+ return __apply_microcode(uci, mc, &cur_rev);
}
static __init bool load_builtin_intel_microcode(struct cpio_data *cp)
early_initcall(save_builtin_microcode);
/* Load microcode on BSP from initrd or builtin blobs */
-void __init load_ucode_intel_bsp(void)
+void __init load_ucode_intel_bsp(struct early_load_data *ed)
{
struct ucode_cpu_info uci;
+ ed->old_rev = intel_get_microcode_revision();
+
uci.mc = get_microcode_blob(&uci, false);
if (uci.mc && apply_microcode_early(&uci) == UCODE_UPDATED)
ucode_patch_va = UCODE_BSP_LOADED;
+
+ ed->new_rev = uci.cpu_sig.rev;
}
void load_ucode_intel_ap(void)
use_nmi : 1;
};
+struct early_load_data {
+ u32 old_rev;
+ u32 new_rev;
+};
+
+extern struct early_load_data early_data;
extern struct ucode_cpu_info ucode_cpu_info[];
struct cpio_data find_microcode_in_initrd(const char *path);
extern bool force_minrev;
#ifdef CONFIG_CPU_SUP_AMD
-void load_ucode_amd_bsp(unsigned int family);
+void load_ucode_amd_bsp(struct early_load_data *ed, unsigned int family);
void load_ucode_amd_ap(unsigned int family);
int save_microcode_in_initrd_amd(unsigned int family);
void reload_ucode_amd(unsigned int cpu);
struct microcode_ops *init_amd_microcode(void);
void exit_amd_microcode(void);
#else /* CONFIG_CPU_SUP_AMD */
-static inline void load_ucode_amd_bsp(unsigned int family) { }
+static inline void load_ucode_amd_bsp(struct early_load_data *ed, unsigned int family) { }
static inline void load_ucode_amd_ap(unsigned int family) { }
static inline int save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; }
static inline void reload_ucode_amd(unsigned int cpu) { }
#endif /* !CONFIG_CPU_SUP_AMD */
#ifdef CONFIG_CPU_SUP_INTEL
-void load_ucode_intel_bsp(void);
+void load_ucode_intel_bsp(struct early_load_data *ed);
void load_ucode_intel_ap(void);
void reload_ucode_intel(void);
struct microcode_ops *init_intel_microcode(void);
#else /* CONFIG_CPU_SUP_INTEL */
-static inline void load_ucode_intel_bsp(void) { }
+static inline void load_ucode_intel_bsp(struct early_load_data *ed) { }
static inline void load_ucode_intel_ap(void) { }
static inline void reload_ucode_intel(void) { }
static inline struct microcode_ops *init_intel_microcode(void) { return NULL; }
static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
{
static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+ unsigned int old_cpu, this_cpu;
if (!unknown_nmi_panic)
return NMI_DONE;
- if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+ old_cpu = -1;
+ this_cpu = raw_smp_processor_id();
+ if (!atomic_try_cmpxchg(&nmi_cpu, &old_cpu, this_cpu))
return NMI_HANDLED;
return NMI_DONE;
testl $X2APIC_ENABLE, %eax
jnz .Lread_apicid_msr
+#ifdef CONFIG_X86_X2APIC
+ /*
+ * If system is in X2APIC mode then MMIO base might not be
+ * mapped causing the MMIO read below to fault. Faults can't
+ * be handled at that point.
+ */
+ cmpl $0, x2apic_mode(%rip)
+ jz .Lread_apicid_mmio
+
+ /* Force the AP into X2APIC mode. */
+ orl $X2APIC_ENABLE, %eax
+ wrmsr
+ jmp .Lread_apicid_msr
+#endif
+
+.Lread_apicid_mmio:
/* Read the APIC ID from the fix-mapped MMIO space. */
movq apic_mmio_base(%rip), %rcx
addq $APIC_ID, %rcx
static const struct idt_data ia32_idt[] __initconst = {
#if defined(CONFIG_IA32_EMULATION)
- SYSG(IA32_SYSCALL_VECTOR, entry_INT80_compat),
+ SYSG(IA32_SYSCALL_VECTOR, asm_int80_emulation),
#elif defined(CONFIG_X86_32)
SYSG(IA32_SYSCALL_VECTOR, entry_INT80_32),
#endif
if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
- /* First make sure the hypervisor talks a supported protocol. */
- if (!sev_es_negotiate_protocol())
- sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
-
/*
* Check whether the runtime #VC exception handler is active. It uses
* the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
return;
}
+ /*
+ * Make sure the hypervisor talks a supported protocol.
+ * This gets called only in the BSP boot phase.
+ */
+ if (!sev_es_negotiate_protocol())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
/*
* Clear the boot_ghcb. The first exception comes in before the bss
* section is cleared.
frame = get_sigframe(ksig, regs, sizeof(struct rt_sigframe), &fp);
uc_flags = frame_uc_flags(regs);
- if (setup_signal_shadow_stack(ksig))
- return -EFAULT;
-
if (!user_access_begin(frame, sizeof(*frame)))
return -EFAULT;
return -EFAULT;
}
+ if (setup_signal_shadow_stack(ksig))
+ return -EFAULT;
+
/* Set up registers for signal handler */
regs->di = ksig->sig;
/* In case the signal handler was declared without prototypes */
depends on HAVE_KVM
depends on HIGH_RES_TIMERS
depends on X86_LOCAL_APIC
- select PREEMPT_NOTIFIERS
+ select KVM_COMMON
select KVM_GENERIC_MMU_NOTIFIER
select HAVE_KVM_IRQCHIP
select HAVE_KVM_PFNCACHE
- select HAVE_KVM_IRQFD
select HAVE_KVM_DIRTY_RING_TSO
select HAVE_KVM_DIRTY_RING_ACQ_REL
select IRQ_BYPASS_MANAGER
select HAVE_KVM_IRQ_BYPASS
select HAVE_KVM_IRQ_ROUTING
- select HAVE_KVM_EVENTFD
select KVM_ASYNC_PF
select USER_RETURN_NOTIFIER
select KVM_MMIO
select KVM_XFER_TO_GUEST_WORK
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_VFIO
- select INTERVAL_TREE
select HAVE_KVM_PM_NOTIFIER if PM
select KVM_GENERIC_HARDWARE_ENABLING
help
config KVM_SW_PROTECTED_VM
bool "Enable support for KVM software-protected VMs"
depends on EXPERT
- depends on X86_64
+ depends on KVM && X86_64
select KVM_GENERIC_PRIVATE_MEM
help
Enable support for KVM software-protected VMs. Currently "protected"
}
static const struct file_operations mmu_rmaps_stat_fops = {
+ .owner = THIS_MODULE,
.open = kvm_mmu_rmaps_stat_open,
.read = seq_read,
.llseek = seq_lseek,
set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
}
+
+ /*
+ * For SEV-ES, accesses to MSR_IA32_XSS should not be intercepted if
+ * the host/guest supports its use.
+ *
+ * guest_can_use() checks a number of requirements on the host/guest to
+ * ensure that MSR_IA32_XSS is available, but it might report true even
+ * if X86_FEATURE_XSAVES isn't configured in the guest to ensure host
+ * MSR_IA32_XSS is always properly restored. For SEV-ES, it is better
+ * to further check that the guest CPUID actually supports
+ * X86_FEATURE_XSAVES so that accesses to MSR_IA32_XSS by misbehaved
+ * guests will still get intercepted and caught in the normal
+ * kvm_emulate_rdmsr()/kvm_emulated_wrmsr() paths.
+ */
+ if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 1, 1);
+ else
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 0, 0);
}
void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm)
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
{ .index = MSR_IA32_LASTINTTOIP, .always = false },
+ { .index = MSR_IA32_XSS, .always = false },
{ .index = MSR_EFER, .always = false },
{ .index = MSR_IA32_CR_PAT, .always = false },
{ .index = MSR_AMD64_SEV_ES_GHCB, .always = true },
bool old_paging = is_paging(vcpu);
#ifdef CONFIG_X86_64
- if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) {
+ if (vcpu->arch.efer & EFER_LME) {
if (!is_paging(vcpu) && (cr0 & X86_CR0_PG)) {
vcpu->arch.efer |= EFER_LMA;
- svm->vmcb->save.efer |= EFER_LMA | EFER_LME;
+ if (!vcpu->arch.guest_state_protected)
+ svm->vmcb->save.efer |= EFER_LMA | EFER_LME;
}
if (is_paging(vcpu) && !(cr0 & X86_CR0_PG)) {
vcpu->arch.efer &= ~EFER_LMA;
- svm->vmcb->save.efer &= ~(EFER_LMA | EFER_LME);
+ if (!vcpu->arch.guest_state_protected)
+ svm->vmcb->save.efer &= ~(EFER_LMA | EFER_LME);
}
}
#endif
#define IOPM_SIZE PAGE_SIZE * 3
#define MSRPM_SIZE PAGE_SIZE * 2
-#define MAX_DIRECT_ACCESS_MSRS 46
+#define MAX_DIRECT_ACCESS_MSRS 47
#define MSRPM_OFFSETS 32
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu,
struct kvm_xsave *guest_xsave)
{
- return kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
- sizeof(guest_xsave->region));
+ kvm_vcpu_ioctl_x86_get_xsave2(vcpu, (void *)guest_xsave->region,
+ sizeof(guest_xsave->region));
}
static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu,
if (vcpu->arch.guest_state_protected)
return true;
- return vcpu->arch.preempted_in_kernel;
+ if (vcpu != kvm_get_running_vcpu())
+ return vcpu->arch.preempted_in_kernel;
+
+ return static_call(kvm_x86_get_cpl)(vcpu) == 0;
}
unsigned long kvm_arch_vcpu_get_ip(struct kvm_vcpu *vcpu)
#include <asm/msr.h>
#include <asm/cmdline.h>
#include <asm/sev.h>
+#include <asm/ia32.h>
#include "mm_internal.h"
*/
if (sev_status & MSR_AMD64_SEV_ES_ENABLED)
x86_cpuinit.parallel_bringup = false;
+
+ /*
+ * The VMM is capable of injecting interrupt 0x80 and triggering the
+ * compatibility syscall path.
+ *
+ * By default, the 32-bit emulation is disabled in order to ensure
+ * the safety of the VM.
+ */
+ if (sev_status & MSR_AMD64_SEV_ENABLED)
+ ia32_disable();
}
void __init mem_encrypt_free_decrypted_mem(void)
#endif
WARN(1, "verification of programs using bpf_throw should have failed\n");
}
+
+void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old)
+{
+ u8 *old_addr, *new_addr, *old_bypass_addr;
+ int ret;
+
+ old_bypass_addr = old ? NULL : poke->bypass_addr;
+ old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
+ new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
+
+ /*
+ * On program loading or teardown, the program's kallsym entry
+ * might not be in place, so we use __bpf_arch_text_poke to skip
+ * the kallsyms check.
+ */
+ if (new) {
+ ret = __bpf_arch_text_poke(poke->tailcall_target,
+ BPF_MOD_JUMP,
+ old_addr, new_addr);
+ BUG_ON(ret < 0);
+ if (!old) {
+ ret = __bpf_arch_text_poke(poke->tailcall_bypass,
+ BPF_MOD_JUMP,
+ poke->bypass_addr,
+ NULL);
+ BUG_ON(ret < 0);
+ }
+ } else {
+ ret = __bpf_arch_text_poke(poke->tailcall_bypass,
+ BPF_MOD_JUMP,
+ old_bypass_addr,
+ poke->bypass_addr);
+ BUG_ON(ret < 0);
+ /* let other CPUs finish the execution of program
+ * so that it will not possible to expose them
+ * to invalid nop, stack unwind, nop state
+ */
+ if (!ret)
+ synchronize_rcu();
+ ret = __bpf_arch_text_poke(poke->tailcall_target,
+ BPF_MOD_JUMP,
+ old_addr, NULL);
+ BUG_ON(ret < 0);
+ }
+}
select PARAVIRT_CLOCK
select X86_HV_CALLBACK_VECTOR
depends on X86_64 || (X86_32 && X86_PAE)
+ depends on X86_64 || (X86_GENERIC || MPENTIUM4 || MCORE2 || MATOM || MK8)
depends on X86_LOCAL_APIC && X86_TSC
help
This is the Linux Xen port. Enabling this will allow the
* and xen_vcpu_setup for details. By default it points to share_info->vcpu_info
* but during boot it is switched to point to xen_vcpu_info.
* The pointer is used in xen_evtchn_do_upcall to acknowledge pending events.
+ * Make sure that xen_vcpu_info doesn't cross a page boundary by making it
+ * cache-line aligned (the struct is guaranteed to have a size of 64 bytes,
+ * which matches the cache line size of 64-bit x86 processors).
*/
DEFINE_PER_CPU(struct vcpu_info *, xen_vcpu);
-DEFINE_PER_CPU(struct vcpu_info, xen_vcpu_info);
+DEFINE_PER_CPU_ALIGNED(struct vcpu_info, xen_vcpu_info);
/* Linux <-> Xen vCPU id mapping */
DEFINE_PER_CPU(uint32_t, xen_vcpu_id);
int err;
struct vcpu_info *vcpup;
+ BUILD_BUG_ON(sizeof(*vcpup) > SMP_CACHE_BYTES);
BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
/*
TRAP_ENTRY(exc_int3, false ),
TRAP_ENTRY(exc_overflow, false ),
#ifdef CONFIG_IA32_EMULATION
- { entry_INT80_compat, xen_entry_INT80_compat, false },
+ TRAP_ENTRY(int80_emulation, false ),
#endif
TRAP_ENTRY(exc_page_fault, false ),
TRAP_ENTRY(exc_divide_error, false ),
#endif /* CONFIG_X86_MCE */
xen_pv_trap asm_exc_simd_coprocessor_error
#ifdef CONFIG_IA32_EMULATION
-xen_pv_trap entry_INT80_compat
+xen_pv_trap asm_int80_emulation
#endif
xen_pv_trap asm_exc_xen_unknown_trap
xen_pv_trap asm_exc_xen_hypervisor_callback
struct trap_info;
void xen_copy_trap_info(struct trap_info *traps);
-DECLARE_PER_CPU(struct vcpu_info, xen_vcpu_info);
+DECLARE_PER_CPU_ALIGNED(struct vcpu_info, xen_vcpu_info);
DECLARE_PER_CPU(unsigned long, xen_cr3);
DECLARE_PER_CPU(unsigned long, xen_current_cr3);
void bdev_add(struct block_device *bdev, dev_t dev)
{
+ if (bdev_stable_writes(bdev))
+ mapping_set_stable_writes(bdev->bd_inode->i_mapping);
bdev->bd_dev = dev;
bdev->bd_inode->i_rdev = dev;
bdev->bd_inode->i_ino = dev;
struct request_queue *q = disk->queue;
struct blkcg_gq *blkg, *n;
int count = BLKG_DESTROY_BATCH_SIZE;
+ int i;
restart:
spin_lock_irq(&q->queue_lock);
}
}
+ /*
+ * Mark policy deactivated since policy offline has been done, and
+ * the free is scheduled, so future blkcg_deactivate_policy() can
+ * be bypassed
+ */
+ for (i = 0; i < BLKCG_MAX_POLS; i++) {
+ struct blkcg_policy *pol = blkcg_policy[i];
+
+ if (pol)
+ __clear_bit(pol->plid, q->blkcg_pols);
+ }
+
q->root_blkg = NULL;
spin_unlock_irq(&q->queue_lock);
}
{
struct blkcg_gq *blkg;
- WARN_ON_ONCE(!rcu_read_lock_held());
-
if (blkcg == &blkcg_root)
return q->root_blkg;
if (op_is_write(bio_op(bio)) && bdev_read_only(bio->bi_bdev)) {
if (op_is_flush(bio->bi_opf) && !bio_sectors(bio))
return;
- pr_warn_ratelimited("Trying to write to read-only block-device %pg\n",
- bio->bi_bdev);
- /* Older lvm-tools actually trigger this */
+
+ if (bio->bi_bdev->bd_ro_warned)
+ return;
+
+ bio->bi_bdev->bd_ro_warned = true;
+ /*
+ * Use ioctl to set underlying disk of raid/dm to read-only
+ * will trigger this.
+ */
+ pr_warn("Trying to write to read-only block-device %pg\n",
+ bio->bi_bdev);
}
}
}
EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
+static bool blk_is_flush_data_rq(struct request *rq)
+{
+ return (rq->rq_flags & RQF_FLUSH_SEQ) && !is_flush_rq(rq);
+}
+
static bool blk_mq_rq_inflight(struct request *rq, void *priv)
{
/*
* If we find a request that isn't idle we know the queue is busy
* as it's checked in the iter.
* Return false to stop the iteration.
+ *
+ * In case of queue quiesce, if one flush data request is completed,
+ * don't count it as inflight given the flush sequence is suspended,
+ * and the original flush data request is invisible to driver, just
+ * like other pending requests because of quiesce
*/
- if (blk_mq_request_started(rq)) {
+ if (blk_mq_request_started(rq) && !(blk_queue_quiesced(rq->q) &&
+ blk_is_flush_data_rq(rq) &&
+ blk_mq_request_completed(rq))) {
bool *busy = priv;
*busy = true;
};
struct request *rq;
- if (unlikely(bio_queue_enter(bio)))
- return NULL;
-
if (blk_mq_attempt_bio_merge(q, bio, nsegs))
- goto queue_exit;
+ return NULL;
rq_qos_throttle(q, bio);
rq_qos_cleanup(q, bio);
if (bio->bi_opf & REQ_NOWAIT)
bio_wouldblock_error(bio);
-queue_exit:
- blk_queue_exit(q);
return NULL;
}
-static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
- struct blk_plug *plug, struct bio **bio, unsigned int nsegs)
+/* return true if this @rq can be used for @bio */
+static bool blk_mq_can_use_cached_rq(struct request *rq, struct blk_plug *plug,
+ struct bio *bio)
{
- struct request *rq;
- enum hctx_type type, hctx_type;
+ enum hctx_type type = blk_mq_get_hctx_type(bio->bi_opf);
+ enum hctx_type hctx_type = rq->mq_hctx->type;
- if (!plug)
- return NULL;
- rq = rq_list_peek(&plug->cached_rq);
- if (!rq || rq->q != q)
- return NULL;
-
- if (blk_mq_attempt_bio_merge(q, *bio, nsegs)) {
- *bio = NULL;
- return NULL;
- }
+ WARN_ON_ONCE(rq_list_peek(&plug->cached_rq) != rq);
- type = blk_mq_get_hctx_type((*bio)->bi_opf);
- hctx_type = rq->mq_hctx->type;
if (type != hctx_type &&
!(type == HCTX_TYPE_READ && hctx_type == HCTX_TYPE_DEFAULT))
- return NULL;
- if (op_is_flush(rq->cmd_flags) != op_is_flush((*bio)->bi_opf))
- return NULL;
+ return false;
+ if (op_is_flush(rq->cmd_flags) != op_is_flush(bio->bi_opf))
+ return false;
/*
* If any qos ->throttle() end up blocking, we will have flushed the
* before we throttle.
*/
plug->cached_rq = rq_list_next(rq);
- rq_qos_throttle(q, *bio);
+ rq_qos_throttle(rq->q, bio);
blk_mq_rq_time_init(rq, 0);
- rq->cmd_flags = (*bio)->bi_opf;
+ rq->cmd_flags = bio->bi_opf;
INIT_LIST_HEAD(&rq->queuelist);
- return rq;
+ return true;
}
static void bio_set_ioprio(struct bio *bio)
struct blk_plug *plug = blk_mq_plug(bio);
const int is_sync = op_is_sync(bio->bi_opf);
struct blk_mq_hw_ctx *hctx;
- struct request *rq;
+ struct request *rq = NULL;
unsigned int nr_segs = 1;
blk_status_t ret;
return;
}
- if (!bio_integrity_prep(bio))
- return;
-
bio_set_ioprio(bio);
- rq = blk_mq_get_cached_request(q, plug, &bio, nr_segs);
- if (!rq) {
- if (!bio)
+ if (plug) {
+ rq = rq_list_peek(&plug->cached_rq);
+ if (rq && rq->q != q)
+ rq = NULL;
+ }
+ if (rq) {
+ if (!bio_integrity_prep(bio))
return;
- rq = blk_mq_get_new_requests(q, plug, bio, nr_segs);
- if (unlikely(!rq))
+ if (blk_mq_attempt_bio_merge(q, bio, nr_segs))
return;
+ if (blk_mq_can_use_cached_rq(rq, plug, bio))
+ goto done;
+ percpu_ref_get(&q->q_usage_counter);
+ } else {
+ if (unlikely(bio_queue_enter(bio)))
+ return;
+ if (!bio_integrity_prep(bio))
+ goto fail;
}
+ rq = blk_mq_get_new_requests(q, plug, bio, nr_segs);
+ if (unlikely(!rq)) {
+fail:
+ blk_queue_exit(q);
+ return;
+ }
+
+done:
trace_block_getrq(bio);
rq_qos_track(q, rq, bio);
* @q: the queue of the device
*
* Description:
- * For historical reasons, this routine merely calls blk_set_runtime_active()
- * to do the real work of restarting the queue. It does this regardless of
- * whether the device's runtime-resume succeeded; even if it failed the
+ * Restart the queue of a runtime suspended device. It does this regardless
+ * of whether the device's runtime-resume succeeded; even if it failed the
* driver or error handler will need to communicate with the device.
*
* This function should be called near the end of the device's
- * runtime_resume callback.
+ * runtime_resume callback to correct queue runtime PM status and re-enable
+ * peeking requests from the queue.
*/
void blk_post_runtime_resume(struct request_queue *q)
-{
- blk_set_runtime_active(q);
-}
-EXPORT_SYMBOL(blk_post_runtime_resume);
-
-/**
- * blk_set_runtime_active - Force runtime status of the queue to be active
- * @q: the queue of the device
- *
- * If the device is left runtime suspended during system suspend the resume
- * hook typically resumes the device and corrects runtime status
- * accordingly. However, that does not affect the queue runtime PM status
- * which is still "suspended". This prevents processing requests from the
- * queue.
- *
- * This function can be used in driver's resume hook to correct queue
- * runtime PM status and re-enable peeking requests from the queue. It
- * should be called before first request is added to the queue.
- *
- * This function is also called by blk_post_runtime_resume() for
- * runtime resumes. It does everything necessary to restart the queue.
- */
-void blk_set_runtime_active(struct request_queue *q)
{
int old_status;
if (old_status != RPM_ACTIVE)
blk_clear_pm_only(q);
}
-EXPORT_SYMBOL(blk_set_runtime_active);
+EXPORT_SYMBOL(blk_post_runtime_resume);
QUEUE_RW_ENTRY(queue_wb_lat, "wbt_lat_usec");
#endif
+/* Common attributes for bio-based and request-based queues. */
static struct attribute *queue_attrs[] = {
&queue_ra_entry.attr,
&queue_max_hw_sectors_entry.attr,
NULL,
};
+/* Request-based queue attributes that are not relevant for bio-based queues. */
static struct attribute *blk_mq_queue_attrs[] = {
&queue_requests_entry.attr,
&elv_iosched_entry.attr,
tg_bps_limit(tg, READ), tg_bps_limit(tg, WRITE),
tg_iops_limit(tg, READ), tg_iops_limit(tg, WRITE));
+ rcu_read_lock();
/*
* Update has_rules[] flags for the updated tg's subtree. A tg is
* considered to have rules if either the tg itself or any of its
this_tg->latency_target = max(this_tg->latency_target,
parent_tg->latency_target);
}
+ rcu_read_unlock();
/*
* We're already holding queue_lock and know @tg is valid. Let's
#define ICB_0_1_IRQ_MASK ((((u64)ICB_1_IRQ_MASK) << 32) | ICB_0_IRQ_MASK)
-#define BUTTRESS_IRQ_MASK ((REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, FREQ_CHANGE)) | \
- (REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, ATS_ERR)) | \
+#define BUTTRESS_IRQ_MASK ((REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, ATS_ERR)) | \
(REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, UFI_ERR)))
+#define BUTTRESS_ALL_IRQ_MASK (BUTTRESS_IRQ_MASK | \
+ (REG_FLD(VPU_37XX_BUTTRESS_INTERRUPT_STAT, FREQ_CHANGE)))
+
#define BUTTRESS_IRQ_ENABLE_MASK ((u32)~BUTTRESS_IRQ_MASK)
#define BUTTRESS_IRQ_DISABLE_MASK ((u32)-1)
vdev->wa.clear_runtime_mem = false;
vdev->wa.d3hot_after_power_off = true;
- if (ivpu_device_id(vdev) == PCI_DEVICE_ID_MTL && ivpu_revision(vdev) < 4)
+ REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, BUTTRESS_ALL_IRQ_MASK);
+ if (REGB_RD32(VPU_37XX_BUTTRESS_INTERRUPT_STAT) == BUTTRESS_ALL_IRQ_MASK) {
+ /* Writing 1s does not clear the interrupt status register */
vdev->wa.interrupt_clear_with_0 = true;
+ REGB_WR32(VPU_37XX_BUTTRESS_INTERRUPT_STAT, 0x0);
+ }
IVPU_PRINT_WA(punit_disabled);
IVPU_PRINT_WA(clear_runtime_mem);
return ret;
}
+static int ivpu_boot_pwr_domain_disable(struct ivpu_device *vdev)
+{
+ ivpu_boot_dpu_active_drive(vdev, false);
+ ivpu_boot_pwr_island_isolation_drive(vdev, true);
+ ivpu_boot_pwr_island_trickle_drive(vdev, false);
+ ivpu_boot_pwr_island_drive(vdev, false);
+
+ return ivpu_boot_wait_for_pwr_island_status(vdev, 0x0);
+}
+
static void ivpu_boot_no_snoop_enable(struct ivpu_device *vdev)
{
u32 val = REGV_RD32(VPU_37XX_HOST_IF_TCU_PTW_OVERRIDES);
static int ivpu_hw_37xx_reset(struct ivpu_device *vdev)
{
- int ret;
- u32 val;
-
- if (IVPU_WA(punit_disabled))
- return 0;
+ int ret = 0;
- ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, TIMEOUT_US);
- if (ret) {
- ivpu_err(vdev, "Timed out waiting for TRIGGER bit\n");
- return ret;
+ if (ivpu_boot_pwr_domain_disable(vdev)) {
+ ivpu_err(vdev, "Failed to disable power domain\n");
+ ret = -EIO;
}
- val = REGB_RD32(VPU_37XX_BUTTRESS_VPU_IP_RESET);
- val = REG_SET_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, val);
- REGB_WR32(VPU_37XX_BUTTRESS_VPU_IP_RESET, val);
-
- ret = REGB_POLL_FLD(VPU_37XX_BUTTRESS_VPU_IP_RESET, TRIGGER, 0, TIMEOUT_US);
- if (ret)
- ivpu_err(vdev, "Timed out waiting for RESET completion\n");
+ if (ivpu_pll_disable(vdev)) {
+ ivpu_err(vdev, "Failed to disable PLL\n");
+ ret = -EIO;
+ }
return ret;
}
{
int ret;
- ret = ivpu_hw_37xx_reset(vdev);
- if (ret)
- ivpu_warn(vdev, "Failed to reset HW: %d\n", ret);
-
ret = ivpu_hw_37xx_d0i3_disable(vdev);
if (ret)
ivpu_warn(vdev, "Failed to disable D0I3: %d\n", ret);
{
int ret = 0;
- if (!ivpu_hw_37xx_is_idle(vdev) && ivpu_hw_37xx_reset(vdev))
- ivpu_err(vdev, "Failed to reset the VPU\n");
+ if (!ivpu_hw_37xx_is_idle(vdev))
+ ivpu_warn(vdev, "VPU not idle during power down\n");
- if (ivpu_pll_disable(vdev)) {
- ivpu_err(vdev, "Failed to disable PLL\n");
+ if (ivpu_hw_37xx_reset(vdev)) {
+ ivpu_err(vdev, "Failed to reset VPU\n");
ret = -EIO;
}
{
int ret;
- ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n",
- atomic_read(&vdev->drm.dev->power.usage_count));
-
ret = pm_runtime_get_if_active(vdev->drm.dev, false);
drm_WARN_ON(&vdev->drm, ret < 0);
static int video_get_max_state(struct thermal_cooling_device *cooling_dev,
unsigned long *state)
{
- struct acpi_device *device = cooling_dev->devdata;
- struct acpi_video_device *video = acpi_driver_data(device);
+ struct acpi_video_device *video = cooling_dev->devdata;
*state = video->brightness->count - ACPI_VIDEO_FIRST_LEVEL - 1;
return 0;
static int video_get_cur_state(struct thermal_cooling_device *cooling_dev,
unsigned long *state)
{
- struct acpi_device *device = cooling_dev->devdata;
- struct acpi_video_device *video = acpi_driver_data(device);
+ struct acpi_video_device *video = cooling_dev->devdata;
unsigned long long level;
int offset;
static int
video_set_cur_state(struct thermal_cooling_device *cooling_dev, unsigned long state)
{
- struct acpi_device *device = cooling_dev->devdata;
- struct acpi_video_device *video = acpi_driver_data(device);
+ struct acpi_video_device *video = cooling_dev->devdata;
int level;
if (state >= video->brightness->count - ACPI_VIDEO_FIRST_LEVEL)
strcpy(acpi_device_name(device), ACPI_VIDEO_DEVICE_NAME);
strcpy(acpi_device_class(device), ACPI_VIDEO_CLASS);
- device->driver_data = data;
data->device_id = device_id;
data->video = video;
device->backlight->props.brightness =
acpi_video_get_brightness(device->backlight);
- device->cooling_dev = thermal_cooling_device_register("LCD",
- device->dev, &video_cooling_ops);
+ device->cooling_dev = thermal_cooling_device_register("LCD", device,
+ &video_cooling_ops);
if (IS_ERR(device->cooling_dev)) {
/*
* Set cooling_dev to NULL so we don't crash trying to free it.
* HP ZBook Fury 16 G10 requires ACPI video's child devices have _PS0
* evaluated to have functional panel brightness control.
*/
- acpi_device_fix_up_power_extended(device);
+ acpi_device_fix_up_power_children(device);
pr_info("%s [%s] (multi-head: %s rom: %s post: %s)\n",
ACPI_VIDEO_DEVICE_NAME, acpi_device_bid(device),
}
EXPORT_SYMBOL_GPL(acpi_device_fix_up_power_extended);
+/**
+ * acpi_device_fix_up_power_children - Force a device's children into D0.
+ * @adev: Parent device object whose children's power state is to be fixed up.
+ *
+ * Call acpi_device_fix_up_power() for @adev's children so long as they
+ * are reported as present and enabled.
+ */
+void acpi_device_fix_up_power_children(struct acpi_device *adev)
+{
+ acpi_dev_for_each_child(adev, fix_up_power_if_applicable, NULL);
+}
+EXPORT_SYMBOL_GPL(acpi_device_fix_up_power_children);
+
int acpi_device_update_power(struct acpi_device *device, int *state_p)
{
int state;
while (1) {
if (cx->entry_method == ACPI_CSTATE_HALT)
- safe_halt();
+ raw_safe_halt();
else if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) {
io_idle(cx->address);
} else
DMI_MATCH(DMI_BOARD_NAME, "B1402CBA"),
},
},
+ {
+ /* Asus ExpertBook B1402CVA */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_BOARD_NAME, "B1402CVA"),
+ },
+ },
{
/* Asus ExpertBook B1502CBA */
.matches = {
int err;
const struct iommu_ops *ops;
+ /* Serialise to make dev->iommu stable under our potential fwspec */
+ mutex_lock(&iommu_probe_device_lock);
/*
* If we already translated the fwspec there is nothing left to do,
* return the iommu_ops.
*/
ops = acpi_iommu_fwspec_ops(dev);
- if (ops)
+ if (ops) {
+ mutex_unlock(&iommu_probe_device_lock);
return ops;
+ }
err = iort_iommu_configure_id(dev, id_in);
if (err && err != -EPROBE_DEFER)
err = viot_iommu_configure(dev);
+ mutex_unlock(&iommu_probe_device_lock);
/*
* If we have reason to believe the IOMMU driver missed the initial
acpi_handle_debug(list->handles[i], "Found in reference list\n");
}
-end:
if (ACPI_FAILURE(status)) {
list->count = 0;
kfree(list->handles);
list->handles = NULL;
}
+end:
kfree(buffer.pointer);
return status;
* Ask the sd driver to issue START STOP UNIT on runtime suspend
* and resume and shutdown only. For system level suspend/resume,
* devices power state is handled directly by libata EH.
+ * Given that disks are always spun up on system resume, also
+ * make sure that the sd driver forces runtime suspended disks
+ * to be resumed to correctly reflect the power state of the
+ * device.
*/
- sdev->manage_runtime_start_stop = true;
- sdev->manage_shutdown = true;
+ sdev->manage_runtime_start_stop = 1;
+ sdev->manage_shutdown = 1;
+ sdev->force_runtime_start_on_system_start = 1;
}
/*
if (pnp_port_valid(idev, 1)) {
ctl_addr = devm_ioport_map(&idev->dev,
pnp_port_start(idev, 1), 1);
+ if (!ctl_addr)
+ return -ENOMEM;
+
ap->ioaddr.altstatus_addr = ctl_addr;
ap->ioaddr.ctl_addr = ctl_addr;
ap->ops = &isapnp_port_ops;
struct sk_buff *skb;
unsigned int len;
- spin_lock(&card->cli_queue_lock);
+ spin_lock_bh(&card->cli_queue_lock);
skb = skb_dequeue(&card->cli_queue[SOLOS_CHAN(atmdev)]);
- spin_unlock(&card->cli_queue_lock);
+ spin_unlock_bh(&card->cli_queue_lock);
if(skb == NULL)
return sprintf(buf, "No data.\n");
struct pkt_hdr *header;
/* Remove any yet-to-be-transmitted packets from the pending queue */
- spin_lock(&card->tx_queue_lock);
+ spin_lock_bh(&card->tx_queue_lock);
skb_queue_walk_safe(&card->tx_queue[port], skb, tmpskb) {
if (SKB_CB(skb)->vcc == vcc) {
skb_unlink(skb, &card->tx_queue[port]);
solos_pop(vcc, skb);
}
}
- spin_unlock(&card->tx_queue_lock);
+ spin_unlock_bh(&card->tx_queue_lock);
skb = alloc_skb(sizeof(*header), GFP_KERNEL);
if (!skb) {
#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
#endif /* CONFIG_HOTPLUG_CPU */
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
#include <linux/kexec.h>
static ssize_t crash_notes_show(struct device *dev,
#endif
static const struct attribute_group *common_cpu_attr_groups[] = {
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
&crash_note_cpu_attr_group,
#endif
NULL
};
static const struct attribute_group *hotplugable_cpu_attr_groups[] = {
-#ifdef CONFIG_KEXEC
+#ifdef CONFIG_KEXEC_CORE
&crash_note_cpu_attr_group,
#endif
NULL
devcd->devcd_dev.class = &devcd_class;
mutex_lock(&devcd->mutex);
+ dev_set_uevent_suppress(&devcd->devcd_dev, true);
if (device_add(&devcd->devcd_dev))
goto put_device;
"devcoredump"))
dev_warn(dev, "devcoredump create_link failed\n");
+ dev_set_uevent_suppress(&devcd->devcd_dev, false);
+ kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD);
INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
mutex_unlock(&devcd->mutex);
}
#endif
+/*
+ * Must acquire mem_hotplug_lock in write mode.
+ */
static int memory_block_online(struct memory_block *mem)
{
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
if (mem->altmap)
nr_vmemmap_pages = mem->altmap->free;
+ mem_hotplug_begin();
if (nr_vmemmap_pages) {
ret = mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone);
if (ret)
- return ret;
+ goto out;
}
ret = online_pages(start_pfn + nr_vmemmap_pages,
if (ret) {
if (nr_vmemmap_pages)
mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages);
- return ret;
+ goto out;
}
/*
nr_vmemmap_pages);
mem->zone = zone;
+out:
+ mem_hotplug_done();
return ret;
}
+/*
+ * Must acquire mem_hotplug_lock in write mode.
+ */
static int memory_block_offline(struct memory_block *mem)
{
unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr);
if (mem->altmap)
nr_vmemmap_pages = mem->altmap->free;
+ mem_hotplug_begin();
if (nr_vmemmap_pages)
adjust_present_page_count(pfn_to_page(start_pfn), mem->group,
-nr_vmemmap_pages);
if (nr_vmemmap_pages)
adjust_present_page_count(pfn_to_page(start_pfn),
mem->group, nr_vmemmap_pages);
- return ret;
+ goto out;
}
if (nr_vmemmap_pages)
mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages);
mem->zone = NULL;
+out:
+ mem_hotplug_done();
return ret;
}
rb_entry(node, struct regmap_range_node, node);
/* If there's nothing in the cache there's nothing to sync */
- ret = regcache_read(map, this->selector_reg, &i);
- if (ret != 0)
+ if (regcache_read(map, this->selector_reg, &i) != 0)
continue;
ret = _regmap_write(map, this->selector_reg, i);
struct recv_thread_args {
struct work_struct work;
struct nbd_device *nbd;
+ struct nbd_sock *nsock;
int index;
};
}
}
+static struct nbd_config *nbd_get_config_unlocked(struct nbd_device *nbd)
+{
+ if (refcount_inc_not_zero(&nbd->config_refs)) {
+ /*
+ * Add smp_mb__after_atomic to ensure that reading nbd->config_refs
+ * and reading nbd->config is ordered. The pair is the barrier in
+ * nbd_alloc_and_init_config(), avoid nbd->config_refs is set
+ * before nbd->config.
+ */
+ smp_mb__after_atomic();
+ return nbd->config;
+ }
+
+ return NULL;
+}
+
static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req)
{
struct nbd_cmd *cmd = blk_mq_rq_to_pdu(req);
return BLK_EH_DONE;
}
- if (!refcount_inc_not_zero(&nbd->config_refs)) {
+ config = nbd_get_config_unlocked(nbd);
+ if (!config) {
cmd->status = BLK_STS_TIMEOUT;
__clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
mutex_unlock(&cmd->lock);
goto done;
}
- config = nbd->config;
if (config->num_connections > 1 ||
(config->num_connections == 1 && nbd->tag_set.timeout)) {
return BLK_EH_DONE;
}
-/*
- * Send or receive packet. Return a positive value on success and
- * negtive value on failue, and never return 0.
- */
-static int sock_xmit(struct nbd_device *nbd, int index, int send,
- struct iov_iter *iter, int msg_flags, int *sent)
+static int __sock_xmit(struct nbd_device *nbd, struct socket *sock, int send,
+ struct iov_iter *iter, int msg_flags, int *sent)
{
- struct nbd_config *config = nbd->config;
- struct socket *sock = config->socks[index]->sock;
int result;
struct msghdr msg;
unsigned int noreclaim_flag;
return result;
}
+/*
+ * Send or receive packet. Return a positive value on success and
+ * negtive value on failure, and never return 0.
+ */
+static int sock_xmit(struct nbd_device *nbd, int index, int send,
+ struct iov_iter *iter, int msg_flags, int *sent)
+{
+ struct nbd_config *config = nbd->config;
+ struct socket *sock = config->socks[index]->sock;
+
+ return __sock_xmit(nbd, sock, send, iter, msg_flags, sent);
+}
+
/*
* Different settings for sk->sk_sndtimeo can result in different return values
* if there is a signal pending when we enter sendmsg, because reasons?
return 0;
}
-static int nbd_read_reply(struct nbd_device *nbd, int index,
+static int nbd_read_reply(struct nbd_device *nbd, struct socket *sock,
struct nbd_reply *reply)
{
struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
reply->magic = 0;
iov_iter_kvec(&to, ITER_DEST, &iov, 1, sizeof(*reply));
- result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
+ result = __sock_xmit(nbd, sock, 0, &to, MSG_WAITALL, NULL);
if (result < 0) {
if (!nbd_disconnected(nbd->config))
dev_err(disk_to_dev(nbd->disk),
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
struct request_queue *q = nbd->disk->queue;
- struct nbd_sock *nsock;
+ struct nbd_sock *nsock = args->nsock;
struct nbd_cmd *cmd;
struct request *rq;
while (1) {
struct nbd_reply reply;
- if (nbd_read_reply(nbd, args->index, &reply))
+ if (nbd_read_reply(nbd, nsock->sock, &reply))
break;
/*
percpu_ref_put(&q->q_usage_counter);
}
- nsock = config->socks[args->index];
mutex_lock(&nsock->tx_lock);
nbd_mark_nsock_dead(nbd, nsock, 1);
mutex_unlock(&nsock->tx_lock);
struct nbd_sock *nsock;
int ret;
- if (!refcount_inc_not_zero(&nbd->config_refs)) {
+ config = nbd_get_config_unlocked(nbd);
+ if (!config) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Socks array is empty\n");
return -EINVAL;
}
- config = nbd->config;
if (index >= config->num_connections) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
INIT_WORK(&args->work, recv_work);
args->index = i;
args->nbd = nbd;
+ args->nsock = nsock;
nsock->cookie++;
mutex_unlock(&nsock->tx_lock);
sockfd_put(old);
refcount_inc(&nbd->config_refs);
INIT_WORK(&args->work, recv_work);
args->nbd = nbd;
+ args->nsock = config->socks[i];
args->index = i;
queue_work(nbd->recv_workq, &args->work);
}
return error;
}
-static struct nbd_config *nbd_alloc_config(void)
+static int nbd_alloc_and_init_config(struct nbd_device *nbd)
{
struct nbd_config *config;
+ if (WARN_ON(nbd->config))
+ return -EINVAL;
+
if (!try_module_get(THIS_MODULE))
- return ERR_PTR(-ENODEV);
+ return -ENODEV;
config = kzalloc(sizeof(struct nbd_config), GFP_NOFS);
if (!config) {
module_put(THIS_MODULE);
- return ERR_PTR(-ENOMEM);
+ return -ENOMEM;
}
atomic_set(&config->recv_threads, 0);
init_waitqueue_head(&config->conn_wait);
config->blksize_bits = NBD_DEF_BLKSIZE_BITS;
atomic_set(&config->live_connections, 0);
- return config;
+
+ nbd->config = config;
+ /*
+ * Order refcount_set(&nbd->config_refs, 1) and nbd->config assignment,
+ * its pair is the barrier in nbd_get_config_unlocked().
+ * So nbd_get_config_unlocked() won't see nbd->config as null after
+ * refcount_inc_not_zero() succeed.
+ */
+ smp_mb__before_atomic();
+ refcount_set(&nbd->config_refs, 1);
+
+ return 0;
}
static int nbd_open(struct gendisk *disk, blk_mode_t mode)
{
struct nbd_device *nbd;
+ struct nbd_config *config;
int ret = 0;
mutex_lock(&nbd_index_mutex);
ret = -ENXIO;
goto out;
}
- if (!refcount_inc_not_zero(&nbd->config_refs)) {
- struct nbd_config *config;
+ config = nbd_get_config_unlocked(nbd);
+ if (!config) {
mutex_lock(&nbd->config_lock);
if (refcount_inc_not_zero(&nbd->config_refs)) {
mutex_unlock(&nbd->config_lock);
goto out;
}
- config = nbd_alloc_config();
- if (IS_ERR(config)) {
- ret = PTR_ERR(config);
+ ret = nbd_alloc_and_init_config(nbd);
+ if (ret) {
mutex_unlock(&nbd->config_lock);
goto out;
}
- nbd->config = config;
- refcount_set(&nbd->config_refs, 1);
+
refcount_inc(&nbd->refs);
mutex_unlock(&nbd->config_lock);
if (max_part)
set_bit(GD_NEED_PART_SCAN, &disk->state);
- } else if (nbd_disconnected(nbd->config)) {
+ } else if (nbd_disconnected(config)) {
if (max_part)
set_bit(GD_NEED_PART_SCAN, &disk->state);
}
pr_err("nbd%d already in use\n", index);
return -EBUSY;
}
- if (WARN_ON(nbd->config)) {
- mutex_unlock(&nbd->config_lock);
- nbd_put(nbd);
- return -EINVAL;
- }
- config = nbd_alloc_config();
- if (IS_ERR(config)) {
+
+ ret = nbd_alloc_and_init_config(nbd);
+ if (ret) {
mutex_unlock(&nbd->config_lock);
nbd_put(nbd);
pr_err("couldn't allocate config\n");
- return PTR_ERR(config);
+ return ret;
}
- nbd->config = config;
- refcount_set(&nbd->config_refs, 1);
- set_bit(NBD_RT_BOUND, &config->runtime_flags);
+ config = nbd->config;
+ set_bit(NBD_RT_BOUND, &config->runtime_flags);
ret = nbd_genl_size_set(info, nbd);
if (ret)
goto out;
}
mutex_unlock(&nbd_index_mutex);
- if (!refcount_inc_not_zero(&nbd->config_refs)) {
+ config = nbd_get_config_unlocked(nbd);
+ if (!config) {
dev_err(nbd_to_dev(nbd),
"not configured, cannot reconfigure\n");
nbd_put(nbd);
}
mutex_lock(&nbd->config_lock);
- config = nbd->config;
if (!test_bit(NBD_RT_BOUND, &config->runtime_flags) ||
!nbd->pid) {
dev_err(nbd_to_dev(nbd),
return BLK_STS_OK;
}
-static blk_status_t null_handle_cmd(struct nullb_cmd *cmd, sector_t sector,
- sector_t nr_sectors, enum req_op op)
+static void null_handle_cmd(struct nullb_cmd *cmd, sector_t sector,
+ sector_t nr_sectors, enum req_op op)
{
struct nullb_device *dev = cmd->nq->dev;
struct nullb *nullb = dev->nullb;
blk_status_t sts;
- if (test_bit(NULLB_DEV_FL_THROTTLED, &dev->flags)) {
- sts = null_handle_throttled(cmd);
- if (sts != BLK_STS_OK)
- return sts;
- }
-
if (op == REQ_OP_FLUSH) {
cmd->error = errno_to_blk_status(null_handle_flush(nullb));
goto out;
out:
nullb_complete_cmd(cmd);
- return BLK_STS_OK;
}
static enum hrtimer_restart nullb_bwtimer_fn(struct hrtimer *timer)
cmd->fake_timeout = should_timeout_request(rq) ||
blk_should_fake_timeout(rq->q);
- blk_mq_start_request(rq);
-
if (should_requeue_request(rq)) {
/*
* Alternate between hitting the core BUSY path, and the
return BLK_STS_OK;
}
+ if (test_bit(NULLB_DEV_FL_THROTTLED, &nq->dev->flags)) {
+ blk_status_t sts = null_handle_throttled(cmd);
+
+ if (sts != BLK_STS_OK)
+ return sts;
+ }
+
+ blk_mq_start_request(rq);
+
if (is_poll) {
spin_lock(&nq->poll_lock);
list_add_tail(&rq->queuelist, &nq->poll_list);
if (cmd->fake_timeout)
return BLK_STS_OK;
- return null_handle_cmd(cmd, sector, nr_sectors, req_op(rq));
+ null_handle_cmd(cmd, sector, nr_sectors, req_op(rq));
+ return BLK_STS_OK;
}
static void null_queue_rqs(struct request **rqlist)
#include <linux/module.h>
#include <asm/unaligned.h>
+#include <linux/atomic.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/slab.h>
bool wakeup;
__u16 msft_opcode;
bool aosp_capable;
+ atomic_t initialized;
};
static int vhci_open_dev(struct hci_dev *hdev)
memcpy(skb_push(skb, 1), &hci_skb_pkt_type(skb), 1);
- mutex_lock(&data->open_mutex);
skb_queue_tail(&data->readq, skb);
- mutex_unlock(&data->open_mutex);
- wake_up_interruptible(&data->read_wait);
+ if (atomic_read(&data->initialized))
+ wake_up_interruptible(&data->read_wait);
return 0;
}
skb_put_u8(skb, 0xff);
skb_put_u8(skb, opcode);
put_unaligned_le16(hdev->id, skb_put(skb, 2));
- skb_queue_tail(&data->readq, skb);
+ skb_queue_head(&data->readq, skb);
+ atomic_inc(&data->initialized);
wake_up_interruptible(&data->read_wait);
return 0;
sysc_val = sysc_read_sysconfig(ddata);
sysc_val |= sysc_mask;
sysc_write(ddata, sysc_offset, sysc_val);
- /* Flush posted write */
+
+ /*
+ * Some devices need a delay before reading registers
+ * after reset. Presumably a srst_udelay is not needed
+ * for devices that use a rstctrl register reset.
+ */
+ if (ddata->cfg.srst_udelay)
+ fsleep(ddata->cfg.srst_udelay);
+
+ /*
+ * Flush posted write. For devices needing srst_udelay
+ * this should trigger an interconnect error if the
+ * srst_udelay value is needed but not configured.
+ */
sysc_val = sysc_read_sysconfig(ddata);
}
- if (ddata->cfg.srst_udelay)
- fsleep(ddata->cfg.srst_udelay);
-
if (ddata->post_reset_quirk)
ddata->post_reset_quirk(ddata);
config SM_CAMCC_8550
tristate "SM8550 Camera Clock Controller"
+ depends on ARM64 || COMPILE_TEST
select SM_GCC_8550
help
Support for the camera clock controller on SM8550 devices.
PNAME(mux_pll_src_4plls_p) = { "cpll", "gpll", "gpll_div2", "usb480m" };
PNAME(mux_pll_src_3plls_p) = { "cpll", "gpll", "gpll_div2" };
-PNAME(mux_aclk_peri_src_p) = { "gpll_peri", "cpll_peri", "gpll_div2_peri", "gpll_div3_peri" };
+PNAME(mux_clk_peri_src_p) = { "gpll", "cpll", "gpll_div2", "gpll_div3" };
PNAME(mux_mmc_src_p) = { "cpll", "gpll", "gpll_div2", "xin24m" };
PNAME(mux_clk_cif_out_src_p) = { "clk_cif_src", "xin24m" };
PNAME(mux_sclk_vop_src_p) = { "cpll", "gpll", "gpll_div2", "gpll_div3" };
RK2928_CLKGATE_CON(0), 11, GFLAGS),
/* PD_PERI */
- GATE(0, "gpll_peri", "gpll", CLK_IGNORE_UNUSED,
+ COMPOSITE(0, "clk_peri_src", mux_clk_peri_src_p, 0,
+ RK2928_CLKSEL_CON(10), 14, 2, MFLAGS, 0, 5, DFLAGS,
RK2928_CLKGATE_CON(2), 0, GFLAGS),
- GATE(0, "cpll_peri", "cpll", CLK_IGNORE_UNUSED,
- RK2928_CLKGATE_CON(2), 0, GFLAGS),
- GATE(0, "gpll_div2_peri", "gpll_div2", CLK_IGNORE_UNUSED,
- RK2928_CLKGATE_CON(2), 0, GFLAGS),
- GATE(0, "gpll_div3_peri", "gpll_div3", CLK_IGNORE_UNUSED,
- RK2928_CLKGATE_CON(2), 0, GFLAGS),
- COMPOSITE_NOGATE(0, "aclk_peri_src", mux_aclk_peri_src_p, 0,
- RK2928_CLKSEL_CON(10), 14, 2, MFLAGS, 0, 5, DFLAGS),
- COMPOSITE_NOMUX(PCLK_PERI, "pclk_peri", "aclk_peri_src", 0,
+
+ COMPOSITE_NOMUX(PCLK_PERI, "pclk_peri", "clk_peri_src", 0,
RK2928_CLKSEL_CON(10), 12, 2, DFLAGS | CLK_DIVIDER_POWER_OF_TWO,
RK2928_CLKGATE_CON(2), 3, GFLAGS),
- COMPOSITE_NOMUX(HCLK_PERI, "hclk_peri", "aclk_peri_src", 0,
+ COMPOSITE_NOMUX(HCLK_PERI, "hclk_peri", "clk_peri_src", 0,
RK2928_CLKSEL_CON(10), 8, 2, DFLAGS | CLK_DIVIDER_POWER_OF_TWO,
RK2928_CLKGATE_CON(2), 2, GFLAGS),
- GATE(ACLK_PERI, "aclk_peri", "aclk_peri_src", 0,
+ GATE(ACLK_PERI, "aclk_peri", "clk_peri_src", 0,
RK2928_CLKGATE_CON(2), 1, GFLAGS),
GATE(SCLK_TIMER0, "sclk_timer0", "xin24m", 0,
GATE(SCLK_MIPI_24M, "clk_mipi_24m", "xin24m", CLK_IGNORE_UNUSED,
RK2928_CLKGATE_CON(2), 15, GFLAGS),
- COMPOSITE(SCLK_SDMMC, "sclk_sdmmc0", mux_mmc_src_p, 0,
+ COMPOSITE(SCLK_SDMMC, "sclk_sdmmc", mux_mmc_src_p, 0,
RK2928_CLKSEL_CON(11), 6, 2, MFLAGS, 0, 6, DFLAGS,
RK2928_CLKGATE_CON(2), 11, GFLAGS),
GATE(HCLK_I2S_2CH, "hclk_i2s_2ch", "hclk_peri", 0, RK2928_CLKGATE_CON(7), 2, GFLAGS),
GATE(0, "hclk_usb_peri", "hclk_peri", CLK_IGNORE_UNUSED, RK2928_CLKGATE_CON(9), 13, GFLAGS),
GATE(HCLK_HOST2, "hclk_host2", "hclk_peri", 0, RK2928_CLKGATE_CON(7), 3, GFLAGS),
- GATE(HCLK_OTG, "hclk_otg", "hclk_peri", 0, RK2928_CLKGATE_CON(3), 13, GFLAGS),
+ GATE(HCLK_OTG, "hclk_otg", "hclk_peri", 0, RK2928_CLKGATE_CON(5), 13, GFLAGS),
GATE(0, "hclk_peri_ahb", "hclk_peri", CLK_IGNORE_UNUSED, RK2928_CLKGATE_CON(9), 14, GFLAGS),
GATE(HCLK_SPDIF, "hclk_spdif", "hclk_peri", 0, RK2928_CLKGATE_CON(10), 9, GFLAGS),
GATE(HCLK_TSP, "hclk_tsp", "hclk_peri", 0, RK2928_CLKGATE_CON(10), 12, GFLAGS),
RK3036_PLL_RATE(408000000, 1, 68, 2, 2, 1, 0),
RK3036_PLL_RATE(312000000, 1, 78, 6, 1, 1, 0),
RK3036_PLL_RATE(297000000, 2, 99, 4, 1, 1, 0),
+ RK3036_PLL_RATE(292500000, 1, 195, 4, 4, 1, 0),
RK3036_PLL_RATE(241500000, 2, 161, 4, 2, 1, 0),
RK3036_PLL_RATE(216000000, 1, 72, 4, 2, 1, 0),
RK3036_PLL_RATE(200000000, 1, 100, 3, 4, 1, 0),
highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
WRITE_ONCE(cpudata->highest_perf, highest_perf);
-
+ WRITE_ONCE(cpudata->max_limit_perf, highest_perf);
WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
-
+ WRITE_ONCE(cpudata->min_limit_perf, AMD_CPPC_LOWEST_PERF(cap1));
return 0;
}
highest_perf = cppc_perf.highest_perf;
WRITE_ONCE(cpudata->highest_perf, highest_perf);
-
+ WRITE_ONCE(cpudata->max_limit_perf, highest_perf);
WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
WRITE_ONCE(cpudata->lowest_nonlinear_perf,
cppc_perf.lowest_nonlinear_perf);
WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+ WRITE_ONCE(cpudata->min_limit_perf, cppc_perf.lowest_perf);
if (cppc_state == AMD_PSTATE_ACTIVE)
return 0;
u64 prev = READ_ONCE(cpudata->cppc_req_cached);
u64 value = prev;
+ min_perf = clamp_t(unsigned long, min_perf, cpudata->min_limit_perf,
+ cpudata->max_limit_perf);
+ max_perf = clamp_t(unsigned long, max_perf, cpudata->min_limit_perf,
+ cpudata->max_limit_perf);
des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
if ((cppc_state == AMD_PSTATE_GUIDED) && (gov_flags & CPUFREQ_GOV_DYNAMIC_SWITCHING)) {
return 0;
}
+static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
+{
+ u32 max_limit_perf, min_limit_perf;
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq);
+ min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq);
+
+ WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
+ WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
+ WRITE_ONCE(cpudata->max_limit_freq, policy->max);
+ WRITE_ONCE(cpudata->min_limit_freq, policy->min);
+
+ return 0;
+}
+
static int amd_pstate_update_freq(struct cpufreq_policy *policy,
unsigned int target_freq, bool fast_switch)
{
if (!cpudata->max_freq)
return -ENODEV;
+ if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
+ amd_pstate_update_min_max_limit(policy);
+
cap_perf = READ_ONCE(cpudata->highest_perf);
min_perf = READ_ONCE(cpudata->lowest_perf);
max_perf = cap_perf;
static unsigned int amd_pstate_fast_switch(struct cpufreq_policy *policy,
unsigned int target_freq)
{
- return amd_pstate_update_freq(policy, target_freq, true);
+ if (!amd_pstate_update_freq(policy, target_freq, true))
+ return target_freq;
+ return policy->cur;
}
static void amd_pstate_adjust_perf(unsigned int cpu,
struct amd_cpudata *cpudata = policy->driver_data;
unsigned int target_freq;
+ if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
+ amd_pstate_update_min_max_limit(policy);
+
+
cap_perf = READ_ONCE(cpudata->highest_perf);
lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
max_freq = READ_ONCE(cpudata->max_freq);
/* Initial processor data capability frequencies */
cpudata->max_freq = max_freq;
cpudata->min_freq = min_freq;
+ cpudata->max_limit_freq = max_freq;
+ cpudata->min_limit_freq = min_freq;
cpudata->nominal_freq = nominal_freq;
cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
{
int i = 0;
int offset = 0;
+ struct amd_cpudata *cpudata = policy->driver_data;
+
+ if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
+ return sysfs_emit_at(buf, offset, "%s\n",
+ energy_perf_strings[EPP_INDEX_PERFORMANCE]);
while (energy_perf_strings[i] != NULL)
offset += sysfs_emit_at(buf, offset, "%s ", energy_perf_strings[i++]);
- sysfs_emit_at(buf, offset, "\n");
+ offset += sysfs_emit_at(buf, offset, "\n");
return offset;
}
return 0;
}
-static void amd_pstate_epp_init(unsigned int cpu)
+static void amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
{
- struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct amd_cpudata *cpudata = policy->driver_data;
- u32 max_perf, min_perf;
+ u32 max_perf, min_perf, min_limit_perf, max_limit_perf;
u64 value;
s16 epp;
max_perf = READ_ONCE(cpudata->highest_perf);
min_perf = READ_ONCE(cpudata->lowest_perf);
+ max_limit_perf = div_u64(policy->max * cpudata->highest_perf, cpudata->max_freq);
+ min_limit_perf = div_u64(policy->min * cpudata->highest_perf, cpudata->max_freq);
+
+ max_perf = clamp_t(unsigned long, max_perf, cpudata->min_limit_perf,
+ cpudata->max_limit_perf);
+ min_perf = clamp_t(unsigned long, min_perf, cpudata->min_limit_perf,
+ cpudata->max_limit_perf);
+
+ WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
+ WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
value = READ_ONCE(cpudata->cppc_req_cached);
value &= ~AMD_CPPC_DES_PERF(~0L);
value |= AMD_CPPC_DES_PERF(0);
- if (cpudata->epp_policy == cpudata->policy)
- goto skip_epp;
-
cpudata->epp_policy = cpudata->policy;
/* Get BIOS pre-defined epp value */
* This return value can only be negative for shared_memory
* systems where EPP register read/write not supported.
*/
- goto skip_epp;
+ return;
}
if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
WRITE_ONCE(cpudata->cppc_req_cached, value);
amd_pstate_set_epp(cpudata, epp);
-skip_epp:
- cpufreq_cpu_put(policy);
}
static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
cpudata->policy = policy->policy;
- amd_pstate_epp_init(policy->cpu);
+ amd_pstate_epp_update_limit(policy);
return 0;
}
imx6x_disable_freq_in_opp(dev, 696000000);
if (of_machine_is_compatible("fsl,imx6ull")) {
- if (val != OCOTP_CFG3_6ULL_SPEED_792MHZ)
+ if (val < OCOTP_CFG3_6ULL_SPEED_792MHZ)
imx6x_disable_freq_in_opp(dev, 792000000);
if (val != OCOTP_CFG3_6ULL_SPEED_900MHZ)
#include <linux/nvmem-consumer.h>
#include <linux/of.h>
#include <linux/platform_device.h>
+#include <linux/pm.h>
#include <linux/pm_domain.h>
#include <linux/pm_opp.h>
+#include <linux/pm_runtime.h>
#include <linux/slab.h>
#include <linux/soc/qcom/smem.h>
struct qcom_cpufreq_drv_cpu {
int opp_token;
+ struct device **virt_devs;
};
struct qcom_cpufreq_drv {
.get_version = qcom_cpufreq_ipq8074_name_version,
};
+static void qcom_cpufreq_suspend_virt_devs(struct qcom_cpufreq_drv *drv, unsigned int cpu)
+{
+ const char * const *name = drv->data->genpd_names;
+ int i;
+
+ if (!drv->cpus[cpu].virt_devs)
+ return;
+
+ for (i = 0; *name; i++, name++)
+ device_set_awake_path(drv->cpus[cpu].virt_devs[i]);
+}
+
+static void qcom_cpufreq_put_virt_devs(struct qcom_cpufreq_drv *drv, unsigned int cpu)
+{
+ const char * const *name = drv->data->genpd_names;
+ int i;
+
+ if (!drv->cpus[cpu].virt_devs)
+ return;
+
+ for (i = 0; *name; i++, name++)
+ pm_runtime_put(drv->cpus[cpu].virt_devs[i]);
+}
+
static int qcom_cpufreq_probe(struct platform_device *pdev)
{
struct qcom_cpufreq_drv *drv;
of_node_put(np);
for_each_possible_cpu(cpu) {
+ struct device **virt_devs = NULL;
struct dev_pm_opp_config config = {
.supported_hw = NULL,
};
if (drv->data->genpd_names) {
config.genpd_names = drv->data->genpd_names;
- config.virt_devs = NULL;
+ config.virt_devs = &virt_devs;
}
if (config.supported_hw || config.genpd_names) {
goto free_opp;
}
}
+
+ if (virt_devs) {
+ const char * const *name = config.genpd_names;
+ int i, j;
+
+ for (i = 0; *name; i++, name++) {
+ ret = pm_runtime_resume_and_get(virt_devs[i]);
+ if (ret) {
+ dev_err(cpu_dev, "failed to resume %s: %d\n",
+ *name, ret);
+
+ /* Rollback previous PM runtime calls */
+ name = config.genpd_names;
+ for (j = 0; *name && j < i; j++, name++)
+ pm_runtime_put(virt_devs[j]);
+
+ goto free_opp;
+ }
+ }
+ drv->cpus[cpu].virt_devs = virt_devs;
+ }
}
cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1,
dev_err(cpu_dev, "Failed to register platform device\n");
free_opp:
- for_each_possible_cpu(cpu)
+ for_each_possible_cpu(cpu) {
+ qcom_cpufreq_put_virt_devs(drv, cpu);
dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
+ }
return ret;
}
platform_device_unregister(cpufreq_dt_pdev);
- for_each_possible_cpu(cpu)
+ for_each_possible_cpu(cpu) {
+ qcom_cpufreq_put_virt_devs(drv, cpu);
dev_pm_opp_clear_config(drv->cpus[cpu].opp_token);
+ }
}
+static int qcom_cpufreq_suspend(struct device *dev)
+{
+ struct qcom_cpufreq_drv *drv = dev_get_drvdata(dev);
+ unsigned int cpu;
+
+ for_each_possible_cpu(cpu)
+ qcom_cpufreq_suspend_virt_devs(drv, cpu);
+
+ return 0;
+}
+
+static DEFINE_SIMPLE_DEV_PM_OPS(qcom_cpufreq_pm_ops, qcom_cpufreq_suspend, NULL);
+
static struct platform_driver qcom_cpufreq_driver = {
.probe = qcom_cpufreq_probe,
.remove_new = qcom_cpufreq_remove,
.driver = {
.name = "qcom-cpufreq-nvmem",
+ .pm = pm_sleep_ptr(&qcom_cpufreq_pm_ops),
},
};
{
resource_size_t base = -1;
- down_read(&cxl_dpa_rwsem);
+ lockdep_assert_held(&cxl_dpa_rwsem);
if (cxled->dpa_res)
base = cxled->dpa_res->start;
- up_read(&cxl_dpa_rwsem);
return base;
}
cxld->target_type = CXL_DECODER_HOSTONLYMEM;
else
cxld->target_type = CXL_DECODER_DEVMEM;
+
+ guard(rwsem_write)(&cxl_region_rwsem);
if (cxld->id != cxl_num_decoders_committed(port)) {
dev_warn(&port->dev,
"decoder%d.%d: Committed out of order\n",
if (!port || !is_cxl_endpoint(port))
return -EINVAL;
- rc = down_read_interruptible(&cxl_dpa_rwsem);
+ rc = down_read_interruptible(&cxl_region_rwsem);
if (rc)
return rc;
+ rc = down_read_interruptible(&cxl_dpa_rwsem);
+ if (rc) {
+ up_read(&cxl_region_rwsem);
+ return rc;
+ }
+
if (cxl_num_decoders_committed(port) == 0) {
/* No regions mapped to this memdev */
rc = cxl_get_poison_by_memdev(cxlmd);
rc = cxl_get_poison_by_endpoint(port);
}
up_read(&cxl_dpa_rwsem);
+ up_read(&cxl_region_rwsem);
return rc;
}
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- rc = down_read_interruptible(&cxl_dpa_rwsem);
+ rc = down_read_interruptible(&cxl_region_rwsem);
if (rc)
return rc;
+ rc = down_read_interruptible(&cxl_dpa_rwsem);
+ if (rc) {
+ up_read(&cxl_region_rwsem);
+ return rc;
+ }
+
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
goto out;
trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT);
out:
up_read(&cxl_dpa_rwsem);
+ up_read(&cxl_region_rwsem);
return rc;
}
if (!IS_ENABLED(CONFIG_DEBUG_FS))
return 0;
- rc = down_read_interruptible(&cxl_dpa_rwsem);
+ rc = down_read_interruptible(&cxl_region_rwsem);
if (rc)
return rc;
+ rc = down_read_interruptible(&cxl_dpa_rwsem);
+ if (rc) {
+ up_read(&cxl_region_rwsem);
+ return rc;
+ }
+
rc = cxl_validate_poison_dpa(cxlmd, dpa);
if (rc)
goto out;
trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR);
out:
up_read(&cxl_dpa_rwsem);
+ up_read(&cxl_region_rwsem);
return rc;
}
struct pci_dev *pdev = NULL;
struct cxl_memdev *cxlmd;
size_t cdat_length;
- void *cdat_table;
+ void *cdat_table, *cdat_buf;
int rc;
if (is_cxl_memdev(uport)) {
return;
}
- cdat_table = devm_kzalloc(dev, cdat_length + sizeof(__le32),
- GFP_KERNEL);
- if (!cdat_table)
+ cdat_buf = devm_kzalloc(dev, cdat_length + sizeof(__le32), GFP_KERNEL);
+ if (!cdat_buf)
return;
- rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length);
+ rc = cxl_cdat_read_table(dev, cdat_doe, cdat_buf, &cdat_length);
if (rc)
goto err;
- cdat_table = cdat_table + sizeof(__le32);
+ cdat_table = cdat_buf + sizeof(__le32);
if (cdat_checksum(cdat_table, cdat_length))
goto err;
err:
/* Don't leave table data allocated on error */
- devm_kfree(dev, cdat_table);
+ devm_kfree(dev, cdat_buf);
dev_err(dev, "Failed to read/validate CDAT.\n");
}
EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL);
static void remove_dev(void *dev)
{
- device_del(dev);
+ device_unregister(dev);
}
int devm_cxl_pmu_add(struct device *parent, struct cxl_pmu_regs *regs,
char *buf)
{
struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
- u64 base = cxl_dpa_resource_start(cxled);
- return sysfs_emit(buf, "%#llx\n", base);
+ guard(rwsem_read)(&cxl_dpa_rwsem);
+ return sysfs_emit(buf, "%#llx\n", (u64)cxl_dpa_resource_start(cxled));
}
static DEVICE_ATTR_RO(dpa_resource);
struct cxl_poison_context ctx;
int rc = 0;
- rc = down_read_interruptible(&cxl_region_rwsem);
- if (rc)
- return rc;
-
ctx = (struct cxl_poison_context) {
.port = port
};
rc = cxl_get_poison_unmapped(to_cxl_memdev(port->uport_dev),
&ctx);
- up_read(&cxl_region_rwsem);
return rc;
}
dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
if ((old->context == fence->context && old_usage >= usage &&
- dma_fence_is_later(fence, old)) ||
+ dma_fence_is_later_or_same(fence, old)) ||
dma_fence_is_signaled(old)) {
dma_resv_list_set(fobj, i, fence, usage);
dma_fence_put(old);
dma_pool_destroy(fsl_chan->tcd_pool);
fsl_chan->tcd_pool = NULL;
fsl_chan->is_sw = false;
+ fsl_chan->srcid = 0;
}
void fsl_edma_cleanup_vchan(struct dma_device *dmadev)
link = device_link_add(dev, pd_chan, DL_FLAG_STATELESS |
DL_FLAG_PM_RUNTIME |
DL_FLAG_RPM_ACTIVE);
- if (IS_ERR(link)) {
- dev_err(dev, "Failed to add device_link to %d: %ld\n", i,
- PTR_ERR(link));
+ if (!link) {
+ dev_err(dev, "Failed to add device_link to %d\n", i);
return -EINVAL;
}
for (i = 0; i < fsl_edma->n_chans; i++) {
fsl_chan = &fsl_edma->chans[i];
+ if (fsl_edma->chan_masked & BIT(i))
+ continue;
spin_lock_irqsave(&fsl_chan->vchan.lock, flags);
/* Make sure chan is idle or will force disable. */
if (unlikely(!fsl_chan->idle)) {
for (i = 0; i < fsl_edma->n_chans; i++) {
fsl_chan = &fsl_edma->chans[i];
+ if (fsl_edma->chan_masked & BIT(i))
+ continue;
fsl_chan->pm_state = RUNNING;
edma_write_tcdreg(fsl_chan, 0, csr);
if (fsl_chan->slave_id != 0)
fsl_edma_chan_mux(fsl_chan, fsl_chan->slave_id, true);
}
- edma_writel(fsl_edma, EDMA_CR_ERGA | EDMA_CR_ERCA, regs->cr);
+ if (!(fsl_edma->drvdata->flags & FSL_EDMA_DRV_SPLIT_REG))
+ edma_writel(fsl_edma, EDMA_CR_ERGA | EDMA_CR_ERCA, regs->cr);
return 0;
}
/*
* This macro calculates the offset into the GRPCFG register
* idxd - struct idxd *
- * n - wq id
- * ofs - the index of the 32b dword for the config register
+ * n - group id
+ * ofs - the index of the 64b qword for the config register
*
- * The WQCFG register block is divided into groups per each wq. The n index
- * allows us to move to the register group that's for that particular wq.
- * Each register is 32bits. The ofs gives us the number of register to access.
+ * The GRPCFG register block is divided into three sub-registers, which
+ * are GRPWQCFG, GRPENGCFG and GRPFLGCFG. The n index allows us to move
+ * to the register block that contains the three sub-registers.
+ * Each register block is 64bits. And the ofs gives us the offset
+ * within the GRPWQCFG register to access.
*/
#define GRPWQCFG_OFFSET(idxd_dev, n, ofs) ((idxd_dev)->grpcfg_offset +\
(n) * GRPCFG_SIZE + sizeof(u64) * (ofs))
portal = idxd_wq_portal_addr(wq);
- /*
- * The wmb() flushes writes to coherent DMA data before
- * possibly triggering a DMA read. The wmb() is necessary
- * even on UP because the recipient is a device.
- */
- wmb();
-
/*
* Pending the descriptor to the lockless list for the irq_entry
* that we designated the descriptor to.
llist_add(&desc->llnode, &ie->pending_llist);
}
+ /*
+ * The wmb() flushes writes to coherent DMA data before
+ * possibly triggering a DMA read. The wmb() is necessary
+ * even on UP because the recipient is a device.
+ */
+ wmb();
+
if (wq_dedicated(wq)) {
iosubmit_cmds512(portal, desc->hw, 1);
} else {
enum dma_slave_buswidth max_width;
struct stm32_dma_desc *desc;
size_t xfer_count, offset;
- u32 num_sgs, best_burst, dma_burst, threshold;
- int i;
+ u32 num_sgs, best_burst, threshold;
+ int dma_burst, i;
num_sgs = DIV_ROUND_UP(len, STM32_DMA_ALIGNED_MAX_DATA_ITEMS);
desc = kzalloc(struct_size(desc, sg_req, num_sgs), GFP_NOWAIT);
best_burst = stm32_dma_get_best_burst(len, STM32_DMA_MAX_BURST,
threshold, max_width);
dma_burst = stm32_dma_get_burst(chan, best_burst);
+ if (dma_burst < 0) {
+ kfree(desc);
+ return NULL;
+ }
stm32_dma_clear_reg(&desc->sg_req[i].chan_reg);
desc->sg_req[i].chan_reg.dma_scr =
PSIL_SAUL(0x7505, 21, 35, 8, 36, 0),
PSIL_SAUL(0x7506, 22, 43, 8, 43, 0),
PSIL_SAUL(0x7507, 23, 43, 8, 44, 0),
- /* PDMA_MAIN0 - SPI0-3 */
+ /* PDMA_MAIN0 - SPI0-2 */
+ PSIL_PDMA_XY_PKT(0x4300),
+ PSIL_PDMA_XY_PKT(0x4301),
PSIL_PDMA_XY_PKT(0x4302),
PSIL_PDMA_XY_PKT(0x4303),
PSIL_PDMA_XY_PKT(0x4304),
PSIL_PDMA_XY_PKT(0x4309),
PSIL_PDMA_XY_PKT(0x430a),
PSIL_PDMA_XY_PKT(0x430b),
- PSIL_PDMA_XY_PKT(0x430c),
- PSIL_PDMA_XY_PKT(0x430d),
/* PDMA_MAIN1 - UART0-6 */
PSIL_PDMA_XY_PKT(0x4400),
PSIL_PDMA_XY_PKT(0x4401),
/* SAUL */
PSIL_SAUL(0xf500, 27, 83, 8, 83, 1),
PSIL_SAUL(0xf501, 28, 91, 8, 91, 1),
- /* PDMA_MAIN0 - SPI0-3 */
+ /* PDMA_MAIN0 - SPI0-2 */
+ PSIL_PDMA_XY_PKT(0xc300),
+ PSIL_PDMA_XY_PKT(0xc301),
PSIL_PDMA_XY_PKT(0xc302),
PSIL_PDMA_XY_PKT(0xc303),
PSIL_PDMA_XY_PKT(0xc304),
PSIL_PDMA_XY_PKT(0xc309),
PSIL_PDMA_XY_PKT(0xc30a),
PSIL_PDMA_XY_PKT(0xc30b),
- PSIL_PDMA_XY_PKT(0xc30c),
- PSIL_PDMA_XY_PKT(0xc30d),
/* PDMA_MAIN1 - UART0-6 */
PSIL_PDMA_XY_PKT(0xc400),
PSIL_PDMA_XY_PKT(0xc401),
PSIL_SAUL(0x7505, 21, 35, 8, 36, 0),
PSIL_SAUL(0x7506, 22, 43, 8, 43, 0),
PSIL_SAUL(0x7507, 23, 43, 8, 44, 0),
- /* PDMA_MAIN0 - SPI0-3 */
+ /* PDMA_MAIN0 - SPI0-2 */
+ PSIL_PDMA_XY_PKT(0x4300),
+ PSIL_PDMA_XY_PKT(0x4301),
PSIL_PDMA_XY_PKT(0x4302),
PSIL_PDMA_XY_PKT(0x4303),
PSIL_PDMA_XY_PKT(0x4304),
PSIL_PDMA_XY_PKT(0x4309),
PSIL_PDMA_XY_PKT(0x430a),
PSIL_PDMA_XY_PKT(0x430b),
- PSIL_PDMA_XY_PKT(0x430c),
- PSIL_PDMA_XY_PKT(0x430d),
/* PDMA_MAIN1 - UART0-6 */
PSIL_PDMA_XY_PKT(0x4400),
PSIL_PDMA_XY_PKT(0x4401),
/* SAUL */
PSIL_SAUL(0xf500, 27, 83, 8, 83, 1),
PSIL_SAUL(0xf501, 28, 91, 8, 91, 1),
- /* PDMA_MAIN0 - SPI0-3 */
+ /* PDMA_MAIN0 - SPI0-2 */
+ PSIL_PDMA_XY_PKT(0xc300),
+ PSIL_PDMA_XY_PKT(0xc301),
PSIL_PDMA_XY_PKT(0xc302),
PSIL_PDMA_XY_PKT(0xc303),
PSIL_PDMA_XY_PKT(0xc304),
PSIL_PDMA_XY_PKT(0xc309),
PSIL_PDMA_XY_PKT(0xc30a),
PSIL_PDMA_XY_PKT(0xc30b),
- PSIL_PDMA_XY_PKT(0xc30c),
- PSIL_PDMA_XY_PKT(0xc30d),
/* PDMA_MAIN1 - UART0-6 */
PSIL_PDMA_XY_PKT(0xc400),
PSIL_PDMA_XY_PKT(0xc401),
struct netlink_ext_ack *extack)
{
struct nlattr *tb[DPLL_A_PIN_MAX + 1];
- enum dpll_pin_state state;
u32 ppin_idx;
int ret;
return -EINVAL;
}
ppin_idx = nla_get_u32(tb[DPLL_A_PIN_PARENT_ID]);
- state = nla_get_u32(tb[DPLL_A_PIN_STATE]);
- ret = dpll_pin_on_pin_state_set(pin, ppin_idx, state, extack);
- if (ret)
- return ret;
+
+ if (tb[DPLL_A_PIN_STATE]) {
+ enum dpll_pin_state state = nla_get_u32(tb[DPLL_A_PIN_STATE]);
+
+ ret = dpll_pin_on_pin_state_set(pin, ppin_idx, state, extack);
+ if (ret)
+ return ret;
+ }
return 0;
}
return -ENOMEM;
hdr = genlmsg_put_reply(msg, info, &dpll_nl_family, 0,
DPLL_CMD_PIN_ID_GET);
- if (!hdr)
+ if (!hdr) {
+ nlmsg_free(msg);
return -EMSGSIZE;
-
+ }
pin = dpll_pin_find_from_nlattr(info);
if (!IS_ERR(pin)) {
ret = dpll_msg_add_pin_handle(msg, pin);
return -ENOMEM;
hdr = genlmsg_put_reply(msg, info, &dpll_nl_family, 0,
DPLL_CMD_PIN_GET);
- if (!hdr)
+ if (!hdr) {
+ nlmsg_free(msg);
return -EMSGSIZE;
+ }
ret = dpll_cmd_pin_get_one(msg, pin, info->extack);
if (ret) {
nlmsg_free(msg);
return -ENOMEM;
hdr = genlmsg_put_reply(msg, info, &dpll_nl_family, 0,
DPLL_CMD_DEVICE_ID_GET);
- if (!hdr)
+ if (!hdr) {
+ nlmsg_free(msg);
return -EMSGSIZE;
+ }
dpll = dpll_device_find_from_nlattr(info);
if (!IS_ERR(dpll)) {
return -ENOMEM;
hdr = genlmsg_put_reply(msg, info, &dpll_nl_family, 0,
DPLL_CMD_DEVICE_GET);
- if (!hdr)
+ if (!hdr) {
+ nlmsg_free(msg);
return -EMSGSIZE;
+ }
ret = dpll_device_get_one(dpll, msg, info->extack);
if (ret) {
edac_mc_id = emif_get_id(pdev->dev.of_node);
regval = readl(ddrmc_baseaddr + XDDR_REG_CONFIG0_OFFSET);
- num_chans = FIELD_PREP(XDDR_REG_CONFIG0_NUM_CHANS_MASK, regval);
+ num_chans = FIELD_GET(XDDR_REG_CONFIG0_NUM_CHANS_MASK, regval);
num_chans++;
- num_csrows = FIELD_PREP(XDDR_REG_CONFIG0_NUM_RANKS_MASK, regval);
+ num_csrows = FIELD_GET(XDDR_REG_CONFIG0_NUM_RANKS_MASK, regval);
num_csrows *= 2;
if (!num_csrows)
num_csrows = 1;
fw_unit_attributes,
&unit->attribute_group);
- if (device_register(&unit->device) < 0)
- goto skip_unit;
-
fw_device_get(device);
- continue;
-
- skip_unit:
- kfree(unit);
+ if (device_register(&unit->device) < 0) {
+ put_device(&unit->device);
+ continue;
+ }
}
}
sdev->use_10_for_rw = 1;
if (sbp2_param_exclusive_login) {
- sdev->manage_system_start_stop = true;
- sdev->manage_runtime_start_stop = true;
- sdev->manage_shutdown = true;
+ sdev->manage_system_start_stop = 1;
+ sdev->manage_runtime_start_stop = 1;
+ sdev->manage_shutdown = 1;
}
if (sdev->type == TYPE_ROM)
config FW_CFG_SYSFS
tristate "QEMU fw_cfg device support in sysfs"
- depends on SYSFS && (ARM || ARM64 || PARISC || PPC_PMAC || SPARC || X86)
+ depends on SYSFS && (ARM || ARM64 || PARISC || PPC_PMAC || RISCV || SPARC || X86)
depends on HAS_IOPORT_MAP
default n
help
void *tx_buffer;
bool mem_ops_native;
bool bitmap_created;
+ bool notif_enabled;
unsigned int sched_recv_irq;
unsigned int cpuhp_state;
struct ffa_pcpu_irq __percpu *irq_pcpu;
if (ids_processed >= max_ids - 1)
break;
- part_id = packed_id_list[++ids_processed];
+ part_id = packed_id_list[ids_processed++];
if (!ids_count[list]) { /* Global Notification */
__do_sched_recv_cb(part_id, 0, false);
if (ids_processed >= max_ids - 1)
break;
- vcpu_id = packed_id_list[++ids_processed];
+ vcpu_id = packed_id_list[ids_processed++];
__do_sched_recv_cb(part_id, vcpu_id, true);
}
#define FFA_SECURE_PARTITION_ID_FLAG BIT(15)
+#define ffa_notifications_disabled() (!drv_info->notif_enabled)
+
enum notify_type {
NON_SECURE_VM,
SECURE_PARTITION,
struct ffa_dev_part_info *partition;
bool cb_valid;
+ if (ffa_notifications_disabled())
+ return -EOPNOTSUPP;
+
partition = xa_load(&drv_info->partition_info, part_id);
write_lock(&partition->rw_lock);
int rc;
enum notify_type type = ffa_notify_type_get(dev->vm_id);
+ if (ffa_notifications_disabled())
+ return -EOPNOTSUPP;
+
if (notify_id >= FFA_MAX_NOTIFICATIONS)
return -EINVAL;
u32 flags = 0;
enum notify_type type = ffa_notify_type_get(dev->vm_id);
+ if (ffa_notifications_disabled())
+ return -EOPNOTSUPP;
+
if (notify_id >= FFA_MAX_NOTIFICATIONS)
return -EINVAL;
{
u32 flags = 0;
+ if (ffa_notifications_disabled())
+ return -EOPNOTSUPP;
+
if (is_per_vcpu)
flags |= (PER_VCPU_NOTIFICATION_FLAG | vcpu << 16);
if (!count)
return;
- info = kcalloc(count, sizeof(**info), GFP_KERNEL);
+ info = kcalloc(count, sizeof(*info), GFP_KERNEL);
if (!info)
return;
static void ffa_sched_recv_irq_unmap(void)
{
- if (drv_info->sched_recv_irq)
+ if (drv_info->sched_recv_irq) {
irq_dispose_mapping(drv_info->sched_recv_irq);
+ drv_info->sched_recv_irq = 0;
+ }
}
static int ffa_cpuhp_pcpu_irq_enable(unsigned int cpu)
static void ffa_uninit_pcpu_irq(void)
{
- if (drv_info->cpuhp_state)
+ if (drv_info->cpuhp_state) {
cpuhp_remove_state(drv_info->cpuhp_state);
+ drv_info->cpuhp_state = 0;
+ }
- if (drv_info->notif_pcpu_wq)
+ if (drv_info->notif_pcpu_wq) {
destroy_workqueue(drv_info->notif_pcpu_wq);
+ drv_info->notif_pcpu_wq = NULL;
+ }
if (drv_info->sched_recv_irq)
free_percpu_irq(drv_info->sched_recv_irq, drv_info->irq_pcpu);
- if (drv_info->irq_pcpu)
+ if (drv_info->irq_pcpu) {
free_percpu(drv_info->irq_pcpu);
+ drv_info->irq_pcpu = NULL;
+ }
}
static int ffa_init_pcpu_irq(unsigned int irq)
ffa_notification_bitmap_destroy();
drv_info->bitmap_created = false;
}
+ drv_info->notif_enabled = false;
}
-static int ffa_notifications_setup(void)
+static void ffa_notifications_setup(void)
{
int ret, irq;
ret = ffa_features(FFA_NOTIFICATION_BITMAP_CREATE, 0, NULL, NULL);
if (ret) {
- pr_err("Notifications not supported, continuing with it ..\n");
- return 0;
+ pr_info("Notifications not supported, continuing with it ..\n");
+ return;
}
ret = ffa_notification_bitmap_create();
if (ret) {
- pr_err("notification_bitmap_create error %d\n", ret);
- return ret;
+ pr_info("Notification bitmap create error %d\n", ret);
+ return;
}
drv_info->bitmap_created = true;
hash_init(drv_info->notifier_hash);
mutex_init(&drv_info->notify_lock);
- /* Register internal scheduling callback */
- ret = ffa_sched_recv_cb_update(drv_info->vm_id, ffa_self_notif_handle,
- drv_info, true);
- if (!ret)
- return ret;
+ drv_info->notif_enabled = true;
+ return;
cleanup:
+ pr_info("Notification setup failed %d, not enabled\n", ret);
ffa_notifications_cleanup();
- return ret;
}
static int __init ffa_init(void)
mutex_init(&drv_info->rx_lock);
mutex_init(&drv_info->tx_lock);
- ffa_setup_partitions();
-
ffa_set_up_mem_ops_native_flag();
- ret = ffa_notifications_setup();
+ ffa_notifications_setup();
+
+ ffa_setup_partitions();
+
+ ret = ffa_sched_recv_cb_update(drv_info->vm_id, ffa_self_notif_handle,
+ drv_info, true);
if (ret)
- goto partitions_cleanup;
+ pr_info("Failed to register driver sched callback %d\n", ret);
return 0;
-partitions_cleanup:
- ffa_partitions_cleanup();
free_pages:
if (drv_info->tx_buffer)
free_pages_exact(drv_info->tx_buffer, RXTX_BUFFER_SIZE);
u32 opp_count;
u32 sustained_freq_khz;
u32 sustained_perf_level;
- u32 mult_factor;
+ unsigned long mult_factor;
struct scmi_perf_domain_info info;
struct scmi_opp opp[MAX_OPPS];
struct scmi_fc_info *fc_info;
dom_info->sustained_perf_level =
le32_to_cpu(attr->sustained_perf_level);
if (!dom_info->sustained_freq_khz ||
- !dom_info->sustained_perf_level)
+ !dom_info->sustained_perf_level ||
+ dom_info->level_indexing_mode)
/* CPUFreq converts to kHz, hence default 1000 */
dom_info->mult_factor = 1000;
else
dom_info->mult_factor =
- (dom_info->sustained_freq_khz * 1000) /
- dom_info->sustained_perf_level;
+ (dom_info->sustained_freq_khz * 1000UL)
+ / dom_info->sustained_perf_level;
strscpy(dom_info->info.name, attr->name,
SCMI_SHORT_NAME_MAX_SIZE);
}
if (!dom->level_indexing_mode)
freq = dom->opp[idx].perf * dom->mult_factor;
else
- freq = dom->opp[idx].indicative_freq * 1000;
+ freq = dom->opp[idx].indicative_freq * dom->mult_factor;
data.level = dom->opp[idx].perf;
data.freq = freq;
} else {
struct scmi_opp *opp;
- opp = LOOKUP_BY_FREQ(dom->opps_by_freq, freq / 1000);
+ opp = LOOKUP_BY_FREQ(dom->opps_by_freq,
+ freq / dom->mult_factor);
if (!opp)
return -EIO;
if (!opp)
return -EIO;
- *freq = opp->indicative_freq * 1000;
+ *freq = opp->indicative_freq * dom->mult_factor;
}
return ret;
if (!dom->level_indexing_mode)
opp_freq = opp->perf * dom->mult_factor;
else
- opp_freq = opp->indicative_freq * 1000;
+ opp_freq = opp->indicative_freq * dom->mult_factor;
if (opp_freq < *freq)
continue;
return status;
}
-unsigned long kernel_entry_address(void)
+unsigned long kernel_entry_address(unsigned long kernel_addr)
{
unsigned long base = (unsigned long)&kernel_offset - kernel_offset;
- return (unsigned long)&kernel_entry - base + VMLINUX_LOAD_ADDRESS;
+ return (unsigned long)&kernel_entry - base + kernel_addr;
}
return EFI_SUCCESS;
}
-unsigned long __weak kernel_entry_address(void)
+unsigned long __weak kernel_entry_address(unsigned long kernel_addr)
{
- return *(unsigned long *)(PHYSADDR(VMLINUX_LOAD_ADDRESS) + 8);
+ return *(unsigned long *)(kernel_addr + 8) - VMLINUX_LOAD_ADDRESS + kernel_addr;
}
efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image,
csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0);
csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1);
- real_kernel_entry = (void *)kernel_entry_address();
+ real_kernel_entry = (void *)kernel_entry_address(kernel_addr);
real_kernel_entry(true, (unsigned long)cmdline_ptr,
(unsigned long)efi_system_table);
efi_err("Memory acceptance protocol failed\n");
}
+static efi_char16_t *efistub_fw_vendor(void)
+{
+ unsigned long vendor = efi_table_attr(efi_system_table, fw_vendor);
+
+ return (efi_char16_t *)vendor;
+}
+
static const efi_char16_t apple[] = L"Apple";
static void setup_quirks(struct boot_params *boot_params)
{
- efi_char16_t *fw_vendor = (efi_char16_t *)(unsigned long)
- efi_table_attr(efi_system_table, fw_vendor);
-
- if (!memcmp(fw_vendor, apple, sizeof(apple))) {
- if (IS_ENABLED(CONFIG_APPLE_PROPERTIES))
- retrieve_apple_device_properties(boot_params);
- }
+ if (IS_ENABLED(CONFIG_APPLE_PROPERTIES) &&
+ !memcmp(efistub_fw_vendor(), apple, sizeof(apple)))
+ retrieve_apple_device_properties(boot_params);
}
/*
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && !efi_nokaslr) {
u64 range = KERNEL_IMAGE_SIZE - LOAD_PHYSICAL_ADDR - kernel_total_size;
+ static const efi_char16_t ami[] = L"American Megatrends";
efi_get_seed(seed, sizeof(seed));
virt_addr += (range * seed[1]) >> 32;
virt_addr &= ~(CONFIG_PHYSICAL_ALIGN - 1);
+
+ /*
+ * Older Dell systems with AMI UEFI firmware v2.0 may hang
+ * while decompressing the kernel if physical address
+ * randomization is enabled.
+ *
+ * https://bugzilla.kernel.org/show_bug.cgi?id=218173
+ */
+ if (efi_system_table->hdr.revision <= EFI_2_00_SYSTEM_TABLE_REVISION &&
+ !memcmp(efistub_fw_vendor(), ami, sizeof(ami))) {
+ efi_debug("AMI firmware v2.0 or older detected - disabling physical KASLR\n");
+ seed[0] = 0;
+ }
}
status = efi_random_alloc(alloc_size, CONFIG_PHYSICAL_ALIGN, &addr,
* overlap on physical address level.
*/
list_for_each_entry(entry, &accepting_list, list) {
- if (entry->end < range.start)
+ if (entry->end <= range.start)
continue;
if (entry->start >= range.end)
continue;
/* arch-specific ctrl & data register offsets are not available in ACPI, DT */
#if !(defined(FW_CFG_CTRL_OFF) && defined(FW_CFG_DATA_OFF))
-# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64))
+# if (defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_RISCV))
# define FW_CFG_CTRL_OFF 0x08
# define FW_CFG_DATA_OFF 0x00
# define FW_CFG_DMA_OFF 0x10
{
struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
struct dwapb_gpio *gpio = to_dwapb_gpio(gc);
+ irq_hw_number_t hwirq = irqd_to_hwirq(d);
unsigned long flags;
u32 val;
raw_spin_lock_irqsave(&gc->bgpio_lock, flags);
- val = dwapb_read(gpio, GPIO_INTEN);
- val |= BIT(irqd_to_hwirq(d));
+ val = dwapb_read(gpio, GPIO_INTEN) | BIT(hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
+ val = dwapb_read(gpio, GPIO_INTMASK) & ~BIT(hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags);
}
{
struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
struct dwapb_gpio *gpio = to_dwapb_gpio(gc);
+ irq_hw_number_t hwirq = irqd_to_hwirq(d);
unsigned long flags;
u32 val;
raw_spin_lock_irqsave(&gc->bgpio_lock, flags);
- val = dwapb_read(gpio, GPIO_INTEN);
- val &= ~BIT(irqd_to_hwirq(d));
+ val = dwapb_read(gpio, GPIO_INTMASK) | BIT(hwirq);
+ dwapb_write(gpio, GPIO_INTMASK, val);
+ val = dwapb_read(gpio, GPIO_INTEN) & ~BIT(hwirq);
dwapb_write(gpio, GPIO_INTEN, val);
raw_spin_unlock_irqrestore(&gc->bgpio_lock, flags);
}
return 0;
}
-/*
- * gpio_ioctl() - ioctl handler for the GPIO chardev
- */
-static long gpio_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+static long gpio_ioctl_unlocked(struct file *file, unsigned int cmd, unsigned long arg)
{
struct gpio_chardev_data *cdev = file->private_data;
struct gpio_device *gdev = cdev->gdev;
}
}
+/*
+ * gpio_ioctl() - ioctl handler for the GPIO chardev
+ */
+static long gpio_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ struct gpio_chardev_data *cdev = file->private_data;
+
+ return call_ioctl_locked(file, cmd, arg, cdev->gdev,
+ gpio_ioctl_unlocked);
+}
+
#ifdef CONFIG_COMPAT
static long gpio_ioctl_compat(struct file *file, unsigned int cmd,
unsigned long arg)
goto done;
status = gpiod_set_transitory(desc, false);
- if (!status) {
- status = gpiod_export(desc, true);
- if (status < 0)
- gpiod_free(desc);
- else
- set_bit(FLAG_SYSFS, &desc->flags);
+ if (status) {
+ gpiod_free(desc);
+ goto done;
}
+ status = gpiod_export(desc, true);
+ if (status < 0)
+ gpiod_free(desc);
+ else
+ set_bit(FLAG_SYSFS, &desc->flags);
+
done:
if (status)
pr_debug("%s: status %d\n", __func__, status);
extern int amdgpu_seamless;
extern int amdgpu_user_partt_mode;
+extern int amdgpu_agp;
#define AMDGPU_VM_MAX_NUM_CTX 4096
#define AMDGPU_SG_THRESHOLD (256*1024*1024)
struct amdgpu_device *adev = dst, *peer_adev;
int num_links;
- if (adev->asic_type != CHIP_ALDEBARAN)
+ if (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(9, 4, 2))
return 0;
if (src)
}
for (i = 0; i < p->nchunks; i++) {
- struct drm_amdgpu_cs_chunk __user **chunk_ptr = NULL;
+ struct drm_amdgpu_cs_chunk __user *chunk_ptr = NULL;
struct drm_amdgpu_cs_chunk user_chunk;
uint32_t __user *cdata;
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
+ if (!adev->didt_rreg)
+ return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
+ if (!adev->didt_wreg)
+ return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
adev->gfx.mcbp = true;
else if (amdgpu_mcbp == 0)
adev->gfx.mcbp = false;
- else if ((amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(9, 0, 0)) &&
- (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(10, 0, 0)) &&
- adev->gfx.num_gfx_rings)
- adev->gfx.mcbp = true;
if (amdgpu_sriov_vf(adev))
adev->gfx.mcbp = true;
amdgpu_ras_suspend(adev);
- amdgpu_ttm_set_buffer_funcs_status(adev, false);
-
amdgpu_device_ip_suspend_phase1(adev);
if (!adev->in_s0ix)
if (r)
return r;
+ amdgpu_ttm_set_buffer_funcs_status(adev, false);
+
amdgpu_fence_driver_hw_fini(adev);
amdgpu_device_ip_suspend_phase2(adev);
if (amdgpu_sriov_vf(adev))
amdgpu_virt_release_full_gpu(adev, false);
+ r = amdgpu_dpm_notify_rlc_state(adev, false);
+ if (r)
+ return r;
+
return 0;
}
adev->have_disp_power_ref = true;
return ret;
}
- /* if we have no active crtcs, then drop the power ref
- * we got before
+ /* if we have no active crtcs, then go to
+ * drop the power ref we got before
*/
- if (!active && adev->have_disp_power_ref) {
- pm_runtime_put_autosuspend(dev->dev);
+ if (!active && adev->have_disp_power_ref)
adev->have_disp_power_ref = false;
- }
-
out:
/* drop the power reference we got coming in here */
pm_runtime_put_autosuspend(dev->dev);
int amdgpu_umsch_mm;
int amdgpu_seamless = -1; /* auto */
uint amdgpu_debug_mask;
+int amdgpu_agp = -1; /* auto */
static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
MODULE_PARM_DESC(debug_mask, "debug options for amdgpu, disabled by default");
module_param_named(debug_mask, amdgpu_debug_mask, uint, 0444);
+/**
+ * DOC: agp (int)
+ * Enable the AGP aperture. This provides an aperture in the GPU's internal
+ * address space for direct access to system memory. Note that these accesses
+ * are non-snooped, so they are only used for access to uncached memory.
+ */
+MODULE_PARM_DESC(agp, "AGP (-1 = auto (default), 0 = disable, 1 = enable)");
+module_param_named(agp, amdgpu_agp, int, 0444);
+
/* These devices are not supported by amdgpu.
* They are supported by the mach64, r128, radeon drivers
*/
pm_runtime_mark_last_busy(ddev->dev);
pm_runtime_put_autosuspend(ddev->dev);
+ pci_wake_from_d3(pdev, TRUE);
+
/*
* For runpm implemented via BACO, PMFW will handle the
* timing for BACO in and out:
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
+ if (!bo->ttm)
+ return AMDGPU_BO_INVALID_OFFSET;
+
if (bo->ttm->num_pages != 1 || bo->ttm->caching == ttm_cached)
return AMDGPU_BO_INVALID_OFFSET;
#define MCA_REG__STATUS__ERRORCODEEXT(x) MCA_REG_FIELD(x, 21, 16)
#define MCA_REG__STATUS__ERRORCODE(x) MCA_REG_FIELD(x, 15, 0)
+#define MCA_REG__SYND__ERRORINFORMATION(x) MCA_REG_FIELD(x, 17, 0)
+
enum amdgpu_mca_ip {
AMDGPU_MCA_IP_UNKNOW = -1,
AMDGPU_MCA_IP_PSP = 0,
abo = ttm_to_amdgpu_bo(bo);
+ WARN_ON(abo->vm_bo);
+
if (abo->kfd_bo)
amdgpu_amdkfd_release_notify(abo);
u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo)
{
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
- uint64_t offset;
+ uint64_t offset = AMDGPU_BO_INVALID_OFFSET;
+
+ if (bo->tbo.resource->mem_type == TTM_PL_TT)
+ offset = amdgpu_gmc_agp_addr(&bo->tbo);
- offset = (bo->tbo.resource->start << PAGE_SHIFT) +
- amdgpu_ttm_domain_start(adev, bo->tbo.resource->mem_type);
+ if (offset == AMDGPU_BO_INVALID_OFFSET)
+ offset = (bo->tbo.resource->start << PAGE_SHIFT) +
+ amdgpu_ttm_domain_start(adev, bo->tbo.resource->mem_type);
return amdgpu_gmc_sign_extend(offset);
}
topology->nodes[i].num_links = (requires_reflection && topology->nodes[i].num_links) ?
topology->nodes[i].num_links : node_num_links;
}
+ /* popluate the connected port num info if supported and available */
+ if (ta_port_num_support && topology->nodes[i].num_links) {
+ memcpy(topology->nodes[i].port_num, link_extend_info_output->nodes[i].port_num,
+ sizeof(struct xgmi_connected_port_num) * TA_XGMI__MAX_PORT_NUM);
+ }
/* reflect the topology information for bi-directionality */
if (requires_reflection && topology->nodes[i].num_hops)
uint8_t is_sharing_enabled;
enum ta_xgmi_assigned_sdma_engine sdma_engine;
uint8_t num_links;
+ struct xgmi_connected_port_num port_num[TA_XGMI__MAX_PORT_NUM];
};
struct psp_xgmi_topology_info {
#include <linux/reboot.h>
#include <linux/syscalls.h>
#include <linux/pm_runtime.h>
+#include <linux/list_sort.h>
#include "amdgpu.h"
#include "amdgpu_ras.h"
}
if (block_obj->hw_ops->query_ras_error_count)
- block_obj->hw_ops->query_ras_error_count(adev, &err_data);
+ block_obj->hw_ops->query_ras_error_count(adev, err_data);
if ((info->head.block == AMDGPU_RAS_BLOCK__SDMA) ||
(info->head.block == AMDGPU_RAS_BLOCK__GFX) ||
return err_node;
}
+static int ras_err_info_cmp(void *priv, const struct list_head *a, const struct list_head *b)
+{
+ struct ras_err_node *nodea = container_of(a, struct ras_err_node, node);
+ struct ras_err_node *nodeb = container_of(b, struct ras_err_node, node);
+ struct amdgpu_smuio_mcm_config_info *infoa = &nodea->err_info.mcm_info;
+ struct amdgpu_smuio_mcm_config_info *infob = &nodeb->err_info.mcm_info;
+
+ if (unlikely(infoa->socket_id != infob->socket_id))
+ return infoa->socket_id - infob->socket_id;
+ else
+ return infoa->die_id - infob->die_id;
+
+ return 0;
+}
+
static struct ras_err_info *amdgpu_ras_error_get_info(struct ras_err_data *err_data,
struct amdgpu_smuio_mcm_config_info *mcm_info)
{
err_data->err_list_count++;
list_add_tail(&err_node->node, &err_data->err_node_list);
+ list_sort(NULL, &err_data->err_node_list, ras_err_info_cmp);
return &err_node->err_info;
}
control->i2c_address = EEPROM_I2C_MADDR_0;
return true;
case IP_VERSION(13, 0, 0):
+ if (strnstr(atom_ctx->vbios_pn, "D707",
+ sizeof(atom_ctx->vbios_pn)))
+ control->i2c_address = EEPROM_I2C_MADDR_0;
+ else
+ control->i2c_address = EEPROM_I2C_MADDR_4;
+ return true;
case IP_VERSION(13, 0, 6):
case IP_VERSION(13, 0, 10):
control->i2c_address = EEPROM_I2C_MADDR_4;
return 0;
addr = amdgpu_gmc_agp_addr(bo);
- if (addr != AMDGPU_BO_INVALID_OFFSET) {
- bo->resource->start = addr >> PAGE_SHIFT;
+ if (addr != AMDGPU_BO_INVALID_OFFSET)
return 0;
- }
/* allocate GART space */
placement.num_placement = 1;
* amdgpu_uvd_entity_init - init entity
*
* @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring pointer to check
*
* Initialize the entity used for handle management in the kernel driver.
*/
* amdgpu_vce_entity_init - init entity
*
* @adev: amdgpu_device pointer
+ * @ring: amdgpu_ring pointer to check
*
* Initialize the entity used for handle management in the kernel driver.
*/
list_for_each_entry_safe(vm_bo, tmp, &vm->idle, vm_status) {
struct amdgpu_bo *bo = vm_bo->bo;
+ vm_bo->moved = true;
if (!bo || bo->tbo.type != ttm_bo_type_kernel)
list_move(&vm_bo->vm_status, &vm_bo->vm->moved);
else if (bo->parent)
if (!entry->bo)
return;
+
+ entry->bo->vm_bo = NULL;
shadow = amdgpu_bo_shadowed(entry->bo);
if (shadow) {
ttm_bo_set_bulk_move(&shadow->tbo, NULL);
amdgpu_bo_unref(&shadow);
}
ttm_bo_set_bulk_move(&entry->bo->tbo, NULL);
- entry->bo->vm_bo = NULL;
spin_lock(&entry->vm->status_lock);
list_del(&entry->vm_status);
MODULE_FIRMWARE("amdgpu/gc_11_5_0_mec.bin");
MODULE_FIRMWARE("amdgpu/gc_11_5_0_rlc.bin");
+static const struct soc15_reg_golden golden_settings_gc_11_0[] = {
+ SOC15_REG_GOLDEN_VALUE(GC, 0, regTCP_CNTL, 0x20000000, 0x20000000)
+};
+
static const struct soc15_reg_golden golden_settings_gc_11_0_1[] =
{
SOC15_REG_GOLDEN_VALUE(GC, 0, regCGTT_GS_NGG_CLK_CTRL, 0x9fff8fff, 0x00000010),
default:
break;
}
+ soc15_program_register_sequence(adev,
+ golden_settings_gc_11_0,
+ (const u32)ARRAY_SIZE(golden_settings_gc_11_0));
+
}
static void gfx_v11_0_write_data_to_reg(struct amdgpu_ring *ring, int eng_sel,
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
cpu_ptr = &adev->wb.wb[index];
- r = amdgpu_ib_get(adev, NULL, 16, AMDGPU_IB_POOL_DIRECT, &ib);
+ r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
if (r) {
DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
goto err1;
gpu_addr = adev->wb.gpu_addr + (index * 4);
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 16,
- AMDGPU_IB_POOL_DIRECT, &ib);
+
+ r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
if (r)
goto err1;
gpu_addr = adev->wb.gpu_addr + (index * 4);
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 16,
- AMDGPU_IB_POOL_DIRECT, &ib);
+
+ r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
if (r)
goto err1;
gpu_addr = adev->wb.gpu_addr + (index * 4);
adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
memset(&ib, 0, sizeof(ib));
- r = amdgpu_ib_get(adev, NULL, 16,
- AMDGPU_IB_POOL_DIRECT, &ib);
+
+ r = amdgpu_ib_get(adev, NULL, 20, AMDGPU_IB_POOL_DIRECT, &ib);
if (r)
goto err1;
amdgpu_gmc_set_agp_default(adev, mc);
amdgpu_gmc_vram_location(adev, &adev->gmc, base);
amdgpu_gmc_gart_location(adev, mc, AMDGPU_GART_PLACEMENT_BEST_FIT);
- if (!amdgpu_sriov_vf(adev))
+ if (!amdgpu_sriov_vf(adev) && (amdgpu_agp == 1))
amdgpu_gmc_agp_location(adev, mc);
/* base offset of vram pages */
amdgpu_gmc_set_agp_default(adev, mc);
amdgpu_gmc_vram_location(adev, &adev->gmc, base);
amdgpu_gmc_gart_location(adev, mc, AMDGPU_GART_PLACEMENT_HIGH);
- if (!amdgpu_sriov_vf(adev) ||
- (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(11, 5, 0)))
+ if (!amdgpu_sriov_vf(adev) &&
+ (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(11, 5, 0)) &&
+ (amdgpu_agp == 1))
amdgpu_gmc_agp_location(adev, mc);
/* base offset of vram pages */
} else {
amdgpu_gmc_vram_location(adev, mc, base);
amdgpu_gmc_gart_location(adev, mc, AMDGPU_GART_PLACEMENT_BEST_FIT);
- if (!amdgpu_sriov_vf(adev))
+ if (!amdgpu_sriov_vf(adev) && (amdgpu_agp == 1))
amdgpu_gmc_agp_location(adev, mc);
}
/* base offset of vram pages */
if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3))
amdgpu_gmc_sysfs_fini(adev);
- adev->gmc.num_mem_partitions = 0;
- kfree(adev->gmc.mem_partitions);
amdgpu_gmc_ras_fini(adev);
amdgpu_gem_force_release(adev);
amdgpu_bo_free_kernel(&adev->gmc.pdb0_bo, NULL, &adev->gmc.ptr_pdb0);
amdgpu_bo_fini(adev);
+ adev->gmc.num_mem_partitions = 0;
+ kfree(adev->gmc.mem_partitions);
+
return 0;
}
{
int data;
+ if (amdgpu_ip_version(adev, HDP_HWIP, 0) == IP_VERSION(4, 4, 2)) {
+ /* Default enabled */
+ *flags |= AMD_CG_SUPPORT_HDP_MGCG;
+ return;
+ }
/* AMD_CG_SUPPORT_HDP_LS */
data = RREG32(SOC15_REG_OFFSET(HDP, 0, mmHDP_MEM_POWER_LS));
if (data & HDP_MEM_POWER_LS__LS_ENABLE_MASK)
struct amdgpu_ring *ring = adev->jpeg.inst->ring_dec;
int r;
- adev->nbio.funcs->vcn_doorbell_range(adev, ring->use_doorbell,
- (adev->doorbell_index.vcn.vcn_ring0_1 << 1), 0);
-
- WREG32_SOC15(VCN, 0, regVCN_JPEG_DB_CTRL,
- ring->doorbell_index << VCN_JPEG_DB_CTRL__OFFSET__SHIFT |
- VCN_JPEG_DB_CTRL__EN_MASK);
-
r = amdgpu_ring_test_helper(ring);
if (r)
return r;
if (adev->pm.dpm_enabled)
amdgpu_dpm_enable_jpeg(adev, true);
+ /* doorbell programming is done for every playback */
+ adev->nbio.funcs->vcn_doorbell_range(adev, ring->use_doorbell,
+ (adev->doorbell_index.vcn.vcn_ring0_1 << 1), 0);
+
+ WREG32_SOC15(VCN, 0, regVCN_JPEG_DB_CTRL,
+ ring->doorbell_index << VCN_JPEG_DB_CTRL__OFFSET__SHIFT |
+ VCN_JPEG_DB_CTRL__EN_MASK);
+
/* disable power gating */
r = jpeg_v4_0_5_disable_static_power_gating(adev);
if (r)
uint64_t value;
int i;
+ if (amdgpu_sriov_vf(adev))
+ return;
+
inst_mask = adev->aid_mask;
for_each_inst(i, inst_mask) {
/* Program the AGP BAR */
WREG32_SOC15(MMHUB, i, regMC_VM_AGP_TOP,
adev->gmc.agp_end >> 24);
- if (amdgpu_sriov_vf(adev))
- return;
-
/* Program the system aperture low logical page number. */
WREG32_SOC15(MMHUB, i, regMC_VM_SYSTEM_APERTURE_LOW_ADDR,
min(adev->gmc.fb_start, adev->gmc.agp_start) >> 18);
static void nbio_v7_11_init_registers(struct amdgpu_device *adev)
{
-/* uint32_t def, data;
+ uint32_t def, data;
+
+ def = data = RREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3);
+ data = REG_SET_FIELD(data, BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3,
+ CI_SWUS_MAX_READ_REQUEST_SIZE_MODE, 1);
+ data = REG_SET_FIELD(data, BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3,
+ CI_SWUS_MAX_READ_REQUEST_SIZE_PRIV, 1);
- def = data = RREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3);
- data = REG_SET_FIELD(data, BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3,
- CI_SWUS_MAX_READ_REQUEST_SIZE_MODE, 1);
- data = REG_SET_FIELD(data, BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3,
- CI_SWUS_MAX_READ_REQUEST_SIZE_PRIV, 1);
+ if (def != data)
+ WREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3, data);
- if (def != data)
- WREG32_SOC15(NBIO, 0, regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3, data);
-*/
}
static void nbio_v7_11_update_medium_grain_clock_gating(struct amdgpu_device *adev,
dev_info(adev->dev, "RAS controller interrupt triggered "
"by NBIF error\n");
-
- /* ras_controller_int is dedicated for nbif ras error,
- * not the global interrupt for sync flood
- */
- amdgpu_ras_reset_gpu(adev);
}
amdgpu_ras_error_data_fini(&err_data);
#define GFX_CMD_USB_PD_USE_LFB 0x480
/* Retry times for vmbx ready wait */
-#define PSP_VMBX_POLLING_LIMIT 20000
+#define PSP_VMBX_POLLING_LIMIT 3000
/* VBIOS gfl defines */
#define MBOX_READY_MASK 0x80000000
static int psp_v13_0_wait_for_bootloader(struct psp_context *psp)
{
struct amdgpu_device *adev = psp->adev;
- int retry_loop, ret;
+ int retry_loop, retry_cnt, ret;
+ retry_cnt =
+ (amdgpu_ip_version(adev, MP0_HWIP, 0) == IP_VERSION(13, 0, 6)) ?
+ PSP_VMBX_POLLING_LIMIT :
+ 10;
/* Wait for bootloader to signify that it is ready having bit 31 of
* C2PMSG_35 set to 1. All other bits are expected to be cleared.
* If there is an error in processing command, bits[7:0] will be set.
* This is applicable for PSP v13.0.6 and newer.
*/
- for (retry_loop = 0; retry_loop < PSP_VMBX_POLLING_LIMIT; retry_loop++) {
+ for (retry_loop = 0; retry_loop < retry_cnt; retry_loop++) {
ret = psp_wait_for(
psp, SOC15_REG_OFFSET(MP0, 0, regMP0_SMN_C2PMSG_35),
0x80000000, 0xffffffff, false);
if (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 6))
return 0;
- if (RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_59) < 0x00a10007)
+ if (RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_59) < 0x00a10109)
return 0;
for_each_inst(i, inst_mask) {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int r;
+ adev->sdma.num_instances = SDMA_MAX_INSTANCE;
+
r = sdma_v2_4_init_microcode(adev);
if (r)
return r;
- adev->sdma.num_instances = SDMA_MAX_INSTANCE;
-
sdma_v2_4_set_ring_funcs(adev);
sdma_v2_4_set_buffer_funcs(adev);
sdma_v2_4_set_vm_pte_funcs(adev);
*flags |= AMD_CG_SUPPORT_SDMA_LS;
}
+static void sdma_v5_2_ring_begin_use(struct amdgpu_ring *ring)
+{
+ struct amdgpu_device *adev = ring->adev;
+
+ /* SDMA 5.2.3 (RMB) FW doesn't seem to properly
+ * disallow GFXOFF in some cases leading to
+ * hangs in SDMA. Disallow GFXOFF while SDMA is active.
+ * We can probably just limit this to 5.2.3,
+ * but it shouldn't hurt for other parts since
+ * this GFXOFF will be disallowed anyway when SDMA is
+ * active, this just makes it explicit.
+ */
+ amdgpu_gfx_off_ctrl(adev, false);
+}
+
+static void sdma_v5_2_ring_end_use(struct amdgpu_ring *ring)
+{
+ struct amdgpu_device *adev = ring->adev;
+
+ /* SDMA 5.2.3 (RMB) FW doesn't seem to properly
+ * disallow GFXOFF in some cases leading to
+ * hangs in SDMA. Allow GFXOFF when SDMA is complete.
+ */
+ amdgpu_gfx_off_ctrl(adev, true);
+}
+
const struct amd_ip_funcs sdma_v5_2_ip_funcs = {
.name = "sdma_v5_2",
.early_init = sdma_v5_2_early_init,
.test_ib = sdma_v5_2_ring_test_ib,
.insert_nop = sdma_v5_2_ring_insert_nop,
.pad_ib = sdma_v5_2_ring_pad_ib,
+ .begin_use = sdma_v5_2_ring_begin_use,
+ .end_use = sdma_v5_2_ring_end_use,
.emit_wreg = sdma_v5_2_ring_emit_wreg,
.emit_reg_wait = sdma_v5_2_ring_emit_reg_wait,
.emit_reg_write_reg_wait = sdma_v5_2_ring_emit_reg_write_reg_wait,
AMD_PG_SUPPORT_VCN_DPG |
AMD_PG_SUPPORT_JPEG;
adev->external_rev_id = adev->rev_id + 0x46;
+ /* GC 9.4.3 uses MMIO register region hole at a different offset */
+ if (!amdgpu_sriov_vf(adev)) {
+ adev->rmmio_remap.reg_offset = 0x1A000;
+ adev->rmmio_remap.bus_addr = adev->rmmio_base + 0x1A000;
+ }
break;
default:
/* FIXME: not supported yet */
if (amdgpu_sriov_vf(adev))
*flags = 0;
- adev->nbio.funcs->get_clockgating_state(adev, flags);
+ if (adev->nbio.funcs && adev->nbio.funcs->get_clockgating_state)
+ adev->nbio.funcs->get_clockgating_state(adev, flags);
- adev->hdp.funcs->get_clock_gating_state(adev, flags);
+ if (adev->hdp.funcs && adev->hdp.funcs->get_clock_gating_state)
+ adev->hdp.funcs->get_clock_gating_state(adev, flags);
- if (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 2)) {
+ if ((amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 2)) &&
+ (amdgpu_ip_version(adev, MP0_HWIP, 0) != IP_VERSION(13, 0, 6))) {
/* AMD_CG_SUPPORT_DRM_MGCG */
data = RREG32(SOC15_REG_OFFSET(MP0, 0, mmMP0_MISC_CGTT_CTRL0));
if (!(data & 0x01000000))
}
/* AMD_CG_SUPPORT_ROM_MGCG */
- adev->smuio.funcs->get_clock_gating_state(adev, flags);
+ if (adev->smuio.funcs && adev->smuio.funcs->get_clock_gating_state)
+ adev->smuio.funcs->get_clock_gating_state(adev, flags);
- adev->df.funcs->get_clockgating_state(adev, flags);
+ if (adev->df.funcs && adev->df.funcs->get_clockgating_state)
+ adev->df.funcs->get_clockgating_state(adev, flags);
}
static int soc15_common_set_powergating_state(void *handle,
struct kfd_dev *dev = adev->kfd.dev;
uint32_t i;
- if (adev->ip_versions[GC_HWIP][0] != IP_VERSION(9, 4, 3))
+ if (KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 3))
return dev->nodes[0];
for (i = 0; i < dev->num_nodes; i++)
return 0;
}
+static void pqm_clean_queue_resource(struct process_queue_manager *pqm,
+ struct process_queue_node *pqn)
+{
+ struct kfd_node *dev;
+ struct kfd_process_device *pdd;
+
+ dev = pqn->q->device;
+
+ pdd = kfd_get_process_device_data(dev, pqm->process);
+ if (!pdd) {
+ pr_err("Process device data doesn't exist\n");
+ return;
+ }
+
+ if (pqn->q->gws) {
+ if (KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) &&
+ !dev->kfd->shared_resources.enable_mes)
+ amdgpu_amdkfd_remove_gws_from_process(
+ pqm->process->kgd_process_info, pqn->q->gws);
+ pdd->qpd.num_gws = 0;
+ }
+
+ if (dev->kfd->shared_resources.enable_mes) {
+ amdgpu_amdkfd_free_gtt_mem(dev->adev, pqn->q->gang_ctx_bo);
+ if (pqn->q->wptr_bo)
+ amdgpu_amdkfd_free_gtt_mem(dev->adev, pqn->q->wptr_bo);
+ }
+}
+
void pqm_uninit(struct process_queue_manager *pqm)
{
struct process_queue_node *pqn, *next;
list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
- if (pqn->q && pqn->q->gws &&
- KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) &&
- !pqn->q->device->kfd->shared_resources.enable_mes)
- amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info,
- pqn->q->gws);
+ if (pqn->q)
+ pqm_clean_queue_resource(pqm, pqn);
+
kfd_procfs_del_queue(pqn->q);
uninit_queue(pqn->q);
list_del(&pqn->process_queue_list);
goto err_destroy_queue;
}
- if (pqn->q->gws) {
- if (KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) &&
- !dev->kfd->shared_resources.enable_mes)
- amdgpu_amdkfd_remove_gws_from_process(
- pqm->process->kgd_process_info,
- pqn->q->gws);
- pdd->qpd.num_gws = 0;
- }
-
- if (dev->kfd->shared_resources.enable_mes) {
- amdgpu_amdkfd_free_gtt_mem(dev->adev,
- pqn->q->gang_ctx_bo);
- if (pqn->q->wptr_bo)
- amdgpu_amdkfd_free_gtt_mem(dev->adev, pqn->q->wptr_bo);
-
- }
+ pqm_clean_queue_resource(pqm, pqn);
uninit_queue(pqn->q);
}
if (test_bit(gpuidx, prange->bitmap_access))
bitmap_set(ctx->bitmap, gpuidx, 1);
}
+
+ /*
+ * If prange is already mapped or with always mapped flag,
+ * update mapping on GPUs with ACCESS attribute
+ */
+ if (bitmap_empty(ctx->bitmap, MAX_GPU_INSTANCE)) {
+ if (prange->mapped_to_gpu ||
+ prange->flags & KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED)
+ bitmap_copy(ctx->bitmap, prange->bitmap_access, MAX_GPU_INSTANCE);
+ }
} else {
bitmap_or(ctx->bitmap, prange->bitmap_access,
prange->bitmap_aip, MAX_GPU_INSTANCE);
}
if (bitmap_empty(ctx->bitmap, MAX_GPU_INSTANCE)) {
- bitmap_copy(ctx->bitmap, prange->bitmap_access, MAX_GPU_INSTANCE);
- if (!prange->mapped_to_gpu ||
- bitmap_empty(ctx->bitmap, MAX_GPU_INSTANCE)) {
- r = 0;
- goto free_ctx;
- }
+ r = 0;
+ goto free_ctx;
}
if (prange->actual_loc && !prange->ttm_res) {
struct dmub_srv_create_params create_params;
struct dmub_srv_region_params region_params;
struct dmub_srv_region_info region_info;
- struct dmub_srv_fb_params fb_params;
+ struct dmub_srv_memory_params memory_params;
struct dmub_srv_fb_info *fb_info;
struct dmub_srv *dmub_srv;
const struct dmcub_firmware_header_v1_0 *hdr;
adev->dm.dmub_fw->data +
le32_to_cpu(hdr->header.ucode_array_offset_bytes) +
PSP_HEADER_BYTES;
+ region_params.is_mailbox_in_inbox = false;
status = dmub_srv_calc_region_info(dmub_srv, ®ion_params,
®ion_info);
return r;
/* Rebase the regions on the framebuffer address. */
- memset(&fb_params, 0, sizeof(fb_params));
- fb_params.cpu_addr = adev->dm.dmub_bo_cpu_addr;
- fb_params.gpu_addr = adev->dm.dmub_bo_gpu_addr;
- fb_params.region_info = ®ion_info;
+ memset(&memory_params, 0, sizeof(memory_params));
+ memory_params.cpu_fb_addr = adev->dm.dmub_bo_cpu_addr;
+ memory_params.gpu_fb_addr = adev->dm.dmub_bo_gpu_addr;
+ memory_params.region_info = ®ion_info;
adev->dm.dmub_fb_info =
kzalloc(sizeof(*adev->dm.dmub_fb_info), GFP_KERNEL);
return -ENOMEM;
}
- status = dmub_srv_calc_fb_info(dmub_srv, &fb_params, fb_info);
+ status = dmub_srv_calc_mem_info(dmub_srv, &memory_params, fb_info);
if (status != DMUB_STATUS_OK) {
DRM_ERROR("Error calculating DMUB FB info: %d\n", status);
return -EINVAL;
if (plane->type == DRM_PLANE_TYPE_CURSOR)
return;
+ if (new_plane_state->rotation != DRM_MODE_ROTATE_0)
+ goto ffu;
+
num_clips = drm_plane_get_damage_clips_count(new_plane_state);
clips = drm_plane_get_damage_clips(new_plane_state);
dm_new_state->underscan_enable = val;
ret = 0;
} else if (property == adev->mode_info.abm_level_property) {
- dm_new_state->abm_level = val;
+ dm_new_state->abm_level = val ?: ABM_LEVEL_IMMEDIATE_DISABLE;
ret = 0;
}
*val = dm_state->underscan_enable;
ret = 0;
} else if (property == adev->mode_info.abm_level_property) {
- *val = dm_state->abm_level;
+ *val = (dm_state->abm_level != ABM_LEVEL_IMMEDIATE_DISABLE) ?
+ dm_state->abm_level : 0;
ret = 0;
}
state->pbn = 0;
if (connector->connector_type == DRM_MODE_CONNECTOR_eDP)
- state->abm_level = amdgpu_dm_abm_level;
+ state->abm_level = amdgpu_dm_abm_level ?:
+ ABM_LEVEL_IMMEDIATE_DISABLE;
__drm_atomic_helper_connector_reset(connector, &state->base);
}
int i;
int result = -EIO;
+ if (!ddc_service->ddc_pin || !ddc_service->ddc_pin->hw_info.hw_supported)
+ return result;
+
cmd.payloads = kcalloc(num, sizeof(struct i2c_payload), GFP_KERNEL);
if (!cmd.payloads)
struct drm_plane *other;
struct drm_plane_state *old_other_state, *new_other_state;
struct drm_crtc_state *new_crtc_state;
+ struct amdgpu_device *adev = drm_to_adev(plane->dev);
int i;
/*
- * TODO: Remove this hack once the checks below are sufficient
- * enough to determine when we need to reset all the planes on
- * the stream.
+ * TODO: Remove this hack for all asics once it proves that the
+ * fast updates works fine on DCN3.2+.
*/
- if (state->allow_modeset)
+ if (adev->ip_versions[DCE_HWIP][0] < IP_VERSION(3, 2, 0) && state->allow_modeset)
return true;
/* Exit early if we know that we're adding or removing the plane. */
DRM_DEBUG_DRIVER("Disabling FAMS on monitor with panel id %X\n", panel_id);
edid_caps->panel_patch.disable_fams = true;
break;
+ /* Workaround for some monitors that do not clear DPCD 0x317 if FreeSync is unsupported */
+ case drm_edid_encode_panel_id('A', 'U', 'O', 0xA7AB):
+ case drm_edid_encode_panel_id('A', 'U', 'O', 0xE69B):
+ DRM_DEBUG_DRIVER("Clearing DPCD 0x317 on monitor with panel id %X\n", panel_id);
+ edid_caps->panel_patch.remove_sink_ext_caps = true;
+ break;
default:
return;
}
struct amdgpu_dm_connector *aconnector = link->priv;
- if (!aconnector) {
- drm_dbg_dp(aconnector->base.dev,
- "Failed to find connector for link!\n");
+ if (!aconnector)
return false;
- }
return drm_dp_dpcd_read(&aconnector->dm_dp_aux.aux, address, data,
size) == size;
unsigned int upper_link_bw_in_kbps = 0, down_link_bw_in_kbps = 0;
unsigned int max_compressed_bw_in_kbps = 0;
struct dc_dsc_bw_range bw_range = {0};
- struct drm_dp_mst_topology_mgr *mst_mgr;
+ uint16_t full_pbn = aconnector->mst_output_port->full_pbn;
/*
- * check if the mode could be supported if DSC pass-through is supported
- * AND check if there enough bandwidth available to support the mode
- * with DSC enabled.
+ * Consider the case with the depth of the mst topology tree is equal or less than 2
+ * A. When dsc bitstream can be transmitted along the entire path
+ * 1. dsc is possible between source and branch/leaf device (common dsc params is possible), AND
+ * 2. dsc passthrough supported at MST branch, or
+ * 3. dsc decoding supported at leaf MST device
+ * Use maximum dsc compression as bw constraint
+ * B. When dsc bitstream cannot be transmitted along the entire path
+ * Use native bw as bw constraint
*/
if (is_dsc_common_config_possible(stream, &bw_range) &&
- aconnector->mst_output_port->passthrough_aux) {
- mst_mgr = aconnector->mst_output_port->mgr;
- mutex_lock(&mst_mgr->lock);
-
+ (aconnector->mst_output_port->passthrough_aux ||
+ aconnector->dsc_aux == &aconnector->mst_output_port->aux)) {
cur_link_settings = stream->link->verified_link_cap;
upper_link_bw_in_kbps = dc_link_bandwidth_kbps(aconnector->dc_link,
- &cur_link_settings
- );
- down_link_bw_in_kbps = kbps_from_pbn(aconnector->mst_output_port->full_pbn);
+ &cur_link_settings);
+ down_link_bw_in_kbps = kbps_from_pbn(full_pbn);
/* pick the bottleneck */
end_to_end_bw_in_kbps = min(upper_link_bw_in_kbps,
down_link_bw_in_kbps);
- mutex_unlock(&mst_mgr->lock);
-
/*
* use the maximum dsc compression bandwidth as the required
* bandwidth for the mode
/* check if mode could be supported within full_pbn */
bpp = convert_dc_color_depth_into_bpc(stream->timing.display_color_depth) * 3;
pbn = drm_dp_calc_pbn_mode(stream->timing.pix_clk_100hz / 10, bpp, false);
-
- if (pbn > aconnector->mst_output_port->full_pbn)
+ if (pbn > full_pbn)
return DC_FAIL_BANDWIDTH_VALIDATE;
}
DC_LOG_BIOS("AS_SIGNAL_TYPE_HDMI ss_percentage: %d\n", ss_info->spread_spectrum_percentage);
break;
case AS_SIGNAL_TYPE_DISPLAY_PORT:
- ss_info->spread_spectrum_percentage =
+ if (bp->base.integrated_info) {
+ DC_LOG_BIOS("gpuclk_ss_percentage (unit of 0.001 percent): %d\n", bp->base.integrated_info->gpuclk_ss_percentage);
+ ss_info->spread_spectrum_percentage =
+ bp->base.integrated_info->gpuclk_ss_percentage;
+ ss_info->type.CENTER_MODE =
+ bp->base.integrated_info->gpuclk_ss_type;
+ } else {
+ ss_info->spread_spectrum_percentage =
disp_cntl_tbl->dp_ss_percentage;
- ss_info->spread_spectrum_range =
+ ss_info->spread_spectrum_range =
disp_cntl_tbl->dp_ss_rate_10hz * 10;
- if (disp_cntl_tbl->dp_ss_mode & ATOM_SS_CENTRE_SPREAD_MODE)
- ss_info->type.CENTER_MODE = true;
-
+ if (disp_cntl_tbl->dp_ss_mode & ATOM_SS_CENTRE_SPREAD_MODE)
+ ss_info->type.CENTER_MODE = true;
+ }
DC_LOG_BIOS("AS_SIGNAL_TYPE_DISPLAY_PORT ss_percentage: %d\n", ss_info->spread_spectrum_percentage);
break;
case AS_SIGNAL_TYPE_GPU_PLL:
info->ma_channel_number = info_v2_2->umachannelnumber;
info->dp_ss_control =
le16_to_cpu(info_v2_2->reserved1);
+ info->gpuclk_ss_percentage = info_v2_2->gpuclk_ss_percentage;
+ info->gpuclk_ss_type = info_v2_2->gpuclk_ss_type;
for (i = 0; i < NUMBER_OF_UCHAR_FOR_GUID; ++i) {
info->ext_disp_conn_info.gu_id[i] =
{
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
- .pstate_latency_us = 11.65333,
+ .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
- .pstate_latency_us = 11.65333,
+ .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
- .pstate_latency_us = 11.65333,
+ .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
- .pstate_latency_us = 11.65333,
+ .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
if (dc->work_arounds.skip_clock_update)
return;
+ /* DTBCLK is fixed, so set a default if unspecified. */
+ if (new_clocks->dtbclk_en && !new_clocks->ref_dtbclk_khz)
+ new_clocks->ref_dtbclk_khz = 600000;
+
/*
* if it is safe to lower, but we are already in the lower state, we don't have to do anything
* also if safe to lower is false, we just go in the higher state
if (!clk_mgr_base->clks.dtbclk_en && new_clocks->dtbclk_en) {
dcn35_smu_set_dtbclk(clk_mgr, true);
- dcn35_update_clocks_update_dtb_dto(clk_mgr, context, clk_mgr_base->clks.ref_dtbclk_khz);
clk_mgr_base->clks.dtbclk_en = new_clocks->dtbclk_en;
+
+ dcn35_update_clocks_update_dtb_dto(clk_mgr, context, new_clocks->ref_dtbclk_khz);
+ clk_mgr_base->clks.ref_dtbclk_khz = new_clocks->ref_dtbclk_khz;
}
/* check that we're not already in D0 */
update_dispclk = true;
}
- if (!new_clocks->dtbclk_en) {
- new_clocks->ref_dtbclk_khz = 600000;
- }
-
/* clock limits are received with MHz precision, divide by 1000 to prevent setting clocks at every call */
if (!dc->debug.disable_dtb_ref_clk_switch &&
- should_set_clock(safe_to_lower, new_clocks->ref_dtbclk_khz / 1000, clk_mgr_base->clks.ref_dtbclk_khz / 1000)) {
- /* DCCG requires KHz precision for DTBCLK */
- dcn35_smu_set_dtbclk(clk_mgr, true);
-
- dcn35_update_clocks_update_dtb_dto(clk_mgr, context, clk_mgr_base->clks.ref_dtbclk_khz);
+ should_set_clock(safe_to_lower, new_clocks->ref_dtbclk_khz / 1000,
+ clk_mgr_base->clks.ref_dtbclk_khz / 1000)) {
+ dcn35_update_clocks_update_dtb_dto(clk_mgr, context, new_clocks->ref_dtbclk_khz);
+ clk_mgr_base->clks.ref_dtbclk_khz = new_clocks->ref_dtbclk_khz;
}
if (dpp_clock_lowered) {
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
- .sr_exit_time_us = 9,
- .sr_enter_plus_exit_time_us = 11,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
- .sr_exit_time_us = 9,
- .sr_enter_plus_exit_time_us = 11,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
- .sr_exit_time_us = 9,
- .sr_enter_plus_exit_time_us = 11,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
- .sr_exit_time_us = 9,
- .sr_enter_plus_exit_time_us = 11,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
}
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
- .sr_exit_time_us = 11.5,
- .sr_enter_plus_exit_time_us = 14.5,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
- .sr_exit_time_us = 11.5,
- .sr_enter_plus_exit_time_us = 14.5,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
- .sr_exit_time_us = 11.5,
- .sr_enter_plus_exit_time_us = 14.5,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
- .sr_exit_time_us = 11.5,
- .sr_enter_plus_exit_time_us = 14.5,
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
.valid = true,
},
}
static struct dcn35_watermarks dummy_wms = { 0 };
-static struct dcn35_ss_info_table ss_info_table = {
- .ss_divider = 1000,
- .ss_percentage = {0, 0, 375, 375, 375}
-};
-
static void dcn35_build_watermark_ranges(struct clk_bw_params *bw_params, struct dcn35_watermarks *table)
{
int i, num_valid_sets;
return 1;
}
+static inline uint32_t calc_dram_speed_mts(const MemPstateTable_t *entry)
+{
+ return entry->UClk * convert_wck_ratio(entry->WckRatio) * 2;
+}
+
static void dcn35_clk_mgr_helper_populate_bw_params(struct clk_mgr_internal *clk_mgr,
struct integrated_info *bios_info,
DpmClocks_t_dcn35 *clock_table)
{
struct clk_bw_params *bw_params = clk_mgr->base.bw_params;
struct clk_limit_table_entry def_max = bw_params->clk_table.entries[bw_params->clk_table.num_entries - 1];
- uint32_t max_pstate = 0, max_uclk = 0, max_fclk = 0;
- uint32_t min_pstate = 0, max_dispclk = 0, max_dppclk = 0;
+ uint32_t max_fclk = 0, min_pstate = 0, max_dispclk = 0, max_dppclk = 0;
+ uint32_t max_pstate = 0, max_dram_speed_mts = 0, min_dram_speed_mts = 0;
int i;
+ /* Determine min/max p-state values. */
for (i = 0; i < clock_table->NumMemPstatesEnabled; i++) {
- if (is_valid_clock_value(clock_table->MemPstateTable[i].UClk) &&
- clock_table->MemPstateTable[i].UClk > max_uclk) {
- max_uclk = clock_table->MemPstateTable[i].UClk;
+ uint32_t dram_speed_mts = calc_dram_speed_mts(&clock_table->MemPstateTable[i]);
+
+ if (is_valid_clock_value(dram_speed_mts) && dram_speed_mts > max_dram_speed_mts) {
+ max_dram_speed_mts = dram_speed_mts;
max_pstate = i;
}
}
- /* We expect the table to contain at least one valid Uclk entry. */
- ASSERT(is_valid_clock_value(max_uclk));
+ min_dram_speed_mts = max_dram_speed_mts;
+ min_pstate = max_pstate;
+
+ for (i = 0; i < clock_table->NumMemPstatesEnabled; i++) {
+ uint32_t dram_speed_mts = calc_dram_speed_mts(&clock_table->MemPstateTable[i]);
+
+ if (is_valid_clock_value(dram_speed_mts) && dram_speed_mts < min_dram_speed_mts) {
+ min_dram_speed_mts = dram_speed_mts;
+ min_pstate = i;
+ }
+ }
+ /* We expect the table to contain at least one valid P-state entry. */
+ ASSERT(clock_table->NumMemPstatesEnabled &&
+ is_valid_clock_value(max_dram_speed_mts) &&
+ is_valid_clock_value(min_dram_speed_mts));
/* dispclk and dppclk can be max at any voltage, same number of levels for both */
if (clock_table->NumDispClkLevelsEnabled <= NUM_DISPCLK_DPM_LEVELS &&
max_dppclk = find_max_clk_value(clock_table->DppClocks,
clock_table->NumDispClkLevelsEnabled);
} else {
+ /* Invalid number of entries in the table from PMFW. */
ASSERT(0);
}
- if (clock_table->NumFclkLevelsEnabled <= NUM_FCLK_DPM_LEVELS)
- max_fclk = find_max_clk_value(clock_table->FclkClocks_Freq,
- clock_table->NumFclkLevelsEnabled);
- for (i = 0; i < clock_table->NumMemPstatesEnabled; i++) {
- uint32_t min_uclk = clock_table->MemPstateTable[0].UClk;
- int j;
+ /* Base the clock table on dcfclk, need at least one entry regardless of pmfw table */
+ ASSERT(clock_table->NumDcfClkLevelsEnabled > 0);
- for (j = 1; j < clock_table->NumMemPstatesEnabled; j++) {
- if (is_valid_clock_value(clock_table->MemPstateTable[j].UClk) &&
- clock_table->MemPstateTable[j].UClk < min_uclk &&
- clock_table->MemPstateTable[j].Voltage <= clock_table->SocVoltage[i]) {
- min_uclk = clock_table->MemPstateTable[j].UClk;
- min_pstate = j;
- }
- }
+ max_fclk = find_max_clk_value(clock_table->FclkClocks_Freq, clock_table->NumFclkLevelsEnabled);
+
+ for (i = 0; i < clock_table->NumDcfClkLevelsEnabled; i++) {
+ int j;
+ /* First search defaults for the clocks we don't read using closest lower or equal default dcfclk */
for (j = bw_params->clk_table.num_entries - 1; j > 0; j--)
if (bw_params->clk_table.entries[j].dcfclk_mhz <= clock_table->DcfClocks[i])
- break;
+ break;
bw_params->clk_table.entries[i].phyclk_mhz = bw_params->clk_table.entries[j].phyclk_mhz;
bw_params->clk_table.entries[i].phyclk_d18_mhz = bw_params->clk_table.entries[j].phyclk_d18_mhz;
bw_params->clk_table.entries[i].dtbclk_mhz = bw_params->clk_table.entries[j].dtbclk_mhz;
- bw_params->clk_table.entries[i].fclk_mhz = max_fclk;
+
+ /* Now update clocks we do read */
bw_params->clk_table.entries[i].memclk_mhz = clock_table->MemPstateTable[min_pstate].MemClk;
bw_params->clk_table.entries[i].voltage = clock_table->MemPstateTable[min_pstate].Voltage;
bw_params->clk_table.entries[i].dcfclk_mhz = clock_table->DcfClocks[i];
bw_params->clk_table.entries[i].socclk_mhz = clock_table->SocClocks[i];
bw_params->clk_table.entries[i].dispclk_mhz = max_dispclk;
bw_params->clk_table.entries[i].dppclk_mhz = max_dppclk;
- bw_params->clk_table.entries[i].wck_ratio = convert_wck_ratio(
- clock_table->MemPstateTable[min_pstate].WckRatio);
- }
+ bw_params->clk_table.entries[i].wck_ratio =
+ convert_wck_ratio(clock_table->MemPstateTable[min_pstate].WckRatio);
+
+ /* Dcfclk and Fclk are tied, but at a different ratio */
+ bw_params->clk_table.entries[i].fclk_mhz = min(max_fclk, 2 * clock_table->DcfClocks[i]);
+ }
/* Make sure to include at least one entry at highest pstate */
if (max_pstate != min_pstate || i == 0) {
if (i > MAX_NUM_DPM_LVL - 1)
i = MAX_NUM_DPM_LVL - 1;
+
bw_params->clk_table.entries[i].fclk_mhz = max_fclk;
bw_params->clk_table.entries[i].memclk_mhz = clock_table->MemPstateTable[max_pstate].MemClk;
bw_params->clk_table.entries[i].voltage = clock_table->MemPstateTable[max_pstate].Voltage;
}
bw_params->clk_table.num_entries = i--;
+ /* Make sure all highest clocks are included*/
bw_params->clk_table.entries[i].socclk_mhz =
find_max_clk_value(clock_table->SocClocks, NUM_SOCCLK_DPM_LEVELS);
bw_params->clk_table.entries[i].dispclk_mhz =
bw_params->clk_table.num_entries_per_clk.num_fclk_levels = clock_table->NumFclkLevelsEnabled;
bw_params->clk_table.num_entries_per_clk.num_memclk_levels = clock_table->NumMemPstatesEnabled;
bw_params->clk_table.num_entries_per_clk.num_socclk_levels = clock_table->NumSocClkLevelsEnabled;
+
+ /*
+ * Set any 0 clocks to max default setting. Not an issue for
+ * power since we aren't doing switching in such case anyway
+ */
for (i = 0; i < bw_params->clk_table.num_entries; i++) {
if (!bw_params->clk_table.entries[i].fclk_mhz) {
bw_params->clk_table.entries[i].fclk_mhz = def_max.fclk_mhz;
if (dc->config.disable_ips == DMUB_IPS_ENABLE ||
dc->config.disable_ips == DMUB_IPS_DISABLE_DYNAMIC) {
- val |= DMUB_IPS1_ALLOW_MASK;
- val |= DMUB_IPS2_ALLOW_MASK;
- } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS1) {
val = val & ~DMUB_IPS1_ALLOW_MASK;
val = val & ~DMUB_IPS2_ALLOW_MASK;
- } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS2) {
- val |= DMUB_IPS1_ALLOW_MASK;
- val = val & ~DMUB_IPS2_ALLOW_MASK;
- } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS2_Z10) {
+ } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS1) {
val |= DMUB_IPS1_ALLOW_MASK;
val |= DMUB_IPS2_ALLOW_MASK;
+ } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS2) {
+ val = val & ~DMUB_IPS1_ALLOW_MASK;
+ val |= DMUB_IPS2_ALLOW_MASK;
+ } else if (dc->config.disable_ips == DMUB_IPS_DISABLE_IPS2_Z10) {
+ val = val & ~DMUB_IPS1_ALLOW_MASK;
+ val = val & ~DMUB_IPS2_ALLOW_MASK;
}
if (!allow_idle) {
- val = val & ~DMUB_IPS1_ALLOW_MASK;
- val = val & ~DMUB_IPS2_ALLOW_MASK;
+ val |= DMUB_IPS1_ALLOW_MASK;
+ val |= DMUB_IPS2_ALLOW_MASK;
}
dcn35_smu_write_ips_scratch(clk_mgr, val);
.get_dtb_ref_clk_frequency = dcn31_get_dtb_ref_freq_khz,
};
-static void dcn35_read_ss_info_from_lut(struct clk_mgr_internal *clk_mgr)
-{
- uint32_t clock_source;
- struct dc_context *ctx = clk_mgr->base.ctx;
-
- REG_GET(CLK1_CLK2_BYPASS_CNTL, CLK2_BYPASS_SEL, &clock_source);
-
- clk_mgr->dprefclk_ss_percentage = ss_info_table.ss_percentage[clock_source];
-
- if (clk_mgr->dprefclk_ss_percentage != 0) {
- clk_mgr->ss_on_dprefclk = true;
- clk_mgr->dprefclk_ss_divider = ss_info_table.ss_divider;
- }
-}
-
void dcn35_clk_mgr_construct(
struct dc_context *ctx,
struct clk_mgr_dcn35 *clk_mgr,
dcn35_dump_clk_registers(&clk_mgr->base.base.boot_snapshot, &clk_mgr->base.base, &log_info);
clk_mgr->base.base.dprefclk_khz = dcn35_smu_get_dprefclk(&clk_mgr->base);
- clk_mgr->base.base.clks.ref_dtbclk_khz = dcn35_smu_get_dtbclk(&clk_mgr->base);
-
- if (!clk_mgr->base.base.clks.ref_dtbclk_khz)
- dcn35_smu_set_dtbclk(&clk_mgr->base, true);
+ clk_mgr->base.base.clks.ref_dtbclk_khz = 600000;
- clk_mgr->base.base.clks.dtbclk_en = true;
dce_clock_read_ss_info(&clk_mgr->base);
/*when clk src is from FCH, it could have ss, same clock src as DPREF clk*/
- dcn35_read_ss_info_from_lut(&clk_mgr->base);
-
clk_mgr->base.base.bw_params = &dcn35_bw_params;
if (clk_mgr->base.base.ctx->dc->debug.pstate_enabled) {
ctx->dc->debug.disable_dpp_power_gate = false;
ctx->dc->debug.disable_hubp_power_gate = false;
ctx->dc->debug.disable_dsc_power_gate = false;
- ctx->dc->debug.disable_hpo_power_gate = false;
} else {
/*let's reset the config control flag*/
ctx->dc->config.disable_ips = DMUB_IPS_DISABLE_ALL; /*pmfw not support it, disable it all*/
struct pipe_ctx *otg_master = resource_get_otg_master_for_stream(&context->res_ctx,
context->streams[i]);
- if (otg_master->stream->test_pattern.type != DP_TEST_PATTERN_VIDEO_MODE)
+ if (otg_master && otg_master->stream->test_pattern.type != DP_TEST_PATTERN_VIDEO_MODE)
resource_build_test_pattern_params(&context->res_ctx, otg_master);
}
}
if (dc->hwss.get_idle_state)
idle_state = dc->hwss.get_idle_state(dc);
- if ((idle_state & DMUB_IPS1_ALLOW_MASK) ||
- (idle_state & DMUB_IPS2_ALLOW_MASK))
+ if (!(idle_state & DMUB_IPS1_ALLOW_MASK) ||
+ !(idle_state & DMUB_IPS2_ALLOW_MASK))
return true;
return false;
*/
bool dc_is_dmub_outbox_supported(struct dc *dc)
{
- /* DCN31 B0 USB4 DPIA needs dmub notifications for interrupts */
- if (dc->ctx->asic_id.chip_family == FAMILY_YELLOW_CARP &&
- dc->ctx->asic_id.hw_internal_rev == YELLOW_CARP_B0 &&
- !dc->debug.dpia_debug.bits.disable_dpia)
- return true;
+ switch (dc->ctx->asic_id.chip_family) {
- if (dc->ctx->asic_id.chip_family == AMDGPU_FAMILY_GC_11_0_1 &&
- !dc->debug.dpia_debug.bits.disable_dpia)
- return true;
+ case FAMILY_YELLOW_CARP:
+ /* DCN31 B0 USB4 DPIA needs dmub notifications for interrupts */
+ if (dc->ctx->asic_id.hw_internal_rev == YELLOW_CARP_B0 &&
+ !dc->debug.dpia_debug.bits.disable_dpia)
+ return true;
+ break;
+
+ case AMDGPU_FAMILY_GC_11_0_1:
+ case AMDGPU_FAMILY_GC_11_5_0:
+ if (!dc->debug.dpia_debug.bits.disable_dpia)
+ return true;
+ break;
+
+ default:
+ break;
+ }
/* dmub aux needs dmub notifications to be enabled */
return dc->debug.enable_dmub_aux_for_legacy_ddc;
+
}
/**
sec_next = sec_pipe->next_odm_pipe;
sec_prev = sec_pipe->prev_odm_pipe;
+ if (pri_pipe == NULL)
+ return false;
+
*sec_pipe = *pri_pipe;
sec_pipe->top_pipe = sec_top;
unsigned int seamless_boot_odm_combine;
unsigned int force_odm_combine_4to1; //bit vector based on otg inst
int minimum_z8_residency_time;
+ int minimum_z10_residency_time;
bool disable_z9_mpc;
unsigned int force_fclk_khz;
bool enable_tri_buf;
enum edp_revision edp_revision;
union dpcd_sink_ext_caps dpcd_sink_ext_caps;
- struct backlight_settings backlight_settings;
struct psr_settings psr_settings;
struct replay_settings replay_settings;
allow_state = dc->hwss.get_idle_state(dc);
dc->hwss.set_idle_state(dc, false);
- if (allow_state & DMUB_IPS2_ALLOW_MASK) {
+ if (!(allow_state & DMUB_IPS2_ALLOW_MASK)) {
// Wait for evaluation time
udelay(dc->debug.ips2_eval_delay_us);
commit_state = dc->hwss.get_idle_state(dc);
- if (commit_state & DMUB_IPS2_COMMIT_MASK) {
+ if (!(commit_state & DMUB_IPS2_COMMIT_MASK)) {
// Tell PMFW to exit low power state
dc->clk_mgr->funcs->exit_low_power_state(dc->clk_mgr);
for (i = 0; i < max_num_polls; ++i) {
commit_state = dc->hwss.get_idle_state(dc);
- if (!(commit_state & DMUB_IPS2_COMMIT_MASK))
+ if (commit_state & DMUB_IPS2_COMMIT_MASK)
break;
udelay(1);
}
dc_dmub_srv_notify_idle(dc, false);
- if (allow_state & DMUB_IPS1_ALLOW_MASK) {
+ if (!(allow_state & DMUB_IPS1_ALLOW_MASK)) {
for (i = 0; i < max_num_polls; ++i) {
commit_state = dc->hwss.get_idle_state(dc);
- if (!(commit_state & DMUB_IPS1_COMMIT_MASK))
+ if (commit_state & DMUB_IPS1_COMMIT_MASK)
break;
udelay(1);
struct fixed31_32 v_scale_ratio;
enum dc_rotation_angle rotation;
bool mirror;
+ struct dc_stream_state *stream;
};
/* IPP related types */
unsigned int disable_fams;
unsigned int skip_avmute;
unsigned int mst_start_top_delay;
+ unsigned int remove_sink_ext_caps;
};
struct dc_edid_caps {
struct link_mst_stream_allocation stream_allocations[MAX_CONTROLLER_NUM];
};
-struct backlight_settings {
- uint32_t backlight_millinits;
-};
-
/* PSR feature flags */
struct psr_settings {
bool psr_feature_enabled; // PSR is supported by sink
if (src_y_offset < 0)
src_y_offset = 0;
/* Save necessary cursor info x, y position. w, h is saved in attribute func. */
- hubp->cur_rect.x = src_x_offset + param->viewport.x;
- hubp->cur_rect.y = src_y_offset + param->viewport.y;
+ if (param->stream->link->psr_settings.psr_version >= DC_PSR_VERSION_SU_1 &&
+ param->rotation != ROTATION_ANGLE_0) {
+ hubp->cur_rect.x = 0;
+ hubp->cur_rect.y = 0;
+ hubp->cur_rect.w = param->stream->timing.h_addressable;
+ hubp->cur_rect.h = param->stream->timing.v_addressable;
+ } else {
+ hubp->cur_rect.x = src_x_offset + param->viewport.x;
+ hubp->cur_rect.y = src_y_offset + param->viewport.y;
+ }
}
void hubp2_clk_cntl(struct hubp *hubp, bool enable)
static const struct dc_debug_options debug_defaults_drv = {
.disable_z10 = false,
.enable_z9_disable_interface = true,
- .minimum_z8_residency_time = 2000,
+ .minimum_z8_residency_time = 2100,
.psr_skip_crtc_disable = true,
.replay_skip_crtc_disabled = true,
.disable_dmcu = true,
/* invalid mode ! */
ASSERT_CRITICAL(false);
}
-
- REG_UPDATE(DIG_FE_CLK_CNTL, DIG_FE_CLK_EN, 1);
- REG_UPDATE(DIG_FE_EN_CNTL, DIG_FE_ENABLE, 1);
- } else {
- REG_UPDATE(DIG_FE_EN_CNTL, DIG_FE_ENABLE, 0);
- REG_UPDATE(DIG_FE_CLK_CNTL, DIG_FE_CLK_EN, 0);
}
}
struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_ENABLE, 0);
+ REG_UPDATE(DIG_FE_EN_CNTL, DIG_FE_ENABLE, 0);
+ REG_UPDATE(DIG_FE_CLK_CNTL, DIG_FE_CLK_EN, 0);
}
static void enc35_enable_fifo(struct stream_encoder *enc)
struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
+ REG_UPDATE(DIG_FE_CLK_CNTL, DIG_FE_CLK_EN, 1);
+ REG_UPDATE(DIG_FE_EN_CNTL, DIG_FE_ENABLE, 1);
enc35_reset_fifo(enc, true);
enc35_reset_fifo(enc, false);
uint32_t power_gate = power_on ? 0 : 1;
uint32_t pwr_status = power_on ? 0 : 2;
uint32_t org_ip_request_cntl;
+ uint32_t power_forceon;
bool block_enabled;
if (pg_cntl->ctx->dc->debug.ignore_pg ||
return;
}
+ REG_GET(DOMAIN25_PG_CONFIG, DOMAIN_POWER_FORCEON, &power_forceon);
+ if (power_forceon)
+ return;
+
REG_GET(DC_IP_REQUEST_CNTL, IP_REQUEST_EN, &org_ip_request_cntl);
if (org_ip_request_cntl == 0)
REG_SET(DC_IP_REQUEST_CNTL, 0, IP_REQUEST_EN, 1);
uint32_t power_gate = power_on ? 0 : 1;
uint32_t pwr_status = power_on ? 0 : 2;
uint32_t org_ip_request_cntl;
+ uint32_t power_forceon;
bool block_enabled;
if (pg_cntl->ctx->dc->debug.ignore_pg ||
return;
}
+ REG_GET(DOMAIN22_PG_CONFIG, DOMAIN_POWER_FORCEON, &power_forceon);
+ if (power_forceon)
+ return;
+
REG_GET(DC_IP_REQUEST_CNTL, IP_REQUEST_EN, &org_ip_request_cntl);
if (org_ip_request_cntl == 0)
REG_SET(DC_IP_REQUEST_CNTL, 0, IP_REQUEST_EN, 1);
out = dml2_validate(dc, context, fast_validate);
+ if (fast_validate)
+ return out;
+
+ DC_FP_START();
+ dcn35_decide_zstate_support(dc, context);
+ DC_FP_END();
+
return out;
}
/* Use pipe context based otg sync logic */
dc->config.use_pipe_ctx_sync_logic = true;
- dc->config.use_default_clock_table = false;
+
/* read VBIOS LTTPR caps */
{
if (ctx->dc_bios->funcs->get_lttpr_caps) {
endif
ifneq ($(CONFIG_FRAME_WARN),0)
+ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)
+frame_warn_flag := -Wframe-larger-than=3072
+else
frame_warn_flag := -Wframe-larger-than=2048
endif
+endif
CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o := $(dml_ccflags)
CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o := $(dml_ccflags)
* Define the maximum amount of states supported by the ASIC. Every ASIC has a
* specific number of states; this macro defines the maximum number of states.
*/
-#define DC__VOLTAGE_STATES 20
+#define DC__VOLTAGE_STATES 40
#define DC__NUM_DPP__4 1
#define DC__NUM_DPP__0_PRESENT 1
#define DC__NUM_DPP__1_PRESENT 1
{
int plane_count;
int i;
- unsigned int min_dst_y_next_start_us;
plane_count = 0;
- min_dst_y_next_start_us = 0;
for (i = 0; i < dc->res_pool->pipe_count; i++) {
if (context->res_ctx.pipe_ctx[i].plane_state)
plane_count++;
else if (context->stream_count == 1 && context->streams[0]->signal == SIGNAL_TYPE_EDP) {
struct dc_link *link = context->streams[0]->sink->link;
struct dc_stream_status *stream_status = &context->stream_status[0];
- struct dc_stream_state *current_stream = context->streams[0];
int minmum_z8_residency = dc->debug.minimum_z8_residency_time > 0 ? dc->debug.minimum_z8_residency_time : 1000;
bool allow_z8 = context->bw_ctx.dml.vba.StutterPeriod > (double)minmum_z8_residency;
bool is_pwrseq0 = link->link_index == 0;
- bool isFreesyncVideo;
-
- isFreesyncVideo = current_stream->adjust.v_total_min == current_stream->adjust.v_total_max;
- isFreesyncVideo = isFreesyncVideo && current_stream->timing.v_total < current_stream->adjust.v_total_min;
- for (i = 0; i < dc->res_pool->pipe_count; i++) {
- if (context->res_ctx.pipe_ctx[i].stream == current_stream && isFreesyncVideo) {
- min_dst_y_next_start_us = context->res_ctx.pipe_ctx[i].dlg_regs.min_dst_y_next_start_us;
- break;
- }
- }
/* Don't support multi-plane configurations */
if (stream_status->plane_count > 1)
return DCN_ZSTATE_SUPPORT_DISALLOW;
- if (is_pwrseq0 && (context->bw_ctx.dml.vba.StutterPeriod > 5000.0 || min_dst_y_next_start_us > 5000))
+ if (is_pwrseq0 && context->bw_ctx.dml.vba.StutterPeriod > 5000.0)
return DCN_ZSTATE_SUPPORT_ALLOW;
else if (is_pwrseq0 && link->psr_settings.psr_version == DC_PSR_VERSION_1 && !link->panel_config.psr.disable_psr)
return allow_z8 ? DCN_ZSTATE_SUPPORT_ALLOW_Z8_Z10_ONLY : DCN_ZSTATE_SUPPORT_ALLOW_Z10_ONLY;
*/
struct pipe_ctx *pipe;
bool odm;
- int i;
+ int dc_pipe_idx, dml_pipe_idx = 0;
bool updated = false;
- for (i = 0; i < dc->res_pool->pipe_count; i++) {
- pipe = &context->res_ctx.pipe_ctx[i];
+ for (dc_pipe_idx = 0;
+ dc_pipe_idx < dc->res_pool->pipe_count; dc_pipe_idx++) {
+ pipe = &context->res_ctx.pipe_ctx[dc_pipe_idx];
+ if (resource_is_pipe_type(pipe, FREE_PIPE))
+ continue;
- if (merge[i]) {
+ if (merge[dc_pipe_idx]) {
if (resource_is_pipe_type(pipe, OPP_HEAD))
/* merging OPP head means reducing ODM slice
* count by 1
updated = true;
}
- if (split[i]) {
- odm = vba->ODMCombineEnabled[vba->pipe_plane[i]] !=
+ if (split[dc_pipe_idx]) {
+ odm = vba->ODMCombineEnabled[vba->pipe_plane[dml_pipe_idx]] !=
dm_odm_combine_mode_disabled;
if (odm && resource_is_pipe_type(pipe, OPP_HEAD))
update_slice_table_for_stream(
- table, pipe->stream, split[i] - 1);
+ table, pipe->stream, split[dc_pipe_idx] - 1);
else if (!odm && resource_is_pipe_type(pipe, DPP_PIPE))
update_slice_table_for_plane(table, pipe,
- pipe->plane_state, split[i] - 1);
+ pipe->plane_state, split[dc_pipe_idx] - 1);
updated = true;
}
+ dml_pipe_idx++;
}
return updated;
}
int i, pipe_idx, vlevel_temp = 0;
double dcfclk = dcn3_2_soc.clock_limits[0].dcfclk_mhz;
double dcfclk_from_validation = context->bw_ctx.dml.vba.DCFCLKState[vlevel][context->bw_ctx.dml.vba.maxMpcComb];
+ double dram_speed_from_validation = context->bw_ctx.dml.vba.DRAMSpeed;
double dcfclk_from_fw_based_mclk_switching = dcfclk_from_validation;
bool pstate_en = context->bw_ctx.dml.vba.DRAMClockChangeSupport[vlevel][context->bw_ctx.dml.vba.maxMpcComb] !=
dm_dram_clock_change_unsupported;
}
if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_C].valid) {
- min_dram_speed_mts = context->bw_ctx.dml.vba.DRAMSpeed;
+ min_dram_speed_mts = dram_speed_from_validation;
min_dram_speed_mts_margin = 160;
context->bw_ctx.dml.soc.dram_clock_change_latency_us =
.phyclk_mhz = 600.0,
.phyclk_d18_mhz = 667.0,
.dscclk_mhz = 186.0,
- .dtbclk_mhz = 625.0,
+ .dtbclk_mhz = 600.0,
},
{
.state = 1,
.phyclk_mhz = 810.0,
.phyclk_d18_mhz = 667.0,
.dscclk_mhz = 209.0,
- .dtbclk_mhz = 625.0,
+ .dtbclk_mhz = 600.0,
},
{
.state = 2,
.phyclk_mhz = 810.0,
.phyclk_d18_mhz = 667.0,
.dscclk_mhz = 209.0,
- .dtbclk_mhz = 625.0,
+ .dtbclk_mhz = 600.0,
},
{
.state = 3,
.phyclk_mhz = 810.0,
.phyclk_d18_mhz = 667.0,
.dscclk_mhz = 371.0,
- .dtbclk_mhz = 625.0,
+ .dtbclk_mhz = 600.0,
},
{
.state = 4,
.phyclk_mhz = 810.0,
.phyclk_d18_mhz = 667.0,
.dscclk_mhz = 417.0,
- .dtbclk_mhz = 625.0,
+ .dtbclk_mhz = 600.0,
},
},
.num_states = 5,
- .sr_exit_time_us = 9.0,
- .sr_enter_plus_exit_time_us = 11.0,
- .sr_exit_z8_time_us = 50.0, /*changed from 442.0*/
- .sr_enter_plus_exit_z8_time_us = 50.0,/*changed from 560.0*/
+ .sr_exit_time_us = 14.0,
+ .sr_enter_plus_exit_time_us = 16.0,
+ .sr_exit_z8_time_us = 525.0,
+ .sr_enter_plus_exit_z8_time_us = 715.0,
.fclk_change_latency_us = 20.0,
.usr_retraining_latency_us = 2,
.writeback_latency_us = 12.0,
/*temp till dml2 fully work without dml1*/
dml_init_instance(&dc->dml, &dcn3_5_soc, &dcn3_5_ip,
DML_PROJECT_DCN31);
+
+ /*copy to dml2, before dml2_create*/
+ if (clk_table->num_entries > 2) {
+
+ for (i = 0; i < clk_table->num_entries; i++) {
+ dc->dml2_options.bbox_overrides.clks_table.num_states =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].dcfclk_mhz =
+ clock_limits[i].dcfclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].fclk_mhz =
+ clock_limits[i].fabricclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].dispclk_mhz =
+ clock_limits[i].dispclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].dppclk_mhz =
+ clock_limits[i].dppclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].socclk_mhz =
+ clock_limits[i].socclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].memclk_mhz =
+ clk_table->entries[i].memclk_mhz * clk_table->entries[i].wck_ratio;
+ dc->dml2_options.bbox_overrides.clks_table.clk_entries[i].dtbclk_mhz =
+ clock_limits[i].dtbclk_mhz;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_dcfclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_fclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_dispclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_dppclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_socclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_memclk_levels =
+ clk_table->num_entries;
+ dc->dml2_options.bbox_overrides.clks_table.num_entries_per_clk.num_dtbclk_levels =
+ clk_table->num_entries;
+ }
+ }
+
+ /* Update latency values */
+ dc->dml2_options.bbox_overrides.dram_clock_change_latency_us = dcn3_5_soc.dram_clock_change_latency_us;
+
+ dc->dml2_options.bbox_overrides.sr_exit_latency_us = dcn3_5_soc.sr_exit_time_us;
+ dc->dml2_options.bbox_overrides.sr_enter_plus_exit_latency_us = dcn3_5_soc.sr_enter_plus_exit_time_us;
+
+ dc->dml2_options.bbox_overrides.sr_exit_z8_time_us = dcn3_5_soc.sr_exit_z8_time_us;
+ dc->dml2_options.bbox_overrides.sr_enter_plus_exit_z8_time_us = dcn3_5_soc.sr_enter_plus_exit_z8_time_us;
}
static bool is_dual_plane(enum surface_pixel_format format)
return pipe_cnt;
}
+
+void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context)
+{
+ enum dcn_zstate_support_state support = DCN_ZSTATE_SUPPORT_DISALLOW;
+ unsigned int i, plane_count = 0;
+
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+ if (context->res_ctx.pipe_ctx[i].plane_state)
+ plane_count++;
+ }
+
+ if (plane_count == 0) {
+ support = DCN_ZSTATE_SUPPORT_ALLOW;
+ } else if (plane_count == 1 && context->stream_count == 1 && context->streams[0]->signal == SIGNAL_TYPE_EDP) {
+ struct dc_link *link = context->streams[0]->sink->link;
+ bool is_pwrseq0 = link && link->link_index == 0;
+ bool is_psr1 = link && link->psr_settings.psr_version == DC_PSR_VERSION_1 && !link->panel_config.psr.disable_psr;
+ int minmum_z8_residency =
+ dc->debug.minimum_z8_residency_time > 0 ? dc->debug.minimum_z8_residency_time : 1000;
+ bool allow_z8 = context->bw_ctx.dml.vba.StutterPeriod > (double)minmum_z8_residency;
+ int minmum_z10_residency =
+ dc->debug.minimum_z10_residency_time > 0 ? dc->debug.minimum_z10_residency_time : 5000;
+ bool allow_z10 = context->bw_ctx.dml.vba.StutterPeriod > (double)minmum_z10_residency;
+
+ if (is_pwrseq0 && allow_z10)
+ support = DCN_ZSTATE_SUPPORT_ALLOW;
+ else if (is_pwrseq0 && is_psr1)
+ support = allow_z8 ? DCN_ZSTATE_SUPPORT_ALLOW_Z8_Z10_ONLY : DCN_ZSTATE_SUPPORT_ALLOW_Z10_ONLY;
+ else if (allow_z8)
+ support = DCN_ZSTATE_SUPPORT_ALLOW_Z8_ONLY;
+ }
+
+ context->bw_ctx.bw.dcn.clk.zstate_support = support;
+}
display_e2e_pipe_params_st *pipes,
bool fast_validate);
+void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context);
+
#endif
*OutBpp = TruncToValidBPP((1 - Downspreading / 100) * 13500, OutputLinkDPLanes, HTotal, HActive, PixelClockBackEnd, ForcedOutputLinkBPP, LinkDSCEnable, Output,
OutputFormat, DSCInputBitPerComponent, NumberOfDSCSlices, (dml_uint_t)AudioSampleRate, AudioSampleLayout, ODMModeNoDSC, ODMModeDSC, RequiredSlots);
- if (OutBpp == 0 && PHYCLKD32PerState < 20000 / 32 && DSCEnable == dml_dsc_enable_if_necessary && ForcedOutputLinkBPP == 0) {
+ if (*OutBpp == 0 && PHYCLKD32PerState < 20000 / 32 && DSCEnable == dml_dsc_enable_if_necessary && ForcedOutputLinkBPP == 0) {
*RequiresDSC = true;
LinkDSCEnable = true;
*OutBpp = TruncToValidBPP((1 - Downspreading / 100) * 13500, OutputLinkDPLanes, HTotal, HActive, PixelClockBackEnd, ForcedOutputLinkBPP, LinkDSCEnable, Output,
// Output
CalculateWatermarks_params->Watermark = &s->dummy_watermark; // Watermarks *Watermark
- CalculateWatermarks_params->DRAMClockChangeSupport = &mode_lib->ms.support.DRAMClockChangeSupport[j];
+ CalculateWatermarks_params->DRAMClockChangeSupport = &mode_lib->ms.support.DRAMClockChangeSupport[0];
CalculateWatermarks_params->MaxActiveDRAMClockChangeLatencySupported = &s->dummy_single_array[0][0]; // dml_float_t *MaxActiveDRAMClockChangeLatencySupported[]
CalculateWatermarks_params->SubViewportLinesNeededInMALL = &mode_lib->ms.SubViewportLinesNeededInMALL[j]; // dml_uint_t SubViewportLinesNeededInMALL[]
- CalculateWatermarks_params->FCLKChangeSupport = &mode_lib->ms.support.FCLKChangeSupport[j];
+ CalculateWatermarks_params->FCLKChangeSupport = &mode_lib->ms.support.FCLKChangeSupport[0];
CalculateWatermarks_params->MaxActiveFCLKChangeLatencySupported = &s->dummy_single[0]; // dml_float_t *MaxActiveFCLKChangeLatencySupported
- CalculateWatermarks_params->USRRetrainingSupport = &mode_lib->ms.support.USRRetrainingSupport[j];
+ CalculateWatermarks_params->USRRetrainingSupport = &mode_lib->ms.support.USRRetrainingSupport[0];
CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
&mode_lib->scratch,
break;
}
- /* Override from passed values, mainly for debugging purposes, if available */
- if (dml2->config.bbox_overrides.sr_exit_latency_us) {
- p->in_states->state_array[0].sr_exit_time_us = dml2->config.bbox_overrides.sr_exit_latency_us;
- }
+ /* Override from passed values, if available */
+ for (i = 0; i < p->in_states->num_states; i++) {
+ if (dml2->config.bbox_overrides.sr_exit_latency_us) {
+ p->in_states->state_array[i].sr_exit_time_us =
+ dml2->config.bbox_overrides.sr_exit_latency_us;
+ }
- if (dml2->config.bbox_overrides.sr_enter_plus_exit_latency_us) {
- p->in_states->state_array[0].sr_enter_plus_exit_time_us = dml2->config.bbox_overrides.sr_enter_plus_exit_latency_us;
- }
+ if (dml2->config.bbox_overrides.sr_enter_plus_exit_latency_us) {
+ p->in_states->state_array[i].sr_enter_plus_exit_time_us =
+ dml2->config.bbox_overrides.sr_enter_plus_exit_latency_us;
+ }
- if (dml2->config.bbox_overrides.urgent_latency_us) {
- p->in_states->state_array[0].urgent_latency_pixel_data_only_us = dml2->config.bbox_overrides.urgent_latency_us;
- }
+ if (dml2->config.bbox_overrides.sr_exit_z8_time_us) {
+ p->in_states->state_array[i].sr_exit_z8_time_us =
+ dml2->config.bbox_overrides.sr_exit_z8_time_us;
+ }
- if (dml2->config.bbox_overrides.dram_clock_change_latency_us) {
- p->in_states->state_array[0].dram_clock_change_latency_us = dml2->config.bbox_overrides.dram_clock_change_latency_us;
- }
+ if (dml2->config.bbox_overrides.sr_enter_plus_exit_z8_time_us) {
+ p->in_states->state_array[i].sr_enter_plus_exit_z8_time_us =
+ dml2->config.bbox_overrides.sr_enter_plus_exit_z8_time_us;
+ }
+
+ if (dml2->config.bbox_overrides.urgent_latency_us) {
+ p->in_states->state_array[i].urgent_latency_pixel_data_only_us =
+ dml2->config.bbox_overrides.urgent_latency_us;
+ }
- if (dml2->config.bbox_overrides.fclk_change_latency_us) {
- p->in_states->state_array[0].fclk_change_latency_us = dml2->config.bbox_overrides.fclk_change_latency_us;
+ if (dml2->config.bbox_overrides.dram_clock_change_latency_us) {
+ p->in_states->state_array[i].dram_clock_change_latency_us =
+ dml2->config.bbox_overrides.dram_clock_change_latency_us;
+ }
+
+ if (dml2->config.bbox_overrides.fclk_change_latency_us) {
+ p->in_states->state_array[i].fclk_change_latency_us =
+ dml2->config.bbox_overrides.fclk_change_latency_us;
+ }
}
/* DCFCLK stas values are project specific */
}
for (i = 0; i < dml2->config.bbox_overrides.clks_table.num_entries_per_clk.num_dtbclk_levels; i++) {
- p->in_states->state_array[i].dtbclk_mhz =
- dml2->config.bbox_overrides.clks_table.clk_entries[i].dtbclk_mhz;
+ if (dml2->config.bbox_overrides.clks_table.clk_entries[i].dtbclk_mhz > 0)
+ p->in_states->state_array[i].dtbclk_mhz =
+ dml2->config.bbox_overrides.clks_table.clk_entries[i].dtbclk_mhz;
}
for (i = 0; i < dml2->config.bbox_overrides.clks_table.num_entries_per_clk.num_dispclk_levels; i++) {
out->do_urgent_latency_adjustment = in_soc_params->do_urgent_latency_adjustment;
out->dram_channel_width_bytes = (dml_uint_t)in_soc_params->dram_channel_width_bytes;
out->fabric_datapath_to_dcn_data_return_bytes = (dml_uint_t)in_soc_params->fabric_datapath_to_dcn_data_return_bytes;
- out->gpuvm_min_page_size_kbytes = in_soc_params->gpuvm_min_page_size_bytes * 1024;
- out->hostvm_min_page_size_kbytes = in_soc_params->hostvm_min_page_size_bytes * 1024;
+ out->gpuvm_min_page_size_kbytes = in_soc_params->gpuvm_min_page_size_bytes / 1024;
+ out->hostvm_min_page_size_kbytes = in_soc_params->hostvm_min_page_size_bytes / 1024;
out->mall_allocated_for_dcn_mbytes = (dml_uint_t)in_soc_params->mall_allocated_for_dcn_mbytes;
out->max_avg_dram_bw_use_normal_percent = in_soc_params->max_avg_dram_bw_use_normal_percent;
out->max_avg_fabric_bw_use_normal_percent = in_soc_params->max_avg_fabric_bw_use_normal_percent;
}
//Generally these are set by referencing our latest BB/IP params in dcn32_resource.c file
- dml_dispcfg->plane.GPUVMEnable = true;
- dml_dispcfg->plane.GPUVMMaxPageTableLevels = 4;
- dml_dispcfg->plane.HostVMEnable = false;
+ dml_dispcfg->plane.GPUVMEnable = dml2->v20.dml_core_ctx.ip.gpuvm_enable;
+ dml_dispcfg->plane.GPUVMMaxPageTableLevels = dml2->v20.dml_core_ctx.ip.gpuvm_max_page_table_levels;
+ dml_dispcfg->plane.HostVMEnable = dml2->v20.dml_core_ctx.ip.hostvm_enable;
+ dml_dispcfg->plane.HostVMMaxPageTableLevels = dml2->v20.dml_core_ctx.ip.hostvm_max_page_table_levels;
+ if (dml2->v20.dml_core_ctx.ip.hostvm_enable)
+ dml2->v20.dml_core_ctx.policy.AllowForPStateChangeOrStutterInVBlankFinal = dml_prefetch_support_uclk_fclk_and_stutter;
dml2_populate_pipe_to_plane_index_mapping(dml2, context);
double urgent_latency_us;
double sr_exit_latency_us;
double sr_enter_plus_exit_latency_us;
+ double sr_exit_z8_time_us;
+ double sr_enter_plus_exit_z8_time_us;
double dram_clock_change_latency_us;
double fclk_change_latency_us;
unsigned int dram_num_chan;
.h_scale_ratio = pipe_ctx->plane_res.scl_data.ratios.horz,
.v_scale_ratio = pipe_ctx->plane_res.scl_data.ratios.vert,
.rotation = pipe_ctx->plane_state->rotation,
- .mirror = pipe_ctx->plane_state->horizontal_mirror
+ .mirror = pipe_ctx->plane_state->horizontal_mirror,
+ .stream = pipe_ctx->stream,
};
bool pipe_split_on = false;
bool odm_combine_on = (pipe_ctx->next_odm_pipe != NULL) ||
if (plane_state->blend_tf->type == TF_TYPE_HWPWL)
lut_params = &plane_state->blend_tf->pwl;
else if (plane_state->blend_tf->type == TF_TYPE_DISTRIBUTED_POINTS) {
- cm_helper_translate_curve_to_hw_format(plane_state->ctx,
- plane_state->blend_tf,
+ cm3_helper_translate_curve_to_hw_format(plane_state->blend_tf,
&dpp_base->regamma_params, false);
lut_params = &dpp_base->regamma_params;
}
else if (plane_state->in_shaper_func->type == TF_TYPE_DISTRIBUTED_POINTS) {
// TODO: dpp_base replace
ASSERT(false);
- cm_helper_translate_curve_to_hw_format(plane_state->ctx,
- plane_state->in_shaper_func,
+ cm3_helper_translate_curve_to_hw_format(plane_state->in_shaper_func,
&dpp_base->shaper_params, true);
lut_params = &dpp_base->shaper_params;
}
dc->caps.dmub_caps.subvp_psr = dc->ctx->dmub_srv->dmub->feature_caps.subvp_psr_support;
dc->caps.dmub_caps.gecc_enable = dc->ctx->dmub_srv->dmub->feature_caps.gecc_enable;
dc->caps.dmub_caps.mclk_sw = dc->ctx->dmub_srv->dmub->feature_caps.fw_assisted_mclk_switch;
+
+ if (dc->ctx->dmub_srv->dmub->fw_version <
+ DMUB_FW_VERSION(7, 0, 35)) {
+ dc->debug.force_disable_subvp = true;
+ dc->debug.disable_fpo_optimizations = true;
+ }
}
}
(link->dpcd_sink_ext_caps.bits.oled == 1)) {
dpcd_set_source_specific_data(link);
msleep(post_oui_delay);
- set_cached_brightness_aux(link);
+ set_default_brightness_aux(link);
}
return true;
if (sink->edid_caps.panel_patch.skip_scdc_overwrite)
link->ctx->dc->debug.hdmi20_disable = true;
+ if (sink->edid_caps.panel_patch.remove_sink_ext_caps)
+ link->dpcd_sink_ext_caps.raw = 0;
+
if (dc_is_hdmi_signal(link->connector_signal))
read_scdc_caps(link->ddc, link->local_sink);
if (link->dpcd_sink_ext_caps.bits.oled == 1 ||
link->dpcd_sink_ext_caps.bits.sdr_aux_backlight_control == 1 ||
link->dpcd_sink_ext_caps.bits.hdr_aux_backlight_control == 1) {
- set_cached_brightness_aux(link);
-
+ set_default_brightness_aux(link);
if (link->dpcd_sink_ext_caps.bits.oled == 1)
msleep(bl_oled_enable_delay);
edp_backlight_enable_aux(link, true);
lt_settings->cr_pattern_time = 16000;
/* Fixed VS/PE specific: Toggle link rate */
- apply_toggle_rate_wa = (link->vendor_specific_lttpr_link_rate_wa == target_rate);
+ apply_toggle_rate_wa = ((link->vendor_specific_lttpr_link_rate_wa == target_rate) || (link->vendor_specific_lttpr_link_rate_wa == 0));
target_rate = get_dpcd_link_rate(<_settings->link_settings);
toggle_rate = (target_rate == 0x6) ? 0xA : 0x6;
/* Vendor specific: Toggle link rate */
toggle_rate = (rate == 0x6) ? 0xA : 0x6;
- if (link->vendor_specific_lttpr_link_rate_wa == rate) {
+ if (link->vendor_specific_lttpr_link_rate_wa == rate || link->vendor_specific_lttpr_link_rate_wa == 0) {
core_link_write_dpcd(
link,
DP_LINK_BW_SET,
/* Vendor specific: Toggle link rate */
toggle_rate = (rate == 0x6) ? 0xA : 0x6;
- if (link->vendor_specific_lttpr_link_rate_wa == rate) {
+ if (link->vendor_specific_lttpr_link_rate_wa == rate || link->vendor_specific_lttpr_link_rate_wa == 0) {
core_link_write_dpcd(
link,
DP_LINK_BW_SET,
*(uint32_t *)&dpcd_backlight_set.backlight_level_millinits = backlight_millinits;
*(uint16_t *)&dpcd_backlight_set.backlight_transition_time_ms = (uint16_t)transition_time_in_ms;
- link->backlight_settings.backlight_millinits = backlight_millinits;
if (!link->dpcd_caps.panel_luminance_control) {
if (core_link_write_dpcd(link, DP_SOURCE_BACKLIGHT_LEVEL,
default_backlight = 150000;
// if < 1 nits or > 5000, it might be wrong readback
if (default_backlight < 1000 || default_backlight > 5000000)
- default_backlight = 150000; //
+ default_backlight = 150000;
return edp_set_backlight_level_nits(link, true,
default_backlight, 0);
return false;
}
-bool set_cached_brightness_aux(struct dc_link *link)
-{
- if (link->backlight_settings.backlight_millinits)
- return edp_set_backlight_level_nits(link, true,
- link->backlight_settings.backlight_millinits, 0);
- else
- return set_default_brightness_aux(link);
- return false;
-}
bool edp_is_ilr_optimization_enabled(struct dc_link *link)
{
if (link->dpcd_caps.edp_supported_link_rates_count == 0 || !link->panel_config.ilr.optimize_edp_link_rate)
enum dp_panel_mode dp_get_panel_mode(struct dc_link *link);
void dp_set_panel_mode(struct dc_link *link, enum dp_panel_mode panel_mode);
bool set_default_brightness_aux(struct dc_link *link);
-bool set_cached_brightness_aux(struct dc_link *link);
void edp_panel_backlight_power_on(struct dc_link *link, bool wait_for_hpd);
int edp_get_backlight_level(const struct dc_link *link);
bool edp_get_backlight_level_nits(struct dc_link *link,
uint32_t vbios_size;
const uint8_t *fw_inst_const;
const uint8_t *fw_bss_data;
+ bool is_mailbox_in_inbox;
};
/**
*/
struct dmub_srv_region_info {
uint32_t fb_size;
+ uint32_t inbox_size;
uint8_t num_regions;
struct dmub_region regions[DMUB_WINDOW_TOTAL];
};
/**
- * struct dmub_srv_fb_params - parameters used for driver fb setup
+ * struct dmub_srv_memory_params - parameters used for driver fb setup
* @region_info: region info calculated by dmub service
- * @cpu_addr: base cpu address for the framebuffer
- * @gpu_addr: base gpu virtual address for the framebuffer
+ * @cpu_fb_addr: base cpu address for the framebuffer
+ * @cpu_inbox_addr: base cpu address for the gart
+ * @gpu_fb_addr: base gpu virtual address for the framebuffer
+ * @gpu_inbox_addr: base gpu virtual address for the gart
*/
-struct dmub_srv_fb_params {
+struct dmub_srv_memory_params {
const struct dmub_srv_region_info *region_info;
- void *cpu_addr;
- uint64_t gpu_addr;
+ void *cpu_fb_addr;
+ void *cpu_inbox_addr;
+ uint64_t gpu_fb_addr;
+ uint64_t gpu_inbox_addr;
};
/**
* DMUB_STATUS_OK - success
* DMUB_STATUS_INVALID - unspecified error
*/
-enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub,
- const struct dmub_srv_fb_params *params,
+enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub,
+ const struct dmub_srv_memory_params *params,
struct dmub_srv_fb_info *out);
/**
uint32_t fw_state_size = DMUB_FW_STATE_SIZE;
uint32_t trace_buffer_size = DMUB_TRACE_BUFFER_SIZE;
uint32_t scratch_mem_size = DMUB_SCRATCH_MEM_SIZE;
-
+ uint32_t previous_top = 0;
if (!dmub->sw_init)
return DMUB_STATUS_INVALID;
bios->base = dmub_align(stack->top, 256);
bios->top = bios->base + params->vbios_size;
- mail->base = dmub_align(bios->top, 256);
- mail->top = mail->base + DMUB_MAILBOX_SIZE;
+ if (params->is_mailbox_in_inbox) {
+ mail->base = 0;
+ mail->top = mail->base + DMUB_MAILBOX_SIZE;
+ previous_top = bios->top;
+ } else {
+ mail->base = dmub_align(bios->top, 256);
+ mail->top = mail->base + DMUB_MAILBOX_SIZE;
+ previous_top = mail->top;
+ }
fw_info = dmub_get_fw_meta_info(params);
dmub->fw_version = fw_info->fw_version;
}
- trace_buff->base = dmub_align(mail->top, 256);
+ trace_buff->base = dmub_align(previous_top, 256);
trace_buff->top = trace_buff->base + dmub_align(trace_buffer_size, 64);
fw_state->base = dmub_align(trace_buff->top, 256);
out->fb_size = dmub_align(scratch_mem->top, 4096);
+ if (params->is_mailbox_in_inbox)
+ out->inbox_size = dmub_align(mail->top, 4096);
+
return DMUB_STATUS_OK;
}
-enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub,
- const struct dmub_srv_fb_params *params,
+enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub,
+ const struct dmub_srv_memory_params *params,
struct dmub_srv_fb_info *out)
{
uint8_t *cpu_base;
if (params->region_info->num_regions != DMUB_NUM_WINDOWS)
return DMUB_STATUS_INVALID;
- cpu_base = (uint8_t *)params->cpu_addr;
- gpu_base = params->gpu_addr;
+ cpu_base = (uint8_t *)params->cpu_fb_addr;
+ gpu_base = params->gpu_fb_addr;
for (i = 0; i < DMUB_NUM_WINDOWS; ++i) {
const struct dmub_region *reg =
out->fb[i].cpu_addr = cpu_base + reg->base;
out->fb[i].gpu_addr = gpu_base + reg->base;
+
+ if (i == DMUB_WINDOW_4_MAILBOX && params->cpu_inbox_addr != 0) {
+ out->fb[i].cpu_addr = (uint8_t *)params->cpu_inbox_addr + reg->base;
+ out->fb[i].gpu_addr = params->gpu_inbox_addr + reg->base;
+ }
+
out->fb[i].size = reg->top - reg->base;
}
return DMUB_STATUS_INVALID;
if (dmub->hw_funcs.get_inbox1_rptr && dmub->hw_funcs.get_inbox1_wptr) {
- dmub->inbox1_rb.rptr = dmub->hw_funcs.get_inbox1_rptr(dmub);
- dmub->inbox1_rb.wrpt = dmub->hw_funcs.get_inbox1_wptr(dmub);
- dmub->inbox1_last_wptr = dmub->inbox1_rb.wrpt;
+ uint32_t rptr = dmub->hw_funcs.get_inbox1_rptr(dmub);
+ uint32_t wptr = dmub->hw_funcs.get_inbox1_wptr(dmub);
+
+ if (rptr > dmub->inbox1_rb.capacity || wptr > dmub->inbox1_rb.capacity) {
+ return DMUB_STATUS_HW_FAILURE;
+ } else {
+ dmub->inbox1_rb.rptr = rptr;
+ dmub->inbox1_rb.wrpt = wptr;
+ dmub->inbox1_last_wptr = dmub->inbox1_rb.wrpt;
+ }
}
return DMUB_STATUS_OK;
if (!dmub->hw_init)
return DMUB_STATUS_INVALID;
+ if (dmub->inbox1_rb.rptr > dmub->inbox1_rb.capacity ||
+ dmub->inbox1_rb.wrpt > dmub->inbox1_rb.capacity) {
+ return DMUB_STATUS_HW_FAILURE;
+ }
+
if (dmub_rb_push_front(&dmub->inbox1_rb, cmd))
return DMUB_STATUS_OK;
ack = dmub->hw_funcs.read_inbox0_ack_register(dmub);
if (ack)
return DMUB_STATUS_OK;
+ udelay(1);
}
return DMUB_STATUS_TIMEOUT;
}
/* V2.1 */
struct edp_info edp1_info;
struct edp_info edp2_info;
+ uint32_t gpuclk_ss_percentage;
+ uint32_t gpuclk_ss_type;
};
/*
((dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x08) ||
(dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x07)))
isPSRSUSupported = false;
+ else if (dpcd_caps->sink_dev_id_str[1] == 0x08 && dpcd_caps->sink_dev_id_str[0] == 0x03)
+ isPSRSUSupported = false;
else if (dpcd_caps->psr_info.force_psrsu_cap == 0x1)
isPSRSUSupported = true;
}
#define regTCP_INVALIDATE_BASE_IDX 1
#define regTCP_STATUS 0x19a1
#define regTCP_STATUS_BASE_IDX 1
+#define regTCP_CNTL 0x19a2
+#define regTCP_CNTL_BASE_IDX 1
#define regTCP_CNTL2 0x19a3
#define regTCP_CNTL2_BASE_IDX 1
#define regTCP_DEBUG_INDEX 0x19a5
#define regBIF_BIF256_CI256_RC3X4_USB4_PCIE_CNTL2_BASE_IDX 5
#define regBIF_BIF256_CI256_RC3X4_USB4_PCIE_TX_POWER_CTRL_1 0x420187
#define regBIF_BIF256_CI256_RC3X4_USB4_PCIE_TX_POWER_CTRL_1_BASE_IDX 5
+#define regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3 0x4201c6
+#define regBIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3_BASE_IDX 5
// addressBlock: nbio_nbif0_bif_cfg_dev0_rc_bifcfgdecp
//BIF_BIF256_CI256_RC3X4_USB4_PCIE_TX_POWER_CTRL_1
#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_TX_POWER_CTRL_1__MST_MEM_LS_EN_MASK 0x00000001L
#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_TX_POWER_CTRL_1__REPLAY_MEM_LS_EN_MASK 0x00000008L
+//BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_PAYLOAD_SIZE_MODE__SHIFT 0x8
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_PRIV_MAX_PAYLOAD_SIZE__SHIFT 0x9
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_10BIT_TAG_EN_OVERRIDE__SHIFT 0xb
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_10BIT_TAG_EN_OVERRIDE__SHIFT 0xd
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__MST_DROP_SYNC_FLOOD_EN__SHIFT 0xf
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_PAYLOAD_SIZE_MODE__SHIFT 0x10
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_PRIV_MAX_PAYLOAD_SIZE__SHIFT 0x11
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_READ_REQUEST_SIZE_MODE__SHIFT 0x14
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_PRIV_MAX_READ_REQUEST_SIZE__SHIFT 0x15
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_READ_SAFE_MODE__SHIFT 0x18
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_EXTENDED_TAG_EN_OVERRIDE__SHIFT 0x19
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_READ_REQUEST_SIZE_MODE__SHIFT 0x1b
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_READ_REQUEST_SIZE_PRIV__SHIFT 0x1c
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_EXTENDED_TAG_EN_OVERRIDE__SHIFT 0x1e
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_PAYLOAD_SIZE_MODE_MASK 0x00000100L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_PRIV_MAX_PAYLOAD_SIZE_MASK 0x00000600L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_10BIT_TAG_EN_OVERRIDE_MASK 0x00001800L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_10BIT_TAG_EN_OVERRIDE_MASK 0x00006000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__MST_DROP_SYNC_FLOOD_EN_MASK 0x00008000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_PAYLOAD_SIZE_MODE_MASK 0x00010000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_PRIV_MAX_PAYLOAD_SIZE_MASK 0x000E0000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_READ_REQUEST_SIZE_MODE_MASK 0x00100000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_PRIV_MAX_READ_REQUEST_SIZE_MASK 0x00E00000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_MAX_READ_SAFE_MODE_MASK 0x01000000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_EXTENDED_TAG_EN_OVERRIDE_MASK 0x06000000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_READ_REQUEST_SIZE_MODE_MASK 0x08000000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_MAX_READ_REQUEST_SIZE_PRIV_MASK 0x30000000L
+#define BIF_BIF256_CI256_RC3X4_USB4_PCIE_MST_CTRL_3__CI_SWUS_EXTENDED_TAG_EN_OVERRIDE_MASK 0xC0000000L
// addressBlock: nbio_nbif0_bif_cfg_dev0_rc_bifcfgdecp
//BIF_CFG_DEV0_RC0_VENDOR_ID
struct dpm_clocks *clock_table);
int (*get_smu_prv_buf_details)(void *handle, void **addr, size_t *size);
void (*pm_compute_clocks)(void *handle);
+ int (*notify_rlc_state)(void *handle, bool en);
};
struct metrics_table_header {
uint16_t average_dram_reads;
/* time filtered DRAM write bandwidth [MB/sec] */
uint16_t average_dram_writes;
+ /* time filtered IPU read bandwidth [MB/sec] */
+ uint16_t average_ipu_reads;
+ /* time filtered IPU write bandwidth [MB/sec] */
+ uint16_t average_ipu_writes;
/* Driver attached timestamp (in ns) */
uint64_t system_clock_counter;
uint32_t average_all_core_power;
/* calculated core power [mW] */
uint16_t average_core_power[16];
+ /* time filtered total system power [mW] */
+ uint16_t average_sys_power;
/* maximum IRM defined STAPM power limit [mW] */
uint16_t stapm_power_limit;
/* time filtered STAPM power limit [mW] */
uint16_t average_ipuclk_frequency;
uint16_t average_fclk_frequency;
uint16_t average_vclk_frequency;
+ uint16_t average_uclk_frequency;
+ uint16_t average_mpipu_frequency;
/* Current clocks */
/* target core frequency [MHz] */
/* GFXCLK frequency limit enforced on GFX [MHz] */
uint16_t current_gfx_maxfreq;
+ /* Throttle Residency (ASIC dependent) */
+ uint32_t throttle_residency_prochot;
+ uint32_t throttle_residency_spl;
+ uint32_t throttle_residency_fppt;
+ uint32_t throttle_residency_sppt;
+ uint32_t throttle_residency_thm_core;
+ uint32_t throttle_residency_thm_gfx;
+ uint32_t throttle_residency_thm_soc;
+
/* Metrics table alpha filter time constant [us] */
uint32_t time_filter_alphavalue;
};
return ret;
}
+int amdgpu_dpm_notify_rlc_state(struct amdgpu_device *adev, bool en)
+{
+ int ret = 0;
+ const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+
+ if (pp_funcs && pp_funcs->notify_rlc_state) {
+ mutex_lock(&adev->pm.mutex);
+
+ ret = pp_funcs->notify_rlc_state(
+ adev->powerplay.pp_handle,
+ en);
+
+ mutex_unlock(&adev->pm.mutex);
+ }
+
+ return ret;
+}
+
bool amdgpu_dpm_is_baco_supported(struct amdgpu_device *adev)
{
const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
} else if (DEVICE_ATTR_IS(xgmi_plpd_policy)) {
if (amdgpu_dpm_get_xgmi_plpd_mode(adev, NULL) == XGMI_PLPD_NONE)
*states = ATTR_STATE_UNSUPPORTED;
- } else if (DEVICE_ATTR_IS(pp_dpm_mclk_od)) {
+ } else if (DEVICE_ATTR_IS(pp_mclk_od)) {
if (amdgpu_dpm_get_mclk_od(adev) == -EOPNOTSUPP)
*states = ATTR_STATE_UNSUPPORTED;
- } else if (DEVICE_ATTR_IS(pp_dpm_sclk_od)) {
+ } else if (DEVICE_ATTR_IS(pp_sclk_od)) {
if (amdgpu_dpm_get_sclk_od(adev) == -EOPNOTSUPP)
*states = ATTR_STATE_UNSUPPORTED;
} else if (DEVICE_ATTR_IS(apu_thermal_cap)) {
int amdgpu_dpm_set_mp1_state(struct amdgpu_device *adev,
enum pp_mp1_state mp1_state);
+int amdgpu_dpm_notify_rlc_state(struct amdgpu_device *adev, bool en);
+
int amdgpu_dpm_set_gfx_power_up_by_imu(struct amdgpu_device *adev);
int amdgpu_dpm_baco_exit(struct amdgpu_device *adev);
}
}
+ /* Notify SMU RLC is going to be off, stop RLC and SMU interaction.
+ * otherwise SMU will hang while interacting with RLC if RLC is halted
+ * this is a WA for Vangogh asic which fix the SMU hang issue.
+ */
+ ret = smu_notify_rlc_state(smu, false);
+ if (ret) {
+ dev_err(adev->dev, "Fail to notify rlc status!\n");
+ return ret;
+ }
+
if (amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(9, 4, 2) &&
!((adev->flags & AMD_IS_APU) && adev->gfx.imu.funcs) &&
!amdgpu_sriov_vf(adev) && adev->gfx.rlc.funcs->stop)
* management.
*/
int (*dpm_set_umsch_mm_enable)(struct smu_context *smu, bool enable);
+
+ /**
+ * @notify_rlc_state: Notify RLC power state to SMU.
+ */
+ int (*notify_rlc_state)(struct smu_context *smu, bool en);
};
typedef enum {
METRICS_PCIE_WIDTH,
METRICS_CURR_FANPWM,
METRICS_CURR_SOCKETPOWER,
+ METRICS_AVERAGE_VPECLK,
+ METRICS_AVERAGE_IPUCLK,
+ METRICS_AVERAGE_MPIPUCLK,
+ METRICS_THROTTLER_RESIDENCY_PROCHOT,
+ METRICS_THROTTLER_RESIDENCY_SPL,
+ METRICS_THROTTLER_RESIDENCY_FPPT,
+ METRICS_THROTTLER_RESIDENCY_SPPT,
+ METRICS_THROTTLER_RESIDENCY_THM_CORE,
+ METRICS_THROTTLER_RESIDENCY_THM_GFX,
+ METRICS_THROTTLER_RESIDENCY_THM_SOC,
} MetricsMember_t;
enum smu_cmn2asic_mapping_type {
// *** IMPORTANT ***
// SMU TEAM: Always increment the interface version if
// any structure is changed in this file
-#define PMFW_DRIVER_IF_VERSION 6
+#define PMFW_DRIVER_IF_VERSION 7
typedef struct {
int32_t value;
} DpmClocks_t;
typedef struct {
- uint16_t CoreFrequency[16]; //Target core frequency [MHz]
- uint16_t CorePower[16]; //CAC calculated core power [mW]
- uint16_t CoreTemperature[16]; //TSEN measured core temperature [centi-C]
- uint16_t GfxTemperature; //TSEN measured GFX temperature [centi-C]
- uint16_t SocTemperature; //TSEN measured SOC temperature [centi-C]
- uint16_t StapmOpnLimit; //Maximum IRM defined STAPM power limit [mW]
- uint16_t StapmCurrentLimit; //Time filtered STAPM power limit [mW]
- uint16_t InfrastructureCpuMaxFreq; //CCLK frequency limit enforced on classic cores [MHz]
- uint16_t InfrastructureGfxMaxFreq; //GFXCLK frequency limit enforced on GFX [MHz]
- uint16_t SkinTemp; //Maximum skin temperature reported by APU and HS2 chassis sensors [centi-C]
- uint16_t GfxclkFrequency; //Time filtered target GFXCLK frequency [MHz]
- uint16_t FclkFrequency; //Time filtered target FCLK frequency [MHz]
- uint16_t GfxActivity; //Time filtered GFX busy % [0-100]
- uint16_t SocclkFrequency; //Time filtered target SOCCLK frequency [MHz]
- uint16_t VclkFrequency; //Time filtered target VCLK frequency [MHz]
- uint16_t VcnActivity; //Time filtered VCN busy % [0-100]
- uint16_t VpeclkFrequency; //Time filtered target VPECLK frequency [MHz]
- uint16_t IpuclkFrequency; //Time filtered target IPUCLK frequency [MHz]
- uint16_t IpuBusy[8]; //Time filtered IPU per-column busy % [0-100]
- uint16_t DRAMReads; //Time filtered DRAM read bandwidth [MB/sec]
- uint16_t DRAMWrites; //Time filtered DRAM write bandwidth [MB/sec]
- uint16_t CoreC0Residency[16]; //Time filtered per-core C0 residency % [0-100]
- uint16_t IpuPower; //Time filtered IPU power [mW]
- uint32_t ApuPower; //Time filtered APU power [mW]
- uint32_t GfxPower; //Time filtered GFX power [mW]
- uint32_t dGpuPower; //Time filtered dGPU power [mW]
- uint32_t SocketPower; //Time filtered power used for PPT/STAPM [APU+dGPU] [mW]
- uint32_t AllCorePower; //Time filtered sum of core power across all cores in the socket [mW]
- uint32_t FilterAlphaValue; //Metrics table alpha filter time constant [us]
- uint32_t MetricsCounter; //Counter that is incremented on every metrics table update [PM_TIMER cycles]
- uint32_t spare[16];
+ uint16_t CoreFrequency[16]; //Target core frequency [MHz]
+ uint16_t CorePower[16]; //CAC calculated core power [mW]
+ uint16_t CoreTemperature[16]; //TSEN measured core temperature [centi-C]
+ uint16_t GfxTemperature; //TSEN measured GFX temperature [centi-C]
+ uint16_t SocTemperature; //TSEN measured SOC temperature [centi-C]
+ uint16_t StapmOpnLimit; //Maximum IRM defined STAPM power limit [mW]
+ uint16_t StapmCurrentLimit; //Time filtered STAPM power limit [mW]
+ uint16_t InfrastructureCpuMaxFreq; //CCLK frequency limit enforced on classic cores [MHz]
+ uint16_t InfrastructureGfxMaxFreq; //GFXCLK frequency limit enforced on GFX [MHz]
+ uint16_t SkinTemp; //Maximum skin temperature reported by APU and HS2 chassis sensors [centi-C]
+ uint16_t GfxclkFrequency; //Time filtered target GFXCLK frequency [MHz]
+ uint16_t FclkFrequency; //Time filtered target FCLK frequency [MHz]
+ uint16_t GfxActivity; //Time filtered GFX busy % [0-100]
+ uint16_t SocclkFrequency; //Time filtered target SOCCLK frequency [MHz]
+ uint16_t VclkFrequency; //Time filtered target VCLK frequency [MHz]
+ uint16_t VcnActivity; //Time filtered VCN busy % [0-100]
+ uint16_t VpeclkFrequency; //Time filtered target VPECLK frequency [MHz]
+ uint16_t IpuclkFrequency; //Time filtered target IPUCLK frequency [MHz]
+ uint16_t IpuBusy[8]; //Time filtered IPU per-column busy % [0-100]
+ uint16_t DRAMReads; //Time filtered DRAM read bandwidth [MB/sec]
+ uint16_t DRAMWrites; //Time filtered DRAM write bandwidth [MB/sec]
+ uint16_t CoreC0Residency[16]; //Time filtered per-core C0 residency % [0-100]
+ uint16_t IpuPower; //Time filtered IPU power [mW]
+ uint32_t ApuPower; //Time filtered APU power [mW]
+ uint32_t GfxPower; //Time filtered GFX power [mW]
+ uint32_t dGpuPower; //Time filtered dGPU power [mW]
+ uint32_t SocketPower; //Time filtered power used for PPT/STAPM [APU+dGPU] [mW]
+ uint32_t AllCorePower; //Time filtered sum of core power across all cores in the socket [mW]
+ uint32_t FilterAlphaValue; //Metrics table alpha filter time constant [us]
+ uint32_t MetricsCounter; //Counter that is incremented on every metrics table update [PM_TIMER cycles]
+ uint16_t MemclkFrequency; //Time filtered target MEMCLK frequency [MHz]
+ uint16_t MpipuclkFrequency; //Time filtered target MPIPUCLK frequency [MHz]
+ uint16_t IpuReads; //Time filtered IPU read bandwidth [MB/sec]
+ uint16_t IpuWrites; //Time filtered IPU write bandwidth [MB/sec]
+ uint32_t ThrottleResidency_PROCHOT; //Counter that is incremented on every metrics table update when PROCHOT was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_SPL; //Counter that is incremented on every metrics table update when SPL was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_FPPT; //Counter that is incremented on every metrics table update when fast PPT was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_SPPT; //Counter that is incremented on every metrics table update when slow PPT was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_THM_CORE; //Counter that is incremented on every metrics table update when CORE thermal throttling was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_THM_GFX; //Counter that is incremented on every metrics table update when GFX thermal throttling was engaged [PM_TIMER cycles]
+ uint32_t ThrottleResidency_THM_SOC; //Counter that is incremented on every metrics table update when SOC thermal throttling was engaged [PM_TIMER cycles]
+ uint16_t Psys; //Time filtered Psys power [mW]
+ uint16_t spare1;
+ uint32_t spare[6];
} SmuMetrics_t;
//ISP tile definitions
VOLTAGE_GUARDBAND_COUNT
} GFX_GUARDBAND_e;
-#define SMU_METRICS_TABLE_VERSION 0x8
+#define SMU_METRICS_TABLE_VERSION 0x9
typedef struct __attribute__((packed, aligned(4))) {
uint32_t AccumulationCounter;
//XGMI Data tranfser size
uint64_t XgmiReadDataSizeAcc[8];//in KByte
uint64_t XgmiWriteDataSizeAcc[8];//in KByte
+
+ //PCIE BW Data and error count
+ uint32_t PcieBandwidth[4];
+ uint32_t PCIeL0ToRecoveryCountAcc; // The Pcie counter itself is accumulated
+ uint32_t PCIenReplayAAcc; // The Pcie counter itself is accumulated
+ uint32_t PCIenReplayARolloverCountAcc; // The Pcie counter itself is accumulated
+ uint32_t PCIeNAKSentCountAcc; // The Pcie counter itself is accumulated
+ uint32_t PCIeNAKReceivedCountAcc; // The Pcie counter itself is accumulated
} MetricsTable_t;
#define SMU_VF_METRICS_TABLE_VERSION 0x3
return 0;
}
-
-static int vangogh_system_features_control(struct smu_context *smu, bool en)
+static int vangogh_notify_rlc_state(struct smu_context *smu, bool en)
{
struct amdgpu_device *adev = smu->adev;
int ret = 0;
.print_clk_levels = vangogh_common_print_clk_levels,
.set_default_dpm_table = vangogh_set_default_dpm_tables,
.set_fine_grain_gfx_freq_parameters = vangogh_set_fine_grain_gfx_freq_parameters,
- .system_features_control = vangogh_system_features_control,
+ .notify_rlc_state = vangogh_notify_rlc_state,
.feature_is_enabled = smu_cmn_feature_is_enabled,
.set_power_profile_mode = vangogh_set_power_profile_mode,
.get_power_profile_mode = vangogh_get_power_profile_mode,
}
smu_table->ecc_table = kzalloc(tables[SMU_TABLE_ECCINFO].size, GFP_KERNEL);
- if (!smu_table->ecc_table)
+ if (!smu_table->ecc_table) {
+ kfree(smu_table->metrics_table);
+ kfree(smu_table->gpu_metrics_table);
return -ENOMEM;
+ }
return 0;
}
static int smu_v13_0_6_notify_unload(struct smu_context *smu)
{
- if (smu->smc_fw_version <= 0x553500)
+ if (amdgpu_in_reset(smu->adev))
return 0;
dev_dbg(smu->adev->dev, "Notify PMFW about driver unload");
smu_v13_0_6_get_current_pcie_link_speed(smu);
gpu_metrics->pcie_bandwidth_acc =
SMUQ10_ROUND(metrics->PcieBandwidthAcc[0]);
+ gpu_metrics->pcie_bandwidth_inst =
+ SMUQ10_ROUND(metrics->PcieBandwidth[0]);
+ gpu_metrics->pcie_l0_to_recov_count_acc =
+ metrics->PCIeL0ToRecoveryCountAcc;
+ gpu_metrics->pcie_replay_count_acc =
+ metrics->PCIenReplayAAcc;
+ gpu_metrics->pcie_replay_rover_count_acc =
+ metrics->PCIenReplayARolloverCountAcc;
}
gpu_metrics->system_clock_counter = ktime_get_boottime_ns();
static bool mca_smu_bank_is_valid(const struct mca_ras_info *mca_ras, struct amdgpu_device *adev,
enum amdgpu_mca_error_type type, struct mca_bank_entry *entry)
{
+ struct smu_context *smu = adev->powerplay.pp_handle;
uint32_t errcode, instlo;
instlo = REG_GET_FIELD(entry->regs[MCA_REG_IDX_IPID], MCMP1_IPIDT0, InstanceIdLo);
if (instlo != 0x03b30400)
return false;
- errcode = REG_GET_FIELD(entry->regs[MCA_REG_IDX_STATUS], MCMP1_STATUST0, ErrorCode);
+ if (!(adev->flags & AMD_IS_APU) && smu->smc_fw_version >= 0x00555600) {
+ errcode = MCA_REG__SYND__ERRORINFORMATION(entry->regs[MCA_REG_IDX_SYND]);
+ errcode &= 0xff;
+ } else {
+ errcode = REG_GET_FIELD(entry->regs[MCA_REG_IDX_STATUS], MCMP1_STATUST0, ErrorCode);
+ }
+
return mca_smu_check_error_code(adev, mca_ras, errcode);
}
*value = 0;
break;
case METRICS_AVERAGE_UCLK:
- *value = 0;
+ *value = metrics->MemclkFrequency;
break;
case METRICS_AVERAGE_FCLK:
*value = metrics->FclkFrequency;
break;
+ case METRICS_AVERAGE_VPECLK:
+ *value = metrics->VpeclkFrequency;
+ break;
+ case METRICS_AVERAGE_IPUCLK:
+ *value = metrics->IpuclkFrequency;
+ break;
+ case METRICS_AVERAGE_MPIPUCLK:
+ *value = metrics->MpipuclkFrequency;
+ break;
case METRICS_AVERAGE_GFXACTIVITY:
*value = metrics->GfxActivity / 100;
break;
*value = metrics->SocTemperature / 100 *
SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
break;
- case METRICS_THROTTLER_STATUS:
- *value = 0;
+ case METRICS_THROTTLER_RESIDENCY_PROCHOT:
+ *value = metrics->ThrottleResidency_PROCHOT;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_SPL:
+ *value = metrics->ThrottleResidency_SPL;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_FPPT:
+ *value = metrics->ThrottleResidency_FPPT;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_SPPT:
+ *value = metrics->ThrottleResidency_SPPT;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_THM_CORE:
+ *value = metrics->ThrottleResidency_THM_CORE;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_THM_GFX:
+ *value = metrics->ThrottleResidency_THM_GFX;
+ break;
+ case METRICS_THROTTLER_RESIDENCY_THM_SOC:
+ *value = metrics->ThrottleResidency_THM_SOC;
break;
case METRICS_VOLTAGE_VDDGFX:
*value = 0;
sizeof(uint16_t) * 16);
gpu_metrics->average_dram_reads = metrics.DRAMReads;
gpu_metrics->average_dram_writes = metrics.DRAMWrites;
+ gpu_metrics->average_ipu_reads = metrics.IpuReads;
+ gpu_metrics->average_ipu_writes = metrics.IpuWrites;
gpu_metrics->average_socket_power = metrics.SocketPower;
gpu_metrics->average_ipu_power = metrics.IpuPower;
gpu_metrics->average_gfx_power = metrics.GfxPower;
gpu_metrics->average_dgpu_power = metrics.dGpuPower;
gpu_metrics->average_all_core_power = metrics.AllCorePower;
+ gpu_metrics->average_sys_power = metrics.Psys;
memcpy(&gpu_metrics->average_core_power[0],
&metrics.CorePower[0],
sizeof(uint16_t) * 16);
gpu_metrics->average_fclk_frequency = metrics.FclkFrequency;
gpu_metrics->average_vclk_frequency = metrics.VclkFrequency;
gpu_metrics->average_ipuclk_frequency = metrics.IpuclkFrequency;
+ gpu_metrics->average_uclk_frequency = metrics.MemclkFrequency;
+ gpu_metrics->average_mpipu_frequency = metrics.MpipuclkFrequency;
memcpy(&gpu_metrics->current_coreclk[0],
&metrics.CoreFrequency[0],
gpu_metrics->current_core_maxfreq = metrics.InfrastructureCpuMaxFreq;
gpu_metrics->current_gfx_maxfreq = metrics.InfrastructureGfxMaxFreq;
+ gpu_metrics->throttle_residency_prochot = metrics.ThrottleResidency_PROCHOT;
+ gpu_metrics->throttle_residency_spl = metrics.ThrottleResidency_SPL;
+ gpu_metrics->throttle_residency_fppt = metrics.ThrottleResidency_FPPT;
+ gpu_metrics->throttle_residency_sppt = metrics.ThrottleResidency_SPPT;
+ gpu_metrics->throttle_residency_thm_core = metrics.ThrottleResidency_THM_CORE;
+ gpu_metrics->throttle_residency_thm_gfx = metrics.ThrottleResidency_THM_GFX;
+ gpu_metrics->throttle_residency_thm_soc = metrics.ThrottleResidency_THM_SOC;
+
gpu_metrics->time_filter_alphavalue = metrics.FilterAlphaValue;
gpu_metrics->system_clock_counter = ktime_get_boottime_ns();
#define smu_get_default_config_table_settings(smu, config_table) smu_ppt_funcs(get_default_config_table_settings, -EOPNOTSUPP, smu, config_table)
#define smu_set_config_table(smu, config_table) smu_ppt_funcs(set_config_table, -EOPNOTSUPP, smu, config_table)
#define smu_init_pptable_microcode(smu) smu_ppt_funcs(init_pptable_microcode, 0, smu)
+#define smu_notify_rlc_state(smu, en) smu_ppt_funcs(notify_rlc_state, 0, smu, en)
#endif
#endif
return container_of(connector, struct ast_sil164_connector, base);
}
+struct ast_bmc_connector {
+ struct drm_connector base;
+ struct drm_connector *physical_connector;
+};
+
+static inline struct ast_bmc_connector *
+to_ast_bmc_connector(struct drm_connector *connector)
+{
+ return container_of(connector, struct ast_bmc_connector, base);
+}
+
/*
* Device
*/
} astdp;
struct {
struct drm_encoder encoder;
- struct drm_connector connector;
+ struct ast_bmc_connector bmc_connector;
} bmc;
} output;
.destroy = drm_encoder_cleanup,
};
+static int ast_bmc_connector_helper_detect_ctx(struct drm_connector *connector,
+ struct drm_modeset_acquire_ctx *ctx,
+ bool force)
+{
+ struct ast_bmc_connector *bmc_connector = to_ast_bmc_connector(connector);
+ struct drm_connector *physical_connector = bmc_connector->physical_connector;
+
+ /*
+ * Most user-space compositors cannot handle more than one connected
+ * connector per CRTC. Hence, we only mark the BMC as connected if the
+ * physical connector is disconnected. If the physical connector's status
+ * is connected or unknown, the BMC remains disconnected. This has no
+ * effect on the output of the BMC.
+ *
+ * FIXME: Remove this logic once user-space compositors can handle more
+ * than one connector per CRTC. The BMC should always be connected.
+ */
+
+ if (physical_connector && physical_connector->status == connector_status_disconnected)
+ return connector_status_connected;
+
+ return connector_status_disconnected;
+}
+
static int ast_bmc_connector_helper_get_modes(struct drm_connector *connector)
{
return drm_add_modes_noedid(connector, 4096, 4096);
static const struct drm_connector_helper_funcs ast_bmc_connector_helper_funcs = {
.get_modes = ast_bmc_connector_helper_get_modes,
+ .detect_ctx = ast_bmc_connector_helper_detect_ctx,
};
static const struct drm_connector_funcs ast_bmc_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
};
-static int ast_bmc_output_init(struct ast_device *ast)
+static int ast_bmc_connector_init(struct drm_device *dev,
+ struct ast_bmc_connector *bmc_connector,
+ struct drm_connector *physical_connector)
+{
+ struct drm_connector *connector = &bmc_connector->base;
+ int ret;
+
+ ret = drm_connector_init(dev, connector, &ast_bmc_connector_funcs,
+ DRM_MODE_CONNECTOR_VIRTUAL);
+ if (ret)
+ return ret;
+
+ drm_connector_helper_add(connector, &ast_bmc_connector_helper_funcs);
+
+ bmc_connector->physical_connector = physical_connector;
+
+ return 0;
+}
+
+static int ast_bmc_output_init(struct ast_device *ast,
+ struct drm_connector *physical_connector)
{
struct drm_device *dev = &ast->base;
struct drm_crtc *crtc = &ast->crtc;
struct drm_encoder *encoder = &ast->output.bmc.encoder;
- struct drm_connector *connector = &ast->output.bmc.connector;
+ struct ast_bmc_connector *bmc_connector = &ast->output.bmc.bmc_connector;
+ struct drm_connector *connector = &bmc_connector->base;
int ret;
ret = drm_encoder_init(dev, encoder,
return ret;
encoder->possible_crtcs = drm_crtc_mask(crtc);
- ret = drm_connector_init(dev, connector, &ast_bmc_connector_funcs,
- DRM_MODE_CONNECTOR_VIRTUAL);
+ ret = ast_bmc_connector_init(dev, bmc_connector, physical_connector);
if (ret)
return ret;
- drm_connector_helper_add(connector, &ast_bmc_connector_helper_funcs);
-
ret = drm_connector_attach_encoder(connector, encoder);
if (ret)
return ret;
int ast_mode_config_init(struct ast_device *ast)
{
struct drm_device *dev = &ast->base;
+ struct drm_connector *physical_connector = NULL;
int ret;
ret = drmm_mode_config_init(dev);
ret = ast_vga_output_init(ast);
if (ret)
return ret;
+ physical_connector = &ast->output.vga.vga_connector.base;
}
if (ast->tx_chip_types & AST_TX_SIL164_BIT) {
ret = ast_sil164_output_init(ast);
if (ret)
return ret;
+ physical_connector = &ast->output.sil164.sil164_connector.base;
}
if (ast->tx_chip_types & AST_TX_DP501_BIT) {
ret = ast_dp501_output_init(ast);
if (ret)
return ret;
+ physical_connector = &ast->output.dp501.connector;
}
if (ast->tx_chip_types & AST_TX_ASTDP_BIT) {
ret = ast_astdp_output_init(ast);
if (ret)
return ret;
+ physical_connector = &ast->output.astdp.connector;
}
- ret = ast_bmc_output_init(ast);
+ ret = ast_bmc_output_init(ast, physical_connector);
if (ret)
return ret;
select REGMAP_I2C
select DRM_PANEL
select DRM_MIPI_DSI
+ select VIDEOMODE_HELPERS
help
Toshiba TC358768AXBG/TC358778XBG DSI bridge chip driver.
* Copyright (C) 2017 Broadcom
*/
-#include <linux/device.h>
-
#include <drm/drm_atomic_helper.h>
#include <drm/drm_bridge.h>
#include <drm/drm_connector.h>
struct drm_bridge bridge;
struct drm_connector connector;
struct drm_panel *panel;
- struct device_link *link;
u32 connector_type;
};
{
struct panel_bridge *panel_bridge = drm_bridge_to_panel_bridge(bridge);
struct drm_connector *connector = &panel_bridge->connector;
- struct drm_panel *panel = panel_bridge->panel;
- struct drm_device *drm_dev = bridge->dev;
int ret;
- panel_bridge->link = device_link_add(drm_dev->dev, panel->dev,
- DL_FLAG_STATELESS);
- if (!panel_bridge->link) {
- DRM_ERROR("Failed to add device link between %s and %s\n",
- dev_name(drm_dev->dev), dev_name(panel->dev));
- return -EINVAL;
- }
-
if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR)
return 0;
if (!bridge->encoder) {
DRM_ERROR("Missing encoder\n");
- device_link_del(panel_bridge->link);
return -ENODEV;
}
panel_bridge->connector_type);
if (ret) {
DRM_ERROR("Failed to initialize connector\n");
- device_link_del(panel_bridge->link);
return ret;
}
struct panel_bridge *panel_bridge = drm_bridge_to_panel_bridge(bridge);
struct drm_connector *connector = &panel_bridge->connector;
- device_link_del(panel_bridge->link);
-
/*
* Cleanup the connector if we know it was initialized.
*
certifi==2023.7.22
charset-normalizer==3.2.0
idna==3.4
-pip==23.2.1
+pip==23.3
python-gitlab==3.15.0
requests==2.31.0
requests-toolbelt==1.0.0
ruamel.yaml.clib==0.2.7
setuptools==68.0.0
tenacity==8.2.3
-urllib3==2.0.4
-wheel==0.41.1
\ No newline at end of file
+urllib3==2.0.7
+wheel==0.41.1
return ret;
drm_atomic_helper_async_commit(dev, state);
- drm_atomic_helper_cleanup_planes(dev, state);
+ drm_atomic_helper_unprepare_planes(dev, state);
return 0;
}
return 0;
err:
- drm_atomic_helper_cleanup_planes(dev, state);
+ drm_atomic_helper_unprepare_planes(dev, state);
return ret;
}
EXPORT_SYMBOL(drm_atomic_helper_commit);
}
EXPORT_SYMBOL(drm_atomic_helper_prepare_planes);
+/**
+ * drm_atomic_helper_unprepare_planes - release plane resources on aborts
+ * @dev: DRM device
+ * @state: atomic state object with old state structures
+ *
+ * This function cleans up plane state, specifically framebuffers, from the
+ * atomic state. It undoes the effects of drm_atomic_helper_prepare_planes()
+ * when aborting an atomic commit. For cleaning up after a successful commit
+ * use drm_atomic_helper_cleanup_planes().
+ */
+void drm_atomic_helper_unprepare_planes(struct drm_device *dev,
+ struct drm_atomic_state *state)
+{
+ struct drm_plane *plane;
+ struct drm_plane_state *new_plane_state;
+ int i;
+
+ for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+ const struct drm_plane_helper_funcs *funcs = plane->helper_private;
+
+ if (funcs->end_fb_access)
+ funcs->end_fb_access(plane, new_plane_state);
+ }
+
+ for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+ const struct drm_plane_helper_funcs *funcs = plane->helper_private;
+
+ if (funcs->cleanup_fb)
+ funcs->cleanup_fb(plane, new_plane_state);
+ }
+}
+EXPORT_SYMBOL(drm_atomic_helper_unprepare_planes);
+
static bool plane_crtc_active(const struct drm_plane_state *state)
{
return state->crtc && state->crtc->state->active;
funcs->atomic_flush(crtc, old_state);
}
+
+ /*
+ * Signal end of framebuffer access here before hw_done. After hw_done,
+ * a later commit might have already released the plane state.
+ */
+ for_each_old_plane_in_state(old_state, plane, old_plane_state, i) {
+ const struct drm_plane_helper_funcs *funcs = plane->helper_private;
+
+ if (funcs->end_fb_access)
+ funcs->end_fb_access(plane, old_plane_state);
+ }
}
EXPORT_SYMBOL(drm_atomic_helper_commit_planes);
* configuration. Hence the old configuration must be perserved in @old_state to
* be able to call this function.
*
- * This function must also be called on the new state when the atomic update
- * fails at any point after calling drm_atomic_helper_prepare_planes().
+ * This function may not be called on the new state when the atomic update
+ * fails at any point after calling drm_atomic_helper_prepare_planes(). Use
+ * drm_atomic_helper_unprepare_planes() in this case.
*/
void drm_atomic_helper_cleanup_planes(struct drm_device *dev,
struct drm_atomic_state *old_state)
{
struct drm_plane *plane;
- struct drm_plane_state *old_plane_state, *new_plane_state;
+ struct drm_plane_state *old_plane_state;
int i;
- for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
+ for_each_old_plane_in_state(old_state, plane, old_plane_state, i) {
const struct drm_plane_helper_funcs *funcs = plane->helper_private;
- if (funcs->end_fb_access)
- funcs->end_fb_access(plane, new_plane_state);
- }
-
- for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
- const struct drm_plane_helper_funcs *funcs;
- struct drm_plane_state *plane_state;
-
- /*
- * This might be called before swapping when commit is aborted,
- * in which case we have to cleanup the new state.
- */
- if (old_plane_state == plane->state)
- plane_state = new_plane_state;
- else
- plane_state = old_plane_state;
-
- funcs = plane->helper_private;
-
if (funcs->cleanup_fb)
- funcs->cleanup_fb(plane, plane_state);
+ funcs->cleanup_fb(plane, old_plane_state);
}
}
EXPORT_SYMBOL(drm_atomic_helper_cleanup_planes);
drm_master_check_perm(struct drm_device *dev, struct drm_file *file_priv)
{
if (file_priv->was_master &&
- rcu_access_pointer(file_priv->pid) == task_pid(current))
+ rcu_access_pointer(file_priv->pid) == task_tgid(current))
return 0;
if (!capable(CAP_SYS_ADMIN))
struct drm_mode_set set;
uint32_t __user *set_connectors_ptr;
struct drm_modeset_acquire_ctx ctx;
- int ret;
- int i;
+ int ret, i, num_connectors = 0;
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
connector->name);
connector_set[i] = connector;
+ num_connectors++;
}
}
set.y = crtc_req->y;
set.mode = mode;
set.connectors = connector_set;
- set.num_connectors = crtc_req->count_connectors;
+ set.num_connectors = num_connectors;
set.fb = fb;
if (drm_drv_uses_atomic_modeset(dev))
drm_framebuffer_put(fb);
if (connector_set) {
- for (i = 0; i < crtc_req->count_connectors; i++) {
+ for (i = 0; i < num_connectors; i++) {
if (connector_set[i])
drm_connector_put(connector_set[i]);
}
override = drm_edid_override_get(connector);
if (override) {
- num_modes = drm_edid_connector_update(connector, override);
+ if (drm_edid_connector_update(connector, override) == 0)
+ num_modes = drm_edid_connector_add_modes(connector);
drm_edid_free(override);
-// SPDX-License-Identifier: GPL-2.0 OR MIT
+// SPDX-License-Identifier: GPL-2.0-only OR MIT
/*
* Copyright (c) 2022 Red Hat.
*
DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "IdeaPad Duet 3 10IGL5"),
},
.driver_data = (void *)&lcd1200x1920_rightside_up,
+ }, { /* Lenovo Legion Go 8APU1 */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "Legion Go 8APU1"),
+ },
+ .driver_data = (void *)&lcd1600x2560_leftside_up,
}, { /* Lenovo Yoga Book X90F / X90L */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Intel Corporation"),
}
EXPORT_SYMBOL(drm_gem_dmabuf_release);
-/*
+/**
* drm_gem_prime_fd_to_handle - PRIME import function for GEM drivers
* @dev: drm_device to import into
* @file_priv: drm file-private structure
*
* Returns 0 on success or a negative error code on failure.
*/
-static int drm_gem_prime_fd_to_handle(struct drm_device *dev,
- struct drm_file *file_priv, int prime_fd,
- uint32_t *handle)
+int drm_gem_prime_fd_to_handle(struct drm_device *dev,
+ struct drm_file *file_priv, int prime_fd,
+ uint32_t *handle)
{
struct dma_buf *dma_buf;
struct drm_gem_object *obj;
dma_buf_put(dma_buf);
return ret;
}
+EXPORT_SYMBOL(drm_gem_prime_fd_to_handle);
int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
return dmabuf;
}
-/*
+/**
* drm_gem_prime_handle_to_fd - PRIME export function for GEM drivers
* @dev: dev to export the buffer from
* @file_priv: drm file-private structure
* The actual exporting from GEM object to a dma-buf is done through the
* &drm_gem_object_funcs.export callback.
*/
-static int drm_gem_prime_handle_to_fd(struct drm_device *dev,
- struct drm_file *file_priv, uint32_t handle,
- uint32_t flags,
- int *prime_fd)
+int drm_gem_prime_handle_to_fd(struct drm_device *dev,
+ struct drm_file *file_priv, uint32_t handle,
+ uint32_t flags,
+ int *prime_fd)
{
struct drm_gem_object *obj;
int ret = 0;
return ret;
}
+EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
* @obj: GEM object to export
* @flags: flags like DRM_CLOEXEC and DRM_RDWR
*
- * This is the implementation of the &drm_gem_object_funcs.export functions
- * for GEM drivers using the PRIME helpers. It is used as the default for
- * drivers that do not set their own.
+ * This is the implementation of the &drm_gem_object_funcs.export functions for GEM drivers
+ * using the PRIME helpers. It is used as the default in
+ * drm_gem_prime_handle_to_fd().
*/
struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
int flags)
* @dev: drm_device to import into
* @dma_buf: dma-buf object to import
*
- * This is the implementation of the gem_prime_import functions for GEM
- * drivers using the PRIME helpers. It is the default for drivers that do
- * not set their own &drm_driver.gem_prime_import.
+ * This is the implementation of the gem_prime_import functions for GEM drivers
+ * using the PRIME helpers. Drivers can use this as their
+ * &drm_driver.gem_prime_import implementation. It is used as the default
+ * implementation in drm_gem_prime_fd_to_handle().
*
* Drivers must arrange to call drm_prime_gem_destroy() from their
* &drm_gem_object_funcs.free hook when using this function.
return 0;
if (!priv->mapping) {
- void *mapping;
+ void *mapping = NULL;
if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))
mapping = arm_iommu_create_mapping(&platform_bus_type,
EXYNOS_DEV_ADDR_START, EXYNOS_DEV_ADDR_SIZE);
else if (IS_ENABLED(CONFIG_IOMMU_DMA))
mapping = iommu_get_domain_for_dev(priv->dma_dev);
- else
- mapping = ERR_PTR(-ENODEV);
- if (IS_ERR(mapping))
- return PTR_ERR(mapping);
+ if (!mapping)
+ return -ENODEV;
priv->mapping = mapping;
}
return ret;
crtc = exynos_drm_crtc_get_by_type(drm_dev, EXYNOS_DISPLAY_TYPE_HDMI);
+ if (IS_ERR(crtc))
+ return PTR_ERR(crtc);
crtc->pipe_clk = &hdata->phy_clk;
ret = hdmi_create_connector(encoder);
static enum drm_mode_status gen11_dsi_mode_valid(struct drm_connector *connector,
struct drm_display_mode *mode)
{
+ struct drm_i915_private *i915 = to_i915(connector->dev);
+ enum drm_mode_status status;
+
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
+
/* FIXME: DSC? */
return intel_dsi_mode_valid(connector, mode);
}
struct drm_device *dev = connector->dev;
struct drm_i915_private *dev_priv = to_i915(dev);
int max_dotclk = dev_priv->max_dotclk_freq;
+ enum drm_mode_status status;
int max_clock;
+ status = intel_cpu_transcoder_mode_valid(dev_priv, mode);
+ if (status != MODE_OK)
+ return status;
+
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
return MODE_NO_DBLESCAN;
val |= XELPDP_FORWARD_CLOCK_UNGATE;
- if (is_hdmi_frl(crtc_state->port_clock))
+ if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI) &&
+ is_hdmi_frl(crtc_state->port_clock))
val |= XELPDP_DDI_CLOCK_SELECT(XELPDP_DDI_CLOCK_SELECT_DIV18CLK);
else
val |= XELPDP_DDI_CLOCK_SELECT(XELPDP_DDI_CLOCK_SELECT_MAXPCLK);
static bool planes_enabling(const struct intel_crtc_state *old_crtc_state,
const struct intel_crtc_state *new_crtc_state)
{
+ if (!new_crtc_state->hw.active)
+ return false;
+
return is_enabling(active_planes, old_crtc_state, new_crtc_state);
}
static bool planes_disabling(const struct intel_crtc_state *old_crtc_state,
const struct intel_crtc_state *new_crtc_state)
{
+ if (!old_crtc_state->hw.active)
+ return false;
+
return is_disabling(active_planes, old_crtc_state, new_crtc_state);
}
static bool vrr_enabling(const struct intel_crtc_state *old_crtc_state,
const struct intel_crtc_state *new_crtc_state)
{
+ if (!new_crtc_state->hw.active)
+ return false;
+
return is_enabling(vrr.enable, old_crtc_state, new_crtc_state) ||
(new_crtc_state->vrr.enable &&
(new_crtc_state->update_m_n || new_crtc_state->update_lrr ||
static bool vrr_disabling(const struct intel_crtc_state *old_crtc_state,
const struct intel_crtc_state *new_crtc_state)
{
+ if (!old_crtc_state->hw.active)
+ return false;
+
return is_disabling(vrr.enable, old_crtc_state, new_crtc_state) ||
(old_crtc_state->vrr.enable &&
(new_crtc_state->update_m_n || new_crtc_state->update_lrr ||
if (!active)
goto out;
- intel_dsc_get_config(pipe_config);
intel_bigjoiner_get_config(pipe_config);
+ intel_dsc_get_config(pipe_config);
if (!transcoder_is_dsi(pipe_config->cpu_transcoder) ||
DISPLAY_VER(dev_priv) >= 11)
return -EINVAL;
}
+ /*
+ * FIXME: Bigjoiner+async flip is busted currently.
+ * Remove this check once the issues are fixed.
+ */
+ if (new_crtc_state->bigjoiner_pipes) {
+ drm_dbg_kms(&i915->drm,
+ "[CRTC:%d:%s] async flip disallowed with bigjoiner\n",
+ crtc->base.base.id, crtc->base.name);
+ return -EINVAL;
+ }
+
for_each_oldnew_intel_plane_in_state(state, plane, old_plane_state,
new_plane_state, i) {
if (plane->pipe != crtc->pipe)
if (!intel_crtc_needs_modeset(new_crtc_state))
continue;
+ intel_pre_plane_update(state, crtc);
+
if (!old_crtc_state->hw.active)
continue;
- intel_pre_plane_update(state, crtc);
intel_crtc_disable_planes(state, crtc);
}
for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i)
intel_color_cleanup_commit(new_crtc_state);
- drm_atomic_helper_cleanup_planes(dev, &state->base);
+ drm_atomic_helper_unprepare_planes(dev, &state->base);
intel_runtime_pm_put(&dev_priv->runtime_pm, state->wakeref);
return ret;
}
mode->vtotal > vtotal_max)
return MODE_V_ILLEGAL;
+ return MODE_OK;
+}
+
+enum drm_mode_status intel_cpu_transcoder_mode_valid(struct drm_i915_private *dev_priv,
+ const struct drm_display_mode *mode)
+{
+ /*
+ * Additional transcoder timing limits,
+ * excluding BXT/GLK DSI transcoders.
+ */
if (DISPLAY_VER(dev_priv) >= 5) {
if (mode->hdisplay < 64 ||
mode->htotal - mode->hdisplay < 32)
intel_mode_valid_max_plane_size(struct drm_i915_private *dev_priv,
const struct drm_display_mode *mode,
bool bigjoiner);
+enum drm_mode_status
+intel_cpu_transcoder_mode_valid(struct drm_i915_private *i915,
+ const struct drm_display_mode *mode);
enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port);
bool is_trans_port_sync_mode(const struct intel_crtc_state *state);
bool is_trans_port_sync_master(const struct intel_crtc_state *state);
enum intel_dmc_id dmc_id;
/* TODO: check if the following applies to all D13+ platforms. */
- if (!IS_DG2(i915) && !IS_TIGERLAKE(i915))
+ if (!IS_TIGERLAKE(i915))
return;
for_each_dmc_id(dmc_id) {
intel_de_rmw(i915, PIPEDMC_CONTROL(pipe), PIPEDMC_ENABLE, 0);
}
+static bool is_dmc_evt_ctl_reg(struct drm_i915_private *i915,
+ enum intel_dmc_id dmc_id, i915_reg_t reg)
+{
+ u32 offset = i915_mmio_reg_offset(reg);
+ u32 start = i915_mmio_reg_offset(DMC_EVT_CTL(i915, dmc_id, 0));
+ u32 end = i915_mmio_reg_offset(DMC_EVT_CTL(i915, dmc_id, DMC_EVENT_HANDLER_COUNT_GEN12));
+
+ return offset >= start && offset < end;
+}
+
+static bool disable_dmc_evt(struct drm_i915_private *i915,
+ enum intel_dmc_id dmc_id,
+ i915_reg_t reg, u32 data)
+{
+ if (!is_dmc_evt_ctl_reg(i915, dmc_id, reg))
+ return false;
+
+ /* keep all pipe DMC events disabled by default */
+ if (dmc_id != DMC_FW_MAIN)
+ return true;
+
+ return false;
+}
+
+static u32 dmc_mmiodata(struct drm_i915_private *i915,
+ struct intel_dmc *dmc,
+ enum intel_dmc_id dmc_id, int i)
+{
+ if (disable_dmc_evt(i915, dmc_id,
+ dmc->dmc_info[dmc_id].mmioaddr[i],
+ dmc->dmc_info[dmc_id].mmiodata[i]))
+ return REG_FIELD_PREP(DMC_EVT_CTL_TYPE_MASK,
+ DMC_EVT_CTL_TYPE_EDGE_0_1) |
+ REG_FIELD_PREP(DMC_EVT_CTL_EVENT_ID_MASK,
+ DMC_EVT_CTL_EVENT_ID_FALSE);
+ else
+ return dmc->dmc_info[dmc_id].mmiodata[i];
+}
+
/**
* intel_dmc_load_program() - write the firmware from memory to register.
* @i915: i915 drm device.
for_each_dmc_id(dmc_id) {
for (i = 0; i < dmc->dmc_info[dmc_id].mmio_count; i++) {
intel_de_write(i915, dmc->dmc_info[dmc_id].mmioaddr[i],
- dmc->dmc_info[dmc_id].mmiodata[i]);
+ dmc_mmiodata(i915, dmc, dmc_id, i));
}
}
enum drm_mode_status status;
bool dsc = false, bigjoiner = false;
+ status = intel_cpu_transcoder_mode_valid(dev_priv, mode);
+ if (status != MODE_OK)
+ return status;
+
if (mode->flags & DRM_MODE_FLAG_DBLCLK)
return MODE_H_ILLEGAL;
* (eg. Acer Chromebook C710), so we'll check it only if multiple
* ports are attempting to use the same AUX CH, according to VBT.
*/
- if (intel_bios_dp_has_shared_aux_ch(encoder->devdata) &&
- !intel_digital_port_connected(encoder)) {
+ if (intel_bios_dp_has_shared_aux_ch(encoder->devdata)) {
/*
* If this fails, presume the DPCD answer came
* from some other port using the same AUX CH.
* FIXME maybe cleaner to check this before the
* DPCD read? Would need sort out the VDD handling...
*/
- drm_info(&dev_priv->drm,
- "[ENCODER:%d:%s] HPD is down, disabling eDP\n",
- encoder->base.base.id, encoder->base.name);
- goto out_vdd_off;
+ if (!intel_digital_port_connected(encoder)) {
+ drm_info(&dev_priv->drm,
+ "[ENCODER:%d:%s] HPD is down, disabling eDP\n",
+ encoder->base.base.id, encoder->base.name);
+ goto out_vdd_off;
+ }
+
+ /*
+ * Unfortunately even the HPD based detection fails on
+ * eg. Asus B360M-A (CFL+CNP), so as a last resort fall
+ * back to checking for a VGA branch device. Only do this
+ * on known affected platforms to minimize false positives.
+ */
+ if (DISPLAY_VER(dev_priv) == 9 && drm_dp_is_branch(intel_dp->dpcd) &&
+ (intel_dp->dpcd[DP_DOWNSTREAMPORT_PRESENT] & DP_DWN_STRM_PORT_TYPE_MASK) ==
+ DP_DWN_STRM_PORT_TYPE_ANALOG) {
+ drm_info(&dev_priv->drm,
+ "[ENCODER:%d:%s] VGA converter detected, disabling eDP\n",
+ encoder->base.base.id, encoder->base.name);
+ goto out_vdd_off;
+ }
}
mutex_lock(&dev_priv->drm.mode_config.mutex);
const struct intel_crtc_state *crtc_state,
u8 link_bw, u8 rate_select)
{
- u8 link_config[2];
+ u8 lane_count = crtc_state->lane_count;
- /* Write the link configuration data */
- link_config[0] = link_bw;
- link_config[1] = crtc_state->lane_count;
if (crtc_state->enhanced_framing)
- link_config[1] |= DP_LANE_COUNT_ENHANCED_FRAME_EN;
- drm_dp_dpcd_write(&intel_dp->aux, DP_LINK_BW_SET, link_config, 2);
+ lane_count |= DP_LANE_COUNT_ENHANCED_FRAME_EN;
+
+ if (link_bw) {
+ /* DP and eDP v1.3 and earlier link bw set method. */
+ u8 link_config[] = { link_bw, lane_count };
- /* eDP 1.4 rate select method. */
- if (!link_bw)
- drm_dp_dpcd_write(&intel_dp->aux, DP_LINK_RATE_SET,
- &rate_select, 1);
+ drm_dp_dpcd_write(&intel_dp->aux, DP_LINK_BW_SET, link_config,
+ ARRAY_SIZE(link_config));
+ } else {
+ /*
+ * eDP v1.4 and later link rate set method.
+ *
+ * eDP v1.4x sinks shall ignore DP_LINK_RATE_SET if
+ * DP_LINK_BW_SET is set. Avoid writing DP_LINK_BW_SET.
+ *
+ * eDP v1.5 sinks allow choosing either, and the last choice
+ * shall be active.
+ */
+ drm_dp_dpcd_writeb(&intel_dp->aux, DP_LANE_COUNT_SET, lane_count);
+ drm_dp_dpcd_writeb(&intel_dp->aux, DP_LINK_RATE_SET, rate_select);
+ }
}
/*
return 0;
}
+ *status = intel_cpu_transcoder_mode_valid(dev_priv, mode);
+ if (*status != MODE_OK)
+ return 0;
+
if (mode->flags & DRM_MODE_FLAG_DBLSCAN) {
*status = MODE_NO_DBLESCAN;
return 0;
if (intel_dp_need_bigjoiner(intel_dp, mode->hdisplay, target_clock)) {
bigjoiner = true;
max_dotclk *= 2;
+
+ /* TODO: add support for bigjoiner */
+ *status = MODE_CLOCK_HIGH;
+ return 0;
}
if (DISPLAY_VER(dev_priv) >= 10 &&
* Big joiner configuration needs DSC for TGL which is not true for
* XE_LPD where uncompressed joiner is supported.
*/
- if (DISPLAY_VER(dev_priv) < 13 && bigjoiner && !dsc)
- return MODE_CLOCK_HIGH;
+ if (DISPLAY_VER(dev_priv) < 13 && bigjoiner && !dsc) {
+ *status = MODE_CLOCK_HIGH;
+ return 0;
+ }
- if (mode_rate > max_rate && !dsc)
- return MODE_CLOCK_HIGH;
+ if (mode_rate > max_rate && !dsc) {
+ *status = MODE_CLOCK_HIGH;
+ return 0;
+ }
*status = intel_mode_valid_max_plane_size(dev_priv, mode, false);
return 0;
intel_connector->port = port;
drm_dp_mst_get_port_malloc(port);
+ /*
+ * TODO: set the AUX for the actual MST port decompressing the stream.
+ * At the moment the driver only supports enabling this globally in the
+ * first downstream MST branch, via intel_dp's (root port) AUX.
+ */
+ intel_connector->dp.dsc_decompression_aux = &intel_dp->aux;
+ intel_dp_mst_read_decompression_port_dsc_caps(intel_dp, intel_connector);
+
connector = &intel_connector->base;
ret = drm_connector_init(dev, connector, &intel_dp_mst_connector_funcs,
DRM_MODE_CONNECTOR_DisplayPort);
drm_connector_helper_add(connector, &intel_dp_mst_connector_helper_funcs);
- /*
- * TODO: set the AUX for the actual MST port decompressing the stream.
- * At the moment the driver only supports enabling this globally in the
- * first downstream MST branch, via intel_dp's (root port) AUX.
- */
- intel_connector->dp.dsc_decompression_aux = &intel_dp->aux;
- intel_dp_mst_read_decompression_port_dsc_caps(intel_dp, intel_connector);
-
for_each_pipe(dev_priv, pipe) {
struct drm_encoder *enc =
&intel_dp->mst_encoders[pipe]->base.base;
}
static void _intel_dsb_commit(struct intel_dsb *dsb, u32 ctrl,
- unsigned int dewake_scanline)
+ int dewake_scanline)
{
struct intel_crtc *crtc = dsb->crtc;
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
struct drm_display_mode *mode)
{
struct intel_connector *connector = to_intel_connector(_connector);
+ struct drm_i915_private *i915 = to_i915(connector->base.dev);
struct intel_dvo *intel_dvo = intel_attached_dvo(connector);
const struct drm_display_mode *fixed_mode =
intel_panel_fixed_mode(connector, mode);
int max_dotclk = to_i915(connector->base.dev)->max_dotclk_freq;
int target_clock = mode->clock;
+ enum drm_mode_status status;
+
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
return MODE_NO_DBLESCAN;
struct drm_i915_private *i915 = to_i915(fb->base.dev);
unsigned int stride_tiles;
- if (IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14)
+ if ((IS_ALDERLAKE_P(i915) || DISPLAY_VER(i915) >= 14) &&
+ src_stride_tiles < dst_stride_tiles)
stride_tiles = src_stride_tiles;
else
stride_tiles = dst_stride_tiles;
size += remap_info->size;
} else {
- unsigned int dst_stride = plane_view_dst_stride_tiles(fb, color_plane,
- remap_info->width);
+ unsigned int dst_stride;
+
+ /*
+ * The hardware automagically calculates the CCS AUX surface
+ * stride from the main surface stride so can't really remap a
+ * smaller subset (unless we'd remap in whole AUX page units).
+ */
+ if (intel_fb_needs_pot_stride_remap(fb) &&
+ intel_fb_is_ccs_modifier(fb->base.modifier))
+ dst_stride = remap_info->src_stride;
+ else
+ dst_stride = remap_info->width;
+
+ dst_stride = plane_view_dst_stride_tiles(fb, color_plane, dst_stride);
assign_chk_ovf(i915, remap_info->dst_stride, dst_stride);
color_plane_info->mapping_stride = dst_stride *
bool ycbcr_420_only;
enum intel_output_format sink_format;
+ status = intel_cpu_transcoder_mode_valid(dev_priv, mode);
+ if (status != MODE_OK)
+ return status;
+
if ((mode->flags & DRM_MODE_FLAG_3D_MASK) == DRM_MODE_FLAG_3D_FRAME_PACKING)
clock *= 2;
struct drm_display_mode *mode)
{
struct intel_connector *connector = to_intel_connector(_connector);
+ struct drm_i915_private *i915 = to_i915(connector->base.dev);
const struct drm_display_mode *fixed_mode =
intel_panel_fixed_mode(connector, mode);
int max_pixclk = to_i915(connector->base.dev)->max_dotclk_freq;
enum drm_mode_status status;
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
+
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
return MODE_NO_DBLESCAN;
intel_sdvo_mode_valid(struct drm_connector *connector,
struct drm_display_mode *mode)
{
+ struct drm_i915_private *i915 = to_i915(connector->dev);
struct intel_sdvo *intel_sdvo = intel_attached_sdvo(to_intel_connector(connector));
struct intel_sdvo_connector *intel_sdvo_connector =
to_intel_sdvo_connector(connector);
- int max_dotclk = to_i915(connector->dev)->max_dotclk_freq;
bool has_hdmi_sink = intel_has_hdmi_sink(intel_sdvo_connector, connector->state);
+ int max_dotclk = i915->max_dotclk_freq;
+ enum drm_mode_status status;
int clock = mode->clock;
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
+
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
return MODE_NO_DBLESCAN;
intel_tv_mode_valid(struct drm_connector *connector,
struct drm_display_mode *mode)
{
+ struct drm_i915_private *i915 = to_i915(connector->dev);
const struct tv_mode *tv_mode = intel_tv_mode_find(connector->state);
- int max_dotclk = to_i915(connector->dev)->max_dotclk_freq;
+ int max_dotclk = i915->max_dotclk_freq;
+ enum drm_mode_status status;
+
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
if (mode->flags & DRM_MODE_FLAG_DBLSCAN)
return MODE_NO_DBLESCAN;
{
struct drm_plane *plane = NULL;
struct intel_plane *intel_plane;
- struct intel_plane_state *plane_state = NULL;
struct intel_crtc_scaler_state *scaler_state =
&crtc_state->scaler_state;
struct drm_atomic_state *drm_state = crtc_state->uapi.state;
/* walkthrough scaler_users bits and start assigning scalers */
for (i = 0; i < sizeof(scaler_state->scaler_users) * 8; i++) {
+ struct intel_plane_state *plane_state = NULL;
int *scaler_id;
const char *name;
int idx, ret;
.destroy = intel_dsi_encoder_destroy,
};
+static enum drm_mode_status vlv_dsi_mode_valid(struct drm_connector *connector,
+ struct drm_display_mode *mode)
+{
+ struct drm_i915_private *i915 = to_i915(connector->dev);
+
+ if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
+ enum drm_mode_status status;
+
+ status = intel_cpu_transcoder_mode_valid(i915, mode);
+ if (status != MODE_OK)
+ return status;
+ }
+
+ return intel_dsi_mode_valid(connector, mode);
+}
+
static const struct drm_connector_helper_funcs intel_dsi_connector_helper_funcs = {
.get_modes = intel_dsi_get_modes,
- .mode_valid = intel_dsi_mode_valid,
+ .mode_valid = vlv_dsi_mode_valid,
.atomic_check = intel_digital_connector_atomic_check,
};
llist_add(&engine->uabi_llist, &engine->i915->uabi_engines_llist);
}
-static const u8 uabi_classes[] = {
+#define I915_NO_UABI_CLASS ((u16)(-1))
+
+static const u16 uabi_classes[] = {
[RENDER_CLASS] = I915_ENGINE_CLASS_RENDER,
[COPY_ENGINE_CLASS] = I915_ENGINE_CLASS_COPY,
[VIDEO_DECODE_CLASS] = I915_ENGINE_CLASS_VIDEO,
[VIDEO_ENHANCEMENT_CLASS] = I915_ENGINE_CLASS_VIDEO_ENHANCE,
[COMPUTE_CLASS] = I915_ENGINE_CLASS_COMPUTE,
+ [OTHER_CLASS] = I915_NO_UABI_CLASS, /* Not exposed to users, no uabi class. */
};
static int engine_cmp(void *priv, const struct list_head *A,
void intel_engines_driver_register(struct drm_i915_private *i915)
{
+ u16 name_instance, other_instance = 0;
struct legacy_ring ring = {};
struct list_head *it, *next;
struct rb_node **p, *prev;
if (intel_gt_has_unrecoverable_error(engine->gt))
continue; /* ignore incomplete engines */
- /*
- * We don't want to expose the GSC engine to the users, but we
- * still rename it so it is easier to identify in the debug logs
- */
- if (engine->id == GSC0) {
- engine_rename(engine, "gsc", 0);
- continue;
- }
-
GEM_BUG_ON(engine->class >= ARRAY_SIZE(uabi_classes));
engine->uabi_class = uabi_classes[engine->class];
+ if (engine->uabi_class == I915_NO_UABI_CLASS) {
+ name_instance = other_instance++;
+ } else {
+ GEM_BUG_ON(engine->uabi_class >=
+ ARRAY_SIZE(i915->engine_uabi_class_count));
+ name_instance =
+ i915->engine_uabi_class_count[engine->uabi_class]++;
+ }
+ engine->uabi_instance = name_instance;
- GEM_BUG_ON(engine->uabi_class >=
- ARRAY_SIZE(i915->engine_uabi_class_count));
- engine->uabi_instance =
- i915->engine_uabi_class_count[engine->uabi_class]++;
-
- /* Replace the internal name with the final user facing name */
+ /*
+ * Replace the internal name with the final user and log facing
+ * name.
+ */
engine_rename(engine,
intel_engine_class_repr(engine->class),
- engine->uabi_instance);
+ name_instance);
+
+ if (engine->uabi_class == I915_NO_UABI_CLASS)
+ continue;
rb_link_node(&engine->uabi_node, prev, p);
rb_insert_color(&engine->uabi_node, &i915->uabi_engines);
err:
i915_probe_error(i915, "Failed to initialize %s! (%d)\n", gtdef->name, ret);
- intel_gt_release_all(i915);
-
return ret;
}
return 0;
}
-void intel_gt_release_all(struct drm_i915_private *i915)
-{
- struct intel_gt *gt;
- unsigned int id;
-
- for_each_gt(gt, i915, id)
- i915->gt[id] = NULL;
-}
-
void intel_gt_info_print(const struct intel_gt_info *info,
struct drm_printer *p)
{
if (msg)
drm_notice(&engine->i915->drm,
"Resetting %s for %s\n", engine->name, msg);
- atomic_inc(&engine->i915->gpu_error.reset_engine_count[engine->uabi_class]);
+ i915_increase_reset_engine_count(&engine->i915->gpu_error, engine);
ret = intel_gt_reset_engine(engine);
if (ret) {
if (match) {
intel_engine_set_hung_context(e, ce);
engine_mask |= e->mask;
- atomic_inc(&i915->gpu_error.reset_engine_count[e->uabi_class]);
+ i915_increase_reset_engine_count(&i915->gpu_error,
+ e);
}
}
} else {
intel_engine_set_hung_context(ce->engine, ce);
engine_mask = ce->engine->mask;
- atomic_inc(&i915->gpu_error.reset_engine_count[ce->engine->uabi_class]);
+ i915_increase_reset_engine_count(&i915->gpu_error, ce->engine);
}
with_intel_runtime_pm(&i915->runtime_pm, wakeref)
ret = i915_driver_mmio_probe(i915);
if (ret < 0)
- goto out_tiles_cleanup;
+ goto out_runtime_pm_put;
ret = i915_driver_hw_probe(i915);
if (ret < 0)
i915_ggtt_driver_late_release(i915);
out_cleanup_mmio:
i915_driver_mmio_release(i915);
-out_tiles_cleanup:
- intel_gt_release_all(i915);
out_runtime_pm_put:
enable_rpm_wakeref_asserts(&i915->runtime_pm);
i915_driver_late_release(i915);
#include "display/intel_display_device.h"
#include "gt/intel_engine.h"
+#include "gt/intel_engine_types.h"
#include "gt/intel_gt_types.h"
#include "gt/uc/intel_uc_fw.h"
atomic_t reset_count;
/** Number of times an engine has been reset */
- atomic_t reset_engine_count[I915_NUM_ENGINES];
+ atomic_t reset_engine_count[MAX_ENGINE_CLASS];
};
struct drm_i915_error_state_buf {
static inline u32 i915_reset_engine_count(struct i915_gpu_error *error,
const struct intel_engine_cs *engine)
{
- return atomic_read(&error->reset_engine_count[engine->uabi_class]);
+ return atomic_read(&error->reset_engine_count[engine->class]);
+}
+
+static inline void
+i915_increase_reset_engine_count(struct i915_gpu_error *error,
+ const struct intel_engine_cs *engine)
+{
+ atomic_inc(&error->reset_engine_count[engine->class]);
}
#define CORE_DUMP_FLAG_NONE 0x0
* tau4 = (4 | x) << y
* but add 2 when doing the final right shift to account for units
*/
- tau4 = ((1 << x_w) | x) << y;
+ tau4 = (u64)((1 << x_w) | x) << y;
/* val in hwmon interface units (millisec) */
out = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
r = FIELD_PREP(PKG_MAX_WIN, PKG_MAX_WIN_DEFAULT);
x = REG_FIELD_GET(PKG_MAX_WIN_X, r);
y = REG_FIELD_GET(PKG_MAX_WIN_Y, r);
- tau4 = ((1 << x_w) | x) << y;
+ tau4 = (u64)((1 << x_w) | x) << y;
max_win = mul_u64_u32_shr(tau4, SF_TIME, hwmon->scl_shift_time + x_w);
if (val > max_win)
}
for_each_engine(engine, gt, id)
- t->reset_engine[id] =
- i915_reset_engine_count(&i915->gpu_error, engine);
+ t->reset_engine[i][id] =
+ i915_reset_engine_count(&i915->gpu_error,
+ engine);
}
t->reset_global = i915_reset_count(&i915->gpu_error);
for_each_gt(gt, i915, i) {
for_each_engine(engine, gt, id) {
- if (t->reset_engine[id] ==
+ if (t->reset_engine[i][id] ==
i915_reset_engine_count(&i915->gpu_error, engine))
continue;
gt_err(gt, "%s(%s): engine '%s' was reset %d times!\n",
t->func, t->name, engine->name,
i915_reset_engine_count(&i915->gpu_error, engine) -
- t->reset_engine[id]);
+ t->reset_engine[i][id]);
return -EIO;
}
}
#ifndef IGT_LIVE_TEST_H
#define IGT_LIVE_TEST_H
+#include "gt/intel_gt_defines.h" /* for I915_MAX_GT */
#include "gt/intel_engine.h" /* for I915_NUM_ENGINES */
struct drm_i915_private;
const char *name;
unsigned int reset_global;
- unsigned int reset_engine[I915_NUM_ENGINES];
+ unsigned int reset_engine[I915_MAX_GT][I915_NUM_ENGINES];
};
/*
/* Disable RELAY mode to pass the processed image */
cfg_val &= ~GAMMA_RELAY_MODE;
- cfg_val = readl(gamma->regs + DISP_GAMMA_CFG);
+ writel(cfg_val, gamma->regs + DISP_GAMMA_CFG);
}
void mtk_gamma_config(struct device *dev, unsigned int w,
crtc);
struct mtk_crtc_state *mtk_crtc_state = to_mtk_crtc_state(crtc_state);
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
+ unsigned long flags;
if (mtk_crtc->event && mtk_crtc_state->base.event)
DRM_ERROR("new event while there is still a pending event\n");
if (mtk_crtc_state->base.event) {
mtk_crtc_state->base.event->pipe = drm_crtc_index(crtc);
WARN_ON(drm_crtc_vblank_get(crtc) != 0);
+
+ spin_lock_irqsave(&crtc->dev->event_lock, flags);
mtk_crtc->event = mtk_crtc_state->base.event;
+ spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
+
mtk_crtc_state->base.event = NULL;
}
}
struct device *mtk_drm_crtc_dma_dev_get(struct drm_crtc *crtc)
{
- struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
+ struct mtk_drm_crtc *mtk_crtc = NULL;
+
+ if (!crtc)
+ return NULL;
+
+ mtk_crtc = to_mtk_crtc(crtc);
+ if (!mtk_crtc)
+ return NULL;
return mtk_crtc->dma_dev;
}
struct mtk_drm_private *private = drm->dev_private;
struct mtk_drm_private *priv_n;
struct device *dma_dev = NULL;
+ struct drm_crtc *crtc;
int ret, i, j;
if (drm_firmware_drivers_only())
}
/* Use OVL device for all DMA memory allocations */
- dma_dev = mtk_drm_crtc_dma_dev_get(drm_crtc_from_index(drm, 0));
+ crtc = drm_crtc_from_index(drm, 0);
+ if (crtc)
+ dma_dev = mtk_drm_crtc_dma_dev_get(crtc);
if (!dma_dev) {
ret = -ENODEV;
dev_err(drm->dev, "Need at least one OVL device\n");
.min_llcc_ib = 0,
.min_dram_ib = 800000,
.danger_lut_tbl = {0xf, 0xffff, 0x0},
+ .safe_lut_tbl = {0xfe00, 0xfe00, 0xffff},
.qos_lut_tbl = {
{.nentry = ARRAY_SIZE(sc8180x_qos_linear),
.entries = sc8180x_qos_linear
return 0;
fail:
- if (mdp5_kms)
- mdp5_destroy(mdp5_kms);
+ mdp5_destroy(mdp5_kms);
return ret;
}
/* reset video pattern flag on disconnect */
if (!hpd) {
dp->panel->video_test = false;
- drm_dp_set_subconnector_property(dp->dp_display.connector,
- connector_status_disconnected,
- dp->panel->dpcd, dp->panel->downstream_ports);
+ if (!dp->dp_display.is_edp)
+ drm_dp_set_subconnector_property(dp->dp_display.connector,
+ connector_status_disconnected,
+ dp->panel->dpcd,
+ dp->panel->downstream_ports);
}
dp->dp_display.is_connected = hpd;
dp_link_process_request(dp->link);
- drm_dp_set_subconnector_property(dp->dp_display.connector, connector_status_connected,
- dp->panel->dpcd, dp->panel->downstream_ports);
+ if (!dp->dp_display.is_edp)
+ drm_dp_set_subconnector_property(dp->dp_display.connector,
+ connector_status_connected,
+ dp->panel->dpcd,
+ dp->panel->downstream_ports);
edid = dp->panel->edid;
if (IS_ERR(connector))
return connector;
+ if (!dp_display->is_edp)
+ drm_connector_attach_dp_subconnector_property(connector);
+
drm_connector_attach_encoder(connector, encoder);
return connector;
if ((phy->cfg->quirks & DSI_PHY_7NM_QUIRK_V5_2)) {
if (phy->cphy_mode) {
vreg_ctrl_0 = 0x45;
- vreg_ctrl_1 = 0x45;
+ vreg_ctrl_1 = 0x41;
glbl_rescode_top_ctrl = 0x00;
glbl_rescode_bot_ctrl = 0x00;
} else {
if (ret)
goto err_msm_uninit;
- drm_kms_helper_poll_init(ddev);
-
if (priv->kms_init) {
drm_kms_helper_poll_init(ddev);
msm_fbdev_setup(ddev);
err_cleanup:
if (ret)
- drm_atomic_helper_cleanup_planes(dev, state);
+ drm_atomic_helper_unprepare_planes(dev, state);
done:
pm_runtime_put_autosuspend(dev->dev);
return ret;
int index_nr;
spinlock_t refs_lock;
- spinlock_t list_lock;
+ rwlock_t list_lock;
int *refs;
struct list_head ntfy;
int types_nr, int index_nr, struct nvkm_event *event)
{
spin_lock_init(&event->refs_lock);
- spin_lock_init(&event->list_lock);
+ rwlock_init(&event->list_lock);
return __nvkm_event_init(func, subdev, types_nr, index_nr, event);
}
* DEALINGS IN THE SOFTWARE.
*/
+/**
+ * msgqTxHeader -- TX queue data structure
+ * @version: the version of this structure, must be 0
+ * @size: the size of the entire queue, including this header
+ * @msgSize: the padded size of queue element, 16 is minimum
+ * @msgCount: the number of elements in this queue
+ * @writePtr: head index of this queue
+ * @flags: 1 = swap the RX pointers
+ * @rxHdrOff: offset of readPtr in this structure
+ * @entryOff: offset of beginning of queue (msgqRxHeader), relative to
+ * beginning of this structure
+ *
+ * The command queue is a queue of RPCs that are sent from the driver to the
+ * GSP. The status queue is a queue of messages/responses from GSP-RM to the
+ * driver. Although the driver allocates memory for both queues, the command
+ * queue is owned by the driver and the status queue is owned by GSP-RM. In
+ * addition, the headers of the two queues must not share the same 4K page.
+ *
+ * Each queue is prefixed with this data structure. The idea is that a queue
+ * and its header are written to only by their owner. That is, only the
+ * driver writes to the command queue and command queue header, and only the
+ * GSP writes to the status (receive) queue and its header.
+ *
+ * This is enforced by the concept of "swapping" the RX pointers. This is
+ * why the 'flags' field must be set to 1. 'rxHdrOff' is how the GSP knows
+ * where the where the tail pointer of its status queue.
+ *
+ * When the driver writes a new RPC to the command queue, it updates writePtr.
+ * When it reads a new message from the status queue, it updates readPtr. In
+ * this way, the GSP knows when a new command is in the queue (it polls
+ * writePtr) and it knows how much free space is in the status queue (it
+ * checks readPtr). The driver never cares about how much free space is in
+ * the status queue.
+ *
+ * As usual, producers write to the head pointer, and consumers read from the
+ * tail pointer. When head == tail, the queue is empty.
+ *
+ * So to summarize:
+ * command.writePtr = head of command queue
+ * command.readPtr = tail of status queue
+ * status.writePtr = head of status queue
+ * status.readPtr = tail of command queue
+ */
typedef struct
{
NvU32 version; // queue version
NvU32 entryOff; // Offset of entries from start of backing store.
} msgqTxHeader;
+/**
+ * msgqRxHeader - RX queue data structure
+ * @readPtr: tail index of the other queue
+ *
+ * Although this is a separate struct, it could easily be merged into
+ * msgqTxHeader. msgqTxHeader.rxHdrOff is simply the offset of readPtr
+ * from the beginning of msgqTxHeader.
+ */
typedef struct
{
NvU32 readPtr; // message id of last message read
{
NvU32 size;
NvU32 numEntries;
- PACKED_REGISTRY_ENTRY entries[0];
+ PACKED_REGISTRY_ENTRY entries[] __counted_by(numEntries);
} PACKED_REGISTRY_TABLE;
#endif
(!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
continue;
- if (pi < 0)
- pi = i;
+ /* pick the last one as it will be smallest. */
+ pi = i;
+
/* Stop once the buffer is larger than the current page size. */
if (*size >= 1ULL << vmm->page[i].shift)
break;
if (nouveau_modeset != 2) {
ret = nvif_disp_ctor(&drm->client.device, "kmsDisp", 0, &disp->disp);
+ /* no display hw */
+ if (ret == -ENODEV) {
+ ret = 0;
+ goto disp_create_err;
+ }
if (!ret && (disp->disp.outp_mask || drm->vbios.dcb.entries)) {
nouveau_display_create_properties(dev);
static void
nvkm_event_ntfy_remove(struct nvkm_event_ntfy *ntfy)
{
- spin_lock_irq(&ntfy->event->list_lock);
+ write_lock_irq(&ntfy->event->list_lock);
list_del_init(&ntfy->head);
- spin_unlock_irq(&ntfy->event->list_lock);
+ write_unlock_irq(&ntfy->event->list_lock);
}
static void
nvkm_event_ntfy_insert(struct nvkm_event_ntfy *ntfy)
{
- spin_lock_irq(&ntfy->event->list_lock);
+ write_lock_irq(&ntfy->event->list_lock);
list_add_tail(&ntfy->head, &ntfy->event->ntfy);
- spin_unlock_irq(&ntfy->event->list_lock);
+ write_unlock_irq(&ntfy->event->list_lock);
}
static void
return;
nvkm_trace(event->subdev, "event: ntfy %08x on %d\n", bits, id);
- spin_lock_irqsave(&event->list_lock, flags);
+ read_lock_irqsave(&event->list_lock, flags);
list_for_each_entry_safe(ntfy, ntmp, &event->ntfy, head) {
if (ntfy->id == id && ntfy->bits & bits) {
}
}
- spin_unlock_irqrestore(&event->list_lock, flags);
+ read_unlock_irqrestore(&event->list_lock, flags);
}
void
/* Ensure an ior is hooked up to this outp already */
ior = outp->func->inherit(outp);
- if (!ior)
+ if (!ior || !ior->arm.head)
return -ENODEV;
/* With iors, there will be a separate output path for each type of connector - and all of
struct nvkm_runl *runl;
struct nvkm_engn *engn;
u32 cgids = 2048;
- u32 chids = 2048 / CHID_PER_USERD;
+ u32 chids = 2048;
int ret;
NV2080_CTRL_FIFO_GET_DEVICE_INFO_TABLE_PARAMS *ctrl;
}
ret = r535_gsp_cmdq_push(gsp, rpc);
- if (ret) {
- mutex_unlock(&gsp->cmdq.mutex);
+ if (ret)
return ERR_PTR(ret);
- }
if (wait) {
msg = r535_gsp_msg_recv(gsp, fn, repc);
struct nvfw_gsp_rpc *rpc;
rpc = r535_gsp_cmdq_get(gsp, ALIGN(sizeof(*rpc) + argc, sizeof(u64)));
- if (!rpc)
- return NULL;
+ if (IS_ERR(rpc))
+ return ERR_CAST(rpc);
rpc->header_version = 0x03000000;
rpc->signature = ('C' << 24) | ('P' << 16) | ('R' << 8) | 'V';
char *strings;
int str_offset;
int i;
- size_t rpc_size = sizeof(*rpc) + sizeof(rpc->entries[0]) * NV_GSP_REG_NUM_ENTRIES;
+ size_t rpc_size = struct_size(rpc, entries, NV_GSP_REG_NUM_ENTRIES);
/* add strings + null terminator */
for (i = 0; i < NV_GSP_REG_NUM_ENTRIES; i++)
r535_gsp_acpi_mux_id(acpi_handle handle, u32 id, MUX_METHOD_DATA_ELEMENT *mode,
MUX_METHOD_DATA_ELEMENT *part)
{
- acpi_handle iter = NULL, handle_mux;
+ acpi_handle iter = NULL, handle_mux = NULL;
acpi_status status;
unsigned long long value;
return 0;
}
+/**
+ * r535_gsp_msg_run_cpu_sequencer() -- process I/O commands from the GSP
+ *
+ * The GSP sequencer is a list of I/O commands that the GSP can send to
+ * the driver to perform for various purposes. The most common usage is to
+ * perform a special mid-initialization reset.
+ */
static int
r535_gsp_msg_run_cpu_sequencer(void *priv, u32 fn, void *repv, u32 repc)
{
return id;
}
+/**
+ * create_pte_array() - creates a PTE array of a physically contiguous buffer
+ * @ptes: pointer to the array
+ * @addr: base address of physically contiguous buffer (GSP_PAGE_SIZE aligned)
+ * @size: size of the buffer
+ *
+ * GSP-RM sometimes expects physically-contiguous buffers to have an array of
+ * "PTEs" for each page in that buffer. Although in theory that allows for
+ * the buffer to be physically discontiguous, GSP-RM does not currently
+ * support that.
+ *
+ * In this case, the PTEs are DMA addresses of each page of the buffer. Since
+ * the buffer is physically contiguous, calculating all the PTEs is simple
+ * math.
+ *
+ * See memdescGetPhysAddrsForGpu()
+ */
static void create_pte_array(u64 *ptes, dma_addr_t addr, size_t size)
{
unsigned int num_pages = DIV_ROUND_UP_ULL(size, GSP_PAGE_SIZE);
ptes[i] = (u64)addr + (i << GSP_PAGE_SHIFT);
}
+/**
+ * r535_gsp_libos_init() -- create the libos arguments structure
+ *
+ * The logging buffers are byte queues that contain encoded printf-like
+ * messages from GSP-RM. They need to be decoded by a special application
+ * that can parse the buffers.
+ *
+ * The 'loginit' buffer contains logs from early GSP-RM init and
+ * exception dumps. The 'logrm' buffer contains the subsequent logs. Both are
+ * written to directly by GSP-RM and can be any multiple of GSP_PAGE_SIZE.
+ *
+ * The physical address map for the log buffer is stored in the buffer
+ * itself, starting with offset 1. Offset 0 contains the "put" pointer.
+ *
+ * The GSP only understands 4K pages (GSP_PAGE_SIZE), so even if the kernel is
+ * configured for a larger page size (e.g. 64K pages), we need to give
+ * the GSP an array of 4K pages. Fortunately, since the buffer is
+ * physically contiguous, it's simple math to calculate the addresses.
+ *
+ * The buffers must be a multiple of GSP_PAGE_SIZE. GSP-RM also currently
+ * ignores the @kind field for LOGINIT, LOGINTR, and LOGRM, but expects the
+ * buffers to be physically contiguous anyway.
+ *
+ * The memory allocated for the arguments must remain until the GSP sends the
+ * init_done RPC.
+ *
+ * See _kgspInitLibosLoggingStructures (allocates memory for buffers)
+ * See kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
+ */
static int
r535_gsp_libos_init(struct nvkm_gsp *gsp)
{
nvkm_gsp_mem_dtor(gsp, &rx3->mem[i]);
}
+/**
+ * nvkm_gsp_radix3_sg - build a radix3 table from a S/G list
+ *
+ * The GSP uses a three-level page table, called radix3, to map the firmware.
+ * Each 64-bit "pointer" in the table is either the bus address of an entry in
+ * the next table (for levels 0 and 1) or the bus address of the next page in
+ * the GSP firmware image itself.
+ *
+ * Level 0 contains a single entry in one page that points to the first page
+ * of level 1.
+ *
+ * Level 1, since it's also only one page in size, contains up to 512 entries,
+ * one for each page in Level 2.
+ *
+ * Level 2 can be up to 512 pages in size, and each of those entries points to
+ * the next page of the firmware image. Since there can be up to 512*512
+ * pages, that limits the size of the firmware to 512*512*GSP_PAGE_SIZE = 1GB.
+ *
+ * Internally, the GSP has its window into system memory, but the base
+ * physical address of the aperture is not 0. In fact, it varies depending on
+ * the GPU architecture. Since the GPU is a PCI device, this window is
+ * accessed via DMA and is therefore bound by IOMMU translation. The end
+ * result is that GSP-RM must translate the bus addresses in the table to GSP
+ * physical addresses. All this should happen transparently.
+ *
+ * Returns 0 on success, or negative error code
+ *
+ * See kgspCreateRadix3_IMPL
+ */
static int
nvkm_gsp_radix3_sg(struct nvkm_device *device, struct sg_table *sgt, u64 size,
struct nvkm_gsp_radix3 *rx3)
#include <subdev/mmu.h>
struct gk20a_instobj {
- struct nvkm_memory memory;
+ struct nvkm_instobj base;
struct nvkm_mm_node *mn;
struct gk20a_instmem *imem;
/* CPU mapping */
u32 *vaddr;
};
-#define gk20a_instobj(p) container_of((p), struct gk20a_instobj, memory)
+#define gk20a_instobj(p) container_of((p), struct gk20a_instobj, base.memory)
/*
* Used for objects allocated using the DMA API
list_del(&obj->vaddr_node);
vunmap(obj->base.vaddr);
obj->base.vaddr = NULL;
- imem->vaddr_use -= nvkm_memory_size(&obj->base.memory);
+ imem->vaddr_use -= nvkm_memory_size(&obj->base.base.memory);
nvkm_debug(&imem->base.subdev, "vaddr used: %x/%x\n", imem->vaddr_use,
imem->vaddr_max);
}
{
struct gk20a_instobj *node = gk20a_instobj(memory);
struct nvkm_vmm_map map = {
- .memory = &node->memory,
+ .memory = &node->base.memory,
.offset = offset,
.mem = node->mn,
};
return -ENOMEM;
*_node = &node->base;
- nvkm_memory_ctor(&gk20a_instobj_func_dma, &node->base.memory);
- node->base.memory.ptrs = &gk20a_instobj_ptrs;
+ nvkm_memory_ctor(&gk20a_instobj_func_dma, &node->base.base.memory);
+ node->base.base.memory.ptrs = &gk20a_instobj_ptrs;
node->base.vaddr = dma_alloc_attrs(dev, npages << PAGE_SHIFT,
&node->handle, GFP_KERNEL,
*_node = &node->base;
node->dma_addrs = (void *)(node->pages + npages);
- nvkm_memory_ctor(&gk20a_instobj_func_iommu, &node->base.memory);
- node->base.memory.ptrs = &gk20a_instobj_ptrs;
+ nvkm_memory_ctor(&gk20a_instobj_func_iommu, &node->base.base.memory);
+ node->base.base.memory.ptrs = &gk20a_instobj_ptrs;
/* Allocate backing memory */
for (i = 0; i < npages; i++) {
else
ret = gk20a_instobj_ctor_dma(imem, size >> PAGE_SHIFT,
align, &node);
- *pmemory = node ? &node->memory : NULL;
+ *pmemory = node ? &node->base.memory : NULL;
if (ret)
return ret;
type |= 0x00000001; /* PAGE_ALL */
if (atomic_read(&vmm->engref[NVKM_SUBDEV_BAR]))
- type |= 0x00000004; /* HUB_ONLY */
+ type |= 0x00000006; /* HUB_ONLY | ALL PDB (hack) */
mutex_lock(&vmm->mmu->mutex);
.mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
MIPI_DSI_MODE_LPM,
.init_cmds = auo_b101uan08_3_init_cmd,
+ .lp11_before_reset = true,
};
static const struct drm_display_mode boe_tv105wum_nw0_default_mode = {
.mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
MIPI_DSI_MODE_LPM,
.init_cmds = starry_qfh032011_53g_init_cmd,
+ .lp11_before_reset = true,
};
static const struct drm_display_mode starry_himax83102_j02_default_mode = {
- .clock = 161600,
+ .clock = 162850,
.hdisplay = 1200,
- .hsync_start = 1200 + 40,
- .hsync_end = 1200 + 40 + 20,
- .htotal = 1200 + 40 + 20 + 40,
+ .hsync_start = 1200 + 50,
+ .hsync_end = 1200 + 50 + 20,
+ .htotal = 1200 + 50 + 20 + 50,
.vdisplay = 1920,
.vsync_start = 1920 + 116,
.vsync_end = 1920 + 116 + 8,
static const struct ltk050h3146w_desc ltk050h3148w_data = {
.mode = <k050h3148w_mode,
.init = ltk050h3148w_init_sequence,
- .mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE,
+ .mode_flags = MIPI_DSI_MODE_VIDEO_SYNC_PULSE | MIPI_DSI_MODE_VIDEO_BURST,
};
static int ltk050h3146w_init_sequence(struct ltk050h3146w *ctx)
return dev_err_probe(dev, -EPROBE_DEFER, "cannot get secondary DSI host\n");
pinfo->dsi[1] = mipi_dsi_device_register_full(dsi1_host, info);
- if (!pinfo->dsi[1]) {
+ if (IS_ERR(pinfo->dsi[1])) {
dev_err(dev, "cannot get secondary DSI device\n");
- return -ENODEV;
+ return PTR_ERR(pinfo->dsi[1]);
}
}
static const struct display_timing innolux_g101ice_l01_timing = {
.pixelclock = { 60400000, 71100000, 74700000 },
.hactive = { 1280, 1280, 1280 },
- .hfront_porch = { 41, 80, 100 },
- .hback_porch = { 40, 79, 99 },
- .hsync_len = { 1, 1, 1 },
+ .hfront_porch = { 30, 60, 70 },
+ .hback_porch = { 30, 60, 70 },
+ .hsync_len = { 22, 40, 60 },
.vactive = { 800, 800, 800 },
- .vfront_porch = { 5, 11, 14 },
- .vback_porch = { 4, 11, 14 },
- .vsync_len = { 1, 1, 1 },
+ .vfront_porch = { 3, 8, 14 },
+ .vback_porch = { 3, 8, 14 },
+ .vsync_len = { 4, 7, 12 },
.flags = DISPLAY_FLAGS_DE_HIGH,
};
.disable = 200,
},
.bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG,
+ .bus_flags = DRM_BUS_FLAG_DE_HIGH,
.connector_type = DRM_MODE_CONNECTOR_LVDS,
};
static int panfrost_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
+ struct panfrost_device *ptdev = dev_get_drvdata(dev);
struct dev_pm_opp *opp;
+ int err;
opp = devfreq_recommended_opp(dev, freq, flags);
if (IS_ERR(opp))
return PTR_ERR(opp);
dev_pm_opp_put(opp);
- return dev_pm_opp_set_rate(dev, *freq);
+ err = dev_pm_opp_set_rate(dev, *freq);
+ if (!err)
+ ptdev->pfdevfreq.current_frequency = *freq;
+
+ return err;
}
static void panfrost_devfreq_reset(struct panfrost_devfreq *pfdevfreq)
spin_lock_irqsave(&pfdevfreq->lock, irqflags);
panfrost_devfreq_update_utilization(pfdevfreq);
- pfdevfreq->current_frequency = status->current_frequency;
status->total_time = ktime_to_ns(ktime_add(pfdevfreq->busy_time,
pfdevfreq->idle_time));
panfrost_devfreq_profile.initial_freq = cur_freq;
+ /*
+ * We could wait until panfrost_devfreq_target() to set this value, but
+ * since the simple_ondemand governor works asynchronously, there's a
+ * chance by the time someone opens the device's fdinfo file, current
+ * frequency hasn't been updated yet, so let's just do an early set.
+ */
+ pfdevfreq->current_frequency = cur_freq;
+
/*
* Set the recommend OPP this will enable and configure the regulator
* if any and will avoid a switch off by regulator_late_cleanup()
struct panfrost_gem_object *bo = to_panfrost_bo(obj);
enum drm_gem_object_status res = 0;
- if (bo->base.pages)
+ if (bo->base.base.import_attach || bo->base.pages)
res |= DRM_GEM_OBJECT_RESIDENT;
if (bo->base.madv == PANFROST_MADV_DONTNEED)
VOP_REG_SET(vop, common, cfg_done, 1);
}
-static bool has_rb_swapped(uint32_t format)
+static bool has_rb_swapped(uint32_t version, uint32_t format)
{
switch (format) {
case DRM_FORMAT_XBGR8888:
case DRM_FORMAT_ABGR8888:
- case DRM_FORMAT_BGR888:
case DRM_FORMAT_BGR565:
return true;
+ /*
+ * full framework (IP version 3.x) only need rb swapped for RGB888 and
+ * little framework (IP version 2.x) only need rb swapped for BGR888,
+ * check for 3.x to also only rb swap BGR888 for unknown vop version
+ */
+ case DRM_FORMAT_RGB888:
+ return VOP_MAJOR(version) == 3;
+ case DRM_FORMAT_BGR888:
+ return VOP_MAJOR(version) != 3;
default:
return false;
}
VOP_WIN_SET(vop, win, dsp_info, dsp_info);
VOP_WIN_SET(vop, win, dsp_st, dsp_st);
- rb_swap = has_rb_swapped(fb->format->format);
+ rb_swap = has_rb_swapped(vop->data->version, fb->format->format);
VOP_WIN_SET(vop, win, rb_swap, rb_swap);
/*
config GREYBUS_BEAGLEPLAY
tristate "Greybus BeaglePlay driver"
depends on SERIAL_DEV_BUS
+ select CRC_CCITT
help
Select this option if you have a BeaglePlay where CC1352
co-processor acts as Greybus SVC.
{ "AONE" },
{ "GANSS" },
{ "Hailuck" },
+ { "Jamesdonkey" },
+ { "A3R" },
+ { "hfd.cn" },
+ { "WKB603" },
};
static bool apple_is_non_apple_keyboard(struct hid_device *hdev)
return 0;
}
-static int asus_kbd_set_report(struct hid_device *hdev, u8 *buf, size_t buf_size)
+static int asus_kbd_set_report(struct hid_device *hdev, const u8 *buf, size_t buf_size)
{
unsigned char *dmabuf;
int ret;
static int asus_kbd_init(struct hid_device *hdev)
{
- u8 buf[] = { FEATURE_KBD_REPORT_ID, 0x41, 0x53, 0x55, 0x53, 0x20, 0x54,
+ const u8 buf[] = { FEATURE_KBD_REPORT_ID, 0x41, 0x53, 0x55, 0x53, 0x20, 0x54,
0x65, 0x63, 0x68, 0x2e, 0x49, 0x6e, 0x63, 0x2e, 0x00 };
int ret;
static int asus_kbd_get_functions(struct hid_device *hdev,
unsigned char *kbd_func)
{
- u8 buf[] = { FEATURE_KBD_REPORT_ID, 0x05, 0x20, 0x31, 0x00, 0x08 };
+ const u8 buf[] = { FEATURE_KBD_REPORT_ID, 0x05, 0x20, 0x31, 0x00, 0x08 };
u8 *readbuf;
int ret;
static int rog_nkey_led_init(struct hid_device *hdev)
{
- u8 buf_init_start[] = { FEATURE_KBD_LED_REPORT_ID1, 0xB9 };
+ const u8 buf_init_start[] = { FEATURE_KBD_LED_REPORT_ID1, 0xB9 };
u8 buf_init2[] = { FEATURE_KBD_LED_REPORT_ID1, 0x41, 0x53, 0x55, 0x53, 0x20,
0x54, 0x65, 0x63, 0x68, 0x2e, 0x49, 0x6e, 0x63, 0x2e, 0x00 };
u8 buf_init3[] = { FEATURE_KBD_LED_REPORT_ID1,
return 0;
}
+static int __maybe_unused asus_resume(struct hid_device *hdev) {
+ struct asus_drvdata *drvdata = hid_get_drvdata(hdev);
+ int ret = 0;
+
+ if (drvdata->kbd_backlight) {
+ const u8 buf[] = { FEATURE_KBD_REPORT_ID, 0xba, 0xc5, 0xc4,
+ drvdata->kbd_backlight->cdev.brightness };
+ ret = asus_kbd_set_report(hdev, buf, sizeof(buf));
+ if (ret < 0) {
+ hid_err(hdev, "Asus failed to set keyboard backlight: %d\n", ret);
+ goto asus_resume_err;
+ }
+ }
+
+asus_resume_err:
+ return ret;
+}
+
static int __maybe_unused asus_reset_resume(struct hid_device *hdev)
{
struct asus_drvdata *drvdata = hid_get_drvdata(hdev);
.input_configured = asus_input_configured,
#ifdef CONFIG_PM
.reset_resume = asus_reset_resume,
+ .resume = asus_resume,
#endif
.event = asus_event,
.raw_event = asus_raw_event
* Free a device structure, all reports, and all fields.
*/
-static void hid_device_release(struct device *dev)
+void hiddev_free(struct kref *ref)
{
- struct hid_device *hid = to_hid_device(dev);
+ struct hid_device *hid = container_of(ref, struct hid_device, ref);
hid_close_report(hid);
kfree(hid->dev_rdesc);
kfree(hid);
}
+static void hid_device_release(struct device *dev)
+{
+ struct hid_device *hid = to_hid_device(dev);
+
+ kref_put(&hid->ref, hiddev_free);
+}
+
/*
* Fetch a report description item from the data stream. We support long
* items, though they are not used yet.
spin_lock_init(&hdev->debug_list_lock);
sema_init(&hdev->driver_input_lock, 1);
mutex_init(&hdev->ll_open_lock);
+ kref_init(&hdev->ref);
hid_bpf_device_init(hdev);
goto out;
}
list->hdev = (struct hid_device *) inode->i_private;
+ kref_get(&list->hdev->ref);
file->private_data = list;
mutex_init(&list->read_mutex);
list_del(&list->node);
spin_unlock_irqrestore(&list->hdev->debug_list_lock, flags);
kfifo_free(&list->hid_debug_fifo);
+
+ kref_put(&list->hdev->ref, hiddev_free);
kfree(list);
return 0;
* Glorious Model O and O- specify the const flag in the consumer input
* report descriptor, which leads to inputs being ignored. Fix this
* by patching the descriptor.
+ *
+ * Glorious Model I incorrectly specifes the Usage Minimum for its
+ * keyboard HID report, causing keycodes to be misinterpreted.
+ * Fix this by setting Usage Minimum to 0 in that report.
*/
static __u8 *glorious_report_fixup(struct hid_device *hdev, __u8 *rdesc,
unsigned int *rsize)
rdesc[85] = rdesc[113] = rdesc[141] = \
HID_MAIN_ITEM_VARIABLE | HID_MAIN_ITEM_RELATIVE;
}
+ if (*rsize == 156 && rdesc[41] == 1) {
+ hid_info(hdev, "patching Glorious Model I keyboard report descriptor\n");
+ rdesc[41] = 0;
+ }
return rdesc;
}
model = "Model O"; break;
case USB_DEVICE_ID_GLORIOUS_MODEL_D:
model = "Model D"; break;
+ case USB_DEVICE_ID_GLORIOUS_MODEL_I:
+ model = "Model I"; break;
}
snprintf(hdev->name, sizeof(hdev->name), "%s %s", "Glorious", model);
}
static const struct hid_device_id glorious_devices[] = {
- { HID_USB_DEVICE(USB_VENDOR_ID_GLORIOUS,
+ { HID_USB_DEVICE(USB_VENDOR_ID_SINOWEALTH,
USB_DEVICE_ID_GLORIOUS_MODEL_O) },
- { HID_USB_DEVICE(USB_VENDOR_ID_GLORIOUS,
+ { HID_USB_DEVICE(USB_VENDOR_ID_SINOWEALTH,
USB_DEVICE_ID_GLORIOUS_MODEL_D) },
+ { HID_USB_DEVICE(USB_VENDOR_ID_LAVIEW,
+ USB_DEVICE_ID_GLORIOUS_MODEL_I) },
{ }
};
MODULE_DEVICE_TABLE(hid, glorious_devices);
#define USB_DEVICE_ID_GENERAL_TOUCH_WIN8_PIT_010A 0x010a
#define USB_DEVICE_ID_GENERAL_TOUCH_WIN8_PIT_E100 0xe100
-#define USB_VENDOR_ID_GLORIOUS 0x258a
-#define USB_DEVICE_ID_GLORIOUS_MODEL_D 0x0033
-#define USB_DEVICE_ID_GLORIOUS_MODEL_O 0x0036
-
#define I2C_VENDOR_ID_GOODIX 0x27c6
#define I2C_DEVICE_ID_GOODIX_01F0 0x01f0
#define USB_VENDOR_ID_LABTEC 0x1020
#define USB_DEVICE_ID_LABTEC_WIRELESS_KEYBOARD 0x0006
+#define USB_DEVICE_ID_LABTEC_ODDOR_HANDBRAKE 0x8888
+
+#define USB_VENDOR_ID_LAVIEW 0x22D4
+#define USB_DEVICE_ID_GLORIOUS_MODEL_I 0x1503
#define USB_VENDOR_ID_LCPOWER 0x1241
#define USB_DEVICE_ID_LCPOWER_LC1000 0xf767
#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_2 0xc534
#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1 0xc539
#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_1 0xc53f
-#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_2 0xc547
#define USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_POWERPLAY 0xc53a
#define USB_DEVICE_ID_SPACETRAVELLER 0xc623
#define USB_DEVICE_ID_SPACENAVIGATOR 0xc626
#define USB_VENDOR_ID_SIGMATEL 0x066F
#define USB_DEVICE_ID_SIGMATEL_STMP3780 0x3780
+#define USB_VENDOR_ID_SINOWEALTH 0x258a
+#define USB_DEVICE_ID_GLORIOUS_MODEL_D 0x0033
+#define USB_DEVICE_ID_GLORIOUS_MODEL_O 0x0036
+
#define USB_VENDOR_ID_SIS_TOUCH 0x0457
#define USB_DEVICE_ID_SIS9200_TOUCH 0x9200
#define USB_DEVICE_ID_SIS817_TOUCH 0x0817
* so set middlebutton_state to 3
* to never apply workaround anymore
*/
- if (cptkbd_data->middlebutton_state == 1 &&
+ if (hdev->product == USB_DEVICE_ID_LENOVO_CUSBKBD &&
+ cptkbd_data->middlebutton_state == 1 &&
usage->type == EV_REL &&
(usage->code == REL_X || usage->code == REL_Y)) {
cptkbd_data->middlebutton_state = 3;
}
/*
* Mouse-only receivers send unnumbered mouse data. The 27 MHz
- * receiver uses 6 byte packets, the nano receiver 8 bytes,
- * the lightspeed receiver (Pro X Superlight) 13 bytes.
+ * receiver uses 6 byte packets, the nano receiver 8 bytes.
*/
if (djrcv_dev->unnumbered_application == HID_GD_MOUSE &&
- size <= 13){
- u8 mouse_report[14];
+ size <= 8) {
+ u8 mouse_report[9];
/* Prepend report id */
mouse_report[0] = REPORT_TYPE_MOUSE;
HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH,
USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_1),
.driver_data = recvr_type_gaming_hidpp},
- { /* Logitech lightspeed receiver (0xc547) */
- HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH,
- USB_DEVICE_ID_LOGITECH_NANO_RECEIVER_LIGHTSPEED_1_2),
- .driver_data = recvr_type_gaming_hidpp},
{ /* Logitech 27 MHz HID++ 1.0 receiver (0xc513) */
HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_MX3000_RECEIVER),
if (ret)
return ret;
+ hid_device_io_start(hdev);
+
/* Set I2C bus clock diviser */
if (i2c_clk_freq > 400)
i2c_clk_freq = 400;
snprintf(mcp->adapter.name, sizeof(mcp->adapter.name),
"MCP2221 usb-i2c bridge");
+ i2c_set_adapdata(&mcp->adapter, mcp);
ret = devm_i2c_add_adapter(&hdev->dev, &mcp->adapter);
if (ret) {
hid_err(hdev, "can't add usb-i2c adapter: %d\n", ret);
return ret;
}
- i2c_set_adapdata(&mcp->adapter, mcp);
#if IS_REACHABLE(CONFIG_GPIOLIB)
/* Setup GPIO chip */
MT_USB_DEVICE(USB_VENDOR_ID_HANVON_ALT,
USB_DEVICE_ID_HANVON_ALT_MULTITOUCH) },
+ /* HONOR GLO-GXXX panel */
+ { .driver_data = MT_CLS_VTL,
+ HID_DEVICE(BUS_I2C, HID_GROUP_MULTITOUCH_WIN_8,
+ 0x347d, 0x7853) },
+
/* Ilitek dual touch panel */
{ .driver_data = MT_CLS_NSMU,
MT_USB_DEVICE(USB_VENDOR_ID_ILITEK,
* All the controller's button values are stored in a u32.
* They can be accessed with bitwise ANDs.
*/
-static const u32 JC_BTN_Y = BIT(0);
-static const u32 JC_BTN_X = BIT(1);
-static const u32 JC_BTN_B = BIT(2);
-static const u32 JC_BTN_A = BIT(3);
-static const u32 JC_BTN_SR_R = BIT(4);
-static const u32 JC_BTN_SL_R = BIT(5);
-static const u32 JC_BTN_R = BIT(6);
-static const u32 JC_BTN_ZR = BIT(7);
-static const u32 JC_BTN_MINUS = BIT(8);
-static const u32 JC_BTN_PLUS = BIT(9);
-static const u32 JC_BTN_RSTICK = BIT(10);
-static const u32 JC_BTN_LSTICK = BIT(11);
-static const u32 JC_BTN_HOME = BIT(12);
-static const u32 JC_BTN_CAP = BIT(13); /* capture button */
-static const u32 JC_BTN_DOWN = BIT(16);
-static const u32 JC_BTN_UP = BIT(17);
-static const u32 JC_BTN_RIGHT = BIT(18);
-static const u32 JC_BTN_LEFT = BIT(19);
-static const u32 JC_BTN_SR_L = BIT(20);
-static const u32 JC_BTN_SL_L = BIT(21);
-static const u32 JC_BTN_L = BIT(22);
-static const u32 JC_BTN_ZL = BIT(23);
+#define JC_BTN_Y BIT(0)
+#define JC_BTN_X BIT(1)
+#define JC_BTN_B BIT(2)
+#define JC_BTN_A BIT(3)
+#define JC_BTN_SR_R BIT(4)
+#define JC_BTN_SL_R BIT(5)
+#define JC_BTN_R BIT(6)
+#define JC_BTN_ZR BIT(7)
+#define JC_BTN_MINUS BIT(8)
+#define JC_BTN_PLUS BIT(9)
+#define JC_BTN_RSTICK BIT(10)
+#define JC_BTN_LSTICK BIT(11)
+#define JC_BTN_HOME BIT(12)
+#define JC_BTN_CAP BIT(13) /* capture button */
+#define JC_BTN_DOWN BIT(16)
+#define JC_BTN_UP BIT(17)
+#define JC_BTN_RIGHT BIT(18)
+#define JC_BTN_LEFT BIT(19)
+#define JC_BTN_SR_L BIT(20)
+#define JC_BTN_SL_L BIT(21)
+#define JC_BTN_L BIT(22)
+#define JC_BTN_ZL BIT(23)
enum joycon_msg_type {
JOYCON_MSG_TYPE_NONE,
*/
static void joycon_calc_imu_cal_divisors(struct joycon_ctlr *ctlr)
{
- int i;
+ int i, divz = 0;
for (i = 0; i < 3; i++) {
ctlr->imu_cal_accel_divisor[i] = ctlr->accel_cal.scale[i] -
ctlr->accel_cal.offset[i];
ctlr->imu_cal_gyro_divisor[i] = ctlr->gyro_cal.scale[i] -
ctlr->gyro_cal.offset[i];
+
+ if (ctlr->imu_cal_accel_divisor[i] == 0) {
+ ctlr->imu_cal_accel_divisor[i] = 1;
+ divz++;
+ }
+
+ if (ctlr->imu_cal_gyro_divisor[i] == 0) {
+ ctlr->imu_cal_gyro_divisor[i] = 1;
+ divz++;
+ }
}
+
+ if (divz)
+ hid_warn(ctlr->hdev, "inaccurate IMU divisors (%d)\n", divz);
}
static const s16 DFLT_ACCEL_OFFSET /*= 0*/;
JC_IMU_SAMPLES_PER_DELTA_AVG) {
ctlr->imu_avg_delta_ms = ctlr->imu_delta_samples_sum /
ctlr->imu_delta_samples_count;
- /* don't ever want divide by zero shenanigans */
- if (ctlr->imu_avg_delta_ms == 0) {
- ctlr->imu_avg_delta_ms = 1;
- hid_warn(ctlr->hdev,
- "calculated avg imu delta of 0\n");
- }
ctlr->imu_delta_samples_count = 0;
ctlr->imu_delta_samples_sum = 0;
}
+ /* don't ever want divide by zero shenanigans */
+ if (ctlr->imu_avg_delta_ms == 0) {
+ ctlr->imu_avg_delta_ms = 1;
+ hid_warn(ctlr->hdev, "calculated avg imu delta of 0\n");
+ }
+
/* useful for debugging IMU sample rate */
hid_dbg(ctlr->hdev,
"imu_report: ms=%u last_ms=%u delta=%u avg_delta=%u\n",
{ HID_USB_DEVICE(USB_VENDOR_ID_AKAI, USB_DEVICE_ID_AKAI_MPKMINI2), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_ALPS, USB_DEVICE_ID_IBM_GAMEPAD), HID_QUIRK_BADPAD },
{ HID_USB_DEVICE(USB_VENDOR_ID_AMI, USB_DEVICE_ID_AMI_VIRT_KEYBOARD_AND_MOUSE), HID_QUIRK_ALWAYS_POLL },
+ { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_ALU_REVB_ANSI), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_2PORTKVM), HID_QUIRK_NOGET },
{ HID_USB_DEVICE(USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_4PORTKVMC), HID_QUIRK_NOGET },
{ HID_USB_DEVICE(USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_4PORTKVM), HID_QUIRK_NOGET },
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_EASYPEN_M406XE), HID_QUIRK_MULTI_INPUT },
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_MOUSEPEN_I608X_V2), HID_QUIRK_MULTI_INPUT },
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_PENSKETCH_T609A), HID_QUIRK_MULTI_INPUT },
+ { HID_USB_DEVICE(USB_VENDOR_ID_LABTEC, USB_DEVICE_ID_LABTEC_ODDOR_HANDBRAKE), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_OPTICAL_USB_MOUSE_600E), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_608D), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019), HID_QUIRK_ALWAYS_POLL },
* ICN8505 controller, has a _CID of PNP0C50 but is not HID compatible.
*/
{ "CHPN0001" },
+ /*
+ * The IDEA5002 ACPI device causes high interrupt usage and spurious
+ * wakeups from suspend.
+ */
+ { "IDEA5002" },
{ }
};
#define POWER_METER_CAN_NOTIFY (1 << 3)
#define POWER_METER_IS_BATTERY (1 << 8)
#define UNKNOWN_HYSTERESIS 0xFFFFFFFF
+#define UNKNOWN_POWER 0xFFFFFFFF
#define METER_NOTIFY_CONFIG 0x80
#define METER_NOTIFY_TRIP 0x81
update_meter(resource);
mutex_unlock(&resource->lock);
+ if (resource->power == UNKNOWN_POWER)
+ return -ENODATA;
+
return sprintf(buf, "%llu\n", resource->power * 1000);
}
.reset_resume = corsairpsu_resume,
#endif
};
-module_hid_driver(corsairpsu_driver);
+
+static int __init corsair_init(void)
+{
+ return hid_register_driver(&corsairpsu_driver);
+}
+
+static void __exit corsair_exit(void)
+{
+ hid_unregister_driver(&corsairpsu_driver);
+}
+
+/*
+ * With module_init() the driver would load before the HID bus when
+ * built-in, so use late_initcall() instead.
+ */
+late_initcall(corsair_init);
+module_exit(corsair_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Wilken Gottwalt <wilken.gottwalt@posteo.net>");
LTC2991_REPEAT_ACQ_EN);
if (ret)
return dev_err_probe(st->dev, ret,
- "Error: Failed to set contiuous mode.\n");
+ "Error: Failed to set continuous mode.\n");
/* Enable all channels and trigger conversions */
return regmap_write(st->regmap, LTC2991_CH_EN_TRIGGER,
#include <linux/i2c.h>
#include <linux/mutex.h>
#include <linux/regmap.h>
+#include <linux/regulator/consumer.h>
#define MAX31827_T_REG 0x0
#define MAX31827_CONFIGURATION_REG 0x2
ret = hid_hw_start(hdev, HID_CONNECT_HIDRAW);
if (ret) {
hid_err(hdev, "hid hw start failed with %d\n", ret);
- goto fail_and_stop;
+ return ret;
}
ret = hid_hw_open(hdev);
if (ret) {
hid_err(hdev, "hid hw open failed with %d\n", ret);
- goto fail_and_close;
+ goto fail_and_stop;
}
priv->hwmon_dev = hwmon_device_register_with_info(&hdev->dev, "kraken2",
goto fail_end_stop;
/* Finally enable the tracer */
- if (coresight_enable_source(csdev, CS_MODE_PERF, event))
+ if (source_ops(csdev)->enable(csdev, event, CS_MODE_PERF))
goto fail_disable_path;
/*
return;
/* stop tracer */
- coresight_disable_source(csdev, event);
+ source_ops(csdev)->disable(csdev, event);
/* tell the core */
event->hw.state = PERF_HES_STOPPED;
per_cpu(delayed_probe, cpu) = NULL;
}
-static void __exit etm4_remove_dev(struct etmv4_drvdata *drvdata)
+static void etm4_remove_dev(struct etmv4_drvdata *drvdata)
{
bool had_delayed_probe;
/*
}
}
-static void __exit etm4_remove_amba(struct amba_device *adev)
+static void etm4_remove_amba(struct amba_device *adev)
{
struct etmv4_drvdata *drvdata = dev_get_drvdata(&adev->dev);
etm4_remove_dev(drvdata);
}
-static int __exit etm4_remove_platform_dev(struct platform_device *pdev)
+static int etm4_remove_platform_dev(struct platform_device *pdev)
{
struct etmv4_drvdata *drvdata = dev_get_drvdata(&pdev->dev);
struct smb_drv_data, miscdev);
int ret = 0;
- mutex_lock(&drvdata->mutex);
+ spin_lock(&drvdata->spinlock);
if (drvdata->reading) {
ret = -EBUSY;
drvdata->reading = true;
out:
- mutex_unlock(&drvdata->mutex);
+ spin_unlock(&drvdata->spinlock);
return ret;
}
if (!len)
return 0;
- mutex_lock(&drvdata->mutex);
-
if (!sdb->data_size)
- goto out;
+ return 0;
to_copy = min(sdb->data_size, len);
if (copy_to_user(data, sdb->buf_base + sdb->buf_rdptr, to_copy)) {
dev_dbg(dev, "Failed to copy data to user\n");
- to_copy = -EFAULT;
- goto out;
+ return -EFAULT;
}
*ppos += to_copy;
-
smb_update_read_ptr(drvdata, to_copy);
-
- dev_dbg(dev, "%zu bytes copied\n", to_copy);
-out:
if (!sdb->data_size)
smb_reset_buffer(drvdata);
- mutex_unlock(&drvdata->mutex);
+ dev_dbg(dev, "%zu bytes copied\n", to_copy);
return to_copy;
}
struct smb_drv_data *drvdata = container_of(file->private_data,
struct smb_drv_data, miscdev);
- mutex_lock(&drvdata->mutex);
+ spin_lock(&drvdata->spinlock);
drvdata->reading = false;
- mutex_unlock(&drvdata->mutex);
+ spin_unlock(&drvdata->spinlock);
return 0;
}
struct smb_drv_data *drvdata = dev_get_drvdata(csdev->dev.parent);
int ret = 0;
- mutex_lock(&drvdata->mutex);
+ spin_lock(&drvdata->spinlock);
/* Do nothing, the trace data is reading by other interface now */
if (drvdata->reading) {
dev_dbg(&csdev->dev, "Ultrasoc SMB enabled\n");
out:
- mutex_unlock(&drvdata->mutex);
+ spin_unlock(&drvdata->spinlock);
return ret;
}
struct smb_drv_data *drvdata = dev_get_drvdata(csdev->dev.parent);
int ret = 0;
- mutex_lock(&drvdata->mutex);
+ spin_lock(&drvdata->spinlock);
if (drvdata->reading) {
ret = -EBUSY;
dev_dbg(&csdev->dev, "Ultrasoc SMB disabled\n");
out:
- mutex_unlock(&drvdata->mutex);
+ spin_unlock(&drvdata->spinlock);
return ret;
}
if (!buf)
return 0;
- mutex_lock(&drvdata->mutex);
+ spin_lock(&drvdata->spinlock);
/* Don't do anything if another tracer is using this sink. */
if (atomic_read(&csdev->refcnt) != 1)
if (!buf->snapshot && lost)
perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
out:
- mutex_unlock(&drvdata->mutex);
+ spin_unlock(&drvdata->spinlock);
return data_size;
}
static void smb_init_hw(struct smb_drv_data *drvdata)
{
smb_disable_hw(drvdata);
- smb_reset_buffer(drvdata);
writel(SMB_LB_CFG_LO_DEFAULT, drvdata->base + SMB_LB_CFG_LO_REG);
writel(SMB_LB_CFG_HI_DEFAULT, drvdata->base + SMB_LB_CFG_HI_REG);
return ret;
}
- mutex_init(&drvdata->mutex);
+ ret = smb_config_inport(dev, true);
+ if (ret)
+ return ret;
+
+ smb_reset_buffer(drvdata);
+ platform_set_drvdata(pdev, drvdata);
+ spin_lock_init(&drvdata->spinlock);
drvdata->pid = -1;
ret = smb_register_sink(pdev, drvdata);
if (ret) {
+ smb_config_inport(&pdev->dev, false);
dev_err(dev, "Failed to register SMB sink\n");
return ret;
}
- ret = smb_config_inport(dev, true);
- if (ret) {
- smb_unregister_sink(drvdata);
- return ret;
- }
-
- platform_set_drvdata(pdev, drvdata);
-
return 0;
}
static int smb_remove(struct platform_device *pdev)
{
struct smb_drv_data *drvdata = platform_get_drvdata(pdev);
- int ret;
-
- ret = smb_config_inport(&pdev->dev, false);
- if (ret)
- return ret;
smb_unregister_sink(drvdata);
+ smb_config_inport(&pdev->dev, false);
+
return 0;
}
#define _ULTRASOC_SMB_H
#include <linux/miscdevice.h>
-#include <linux/mutex.h>
+#include <linux/spinlock.h>
/* Offset of SMB global registers */
#define SMB_GLB_CFG_REG 0x00
* @csdev: Component vitals needed by the framework.
* @sdb: Data buffer for SMB.
* @miscdev: Specifics to handle "/dev/xyz.smb" entry.
- * @mutex: Control data access to one at a time.
+ * @spinlock: Control data access to one at a time.
* @reading: Synchronise user space access to SMB buffer.
* @pid: Process ID of the process being monitored by the
* session that is using this component.
struct coresight_device *csdev;
struct smb_data_buffer sdb;
struct miscdevice miscdev;
- struct mutex mutex;
+ spinlock_t spinlock;
bool reading;
pid_t pid;
enum cs_mode mode;
return ret;
hisi_ptt->trace_irq = pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ);
- ret = devm_request_threaded_irq(&pdev->dev, hisi_ptt->trace_irq,
- NULL, hisi_ptt_isr, 0,
- DRV_NAME, hisi_ptt);
+ ret = devm_request_irq(&pdev->dev, hisi_ptt->trace_irq, hisi_ptt_isr,
+ IRQF_NOBALANCING | IRQF_NO_THREAD, DRV_NAME,
+ hisi_ptt);
if (ret) {
pci_err(pdev, "failed to request irq %d, ret = %d\n",
hisi_ptt->trace_irq, ret);
return -EOPNOTSUPP;
}
+ if (event->attach_state & PERF_ATTACH_TASK)
+ return -EOPNOTSUPP;
+
if (event->attr.type != hisi_ptt->hisi_ptt_pmu.type)
return -ENOENT;
hisi_ptt_pmu_stop(event, PERF_EF_UPDATE);
}
+static void hisi_ptt_pmu_read(struct perf_event *event)
+{
+}
+
static void hisi_ptt_remove_cpuhp_instance(void *hotplug_node)
{
cpuhp_state_remove_instance_nocalls(hisi_ptt_pmu_online, hotplug_node);
.stop = hisi_ptt_pmu_stop,
.add = hisi_ptt_pmu_add,
.del = hisi_ptt_pmu_del,
+ .read = hisi_ptt_pmu_read,
};
reg = readl(hisi_ptt->iobase + HISI_PTT_LOCATION);
if (!slave)
return 0;
- command = readl(bus->base + ASPEED_I2C_CMD_REG);
+ /*
+ * Handle stop conditions early, prior to SLAVE_MATCH. Some masters may drive
+ * transfers with low enough latency between the nak/stop phase of the current
+ * command and the start/address phase of the following command that the
+ * interrupts are coalesced by the time we process them.
+ */
+ if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
+ irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
+ bus->slave_state = ASPEED_I2C_SLAVE_STOP;
+ }
+
+ if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
+ bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
+ irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
+ bus->slave_state = ASPEED_I2C_SLAVE_STOP;
+ }
+
+ /* Propagate any stop conditions to the slave implementation. */
+ if (bus->slave_state == ASPEED_I2C_SLAVE_STOP) {
+ i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
+ bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
+ }
- /* Slave was requested, restart state machine. */
+ /*
+ * Now that we've dealt with any potentially coalesced stop conditions,
+ * address any start conditions.
+ */
if (irq_status & ASPEED_I2CD_INTR_SLAVE_MATCH) {
irq_handled |= ASPEED_I2CD_INTR_SLAVE_MATCH;
bus->slave_state = ASPEED_I2C_SLAVE_START;
}
- /* Slave is not currently active, irq was for someone else. */
+ /*
+ * If the slave has been stopped and not started then slave interrupt
+ * handling is complete.
+ */
if (bus->slave_state == ASPEED_I2C_SLAVE_INACTIVE)
return irq_handled;
+ command = readl(bus->base + ASPEED_I2C_CMD_REG);
dev_dbg(bus->dev, "slave irq status 0x%08x, cmd 0x%08x\n",
irq_status, command);
irq_handled |= ASPEED_I2CD_INTR_RX_DONE;
}
- /* Slave was asked to stop. */
- if (irq_status & ASPEED_I2CD_INTR_NORMAL_STOP) {
- irq_handled |= ASPEED_I2CD_INTR_NORMAL_STOP;
- bus->slave_state = ASPEED_I2C_SLAVE_STOP;
- }
- if (irq_status & ASPEED_I2CD_INTR_TX_NAK &&
- bus->slave_state == ASPEED_I2C_SLAVE_READ_PROCESSED) {
- irq_handled |= ASPEED_I2CD_INTR_TX_NAK;
- bus->slave_state = ASPEED_I2C_SLAVE_STOP;
- }
-
switch (bus->slave_state) {
case ASPEED_I2C_SLAVE_READ_REQUESTED:
if (unlikely(irq_status & ASPEED_I2CD_INTR_TX_ACK))
i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED, &value);
break;
case ASPEED_I2C_SLAVE_STOP:
- i2c_slave_event(slave, I2C_SLAVE_STOP, &value);
- bus->slave_state = ASPEED_I2C_SLAVE_INACTIVE;
+ /* Stop event handling is done early. Unreachable. */
break;
case ASPEED_I2C_SLAVE_START:
/* Slave was just started. Waiting for the next event. */;
{
struct dw_i2c_dev *dev = context;
- *val = readl_relaxed(dev->base + reg);
+ *val = readl(dev->base + reg);
return 0;
}
{
struct dw_i2c_dev *dev = context;
- writel_relaxed(val, dev->base + reg);
+ writel(val, dev->base + reg);
return 0;
}
{
struct dw_i2c_dev *dev = context;
- *val = swab32(readl_relaxed(dev->base + reg));
+ *val = swab32(readl(dev->base + reg));
return 0;
}
{
struct dw_i2c_dev *dev = context;
- writel_relaxed(swab32(val), dev->base + reg);
+ writel(swab32(val), dev->base + reg);
return 0;
}
{
struct dw_i2c_dev *dev = context;
- *val = readw_relaxed(dev->base + reg) |
- (readw_relaxed(dev->base + reg + 2) << 16);
+ *val = readw(dev->base + reg) |
+ (readw(dev->base + reg + 2) << 16);
return 0;
}
{
struct dw_i2c_dev *dev = context;
- writew_relaxed(val, dev->base + reg);
- writew_relaxed(val >> 16, dev->base + reg + 2);
+ writew(val, dev->base + reg);
+ writew(val >> 16, dev->base + reg + 2);
return 0;
}
return ocores_init(dev, i2c);
}
-static DEFINE_SIMPLE_DEV_PM_OPS(ocores_i2c_pm,
- ocores_i2c_suspend, ocores_i2c_resume);
+static DEFINE_NOIRQ_DEV_PM_OPS(ocores_i2c_pm,
+ ocores_i2c_suspend, ocores_i2c_resume);
static struct platform_driver ocores_i2c_driver = {
.probe = ocores_i2c_probe,
u32 hs_mask;
struct i2c_bus_recovery_info recovery;
+ struct pinctrl *pinctrl;
+ struct pinctrl_state *pinctrl_default;
+ struct pinctrl_state *pinctrl_recovery;
};
#define _IBMR(i2c) ((i2c)->reg_ibmr)
*/
gpiod_set_value(i2c->recovery.scl_gpiod, ibmr & IBMR_SCLS);
gpiod_set_value(i2c->recovery.sda_gpiod, ibmr & IBMR_SDAS);
+
+ WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery));
}
static void i2c_pxa_unprepare_recovery(struct i2c_adapter *adap)
{
struct pxa_i2c *i2c = adap->algo_data;
- struct i2c_bus_recovery_info *bri = adap->bus_recovery_info;
u32 isr;
/*
i2c_pxa_do_reset(i2c);
}
- WARN_ON(pinctrl_select_state(bri->pinctrl, bri->pins_default));
+ WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default));
dev_dbg(&i2c->adap.dev, "recovery: IBMR 0x%08x ISR 0x%08x\n",
readl(_IBMR(i2c)), readl(_ISR(i2c)));
if (IS_ENABLED(CONFIG_I2C_PXA_SLAVE))
return 0;
- bri->pinctrl = devm_pinctrl_get(dev);
- if (PTR_ERR(bri->pinctrl) == -ENODEV) {
- bri->pinctrl = NULL;
+ i2c->pinctrl = devm_pinctrl_get(dev);
+ if (PTR_ERR(i2c->pinctrl) == -ENODEV)
+ i2c->pinctrl = NULL;
+ if (IS_ERR(i2c->pinctrl))
+ return PTR_ERR(i2c->pinctrl);
+
+ if (!i2c->pinctrl)
+ return 0;
+
+ i2c->pinctrl_default = pinctrl_lookup_state(i2c->pinctrl,
+ PINCTRL_STATE_DEFAULT);
+ i2c->pinctrl_recovery = pinctrl_lookup_state(i2c->pinctrl, "recovery");
+
+ if (IS_ERR(i2c->pinctrl_default) || IS_ERR(i2c->pinctrl_recovery)) {
+ dev_info(dev, "missing pinmux recovery information: %ld %ld\n",
+ PTR_ERR(i2c->pinctrl_default),
+ PTR_ERR(i2c->pinctrl_recovery));
+ return 0;
+ }
+
+ /*
+ * Claiming GPIOs can influence the pinmux state, and may glitch the
+ * I2C bus. Do this carefully.
+ */
+ bri->scl_gpiod = devm_gpiod_get(dev, "scl", GPIOD_OUT_HIGH_OPEN_DRAIN);
+ if (bri->scl_gpiod == ERR_PTR(-EPROBE_DEFER))
+ return -EPROBE_DEFER;
+ if (IS_ERR(bri->scl_gpiod)) {
+ dev_info(dev, "missing scl gpio recovery information: %pe\n",
+ bri->scl_gpiod);
+ return 0;
+ }
+
+ /*
+ * We have SCL. Pull SCL low and wait a bit so that SDA glitches
+ * have no effect.
+ */
+ gpiod_direction_output(bri->scl_gpiod, 0);
+ udelay(10);
+ bri->sda_gpiod = devm_gpiod_get(dev, "sda", GPIOD_OUT_HIGH_OPEN_DRAIN);
+
+ /* Wait a bit in case of a SDA glitch, and then release SCL. */
+ udelay(10);
+ gpiod_direction_output(bri->scl_gpiod, 1);
+
+ if (bri->sda_gpiod == ERR_PTR(-EPROBE_DEFER))
+ return -EPROBE_DEFER;
+
+ if (IS_ERR(bri->sda_gpiod)) {
+ dev_info(dev, "missing sda gpio recovery information: %pe\n",
+ bri->sda_gpiod);
return 0;
}
- if (IS_ERR(bri->pinctrl))
- return PTR_ERR(bri->pinctrl);
bri->prepare_recovery = i2c_pxa_prepare_recovery;
bri->unprepare_recovery = i2c_pxa_unprepare_recovery;
+ bri->recover_bus = i2c_generic_scl_recovery;
i2c->adap.bus_recovery_info = bri;
- return 0;
+ /*
+ * Claiming GPIOs can change the pinmux state, which confuses the
+ * pinctrl since pinctrl's idea of the current setting is unaffected
+ * by the pinmux change caused by claiming the GPIO. Work around that
+ * by switching pinctrl to the GPIO state here. We do it this way to
+ * avoid glitching the I2C bus.
+ */
+ pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery);
+
+ return pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default);
}
static int i2c_pxa_probe(struct platform_device *dev)
ret = geni_se_resources_on(&gi2c->se);
if (ret) {
dev_err(dev, "Error turning on resources %d\n", ret);
+ clk_disable_unprepare(gi2c->core_clk);
return ret;
}
proto = geni_se_read_proto(&gi2c->se);
/* FIFO is disabled, so we can only use GPI DMA */
gi2c->gpi_mode = true;
ret = setup_gpi_dma(gi2c);
- if (ret)
+ if (ret) {
+ geni_se_resources_off(&gi2c->se);
+ clk_disable_unprepare(gi2c->core_clk);
return dev_err_probe(dev, ret, "Failed to setup GPI DMA mode\n");
+ }
dev_dbg(dev, "Using GPI DMA mode for I2C\n");
} else {
if (!tx_depth) {
dev_err(dev, "Invalid TX FIFO depth\n");
+ geni_se_resources_off(&gi2c->se);
+ clk_disable_unprepare(gi2c->core_clk);
return -EINVAL;
}
* @clk: function clk for rk3399 or function & Bus clks for others
* @pclk: Bus clk for rk3399
* @clk_rate_nb: i2c clk rate change notify
+ * @irq: irq number
* @t: I2C known timing information
* @lock: spinlock for the i2c bus
* @wait: the waitqueue to wait for i2c transfer
struct clk *clk;
struct clk *pclk;
struct notifier_block clk_rate_nb;
+ int irq;
/* Settings */
struct i2c_timings t;
spin_unlock_irqrestore(&i2c->lock, flags);
- rk3x_i2c_start(i2c);
-
if (!polling) {
+ rk3x_i2c_start(i2c);
+
timeout = wait_event_timeout(i2c->wait, !i2c->busy,
msecs_to_jiffies(WAIT_TIMEOUT));
} else {
+ disable_irq(i2c->irq);
+ rk3x_i2c_start(i2c);
+
timeout = rk3x_i2c_wait_xfer_poll(i2c);
+
+ enable_irq(i2c->irq);
}
spin_lock_irqsave(&i2c->lock, flags);
return ret;
}
+ i2c->irq = irq;
+
platform_set_drvdata(pdev, i2c);
if (i2c->soc_data->calc_timings == rk3x_i2c_v0_calc_timings) {
* (range / 2^bits) * g = (range / 2^bits) * 9.80665 m/s^2
* => KX022A uses 16 bit (HiRes mode - assume the low 8 bits are zeroed
* in low-power mode(?) )
- * => +/-2G => 4 / 2^16 * 9,80665 * 10^6 (to scale to micro)
- * => +/-2G - 598.550415
- * +/-4G - 1197.10083
- * +/-8G - 2394.20166
- * +/-16G - 4788.40332
+ * => +/-2G => 4 / 2^16 * 9,80665
+ * => +/-2G - 0.000598550415
+ * +/-4G - 0.00119710083
+ * +/-8G - 0.00239420166
+ * +/-16G - 0.00478840332
*/
static const int kx022a_scale_table[][2] = {
- { 598, 550415 },
- { 1197, 100830 },
- { 2394, 201660 },
- { 4788, 403320 },
+ { 0, 598550 },
+ { 0, 1197101 },
+ { 0, 2394202 },
+ { 0, 4788403 },
};
static int kx022a_read_avail(struct iio_dev *indio_dev,
*vals = (const int *)kx022a_scale_table;
*length = ARRAY_SIZE(kx022a_scale_table) *
ARRAY_SIZE(kx022a_scale_table[0]);
- *type = IIO_VAL_INT_PLUS_MICRO;
+ *type = IIO_VAL_INT_PLUS_NANO;
return IIO_AVAIL_LIST;
default:
return -EINVAL;
return ret;
}
+static int kx022a_write_raw_get_fmt(struct iio_dev *idev,
+ struct iio_chan_spec const *chan,
+ long mask)
+{
+ switch (mask) {
+ case IIO_CHAN_INFO_SCALE:
+ return IIO_VAL_INT_PLUS_NANO;
+ case IIO_CHAN_INFO_SAMP_FREQ:
+ return IIO_VAL_INT_PLUS_MICRO;
+ default:
+ return -EINVAL;
+ }
+}
+
static int kx022a_write_raw(struct iio_dev *idev,
struct iio_chan_spec const *chan,
int val, int val2, long mask)
kx022a_reg2scale(regval, val, val2);
- return IIO_VAL_INT_PLUS_MICRO;
+ return IIO_VAL_INT_PLUS_NANO;
}
return -EINVAL;
static const struct iio_info kx022a_info = {
.read_raw = &kx022a_read_raw,
.write_raw = &kx022a_write_raw,
+ .write_raw_get_fmt = &kx022a_write_raw_get_fmt,
.read_avail = &kx022a_read_avail,
.validate_trigger = iio_validate_own_trigger,
IMX93_ADC_CHAN(1),
IMX93_ADC_CHAN(2),
IMX93_ADC_CHAN(3),
+ IMX93_ADC_CHAN(4),
+ IMX93_ADC_CHAN(5),
+ IMX93_ADC_CHAN(6),
+ IMX93_ADC_CHAN(7),
};
static void imx93_adc_power_down(struct imx93_adc *adc)
mutex_unlock(&adc->lock);
return ret;
case IIO_CHAN_INFO_CALIBBIAS:
- if (val < mcp3564_calib_bias[0] && val > mcp3564_calib_bias[2])
+ if (val < mcp3564_calib_bias[0] || val > mcp3564_calib_bias[2])
return -EINVAL;
mutex_lock(&adc->lock);
mutex_unlock(&adc->lock);
return ret;
case IIO_CHAN_INFO_CALIBSCALE:
- if (val < mcp3564_calib_scale[0] && val > mcp3564_calib_scale[2])
+ if (val < mcp3564_calib_scale[0] || val > mcp3564_calib_scale[2])
return -EINVAL;
if (adc->calib_scale == val)
enum mcp3564_ids ids;
int ret = 0;
unsigned int tmp = 0x01;
- bool err = true;
+ bool err = false;
/*
* The address is set on a per-device basis by fuses in the factory,
module_spi_driver(mcp3564_driver);
MODULE_AUTHOR("Marius Cristea <marius.cristea@microchip.com>");
-MODULE_DESCRIPTION("Microchip MCP346x/MCP346xR and MCP356x/MCP346xR ADCs");
+MODULE_DESCRIPTION("Microchip MCP346x/MCP346xR and MCP356x/MCP356xR ADCs");
MODULE_LICENSE("GPL v2");
.cmv_select = 1,
};
+static const struct meson_sar_adc_param meson_sar_adc_axg_param = {
+ .has_bl30_integration = true,
+ .clock_rate = 1200000,
+ .bandgap_reg = MESON_SAR_ADC_REG11,
+ .regmap_config = &meson_sar_adc_regmap_config_gxbb,
+ .resolution = 12,
+ .disable_ring_counter = 1,
+ .has_reg11 = true,
+ .vref_volatge = 1,
+ .has_vref_select = true,
+ .vref_select = VREF_VDDA,
+ .cmv_select = 1,
+};
+
static const struct meson_sar_adc_param meson_sar_adc_g12a_param = {
.has_bl30_integration = false,
.clock_rate = 1200000,
};
static const struct meson_sar_adc_data meson_sar_adc_axg_data = {
- .param = &meson_sar_adc_gxl_param,
+ .param = &meson_sar_adc_axg_param,
.name = "meson-axg-saradc",
};
platform_set_drvdata(pdev, indio_dev);
err = tiadc_request_dma(pdev, adc_dev);
- if (err && err == -EPROBE_DEFER)
+ if (err && err != -ENODEV) {
+ dev_err_probe(&pdev->dev, err, "DMA request failed\n");
goto err_dma;
+ }
return 0;
struct iio_buffer *buffer;
int ret;
+ /*
+ * iio_triggered_buffer_cleanup() assumes that the buffer allocated here
+ * is assigned to indio_dev->buffer but this is only the case if this
+ * function is the first caller to iio_device_attach_buffer(). If
+ * indio_dev->buffer is already set then we can't proceed otherwise the
+ * cleanup function will try to free a buffer that was not allocated here.
+ */
+ if (indio_dev->buffer)
+ return -EADDRINUSE;
+
buffer = iio_kfifo_allocate();
if (!buffer) {
ret = -ENOMEM;
/* Conversion times in us */
static const u16 ms_sensors_ht_t_conversion_time[] = { 50000, 25000,
13000, 7000 };
-static const u16 ms_sensors_ht_h_conversion_time[] = { 16000, 3000,
- 5000, 8000 };
+static const u16 ms_sensors_ht_h_conversion_time[] = { 16000, 5000,
+ 3000, 8000 };
static const u16 ms_sensors_tp_conversion_time[] = { 500, 1100, 2100,
4100, 8220, 16440 };
#define ADIS16475_MAX_SCAN_DATA 20
/* spi max speed in brust mode */
#define ADIS16475_BURST_MAX_SPEED 1000000
-#define ADIS16475_LSB_DEC_MASK BIT(0)
-#define ADIS16475_LSB_FIR_MASK BIT(1)
+#define ADIS16475_LSB_DEC_MASK 0
+#define ADIS16475_LSB_FIR_MASK 1
#define ADIS16500_BURST_DATA_SEL_0_CHN_MASK GENMASK(5, 0)
#define ADIS16500_BURST_DATA_SEL_1_CHN_MASK GENMASK(12, 7)
return 0;
}
-static const struct of_device_id adis16475_of_match[] = {
- { .compatible = "adi,adis16470",
- .data = &adis16475_chip_info[ADIS16470] },
- { .compatible = "adi,adis16475-1",
- .data = &adis16475_chip_info[ADIS16475_1] },
- { .compatible = "adi,adis16475-2",
- .data = &adis16475_chip_info[ADIS16475_2] },
- { .compatible = "adi,adis16475-3",
- .data = &adis16475_chip_info[ADIS16475_3] },
- { .compatible = "adi,adis16477-1",
- .data = &adis16475_chip_info[ADIS16477_1] },
- { .compatible = "adi,adis16477-2",
- .data = &adis16475_chip_info[ADIS16477_2] },
- { .compatible = "adi,adis16477-3",
- .data = &adis16475_chip_info[ADIS16477_3] },
- { .compatible = "adi,adis16465-1",
- .data = &adis16475_chip_info[ADIS16465_1] },
- { .compatible = "adi,adis16465-2",
- .data = &adis16475_chip_info[ADIS16465_2] },
- { .compatible = "adi,adis16465-3",
- .data = &adis16475_chip_info[ADIS16465_3] },
- { .compatible = "adi,adis16467-1",
- .data = &adis16475_chip_info[ADIS16467_1] },
- { .compatible = "adi,adis16467-2",
- .data = &adis16475_chip_info[ADIS16467_2] },
- { .compatible = "adi,adis16467-3",
- .data = &adis16475_chip_info[ADIS16467_3] },
- { .compatible = "adi,adis16500",
- .data = &adis16475_chip_info[ADIS16500] },
- { .compatible = "adi,adis16505-1",
- .data = &adis16475_chip_info[ADIS16505_1] },
- { .compatible = "adi,adis16505-2",
- .data = &adis16475_chip_info[ADIS16505_2] },
- { .compatible = "adi,adis16505-3",
- .data = &adis16475_chip_info[ADIS16505_3] },
- { .compatible = "adi,adis16507-1",
- .data = &adis16475_chip_info[ADIS16507_1] },
- { .compatible = "adi,adis16507-2",
- .data = &adis16475_chip_info[ADIS16507_2] },
- { .compatible = "adi,adis16507-3",
- .data = &adis16475_chip_info[ADIS16507_3] },
- { },
-};
-MODULE_DEVICE_TABLE(of, adis16475_of_match);
static int adis16475_probe(struct spi_device *spi)
{
st = iio_priv(indio_dev);
- st->info = device_get_match_data(&spi->dev);
+ st->info = spi_get_device_match_data(spi);
if (!st->info)
return -EINVAL;
return 0;
}
+static const struct of_device_id adis16475_of_match[] = {
+ { .compatible = "adi,adis16470",
+ .data = &adis16475_chip_info[ADIS16470] },
+ { .compatible = "adi,adis16475-1",
+ .data = &adis16475_chip_info[ADIS16475_1] },
+ { .compatible = "adi,adis16475-2",
+ .data = &adis16475_chip_info[ADIS16475_2] },
+ { .compatible = "adi,adis16475-3",
+ .data = &adis16475_chip_info[ADIS16475_3] },
+ { .compatible = "adi,adis16477-1",
+ .data = &adis16475_chip_info[ADIS16477_1] },
+ { .compatible = "adi,adis16477-2",
+ .data = &adis16475_chip_info[ADIS16477_2] },
+ { .compatible = "adi,adis16477-3",
+ .data = &adis16475_chip_info[ADIS16477_3] },
+ { .compatible = "adi,adis16465-1",
+ .data = &adis16475_chip_info[ADIS16465_1] },
+ { .compatible = "adi,adis16465-2",
+ .data = &adis16475_chip_info[ADIS16465_2] },
+ { .compatible = "adi,adis16465-3",
+ .data = &adis16475_chip_info[ADIS16465_3] },
+ { .compatible = "adi,adis16467-1",
+ .data = &adis16475_chip_info[ADIS16467_1] },
+ { .compatible = "adi,adis16467-2",
+ .data = &adis16475_chip_info[ADIS16467_2] },
+ { .compatible = "adi,adis16467-3",
+ .data = &adis16475_chip_info[ADIS16467_3] },
+ { .compatible = "adi,adis16500",
+ .data = &adis16475_chip_info[ADIS16500] },
+ { .compatible = "adi,adis16505-1",
+ .data = &adis16475_chip_info[ADIS16505_1] },
+ { .compatible = "adi,adis16505-2",
+ .data = &adis16475_chip_info[ADIS16505_2] },
+ { .compatible = "adi,adis16505-3",
+ .data = &adis16475_chip_info[ADIS16505_3] },
+ { .compatible = "adi,adis16507-1",
+ .data = &adis16475_chip_info[ADIS16507_1] },
+ { .compatible = "adi,adis16507-2",
+ .data = &adis16475_chip_info[ADIS16507_2] },
+ { .compatible = "adi,adis16507-3",
+ .data = &adis16475_chip_info[ADIS16507_3] },
+ { },
+};
+MODULE_DEVICE_TABLE(of, adis16475_of_match);
+
+static const struct spi_device_id adis16475_ids[] = {
+ { "adis16470", (kernel_ulong_t)&adis16475_chip_info[ADIS16470] },
+ { "adis16475-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16475_1] },
+ { "adis16475-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16475_2] },
+ { "adis16475-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16475_3] },
+ { "adis16477-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16477_1] },
+ { "adis16477-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16477_2] },
+ { "adis16477-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16477_3] },
+ { "adis16465-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16465_1] },
+ { "adis16465-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16465_2] },
+ { "adis16465-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16465_3] },
+ { "adis16467-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16467_1] },
+ { "adis16467-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16467_2] },
+ { "adis16467-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16467_3] },
+ { "adis16500", (kernel_ulong_t)&adis16475_chip_info[ADIS16500] },
+ { "adis16505-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16505_1] },
+ { "adis16505-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16505_2] },
+ { "adis16505-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16505_3] },
+ { "adis16507-1", (kernel_ulong_t)&adis16475_chip_info[ADIS16507_1] },
+ { "adis16507-2", (kernel_ulong_t)&adis16475_chip_info[ADIS16507_2] },
+ { "adis16507-3", (kernel_ulong_t)&adis16475_chip_info[ADIS16507_3] },
+ { }
+};
+MODULE_DEVICE_TABLE(spi, adis16475_ids);
+
static struct spi_driver adis16475_driver = {
.driver = {
.name = "adis16475",
.of_match_table = adis16475_of_match,
},
.probe = adis16475_probe,
+ .id_table = adis16475_ids,
};
module_spi_driver(adis16475_driver);
ret = inv_mpu6050_sensor_show(st, st->reg->gyro_offset,
chan->channel2, val);
mutex_unlock(&st->lock);
- return IIO_VAL_INT;
+ return ret;
case IIO_ACCEL:
mutex_lock(&st->lock);
ret = inv_mpu6050_sensor_show(st, st->reg->accl_offset,
chan->channel2, val);
mutex_unlock(&st->lock);
- return IIO_VAL_INT;
+ return ret;
default:
return -EINVAL;
#include "../common/hid-sensors/hid-sensor-trigger.h"
enum {
- CHANNEL_SCAN_INDEX_INTENSITY,
- CHANNEL_SCAN_INDEX_ILLUM,
- CHANNEL_SCAN_INDEX_COLOR_TEMP,
- CHANNEL_SCAN_INDEX_CHROMATICITY_X,
- CHANNEL_SCAN_INDEX_CHROMATICITY_Y,
+ CHANNEL_SCAN_INDEX_INTENSITY = 0,
+ CHANNEL_SCAN_INDEX_ILLUM = 1,
CHANNEL_SCAN_INDEX_MAX
};
BIT(IIO_CHAN_INFO_HYSTERESIS_RELATIVE),
.scan_index = CHANNEL_SCAN_INDEX_ILLUM,
},
- {
- .type = IIO_COLORTEMP,
- .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
- .info_mask_shared_by_type = BIT(IIO_CHAN_INFO_OFFSET) |
- BIT(IIO_CHAN_INFO_SCALE) |
- BIT(IIO_CHAN_INFO_SAMP_FREQ) |
- BIT(IIO_CHAN_INFO_HYSTERESIS) |
- BIT(IIO_CHAN_INFO_HYSTERESIS_RELATIVE),
- .scan_index = CHANNEL_SCAN_INDEX_COLOR_TEMP,
- },
- {
- .type = IIO_CHROMATICITY,
- .modified = 1,
- .channel2 = IIO_MOD_X,
- .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
- .info_mask_shared_by_type = BIT(IIO_CHAN_INFO_OFFSET) |
- BIT(IIO_CHAN_INFO_SCALE) |
- BIT(IIO_CHAN_INFO_SAMP_FREQ) |
- BIT(IIO_CHAN_INFO_HYSTERESIS) |
- BIT(IIO_CHAN_INFO_HYSTERESIS_RELATIVE),
- .scan_index = CHANNEL_SCAN_INDEX_CHROMATICITY_X,
- },
- {
- .type = IIO_CHROMATICITY,
- .modified = 1,
- .channel2 = IIO_MOD_Y,
- .info_mask_separate = BIT(IIO_CHAN_INFO_RAW),
- .info_mask_shared_by_type = BIT(IIO_CHAN_INFO_OFFSET) |
- BIT(IIO_CHAN_INFO_SCALE) |
- BIT(IIO_CHAN_INFO_SAMP_FREQ) |
- BIT(IIO_CHAN_INFO_HYSTERESIS) |
- BIT(IIO_CHAN_INFO_HYSTERESIS_RELATIVE),
- .scan_index = CHANNEL_SCAN_INDEX_CHROMATICITY_Y,
- },
IIO_CHAN_SOFT_TIMESTAMP(CHANNEL_SCAN_INDEX_TIMESTAMP)
};
min = als_state->als[chan->scan_index].logical_minimum;
address = HID_USAGE_SENSOR_LIGHT_ILLUM;
break;
- case CHANNEL_SCAN_INDEX_COLOR_TEMP:
- report_id = als_state->als[chan->scan_index].report_id;
- min = als_state->als[chan->scan_index].logical_minimum;
- address = HID_USAGE_SENSOR_LIGHT_COLOR_TEMPERATURE;
- break;
- case CHANNEL_SCAN_INDEX_CHROMATICITY_X:
- report_id = als_state->als[chan->scan_index].report_id;
- min = als_state->als[chan->scan_index].logical_minimum;
- address = HID_USAGE_SENSOR_LIGHT_CHROMATICITY_X;
- break;
- case CHANNEL_SCAN_INDEX_CHROMATICITY_Y:
- report_id = als_state->als[chan->scan_index].report_id;
- min = als_state->als[chan->scan_index].logical_minimum;
- address = HID_USAGE_SENSOR_LIGHT_CHROMATICITY_Y;
- break;
default:
report_id = -1;
break;
als_state->scan.illum[CHANNEL_SCAN_INDEX_ILLUM] = sample_data;
ret = 0;
break;
- case HID_USAGE_SENSOR_LIGHT_COLOR_TEMPERATURE:
- als_state->scan.illum[CHANNEL_SCAN_INDEX_COLOR_TEMP] = sample_data;
- ret = 0;
- break;
- case HID_USAGE_SENSOR_LIGHT_CHROMATICITY_X:
- als_state->scan.illum[CHANNEL_SCAN_INDEX_CHROMATICITY_X] = sample_data;
- ret = 0;
- break;
- case HID_USAGE_SENSOR_LIGHT_CHROMATICITY_Y:
- als_state->scan.illum[CHANNEL_SCAN_INDEX_CHROMATICITY_Y] = sample_data;
- ret = 0;
- break;
case HID_USAGE_SENSOR_TIME_TIMESTAMP:
als_state->timestamp = hid_sensor_convert_timestamp(&als_state->common_attributes,
*(s64 *)raw_data);
st->als[i].report_id);
}
- ret = sensor_hub_input_get_attribute_info(hsdev, HID_INPUT_REPORT,
- usage_id,
- HID_USAGE_SENSOR_LIGHT_COLOR_TEMPERATURE,
- &st->als[CHANNEL_SCAN_INDEX_COLOR_TEMP]);
- if (ret < 0)
- return ret;
- als_adjust_channel_bit_mask(channels, CHANNEL_SCAN_INDEX_COLOR_TEMP,
- st->als[CHANNEL_SCAN_INDEX_COLOR_TEMP].size);
-
- dev_dbg(&pdev->dev, "als %x:%x\n",
- st->als[CHANNEL_SCAN_INDEX_COLOR_TEMP].index,
- st->als[CHANNEL_SCAN_INDEX_COLOR_TEMP].report_id);
-
- for (i = 0; i < 2; i++) {
- int next_scan_index = CHANNEL_SCAN_INDEX_CHROMATICITY_X + i;
-
- ret = sensor_hub_input_get_attribute_info(hsdev,
- HID_INPUT_REPORT, usage_id,
- HID_USAGE_SENSOR_LIGHT_CHROMATICITY_X + i,
- &st->als[next_scan_index]);
- if (ret < 0)
- return ret;
-
- als_adjust_channel_bit_mask(channels,
- CHANNEL_SCAN_INDEX_CHROMATICITY_X + i,
- st->als[next_scan_index].size);
-
- dev_dbg(&pdev->dev, "als %x:%x\n",
- st->als[next_scan_index].index,
- st->als[next_scan_index].report_id);
- }
-
st->scale_precision = hid_sensor_format_scale(usage_id,
&st->als[CHANNEL_SCAN_INDEX_INTENSITY],
&st->scale_pre_decml, &st->scale_post_decml);
case IIO_CHAN_INFO_OFFSET:
switch (chan->type) {
case IIO_TEMP:
- *val = -266314;
+ *val = -16005;
return IIO_VAL_INT;
default:
return -EINVAL;
return page_size;
}
- /* rdma_for_each_block() has a bug if the page size is smaller than the
- * page size used to build the umem. For now prevent smaller page sizes
- * from being returned.
- */
- pgsz_bitmap &= GENMASK(BITS_PER_LONG - 1, PAGE_SHIFT);
-
/* The best result is the smallest page size that results in the minimum
* number of required pages. Compute the largest page size that could
* work based on VA address bits that don't change.
int rc;
u32 netdev_speed;
struct net_device *netdev;
- struct ethtool_link_ksettings lksettings;
+ struct ethtool_link_ksettings lksettings = {};
if (rdma_port_get_link_layer(dev, port_num) != IB_LINK_LAYER_ETHERNET)
return -EINVAL;
BNXT_RE_DESC "\n";
MODULE_AUTHOR("Eddie Wai <eddie.wai@broadcom.com>");
-MODULE_DESCRIPTION(BNXT_RE_DESC " Driver");
+MODULE_DESCRIPTION(BNXT_RE_DESC);
MODULE_LICENSE("Dual BSD/GPL");
/* globals */
cong_alg->wnd_mode_sel = WND_LIMIT;
break;
default:
- ibdev_err(&hr_dev->ib_dev,
- "error type(%u) for congestion selection.\n",
- hr_dev->caps.cong_type);
- return -EINVAL;
+ ibdev_warn(&hr_dev->ib_dev,
+ "invalid type(%u) for congestion selection.\n",
+ hr_dev->caps.cong_type);
+ hr_dev->caps.cong_type = CONG_TYPE_DCQCN;
+ cong_alg->alg_sel = CONG_DCQCN;
+ cong_alg->alg_sub_sel = UNSUPPORT_CONG_LEVEL;
+ cong_alg->dip_vld = DIP_INVALID;
+ cong_alg->wnd_mode_sel = WND_LIMIT;
+ break;
}
return 0;
break;
case IRDMA_AE_QP_SUSPEND_COMPLETE:
if (iwqp->iwdev->vsi.tc_change_pending) {
- atomic_dec(&iwqp->sc_qp.vsi->qp_suspend_reqs);
+ if (!atomic_dec_return(&qp->vsi->qp_suspend_reqs))
+ wake_up(&iwqp->iwdev->suspend_wq);
+ }
+ if (iwqp->suspend_pending) {
+ iwqp->suspend_pending = false;
wake_up(&iwqp->iwdev->suspend_wq);
}
break;
struct irdma_cqp *cqp = &rf->cqp;
int status = 0;
- if (rf->cqp_cmpl_wq)
- destroy_workqueue(rf->cqp_cmpl_wq);
-
status = irdma_sc_cqp_destroy(dev->cqp);
if (status)
ibdev_dbg(to_ibdev(dev), "ERR: Destroy CQP failed %d\n", status);
struct irdma_ccq *ccq = &rf->ccq;
int status = 0;
+ if (rf->cqp_cmpl_wq)
+ destroy_workqueue(rf->cqp_cmpl_wq);
+
if (!rf->reset)
status = irdma_sc_ccq_destroy(dev->ccq, 0, true);
if (status)
int status;
struct irdma_ceq_init_info info = {};
struct irdma_sc_dev *dev = &rf->sc_dev;
- u64 scratch;
u32 ceq_size;
info.ceq_id = ceq_id;
iwceq->sc_ceq.ceq_id = ceq_id;
info.dev = dev;
info.vsi = vsi;
- scratch = (uintptr_t)&rf->cqp.sc_cqp;
status = irdma_sc_ceq_init(&iwceq->sc_ceq, &info);
if (!status) {
if (dev->ceq_valid)
status = irdma_cqp_ceq_cmd(&rf->sc_dev, &iwceq->sc_ceq,
IRDMA_OP_CEQ_CREATE);
else
- status = irdma_sc_cceq_create(&iwceq->sc_ceq, scratch);
+ status = irdma_sc_cceq_create(&iwceq->sc_ceq, 0);
}
if (status) {
/* Wait for all qp's to suspend */
wait_event_timeout(iwdev->suspend_wq,
!atomic_read(&iwdev->vsi.qp_suspend_reqs),
- IRDMA_EVENT_TIMEOUT);
+ msecs_to_jiffies(IRDMA_EVENT_TIMEOUT_MS));
irdma_ws_reset(&iwdev->vsi);
}
#define MAX_DPC_ITERATIONS 128
-#define IRDMA_EVENT_TIMEOUT 50000
+#define IRDMA_EVENT_TIMEOUT_MS 5000
#define IRDMA_VCHNL_EVENT_TIMEOUT 100000
#define IRDMA_RST_TIMEOUT_HZ 4
return prio;
}
+static int irdma_wait_for_suspend(struct irdma_qp *iwqp)
+{
+ if (!wait_event_timeout(iwqp->iwdev->suspend_wq,
+ !iwqp->suspend_pending,
+ msecs_to_jiffies(IRDMA_EVENT_TIMEOUT_MS))) {
+ iwqp->suspend_pending = false;
+ ibdev_warn(&iwqp->iwdev->ibdev,
+ "modify_qp timed out waiting for suspend. qp_id = %d, last_ae = 0x%x\n",
+ iwqp->ibqp.qp_num, iwqp->last_aeq);
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
/**
* irdma_modify_qp_roce - modify qp request
* @ibqp: qp's pointer for modify
info.next_iwarp_state = IRDMA_QP_STATE_SQD;
issue_modify_qp = 1;
+ iwqp->suspend_pending = true;
break;
case IB_QPS_SQE:
case IB_QPS_ERR:
case IB_QPS_RESET:
- if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS) {
- spin_unlock_irqrestore(&iwqp->lock, flags);
- info.next_iwarp_state = IRDMA_QP_STATE_SQD;
- irdma_hw_modify_qp(iwdev, iwqp, &info, true);
- spin_lock_irqsave(&iwqp->lock, flags);
- }
-
if (iwqp->iwarp_state == IRDMA_QP_STATE_ERROR) {
spin_unlock_irqrestore(&iwqp->lock, flags);
if (udata && udata->inlen) {
ctx_info->rem_endpoint_idx = udp_info->arp_idx;
if (irdma_hw_modify_qp(iwdev, iwqp, &info, true))
return -EINVAL;
+ if (info.next_iwarp_state == IRDMA_QP_STATE_SQD) {
+ ret = irdma_wait_for_suspend(iwqp);
+ if (ret)
+ return ret;
+ }
spin_lock_irqsave(&iwqp->lock, flags);
if (iwqp->iwarp_state == info.curr_iwarp_state) {
iwqp->iwarp_state = info.next_iwarp_state;
iwmr->type = reg_type;
pgsz_bitmap = (reg_type == IRDMA_MEMREG_TYPE_MEM) ?
- iwdev->rf->sc_dev.hw_attrs.page_size_cap : PAGE_SIZE;
+ iwdev->rf->sc_dev.hw_attrs.page_size_cap : SZ_4K;
iwmr->page_size = ib_umem_find_best_pgsz(region, pgsz_bitmap, virt);
if (unlikely(!iwmr->page_size)) {
int err;
u8 lvl;
+ /* iWarp: Catch page not starting on OS page boundary */
+ if (!rdma_protocol_roce(&iwdev->ibdev, 1) &&
+ ib_umem_offset(iwmr->region))
+ return -EINVAL;
+
total = req.sq_pages + req.rq_pages + 1;
if (total > iwmr->page_cnt)
return -EINVAL;
u8 flush_issued : 1;
u8 sig_all : 1;
u8 pau_mode : 1;
+ u8 suspend_pending : 1;
u8 rsvd : 1;
u8 iwarp_state;
u16 term_sq_flush_code;
struct rtrs_clt_path *clt_path;
int err;
- if (WARN_ON(!req->in_use))
+ if (!req->in_use)
return;
if (WARN_ON(!req->con))
return;
clt_path->s.dev_ref++;
max_send_wr = min_t(int, wr_limit,
/* QD * (REQ + RSP + FR REGS or INVS) + drain */
- clt_path->queue_depth * 3 + 1);
+ clt_path->queue_depth * 4 + 1);
max_recv_wr = min_t(int, wr_limit,
clt_path->queue_depth * 3 + 1);
max_send_sge = 2;
if (err)
goto destroy;
- rtrs_start_hb(&clt_path->s);
-
return 0;
destroy:
goto out;
}
rtrs_clt_path_up(clt_path);
+ rtrs_start_hb(&clt_path->s);
out:
mutex_unlock(&clt_path->init_mutex);
{
enum rtrs_srv_state old_state;
bool changed = false;
+ unsigned long flags;
- spin_lock_irq(&srv_path->state_lock);
+ spin_lock_irqsave(&srv_path->state_lock, flags);
old_state = srv_path->state;
switch (new_state) {
case RTRS_SRV_CONNECTED:
}
if (changed)
srv_path->state = new_state;
- spin_unlock_irq(&srv_path->state_lock);
+ spin_unlock_irqrestore(&srv_path->state_lock, flags);
return changed;
}
struct rtrs_srv_mr *srv_mr;
srv_mr = &srv_path->mrs[i];
- rtrs_iu_free(srv_mr->iu, srv_path->s.dev->ib_dev, 1);
+
+ if (always_invalidate)
+ rtrs_iu_free(srv_mr->iu, srv_path->s.dev->ib_dev, 1);
+
ib_dereg_mr(srv_mr->mr);
ib_dma_unmap_sg(srv_path->s.dev->ib_dev, srv_mr->sgt.sgl,
srv_mr->sgt.nents, DMA_BIDIRECTIONAL);
WARN_ON(wc->opcode != IB_WC_SEND);
}
-static void rtrs_srv_path_up(struct rtrs_srv_path *srv_path)
+static int rtrs_srv_path_up(struct rtrs_srv_path *srv_path)
{
struct rtrs_srv_sess *srv = srv_path->srv;
struct rtrs_srv_ctx *ctx = srv->ctx;
- int up;
+ int up, ret = 0;
mutex_lock(&srv->paths_ev_mutex);
up = ++srv->paths_up;
if (up == 1)
- ctx->ops.link_ev(srv, RTRS_SRV_LINK_EV_CONNECTED, NULL);
+ ret = ctx->ops.link_ev(srv, RTRS_SRV_LINK_EV_CONNECTED, NULL);
mutex_unlock(&srv->paths_ev_mutex);
/* Mark session as established */
- srv_path->established = true;
+ if (!ret)
+ srv_path->established = true;
+
+ return ret;
}
static void rtrs_srv_path_down(struct rtrs_srv_path *srv_path)
goto iu_free;
kobject_get(&srv_path->kobj);
get_device(&srv_path->srv->dev);
- rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
+ err = rtrs_srv_change_state(srv_path, RTRS_SRV_CONNECTED);
+ if (!err) {
+ rtrs_err(s, "rtrs_srv_change_state(), err: %d\n", err);
+ goto iu_free;
+ }
+
rtrs_srv_start_hb(srv_path);
/*
* all connections are successfully established. Thus, simply notify
* listener with a proper event if we are the first path.
*/
- rtrs_srv_path_up(srv_path);
+ err = rtrs_srv_path_up(srv_path);
+ if (err) {
+ rtrs_err(s, "rtrs_srv_path_up(), err: %d\n", err);
+ goto iu_free;
+ }
ib_dma_sync_single_for_device(srv_path->s.dev->ib_dev,
tx_iu->dma_addr,
srv_path = container_of(work, typeof(*srv_path), close_work);
- rtrs_srv_destroy_path_files(srv_path);
rtrs_srv_stop_hb(srv_path);
for (i = 0; i < srv_path->s.con_num; i++) {
/* Wait for all completion */
wait_for_completion(&srv_path->complete_done);
+ rtrs_srv_destroy_path_files(srv_path);
+
/* Notify upper layer if we are the last path */
rtrs_srv_path_down(srv_path);
{ 0x146b, 0x0604, "Bigben Interactive DAIJA Arcade Stick", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOX360 },
{ 0x1532, 0x0a00, "Razer Atrox Arcade Stick", MAP_TRIGGERS_TO_BUTTONS, XTYPE_XBOXONE },
{ 0x1532, 0x0a03, "Razer Wildcat", 0, XTYPE_XBOXONE },
+ { 0x1532, 0x0a29, "Razer Wolverine V2", 0, XTYPE_XBOXONE },
{ 0x15e4, 0x3f00, "Power A Mini Pro Elite", 0, XTYPE_XBOX360 },
{ 0x15e4, 0x3f0a, "Xbox Airflo wired controller", 0, XTYPE_XBOX360 },
{ 0x15e4, 0x3f10, "Batarang Xbox 360 controller", 0, XTYPE_XBOX360 },
ps2dev->serio->phys);
}
+#ifdef CONFIG_X86
+static bool atkbd_is_portable_device(void)
+{
+ static const char * const chassis_types[] = {
+ "8", /* Portable */
+ "9", /* Laptop */
+ "10", /* Notebook */
+ "14", /* Sub-Notebook */
+ "31", /* Convertible */
+ "32", /* Detachable */
+ };
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(chassis_types); i++)
+ if (dmi_match(DMI_CHASSIS_TYPE, chassis_types[i]))
+ return true;
+
+ return false;
+}
+
+/*
+ * On many modern laptops ATKBD_CMD_GETID may cause problems, on these laptops
+ * the controller is always in translated mode. In this mode mice/touchpads will
+ * not work. So in this case simply assume a keyboard is connected to avoid
+ * confusing some laptop keyboards.
+ *
+ * Skipping ATKBD_CMD_GETID ends up using a fake keyboard id. Using a fake id is
+ * ok in translated mode, only atkbd_select_set() checks atkbd->id and in
+ * translated mode that is a no-op.
+ */
+static bool atkbd_skip_getid(struct atkbd *atkbd)
+{
+ return atkbd->translated && atkbd_is_portable_device();
+}
+#else
+static inline bool atkbd_skip_getid(struct atkbd *atkbd) { return false; }
+#endif
+
/*
* atkbd_probe() probes for an AT keyboard on a serio port.
*/
*/
param[0] = param[1] = 0xa5; /* initialize with invalid values */
- if (ps2_command(ps2dev, param, ATKBD_CMD_GETID)) {
+ if (atkbd_skip_getid(atkbd) || ps2_command(ps2dev, param, ATKBD_CMD_GETID)) {
/*
- * If the get ID command failed, we check if we can at least set the LEDs on
- * the keyboard. This should work on every keyboard out there. It also turns
- * the LEDs off, which we want anyway.
+ * If the get ID command was skipped or failed, we check if we can at least set
+ * the LEDs on the keyboard. This should work on every keyboard out there.
+ * It also turns the LEDs off, which we want anyway.
*/
param[0] = 0;
if (ps2_command(ps2dev, param, ATKBD_CMD_SETLEDS))
keys->codes = devm_kmemdup(&pdev->dev, micro_keycodes,
keys->input->keycodesize * keys->input->keycodemax,
GFP_KERNEL);
+ if (!keys->codes)
+ return -ENOMEM;
+
keys->input->keycode = keys->codes;
__set_bit(EV_KEY, keys->input->evbit);
info->name = "power";
info->event_code = KEY_POWER;
info->wakeup = true;
+ } else if (upage == 0x01 && usage == 0xc6) {
+ info->name = "airplane mode switch";
+ info->event_type = EV_SW;
+ info->event_code = SW_RFKILL_ALL;
+ info->active_low = false;
} else if (upage == 0x01 && usage == 0xca) {
info->name = "rotation lock switch";
info->event_type = EV_SW;
return 0;
}
-static int __exit amimouse_remove(struct platform_device *pdev)
+static void __exit amimouse_remove(struct platform_device *pdev)
{
struct input_dev *dev = platform_get_drvdata(pdev);
input_unregister_device(dev);
- return 0;
}
static struct platform_driver amimouse_driver = {
- .remove = __exit_p(amimouse_remove),
+ .remove_new = __exit_p(amimouse_remove),
.driver = {
.name = "amiga-mouse",
},
"LEN009b", /* T580 */
"LEN0402", /* X1 Extreme Gen 2 / P1 Gen 2 */
"LEN040f", /* P1 Gen 3 */
+ "LEN0411", /* L14 Gen 1 */
"LEN200f", /* T450s */
"LEN2044", /* L470 */
"LEN2054", /* E480 */
},
.driver_data = (void *)(SERIO_QUIRK_DRITEK)
},
+ {
+ /* Acer TravelMate P459-G2-M */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate P459-G2-M"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOMUX)
+ },
{
/* Amoi M636/A737 */
.matches = {
}
mutex_unlock(&icc_lock);
+ if (!node)
+ return ERR_PTR(-EINVAL);
+
if (IS_ERR(node))
return ERR_CAST(node);
if (qn->ib_coeff) {
agg_peak_rate = qn->max_peak[ctx] * 100;
- agg_peak_rate = div_u64(qn->max_peak[ctx], qn->ib_coeff);
+ agg_peak_rate = div_u64(agg_peak_rate, qn->ib_coeff);
} else {
agg_peak_rate = qn->max_peak[ctx];
}
.driver = {
.name = "qnoc-sm8250",
.of_match_table = qnoc_of_match,
+ .sync_state = icc_sync_state,
},
};
module_platform_driver(qnoc_driver);
{
struct qi_desc desc;
+ /*
+ * VT-d spec, section 4.3:
+ *
+ * Software is recommended to not submit any Device-TLB invalidation
+ * requests while address remapping hardware is disabled.
+ */
+ if (!(iommu->gcmd & DMA_GCMD_TE))
+ return;
+
if (mask) {
addr |= (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1;
desc.qw1 = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE;
unsigned long mask = 1UL << (VTD_PAGE_SHIFT + size_order - 1);
struct qi_desc desc = {.qw1 = 0, .qw2 = 0, .qw3 = 0};
+ /*
+ * VT-d spec, section 4.3:
+ *
+ * Software is recommended to not submit any Device-TLB invalidation
+ * requests while address remapping hardware is disabled.
+ */
+ if (!(iommu->gcmd & DMA_GCMD_TE))
+ return;
+
desc.qw0 = QI_DEV_EIOTLB_PASID(pasid) | QI_DEV_EIOTLB_SID(sid) |
QI_DEV_EIOTLB_QDEP(qdep) | QI_DEIOTLB_TYPE |
QI_DEV_IOTLB_PFSID(pfsid);
#define IDENTMAP_AZALIA 4
const struct iommu_ops intel_iommu_ops;
-const struct iommu_dirty_ops intel_dirty_ops;
+static const struct iommu_dirty_ops intel_dirty_ops;
static bool translation_pre_enabled(struct intel_iommu *iommu)
{
attr |= DMA_FL_PTE_DIRTY;
}
+ domain->has_mappings = true;
+
pteval = ((phys_addr_t)phys_pfn << VTD_PAGE_SHIFT) | attr;
while (nr_pages > 0) {
return ret;
}
- iommu_enable_pci_caps(info);
+ if (sm_supported(info->iommu) || !domain_type_is_si(info->domain))
+ iommu_enable_pci_caps(info);
return 0;
}
*/
static void domain_context_clear(struct device_domain_info *info)
{
- if (!info->iommu || !info->dev || !dev_is_pci(info->dev))
- return;
+ if (!dev_is_pci(info->dev))
+ domain_context_clear_one(info, info->bus, info->devfn);
pci_for_each_dma_alias(to_pci_dev(info->dev),
&domain_context_clear_one_cb, info);
return true;
spin_lock_irqsave(&dmar_domain->lock, flags);
- if (!domain_support_force_snooping(dmar_domain)) {
+ if (!domain_support_force_snooping(dmar_domain) ||
+ (!dmar_domain->use_first_level && dmar_domain->has_mappings)) {
spin_unlock_irqrestore(&dmar_domain->lock, flags);
return false;
}
return 0;
}
-const struct iommu_dirty_ops intel_dirty_ops = {
+static const struct iommu_dirty_ops intel_dirty_ops = {
.set_dirty_tracking = intel_iommu_set_dirty_tracking,
.read_and_clear_dirty = intel_iommu_read_and_clear_dirty,
};
ver = (dev->device >> 8) & 0xff;
if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
ver != 0x4e && ver != 0x8a && ver != 0x98 &&
- ver != 0x9a && ver != 0xa7)
+ ver != 0x9a && ver != 0xa7 && ver != 0x7d)
return;
if (risky_device(dev))
*/
u8 dirty_tracking:1; /* Dirty tracking is enabled */
u8 nested_parent:1; /* Has other domains nested on it */
+ u8 has_mappings:1; /* Has mappings configured through
+ * iommu_map() interface.
+ */
spinlock_t lock; /* Protect device tracking lists */
struct list_head devices; /* all devices' list */
rcu_read_unlock();
}
+static void intel_flush_svm_all(struct intel_svm *svm)
+{
+ struct device_domain_info *info;
+ struct intel_svm_dev *sdev;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(sdev, &svm->devs, list) {
+ info = dev_iommu_priv_get(sdev->dev);
+
+ qi_flush_piotlb(sdev->iommu, sdev->did, svm->pasid, 0, -1UL, 0);
+ if (info->ats_enabled) {
+ qi_flush_dev_iotlb_pasid(sdev->iommu, sdev->sid, info->pfsid,
+ svm->pasid, sdev->qdep,
+ 0, 64 - VTD_PAGE_SHIFT);
+ quirk_extra_dev_tlb_flush(info, 0, 64 - VTD_PAGE_SHIFT,
+ svm->pasid, sdev->qdep);
+ }
+ }
+ rcu_read_unlock();
+}
+
/* Pages have been freed at this point */
static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
{
struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+ if (start == 0 && end == -1UL) {
+ intel_flush_svm_all(svm);
+ return;
+ }
+
intel_flush_svm_range(svm, start,
(end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0);
}
dev_iommu_free(dev);
}
+DEFINE_MUTEX(iommu_probe_device_lock);
+
static int __iommu_probe_device(struct device *dev, struct list_head *group_list)
{
const struct iommu_ops *ops = dev->bus->iommu_ops;
struct iommu_group *group;
- static DEFINE_MUTEX(iommu_probe_device_lock);
struct group_device *gdev;
int ret;
* probably be able to use device_lock() here to minimise the scope,
* but for now enforcing a simple global ordering is fine.
*/
- mutex_lock(&iommu_probe_device_lock);
+ lockdep_assert_held(&iommu_probe_device_lock);
/* Device is probed already if in a group */
- if (dev->iommu_group) {
- ret = 0;
- goto out_unlock;
- }
+ if (dev->iommu_group)
+ return 0;
ret = iommu_init_device(dev, ops);
if (ret)
- goto out_unlock;
+ return ret;
group = dev->iommu_group;
gdev = iommu_group_alloc_device(group, dev);
list_add_tail(&group->entry, group_list);
}
mutex_unlock(&group->mutex);
- mutex_unlock(&iommu_probe_device_lock);
if (dev_is_pci(dev))
iommu_dma_set_pci_32bit_workaround(dev);
iommu_deinit_device(dev);
mutex_unlock(&group->mutex);
iommu_group_put(group);
-out_unlock:
- mutex_unlock(&iommu_probe_device_lock);
return ret;
}
const struct iommu_ops *ops;
int ret;
+ mutex_lock(&iommu_probe_device_lock);
ret = __iommu_probe_device(dev, NULL);
+ mutex_unlock(&iommu_probe_device_lock);
if (ret)
return ret;
*/
if (ops->default_domain) {
if (req_type)
- return NULL;
+ return ERR_PTR(-EINVAL);
return ops->default_domain;
}
/* The driver gave no guidance on what type to use, try the default */
dom = __iommu_group_alloc_default_domain(group, iommu_def_domain_type);
- if (dom)
+ if (!IS_ERR(dom))
return dom;
/* Otherwise IDENTITY and DMA_FQ defaults will try DMA */
if (iommu_def_domain_type == IOMMU_DOMAIN_DMA)
- return NULL;
+ return ERR_PTR(-EINVAL);
dom = __iommu_group_alloc_default_domain(group, IOMMU_DOMAIN_DMA);
- if (!dom)
- return NULL;
+ if (IS_ERR(dom))
+ return dom;
pr_warn("Failed to allocate default IOMMU domain of type %u for group %s - Falling back to IOMMU_DOMAIN_DMA",
iommu_def_domain_type, group->name);
struct list_head *group_list = data;
int ret;
+ mutex_lock(&iommu_probe_device_lock);
ret = __iommu_probe_device(dev, group_list);
+ mutex_unlock(&iommu_probe_device_lock);
if (ret == -ENODEV)
ret = 0;
else if (ops->domain_alloc)
domain = ops->domain_alloc(alloc_type);
else
- return NULL;
+ return ERR_PTR(-EOPNOTSUPP);
+ /*
+ * Many domain_alloc ops now return ERR_PTR, make things easier for the
+ * driver by accepting ERR_PTR from all domain_alloc ops instead of
+ * having two rules.
+ */
+ if (IS_ERR(domain))
+ return domain;
if (!domain)
- return NULL;
+ return ERR_PTR(-ENOMEM);
domain->type = type;
/*
if (!domain->ops)
domain->ops = ops->default_domain_ops;
- if (iommu_is_dma_domain(domain) && iommu_get_dma_cookie(domain)) {
- iommu_domain_free(domain);
- domain = NULL;
+ if (iommu_is_dma_domain(domain)) {
+ int rc;
+
+ rc = iommu_get_dma_cookie(domain);
+ if (rc) {
+ iommu_domain_free(domain);
+ return ERR_PTR(rc);
+ }
}
return domain;
}
struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
{
+ struct iommu_domain *domain;
+
if (bus == NULL || bus->iommu_ops == NULL)
return NULL;
- return __iommu_domain_alloc(bus->iommu_ops, NULL,
+ domain = __iommu_domain_alloc(bus->iommu_ops, NULL,
IOMMU_DOMAIN_UNMANAGED);
+ if (IS_ERR(domain))
+ return NULL;
+ return domain;
}
EXPORT_SYMBOL_GPL(iommu_domain_alloc);
return -EINVAL;
dom = iommu_group_alloc_default_domain(group, req_type);
- if (!dom)
- return -ENODEV;
+ if (IS_ERR(dom))
+ return PTR_ERR(dom);
if (group->default_domain == dom)
return 0;
static int __iommu_group_alloc_blocking_domain(struct iommu_group *group)
{
+ struct iommu_domain *domain;
+
if (group->blocking_domain)
return 0;
- group->blocking_domain =
- __iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED);
- if (!group->blocking_domain) {
+ domain = __iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED);
+ if (IS_ERR(domain)) {
/*
* For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
* create an empty domain instead.
*/
- group->blocking_domain = __iommu_group_domain_alloc(
- group, IOMMU_DOMAIN_UNMANAGED);
- if (!group->blocking_domain)
- return -EINVAL;
+ domain = __iommu_group_domain_alloc(group,
+ IOMMU_DOMAIN_UNMANAGED);
+ if (IS_ERR(domain))
+ return PTR_ERR(domain);
}
+ group->blocking_domain = domain;
return 0;
}
continue;
destroy_hwpt = (*do_attach)(idev, hwpt);
if (IS_ERR(destroy_hwpt)) {
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(idev->ictx, &hwpt->obj);
/*
* -EINVAL means the domain is incompatible with the
* device. Other error codes should propagate to
goto out_unlock;
}
*pt_id = hwpt->obj.id;
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(idev->ictx, &hwpt->obj);
goto out_unlock;
}
destroy_hwpt = ERR_PTR(-EINVAL);
goto out_put_pt_obj;
}
- iommufd_put_object(pt_obj);
+ iommufd_put_object(idev->ictx, pt_obj);
/* This destruction has to be after we unlock everything */
if (destroy_hwpt)
return 0;
out_put_pt_obj:
- iommufd_put_object(pt_obj);
+ iommufd_put_object(idev->ictx, pt_obj);
return PTR_ERR(destroy_hwpt);
}
if (IS_ERR(ioas))
return PTR_ERR(ioas);
rc = iommufd_access_change_ioas(access, ioas);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(access->ictx, &ioas->obj);
return rc;
}
access->ops->unmap(access->data, iova, length);
- iommufd_put_object(&access->obj);
+ iommufd_put_object(access->ictx, &access->obj);
xa_lock(&ioas->iopt.access_list);
}
xa_unlock(&ioas->iopt.access_list);
out_free:
kfree(data);
out_put:
- iommufd_put_object(&idev->obj);
+ iommufd_put_object(ucmd->ictx, &idev->obj);
return rc;
}
if (ioas)
mutex_unlock(&ioas->mutex);
out_put_pt:
- iommufd_put_object(pt_obj);
+ iommufd_put_object(ucmd->ictx, pt_obj);
out_put_idev:
- iommufd_put_object(&idev->obj);
+ iommufd_put_object(ucmd->ictx, &idev->obj);
return rc;
}
rc = iopt_set_dirty_tracking(&ioas->iopt, hwpt_paging->common.domain,
enable);
- iommufd_put_object(&hwpt_paging->common.obj);
+ iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj);
return rc;
}
rc = iopt_read_and_clear_dirty_data(
&ioas->iopt, hwpt_paging->common.domain, cmd->flags, cmd);
- iommufd_put_object(&hwpt_paging->common.obj);
+ iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj);
return rc;
}
rc = -EMSGSIZE;
out_put:
up_read(&ioas->iopt.iova_rwsem);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
interval_tree_remove(node, &allowed_iova);
kfree(container_of(node, struct iopt_allowed, node));
}
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
cmd->iova = iova;
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
out_put:
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
return PTR_ERR(src_ioas);
rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length,
&pages_list);
- iommufd_put_object(&src_ioas->obj);
+ iommufd_put_object(ucmd->ictx, &src_ioas->obj);
if (rc)
return rc;
cmd->dst_iova = iova;
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
out_put_dst:
- iommufd_put_object(&dst_ioas->obj);
+ iommufd_put_object(ucmd->ictx, &dst_ioas->obj);
out_pages:
iopt_free_pages_list(&pages_list);
return rc;
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
out_put:
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
rc = -EOPNOTSUPP;
}
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
struct file *file;
struct xarray objects;
struct xarray groups;
+ wait_queue_head_t destroy_wait;
u8 account_mode;
/* Compatibility with VFIO no iommu */
/* Base struct for all objects with a userspace ID handle. */
struct iommufd_object {
- struct rw_semaphore destroy_rwsem;
+ refcount_t shortterm_users;
refcount_t users;
enum iommufd_object_type type;
unsigned int id;
static inline bool iommufd_lock_obj(struct iommufd_object *obj)
{
- if (!down_read_trylock(&obj->destroy_rwsem))
+ if (!refcount_inc_not_zero(&obj->users))
return false;
- if (!refcount_inc_not_zero(&obj->users)) {
- up_read(&obj->destroy_rwsem);
+ if (!refcount_inc_not_zero(&obj->shortterm_users)) {
+ /*
+ * If the caller doesn't already have a ref on obj this must be
+ * called under the xa_lock. Otherwise the caller is holding a
+ * ref on users. Thus it cannot be one before this decrement.
+ */
+ refcount_dec(&obj->users);
return false;
}
return true;
struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id,
enum iommufd_object_type type);
-static inline void iommufd_put_object(struct iommufd_object *obj)
+static inline void iommufd_put_object(struct iommufd_ctx *ictx,
+ struct iommufd_object *obj)
{
+ /*
+ * Users first, then shortterm so that REMOVE_WAIT_SHORTTERM never sees
+ * a spurious !0 users with a 0 shortterm_users.
+ */
refcount_dec(&obj->users);
- up_read(&obj->destroy_rwsem);
+ if (refcount_dec_and_test(&obj->shortterm_users))
+ wake_up_interruptible_all(&ictx->destroy_wait);
}
void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj);
struct iommufd_object *obj);
void iommufd_object_finalize(struct iommufd_ctx *ictx,
struct iommufd_object *obj);
-void __iommufd_object_destroy_user(struct iommufd_ctx *ictx,
- struct iommufd_object *obj, bool allow_fail);
+
+enum {
+ REMOVE_WAIT_SHORTTERM = 1,
+};
+int iommufd_object_remove(struct iommufd_ctx *ictx,
+ struct iommufd_object *to_destroy, u32 id,
+ unsigned int flags);
+
+/*
+ * The caller holds a users refcount and wants to destroy the object. At this
+ * point the caller has no shortterm_users reference and at least the xarray
+ * will be holding one.
+ */
static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx,
struct iommufd_object *obj)
{
- __iommufd_object_destroy_user(ictx, obj, false);
+ int ret;
+
+ ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM);
+
+ /*
+ * If there is a bug and we couldn't destroy the object then we did put
+ * back the caller's users refcount and will eventually try to free it
+ * again during close.
+ */
+ WARN_ON(ret);
}
-static inline void iommufd_object_deref_user(struct iommufd_ctx *ictx,
- struct iommufd_object *obj)
+
+/*
+ * The HWPT allocated by autodomains is used in possibly many devices and
+ * is automatically destroyed when its refcount reaches zero.
+ *
+ * If userspace uses the HWPT manually, even for a short term, then it will
+ * disrupt this refcounting and the auto-free in the kernel will not work.
+ * Userspace that tries to use the automatically allocated HWPT must be careful
+ * to ensure that it is consistently destroyed, eg by not racing accesses
+ * and by not attaching an automatic HWPT to a device manually.
+ */
+static inline void
+iommufd_object_put_and_try_destroy(struct iommufd_ctx *ictx,
+ struct iommufd_object *obj)
{
- __iommufd_object_destroy_user(ictx, obj, true);
+ iommufd_object_remove(ictx, obj, obj->id, 0);
}
struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
lockdep_assert_not_held(&hwpt_paging->ioas->mutex);
if (hwpt_paging->auto_domain) {
- iommufd_object_deref_user(ictx, &hwpt->obj);
+ iommufd_object_put_and_try_destroy(ictx, &hwpt->obj);
return;
}
}
size_t size,
enum iommufd_object_type type)
{
- static struct lock_class_key obj_keys[IOMMUFD_OBJ_MAX];
struct iommufd_object *obj;
int rc;
if (!obj)
return ERR_PTR(-ENOMEM);
obj->type = type;
- /*
- * In most cases the destroy_rwsem is obtained with try so it doesn't
- * interact with lockdep, however on destroy we have to sleep. This
- * means if we have to destroy an object while holding a get on another
- * object it triggers lockdep. Using one locking class per object type
- * is a simple and reasonable way to avoid this.
- */
- __init_rwsem(&obj->destroy_rwsem, "iommufd_object::destroy_rwsem",
- &obj_keys[type]);
+ /* Starts out bias'd by 1 until it is removed from the xarray */
+ refcount_set(&obj->shortterm_users, 1);
refcount_set(&obj->users, 1);
/*
return obj;
}
+static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx,
+ struct iommufd_object *to_destroy)
+{
+ if (refcount_dec_and_test(&to_destroy->shortterm_users))
+ return 0;
+
+ if (wait_event_timeout(ictx->destroy_wait,
+ refcount_read(&to_destroy->shortterm_users) ==
+ 0,
+ msecs_to_jiffies(10000)))
+ return 0;
+
+ pr_crit("Time out waiting for iommufd object to become free\n");
+ refcount_inc(&to_destroy->shortterm_users);
+ return -EBUSY;
+}
+
/*
* Remove the given object id from the xarray if the only reference to the
- * object is held by the xarray. The caller must call ops destroy().
+ * object is held by the xarray.
*/
-static struct iommufd_object *iommufd_object_remove(struct iommufd_ctx *ictx,
- u32 id, bool extra_put)
+int iommufd_object_remove(struct iommufd_ctx *ictx,
+ struct iommufd_object *to_destroy, u32 id,
+ unsigned int flags)
{
struct iommufd_object *obj;
XA_STATE(xas, &ictx->objects, id);
-
- xa_lock(&ictx->objects);
- obj = xas_load(&xas);
- if (xa_is_zero(obj) || !obj) {
- obj = ERR_PTR(-ENOENT);
- goto out_xa;
- }
+ bool zerod_shortterm = false;
+ int ret;
/*
- * If the caller is holding a ref on obj we put it here under the
- * spinlock.
+ * The purpose of the shortterm_users is to ensure deterministic
+ * destruction of objects used by external drivers and destroyed by this
+ * function. Any temporary increment of the refcount must increment
+ * shortterm_users, such as during ioctl execution.
*/
- if (extra_put)
+ if (flags & REMOVE_WAIT_SHORTTERM) {
+ ret = iommufd_object_dec_wait_shortterm(ictx, to_destroy);
+ if (ret) {
+ /*
+ * We have a bug. Put back the callers reference and
+ * defer cleaning this object until close.
+ */
+ refcount_dec(&to_destroy->users);
+ return ret;
+ }
+ zerod_shortterm = true;
+ }
+
+ xa_lock(&ictx->objects);
+ obj = xas_load(&xas);
+ if (to_destroy) {
+ /*
+ * If the caller is holding a ref on obj we put it here under
+ * the spinlock.
+ */
refcount_dec(&obj->users);
+ if (WARN_ON(obj != to_destroy)) {
+ ret = -ENOENT;
+ goto err_xa;
+ }
+ } else if (xa_is_zero(obj) || !obj) {
+ ret = -ENOENT;
+ goto err_xa;
+ }
+
if (!refcount_dec_if_one(&obj->users)) {
- obj = ERR_PTR(-EBUSY);
- goto out_xa;
+ ret = -EBUSY;
+ goto err_xa;
}
xas_store(&xas, NULL);
if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj))
ictx->vfio_ioas = NULL;
-
-out_xa:
xa_unlock(&ictx->objects);
- /* The returned object reference count is zero */
- return obj;
-}
-
-/*
- * The caller holds a users refcount and wants to destroy the object. Returns
- * true if the object was destroyed. In all cases the caller no longer has a
- * reference on obj.
- */
-void __iommufd_object_destroy_user(struct iommufd_ctx *ictx,
- struct iommufd_object *obj, bool allow_fail)
-{
- struct iommufd_object *ret;
-
/*
- * The purpose of the destroy_rwsem is to ensure deterministic
- * destruction of objects used by external drivers and destroyed by this
- * function. Any temporary increment of the refcount must hold the read
- * side of this, such as during ioctl execution.
- */
- down_write(&obj->destroy_rwsem);
- ret = iommufd_object_remove(ictx, obj->id, true);
- up_write(&obj->destroy_rwsem);
-
- if (allow_fail && IS_ERR(ret))
- return;
-
- /*
- * If there is a bug and we couldn't destroy the object then we did put
- * back the caller's refcount and will eventually try to free it again
- * during close.
+ * Since users is zero any positive users_shortterm must be racing
+ * iommufd_put_object(), or we have a bug.
*/
- if (WARN_ON(IS_ERR(ret)))
- return;
+ if (!zerod_shortterm) {
+ ret = iommufd_object_dec_wait_shortterm(ictx, obj);
+ if (WARN_ON(ret))
+ return ret;
+ }
iommufd_object_ops[obj->type].destroy(obj);
kfree(obj);
+ return 0;
+
+err_xa:
+ if (zerod_shortterm) {
+ /* Restore the xarray owned reference */
+ refcount_set(&obj->shortterm_users, 1);
+ }
+ xa_unlock(&ictx->objects);
+
+ /* The returned object reference count is zero */
+ return ret;
}
static int iommufd_destroy(struct iommufd_ucmd *ucmd)
{
struct iommu_destroy *cmd = ucmd->cmd;
- struct iommufd_object *obj;
- obj = iommufd_object_remove(ucmd->ictx, cmd->id, false);
- if (IS_ERR(obj))
- return PTR_ERR(obj);
- iommufd_object_ops[obj->type].destroy(obj);
- kfree(obj);
- return 0;
+ return iommufd_object_remove(ucmd->ictx, NULL, cmd->id, 0);
}
static int iommufd_fops_open(struct inode *inode, struct file *filp)
xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT);
xa_init(&ictx->groups);
ictx->file = filp;
+ init_waitqueue_head(&ictx->destroy_wait);
filp->private_data = ictx;
return 0;
}
if (IS_ERR(ioas))
return;
*iova = iommufd_test_syz_conv_iova(&ioas->iopt, iova);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
}
struct mock_iommu_domain {
return hwpt;
if (hwpt->domain->type != IOMMU_DOMAIN_UNMANAGED ||
hwpt->domain->ops != mock_ops.default_domain_ops) {
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(ucmd->ictx, &hwpt->obj);
return ERR_PTR(-EINVAL);
}
*mock = container_of(hwpt->domain, struct mock_iommu_domain, domain);
return hwpt;
if (hwpt->domain->type != IOMMU_DOMAIN_NESTED ||
hwpt->domain->ops != &domain_nested_ops) {
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(ucmd->ictx, &hwpt->obj);
return ERR_PTR(-EINVAL);
}
*mock_nested = container_of(hwpt->domain,
rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
out_dev_obj:
- iommufd_put_object(dev_obj);
+ iommufd_put_object(ucmd->ictx, dev_obj);
return rc;
}
down_write(&ioas->iopt.iova_rwsem);
rc = iopt_reserve_iova(&ioas->iopt, start, start + length - 1, NULL);
up_write(&ioas->iopt.iova_rwsem);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return rc;
}
rc = 0;
out_put:
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(ucmd->ictx, &hwpt->obj);
return rc;
}
out_free:
kvfree(tmp);
out_put:
- iommufd_put_object(&hwpt->obj);
+ iommufd_put_object(ucmd->ictx, &hwpt->obj);
return rc;
}
if (IS_ERR(ioas))
return PTR_ERR(ioas);
*out_ioas_id = ioas->obj.id;
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return 0;
}
EXPORT_SYMBOL_NS_GPL(iommufd_vfio_compat_ioas_get_id, IOMMUFD_VFIO);
if (ictx->vfio_ioas && iommufd_lock_obj(&ictx->vfio_ioas->obj)) {
ret = 0;
- iommufd_put_object(&ictx->vfio_ioas->obj);
+ iommufd_put_object(ictx, &ictx->vfio_ioas->obj);
goto out_abort;
}
ictx->vfio_ioas = ioas;
if (IS_ERR(ioas))
return PTR_ERR(ioas);
cmd->ioas_id = ioas->obj.id;
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return iommufd_ucmd_respond(ucmd, sizeof(*cmd));
case IOMMU_VFIO_IOAS_SET:
xa_lock(&ucmd->ictx->objects);
ucmd->ictx->vfio_ioas = ioas;
xa_unlock(&ucmd->ictx->objects);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ucmd->ictx, &ioas->obj);
return 0;
case IOMMU_VFIO_IOAS_CLEAR:
iova = map.iova;
rc = iopt_map_user_pages(ictx, &ioas->iopt, &iova, u64_to_user_ptr(map.vaddr),
map.size, iommu_prot, 0);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return rc;
}
rc = -EFAULT;
err_put:
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return rc;
}
}
mutex_unlock(&ioas->mutex);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return rc;
}
*/
if (type == VFIO_TYPE1_IOMMU)
rc = iopt_disable_large_pages(&ioas->iopt);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return rc;
}
out_put:
up_read(&ioas->iopt.iova_rwsem);
- iommufd_put_object(&ioas->obj);
+ iommufd_put_object(ictx, &ioas->obj);
return rc;
}
const u32 *id)
{
const struct iommu_ops *ops = NULL;
- struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+ struct iommu_fwspec *fwspec;
int err = NO_IOMMU;
if (!master_np)
return NULL;
+ /* Serialise to make dev->iommu stable under our potential fwspec */
+ mutex_lock(&iommu_probe_device_lock);
+ fwspec = dev_iommu_fwspec_get(dev);
if (fwspec) {
- if (fwspec->ops)
+ if (fwspec->ops) {
+ mutex_unlock(&iommu_probe_device_lock);
return fwspec->ops;
-
+ }
/* In the deferred case, start again from scratch */
iommu_fwspec_free(dev);
}
fwspec = dev_iommu_fwspec_get(dev);
ops = fwspec->ops;
}
+ mutex_unlock(&iommu_probe_device_lock);
+
/*
* If we have reason to believe the IOMMU driver missed the initial
* probe for dev, replay it to get things in order.
if (start == phys->start && end == phys->end)
return IOMMU_RESV_DIRECT;
- dev_warn(dev, "treating non-direct mapping [%pr] -> [%pap-%pap] as reservation\n", &phys,
+ dev_warn(dev, "treating non-direct mapping [%pr] -> [%pap-%pap] as reservation\n", phys,
&start, &end);
return IOMMU_RESV_RESERVED;
}
break;
}
+ if (!shr)
+ gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
+
its_write_baser(its, baser, val);
tmp = baser->val;
- if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE)
- tmp &= ~GITS_BASER_SHAREABILITY_MASK;
-
if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
/*
* Shareability didn't stick. Just use
* non-cacheable as well.
*/
shr = tmp & GITS_BASER_SHAREABILITY_MASK;
- if (!shr) {
+ if (!shr)
cache = GITS_BASER_nC;
- gic_flush_dcache_to_poc(base, PAGE_ORDER_TO_SIZE(order));
- }
+
goto retry_baser;
}
/* erratum 24313: ignore memory access type */
cache = GITS_BASER_nCnB;
+ if (its->flags & ITS_FLAGS_FORCE_NON_SHAREABLE) {
+ cache = GITS_BASER_nC;
+ shr = 0;
+ }
+
for (i = 0; i < GITS_BASER_NR_REGS; i++) {
struct its_baser *baser = its->tables + i;
u64 val = its_read_baser(its, baser);
}
static DEVICE_ATTR_RO(max_brightness);
-static ssize_t color_show(struct device *dev,
- struct device_attribute *attr, char *buf)
-{
- const char *color_text = "invalid";
- struct led_classdev *led_cdev = dev_get_drvdata(dev);
-
- if (led_cdev->color < LED_COLOR_ID_MAX)
- color_text = led_colors[led_cdev->color];
-
- return sysfs_emit(buf, "%s\n", color_text);
-}
-static DEVICE_ATTR_RO(color);
-
#ifdef CONFIG_LEDS_TRIGGERS
static BIN_ATTR(trigger, 0644, led_trigger_read, led_trigger_write, 0);
static struct bin_attribute *led_trigger_bin_attrs[] = {
static struct attribute *led_class_attrs[] = {
&dev_attr_brightness.attr,
&dev_attr_max_brightness.attr,
- &dev_attr_color.attr,
NULL,
};
cancel_delayed_work_sync(&trigger_data->work);
+ /*
+ * Take RTNL lock before trigger_data lock to prevent potential
+ * deadlock with netdev notifier registration.
+ */
+ rtnl_lock();
mutex_lock(&trigger_data->lock);
if (trigger_data->net_dev) {
trigger_data->carrier_link_up = false;
trigger_data->link_speed = SPEED_UNKNOWN;
trigger_data->duplex = DUPLEX_UNKNOWN;
- if (trigger_data->net_dev != NULL) {
- rtnl_lock();
+ if (trigger_data->net_dev)
get_device_state(trigger_data);
- rtnl_unlock();
- }
trigger_data->last_activity = 0;
set_baseline_state(trigger_data);
mutex_unlock(&trigger_data->lock);
+ rtnl_unlock();
return 0;
}
config DM_AUDIT
bool "DM audit events"
+ depends on BLK_DEV_DM
depends on AUDIT
help
Generate audit events for device-mapper.
#define BCACHE_DEV_WB_RUNNING 3
#define BCACHE_DEV_RATE_DW_RUNNING 4
int nr_stripes;
+#define BCH_MIN_STRIPE_SZ ((4 << 20) >> SECTOR_SHIFT)
unsigned int stripe_size;
atomic_t *stripe_sectors_dirty;
unsigned long *full_dirty_stripes;
w->journal = NULL;
}
-static void btree_node_write_unlock(struct closure *cl)
+static CLOSURE_CALLBACK(btree_node_write_unlock)
{
- struct btree *b = container_of(cl, struct btree, io);
+ closure_type(b, struct btree, io);
up(&b->io_mutex);
}
-static void __btree_node_write_done(struct closure *cl)
+static CLOSURE_CALLBACK(__btree_node_write_done)
{
- struct btree *b = container_of(cl, struct btree, io);
+ closure_type(b, struct btree, io);
struct btree_write *w = btree_prev_write(b);
bch_bbio_free(b->bio, b->c);
closure_return_with_destructor(cl, btree_node_write_unlock);
}
-static void btree_node_write_done(struct closure *cl)
+static CLOSURE_CALLBACK(btree_node_write_done)
{
- struct btree *b = container_of(cl, struct btree, io);
+ closure_type(b, struct btree, io);
bio_free_pages(b->bio);
- __btree_node_write_done(cl);
+ __btree_node_write_done(&cl->work);
}
static void btree_node_write_endio(struct bio *bio)
*
* The btree node will have either a read or a write lock held, depending on
* level and op->lock.
+ *
+ * Note: Only error code or btree pointer will be returned, it is unncessary
+ * for callers to check NULL pointer.
*/
struct btree *bch_btree_node_get(struct cache_set *c, struct btree_op *op,
struct bkey *k, int level, bool write,
mutex_unlock(&b->c->bucket_lock);
}
+/*
+ * Only error code or btree pointer will be returned, it is unncessary for
+ * callers to check NULL pointer.
+ */
struct btree *__bch_btree_node_alloc(struct cache_set *c, struct btree_op *op,
int level, bool wait,
struct btree *parent)
memset(new_nodes, 0, sizeof(new_nodes));
closure_init_stack(&cl);
- while (nodes < GC_MERGE_NODES && !IS_ERR(r[nodes].b))
+ while (nodes < GC_MERGE_NODES && !IS_ERR_OR_NULL(r[nodes].b))
keys += r[nodes++].keys;
blocks = btree_default_blocks(b->c) * 2 / 3;
bch_keylist_free(&keylist);
for (i = 0; i < nodes; i++)
- if (!IS_ERR(new_nodes[i])) {
+ if (!IS_ERR_OR_NULL(new_nodes[i])) {
btree_node_free(new_nodes[i]);
rw_unlock(true, new_nodes[i]);
}
return 0;
n = btree_node_alloc_replacement(replace, NULL);
+ if (IS_ERR(n))
+ return 0;
/* recheck reserve after allocating replacement node */
if (btree_check_reserve(b, NULL)) {
closure_put(&w->c->journal.io);
}
-static void journal_write(struct closure *cl);
+static CLOSURE_CALLBACK(journal_write);
-static void journal_write_done(struct closure *cl)
+static CLOSURE_CALLBACK(journal_write_done)
{
- struct journal *j = container_of(cl, struct journal, io);
+ closure_type(j, struct journal, io);
struct journal_write *w = (j->cur == j->w)
? &j->w[1]
: &j->w[0];
continue_at_nobarrier(cl, journal_write, bch_journal_wq);
}
-static void journal_write_unlock(struct closure *cl)
+static CLOSURE_CALLBACK(journal_write_unlock)
__releases(&c->journal.lock)
{
- struct cache_set *c = container_of(cl, struct cache_set, journal.io);
+ closure_type(c, struct cache_set, journal.io);
c->journal.io_in_flight = 0;
spin_unlock(&c->journal.lock);
}
-static void journal_write_unlocked(struct closure *cl)
+static CLOSURE_CALLBACK(journal_write_unlocked)
__releases(c->journal.lock)
{
- struct cache_set *c = container_of(cl, struct cache_set, journal.io);
+ closure_type(c, struct cache_set, journal.io);
struct cache *ca = c->cache;
struct journal_write *w = c->journal.cur;
struct bkey *k = &c->journal.key;
continue_at(cl, journal_write_done, NULL);
}
-static void journal_write(struct closure *cl)
+static CLOSURE_CALLBACK(journal_write)
{
- struct cache_set *c = container_of(cl, struct cache_set, journal.io);
+ closure_type(c, struct cache_set, journal.io);
spin_lock(&c->journal.lock);
- journal_write_unlocked(cl);
+ journal_write_unlocked(&cl->work);
}
static void journal_try_write(struct cache_set *c)
/* Moving GC - IO loop */
-static void moving_io_destructor(struct closure *cl)
+static CLOSURE_CALLBACK(moving_io_destructor)
{
- struct moving_io *io = container_of(cl, struct moving_io, cl);
+ closure_type(io, struct moving_io, cl);
kfree(io);
}
-static void write_moving_finish(struct closure *cl)
+static CLOSURE_CALLBACK(write_moving_finish)
{
- struct moving_io *io = container_of(cl, struct moving_io, cl);
+ closure_type(io, struct moving_io, cl);
struct bio *bio = &io->bio.bio;
bio_free_pages(bio);
bch_bio_map(bio, NULL);
}
-static void write_moving(struct closure *cl)
+static CLOSURE_CALLBACK(write_moving)
{
- struct moving_io *io = container_of(cl, struct moving_io, cl);
+ closure_type(io, struct moving_io, cl);
struct data_insert_op *op = &io->op;
if (!op->status) {
continue_at(cl, write_moving_finish, op->wq);
}
-static void read_moving_submit(struct closure *cl)
+static CLOSURE_CALLBACK(read_moving_submit)
{
- struct moving_io *io = container_of(cl, struct moving_io, cl);
+ closure_type(io, struct moving_io, cl);
struct bio *bio = &io->bio.bio;
bch_submit_bbio(bio, io->op.c, &io->w->key, 0);
struct kmem_cache *bch_search_cache;
-static void bch_data_insert_start(struct closure *cl);
+static CLOSURE_CALLBACK(bch_data_insert_start);
static unsigned int cache_mode(struct cached_dev *dc)
{
/* Insert data into cache */
-static void bch_data_insert_keys(struct closure *cl)
+static CLOSURE_CALLBACK(bch_data_insert_keys)
{
- struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
+ closure_type(op, struct data_insert_op, cl);
atomic_t *journal_ref = NULL;
struct bkey *replace_key = op->replace ? &op->replace_key : NULL;
int ret;
continue_at(cl, bch_data_insert_keys, op->wq);
}
-static void bch_data_insert_error(struct closure *cl)
+static CLOSURE_CALLBACK(bch_data_insert_error)
{
- struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
+ closure_type(op, struct data_insert_op, cl);
/*
* Our data write just errored, which means we've got a bunch of keys to
op->insert_keys.top = dst;
- bch_data_insert_keys(cl);
+ bch_data_insert_keys(&cl->work);
}
static void bch_data_insert_endio(struct bio *bio)
bch_bbio_endio(op->c, bio, bio->bi_status, "writing data to cache");
}
-static void bch_data_insert_start(struct closure *cl)
+static CLOSURE_CALLBACK(bch_data_insert_start)
{
- struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
+ closure_type(op, struct data_insert_op, cl);
struct bio *bio = op->bio, *n;
if (op->bypass)
* If op->bypass is true, instead of inserting the data it invalidates the
* region of the cache represented by op->bio and op->inode.
*/
-void bch_data_insert(struct closure *cl)
+CLOSURE_CALLBACK(bch_data_insert)
{
- struct data_insert_op *op = container_of(cl, struct data_insert_op, cl);
+ closure_type(op, struct data_insert_op, cl);
trace_bcache_write(op->c, op->inode, op->bio,
op->writeback, op->bypass);
bch_keylist_init(&op->insert_keys);
bio_get(op->bio);
- bch_data_insert_start(cl);
+ bch_data_insert_start(&cl->work);
}
/*
return n == bio ? MAP_DONE : MAP_CONTINUE;
}
-static void cache_lookup(struct closure *cl)
+static CLOSURE_CALLBACK(cache_lookup)
{
- struct search *s = container_of(cl, struct search, iop.cl);
+ closure_type(s, struct search, iop.cl);
struct bio *bio = &s->bio.bio;
struct cached_dev *dc;
int ret;
bio_cnt_set(bio, 3);
}
-static void search_free(struct closure *cl)
+static CLOSURE_CALLBACK(search_free)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
atomic_dec(&s->iop.c->search_inflight);
/* Cached devices */
-static void cached_dev_bio_complete(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_bio_complete)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
cached_dev_put(dc);
- search_free(cl);
+ search_free(&cl->work);
}
/* Process reads */
-static void cached_dev_read_error_done(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_read_error_done)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
if (s->iop.replace_collision)
bch_mark_cache_miss_collision(s->iop.c, s->d);
if (s->iop.bio)
bio_free_pages(s->iop.bio);
- cached_dev_bio_complete(cl);
+ cached_dev_bio_complete(&cl->work);
}
-static void cached_dev_read_error(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_read_error)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct bio *bio = &s->bio.bio;
/*
continue_at(cl, cached_dev_read_error_done, NULL);
}
-static void cached_dev_cache_miss_done(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_cache_miss_done)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct bcache_device *d = s->d;
if (s->iop.replace_collision)
if (s->iop.bio)
bio_free_pages(s->iop.bio);
- cached_dev_bio_complete(cl);
+ cached_dev_bio_complete(&cl->work);
closure_put(&d->cl);
}
-static void cached_dev_read_done(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_read_done)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
/*
continue_at(cl, cached_dev_cache_miss_done, NULL);
}
-static void cached_dev_read_done_bh(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_read_done_bh)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
bch_mark_cache_accounting(s->iop.c, s->d,
/* Process writes */
-static void cached_dev_write_complete(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_write_complete)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct cached_dev *dc = container_of(s->d, struct cached_dev, disk);
up_read_non_owner(&dc->writeback_lock);
- cached_dev_bio_complete(cl);
+ cached_dev_bio_complete(&cl->work);
}
static void cached_dev_write(struct cached_dev *dc, struct search *s)
continue_at(cl, cached_dev_write_complete, NULL);
}
-static void cached_dev_nodata(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_nodata)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
struct bio *bio = &s->bio.bio;
if (s->iop.flush_journal)
return MAP_CONTINUE;
}
-static void flash_dev_nodata(struct closure *cl)
+static CLOSURE_CALLBACK(flash_dev_nodata)
{
- struct search *s = container_of(cl, struct search, cl);
+ closure_type(s, struct search, cl);
if (s->iop.flush_journal)
bch_journal_meta(s->iop.c, cl);
};
unsigned int bch_get_congested(const struct cache_set *c);
-void bch_data_insert(struct closure *cl);
+CLOSURE_CALLBACK(bch_data_insert);
void bch_cached_dev_request_init(struct cached_dev *dc);
void cached_dev_submit_bio(struct bio *bio);
submit_bio(bio);
}
-static void bch_write_bdev_super_unlock(struct closure *cl)
+static CLOSURE_CALLBACK(bch_write_bdev_super_unlock)
{
- struct cached_dev *dc = container_of(cl, struct cached_dev, sb_write);
+ closure_type(dc, struct cached_dev, sb_write);
up(&dc->sb_write_mutex);
}
closure_put(&ca->set->sb_write);
}
-static void bcache_write_super_unlock(struct closure *cl)
+static CLOSURE_CALLBACK(bcache_write_super_unlock)
{
- struct cache_set *c = container_of(cl, struct cache_set, sb_write);
+ closure_type(c, struct cache_set, sb_write);
up(&c->sb_write_mutex);
}
closure_put(cl);
}
-static void uuid_io_unlock(struct closure *cl)
+static CLOSURE_CALLBACK(uuid_io_unlock)
{
- struct cache_set *c = container_of(cl, struct cache_set, uuid_write);
+ closure_type(c, struct cache_set, uuid_write);
up(&c->uuid_write_mutex);
}
if (!d->stripe_size)
d->stripe_size = 1 << 31;
+ else if (d->stripe_size < BCH_MIN_STRIPE_SZ)
+ d->stripe_size = roundup(BCH_MIN_STRIPE_SZ, d->stripe_size);
n = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
if (!n || n > max_stripes) {
module_put(THIS_MODULE);
}
-static void cached_dev_free(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_free)
{
- struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl);
+ closure_type(dc, struct cached_dev, disk.cl);
if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
cancel_writeback_rate_update_dwork(dc);
kobject_put(&dc->disk.kobj);
}
-static void cached_dev_flush(struct closure *cl)
+static CLOSURE_CALLBACK(cached_dev_flush)
{
- struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl);
+ closure_type(dc, struct cached_dev, disk.cl);
struct bcache_device *d = &dc->disk;
mutex_lock(&bch_register_lock);
kfree(d);
}
-static void flash_dev_free(struct closure *cl)
+static CLOSURE_CALLBACK(flash_dev_free)
{
- struct bcache_device *d = container_of(cl, struct bcache_device, cl);
+ closure_type(d, struct bcache_device, cl);
mutex_lock(&bch_register_lock);
atomic_long_sub(bcache_dev_sectors_dirty(d),
kobject_put(&d->kobj);
}
-static void flash_dev_flush(struct closure *cl)
+static CLOSURE_CALLBACK(flash_dev_flush)
{
- struct bcache_device *d = container_of(cl, struct bcache_device, cl);
+ closure_type(d, struct bcache_device, cl);
mutex_lock(&bch_register_lock);
bcache_device_unlink(d);
module_put(THIS_MODULE);
}
-static void cache_set_free(struct closure *cl)
+static CLOSURE_CALLBACK(cache_set_free)
{
- struct cache_set *c = container_of(cl, struct cache_set, cl);
+ closure_type(c, struct cache_set, cl);
struct cache *ca;
debugfs_remove(c->debug);
kobject_put(&c->kobj);
}
-static void cache_set_flush(struct closure *cl)
+static CLOSURE_CALLBACK(cache_set_flush)
{
- struct cache_set *c = container_of(cl, struct cache_set, caching);
+ closure_type(c, struct cache_set, caching);
struct cache *ca = c->cache;
struct btree *b;
}
}
-static void __cache_set_unregister(struct closure *cl)
+static CLOSURE_CALLBACK(__cache_set_unregister)
{
- struct cache_set *c = container_of(cl, struct cache_set, caching);
+ closure_type(c, struct cache_set, caching);
struct cached_dev *dc;
struct bcache_device *d;
size_t i;
c->root = bch_btree_node_get(c, NULL, k,
j->btree_level,
true, NULL);
- if (IS_ERR_OR_NULL(c->root))
+ if (IS_ERR(c->root))
goto err;
list_del_init(&c->root->list);
sum += INITIAL_PRIO - cached[i];
if (n)
- do_div(sum, n);
+ sum = div64_u64(sum, n);
for (i = 0; i < ARRAY_SIZE(q); i++)
q[i] = INITIAL_PRIO - cached[n * (i + 1) /
bch_bio_map(bio, NULL);
}
-static void dirty_io_destructor(struct closure *cl)
+static CLOSURE_CALLBACK(dirty_io_destructor)
{
- struct dirty_io *io = container_of(cl, struct dirty_io, cl);
+ closure_type(io, struct dirty_io, cl);
kfree(io);
}
-static void write_dirty_finish(struct closure *cl)
+static CLOSURE_CALLBACK(write_dirty_finish)
{
- struct dirty_io *io = container_of(cl, struct dirty_io, cl);
+ closure_type(io, struct dirty_io, cl);
struct keybuf_key *w = io->bio.bi_private;
struct cached_dev *dc = io->dc;
closure_put(&io->cl);
}
-static void write_dirty(struct closure *cl)
+static CLOSURE_CALLBACK(write_dirty)
{
- struct dirty_io *io = container_of(cl, struct dirty_io, cl);
+ closure_type(io, struct dirty_io, cl);
struct keybuf_key *w = io->bio.bi_private;
struct cached_dev *dc = io->dc;
dirty_endio(bio);
}
-static void read_dirty_submit(struct closure *cl)
+static CLOSURE_CALLBACK(read_dirty_submit)
{
- struct dirty_io *io = container_of(cl, struct dirty_io, cl);
+ closure_type(io, struct dirty_io, cl);
closure_bio_submit(io->dc->disk.c, &io->bio, cl);
int cur_idx, prev_idx, skip_nr;
k = p = NULL;
- cur_idx = prev_idx = 0;
+ prev_idx = 0;
bch_btree_iter_init(&c->root->keys, &iter, NULL);
k = bch_btree_iter_next_filter(&iter, &c->root->keys, bch_ptr_bad);
void bch_sectors_dirty_init(struct bcache_device *d)
{
int i;
+ struct btree *b = NULL;
struct bkey *k = NULL;
struct btree_iter iter;
struct sectors_dirty_init op;
struct cache_set *c = d->c;
struct bch_dirty_init_state state;
+retry_lock:
+ b = c->root;
+ rw_lock(0, b, b->level);
+ if (b != c->root) {
+ rw_unlock(0, b);
+ goto retry_lock;
+ }
+
/* Just count root keys if no leaf node */
- rw_lock(0, c->root, c->root->level);
if (c->root->level == 0) {
bch_btree_op_init(&op.op, -1);
op.inode = d->id;
op.count = 0;
for_each_key_filter(&c->root->keys,
- k, &iter, bch_ptr_invalid)
+ k, &iter, bch_ptr_invalid) {
+ if (KEY_INODE(k) != op.inode)
+ continue;
sectors_dirty_init_fn(&op.op, c->root, k);
+ }
- rw_unlock(0, c->root);
+ rw_unlock(0, b);
return;
}
if (atomic_read(&state.enough))
break;
+ atomic_inc(&state.started);
state.infos[i].state = &state;
state.infos[i].thread =
kthread_run(bch_dirty_init_thread, &state.infos[i],
"bch_dirtcnt[%d]", i);
if (IS_ERR(state.infos[i].thread)) {
pr_err("fails to run thread bch_dirty_init[%d]\n", i);
+ atomic_dec(&state.started);
for (--i; i >= 0; i--)
kthread_stop(state.infos[i].thread);
goto out;
}
- atomic_inc(&state.started);
}
out:
/* Must wait for all threads to stop. */
wait_event(state.wait, atomic_read(&state.started) == 0);
- rw_unlock(0, c->root);
+ rw_unlock(0, b);
}
void bch_cached_dev_writeback_init(struct cached_dev *dc)
typedef enum evict_result (*le_predicate)(struct lru_entry *le, void *context);
-static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *context)
+static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *context, bool no_sleep)
{
unsigned long tested = 0;
struct list_head *h = lru->cursor;
h = h->next;
- cond_resched();
+ if (!no_sleep)
+ cond_resched();
}
return NULL;
*/
struct buffer_tree {
- struct rw_semaphore lock;
+ union {
+ struct rw_semaphore lock;
+ rwlock_t spinlock;
+ } u;
struct rb_root root;
} ____cacheline_aligned_in_smp;
* on the locks.
*/
unsigned int num_locks;
+ bool no_sleep;
struct buffer_tree trees[];
};
+static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled);
+
static inline unsigned int cache_index(sector_t block, unsigned int num_locks)
{
return dm_hash_locks_index(block, num_locks);
static inline void cache_read_lock(struct dm_buffer_cache *bc, sector_t block)
{
- down_read(&bc->trees[cache_index(block, bc->num_locks)].lock);
+ if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep)
+ read_lock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock);
+ else
+ down_read(&bc->trees[cache_index(block, bc->num_locks)].u.lock);
}
static inline void cache_read_unlock(struct dm_buffer_cache *bc, sector_t block)
{
- up_read(&bc->trees[cache_index(block, bc->num_locks)].lock);
+ if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep)
+ read_unlock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock);
+ else
+ up_read(&bc->trees[cache_index(block, bc->num_locks)].u.lock);
}
static inline void cache_write_lock(struct dm_buffer_cache *bc, sector_t block)
{
- down_write(&bc->trees[cache_index(block, bc->num_locks)].lock);
+ if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep)
+ write_lock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock);
+ else
+ down_write(&bc->trees[cache_index(block, bc->num_locks)].u.lock);
}
static inline void cache_write_unlock(struct dm_buffer_cache *bc, sector_t block)
{
- up_write(&bc->trees[cache_index(block, bc->num_locks)].lock);
+ if (static_branch_unlikely(&no_sleep_enabled) && bc->no_sleep)
+ write_unlock_bh(&bc->trees[cache_index(block, bc->num_locks)].u.spinlock);
+ else
+ up_write(&bc->trees[cache_index(block, bc->num_locks)].u.lock);
}
/*
static void __lh_lock(struct lock_history *lh, unsigned int index)
{
- if (lh->write)
- down_write(&lh->cache->trees[index].lock);
- else
- down_read(&lh->cache->trees[index].lock);
+ if (lh->write) {
+ if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep)
+ write_lock_bh(&lh->cache->trees[index].u.spinlock);
+ else
+ down_write(&lh->cache->trees[index].u.lock);
+ } else {
+ if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep)
+ read_lock_bh(&lh->cache->trees[index].u.spinlock);
+ else
+ down_read(&lh->cache->trees[index].u.lock);
+ }
}
static void __lh_unlock(struct lock_history *lh, unsigned int index)
{
- if (lh->write)
- up_write(&lh->cache->trees[index].lock);
- else
- up_read(&lh->cache->trees[index].lock);
+ if (lh->write) {
+ if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep)
+ write_unlock_bh(&lh->cache->trees[index].u.spinlock);
+ else
+ up_write(&lh->cache->trees[index].u.lock);
+ } else {
+ if (static_branch_unlikely(&no_sleep_enabled) && lh->cache->no_sleep)
+ read_unlock_bh(&lh->cache->trees[index].u.spinlock);
+ else
+ up_read(&lh->cache->trees[index].u.lock);
+ }
}
/*
return le_to_buffer(le);
}
-static void cache_init(struct dm_buffer_cache *bc, unsigned int num_locks)
+static void cache_init(struct dm_buffer_cache *bc, unsigned int num_locks, bool no_sleep)
{
unsigned int i;
bc->num_locks = num_locks;
+ bc->no_sleep = no_sleep;
for (i = 0; i < bc->num_locks; i++) {
- init_rwsem(&bc->trees[i].lock);
+ if (no_sleep)
+ rwlock_init(&bc->trees[i].u.spinlock);
+ else
+ init_rwsem(&bc->trees[i].u.lock);
bc->trees[i].root = RB_ROOT;
}
struct lru_entry *le;
struct dm_buffer *b;
- le = lru_evict(&bc->lru[list_mode], __evict_pred, &w);
+ le = lru_evict(&bc->lru[list_mode], __evict_pred, &w, bc->no_sleep);
if (!le)
return NULL;
struct evict_wrapper w = {.lh = lh, .pred = pred, .context = context};
while (true) {
- le = lru_evict(&bc->lru[old_mode], __evict_pred, &w);
+ le = lru_evict(&bc->lru[old_mode], __evict_pred, &w, bc->no_sleep);
if (!le)
break;
{
unsigned int i;
+ BUG_ON(bc->no_sleep);
for (i = 0; i < bc->num_locks; i++) {
- down_write(&bc->trees[i].lock);
+ down_write(&bc->trees[i].u.lock);
__remove_range(bc, &bc->trees[i].root, begin, end, pred, release);
- up_write(&bc->trees[i].lock);
+ up_write(&bc->trees[i].u.lock);
}
}
struct dm_buffer_cache cache; /* must be last member */
};
-static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled);
-
/*----------------------------------------------------------------*/
#define dm_bufio_in_request() (!!current->bio_list)
if (need_submit)
submit_io(b, REQ_OP_READ, read_endio);
- wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);
+ if (nf != NF_GET) /* we already tested this condition above */
+ wait_on_bit_io(&b->state, B_READING, TASK_UNINTERRUPTIBLE);
if (b->read_error) {
int error = blk_status_to_errno(b->read_error);
r = -ENOMEM;
goto bad_client;
}
- cache_init(&c->cache, num_locks);
+ cache_init(&c->cache, num_locks, (flags & DM_BUFIO_CLIENT_NO_SLEEP) != 0);
c->bdev = bdev;
c->block_size = block_size;
unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM;
unsigned int remaining_size;
- unsigned int order = MAX_ORDER - 1;
+ unsigned int order = MAX_ORDER;
retry:
if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
struct work_struct flush_expired_bios;
struct list_head delayed_bios;
struct task_struct *worker;
- atomic_t may_delay;
+ bool may_delay;
struct delay_class read;
struct delay_class write;
return !!dc->worker;
}
-static void flush_delayed_bios_fast(struct delay_c *dc, bool flush_all)
-{
- struct dm_delay_info *delayed, *next;
-
- mutex_lock(&delayed_bios_lock);
- list_for_each_entry_safe(delayed, next, &dc->delayed_bios, list) {
- if (flush_all || time_after_eq(jiffies, delayed->expires)) {
- struct bio *bio = dm_bio_from_per_bio_data(delayed,
- sizeof(struct dm_delay_info));
- list_del(&delayed->list);
- dm_submit_bio_remap(bio, NULL);
- delayed->class->ops--;
- }
- }
- mutex_unlock(&delayed_bios_lock);
-}
-
-static int flush_worker_fn(void *data)
-{
- struct delay_c *dc = data;
-
- while (1) {
- flush_delayed_bios_fast(dc, false);
- if (unlikely(list_empty(&dc->delayed_bios))) {
- set_current_state(TASK_INTERRUPTIBLE);
- schedule();
- } else
- cond_resched();
- }
-
- return 0;
-}
-
static void flush_bios(struct bio *bio)
{
struct bio *n;
}
}
-static struct bio *flush_delayed_bios(struct delay_c *dc, bool flush_all)
+static void flush_delayed_bios(struct delay_c *dc, bool flush_all)
{
struct dm_delay_info *delayed, *next;
+ struct bio_list flush_bio_list;
unsigned long next_expires = 0;
- unsigned long start_timer = 0;
- struct bio_list flush_bios = { };
+ bool start_timer = false;
+ bio_list_init(&flush_bio_list);
mutex_lock(&delayed_bios_lock);
list_for_each_entry_safe(delayed, next, &dc->delayed_bios, list) {
+ cond_resched();
if (flush_all || time_after_eq(jiffies, delayed->expires)) {
struct bio *bio = dm_bio_from_per_bio_data(delayed,
sizeof(struct dm_delay_info));
list_del(&delayed->list);
- bio_list_add(&flush_bios, bio);
+ bio_list_add(&flush_bio_list, bio);
delayed->class->ops--;
continue;
}
- if (!start_timer) {
- start_timer = 1;
- next_expires = delayed->expires;
- } else
- next_expires = min(next_expires, delayed->expires);
+ if (!delay_is_fast(dc)) {
+ if (!start_timer) {
+ start_timer = true;
+ next_expires = delayed->expires;
+ } else {
+ next_expires = min(next_expires, delayed->expires);
+ }
+ }
}
mutex_unlock(&delayed_bios_lock);
if (start_timer)
queue_timeout(dc, next_expires);
- return bio_list_get(&flush_bios);
+ flush_bios(bio_list_get(&flush_bio_list));
+}
+
+static int flush_worker_fn(void *data)
+{
+ struct delay_c *dc = data;
+
+ while (!kthread_should_stop()) {
+ flush_delayed_bios(dc, false);
+ mutex_lock(&delayed_bios_lock);
+ if (unlikely(list_empty(&dc->delayed_bios))) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ mutex_unlock(&delayed_bios_lock);
+ schedule();
+ } else {
+ mutex_unlock(&delayed_bios_lock);
+ cond_resched();
+ }
+ }
+
+ return 0;
}
static void flush_expired_bios(struct work_struct *work)
struct delay_c *dc;
dc = container_of(work, struct delay_c, flush_expired_bios);
- if (delay_is_fast(dc))
- flush_delayed_bios_fast(dc, false);
- else
- flush_bios(flush_delayed_bios(dc, false));
+ flush_delayed_bios(dc, false);
}
static void delay_dtr(struct dm_target *ti)
if (dc->worker)
kthread_stop(dc->worker);
- if (!delay_is_fast(dc))
- mutex_destroy(&dc->timer_lock);
+ mutex_destroy(&dc->timer_lock);
kfree(dc);
}
ti->private = dc;
INIT_LIST_HEAD(&dc->delayed_bios);
- atomic_set(&dc->may_delay, 1);
+ mutex_init(&dc->timer_lock);
+ dc->may_delay = true;
dc->argc = argc;
ret = delay_class_ctr(ti, &dc->read, argv);
"dm-delay-flush-worker");
if (IS_ERR(dc->worker)) {
ret = PTR_ERR(dc->worker);
+ dc->worker = NULL;
goto bad;
}
} else {
timer_setup(&dc->delay_timer, handle_delayed_timer, 0);
INIT_WORK(&dc->flush_expired_bios, flush_expired_bios);
- mutex_init(&dc->timer_lock);
dc->kdelayd_wq = alloc_workqueue("kdelayd", WQ_MEM_RECLAIM, 0);
if (!dc->kdelayd_wq) {
ret = -EINVAL;
struct dm_delay_info *delayed;
unsigned long expires = 0;
- if (!c->delay || !atomic_read(&dc->may_delay))
+ if (!c->delay)
return DM_MAPIO_REMAPPED;
delayed = dm_per_bio_data(bio, sizeof(struct dm_delay_info));
delayed->expires = expires = jiffies + msecs_to_jiffies(c->delay);
mutex_lock(&delayed_bios_lock);
+ if (unlikely(!dc->may_delay)) {
+ mutex_unlock(&delayed_bios_lock);
+ return DM_MAPIO_REMAPPED;
+ }
c->ops++;
list_add_tail(&delayed->list, &dc->delayed_bios);
mutex_unlock(&delayed_bios_lock);
{
struct delay_c *dc = ti->private;
- atomic_set(&dc->may_delay, 0);
+ mutex_lock(&delayed_bios_lock);
+ dc->may_delay = false;
+ mutex_unlock(&delayed_bios_lock);
- if (delay_is_fast(dc))
- flush_delayed_bios_fast(dc, true);
- else {
+ if (!delay_is_fast(dc))
del_timer_sync(&dc->delay_timer);
- flush_bios(flush_delayed_bios(dc, true));
- }
+ flush_delayed_bios(dc, true);
}
static void delay_resume(struct dm_target *ti)
{
struct delay_c *dc = ti->private;
- atomic_set(&dc->may_delay, 1);
+ dc->may_delay = true;
}
static int delay_map(struct dm_target *ti, struct bio *bio)
remaining_size = size;
- order = MAX_ORDER - 1;
+ order = MAX_ORDER;
while (remaining_size) {
struct page *pages;
unsigned size_to_add, to_copy;
sectors_to_process = dio->range.n_sectors;
__bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) {
+ struct bio_vec bv_copy = bv;
unsigned int pos;
char *mem, *checksums_ptr;
again:
- mem = bvec_kmap_local(&bv);
+ mem = bvec_kmap_local(&bv_copy);
pos = 0;
checksums_ptr = checksums;
do {
sectors_to_process -= ic->sectors_per_block;
pos += ic->sectors_per_block << SECTOR_SHIFT;
sector += ic->sectors_per_block;
- } while (pos < bv.bv_len && sectors_to_process && checksums != checksums_onstack);
+ } while (pos < bv_copy.bv_len && sectors_to_process && checksums != checksums_onstack);
kunmap_local(mem);
r = dm_integrity_rw_tag(ic, checksums, &dio->metadata_block, &dio->metadata_offset,
if (!sectors_to_process)
break;
- if (unlikely(pos < bv.bv_len)) {
- bv.bv_offset += pos;
- bv.bv_len -= pos;
+ if (unlikely(pos < bv_copy.bv_len)) {
+ bv_copy.bv_offset += pos;
+ bv_copy.bv_len -= pos;
goto again;
}
}
mddev_lock_nointr(&rs->md);
md_stop(&rs->md);
mddev_unlock(&rs->md);
+
+ if (work_pending(&rs->md.event_work))
+ flush_work(&rs->md.event_work);
raid_set_free(rs);
}
*/
static inline struct dm_verity_fec_io *fec_io(struct dm_verity_io *io)
{
- return (struct dm_verity_fec_io *) verity_io_digest_end(io->v, io);
+ return (struct dm_verity_fec_io *)
+ ((char *)io + io->v->ti->per_io_data_size - sizeof(struct dm_verity_fec_io));
}
/*
{
if (unlikely(verity_hash(v, verity_io_hash_req(v, io),
data, 1 << v->data_dev_block_bits,
- verity_io_real_digest(v, io))))
+ verity_io_real_digest(v, io), true)))
return 0;
return memcmp(verity_io_real_digest(v, io), want_digest,
/* Always re-validate the corrected block against the expected hash */
r = verity_hash(v, verity_io_hash_req(v, io), fio->output,
1 << v->data_dev_block_bits,
- verity_io_real_digest(v, io));
+ verity_io_real_digest(v, io), true);
if (unlikely(r < 0))
return r;
* Wrapper for crypto_ahash_init, which handles verity salting.
*/
static int verity_hash_init(struct dm_verity *v, struct ahash_request *req,
- struct crypto_wait *wait)
+ struct crypto_wait *wait, bool may_sleep)
{
int r;
ahash_request_set_tfm(req, v->tfm);
- ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
- CRYPTO_TFM_REQ_MAY_BACKLOG,
- crypto_req_done, (void *)wait);
+ ahash_request_set_callback(req,
+ may_sleep ? CRYPTO_TFM_REQ_MAY_SLEEP | CRYPTO_TFM_REQ_MAY_BACKLOG : 0,
+ crypto_req_done, (void *)wait);
crypto_init_wait(wait);
r = crypto_wait_req(crypto_ahash_init(req), wait);
if (unlikely(r < 0)) {
- DMERR("crypto_ahash_init failed: %d", r);
+ if (r != -ENOMEM)
+ DMERR("crypto_ahash_init failed: %d", r);
return r;
}
}
int verity_hash(struct dm_verity *v, struct ahash_request *req,
- const u8 *data, size_t len, u8 *digest)
+ const u8 *data, size_t len, u8 *digest, bool may_sleep)
{
int r;
struct crypto_wait wait;
- r = verity_hash_init(v, req, &wait);
+ r = verity_hash_init(v, req, &wait, may_sleep);
if (unlikely(r < 0))
goto out;
r = verity_hash(v, verity_io_hash_req(v, io),
data, 1 << v->hash_dev_block_bits,
- verity_io_real_digest(v, io));
+ verity_io_real_digest(v, io), !io->in_tasklet);
if (unlikely(r < 0))
goto release_ret_r;
continue;
}
- r = verity_hash_init(v, req, &wait);
+ r = verity_hash_init(v, req, &wait, !io->in_tasklet);
if (unlikely(r < 0))
return r;
io->in_tasklet = false;
- verity_fec_init_io(io);
verity_finish_io(io, errno_to_blk_status(verity_verify_io(io)));
}
io->in_tasklet = true;
err = verity_verify_io(io);
- if (err == -EAGAIN) {
+ if (err == -EAGAIN || err == -ENOMEM) {
/* fallback to retrying with work-queue */
INIT_WORK(&io->work, verity_work);
queue_work(io->v->verify_wq, &io->work);
struct dm_verity_io *io = bio->bi_private;
if (bio->bi_status &&
- (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) {
+ (!verity_fec_is_enabled(io->v) ||
+ verity_is_system_shutting_down() ||
+ (bio->bi_opf & REQ_RAHEAD))) {
verity_finish_io(io, bio->bi_status);
return;
}
bio->bi_private = io;
io->iter = bio->bi_iter;
+ verity_fec_init_io(io);
+
verity_submit_prefetch(v, io);
submit_bio_noacct(bio);
goto out;
r = verity_hash(v, req, zero_data, 1 << v->data_dev_block_bits,
- v->zero_digest);
+ v->zero_digest, true);
out:
kfree(req);
return (u8 *)(io + 1) + v->ahash_reqsize + v->digest_size;
}
-static inline u8 *verity_io_digest_end(struct dm_verity *v,
- struct dm_verity_io *io)
-{
- return verity_io_want_digest(v, io) + v->digest_size;
-}
-
extern int verity_for_bv_block(struct dm_verity *v, struct dm_verity_io *io,
struct bvec_iter *iter,
int (*process)(struct dm_verity *v,
u8 *data, size_t len));
extern int verity_hash(struct dm_verity *v, struct ahash_request *req,
- const u8 *data, size_t len, u8 *digest);
+ const u8 *data, size_t len, u8 *digest, bool may_sleep);
extern int verity_hash_for_block(struct dm_verity *v, struct dm_verity_io *io,
sector_t block, u8 *digest, bool *is_zero);
static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
static struct workqueue_struct *md_wq;
+
+/*
+ * This workqueue is used for sync_work to register new sync_thread, and for
+ * del_work to remove rdev, and for event_work that is only set by dm-raid.
+ *
+ * Noted that sync_work will grab reconfig_mutex, hence never flush this
+ * workqueue whith reconfig_mutex grabbed.
+ */
static struct workqueue_struct *md_misc_wq;
struct workqueue_struct *md_bitmap_wq;
}
EXPORT_SYMBOL_GPL(mddev_suspend);
-void mddev_resume(struct mddev *mddev)
+static void __mddev_resume(struct mddev *mddev, bool recovery_needed)
{
lockdep_assert_not_held(&mddev->reconfig_mutex);
percpu_ref_resurrect(&mddev->active_io);
wake_up(&mddev->sb_wait);
- set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ if (recovery_needed)
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
mutex_unlock(&mddev->suspend_mutex);
}
+
+void mddev_resume(struct mddev *mddev)
+{
+ return __mddev_resume(mddev, true);
+}
EXPORT_SYMBOL_GPL(mddev_resume);
/*
return sprintf(page, "%s\n", type);
}
-static void stop_sync_thread(struct mddev *mddev)
+/**
+ * stop_sync_thread() - wait for sync_thread to stop if it's running.
+ * @mddev: the array.
+ * @locked: if set, reconfig_mutex will still be held after this function
+ * return; if not set, reconfig_mutex will be released after this
+ * function return.
+ * @check_seq: if set, only wait for curent running sync_thread to stop, noted
+ * that new sync_thread can still start.
+ */
+static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq)
{
- if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
- return;
+ int sync_seq;
- if (mddev_lock(mddev))
- return;
+ if (check_seq)
+ sync_seq = atomic_read(&mddev->sync_seq);
- /*
- * Check again in case MD_RECOVERY_RUNNING is cleared before lock is
- * held.
- */
if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
- mddev_unlock(mddev);
+ if (!locked)
+ mddev_unlock(mddev);
return;
}
- if (work_pending(&mddev->del_work))
- flush_workqueue(md_misc_wq);
+ mddev_unlock(mddev);
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
/*
* never happen
*/
md_wakeup_thread_directly(mddev->sync_thread);
+ if (work_pending(&mddev->sync_work))
+ flush_work(&mddev->sync_work);
- mddev_unlock(mddev);
+ wait_event(resync_wait,
+ !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) ||
+ (check_seq && sync_seq != atomic_read(&mddev->sync_seq)));
+
+ if (locked)
+ mddev_lock_nointr(mddev);
}
static void idle_sync_thread(struct mddev *mddev)
{
- int sync_seq = atomic_read(&mddev->sync_seq);
-
mutex_lock(&mddev->sync_mutex);
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- stop_sync_thread(mddev);
- wait_event(resync_wait, sync_seq != atomic_read(&mddev->sync_seq) ||
- !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery));
+ if (mddev_lock(mddev)) {
+ mutex_unlock(&mddev->sync_mutex);
+ return;
+ }
+ stop_sync_thread(mddev, false, true);
mutex_unlock(&mddev->sync_mutex);
}
{
mutex_lock(&mddev->sync_mutex);
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- stop_sync_thread(mddev);
- wait_event(resync_wait, mddev->sync_thread == NULL &&
- !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery));
+ if (mddev_lock(mddev)) {
+ mutex_unlock(&mddev->sync_mutex);
+ return;
+ }
+ stop_sync_thread(mddev, false, false);
mutex_unlock(&mddev->sync_mutex);
}
static void __md_stop_writes(struct mddev *mddev)
{
- set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- if (work_pending(&mddev->del_work))
- flush_workqueue(md_misc_wq);
- if (mddev->sync_thread) {
- set_bit(MD_RECOVERY_INTR, &mddev->recovery);
- md_reap_sync_thread(mddev);
- }
-
+ stop_sync_thread(mddev, true, false);
del_timer_sync(&mddev->safemode_timer);
if (mddev->pers && mddev->pers->quiesce) {
struct md_personality *pers = mddev->pers;
md_bitmap_destroy(mddev);
mddev_detach(mddev);
- /* Ensure ->event_work is done */
- if (mddev->event_work.func)
- flush_workqueue(md_misc_wq);
spin_lock(&mddev->lock);
mddev->pers = NULL;
spin_unlock(&mddev->lock);
int err = 0;
int did_freeze = 0;
+ if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
+ return -EBUSY;
+
if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) {
did_freeze = 1;
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
md_wakeup_thread(mddev->thread);
}
- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
- set_bit(MD_RECOVERY_INTR, &mddev->recovery);
- /*
- * Thread might be blocked waiting for metadata update which will now
- * never happen
- */
- md_wakeup_thread_directly(mddev->sync_thread);
-
- if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
- return -EBUSY;
- mddev_unlock(mddev);
- wait_event(resync_wait, !test_bit(MD_RECOVERY_RUNNING,
- &mddev->recovery));
+ stop_sync_thread(mddev, false, false);
wait_event(mddev->sb_wait,
!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
mddev_lock_nointr(mddev);
mddev->sync_thread ||
test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
pr_warn("md: %s still in use.\n",mdname(mddev));
- if (did_freeze) {
- clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
- md_wakeup_thread(mddev->thread);
- }
err = -EBUSY;
goto out;
}
+
if (mddev->pers) {
__md_stop_writes(mddev);
- err = -ENXIO;
- if (mddev->ro == MD_RDONLY)
+ if (mddev->ro == MD_RDONLY) {
+ err = -ENXIO;
goto out;
+ }
+
mddev->ro = MD_RDONLY;
set_disk_ro(mddev->gendisk, 1);
+ }
+
+out:
+ if ((mddev->pers && !err) || did_freeze) {
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
sysfs_notify_dirent_safe(mddev->sysfs_state);
- err = 0;
}
-out:
+
mutex_unlock(&mddev->open_mutex);
return err;
}
set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
md_wakeup_thread(mddev->thread);
}
- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
- set_bit(MD_RECOVERY_INTR, &mddev->recovery);
- /*
- * Thread might be blocked waiting for metadata update which will now
- * never happen
- */
- md_wakeup_thread_directly(mddev->sync_thread);
-
- mddev_unlock(mddev);
- wait_event(resync_wait, (mddev->sync_thread == NULL &&
- !test_bit(MD_RECOVERY_RUNNING,
- &mddev->recovery)));
- mddev_lock_nointr(mddev);
+ stop_sync_thread(mddev, true, false);
mutex_lock(&mddev->open_mutex);
if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) ||
struct bio *orig_bio = md_io_clone->orig_bio;
struct mddev *mddev = md_io_clone->mddev;
- orig_bio->bi_status = bio->bi_status;
+ if (bio->bi_status && !orig_bio->bi_status)
+ orig_bio->bi_status = bio->bi_status;
if (md_io_clone->start_time)
bio_end_io_acct(orig_bio, md_io_clone->start_time);
goto not_running;
}
- suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
+ mddev_unlock(mddev);
+ /*
+ * md_start_sync was triggered by MD_RECOVERY_NEEDED, so we should
+ * not set it again. Otherwise, we may cause issue like this one:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=218200
+ * Therefore, use __mddev_resume(mddev, false).
+ */
+ if (suspend)
+ __mddev_resume(mddev, false);
md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event();
clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
+ mddev_unlock(mddev);
+ /*
+ * md_start_sync was triggered by MD_RECOVERY_NEEDED, so we should
+ * not set it again. Otherwise, we may cause issue like this one:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=218200
+ * Therefore, use __mddev_resume(mddev, false).
+ */
+ if (suspend)
+ __mddev_resume(mddev, false);
wake_up(&resync_wait);
if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
int dd_idx;
for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) {
- if (dd_idx == sh->pd_idx)
+ if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx)
continue;
min_sector = min(min_sector, sh->dev[dd_idx].sector);
- max_sector = min(max_sector, sh->dev[dd_idx].sector);
+ max_sector = max(max_sector, sh->dev[dd_idx].sector);
}
spin_lock_irq(&conf->device_lock);
config VIDEO_MGB4
tristate "Digiteq Automotive MGB4 support"
depends on VIDEO_DEV && PCI && I2C && DMADEVICES && SPI && MTD && IIO
+ depends on COMMON_CLK
select VIDEOBUF2_DMA_SG
select IIO_BUFFER
select IIO_TRIGGERED_BUFFER
#define MGB4_USER_IRQS 16
+#define DIGITEQ_VID 0x1ed8
+#define T100_DID 0x0101
+#define T200_DID 0x0201
+
ATTRIBUTE_GROUPS(mgb4_pci);
static int flashid;
return dev ? container_of(dev, struct spi_master, dev) : NULL;
}
-static int init_spi(struct mgb4_dev *mgbdev)
+static int init_spi(struct mgb4_dev *mgbdev, u32 devid)
{
struct resource spi_resources[] = {
{
snprintf(mgbdev->fw_part_name, sizeof(mgbdev->fw_part_name),
"mgb4-fw.%d", flashid);
mgbdev->partitions[0].name = mgbdev->fw_part_name;
- mgbdev->partitions[0].size = 0x400000;
- mgbdev->partitions[0].offset = 0x400000;
+ if (devid == T200_DID) {
+ mgbdev->partitions[0].size = 0x950000;
+ mgbdev->partitions[0].offset = 0x1000000;
+ } else {
+ mgbdev->partitions[0].size = 0x400000;
+ mgbdev->partitions[0].offset = 0x400000;
+ }
mgbdev->partitions[0].mask_flags = 0;
snprintf(mgbdev->data_part_name, sizeof(mgbdev->data_part_name),
goto err_video_regs;
/* SPI FLASH */
- rv = init_spi(mgbdev);
+ rv = init_spi(mgbdev, id->device);
if (rv < 0)
goto err_cmt_regs;
}
static const struct pci_device_id mgb4_pci_ids[] = {
- { PCI_DEVICE(0x1ed8, 0x0101), },
+ { PCI_DEVICE(DIGITEQ_VID, T100_DID), },
+ { PCI_DEVICE(DIGITEQ_VID, T200_DID), },
{ 0, }
};
MODULE_DEVICE_TABLE(pci, mgb4_pci_ids);
(7 << VI6_DPR_SMPPT_TGW_SHIFT) |
(VI6_DPR_NODE_UNUSED << VI6_DPR_SMPPT_PT_SHIFT));
- v4l2_subdev_call(&pipe->output->entity.subdev, video, s_stream, 0);
+ vsp1_wpf_stop(pipe->output);
return ret;
}
data);
}
-/* -----------------------------------------------------------------------------
- * V4L2 Subdevice Operations
- */
-
-static const struct v4l2_subdev_ops rpf_ops = {
- .pad = &vsp1_rwpf_pad_ops,
-};
-
/* -----------------------------------------------------------------------------
* VSP1 Entity Operations
*/
rpf->entity.index = index;
sprintf(name, "rpf.%u", index);
- ret = vsp1_entity_init(vsp1, &rpf->entity, name, 2, &rpf_ops,
+ ret = vsp1_entity_init(vsp1, &rpf->entity, name, 2, &vsp1_rwpf_subdev_ops,
MEDIA_ENT_F_PROC_VIDEO_PIXEL_FORMATTER);
if (ret < 0)
return ERR_PTR(ret);
}
/* -----------------------------------------------------------------------------
- * V4L2 Subdevice Pad Operations
+ * V4L2 Subdevice Operations
*/
static int vsp1_rwpf_enum_mbus_code(struct v4l2_subdev *subdev,
return ret;
}
-const struct v4l2_subdev_pad_ops vsp1_rwpf_pad_ops = {
+static const struct v4l2_subdev_pad_ops vsp1_rwpf_pad_ops = {
.init_cfg = vsp1_entity_init_cfg,
.enum_mbus_code = vsp1_rwpf_enum_mbus_code,
.enum_frame_size = vsp1_rwpf_enum_frame_size,
.set_selection = vsp1_rwpf_set_selection,
};
+const struct v4l2_subdev_ops vsp1_rwpf_subdev_ops = {
+ .pad = &vsp1_rwpf_pad_ops,
+};
+
/* -----------------------------------------------------------------------------
* Controls
*/
struct vsp1_rwpf *vsp1_rpf_create(struct vsp1_device *vsp1, unsigned int index);
struct vsp1_rwpf *vsp1_wpf_create(struct vsp1_device *vsp1, unsigned int index);
+void vsp1_wpf_stop(struct vsp1_rwpf *wpf);
+
int vsp1_rwpf_init_ctrls(struct vsp1_rwpf *rwpf, unsigned int ncontrols);
-extern const struct v4l2_subdev_pad_ops vsp1_rwpf_pad_ops;
+extern const struct v4l2_subdev_ops vsp1_rwpf_subdev_ops;
struct v4l2_rect *vsp1_rwpf_get_crop(struct vsp1_rwpf *rwpf,
struct v4l2_subdev_state *sd_state);
}
/* -----------------------------------------------------------------------------
- * V4L2 Subdevice Core Operations
+ * VSP1 Entity Operations
*/
-static int wpf_s_stream(struct v4l2_subdev *subdev, int enable)
+void vsp1_wpf_stop(struct vsp1_rwpf *wpf)
{
- struct vsp1_rwpf *wpf = to_rwpf(subdev);
struct vsp1_device *vsp1 = wpf->entity.vsp1;
- if (enable)
- return 0;
-
/*
* Write to registers directly when stopping the stream as there will be
* no pipeline run to apply the display list.
vsp1_write(vsp1, VI6_WPF_IRQ_ENB(wpf->entity.index), 0);
vsp1_write(vsp1, wpf->entity.index * VI6_WPF_OFFSET +
VI6_WPF_SRCRPF, 0);
-
- return 0;
}
-/* -----------------------------------------------------------------------------
- * V4L2 Subdevice Operations
- */
-
-static const struct v4l2_subdev_video_ops wpf_video_ops = {
- .s_stream = wpf_s_stream,
-};
-
-static const struct v4l2_subdev_ops wpf_ops = {
- .video = &wpf_video_ops,
- .pad = &vsp1_rwpf_pad_ops,
-};
-
-/* -----------------------------------------------------------------------------
- * VSP1 Entity Operations
- */
-
static void vsp1_wpf_destroy(struct vsp1_entity *entity)
{
struct vsp1_rwpf *wpf = entity_to_rwpf(entity);
wpf->entity.index = index;
sprintf(name, "wpf.%u", index);
- ret = vsp1_entity_init(vsp1, &wpf->entity, name, 2, &wpf_ops,
+ ret = vsp1_entity_init(vsp1, &wpf->entity, name, 2, &vsp1_rwpf_subdev_ops,
MEDIA_ENT_F_PROC_VIDEO_PIXEL_FORMATTER);
if (ret < 0)
return ERR_PTR(ret);
mei_hdr = mei_msg_hdr_init(cb);
if (IS_ERR(mei_hdr)) {
- rets = -PTR_ERR(mei_hdr);
+ rets = PTR_ERR(mei_hdr);
mei_hdr = NULL;
goto err;
}
hbuf_slots = mei_hbuf_empty_slots(dev);
if (hbuf_slots < 0) {
- rets = -EOVERFLOW;
+ buf_len = -EOVERFLOW;
goto out;
}
byte = ret;
break;
}
+ return byte;
}
- return byte;
+ return 0;
}
/**
blk_mq_requeue_request(req, true);
else
__blk_mq_end_request(req, BLK_STS_OK);
+ } else if (mq->in_recovery) {
+ blk_mq_requeue_request(req, true);
} else {
blk_mq_end_request(req, BLK_STS_OK);
}
cmd.flags = MMC_RSP_R1B | MMC_CMD_AC;
cmd.flags &= ~MMC_RSP_CRC; /* Ignore CRC */
cmd.busy_timeout = MMC_CQE_RECOVERY_TIMEOUT;
- mmc_wait_for_cmd(host, &cmd, 0);
+ mmc_wait_for_cmd(host, &cmd, MMC_CMD_RETRIES);
+
+ mmc_poll_for_busy(host->card, MMC_CQE_RECOVERY_TIMEOUT, true, MMC_BUSY_IO);
memset(&cmd, 0, sizeof(cmd));
cmd.opcode = MMC_CMDQ_TASK_MGMT;
cmd.flags = MMC_RSP_R1B | MMC_CMD_AC;
cmd.flags &= ~MMC_RSP_CRC; /* Ignore CRC */
cmd.busy_timeout = MMC_CQE_RECOVERY_TIMEOUT;
- err = mmc_wait_for_cmd(host, &cmd, 0);
+ err = mmc_wait_for_cmd(host, &cmd, MMC_CMD_RETRIES);
host->cqe_ops->cqe_recovery_finish(host);
+ if (err)
+ err = mmc_wait_for_cmd(host, &cmd, MMC_CMD_RETRIES);
+
mmc_retune_release(host);
return err;
ret = cqhci_tasks_cleared(cq_host);
if (!ret)
- pr_debug("%s: cqhci: Failed to clear tasks\n",
- mmc_hostname(mmc));
+ pr_warn("%s: cqhci: Failed to clear tasks\n",
+ mmc_hostname(mmc));
return ret;
}
ret = cqhci_halted(cq_host);
if (!ret)
- pr_debug("%s: cqhci: Failed to halt\n", mmc_hostname(mmc));
+ pr_warn("%s: cqhci: Failed to halt\n", mmc_hostname(mmc));
return ret;
}
/*
* After halting we expect to be able to use the command line. We interpret the
* failure to halt to mean the data lines might still be in use (and the upper
- * layers will need to send a STOP command), so we set the timeout based on a
- * generous command timeout.
+ * layers will need to send a STOP command), however failing to halt complicates
+ * the recovery, so set a timeout that would reasonably allow I/O to complete.
*/
-#define CQHCI_START_HALT_TIMEOUT 5
+#define CQHCI_START_HALT_TIMEOUT 500
static void cqhci_recovery_start(struct mmc_host *mmc)
{
ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
- if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
- ok = false;
-
/*
* The specification contradicts itself, by saying that tasks cannot be
* cleared if CQHCI does not halt, but if CQHCI does not halt, it should
* be disabled/re-enabled, but not to disable before clearing tasks.
* Have a go anyway.
*/
- if (!ok) {
- pr_debug("%s: cqhci: disable / re-enable\n", mmc_hostname(mmc));
- cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
- cqcfg &= ~CQHCI_ENABLE;
- cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
- cqcfg |= CQHCI_ENABLE;
- cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
- /* Be sure that there are no tasks */
- ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
- if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
- ok = false;
- WARN_ON(!ok);
- }
+ if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+ ok = false;
+
+ /* Disable to make sure tasks really are cleared */
+ cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+ cqcfg &= ~CQHCI_ENABLE;
+ cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+ cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+ cqcfg |= CQHCI_ENABLE;
+ cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+ cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+
+ if (!ok)
+ cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT);
cqhci_recover_mrqs(cq_host);
sdhci_writel(host, val, SDHCI_GLI_9763E_HS400_ES_REG);
}
+static void gl9763e_set_low_power_negotiation(struct sdhci_pci_slot *slot,
+ bool enable)
+{
+ struct pci_dev *pdev = slot->chip->pdev;
+ u32 value;
+
+ pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
+ value &= ~GLI_9763E_VHS_REV;
+ value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_W);
+ pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
+
+ pci_read_config_dword(pdev, PCIE_GLI_9763E_CFG, &value);
+
+ if (enable)
+ value &= ~GLI_9763E_CFG_LPSN_DIS;
+ else
+ value |= GLI_9763E_CFG_LPSN_DIS;
+
+ pci_write_config_dword(pdev, PCIE_GLI_9763E_CFG, value);
+
+ pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
+ value &= ~GLI_9763E_VHS_REV;
+ value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_R);
+ pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
+}
+
static void sdhci_set_gl9763e_signaling(struct sdhci_host *host,
unsigned int timing)
{
if (ret)
goto cleanup;
+ /* Disable LPM negotiation to avoid entering L1 state. */
+ gl9763e_set_low_power_negotiation(slot, false);
+
return 0;
cleanup:
}
#ifdef CONFIG_PM
-static void gl9763e_set_low_power_negotiation(struct sdhci_pci_slot *slot, bool enable)
-{
- struct pci_dev *pdev = slot->chip->pdev;
- u32 value;
-
- pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
- value &= ~GLI_9763E_VHS_REV;
- value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_W);
- pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
-
- pci_read_config_dword(pdev, PCIE_GLI_9763E_CFG, &value);
-
- if (enable)
- value &= ~GLI_9763E_CFG_LPSN_DIS;
- else
- value |= GLI_9763E_CFG_LPSN_DIS;
-
- pci_write_config_dword(pdev, PCIE_GLI_9763E_CFG, value);
-
- pci_read_config_dword(pdev, PCIE_GLI_9763E_VHS, &value);
- value &= ~GLI_9763E_VHS_REV;
- value |= FIELD_PREP(GLI_9763E_VHS_REV, GLI_9763E_VHS_REV_R);
- pci_write_config_dword(pdev, PCIE_GLI_9763E_VHS, value);
-}
-
static int gl9763e_runtime_suspend(struct sdhci_pci_chip *chip)
{
struct sdhci_pci_slot *slot = chip->slots[0];
mmc_request_done(host->mmc, mrq);
}
+static void sdhci_sprd_set_power(struct sdhci_host *host, unsigned char mode,
+ unsigned short vdd)
+{
+ struct mmc_host *mmc = host->mmc;
+
+ switch (mode) {
+ case MMC_POWER_OFF:
+ mmc_regulator_set_ocr(host->mmc, mmc->supply.vmmc, 0);
+
+ mmc_regulator_disable_vqmmc(mmc);
+ break;
+ case MMC_POWER_ON:
+ mmc_regulator_enable_vqmmc(mmc);
+ break;
+ case MMC_POWER_UP:
+ mmc_regulator_set_ocr(host->mmc, mmc->supply.vmmc, vdd);
+ break;
+ }
+}
+
static struct sdhci_ops sdhci_sprd_ops = {
.read_l = sdhci_sprd_readl,
.write_l = sdhci_sprd_writel,
.write_w = sdhci_sprd_writew,
.write_b = sdhci_sprd_writeb,
.set_clock = sdhci_sprd_set_clock,
+ .set_power = sdhci_sprd_set_power,
.get_max_clock = sdhci_sprd_get_max_clock,
.get_min_clock = sdhci_sprd_get_min_clock,
.set_bus_width = sdhci_set_bus_width,
host->caps1 &= ~(SDHCI_SUPPORT_SDR50 | SDHCI_SUPPORT_SDR104 |
SDHCI_SUPPORT_DDR50);
+ ret = mmc_regulator_get_supply(host->mmc);
+ if (ret)
+ goto pm_runtime_disable;
+
ret = sdhci_setup_host(host);
if (ret)
goto pm_runtime_disable;
#define ARC_IS_5MBIT 1 /* card default speed is 5MBit */
#define ARC_CAN_10MBIT 2 /* card uses COM20022, supporting 10MBit,
but default is 2.5MBit. */
+#define ARC_HAS_LED 4 /* card has software controlled LEDs */
+#define ARC_HAS_ROTARY 8 /* card has rotary encoder */
/* information needed to define an encapsulation driver */
struct ArcProto {
if (!strncmp(ci->name, "EAE PLX-PCI FB2", 15))
lp->backplane = 1;
- /* Get the dev_id from the PLX rotary coder */
- if (!strncmp(ci->name, "EAE PLX-PCI MA1", 15))
- dev_id_mask = 0x3;
- dev->dev_id = (inb(priv->misc + ci->rotary) >> 4) & dev_id_mask;
-
- snprintf(dev->name, sizeof(dev->name), "arc%d-%d", dev->dev_id, i);
+ if (ci->flags & ARC_HAS_ROTARY) {
+ /* Get the dev_id from the PLX rotary coder */
+ if (!strncmp(ci->name, "EAE PLX-PCI MA1", 15))
+ dev_id_mask = 0x3;
+ dev->dev_id = (inb(priv->misc + ci->rotary) >> 4) & dev_id_mask;
+ snprintf(dev->name, sizeof(dev->name), "arc%d-%d", dev->dev_id, i);
+ }
if (arcnet_inb(ioaddr, COM20020_REG_R_STATUS) == 0xFF) {
pr_err("IO address %Xh is empty!\n", ioaddr);
goto err_free_arcdev;
}
+ ret = com20020_found(dev, IRQF_SHARED);
+ if (ret)
+ goto err_free_arcdev;
+
card = devm_kzalloc(&pdev->dev, sizeof(struct com20020_dev),
GFP_KERNEL);
if (!card) {
card->index = i;
card->pci_priv = priv;
- card->tx_led.brightness_set = led_tx_set;
- card->tx_led.default_trigger = devm_kasprintf(&pdev->dev,
- GFP_KERNEL, "arc%d-%d-tx",
- dev->dev_id, i);
- card->tx_led.name = devm_kasprintf(&pdev->dev, GFP_KERNEL,
- "pci:green:tx:%d-%d",
- dev->dev_id, i);
-
- card->tx_led.dev = &dev->dev;
- card->recon_led.brightness_set = led_recon_set;
- card->recon_led.default_trigger = devm_kasprintf(&pdev->dev,
- GFP_KERNEL, "arc%d-%d-recon",
- dev->dev_id, i);
- card->recon_led.name = devm_kasprintf(&pdev->dev, GFP_KERNEL,
- "pci:red:recon:%d-%d",
- dev->dev_id, i);
- card->recon_led.dev = &dev->dev;
- card->dev = dev;
-
- ret = devm_led_classdev_register(&pdev->dev, &card->tx_led);
- if (ret)
- goto err_free_arcdev;
- ret = devm_led_classdev_register(&pdev->dev, &card->recon_led);
- if (ret)
- goto err_free_arcdev;
-
- dev_set_drvdata(&dev->dev, card);
-
- ret = com20020_found(dev, IRQF_SHARED);
- if (ret)
- goto err_free_arcdev;
-
- devm_arcnet_led_init(dev, dev->dev_id, i);
+ if (ci->flags & ARC_HAS_LED) {
+ card->tx_led.brightness_set = led_tx_set;
+ card->tx_led.default_trigger = devm_kasprintf(&pdev->dev,
+ GFP_KERNEL, "arc%d-%d-tx",
+ dev->dev_id, i);
+ card->tx_led.name = devm_kasprintf(&pdev->dev, GFP_KERNEL,
+ "pci:green:tx:%d-%d",
+ dev->dev_id, i);
+
+ card->tx_led.dev = &dev->dev;
+ card->recon_led.brightness_set = led_recon_set;
+ card->recon_led.default_trigger = devm_kasprintf(&pdev->dev,
+ GFP_KERNEL, "arc%d-%d-recon",
+ dev->dev_id, i);
+ card->recon_led.name = devm_kasprintf(&pdev->dev, GFP_KERNEL,
+ "pci:red:recon:%d-%d",
+ dev->dev_id, i);
+ card->recon_led.dev = &dev->dev;
+
+ ret = devm_led_classdev_register(&pdev->dev, &card->tx_led);
+ if (ret)
+ goto err_free_arcdev;
+
+ ret = devm_led_classdev_register(&pdev->dev, &card->recon_led);
+ if (ret)
+ goto err_free_arcdev;
+
+ dev_set_drvdata(&dev->dev, card);
+ devm_arcnet_led_init(dev, dev->dev_id, i);
+ }
+ card->dev = dev;
list_add(&card->list, &priv->list_dev);
continue;
};
static struct com20020_pci_card_info card_info_sohard = {
- .name = "PLX-PCI",
+ .name = "SOHARD SH ARC-PCI",
.devcount = 1,
/* SOHARD needs PCI base addr 4 */
.chan_map_tbl = {
},
},
.rotary = 0x0,
- .flags = ARC_CAN_10MBIT,
+ .flags = ARC_HAS_ROTARY | ARC_HAS_LED | ARC_CAN_10MBIT,
};
static struct com20020_pci_card_info card_info_eae_ma1 = {
},
},
.rotary = 0x0,
- .flags = ARC_CAN_10MBIT,
+ .flags = ARC_HAS_ROTARY | ARC_HAS_LED | ARC_CAN_10MBIT,
};
static struct com20020_pci_card_info card_info_eae_fb2 = {
},
},
.rotary = 0x0,
- .flags = ARC_CAN_10MBIT,
+ .flags = ARC_HAS_ROTARY | ARC_HAS_LED | ARC_CAN_10MBIT,
};
static const struct pci_device_id com20020pci_id_table[] = {
static void bond_setup_by_slave(struct net_device *bond_dev,
struct net_device *slave_dev)
{
+ bool was_up = !!(bond_dev->flags & IFF_UP);
+
+ dev_close(bond_dev);
+
bond_dev->header_ops = slave_dev->header_ops;
bond_dev->type = slave_dev->type;
bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP);
}
+ if (was_up)
+ dev_open(bond_dev, NULL);
}
/* On bonding slaves other than the currently active slave, suppress
{
struct ksz_tagger_data *tagger_data;
- tagger_data = ksz_tagger_data(ds);
- tagger_data->xmit_work_fn = ksz_port_deferred_xmit;
-
- return 0;
+ switch (proto) {
+ case DSA_TAG_PROTO_KSZ8795:
+ return 0;
+ case DSA_TAG_PROTO_KSZ9893:
+ case DSA_TAG_PROTO_KSZ9477:
+ case DSA_TAG_PROTO_LAN937X:
+ tagger_data = ksz_tagger_data(ds);
+ tagger_data->xmit_work_fn = ksz_port_deferred_xmit;
+ return 0;
+ default:
+ return -EPROTONOSUPPORT;
+ }
}
static int ksz_port_vlan_filtering(struct dsa_switch *ds, int port,
config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100;
}
+static void mv88e6351_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
+ struct phylink_config *config)
+{
+ unsigned long *supported = config->supported_interfaces;
+
+ /* Translate the default cmode */
+ mv88e6xxx_translate_cmode(chip->ports[port].cmode, supported);
+
+ config->mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 |
+ MAC_1000FD;
+}
+
static int mv88e6352_get_port4_serdes_cmode(struct mv88e6xxx_chip *chip)
{
u16 reg, val;
struct mv88e6xxx_chip *chip = ds->priv;
int err;
- if (chip->info->ops->pcs_ops->pcs_init) {
+ if (chip->info->ops->pcs_ops &&
+ chip->info->ops->pcs_ops->pcs_init) {
err = chip->info->ops->pcs_ops->pcs_init(chip, port);
if (err)
return err;
mv88e6xxx_teardown_devlink_regions_port(ds, port);
- if (chip->info->ops->pcs_ops->pcs_teardown)
+ if (chip->info->ops->pcs_ops &&
+ chip->info->ops->pcs_ops->pcs_teardown)
chip->info->ops->pcs_ops->pcs_teardown(chip, port);
}
.vtu_loadpurge = mv88e6352_g1_vtu_loadpurge,
.stu_getnext = mv88e6352_g1_stu_getnext,
.stu_loadpurge = mv88e6352_g1_stu_loadpurge,
- .phylink_get_caps = mv88e6185_phylink_get_caps,
+ .phylink_get_caps = mv88e6351_phylink_get_caps,
};
static const struct mv88e6xxx_ops mv88e6172_ops = {
.vtu_loadpurge = mv88e6352_g1_vtu_loadpurge,
.stu_getnext = mv88e6352_g1_stu_getnext,
.stu_loadpurge = mv88e6352_g1_stu_loadpurge,
- .phylink_get_caps = mv88e6185_phylink_get_caps,
+ .phylink_get_caps = mv88e6351_phylink_get_caps,
};
static const struct mv88e6xxx_ops mv88e6176_ops = {
.vtu_loadpurge = mv88e6352_g1_vtu_loadpurge,
.stu_getnext = mv88e6352_g1_stu_getnext,
.stu_loadpurge = mv88e6352_g1_stu_loadpurge,
- .phylink_get_caps = mv88e6185_phylink_get_caps,
+ .phylink_get_caps = mv88e6351_phylink_get_caps,
};
static const struct mv88e6xxx_ops mv88e6351_ops = {
.stu_loadpurge = mv88e6352_g1_stu_loadpurge,
.avb_ops = &mv88e6352_avb_ops,
.ptp_ops = &mv88e6352_ptp_ops,
- .phylink_get_caps = mv88e6185_phylink_get_caps,
+ .phylink_get_caps = mv88e6351_phylink_get_caps,
};
static const struct mv88e6xxx_ops mv88e6352_ops = {
case PHY_INTERFACE_MODE_10GBASER:
case PHY_INTERFACE_MODE_XAUI:
case PHY_INTERFACE_MODE_RXAUI:
+ case PHY_INTERFACE_MODE_USXGMII:
return &mpcs->xg_pcs;
default:
struct mv88e639x_pcs *mpcs = xg_pcs_to_mv88e639x_pcs(pcs);
int err;
- if (interface == PHY_INTERFACE_MODE_10GBASER) {
+ if (interface == PHY_INTERFACE_MODE_10GBASER ||
+ interface == PHY_INTERFACE_MODE_USXGMII) {
err = mv88e6393x_erratum_5_2(mpcs);
if (err)
return err;
return mv88e639x_xg_pcs_enable(mpcs);
}
+static void mv88e6393x_xg_pcs_get_state(struct phylink_pcs *pcs,
+ struct phylink_link_state *state)
+{
+ struct mv88e639x_pcs *mpcs = xg_pcs_to_mv88e639x_pcs(pcs);
+ u16 status, lp_status;
+ int err;
+
+ if (state->interface != PHY_INTERFACE_MODE_USXGMII)
+ return mv88e639x_xg_pcs_get_state(pcs, state);
+
+ state->link = false;
+
+ err = mv88e639x_read(mpcs, MV88E6390_USXGMII_PHY_STATUS, &status);
+ err = err ? : mv88e639x_read(mpcs, MV88E6390_USXGMII_LP_STATUS, &lp_status);
+ if (err) {
+ dev_err(mpcs->mdio.dev.parent,
+ "can't read USXGMII status: %pe\n", ERR_PTR(err));
+ return;
+ }
+
+ state->link = !!(status & MDIO_USXGMII_LINK);
+ state->an_complete = state->link;
+ phylink_decode_usxgmii_word(state, lp_status);
+}
+
static const struct phylink_pcs_ops mv88e6393x_xg_pcs_ops = {
.pcs_enable = mv88e6393x_xg_pcs_enable,
.pcs_disable = mv88e6393x_xg_pcs_disable,
.pcs_pre_config = mv88e6393x_xg_pcs_pre_config,
.pcs_post_config = mv88e6393x_xg_pcs_post_config,
- .pcs_get_state = mv88e639x_xg_pcs_get_state,
+ .pcs_get_state = mv88e6393x_xg_pcs_get_state,
.pcs_config = mv88e639x_xg_pcs_config,
};
* compare it to the stored version, just create the meta
*/
if (io_sq->disable_meta_caching) {
- if (unlikely(!ena_tx_ctx->meta_valid))
- return -EINVAL;
-
*have_meta = true;
return ena_com_create_meta(io_sq, ena_meta);
}
struct ena_tx_buffer *tx_info);
static int ena_create_io_tx_queues_in_range(struct ena_adapter *adapter,
int first_index, int count);
+static void ena_free_all_io_tx_resources_in_range(struct ena_adapter *adapter,
+ int first_index, int count);
/* Increase a stat by cnt while holding syncp seqlock on 32bit machines */
static void ena_increase_stat(u64 *statp, u64 cnt,
static int ena_setup_and_create_all_xdp_queues(struct ena_adapter *adapter)
{
+ u32 xdp_first_ring = adapter->xdp_first_ring;
+ u32 xdp_num_queues = adapter->xdp_num_queues;
int rc = 0;
- rc = ena_setup_tx_resources_in_range(adapter, adapter->xdp_first_ring,
- adapter->xdp_num_queues);
+ rc = ena_setup_tx_resources_in_range(adapter, xdp_first_ring, xdp_num_queues);
if (rc)
goto setup_err;
- rc = ena_create_io_tx_queues_in_range(adapter,
- adapter->xdp_first_ring,
- adapter->xdp_num_queues);
+ rc = ena_create_io_tx_queues_in_range(adapter, xdp_first_ring, xdp_num_queues);
if (rc)
goto create_err;
return 0;
create_err:
- ena_free_all_io_tx_resources(adapter);
+ ena_free_all_io_tx_resources_in_range(adapter, xdp_first_ring, xdp_num_queues);
setup_err:
return rc;
}
if (unlikely(!skb))
return NULL;
- /* sync this buffer for CPU use */
- dma_sync_single_for_cpu(rx_ring->dev,
- dma_unmap_addr(&rx_info->ena_buf, paddr) + pkt_offset,
- len,
- DMA_FROM_DEVICE);
skb_copy_to_linear_data(skb, buf_addr + buf_offset, len);
dma_sync_single_for_device(rx_ring->dev,
dma_unmap_addr(&rx_info->ena_buf, paddr) + pkt_offset,
buf_len = SKB_DATA_ALIGN(len + buf_offset + tailroom);
- pre_reuse_paddr = dma_unmap_addr(&rx_info->ena_buf, paddr);
-
/* If XDP isn't loaded try to reuse part of the RX buffer */
reuse_rx_buf_page = !is_xdp_loaded &&
ena_try_rx_buf_page_reuse(rx_info, buf_len, len, pkt_offset);
- dma_sync_single_for_cpu(rx_ring->dev,
- pre_reuse_paddr + pkt_offset,
- len,
- DMA_FROM_DEVICE);
-
if (!reuse_rx_buf_page)
ena_unmap_rx_buff_attrs(rx_ring, rx_info, DMA_ATTR_SKIP_CPU_SYNC);
}
}
-static int ena_xdp_handle_buff(struct ena_ring *rx_ring, struct xdp_buff *xdp)
+static int ena_xdp_handle_buff(struct ena_ring *rx_ring, struct xdp_buff *xdp, u16 num_descs)
{
struct ena_rx_buffer *rx_info;
int ret;
+ /* XDP multi-buffer packets not supported */
+ if (unlikely(num_descs > 1)) {
+ netdev_err_once(rx_ring->adapter->netdev,
+ "xdp: dropped unsupported multi-buffer packets\n");
+ ena_increase_stat(&rx_ring->rx_stats.xdp_drop, 1, &rx_ring->syncp);
+ return ENA_XDP_DROP;
+ }
+
rx_info = &rx_ring->rx_buffer_info[rx_ring->ena_bufs[0].req_id];
xdp_prepare_buff(xdp, page_address(rx_info->page),
rx_info->buf_offset,
rx_ring->ena_bufs[0].len, false);
- /* If for some reason we received a bigger packet than
- * we expect, then we simply drop it
- */
- if (unlikely(rx_ring->ena_bufs[0].len > ENA_XDP_MAX_MTU))
- return ENA_XDP_DROP;
ret = ena_xdp_execute(rx_ring, xdp);
int xdp_flags = 0;
int total_len = 0;
int xdp_verdict;
+ u8 pkt_offset;
int rc = 0;
int i;
/* First descriptor might have an offset set by the device */
rx_info = &rx_ring->rx_buffer_info[rx_ring->ena_bufs[0].req_id];
- rx_info->buf_offset += ena_rx_ctx.pkt_offset;
+ pkt_offset = ena_rx_ctx.pkt_offset;
+ rx_info->buf_offset += pkt_offset;
netif_dbg(rx_ring->adapter, rx_status, rx_ring->netdev,
"rx_poll: q %d got packet from ena. descs #: %d l3 proto %d l4 proto %d hash: %x\n",
rx_ring->qid, ena_rx_ctx.descs, ena_rx_ctx.l3_proto,
ena_rx_ctx.l4_proto, ena_rx_ctx.hash);
+ dma_sync_single_for_cpu(rx_ring->dev,
+ dma_unmap_addr(&rx_info->ena_buf, paddr) + pkt_offset,
+ rx_ring->ena_bufs[0].len,
+ DMA_FROM_DEVICE);
+
if (ena_xdp_present_ring(rx_ring))
- xdp_verdict = ena_xdp_handle_buff(rx_ring, &xdp);
+ xdp_verdict = ena_xdp_handle_buff(rx_ring, &xdp, ena_rx_ctx.descs);
/* allocate skb and fill it */
if (xdp_verdict == ENA_XDP_PASS)
if (xdp_verdict & ENA_XDP_FORWARDED) {
ena_unmap_rx_buff_attrs(rx_ring,
&rx_ring->rx_buffer_info[req_id],
- 0);
+ DMA_ATTR_SKIP_CPU_SYNC);
rx_ring->rx_buffer_info[req_id].page = NULL;
}
}
}
queue_work(pdsc->wq, &qcq->work);
- pds_core_intr_mask(&pdsc->intr_ctrl[irq], PDS_CORE_INTR_MASK_CLEAR);
+ pds_core_intr_mask(&pdsc->intr_ctrl[qcq->intx], PDS_CORE_INTR_MASK_CLEAR);
return IRQ_HANDLED;
}
#define PDSC_DRV_DESCRIPTION "AMD/Pensando Core Driver"
#define PDSC_WATCHDOG_SECS 5
-#define PDSC_QUEUE_NAME_MAX_SZ 32
+#define PDSC_QUEUE_NAME_MAX_SZ 16
#define PDSC_ADMINQ_MIN_LENGTH 16 /* must be a power of two */
#define PDSC_NOTIFYQ_LENGTH 64 /* must be a power of two */
#define PDSC_TEARDOWN_RECOVERY false
struct pds_core_drv_identity drv = {};
size_t sz;
int err;
+ int n;
drv.drv_type = cpu_to_le32(PDS_DRIVER_LINUX);
- snprintf(drv.driver_ver_str, sizeof(drv.driver_ver_str),
- "%s %s", PDS_CORE_DRV_NAME, utsname()->release);
+ /* Catching the return quiets a Wformat-truncation complaint */
+ n = snprintf(drv.driver_ver_str, sizeof(drv.driver_ver_str),
+ "%s %s", PDS_CORE_DRV_NAME, utsname()->release);
+ if (n > sizeof(drv.driver_ver_str))
+ dev_dbg(pdsc->dev, "release name truncated, don't care\n");
/* Next let's get some info about the device
* We use the devcmd_lock at this level in order to
struct pds_core_fw_list_info fw_list;
struct pdsc *pdsc = devlink_priv(dl);
union pds_core_dev_comp comp;
- char buf[16];
+ char buf[32];
int listlen;
int err;
int i;
static void xgbe_service_timer(struct timer_list *t)
{
struct xgbe_prv_data *pdata = from_timer(pdata, t, service_timer);
+ struct xgbe_channel *channel;
+ unsigned int i;
queue_work(pdata->dev_workqueue, &pdata->service_work);
mod_timer(&pdata->service_timer, jiffies + HZ);
+
+ if (!pdata->tx_usecs)
+ return;
+
+ for (i = 0; i < pdata->channel_count; i++) {
+ channel = pdata->channel[i];
+ if (!channel->tx_ring || channel->tx_timer_active)
+ break;
+ channel->tx_timer_active = 1;
+ mod_timer(&channel->tx_timer,
+ jiffies + usecs_to_jiffies(pdata->tx_usecs));
+ }
}
static void xgbe_init_timers(struct xgbe_prv_data *pdata)
cmd->base.phy_address = pdata->phy.address;
- cmd->base.autoneg = pdata->phy.autoneg;
- cmd->base.speed = pdata->phy.speed;
- cmd->base.duplex = pdata->phy.duplex;
+ if (netif_carrier_ok(netdev)) {
+ cmd->base.speed = pdata->phy.speed;
+ cmd->base.duplex = pdata->phy.duplex;
+ } else {
+ cmd->base.speed = SPEED_UNKNOWN;
+ cmd->base.duplex = DUPLEX_UNKNOWN;
+ }
+ cmd->base.autoneg = pdata->phy.autoneg;
cmd->base.port = PORT_NONE;
XGBE_LM_COPY(cmd, supported, lks, supported);
if (pdata->phy.duplex != DUPLEX_FULL)
return -EINVAL;
- xgbe_set_mode(pdata, mode);
+ /* Force the mode change for SFI in Fixed PHY config.
+ * Fixed PHY configs needs PLL to be enabled while doing mode set.
+ * When the SFP module isn't connected during boot, driver assumes
+ * AN is ON and attempts autonegotiation. However, if the connected
+ * SFP comes up in Fixed PHY config, the link will not come up as
+ * PLL isn't enabled while the initial mode set command is issued.
+ * So, force the mode change for SFI in Fixed PHY configuration to
+ * fix link issues.
+ */
+ if (mode == XGBE_MODE_SFI)
+ xgbe_change_mode(pdata, mode);
+ else
+ xgbe_set_mode(pdata, mode);
return 0;
}
/* aq_ptp_rx_hwtstamp - utility function which checks for RX time stamp
* @adapter: pointer to adapter struct
- * @skb: particular skb to send timestamp with
+ * @shhwtstamps: particular skb_shared_hwtstamps to save timestamp
*
* if the timestamp is valid, we convert it into the timecounter ns
* value, then store that result into the hwtstamps structure which
* is passed up the network stack
*/
-static void aq_ptp_rx_hwtstamp(struct aq_ptp_s *aq_ptp, struct sk_buff *skb,
+static void aq_ptp_rx_hwtstamp(struct aq_ptp_s *aq_ptp, struct skb_shared_hwtstamps *shhwtstamps,
u64 timestamp)
{
timestamp -= atomic_read(&aq_ptp->offset_ingress);
- aq_ptp_convert_to_hwtstamp(aq_ptp, skb_hwtstamps(skb), timestamp);
+ aq_ptp_convert_to_hwtstamp(aq_ptp, shhwtstamps, timestamp);
}
void aq_ptp_hwtstamp_config_get(struct aq_ptp_s *aq_ptp,
&aq_ptp->ptp_rx == ring || &aq_ptp->hwts_rx == ring;
}
-u16 aq_ptp_extract_ts(struct aq_nic_s *aq_nic, struct sk_buff *skb, u8 *p,
+u16 aq_ptp_extract_ts(struct aq_nic_s *aq_nic, struct skb_shared_hwtstamps *shhwtstamps, u8 *p,
unsigned int len)
{
struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
p, len, ×tamp);
if (ret > 0)
- aq_ptp_rx_hwtstamp(aq_ptp, skb, timestamp);
+ aq_ptp_rx_hwtstamp(aq_ptp, shhwtstamps, timestamp);
return ret;
}
/* Return either ring is belong to PTP or not*/
bool aq_ptp_ring(struct aq_nic_s *aq_nic, struct aq_ring_s *ring);
-u16 aq_ptp_extract_ts(struct aq_nic_s *aq_nic, struct sk_buff *skb, u8 *p,
+u16 aq_ptp_extract_ts(struct aq_nic_s *aq_nic, struct skb_shared_hwtstamps *shhwtstamps, u8 *p,
unsigned int len);
struct ptp_clock *aq_ptp_get_ptp_clock(struct aq_ptp_s *aq_ptp);
}
static inline u16 aq_ptp_extract_ts(struct aq_nic_s *aq_nic,
- struct sk_buff *skb, u8 *p,
+ struct skb_shared_hwtstamps *shhwtstamps, u8 *p,
unsigned int len)
{
return 0;
}
if (is_ptp_ring)
buff->len -=
- aq_ptp_extract_ts(self->aq_nic, skb,
+ aq_ptp_extract_ts(self->aq_nic, skb_hwtstamps(skb),
aq_buf_vaddr(&buff->rxdata),
buff->len);
struct aq_ring_buff_s *buff = &rx_ring->buff_ring[rx_ring->sw_head];
bool is_ptp_ring = aq_ptp_ring(rx_ring->aq_nic, rx_ring);
struct aq_ring_buff_s *buff_ = NULL;
+ u16 ptp_hwtstamp_len = 0;
+ struct skb_shared_hwtstamps shhwtstamps;
struct sk_buff *skb = NULL;
unsigned int next_ = 0U;
struct xdp_buff xdp;
hard_start = page_address(buff->rxdata.page) +
buff->rxdata.pg_off - rx_ring->page_offset;
- if (is_ptp_ring)
- buff->len -=
- aq_ptp_extract_ts(rx_ring->aq_nic, skb,
- aq_buf_vaddr(&buff->rxdata),
- buff->len);
+ if (is_ptp_ring) {
+ ptp_hwtstamp_len = aq_ptp_extract_ts(rx_ring->aq_nic, &shhwtstamps,
+ aq_buf_vaddr(&buff->rxdata),
+ buff->len);
+ buff->len -= ptp_hwtstamp_len;
+ }
xdp_init_buff(&xdp, frame_sz, &rx_ring->xdp_rxq);
xdp_prepare_buff(&xdp, hard_start, rx_ring->page_offset,
if (IS_ERR(skb) || !skb)
continue;
+ if (ptp_hwtstamp_len > 0)
+ *skb_hwtstamps(skb) = shhwtstamps;
+
if (buff->is_vlan)
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q),
buff->vlan_rx_tag);
return;
kfree(self->buff_ring);
+ self->buff_ring = NULL;
- if (self->dx_ring)
+ if (self->dx_ring) {
dma_free_coherent(aq_nic_get_dev(self->aq_nic),
self->size * self->dx_size, self->dx_ring,
self->dx_ring_pa);
+ self->dx_ring = NULL;
+ }
}
unsigned int aq_ring_fill_stats_data(struct aq_ring_s *self, u64 *data)
netdev_err(adapter->netdev, "offset(%d) > ring size(%d) !!\n",
offset, adapter->ring_size);
err = -1;
- goto failed;
+ goto free_buffer;
}
return 0;
+free_buffer:
+ kfree(tx_ring->tx_buffer);
+ tx_ring->tx_buffer = NULL;
failed:
if (adapter->ring_vir_addr != NULL) {
dma_free_coherent(&pdev->dev, adapter->ring_size,
static void bnxt_deliver_skb(struct bnxt *bp, struct bnxt_napi *bnapi,
struct sk_buff *skb)
{
+ skb_mark_for_recycle(skb);
+
if (skb->dev != bp->dev) {
/* this packet belongs to a vf-rep */
bnxt_vf_rep_rx(bp, skb);
return;
}
skb_record_rx_queue(skb, bnapi->index);
- skb_mark_for_recycle(skb);
napi_gro_receive(&bnapi->napi, skb);
}
+static bool bnxt_rx_ts_valid(struct bnxt *bp, u32 flags,
+ struct rx_cmp_ext *rxcmp1, u32 *cmpl_ts)
+{
+ u32 ts = le32_to_cpu(rxcmp1->rx_cmp_timestamp);
+
+ if (BNXT_PTP_RX_TS_VALID(flags))
+ goto ts_valid;
+ if (!bp->ptp_all_rx_tstamp || !ts || !BNXT_ALL_RX_TS_VALID(flags))
+ return false;
+
+ts_valid:
+ *cmpl_ts = ts;
+ return true;
+}
+
/* returns the following:
* 1 - 1 packet successfully received
* 0 - successful TPA_START, packet not completed yet
struct sk_buff *skb;
struct xdp_buff xdp;
u32 flags, misc;
+ u32 cmpl_ts;
void *data;
int rc = 0;
}
}
- if (unlikely((flags & RX_CMP_FLAGS_ITYPES_MASK) ==
- RX_CMP_FLAGS_ITYPE_PTP_W_TS) || bp->ptp_all_rx_tstamp) {
+ if (bnxt_rx_ts_valid(bp, flags, rxcmp1, &cmpl_ts)) {
if (bp->flags & BNXT_FLAG_CHIP_P5) {
- u32 cmpl_ts = le32_to_cpu(rxcmp1->rx_cmp_timestamp);
u64 ns, ts;
if (!bnxt_get_rx_ts_p5(bp, &ts, cmpl_ts)) {
bnxt_free_mem(bp, irq_re_init);
}
-int bnxt_close_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
+void bnxt_close_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init)
{
- int rc = 0;
-
if (test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) {
/* If we get here, it means firmware reset is in progress
* while we are trying to close. We can safely proceed with
#ifdef CONFIG_BNXT_SRIOV
if (bp->sriov_cfg) {
+ int rc;
+
rc = wait_event_interruptible_timeout(bp->sriov_cfg_wait,
!bp->sriov_cfg,
BNXT_SRIOV_CFG_WAIT_TMO);
- if (rc)
- netdev_warn(bp->dev, "timeout waiting for SRIOV config operation to complete!\n");
+ if (!rc)
+ netdev_warn(bp->dev, "timeout waiting for SRIOV config operation to complete, proceeding to close!\n");
+ else if (rc < 0)
+ netdev_warn(bp->dev, "SRIOV config operation interrupted, proceeding to close!\n");
}
#endif
__bnxt_close_nic(bp, irq_re_init, link_re_init);
- return rc;
}
static int bnxt_close(struct net_device *dev)
if (rc)
goto resume_exit;
+ bnxt_clear_reservations(bp, true);
+
if (bnxt_hwrm_func_drv_rgtr(bp, NULL, 0, false)) {
rc = -ENODEV;
goto resume_exit;
#define RX_CMP_FLAGS_ERROR (1 << 6)
#define RX_CMP_FLAGS_PLACEMENT (7 << 7)
#define RX_CMP_FLAGS_RSS_VALID (1 << 10)
- #define RX_CMP_FLAGS_UNUSED (1 << 11)
+ #define RX_CMP_FLAGS_PKT_METADATA_PRESENT (1 << 11)
#define RX_CMP_FLAGS_ITYPES_SHIFT 12
#define RX_CMP_FLAGS_ITYPES_MASK 0xf000
#define RX_CMP_FLAGS_ITYPE_UNKNOWN (0 << 12)
__le32 rx_cmp_rss_hash;
};
+#define BNXT_PTP_RX_TS_VALID(flags) \
+ (((flags) & RX_CMP_FLAGS_ITYPES_MASK) == RX_CMP_FLAGS_ITYPE_PTP_W_TS)
+
+#define BNXT_ALL_RX_TS_VALID(flags) \
+ !((flags) & RX_CMP_FLAGS_PKT_METADATA_PRESENT)
+
#define RX_CMP_HASH_VALID(rxcmp) \
((rxcmp)->rx_cmp_len_flags_type & cpu_to_le32(RX_CMP_FLAGS_RSS_VALID))
int bnxt_half_open_nic(struct bnxt *bp);
void bnxt_half_close_nic(struct bnxt *bp);
void bnxt_reenable_sriov(struct bnxt *bp);
-int bnxt_close_nic(struct bnxt *, bool, bool);
+void bnxt_close_nic(struct bnxt *, bool, bool);
void bnxt_get_ring_err_stats(struct bnxt *bp,
struct bnxt_total_ring_err_stats *stats);
int bnxt_dbg_hwrm_rd_reg(struct bnxt *bp, u32 reg_off, u16 num_words,
return -ENODEV;
}
bnxt_ulp_stop(bp);
- if (netif_running(bp->dev)) {
- rc = bnxt_close_nic(bp, true, true);
- if (rc) {
- NL_SET_ERR_MSG_MOD(extack, "Failed to close");
- dev_close(bp->dev);
- rtnl_unlock();
- break;
- }
- }
+ if (netif_running(bp->dev))
+ bnxt_close_nic(bp, true, true);
bnxt_vf_reps_free(bp);
rc = bnxt_hwrm_func_drv_unrgtr(bp);
if (rc) {
reset_coalesce:
if (test_bit(BNXT_STATE_OPEN, &bp->state)) {
if (update_stats) {
- rc = bnxt_close_nic(bp, true, false);
- if (!rc)
- rc = bnxt_open_nic(bp, true, false);
+ bnxt_close_nic(bp, true, false);
+ rc = bnxt_open_nic(bp, true, false);
} else {
rc = bnxt_hwrm_set_coal(bp);
}
* before PF unload
*/
}
- rc = bnxt_close_nic(bp, true, false);
- if (rc) {
- netdev_err(bp->dev, "Set channel failure rc :%x\n",
- rc);
- return rc;
- }
+ bnxt_close_nic(bp, true, false);
}
if (sh) {
bnxt_run_fw_tests(bp, test_mask, &test_results);
} else {
bnxt_ulp_stop(bp);
- rc = bnxt_close_nic(bp, true, false);
- if (rc) {
- etest->flags |= ETH_TEST_FL_FAILED;
- bnxt_ulp_start(bp, rc);
- return;
- }
+ bnxt_close_nic(bp, true, false);
bnxt_run_fw_tests(bp, test_mask, &test_results);
buf[BNXT_MACLPBK_TEST_IDX] = 1;
if (netif_running(bp->dev)) {
if (ptp->rx_filter == HWTSTAMP_FILTER_ALL) {
- rc = bnxt_close_nic(bp, false, false);
- if (!rc)
- rc = bnxt_open_nic(bp, false, false);
+ bnxt_close_nic(bp, false, false);
+ rc = bnxt_open_nic(bp, false, false);
} else {
bnxt_ptp_cfg_tstamp_filters(bp);
}
rhashtable_destroy(&tc_info->flow_table);
free_tc_info:
kfree(tc_info);
+ bp->tc_info = NULL;
return rc;
}
for (i = 0; i < num_frags ; i++) {
skb_frag_t *frag = &sinfo->frags[i];
struct bnxt_sw_tx_bd *frag_tx_buf;
- struct pci_dev *pdev = bp->pdev;
dma_addr_t frag_mapping;
int frag_len;
txbd = &txr->tx_desc_ring[TX_RING(prod)][TX_IDX(prod)];
frag_len = skb_frag_size(frag);
- frag_mapping = skb_frag_dma_map(&pdev->dev, frag, 0,
- frag_len, DMA_TO_DEVICE);
-
- if (unlikely(dma_mapping_error(&pdev->dev, frag_mapping)))
- return NULL;
-
- dma_unmap_addr_set(frag_tx_buf, mapping, frag_mapping);
-
flags = frag_len << TX_BD_LEN_SHIFT;
txbd->tx_bd_len_flags_type = cpu_to_le32(flags);
+ frag_mapping = page_pool_get_dma_addr(skb_frag_page(frag)) +
+ skb_frag_off(frag);
txbd->tx_bd_haddr = cpu_to_le64(frag_mapping);
len = frag_len;
int i;
u32 *regs;
+ /* If it is a PCI error, all registers will be 0xffff,
+ * we don't dump them out, just report the error and return
+ */
+ if (tp->pdev->error_state != pci_channel_io_normal) {
+ netdev_err(tp->dev, "PCI channel ERROR!\n");
+ return;
+ }
+
regs = kzalloc(TG3_REG_BLK_SIZE, GFP_ATOMIC);
if (!regs)
return;
desc_idx, *post_ptr);
drop_it_no_recycle:
/* Other statistics kept track of by card. */
- tp->rx_dropped++;
+ tnapi->rx_dropped++;
goto next_pkt;
}
segs = skb_gso_segment(skb, tp->dev->features &
~(NETIF_F_TSO | NETIF_F_TSO6));
- if (IS_ERR(segs) || !segs)
+ if (IS_ERR(segs) || !segs) {
+ tnapi->tx_dropped++;
goto tg3_tso_bug_end;
+ }
skb_list_walk_safe(segs, seg, next) {
skb_mark_not_on_list(seg);
drop:
dev_kfree_skb_any(skb);
drop_nofree:
- tp->tx_dropped++;
+ tnapi->tx_dropped++;
return NETDEV_TX_OK;
}
/* tp->lock is held. */
static int tg3_halt(struct tg3 *tp, int kind, bool silent)
{
- int err;
+ int err, i;
tg3_stop_fw(tp);
/* And make sure the next sample is new data */
memset(tp->hw_stats, 0, sizeof(struct tg3_hw_stats));
+
+ for (i = 0; i < TG3_IRQ_MAX_VECS; ++i) {
+ struct tg3_napi *tnapi = &tp->napi[i];
+
+ tnapi->rx_dropped = 0;
+ tnapi->tx_dropped = 0;
+ }
}
return err;
rtnl_lock();
tg3_full_lock(tp, 0);
- if (tp->pcierr_recovery || !netif_running(tp->dev)) {
+ if (tp->pcierr_recovery || !netif_running(tp->dev) ||
+ tp->pdev->error_state != pci_channel_io_normal) {
tg3_flag_clear(tp, RESET_TASK_PENDING);
tg3_full_unlock(tp);
rtnl_unlock();
{
struct rtnl_link_stats64 *old_stats = &tp->net_stats_prev;
struct tg3_hw_stats *hw_stats = tp->hw_stats;
+ unsigned long rx_dropped;
+ unsigned long tx_dropped;
+ int i;
stats->rx_packets = old_stats->rx_packets +
get_stat64(&hw_stats->rx_ucast_packets) +
stats->rx_missed_errors = old_stats->rx_missed_errors +
get_stat64(&hw_stats->rx_discards);
- stats->rx_dropped = tp->rx_dropped;
- stats->tx_dropped = tp->tx_dropped;
+ /* Aggregate per-queue counters. The per-queue counters are updated
+ * by a single writer, race-free. The result computed by this loop
+ * might not be 100% accurate (counters can be updated in the middle of
+ * the loop) but the next tg3_get_nstats() will recompute the current
+ * value so it is acceptable.
+ *
+ * Note that these counters wrap around at 4G on 32bit machines.
+ */
+ rx_dropped = (unsigned long)(old_stats->rx_dropped);
+ tx_dropped = (unsigned long)(old_stats->tx_dropped);
+
+ for (i = 0; i < tp->irq_cnt; i++) {
+ struct tg3_napi *tnapi = &tp->napi[i];
+
+ rx_dropped += tnapi->rx_dropped;
+ tx_dropped += tnapi->tx_dropped;
+ }
+
+ stats->rx_dropped = rx_dropped;
+ stats->tx_dropped = tx_dropped;
}
static int tg3_get_regs_len(struct net_device *dev)
u16 *rx_rcb_prod_idx;
struct tg3_rx_prodring_set prodring;
struct tg3_rx_buffer_desc *rx_rcb;
+ unsigned long rx_dropped;
u32 tx_prod ____cacheline_aligned;
u32 tx_cons;
u32 prodmbox;
struct tg3_tx_buffer_desc *tx_ring;
struct tg3_tx_ring_info *tx_buffers;
+ unsigned long tx_dropped;
dma_addr_t status_mapping;
dma_addr_t rx_rcb_mapping;
/* begin "everything else" cacheline(s) section */
- unsigned long rx_dropped;
- unsigned long tx_dropped;
struct rtnl_link_stats64 net_stats_prev;
struct tg3_ethtool_stats estats_prev;
.val = CONFIG0_MAXLEN_1536,
},
{
- .max_l3_len = 1542,
- .val = CONFIG0_MAXLEN_1542,
+ .max_l3_len = 1548,
+ .val = CONFIG0_MAXLEN_1548,
},
{
.max_l3_len = 9212,
dma_addr_t mapping;
unsigned short mtu;
void *buffer;
+ int ret;
mtu = ETH_HLEN;
mtu += netdev->mtu;
word3 |= mtu;
}
- if (skb->ip_summed != CHECKSUM_NONE) {
+ if (skb->len >= ETH_FRAME_LEN) {
+ /* Hardware offloaded checksumming isn't working on frames
+ * bigger than 1514 bytes. A hypothesis about this is that the
+ * checksum buffer is only 1518 bytes, so when the frames get
+ * bigger they get truncated, or the last few bytes get
+ * overwritten by the FCS.
+ *
+ * Just use software checksumming and bypass on bigger frames.
+ */
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ ret = skb_checksum_help(skb);
+ if (ret)
+ return ret;
+ }
+ word1 |= TSS_BYPASS_BIT;
+ } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
int tcp = 0;
+ /* We do not switch off the checksumming on non TCP/UDP
+ * frames: as is shown from tests, the checksumming engine
+ * is smart enough to see that a frame is not actually TCP
+ * or UDP and then just pass it through without any changes
+ * to the frame.
+ */
if (skb->protocol == htons(ETH_P_IP)) {
word1 |= TSS_IP_CHKSUM_BIT;
tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
return 0;
}
-static netdev_features_t gmac_fix_features(struct net_device *netdev,
- netdev_features_t features)
-{
- if (netdev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK)
- features &= ~GMAC_OFFLOAD_FEATURES;
-
- return features;
-}
-
static int gmac_set_features(struct net_device *netdev,
netdev_features_t features)
{
.ndo_set_mac_address = gmac_set_mac_address,
.ndo_get_stats64 = gmac_get_stats64,
.ndo_change_mtu = gmac_change_mtu,
- .ndo_fix_features = gmac_fix_features,
.ndo_set_features = gmac_set_features,
};
netdev->hw_features = GMAC_OFFLOAD_FEATURES;
netdev->features |= GMAC_OFFLOAD_FEATURES | NETIF_F_GRO;
- /* We can handle jumbo frames up to 10236 bytes so, let's accept
- * payloads of 10236 bytes minus VLAN and ethernet header
+ /* We can receive jumbo frames up to 10236 bytes but only
+ * transmit 2047 bytes so, let's accept payloads of 2047
+ * bytes minus VLAN and ethernet header
*/
netdev->min_mtu = ETH_MIN_MTU;
- netdev->max_mtu = 10236 - VLAN_ETH_HLEN;
+ netdev->max_mtu = MTU_SIZE_BIT_MASK - VLAN_ETH_HLEN;
port->freeq_refill = 0;
netif_napi_add(netdev, &port->napi, gmac_napi_poll);
#define SOF_BIT 0x80000000
#define EOF_BIT 0x40000000
#define EOFIE_BIT BIT(29)
-#define MTU_SIZE_BIT_MASK 0x1fff
+#define MTU_SIZE_BIT_MASK 0x7ff /* Max MTU 2047 bytes */
/* GMAC Tx Descriptor */
struct gmac_txdesc {
#define CONFIG0_MAXLEN_1536 0
#define CONFIG0_MAXLEN_1518 1
#define CONFIG0_MAXLEN_1522 2
-#define CONFIG0_MAXLEN_1542 3
+#define CONFIG0_MAXLEN_1548 3
#define CONFIG0_MAXLEN_9k 4 /* 9212 */
#define CONFIG0_MAXLEN_10k 5 /* 10236 */
#define CONFIG0_MAXLEN_1518__6 6
memcpy(skb->data, fd_vaddr + fd_offset, fd_length);
- dpaa2_eth_recycle_buf(priv, ch, dpaa2_fd_get_addr(fd));
-
return skb;
}
struct rtnl_link_stats64 *percpu_stats;
struct dpaa2_eth_drv_stats *percpu_extras;
struct device *dev = priv->net_dev->dev.parent;
+ bool recycle_rx_buf = false;
void *buf_data;
u32 xdp_act;
dma_unmap_page(dev, addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
skb = dpaa2_eth_build_linear_skb(ch, fd, vaddr);
+ } else {
+ recycle_rx_buf = true;
}
} else if (fd_format == dpaa2_fd_sg) {
WARN_ON(priv->xdp_prog);
goto err_build_skb;
dpaa2_eth_receive_skb(priv, ch, fd, vaddr, fq, percpu_stats, skb);
+
+ if (recycle_rx_buf)
+ dpaa2_eth_recycle_buf(priv, ch, dpaa2_fd_get_addr(fd));
return;
err_build_skb:
dma_addr_t addr;
buffer_start = skb->data - dpaa2_eth_needed_headroom(skb);
-
- /* If there's enough room to align the FD address, do it.
- * It will help hardware optimize accesses.
- */
aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN,
DPAA2_ETH_TX_BUF_ALIGN);
if (aligned_start >= skb->head)
buffer_start = aligned_start;
+ else
+ return -ENOMEM;
/* Store a backpointer to the skb at the beginning of the buffer
* (in the private data area) such that we can release it
if (err)
goto err_dl_port_add;
+ net_dev->needed_headroom = DPAA2_ETH_SWA_SIZE + DPAA2_ETH_TX_BUF_ALIGN;
+
err = register_netdev(net_dev);
if (err < 0) {
dev_err(dev, "register_netdev() failed\n");
static inline unsigned int dpaa2_eth_needed_headroom(struct sk_buff *skb)
{
- unsigned int headroom = DPAA2_ETH_SWA_SIZE;
+ unsigned int headroom = DPAA2_ETH_SWA_SIZE + DPAA2_ETH_TX_BUF_ALIGN;
/* If we don't have an skb (e.g. XDP buffer), we only need space for
* the software annotation area
err = dpsw_acl_add_entry(ethsw->mc_io, 0, ethsw->dpsw_handle,
filter_block->acl_id, acl_entry_cfg);
- dma_unmap_single(dev, acl_entry_cfg->key_iova, sizeof(cmd_buff),
+ dma_unmap_single(dev, acl_entry_cfg->key_iova,
+ DPAA2_ETHSW_PORT_ACL_CMD_BUF_SIZE,
DMA_TO_DEVICE);
if (err) {
dev_err(dev, "dpsw_acl_add_entry() failed %d\n", err);
err = dpsw_acl_remove_entry(ethsw->mc_io, 0, ethsw->dpsw_handle,
block->acl_id, acl_entry_cfg);
- dma_unmap_single(dev, acl_entry_cfg->key_iova, sizeof(cmd_buff),
- DMA_TO_DEVICE);
+ dma_unmap_single(dev, acl_entry_cfg->key_iova,
+ DPAA2_ETHSW_PORT_ACL_CMD_BUF_SIZE, DMA_TO_DEVICE);
if (err) {
dev_err(dev, "dpsw_acl_remove_entry() failed %d\n", err);
kfree(cmd_buff);
return notifier_from_errno(err);
}
-static struct notifier_block dpaa2_switch_port_switchdev_nb;
-static struct notifier_block dpaa2_switch_port_switchdev_blocking_nb;
-
static int dpaa2_switch_port_bridge_join(struct net_device *netdev,
struct net_device *upper_dev,
struct netlink_ext_ack *extack)
goto err_egress_flood;
err = switchdev_bridge_port_offload(netdev, netdev, NULL,
- &dpaa2_switch_port_switchdev_nb,
- &dpaa2_switch_port_switchdev_blocking_nb,
- false, extack);
+ NULL, NULL, false, extack);
if (err)
goto err_switchdev_offload;
static void dpaa2_switch_port_pre_bridge_leave(struct net_device *netdev)
{
- switchdev_bridge_port_unoffload(netdev, NULL,
- &dpaa2_switch_port_switchdev_nb,
- &dpaa2_switch_port_switchdev_blocking_nb);
+ switchdev_bridge_port_unoffload(netdev, NULL, NULL, NULL);
}
static int dpaa2_switch_port_bridge_leave(struct net_device *netdev)
return 0;
}
-static u16 fec_enet_get_raw_vlan_tci(struct sk_buff *skb)
-{
- struct vlan_ethhdr *vhdr;
- unsigned short vlan_TCI = 0;
-
- if (skb->protocol == htons(ETH_P_ALL)) {
- vhdr = (struct vlan_ethhdr *)(skb->data);
- vlan_TCI = ntohs(vhdr->h_vlan_TCI);
- }
-
- return vlan_TCI;
-}
-
static u16 fec_enet_select_queue(struct net_device *ndev, struct sk_buff *skb,
struct net_device *sb_dev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
- u16 vlan_tag;
+ u16 vlan_tag = 0;
if (!(fep->quirks & FEC_QUIRK_HAS_AVB))
return netdev_pick_tx(ndev, skb, NULL);
- vlan_tag = fec_enet_get_raw_vlan_tci(skb);
- if (!vlan_tag)
+ /* VLAN is present in the payload.*/
+ if (eth_type_vlan(skb->protocol)) {
+ struct vlan_ethhdr *vhdr = skb_vlan_eth_hdr(skb);
+
+ vlan_tag = ntohs(vhdr->h_vlan_TCI);
+ /* VLAN is present in the skb but not yet pushed in the payload.*/
+ } else if (skb_vlan_tag_present(skb)) {
+ vlan_tag = skb->vlan_tci;
+ } else {
return vlan_tag;
+ }
return fec_enet_vlan_pri_to_queue[vlan_tag >> 13];
}
if (block->tx) {
if (block->tx->q_num < priv->tx_cfg.num_queues)
reschedule |= gve_tx_poll(block, budget);
- else
+ else if (budget)
reschedule |= gve_xdp_poll(block, budget);
}
+ if (!budget)
+ return 0;
+
if (block->rx) {
work_done = gve_rx_poll(block, budget);
reschedule |= work_done == budget;
if (block->tx)
reschedule |= gve_tx_poll_dqo(block, /*do_clean=*/true);
+ if (!budget)
+ return 0;
+
if (block->rx) {
work_done = gve_rx_poll_dqo(block, budget);
reschedule |= work_done == budget;
feat = block->napi.dev->features;
- /* If budget is 0, do all the work */
- if (budget == 0)
- budget = INT_MAX;
-
if (budget > 0)
work_done = gve_clean_rx_done(rx, budget, feat);
bool repoll;
u32 to_do;
- /* If budget is 0, do all the work */
- if (budget == 0)
- budget = INT_MAX;
-
/* Find out how much work there is to be done */
nic_done = gve_tx_load_event_counter(priv, tx);
to_do = min_t(u32, (nic_done - tx->done), budget);
}
}
+static u32 hns_mac_link_anti_shake(struct mac_driver *mac_ctrl_drv)
+{
+#define HNS_MAC_LINK_WAIT_TIME 5
+#define HNS_MAC_LINK_WAIT_CNT 40
+
+ u32 link_status = 0;
+ int i;
+
+ if (!mac_ctrl_drv->get_link_status)
+ return link_status;
+
+ for (i = 0; i < HNS_MAC_LINK_WAIT_CNT; i++) {
+ msleep(HNS_MAC_LINK_WAIT_TIME);
+ mac_ctrl_drv->get_link_status(mac_ctrl_drv, &link_status);
+ if (!link_status)
+ break;
+ }
+
+ return link_status;
+}
+
void hns_mac_get_link_status(struct hns_mac_cb *mac_cb, u32 *link_status)
{
struct mac_driver *mac_ctrl_drv;
&sfp_prsnt);
if (!ret)
*link_status = *link_status && sfp_prsnt;
+
+ /* for FIBER port, it may have a fake link up.
+ * when the link status changes from down to up, we need to do
+ * anti-shake. the anti-shake time is base on tests.
+ * only FIBER port need to do this.
+ */
+ if (*link_status && !mac_cb->link)
+ *link_status = hns_mac_link_anti_shake(mac_ctrl_drv);
}
mac_cb->link = *link_status;
static void fill_desc(struct hnae_ring *ring, void *priv,
int size, dma_addr_t dma, int frag_end,
- int buf_num, enum hns_desc_type type, int mtu)
+ int buf_num, enum hns_desc_type type, int mtu,
+ bool is_gso)
{
struct hnae_desc *desc = &ring->desc[ring->next_to_use];
struct hnae_desc_cb *desc_cb = &ring->desc_cb[ring->next_to_use];
return 0;
}
+static int hns_nic_maybe_stop_tx_v2(struct sk_buff **out_skb, int *bnum,
+ struct hnae_ring *ring)
+{
+ if (skb_is_gso(*out_skb))
+ return hns_nic_maybe_stop_tso(out_skb, bnum, ring);
+ else
+ return hns_nic_maybe_stop_tx(out_skb, bnum, ring);
+}
+
static void fill_tso_desc(struct hnae_ring *ring, void *priv,
int size, dma_addr_t dma, int frag_end,
int buf_num, enum hns_desc_type type, int mtu)
mtu);
}
+static void fill_desc_v2(struct hnae_ring *ring, void *priv,
+ int size, dma_addr_t dma, int frag_end,
+ int buf_num, enum hns_desc_type type, int mtu,
+ bool is_gso)
+{
+ if (is_gso)
+ fill_tso_desc(ring, priv, size, dma, frag_end, buf_num, type,
+ mtu);
+ else
+ fill_v2_desc(ring, priv, size, dma, frag_end, buf_num, type,
+ mtu);
+}
+
netdev_tx_t hns_nic_net_xmit_hw(struct net_device *ndev,
struct sk_buff *skb,
struct hns_nic_ring_data *ring_data)
int seg_num;
dma_addr_t dma;
int size, next_to_use;
+ bool is_gso;
int i;
switch (priv->ops.maybe_stop_tx(&skb, &buf_num, ring)) {
ring->stats.sw_err_cnt++;
goto out_err_tx_ok;
}
+ is_gso = skb_is_gso(skb);
priv->ops.fill_desc(ring, skb, size, dma, seg_num == 1 ? 1 : 0,
- buf_num, DESC_TYPE_SKB, ndev->mtu);
+ buf_num, DESC_TYPE_SKB, ndev->mtu, is_gso);
/* fill the fragments */
for (i = 1; i < seg_num; i++) {
}
priv->ops.fill_desc(ring, skb_frag_page(frag), size, dma,
seg_num - 1 == i ? 1 : 0, buf_num,
- DESC_TYPE_PAGE, ndev->mtu);
+ DESC_TYPE_PAGE, ndev->mtu, is_gso);
}
/*complete translate all packets*/
netdev_info(netdev, "enet v1 do not support tso!\n");
break;
default:
- if (features & (NETIF_F_TSO | NETIF_F_TSO6)) {
- priv->ops.fill_desc = fill_tso_desc;
- priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tso;
- /* The chip only support 7*4096 */
- netif_set_tso_max_size(netdev, 7 * 4096);
- } else {
- priv->ops.fill_desc = fill_v2_desc;
- priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tx;
- }
break;
}
netdev->features = features;
priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tx;
} else {
priv->ops.get_rxd_bnum = get_v2rx_desc_bnum;
- if ((netdev->features & NETIF_F_TSO) ||
- (netdev->features & NETIF_F_TSO6)) {
- priv->ops.fill_desc = fill_tso_desc;
- priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tso;
- /* This chip only support 7*4096 */
- netif_set_tso_max_size(netdev, 7 * 4096);
- } else {
- priv->ops.fill_desc = fill_v2_desc;
- priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tx;
- }
+ priv->ops.fill_desc = fill_desc_v2;
+ priv->ops.maybe_stop_tx = hns_nic_maybe_stop_tx_v2;
+ netif_set_tso_max_size(netdev, 7 * 4096);
/* enable tso when init
* control tso on/off through TSE bit in bd
*/
struct hns_nic_ops {
void (*fill_desc)(struct hnae_ring *ring, void *priv,
int size, dma_addr_t dma, int frag_end,
- int buf_num, enum hns_desc_type type, int mtu);
+ int buf_num, enum hns_desc_type type, int mtu,
+ bool is_gso);
int (*maybe_stop_tx)(struct sk_buff **out_skb,
int *bnum, struct hnae_ring *ring);
void (*get_rxd_bnum)(u32 bnum_flag, int *out_bnum);
}
sprintf(result[j++], "%d", i);
- sprintf(result[j++], "%s", dim_state_str[dim->state]);
+ sprintf(result[j++], "%s", dim->state < ARRAY_SIZE(dim_state_str) ?
+ dim_state_str[dim->state] : "unknown");
sprintf(result[j++], "%u", dim->profile_ix);
- sprintf(result[j++], "%s", dim_cqe_mode_str[dim->mode]);
+ sprintf(result[j++], "%s", dim->mode < ARRAY_SIZE(dim_cqe_mode_str) ?
+ dim_cqe_mode_str[dim->mode] : "unknown");
sprintf(result[j++], "%s",
- dim_tune_stat_str[dim->tune_state]);
+ dim->tune_state < ARRAY_SIZE(dim_tune_stat_str) ?
+ dim_tune_stat_str[dim->tune_state] : "unknown");
sprintf(result[j++], "%u", dim->steps_left);
sprintf(result[j++], "%u", dim->steps_right);
sprintf(result[j++], "%u", dim->tired);
struct hns3_nic_priv *priv = netdev_priv(netdev);
char format_mac_addr[HNAE3_FORMAT_MAC_ADDR_LEN];
struct hnae3_handle *h = priv->ae_handle;
- u8 mac_addr_temp[ETH_ALEN];
+ u8 mac_addr_temp[ETH_ALEN] = {0};
int ret = 0;
if (h->ae_algo->ops->get_mac_addr)
static void hclge_update_fec_stats(struct hclge_dev *hdev);
static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret,
int wait_cnt);
+static int hclge_update_port_info(struct hclge_dev *hdev);
static struct hnae3_ae_algo ae_algo;
if (state != hdev->hw.mac.link) {
hdev->hw.mac.link = state;
+ if (state == HCLGE_LINK_STATUS_UP)
+ hclge_update_port_info(hdev);
+
client->ops->link_status_change(handle, state);
hclge_config_mac_tnl_int(hdev, state);
if (rclient && rclient->ops->link_status_change)
struct hclge_vport_vlan_cfg *vlan, *tmp;
struct hclge_dev *hdev = vport->back;
- mutex_lock(&hdev->vport_lock);
-
list_for_each_entry_safe(vlan, tmp, &vport->vlan_list, node) {
if (vlan->vlan_id == vlan_id) {
if (is_write_tbl && vlan->hd_tbl_status)
break;
}
}
-
- mutex_unlock(&hdev->vport_lock);
}
void hclge_rm_vport_all_vlan_table(struct hclge_vport *vport, bool is_del_list)
* handle mailbox. Just record the vlan id, and remove it after
* reset finished.
*/
+ mutex_lock(&hdev->vport_lock);
if ((test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) ||
test_bit(HCLGE_STATE_RST_FAIL, &hdev->state)) && is_kill) {
set_bit(vlan_id, vport->vlan_del_fail_bmap);
+ mutex_unlock(&hdev->vport_lock);
return -EBUSY;
+ } else if (!is_kill && test_bit(vlan_id, vport->vlan_del_fail_bmap)) {
+ clear_bit(vlan_id, vport->vlan_del_fail_bmap);
}
+ mutex_unlock(&hdev->vport_lock);
/* when port base vlan enabled, we use port base vlan as the vlan
* filter entry. In this case, we don't update vlan filter table
}
if (!ret) {
- if (!is_kill)
+ if (!is_kill) {
hclge_add_vport_vlan_table(vport, vlan_id,
writen_to_tbl);
- else if (is_kill && vlan_id != 0)
+ } else if (is_kill && vlan_id != 0) {
+ mutex_lock(&hdev->vport_lock);
hclge_rm_vport_vlan_table(vport, vlan_id, false);
+ mutex_unlock(&hdev->vport_lock);
+ }
} else if (is_kill) {
/* when remove hw vlan filter failed, record the vlan id,
* and try to remove it from hw later, to be consistence
* with stack
*/
+ mutex_lock(&hdev->vport_lock);
set_bit(vlan_id, vport->vlan_del_fail_bmap);
+ mutex_unlock(&hdev->vport_lock);
}
hclge_set_vport_vlan_fltr_change(vport);
int i, ret, sync_cnt = 0;
u16 vlan_id;
+ mutex_lock(&hdev->vport_lock);
/* start from vport 1 for PF is always alive */
for (i = 0; i < hdev->num_alloc_vport; i++) {
struct hclge_vport *vport = &hdev->vport[i];
ret = hclge_set_vlan_filter_hw(hdev, htons(ETH_P_8021Q),
vport->vport_id, vlan_id,
true);
- if (ret && ret != -EINVAL)
+ if (ret && ret != -EINVAL) {
+ mutex_unlock(&hdev->vport_lock);
return;
+ }
clear_bit(vlan_id, vport->vlan_del_fail_bmap);
hclge_rm_vport_vlan_table(vport, vlan_id, false);
hclge_set_vport_vlan_fltr_change(vport);
sync_cnt++;
- if (sync_cnt >= HCLGE_MAX_SYNC_COUNT)
+ if (sync_cnt >= HCLGE_MAX_SYNC_COUNT) {
+ mutex_unlock(&hdev->vport_lock);
return;
+ }
vlan_id = find_first_bit(vport->vlan_del_fail_bmap,
VLAN_N_VID);
}
}
+ mutex_unlock(&hdev->vport_lock);
hclge_sync_vlan_fltr_state(hdev);
}
goto err_msi_irq_uninit;
if (hdev->hw.mac.media_type == HNAE3_MEDIA_TYPE_COPPER) {
+ clear_bit(HNAE3_DEV_SUPPORT_FEC_B, ae_dev->caps);
if (hnae3_dev_phy_imp_supported(hdev))
ret = hclge_update_tp_port_info(hdev);
else
test_bit(HCLGEVF_STATE_RST_FAIL, &hdev->state)) && is_kill) {
set_bit(vlan_id, hdev->vlan_del_fail_bmap);
return -EBUSY;
+ } else if (!is_kill && test_bit(vlan_id, hdev->vlan_del_fail_bmap)) {
+ clear_bit(vlan_id, hdev->vlan_del_fail_bmap);
}
hclgevf_build_send_msg(&send_msg, HCLGE_MBX_SET_VLAN,
int ret, sync_cnt = 0;
u16 vlan_id;
+ if (bitmap_empty(hdev->vlan_del_fail_bmap, VLAN_N_VID))
+ return;
+
+ rtnl_lock();
vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID);
while (vlan_id != VLAN_N_VID) {
ret = hclgevf_set_vlan_filter(handle, htons(ETH_P_8021Q),
vlan_id, true);
if (ret)
- return;
+ break;
clear_bit(vlan_id, hdev->vlan_del_fail_bmap);
sync_cnt++;
if (sync_cnt >= HCLGEVF_MAX_SYNC_COUNT)
- return;
+ break;
vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID);
}
+ rtnl_unlock();
}
static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable)
return HCLGEVF_VECTOR0_EVENT_OTHER;
}
+static void hclgevf_reset_timer(struct timer_list *t)
+{
+ struct hclgevf_dev *hdev = from_timer(hdev, t, reset_timer);
+
+ hclgevf_clear_event_cause(hdev, HCLGEVF_VECTOR0_EVENT_RST);
+ hclgevf_reset_task_schedule(hdev);
+}
+
static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
{
+#define HCLGEVF_RESET_DELAY 5
+
enum hclgevf_evt_cause event_cause;
struct hclgevf_dev *hdev = data;
u32 clearval;
switch (event_cause) {
case HCLGEVF_VECTOR0_EVENT_RST:
- hclgevf_reset_task_schedule(hdev);
+ mod_timer(&hdev->reset_timer,
+ jiffies + msecs_to_jiffies(HCLGEVF_RESET_DELAY));
break;
case HCLGEVF_VECTOR0_EVENT_MBX:
hclgevf_mbx_handler(hdev);
HCLGEVF_DRIVER_NAME);
hclgevf_task_schedule(hdev, round_jiffies_relative(HZ));
+ timer_setup(&hdev->reset_timer, hclgevf_reset_timer, 0);
return 0;
enum hnae3_reset_type reset_level;
unsigned long reset_pending;
enum hnae3_reset_type reset_type;
+ struct timer_list reset_timer;
#define HCLGEVF_RESET_REQUESTED 0
#define HCLGEVF_RESET_PENDING 1
i++;
}
+ /* ensure additional_info will be seen after received_resp */
+ smp_rmb();
+
if (i >= HCLGEVF_MAX_TRY_TIMES) {
dev_err(&hdev->pdev->dev,
"VF could not get mbx(%u,%u) resp(=%d) from PF in %d tries\n",
resp->resp_status = hclgevf_resp_to_errno(resp_status);
memcpy(resp->additional_info, req->msg.resp_data,
HCLGE_MBX_MAX_RESP_DATA_SIZE * sizeof(u8));
+
+ /* ensure additional_info will be seen before setting received_resp */
+ smp_wmb();
+
if (match_id) {
/* If match_id is not zero, it means PF support match_id.
* if the match_id is right, VF get the right response, or
I40E_PRTGL_SAH_MFS_MASK) >> I40E_PRTGL_SAH_MFS_SHIFT;
if (val < MAX_FRAME_SIZE_DEFAULT)
dev_warn(&pdev->dev, "MFS for port %x has been set below the default: %x\n",
- i, val);
+ pf->hw.port, val);
/* Add a filter to drop all Flow control frames from any VSI from being
* transmitted. By doing so we stop a malicious VF from sending out
#define I40E_GLGEN_MSCA_OPCODE_SHIFT 26
#define I40E_GLGEN_MSCA_OPCODE_MASK(_i) I40E_MASK(_i, I40E_GLGEN_MSCA_OPCODE_SHIFT)
#define I40E_GLGEN_MSCA_STCODE_SHIFT 28
-#define I40E_GLGEN_MSCA_STCODE_MASK I40E_MASK(0x1, I40E_GLGEN_MSCA_STCODE_SHIFT)
+#define I40E_GLGEN_MSCA_STCODE_MASK(_i) I40E_MASK(_i, I40E_GLGEN_MSCA_STCODE_SHIFT)
#define I40E_GLGEN_MSCA_MDICMD_SHIFT 30
#define I40E_GLGEN_MSCA_MDICMD_MASK I40E_MASK(0x1, I40E_GLGEN_MSCA_MDICMD_SHIFT)
#define I40E_GLGEN_MSCA_MDIINPROGEN_SHIFT 31
#define I40E_QTX_CTL_VM_QUEUE 0x1
#define I40E_QTX_CTL_PF_QUEUE 0x2
-#define I40E_MDIO_CLAUSE22_STCODE_MASK I40E_GLGEN_MSCA_STCODE_MASK
+#define I40E_MDIO_CLAUSE22_STCODE_MASK I40E_GLGEN_MSCA_STCODE_MASK(1)
#define I40E_MDIO_CLAUSE22_OPCODE_WRITE_MASK I40E_GLGEN_MSCA_OPCODE_MASK(1)
#define I40E_MDIO_CLAUSE22_OPCODE_READ_MASK I40E_GLGEN_MSCA_OPCODE_MASK(2)
-#define I40E_MDIO_CLAUSE45_STCODE_MASK I40E_GLGEN_MSCA_STCODE_MASK
+#define I40E_MDIO_CLAUSE45_STCODE_MASK I40E_GLGEN_MSCA_STCODE_MASK(0)
#define I40E_MDIO_CLAUSE45_OPCODE_ADDRESS_MASK I40E_GLGEN_MSCA_OPCODE_MASK(0)
#define I40E_MDIO_CLAUSE45_OPCODE_WRITE_MASK I40E_GLGEN_MSCA_OPCODE_MASK(1)
#define I40E_MDIO_CLAUSE45_OPCODE_READ_MASK I40E_GLGEN_MSCA_OPCODE_MASK(3)
struct i40e_pf *pf = vf->pf;
struct i40e_vsi *vsi = NULL;
int aq_ret = 0;
- int i, ret;
+ int i;
if (!i40e_sync_vf_state(vf, I40E_VF_STATE_ACTIVE)) {
aq_ret = -EINVAL;
}
cfilter = kzalloc(sizeof(*cfilter), GFP_KERNEL);
- if (!cfilter)
- return -ENOMEM;
+ if (!cfilter) {
+ aq_ret = -ENOMEM;
+ goto err_out;
+ }
/* parse destination mac address */
for (i = 0; i < ETH_ALEN; i++)
/* Adding cloud filter programmed as TC filter */
if (tcf.dst_port)
- ret = i40e_add_del_cloud_filter_big_buf(vsi, cfilter, true);
+ aq_ret = i40e_add_del_cloud_filter_big_buf(vsi, cfilter, true);
else
- ret = i40e_add_del_cloud_filter(vsi, cfilter, true);
- if (ret) {
+ aq_ret = i40e_add_del_cloud_filter(vsi, cfilter, true);
+ if (aq_ret) {
dev_err(&pf->pdev->dev,
"VF %d: Failed to add cloud filter, err %pe aq_err %s\n",
- vf->vf_id, ERR_PTR(ret),
+ vf->vf_id, ERR_PTR(aq_ret),
i40e_aq_str(&pf->hw, pf->hw.aq.asq_last_status));
goto err_free;
}
#define IAVF_FLAG_QUEUES_DISABLED BIT(17)
#define IAVF_FLAG_SETUP_NETDEV_FEATURES BIT(18)
#define IAVF_FLAG_REINIT_MSIX_NEEDED BIT(20)
+#define IAVF_FLAG_FDIR_ENABLED BIT(21)
/* duplicates for common code */
#define IAVF_FLAG_DCB_ENABLED 0
/* flags for admin queue service task */
struct iavf_adapter *adapter = netdev_priv(netdev);
int i;
- if (ec->rx_coalesce_usecs == 0) {
- if (ec->use_adaptive_rx_coalesce)
- netif_info(adapter, drv, netdev, "rx-usecs=0, need to disable adaptive-rx for a complete disable\n");
- } else if ((ec->rx_coalesce_usecs < IAVF_MIN_ITR) ||
- (ec->rx_coalesce_usecs > IAVF_MAX_ITR)) {
+ if (ec->rx_coalesce_usecs > IAVF_MAX_ITR) {
netif_info(adapter, drv, netdev, "Invalid value, rx-usecs range is 0-8160\n");
return -EINVAL;
- } else if (ec->tx_coalesce_usecs == 0) {
- if (ec->use_adaptive_tx_coalesce)
- netif_info(adapter, drv, netdev, "tx-usecs=0, need to disable adaptive-tx for a complete disable\n");
- } else if ((ec->tx_coalesce_usecs < IAVF_MIN_ITR) ||
- (ec->tx_coalesce_usecs > IAVF_MAX_ITR)) {
+ } else if (ec->tx_coalesce_usecs > IAVF_MAX_ITR) {
netif_info(adapter, drv, netdev, "Invalid value, tx-usecs range is 0-8160\n");
return -EINVAL;
}
struct iavf_fdir_fltr *rule = NULL;
int ret = 0;
- if (!FDIR_FLTR_SUPPORT(adapter))
+ if (!(adapter->flags & IAVF_FLAG_FDIR_ENABLED))
return -EOPNOTSUPP;
spin_lock_bh(&adapter->fdir_fltr_lock);
unsigned int cnt = 0;
int val = 0;
- if (!FDIR_FLTR_SUPPORT(adapter))
+ if (!(adapter->flags & IAVF_FLAG_FDIR_ENABLED))
return -EOPNOTSUPP;
cmd->data = IAVF_MAX_FDIR_FILTERS;
int count = 50;
int err;
- if (!FDIR_FLTR_SUPPORT(adapter))
+ if (!(adapter->flags & IAVF_FLAG_FDIR_ENABLED))
return -EOPNOTSUPP;
if (fsp->flow_type & FLOW_MAC_EXT)
spin_lock_bh(&adapter->fdir_fltr_lock);
iavf_fdir_list_add_fltr(adapter, fltr);
adapter->fdir_active_fltr++;
- fltr->state = IAVF_FDIR_FLTR_ADD_REQUEST;
- adapter->aq_required |= IAVF_FLAG_AQ_ADD_FDIR_FILTER;
+ if (adapter->link_up) {
+ fltr->state = IAVF_FDIR_FLTR_ADD_REQUEST;
+ adapter->aq_required |= IAVF_FLAG_AQ_ADD_FDIR_FILTER;
+ } else {
+ fltr->state = IAVF_FDIR_FLTR_INACTIVE;
+ }
spin_unlock_bh(&adapter->fdir_fltr_lock);
- mod_delayed_work(adapter->wq, &adapter->watchdog_task, 0);
-
+ if (adapter->link_up)
+ mod_delayed_work(adapter->wq, &adapter->watchdog_task, 0);
ret:
if (err && fltr)
kfree(fltr);
struct iavf_fdir_fltr *fltr = NULL;
int err = 0;
- if (!FDIR_FLTR_SUPPORT(adapter))
+ if (!(adapter->flags & IAVF_FLAG_FDIR_ENABLED))
return -EOPNOTSUPP;
spin_lock_bh(&adapter->fdir_fltr_lock);
if (fltr->state == IAVF_FDIR_FLTR_ACTIVE) {
fltr->state = IAVF_FDIR_FLTR_DEL_REQUEST;
adapter->aq_required |= IAVF_FLAG_AQ_DEL_FDIR_FILTER;
+ } else if (fltr->state == IAVF_FDIR_FLTR_INACTIVE) {
+ list_del(&fltr->list);
+ kfree(fltr);
+ adapter->fdir_active_fltr--;
+ fltr = NULL;
} else {
err = -EBUSY;
}
ret = 0;
break;
case ETHTOOL_GRXCLSRLCNT:
- if (!FDIR_FLTR_SUPPORT(adapter))
+ if (!(adapter->flags & IAVF_FLAG_FDIR_ENABLED))
break;
spin_lock_bh(&adapter->fdir_fltr_lock);
cmd->rule_cnt = adapter->fdir_active_fltr;
struct iavf_adapter;
-/* State of Flow Director filter */
+/* State of Flow Director filter
+ *
+ * *_REQUEST states are used to mark filter to be sent to PF driver to perform
+ * an action (either add or delete filter). *_PENDING states are an indication
+ * that request was sent to PF and the driver is waiting for response.
+ *
+ * Both DELETE and DISABLE states are being used to delete a filter in PF.
+ * The difference is that after a successful response filter in DEL_PENDING
+ * state is being deleted from VF driver as well and filter in DIS_PENDING state
+ * is being changed to INACTIVE state.
+ */
enum iavf_fdir_fltr_state_t {
IAVF_FDIR_FLTR_ADD_REQUEST, /* User requests to add filter */
IAVF_FDIR_FLTR_ADD_PENDING, /* Filter pending add by the PF */
IAVF_FDIR_FLTR_DEL_REQUEST, /* User requests to delete filter */
IAVF_FDIR_FLTR_DEL_PENDING, /* Filter pending delete by the PF */
+ IAVF_FDIR_FLTR_DIS_REQUEST, /* Filter scheduled to be disabled */
+ IAVF_FDIR_FLTR_DIS_PENDING, /* Filter pending disable by the PF */
+ IAVF_FDIR_FLTR_INACTIVE, /* Filter inactive on link down */
IAVF_FDIR_FLTR_ACTIVE, /* Filter is active */
};
kfree(mem->va);
}
-/**
- * iavf_lock_timeout - try to lock mutex but give up after timeout
- * @lock: mutex that should be locked
- * @msecs: timeout in msecs
- *
- * Returns 0 on success, negative on failure
- **/
-static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
-{
- unsigned int wait, delay = 10;
-
- for (wait = 0; wait < msecs; wait += delay) {
- if (mutex_trylock(lock))
- return 0;
-
- msleep(delay);
- }
-
- return -1;
-}
-
/**
* iavf_schedule_reset - Set the flags and schedule a reset event
* @adapter: board private structure
**/
static void iavf_clear_fdir_filters(struct iavf_adapter *adapter)
{
- struct iavf_fdir_fltr *fdir, *fdirtmp;
+ struct iavf_fdir_fltr *fdir;
/* remove all Flow Director filters */
spin_lock_bh(&adapter->fdir_fltr_lock);
- list_for_each_entry_safe(fdir, fdirtmp, &adapter->fdir_list_head,
- list) {
+ list_for_each_entry(fdir, &adapter->fdir_list_head, list) {
if (fdir->state == IAVF_FDIR_FLTR_ADD_REQUEST) {
- list_del(&fdir->list);
- kfree(fdir);
- adapter->fdir_active_fltr--;
- } else {
- fdir->state = IAVF_FDIR_FLTR_DEL_REQUEST;
+ /* Cancel a request, keep filter as inactive */
+ fdir->state = IAVF_FDIR_FLTR_INACTIVE;
+ } else if (fdir->state == IAVF_FDIR_FLTR_ADD_PENDING ||
+ fdir->state == IAVF_FDIR_FLTR_ACTIVE) {
+ /* Disable filters which are active or have a pending
+ * request to PF to be added
+ */
+ fdir->state = IAVF_FDIR_FLTR_DIS_REQUEST;
}
}
spin_unlock_bh(&adapter->fdir_fltr_lock);
}
}
+/**
+ * iavf_restore_fdir_filters
+ * @adapter: board private structure
+ *
+ * Restore existing FDIR filters when VF netdev comes back up.
+ **/
+static void iavf_restore_fdir_filters(struct iavf_adapter *adapter)
+{
+ struct iavf_fdir_fltr *f;
+
+ spin_lock_bh(&adapter->fdir_fltr_lock);
+ list_for_each_entry(f, &adapter->fdir_list_head, list) {
+ if (f->state == IAVF_FDIR_FLTR_DIS_REQUEST) {
+ /* Cancel a request, keep filter as active */
+ f->state = IAVF_FDIR_FLTR_ACTIVE;
+ } else if (f->state == IAVF_FDIR_FLTR_DIS_PENDING ||
+ f->state == IAVF_FDIR_FLTR_INACTIVE) {
+ /* Add filters which are inactive or have a pending
+ * request to PF to be deleted
+ */
+ f->state = IAVF_FDIR_FLTR_ADD_REQUEST;
+ adapter->aq_required |= IAVF_FLAG_AQ_ADD_FDIR_FILTER;
+ }
+ }
+ spin_unlock_bh(&adapter->fdir_fltr_lock);
+}
+
/**
* iavf_open - Called when a network interface is made active
* @netdev: network interface device structure
spin_unlock_bh(&adapter->mac_vlan_list_lock);
- /* Restore VLAN filters that were removed with IFF_DOWN */
+ /* Restore filters that were removed with IFF_DOWN */
iavf_restore_filters(adapter);
+ iavf_restore_fdir_filters(adapter);
iavf_configure(adapter);
return ret;
}
+/**
+ * iavf_disable_fdir - disable Flow Director and clear existing filters
+ * @adapter: board private structure
+ **/
+static void iavf_disable_fdir(struct iavf_adapter *adapter)
+{
+ struct iavf_fdir_fltr *fdir, *fdirtmp;
+ bool del_filters = false;
+
+ adapter->flags &= ~IAVF_FLAG_FDIR_ENABLED;
+
+ /* remove all Flow Director filters */
+ spin_lock_bh(&adapter->fdir_fltr_lock);
+ list_for_each_entry_safe(fdir, fdirtmp, &adapter->fdir_list_head,
+ list) {
+ if (fdir->state == IAVF_FDIR_FLTR_ADD_REQUEST ||
+ fdir->state == IAVF_FDIR_FLTR_INACTIVE) {
+ /* Delete filters not registered in PF */
+ list_del(&fdir->list);
+ kfree(fdir);
+ adapter->fdir_active_fltr--;
+ } else if (fdir->state == IAVF_FDIR_FLTR_ADD_PENDING ||
+ fdir->state == IAVF_FDIR_FLTR_DIS_REQUEST ||
+ fdir->state == IAVF_FDIR_FLTR_ACTIVE) {
+ /* Filters registered in PF, schedule their deletion */
+ fdir->state = IAVF_FDIR_FLTR_DEL_REQUEST;
+ del_filters = true;
+ } else if (fdir->state == IAVF_FDIR_FLTR_DIS_PENDING) {
+ /* Request to delete filter already sent to PF, change
+ * state to DEL_PENDING to delete filter after PF's
+ * response, not set as INACTIVE
+ */
+ fdir->state = IAVF_FDIR_FLTR_DEL_PENDING;
+ }
+ }
+ spin_unlock_bh(&adapter->fdir_fltr_lock);
+
+ if (del_filters) {
+ adapter->aq_required |= IAVF_FLAG_AQ_DEL_FDIR_FILTER;
+ mod_delayed_work(adapter->wq, &adapter->watchdog_task, 0);
+ }
+}
+
#define NETIF_VLAN_OFFLOAD_FEATURES (NETIF_F_HW_VLAN_CTAG_RX | \
NETIF_F_HW_VLAN_CTAG_TX | \
NETIF_F_HW_VLAN_STAG_RX | \
((netdev->features & NETIF_F_RXFCS) ^ (features & NETIF_F_RXFCS)))
iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
+ if ((netdev->features & NETIF_F_NTUPLE) ^ (features & NETIF_F_NTUPLE)) {
+ if (features & NETIF_F_NTUPLE)
+ adapter->flags |= IAVF_FLAG_FDIR_ENABLED;
+ else
+ iavf_disable_fdir(adapter);
+ }
+
return 0;
}
features = iavf_fix_netdev_vlan_features(adapter, features);
+ if (!FDIR_FLTR_SUPPORT(adapter))
+ features &= ~NETIF_F_NTUPLE;
+
return iavf_fix_strip_features(adapter, features);
}
if (vfres->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_VLAN)
netdev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
+ if (FDIR_FLTR_SUPPORT(adapter)) {
+ netdev->hw_features |= NETIF_F_NTUPLE;
+ netdev->features |= NETIF_F_NTUPLE;
+ adapter->flags |= IAVF_FLAG_FDIR_ENABLED;
+ }
+
netdev->priv_flags |= IFF_UNICAST_FLT;
/* Do not turn on offloads when they are requested to be turned off.
return 0;
}
-/**
- * iavf_shutdown - Shutdown the device in preparation for a reboot
- * @pdev: pci device structure
- **/
-static void iavf_shutdown(struct pci_dev *pdev)
-{
- struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
- struct net_device *netdev = adapter->netdev;
-
- netif_device_detach(netdev);
-
- if (netif_running(netdev))
- iavf_close(netdev);
-
- if (iavf_lock_timeout(&adapter->crit_lock, 5000))
- dev_warn(&adapter->pdev->dev, "%s: failed to acquire crit_lock\n", __func__);
- /* Prevent the watchdog from running. */
- iavf_change_state(adapter, __IAVF_REMOVE);
- adapter->aq_required = 0;
- mutex_unlock(&adapter->crit_lock);
-
-#ifdef CONFIG_PM
- pci_save_state(pdev);
-
-#endif
- pci_disable_device(pdev);
-}
-
/**
* iavf_probe - Device Initialization Routine
* @pdev: PCI device information struct
**/
static void iavf_remove(struct pci_dev *pdev)
{
- struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
struct iavf_fdir_fltr *fdir, *fdirtmp;
struct iavf_vlan_filter *vlf, *vlftmp;
struct iavf_cloud_filter *cf, *cftmp;
struct iavf_adv_rss *rss, *rsstmp;
struct iavf_mac_filter *f, *ftmp;
+ struct iavf_adapter *adapter;
struct net_device *netdev;
struct iavf_hw *hw;
- netdev = adapter->netdev;
+ /* Don't proceed with remove if netdev is already freed */
+ netdev = pci_get_drvdata(pdev);
+ if (!netdev)
+ return;
+
+ adapter = iavf_pdev_to_adapter(pdev);
hw = &adapter->hw;
if (test_and_set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
destroy_workqueue(adapter->wq);
+ pci_set_drvdata(pdev, NULL);
+
free_netdev(netdev);
pci_disable_device(pdev);
}
+/**
+ * iavf_shutdown - Shutdown the device in preparation for a reboot
+ * @pdev: pci device structure
+ **/
+static void iavf_shutdown(struct pci_dev *pdev)
+{
+ iavf_remove(pdev);
+
+ if (system_state == SYSTEM_POWER_OFF)
+ pci_set_power_state(pdev, PCI_D3hot);
+}
+
static SIMPLE_DEV_PM_OPS(iavf_pm_ops, iavf_suspend, iavf_resume);
static struct pci_driver iavf_driver = {
*/
#define IAVF_ITR_DYNAMIC 0x8000 /* use top bit as a flag */
#define IAVF_ITR_MASK 0x1FFE /* mask for ITR register value */
-#define IAVF_MIN_ITR 2 /* reg uses 2 usec resolution */
#define IAVF_ITR_100K 10 /* all values below must be even */
#define IAVF_ITR_50K 20
#define IAVF_ITR_20K 50
**/
void iavf_del_fdir_filter(struct iavf_adapter *adapter)
{
+ struct virtchnl_fdir_del f = {};
struct iavf_fdir_fltr *fdir;
- struct virtchnl_fdir_del f;
bool process_fltr = false;
int len;
list_for_each_entry(fdir, &adapter->fdir_list_head, list) {
if (fdir->state == IAVF_FDIR_FLTR_DEL_REQUEST) {
process_fltr = true;
- memset(&f, 0, len);
f.vsi_id = fdir->vc_add_msg.vsi_id;
f.flow_id = fdir->flow_id;
fdir->state = IAVF_FDIR_FLTR_DEL_PENDING;
break;
+ } else if (fdir->state == IAVF_FDIR_FLTR_DIS_REQUEST) {
+ process_fltr = true;
+ f.vsi_id = fdir->vc_add_msg.vsi_id;
+ f.flow_id = fdir->flow_id;
+ fdir->state = IAVF_FDIR_FLTR_DIS_PENDING;
+ break;
}
}
spin_unlock_bh(&adapter->fdir_fltr_lock);
netdev->features &= ~NETIF_F_HW_VLAN_CTAG_RX;
}
+/**
+ * iavf_activate_fdir_filters - Reactivate all FDIR filters after a reset
+ * @adapter: private adapter structure
+ *
+ * Called after a reset to re-add all FDIR filters and delete some of them
+ * if they were pending to be deleted.
+ */
+static void iavf_activate_fdir_filters(struct iavf_adapter *adapter)
+{
+ struct iavf_fdir_fltr *f, *ftmp;
+ bool add_filters = false;
+
+ spin_lock_bh(&adapter->fdir_fltr_lock);
+ list_for_each_entry_safe(f, ftmp, &adapter->fdir_list_head, list) {
+ if (f->state == IAVF_FDIR_FLTR_ADD_REQUEST ||
+ f->state == IAVF_FDIR_FLTR_ADD_PENDING ||
+ f->state == IAVF_FDIR_FLTR_ACTIVE) {
+ /* All filters and requests have been removed in PF,
+ * restore them
+ */
+ f->state = IAVF_FDIR_FLTR_ADD_REQUEST;
+ add_filters = true;
+ } else if (f->state == IAVF_FDIR_FLTR_DIS_REQUEST ||
+ f->state == IAVF_FDIR_FLTR_DIS_PENDING) {
+ /* Link down state, leave filters as inactive */
+ f->state = IAVF_FDIR_FLTR_INACTIVE;
+ } else if (f->state == IAVF_FDIR_FLTR_DEL_REQUEST ||
+ f->state == IAVF_FDIR_FLTR_DEL_PENDING) {
+ /* Delete filters that were pending to be deleted, the
+ * list on PF is already cleared after a reset
+ */
+ list_del(&f->list);
+ kfree(f);
+ adapter->fdir_active_fltr--;
+ }
+ }
+ spin_unlock_bh(&adapter->fdir_fltr_lock);
+
+ if (add_filters)
+ adapter->aq_required |= IAVF_FLAG_AQ_ADD_FDIR_FILTER;
+}
+
/**
* iavf_virtchnl_completion
* @adapter: adapter structure
spin_lock_bh(&adapter->fdir_fltr_lock);
list_for_each_entry(fdir, &adapter->fdir_list_head,
list) {
- if (fdir->state == IAVF_FDIR_FLTR_DEL_PENDING) {
+ if (fdir->state == IAVF_FDIR_FLTR_DEL_PENDING ||
+ fdir->state == IAVF_FDIR_FLTR_DIS_PENDING) {
fdir->state = IAVF_FDIR_FLTR_ACTIVE;
dev_info(&adapter->pdev->dev, "Failed to del Flow Director filter, error %s\n",
iavf_stat_str(&adapter->hw,
spin_unlock_bh(&adapter->mac_vlan_list_lock);
+ iavf_activate_fdir_filters(adapter);
+
iavf_parse_vf_resource_msg(adapter);
/* negotiated VIRTCHNL_VF_OFFLOAD_VLAN_V2, so wait for the
list_for_each_entry_safe(fdir, fdir_tmp, &adapter->fdir_list_head,
list) {
if (fdir->state == IAVF_FDIR_FLTR_DEL_PENDING) {
- if (del_fltr->status == VIRTCHNL_FDIR_SUCCESS) {
+ if (del_fltr->status == VIRTCHNL_FDIR_SUCCESS ||
+ del_fltr->status ==
+ VIRTCHNL_FDIR_FAILURE_RULE_NONEXIST) {
dev_info(&adapter->pdev->dev, "Flow Director filter with location %u is deleted\n",
fdir->loc);
list_del(&fdir->list);
del_fltr->status);
iavf_print_fdir_fltr(adapter, fdir);
}
+ } else if (fdir->state == IAVF_FDIR_FLTR_DIS_PENDING) {
+ if (del_fltr->status == VIRTCHNL_FDIR_SUCCESS ||
+ del_fltr->status ==
+ VIRTCHNL_FDIR_FAILURE_RULE_NONEXIST) {
+ fdir->state = IAVF_FDIR_FLTR_INACTIVE;
+ } else {
+ fdir->state = IAVF_FDIR_FLTR_ACTIVE;
+ dev_info(&adapter->pdev->dev, "Failed to disable Flow Director filter with status: %d\n",
+ del_fltr->status);
+ iavf_print_fdir_fltr(adapter, fdir);
+ }
}
}
spin_unlock_bh(&adapter->fdir_fltr_lock);
}
/**
- * ice_download_pkg
+ * ice_download_pkg_with_sig_seg
* @hw: pointer to the hardware structure
* @pkg_hdr: pointer to package header
*
* Handles the download of a complete package.
*/
static enum ice_ddp_state
-ice_download_pkg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr)
+ice_download_pkg_with_sig_seg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr)
{
enum ice_aq_err aq_err = hw->adminq.sq_last_status;
enum ice_ddp_state state = ICE_DDP_PKG_ERR;
state = ice_post_dwnld_pkg_actions(hw);
ice_release_global_cfg_lock(hw);
+
+ return state;
+}
+
+/**
+ * ice_dwnld_cfg_bufs
+ * @hw: pointer to the hardware structure
+ * @bufs: pointer to an array of buffers
+ * @count: the number of buffers in the array
+ *
+ * Obtains global config lock and downloads the package configuration buffers
+ * to the firmware.
+ */
+static enum ice_ddp_state
+ice_dwnld_cfg_bufs(struct ice_hw *hw, struct ice_buf *bufs, u32 count)
+{
+ enum ice_ddp_state state;
+ struct ice_buf_hdr *bh;
+ int status;
+
+ if (!bufs || !count)
+ return ICE_DDP_PKG_ERR;
+
+ /* If the first buffer's first section has its metadata bit set
+ * then there are no buffers to be downloaded, and the operation is
+ * considered a success.
+ */
+ bh = (struct ice_buf_hdr *)bufs;
+ if (le32_to_cpu(bh->section_entry[0].type) & ICE_METADATA_BUF)
+ return ICE_DDP_PKG_SUCCESS;
+
+ status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
+ if (status) {
+ if (status == -EALREADY)
+ return ICE_DDP_PKG_ALREADY_LOADED;
+ return ice_map_aq_err_to_ddp_state(hw->adminq.sq_last_status);
+ }
+
+ state = ice_dwnld_cfg_bufs_no_lock(hw, bufs, 0, count, true);
+ if (!state)
+ state = ice_post_dwnld_pkg_actions(hw);
+
+ ice_release_global_cfg_lock(hw);
+
+ return state;
+}
+
+/**
+ * ice_download_pkg_without_sig_seg
+ * @hw: pointer to the hardware structure
+ * @ice_seg: pointer to the segment of the package to be downloaded
+ *
+ * Handles the download of a complete package without signature segment.
+ */
+static enum ice_ddp_state
+ice_download_pkg_without_sig_seg(struct ice_hw *hw, struct ice_seg *ice_seg)
+{
+ struct ice_buf_table *ice_buf_tbl;
+
+ ice_debug(hw, ICE_DBG_PKG, "Segment format version: %d.%d.%d.%d\n",
+ ice_seg->hdr.seg_format_ver.major,
+ ice_seg->hdr.seg_format_ver.minor,
+ ice_seg->hdr.seg_format_ver.update,
+ ice_seg->hdr.seg_format_ver.draft);
+
+ ice_debug(hw, ICE_DBG_PKG, "Seg: type 0x%X, size %d, name %s\n",
+ le32_to_cpu(ice_seg->hdr.seg_type),
+ le32_to_cpu(ice_seg->hdr.seg_size), ice_seg->hdr.seg_id);
+
+ ice_buf_tbl = ice_find_buf_table(ice_seg);
+
+ ice_debug(hw, ICE_DBG_PKG, "Seg buf count: %d\n",
+ le32_to_cpu(ice_buf_tbl->buf_count));
+
+ return ice_dwnld_cfg_bufs(hw, ice_buf_tbl->buf_array,
+ le32_to_cpu(ice_buf_tbl->buf_count));
+}
+
+/**
+ * ice_download_pkg
+ * @hw: pointer to the hardware structure
+ * @pkg_hdr: pointer to package header
+ * @ice_seg: pointer to the segment of the package to be downloaded
+ *
+ * Handles the download of a complete package.
+ */
+static enum ice_ddp_state
+ice_download_pkg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr,
+ struct ice_seg *ice_seg)
+{
+ enum ice_ddp_state state;
+
+ if (hw->pkg_has_signing_seg)
+ state = ice_download_pkg_with_sig_seg(hw, pkg_hdr);
+ else
+ state = ice_download_pkg_without_sig_seg(hw, ice_seg);
+
ice_post_pkg_dwnld_vlan_mode_cfg(hw);
return state;
/* initialize package hints and then download package */
ice_init_pkg_hints(hw, seg);
- state = ice_download_pkg(hw, pkg);
+ state = ice_download_pkg(hw, pkg, seg);
if (state == ICE_DDP_PKG_ALREADY_LOADED) {
ice_debug(hw, ICE_DBG_INIT,
"package previously loaded - no work.\n");
struct ice_pf *pf = d->pf;
int ret;
- if (prio > ICE_DPLL_PRIO_MAX) {
- NL_SET_ERR_MSG_FMT(extack, "prio out of supported range 0-%d",
- ICE_DPLL_PRIO_MAX);
- return -EINVAL;
- }
-
mutex_lock(&pf->dplls.lock);
ret = ice_dpll_hw_input_prio_set(pf, d, p, prio, extack);
mutex_unlock(&pf->dplls.lock);
}
d->pf = pf;
if (cgu) {
+ ice_dpll_update_state(pf, d, true);
ret = dpll_device_register(d->dpll, type, &ice_dpll_ops, d);
if (ret) {
dpll_device_put(d->dpll);
struct ice_dplls *d = &pf->dplls;
struct kthread_worker *kworker;
- ice_dpll_update_state(pf, &d->eec, true);
- ice_dpll_update_state(pf, &d->pps, true);
kthread_init_delayed_work(&d->work, ice_dpll_periodic_work);
kworker = kthread_create_worker(0, "ice-dplls-%s",
dev_name(ice_pf_to_dev(pf)));
int num_pins, i, ret = -EINVAL;
struct ice_hw *hw = &pf->hw;
struct ice_dpll_pin *pins;
+ unsigned long caps;
u8 freq_supp_num;
bool input;
}
for (i = 0; i < num_pins; i++) {
+ caps = 0;
pins[i].idx = i;
pins[i].prop.board_label = ice_cgu_get_pin_name(hw, i, input);
pins[i].prop.type = ice_cgu_get_pin_type(hw, i, input);
&dp->input_prio[i]);
if (ret)
return ret;
- pins[i].prop.capabilities |=
- DPLL_PIN_CAPABILITIES_PRIORITY_CAN_CHANGE;
+ caps |= (DPLL_PIN_CAPABILITIES_PRIORITY_CAN_CHANGE |
+ DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE);
pins[i].prop.phase_range.min =
pf->dplls.input_phase_adj_max;
pins[i].prop.phase_range.max =
pf->dplls.output_phase_adj_max;
pins[i].prop.phase_range.max =
-pf->dplls.output_phase_adj_max;
+ ret = ice_cgu_get_output_pin_state_caps(hw, i, &caps);
+ if (ret)
+ return ret;
}
- pins[i].prop.capabilities |=
- DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE;
+ pins[i].prop.capabilities = caps;
ret = ice_dpll_pin_state_update(pf, &pins[i], pin_type, NULL);
if (ret)
return ret;
#include "ice.h"
-#define ICE_DPLL_PRIO_MAX 0xF
#define ICE_DPLL_RCLK_NUM_MAX 4
/** ice_dpll_pin - store info about pins
linkmode_zero(ks->link_modes.supported);
linkmode_zero(ks->link_modes.advertising);
- for (i = 0; i < BITS_PER_TYPE(u64); i++) {
+ for (i = 0; i < ARRAY_SIZE(phy_type_low_lkup); i++) {
if (phy_types_low & BIT_ULL(i))
ice_linkmode_set_bit(&phy_type_low_lkup[i], ks,
req_speeds, advert_phy_type_lo,
i);
}
- for (i = 0; i < BITS_PER_TYPE(u64); i++) {
+ for (i = 0; i < ARRAY_SIZE(phy_type_high_lkup); i++) {
if (phy_types_high & BIT_ULL(i))
ice_linkmode_set_bit(&phy_type_high_lkup[i], ks,
req_speeds, advert_phy_type_hi,
dev_dbg(dev, "Problem restarting traffic for LAG node move\n");
}
+/**
+ * ice_lag_build_netdev_list - populate the lag struct's netdev list
+ * @lag: local lag struct
+ * @ndlist: pointer to netdev list to populate
+ */
+static void ice_lag_build_netdev_list(struct ice_lag *lag,
+ struct ice_lag_netdev_list *ndlist)
+{
+ struct ice_lag_netdev_list *nl;
+ struct net_device *tmp_nd;
+
+ INIT_LIST_HEAD(&ndlist->node);
+ rcu_read_lock();
+ for_each_netdev_in_bond_rcu(lag->upper_netdev, tmp_nd) {
+ nl = kzalloc(sizeof(*nl), GFP_ATOMIC);
+ if (!nl)
+ break;
+
+ nl->netdev = tmp_nd;
+ list_add(&nl->node, &ndlist->node);
+ }
+ rcu_read_unlock();
+ lag->netdev_head = &ndlist->node;
+}
+
+/**
+ * ice_lag_destroy_netdev_list - free lag struct's netdev list
+ * @lag: pointer to local lag struct
+ * @ndlist: pointer to lag struct netdev list
+ */
+static void ice_lag_destroy_netdev_list(struct ice_lag *lag,
+ struct ice_lag_netdev_list *ndlist)
+{
+ struct ice_lag_netdev_list *entry, *n;
+
+ rcu_read_lock();
+ list_for_each_entry_safe(entry, n, &ndlist->node, node) {
+ list_del(&entry->node);
+ kfree(entry);
+ }
+ rcu_read_unlock();
+ lag->netdev_head = NULL;
+}
+
/**
* ice_lag_move_single_vf_nodes - Move Tx scheduling nodes for single VF
* @lag: primary interface LAG struct
void ice_lag_move_new_vf_nodes(struct ice_vf *vf)
{
struct ice_lag_netdev_list ndlist;
- struct list_head *tmp, *n;
u8 pri_port, act_port;
struct ice_lag *lag;
struct ice_vsi *vsi;
pri_port = pf->hw.port_info->lport;
act_port = lag->active_port;
- if (lag->upper_netdev) {
- struct ice_lag_netdev_list *nl;
- struct net_device *tmp_nd;
-
- INIT_LIST_HEAD(&ndlist.node);
- rcu_read_lock();
- for_each_netdev_in_bond_rcu(lag->upper_netdev, tmp_nd) {
- nl = kzalloc(sizeof(*nl), GFP_ATOMIC);
- if (!nl)
- break;
-
- nl->netdev = tmp_nd;
- list_add(&nl->node, &ndlist.node);
- }
- rcu_read_unlock();
- }
-
- lag->netdev_head = &ndlist.node;
+ if (lag->upper_netdev)
+ ice_lag_build_netdev_list(lag, &ndlist);
if (ice_is_feature_supported(pf, ICE_F_SRIOV_LAG) &&
lag->bonded && lag->primary && pri_port != act_port &&
!list_empty(lag->netdev_head))
ice_lag_move_single_vf_nodes(lag, pri_port, act_port, vsi->idx);
- list_for_each_safe(tmp, n, &ndlist.node) {
- struct ice_lag_netdev_list *entry;
-
- entry = list_entry(tmp, struct ice_lag_netdev_list, node);
- list_del(&entry->node);
- kfree(entry);
- }
- lag->netdev_head = NULL;
+ ice_lag_destroy_netdev_list(lag, &ndlist);
new_vf_unlock:
mutex_unlock(&pf->lag_mutex);
ice_lag_move_single_vf_nodes(lag, oldport, newport, i);
}
+/**
+ * ice_lag_move_vf_nodes_cfg - move vf nodes outside LAG netdev event context
+ * @lag: local lag struct
+ * @src_prt: lport value for source port
+ * @dst_prt: lport value for destination port
+ *
+ * This function is used to move nodes during an out-of-netdev-event situation,
+ * primarily when the driver needs to reconfigure or recreate resources.
+ *
+ * Must be called while holding the lag_mutex to avoid lag events from
+ * processing while out-of-sync moves are happening. Also, paired moves,
+ * such as used in a reset flow, should both be called under the same mutex
+ * lock to avoid changes between start of reset and end of reset.
+ */
+void ice_lag_move_vf_nodes_cfg(struct ice_lag *lag, u8 src_prt, u8 dst_prt)
+{
+ struct ice_lag_netdev_list ndlist;
+
+ ice_lag_build_netdev_list(lag, &ndlist);
+ ice_lag_move_vf_nodes(lag, src_prt, dst_prt);
+ ice_lag_destroy_netdev_list(lag, &ndlist);
+}
+
#define ICE_LAG_SRIOV_CP_RECIPE 10
#define ICE_LAG_SRIOV_TRAIN_PKT_LEN 16
int n, err;
ice_lag_init_feature_support_flag(pf);
+ if (!ice_is_feature_supported(pf, ICE_F_SRIOV_LAG))
+ return 0;
pf->lag = kzalloc(sizeof(*lag), GFP_KERNEL);
if (!pf->lag)
{
struct ice_lag_netdev_list ndlist;
struct ice_lag *lag, *prim_lag;
- struct list_head *tmp, *n;
u8 act_port, loc_port;
if (!pf->lag || !pf->lag->bonded)
if (lag->primary) {
prim_lag = lag;
} else {
- struct ice_lag_netdev_list *nl;
- struct net_device *tmp_nd;
-
- INIT_LIST_HEAD(&ndlist.node);
- rcu_read_lock();
- for_each_netdev_in_bond_rcu(lag->upper_netdev, tmp_nd) {
- nl = kzalloc(sizeof(*nl), GFP_ATOMIC);
- if (!nl)
- break;
-
- nl->netdev = tmp_nd;
- list_add(&nl->node, &ndlist.node);
- }
- rcu_read_unlock();
- lag->netdev_head = &ndlist.node;
+ ice_lag_build_netdev_list(lag, &ndlist);
prim_lag = ice_lag_find_primary(lag);
}
ice_clear_rdma_cap(pf);
lag_rebuild_out:
- list_for_each_safe(tmp, n, &ndlist.node) {
- struct ice_lag_netdev_list *entry;
-
- entry = list_entry(tmp, struct ice_lag_netdev_list, node);
- list_del(&entry->node);
- kfree(entry);
- }
+ ice_lag_destroy_netdev_list(lag, &ndlist);
mutex_unlock(&pf->lag_mutex);
}
void ice_deinit_lag(struct ice_pf *pf);
void ice_lag_rebuild(struct ice_pf *pf);
bool ice_lag_is_switchdev_running(struct ice_pf *pf);
+void ice_lag_move_vf_nodes_cfg(struct ice_lag *lag, u8 src_prt, u8 dst_prt);
#endif /* _ICE_LAG_H_ */
} else {
max_txqs[i] = vsi->alloc_txq;
}
+
+ if (vsi->type == ICE_VSI_PF)
+ max_txqs[i] += vsi->num_xdp_txq;
}
dev_dbg(dev, "vsi->tc_cfg.ena_tc = %d\n", vsi->tc_cfg.ena_tc);
if (vsi->type == ICE_VSI_VF &&
vsi->agg_node && vsi->agg_node->valid)
vsi->agg_node->num_vsis--;
- if (vsi->agg_node) {
- vsi->agg_node->valid = false;
- vsi->agg_node->agg_id = 0;
- }
}
/**
goto err_vsi_rebuild;
}
- /* configure PTP timestamping after VSI rebuild */
- if (test_bit(ICE_FLAG_PTP_SUPPORTED, pf->flags)) {
- if (pf->ptp.tx_interrupt_mode == ICE_PTP_TX_INTERRUPT_SELF)
- ice_ptp_cfg_timestamp(pf, false);
- else if (pf->ptp.tx_interrupt_mode == ICE_PTP_TX_INTERRUPT_ALL)
- /* for E82x PHC owner always need to have interrupts */
- ice_ptp_cfg_timestamp(pf, true);
- }
-
err = ice_vsi_rebuild_by_type(pf, ICE_VSI_SWITCHDEV_CTRL);
if (err) {
dev_err(dev, "Switchdev CTRL VSI rebuild failed: %d\n", err);
ice_plug_aux_dev(pf);
if (ice_is_feature_supported(pf, ICE_F_SRIOV_LAG))
ice_lag_rebuild(pf);
+
+ /* Restore timestamp mode settings after VSI rebuild */
+ ice_ptp_restore_timestamp_mode(pf);
return;
err_vsi_rebuild:
}
/**
- * ice_ptp_configure_tx_tstamp - Enable or disable Tx timestamp interrupt
- * @pf: The PF pointer to search in
- * @on: bool value for whether timestamp interrupt is enabled or disabled
+ * ice_ptp_cfg_tx_interrupt - Configure Tx timestamp interrupt for the device
+ * @pf: Board private structure
+ *
+ * Program the device to respond appropriately to the Tx timestamp interrupt
+ * cause.
*/
-static void ice_ptp_configure_tx_tstamp(struct ice_pf *pf, bool on)
+static void ice_ptp_cfg_tx_interrupt(struct ice_pf *pf)
{
+ struct ice_hw *hw = &pf->hw;
+ bool enable;
u32 val;
+ switch (pf->ptp.tx_interrupt_mode) {
+ case ICE_PTP_TX_INTERRUPT_ALL:
+ /* React to interrupts across all quads. */
+ wr32(hw, PFINT_TSYN_MSK + (0x4 * hw->pf_id), (u32)0x1f);
+ enable = true;
+ break;
+ case ICE_PTP_TX_INTERRUPT_NONE:
+ /* Do not react to interrupts on any quad. */
+ wr32(hw, PFINT_TSYN_MSK + (0x4 * hw->pf_id), (u32)0x0);
+ enable = false;
+ break;
+ case ICE_PTP_TX_INTERRUPT_SELF:
+ default:
+ enable = pf->ptp.tstamp_config.tx_type == HWTSTAMP_TX_ON;
+ break;
+ }
+
/* Configure the Tx timestamp interrupt */
- val = rd32(&pf->hw, PFINT_OICR_ENA);
- if (on)
+ val = rd32(hw, PFINT_OICR_ENA);
+ if (enable)
val |= PFINT_OICR_TSYN_TX_M;
else
val &= ~PFINT_OICR_TSYN_TX_M;
- wr32(&pf->hw, PFINT_OICR_ENA, val);
-}
-
-/**
- * ice_set_tx_tstamp - Enable or disable Tx timestamping
- * @pf: The PF pointer to search in
- * @on: bool value for whether timestamps are enabled or disabled
- */
-static void ice_set_tx_tstamp(struct ice_pf *pf, bool on)
-{
- struct ice_vsi *vsi;
- u16 i;
-
- vsi = ice_get_main_vsi(pf);
- if (!vsi)
- return;
-
- /* Set the timestamp enable flag for all the Tx rings */
- ice_for_each_txq(vsi, i) {
- if (!vsi->tx_rings[i])
- continue;
- vsi->tx_rings[i]->ptp_tx = on;
- }
-
- if (pf->ptp.tx_interrupt_mode == ICE_PTP_TX_INTERRUPT_SELF)
- ice_ptp_configure_tx_tstamp(pf, on);
-
- pf->ptp.tstamp_config.tx_type = on ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
+ wr32(hw, PFINT_OICR_ENA, val);
}
/**
u16 i;
vsi = ice_get_main_vsi(pf);
- if (!vsi)
+ if (!vsi || !vsi->rx_rings)
return;
/* Set the timestamp flag for all the Rx rings */
continue;
vsi->rx_rings[i]->ptp_rx = on;
}
+}
+
+/**
+ * ice_ptp_disable_timestamp_mode - Disable current timestamp mode
+ * @pf: Board private structure
+ *
+ * Called during preparation for reset to temporarily disable timestamping on
+ * the device. Called during remove to disable timestamping while cleaning up
+ * driver resources.
+ */
+static void ice_ptp_disable_timestamp_mode(struct ice_pf *pf)
+{
+ struct ice_hw *hw = &pf->hw;
+ u32 val;
+
+ val = rd32(hw, PFINT_OICR_ENA);
+ val &= ~PFINT_OICR_TSYN_TX_M;
+ wr32(hw, PFINT_OICR_ENA, val);
- pf->ptp.tstamp_config.rx_filter = on ? HWTSTAMP_FILTER_ALL :
- HWTSTAMP_FILTER_NONE;
+ ice_set_rx_tstamp(pf, false);
}
/**
- * ice_ptp_cfg_timestamp - Configure timestamp for init/deinit
+ * ice_ptp_restore_timestamp_mode - Restore timestamp configuration
* @pf: Board private structure
- * @ena: bool value to enable or disable time stamp
*
- * This function will configure timestamping during PTP initialization
- * and deinitialization
+ * Called at the end of rebuild to restore timestamp configuration after
+ * a device reset.
*/
-void ice_ptp_cfg_timestamp(struct ice_pf *pf, bool ena)
+void ice_ptp_restore_timestamp_mode(struct ice_pf *pf)
{
- ice_set_tx_tstamp(pf, ena);
- ice_set_rx_tstamp(pf, ena);
+ struct ice_hw *hw = &pf->hw;
+ bool enable_rx;
+
+ ice_ptp_cfg_tx_interrupt(pf);
+
+ enable_rx = pf->ptp.tstamp_config.rx_filter == HWTSTAMP_FILTER_ALL;
+ ice_set_rx_tstamp(pf, enable_rx);
+
+ /* Trigger an immediate software interrupt to ensure that timestamps
+ * which occurred during reset are handled now.
+ */
+ wr32(hw, PFINT_OICR, PFINT_OICR_TSYN_TX_M);
+ ice_flush(hw);
}
/**
{
switch (config->tx_type) {
case HWTSTAMP_TX_OFF:
- ice_set_tx_tstamp(pf, false);
+ pf->ptp.tstamp_config.tx_type = HWTSTAMP_TX_OFF;
break;
case HWTSTAMP_TX_ON:
- ice_set_tx_tstamp(pf, true);
+ pf->ptp.tstamp_config.tx_type = HWTSTAMP_TX_ON;
break;
default:
return -ERANGE;
switch (config->rx_filter) {
case HWTSTAMP_FILTER_NONE:
- ice_set_rx_tstamp(pf, false);
+ pf->ptp.tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
break;
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_ALL:
- ice_set_rx_tstamp(pf, true);
+ pf->ptp.tstamp_config.rx_filter = HWTSTAMP_FILTER_ALL;
break;
default:
return -ERANGE;
}
+ /* Immediately update the device timestamping mode */
+ ice_ptp_restore_timestamp_mode(pf);
+
return 0;
}
clear_bit(ICE_FLAG_PTP, pf->flags);
/* Disable timestamping for both Tx and Rx */
- ice_ptp_cfg_timestamp(pf, false);
+ ice_ptp_disable_timestamp_mode(pf);
kthread_cancel_delayed_work_sync(&ptp->work);
/* Release the global hardware lock */
ice_ptp_unlock(hw);
- if (pf->ptp.tx_interrupt_mode == ICE_PTP_TX_INTERRUPT_ALL) {
- /* The clock owner for this device type handles the timestamp
- * interrupt for all ports.
- */
- ice_ptp_configure_tx_tstamp(pf, true);
-
- /* React on all quads interrupts for E82x */
- wr32(hw, PFINT_TSYN_MSK + (0x4 * hw->pf_id), (u32)0x1f);
-
+ if (!ice_is_e810(hw)) {
/* Enable quad interrupts */
err = ice_ptp_tx_ena_intr(pf, true, itr);
if (err)
case ICE_PHY_E810:
return ice_ptp_init_tx_e810(pf, &ptp_port->tx);
case ICE_PHY_E822:
- /* Non-owner PFs don't react to any interrupts on E82x,
- * neither on own quad nor on others
- */
- if (!ice_ptp_pf_handles_tx_interrupt(pf)) {
- ice_ptp_configure_tx_tstamp(pf, false);
- wr32(hw, PFINT_TSYN_MSK + (0x4 * hw->pf_id), (u32)0x0);
- }
kthread_init_delayed_work(&ptp_port->ov_work,
ice_ptp_wait_for_offsets);
/* Start the PHY timestamping block */
ice_ptp_reset_phy_timestamping(pf);
+ /* Configure initial Tx interrupt settings */
+ ice_ptp_cfg_tx_interrupt(pf);
+
set_bit(ICE_FLAG_PTP, pf->flags);
err = ice_ptp_init_work(pf, ptp);
if (err)
return;
/* Disable timestamping for both Tx and Rx */
- ice_ptp_cfg_timestamp(pf, false);
+ ice_ptp_disable_timestamp_mode(pf);
ice_ptp_remove_auxbus_device(pf);
struct ice_pf;
int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr);
int ice_ptp_get_ts_config(struct ice_pf *pf, struct ifreq *ifr);
-void ice_ptp_cfg_timestamp(struct ice_pf *pf, bool ena);
+void ice_ptp_restore_timestamp_mode(struct ice_pf *pf);
void ice_ptp_extts_event(struct ice_pf *pf);
s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb);
return -EOPNOTSUPP;
}
-static inline void ice_ptp_cfg_timestamp(struct ice_pf *pf, bool ena) { }
-
+static inline void ice_ptp_restore_timestamp_mode(struct ice_pf *pf) { }
static inline void ice_ptp_extts_event(struct ice_pf *pf) { }
static inline s8
ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb)
return ret;
}
+
+/**
+ * ice_cgu_get_output_pin_state_caps - get output pin state capabilities
+ * @hw: pointer to the hw struct
+ * @pin_id: id of a pin
+ * @caps: capabilities to modify
+ *
+ * Return:
+ * * 0 - success, state capabilities were modified
+ * * negative - failure, capabilities were not modified
+ */
+int ice_cgu_get_output_pin_state_caps(struct ice_hw *hw, u8 pin_id,
+ unsigned long *caps)
+{
+ bool can_change = true;
+
+ switch (hw->device_id) {
+ case ICE_DEV_ID_E810C_SFP:
+ if (pin_id == ZL_OUT2 || pin_id == ZL_OUT3)
+ can_change = false;
+ break;
+ case ICE_DEV_ID_E810C_QSFP:
+ if (pin_id == ZL_OUT2 || pin_id == ZL_OUT3 || pin_id == ZL_OUT4)
+ can_change = false;
+ break;
+ case ICE_DEV_ID_E823L_10G_BASE_T:
+ case ICE_DEV_ID_E823L_1GBE:
+ case ICE_DEV_ID_E823L_BACKPLANE:
+ case ICE_DEV_ID_E823L_QSFP:
+ case ICE_DEV_ID_E823L_SFP:
+ case ICE_DEV_ID_E823C_10G_BASE_T:
+ case ICE_DEV_ID_E823C_BACKPLANE:
+ case ICE_DEV_ID_E823C_QSFP:
+ case ICE_DEV_ID_E823C_SFP:
+ case ICE_DEV_ID_E823C_SGMII:
+ if (hw->cgu_part_number ==
+ ICE_AQC_GET_LINK_TOPO_NODE_NR_ZL30632_80032 &&
+ pin_id == ZL_OUT2)
+ can_change = false;
+ else if (hw->cgu_part_number ==
+ ICE_AQC_GET_LINK_TOPO_NODE_NR_SI5383_5384 &&
+ pin_id == SI_OUT1)
+ can_change = false;
+ break;
+ default:
+ return -EINVAL;
+ }
+ if (can_change)
+ *caps |= DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE;
+ else
+ *caps &= ~DPLL_PIN_CAPABILITIES_STATE_CAN_CHANGE;
+
+ return 0;
+}
int ice_get_cgu_rclk_pin_info(struct ice_hw *hw, u8 *base_idx, u8 *pin_num);
void ice_ptp_init_phy_model(struct ice_hw *hw);
+int ice_cgu_get_output_pin_state_caps(struct ice_hw *hw, u8 pin_id,
+ unsigned long *caps);
#define PFTSYN_SEM_BYTES 4
*/
int ice_calc_vf_reg_idx(struct ice_vf *vf, struct ice_q_vector *q_vector)
{
- struct ice_pf *pf;
-
if (!vf || !q_vector)
return -EINVAL;
- pf = vf->pf;
-
/* always add one to account for the OICR being the first MSIX */
- return pf->sriov_base_vector + pf->vfs.num_msix_per * vf->vf_id +
- q_vector->v_idx + 1;
+ return vf->first_vector_idx + q_vector->v_idx + 1;
}
/**
if (likely(!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)))
return;
- if (!tx_ring->ptp_tx)
- return;
-
/* Tx timestamps cannot be sampled when doing TSO */
if (first->tx_flags & ICE_TX_FLAGS_TSO)
return;
#define ICE_TX_FLAGS_RING_VLAN_L2TAG2 BIT(2)
u8 flags;
u8 dcb_tc; /* Traffic class of ring */
- u8 ptp_tx;
} ____cacheline_internodealigned_in_smp;
static inline bool ice_ring_uses_build_skb(struct ice_rx_ring *ring)
int ice_reset_vf(struct ice_vf *vf, u32 flags)
{
struct ice_pf *pf = vf->pf;
+ struct ice_lag *lag;
struct ice_vsi *vsi;
+ u8 act_prt, pri_prt;
struct device *dev;
int err = 0;
bool rsd;
dev = ice_pf_to_dev(pf);
+ act_prt = ICE_LAG_INVALID_PORT;
+ pri_prt = pf->hw.port_info->lport;
if (flags & ICE_VF_RESET_NOTIFY)
ice_notify_vf_reset(vf);
return 0;
}
+ lag = pf->lag;
+ mutex_lock(&pf->lag_mutex);
+ if (lag && lag->bonded && lag->primary) {
+ act_prt = lag->active_port;
+ if (act_prt != pri_prt && act_prt != ICE_LAG_INVALID_PORT &&
+ lag->upper_netdev)
+ ice_lag_move_vf_nodes_cfg(lag, act_prt, pri_prt);
+ else
+ act_prt = ICE_LAG_INVALID_PORT;
+ }
+
if (flags & ICE_VF_RESET_LOCK)
mutex_lock(&vf->cfg_lock);
else
if (flags & ICE_VF_RESET_LOCK)
mutex_unlock(&vf->cfg_lock);
+ if (lag && lag->bonded && lag->primary &&
+ act_prt != ICE_LAG_INVALID_PORT)
+ ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
+ mutex_unlock(&pf->lag_mutex);
+
return err;
}
/* setup outer VLAN ops */
vlan_ops->set_port_vlan = ice_vsi_set_outer_port_vlan;
vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
- vlan_ops->clear_port_vlan = ice_vsi_clear_outer_port_vlan;
/* setup inner VLAN ops */
vlan_ops = &vsi->inner_vlan_ops;
vlan_ops->set_port_vlan = ice_vsi_set_inner_port_vlan;
vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
- vlan_ops->clear_port_vlan = ice_vsi_clear_inner_port_vlan;
}
+
+ /* all Rx traffic should be in the domain of the assigned port VLAN,
+ * so prevent disabling Rx VLAN filtering
+ */
+ vlan_ops->dis_rx_filtering = noop_vlan;
+
vlan_ops->ena_rx_filtering = ice_vsi_ena_rx_vlan_filtering;
}
vlan_ops->del_vlan = ice_vsi_del_vlan;
}
+ vlan_ops->dis_rx_filtering = ice_vsi_dis_rx_vlan_filtering;
+
if (!test_bit(ICE_FLAG_VF_VLAN_PRUNING, pf->flags))
vlan_ops->ena_rx_filtering = noop_vlan;
else
&vsi->outer_vlan_ops : &vsi->inner_vlan_ops;
vlan_ops->add_vlan = ice_vsi_add_vlan;
- vlan_ops->dis_rx_filtering = ice_vsi_dis_rx_vlan_filtering;
vlan_ops->ena_tx_filtering = ice_vsi_ena_tx_vlan_filtering;
vlan_ops->dis_tx_filtering = ice_vsi_dis_tx_vlan_filtering;
}
u16 num_q_vectors_mapped, vsi_id, vector_id;
struct virtchnl_irq_map_info *irqmap_info;
struct virtchnl_vector_map *map;
- struct ice_pf *pf = vf->pf;
struct ice_vsi *vsi;
int i;
* there is actually at least a single VF queue vector mapped
*/
if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states) ||
- pf->vfs.num_msix_per < num_q_vectors_mapped ||
+ vf->num_msix < num_q_vectors_mapped ||
!num_q_vectors_mapped) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
goto error_param;
/* vector_id is always 0-based for each VF, and can never be
* larger than or equal to the max allowed interrupts per VF
*/
- if (!(vector_id < pf->vfs.num_msix_per) ||
+ if (!(vector_id < vf->num_msix) ||
!ice_vc_isvalid_vsi_id(vf, vsi_id) ||
(!vector_id && (map->rxq_map || map->txq_map))) {
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
(struct virtchnl_vsi_queue_config_info *)msg;
struct virtchnl_queue_pair_info *qpi;
struct ice_pf *pf = vf->pf;
+ struct ice_lag *lag;
struct ice_vsi *vsi;
+ u8 act_prt, pri_prt;
int i = -1, q_idx;
+ lag = pf->lag;
+ mutex_lock(&pf->lag_mutex);
+ act_prt = ICE_LAG_INVALID_PORT;
+ pri_prt = pf->hw.port_info->lport;
+ if (lag && lag->bonded && lag->primary) {
+ act_prt = lag->active_port;
+ if (act_prt != pri_prt && act_prt != ICE_LAG_INVALID_PORT &&
+ lag->upper_netdev)
+ ice_lag_move_vf_nodes_cfg(lag, act_prt, pri_prt);
+ else
+ act_prt = ICE_LAG_INVALID_PORT;
+ }
+
if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states))
goto error_param;
}
}
+ if (lag && lag->bonded && lag->primary &&
+ act_prt != ICE_LAG_INVALID_PORT)
+ ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
+ mutex_unlock(&pf->lag_mutex);
+
/* send the response to the VF */
return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
VIRTCHNL_STATUS_SUCCESS, NULL, 0);
vf->vf_id, i);
}
+ if (lag && lag->bonded && lag->primary &&
+ act_prt != ICE_LAG_INVALID_PORT)
+ ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
+ mutex_unlock(&pf->lag_mutex);
+
ice_lag_move_new_vf_nodes(vf);
/* send the response to the VF */
u8 *data)
{
if (sset == ETH_SS_STATS) {
+ struct mvneta_port *pp = netdev_priv(netdev);
int i;
for (i = 0; i < ARRAY_SIZE(mvneta_statistics); i++)
memcpy(data + i * ETH_GSTRING_LEN,
mvneta_statistics[i].name, ETH_GSTRING_LEN);
- data += ETH_GSTRING_LEN * ARRAY_SIZE(mvneta_statistics);
- page_pool_ethtool_stats_get_strings(data);
+ if (!pp->bm_priv) {
+ data += ETH_GSTRING_LEN * ARRAY_SIZE(mvneta_statistics);
+ page_pool_ethtool_stats_get_strings(data);
+ }
}
}
struct page_pool_stats stats = {};
int i;
- for (i = 0; i < rxq_number; i++)
- page_pool_get_stats(pp->rxqs[i].page_pool, &stats);
+ for (i = 0; i < rxq_number; i++) {
+ if (pp->rxqs[i].page_pool)
+ page_pool_get_stats(pp->rxqs[i].page_pool, &stats);
+ }
page_pool_ethtool_stats_get(data, &stats);
}
for (i = 0; i < ARRAY_SIZE(mvneta_statistics); i++)
*data++ = pp->ethtool_stats[i];
- mvneta_ethtool_pp_stats(pp, data);
+ if (!pp->bm_priv)
+ mvneta_ethtool_pp_stats(pp, data);
}
static int mvneta_ethtool_get_sset_count(struct net_device *dev, int sset)
{
- if (sset == ETH_SS_STATS)
- return ARRAY_SIZE(mvneta_statistics) +
- page_pool_ethtool_stats_get_count();
+ if (sset == ETH_SS_STATS) {
+ int count = ARRAY_SIZE(mvneta_statistics);
+ struct mvneta_port *pp = netdev_priv(dev);
+
+ if (!pp->bm_priv)
+ count += page_pool_ethtool_stats_get_count();
+
+ return count;
+ }
return -EOPNOTSUPP;
}
if (ret)
return ret;
+ INIT_WORK(&oct->tx_timeout_task, octep_tx_timeout_task);
+ INIT_WORK(&oct->ctrl_mbox_task, octep_ctrl_mbox_task);
+ INIT_DELAYED_WORK(&oct->intr_poll_task, octep_intr_poll_task);
+ oct->poll_non_ioq_intr = true;
+ queue_delayed_work(octep_wq, &oct->intr_poll_task,
+ msecs_to_jiffies(OCTEP_INTR_POLL_TIME_MSECS));
+
atomic_set(&oct->hb_miss_cnt, 0);
INIT_DELAYED_WORK(&oct->hb_task, octep_hb_timeout_task);
pci_read_config_byte(pdev, (pos + 8), &status);
dev_info(&pdev->dev, "Firmware ready status = %u\n", status);
- return status;
+#define FW_STATUS_READY 1ULL
+ return status == FW_STATUS_READY;
}
return false;
}
goto err_octep_config;
}
- octep_ctrl_net_get_info(octep_dev, OCTEP_CTRL_NET_INVALID_VFID,
- &octep_dev->conf->fw_info);
+ err = octep_ctrl_net_get_info(octep_dev, OCTEP_CTRL_NET_INVALID_VFID,
+ &octep_dev->conf->fw_info);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to get firmware info\n");
+ goto register_dev_err;
+ }
dev_info(&octep_dev->pdev->dev, "Heartbeat interval %u msecs Heartbeat miss count %u\n",
octep_dev->conf->fw_info.hb_interval,
octep_dev->conf->fw_info.hb_miss_count);
queue_delayed_work(octep_wq, &octep_dev->hb_task,
msecs_to_jiffies(octep_dev->conf->fw_info.hb_interval));
- INIT_WORK(&octep_dev->tx_timeout_task, octep_tx_timeout_task);
- INIT_WORK(&octep_dev->ctrl_mbox_task, octep_ctrl_mbox_task);
- INIT_DELAYED_WORK(&octep_dev->intr_poll_task, octep_intr_poll_task);
- octep_dev->poll_non_ioq_intr = true;
- queue_delayed_work(octep_wq, &octep_dev->intr_poll_task,
- msecs_to_jiffies(OCTEP_INTR_POLL_TIME_MSECS));
-
netdev->netdev_ops = &octep_netdev_ops;
octep_set_ethtool_ops(netdev);
netif_carrier_off(netdev);
u8 tcam_entries; /* RX/TX Tcam entries per mcs block */
u8 secy_entries; /* RX/TX SECY entries per mcs block */
u8 sc_entries; /* RX/TX SC CAM entries per mcs block */
- u8 sa_entries; /* PN table entries = SA entries */
+ u16 sa_entries; /* PN table entries = SA entries */
u64 rsvd[16];
};
reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYTAGGEDCTLX(id);
stats->pkt_tagged_ctl_cnt = mcs_reg_read(mcs, reg);
- reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDORNOTAGX(id);
+ reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDX(id);
stats->pkt_untaged_cnt = mcs_reg_read(mcs, reg);
reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(id);
reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSCNOTVALIDX(id);
stats->pkt_notvalid_cnt = mcs_reg_read(mcs, reg);
- reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSCUNCHECKEDOROKX(id);
+ reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSCUNCHECKEDX(id);
stats->pkt_unchecked_cnt = mcs_reg_read(mcs, reg);
if (mcs->hw->mcs_blks > 1) {
return NULL;
}
+bool is_mcs_bypass(int mcs_id)
+{
+ struct mcs *mcs_dev;
+
+ list_for_each_entry(mcs_dev, &mcs_list, mcs_list) {
+ if (mcs_dev->mcs_id == mcs_id)
+ return mcs_dev->bypass;
+ }
+ return true;
+}
+
void mcs_set_port_cfg(struct mcs *mcs, struct mcs_port_cfg_set_req *req)
{
u64 val = 0;
return err;
}
-static void mcs_set_external_bypass(struct mcs *mcs, u8 bypass)
+static void mcs_set_external_bypass(struct mcs *mcs, bool bypass)
{
u64 val;
else
val &= ~BIT_ULL(6);
mcs_reg_write(mcs, MCSX_MIL_GLOBAL, val);
+ mcs->bypass = bypass;
}
static void mcs_global_cfg(struct mcs *mcs)
u16 num_vec;
void *rvu;
u16 *tx_sa_active;
+ bool bypass;
};
struct mcs_ops {
int mcs_alloc_ctrlpktrule(struct rsrc_bmap *rsrc, u16 *pf_map, u16 offset, u16 pcifunc);
int mcs_free_ctrlpktrule(struct mcs *mcs, struct mcs_free_ctrl_pkt_rule_req *req);
int mcs_ctrlpktrule_write(struct mcs *mcs, struct mcs_ctrl_pkt_rule_write_req *req);
+bool is_mcs_bypass(int mcs_id);
/* CN10K-B APIs */
void cn10kb_mcs_set_hw_capabilities(struct mcs *mcs);
offset = 0x9d8ull; \
offset; })
+#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSCUNCHECKEDX(a) ({ \
+ u64 offset; \
+ \
+ offset = 0xee80ull; \
+ if (mcs->hw->mcs_blks > 1) \
+ offset = 0xe818ull; \
+ offset += (a) * 0x8ull; \
+ offset; })
+
+#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDX(a) ({ \
+ u64 offset; \
+ \
+ offset = 0xa680ull; \
+ if (mcs->hw->mcs_blks > 1) \
+ offset = 0xd018ull; \
+ offset += (a) * 0x8ull; \
+ offset; })
+
+#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSCLATEORDELAYEDX(a) ({ \
+ u64 offset; \
+ \
+ offset = 0xf680ull; \
+ if (mcs->hw->mcs_blks > 1) \
+ offset = 0xe018ull; \
+ offset += (a) * 0x8ull; \
+ offset; })
+
#define MCSX_CSE_RX_MEM_SLAVE_INOCTETSSCDECRYPTEDX(a) (0xe680ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INOCTETSSCVALIDATEX(a) (0xde80ull + (a) * 0x8ull)
-#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDORNOTAGX(a) (0xa680ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYNOTAGX(a) (0xd218 + (a) * 0x8ull)
-#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDX(a) (0xd018ull + (a) * 0x8ull)
-#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSCUNCHECKEDOROKX(a) (0xee80ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(a) (0xb680ull + (a) * 0x8ull)
-#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSCLATEORDELAYEDX(a) (0xf680ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSAINVALIDX(a) (0x12680ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSANOTUSINGSAERRORX(a) (0x15680ull + (a) * 0x8ull)
#define MCSX_CSE_RX_MEM_SLAVE_INPKTSSANOTVALIDX(a) (0x13680ull + (a) * 0x8ull)
cfg |= RPMX_MTI_MAC100X_COMMAND_CONFIG_TX_P_DISABLE;
rpm_write(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG, cfg);
+ /* Disable forward pause to driver */
+ cfg = rpm_read(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG);
+ cfg &= ~RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_FWD;
+ rpm_write(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG, cfg);
+
/* Enable channel mask for all LMACS */
if (is_dev_rpm2(rpm))
rpm_write(rpm, lmac_id, RPM2_CMR_CHAN_MSK_OR, 0xffff);
if (rx_pause) {
cfg &= ~(RPMX_MTI_MAC100X_COMMAND_CONFIG_RX_P_DISABLE |
- RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_IGNORE |
- RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_FWD);
+ RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_IGNORE);
} else {
cfg |= (RPMX_MTI_MAC100X_COMMAND_CONFIG_RX_P_DISABLE |
- RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_IGNORE |
- RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_FWD);
+ RPMX_MTI_MAC100X_COMMAND_CONFIG_PAUSE_IGNORE);
}
if (tx_pause) {
rvu_npc_free_mcam_entries(rvu, pcifunc, -1);
rvu_mac_reset(rvu, pcifunc);
+ if (rvu->mcs_blk_cnt)
+ rvu_mcs_flr_handler(rvu, pcifunc);
+
mutex_unlock(&rvu->flr_lock);
}
struct nix_txvlan txvlan;
struct nix_ipolicer *ipolicer;
u64 *tx_credits;
+ u8 cc_mcs_cnt;
};
/* RVU block's capabilities or functionality,
rvu_dl->devlink_wq = create_workqueue("rvu_devlink_wq");
if (!rvu_dl->devlink_wq)
- goto err;
+ return -ENOMEM;
INIT_WORK(&rvu_reporters->intr_work, rvu_nix_intr_work);
INIT_WORK(&rvu_reporters->gen_work, rvu_nix_gen_work);
INIT_WORK(&rvu_reporters->ras_work, rvu_nix_ras_work);
return 0;
-err:
- rvu_nix_health_reporters_destroy(rvu_dl);
- return -ENOMEM;
}
static int rvu_nix_health_reporters_create(struct rvu_devlink *rvu_dl)
rvu_dl->devlink_wq = create_workqueue("rvu_devlink_wq");
if (!rvu_dl->devlink_wq)
- goto err;
+ return -ENOMEM;
INIT_WORK(&rvu_reporters->intr_work, rvu_npa_intr_work);
INIT_WORK(&rvu_reporters->err_work, rvu_npa_err_work);
INIT_WORK(&rvu_reporters->ras_work, rvu_npa_ras_work);
return 0;
-err:
- rvu_npa_health_reporters_destroy(rvu_dl);
- return -ENOMEM;
}
static int rvu_npa_health_reporters_create(struct rvu_devlink *rvu_dl)
#include "rvu_reg.h"
#include "rvu.h"
#include "npc.h"
+#include "mcs.h"
#include "cgx.h"
#include "lmac_common.h"
#include "rvu_npc_hash.h"
SDP_HW_MAX_FRS << 16 | NIC_HW_MIN_FRS);
}
+ /* Get MCS external bypass status for CN10K-B */
+ if (mcs_get_blkcnt() == 1) {
+ /* Adjust for 2 credits when external bypass is disabled */
+ nix_hw->cc_mcs_cnt = is_mcs_bypass(0) ? 0 : 2;
+ }
+
/* Set credits for Tx links assuming max packet length allowed.
* This will be reconfigured based on MTU set for PF/VF.
*/
tx_credits = (lmac_fifo_len - lmac_max_frs) / 16;
/* Enable credits and set credit pkt count to max allowed */
cfg = (tx_credits << 12) | (0x1FF << 2) | BIT_ULL(1);
+ cfg |= FIELD_PREP(NIX_AF_LINKX_MCS_CNT_MASK, nix_hw->cc_mcs_cnt);
link = iter + slink;
nix_hw->tx_credits[link] = tx_credits;
ipolicer = &nix_hw->ipolicer[layer];
for (idx = 0; idx < req->prof_count[layer]; idx++) {
+ if (idx == MAX_BANDPROF_PER_PFFUNC)
+ break;
prof_idx = req->prof_idx[layer][idx];
if (prof_idx >= ipolicer->band_prof.max ||
ipolicer->pfvf_map[prof_idx] != pcifunc)
ipolicer->pfvf_map[prof_idx] = 0x00;
ipolicer->match_id[prof_idx] = 0;
rvu_free_rsrc(&ipolicer->band_prof, prof_idx);
- if (idx == MAX_BANDPROF_PER_PFFUNC)
- break;
}
}
mutex_unlock(&rvu->rsrc_lock);
int bank, nixlf, index;
/* get ucast entry rule entry index */
- nix_get_nixlf(rvu, pf_func, &nixlf, NULL);
+ if (nix_get_nixlf(rvu, pf_func, &nixlf, NULL)) {
+ dev_err(rvu->dev, "%s: nixlf not attached to pcifunc:0x%x\n",
+ __func__, pf_func);
+ /* Action 0 is drop */
+ return 0;
+ }
+
index = npc_get_nixlf_mcam_index(mcam, pf_func, nixlf,
NIXLF_UCAST_ENTRY);
bank = npc_get_bank(mcam, index);
int blkaddr, ucast_idx, index;
struct nix_rx_action action = { 0 };
u64 relaxed_mask;
+ u8 flow_key_alg;
if (!hw->cap.nix_rx_multicast && is_cgx_vf(rvu, pcifunc))
return;
action.op = NIX_RX_ACTIONOP_UCAST;
}
+ flow_key_alg = action.flow_key_alg;
+
/* RX_ACTION set to MCAST for CGX PF's */
if (hw->cap.nix_rx_multicast && pfvf->use_mce_list &&
is_pf_cgxmapped(rvu, rvu_get_pf(pcifunc))) {
req.vf = pcifunc;
req.index = action.index;
req.match_id = action.match_id;
- req.flow_key_alg = action.flow_key_alg;
+ req.flow_key_alg = flow_key_alg;
rvu_mbox_handler_npc_install_flow(rvu, &req, &rsp);
}
u8 mac_addr[ETH_ALEN] = { 0 };
struct nix_rx_action action = { 0 };
struct rvu_pfvf *pfvf;
+ u8 flow_key_alg;
u16 vf_func;
/* Only CGX PF/VF can add allmulticast entry */
*(u64 *)&action = npc_get_mcam_action(rvu, mcam,
blkaddr, ucast_idx);
+ flow_key_alg = action.flow_key_alg;
if (action.op != NIX_RX_ACTIONOP_RSS) {
*(u64 *)&action = 0;
action.op = NIX_RX_ACTIONOP_UCAST;
req.vf = pcifunc | vf_func;
req.index = action.index;
req.match_id = action.match_id;
- req.flow_key_alg = action.flow_key_alg;
+ req.flow_key_alg = flow_key_alg;
rvu_mbox_handler_npc_install_flow(rvu, &req, &rsp);
}
mutex_unlock(&mcam->lock);
}
+static void npc_update_rx_action_with_alg_idx(struct rvu *rvu, struct nix_rx_action action,
+ struct rvu_pfvf *pfvf, int mcam_index, int blkaddr,
+ int alg_idx)
+
+{
+ struct npc_mcam *mcam = &rvu->hw->mcam;
+ struct rvu_hwinfo *hw = rvu->hw;
+ int bank, op_rss;
+
+ if (!is_mcam_entry_enabled(rvu, mcam, blkaddr, mcam_index))
+ return;
+
+ op_rss = (!hw->cap.nix_rx_multicast || !pfvf->use_mce_list);
+
+ bank = npc_get_bank(mcam, mcam_index);
+ mcam_index &= (mcam->banksize - 1);
+
+ /* If Rx action is MCAST update only RSS algorithm index */
+ if (!op_rss) {
+ *(u64 *)&action = rvu_read64(rvu, blkaddr,
+ NPC_AF_MCAMEX_BANKX_ACTION(mcam_index, bank));
+
+ action.flow_key_alg = alg_idx;
+ }
+ rvu_write64(rvu, blkaddr,
+ NPC_AF_MCAMEX_BANKX_ACTION(mcam_index, bank), *(u64 *)&action);
+}
+
void rvu_npc_update_flowkey_alg_idx(struct rvu *rvu, u16 pcifunc, int nixlf,
int group, int alg_idx, int mcam_index)
{
struct npc_mcam *mcam = &rvu->hw->mcam;
- struct rvu_hwinfo *hw = rvu->hw;
struct nix_rx_action action;
int blkaddr, index, bank;
struct rvu_pfvf *pfvf;
/* If PF's promiscuous entry is enabled,
* Set RSS action for that entry as well
*/
- if ((!hw->cap.nix_rx_multicast || !pfvf->use_mce_list) &&
- is_mcam_entry_enabled(rvu, mcam, blkaddr, index)) {
- bank = npc_get_bank(mcam, index);
- index &= (mcam->banksize - 1);
+ npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, blkaddr,
+ alg_idx);
- rvu_write64(rvu, blkaddr,
- NPC_AF_MCAMEX_BANKX_ACTION(index, bank),
- *(u64 *)&action);
- }
+ index = npc_get_nixlf_mcam_index(mcam, pcifunc,
+ nixlf, NIXLF_ALLMULTI_ENTRY);
+ /* If PF's allmulti entry is enabled,
+ * Set RSS action for that entry as well
+ */
+ npc_update_rx_action_with_alg_idx(rvu, action, pfvf, index, blkaddr,
+ alg_idx);
}
void npc_enadis_default_mce_entry(struct rvu *rvu, u16 pcifunc,
{NIX_TXSCH_LVL_TL4, 3, 0xFFFF, {{0x0B00, 0x0B08}, {0x0B10, 0x0B18},
{0x1200, 0x12E0} } },
{NIX_TXSCH_LVL_TL3, 4, 0xFFFF, {{0x1000, 0x10E0}, {0x1600, 0x1608},
- {0x1610, 0x1618}, {0x1700, 0x17B0} } },
- {NIX_TXSCH_LVL_TL2, 2, 0xFFFF, {{0x0E00, 0x0EE0}, {0x1700, 0x17B0} } },
+ {0x1610, 0x1618}, {0x1700, 0x17C8} } },
+ {NIX_TXSCH_LVL_TL2, 2, 0xFFFF, {{0x0E00, 0x0EE0}, {0x1700, 0x17C8} } },
{NIX_TXSCH_LVL_TL1, 1, 0xFFFF, {{0x0C00, 0x0D98} } },
};
#define NIX_AF_LINKX_BASE_MASK GENMASK_ULL(11, 0)
#define NIX_AF_LINKX_RANGE_MASK GENMASK_ULL(19, 16)
+#define NIX_AF_LINKX_MCS_CNT_MASK GENMASK_ULL(33, 32)
/* SSO */
#define SSO_AF_CONST (0x1000)
aq->prof.pebs_mantissa = 0;
aq->prof_mask.pebs_mantissa = 0xFF;
+ aq->prof.hl_en = 0;
+ aq->prof_mask.hl_en = 1;
+
/* Fill AQ info */
aq->qidx = profile;
aq->ctype = NIX_AQ_CTYPE_BANDPROF;
void otx2_shutdown_tc(struct otx2_nic *nic);
int otx2_setup_tc(struct net_device *netdev, enum tc_setup_type type,
void *type_data);
+void otx2_tc_apply_ingress_police_rules(struct otx2_nic *nic);
+
/* CGX/RPM DMAC filters support */
int otx2_dmacflt_get_max_cnt(struct otx2_nic *pf);
int otx2_dmacflt_add(struct otx2_nic *pf, const u8 *mac, u32 bit_pos);
static int otx2_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc)
{
struct otx2_nic *pfvf = netdev_priv(dev);
+ u8 old_pfc_en;
int err;
- /* Save PFC configuration to interface */
+ old_pfc_en = pfvf->pfc_en;
pfvf->pfc_en = pfc->pfc_en;
if (pfvf->hw.tx_queues >= NIX_PF_PFC_PRIO_MAX)
* supported by the tx queue configuration
*/
err = otx2_check_pfc_config(pfvf);
- if (err)
+ if (err) {
+ pfvf->pfc_en = old_pfc_en;
return err;
+ }
process_pfc:
err = otx2_config_priority_flow_ctrl(pfvf);
- if (err)
+ if (err) {
+ pfvf->pfc_en = old_pfc_en;
return err;
+ }
/* Request Per channel Bpids */
if (pfc->pfc_en)
err = otx2_pfc_txschq_update(pfvf);
if (err) {
+ if (pfc->pfc_en)
+ otx2_nix_config_bp(pfvf, false);
+
+ otx2_pfc_txschq_stop(pfvf);
+ pfvf->pfc_en = old_pfc_en;
+ otx2_config_priority_flow_ctrl(pfvf);
dev_err(pfvf->dev, "%s failed to update TX schedulers\n", __func__);
return err;
}
if (is_otx2_lbkvf(pfvf->pdev))
return;
+ mutex_lock(&pfvf->mbox.lock);
req = otx2_mbox_alloc_msg_cgx_cfg_pause_frm(&pfvf->mbox);
- if (!req)
+ if (!req) {
+ mutex_unlock(&pfvf->mbox.lock);
return;
+ }
if (!otx2_sync_mbox_msg(&pfvf->mbox)) {
rsp = (struct cgx_pause_frm_cfg *)
pause->rx_pause = rsp->rx_pause;
pause->tx_pause = rsp->tx_pause;
}
+ mutex_unlock(&pfvf->mbox.lock);
}
static int otx2_set_pauseparam(struct net_device *netdev,
struct ethhdr *eth_hdr;
bool new = false;
int err = 0;
+ u64 vf_num;
u32 ring;
if (!flow_cfg->max_flows) {
if (!(pfvf->flags & OTX2_FLAG_NTUPLE_SUPPORT))
return -ENOMEM;
- if (ring >= pfvf->hw.rx_queues && fsp->ring_cookie != RX_CLS_FLOW_DISC)
+ /* Number of queues on a VF can be greater or less than
+ * the PF's queue. Hence no need to check for the
+ * queue count. Hence no need to check queue count if PF
+ * is installing for its VF. Below is the expected vf_num value
+ * based on the ethtool commands.
+ *
+ * e.g.
+ * 1. ethtool -U <netdev> ... action -1 ==> vf_num:255
+ * 2. ethtool -U <netdev> ... action <queue_num> ==> vf_num:0
+ * 3. ethtool -U <netdev> ... vf <vf_idx> queue <queue_num> ==>
+ * vf_num:vf_idx+1
+ */
+ vf_num = ethtool_get_flow_spec_ring_vf(fsp->ring_cookie);
+ if (!is_otx2_vf(pfvf->pcifunc) && !vf_num &&
+ ring >= pfvf->hw.rx_queues && fsp->ring_cookie != RX_CLS_FLOW_DISC)
return -EINVAL;
if (fsp->location >= otx2_get_maxflows(flow_cfg))
flow_cfg->nr_flows++;
}
+ if (flow->is_vf)
+ netdev_info(pfvf->netdev,
+ "Make sure that VF's queue number is within its queue limit\n");
return 0;
}
otx2_write64(pf, RVU_PF_VFPF_MBOX_INTX(1), intr);
otx2_queue_work(mbox, pf->mbox_pfvf_wq, 64, vfs, intr,
TYPE_PFVF);
- vfs -= 64;
+ if (intr)
+ trace_otx2_msg_interrupt(mbox->mbox.pdev, "VF(s) to PF", intr);
+ vfs = 64;
}
intr = otx2_read64(pf, RVU_PF_VFPF_MBOX_INTX(0));
otx2_queue_work(mbox, pf->mbox_pfvf_wq, 0, vfs, intr, TYPE_PFVF);
- trace_otx2_msg_interrupt(mbox->mbox.pdev, "VF(s) to PF", intr);
+ if (intr)
+ trace_otx2_msg_interrupt(mbox->mbox.pdev, "VF(s) to PF", intr);
return IRQ_HANDLED;
}
mutex_unlock(&mbox->lock);
}
+static bool otx2_promisc_use_mce_list(struct otx2_nic *pfvf)
+{
+ int vf;
+
+ /* The AF driver will determine whether to allow the VF netdev or not */
+ if (is_otx2_vf(pfvf->pcifunc))
+ return true;
+
+ /* check if there are any trusted VFs associated with the PF netdev */
+ for (vf = 0; vf < pci_num_vf(pfvf->pdev); vf++)
+ if (pfvf->vf_configs[vf].trusted)
+ return true;
+ return false;
+}
+
static void otx2_do_set_rx_mode(struct otx2_nic *pf)
{
struct net_device *netdev = pf->netdev;
if (netdev->flags & (IFF_ALLMULTI | IFF_MULTICAST))
req->mode |= NIX_RX_MODE_ALLMULTI;
- req->mode |= NIX_RX_MODE_USE_MCE;
+ if (otx2_promisc_use_mce_list(pf))
+ req->mode |= NIX_RX_MODE_USE_MCE;
otx2_sync_mbox_msg(&pf->mbox);
mutex_unlock(&pf->mbox.lock);
}
+static void otx2_set_irq_coalesce(struct otx2_nic *pfvf)
+{
+ int cint;
+
+ for (cint = 0; cint < pfvf->hw.cint_cnt; cint++)
+ otx2_config_irq_coalescing(pfvf, cint);
+}
+
static void otx2_dim_work(struct work_struct *w)
{
struct dim_cq_moder cur_moder;
CQ_TIMER_THRESH_MAX : cur_moder.usec;
pfvf->hw.cq_ecount_wait = (cur_moder.pkts > NAPI_POLL_WEIGHT) ?
NAPI_POLL_WEIGHT : cur_moder.pkts;
+ otx2_set_irq_coalesce(pfvf);
dim->state = DIM_START_MEASURE;
}
if (pf->flags & OTX2_FLAG_DMACFLTR_SUPPORT)
otx2_dmacflt_reinstall_flows(pf);
+ otx2_tc_apply_ingress_police_rules(pf);
+
err = otx2_rxtx_enable(pf, true);
/* If a mbox communication error happens at this point then interface
* will end up in a state such that it is in down state but hardware
/* Clear RSS enable flag */
rss = &pf->hw.rss_info;
rss->enable = false;
+ if (!netif_is_rxfh_configured(netdev))
+ kfree(rss->rss_ctx[DEFAULT_RSS_CONTEXT_GROUP]);
/* Cleanup Queue IRQ */
vec = pci_irq_vector(pf->pdev,
pf->vf_configs[vf].trusted = enable;
rc = otx2_set_vf_permissions(pf, vf, OTX2_TRUSTED_VF);
- if (rc)
+ if (rc) {
pf->vf_configs[vf].trusted = !enable;
- else
+ } else {
netdev_info(pf->netdev, "VF %d is %strusted\n",
vf, enable ? "" : "not ");
+ otx2_set_rx_mode(netdev);
+ }
+
return rc;
}
bool is_act_police;
u32 prio;
struct npc_install_flow_req req;
+ u64 rate;
+ u32 burst;
+ bool is_pps;
};
static void otx2_get_egress_burst_cfg(struct otx2_nic *nic, u32 burst,
return err;
}
-static int otx2_tc_act_set_police(struct otx2_nic *nic,
- struct otx2_tc_flow *node,
- struct flow_cls_offload *f,
- u64 rate, u32 burst, u32 mark,
- struct npc_install_flow_req *req, bool pps)
+static int otx2_tc_act_set_hw_police(struct otx2_nic *nic,
+ struct otx2_tc_flow *node)
{
- struct netlink_ext_ack *extack = f->common.extack;
- struct otx2_hw *hw = &nic->hw;
- int rq_idx, rc;
-
- rq_idx = find_first_zero_bit(&nic->rq_bmap, hw->rx_queues);
- if (rq_idx >= hw->rx_queues) {
- NL_SET_ERR_MSG_MOD(extack, "Police action rules exceeded");
- return -EINVAL;
- }
+ int rc;
mutex_lock(&nic->mbox.lock);
return rc;
}
- rc = cn10k_set_ipolicer_rate(nic, node->leaf_profile, burst, rate, pps);
+ rc = cn10k_set_ipolicer_rate(nic, node->leaf_profile,
+ node->burst, node->rate, node->is_pps);
if (rc)
goto free_leaf;
- rc = cn10k_map_unmap_rq_policer(nic, rq_idx, node->leaf_profile, true);
+ rc = cn10k_map_unmap_rq_policer(nic, node->rq, node->leaf_profile, true);
if (rc)
goto free_leaf;
mutex_unlock(&nic->mbox.lock);
- req->match_id = mark & 0xFFFFULL;
- req->index = rq_idx;
- req->op = NIX_RX_ACTIONOP_UCAST;
- set_bit(rq_idx, &nic->rq_bmap);
- node->is_act_police = true;
- node->rq = rq_idx;
-
return 0;
free_leaf:
return rc;
}
+static int otx2_tc_act_set_police(struct otx2_nic *nic,
+ struct otx2_tc_flow *node,
+ struct flow_cls_offload *f,
+ u64 rate, u32 burst, u32 mark,
+ struct npc_install_flow_req *req, bool pps)
+{
+ struct netlink_ext_ack *extack = f->common.extack;
+ struct otx2_hw *hw = &nic->hw;
+ int rq_idx, rc;
+
+ rq_idx = find_first_zero_bit(&nic->rq_bmap, hw->rx_queues);
+ if (rq_idx >= hw->rx_queues) {
+ NL_SET_ERR_MSG_MOD(extack, "Police action rules exceeded");
+ return -EINVAL;
+ }
+
+ req->match_id = mark & 0xFFFFULL;
+ req->index = rq_idx;
+ req->op = NIX_RX_ACTIONOP_UCAST;
+
+ node->is_act_police = true;
+ node->rq = rq_idx;
+ node->burst = burst;
+ node->rate = rate;
+ node->is_pps = pps;
+
+ rc = otx2_tc_act_set_hw_police(nic, node);
+ if (!rc)
+ set_bit(rq_idx, &nic->rq_bmap);
+
+ return rc;
+}
+
static int otx2_tc_parse_actions(struct otx2_nic *nic,
struct flow_action *flow_action,
struct npc_install_flow_req *req,
}
if (flow_node->is_act_police) {
+ __clear_bit(flow_node->rq, &nic->rq_bmap);
+
+ if (nic->flags & OTX2_FLAG_INTF_DOWN)
+ goto free_mcam_flow;
+
mutex_lock(&nic->mbox.lock);
err = cn10k_map_unmap_rq_policer(nic, flow_node->rq,
"Unable to free leaf bandwidth profile(%d)\n",
flow_node->leaf_profile);
- __clear_bit(flow_node->rq, &nic->rq_bmap);
-
mutex_unlock(&nic->mbox.lock);
}
+free_mcam_flow:
otx2_del_mcam_flow_entry(nic, flow_node->entry, NULL);
otx2_tc_update_mcam_table(nic, flow_cfg, flow_node, false);
kfree_rcu(flow_node, rcu);
if (!(nic->flags & OTX2_FLAG_TC_FLOWER_SUPPORT))
return -ENOMEM;
+ if (nic->flags & OTX2_FLAG_INTF_DOWN) {
+ NL_SET_ERR_MSG_MOD(extack, "Interface not initialized");
+ return -EINVAL;
+ }
+
if (flow_cfg->nr_flows == flow_cfg->max_flows) {
NL_SET_ERR_MSG_MOD(extack,
"Free MCAM entry not available to add the flow");
otx2_destroy_tc_flow_list(nic);
}
EXPORT_SYMBOL(otx2_shutdown_tc);
+
+static void otx2_tc_config_ingress_rule(struct otx2_nic *nic,
+ struct otx2_tc_flow *node)
+{
+ struct npc_install_flow_req *req;
+
+ if (otx2_tc_act_set_hw_police(nic, node))
+ return;
+
+ mutex_lock(&nic->mbox.lock);
+
+ req = otx2_mbox_alloc_msg_npc_install_flow(&nic->mbox);
+ if (!req)
+ goto err;
+
+ memcpy(req, &node->req, sizeof(struct npc_install_flow_req));
+
+ if (otx2_sync_mbox_msg(&nic->mbox))
+ netdev_err(nic->netdev,
+ "Failed to install MCAM flow entry for ingress rule");
+err:
+ mutex_unlock(&nic->mbox.lock);
+}
+
+void otx2_tc_apply_ingress_police_rules(struct otx2_nic *nic)
+{
+ struct otx2_flow_config *flow_cfg = nic->flow_cfg;
+ struct otx2_tc_flow *node;
+
+ /* If any ingress policer rules exist for the interface then
+ * apply those rules. Ingress policer rules depend on bandwidth
+ * profiles linked to the receive queues. Since no receive queues
+ * exist when interface is down, ingress policer rules are stored
+ * and configured in hardware after all receive queues are allocated
+ * in otx2_open.
+ */
+ list_for_each_entry(node, &flow_cfg->flow_list_tc, list) {
+ if (node->is_act_police)
+ otx2_tc_config_ingress_rule(nic, node);
+ }
+}
+EXPORT_SYMBOL(otx2_tc_apply_ingress_police_rules);
{
struct dim_sample dim_sample;
u64 rx_frames, rx_bytes;
+ u64 tx_frames, tx_bytes;
rx_frames = OTX2_GET_RX_STATS(RX_BCAST) + OTX2_GET_RX_STATS(RX_MCAST) +
OTX2_GET_RX_STATS(RX_UCAST);
rx_bytes = OTX2_GET_RX_STATS(RX_OCTS);
- dim_update_sample(pfvf->napi_events, rx_frames, rx_bytes, &dim_sample);
+ tx_bytes = OTX2_GET_TX_STATS(TX_OCTS);
+ tx_frames = OTX2_GET_TX_STATS(TX_UCAST);
+
+ dim_update_sample(pfvf->napi_events,
+ rx_frames + tx_frames,
+ rx_bytes + tx_bytes,
+ &dim_sample);
net_dim(&cq_poll->dim, dim_sample);
}
if (pfvf->flags & OTX2_FLAG_INTF_DOWN)
return workdone;
- /* Check for adaptive interrupt coalesce */
- if (workdone != 0 &&
- ((pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED) ==
- OTX2_FLAG_ADPTV_INT_COAL_ENABLED)) {
- /* Adjust irq coalese using net_dim */
+ /* Adjust irq coalese using net_dim */
+ if (pfvf->flags & OTX2_FLAG_ADPTV_INT_COAL_ENABLED)
otx2_adjust_adaptive_coalese(pfvf, cq_poll);
- /* Update irq coalescing */
- for (i = 0; i < pfvf->hw.cint_cnt; i++)
- otx2_config_irq_coalescing(pfvf, i);
- }
if (unlikely(!filled_cnt)) {
struct refill_work *work;
for (i = 0; i < q->n_desc; i++) {
struct mtk_wed_wo_queue_entry *entry = &q->entry[i];
+ if (!entry->buf)
+ continue;
+
dma_unmap_single(wo->hw->dev, entry->addr, entry->len,
DMA_TO_DEVICE);
skb_free_frag(entry->buf);
return token;
}
-static int cmd_alloc_index(struct mlx5_cmd *cmd)
+static int cmd_alloc_index(struct mlx5_cmd *cmd, struct mlx5_cmd_work_ent *ent)
{
unsigned long flags;
int ret;
spin_lock_irqsave(&cmd->alloc_lock, flags);
ret = find_first_bit(&cmd->vars.bitmask, cmd->vars.max_reg_cmds);
- if (ret < cmd->vars.max_reg_cmds)
+ if (ret < cmd->vars.max_reg_cmds) {
clear_bit(ret, &cmd->vars.bitmask);
+ ent->idx = ret;
+ cmd->ent_arr[ent->idx] = ent;
+ }
spin_unlock_irqrestore(&cmd->alloc_lock, flags);
return ret < cmd->vars.max_reg_cmds ? ret : -ENOMEM;
sem = ent->page_queue ? &cmd->vars.pages_sem : &cmd->vars.sem;
down(sem);
if (!ent->page_queue) {
- alloc_ret = cmd_alloc_index(cmd);
+ alloc_ret = cmd_alloc_index(cmd, ent);
if (alloc_ret < 0) {
mlx5_core_err_rl(dev, "failed to allocate command entry\n");
if (ent->callback) {
up(sem);
return;
}
- ent->idx = alloc_ret;
} else {
ent->idx = cmd->vars.max_reg_cmds;
spin_lock_irqsave(&cmd->alloc_lock, flags);
clear_bit(ent->idx, &cmd->vars.bitmask);
+ cmd->ent_arr[ent->idx] = ent;
spin_unlock_irqrestore(&cmd->alloc_lock, flags);
}
- cmd->ent_arr[ent->idx] = ent;
lay = get_inst(cmd, ent->idx);
ent->lay = lay;
memset(lay, 0, sizeof(*lay));
while (block_timestamp > tracer->last_timestamp) {
/* Check block override if it's not the first block */
- if (!tracer->last_timestamp) {
+ if (tracer->last_timestamp) {
u64 *ts_event;
/* To avoid block override be the HW in case of buffer
* wraparound, the time stamp of the previous block
MLX5E_STATE_DESTROYING,
MLX5E_STATE_XDP_TX_ENABLED,
MLX5E_STATE_XDP_ACTIVE,
+ MLX5E_STATE_CHANNELS_ACTIVE,
};
struct mlx5e_modify_sq_param {
in = kvzalloc(inlen, GFP_KERNEL);
if (!in || !ft->g) {
kfree(ft->g);
+ ft->g = NULL;
kvfree(in);
return -ENOMEM;
}
static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq,
struct mlx5_cqe64 *cqe,
+ u8 *md_buff,
+ u8 *md_buff_sz,
int budget)
{
struct mlx5e_ptp_port_ts_cqe_list *pending_cqe_list = ptpsq->ts_cqe_pending_list;
mlx5e_ptpsq_mark_ts_cqes_undelivered(ptpsq, hwtstamp);
out:
napi_consume_skb(skb, budget);
- mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist, metadata_id);
+ md_buff[*md_buff_sz++] = metadata_id;
if (unlikely(mlx5e_ptp_metadata_map_unhealthy(&ptpsq->metadata_map)) &&
!test_and_set_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state))
queue_work(ptpsq->txqsq.priv->wq, &ptpsq->report_unhealthy_work);
}
-static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int budget)
+static bool mlx5e_ptp_poll_ts_cq(struct mlx5e_cq *cq, int napi_budget)
{
struct mlx5e_ptpsq *ptpsq = container_of(cq, struct mlx5e_ptpsq, ts_cq);
- struct mlx5_cqwq *cqwq = &cq->wq;
+ int budget = min(napi_budget, MLX5E_TX_CQ_POLL_BUDGET);
+ u8 metadata_buff[MLX5E_TX_CQ_POLL_BUDGET];
+ u8 metadata_buff_sz = 0;
+ struct mlx5_cqwq *cqwq;
struct mlx5_cqe64 *cqe;
int work_done = 0;
+ cqwq = &cq->wq;
+
if (unlikely(!test_bit(MLX5E_SQ_STATE_ENABLED, &ptpsq->txqsq.state)))
return false;
do {
mlx5_cqwq_pop(cqwq);
- mlx5e_ptp_handle_ts_cqe(ptpsq, cqe, budget);
+ mlx5e_ptp_handle_ts_cqe(ptpsq, cqe,
+ metadata_buff, &metadata_buff_sz, napi_budget);
} while ((++work_done < budget) && (cqe = mlx5_cqwq_get_cqe(cqwq)));
mlx5_cqwq_update_db_record(cqwq);
/* ensure cq space is freed before enabling more cqes */
wmb();
+ while (metadata_buff_sz > 0)
+ mlx5e_ptp_metadata_fifo_push(&ptpsq->metadata_freelist,
+ metadata_buff[--metadata_buff_sz]);
+
mlx5e_txqsq_wake(&ptpsq->txqsq);
return work_done == budget;
void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq)
{
- char icosq_str[MLX5E_REPORTER_PER_Q_MAX_LEN] = {};
char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN];
struct mlx5e_icosq *icosq = rq->icosq;
struct mlx5e_priv *priv = rq->priv;
struct mlx5e_err_ctx err_ctx = {};
+ char icosq_str[32] = {};
err_ctx.ctx = rq;
err_ctx.recover = mlx5e_rx_reporter_timeout_recover;
if (icosq)
snprintf(icosq_str, sizeof(icosq_str), "ICOSQ: 0x%x, ", icosq->sqn);
snprintf(err_str, sizeof(err_str),
- "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x",
+ "RX timeout on channel: %d, %s RQ: 0x%x, CQ: 0x%x",
rq->ix, icosq_str, rq->rqn, rq->cq.mcq.cqn);
mlx5e_health_report(priv, priv->rx_reporter, err_str, &err_ctx);
}
esw_attr->dests[esw_attr->out_count].flags |= MLX5_ESW_DEST_ENCAP;
esw_attr->out_count++;
- /* attr->dests[].rep is resolved when we handle encap */
+ /* attr->dests[].vport is resolved when we handle encap */
return 0;
}
out_priv = netdev_priv(out_dev);
rpriv = out_priv->ppriv;
- esw_attr->dests[esw_attr->out_count].rep = rpriv->rep;
+ esw_attr->dests[esw_attr->out_count].vport_valid = true;
+ esw_attr->dests[esw_attr->out_count].vport = rpriv->rep->vport;
esw_attr->dests[esw_attr->out_count].mdev = out_priv->mdev;
esw_attr->out_count++;
struct mlx5_flow_spec *spec;
int err;
+ if (IS_ERR(post_act))
+ return PTR_ERR(post_act);
+
spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
if (!spec)
return -ENOMEM;
struct mlx5e_post_act_handle *handle;
int err;
+ if (IS_ERR(post_act))
+ return ERR_CAST(post_act);
+
handle = kzalloc(sizeof(*handle), GFP_KERNEL);
if (!handle)
return ERR_PTR(-ENOMEM);
e->encap_size = ipv4_encap_size;
e->encap_header = encap_header;
+ encap_header = NULL;
if (!(nud_state & NUD_VALID)) {
neigh_event_send(attr.n, NULL);
memset(&reformat_params, 0, sizeof(reformat_params));
reformat_params.type = e->reformat_type;
- reformat_params.size = ipv4_encap_size;
- reformat_params.data = encap_header;
+ reformat_params.size = e->encap_size;
+ reformat_params.data = e->encap_header;
e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
MLX5_FLOW_NAMESPACE_FDB);
if (IS_ERR(e->pkt_reformat)) {
e->encap_size = ipv4_encap_size;
kfree(e->encap_header);
e->encap_header = encap_header;
+ encap_header = NULL;
if (!(nud_state & NUD_VALID)) {
neigh_event_send(attr.n, NULL);
memset(&reformat_params, 0, sizeof(reformat_params));
reformat_params.type = e->reformat_type;
- reformat_params.size = ipv4_encap_size;
- reformat_params.data = encap_header;
+ reformat_params.size = e->encap_size;
+ reformat_params.data = e->encap_header;
e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
MLX5_FLOW_NAMESPACE_FDB);
if (IS_ERR(e->pkt_reformat)) {
e->encap_size = ipv6_encap_size;
e->encap_header = encap_header;
+ encap_header = NULL;
if (!(nud_state & NUD_VALID)) {
neigh_event_send(attr.n, NULL);
memset(&reformat_params, 0, sizeof(reformat_params));
reformat_params.type = e->reformat_type;
- reformat_params.size = ipv6_encap_size;
- reformat_params.data = encap_header;
+ reformat_params.size = e->encap_size;
+ reformat_params.data = e->encap_header;
e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
MLX5_FLOW_NAMESPACE_FDB);
if (IS_ERR(e->pkt_reformat)) {
e->encap_size = ipv6_encap_size;
kfree(e->encap_header);
e->encap_header = encap_header;
+ encap_header = NULL;
if (!(nud_state & NUD_VALID)) {
neigh_event_send(attr.n, NULL);
memset(&reformat_params, 0, sizeof(reformat_params));
reformat_params.type = e->reformat_type;
- reformat_params.size = ipv6_encap_size;
- reformat_params.data = encap_header;
+ reformat_params.size = e->encap_size;
+ reformat_params.data = e->encap_header;
e->pkt_reformat = mlx5_packet_reformat_alloc(priv->mdev, &reformat_params,
MLX5_FLOW_NAMESPACE_FDB);
if (IS_ERR(e->pkt_reformat)) {
out_priv = netdev_priv(encap_dev);
rpriv = out_priv->ppriv;
- esw_attr->dests[out_index].rep = rpriv->rep;
+ esw_attr->dests[out_index].vport_valid = true;
+ esw_attr->dests[out_index].vport = rpriv->rep->vport;
esw_attr->dests[out_index].mdev = out_priv->mdev;
}
dma_addr_t dma_addr = xdptxd->dma_addr;
u32 dma_len = xdptxd->len;
u16 ds_cnt, inline_hdr_sz;
+ unsigned int frags_size;
u8 num_wqebbs = 1;
int num_frags = 0;
bool inline_ok;
inline_ok = sq->min_inline_mode == MLX5_INLINE_MODE_NONE ||
dma_len >= MLX5E_XDP_MIN_INLINE;
+ frags_size = xdptxd->has_frags ? xdptxdf->sinfo->xdp_frags_size : 0;
- if (unlikely(!inline_ok || sq->hw_mtu < dma_len)) {
+ if (unlikely(!inline_ok || sq->hw_mtu < dma_len + frags_size)) {
stats->err++;
return false;
}
if (x->xso.type == XFRM_DEV_OFFLOAD_CRYPTO)
esn_msb = xfrm_replay_seqhi(x, htonl(seq_bottom));
- sa_entry->esn_state.esn = esn;
+ if (sa_entry->esn_state.esn_msb)
+ sa_entry->esn_state.esn = esn;
+ else
+ /* According to RFC4303, section "3.3.3. Sequence Number Generation",
+ * the first packet sent using a given SA will contain a sequence
+ * number of 1.
+ */
+ sa_entry->esn_state.esn = max_t(u32, esn, 1);
sa_entry->esn_state.esn_msb = esn_msb;
if (unlikely(overlap && seq_bottom < MLX5E_IPSEC_ESN_SCOPE_MID)) {
attrs->replay_esn.esn = sa_entry->esn_state.esn;
attrs->replay_esn.esn_msb = sa_entry->esn_state.esn_msb;
attrs->replay_esn.overlap = sa_entry->esn_state.overlap;
+ switch (x->replay_esn->replay_window) {
+ case 32:
+ attrs->replay_esn.replay_window =
+ MLX5_IPSEC_ASO_REPLAY_WIN_32BIT;
+ break;
+ case 64:
+ attrs->replay_esn.replay_window =
+ MLX5_IPSEC_ASO_REPLAY_WIN_64BIT;
+ break;
+ case 128:
+ attrs->replay_esn.replay_window =
+ MLX5_IPSEC_ASO_REPLAY_WIN_128BIT;
+ break;
+ case 256:
+ attrs->replay_esn.replay_window =
+ MLX5_IPSEC_ASO_REPLAY_WIN_256BIT;
+ break;
+ default:
+ WARN_ON(true);
+ return;
+ }
}
attrs->dir = x->xso.dir;
return;
mlx5e_accel_ipsec_fs_cleanup(ipsec);
- if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_TUNNEL)
+ if (ipsec->netevent_nb.notifier_call) {
unregister_netevent_notifier(&ipsec->netevent_nb);
- if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_PACKET_OFFLOAD)
+ ipsec->netevent_nb.notifier_call = NULL;
+ }
+ if (ipsec->aso)
mlx5e_ipsec_aso_cleanup(ipsec);
destroy_workqueue(ipsec->wq);
kfree(ipsec);
}
}
+ if (x->xdo.type == XFRM_DEV_OFFLOAD_PACKET &&
+ !(mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_PACKET_OFFLOAD)) {
+ NL_SET_ERR_MSG_MOD(extack, "Packet offload is not supported");
+ return -EINVAL;
+ }
+
return 0;
}
.xdo_dev_state_free = mlx5e_xfrm_free_state,
.xdo_dev_offload_ok = mlx5e_ipsec_offload_ok,
.xdo_dev_state_advance_esn = mlx5e_xfrm_advance_esn_state,
-};
-
-static const struct xfrmdev_ops mlx5e_ipsec_packet_xfrmdev_ops = {
- .xdo_dev_state_add = mlx5e_xfrm_add_state,
- .xdo_dev_state_delete = mlx5e_xfrm_del_state,
- .xdo_dev_state_free = mlx5e_xfrm_free_state,
- .xdo_dev_offload_ok = mlx5e_ipsec_offload_ok,
- .xdo_dev_state_advance_esn = mlx5e_xfrm_advance_esn_state,
.xdo_dev_state_update_curlft = mlx5e_xfrm_update_curlft,
.xdo_dev_policy_add = mlx5e_xfrm_add_policy,
mlx5_core_info(mdev, "mlx5e: IPSec ESP acceleration enabled\n");
- if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_PACKET_OFFLOAD)
- netdev->xfrmdev_ops = &mlx5e_ipsec_packet_xfrmdev_ops;
- else
- netdev->xfrmdev_ops = &mlx5e_ipsec_xfrmdev_ops;
-
+ netdev->xfrmdev_ops = &mlx5e_ipsec_xfrmdev_ops;
netdev->features |= NETIF_F_HW_ESP;
netdev->hw_enc_features |= NETIF_F_HW_ESP;
u32 refcnt;
};
+struct mlx5e_ipsec_drop {
+ struct mlx5_flow_handle *rule;
+ struct mlx5_fc *fc;
+};
+
struct mlx5e_ipsec_rule {
struct mlx5_flow_handle *rule;
struct mlx5_modify_hdr *modify_hdr;
struct mlx5_pkt_reformat *pkt_reformat;
struct mlx5_fc *fc;
+ struct mlx5e_ipsec_drop replay;
+ struct mlx5e_ipsec_drop auth;
+ struct mlx5e_ipsec_drop trailer;
};
struct mlx5e_ipsec_miss {
struct mlx5_flow_handle *rule;
};
-struct mlx5e_ipsec_rx {
- struct mlx5e_ipsec_ft ft;
- struct mlx5e_ipsec_miss pol;
- struct mlx5e_ipsec_miss sa;
- struct mlx5e_ipsec_rule status;
- struct mlx5e_ipsec_miss status_drop;
- struct mlx5_fc *status_drop_cnt;
- struct mlx5e_ipsec_fc *fc;
- struct mlx5_fs_chains *chains;
- u8 allow_tunnel_mode : 1;
- struct xarray ipsec_obj_id_map;
-};
-
struct mlx5e_ipsec_tx_create_attr {
int prio;
int pol_level;
struct mlx5_ipsec_fs *roce;
u8 is_uplink_rep: 1;
struct mlx5e_ipsec_mpv_work mpv_work;
+ struct xarray ipsec_obj_id_map;
};
struct mlx5e_ipsec_esn_state {
u8 allow_tunnel_mode : 1;
};
+struct mlx5e_ipsec_status_checks {
+ struct mlx5_flow_group *drop_all_group;
+ struct mlx5e_ipsec_drop all;
+};
+
+struct mlx5e_ipsec_rx {
+ struct mlx5e_ipsec_ft ft;
+ struct mlx5e_ipsec_miss pol;
+ struct mlx5e_ipsec_miss sa;
+ struct mlx5e_ipsec_rule status;
+ struct mlx5e_ipsec_status_checks status_drops;
+ struct mlx5e_ipsec_fc *fc;
+ struct mlx5_fs_chains *chains;
+ u8 allow_tunnel_mode : 1;
+};
+
/* IPsec RX flow steering */
static enum mlx5_traffic_types family2tt(u32 family)
{
return mlx5_create_auto_grouped_flow_table(ns, &ft_attr);
}
-static int ipsec_status_rule(struct mlx5_core_dev *mdev,
- struct mlx5e_ipsec_rx *rx,
- struct mlx5_flow_destination *dest)
+static void ipsec_rx_status_drop_destroy(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx)
+{
+ mlx5_del_flow_rules(rx->status_drops.all.rule);
+ mlx5_fc_destroy(ipsec->mdev, rx->status_drops.all.fc);
+ mlx5_destroy_flow_group(rx->status_drops.drop_all_group);
+}
+
+static void ipsec_rx_status_pass_destroy(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx)
{
- u8 action[MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto)] = {};
+ mlx5_del_flow_rules(rx->status.rule);
+
+ if (rx != ipsec->rx_esw)
+ return;
+
+#ifdef CONFIG_MLX5_ESWITCH
+ mlx5_chains_put_table(esw_chains(ipsec->mdev->priv.eswitch), 0, 1, 0);
+#endif
+}
+
+static int rx_add_rule_drop_auth_trailer(struct mlx5e_ipsec_sa_entry *sa_entry,
+ struct mlx5e_ipsec_rx *rx)
+{
+ struct mlx5e_ipsec *ipsec = sa_entry->ipsec;
+ struct mlx5_flow_table *ft = rx->ft.status;
+ struct mlx5_core_dev *mdev = ipsec->mdev;
+ struct mlx5_flow_destination dest = {};
struct mlx5_flow_act flow_act = {};
- struct mlx5_modify_hdr *modify_hdr;
- struct mlx5_flow_handle *fte;
+ struct mlx5_flow_handle *rule;
+ struct mlx5_fc *flow_counter;
struct mlx5_flow_spec *spec;
int err;
if (!spec)
return -ENOMEM;
- /* Action to copy 7 bit ipsec_syndrome to regB[24:30] */
- MLX5_SET(copy_action_in, action, action_type, MLX5_ACTION_TYPE_COPY);
- MLX5_SET(copy_action_in, action, src_field, MLX5_ACTION_IN_FIELD_IPSEC_SYNDROME);
- MLX5_SET(copy_action_in, action, src_offset, 0);
- MLX5_SET(copy_action_in, action, length, 7);
- MLX5_SET(copy_action_in, action, dst_field, MLX5_ACTION_IN_FIELD_METADATA_REG_B);
- MLX5_SET(copy_action_in, action, dst_offset, 24);
+ flow_counter = mlx5_fc_create(mdev, true);
+ if (IS_ERR(flow_counter)) {
+ err = PTR_ERR(flow_counter);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule counter, err=%d\n", err);
+ goto err_cnt;
+ }
+ sa_entry->ipsec_rule.auth.fc = flow_counter;
- modify_hdr = mlx5_modify_header_alloc(mdev, MLX5_FLOW_NAMESPACE_KERNEL,
- 1, action);
+ flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP | MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ flow_act.flags = FLOW_ACT_NO_APPEND;
+ dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+ dest.counter_id = mlx5_fc_id(flow_counter);
+ if (rx == ipsec->rx_esw)
+ spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
- if (IS_ERR(modify_hdr)) {
- err = PTR_ERR(modify_hdr);
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters_2.ipsec_syndrome);
+ MLX5_SET(fte_match_param, spec->match_value, misc_parameters_2.ipsec_syndrome, 1);
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters_2.metadata_reg_c_2);
+ MLX5_SET(fte_match_param, spec->match_value,
+ misc_parameters_2.metadata_reg_c_2,
+ sa_entry->ipsec_obj_id | BIT(31));
+ spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2;
+ rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
mlx5_core_err(mdev,
- "fail to alloc ipsec copy modify_header_id err=%d\n", err);
- goto out_spec;
+ "Failed to add ipsec rx status drop rule, err=%d\n", err);
+ goto err_rule;
}
+ sa_entry->ipsec_rule.auth.rule = rule;
- /* create fte */
- flow_act.action = MLX5_FLOW_CONTEXT_ACTION_MOD_HDR |
- MLX5_FLOW_CONTEXT_ACTION_FWD_DEST |
+ flow_counter = mlx5_fc_create(mdev, true);
+ if (IS_ERR(flow_counter)) {
+ err = PTR_ERR(flow_counter);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule counter, err=%d\n", err);
+ goto err_cnt_2;
+ }
+ sa_entry->ipsec_rule.trailer.fc = flow_counter;
+
+ dest.counter_id = mlx5_fc_id(flow_counter);
+ MLX5_SET(fte_match_param, spec->match_value, misc_parameters_2.ipsec_syndrome, 2);
+ rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule, err=%d\n", err);
+ goto err_rule_2;
+ }
+ sa_entry->ipsec_rule.trailer.rule = rule;
+
+ kvfree(spec);
+ return 0;
+
+err_rule_2:
+ mlx5_fc_destroy(mdev, sa_entry->ipsec_rule.trailer.fc);
+err_cnt_2:
+ mlx5_del_flow_rules(sa_entry->ipsec_rule.auth.rule);
+err_rule:
+ mlx5_fc_destroy(mdev, sa_entry->ipsec_rule.auth.fc);
+err_cnt:
+ kvfree(spec);
+ return err;
+}
+
+static int rx_add_rule_drop_replay(struct mlx5e_ipsec_sa_entry *sa_entry, struct mlx5e_ipsec_rx *rx)
+{
+ struct mlx5e_ipsec *ipsec = sa_entry->ipsec;
+ struct mlx5_flow_table *ft = rx->ft.status;
+ struct mlx5_core_dev *mdev = ipsec->mdev;
+ struct mlx5_flow_destination dest = {};
+ struct mlx5_flow_act flow_act = {};
+ struct mlx5_flow_handle *rule;
+ struct mlx5_fc *flow_counter;
+ struct mlx5_flow_spec *spec;
+ int err;
+
+ spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+ if (!spec)
+ return -ENOMEM;
+
+ flow_counter = mlx5_fc_create(mdev, true);
+ if (IS_ERR(flow_counter)) {
+ err = PTR_ERR(flow_counter);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule counter, err=%d\n", err);
+ goto err_cnt;
+ }
+
+ flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP | MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ flow_act.flags = FLOW_ACT_NO_APPEND;
+ dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+ dest.counter_id = mlx5_fc_id(flow_counter);
+ if (rx == ipsec->rx_esw)
+ spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
+
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters_2.metadata_reg_c_4);
+ MLX5_SET(fte_match_param, spec->match_value, misc_parameters_2.metadata_reg_c_4, 1);
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters_2.metadata_reg_c_2);
+ MLX5_SET(fte_match_param, spec->match_value, misc_parameters_2.metadata_reg_c_2,
+ sa_entry->ipsec_obj_id | BIT(31));
+ spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2;
+ rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule, err=%d\n", err);
+ goto err_rule;
+ }
+
+ sa_entry->ipsec_rule.replay.rule = rule;
+ sa_entry->ipsec_rule.replay.fc = flow_counter;
+
+ kvfree(spec);
+ return 0;
+
+err_rule:
+ mlx5_fc_destroy(mdev, flow_counter);
+err_cnt:
+ kvfree(spec);
+ return err;
+}
+
+static int ipsec_rx_status_drop_all_create(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx)
+{
+ int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
+ struct mlx5_flow_table *ft = rx->ft.status;
+ struct mlx5_core_dev *mdev = ipsec->mdev;
+ struct mlx5_flow_destination dest = {};
+ struct mlx5_flow_act flow_act = {};
+ struct mlx5_flow_handle *rule;
+ struct mlx5_fc *flow_counter;
+ struct mlx5_flow_spec *spec;
+ struct mlx5_flow_group *g;
+ u32 *flow_group_in;
+ int err = 0;
+
+ flow_group_in = kvzalloc(inlen, GFP_KERNEL);
+ spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+ if (!flow_group_in || !spec) {
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, ft->max_fte - 1);
+ MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, ft->max_fte - 1);
+ g = mlx5_create_flow_group(ft, flow_group_in);
+ if (IS_ERR(g)) {
+ err = PTR_ERR(g);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop flow group, err=%d\n", err);
+ goto err_out;
+ }
+
+ flow_counter = mlx5_fc_create(mdev, false);
+ if (IS_ERR(flow_counter)) {
+ err = PTR_ERR(flow_counter);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule counter, err=%d\n", err);
+ goto err_cnt;
+ }
+
+ flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP | MLX5_FLOW_CONTEXT_ACTION_COUNT;
+ dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+ dest.counter_id = mlx5_fc_id(flow_counter);
+ if (rx == ipsec->rx_esw)
+ spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
+ rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
+ mlx5_core_err(mdev,
+ "Failed to add ipsec rx status drop rule, err=%d\n", err);
+ goto err_rule;
+ }
+
+ rx->status_drops.drop_all_group = g;
+ rx->status_drops.all.rule = rule;
+ rx->status_drops.all.fc = flow_counter;
+
+ kvfree(flow_group_in);
+ kvfree(spec);
+ return 0;
+
+err_rule:
+ mlx5_fc_destroy(mdev, flow_counter);
+err_cnt:
+ mlx5_destroy_flow_group(g);
+err_out:
+ kvfree(flow_group_in);
+ kvfree(spec);
+ return err;
+}
+
+static int ipsec_rx_status_pass_create(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx,
+ struct mlx5_flow_destination *dest)
+{
+ struct mlx5_flow_act flow_act = {};
+ struct mlx5_flow_handle *rule;
+ struct mlx5_flow_spec *spec;
+ int err;
+
+ spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
+ if (!spec)
+ return -ENOMEM;
+
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria,
+ misc_parameters_2.ipsec_syndrome);
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria,
+ misc_parameters_2.metadata_reg_c_4);
+ MLX5_SET(fte_match_param, spec->match_value,
+ misc_parameters_2.ipsec_syndrome, 0);
+ MLX5_SET(fte_match_param, spec->match_value,
+ misc_parameters_2.metadata_reg_c_4, 0);
+ if (rx == ipsec->rx_esw)
+ spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
+ spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2;
+ flow_act.flags = FLOW_ACT_NO_APPEND;
+ flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST |
MLX5_FLOW_CONTEXT_ACTION_COUNT;
- flow_act.modify_hdr = modify_hdr;
- fte = mlx5_add_flow_rules(rx->ft.status, spec, &flow_act, dest, 2);
- if (IS_ERR(fte)) {
- err = PTR_ERR(fte);
- mlx5_core_err(mdev, "fail to add ipsec rx err copy rule err=%d\n", err);
- goto out;
+ rule = mlx5_add_flow_rules(rx->ft.status, spec, &flow_act, dest, 2);
+ if (IS_ERR(rule)) {
+ err = PTR_ERR(rule);
+ mlx5_core_warn(ipsec->mdev,
+ "Failed to add ipsec rx status pass rule, err=%d\n", err);
+ goto err_rule;
}
+ rx->status.rule = rule;
kvfree(spec);
- rx->status.rule = fte;
- rx->status.modify_hdr = modify_hdr;
return 0;
-out:
- mlx5_modify_header_dealloc(mdev, modify_hdr);
-out_spec:
+err_rule:
kvfree(spec);
return err;
}
+static void mlx5_ipsec_rx_status_destroy(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx)
+{
+ ipsec_rx_status_pass_destroy(ipsec, rx);
+ ipsec_rx_status_drop_destroy(ipsec, rx);
+}
+
+static int mlx5_ipsec_rx_status_create(struct mlx5e_ipsec *ipsec,
+ struct mlx5e_ipsec_rx *rx,
+ struct mlx5_flow_destination *dest)
+{
+ int err;
+
+ err = ipsec_rx_status_drop_all_create(ipsec, rx);
+ if (err)
+ return err;
+
+ err = ipsec_rx_status_pass_create(ipsec, rx, dest);
+ if (err)
+ goto err_pass_create;
+
+ return 0;
+
+err_pass_create:
+ ipsec_rx_status_drop_destroy(ipsec, rx);
+ return err;
+}
+
static int ipsec_miss_create(struct mlx5_core_dev *mdev,
struct mlx5_flow_table *ft,
struct mlx5e_ipsec_miss *miss,
mlx5_destroy_flow_table(rx->ft.sa);
if (rx->allow_tunnel_mode)
mlx5_eswitch_unblock_encap(mdev);
- if (rx == ipsec->rx_esw) {
- mlx5_esw_ipsec_rx_status_destroy(ipsec, rx);
- } else {
- mlx5_del_flow_rules(rx->status.rule);
- mlx5_modify_header_dealloc(mdev, rx->status.modify_hdr);
- }
+ mlx5_ipsec_rx_status_destroy(ipsec, rx);
mlx5_destroy_flow_table(rx->ft.status);
mlx5_ipsec_fs_roce_rx_destroy(ipsec->roce, family, mdev);
if (err)
return err;
- ft = ipsec_ft_create(attr.ns, attr.status_level, attr.prio, 1, 0);
+ ft = ipsec_ft_create(attr.ns, attr.status_level, attr.prio, 3, 0);
if (IS_ERR(ft)) {
err = PTR_ERR(ft);
goto err_fs_ft_status;
dest[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
dest[1].counter_id = mlx5_fc_id(rx->fc->cnt);
- if (rx == ipsec->rx_esw)
- err = mlx5_esw_ipsec_rx_status_create(ipsec, rx, dest);
- else
- err = ipsec_status_rule(mdev, rx, dest);
+ err = mlx5_ipsec_rx_status_create(ipsec, rx, dest);
if (err)
goto err_add;
MLX5_SET(fte_match_param, spec->match_value, outer_headers.ip_protocol, IPPROTO_ESP);
}
-static void setup_fte_spi(struct mlx5_flow_spec *spec, u32 spi)
+static void setup_fte_spi(struct mlx5_flow_spec *spec, u32 spi, bool encap)
{
/* SPI number */
spec->match_criteria_enable |= MLX5_MATCH_MISC_PARAMETERS;
- MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, misc_parameters.outer_esp_spi);
- MLX5_SET(fte_match_param, spec->match_value, misc_parameters.outer_esp_spi, spi);
+ if (encap) {
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria,
+ misc_parameters.inner_esp_spi);
+ MLX5_SET(fte_match_param, spec->match_value,
+ misc_parameters.inner_esp_spi, spi);
+ } else {
+ MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria,
+ misc_parameters.outer_esp_spi);
+ MLX5_SET(fte_match_param, spec->match_value,
+ misc_parameters.outer_esp_spi, spi);
+ }
}
static void setup_fte_no_frags(struct mlx5_flow_spec *spec)
struct mlx5_flow_act *flow_act)
{
enum mlx5_flow_namespace_type ns_type = ipsec_fs_get_ns(ipsec, type, dir);
- u8 action[MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto)] = {};
+ u8 action[3][MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto)] = {};
struct mlx5_core_dev *mdev = ipsec->mdev;
struct mlx5_modify_hdr *modify_hdr;
+ u8 num_of_actions = 1;
- MLX5_SET(set_action_in, action, action_type, MLX5_ACTION_TYPE_SET);
+ MLX5_SET(set_action_in, action[0], action_type, MLX5_ACTION_TYPE_SET);
switch (dir) {
case XFRM_DEV_OFFLOAD_IN:
- MLX5_SET(set_action_in, action, field,
+ MLX5_SET(set_action_in, action[0], field,
MLX5_ACTION_IN_FIELD_METADATA_REG_B);
+
+ num_of_actions++;
+ MLX5_SET(set_action_in, action[1], action_type, MLX5_ACTION_TYPE_SET);
+ MLX5_SET(set_action_in, action[1], field, MLX5_ACTION_IN_FIELD_METADATA_REG_C_2);
+ MLX5_SET(set_action_in, action[1], data, val);
+ MLX5_SET(set_action_in, action[1], offset, 0);
+ MLX5_SET(set_action_in, action[1], length, 32);
+
+ if (type == XFRM_DEV_OFFLOAD_CRYPTO) {
+ num_of_actions++;
+ MLX5_SET(set_action_in, action[2], action_type,
+ MLX5_ACTION_TYPE_SET);
+ MLX5_SET(set_action_in, action[2], field,
+ MLX5_ACTION_IN_FIELD_METADATA_REG_C_4);
+ MLX5_SET(set_action_in, action[2], data, 0);
+ MLX5_SET(set_action_in, action[2], offset, 0);
+ MLX5_SET(set_action_in, action[2], length, 32);
+ }
break;
case XFRM_DEV_OFFLOAD_OUT:
- MLX5_SET(set_action_in, action, field,
+ MLX5_SET(set_action_in, action[0], field,
MLX5_ACTION_IN_FIELD_METADATA_REG_C_4);
break;
default:
return -EINVAL;
}
- MLX5_SET(set_action_in, action, data, val);
- MLX5_SET(set_action_in, action, offset, 0);
- MLX5_SET(set_action_in, action, length, 32);
+ MLX5_SET(set_action_in, action[0], data, val);
+ MLX5_SET(set_action_in, action[0], offset, 0);
+ MLX5_SET(set_action_in, action[0], length, 32);
- modify_hdr = mlx5_modify_header_alloc(mdev, ns_type, 1, action);
+ modify_hdr = mlx5_modify_header_alloc(mdev, ns_type, num_of_actions, action);
if (IS_ERR(modify_hdr)) {
mlx5_core_err(mdev, "Failed to allocate modify_header %ld\n",
PTR_ERR(modify_hdr));
else
setup_fte_addr6(spec, attrs->saddr.a6, attrs->daddr.a6);
- setup_fte_spi(spec, attrs->spi);
- setup_fte_esp(spec);
+ setup_fte_spi(spec, attrs->spi, attrs->encap);
+ if (!attrs->encap)
+ setup_fte_esp(spec);
setup_fte_no_frags(spec);
setup_fte_upper_proto_match(spec, &attrs->upspec);
mlx5_core_err(mdev, "fail to add RX ipsec rule err=%d\n", err);
goto err_add_flow;
}
+ if (attrs->type == XFRM_DEV_OFFLOAD_PACKET)
+ err = rx_add_rule_drop_replay(sa_entry, rx);
+ if (err)
+ goto err_add_replay;
+
+ err = rx_add_rule_drop_auth_trailer(sa_entry, rx);
+ if (err)
+ goto err_drop_reason;
+
kvfree(spec);
sa_entry->ipsec_rule.rule = rule;
sa_entry->ipsec_rule.pkt_reformat = flow_act.pkt_reformat;
return 0;
+err_drop_reason:
+ if (sa_entry->ipsec_rule.replay.rule) {
+ mlx5_del_flow_rules(sa_entry->ipsec_rule.replay.rule);
+ mlx5_fc_destroy(mdev, sa_entry->ipsec_rule.replay.fc);
+ }
+err_add_replay:
+ mlx5_del_flow_rules(rule);
err_add_flow:
mlx5_fc_destroy(mdev, counter);
err_add_cnt:
switch (attrs->type) {
case XFRM_DEV_OFFLOAD_CRYPTO:
- setup_fte_spi(spec, attrs->spi);
+ setup_fte_spi(spec, attrs->spi, false);
setup_fte_esp(spec);
setup_fte_reg_a(spec);
break;
struct mlx5_eswitch *esw = mdev->priv.eswitch;
int err = 0;
- if (esw)
- down_write(&esw->mode_lock);
+ if (esw) {
+ err = mlx5_esw_lock(esw);
+ if (err)
+ return err;
+ }
if (mdev->num_block_ipsec) {
err = -EBUSY;
unlock:
if (esw)
- up_write(&esw->mode_lock);
+ mlx5_esw_unlock(esw);
return err;
}
static void mlx5e_ipsec_unblock_tc_offload(struct mlx5_core_dev *mdev)
{
- mdev->num_block_tc++;
+ mdev->num_block_tc--;
}
int mlx5e_accel_ipsec_fs_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry)
if (ipsec_rule->modify_hdr)
mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr);
+
+ mlx5_del_flow_rules(ipsec_rule->trailer.rule);
+ mlx5_fc_destroy(mdev, ipsec_rule->trailer.fc);
+
+ mlx5_del_flow_rules(ipsec_rule->auth.rule);
+ mlx5_fc_destroy(mdev, ipsec_rule->auth.fc);
+
+ if (ipsec_rule->replay.rule) {
+ mlx5_del_flow_rules(ipsec_rule->replay.rule);
+ mlx5_fc_destroy(mdev, ipsec_rule->replay.fc);
+ }
mlx5_esw_ipsec_rx_id_mapping_remove(sa_entry);
rx_ft_put(sa_entry->ipsec, sa_entry->attrs.family, sa_entry->attrs.type);
}
kfree(ipsec->rx_ipv6);
if (ipsec->is_uplink_rep) {
- xa_destroy(&ipsec->rx_esw->ipsec_obj_id_map);
+ xa_destroy(&ipsec->ipsec_obj_id_map);
mutex_destroy(&ipsec->tx_esw->ft.mutex);
WARN_ON(ipsec->tx_esw->ft.refcnt);
mutex_init(&ipsec->tx_esw->ft.mutex);
mutex_init(&ipsec->rx_esw->ft.mutex);
ipsec->tx_esw->ns = ns_esw;
- xa_init_flags(&ipsec->rx_esw->ipsec_obj_id_map, XA_FLAGS_ALLOC1);
+ xa_init_flags(&ipsec->ipsec_obj_id_map, XA_FLAGS_ALLOC1);
} else if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_ROCE) {
ipsec->roce = mlx5_ipsec_fs_roce_init(mdev, devcom);
} else {
#include "ipsec.h"
#include "lib/crypto.h"
#include "lib/ipsec_fs_roce.h"
+#include "fs_core.h"
+#include "eswitch.h"
enum {
MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET,
MLX5_CAP_ETH(mdev, insert_trailer) && MLX5_CAP_ETH(mdev, swp))
caps |= MLX5_IPSEC_CAP_CRYPTO;
- if (MLX5_CAP_IPSEC(mdev, ipsec_full_offload)) {
+ if (MLX5_CAP_IPSEC(mdev, ipsec_full_offload) &&
+ (mdev->priv.steering->mode == MLX5_FLOW_STEERING_MODE_DMFS ||
+ (mdev->priv.steering->mode == MLX5_FLOW_STEERING_MODE_SMFS &&
+ is_mdev_legacy_mode(mdev)))) {
if (MLX5_CAP_FLOWTABLE_NIC_TX(mdev,
reformat_add_esp_trasport) &&
MLX5_CAP_FLOWTABLE_NIC_RX(mdev,
if (attrs->dir == XFRM_DEV_OFFLOAD_IN) {
MLX5_SET(ipsec_aso, aso_ctx, window_sz,
- attrs->replay_esn.replay_window / 64);
+ attrs->replay_esn.replay_window);
MLX5_SET(ipsec_aso, aso_ctx, mode,
MLX5_IPSEC_ASO_REPLAY_PROTECTION);
}
dma_unmap_single(pdev, aso->dma_addr, sizeof(aso->ctx),
DMA_BIDIRECTIONAL);
kfree(aso);
+ ipsec->aso = NULL;
}
static void mlx5e_ipsec_aso_copy(struct mlx5_wqe_aso_ctrl_seg *ctrl,
struct ethtool_drvinfo *drvinfo)
{
struct mlx5_core_dev *mdev = priv->mdev;
+ int count;
strscpy(drvinfo->driver, KBUILD_MODNAME, sizeof(drvinfo->driver));
- snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
- "%d.%d.%04d (%.16s)",
- fw_rev_maj(mdev), fw_rev_min(mdev), fw_rev_sub(mdev),
- mdev->board_id);
+ count = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%04d (%.16s)", fw_rev_maj(mdev),
+ fw_rev_min(mdev), fw_rev_sub(mdev), mdev->board_id);
+ if (count >= sizeof(drvinfo->fw_version))
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%04d", fw_rev_maj(mdev),
+ fw_rev_min(mdev), fw_rev_sub(mdev));
+
strscpy(drvinfo->bus_info, dev_name(mdev->device),
sizeof(drvinfo->bus_info));
}
{
int i;
+ ASSERT_RTNL();
if (chs->ptp) {
mlx5e_ptp_close(chs->ptp);
chs->ptp = NULL;
if (mlx5e_is_vport_rep(priv))
mlx5e_rep_activate_channels(priv);
+ set_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state);
+
mlx5e_wait_channels_min_rx_wqes(&priv->channels);
if (priv->rx_res)
mlx5e_rx_res_channels_activate(priv->rx_res, &priv->channels);
}
+static void mlx5e_cancel_tx_timeout_work(struct mlx5e_priv *priv)
+{
+ WARN_ON_ONCE(test_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state));
+ if (current_work() != &priv->tx_timeout_work)
+ cancel_work_sync(&priv->tx_timeout_work);
+}
+
void mlx5e_deactivate_priv_channels(struct mlx5e_priv *priv)
{
if (priv->rx_res)
mlx5e_rx_res_channels_deactivate(priv->rx_res);
+ clear_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state);
+ mlx5e_cancel_tx_timeout_work(priv);
+
if (mlx5e_is_vport_rep(priv))
mlx5e_rep_deactivate_channels(priv);
struct net_device *netdev = priv->netdev;
int i;
- rtnl_lock();
- mutex_lock(&priv->state_lock);
+ /* Take rtnl_lock to ensure no change in netdev->real_num_tx_queues
+ * through this flow. However, channel closing flows have to wait for
+ * this work to finish while holding rtnl lock too. So either get the
+ * lock or find that channels are being closed for other reason and
+ * this work is not relevant anymore.
+ */
+ while (!rtnl_trylock()) {
+ if (!test_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state))
+ return;
+ msleep(20);
+ }
if (!test_bit(MLX5E_STATE_OPENED, &priv->state))
goto unlock;
}
unlock:
- mutex_unlock(&priv->state_lock);
rtnl_unlock();
}
{
struct mlx5e_priv *priv = netdev_priv(dev);
struct mlx5_core_dev *mdev = priv->mdev;
+ int count;
strscpy(drvinfo->driver, mlx5e_rep_driver_name,
sizeof(drvinfo->driver));
- snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
- "%d.%d.%04d (%.16s)",
- fw_rev_maj(mdev), fw_rev_min(mdev),
- fw_rev_sub(mdev), mdev->board_id);
+ count = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%04d (%.16s)", fw_rev_maj(mdev),
+ fw_rev_min(mdev), fw_rev_sub(mdev), mdev->board_id);
+ if (count >= sizeof(drvinfo->fw_version))
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%d.%d.%04d", fw_rev_maj(mdev),
+ fw_rev_min(mdev), fw_rev_sub(mdev));
}
static const struct counter_desc sw_rep_stats_desc[] = {
dl_port = mlx5_esw_offloads_devlink_port(dev->priv.eswitch,
rpriv->rep->vport);
- if (dl_port) {
+ if (!IS_ERR(dl_port)) {
SET_NETDEV_DEVLINK_PORT(netdev, dl_port);
mlx5e_rep_vnic_reporter_create(priv, dl_port);
}
struct mlx5e_flow_meter_handle *meter;
enum mlx5e_post_meter_type type;
+ if (IS_ERR(post_act))
+ return PTR_ERR(post_act);
+
meter = mlx5e_tc_meter_replace(priv->mdev, &attr->meter_attr.params);
if (IS_ERR(meter)) {
mlx5_core_err(priv->mdev, "Failed to get flow meter\n");
OFFLOAD(DIPV6_31_0, 32, U32_MAX, ip6.daddr.s6_addr32[3], 0,
dst_ipv4_dst_ipv6.ipv6_layout.ipv6[12]),
OFFLOAD(IPV6_HOPLIMIT, 8, U8_MAX, ip6.hop_limit, 0, ttl_hoplimit),
- OFFLOAD(IP_DSCP, 16, 0xc00f, ip6, 0, ip_dscp),
+ OFFLOAD(IP_DSCP, 16, 0x0fc0, ip6, 0, ip_dscp),
OFFLOAD(TCP_SPORT, 16, U16_MAX, tcp.source, 0, tcp_sport),
OFFLOAD(TCP_DPORT, 16, U16_MAX, tcp.dest, 0, tcp_dport),
OFFLOAD(UDP_DPORT, 16, U16_MAX, udp.dest, 0, udp_dport),
};
-static unsigned long mask_to_le(unsigned long mask, int size)
+static u32 mask_field_get(void *mask, struct mlx5_fields *f)
{
- __be32 mask_be32;
- __be16 mask_be16;
-
- if (size == 32) {
- mask_be32 = (__force __be32)(mask);
- mask = (__force unsigned long)cpu_to_le32(be32_to_cpu(mask_be32));
- } else if (size == 16) {
- mask_be32 = (__force __be32)(mask);
- mask_be16 = *(__be16 *)&mask_be32;
- mask = (__force unsigned long)cpu_to_le16(be16_to_cpu(mask_be16));
+ switch (f->field_bsize) {
+ case 32:
+ return be32_to_cpu(*(__be32 *)mask) & f->field_mask;
+ case 16:
+ return be16_to_cpu(*(__be16 *)mask) & (u16)f->field_mask;
+ default:
+ return *(u8 *)mask & (u8)f->field_mask;
}
+}
- return mask;
+static void mask_field_clear(void *mask, struct mlx5_fields *f)
+{
+ switch (f->field_bsize) {
+ case 32:
+ *(__be32 *)mask &= ~cpu_to_be32(f->field_mask);
+ break;
+ case 16:
+ *(__be16 *)mask &= ~cpu_to_be16((u16)f->field_mask);
+ break;
+ default:
+ *(u8 *)mask &= ~(u8)f->field_mask;
+ break;
+ }
}
static int offload_pedit_fields(struct mlx5e_priv *priv,
struct pedit_headers *set_masks, *add_masks, *set_vals, *add_vals;
struct pedit_headers_action *hdrs = parse_attr->hdrs;
void *headers_c, *headers_v, *action, *vals_p;
- u32 *s_masks_p, *a_masks_p, s_mask, a_mask;
struct mlx5e_tc_mod_hdr_acts *mod_acts;
- unsigned long mask, field_mask;
+ void *s_masks_p, *a_masks_p;
int i, first, last, next_z;
struct mlx5_fields *f;
+ unsigned long mask;
+ u32 s_mask, a_mask;
u8 cmd;
mod_acts = &parse_attr->mod_hdr_acts;
bool skip;
f = &fields[i];
- /* avoid seeing bits set from previous iterations */
- s_mask = 0;
- a_mask = 0;
-
s_masks_p = (void *)set_masks + f->offset;
a_masks_p = (void *)add_masks + f->offset;
- s_mask = *s_masks_p & f->field_mask;
- a_mask = *a_masks_p & f->field_mask;
+ s_mask = mask_field_get(s_masks_p, f);
+ a_mask = mask_field_get(a_masks_p, f);
if (!s_mask && !a_mask) /* nothing to offload here */
continue;
match_mask, f->field_bsize))
skip = true;
/* clear to denote we consumed this field */
- *s_masks_p &= ~f->field_mask;
+ mask_field_clear(s_masks_p, f);
} else {
cmd = MLX5_ACTION_TYPE_ADD;
mask = a_mask;
vals_p = (void *)add_vals + f->offset;
/* add 0 is no change */
- if ((*(u32 *)vals_p & f->field_mask) == 0)
+ if (!mask_field_get(vals_p, f))
skip = true;
/* clear to denote we consumed this field */
- *a_masks_p &= ~f->field_mask;
+ mask_field_clear(a_masks_p, f);
}
if (skip)
continue;
- mask = mask_to_le(mask, f->field_bsize);
-
first = find_first_bit(&mask, f->field_bsize);
next_z = find_next_zero_bit(&mask, f->field_bsize, first);
last = find_last_bit(&mask, f->field_bsize);
MLX5_SET(set_action_in, action, field, f->field);
if (cmd == MLX5_ACTION_TYPE_SET) {
+ unsigned long field_mask = f->field_mask;
int start;
- field_mask = mask_to_le(f->field_mask, f->field_bsize);
-
/* if field is bit sized it can start not from first bit */
start = find_first_bit(&field_mask, f->field_bsize);
return err;
}
+static int
+set_branch_dest_ft(struct mlx5e_priv *priv, struct mlx5_flow_attr *attr)
+{
+ struct mlx5e_post_act *post_act = get_post_action(priv);
+
+ if (IS_ERR(post_act))
+ return PTR_ERR(post_act);
+
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
+ attr->dest_ft = mlx5e_tc_post_act_get_ft(post_act);
+
+ return 0;
+}
+
static int
alloc_branch_attr(struct mlx5e_tc_flow *flow,
struct mlx5e_tc_act_branch_ctrl *cond,
break;
case FLOW_ACTION_ACCEPT:
case FLOW_ACTION_PIPE:
- attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
- attr->dest_ft = mlx5e_tc_post_act_get_ft(get_post_action(flow->priv));
+ err = set_branch_dest_ft(flow->priv, attr);
+ if (err)
+ goto out_err;
break;
case FLOW_ACTION_JUMP:
if (*jump_count) {
goto out_err;
}
*jump_count = cond->extval;
- attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
- attr->dest_ft = mlx5e_tc_post_act_get_ft(get_post_action(flow->priv));
+ err = set_branch_dest_ft(flow->priv, attr);
+ if (err)
+ goto out_err;
break;
default:
err = -EOPNOTSUPP;
esw = priv->mdev->priv.eswitch;
attr->act_id_restore_rule = esw_add_restore_rule(esw, *act_miss_mapping);
- if (IS_ERR(attr->act_id_restore_rule))
+ if (IS_ERR(attr->act_id_restore_rule)) {
+ err = PTR_ERR(attr->act_id_restore_rule);
goto err_rule;
+ }
return 0;
u8 metadata_index = be32_to_cpu(eseg->flow_table_metadata);
mlx5e_skb_cb_hwtstamp_init(skb);
- mlx5e_ptpsq_track_metadata(sq->ptpsq, metadata_index);
mlx5e_ptp_metadata_map_put(&sq->ptpsq->metadata_map, skb,
metadata_index);
+ mlx5e_ptpsq_track_metadata(sq->ptpsq, metadata_index);
if (!netif_tx_queue_stopped(sq->txq) &&
mlx5e_ptpsq_metadata_freelist_empty(sq->ptpsq)) {
netif_tx_stop_queue(sq->txq);
err_drop:
stats->dropped++;
- dev_kfree_skb_any(skb);
if (unlikely(sq->ptpsq && (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)))
mlx5e_ptp_metadata_fifo_push(&sq->ptpsq->metadata_freelist,
be32_to_cpu(eseg->flow_table_metadata));
+ dev_kfree_skb_any(skb);
mlx5e_tx_flush(sq);
}
{
struct mlx5_eq_table *table = dev->priv.eq_table;
struct mlx5_irq *irq;
+ int cpu;
irq = xa_load(&table->comp_irqs, vecidx);
if (!irq)
return;
+ cpu = cpumask_first(mlx5_irq_get_affinity_mask(irq));
+ cpumask_clear_cpu(cpu, &table->used_cpus);
xa_erase(&table->comp_irqs, vecidx);
mlx5_irq_affinity_irq_release(dev, irq);
}
static int comp_irq_request_sf(struct mlx5_core_dev *dev, u16 vecidx)
{
struct mlx5_eq_table *table = dev->priv.eq_table;
+ struct mlx5_irq_pool *pool = mlx5_irq_pool_get(dev);
+ struct irq_affinity_desc af_desc = {};
struct mlx5_irq *irq;
- irq = mlx5_irq_affinity_irq_request_auto(dev, &table->used_cpus, vecidx);
- if (IS_ERR(irq)) {
- /* In case SF irq pool does not exist, fallback to the PF irqs*/
- if (PTR_ERR(irq) == -ENOENT)
- return comp_irq_request_pci(dev, vecidx);
+ /* In case SF irq pool does not exist, fallback to the PF irqs*/
+ if (!mlx5_irq_pool_is_sf_pool(pool))
+ return comp_irq_request_pci(dev, vecidx);
+ af_desc.is_managed = 1;
+ cpumask_copy(&af_desc.mask, cpu_online_mask);
+ cpumask_andnot(&af_desc.mask, &af_desc.mask, &table->used_cpus);
+ irq = mlx5_irq_affinity_request(pool, &af_desc);
+ if (IS_ERR(irq))
return PTR_ERR(irq);
- }
+
+ cpumask_or(&table->used_cpus, &table->used_cpus, mlx5_irq_get_affinity_mask(irq));
+ mlx5_core_dbg(pool->dev, "IRQ %u mapped to cpu %*pbl, %u EQs on this irq\n",
+ pci_irq_vector(dev->pdev, mlx5_irq_get_index(irq)),
+ cpumask_pr_args(mlx5_irq_get_affinity_mask(irq)),
+ mlx5_irq_read_locked(irq) / MLX5_EQ_REFS_PER_IRQ);
return xa_err(xa_store(&table->comp_irqs, vecidx, irq, GFP_KERNEL));
}
MLX5_ESW_IPSEC_TX_ESP_FT_CNT_LEVEL,
};
-static void esw_ipsec_rx_status_drop_destroy(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx)
-{
- mlx5_del_flow_rules(rx->status_drop.rule);
- mlx5_destroy_flow_group(rx->status_drop.group);
- mlx5_fc_destroy(ipsec->mdev, rx->status_drop_cnt);
-}
-
-static void esw_ipsec_rx_status_pass_destroy(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx)
-{
- mlx5_del_flow_rules(rx->status.rule);
- mlx5_chains_put_table(esw_chains(ipsec->mdev->priv.eswitch), 0, 1, 0);
-}
-
-static int esw_ipsec_rx_status_drop_create(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx)
-{
- int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in);
- struct mlx5_flow_table *ft = rx->ft.status;
- struct mlx5_core_dev *mdev = ipsec->mdev;
- struct mlx5_flow_destination dest = {};
- struct mlx5_flow_act flow_act = {};
- struct mlx5_flow_handle *rule;
- struct mlx5_fc *flow_counter;
- struct mlx5_flow_spec *spec;
- struct mlx5_flow_group *g;
- u32 *flow_group_in;
- int err = 0;
-
- flow_group_in = kvzalloc(inlen, GFP_KERNEL);
- spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
- if (!flow_group_in || !spec) {
- err = -ENOMEM;
- goto err_out;
- }
-
- MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, ft->max_fte - 1);
- MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, ft->max_fte - 1);
- g = mlx5_create_flow_group(ft, flow_group_in);
- if (IS_ERR(g)) {
- err = PTR_ERR(g);
- mlx5_core_err(mdev,
- "Failed to add ipsec rx status drop flow group, err=%d\n", err);
- goto err_out;
- }
-
- flow_counter = mlx5_fc_create(mdev, false);
- if (IS_ERR(flow_counter)) {
- err = PTR_ERR(flow_counter);
- mlx5_core_err(mdev,
- "Failed to add ipsec rx status drop rule counter, err=%d\n", err);
- goto err_cnt;
- }
-
- flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP | MLX5_FLOW_CONTEXT_ACTION_COUNT;
- dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
- dest.counter_id = mlx5_fc_id(flow_counter);
- spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
- rule = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1);
- if (IS_ERR(rule)) {
- err = PTR_ERR(rule);
- mlx5_core_err(mdev,
- "Failed to add ipsec rx status drop rule, err=%d\n", err);
- goto err_rule;
- }
-
- rx->status_drop.group = g;
- rx->status_drop.rule = rule;
- rx->status_drop_cnt = flow_counter;
-
- kvfree(flow_group_in);
- kvfree(spec);
- return 0;
-
-err_rule:
- mlx5_fc_destroy(mdev, flow_counter);
-err_cnt:
- mlx5_destroy_flow_group(g);
-err_out:
- kvfree(flow_group_in);
- kvfree(spec);
- return err;
-}
-
-static int esw_ipsec_rx_status_pass_create(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx,
- struct mlx5_flow_destination *dest)
-{
- struct mlx5_flow_act flow_act = {};
- struct mlx5_flow_handle *rule;
- struct mlx5_flow_spec *spec;
- int err;
-
- spec = kvzalloc(sizeof(*spec), GFP_KERNEL);
- if (!spec)
- return -ENOMEM;
-
- MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria,
- misc_parameters_2.ipsec_syndrome);
- MLX5_SET(fte_match_param, spec->match_value,
- misc_parameters_2.ipsec_syndrome, 0);
- spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
- spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2;
- flow_act.flags = FLOW_ACT_NO_APPEND;
- flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST |
- MLX5_FLOW_CONTEXT_ACTION_COUNT;
- rule = mlx5_add_flow_rules(rx->ft.status, spec, &flow_act, dest, 2);
- if (IS_ERR(rule)) {
- err = PTR_ERR(rule);
- mlx5_core_warn(ipsec->mdev,
- "Failed to add ipsec rx status pass rule, err=%d\n", err);
- goto err_rule;
- }
-
- rx->status.rule = rule;
- kvfree(spec);
- return 0;
-
-err_rule:
- kvfree(spec);
- return err;
-}
-
-void mlx5_esw_ipsec_rx_status_destroy(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx)
-{
- esw_ipsec_rx_status_pass_destroy(ipsec, rx);
- esw_ipsec_rx_status_drop_destroy(ipsec, rx);
-}
-
-int mlx5_esw_ipsec_rx_status_create(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx,
- struct mlx5_flow_destination *dest)
-{
- int err;
-
- err = esw_ipsec_rx_status_drop_create(ipsec, rx);
- if (err)
- return err;
-
- err = esw_ipsec_rx_status_pass_create(ipsec, rx, dest);
- if (err)
- goto err_pass_create;
-
- return 0;
-
-err_pass_create:
- esw_ipsec_rx_status_drop_destroy(ipsec, rx);
- return err;
-}
-
void mlx5_esw_ipsec_rx_create_attr_set(struct mlx5e_ipsec *ipsec,
struct mlx5e_ipsec_rx_create_attr *attr)
{
u32 mapped_id;
int err;
- err = xa_alloc_bh(&ipsec->rx_esw->ipsec_obj_id_map, &mapped_id,
+ err = xa_alloc_bh(&ipsec->ipsec_obj_id_map, &mapped_id,
xa_mk_value(sa_entry->ipsec_obj_id),
XA_LIMIT(1, ESW_IPSEC_RX_MAPPED_ID_MASK), 0);
if (err)
return 0;
err_header_alloc:
- xa_erase_bh(&ipsec->rx_esw->ipsec_obj_id_map, mapped_id);
+ xa_erase_bh(&ipsec->ipsec_obj_id_map, mapped_id);
return err;
}
struct mlx5e_ipsec *ipsec = sa_entry->ipsec;
if (sa_entry->rx_mapped_id)
- xa_erase_bh(&ipsec->rx_esw->ipsec_obj_id_map,
+ xa_erase_bh(&ipsec->ipsec_obj_id_map,
sa_entry->rx_mapped_id);
}
struct mlx5e_ipsec *ipsec = priv->ipsec;
void *val;
- val = xa_load(&ipsec->rx_esw->ipsec_obj_id_map, id);
+ val = xa_load(&ipsec->ipsec_obj_id_map, id);
if (!val)
return -ENOENT;
xa_for_each(&esw->offloads.vport_reps, i, rep) {
rpriv = rep->rep_data[REP_ETH].priv;
- if (!rpriv || !rpriv->netdev)
+ if (!rpriv || !rpriv->netdev || !atomic_read(&rpriv->tc_ht.nelems))
continue;
rhashtable_walk_enter(&rpriv->tc_ht, &iter);
struct mlx5e_ipsec_sa_entry;
#ifdef CONFIG_MLX5_ESWITCH
-void mlx5_esw_ipsec_rx_status_destroy(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx);
-int mlx5_esw_ipsec_rx_status_create(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx,
- struct mlx5_flow_destination *dest);
void mlx5_esw_ipsec_rx_create_attr_set(struct mlx5e_ipsec *ipsec,
struct mlx5e_ipsec_rx_create_attr *attr);
int mlx5_esw_ipsec_rx_status_pass_dest_get(struct mlx5e_ipsec *ipsec,
struct mlx5e_ipsec_tx_create_attr *attr);
void mlx5_esw_ipsec_restore_dest_uplink(struct mlx5_core_dev *mdev);
#else
-static inline void mlx5_esw_ipsec_rx_status_destroy(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx) {}
-
-static inline int mlx5_esw_ipsec_rx_status_create(struct mlx5e_ipsec *ipsec,
- struct mlx5e_ipsec_rx *rx,
- struct mlx5_flow_destination *dest)
-{
- return -EINVAL;
-}
-
static inline void mlx5_esw_ipsec_rx_create_attr_set(struct mlx5e_ipsec *ipsec,
struct mlx5e_ipsec_rx_create_attr *attr) {}
{
int err;
- lockdep_assert_held(&esw->mode_lock);
+ devl_assert_locked(priv_to_devlink(esw->dev));
if (!MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ft_support)) {
esw_warn(esw->dev, "FDB is not supported, aborting ...\n");
if (toggle_lag)
mlx5_lag_disable_change(esw->dev);
- down_write(&esw->mode_lock);
if (!mlx5_esw_is_fdb_created(esw)) {
ret = mlx5_eswitch_enable_locked(esw, num_vfs);
} else {
}
}
- up_write(&esw->mode_lock);
-
if (toggle_lag)
mlx5_lag_enable_change(esw->dev);
return;
devl_assert_locked(priv_to_devlink(esw->dev));
- down_write(&esw->mode_lock);
/* If driver is unloaded, this function is called twice by remove_one()
* and mlx5_unload(). Prevent the second call.
*/
if (!esw->esw_funcs.num_vfs && !esw->esw_funcs.num_ec_vfs && !clear_vf)
- goto unlock;
+ return;
esw_info(esw->dev, "Unload vfs: mode(%s), nvfs(%d), necvfs(%d), active vports(%d)\n",
esw->mode == MLX5_ESWITCH_LEGACY ? "LEGACY" : "OFFLOADS",
esw->esw_funcs.num_vfs = 0;
else
esw->esw_funcs.num_ec_vfs = 0;
-
-unlock:
- up_write(&esw->mode_lock);
}
/* Free resources for corresponding eswitch mode. It is called by devlink
devl_assert_locked(priv_to_devlink(esw->dev));
mlx5_lag_disable_change(esw->dev);
- down_write(&esw->mode_lock);
mlx5_eswitch_disable_locked(esw);
esw->mode = MLX5_ESWITCH_LEGACY;
- up_write(&esw->mode_lock);
mlx5_lag_enable_change(esw->dev);
}
if (!mlx5_esw_allowed(esw))
return true;
- if (down_read_trylock(&esw->mode_lock) != 0)
+ if (down_read_trylock(&esw->mode_lock) != 0) {
+ if (esw->eswitch_operation_in_progress) {
+ up_read(&esw->mode_lock);
+ return false;
+ }
return true;
+ }
return false;
}
if (down_write_trylock(&esw->mode_lock) == 0)
return -EINVAL;
- if (atomic64_read(&esw->user_count) > 0) {
+ if (esw->eswitch_operation_in_progress ||
+ atomic64_read(&esw->user_count) > 0) {
up_write(&esw->mode_lock);
return -EBUSY;
}
return esw->mode;
}
+int mlx5_esw_lock(struct mlx5_eswitch *esw)
+{
+ down_write(&esw->mode_lock);
+
+ if (esw->eswitch_operation_in_progress) {
+ up_write(&esw->mode_lock);
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
/**
* mlx5_esw_unlock() - Release write lock on esw mode lock
* @esw: eswitch device.
struct xarray paired;
struct mlx5_devcom_comp_dev *devcom;
u16 enabled_ipsec_vf_count;
+ bool eswitch_operation_in_progress;
};
void esw_offloads_disable(struct mlx5_eswitch *esw);
u8 total_vlan;
struct {
u32 flags;
- struct mlx5_eswitch_rep *rep;
+ bool vport_valid;
+ u16 vport;
struct mlx5_pkt_reformat *pkt_reformat;
struct mlx5_core_dev *mdev;
struct mlx5_termtbl_handle *termtbl;
void mlx5_esw_get(struct mlx5_core_dev *dev);
void mlx5_esw_put(struct mlx5_core_dev *dev);
int mlx5_esw_try_lock(struct mlx5_eswitch *esw);
+int mlx5_esw_lock(struct mlx5_eswitch *esw);
void mlx5_esw_unlock(struct mlx5_eswitch *esw);
void esw_vport_change_handle_locked(struct mlx5_vport *vport);
for (i = from; i < to; i++)
if (esw_attr->dests[i].flags & MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE)
mlx5_chains_put_table(chains, 0, 1, 0);
- else if (mlx5_esw_indir_table_needed(esw, attr, esw_attr->dests[i].rep->vport,
+ else if (mlx5_esw_indir_table_needed(esw, attr, esw_attr->dests[i].vport,
esw_attr->dests[i].mdev))
- mlx5_esw_indir_table_put(esw, esw_attr->dests[i].rep->vport,
- false);
+ mlx5_esw_indir_table_put(esw, esw_attr->dests[i].vport, false);
}
static bool
* this criteria.
*/
for (i = esw_attr->split_count; i < esw_attr->out_count; i++) {
- if (esw_attr->dests[i].rep &&
- mlx5_esw_indir_table_needed(esw, attr, esw_attr->dests[i].rep->vport,
+ if (esw_attr->dests[i].vport_valid &&
+ mlx5_esw_indir_table_needed(esw, attr, esw_attr->dests[i].vport,
esw_attr->dests[i].mdev)) {
result = true;
} else {
dest[*i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
dest[*i].ft = mlx5_esw_indir_table_get(esw, attr,
- esw_attr->dests[j].rep->vport, false);
+ esw_attr->dests[j].vport, false);
if (IS_ERR(dest[*i].ft)) {
err = PTR_ERR(dest[*i].ft);
goto err_indir_tbl_get;
int attr_idx)
{
if (esw->offloads.ft_ipsec_tx_pol &&
- esw_attr->dests[attr_idx].rep &&
- esw_attr->dests[attr_idx].rep->vport == MLX5_VPORT_UPLINK &&
+ esw_attr->dests[attr_idx].vport_valid &&
+ esw_attr->dests[attr_idx].vport == MLX5_VPORT_UPLINK &&
/* To be aligned with software, encryption is needed only for tunnel device */
(esw_attr->dests[attr_idx].flags & MLX5_ESW_DEST_ENCAP_VALID) &&
- esw_attr->dests[attr_idx].rep != esw_attr->in_rep &&
+ esw_attr->dests[attr_idx].vport != esw_attr->in_rep->vport &&
esw_same_vhca_id(esw_attr->dests[attr_idx].mdev, esw->dev))
return true;
int attr_idx, int dest_idx, bool pkt_reformat)
{
dest[dest_idx].type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
- dest[dest_idx].vport.num = esw_attr->dests[attr_idx].rep->vport;
+ dest[dest_idx].vport.num = esw_attr->dests[attr_idx].vport;
if (MLX5_CAP_ESW(esw->dev, merged_eswitch)) {
dest[dest_idx].vport.vhca_id =
MLX5_CAP_GEN(esw_attr->dests[attr_idx].mdev, vhca_id);
dest.vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID;
flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
- if (rep->vport == MLX5_VPORT_UPLINK && on_esw->offloads.ft_ipsec_tx_pol) {
+ if (rep->vport == MLX5_VPORT_UPLINK &&
+ on_esw == from_esw && on_esw->offloads.ft_ipsec_tx_pol) {
dest.ft = on_esw->offloads.ft_ipsec_tx_pol;
flow_act.flags = FLOW_ACT_IGNORE_FLOW_LEVEL;
dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
struct mlx5_flow_handle *flow;
struct mlx5_flow_spec *spec;
struct mlx5_vport *vport;
+ int err, pfindex;
unsigned long i;
void *misc;
- int err;
if (!MLX5_VPORT_MANAGER(esw->dev) && !mlx5_core_is_ecpf_esw_manager(esw->dev))
return 0;
flows[vport->index] = flow;
}
}
- esw->fdb_table.offloads.peer_miss_rules[mlx5_get_dev_index(peer_dev)] = flows;
+
+ pfindex = mlx5_get_dev_index(peer_dev);
+ if (pfindex >= MLX5_MAX_PORTS) {
+ esw_warn(esw->dev, "Peer dev index(%d) is over the max num defined(%d)\n",
+ pfindex, MLX5_MAX_PORTS);
+ err = -EINVAL;
+ goto add_ec_vf_flow_err;
+ }
+ esw->fdb_table.offloads.peer_miss_rules[pfindex] = flows;
kvfree(spec);
return 0;
static bool esw_offloads_devlink_ns_eq_netdev_ns(struct devlink *devlink)
{
+ struct mlx5_core_dev *dev = devlink_priv(devlink);
struct net *devl_net, *netdev_net;
- struct mlx5_eswitch *esw;
-
- esw = mlx5_devlink_eswitch_nocheck_get(devlink);
- netdev_net = dev_net(esw->dev->mlx5e_res.uplink_netdev);
- devl_net = devlink_net(devlink);
+ bool ret = false;
- return net_eq(devl_net, netdev_net);
+ mutex_lock(&dev->mlx5e_res.uplink_netdev_lock);
+ if (dev->mlx5e_res.uplink_netdev) {
+ netdev_net = dev_net(dev->mlx5e_res.uplink_netdev);
+ devl_net = devlink_net(devlink);
+ ret = net_eq(devl_net, netdev_net);
+ }
+ mutex_unlock(&dev->mlx5e_res.uplink_netdev_lock);
+ return ret;
}
int mlx5_eswitch_block_mode(struct mlx5_core_dev *dev)
goto unlock;
}
+ esw->eswitch_operation_in_progress = true;
+ up_write(&esw->mode_lock);
+
mlx5_eswitch_disable_locked(esw);
if (mode == DEVLINK_ESWITCH_MODE_SWITCHDEV) {
if (mlx5_devlink_trap_get_num_active(esw->dev)) {
NL_SET_ERR_MSG_MOD(extack,
"Can't change mode while devlink traps are active");
err = -EOPNOTSUPP;
- goto unlock;
+ goto skip;
}
err = esw_offloads_start(esw, extack);
} else if (mode == DEVLINK_ESWITCH_MODE_LEGACY) {
err = -EINVAL;
}
+skip:
+ down_write(&esw->mode_lock);
+ esw->eswitch_operation_in_progress = false;
unlock:
mlx5_esw_unlock(esw);
enable_lag:
int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, u16 *mode)
{
struct mlx5_eswitch *esw;
- int err;
esw = mlx5_devlink_eswitch_get(devlink);
if (IS_ERR(esw))
return PTR_ERR(esw);
- down_read(&esw->mode_lock);
- err = esw_mode_to_devlink(esw->mode, mode);
- up_read(&esw->mode_lock);
- return err;
+ return esw_mode_to_devlink(esw->mode, mode);
}
static int mlx5_esw_vports_inline_set(struct mlx5_eswitch *esw, u8 mlx5_mode,
if (err)
goto out;
+ esw->eswitch_operation_in_progress = true;
+ up_write(&esw->mode_lock);
+
err = mlx5_esw_vports_inline_set(esw, mlx5_mode, extack);
- if (err)
- goto out;
+ if (!err)
+ esw->offloads.inline_mode = mlx5_mode;
- esw->offloads.inline_mode = mlx5_mode;
+ down_write(&esw->mode_lock);
+ esw->eswitch_operation_in_progress = false;
up_write(&esw->mode_lock);
return 0;
int mlx5_devlink_eswitch_inline_mode_get(struct devlink *devlink, u8 *mode)
{
struct mlx5_eswitch *esw;
- int err;
esw = mlx5_devlink_eswitch_get(devlink);
if (IS_ERR(esw))
return PTR_ERR(esw);
- down_read(&esw->mode_lock);
- err = esw_inline_mode_to_devlink(esw->offloads.inline_mode, mode);
- up_read(&esw->mode_lock);
- return err;
+ return esw_inline_mode_to_devlink(esw->offloads.inline_mode, mode);
}
bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev)
goto unlock;
}
+ esw->eswitch_operation_in_progress = true;
+ up_write(&esw->mode_lock);
+
esw_destroy_offloads_fdb_tables(esw);
esw->offloads.encap = encap;
(void)esw_create_offloads_fdb_tables(esw);
}
+ down_write(&esw->mode_lock);
+ esw->eswitch_operation_in_progress = false;
+
unlock:
up_write(&esw->mode_lock);
return err;
if (IS_ERR(esw))
return PTR_ERR(esw);
- down_read(&esw->mode_lock);
*encap = esw->offloads.encap;
- up_read(&esw->mode_lock);
return 0;
}
/* hairpin */
for (i = esw_attr->split_count; i < esw_attr->out_count; i++)
- if (!esw_attr->dest_int_port && esw_attr->dests[i].rep &&
- esw_attr->dests[i].rep->vport == MLX5_VPORT_UPLINK)
+ if (!esw_attr->dest_int_port && esw_attr->dests[i].vport_valid &&
+ esw_attr->dests[i].vport == MLX5_VPORT_UPLINK)
return true;
return false;
mlx5_core_err(dev, "Failed to reload FW tracer\n");
}
+#if IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)
+static int mlx5_check_hotplug_interrupt(struct mlx5_core_dev *dev)
+{
+ struct pci_dev *bridge = dev->pdev->bus->self;
+ u16 reg16;
+ int err;
+
+ if (!bridge)
+ return -EOPNOTSUPP;
+
+ err = pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, ®16);
+ if (err)
+ return err;
+
+ if ((reg16 & PCI_EXP_SLTCTL_HPIE) && (reg16 & PCI_EXP_SLTCTL_DLLSCE)) {
+ mlx5_core_warn(dev, "FW reset is not supported as HotPlug is enabled\n");
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+#endif
+
static int mlx5_check_dev_ids(struct mlx5_core_dev *dev, u16 dev_id)
{
struct pci_bus *bridge_bus = dev->pdev->bus;
return false;
}
+#if IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)
+ err = mlx5_check_hotplug_interrupt(dev);
+ if (err)
+ return false;
+#endif
+
err = pci_read_config_word(dev->pdev, PCI_DEVICE_ID, &dev_id);
if (err)
return false;
if (pool->irqs_per_cpu)
cpu_put(pool, cpu);
}
-
-/**
- * mlx5_irq_affinity_irq_request_auto - request one IRQ for mlx5 device.
- * @dev: mlx5 device that is requesting the IRQ.
- * @used_cpus: cpumask of bounded cpus by the device
- * @vecidx: vector index to request an IRQ for.
- *
- * Each IRQ is bounded to at most 1 CPU.
- * This function is requesting an IRQ according to the default assignment.
- * The default assignment policy is:
- * - request the least loaded IRQ which is not bound to any
- * CPU of the previous IRQs requested.
- *
- * On success, this function updates used_cpus mask and returns an irq pointer.
- * In case of an error, an appropriate error pointer is returned.
- */
-struct mlx5_irq *mlx5_irq_affinity_irq_request_auto(struct mlx5_core_dev *dev,
- struct cpumask *used_cpus, u16 vecidx)
-{
- struct mlx5_irq_pool *pool = mlx5_irq_pool_get(dev);
- struct irq_affinity_desc af_desc = {};
- struct mlx5_irq *irq;
-
- if (!mlx5_irq_pool_is_sf_pool(pool))
- return ERR_PTR(-ENOENT);
-
- af_desc.is_managed = 1;
- cpumask_copy(&af_desc.mask, cpu_online_mask);
- cpumask_andnot(&af_desc.mask, &af_desc.mask, used_cpus);
- irq = mlx5_irq_affinity_request(pool, &af_desc);
-
- if (IS_ERR(irq))
- return irq;
-
- cpumask_or(used_cpus, used_cpus, mlx5_irq_get_affinity_mask(irq));
- mlx5_core_dbg(pool->dev, "IRQ %u mapped to cpu %*pbl, %u EQs on this irq\n",
- pci_irq_vector(dev->pdev, mlx5_irq_get_index(irq)),
- cpumask_pr_args(mlx5_irq_get_affinity_mask(irq)),
- mlx5_irq_read_locked(irq) / MLX5_EQ_REFS_PER_IRQ);
-
- return irq;
-}
static int mlx5_ptp_adjphase(struct ptp_clock_info *ptp, s32 delta)
{
- return mlx5_ptp_adjtime(ptp, delta);
+ struct mlx5_clock *clock = container_of(ptp, struct mlx5_clock, ptp_info);
+ struct mlx5_core_dev *mdev;
+
+ mdev = container_of(clock, struct mlx5_core_dev, clock);
+
+ return mlx5_ptp_adjtime_real_time(mdev, delta);
}
static int mlx5_ptp_freq_adj_real_time(struct mlx5_core_dev *mdev, long scaled_ppm)
struct mlx5_irq {
struct atomic_notifier_head nh;
cpumask_var_t mask;
- char name[MLX5_MAX_IRQ_NAME];
+ char name[MLX5_MAX_IRQ_FORMATTED_NAME];
struct mlx5_irq_pool *pool;
int refcount;
struct msi_map map;
else
irq_sf_set_name(pool, name, i);
ATOMIC_INIT_NOTIFIER_HEAD(&irq->nh);
- snprintf(irq->name, MLX5_MAX_IRQ_NAME,
- "%s@pci:%s", name, pci_name(dev->pdev));
+ snprintf(irq->name, MLX5_MAX_IRQ_FORMATTED_NAME,
+ MLX5_IRQ_NAME_FORMAT_STR, name, pci_name(dev->pdev));
err = request_irq(irq->map.virq, irq_int_handler, 0, irq->name,
&irq->nh);
if (err) {
#include <linux/mlx5/driver.h>
#define MLX5_MAX_IRQ_NAME (32)
+#define MLX5_IRQ_NAME_FORMAT_STR ("%s@pci:%s")
+#define MLX5_MAX_IRQ_FORMATTED_NAME \
+ (MLX5_MAX_IRQ_NAME + sizeof(MLX5_IRQ_NAME_FORMAT_STR))
/* max irq_index is 2047, so four chars */
#define MLX5_MAX_IRQ_IDX_CHARS (4)
#define MLX5_EQ_REFS_PER_IRQ (2)
static bool mlx5dr_action_supp_fwd_fdb_multi_ft(struct mlx5_core_dev *dev)
{
- return (MLX5_CAP_ESW_FLOWTABLE(dev, fdb_multi_path_any_table_limit_regc) ||
+ return (MLX5_CAP_GEN(dev, steering_format_version) < MLX5_STEERING_FORMAT_CONNECTX_6DX ||
+ MLX5_CAP_ESW_FLOWTABLE(dev, fdb_multi_path_any_table_limit_regc) ||
MLX5_CAP_ESW_FLOWTABLE(dev, fdb_multi_path_any_table));
}
u32 cqn;
u32 pdn;
u32 max_send_wr;
- u32 max_send_sge;
struct mlx5_uars_page *uar;
u8 isolate_vl_tc:1;
};
return err == CQ_POLL_ERR ? err : npolled;
}
-static int dr_qp_get_args_update_send_wqe_size(struct dr_qp_init_attr *attr)
-{
- return roundup_pow_of_two(sizeof(struct mlx5_wqe_ctrl_seg) +
- sizeof(struct mlx5_wqe_flow_update_ctrl_seg) +
- sizeof(struct mlx5_wqe_header_modify_argument_update_seg));
-}
-
-/* We calculate for specific RC QP with the required functionality */
-static int dr_qp_calc_rc_send_wqe(struct dr_qp_init_attr *attr)
-{
- int update_arg_size;
- int inl_size = 0;
- int tot_size;
- int size;
-
- update_arg_size = dr_qp_get_args_update_send_wqe_size(attr);
-
- size = sizeof(struct mlx5_wqe_ctrl_seg) +
- sizeof(struct mlx5_wqe_raddr_seg);
- inl_size = size + ALIGN(sizeof(struct mlx5_wqe_inline_seg) +
- DR_STE_SIZE, 16);
-
- size += attr->max_send_sge * sizeof(struct mlx5_wqe_data_seg);
-
- size = max(size, update_arg_size);
-
- tot_size = max(size, inl_size);
-
- return ALIGN(tot_size, MLX5_SEND_WQE_BB);
-}
-
static struct mlx5dr_qp *dr_create_rc_qp(struct mlx5_core_dev *mdev,
struct dr_qp_init_attr *attr)
{
u32 temp_qpc[MLX5_ST_SZ_DW(qpc)] = {};
struct mlx5_wq_param wqp;
struct mlx5dr_qp *dr_qp;
- int wqe_size;
int inlen;
void *qpc;
void *in;
if (err)
goto err_in;
dr_qp->uar = attr->uar;
- wqe_size = dr_qp_calc_rc_send_wqe(attr);
- dr_qp->max_inline_data = min(wqe_size -
- (sizeof(struct mlx5_wqe_ctrl_seg) +
- sizeof(struct mlx5_wqe_raddr_seg) +
- sizeof(struct mlx5_wqe_inline_seg)),
- (2 * MLX5_SEND_WQE_BB -
- (sizeof(struct mlx5_wqe_ctrl_seg) +
- sizeof(struct mlx5_wqe_raddr_seg) +
- sizeof(struct mlx5_wqe_inline_seg))));
return dr_qp;
MLX5_SEND_WQE_DS;
}
-static int dr_set_data_inl_seg(struct mlx5dr_qp *dr_qp,
- struct dr_data_seg *data_seg, void *wqe)
-{
- int inline_header_size = sizeof(struct mlx5_wqe_ctrl_seg) +
- sizeof(struct mlx5_wqe_raddr_seg) +
- sizeof(struct mlx5_wqe_inline_seg);
- struct mlx5_wqe_inline_seg *seg;
- int left_space;
- int inl = 0;
- void *addr;
- int len;
- int idx;
-
- seg = wqe;
- wqe += sizeof(*seg);
- addr = (void *)(unsigned long)(data_seg->addr);
- len = data_seg->length;
- inl += len;
- left_space = MLX5_SEND_WQE_BB - inline_header_size;
-
- if (likely(len > left_space)) {
- memcpy(wqe, addr, left_space);
- len -= left_space;
- addr += left_space;
- idx = (dr_qp->sq.pc + 1) & (dr_qp->sq.wqe_cnt - 1);
- wqe = mlx5_wq_cyc_get_wqe(&dr_qp->wq.sq, idx);
- }
-
- memcpy(wqe, addr, len);
-
- if (likely(inl)) {
- seg->byte_count = cpu_to_be32(inl | MLX5_INLINE_SEG);
- return DIV_ROUND_UP(inl + sizeof(seg->byte_count),
- MLX5_SEND_WQE_DS);
- } else {
- return 0;
- }
-}
-
static void
-dr_rdma_handle_icm_write_segments(struct mlx5dr_qp *dr_qp,
- struct mlx5_wqe_ctrl_seg *wq_ctrl,
+dr_rdma_handle_icm_write_segments(struct mlx5_wqe_ctrl_seg *wq_ctrl,
u64 remote_addr,
u32 rkey,
struct dr_data_seg *data_seg,
wq_raddr->reserved = 0;
wq_dseg = (void *)(wq_raddr + 1);
- /* WQE ctrl segment + WQE remote addr segment */
- *size = (sizeof(*wq_ctrl) + sizeof(*wq_raddr)) / MLX5_SEND_WQE_DS;
- if (data_seg->send_flags & IB_SEND_INLINE) {
- *size += dr_set_data_inl_seg(dr_qp, data_seg, wq_dseg);
- } else {
- wq_dseg->byte_count = cpu_to_be32(data_seg->length);
- wq_dseg->lkey = cpu_to_be32(data_seg->lkey);
- wq_dseg->addr = cpu_to_be64(data_seg->addr);
- *size += sizeof(*wq_dseg) / MLX5_SEND_WQE_DS; /* WQE data segment */
- }
+ wq_dseg->byte_count = cpu_to_be32(data_seg->length);
+ wq_dseg->lkey = cpu_to_be32(data_seg->lkey);
+ wq_dseg->addr = cpu_to_be64(data_seg->addr);
+
+ *size = (sizeof(*wq_ctrl) + /* WQE ctrl segment */
+ sizeof(*wq_dseg) + /* WQE data segment */
+ sizeof(*wq_raddr)) / /* WQE remote addr segment */
+ MLX5_SEND_WQE_DS;
}
static void dr_set_ctrl_seg(struct mlx5_wqe_ctrl_seg *wq_ctrl,
switch (opcode) {
case MLX5_OPCODE_RDMA_READ:
case MLX5_OPCODE_RDMA_WRITE:
- dr_rdma_handle_icm_write_segments(dr_qp, wq_ctrl, remote_addr,
+ dr_rdma_handle_icm_write_segments(wq_ctrl, remote_addr,
rkey, data_seg, &size);
break;
case MLX5_OPCODE_FLOW_TBL_ACCESS:
if (send_ring->pending_wqe % send_ring->signal_th == 0)
send_info->write.send_flags |= IB_SEND_SIGNALED;
else
- send_info->write.send_flags &= ~IB_SEND_SIGNALED;
+ send_info->write.send_flags = 0;
}
static void dr_fill_write_icm_segs(struct mlx5dr_domain *dmn,
}
send_ring->pending_wqe++;
- if (!send_info->write.lkey)
- send_info->write.send_flags |= IB_SEND_INLINE;
if (send_ring->pending_wqe % send_ring->signal_th == 0)
send_info->write.send_flags |= IB_SEND_SIGNALED;
- else
- send_info->write.send_flags &= ~IB_SEND_SIGNALED;
send_ring->pending_wqe++;
send_info->read.length = send_info->write.length;
send_info->read.lkey = send_ring->sync_mr->mkey;
if (send_ring->pending_wqe % send_ring->signal_th == 0)
- send_info->read.send_flags |= IB_SEND_SIGNALED;
+ send_info->read.send_flags = IB_SEND_SIGNALED;
else
- send_info->read.send_flags &= ~IB_SEND_SIGNALED;
+ send_info->read.send_flags = 0;
}
static void dr_fill_data_segs(struct mlx5dr_domain *dmn,
dmn->send_ring->cq->qp = dmn->send_ring->qp;
dmn->info.max_send_wr = QUEUE_SIZE;
- init_attr.max_send_sge = 1;
dmn->info.max_inline_size = min(dmn->send_ring->qp->max_inline_data,
DR_STE_SIZE);
req_list_size = max_list_size;
}
- out_sz = MLX5_ST_SZ_BYTES(query_nic_vport_context_in) +
+ out_sz = MLX5_ST_SZ_BYTES(query_nic_vport_context_out) +
req_list_size * MLX5_ST_SZ_BYTES(mac_address_layout);
out = kvzalloc(out_sz, GFP_KERNEL);
* @rxd: Space for receiving SPI data, in DMA-able space.
* @txd: Space for transmitting SPI data, in DMA-able space.
* @msg_enable: The message flags controlling driver output (see ethtool).
+ * @tx_space: Free space in the hardware TX buffer (cached copy of KS_TXMIR).
+ * @queued_len: Space required in hardware TX buffer for queued packets in txq.
* @fid: Incrementing frame id tag.
* @rc_ier: Cached copy of KS_IER.
* @rc_ccr: Cached copy of KS_CCR.
struct work_struct rxctrl_work;
struct sk_buff_head txq;
+ unsigned int queued_len;
struct eeprom_93cx6 eeprom;
struct regulator *vdd_reg;
handled |= IRQ_RXPSI;
if (status & IRQ_TXI) {
- handled |= IRQ_TXI;
+ unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
- /* no lock here, tx queue should have been stopped */
+ netif_dbg(ks, intr, ks->netdev,
+ "%s: txspace %d\n", __func__, tx_space);
- /* update our idea of how much tx space is available to the
- * system */
- ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
+ spin_lock(&ks->statelock);
+ ks->tx_space = tx_space;
+ if (netif_queue_stopped(ks->netdev))
+ netif_wake_queue(ks->netdev);
+ spin_unlock(&ks->statelock);
- netif_dbg(ks, intr, ks->netdev,
- "%s: txspace %d\n", __func__, ks->tx_space);
+ handled |= IRQ_TXI;
}
if (status & IRQ_RXI)
if (status & IRQ_LCI)
mii_check_link(&ks->mii);
- if (status & IRQ_TXI)
- netif_wake_queue(ks->netdev);
-
return IRQ_HANDLED;
}
ks8851_wrreg16(ks, KS_ISR, ks->rc_ier);
ks8851_wrreg16(ks, KS_IER, ks->rc_ier);
+ ks->queued_len = 0;
netif_start_queue(ks->netdev);
netif_dbg(ks, ifup, ks->netdev, "network device up\n");
netdev_err(ks->netdev, "%s: spi_sync() failed\n", __func__);
}
+/**
+ * calc_txlen - calculate size of message to send packet
+ * @len: Length of data
+ *
+ * Returns the size of the TXFIFO message needed to send
+ * this packet.
+ */
+static unsigned int calc_txlen(unsigned int len)
+{
+ return ALIGN(len + 4, 4);
+}
+
/**
* ks8851_rx_skb_spi - receive skbuff
* @ks: The device state
*/
static void ks8851_tx_work(struct work_struct *work)
{
+ unsigned int dequeued_len = 0;
struct ks8851_net_spi *kss;
+ unsigned short tx_space;
struct ks8851_net *ks;
unsigned long flags;
struct sk_buff *txb;
last = skb_queue_empty(&ks->txq);
if (txb) {
+ dequeued_len += calc_txlen(txb->len);
+
ks8851_wrreg16_spi(ks, KS_RXQCR,
ks->rc_rxqcr | RXQCR_SDA);
ks8851_wrfifo_spi(ks, txb, last);
}
}
+ tx_space = ks8851_rdreg16_spi(ks, KS_TXMIR);
+
+ spin_lock(&ks->statelock);
+ ks->queued_len -= dequeued_len;
+ ks->tx_space = tx_space;
+ spin_unlock(&ks->statelock);
+
ks8851_unlock_spi(ks, &flags);
}
flush_work(&kss->tx_work);
}
-/**
- * calc_txlen - calculate size of message to send packet
- * @len: Length of data
- *
- * Returns the size of the TXFIFO message needed to send
- * this packet.
- */
-static unsigned int calc_txlen(unsigned int len)
-{
- return ALIGN(len + 4, 4);
-}
-
/**
* ks8851_start_xmit_spi - transmit packet using SPI
* @skb: The buffer to transmit
spin_lock(&ks->statelock);
- if (needed > ks->tx_space) {
+ if (ks->queued_len + needed > ks->tx_space) {
netif_stop_queue(dev);
ret = NETDEV_TX_BUSY;
} else {
- ks->tx_space -= needed;
+ ks->queued_len += needed;
skb_queue_tail(&ks->txq, skb);
}
spin_unlock(&ks->statelock);
- schedule_work(&kss->tx_work);
+ if (ret == NETDEV_TX_OK)
+ schedule_work(&kss->tx_work);
return ret;
}
depends on PCI_MSI && X86_64
depends on PCI_HYPERV
select AUXILIARY_BUS
+ select PAGE_POOL
help
This driver supports Microsoft Azure Network Adapter (MANA).
So far, the driver is only supported on X86_64.
rmon_stats->hist_tx[0] = s[OCELOT_STAT_TX_64];
rmon_stats->hist_tx[1] = s[OCELOT_STAT_TX_65_127];
rmon_stats->hist_tx[2] = s[OCELOT_STAT_TX_128_255];
- rmon_stats->hist_tx[3] = s[OCELOT_STAT_TX_128_255];
- rmon_stats->hist_tx[4] = s[OCELOT_STAT_TX_256_511];
- rmon_stats->hist_tx[5] = s[OCELOT_STAT_TX_512_1023];
- rmon_stats->hist_tx[6] = s[OCELOT_STAT_TX_1024_1526];
+ rmon_stats->hist_tx[3] = s[OCELOT_STAT_TX_256_511];
+ rmon_stats->hist_tx[4] = s[OCELOT_STAT_TX_512_1023];
+ rmon_stats->hist_tx[5] = s[OCELOT_STAT_TX_1024_1526];
+ rmon_stats->hist_tx[6] = s[OCELOT_STAT_TX_1527_MAX];
}
static void ocelot_port_pmac_rmon_stats_cb(struct ocelot *ocelot, int port,
rmon_stats->hist_tx[0] = s[OCELOT_STAT_TX_PMAC_64];
rmon_stats->hist_tx[1] = s[OCELOT_STAT_TX_PMAC_65_127];
rmon_stats->hist_tx[2] = s[OCELOT_STAT_TX_PMAC_128_255];
- rmon_stats->hist_tx[3] = s[OCELOT_STAT_TX_PMAC_128_255];
- rmon_stats->hist_tx[4] = s[OCELOT_STAT_TX_PMAC_256_511];
- rmon_stats->hist_tx[5] = s[OCELOT_STAT_TX_PMAC_512_1023];
- rmon_stats->hist_tx[6] = s[OCELOT_STAT_TX_PMAC_1024_1526];
+ rmon_stats->hist_tx[3] = s[OCELOT_STAT_TX_PMAC_256_511];
+ rmon_stats->hist_tx[4] = s[OCELOT_STAT_TX_PMAC_512_1023];
+ rmon_stats->hist_tx[5] = s[OCELOT_STAT_TX_PMAC_1024_1526];
+ rmon_stats->hist_tx[6] = s[OCELOT_STAT_TX_PMAC_1527_MAX];
}
void ocelot_port_get_rmon_stats(struct ocelot *ocelot, int port,
u8 addr[ETH_ALEN];
};
+/**
+ * struct nfp_neigh_update_work - update neighbour information to nfp
+ * @work: Work queue for writing neigh to the nfp
+ * @n: neighbour entry
+ * @app: Back pointer to app
+ */
+struct nfp_neigh_update_work {
+ struct work_struct work;
+ struct neighbour *n;
+ struct nfp_app *app;
+};
+
enum nfp_flower_mac_offload_cmd {
NFP_TUNNEL_MAC_OFFLOAD_ADD = 0,
NFP_TUNNEL_MAC_OFFLOAD_DEL = 1,
nfp_flower_cmsg_warn(app, "Neighbour configuration failed.\n");
}
-static int
-nfp_tun_neigh_event_handler(struct notifier_block *nb, unsigned long event,
- void *ptr)
+static void
+nfp_tun_release_neigh_update_work(struct nfp_neigh_update_work *update_work)
{
- struct nfp_flower_priv *app_priv;
- struct netevent_redirect *redir;
- struct neighbour *n;
+ neigh_release(update_work->n);
+ kfree(update_work);
+}
+
+static void nfp_tun_neigh_update(struct work_struct *work)
+{
+ struct nfp_neigh_update_work *update_work;
struct nfp_app *app;
+ struct neighbour *n;
bool neigh_invalid;
int err;
- switch (event) {
- case NETEVENT_REDIRECT:
- redir = (struct netevent_redirect *)ptr;
- n = redir->neigh;
- break;
- case NETEVENT_NEIGH_UPDATE:
- n = (struct neighbour *)ptr;
- break;
- default:
- return NOTIFY_DONE;
- }
-
- neigh_invalid = !(n->nud_state & NUD_VALID) || n->dead;
-
- app_priv = container_of(nb, struct nfp_flower_priv, tun.neigh_nb);
- app = app_priv->app;
+ update_work = container_of(work, struct nfp_neigh_update_work, work);
+ app = update_work->app;
+ n = update_work->n;
if (!nfp_flower_get_port_id_from_netdev(app, n->dev))
- return NOTIFY_DONE;
+ goto out;
#if IS_ENABLED(CONFIG_INET)
+ neigh_invalid = !(n->nud_state & NUD_VALID) || n->dead;
if (n->tbl->family == AF_INET6) {
#if IS_ENABLED(CONFIG_IPV6)
struct flowi6 flow6 = {};
dst = ip6_dst_lookup_flow(dev_net(n->dev), NULL,
&flow6, NULL);
if (IS_ERR(dst))
- return NOTIFY_DONE;
+ goto out;
dst_release(dst);
}
nfp_tun_write_neigh(n->dev, app, &flow6, n, true, false);
-#else
- return NOTIFY_DONE;
#endif /* CONFIG_IPV6 */
} else {
struct flowi4 flow4 = {};
rt = ip_route_output_key(dev_net(n->dev), &flow4);
err = PTR_ERR_OR_ZERO(rt);
if (err)
- return NOTIFY_DONE;
+ goto out;
ip_rt_put(rt);
}
nfp_tun_write_neigh(n->dev, app, &flow4, n, false, false);
}
-#else
- return NOTIFY_DONE;
#endif /* CONFIG_INET */
+out:
+ nfp_tun_release_neigh_update_work(update_work);
+}
- return NOTIFY_OK;
+static struct nfp_neigh_update_work *
+nfp_tun_alloc_neigh_update_work(struct nfp_app *app, struct neighbour *n)
+{
+ struct nfp_neigh_update_work *update_work;
+
+ update_work = kzalloc(sizeof(*update_work), GFP_ATOMIC);
+ if (!update_work)
+ return NULL;
+
+ INIT_WORK(&update_work->work, nfp_tun_neigh_update);
+ neigh_hold(n);
+ update_work->n = n;
+ update_work->app = app;
+
+ return update_work;
+}
+
+static int
+nfp_tun_neigh_event_handler(struct notifier_block *nb, unsigned long event,
+ void *ptr)
+{
+ struct nfp_neigh_update_work *update_work;
+ struct nfp_flower_priv *app_priv;
+ struct netevent_redirect *redir;
+ struct neighbour *n;
+ struct nfp_app *app;
+
+ switch (event) {
+ case NETEVENT_REDIRECT:
+ redir = (struct netevent_redirect *)ptr;
+ n = redir->neigh;
+ break;
+ case NETEVENT_NEIGH_UPDATE:
+ n = (struct neighbour *)ptr;
+ break;
+ default:
+ return NOTIFY_DONE;
+ }
+#if IS_ENABLED(CONFIG_IPV6)
+ if (n->tbl != ipv6_stub->nd_tbl && n->tbl != &arp_tbl)
+#else
+ if (n->tbl != &arp_tbl)
+#endif
+ return NOTIFY_DONE;
+
+ app_priv = container_of(nb, struct nfp_flower_priv, tun.neigh_nb);
+ app = app_priv->app;
+ update_work = nfp_tun_alloc_neigh_update_work(app, n);
+ if (!update_work)
+ return NOTIFY_DONE;
+
+ queue_work(system_highpri_wq, &update_work->work);
+
+ return NOTIFY_DONE;
}
void nfp_tunnel_request_route_v4(struct nfp_app *app, struct sk_buff *skb)
netdev = nfp_app_dev_get(app, be32_to_cpu(payload->ingress_port), NULL);
if (!netdev)
goto fail_rcu_unlock;
+ dev_hold(netdev);
flow.daddr = payload->ipv4_addr;
flow.flowi4_proto = IPPROTO_UDP;
ip_rt_put(rt);
if (!n)
goto fail_rcu_unlock;
+ rcu_read_unlock();
+
nfp_tun_write_neigh(n->dev, app, &flow, n, false, true);
neigh_release(n);
- rcu_read_unlock();
+ dev_put(netdev);
return;
fail_rcu_unlock:
rcu_read_unlock();
+ dev_put(netdev);
nfp_flower_cmsg_warn(app, "Requested route not found.\n");
}
netdev = nfp_app_dev_get(app, be32_to_cpu(payload->ingress_port), NULL);
if (!netdev)
goto fail_rcu_unlock;
+ dev_hold(netdev);
flow.daddr = payload->ipv6_addr;
flow.flowi6_proto = IPPROTO_UDP;
dst_release(dst);
if (!n)
goto fail_rcu_unlock;
+ rcu_read_unlock();
nfp_tun_write_neigh(n->dev, app, &flow, n, true, true);
neigh_release(n);
- rcu_read_unlock();
+ dev_put(netdev);
return;
fail_rcu_unlock:
rcu_read_unlock();
+ dev_put(netdev);
nfp_flower_cmsg_warn(app, "Requested IPv6 route not found.\n");
}
void *cb_arg;
};
-#define IONIC_QUEUE_NAME_MAX_SZ 32
+#define IONIC_QUEUE_NAME_MAX_SZ 16
struct ionic_queue {
struct device *dev;
static void ionic_dim_work(struct work_struct *work)
{
struct dim *dim = container_of(work, struct dim, work);
+ struct ionic_intr_info *intr;
struct dim_cq_moder cur_moder;
struct ionic_qcq *qcq;
+ struct ionic_lif *lif;
u32 new_coal;
cur_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
qcq = container_of(dim, struct ionic_qcq, dim);
- new_coal = ionic_coal_usec_to_hw(qcq->q.lif->ionic, cur_moder.usec);
+ lif = qcq->q.lif;
+ new_coal = ionic_coal_usec_to_hw(lif->ionic, cur_moder.usec);
new_coal = new_coal ? new_coal : 1;
- if (qcq->intr.dim_coal_hw != new_coal) {
- unsigned int qi = qcq->cq.bound_q->index;
- struct ionic_lif *lif = qcq->q.lif;
-
- qcq->intr.dim_coal_hw = new_coal;
+ intr = &qcq->intr;
+ if (intr->dim_coal_hw != new_coal) {
+ intr->dim_coal_hw = new_coal;
ionic_intr_coal_init(lif->ionic->idev.intr_ctrl,
- lif->rxqcqs[qi]->intr.index,
- qcq->intr.dim_coal_hw);
+ intr->index, intr->dim_coal_hw);
}
dim->state = DIM_START_MEASURE;
p_dma->virt_addr = NULL;
}
kfree(p_mngr->ilt_shadow);
+ p_mngr->ilt_shadow = NULL;
}
static int qed_ilt_blk_alloc(struct qed_hwfn *p_hwfn,
#define QCASPI_MAX_REGS 0x20
+#define QCASPI_RX_MAX_FRAMES 4
+
static const u16 qcaspi_spi_regs[] = {
SPI_REG_BFR_SIZE,
SPI_REG_WRBUF_SPC_AVA,
{
struct qcaspi *qca = netdev_priv(dev);
- ring->rx_max_pending = 4;
+ ring->rx_max_pending = QCASPI_RX_MAX_FRAMES;
ring->tx_max_pending = TX_RING_MAX_LEN;
- ring->rx_pending = 4;
+ ring->rx_pending = QCASPI_RX_MAX_FRAMES;
ring->tx_pending = qca->txr.count;
}
struct kernel_ethtool_ringparam *kernel_ring,
struct netlink_ext_ack *extack)
{
- const struct net_device_ops *ops = dev->netdev_ops;
struct qcaspi *qca = netdev_priv(dev);
- if ((ring->rx_pending) ||
+ if (ring->rx_pending != QCASPI_RX_MAX_FRAMES ||
(ring->rx_mini_pending) ||
(ring->rx_jumbo_pending))
return -EINVAL;
- if (netif_running(dev))
- ops->ndo_stop(dev);
+ if (qca->spi_thread)
+ kthread_park(qca->spi_thread);
qca->txr.count = max_t(u32, ring->tx_pending, TX_RING_MIN_LEN);
qca->txr.count = min_t(u16, qca->txr.count, TX_RING_MAX_LEN);
- if (netif_running(dev))
- ops->ndo_open(dev);
+ if (qca->spi_thread)
+ kthread_unpark(qca->spi_thread);
return 0;
}
netdev_info(qca->net_dev, "SPI thread created\n");
while (!kthread_should_stop()) {
set_current_state(TASK_INTERRUPTIBLE);
+ if (kthread_should_park()) {
+ netif_tx_disable(qca->net_dev);
+ netif_carrier_off(qca->net_dev);
+ qcaspi_flush_tx_ring(qca);
+ kthread_parkme();
+ if (qca->sync == QCASPI_SYNC_READY) {
+ netif_carrier_on(qca->net_dev);
+ netif_wake_queue(qca->net_dev);
+ }
+ continue;
+ }
+
if ((qca->intr_req == qca->intr_svc) &&
!qca->txr.skb[qca->txr.head])
schedule();
if (intr_cause & SPI_INT_CPU_ON) {
qcaspi_qca7k_sync(qca, QCASPI_EVENT_CPUON);
+ /* Frame decoding in progress */
+ if (qca->frm_handle.state != qca->frm_handle.init)
+ qca->net_dev->stats.rx_dropped++;
+
+ qcafrm_fsm_init_spi(&qca->frm_handle);
+ qca->stats.device_reset++;
+
/* not synced. */
if (qca->sync != QCASPI_SYNC_READY)
continue;
- qca->stats.device_reset++;
netif_wake_queue(qca->net_dev);
netif_carrier_on(qca->net_dev);
}
/* No threshold before first PCI xfer */
#define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
#define RX_EARLY_OFF (1 << 11)
+#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
#define RXCFG_DMA_SHIFT 8
/* Unlimited maximum PCI burst. */
#define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
enum rtl_flag {
RTL_FLAG_TASK_ENABLED = 0,
RTL_FLAG_TASK_RESET_PENDING,
+ RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE,
RTL_FLAG_TASK_TX_TIMEOUT,
RTL_FLAG_MAX
};
unsigned supports_gmii:1;
unsigned aspm_manageable:1;
+ unsigned dash_enabled:1;
dma_addr_t counters_phys_addr;
struct rtl8169_counters *counters;
struct rtl8169_tc_offsets tc_offset;
return r8168ep_ocp_read(tp, 0x128) & BIT(0);
}
-static enum rtl_dash_type rtl_check_dash(struct rtl8169_private *tp)
+static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
+{
+ switch (tp->dash_type) {
+ case RTL_DASH_DP:
+ return r8168dp_check_dash(tp);
+ case RTL_DASH_EP:
+ return r8168ep_check_dash(tp);
+ default:
+ return false;
+ }
+}
+
+static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
{
switch (tp->mac_version) {
case RTL_GIGA_MAC_VER_28:
case RTL_GIGA_MAC_VER_31:
- return r8168dp_check_dash(tp) ? RTL_DASH_DP : RTL_DASH_NONE;
+ return RTL_DASH_DP;
case RTL_GIGA_MAC_VER_51 ... RTL_GIGA_MAC_VER_53:
- return r8168ep_check_dash(tp) ? RTL_DASH_EP : RTL_DASH_NONE;
+ return RTL_DASH_EP;
default:
return RTL_DASH_NONE;
}
device_set_wakeup_enable(tp_to_dev(tp), wolopts);
- if (tp->dash_type == RTL_DASH_NONE) {
+ if (!tp->dash_enabled) {
rtl_set_d3_pll_down(tp, !wolopts);
tp->dev->wol_enabled = wolopts ? 1 : 0;
}
case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
break;
- case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
+ case RTL_GIGA_MAC_VER_61:
RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
break;
+ case RTL_GIGA_MAC_VER_63:
+ RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
+ RX_PAUSE_SLOT_ON);
+ break;
default:
RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST);
break;
static void rtl_prepare_power_down(struct rtl8169_private *tp)
{
- if (tp->dash_type != RTL_DASH_NONE)
+ if (tp->dash_enabled)
return;
if (tp->mac_version == RTL_GIGA_MAC_VER_32 ||
rx_mode &= ~AcceptMulticast;
} else if (netdev_mc_count(dev) > MC_FILTER_LIMIT ||
dev->flags & IFF_ALLMULTI ||
- tp->mac_version == RTL_GIGA_MAC_VER_35 ||
- tp->mac_version == RTL_GIGA_MAC_VER_46 ||
- tp->mac_version == RTL_GIGA_MAC_VER_48) {
+ tp->mac_version == RTL_GIGA_MAC_VER_35) {
/* accept all multicasts */
} else if (netdev_mc_empty(dev)) {
rx_mode &= ~AcceptMulticast;
reset:
rtl_reset_work(tp);
netif_wake_queue(tp->dev);
+ } else if (test_and_clear_bit(RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE, tp->wk.flags)) {
+ rtl_reset_work(tp);
}
out_unlock:
rtnl_unlock();
} else {
/* In few cases rx is broken after link-down otherwise */
if (rtl_is_8125(tp))
- rtl_reset_work(tp);
+ rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_NO_QUEUE_WAKE);
pm_runtime_idle(d);
}
rtl8169_cleanup(tp);
rtl_disable_exit_l1(tp);
rtl_prepare_power_down(tp);
+
+ if (tp->dash_type != RTL_DASH_NONE)
+ rtl8168_driver_stop(tp);
}
static void rtl8169_up(struct rtl8169_private *tp)
{
+ if (tp->dash_type != RTL_DASH_NONE)
+ rtl8168_driver_start(tp);
+
pci_set_master(tp->pci_dev);
phy_init_hw(tp->phydev);
phy_resume(tp->phydev);
rtl8169_down(tp);
rtl8169_rx_clear(tp);
- cancel_work_sync(&tp->wk.work);
+ cancel_work(&tp->wk.work);
free_irq(tp->irq, tp);
{
struct rtl8169_private *tp = dev_get_drvdata(device);
- if (tp->dash_type != RTL_DASH_NONE)
+ if (tp->dash_enabled)
return -EBUSY;
if (!netif_running(tp->dev) || !netif_carrier_ok(tp->dev))
/* Restore original MAC address */
rtl_rar_set(tp, tp->dev->perm_addr);
- if (system_state == SYSTEM_POWER_OFF &&
- tp->dash_type == RTL_DASH_NONE) {
+ if (system_state == SYSTEM_POWER_OFF && !tp->dash_enabled) {
pci_wake_from_d3(pdev, tp->saved_wolopts);
pci_set_power_state(pdev, PCI_D3hot);
}
if (pci_dev_run_wake(pdev))
pm_runtime_get_noresume(&pdev->dev);
+ cancel_work_sync(&tp->wk.work);
+
unregister_netdev(tp->dev);
if (tp->dash_type != RTL_DASH_NONE)
rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
tp->aspm_manageable = !rc;
- tp->dash_type = rtl_check_dash(tp);
+ tp->dash_type = rtl_get_dash_type(tp);
+ tp->dash_enabled = rtl_dash_is_enabled(tp);
tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
/* configure chip for default features */
rtl8169_set_features(dev, dev->features);
- if (tp->dash_type == RTL_DASH_NONE) {
+ if (!tp->dash_enabled) {
rtl_set_d3_pll_down(tp, true);
} else {
rtl_set_d3_pll_down(tp, false);
"ok" : "ko");
if (tp->dash_type != RTL_DASH_NONE) {
- netdev_info(dev, "DASH enabled\n");
+ netdev_info(dev, "DASH %s\n",
+ tp->dash_enabled ? "enabled" : "disabled");
rtl8168_driver_start(tp);
}
{
struct ravb_private *priv = netdev_priv(ndev);
+ if (priv->phy_interface == PHY_INTERFACE_MODE_MII) {
+ ravb_write(ndev, (1000 << 16) | CXR35_SEL_XMII_MII, CXR35);
+ ravb_modify(ndev, CXR31, CXR31_SEL_LINK0 | CXR31_SEL_LINK1, 0);
+ } else {
+ ravb_write(ndev, (1000 << 16) | CXR35_SEL_XMII_RGMII, CXR35);
+ ravb_modify(ndev, CXR31, CXR31_SEL_LINK0 | CXR31_SEL_LINK1,
+ CXR31_SEL_LINK0);
+ }
+
/* Receive frame limit set register */
ravb_write(ndev, GBETH_RX_BUFF_MAX + ETH_FCS_LEN, RFLR);
/* E-MAC interrupt enable register */
ravb_write(ndev, ECSIPR_ICDIP, ECSIPR);
-
- if (priv->phy_interface == PHY_INTERFACE_MODE_MII) {
- ravb_modify(ndev, CXR31, CXR31_SEL_LINK0 | CXR31_SEL_LINK1, 0);
- ravb_write(ndev, (1000 << 16) | CXR35_SEL_XMII_MII, CXR35);
- } else {
- ravb_modify(ndev, CXR31, CXR31_SEL_LINK0 | CXR31_SEL_LINK1,
- CXR31_SEL_LINK0);
- }
}
static void ravb_emac_init_rcar(struct net_device *ndev)
if (info->gptp)
ravb_ptp_init(ndev, priv->pdev);
- netif_tx_start_all_queues(ndev);
-
/* PHY control start */
error = ravb_phy_start(ndev);
if (error)
goto out_ptp_stop;
+ netif_tx_start_all_queues(ndev);
+
return 0;
out_ptp_stop:
/* Stop PTP Clock driver */
if (info->gptp)
ravb_ptp_stop(ndev);
+ ravb_stop_dma(ndev);
out_free_irq_mgmta:
if (!info->multi_irqs)
goto out_free_irq;
struct net_device *ndev = priv->ndev;
int error;
+ if (!rtnl_trylock()) {
+ usleep_range(1000, 2000);
+ schedule_work(&priv->work);
+ return;
+ }
+
netif_tx_stop_all_queues(ndev);
/* Stop PTP Clock driver */
*/
netdev_err(ndev, "%s: ravb_dmac_init() failed, error %d\n",
__func__, error);
- return;
+ goto out_unlock;
}
ravb_emac_init(ndev);
ravb_ptp_init(ndev, priv->pdev);
netif_tx_start_all_queues(ndev);
+
+out_unlock:
+ rtnl_unlock();
}
/* Packet transmit function for Ethernet AVB */
ndev->features = info->net_features;
ndev->hw_features = info->net_hw_features;
- reset_control_deassert(rstc);
+ error = reset_control_deassert(rstc);
+ if (error)
+ goto out_free_netdev;
+
pm_runtime_enable(&pdev->dev);
- pm_runtime_get_sync(&pdev->dev);
+ error = pm_runtime_resume_and_get(&pdev->dev);
+ if (error < 0)
+ goto out_rpm_disable;
if (info->multi_irqs) {
if (info->err_mgmt_irqs)
out_disable_refclk:
clk_disable_unprepare(priv->refclk);
out_release:
- free_netdev(ndev);
-
pm_runtime_put(&pdev->dev);
+out_rpm_disable:
pm_runtime_disable(&pdev->dev);
reset_control_assert(rstc);
+out_free_netdev:
+ free_netdev(ndev);
return error;
}
struct ravb_private *priv = netdev_priv(ndev);
const struct ravb_hw_info *info = priv->info;
- /* Stop PTP Clock driver */
- if (info->ccc_gac)
- ravb_ptp_stop(ndev);
-
- clk_disable_unprepare(priv->gptp_clk);
- clk_disable_unprepare(priv->refclk);
-
- /* Set reset mode */
- ravb_write(ndev, CCC_OPC_RESET, CCC);
unregister_netdev(ndev);
if (info->nc_queues)
netif_napi_del(&priv->napi[RAVB_NC]);
netif_napi_del(&priv->napi[RAVB_BE]);
+
ravb_mdio_release(priv);
+
+ /* Stop PTP Clock driver */
+ if (info->ccc_gac)
+ ravb_ptp_stop(ndev);
+
dma_free_coherent(ndev->dev.parent, priv->desc_bat_size, priv->desc_bat,
priv->desc_bat_dma);
+
+ /* Set reset mode */
+ ravb_write(ndev, CCC_OPC_RESET, CCC);
+
+ clk_disable_unprepare(priv->gptp_clk);
+ clk_disable_unprepare(priv->refclk);
+
pm_runtime_put_sync(&pdev->dev);
pm_runtime_disable(&pdev->dev);
reset_control_assert(priv->rstc);
{
struct rswitch_device *rdev = netdev_priv(ndev);
struct rswitch_gwca_queue *gq = rdev->tx_queue;
+ netdev_tx_t ret = NETDEV_TX_OK;
struct rswitch_ext_desc *desc;
- int ret = NETDEV_TX_OK;
dma_addr_t dma_addr;
if (rswitch_get_num_cur_queues(gq) >= gq->ring_size - 1) {
return ret;
dma_addr = dma_map_single(ndev->dev.parent, skb->data, skb->len, DMA_TO_DEVICE);
- if (dma_mapping_error(ndev->dev.parent, dma_addr)) {
- dev_kfree_skb_any(skb);
- return ret;
- }
+ if (dma_mapping_error(ndev->dev.parent, dma_addr))
+ goto err_kfree;
gq->skbs[gq->cur] = skb;
desc = &gq->tx_ring[gq->cur];
struct rswitch_gwca_ts_info *ts_info;
ts_info = kzalloc(sizeof(*ts_info), GFP_ATOMIC);
- if (!ts_info) {
- dma_unmap_single(ndev->dev.parent, dma_addr, skb->len, DMA_TO_DEVICE);
- return -ENOMEM;
- }
+ if (!ts_info)
+ goto err_unmap;
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
rdev->ts_tag++;
gq->cur = rswitch_next_queue_index(gq, true, 1);
rswitch_modify(rdev->addr, GWTRC(gq->index), 0, BIT(gq->index % 32));
+ return ret;
+
+err_unmap:
+ dma_unmap_single(ndev->dev.parent, dma_addr, skb->len, DMA_TO_DEVICE);
+
+err_kfree:
+ dev_kfree_skb_any(skb);
+
return ret;
}
config DWMAC_LOONGSON
tristate "Loongson PCI DWMAC support"
default MACH_LOONGSON64
- depends on STMMAC_ETH && PCI
+ depends on (MACH_LOONGSON64 || COMPILE_TEST) && STMMAC_ETH && PCI
depends on COMMON_CLK
help
This selects the LOONGSON PCI bus support for the stmmac driver,
return -ENODEV;
}
- if (!of_device_is_compatible(np, "loongson, pci-gmac")) {
- pr_info("dwmac_loongson_pci: Incompatible OF node\n");
- return -ENODEV;
- }
-
plat = devm_kzalloc(&pdev->dev, sizeof(*plat), GFP_KERNEL);
if (!plat)
return -ENOMEM;
+ plat->mdio_bus_data = devm_kzalloc(&pdev->dev,
+ sizeof(*plat->mdio_bus_data),
+ GFP_KERNEL);
+ if (!plat->mdio_bus_data)
+ return -ENOMEM;
+
plat->mdio_node = of_get_child_by_name(np, "mdio");
if (plat->mdio_node) {
dev_info(&pdev->dev, "Found MDIO subnode\n");
-
- plat->mdio_bus_data = devm_kzalloc(&pdev->dev,
- sizeof(*plat->mdio_bus_data),
- GFP_KERNEL);
- if (!plat->mdio_bus_data) {
- ret = -ENOMEM;
- goto err_put_node;
- }
plat->mdio_bus_data->needs_reset = true;
}
#define RGMII_CONFIG_LOOPBACK_EN BIT(2)
#define RGMII_CONFIG_PROG_SWAP BIT(1)
#define RGMII_CONFIG_DDR_MODE BIT(0)
+#define RGMII_CONFIG_SGMII_CLK_DVDR GENMASK(18, 10)
/* SDCC_HC_REG_DLL_CONFIG fields */
#define SDCC_DLL_CONFIG_DLL_RST BIT(30)
#define ETHQOS_MAC_CTRL_SPEED_MODE BIT(14)
#define ETHQOS_MAC_CTRL_PORT_SEL BIT(15)
+#define SGMII_10M_RX_CLK_DVDR 0x31
+
struct ethqos_emac_por {
unsigned int offset;
unsigned int value;
return 0;
}
+/* On interface toggle MAC registers gets reset.
+ * Configure MAC block for SGMII on ethernet phy link up
+ */
static int ethqos_configure_sgmii(struct qcom_ethqos *ethqos)
{
int val;
case SPEED_10:
val |= ETHQOS_MAC_CTRL_PORT_SEL;
val &= ~ETHQOS_MAC_CTRL_SPEED_MODE;
+ rgmii_updatel(ethqos, RGMII_CONFIG_SGMII_CLK_DVDR,
+ FIELD_PREP(RGMII_CONFIG_SGMII_CLK_DVDR,
+ SGMII_10M_RX_CLK_DVDR),
+ RGMII_IO_MACRO_CONFIG);
break;
}
}
}
-void dwmac5_fpe_configure(void __iomem *ioaddr, u32 num_txq, u32 num_rxq,
+void dwmac5_fpe_configure(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg,
+ u32 num_txq, u32 num_rxq,
bool enable)
{
u32 value;
- if (!enable) {
- value = readl(ioaddr + MAC_FPE_CTRL_STS);
-
- value &= ~EFPE;
-
- writel(value, ioaddr + MAC_FPE_CTRL_STS);
- return;
+ if (enable) {
+ cfg->fpe_csr = EFPE;
+ value = readl(ioaddr + GMAC_RXQ_CTRL1);
+ value &= ~GMAC_RXQCTRL_FPRQ;
+ value |= (num_rxq - 1) << GMAC_RXQCTRL_FPRQ_SHIFT;
+ writel(value, ioaddr + GMAC_RXQ_CTRL1);
+ } else {
+ cfg->fpe_csr = 0;
}
-
- value = readl(ioaddr + GMAC_RXQ_CTRL1);
- value &= ~GMAC_RXQCTRL_FPRQ;
- value |= (num_rxq - 1) << GMAC_RXQCTRL_FPRQ_SHIFT;
- writel(value, ioaddr + GMAC_RXQ_CTRL1);
-
- value = readl(ioaddr + MAC_FPE_CTRL_STS);
- value |= EFPE;
- writel(value, ioaddr + MAC_FPE_CTRL_STS);
+ writel(cfg->fpe_csr, ioaddr + MAC_FPE_CTRL_STS);
}
int dwmac5_fpe_irq_status(void __iomem *ioaddr, struct net_device *dev)
status = FPE_EVENT_UNKNOWN;
+ /* Reads from the MAC_FPE_CTRL_STS register should only be performed
+ * here, since the status flags of MAC_FPE_CTRL_STS are "clear on read"
+ */
value = readl(ioaddr + MAC_FPE_CTRL_STS);
if (value & TRSP) {
return status;
}
-void dwmac5_fpe_send_mpacket(void __iomem *ioaddr, enum stmmac_mpacket_type type)
+void dwmac5_fpe_send_mpacket(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg,
+ enum stmmac_mpacket_type type)
{
- u32 value;
+ u32 value = cfg->fpe_csr;
- value = readl(ioaddr + MAC_FPE_CTRL_STS);
-
- if (type == MPACKET_VERIFY) {
- value &= ~SRSP;
+ if (type == MPACKET_VERIFY)
value |= SVER;
- } else {
- value &= ~SVER;
+ else if (type == MPACKET_RESPONSE)
value |= SRSP;
- }
writel(value, ioaddr + MAC_FPE_CTRL_STS);
}
unsigned int ptp_rate);
void dwmac5_est_irq_status(void __iomem *ioaddr, struct net_device *dev,
struct stmmac_extra_stats *x, u32 txqcnt);
-void dwmac5_fpe_configure(void __iomem *ioaddr, u32 num_txq, u32 num_rxq,
+void dwmac5_fpe_configure(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg,
+ u32 num_txq, u32 num_rxq,
bool enable);
void dwmac5_fpe_send_mpacket(void __iomem *ioaddr,
+ struct stmmac_fpe_cfg *cfg,
enum stmmac_mpacket_type type);
int dwmac5_fpe_irq_status(void __iomem *ioaddr, struct net_device *dev);
return 0;
}
-static void dwxgmac3_fpe_configure(void __iomem *ioaddr, u32 num_txq,
+static void dwxgmac3_fpe_configure(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg,
+ u32 num_txq,
u32 num_rxq, bool enable)
{
u32 value;
unsigned int ptp_rate);
void (*est_irq_status)(void __iomem *ioaddr, struct net_device *dev,
struct stmmac_extra_stats *x, u32 txqcnt);
- void (*fpe_configure)(void __iomem *ioaddr, u32 num_txq, u32 num_rxq,
+ void (*fpe_configure)(void __iomem *ioaddr, struct stmmac_fpe_cfg *cfg,
+ u32 num_txq, u32 num_rxq,
bool enable);
void (*fpe_send_mpacket)(void __iomem *ioaddr,
+ struct stmmac_fpe_cfg *cfg,
enum stmmac_mpacket_type type);
int (*fpe_irq_status)(void __iomem *ioaddr, struct net_device *dev);
};
#define MMC_XGMAC_RX_DISCARD_OCT_GB 0x1b4
#define MMC_XGMAC_RX_ALIGN_ERR_PKT 0x1bc
+#define MMC_XGMAC_TX_FPE_INTR_MASK 0x204
#define MMC_XGMAC_TX_FPE_FRAG 0x208
#define MMC_XGMAC_TX_HOLD_REQ 0x20c
+#define MMC_XGMAC_RX_FPE_INTR_MASK 0x224
#define MMC_XGMAC_RX_PKT_ASSEMBLY_ERR 0x228
#define MMC_XGMAC_RX_PKT_SMD_ERR 0x22c
#define MMC_XGMAC_RX_PKT_ASSEMBLY_OK 0x230
{
writel(0x0, mmcaddr + MMC_RX_INTR_MASK);
writel(0x0, mmcaddr + MMC_TX_INTR_MASK);
+ writel(MMC_DEFAULT_MASK, mmcaddr + MMC_XGMAC_TX_FPE_INTR_MASK);
+ writel(MMC_DEFAULT_MASK, mmcaddr + MMC_XGMAC_RX_FPE_INTR_MASK);
writel(MMC_DEFAULT_MASK, mmcaddr + MMC_XGMAC_RX_IPC_INTR_MASK);
}
*/
ts_status = readl(priv->ioaddr + GMAC_TIMESTAMP_STATUS);
- if (priv->plat->flags & STMMAC_FLAG_EXT_SNAPSHOT_EN)
+ if (!(priv->plat->flags & STMMAC_FLAG_EXT_SNAPSHOT_EN))
return;
num_snapshot = (ts_status & GMAC_TIMESTAMP_ATSNS_MASK) >>
bool *hs_enable = &fpe_cfg->hs_enable;
if (is_up && *hs_enable) {
- stmmac_fpe_send_mpacket(priv, priv->ioaddr, MPACKET_VERIFY);
+ stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
+ MPACKET_VERIFY);
} else {
*lo_state = FPE_STATE_OFF;
*lp_state = FPE_STATE_OFF;
dma_dir = page_pool_get_dma_dir(rx_q->page_pool);
buf_sz = DIV_ROUND_UP(priv->dma_conf.dma_buf_sz, PAGE_SIZE) * PAGE_SIZE;
+ limit = min(priv->dma_conf.dma_rx_size - 1, (unsigned int)limit);
if (netif_msg_rx_status(priv)) {
void *rx_head;
len = 0;
}
+read_again:
if (count >= limit)
break;
-read_again:
buf1_len = 0;
buf2_len = 0;
entry = next_entry;
/* If user has requested FPE enable, quickly response */
if (*hs_enable)
stmmac_fpe_send_mpacket(priv, priv->ioaddr,
+ fpe_cfg,
MPACKET_RESPONSE);
}
if (*lo_state == FPE_STATE_ENTERING_ON &&
*lp_state == FPE_STATE_ENTERING_ON) {
stmmac_fpe_configure(priv, priv->ioaddr,
+ fpe_cfg,
priv->plat->tx_queues_to_use,
priv->plat->rx_queues_to_use,
*enable);
netdev_info(priv->dev, SEND_VERIFY_MPAKCET_FMT,
*lo_state, *lp_state);
stmmac_fpe_send_mpacket(priv, priv->ioaddr,
+ fpe_cfg,
MPACKET_VERIFY);
}
/* Sleep then retry */
if (priv->plat->fpe_cfg->hs_enable != enable) {
if (enable) {
stmmac_fpe_send_mpacket(priv, priv->ioaddr,
+ priv->plat->fpe_cfg,
MPACKET_VERIFY);
} else {
priv->plat->fpe_cfg->lo_fpe_state = FPE_STATE_OFF;
if (priv->dma_cap.fpesel) {
/* Disable FPE */
stmmac_fpe_configure(priv, priv->ioaddr,
+ priv->plat->fpe_cfg,
priv->plat->tx_queues_to_use,
priv->plat->rx_queues_to_use, false);
new_bus->parent = priv->device;
err = of_mdiobus_register(new_bus, mdio_node);
- if (err != 0) {
+ if (err == -ENODEV) {
+ err = 0;
+ dev_info(dev, "MDIO bus is disabled\n");
+ goto bus_register_fail;
+ } else if (err) {
dev_err_probe(dev, err, "Cannot register the MDIO bus\n");
goto bus_register_fail;
}
priv->plat->fpe_cfg->enable = false;
stmmac_fpe_configure(priv, priv->ioaddr,
+ priv->plat->fpe_cfg,
priv->plat->tx_queues_to_use,
priv->plat->rx_queues_to_use,
false);
&prueth->shram);
if (ret) {
dev_err(dev, "unable to get PRUSS SHRD RAM2: %d\n", ret);
- pruss_put(prueth->pruss);
+ goto put_pruss;
}
prueth->sram_pool = of_gen_pool_get(np, "sram", 0);
prueth->iep1 = icss_iep_get_idx(np, 1);
if (IS_ERR(prueth->iep1)) {
ret = dev_err_probe(dev, PTR_ERR(prueth->iep1), "iep1 get failed\n");
- icss_iep_put(prueth->iep0);
- prueth->iep0 = NULL;
- prueth->iep1 = NULL;
- goto free_pool;
+ goto put_iep0;
}
if (prueth->pdata.quirk_10m_link_issue) {
exit_iep:
if (prueth->pdata.quirk_10m_link_issue)
icss_iep_exit_fw(prueth->iep1);
+ icss_iep_put(prueth->iep1);
+
+put_iep0:
+ icss_iep_put(prueth->iep0);
+ prueth->iep0 = NULL;
+ prueth->iep1 = NULL;
free_pool:
gen_pool_free(prueth->sram_pool,
put_mem:
pruss_release_mem_region(prueth->pruss, &prueth->shram);
+
+put_pruss:
pruss_put(prueth->pruss);
put_cores:
wx->subsystem_device_id = pdev->subsystem_device;
} else {
err = wx_flash_read_dword(wx, 0xfffdc, &ssid);
- if (!err)
- wx->subsystem_device_id = swab16((u16)ssid);
+ if (err < 0) {
+ wx_err(wx, "read of internal subsystem device id failed\n");
+ return err;
+ }
- return err;
+ wx->subsystem_device_id = swab16((u16)ssid);
}
wx->mac_table = kcalloc(wx->mac.num_rar_entries,
return rx_desc->wb.upper.status_error & cpu_to_le32(stat_err_bits);
}
-static bool wx_can_reuse_rx_page(struct wx_rx_buffer *rx_buffer,
- int rx_buffer_pgcnt)
-{
- unsigned int pagecnt_bias = rx_buffer->pagecnt_bias;
- struct page *page = rx_buffer->page;
-
- /* avoid re-using remote and pfmemalloc pages */
- if (!dev_page_is_reusable(page))
- return false;
-
-#if (PAGE_SIZE < 8192)
- /* if we are only owner of page we can reuse it */
- if (unlikely((rx_buffer_pgcnt - pagecnt_bias) > 1))
- return false;
-#endif
-
- /* If we have drained the page fragment pool we need to update
- * the pagecnt_bias and page count so that we fully restock the
- * number of references the driver holds.
- */
- if (unlikely(pagecnt_bias == 1)) {
- page_ref_add(page, USHRT_MAX - 1);
- rx_buffer->pagecnt_bias = USHRT_MAX;
- }
-
- return true;
-}
-
-/**
- * wx_reuse_rx_page - page flip buffer and store it back on the ring
- * @rx_ring: rx descriptor ring to store buffers on
- * @old_buff: donor buffer to have page reused
- *
- * Synchronizes page for reuse by the adapter
- **/
-static void wx_reuse_rx_page(struct wx_ring *rx_ring,
- struct wx_rx_buffer *old_buff)
-{
- u16 nta = rx_ring->next_to_alloc;
- struct wx_rx_buffer *new_buff;
-
- new_buff = &rx_ring->rx_buffer_info[nta];
-
- /* update, and store next to alloc */
- nta++;
- rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
-
- /* transfer page from old buffer to new buffer */
- new_buff->page = old_buff->page;
- new_buff->page_dma = old_buff->page_dma;
- new_buff->page_offset = old_buff->page_offset;
- new_buff->pagecnt_bias = old_buff->pagecnt_bias;
-}
-
static void wx_dma_sync_frag(struct wx_ring *rx_ring,
struct wx_rx_buffer *rx_buffer)
{
size,
DMA_FROM_DEVICE);
skip_sync:
- rx_buffer->pagecnt_bias--;
-
return rx_buffer;
}
struct sk_buff *skb,
int rx_buffer_pgcnt)
{
- if (wx_can_reuse_rx_page(rx_buffer, rx_buffer_pgcnt)) {
- /* hand second half of page back to the ring */
- wx_reuse_rx_page(rx_ring, rx_buffer);
- } else {
- if (!IS_ERR(skb) && WX_CB(skb)->dma == rx_buffer->dma)
- /* the page has been released from the ring */
- WX_CB(skb)->page_released = true;
- else
- page_pool_put_full_page(rx_ring->page_pool, rx_buffer->page, false);
-
- __page_frag_cache_drain(rx_buffer->page,
- rx_buffer->pagecnt_bias);
- }
+ if (!IS_ERR(skb) && WX_CB(skb)->dma == rx_buffer->dma)
+ /* the page has been released from the ring */
+ WX_CB(skb)->page_released = true;
/* clear contents of rx_buffer */
rx_buffer->page = NULL;
if (size <= WX_RXBUFFER_256) {
memcpy(__skb_put(skb, size), page_addr,
ALIGN(size, sizeof(long)));
- rx_buffer->pagecnt_bias++;
-
+ page_pool_put_full_page(rx_ring->page_pool, rx_buffer->page, true);
return skb;
}
+ skb_mark_for_recycle(skb);
+
if (!wx_test_staterr(rx_desc, WX_RXD_STAT_EOP))
WX_CB(skb)->dma = rx_buffer->dma;
bi->page_dma = dma;
bi->page = page;
bi->page_offset = 0;
- page_ref_add(page, USHRT_MAX - 1);
- bi->pagecnt_bias = USHRT_MAX;
return true;
}
/* exit if we failed to retrieve a buffer */
if (!skb) {
rx_ring->rx_stats.alloc_rx_buff_failed++;
- rx_buffer->pagecnt_bias++;
break;
}
if (!pdev->msi_enabled && !pdev->msix_enabled)
return;
- pci_free_irq_vectors(wx->pdev);
if (pdev->msix_enabled) {
kfree(wx->msix_entries);
wx->msix_entries = NULL;
}
+ pci_free_irq_vectors(wx->pdev);
}
EXPORT_SYMBOL(wx_reset_interrupt_capability);
/* free resources associated with mapping */
page_pool_put_full_page(rx_ring->page_pool, rx_buffer->page, false);
- __page_frag_cache_drain(rx_buffer->page,
- rx_buffer->pagecnt_bias);
i++;
rx_buffer++;
dma_addr_t page_dma;
struct page *page;
unsigned int page_offset;
- u16 pagecnt_bias;
};
struct wx_queue_stats {
/* PCI config space info */
err = wx_sw_init(wx);
- if (err < 0) {
- wx_err(wx, "read of internal subsystem device id failed\n");
+ if (err < 0)
return err;
- }
/* mac type, phy type , oem type */
ngbe_init_type_code(wx);
/* PCI config space info */
err = wx_sw_init(wx);
- if (err < 0) {
- wx_err(wx, "read of internal subsystem device id failed\n");
+ if (err < 0)
return err;
- }
txgbe_init_type_code(wx);
if (lp->features & XAE_FEATURE_FULL_TX_CSUM) {
/* Tx Full Checksum Offload Enabled */
cur_p->app0 |= 2;
- } else if (lp->features & XAE_FEATURE_PARTIAL_RX_CSUM) {
+ } else if (lp->features & XAE_FEATURE_PARTIAL_TX_CSUM) {
csum_start_off = skb_transport_offset(skb);
csum_index_off = csum_start_off + skb->csum_offset;
/* Tx Partial Checksum Offload Enabled */
tristate "Microsoft Hyper-V virtual network driver"
depends on HYPERV
select UCS2_STRING
+ select NLS
help
Select this option to enable the Hyper-V virtual network driver.
goto upper_link_failed;
}
- /* set slave flag before open to prevent IPv6 addrconf */
- vf_netdev->flags |= IFF_SLAVE;
-
schedule_delayed_work(&ndev_ctx->vf_takeover, VF_TAKEOVER_INT);
call_netdevice_notifiers(NETDEV_JOIN, vf_netdev);
}
- /* Fallback path to check synthetic vf with
- * help of mac addr
+ /* Fallback path to check synthetic vf with help of mac addr.
+ * Because this function can be called before vf_netdev is
+ * initialized (NETDEV_POST_INIT) when its perm_addr has not been copied
+ * from dev_addr, also try to match to its dev_addr.
+ * Note: On Hyper-V and Azure, it's not possible to set a MAC address
+ * on a VF that matches to the MAC of a unrelated NETVSC device.
*/
list_for_each_entry(ndev_ctx, &netvsc_dev_list, list) {
ndev = hv_get_drvdata(ndev_ctx->device_ctx);
- if (ether_addr_equal(vf_netdev->perm_addr, ndev->perm_addr)) {
- netdev_notice(vf_netdev,
- "falling back to mac addr based matching\n");
+ if (ether_addr_equal(vf_netdev->perm_addr, ndev->perm_addr) ||
+ ether_addr_equal(vf_netdev->dev_addr, ndev->perm_addr))
return ndev;
- }
}
netdev_notice(vf_netdev,
return NULL;
}
+static int netvsc_prepare_bonding(struct net_device *vf_netdev)
+{
+ struct net_device *ndev;
+
+ ndev = get_netvsc_byslot(vf_netdev);
+ if (!ndev)
+ return NOTIFY_DONE;
+
+ /* set slave flag before open to prevent IPv6 addrconf */
+ vf_netdev->flags |= IFF_SLAVE;
+ return NOTIFY_DONE;
+}
+
static int netvsc_register_vf(struct net_device *vf_netdev)
{
struct net_device_context *net_device_ctx;
goto devinfo_failed;
}
- nvdev = rndis_filter_device_add(dev, device_info);
- if (IS_ERR(nvdev)) {
- ret = PTR_ERR(nvdev);
- netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
- goto rndis_failed;
- }
-
- eth_hw_addr_set(net, device_info->mac_adr);
-
/* We must get rtnl lock before scheduling nvdev->subchan_work,
* otherwise netvsc_subchan_work() can get rtnl lock first and wait
* all subchannels to show up, but that may not happen because
* -> ... -> device_add() -> ... -> __device_attach() can't get
* the device lock, so all the subchannels can't be processed --
* finally netvsc_subchan_work() hangs forever.
+ *
+ * The rtnl lock also needs to be held before rndis_filter_device_add()
+ * which advertises nvsp_2_vsc_capability / sriov bit, and triggers
+ * VF NIC offering and registering. If VF NIC finished register_netdev()
+ * earlier it may cause name based config failure.
*/
rtnl_lock();
+ nvdev = rndis_filter_device_add(dev, device_info);
+ if (IS_ERR(nvdev)) {
+ ret = PTR_ERR(nvdev);
+ netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
+ goto rndis_failed;
+ }
+
+ eth_hw_addr_set(net, device_info->mac_adr);
+
if (nvdev->num_chn > 1)
schedule_work(&nvdev->subchan_work);
return 0;
register_failed:
- rtnl_unlock();
rndis_filter_device_remove(dev, nvdev);
rndis_failed:
+ rtnl_unlock();
netvsc_devinfo_put(device_info);
devinfo_failed:
free_percpu(net_device_ctx->vf_stats);
return NOTIFY_DONE;
switch (event) {
+ case NETDEV_POST_INIT:
+ return netvsc_prepare_bonding(event_dev);
case NETDEV_REGISTER:
return netvsc_register_vf(event_dev);
case NETDEV_UNREGISTER:
}
netvsc_ring_bytes = ring_size * PAGE_SIZE;
+ register_netdevice_notifier(&netvsc_netdev_notifier);
+
ret = vmbus_driver_register(&netvsc_drv);
if (ret)
- return ret;
+ goto err_vmbus_reg;
- register_netdevice_notifier(&netvsc_netdev_notifier);
return 0;
+
+err_vmbus_reg:
+ unregister_netdevice_notifier(&netvsc_netdev_notifier);
+ return ret;
}
MODULE_LICENSE("GPL");
0x0001c000 + 0x12000 * GSI_EE_AP, 0x80);
static const u32 reg_ev_ch_e_cntxt_1_fmask[] = {
- [R_LENGTH] = GENMASK(19, 0),
+ [R_LENGTH] = GENMASK(23, 0),
};
REG_STRIDE_FIELDS(EV_CH_E_CNTXT_1, ev_ch_e_cntxt_1,
return addr;
}
-static int ipvlan_process_v4_outbound(struct sk_buff *skb)
+static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb)
{
const struct iphdr *ip4h = ip_hdr(skb);
struct net_device *dev = skb->dev;
}
#if IS_ENABLED(CONFIG_IPV6)
-static int ipvlan_process_v6_outbound(struct sk_buff *skb)
+
+static noinline_for_stack int
+ipvlan_route_v6_outbound(struct net_device *dev, struct sk_buff *skb)
{
const struct ipv6hdr *ip6h = ipv6_hdr(skb);
- struct net_device *dev = skb->dev;
- struct net *net = dev_net(dev);
- struct dst_entry *dst;
- int err, ret = NET_XMIT_DROP;
struct flowi6 fl6 = {
.flowi6_oif = dev->ifindex,
.daddr = ip6h->daddr,
.flowi6_mark = skb->mark,
.flowi6_proto = ip6h->nexthdr,
};
+ struct dst_entry *dst;
+ int err;
- dst = ip6_route_output(net, NULL, &fl6);
- if (dst->error) {
- ret = dst->error;
+ dst = ip6_route_output(dev_net(dev), NULL, &fl6);
+ err = dst->error;
+ if (err) {
dst_release(dst);
- goto err;
+ return err;
}
skb_dst_set(skb, dst);
+ return 0;
+}
+
+static int ipvlan_process_v6_outbound(struct sk_buff *skb)
+{
+ struct net_device *dev = skb->dev;
+ int err, ret = NET_XMIT_DROP;
+
+ err = ipvlan_route_v6_outbound(dev, skb);
+ if (unlikely(err)) {
+ DEV_STATS_INC(dev, tx_errors);
+ kfree_skb(skb);
+ return err;
+ }
memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
- err = ip6_local_out(net, skb->sk, skb);
+ err = ip6_local_out(dev_net(dev), skb->sk, skb);
if (unlikely(net_xmit_eval(err)))
DEV_STATS_INC(dev, tx_errors);
else
ret = NET_XMIT_SUCCESS;
- goto out;
-err:
- DEV_STATS_INC(dev, tx_errors);
- kfree_skb(skb);
-out:
return ret;
}
#else
if (dev->flags & IFF_UP) {
if (change & IFF_ALLMULTI)
dev_set_allmulti(lowerdev, dev->flags & IFF_ALLMULTI ? 1 : -1);
- if (change & IFF_PROMISC)
+ if (!macvlan_passthru(vlan->port) && change & IFF_PROMISC)
dev_set_promiscuity(lowerdev,
dev->flags & IFF_PROMISC ? 1 : -1);
{
struct nsim_bpf_bound_prog *state;
- if (!prog || !prog->aux->offload)
+ if (!prog || !bpf_prog_is_offloaded(prog->aux))
return;
state = prog->aux->offload->dev_priv;
if (!bpf->prog)
return 0;
- if (!bpf->prog->aux->offload) {
+ if (!bpf_prog_is_offloaded(bpf->prog->aux)) {
NSIM_EA(bpf->extack, "xdpoffload of non-bound program");
return -EINVAL;
}
#include <linux/filter.h>
#include <linux/netfilter_netdev.h>
#include <linux/bpf_mprog.h>
+#include <linux/indirect_call_wrapper.h>
#include <net/netkit.h>
#include <net/dst.h>
netdev_tx_t ret_dev = NET_XMIT_SUCCESS;
const struct bpf_mprog_entry *entry;
struct net_device *peer;
+ int len = skb->len;
rcu_read_lock();
peer = rcu_dereference(nk->peer);
case NETKIT_PASS:
skb->protocol = eth_type_trans(skb, skb->dev);
skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
- __netif_rx(skb);
+ if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
+ dev_sw_netstats_tx_add(dev, 1, len);
+ dev_sw_netstats_rx_add(peer, len);
+ } else {
+ goto drop_stats;
+ }
break;
case NETKIT_REDIRECT:
+ dev_sw_netstats_tx_add(dev, 1, len);
skb_do_redirect(skb);
break;
case NETKIT_DROP:
default:
drop:
kfree_skb(skb);
+drop_stats:
dev_core_stats_tx_dropped_inc(dev);
ret_dev = NET_XMIT_DROP;
break;
rcu_read_unlock();
}
-static struct net_device *netkit_peer_dev(struct net_device *dev)
+INDIRECT_CALLABLE_SCOPE struct net_device *netkit_peer_dev(struct net_device *dev)
{
return rcu_dereference(netkit_priv(dev)->peer);
}
+static void netkit_get_stats(struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
+{
+ dev_fetch_sw_netstats(stats, dev->tstats);
+ stats->tx_dropped = DEV_STATS_READ(dev, tx_dropped);
+}
+
static void netkit_uninit(struct net_device *dev);
static const struct net_device_ops netkit_netdev_ops = {
.ndo_set_rx_headroom = netkit_set_headroom,
.ndo_get_iflink = netkit_get_iflink,
.ndo_get_peer_dev = netkit_peer_dev,
+ .ndo_get_stats64 = netkit_get_stats,
.ndo_uninit = netkit_uninit,
.ndo_features_check = passthru_features_check,
};
ether_setup(dev);
dev->max_mtu = ETH_MAX_MTU;
+ dev->pcpu_stat_type = NETDEV_PCPU_STAT_TSTATS;
dev->flags |= IFF_NOARP;
dev->priv_flags &= ~IFF_TX_SKB_SHARING;
return -EACCES;
}
+ if (data[IFLA_NETKIT_PEER_INFO]) {
+ NL_SET_ERR_MSG_ATTR(extack, data[IFLA_NETKIT_PEER_INFO],
+ "netkit peer info cannot be changed after device creation");
+ return -EINVAL;
+ }
+
if (data[IFLA_NETKIT_POLICY]) {
attr = data[IFLA_NETKIT_POLICY];
policy = nla_get_u32(attr);
goto error;
phy_resume(phydev);
- phy_led_triggers_register(phydev);
+ if (!phydev->is_on_sfp_module)
+ phy_led_triggers_register(phydev);
/**
* If the external phy used by current mac interface is managed by
}
phydev->phylink = NULL;
- phy_led_triggers_unregister(phydev);
+ if (!phydev->is_on_sfp_module)
+ phy_led_triggers_unregister(phydev);
if (phydev->mdio.dev.driver)
module_put(phydev->mdio.dev.driver->owner);
case PPPIOCSMRU:
if (get_user(val, (int __user *) argp))
break;
+ if (val > U16_MAX) {
+ err = -EINVAL;
+ break;
+ }
if (val < PPP_MRU)
val = PPP_MRU;
ap->mru = val;
/* strip address/control field if present */
p = skb->data;
- if (p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) {
+ if (skb->len >= 2 && p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) {
/* chop off address/control */
if (skb->len < 3)
goto err;
return 0;
inst_rollback:
- for (i--; i >= 0; i--)
+ for (i--; i >= 0; i--) {
__team_option_inst_del_option(team, dst_opts[i]);
+ list_del(&dst_opts[i]->list);
+ }
i = option_count;
alloc_rollback:
u16 pkt_count = 0;
u64 desc_hdr = 0;
u16 vlan_tag = 0;
- u32 skb_len = 0;
+ u32 skb_len;
if (!skb)
goto err;
- if (skb->len == 0)
+ skb_len = skb->len;
+ if (skb_len < sizeof(desc_hdr))
goto err;
- skb_len = skb->len;
/* RX Descriptor Header */
- skb_trim(skb, skb->len - sizeof(desc_hdr));
+ skb_trim(skb, skb_len - sizeof(desc_hdr));
desc_hdr = le64_to_cpup((u64 *)skb_tail_pointer(skb));
/* Check these packets */
u8 in_pm;
u32 wol_supported;
u32 wolopts;
+ u8 disconnecting;
};
struct ax88179_int_data {
{
int ret;
int (*fn)(struct usbnet *, u8, u8, u16, u16, void *, u16);
+ struct ax88179_data *ax179_data = dev->driver_priv;
BUG_ON(!dev);
ret = fn(dev, cmd, USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
value, index, data, size);
- if (unlikely(ret < 0))
+ if (unlikely((ret < 0) && !(ret == -ENODEV && ax179_data->disconnecting)))
netdev_warn(dev->net, "Failed to read reg index 0x%04x: %d\n",
index, ret);
{
int ret;
int (*fn)(struct usbnet *, u8, u8, u16, u16, const void *, u16);
+ struct ax88179_data *ax179_data = dev->driver_priv;
BUG_ON(!dev);
ret = fn(dev, cmd, USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_DEVICE,
value, index, data, size);
- if (unlikely(ret < 0))
+ if (unlikely((ret < 0) && !(ret == -ENODEV && ax179_data->disconnecting)))
netdev_warn(dev->net, "Failed to write reg index 0x%04x: %d\n",
index, ret);
return usbnet_resume(intf);
}
+static void ax88179_disconnect(struct usb_interface *intf)
+{
+ struct usbnet *dev = usb_get_intfdata(intf);
+ struct ax88179_data *ax179_data;
+
+ if (!dev)
+ return;
+
+ ax179_data = dev->driver_priv;
+ ax179_data->disconnecting = 1;
+
+ usbnet_disconnect(intf);
+}
+
static void
ax88179_get_wol(struct net_device *net, struct ethtool_wolinfo *wolinfo)
{
*tmp16 = AX_PHYPWR_RSTCTL_IPRL;
ax88179_write_cmd(dev, AX_ACCESS_MAC, AX_PHYPWR_RSTCTL, 2, 2, tmp16);
- msleep(200);
+ msleep(500);
*tmp = AX_CLK_SELECT_ACS | AX_CLK_SELECT_BCS;
ax88179_write_cmd(dev, AX_ACCESS_MAC, AX_CLK_SELECT, 1, 1, tmp);
- msleep(100);
+ msleep(200);
/* Ethernet PHY Auto Detach*/
ax88179_auto_detach(dev);
.suspend = ax88179_suspend,
.resume = ax88179_resume,
.reset_resume = ax88179_resume,
- .disconnect = usbnet_disconnect,
+ .disconnect = ax88179_disconnect,
.supports_autosuspend = 1,
.disable_hub_initiated_lpm = 1,
};
{QMI_FIXED_INTF(0x19d2, 0x0168, 4)},
{QMI_FIXED_INTF(0x19d2, 0x0176, 3)},
{QMI_FIXED_INTF(0x19d2, 0x0178, 3)},
+ {QMI_FIXED_INTF(0x19d2, 0x0189, 4)}, /* ZTE MF290 */
{QMI_FIXED_INTF(0x19d2, 0x0191, 4)}, /* ZTE EuFi890 */
{QMI_FIXED_INTF(0x19d2, 0x0199, 1)}, /* ZTE MF820S */
{QMI_FIXED_INTF(0x19d2, 0x0200, 1)},
ocp_write_byte(tp, MCU_TYPE_PLA, PLA_CR, CR_RST);
for (i = 0; i < 1000; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ break;
if (!(ocp_read_byte(tp, MCU_TYPE_PLA, PLA_CR) & CR_RST))
break;
usleep_range(100, 400);
rxdy_gated_en(tp, true);
for (i = 0; i < 1000; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ break;
ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL);
if ((ocp_data & FIFO_EMPTY) == FIFO_EMPTY)
break;
}
for (i = 0; i < 1000; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ break;
if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_TCR0) & TCR0_TX_EMPTY)
break;
usleep_range(1000, 2000);
int i;
for (i = 0; i < 1000; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ break;
ocp_data = ocp_read_byte(tp, MCU_TYPE_PLA, PLA_OOB_CTRL);
if (ocp_data & LINK_LIST_READY)
break;
int i;
for (i = 0; i < 100; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ break;
if (ocp_read_word(tp, MCU_TYPE_USB, USB_GPHY_CTRL) & GPHY_PATCH_DONE)
break;
usleep_range(1000, 2000);
for (i = 0; i < 104; i++) {
u32 ocp_data = ocp_read_byte(tp, MCU_TYPE_USB, USB_WDT1_CTRL);
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ return -ENODEV;
if (!(ocp_data & WTD1_EN))
break;
usleep_range(1000, 2000);
data &= ~EN_ALDPS;
ocp_reg_write(tp, OCP_POWER_CFG, data);
for (i = 0; i < 20; i++) {
+ if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+ return;
usleep_range(1000, 2000);
if (ocp_read_word(tp, MCU_TYPE_PLA, 0xe000) & 0x0100)
break;
struct r8152 *tp = usb_get_intfdata(intf);
struct net_device *netdev;
+ rtnl_lock();
+
if (!tp || !test_bit(PROBED_WITH_NO_ERRORS, &tp->flags))
return 0;
struct sockaddr sa;
if (!tp || !test_bit(PROBED_WITH_NO_ERRORS, &tp->flags))
- return 0;
+ goto exit;
rtl_set_accessible(tp);
/* reset the MAC address in case of policy change */
- if (determine_ethernet_addr(tp, &sa) >= 0) {
- rtnl_lock();
+ if (determine_ethernet_addr(tp, &sa) >= 0)
dev_set_mac_address (tp->netdev, &sa, NULL);
- rtnl_unlock();
- }
netdev = tp->netdev;
if (!netif_running(netdev))
- return 0;
+ goto exit;
set_bit(WORK_ENABLE, &tp->flags);
if (netif_carrier_ok(netdev)) {
if (!list_empty(&tp->rx_done))
napi_schedule(&tp->napi);
+exit:
+ rtnl_unlock();
return 0;
}
{ USB_DEVICE(VENDOR_ID_NVIDIA, 0x09ff) },
{ USB_DEVICE(VENDOR_ID_TPLINK, 0x0601) },
{ USB_DEVICE(VENDOR_ID_DLINK, 0xb301) },
+ { USB_DEVICE(VENDOR_ID_ASUS, 0x1976) },
{}
};
data[tx_idx + j] += *(u64 *)(base + offset);
}
} while (u64_stats_fetch_retry(&rq_stats->syncp, start));
- pp_idx = tx_idx + VETH_TQ_STATS_LEN;
}
+ pp_idx = idx + dev->real_num_tx_queues * VETH_TQ_STATS_LEN;
page_pool_stats:
veth_get_page_pool_stats(dev, &data[pp_idx]);
skb_tx_timestamp(skb);
if (likely(veth_forward_skb(rcv, skb, rq, use_napi) == NET_RX_SUCCESS)) {
if (!use_napi)
- dev_lstats_add(dev, length);
+ dev_sw_netstats_tx_add(dev, 1, length);
else
__veth_xdp_flush(rq);
} else {
return ret;
}
-static u64 veth_stats_tx(struct net_device *dev, u64 *packets, u64 *bytes)
-{
- struct veth_priv *priv = netdev_priv(dev);
-
- dev_lstats_read(dev, packets, bytes);
- return atomic64_read(&priv->dropped);
-}
-
static void veth_stats_rx(struct veth_stats *result, struct net_device *dev)
{
struct veth_priv *priv = netdev_priv(dev);
struct veth_priv *priv = netdev_priv(dev);
struct net_device *peer;
struct veth_stats rx;
- u64 packets, bytes;
- tot->tx_dropped = veth_stats_tx(dev, &packets, &bytes);
- tot->tx_bytes = bytes;
- tot->tx_packets = packets;
+ tot->tx_dropped = atomic64_read(&priv->dropped);
+ dev_fetch_sw_netstats(tot, dev->tstats);
veth_stats_rx(&rx, dev);
tot->tx_dropped += rx.xdp_tx_err;
tot->rx_dropped = rx.rx_drops + rx.peer_tq_xdp_xmit_err;
- tot->rx_bytes = rx.xdp_bytes;
- tot->rx_packets = rx.xdp_packets;
+ tot->rx_bytes += rx.xdp_bytes;
+ tot->rx_packets += rx.xdp_packets;
rcu_read_lock();
peer = rcu_dereference(priv->peer);
if (peer) {
- veth_stats_tx(peer, &packets, &bytes);
- tot->rx_bytes += bytes;
- tot->rx_packets += packets;
+ struct rtnl_link_stats64 tot_peer = {};
+
+ dev_fetch_sw_netstats(&tot_peer, peer->tstats);
+ tot->rx_bytes += tot_peer.tx_bytes;
+ tot->rx_packets += tot_peer.tx_packets;
veth_stats_rx(&rx, peer);
tot->tx_dropped += rx.peer_tq_xdp_xmit_err;
skb_add_rx_frag(nskb, i, page, page_offset, size,
truesize);
- if (skb_copy_bits(skb, off, page_address(page),
+ if (skb_copy_bits(skb, off,
+ page_address(page) + page_offset,
size)) {
consume_skb(nskb);
goto drop;
static int veth_dev_init(struct net_device *dev)
{
- int err;
-
- dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
- if (!dev->lstats)
- return -ENOMEM;
-
- err = veth_alloc_queues(dev);
- if (err) {
- free_percpu(dev->lstats);
- return err;
- }
-
- return 0;
+ return veth_alloc_queues(dev);
}
static void veth_dev_free(struct net_device *dev)
{
veth_free_queues(dev);
- free_percpu(dev->lstats);
}
#ifdef CONFIG_NET_POLL_CONTROLLER
NETIF_F_HW_VLAN_STAG_RX);
dev->needs_free_netdev = true;
dev->priv_destructor = veth_dev_free;
+ dev->pcpu_stat_type = NETDEV_PCPU_STAT_TSTATS;
dev->max_mtu = ETH_MAX_MTU;
dev->hw_features = VETH_FEATURES;
int ifindex;
};
-struct pcpu_dstats {
- u64 tx_pkts;
- u64 tx_bytes;
- u64 tx_drps;
- u64 rx_pkts;
- u64 rx_bytes;
- u64 rx_drps;
- struct u64_stats_sync syncp;
-};
-
static void vrf_rx_stats(struct net_device *dev, int len)
{
struct pcpu_dstats *dstats = this_cpu_ptr(dev->dstats);
u64_stats_update_begin(&dstats->syncp);
- dstats->rx_pkts++;
+ dstats->rx_packets++;
dstats->rx_bytes += len;
u64_stats_update_end(&dstats->syncp);
}
do {
start = u64_stats_fetch_begin(&dstats->syncp);
tbytes = dstats->tx_bytes;
- tpkts = dstats->tx_pkts;
- tdrops = dstats->tx_drps;
+ tpkts = dstats->tx_packets;
+ tdrops = dstats->tx_drops;
rbytes = dstats->rx_bytes;
- rpkts = dstats->rx_pkts;
+ rpkts = dstats->rx_packets;
} while (u64_stats_fetch_retry(&dstats->syncp, start));
stats->tx_bytes += tbytes;
stats->tx_packets += tpkts;
if (likely(__netif_rx(skb) == NET_RX_SUCCESS))
vrf_rx_stats(dev, len);
else
- this_cpu_inc(dev->dstats->rx_drps);
+ this_cpu_inc(dev->dstats->rx_drops);
return NETDEV_TX_OK;
}
struct pcpu_dstats *dstats = this_cpu_ptr(dev->dstats);
u64_stats_update_begin(&dstats->syncp);
- dstats->tx_pkts++;
+ dstats->tx_packets++;
dstats->tx_bytes += len;
u64_stats_update_end(&dstats->syncp);
} else {
- this_cpu_inc(dev->dstats->tx_drps);
+ this_cpu_inc(dev->dstats->tx_drops);
}
return ret;
vrf_rtable_release(dev, vrf);
vrf_rt6_release(dev, vrf);
-
- free_percpu(dev->dstats);
- dev->dstats = NULL;
}
static int vrf_dev_init(struct net_device *dev)
{
struct net_vrf *vrf = netdev_priv(dev);
- dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
- if (!dev->dstats)
- goto out_nomem;
-
/* create the default dst which points back to us */
if (vrf_rtable_create(dev) != 0)
- goto out_stats;
+ goto out_nomem;
if (vrf_rt6_create(dev) != 0)
goto out_rth;
out_rth:
vrf_rtable_release(dev, vrf);
-out_stats:
- free_percpu(dev->dstats);
- dev->dstats = NULL;
out_nomem:
return -ENOMEM;
}
dev->min_mtu = IPV6_MIN_MTU;
dev->max_mtu = IP6_MAX_MTU;
dev->mtu = dev->max_mtu;
+
+ dev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
}
static int vrf_validate(struct nlattr *tb[], struct nlattr *data[],
*/
while (skb_queue_len(&peer->staged_packet_queue) > MAX_STAGED_PACKETS) {
dev_kfree_skb(__skb_dequeue(&peer->staged_packet_queue));
- ++dev->stats.tx_dropped;
+ DEV_STATS_INC(dev, tx_dropped);
}
skb_queue_splice_tail(&packets, &peer->staged_packet_queue);
spin_unlock_bh(&peer->staged_packet_queue.lock);
else if (skb->protocol == htons(ETH_P_IPV6))
icmpv6_ndo_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0);
err:
- ++dev->stats.tx_errors;
+ DEV_STATS_INC(dev, tx_errors);
kfree_skb(skb);
return ret;
}
net_dbg_skb_ratelimited("%s: Packet has unallowed src IP (%pISc) from peer %llu (%pISpfsc)\n",
dev->name, skb, peer->internal_id,
&peer->endpoint.addr);
- ++dev->stats.rx_errors;
- ++dev->stats.rx_frame_errors;
+ DEV_STATS_INC(dev, rx_errors);
+ DEV_STATS_INC(dev, rx_frame_errors);
goto packet_processed;
dishonest_packet_type:
net_dbg_ratelimited("%s: Packet is neither ipv4 nor ipv6 from peer %llu (%pISpfsc)\n",
dev->name, peer->internal_id, &peer->endpoint.addr);
- ++dev->stats.rx_errors;
- ++dev->stats.rx_frame_errors;
+ DEV_STATS_INC(dev, rx_errors);
+ DEV_STATS_INC(dev, rx_frame_errors);
goto packet_processed;
dishonest_packet_size:
net_dbg_ratelimited("%s: Packet has incorrect size from peer %llu (%pISpfsc)\n",
dev->name, peer->internal_id, &peer->endpoint.addr);
- ++dev->stats.rx_errors;
- ++dev->stats.rx_length_errors;
+ DEV_STATS_INC(dev, rx_errors);
+ DEV_STATS_INC(dev, rx_length_errors);
goto packet_processed;
packet_processed:
dev_kfree_skb(skb);
void wg_packet_purge_staged_packets(struct wg_peer *peer)
{
spin_lock_bh(&peer->staged_packet_queue.lock);
- peer->device->dev->stats.tx_dropped += peer->staged_packet_queue.qlen;
+ DEV_STATS_ADD(peer->device->dev, tx_dropped,
+ peer->staged_packet_queue.qlen);
__skb_queue_purge(&peer->staged_packet_queue);
spin_unlock_bh(&peer->staged_packet_queue.lock);
}
config ATH9K_DEBUGFS
bool "Atheros ath9k debugging"
- depends on ATH9K && DEBUG_FS
- select MAC80211_DEBUGFS
+ depends on ATH9K && DEBUG_FS && MAC80211_DEBUGFS
select ATH9K_COMMON_DEBUG
help
Say Y, if you need access to ath9k's statistics for
config ATH9K_STATION_STATISTICS
bool "Detailed station statistics"
depends on ATH9K && ATH9K_DEBUGFS && DEBUG_FS
- select MAC80211_DEBUGFS
default n
help
This option enables detailed statistics for association stations.
rcu_dereference_protected(mvm_sta->link[link_id],
lockdep_is_held(&mvm->mutex));
- if (WARN_ON(!link_conf || !mvm_link_sta))
+ if (WARN_ON(!link_conf || !mvm_link_sta)) {
+ ret = -EINVAL;
goto err;
+ }
ret = iwl_mvm_mld_cfg_sta(mvm, sta, vif, link_sta, link_conf,
mvm_link_sta);
* if it is true then one of the handlers took the page.
*/
- if (reclaim) {
+ if (reclaim && txq) {
u16 sequence = le16_to_cpu(pkt->hdr.sequence);
int index = SEQ_TO_INDEX(sequence);
int cmd_index = iwl_txq_get_cmd_index(txq, index);
struct iwl_rxq *rxq = &trans_pcie->rxq[0];
u32 i, r, j, rb_len = 0;
- spin_lock(&rxq->lock);
+ spin_lock_bh(&rxq->lock);
r = iwl_get_closed_rb_stts(trans, rxq);
*data = iwl_fw_error_next_data(*data);
}
- spin_unlock(&rxq->lock);
+ spin_unlock_bh(&rxq->lock);
return rb_len;
}
static void
mt76_add_fragment(struct mt76_dev *dev, struct mt76_queue *q, void *data,
- int len, bool more, u32 info)
+ int len, bool more, u32 info, bool allow_direct)
{
struct sk_buff *skb = q->rx_head;
struct skb_shared_info *shinfo = skb_shinfo(skb);
skb_add_rx_frag(skb, nr_frags, page, offset, len, q->buf_size);
} else {
- mt76_put_page_pool_buf(data, true);
+ mt76_put_page_pool_buf(data, allow_direct);
}
if (more)
struct sk_buff *skb;
unsigned char *data;
bool check_ddone = false;
+ bool allow_direct = !mt76_queue_is_wed_rx(q);
bool more;
if (IS_ENABLED(CONFIG_NET_MEDIATEK_SOC_WED) &&
}
if (q->rx_head) {
- mt76_add_fragment(dev, q, data, len, more, info);
+ mt76_add_fragment(dev, q, data, len, more, info,
+ allow_direct);
continue;
}
continue;
free_frag:
- mt76_put_page_pool_buf(data, true);
+ mt76_put_page_pool_buf(data, allow_direct);
}
mt76_dma_rx_fill(dev, q, true);
int ret, i, len, offset = 0;
u8 *clc_base = NULL, hw_encap = 0;
+ dev->phy.clc_chan_conf = 0xff;
if (mt7921_disable_clc ||
mt76_is_usb(&dev->mt76))
return 0;
static void
mt7925_init_he_caps(struct mt792x_phy *phy, enum nl80211_band band,
struct ieee80211_sband_iftype_data *data,
- enum nl80211_iftype iftype)
+ enum nl80211_iftype iftype)
{
struct ieee80211_sta_he_cap *he_cap = &data->he_cap;
struct ieee80211_he_cap_elem *he_cap_elem = &he_cap->he_cap_elem;
IEEE80211_HE_PHY_CAP2_UL_MU_FULL_MU_MIMO |
IEEE80211_HE_PHY_CAP2_UL_MU_PARTIAL_MU_MIMO;
- switch (i) {
+ switch (iftype) {
case NL80211_IFTYPE_AP:
he_cap_elem->mac_cap_info[2] |=
IEEE80211_HE_MAC_CAP2_BSR;
struct mutex mtx;
struct sk_buff *send_buff;
struct wait_queue_head wq;
+ bool running;
};
static int virtual_nci_open(struct nci_dev *ndev)
{
+ struct virtual_nci_dev *vdev = nci_get_drvdata(ndev);
+
+ vdev->running = true;
return 0;
}
mutex_lock(&vdev->mtx);
kfree_skb(vdev->send_buff);
vdev->send_buff = NULL;
+ vdev->running = false;
mutex_unlock(&vdev->mtx);
return 0;
struct virtual_nci_dev *vdev = nci_get_drvdata(ndev);
mutex_lock(&vdev->mtx);
- if (vdev->send_buff) {
+ if (vdev->send_buff || !vdev->running) {
mutex_unlock(&vdev->mtx);
kfree_skb(skb);
return -1;
If unsure, say N.
config NVME_HOST_AUTH
- bool "NVM Express over Fabrics In-Band Authentication"
+ bool "NVMe over Fabrics In-Band Authentication in host side"
depends on NVME_CORE
select NVME_AUTH
help
- This provides support for NVMe over Fabrics In-Band Authentication.
+ This provides support for NVMe over Fabrics In-Band Authentication in
+ host side.
If unsure, say N.
__func__, chap->qid);
mutex_lock(&ctrl->dhchap_auth_mutex);
ret = nvme_auth_dhchap_setup_host_response(ctrl, chap);
+ mutex_unlock(&ctrl->dhchap_auth_mutex);
if (ret) {
- mutex_unlock(&ctrl->dhchap_auth_mutex);
chap->error = ret;
goto fail2;
}
- mutex_unlock(&ctrl->dhchap_auth_mutex);
/* DH-HMAC-CHAP Step 3: send reply */
dev_dbg(ctrl->device, "%s: qid %d send reply\n",
}
fail2:
+ if (chap->status == 0)
+ chap->status = NVME_AUTH_DHCHAP_FAILURE_FAILED;
dev_dbg(ctrl->device, "%s: qid %d send failure2, status %x\n",
__func__, chap->qid, chap->status);
tl = nvme_auth_set_dhchap_failure2_data(ctrl, chap);
/*
* Only new queue scan work when admin and IO queues are both alive
*/
- if (ctrl->state == NVME_CTRL_LIVE && ctrl->tagset)
+ if (nvme_ctrl_state(ctrl) == NVME_CTRL_LIVE && ctrl->tagset)
queue_work(nvme_wq, &ctrl->scan_work);
}
*/
int nvme_try_sched_reset(struct nvme_ctrl *ctrl)
{
- if (ctrl->state != NVME_CTRL_RESETTING)
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_RESETTING)
return -EBUSY;
if (!queue_work(nvme_reset_wq, &ctrl->reset_work))
return -EBUSY;
struct nvme_ctrl *ctrl = container_of(to_delayed_work(work),
struct nvme_ctrl, failfast_work);
- if (ctrl->state != NVME_CTRL_CONNECTING)
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_CONNECTING)
return;
set_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
ret = nvme_reset_ctrl(ctrl);
if (!ret) {
flush_work(&ctrl->reset_work);
- if (ctrl->state != NVME_CTRL_LIVE)
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_LIVE)
ret = -ENETRESET;
}
void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl)
{
- nvme_stop_keep_alive(ctrl);
if (ctrl->admin_tagset) {
blk_mq_tagset_busy_iter(ctrl->admin_tagset,
nvme_cancel_request, ctrl);
spin_lock_irqsave(&ctrl->lock, flags);
- old_state = ctrl->state;
+ old_state = nvme_ctrl_state(ctrl);
switch (new_state) {
case NVME_CTRL_LIVE:
switch (old_state) {
}
if (changed) {
- ctrl->state = new_state;
+ WRITE_ONCE(ctrl->state, new_state);
wake_up_all(&ctrl->state_wq);
}
if (!changed)
return false;
- if (ctrl->state == NVME_CTRL_LIVE) {
+ if (new_state == NVME_CTRL_LIVE) {
if (old_state == NVME_CTRL_CONNECTING)
nvme_stop_failfast_work(ctrl);
nvme_kick_requeue_lists(ctrl);
- } else if (ctrl->state == NVME_CTRL_CONNECTING &&
+ } else if (new_state == NVME_CTRL_CONNECTING &&
old_state == NVME_CTRL_RESETTING) {
nvme_start_failfast_work(ctrl);
}
*/
static bool nvme_state_terminal(struct nvme_ctrl *ctrl)
{
- switch (ctrl->state) {
+ switch (nvme_ctrl_state(ctrl)) {
case NVME_CTRL_NEW:
case NVME_CTRL_LIVE:
case NVME_CTRL_RESETTING:
wait_event(ctrl->state_wq,
nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING) ||
nvme_state_terminal(ctrl));
- return ctrl->state == NVME_CTRL_RESETTING;
+ return nvme_ctrl_state(ctrl) == NVME_CTRL_RESETTING;
}
EXPORT_SYMBOL_GPL(nvme_wait_reset);
blk_status_t nvme_fail_nonready_command(struct nvme_ctrl *ctrl,
struct request *rq)
{
- if (ctrl->state != NVME_CTRL_DELETING_NOIO &&
- ctrl->state != NVME_CTRL_DELETING &&
- ctrl->state != NVME_CTRL_DEAD &&
+ enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
+
+ if (state != NVME_CTRL_DELETING_NOIO &&
+ state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DEAD &&
!test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags) &&
!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
return BLK_STS_RESOURCE;
* command, which is require to set the queue live in the
* appropinquate states.
*/
- switch (ctrl->state) {
+ switch (nvme_ctrl_state(ctrl)) {
case NVME_CTRL_CONNECTING:
if (blk_rq_is_passthrough(rq) && nvme_is_fabrics(req->cmd) &&
(req->cmd->fabrics.fctype == nvme_fabrics_type_connect ||
static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl)
{
- queue_delayed_work(nvme_wq, &ctrl->ka_work,
- nvme_keep_alive_work_period(ctrl));
+ unsigned long now = jiffies;
+ unsigned long delay = nvme_keep_alive_work_period(ctrl);
+ unsigned long ka_next_check_tm = ctrl->ka_last_check_time + delay;
+
+ if (time_after(now, ka_next_check_tm))
+ delay = 0;
+ else
+ delay = ka_next_check_tm - now;
+
+ queue_delayed_work(nvme_wq, &ctrl->ka_work, delay);
}
static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
if (id->ncap == 0) {
/* namespace not allocated or attached */
info->is_removed = true;
- return -ENODEV;
+ ret = -ENODEV;
+ goto error;
}
info->anagrpid = id->anagrpid;
!memchr_inv(ids->nguid, 0, sizeof(ids->nguid)))
memcpy(ids->nguid, id->nguid, sizeof(ids->nguid));
}
+
+error:
kfree(id);
- return 0;
+ return ret;
}
static int nvme_ns_info_from_id_cs_indep(struct nvme_ctrl *ctrl,
return ret;
}
-static void nvme_configure_metadata(struct nvme_ns *ns, struct nvme_id_ns *id)
+static int nvme_configure_metadata(struct nvme_ns *ns, struct nvme_id_ns *id)
{
struct nvme_ctrl *ctrl = ns->ctrl;
+ int ret;
- if (nvme_init_ms(ns, id))
- return;
+ ret = nvme_init_ms(ns, id);
+ if (ret)
+ return ret;
ns->features &= ~(NVME_NS_METADATA_SUPPORTED | NVME_NS_EXT_LBAS);
if (!ns->ms || !(ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
- return;
+ return 0;
if (ctrl->ops->flags & NVME_F_FABRICS) {
/*
* remap the separate metadata buffer from the block layer.
*/
if (WARN_ON_ONCE(!(id->flbas & NVME_NS_FLBAS_META_EXT)))
- return;
+ return 0;
ns->features |= NVME_NS_EXT_LBAS;
else
ns->features |= NVME_NS_METADATA_SUPPORTED;
}
+ return 0;
}
static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
/*
* The block layer can't support LBA sizes larger than the page size
- * yet, so catch this early and don't allow block I/O.
+ * or smaller than a sector size yet, so catch this early and don't
+ * allow block I/O.
*/
- if (ns->lba_shift > PAGE_SHIFT) {
+ if (ns->lba_shift > PAGE_SHIFT || ns->lba_shift < SECTOR_SHIFT) {
capacity = 0;
bs = (1 << 9);
}
if (ret)
return ret;
+ if (id->ncap == 0) {
+ /* namespace not allocated or attached */
+ info->is_removed = true;
+ ret = -ENODEV;
+ goto error;
+ }
+
blk_mq_freeze_queue(ns->disk->queue);
lbaf = nvme_lbaf_index(id->flbas);
ns->lba_shift = id->lbaf[lbaf].ds;
nvme_set_queue_limits(ns->ctrl, ns->queue);
- nvme_configure_metadata(ns, id);
+ ret = nvme_configure_metadata(ns, id);
+ if (ret < 0) {
+ blk_mq_unfreeze_queue(ns->disk->queue);
+ goto out;
+ }
nvme_set_chunk_sectors(ns, id);
nvme_update_disk_info(ns->disk, ns, id);
set_bit(NVME_NS_READY, &ns->flags);
ret = 0;
}
+
+error:
kfree(id);
return ret;
}
if (ctrl->ps_max_latency_us != latency) {
ctrl->ps_max_latency_us = latency;
- if (ctrl->state == NVME_CTRL_LIVE)
+ if (nvme_ctrl_state(ctrl) == NVME_CTRL_LIVE)
nvme_configure_apst(ctrl);
}
}
struct nvme_ctrl *ctrl =
container_of(inode->i_cdev, struct nvme_ctrl, cdev);
- switch (ctrl->state) {
+ switch (nvme_ctrl_state(ctrl)) {
case NVME_CTRL_LIVE:
break;
default:
goto out_unlink_ns;
down_write(&ctrl->namespaces_rwsem);
+ /*
+ * Ensure that no namespaces are added to the ctrl list after the queues
+ * are frozen, thereby avoiding a deadlock between scan and reset.
+ */
+ if (test_bit(NVME_CTRL_FROZEN, &ctrl->flags)) {
+ up_write(&ctrl->namespaces_rwsem);
+ goto out_unlink_ns;
+ }
nvme_ns_add_to_ctrl_list(ns);
up_write(&ctrl->namespaces_rwsem);
nvme_get_ctrl(ctrl);
int ret;
/* No tagset on a live ctrl means IO queues could not created */
- if (ctrl->state != NVME_CTRL_LIVE || !ctrl->tagset)
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_LIVE || !ctrl->tagset)
return;
/*
* removing the namespaces' disks; fail all the queues now to avoid
* potentially having to clean up the failed sync later.
*/
- if (ctrl->state == NVME_CTRL_DEAD)
+ if (nvme_ctrl_state(ctrl) == NVME_CTRL_DEAD)
nvme_mark_namespaces_dead(ctrl);
/* this is a no-op when called from the controller reset handler */
* flushing ctrl async_event_work after changing the controller state
* from LIVE and before freeing the admin queue.
*/
- if (ctrl->state == NVME_CTRL_LIVE)
+ if (nvme_ctrl_state(ctrl) == NVME_CTRL_LIVE)
ctrl->ops->submit_async_event(ctrl);
}
struct nvme_ctrl, fw_act_work);
unsigned long fw_act_timeout;
+ nvme_auth_stop(ctrl);
+
if (ctrl->mtfa)
fw_act_timeout = jiffies +
msecs_to_jiffies(ctrl->mtfa * 100);
* firmware activation.
*/
if (nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING)) {
- nvme_auth_stop(ctrl);
requeue = false;
queue_work(nvme_wq, &ctrl->fw_act_work);
}
{
nvme_mpath_stop(ctrl);
nvme_auth_stop(ctrl);
+ nvme_stop_keep_alive(ctrl);
nvme_stop_failfast_work(ctrl);
flush_work(&ctrl->async_event_work);
cancel_work_sync(&ctrl->fw_act_work);
{
int ret;
- ctrl->state = NVME_CTRL_NEW;
+ WRITE_ONCE(ctrl->state, NVME_CTRL_NEW);
clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
spin_lock_init(&ctrl->lock);
mutex_init(&ctrl->scan_lock);
INIT_DELAYED_WORK(&ctrl->failfast_work, nvme_failfast_work);
memset(&ctrl->ka_cmd, 0, sizeof(ctrl->ka_cmd));
ctrl->ka_cmd.common.opcode = nvme_admin_keep_alive;
+ ctrl->ka_last_check_time = jiffies;
BUILD_BUG_ON(NVME_DSM_MAX_RANGES * sizeof(struct nvme_dsm_range) >
PAGE_SIZE);
list_for_each_entry(ns, &ctrl->namespaces, list)
blk_mq_unfreeze_queue(ns->queue);
up_read(&ctrl->namespaces_rwsem);
+ clear_bit(NVME_CTRL_FROZEN, &ctrl->flags);
}
EXPORT_SYMBOL_GPL(nvme_unfreeze);
{
struct nvme_ns *ns;
+ set_bit(NVME_CTRL_FROZEN, &ctrl->flags);
down_read(&ctrl->namespaces_rwsem);
list_for_each_entry(ns, &ctrl->namespaces, list)
blk_freeze_queue_start(ns->queue);
#endif
{ NVMF_OPT_FAIL_FAST_TMO, "fast_io_fail_tmo=%d" },
{ NVMF_OPT_DISCOVERY, "discovery" },
+#ifdef CONFIG_NVME_HOST_AUTH
{ NVMF_OPT_DHCHAP_SECRET, "dhchap_secret=%s" },
{ NVMF_OPT_DHCHAP_CTRL_SECRET, "dhchap_ctrl_secret=%s" },
+#endif
#ifdef CONFIG_NVME_TCP_TLS
{ NVMF_OPT_TLS, "tls" },
#endif
static void
nvme_fc_resume_controller(struct nvme_fc_ctrl *ctrl)
{
- switch (ctrl->ctrl.state) {
+ switch (nvme_ctrl_state(&ctrl->ctrl)) {
case NVME_CTRL_NEW:
case NVME_CTRL_CONNECTING:
/*
"NVME-FC{%d}: controller connectivity lost. Awaiting "
"Reconnect", ctrl->cnum);
- switch (ctrl->ctrl.state) {
+ switch (nvme_ctrl_state(&ctrl->ctrl)) {
case NVME_CTRL_NEW:
case NVME_CTRL_LIVE:
/*
* clean up the admin queue. Same thing as above.
*/
nvme_quiesce_admin_queue(&ctrl->ctrl);
-
- /*
- * Open-coding nvme_cancel_admin_tagset() as fc
- * is not using nvme_cancel_request().
- */
- nvme_stop_keep_alive(&ctrl->ctrl);
blk_sync_queue(ctrl->ctrl.admin_q);
blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
* the controller. Abort any ios on the association and let the
* create_association error path resolve things.
*/
- enum nvme_ctrl_state state;
- unsigned long flags;
-
- spin_lock_irqsave(&ctrl->lock, flags);
- state = ctrl->ctrl.state;
- if (state == NVME_CTRL_CONNECTING) {
- set_bit(ASSOC_FAILED, &ctrl->flags);
- spin_unlock_irqrestore(&ctrl->lock, flags);
+ if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) {
__nvme_fc_abort_outstanding_ios(ctrl, true);
+ set_bit(ASSOC_FAILED, &ctrl->flags);
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: transport error during (re)connect\n",
ctrl->cnum);
return;
}
- spin_unlock_irqrestore(&ctrl->lock, flags);
/* Otherwise, only proceed if in LIVE state - e.g. on first error */
- if (state != NVME_CTRL_LIVE)
+ if (ctrl->ctrl.state != NVME_CTRL_LIVE)
return;
dev_warn(ctrl->ctrl.device,
nvme_unquiesce_admin_queue(&ctrl->ctrl);
ret = nvme_init_ctrl_finish(&ctrl->ctrl, false);
- if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
- ret = -EIO;
if (ret)
goto out_disconnect_admin_queue;
-
+ if (test_bit(ASSOC_FAILED, &ctrl->flags)) {
+ ret = -EIO;
+ goto out_stop_keep_alive;
+ }
/* sanity checks */
/* FC-NVME does not have other data in the capsule */
dev_err(ctrl->ctrl.device, "icdoff %d is not supported!\n",
ctrl->ctrl.icdoff);
ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
- goto out_disconnect_admin_queue;
+ goto out_stop_keep_alive;
}
/* FC-NVME supports normal SGL Data Block Descriptors */
dev_err(ctrl->ctrl.device,
"Mandatory sgls are not supported!\n");
ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
- goto out_disconnect_admin_queue;
+ goto out_stop_keep_alive;
}
if (opts->queue_size > ctrl->ctrl.maxcmd) {
else
ret = nvme_fc_recreate_io_queues(ctrl);
}
-
- spin_lock_irqsave(&ctrl->lock, flags);
if (!ret && test_bit(ASSOC_FAILED, &ctrl->flags))
ret = -EIO;
- if (ret) {
- spin_unlock_irqrestore(&ctrl->lock, flags);
+ if (ret)
goto out_term_aen_ops;
- }
+
changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
- spin_unlock_irqrestore(&ctrl->lock, flags);
ctrl->ctrl.nr_reconnects = 0;
out_term_aen_ops:
nvme_fc_term_aen_ops(ctrl);
+out_stop_keep_alive:
+ nvme_stop_keep_alive(&ctrl->ctrl);
out_disconnect_admin_queue:
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: create_assoc failed, assoc_id %llx ret %d\n",
unsigned long recon_delay = ctrl->ctrl.opts->reconnect_delay * HZ;
bool recon = true;
- if (ctrl->ctrl.state != NVME_CTRL_CONNECTING)
+ if (nvme_ctrl_state(&ctrl->ctrl) != NVME_CTRL_CONNECTING)
return;
if (portptr->port_state == FC_OBJSTATE_ONLINE) {
{
u32 effects;
- if (capable(CAP_SYS_ADMIN))
- return true;
-
/*
* Do not allow unprivileged passthrough on partitions, as that allows an
* escape from the containment of the partition.
*/
if (flags & NVME_IOCTL_PARTITION)
- return false;
+ goto admin;
/*
* Do not allow unprivileged processes to send vendor specific or fabrics
*/
if (c->common.opcode >= nvme_cmd_vendor_start ||
c->common.opcode == nvme_fabrics_command)
- return false;
+ goto admin;
/*
* Do not allow unprivileged passthrough of admin commands except
return true;
}
}
- return false;
+ goto admin;
}
/*
*/
effects = nvme_command_effects(ns->ctrl, ns, c->common.opcode);
if (!(effects & NVME_CMD_EFFECTS_CSUPP))
- return false;
+ goto admin;
/*
* Don't allow passthrough for command that have intrusive (or unknown)
if (effects & ~(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC |
NVME_CMD_EFFECTS_UUID_SEL |
NVME_CMD_EFFECTS_SCOPE_MASK))
- return false;
+ goto admin;
/*
* Only allow I/O commands that transfer data to the controller or that
* change the logical block contents if the file descriptor is open for
* writing.
*/
- if (nvme_is_write(c) || (effects & NVME_CMD_EFFECTS_LBCC))
- return open_for_write;
+ if ((nvme_is_write(c) || (effects & NVME_CMD_EFFECTS_LBCC)) &&
+ !open_for_write)
+ goto admin;
+
return true;
+admin:
+ return capable(CAP_SYS_ADMIN);
}
/*
* No temperature thresholds for channels other than 0 (Composite).
*/
NVME_QUIRK_NO_SECONDARY_TEMP_THRESH = (1 << 19),
+
+ /*
+ * Disables simple suspend/resume path.
+ */
+ NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND = (1 << 20),
};
/*
NVME_CTRL_STOPPED = 3,
NVME_CTRL_SKIP_ID_CNS_CS = 4,
NVME_CTRL_DIRTY_CAPABILITY = 5,
+ NVME_CTRL_FROZEN = 6,
};
struct nvme_ctrl {
enum nvme_dctype dctype;
};
+static inline enum nvme_ctrl_state nvme_ctrl_state(struct nvme_ctrl *ctrl)
+{
+ return READ_ONCE(ctrl->state);
+}
+
enum nvme_iopolicy {
NVME_IOPOLICY_NUMA,
NVME_IOPOLICY_RR,
bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);
/* If there is a reset/reinit ongoing, we shouldn't reset again. */
- switch (dev->ctrl.state) {
+ switch (nvme_ctrl_state(&dev->ctrl)) {
case NVME_CTRL_RESETTING:
case NVME_CTRL_CONNECTING:
return false;
* cancellation error. All outstanding requests are completed on
* shutdown, so we return BLK_EH_DONE.
*/
- switch (dev->ctrl.state) {
+ switch (nvme_ctrl_state(&dev->ctrl)) {
case NVME_CTRL_CONNECTING:
nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING);
fallthrough;
/*
* Controller is in wrong state, fail early.
*/
- if (dev->ctrl.state != NVME_CTRL_CONNECTING) {
+ if (nvme_ctrl_state(&dev->ctrl) != NVME_CTRL_CONNECTING) {
mutex_unlock(&dev->shutdown_lock);
return -ENODEV;
}
static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
{
+ enum nvme_ctrl_state state = nvme_ctrl_state(&dev->ctrl);
struct pci_dev *pdev = to_pci_dev(dev->dev);
bool dead;
mutex_lock(&dev->shutdown_lock);
dead = nvme_pci_ctrl_is_dead(dev);
- if (dev->ctrl.state == NVME_CTRL_LIVE ||
- dev->ctrl.state == NVME_CTRL_RESETTING) {
+ if (state == NVME_CTRL_LIVE || state == NVME_CTRL_RESETTING) {
if (pci_is_enabled(pdev))
nvme_start_freeze(&dev->ctrl);
/*
bool was_suspend = !!(dev->ctrl.ctrl_config & NVME_CC_SHN_NORMAL);
int result;
- if (dev->ctrl.state != NVME_CTRL_RESETTING) {
+ if (nvme_ctrl_state(&dev->ctrl) != NVME_CTRL_RESETTING) {
dev_warn(dev->ctrl.device, "ctrl state %d is not RESETTING\n",
dev->ctrl.state);
result = -ENODEV;
if ((dmi_match(DMI_BOARD_VENDOR, "LENOVO")) &&
dmi_match(DMI_BOARD_NAME, "LNVNB161216"))
return NVME_QUIRK_SIMPLE_SUSPEND;
+ } else if (pdev->vendor == 0x2646 && (pdev->device == 0x2263 ||
+ pdev->device == 0x500f)) {
+ /*
+ * Exclude some Kingston NV1 and A2000 devices from
+ * NVME_QUIRK_SIMPLE_SUSPEND. Do a full suspend to save a
+ * lot fo energy with s2idle sleep on some TUXEDO platforms.
+ */
+ if (dmi_match(DMI_BOARD_NAME, "NS5X_NS7XAU") ||
+ dmi_match(DMI_BOARD_NAME, "NS5x_7xAU") ||
+ dmi_match(DMI_BOARD_NAME, "NS5x_7xPU") ||
+ dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1"))
+ return NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND;
}
return 0;
dev->dev = get_device(&pdev->dev);
quirks |= check_vendor_combination_bug(pdev);
- if (!noacpi && acpi_storage_d3(&pdev->dev)) {
+ if (!noacpi &&
+ !(quirks & NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND) &&
+ acpi_storage_d3(&pdev->dev)) {
/*
* Some systems use a bios work around to ask for D3 on
* platforms that support kernel managed suspend.
nvme_wait_freeze(ctrl);
nvme_sync_queues(ctrl);
- if (ctrl->state != NVME_CTRL_LIVE)
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_LIVE)
goto unfreeze;
/*
static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
{
+ enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl);
+
/* If we are resetting/deleting then do nothing */
- if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
- WARN_ON_ONCE(ctrl->ctrl.state == NVME_CTRL_NEW ||
- ctrl->ctrl.state == NVME_CTRL_LIVE);
+ if (state != NVME_CTRL_CONNECTING) {
+ WARN_ON_ONCE(state == NVME_CTRL_NEW || state == NVME_CTRL_LIVE);
return;
}
* unless we're during creation of a new controller to
* avoid races with teardown flow.
*/
- WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING &&
- ctrl->ctrl.state != NVME_CTRL_DELETING_NOIO);
+ enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl);
+
+ WARN_ON_ONCE(state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DELETING_NOIO);
WARN_ON_ONCE(new);
ret = -EINVAL;
goto destroy_io;
nvme_rdma_free_io_queues(ctrl);
}
destroy_admin:
+ nvme_stop_keep_alive(&ctrl->ctrl);
nvme_quiesce_admin_queue(&ctrl->ctrl);
blk_sync_queue(ctrl->ctrl.admin_q);
nvme_rdma_stop_queue(&ctrl->queues[0]);
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
/* state change failure is ok if we started ctrl delete */
- WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING &&
- ctrl->ctrl.state != NVME_CTRL_DELETING_NOIO);
+ enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl);
+
+ WARN_ON_ONCE(state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DELETING_NOIO);
return;
}
struct nvme_rdma_queue *queue = wc->qp->qp_context;
struct nvme_rdma_ctrl *ctrl = queue->ctrl;
- if (ctrl->ctrl.state == NVME_CTRL_LIVE)
+ if (nvme_ctrl_state(&ctrl->ctrl) == NVME_CTRL_LIVE)
dev_info(ctrl->ctrl.device,
"%s for CQE 0x%p failed with status %s (%d)\n",
op, wc->wr_cqe,
dev_warn(ctrl->ctrl.device, "I/O %d QID %d timeout\n",
rq->tag, nvme_rdma_queue_idx(queue));
- if (ctrl->ctrl.state != NVME_CTRL_LIVE) {
+ if (nvme_ctrl_state(&ctrl->ctrl) != NVME_CTRL_LIVE) {
/*
* If we are resetting, connecting or deleting we should
* complete immediately because we may block controller
module_param(so_priority, int, 0644);
MODULE_PARM_DESC(so_priority, "nvme tcp socket optimize priority");
-#ifdef CONFIG_NVME_TCP_TLS
/*
* TLS handshake timeout
*/
static int tls_handshake_timeout = 10;
+#ifdef CONFIG_NVME_TCP_TLS
module_param(tls_handshake_timeout, int, 0644);
MODULE_PARM_DESC(tls_handshake_timeout,
"nvme TLS handshake timeout in seconds (default 10)");
struct ahash_request *snd_hash;
__le32 exp_ddgst;
__le32 recv_ddgst;
-#ifdef CONFIG_NVME_TCP_TLS
struct completion tls_complete;
int tls_err;
-#endif
struct page_frag_cache pf_cache;
void (*state_change)(struct sock *);
return queue - queue->ctrl->queues;
}
+static inline bool nvme_tcp_tls(struct nvme_ctrl *ctrl)
+{
+ if (!IS_ENABLED(CONFIG_NVME_TCP_TLS))
+ return 0;
+
+ return ctrl->opts->tls;
+}
+
static inline struct blk_mq_tags *nvme_tcp_tagset(struct nvme_tcp_queue *queue)
{
u32 queue_idx = nvme_tcp_queue_id(queue);
memset(&msg, 0, sizeof(msg));
iov.iov_base = icresp;
iov.iov_len = sizeof(*icresp);
- if (queue->ctrl->ctrl.opts->tls) {
+ if (nvme_tcp_tls(&queue->ctrl->ctrl)) {
msg.msg_control = cbuf;
msg.msg_controllen = sizeof(cbuf);
}
goto free_icresp;
}
ret = -ENOTCONN;
- if (queue->ctrl->ctrl.opts->tls) {
+ if (nvme_tcp_tls(&queue->ctrl->ctrl)) {
ctype = tls_get_record_type(queue->sock->sk,
(struct cmsghdr *)cbuf);
if (ctype != TLS_RECORD_TYPE_DATA) {
queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
}
-#ifdef CONFIG_NVME_TCP_TLS
static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
{
struct nvme_tcp_queue *queue = data;
}
return ret;
}
-#else
-static int nvme_tcp_start_tls(struct nvme_ctrl *nctrl,
- struct nvme_tcp_queue *queue,
- key_serial_t pskid)
-{
- return -EPROTONOSUPPORT;
-}
-#endif
static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
key_serial_t pskid)
}
/* If PSKs are configured try to start TLS */
- if (pskid) {
+ if (IS_ENABLED(CONFIG_NVME_TCP_TLS) && pskid) {
ret = nvme_tcp_start_tls(nctrl, queue, pskid);
if (ret)
goto err_init_connect;
int ret;
key_serial_t pskid = 0;
- if (ctrl->opts->tls) {
+ if (nvme_tcp_tls(ctrl)) {
if (ctrl->opts->tls_key)
pskid = key_serial(ctrl->opts->tls_key);
else
{
int i, ret;
- if (ctrl->opts->tls && !ctrl->tls_key) {
+ if (nvme_tcp_tls(ctrl) && !ctrl->tls_key) {
dev_err(ctrl->device, "no PSK negotiated\n");
return -ENOKEY;
}
static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl)
{
+ enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
+
/* If we are resetting/deleting then do nothing */
- if (ctrl->state != NVME_CTRL_CONNECTING) {
- WARN_ON_ONCE(ctrl->state == NVME_CTRL_NEW ||
- ctrl->state == NVME_CTRL_LIVE);
+ if (state != NVME_CTRL_CONNECTING) {
+ WARN_ON_ONCE(state == NVME_CTRL_NEW || state == NVME_CTRL_LIVE);
return;
}
* unless we're during creation of a new controller to
* avoid races with teardown flow.
*/
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
- ctrl->state != NVME_CTRL_DELETING_NOIO);
+ enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
+
+ WARN_ON_ONCE(state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DELETING_NOIO);
WARN_ON_ONCE(new);
ret = -EINVAL;
goto destroy_io;
nvme_tcp_destroy_io_queues(ctrl, new);
}
destroy_admin:
+ nvme_stop_keep_alive(ctrl);
nvme_tcp_teardown_admin_queue(ctrl, false);
return ret;
}
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
/* state change failure is ok if we started ctrl delete */
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
- ctrl->state != NVME_CTRL_DELETING_NOIO);
+ enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
+
+ WARN_ON_ONCE(state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DELETING_NOIO);
return;
}
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
/* state change failure is ok if we started ctrl delete */
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
- ctrl->state != NVME_CTRL_DELETING_NOIO);
+ enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
+
+ WARN_ON_ONCE(state != NVME_CTRL_DELETING &&
+ state != NVME_CTRL_DELETING_NOIO);
return;
}
nvme_tcp_queue_id(req->queue), nvme_cid(rq), pdu->hdr.type,
opc, nvme_opcode_str(qid, opc, fctype));
- if (ctrl->state != NVME_CTRL_LIVE) {
+ if (nvme_ctrl_state(ctrl) != NVME_CTRL_LIVE) {
/*
* If we are resetting, connecting or deleting we should
* complete immediately because we may block controller
tristate "NVMe Target support"
depends on BLOCK
depends on CONFIGFS_FS
+ select NVME_KEYRING if NVME_TARGET_TCP_TLS
+ select KEYS if NVME_TARGET_TCP_TLS
select BLK_DEV_INTEGRITY_T10 if BLK_DEV_INTEGRITY
select SGL_ALLOC
help
config NVME_TARGET_TCP_TLS
bool "NVMe over Fabrics TCP target TLS encryption support"
depends on NVME_TARGET_TCP
- select NVME_KEYRING
select NET_HANDSHAKE
- select KEYS
help
Enables TLS encryption for the NVMe TCP target using the netlink handshake API.
If unsure, say N.
config NVME_TARGET_AUTH
- bool "NVMe over Fabrics In-band Authentication support"
+ bool "NVMe over Fabrics In-band Authentication in target side"
depends on NVME_TARGET
select NVME_AUTH
help
- This enables support for NVMe over Fabrics In-band Authentication
+ This enables support for NVMe over Fabrics In-band Authentication in
+ target side.
If unsure, say N.
#include <linux/nvme-keyring.h>
#include <crypto/hash.h>
#include <crypto/kpp.h>
+#include <linux/nospec.h>
#include "nvmet.h"
down_write(&nvmet_ana_sem);
oldgrpid = ns->anagrpid;
+ newgrpid = array_index_nospec(newgrpid, NVMET_MAX_ANAGRPS);
nvmet_ana_group_enabled[newgrpid]++;
ns->anagrpid = newgrpid;
nvmet_ana_group_enabled[oldgrpid]--;
grp->grpid = grpid;
down_write(&nvmet_ana_sem);
+ grpid = array_index_nospec(grpid, NVMET_MAX_ANAGRPS);
nvmet_ana_group_enabled[grpid]++;
up_write(&nvmet_ana_sem);
return ERR_PTR(-ENOMEM);
}
- if (nvme_keyring_id()) {
+ if (IS_ENABLED(CONFIG_NVME_TARGET_TCP_TLS) && nvme_keyring_id()) {
port->keyring = key_lookup(nvme_keyring_id());
if (IS_ERR(port->keyring)) {
pr_warn("NVMe keyring not available, disabling TLS\n");
goto out;
}
+ d->subsysnqn[NVMF_NQN_FIELD_LEN - 1] = '\0';
+ d->hostnqn[NVMF_NQN_FIELD_LEN - 1] = '\0';
status = nvmet_alloc_ctrl(d->subsysnqn, d->hostnqn, req,
le32_to_cpu(c->kato), &ctrl);
if (status)
goto out;
}
+ d->subsysnqn[NVMF_NQN_FIELD_LEN - 1] = '\0';
+ d->hostnqn[NVMF_NQN_FIELD_LEN - 1] = '\0';
ctrl = nvmet_ctrl_find_get(d->subsysnqn, d->hostnqn,
le16_to_cpu(d->cntlid), req);
if (!ctrl) {
}
return ret;
}
+#else
+static void nvmet_tcp_tls_handshake_timeout(struct work_struct *w) {}
#endif
static void nvmet_tcp_alloc_queue(struct nvmet_tcp_port *port,
list_add_tail(&queue->queue_list, &nvmet_tcp_queue_list);
mutex_unlock(&nvmet_tcp_queue_mutex);
-#ifdef CONFIG_NVME_TARGET_TCP_TLS
INIT_DELAYED_WORK(&queue->tls_handshake_tmo_work,
nvmet_tcp_tls_handshake_timeout);
+#ifdef CONFIG_NVME_TARGET_TCP_TLS
if (queue->state == NVMET_TCP_Q_TLS_HANDSHAKE) {
struct sock *sk = queue->sock->sk;
#define NVRAM_MAGIC "FLSH"
+/**
+ * struct brcm_nvram - driver state internal struct
+ *
+ * @dev: NVMEM device pointer
+ * @nvmem_size: Size of the whole space available for NVRAM
+ * @data: NVRAM data copy stored to avoid poking underlaying flash controller
+ * @data_len: NVRAM data size
+ * @padding_byte: Padding value used to fill remaining space
+ * @cells: Array of discovered NVMEM cells
+ * @ncells: Number of elements in cells
+ */
struct brcm_nvram {
struct device *dev;
- void __iomem *base;
+ size_t nvmem_size;
+ uint8_t *data;
+ size_t data_len;
+ uint8_t padding_byte;
struct nvmem_cell_info *cells;
int ncells;
};
size_t bytes)
{
struct brcm_nvram *priv = context;
- u8 *dst = val;
+ size_t to_copy;
+
+ if (offset + bytes > priv->data_len)
+ to_copy = max_t(ssize_t, (ssize_t)priv->data_len - offset, 0);
+ else
+ to_copy = bytes;
+
+ memcpy(val, priv->data + offset, to_copy);
+
+ memset((uint8_t *)val + to_copy, priv->padding_byte, bytes - to_copy);
+
+ return 0;
+}
+
+static int brcm_nvram_copy_data(struct brcm_nvram *priv, struct platform_device *pdev)
+{
+ struct resource *res;
+ void __iomem *base;
+
+ base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
+ if (IS_ERR(base))
+ return PTR_ERR(base);
+
+ priv->nvmem_size = resource_size(res);
+
+ priv->padding_byte = readb(base + priv->nvmem_size - 1);
+ for (priv->data_len = priv->nvmem_size;
+ priv->data_len;
+ priv->data_len--) {
+ if (readb(base + priv->data_len - 1) != priv->padding_byte)
+ break;
+ }
+ WARN(priv->data_len > SZ_128K, "Unexpected (big) NVRAM size: %zu B\n", priv->data_len);
+
+ priv->data = devm_kzalloc(priv->dev, priv->data_len, GFP_KERNEL);
+ if (!priv->data)
+ return -ENOMEM;
+
+ memcpy_fromio(priv->data, base, priv->data_len);
- while (bytes--)
- *dst++ = readb(priv->base + offset++);
+ bcm47xx_nvram_init_from_iomem(base, priv->data_len);
return 0;
}
size_t len)
{
struct device *dev = priv->dev;
- char *var, *value, *eq;
+ char *var, *value;
+ uint8_t tmp;
int idx;
+ int err = 0;
+
+ tmp = priv->data[len - 1];
+ priv->data[len - 1] = '\0';
priv->ncells = 0;
for (var = data + sizeof(struct brcm_nvram_header);
}
priv->cells = devm_kcalloc(dev, priv->ncells, sizeof(*priv->cells), GFP_KERNEL);
- if (!priv->cells)
- return -ENOMEM;
+ if (!priv->cells) {
+ err = -ENOMEM;
+ goto out;
+ }
for (var = data + sizeof(struct brcm_nvram_header), idx = 0;
var < (char *)data + len && *var;
var = value + strlen(value) + 1, idx++) {
+ char *eq, *name;
+
eq = strchr(var, '=');
if (!eq)
break;
*eq = '\0';
+ name = devm_kstrdup(dev, var, GFP_KERNEL);
+ *eq = '=';
+ if (!name) {
+ err = -ENOMEM;
+ goto out;
+ }
value = eq + 1;
- priv->cells[idx].name = devm_kstrdup(dev, var, GFP_KERNEL);
- if (!priv->cells[idx].name)
- return -ENOMEM;
+ priv->cells[idx].name = name;
priv->cells[idx].offset = value - (char *)data;
priv->cells[idx].bytes = strlen(value);
priv->cells[idx].np = of_get_child_by_name(dev->of_node, priv->cells[idx].name);
- if (!strcmp(var, "et0macaddr") ||
- !strcmp(var, "et1macaddr") ||
- !strcmp(var, "et2macaddr")) {
+ if (!strcmp(name, "et0macaddr") ||
+ !strcmp(name, "et1macaddr") ||
+ !strcmp(name, "et2macaddr")) {
priv->cells[idx].raw_len = strlen(value);
priv->cells[idx].bytes = ETH_ALEN;
priv->cells[idx].read_post_process = brcm_nvram_read_post_process_macaddr;
}
}
- return 0;
+out:
+ priv->data[len - 1] = tmp;
+ return err;
}
static int brcm_nvram_parse(struct brcm_nvram *priv)
{
+ struct brcm_nvram_header *header = (struct brcm_nvram_header *)priv->data;
struct device *dev = priv->dev;
- struct brcm_nvram_header header;
- uint8_t *data;
size_t len;
int err;
- memcpy_fromio(&header, priv->base, sizeof(header));
-
- if (memcmp(header.magic, NVRAM_MAGIC, 4)) {
+ if (memcmp(header->magic, NVRAM_MAGIC, 4)) {
dev_err(dev, "Invalid NVRAM magic\n");
return -EINVAL;
}
- len = le32_to_cpu(header.len);
-
- data = kzalloc(len, GFP_KERNEL);
- if (!data)
- return -ENOMEM;
-
- memcpy_fromio(data, priv->base, len);
- data[len - 1] = '\0';
-
- err = brcm_nvram_add_cells(priv, data, len);
- if (err) {
- dev_err(dev, "Failed to add cells: %d\n", err);
- return err;
+ len = le32_to_cpu(header->len);
+ if (len > priv->nvmem_size) {
+ dev_err(dev, "NVRAM length (%zd) exceeds mapped size (%zd)\n", len,
+ priv->nvmem_size);
+ return -EINVAL;
}
- kfree(data);
+ err = brcm_nvram_add_cells(priv, priv->data, len);
+ if (err)
+ dev_err(dev, "Failed to add cells: %d\n", err);
return 0;
}
.reg_read = brcm_nvram_read,
};
struct device *dev = &pdev->dev;
- struct resource *res;
struct brcm_nvram *priv;
int err;
return -ENOMEM;
priv->dev = dev;
- priv->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
- if (IS_ERR(priv->base))
- return PTR_ERR(priv->base);
+ err = brcm_nvram_copy_data(priv, pdev);
+ if (err)
+ return err;
err = brcm_nvram_parse(priv);
if (err)
return err;
- bcm47xx_nvram_init_from_iomem(priv->base, resource_size(res));
-
config.dev = dev;
config.cells = priv->cells;
config.ncells = priv->ncells;
config.priv = priv;
- config.size = resource_size(res);
+ config.size = priv->nvmem_size;
return PTR_ERR_OR_ZERO(devm_nvmem_register(dev, &config));
}
if (!layout_np)
return NULL;
+ /* Fixed layouts don't have a matching driver */
+ if (of_device_is_compatible(layout_np, "fixed-layout")) {
+ of_node_put(layout_np);
+ return NULL;
+ }
+
/*
* In case the nvmem device was built-in while the layout was built as a
* module, we shall manually request the layout driver loading otherwise
*
* Returns the new state of a device based on the notifier used.
*
- * Return: 0 on device going from enabled to disabled, 1 on device
- * going from disabled to enabled and -1 on no change.
+ * Return: OF_RECONFIG_CHANGE_REMOVE on device going from enabled to
+ * disabled, OF_RECONFIG_CHANGE_ADD on device going from disabled to
+ * enabled and OF_RECONFIG_NO_CHANGE on no change.
*/
int of_reconfig_get_state_change(unsigned long action, struct of_reconfig_data *pr)
{
static int qemu_power_off(struct sys_off_data *data)
{
/* this turns the system off via SeaBIOS */
- *(int *)data->cb_data = 0;
+ gsc_writel(0, (unsigned long) data->cb_data);
pdc_soft_power_button(1);
return NOTIFY_DONE;
}
asix_ax99100,
quatech_sppxp100,
wch_ch382l,
+ brainboxes_uc146,
+ brainboxes_px203,
};
/* asix_ax99100 */ { 1, { { 0, 1 }, } },
/* quatech_sppxp100 */ { 1, { { 0, 1 }, } },
/* wch_ch382l */ { 1, { { 2, -1 }, } },
+ /* brainboxes_uc146 */ { 1, { { 3, -1 }, } },
+ /* brainboxes_px203 */ { 1, { { 0, -1 }, } },
};
static const struct pci_device_id parport_pc_pci_tbl[] = {
PCI_ANY_ID, PCI_ANY_ID, 0, 0, quatech_sppxp100 },
/* WCH CH382L PCI-E single parallel port card */
{ 0x1c00, 0x3050, 0x1c00, 0x3050, 0, 0, wch_ch382l },
+ /* Brainboxes IX-500/550 */
+ { PCI_VENDOR_ID_INTASHIELD, 0x402a,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport },
+ /* Brainboxes UC-146/UC-157 */
+ { PCI_VENDOR_ID_INTASHIELD, 0x0be1,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, brainboxes_uc146 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x0be2,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, brainboxes_uc146 },
+ /* Brainboxes PX-146/PX-257 */
+ { PCI_VENDOR_ID_INTASHIELD, 0x401c,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport },
+ /* Brainboxes PX-203 */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4007,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, brainboxes_px203 },
+ /* Brainboxes PX-475 */
+ { PCI_VENDOR_ID_INTASHIELD, 0x401f,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport },
{ 0, } /* terminate list */
};
MODULE_DEVICE_TABLE(pci, parport_pc_pci_tbl);
static int qcom_pcie_enable_aspm(struct pci_dev *pdev, void *userdata)
{
- /* Downstream devices need to be in D0 state before enabling PCI PM substates */
+ /*
+ * Downstream devices need to be in D0 state before enabling PCI PM
+ * substates.
+ */
pci_set_power_state(pdev, PCI_D0);
- pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);
+ pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
return 0;
}
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON,
DEV_LS7A_LPC, system_bus_quirk);
+/*
+ * Some Loongson PCIe ports have hardware limitations on their Maximum Read
+ * Request Size. They can't handle anything larger than this. Sane
+ * firmware will set proper MRRS at boot, so we only need no_inc_mrrs for
+ * bridges. However, some MIPS Loongson firmware doesn't set MRRS properly,
+ * so we have to enforce maximum safe MRRS, which is 256 bytes.
+ */
+#ifdef CONFIG_MIPS
+static void loongson_set_min_mrrs_quirk(struct pci_dev *pdev)
+{
+ struct pci_bus *bus = pdev->bus;
+ struct pci_dev *bridge;
+ static const struct pci_device_id bridge_devids[] = {
+ { PCI_VDEVICE(LOONGSON, DEV_LS2K_PCIE_PORT0) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT0) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT1) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT2) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT3) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT4) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT5) },
+ { PCI_VDEVICE(LOONGSON, DEV_LS7A_PCIE_PORT6) },
+ { 0, },
+ };
+
+ /* look for the matching bridge */
+ while (!pci_is_root_bus(bus)) {
+ bridge = bus->self;
+ bus = bus->parent;
+
+ if (pci_match_id(bridge_devids, bridge)) {
+ if (pcie_get_readrq(pdev) > 256) {
+ pci_info(pdev, "limiting MRRS to 256\n");
+ pcie_set_readrq(pdev, 256);
+ }
+ break;
+ }
+ }
+}
+DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_set_min_mrrs_quirk);
+#endif
+
static void loongson_mrrs_quirk(struct pci_dev *pdev)
{
- /*
- * Some Loongson PCIe ports have h/w limitations of maximum read
- * request size. They can't handle anything larger than this. So
- * force this limit on any devices attached under these ports.
- */
struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
bridge->no_inc_mrrs = 1;
if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
return 0;
- pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);
+ pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
if (!pos)
if (pass && dev->subordinate) {
check_hotplug_bridge(slot, dev);
pcibios_resource_survey_bus(dev->subordinate);
- if (pci_is_root_bus(bus))
- __pci_bus_size_bridges(dev->subordinate, &add_list);
+ __pci_bus_size_bridges(dev->subordinate,
+ &add_list);
}
}
}
- if (pci_is_root_bus(bus))
- __pci_bus_assign_resources(bus, &add_list, NULL);
- else
- pci_assign_unassigned_bridge_resources(bus->self);
+ __pci_bus_assign_resources(bus, &add_list, NULL);
}
acpiphp_sanitize_bus(bus);
return bridge->link_state;
}
-static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool sem)
+static int __pci_disable_link_state(struct pci_dev *pdev, int state, bool locked)
{
struct pcie_link_state *link = pcie_aspm_get_link(pdev);
return -EPERM;
}
- if (sem)
+ if (!locked)
down_read(&pci_bus_sem);
mutex_lock(&aspm_lock);
if (state & PCIE_LINK_STATE_L0S)
link->clkpm_disable = 1;
pcie_set_clkpm(link, policy_to_clkpm_state(link));
mutex_unlock(&aspm_lock);
- if (sem)
+ if (!locked)
up_read(&pci_bus_sem);
return 0;
int pci_disable_link_state_locked(struct pci_dev *pdev, int state)
{
- return __pci_disable_link_state(pdev, state, false);
+ lockdep_assert_held_read(&pci_bus_sem);
+
+ return __pci_disable_link_state(pdev, state, true);
}
EXPORT_SYMBOL(pci_disable_link_state_locked);
*/
int pci_disable_link_state(struct pci_dev *pdev, int state)
{
- return __pci_disable_link_state(pdev, state, true);
+ return __pci_disable_link_state(pdev, state, false);
}
EXPORT_SYMBOL(pci_disable_link_state);
-/**
- * pci_enable_link_state - Clear and set the default device link state so that
- * the link may be allowed to enter the specified states. Note that if the
- * BIOS didn't grant ASPM control to the OS, this does nothing because we can't
- * touch the LNKCTL register. Also note that this does not enable states
- * disabled by pci_disable_link_state(). Return 0 or a negative errno.
- *
- * @pdev: PCI device
- * @state: Mask of ASPM link states to enable
- */
-int pci_enable_link_state(struct pci_dev *pdev, int state)
+static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked)
{
struct pcie_link_state *link = pcie_aspm_get_link(pdev);
return -EPERM;
}
- down_read(&pci_bus_sem);
+ if (!locked)
+ down_read(&pci_bus_sem);
mutex_lock(&aspm_lock);
link->aspm_default = 0;
if (state & PCIE_LINK_STATE_L0S)
link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
pcie_set_clkpm(link, policy_to_clkpm_state(link));
mutex_unlock(&aspm_lock);
- up_read(&pci_bus_sem);
+ if (!locked)
+ up_read(&pci_bus_sem);
return 0;
}
+
+/**
+ * pci_enable_link_state - Clear and set the default device link state so that
+ * the link may be allowed to enter the specified states. Note that if the
+ * BIOS didn't grant ASPM control to the OS, this does nothing because we can't
+ * touch the LNKCTL register. Also note that this does not enable states
+ * disabled by pci_disable_link_state(). Return 0 or a negative errno.
+ *
+ * @pdev: PCI device
+ * @state: Mask of ASPM link states to enable
+ */
+int pci_enable_link_state(struct pci_dev *pdev, int state)
+{
+ return __pci_enable_link_state(pdev, state, false);
+}
EXPORT_SYMBOL(pci_enable_link_state);
+/**
+ * pci_enable_link_state_locked - Clear and set the default device link state
+ * so that the link may be allowed to enter the specified states. Note that if
+ * the BIOS didn't grant ASPM control to the OS, this does nothing because we
+ * can't touch the LNKCTL register. Also note that this does not enable states
+ * disabled by pci_disable_link_state(). Return 0 or a negative errno.
+ *
+ * @pdev: PCI device
+ * @state: Mask of ASPM link states to enable
+ *
+ * Context: Caller holds pci_bus_sem read lock.
+ */
+int pci_enable_link_state_locked(struct pci_dev *pdev, int state)
+{
+ lockdep_assert_held_read(&pci_bus_sem);
+
+ return __pci_enable_link_state(pdev, state, true);
+}
+EXPORT_SYMBOL(pci_enable_link_state_locked);
+
static int pcie_aspm_set_policy(const char *val,
const struct kernel_param *kp)
{
idx = 0;
while (cmn->dtc[j].counters[idx])
if (++idx == CMN_DT_NUM_COUNTERS)
- goto free_dtms;
+ return -ENOSPC;
}
hw->dtc_idx[j] = idx;
}
source "drivers/phy/mscc/Kconfig"
source "drivers/phy/qualcomm/Kconfig"
source "drivers/phy/ralink/Kconfig"
-source "drivers/phy/realtek/Kconfig"
source "drivers/phy/renesas/Kconfig"
source "drivers/phy/rockchip/Kconfig"
source "drivers/phy/samsung/Kconfig"
mscc/ \
qualcomm/ \
ralink/ \
- realtek/ \
renesas/ \
rockchip/ \
samsung/ \
static long mtk_mipi_tx_pll_round_rate(struct clk_hw *hw, unsigned long rate,
unsigned long *prate)
{
- return clamp_val(rate, 50000000, 1600000000);
+ return clamp_val(rate, 125000000, 1600000000);
}
static const struct clk_ops mtk_mipi_tx_pll_ops = {
+++ /dev/null
-# SPDX-License-Identifier: GPL-2.0
-#
-# Phy drivers for Realtek platforms
-#
-
-if ARCH_REALTEK || COMPILE_TEST
-
-config PHY_RTK_RTD_USB2PHY
- tristate "Realtek RTD USB2 PHY Transceiver Driver"
- depends on USB_SUPPORT
- select GENERIC_PHY
- select USB_PHY
- select USB_COMMON
- help
- Enable this to support Realtek SoC USB2 phy transceiver.
- The DHC (digital home center) RTD series SoCs used the Synopsys
- DWC3 USB IP. This driver will do the PHY initialization
- of the parameters.
-
-config PHY_RTK_RTD_USB3PHY
- tristate "Realtek RTD USB3 PHY Transceiver Driver"
- depends on USB_SUPPORT
- select GENERIC_PHY
- select USB_PHY
- select USB_COMMON
- help
- Enable this to support Realtek SoC USB3 phy transceiver.
- The DHC (digital home center) RTD series SoCs used the Synopsys
- DWC3 USB IP. This driver will do the PHY initialization
- of the parameters.
-
-endif # ARCH_REALTEK || COMPILE_TEST
+++ /dev/null
-# SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_PHY_RTK_RTD_USB2PHY) += phy-rtk-usb2.o
-obj-$(CONFIG_PHY_RTK_RTD_USB3PHY) += phy-rtk-usb3.o
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * phy-rtk-usb2.c RTK usb2.0 PHY driver
- *
- * Copyright (C) 2023 Realtek Semiconductor Corporation
- *
- */
-
-#include <linux/module.h>
-#include <linux/of.h>
-#include <linux/of_address.h>
-#include <linux/platform_device.h>
-#include <linux/uaccess.h>
-#include <linux/debugfs.h>
-#include <linux/nvmem-consumer.h>
-#include <linux/regmap.h>
-#include <linux/sys_soc.h>
-#include <linux/mfd/syscon.h>
-#include <linux/phy/phy.h>
-#include <linux/usb.h>
-#include <linux/usb/phy.h>
-#include <linux/usb/hcd.h>
-
-/* GUSB2PHYACCn register */
-#define PHY_NEW_REG_REQ BIT(25)
-#define PHY_VSTS_BUSY BIT(23)
-#define PHY_VCTRL_SHIFT 8
-#define PHY_REG_DATA_MASK 0xff
-
-#define GET_LOW_NIBBLE(addr) ((addr) & 0x0f)
-#define GET_HIGH_NIBBLE(addr) (((addr) & 0xf0) >> 4)
-
-#define EFUS_USB_DC_CAL_RATE 2
-#define EFUS_USB_DC_CAL_MAX 7
-
-#define EFUS_USB_DC_DIS_RATE 1
-#define EFUS_USB_DC_DIS_MAX 7
-
-#define MAX_PHY_DATA_SIZE 20
-#define OFFEST_PHY_READ 0x20
-
-#define MAX_USB_PHY_NUM 4
-#define MAX_USB_PHY_PAGE0_DATA_SIZE 16
-#define MAX_USB_PHY_PAGE1_DATA_SIZE 16
-#define MAX_USB_PHY_PAGE2_DATA_SIZE 8
-
-#define SET_PAGE_OFFSET 0xf4
-#define SET_PAGE_0 0x9b
-#define SET_PAGE_1 0xbb
-#define SET_PAGE_2 0xdb
-
-#define PAGE_START 0xe0
-#define PAGE0_0XE4 0xe4
-#define PAGE0_0XE6 0xe6
-#define PAGE0_0XE7 0xe7
-#define PAGE1_0XE0 0xe0
-#define PAGE1_0XE2 0xe2
-
-#define SENSITIVITY_CTRL (BIT(4) | BIT(5) | BIT(6))
-#define ENABLE_AUTO_SENSITIVITY_CALIBRATION BIT(2)
-#define DEFAULT_DC_DRIVING_VALUE (0x8)
-#define DEFAULT_DC_DISCONNECTION_VALUE (0x6)
-#define HS_CLK_SELECT BIT(6)
-
-struct phy_reg {
- void __iomem *reg_wrap_vstatus;
- void __iomem *reg_gusb2phyacc0;
- int vstatus_index;
-};
-
-struct phy_data {
- u8 addr;
- u8 data;
-};
-
-struct phy_cfg {
- int page0_size;
- struct phy_data page0[MAX_USB_PHY_PAGE0_DATA_SIZE];
- int page1_size;
- struct phy_data page1[MAX_USB_PHY_PAGE1_DATA_SIZE];
- int page2_size;
- struct phy_data page2[MAX_USB_PHY_PAGE2_DATA_SIZE];
-
- int num_phy;
-
- bool check_efuse;
- int check_efuse_version;
-#define CHECK_EFUSE_V1 1
-#define CHECK_EFUSE_V2 2
- int efuse_dc_driving_rate;
- int efuse_dc_disconnect_rate;
- int dc_driving_mask;
- int dc_disconnect_mask;
- bool usb_dc_disconnect_at_page0;
- int driving_updated_for_dev_dis;
-
- bool do_toggle;
- bool do_toggle_driving;
- bool use_default_parameter;
- bool is_double_sensitivity_mode;
-};
-
-struct phy_parameter {
- struct phy_reg phy_reg;
-
- /* Get from efuse */
- s8 efuse_usb_dc_cal;
- s8 efuse_usb_dc_dis;
-
- /* Get from dts */
- bool inverse_hstx_sync_clock;
- u32 driving_level;
- s32 driving_level_compensate;
- s32 disconnection_compensate;
-};
-
-struct rtk_phy {
- struct usb_phy phy;
- struct device *dev;
-
- struct phy_cfg *phy_cfg;
- int num_phy;
- struct phy_parameter *phy_parameter;
-
- struct dentry *debug_dir;
-};
-
-/* mapping 0xE0 to 0 ... 0xE7 to 7, 0xF0 to 8 ,,, 0xF7 to 15 */
-static inline int page_addr_to_array_index(u8 addr)
-{
- return (int)((((addr) - PAGE_START) & 0x7) +
- ((((addr) - PAGE_START) & 0x10) >> 1));
-}
-
-static inline u8 array_index_to_page_addr(int index)
-{
- return ((((index) + PAGE_START) & 0x7) +
- ((((index) & 0x8) << 1) + PAGE_START));
-}
-
-#define PHY_IO_TIMEOUT_USEC (50000)
-#define PHY_IO_DELAY_US (100)
-
-static inline int utmi_wait_register(void __iomem *reg, u32 mask, u32 result)
-{
- int ret;
- unsigned int val;
-
- ret = read_poll_timeout(readl, val, ((val & mask) == result),
- PHY_IO_DELAY_US, PHY_IO_TIMEOUT_USEC, false, reg);
- if (ret) {
- pr_err("%s can't program USB phy\n", __func__);
- return -ETIMEDOUT;
- }
-
- return 0;
-}
-
-static char rtk_phy_read(struct phy_reg *phy_reg, char addr)
-{
- void __iomem *reg_gusb2phyacc0 = phy_reg->reg_gusb2phyacc0;
- unsigned int val;
- int ret = 0;
-
- addr -= OFFEST_PHY_READ;
-
- /* polling until VBusy == 0 */
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return (char)ret;
-
- /* VCtrl = low nibble of addr, and set PHY_NEW_REG_REQ */
- val = PHY_NEW_REG_REQ | (GET_LOW_NIBBLE(addr) << PHY_VCTRL_SHIFT);
- writel(val, reg_gusb2phyacc0);
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return (char)ret;
-
- /* VCtrl = high nibble of addr, and set PHY_NEW_REG_REQ */
- val = PHY_NEW_REG_REQ | (GET_HIGH_NIBBLE(addr) << PHY_VCTRL_SHIFT);
- writel(val, reg_gusb2phyacc0);
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return (char)ret;
-
- val = readl(reg_gusb2phyacc0);
-
- return (char)(val & PHY_REG_DATA_MASK);
-}
-
-static int rtk_phy_write(struct phy_reg *phy_reg, char addr, char data)
-{
- unsigned int val;
- void __iomem *reg_wrap_vstatus = phy_reg->reg_wrap_vstatus;
- void __iomem *reg_gusb2phyacc0 = phy_reg->reg_gusb2phyacc0;
- int shift_bits = phy_reg->vstatus_index * 8;
- int ret = 0;
-
- /* write data to VStatusOut2 (data output to phy) */
- writel((u32)data << shift_bits, reg_wrap_vstatus);
-
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return ret;
-
- /* VCtrl = low nibble of addr, set PHY_NEW_REG_REQ */
- val = PHY_NEW_REG_REQ | (GET_LOW_NIBBLE(addr) << PHY_VCTRL_SHIFT);
-
- writel(val, reg_gusb2phyacc0);
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return ret;
-
- /* VCtrl = high nibble of addr, set PHY_NEW_REG_REQ */
- val = PHY_NEW_REG_REQ | (GET_HIGH_NIBBLE(addr) << PHY_VCTRL_SHIFT);
-
- writel(val, reg_gusb2phyacc0);
- ret = utmi_wait_register(reg_gusb2phyacc0, PHY_VSTS_BUSY, 0);
- if (ret)
- return ret;
-
- return 0;
-}
-
-static int rtk_phy_set_page(struct phy_reg *phy_reg, int page)
-{
- switch (page) {
- case 0:
- return rtk_phy_write(phy_reg, SET_PAGE_OFFSET, SET_PAGE_0);
- case 1:
- return rtk_phy_write(phy_reg, SET_PAGE_OFFSET, SET_PAGE_1);
- case 2:
- return rtk_phy_write(phy_reg, SET_PAGE_OFFSET, SET_PAGE_2);
- default:
- pr_err("%s error page=%d\n", __func__, page);
- }
-
- return -EINVAL;
-}
-
-static u8 __updated_dc_disconnect_level_page0_0xe4(struct phy_cfg *phy_cfg,
- struct phy_parameter *phy_parameter, u8 data)
-{
- u8 ret;
- s32 val;
- s32 dc_disconnect_mask = phy_cfg->dc_disconnect_mask;
- int offset = 4;
-
- val = (s32)((data >> offset) & dc_disconnect_mask)
- + phy_parameter->efuse_usb_dc_dis
- + phy_parameter->disconnection_compensate;
-
- if (val > dc_disconnect_mask)
- val = dc_disconnect_mask;
- else if (val < 0)
- val = 0;
-
- ret = (data & (~(dc_disconnect_mask << offset))) |
- (val & dc_disconnect_mask) << offset;
-
- return ret;
-}
-
-/* updated disconnect level at page0 */
-static void update_dc_disconnect_level_at_page0(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter, bool update)
-{
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
- struct phy_data *phy_data_page;
- struct phy_data *phy_data;
- u8 addr, data;
- int offset = 4;
- s32 dc_disconnect_mask;
- int i;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_reg = &phy_parameter->phy_reg;
-
- /* Set page 0 */
- phy_data_page = phy_cfg->page0;
- rtk_phy_set_page(phy_reg, 0);
-
- i = page_addr_to_array_index(PAGE0_0XE4);
- phy_data = phy_data_page + i;
- if (!phy_data->addr) {
- phy_data->addr = PAGE0_0XE4;
- phy_data->data = rtk_phy_read(phy_reg, PAGE0_0XE4);
- }
-
- addr = phy_data->addr;
- data = phy_data->data;
- dc_disconnect_mask = phy_cfg->dc_disconnect_mask;
-
- if (update)
- data = __updated_dc_disconnect_level_page0_0xe4(phy_cfg, phy_parameter, data);
- else
- data = (data & ~(dc_disconnect_mask << offset)) |
- (DEFAULT_DC_DISCONNECTION_VALUE << offset);
-
- if (rtk_phy_write(phy_reg, addr, data))
- dev_err(rtk_phy->dev,
- "%s: Error to set page1 parameter addr=0x%x value=0x%x\n",
- __func__, addr, data);
-}
-
-static u8 __updated_dc_disconnect_level_page1_0xe2(struct phy_cfg *phy_cfg,
- struct phy_parameter *phy_parameter, u8 data)
-{
- u8 ret;
- s32 val;
- s32 dc_disconnect_mask = phy_cfg->dc_disconnect_mask;
-
- if (phy_cfg->check_efuse_version == CHECK_EFUSE_V1) {
- val = (s32)(data & dc_disconnect_mask)
- + phy_parameter->efuse_usb_dc_dis
- + phy_parameter->disconnection_compensate;
- } else { /* for CHECK_EFUSE_V2 or no efuse */
- if (phy_parameter->efuse_usb_dc_dis)
- val = (s32)(phy_parameter->efuse_usb_dc_dis +
- phy_parameter->disconnection_compensate);
- else
- val = (s32)((data & dc_disconnect_mask) +
- phy_parameter->disconnection_compensate);
- }
-
- if (val > dc_disconnect_mask)
- val = dc_disconnect_mask;
- else if (val < 0)
- val = 0;
-
- ret = (data & (~dc_disconnect_mask)) | (val & dc_disconnect_mask);
-
- return ret;
-}
-
-/* updated disconnect level at page1 */
-static void update_dc_disconnect_level_at_page1(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter, bool update)
-{
- struct phy_cfg *phy_cfg;
- struct phy_data *phy_data_page;
- struct phy_data *phy_data;
- struct phy_reg *phy_reg;
- u8 addr, data;
- s32 dc_disconnect_mask;
- int i;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_reg = &phy_parameter->phy_reg;
-
- /* Set page 1 */
- phy_data_page = phy_cfg->page1;
- rtk_phy_set_page(phy_reg, 1);
-
- i = page_addr_to_array_index(PAGE1_0XE2);
- phy_data = phy_data_page + i;
- if (!phy_data->addr) {
- phy_data->addr = PAGE1_0XE2;
- phy_data->data = rtk_phy_read(phy_reg, PAGE1_0XE2);
- }
-
- addr = phy_data->addr;
- data = phy_data->data;
- dc_disconnect_mask = phy_cfg->dc_disconnect_mask;
-
- if (update)
- data = __updated_dc_disconnect_level_page1_0xe2(phy_cfg, phy_parameter, data);
- else
- data = (data & ~dc_disconnect_mask) | DEFAULT_DC_DISCONNECTION_VALUE;
-
- if (rtk_phy_write(phy_reg, addr, data))
- dev_err(rtk_phy->dev,
- "%s: Error to set page1 parameter addr=0x%x value=0x%x\n",
- __func__, addr, data);
-}
-
-static void update_dc_disconnect_level(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter, bool update)
-{
- struct phy_cfg *phy_cfg = rtk_phy->phy_cfg;
-
- if (phy_cfg->usb_dc_disconnect_at_page0)
- update_dc_disconnect_level_at_page0(rtk_phy, phy_parameter, update);
- else
- update_dc_disconnect_level_at_page1(rtk_phy, phy_parameter, update);
-}
-
-static u8 __update_dc_driving_page0_0xe4(struct phy_cfg *phy_cfg,
- struct phy_parameter *phy_parameter, u8 data)
-{
- s32 driving_level_compensate = phy_parameter->driving_level_compensate;
- s32 dc_driving_mask = phy_cfg->dc_driving_mask;
- s32 val;
- u8 ret;
-
- if (phy_cfg->check_efuse_version == CHECK_EFUSE_V1) {
- val = (s32)(data & dc_driving_mask) + driving_level_compensate
- + phy_parameter->efuse_usb_dc_cal;
- } else { /* for CHECK_EFUSE_V2 or no efuse */
- if (phy_parameter->efuse_usb_dc_cal)
- val = (s32)((phy_parameter->efuse_usb_dc_cal & dc_driving_mask)
- + driving_level_compensate);
- else
- val = (s32)(data & dc_driving_mask);
- }
-
- if (val > dc_driving_mask)
- val = dc_driving_mask;
- else if (val < 0)
- val = 0;
-
- ret = (data & (~dc_driving_mask)) | (val & dc_driving_mask);
-
- return ret;
-}
-
-static void update_dc_driving_level(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter)
-{
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
-
- phy_reg = &phy_parameter->phy_reg;
- phy_cfg = rtk_phy->phy_cfg;
- if (!phy_cfg->page0[4].addr) {
- rtk_phy_set_page(phy_reg, 0);
- phy_cfg->page0[4].addr = PAGE0_0XE4;
- phy_cfg->page0[4].data = rtk_phy_read(phy_reg, PAGE0_0XE4);
- }
-
- if (phy_parameter->driving_level != DEFAULT_DC_DRIVING_VALUE) {
- u32 dc_driving_mask;
- u8 driving_level;
- u8 data;
-
- data = phy_cfg->page0[4].data;
- dc_driving_mask = phy_cfg->dc_driving_mask;
- driving_level = data & dc_driving_mask;
-
- dev_dbg(rtk_phy->dev, "%s driving_level=%d => dts driving_level=%d\n",
- __func__, driving_level, phy_parameter->driving_level);
-
- phy_cfg->page0[4].data = (data & (~dc_driving_mask)) |
- (phy_parameter->driving_level & dc_driving_mask);
- }
-
- phy_cfg->page0[4].data = __update_dc_driving_page0_0xe4(phy_cfg,
- phy_parameter,
- phy_cfg->page0[4].data);
-}
-
-static void update_hs_clk_select(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter)
-{
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_reg = &phy_parameter->phy_reg;
-
- if (phy_parameter->inverse_hstx_sync_clock) {
- if (!phy_cfg->page0[6].addr) {
- rtk_phy_set_page(phy_reg, 0);
- phy_cfg->page0[6].addr = PAGE0_0XE6;
- phy_cfg->page0[6].data = rtk_phy_read(phy_reg, PAGE0_0XE6);
- }
-
- phy_cfg->page0[6].data = phy_cfg->page0[6].data | HS_CLK_SELECT;
- }
-}
-
-static void do_rtk_phy_toggle(struct rtk_phy *rtk_phy,
- int index, bool connect)
-{
- struct phy_parameter *phy_parameter;
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
- struct phy_data *phy_data_page;
- u8 addr, data;
- int i;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- if (!phy_cfg->do_toggle)
- goto out;
-
- if (phy_cfg->is_double_sensitivity_mode)
- goto do_toggle_driving;
-
- /* Set page 0 */
- rtk_phy_set_page(phy_reg, 0);
-
- addr = PAGE0_0XE7;
- data = rtk_phy_read(phy_reg, addr);
-
- if (connect)
- rtk_phy_write(phy_reg, addr, data & (~SENSITIVITY_CTRL));
- else
- rtk_phy_write(phy_reg, addr, data | (SENSITIVITY_CTRL));
-
-do_toggle_driving:
-
- if (!phy_cfg->do_toggle_driving)
- goto do_toggle;
-
- /* Page 0 addr 0xE4 driving capability */
-
- /* Set page 0 */
- phy_data_page = phy_cfg->page0;
- rtk_phy_set_page(phy_reg, 0);
-
- i = page_addr_to_array_index(PAGE0_0XE4);
- addr = phy_data_page[i].addr;
- data = phy_data_page[i].data;
-
- if (connect) {
- rtk_phy_write(phy_reg, addr, data);
- } else {
- u8 value;
- s32 tmp;
- s32 driving_updated =
- phy_cfg->driving_updated_for_dev_dis;
- s32 dc_driving_mask = phy_cfg->dc_driving_mask;
-
- tmp = (s32)(data & dc_driving_mask) + driving_updated;
-
- if (tmp > dc_driving_mask)
- tmp = dc_driving_mask;
- else if (tmp < 0)
- tmp = 0;
-
- value = (data & (~dc_driving_mask)) | (tmp & dc_driving_mask);
-
- rtk_phy_write(phy_reg, addr, value);
- }
-
-do_toggle:
- /* restore dc disconnect level before toggle */
- update_dc_disconnect_level(rtk_phy, phy_parameter, false);
-
- /* Set page 1 */
- rtk_phy_set_page(phy_reg, 1);
-
- addr = PAGE1_0XE0;
- data = rtk_phy_read(phy_reg, addr);
-
- rtk_phy_write(phy_reg, addr, data &
- (~ENABLE_AUTO_SENSITIVITY_CALIBRATION));
- mdelay(1);
- rtk_phy_write(phy_reg, addr, data |
- (ENABLE_AUTO_SENSITIVITY_CALIBRATION));
-
- /* update dc disconnect level after toggle */
- update_dc_disconnect_level(rtk_phy, phy_parameter, true);
-
-out:
- return;
-}
-
-static int do_rtk_phy_init(struct rtk_phy *rtk_phy, int index)
-{
- struct phy_parameter *phy_parameter;
- struct phy_cfg *phy_cfg;
- struct phy_data *phy_data_page;
- struct phy_reg *phy_reg;
- int i;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- if (phy_cfg->use_default_parameter) {
- dev_dbg(rtk_phy->dev, "%s phy#%d use default parameter\n",
- __func__, index);
- goto do_toggle;
- }
-
- /* Set page 0 */
- phy_data_page = phy_cfg->page0;
- rtk_phy_set_page(phy_reg, 0);
-
- for (i = 0; i < phy_cfg->page0_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = phy_data->addr;
- u8 data = phy_data->data;
-
- if (!addr)
- continue;
-
- if (rtk_phy_write(phy_reg, addr, data)) {
- dev_err(rtk_phy->dev,
- "%s: Error to set page0 parameter addr=0x%x value=0x%x\n",
- __func__, addr, data);
- return -EINVAL;
- }
- }
-
- /* Set page 1 */
- phy_data_page = phy_cfg->page1;
- rtk_phy_set_page(phy_reg, 1);
-
- for (i = 0; i < phy_cfg->page1_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = phy_data->addr;
- u8 data = phy_data->data;
-
- if (!addr)
- continue;
-
- if (rtk_phy_write(phy_reg, addr, data)) {
- dev_err(rtk_phy->dev,
- "%s: Error to set page1 parameter addr=0x%x value=0x%x\n",
- __func__, addr, data);
- return -EINVAL;
- }
- }
-
- if (phy_cfg->page2_size == 0)
- goto do_toggle;
-
- /* Set page 2 */
- phy_data_page = phy_cfg->page2;
- rtk_phy_set_page(phy_reg, 2);
-
- for (i = 0; i < phy_cfg->page2_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = phy_data->addr;
- u8 data = phy_data->data;
-
- if (!addr)
- continue;
-
- if (rtk_phy_write(phy_reg, addr, data)) {
- dev_err(rtk_phy->dev,
- "%s: Error to set page2 parameter addr=0x%x value=0x%x\n",
- __func__, addr, data);
- return -EINVAL;
- }
- }
-
-do_toggle:
- do_rtk_phy_toggle(rtk_phy, index, false);
-
- return 0;
-}
-
-static int rtk_phy_init(struct phy *phy)
-{
- struct rtk_phy *rtk_phy = phy_get_drvdata(phy);
- unsigned long phy_init_time = jiffies;
- int i, ret = 0;
-
- if (!rtk_phy)
- return -EINVAL;
-
- for (i = 0; i < rtk_phy->num_phy; i++)
- ret = do_rtk_phy_init(rtk_phy, i);
-
- dev_dbg(rtk_phy->dev, "Initialized RTK USB 2.0 PHY (take %dms)\n",
- jiffies_to_msecs(jiffies - phy_init_time));
- return ret;
-}
-
-static int rtk_phy_exit(struct phy *phy)
-{
- return 0;
-}
-
-static const struct phy_ops ops = {
- .init = rtk_phy_init,
- .exit = rtk_phy_exit,
- .owner = THIS_MODULE,
-};
-
-static void rtk_phy_toggle(struct usb_phy *usb2_phy, bool connect, int port)
-{
- int index = port;
- struct rtk_phy *rtk_phy = NULL;
-
- rtk_phy = dev_get_drvdata(usb2_phy->dev);
-
- if (index > rtk_phy->num_phy) {
- dev_err(rtk_phy->dev, "%s: The port=%d is not in usb phy (num_phy=%d)\n",
- __func__, index, rtk_phy->num_phy);
- return;
- }
-
- do_rtk_phy_toggle(rtk_phy, index, connect);
-}
-
-static int rtk_phy_notify_port_status(struct usb_phy *x, int port,
- u16 portstatus, u16 portchange)
-{
- bool connect = false;
-
- pr_debug("%s port=%d portstatus=0x%x portchange=0x%x\n",
- __func__, port, (int)portstatus, (int)portchange);
- if (portstatus & USB_PORT_STAT_CONNECTION)
- connect = true;
-
- if (portchange & USB_PORT_STAT_C_CONNECTION)
- rtk_phy_toggle(x, connect, port);
-
- return 0;
-}
-
-#ifdef CONFIG_DEBUG_FS
-static struct dentry *create_phy_debug_root(void)
-{
- struct dentry *phy_debug_root;
-
- phy_debug_root = debugfs_lookup("phy", usb_debug_root);
- if (!phy_debug_root)
- phy_debug_root = debugfs_create_dir("phy", usb_debug_root);
-
- return phy_debug_root;
-}
-
-static int rtk_usb2_parameter_show(struct seq_file *s, void *unused)
-{
- struct rtk_phy *rtk_phy = s->private;
- struct phy_cfg *phy_cfg;
- int i, index;
-
- phy_cfg = rtk_phy->phy_cfg;
-
- seq_puts(s, "Property:\n");
- seq_printf(s, " check_efuse: %s\n",
- phy_cfg->check_efuse ? "Enable" : "Disable");
- seq_printf(s, " check_efuse_version: %d\n",
- phy_cfg->check_efuse_version);
- seq_printf(s, " efuse_dc_driving_rate: %d\n",
- phy_cfg->efuse_dc_driving_rate);
- seq_printf(s, " dc_driving_mask: 0x%x\n",
- phy_cfg->dc_driving_mask);
- seq_printf(s, " efuse_dc_disconnect_rate: %d\n",
- phy_cfg->efuse_dc_disconnect_rate);
- seq_printf(s, " dc_disconnect_mask: 0x%x\n",
- phy_cfg->dc_disconnect_mask);
- seq_printf(s, " usb_dc_disconnect_at_page0: %s\n",
- phy_cfg->usb_dc_disconnect_at_page0 ? "true" : "false");
- seq_printf(s, " do_toggle: %s\n",
- phy_cfg->do_toggle ? "Enable" : "Disable");
- seq_printf(s, " do_toggle_driving: %s\n",
- phy_cfg->do_toggle_driving ? "Enable" : "Disable");
- seq_printf(s, " driving_updated_for_dev_dis: 0x%x\n",
- phy_cfg->driving_updated_for_dev_dis);
- seq_printf(s, " use_default_parameter: %s\n",
- phy_cfg->use_default_parameter ? "Enable" : "Disable");
- seq_printf(s, " is_double_sensitivity_mode: %s\n",
- phy_cfg->is_double_sensitivity_mode ? "Enable" : "Disable");
-
- for (index = 0; index < rtk_phy->num_phy; index++) {
- struct phy_parameter *phy_parameter;
- struct phy_reg *phy_reg;
- struct phy_data *phy_data_page;
-
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- seq_printf(s, "PHY %d:\n", index);
-
- seq_puts(s, "Page 0:\n");
- /* Set page 0 */
- phy_data_page = phy_cfg->page0;
- rtk_phy_set_page(phy_reg, 0);
-
- for (i = 0; i < phy_cfg->page0_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = array_index_to_page_addr(i);
- u8 data = phy_data->data;
- u8 value = rtk_phy_read(phy_reg, addr);
-
- if (phy_data->addr)
- seq_printf(s, " Page 0: addr=0x%x data=0x%02x ==> read value=0x%02x\n",
- addr, data, value);
- else
- seq_printf(s, " Page 0: addr=0x%x data=none ==> read value=0x%02x\n",
- addr, value);
- }
-
- seq_puts(s, "Page 1:\n");
- /* Set page 1 */
- phy_data_page = phy_cfg->page1;
- rtk_phy_set_page(phy_reg, 1);
-
- for (i = 0; i < phy_cfg->page1_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = array_index_to_page_addr(i);
- u8 data = phy_data->data;
- u8 value = rtk_phy_read(phy_reg, addr);
-
- if (phy_data->addr)
- seq_printf(s, " Page 1: addr=0x%x data=0x%02x ==> read value=0x%02x\n",
- addr, data, value);
- else
- seq_printf(s, " Page 1: addr=0x%x data=none ==> read value=0x%02x\n",
- addr, value);
- }
-
- if (phy_cfg->page2_size == 0)
- goto out;
-
- seq_puts(s, "Page 2:\n");
- /* Set page 2 */
- phy_data_page = phy_cfg->page2;
- rtk_phy_set_page(phy_reg, 2);
-
- for (i = 0; i < phy_cfg->page2_size; i++) {
- struct phy_data *phy_data = phy_data_page + i;
- u8 addr = array_index_to_page_addr(i);
- u8 data = phy_data->data;
- u8 value = rtk_phy_read(phy_reg, addr);
-
- if (phy_data->addr)
- seq_printf(s, " Page 2: addr=0x%x data=0x%02x ==> read value=0x%02x\n",
- addr, data, value);
- else
- seq_printf(s, " Page 2: addr=0x%x data=none ==> read value=0x%02x\n",
- addr, value);
- }
-
-out:
- seq_puts(s, "PHY Property:\n");
- seq_printf(s, " efuse_usb_dc_cal: %d\n",
- (int)phy_parameter->efuse_usb_dc_cal);
- seq_printf(s, " efuse_usb_dc_dis: %d\n",
- (int)phy_parameter->efuse_usb_dc_dis);
- seq_printf(s, " inverse_hstx_sync_clock: %s\n",
- phy_parameter->inverse_hstx_sync_clock ? "Enable" : "Disable");
- seq_printf(s, " driving_level: %d\n",
- phy_parameter->driving_level);
- seq_printf(s, " driving_level_compensate: %d\n",
- phy_parameter->driving_level_compensate);
- seq_printf(s, " disconnection_compensate: %d\n",
- phy_parameter->disconnection_compensate);
- }
-
- return 0;
-}
-DEFINE_SHOW_ATTRIBUTE(rtk_usb2_parameter);
-
-static inline void create_debug_files(struct rtk_phy *rtk_phy)
-{
- struct dentry *phy_debug_root = NULL;
-
- phy_debug_root = create_phy_debug_root();
- if (!phy_debug_root)
- return;
-
- rtk_phy->debug_dir = debugfs_create_dir(dev_name(rtk_phy->dev),
- phy_debug_root);
-
- debugfs_create_file("parameter", 0444, rtk_phy->debug_dir, rtk_phy,
- &rtk_usb2_parameter_fops);
-
- return;
-}
-
-static inline void remove_debug_files(struct rtk_phy *rtk_phy)
-{
- debugfs_remove_recursive(rtk_phy->debug_dir);
-}
-#else
-static inline void create_debug_files(struct rtk_phy *rtk_phy) { }
-static inline void remove_debug_files(struct rtk_phy *rtk_phy) { }
-#endif /* CONFIG_DEBUG_FS */
-
-static int get_phy_data_by_efuse(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter, int index)
-{
- struct phy_cfg *phy_cfg = rtk_phy->phy_cfg;
- u8 value = 0;
- struct nvmem_cell *cell;
- struct soc_device_attribute rtk_soc_groot[] = {
- { .family = "Realtek Groot",},
- { /* empty */ } };
-
- if (!phy_cfg->check_efuse)
- goto out;
-
- /* Read efuse for usb dc cal */
- cell = nvmem_cell_get(rtk_phy->dev, "usb-dc-cal");
- if (IS_ERR(cell)) {
- dev_dbg(rtk_phy->dev, "%s no usb-dc-cal: %ld\n",
- __func__, PTR_ERR(cell));
- } else {
- unsigned char *buf;
- size_t buf_size;
-
- buf = nvmem_cell_read(cell, &buf_size);
- if (!IS_ERR(buf)) {
- value = buf[0] & phy_cfg->dc_driving_mask;
- kfree(buf);
- }
- nvmem_cell_put(cell);
- }
-
- if (phy_cfg->check_efuse_version == CHECK_EFUSE_V1) {
- int rate = phy_cfg->efuse_dc_driving_rate;
-
- if (value <= EFUS_USB_DC_CAL_MAX)
- phy_parameter->efuse_usb_dc_cal = (int8_t)(value * rate);
- else
- phy_parameter->efuse_usb_dc_cal = -(int8_t)
- ((EFUS_USB_DC_CAL_MAX & value) * rate);
-
- if (soc_device_match(rtk_soc_groot)) {
- dev_dbg(rtk_phy->dev, "For groot IC we need a workaround to adjust efuse_usb_dc_cal\n");
-
- /* We don't multiple dc_cal_rate=2 for positive dc cal compensate */
- if (value <= EFUS_USB_DC_CAL_MAX)
- phy_parameter->efuse_usb_dc_cal = (int8_t)(value);
-
- /* We set max dc cal compensate is 0x8 if otp is 0x7 */
- if (value == 0x7)
- phy_parameter->efuse_usb_dc_cal = (int8_t)(value + 1);
- }
- } else { /* for CHECK_EFUSE_V2 */
- phy_parameter->efuse_usb_dc_cal = value & phy_cfg->dc_driving_mask;
- }
-
- /* Read efuse for usb dc disconnect level */
- value = 0;
- cell = nvmem_cell_get(rtk_phy->dev, "usb-dc-dis");
- if (IS_ERR(cell)) {
- dev_dbg(rtk_phy->dev, "%s no usb-dc-dis: %ld\n",
- __func__, PTR_ERR(cell));
- } else {
- unsigned char *buf;
- size_t buf_size;
-
- buf = nvmem_cell_read(cell, &buf_size);
- if (!IS_ERR(buf)) {
- value = buf[0] & phy_cfg->dc_disconnect_mask;
- kfree(buf);
- }
- nvmem_cell_put(cell);
- }
-
- if (phy_cfg->check_efuse_version == CHECK_EFUSE_V1) {
- int rate = phy_cfg->efuse_dc_disconnect_rate;
-
- if (value <= EFUS_USB_DC_DIS_MAX)
- phy_parameter->efuse_usb_dc_dis = (int8_t)(value * rate);
- else
- phy_parameter->efuse_usb_dc_dis = -(int8_t)
- ((EFUS_USB_DC_DIS_MAX & value) * rate);
- } else { /* for CHECK_EFUSE_V2 */
- phy_parameter->efuse_usb_dc_dis = value & phy_cfg->dc_disconnect_mask;
- }
-
-out:
- return 0;
-}
-
-static int parse_phy_data(struct rtk_phy *rtk_phy)
-{
- struct device *dev = rtk_phy->dev;
- struct device_node *np = dev->of_node;
- struct phy_parameter *phy_parameter;
- int ret = 0;
- int index;
-
- rtk_phy->phy_parameter = devm_kzalloc(dev, sizeof(struct phy_parameter) *
- rtk_phy->num_phy, GFP_KERNEL);
- if (!rtk_phy->phy_parameter)
- return -ENOMEM;
-
- for (index = 0; index < rtk_phy->num_phy; index++) {
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
-
- phy_parameter->phy_reg.reg_wrap_vstatus = of_iomap(np, 0);
- phy_parameter->phy_reg.reg_gusb2phyacc0 = of_iomap(np, 1) + index;
- phy_parameter->phy_reg.vstatus_index = index;
-
- if (of_property_read_bool(np, "realtek,inverse-hstx-sync-clock"))
- phy_parameter->inverse_hstx_sync_clock = true;
- else
- phy_parameter->inverse_hstx_sync_clock = false;
-
- if (of_property_read_u32_index(np, "realtek,driving-level",
- index, &phy_parameter->driving_level))
- phy_parameter->driving_level = DEFAULT_DC_DRIVING_VALUE;
-
- if (of_property_read_u32_index(np, "realtek,driving-level-compensate",
- index, &phy_parameter->driving_level_compensate))
- phy_parameter->driving_level_compensate = 0;
-
- if (of_property_read_u32_index(np, "realtek,disconnection-compensate",
- index, &phy_parameter->disconnection_compensate))
- phy_parameter->disconnection_compensate = 0;
-
- get_phy_data_by_efuse(rtk_phy, phy_parameter, index);
-
- update_dc_driving_level(rtk_phy, phy_parameter);
-
- update_hs_clk_select(rtk_phy, phy_parameter);
- }
-
- return ret;
-}
-
-static int rtk_usb2phy_probe(struct platform_device *pdev)
-{
- struct rtk_phy *rtk_phy;
- struct device *dev = &pdev->dev;
- struct phy *generic_phy;
- struct phy_provider *phy_provider;
- const struct phy_cfg *phy_cfg;
- int ret = 0;
-
- phy_cfg = of_device_get_match_data(dev);
- if (!phy_cfg) {
- dev_err(dev, "phy config are not assigned!\n");
- return -EINVAL;
- }
-
- rtk_phy = devm_kzalloc(dev, sizeof(*rtk_phy), GFP_KERNEL);
- if (!rtk_phy)
- return -ENOMEM;
-
- rtk_phy->dev = &pdev->dev;
- rtk_phy->phy.dev = rtk_phy->dev;
- rtk_phy->phy.label = "rtk-usb2phy";
- rtk_phy->phy.notify_port_status = rtk_phy_notify_port_status;
-
- rtk_phy->phy_cfg = devm_kzalloc(dev, sizeof(*phy_cfg), GFP_KERNEL);
-
- memcpy(rtk_phy->phy_cfg, phy_cfg, sizeof(*phy_cfg));
-
- rtk_phy->num_phy = phy_cfg->num_phy;
-
- ret = parse_phy_data(rtk_phy);
- if (ret)
- goto err;
-
- platform_set_drvdata(pdev, rtk_phy);
-
- generic_phy = devm_phy_create(rtk_phy->dev, NULL, &ops);
- if (IS_ERR(generic_phy))
- return PTR_ERR(generic_phy);
-
- phy_set_drvdata(generic_phy, rtk_phy);
-
- phy_provider = devm_of_phy_provider_register(rtk_phy->dev,
- of_phy_simple_xlate);
- if (IS_ERR(phy_provider))
- return PTR_ERR(phy_provider);
-
- ret = usb_add_phy_dev(&rtk_phy->phy);
- if (ret)
- goto err;
-
- create_debug_files(rtk_phy);
-
-err:
- return ret;
-}
-
-static void rtk_usb2phy_remove(struct platform_device *pdev)
-{
- struct rtk_phy *rtk_phy = platform_get_drvdata(pdev);
-
- remove_debug_files(rtk_phy);
-
- usb_remove_phy(&rtk_phy->phy);
-}
-
-static const struct phy_cfg rtd1295_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0x90},
- [3] = {0xe3, 0x3a},
- [4] = {0xe4, 0x68},
- [6] = {0xe6, 0x91},
- [13] = {0xf5, 0x81},
- [15] = {0xf7, 0x02}, },
- .page1_size = 8,
- .page1 = { /* default parameter */ },
- .page2_size = 0,
- .page2 = { /* no parameter */ },
- .num_phy = 1,
- .check_efuse = false,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = false,
-};
-
-static const struct phy_cfg rtd1395_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [4] = {0xe4, 0xac},
- [13] = {0xf5, 0x00},
- [15] = {0xf7, 0x02}, },
- .page1_size = 8,
- .page1 = { /* default parameter */ },
- .page2_size = 0,
- .page2 = { /* no parameter */ },
- .num_phy = 1,
- .check_efuse = false,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = false,
-};
-
-static const struct phy_cfg rtd1395_phy_cfg_2port = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [4] = {0xe4, 0xac},
- [13] = {0xf5, 0x00},
- [15] = {0xf7, 0x02}, },
- .page1_size = 8,
- .page1 = { /* default parameter */ },
- .page2_size = 0,
- .page2 = { /* no parameter */ },
- .num_phy = 2,
- .check_efuse = false,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = false,
-};
-
-static const struct phy_cfg rtd1619_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [4] = {0xe4, 0x68}, },
- .page1_size = 8,
- .page1 = { /* default parameter */ },
- .page2_size = 0,
- .page2 = { /* no parameter */ },
- .num_phy = 1,
- .check_efuse = true,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = false,
-};
-
-static const struct phy_cfg rtd1319_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0x18},
- [4] = {0xe4, 0x6a},
- [7] = {0xe7, 0x71},
- [13] = {0xf5, 0x15},
- [15] = {0xf7, 0x32}, },
- .page1_size = 8,
- .page1 = { [3] = {0xe3, 0x44}, },
- .page2_size = MAX_USB_PHY_PAGE2_DATA_SIZE,
- .page2 = { [0] = {0xe0, 0x01}, },
- .num_phy = 1,
- .check_efuse = true,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = true,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = true,
-};
-
-static const struct phy_cfg rtd1312c_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0x14},
- [4] = {0xe4, 0x67},
- [5] = {0xe5, 0x55}, },
- .page1_size = 8,
- .page1 = { [3] = {0xe3, 0x23},
- [6] = {0xe6, 0x58}, },
- .page2_size = MAX_USB_PHY_PAGE2_DATA_SIZE,
- .page2 = { /* default parameter */ },
- .num_phy = 1,
- .check_efuse = true,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = 1,
- .dc_driving_mask = 0xf,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = true,
- .do_toggle = true,
- .do_toggle_driving = true,
- .driving_updated_for_dev_dis = 0xf,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = true,
-};
-
-static const struct phy_cfg rtd1619b_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0xa3},
- [4] = {0xe4, 0x88},
- [5] = {0xe5, 0x4f},
- [6] = {0xe6, 0x02}, },
- .page1_size = 8,
- .page1 = { [3] = {0xe3, 0x64}, },
- .page2_size = MAX_USB_PHY_PAGE2_DATA_SIZE,
- .page2 = { [7] = {0xe7, 0x45}, },
- .num_phy = 1,
- .check_efuse = true,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = EFUS_USB_DC_CAL_RATE,
- .dc_driving_mask = 0x1f,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = false,
- .do_toggle = true,
- .do_toggle_driving = true,
- .driving_updated_for_dev_dis = 0x8,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = true,
-};
-
-static const struct phy_cfg rtd1319d_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0xa3},
- [4] = {0xe4, 0x8e},
- [5] = {0xe5, 0x4f},
- [6] = {0xe6, 0x02}, },
- .page1_size = MAX_USB_PHY_PAGE1_DATA_SIZE,
- .page1 = { [14] = {0xf5, 0x1}, },
- .page2_size = MAX_USB_PHY_PAGE2_DATA_SIZE,
- .page2 = { [7] = {0xe7, 0x44}, },
- .check_efuse = true,
- .num_phy = 1,
- .check_efuse_version = CHECK_EFUSE_V1,
- .efuse_dc_driving_rate = EFUS_USB_DC_CAL_RATE,
- .dc_driving_mask = 0x1f,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = false,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0x8,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = true,
-};
-
-static const struct phy_cfg rtd1315e_phy_cfg = {
- .page0_size = MAX_USB_PHY_PAGE0_DATA_SIZE,
- .page0 = { [0] = {0xe0, 0xa3},
- [4] = {0xe4, 0x8c},
- [5] = {0xe5, 0x4f},
- [6] = {0xe6, 0x02}, },
- .page1_size = MAX_USB_PHY_PAGE1_DATA_SIZE,
- .page1 = { [3] = {0xe3, 0x7f},
- [14] = {0xf5, 0x01}, },
- .page2_size = MAX_USB_PHY_PAGE2_DATA_SIZE,
- .page2 = { [7] = {0xe7, 0x44}, },
- .num_phy = 1,
- .check_efuse = true,
- .check_efuse_version = CHECK_EFUSE_V2,
- .efuse_dc_driving_rate = EFUS_USB_DC_CAL_RATE,
- .dc_driving_mask = 0x1f,
- .efuse_dc_disconnect_rate = EFUS_USB_DC_DIS_RATE,
- .dc_disconnect_mask = 0xf,
- .usb_dc_disconnect_at_page0 = false,
- .do_toggle = true,
- .do_toggle_driving = false,
- .driving_updated_for_dev_dis = 0x8,
- .use_default_parameter = false,
- .is_double_sensitivity_mode = true,
-};
-
-static const struct of_device_id usbphy_rtk_dt_match[] = {
- { .compatible = "realtek,rtd1295-usb2phy", .data = &rtd1295_phy_cfg },
- { .compatible = "realtek,rtd1312c-usb2phy", .data = &rtd1312c_phy_cfg },
- { .compatible = "realtek,rtd1315e-usb2phy", .data = &rtd1315e_phy_cfg },
- { .compatible = "realtek,rtd1319-usb2phy", .data = &rtd1319_phy_cfg },
- { .compatible = "realtek,rtd1319d-usb2phy", .data = &rtd1319d_phy_cfg },
- { .compatible = "realtek,rtd1395-usb2phy", .data = &rtd1395_phy_cfg },
- { .compatible = "realtek,rtd1395-usb2phy-2port", .data = &rtd1395_phy_cfg_2port },
- { .compatible = "realtek,rtd1619-usb2phy", .data = &rtd1619_phy_cfg },
- { .compatible = "realtek,rtd1619b-usb2phy", .data = &rtd1619b_phy_cfg },
- {},
-};
-MODULE_DEVICE_TABLE(of, usbphy_rtk_dt_match);
-
-static struct platform_driver rtk_usb2phy_driver = {
- .probe = rtk_usb2phy_probe,
- .remove_new = rtk_usb2phy_remove,
- .driver = {
- .name = "rtk-usb2phy",
- .of_match_table = usbphy_rtk_dt_match,
- },
-};
-
-module_platform_driver(rtk_usb2phy_driver);
-
-MODULE_LICENSE("GPL");
-MODULE_ALIAS("platform: rtk-usb2phy");
-MODULE_AUTHOR("Stanley Chang <stanley_chang@realtek.com>");
-MODULE_DESCRIPTION("Realtek usb 2.0 phy driver");
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * phy-rtk-usb3.c RTK usb3.0 phy driver
- *
- * copyright (c) 2023 realtek semiconductor corporation
- *
- */
-
-#include <linux/module.h>
-#include <linux/of.h>
-#include <linux/of_address.h>
-#include <linux/platform_device.h>
-#include <linux/uaccess.h>
-#include <linux/debugfs.h>
-#include <linux/nvmem-consumer.h>
-#include <linux/regmap.h>
-#include <linux/sys_soc.h>
-#include <linux/mfd/syscon.h>
-#include <linux/phy/phy.h>
-#include <linux/usb.h>
-#include <linux/usb/hcd.h>
-#include <linux/usb/phy.h>
-
-#define USB_MDIO_CTRL_PHY_BUSY BIT(7)
-#define USB_MDIO_CTRL_PHY_WRITE BIT(0)
-#define USB_MDIO_CTRL_PHY_ADDR_SHIFT 8
-#define USB_MDIO_CTRL_PHY_DATA_SHIFT 16
-
-#define MAX_USB_PHY_DATA_SIZE 0x30
-#define PHY_ADDR_0X09 0x09
-#define PHY_ADDR_0X0B 0x0b
-#define PHY_ADDR_0X0D 0x0d
-#define PHY_ADDR_0X10 0x10
-#define PHY_ADDR_0X1F 0x1f
-#define PHY_ADDR_0X20 0x20
-#define PHY_ADDR_0X21 0x21
-#define PHY_ADDR_0X30 0x30
-
-#define REG_0X09_FORCE_CALIBRATION BIT(9)
-#define REG_0X0B_RX_OFFSET_RANGE_MASK 0xc
-#define REG_0X0D_RX_DEBUG_TEST_EN BIT(6)
-#define REG_0X10_DEBUG_MODE_SETTING 0x3c0
-#define REG_0X10_DEBUG_MODE_SETTING_MASK 0x3f8
-#define REG_0X1F_RX_OFFSET_CODE_MASK 0x1e
-
-#define USB_U3_TX_LFPS_SWING_TRIM_SHIFT 4
-#define USB_U3_TX_LFPS_SWING_TRIM_MASK 0xf
-#define AMPLITUDE_CONTROL_COARSE_MASK 0xff
-#define AMPLITUDE_CONTROL_FINE_MASK 0xffff
-#define AMPLITUDE_CONTROL_COARSE_DEFAULT 0xff
-#define AMPLITUDE_CONTROL_FINE_DEFAULT 0xffff
-
-#define PHY_ADDR_MAP_ARRAY_INDEX(addr) (addr)
-#define ARRAY_INDEX_MAP_PHY_ADDR(index) (index)
-
-struct phy_reg {
- void __iomem *reg_mdio_ctl;
-};
-
-struct phy_data {
- u8 addr;
- u16 data;
-};
-
-struct phy_cfg {
- int param_size;
- struct phy_data param[MAX_USB_PHY_DATA_SIZE];
-
- bool check_efuse;
- bool do_toggle;
- bool do_toggle_once;
- bool use_default_parameter;
- bool check_rx_front_end_offset;
-};
-
-struct phy_parameter {
- struct phy_reg phy_reg;
-
- /* Get from efuse */
- u8 efuse_usb_u3_tx_lfps_swing_trim;
-
- /* Get from dts */
- u32 amplitude_control_coarse;
- u32 amplitude_control_fine;
-};
-
-struct rtk_phy {
- struct usb_phy phy;
- struct device *dev;
-
- struct phy_cfg *phy_cfg;
- int num_phy;
- struct phy_parameter *phy_parameter;
-
- struct dentry *debug_dir;
-};
-
-#define PHY_IO_TIMEOUT_USEC (50000)
-#define PHY_IO_DELAY_US (100)
-
-static inline int utmi_wait_register(void __iomem *reg, u32 mask, u32 result)
-{
- int ret;
- unsigned int val;
-
- ret = read_poll_timeout(readl, val, ((val & mask) == result),
- PHY_IO_DELAY_US, PHY_IO_TIMEOUT_USEC, false, reg);
- if (ret) {
- pr_err("%s can't program USB phy\n", __func__);
- return -ETIMEDOUT;
- }
-
- return 0;
-}
-
-static int rtk_phy3_wait_vbusy(struct phy_reg *phy_reg)
-{
- return utmi_wait_register(phy_reg->reg_mdio_ctl, USB_MDIO_CTRL_PHY_BUSY, 0);
-}
-
-static u16 rtk_phy_read(struct phy_reg *phy_reg, char addr)
-{
- unsigned int tmp;
- u32 value;
-
- tmp = (addr << USB_MDIO_CTRL_PHY_ADDR_SHIFT);
-
- writel(tmp, phy_reg->reg_mdio_ctl);
-
- rtk_phy3_wait_vbusy(phy_reg);
-
- value = readl(phy_reg->reg_mdio_ctl);
- value = value >> USB_MDIO_CTRL_PHY_DATA_SHIFT;
-
- return (u16)value;
-}
-
-static int rtk_phy_write(struct phy_reg *phy_reg, char addr, u16 data)
-{
- unsigned int val;
-
- val = USB_MDIO_CTRL_PHY_WRITE |
- (addr << USB_MDIO_CTRL_PHY_ADDR_SHIFT) |
- (data << USB_MDIO_CTRL_PHY_DATA_SHIFT);
-
- writel(val, phy_reg->reg_mdio_ctl);
-
- rtk_phy3_wait_vbusy(phy_reg);
-
- return 0;
-}
-
-static void do_rtk_usb3_phy_toggle(struct rtk_phy *rtk_phy, int index, bool connect)
-{
- struct phy_cfg *phy_cfg = rtk_phy->phy_cfg;
- struct phy_reg *phy_reg;
- struct phy_parameter *phy_parameter;
- struct phy_data *phy_data;
- u8 addr;
- u16 data;
- int i;
-
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- if (!phy_cfg->do_toggle)
- return;
-
- i = PHY_ADDR_MAP_ARRAY_INDEX(PHY_ADDR_0X09);
- phy_data = phy_cfg->param + i;
- addr = phy_data->addr;
- data = phy_data->data;
-
- if (!addr && !data) {
- addr = PHY_ADDR_0X09;
- data = rtk_phy_read(phy_reg, addr);
- phy_data->addr = addr;
- phy_data->data = data;
- }
-
- rtk_phy_write(phy_reg, addr, data & (~REG_0X09_FORCE_CALIBRATION));
- mdelay(1);
- rtk_phy_write(phy_reg, addr, data | REG_0X09_FORCE_CALIBRATION);
-}
-
-static int do_rtk_phy_init(struct rtk_phy *rtk_phy, int index)
-{
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
- struct phy_parameter *phy_parameter;
- int i = 0;
-
- phy_cfg = rtk_phy->phy_cfg;
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- if (phy_cfg->use_default_parameter)
- goto do_toggle;
-
- for (i = 0; i < phy_cfg->param_size; i++) {
- struct phy_data *phy_data = phy_cfg->param + i;
- u8 addr = phy_data->addr;
- u16 data = phy_data->data;
-
- if (!addr && !data)
- continue;
-
- rtk_phy_write(phy_reg, addr, data);
- }
-
-do_toggle:
- if (phy_cfg->do_toggle_once)
- phy_cfg->do_toggle = true;
-
- do_rtk_usb3_phy_toggle(rtk_phy, index, false);
-
- if (phy_cfg->do_toggle_once) {
- u16 check_value = 0;
- int count = 10;
- u16 value_0x0d, value_0x10;
-
- /* Enable Debug mode by set 0x0D and 0x10 */
- value_0x0d = rtk_phy_read(phy_reg, PHY_ADDR_0X0D);
- value_0x10 = rtk_phy_read(phy_reg, PHY_ADDR_0X10);
-
- rtk_phy_write(phy_reg, PHY_ADDR_0X0D,
- value_0x0d | REG_0X0D_RX_DEBUG_TEST_EN);
- rtk_phy_write(phy_reg, PHY_ADDR_0X10,
- (value_0x10 & ~REG_0X10_DEBUG_MODE_SETTING_MASK) |
- REG_0X10_DEBUG_MODE_SETTING);
-
- check_value = rtk_phy_read(phy_reg, PHY_ADDR_0X30);
-
- while (!(check_value & BIT(15))) {
- check_value = rtk_phy_read(phy_reg, PHY_ADDR_0X30);
- mdelay(1);
- if (count-- < 0)
- break;
- }
-
- if (!(check_value & BIT(15)))
- dev_info(rtk_phy->dev, "toggle fail addr=0x%02x, data=0x%04x\n",
- PHY_ADDR_0X30, check_value);
-
- /* Disable Debug mode by set 0x0D and 0x10 to default*/
- rtk_phy_write(phy_reg, PHY_ADDR_0X0D, value_0x0d);
- rtk_phy_write(phy_reg, PHY_ADDR_0X10, value_0x10);
-
- phy_cfg->do_toggle = false;
- }
-
- if (phy_cfg->check_rx_front_end_offset) {
- u16 rx_offset_code, rx_offset_range;
- u16 code_mask = REG_0X1F_RX_OFFSET_CODE_MASK;
- u16 range_mask = REG_0X0B_RX_OFFSET_RANGE_MASK;
- bool do_update = false;
-
- rx_offset_code = rtk_phy_read(phy_reg, PHY_ADDR_0X1F);
- if (((rx_offset_code & code_mask) == 0x0) ||
- ((rx_offset_code & code_mask) == code_mask))
- do_update = true;
-
- rx_offset_range = rtk_phy_read(phy_reg, PHY_ADDR_0X0B);
- if (((rx_offset_range & range_mask) == range_mask) && do_update) {
- dev_warn(rtk_phy->dev, "Don't update rx_offset_range (rx_offset_code=0x%x, rx_offset_range=0x%x)\n",
- rx_offset_code, rx_offset_range);
- do_update = false;
- }
-
- if (do_update) {
- u16 tmp1, tmp2;
-
- tmp1 = rx_offset_range & (~range_mask);
- tmp2 = rx_offset_range & range_mask;
- tmp2 += (1 << 2);
- rx_offset_range = tmp1 | (tmp2 & range_mask);
- rtk_phy_write(phy_reg, PHY_ADDR_0X0B, rx_offset_range);
- goto do_toggle;
- }
- }
-
- return 0;
-}
-
-static int rtk_phy_init(struct phy *phy)
-{
- struct rtk_phy *rtk_phy = phy_get_drvdata(phy);
- int ret = 0;
- int i;
- unsigned long phy_init_time = jiffies;
-
- for (i = 0; i < rtk_phy->num_phy; i++)
- ret = do_rtk_phy_init(rtk_phy, i);
-
- dev_dbg(rtk_phy->dev, "Initialized RTK USB 3.0 PHY (take %dms)\n",
- jiffies_to_msecs(jiffies - phy_init_time));
-
- return ret;
-}
-
-static int rtk_phy_exit(struct phy *phy)
-{
- return 0;
-}
-
-static const struct phy_ops ops = {
- .init = rtk_phy_init,
- .exit = rtk_phy_exit,
- .owner = THIS_MODULE,
-};
-
-static void rtk_phy_toggle(struct usb_phy *usb3_phy, bool connect, int port)
-{
- int index = port;
- struct rtk_phy *rtk_phy = NULL;
-
- rtk_phy = dev_get_drvdata(usb3_phy->dev);
-
- if (index > rtk_phy->num_phy) {
- dev_err(rtk_phy->dev, "%s: The port=%d is not in usb phy (num_phy=%d)\n",
- __func__, index, rtk_phy->num_phy);
- return;
- }
-
- do_rtk_usb3_phy_toggle(rtk_phy, index, connect);
-}
-
-static int rtk_phy_notify_port_status(struct usb_phy *x, int port,
- u16 portstatus, u16 portchange)
-{
- bool connect = false;
-
- pr_debug("%s port=%d portstatus=0x%x portchange=0x%x\n",
- __func__, port, (int)portstatus, (int)portchange);
- if (portstatus & USB_PORT_STAT_CONNECTION)
- connect = true;
-
- if (portchange & USB_PORT_STAT_C_CONNECTION)
- rtk_phy_toggle(x, connect, port);
-
- return 0;
-}
-
-#ifdef CONFIG_DEBUG_FS
-static struct dentry *create_phy_debug_root(void)
-{
- struct dentry *phy_debug_root;
-
- phy_debug_root = debugfs_lookup("phy", usb_debug_root);
- if (!phy_debug_root)
- phy_debug_root = debugfs_create_dir("phy", usb_debug_root);
-
- return phy_debug_root;
-}
-
-static int rtk_usb3_parameter_show(struct seq_file *s, void *unused)
-{
- struct rtk_phy *rtk_phy = s->private;
- struct phy_cfg *phy_cfg;
- int i, index;
-
- phy_cfg = rtk_phy->phy_cfg;
-
- seq_puts(s, "Property:\n");
- seq_printf(s, " check_efuse: %s\n",
- phy_cfg->check_efuse ? "Enable" : "Disable");
- seq_printf(s, " do_toggle: %s\n",
- phy_cfg->do_toggle ? "Enable" : "Disable");
- seq_printf(s, " do_toggle_once: %s\n",
- phy_cfg->do_toggle_once ? "Enable" : "Disable");
- seq_printf(s, " use_default_parameter: %s\n",
- phy_cfg->use_default_parameter ? "Enable" : "Disable");
-
- for (index = 0; index < rtk_phy->num_phy; index++) {
- struct phy_reg *phy_reg;
- struct phy_parameter *phy_parameter;
-
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
- phy_reg = &phy_parameter->phy_reg;
-
- seq_printf(s, "PHY %d:\n", index);
-
- for (i = 0; i < phy_cfg->param_size; i++) {
- struct phy_data *phy_data = phy_cfg->param + i;
- u8 addr = ARRAY_INDEX_MAP_PHY_ADDR(i);
- u16 data = phy_data->data;
-
- if (!phy_data->addr && !data)
- seq_printf(s, " addr = 0x%02x, data = none ==> read value = 0x%04x\n",
- addr, rtk_phy_read(phy_reg, addr));
- else
- seq_printf(s, " addr = 0x%02x, data = 0x%04x ==> read value = 0x%04x\n",
- addr, data, rtk_phy_read(phy_reg, addr));
- }
-
- seq_puts(s, "PHY Property:\n");
- seq_printf(s, " efuse_usb_u3_tx_lfps_swing_trim: 0x%x\n",
- (int)phy_parameter->efuse_usb_u3_tx_lfps_swing_trim);
- seq_printf(s, " amplitude_control_coarse: 0x%x\n",
- (int)phy_parameter->amplitude_control_coarse);
- seq_printf(s, " amplitude_control_fine: 0x%x\n",
- (int)phy_parameter->amplitude_control_fine);
- }
-
- return 0;
-}
-DEFINE_SHOW_ATTRIBUTE(rtk_usb3_parameter);
-
-static inline void create_debug_files(struct rtk_phy *rtk_phy)
-{
- struct dentry *phy_debug_root = NULL;
-
- phy_debug_root = create_phy_debug_root();
-
- if (!phy_debug_root)
- return;
-
- rtk_phy->debug_dir = debugfs_create_dir(dev_name(rtk_phy->dev), phy_debug_root);
-
- debugfs_create_file("parameter", 0444, rtk_phy->debug_dir, rtk_phy,
- &rtk_usb3_parameter_fops);
-
- return;
-}
-
-static inline void remove_debug_files(struct rtk_phy *rtk_phy)
-{
- debugfs_remove_recursive(rtk_phy->debug_dir);
-}
-#else
-static inline void create_debug_files(struct rtk_phy *rtk_phy) { }
-static inline void remove_debug_files(struct rtk_phy *rtk_phy) { }
-#endif /* CONFIG_DEBUG_FS */
-
-static int get_phy_data_by_efuse(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter, int index)
-{
- struct phy_cfg *phy_cfg = rtk_phy->phy_cfg;
- u8 value = 0;
- struct nvmem_cell *cell;
-
- if (!phy_cfg->check_efuse)
- goto out;
-
- cell = nvmem_cell_get(rtk_phy->dev, "usb_u3_tx_lfps_swing_trim");
- if (IS_ERR(cell)) {
- dev_dbg(rtk_phy->dev, "%s no usb_u3_tx_lfps_swing_trim: %ld\n",
- __func__, PTR_ERR(cell));
- } else {
- unsigned char *buf;
- size_t buf_size;
-
- buf = nvmem_cell_read(cell, &buf_size);
- if (!IS_ERR(buf)) {
- value = buf[0] & USB_U3_TX_LFPS_SWING_TRIM_MASK;
- kfree(buf);
- }
- nvmem_cell_put(cell);
- }
-
- if (value > 0 && value < 0x8)
- phy_parameter->efuse_usb_u3_tx_lfps_swing_trim = 0x8;
- else
- phy_parameter->efuse_usb_u3_tx_lfps_swing_trim = (u8)value;
-
-out:
- return 0;
-}
-
-static void update_amplitude_control_value(struct rtk_phy *rtk_phy,
- struct phy_parameter *phy_parameter)
-{
- struct phy_cfg *phy_cfg;
- struct phy_reg *phy_reg;
-
- phy_reg = &phy_parameter->phy_reg;
- phy_cfg = rtk_phy->phy_cfg;
-
- if (phy_parameter->amplitude_control_coarse != AMPLITUDE_CONTROL_COARSE_DEFAULT) {
- u16 val_mask = AMPLITUDE_CONTROL_COARSE_MASK;
- u16 data;
-
- if (!phy_cfg->param[PHY_ADDR_0X20].addr && !phy_cfg->param[PHY_ADDR_0X20].data) {
- phy_cfg->param[PHY_ADDR_0X20].addr = PHY_ADDR_0X20;
- data = rtk_phy_read(phy_reg, PHY_ADDR_0X20);
- } else {
- data = phy_cfg->param[PHY_ADDR_0X20].data;
- }
-
- data &= (~val_mask);
- data |= (phy_parameter->amplitude_control_coarse & val_mask);
-
- phy_cfg->param[PHY_ADDR_0X20].data = data;
- }
-
- if (phy_parameter->efuse_usb_u3_tx_lfps_swing_trim) {
- u8 efuse_val = phy_parameter->efuse_usb_u3_tx_lfps_swing_trim;
- u16 val_mask = USB_U3_TX_LFPS_SWING_TRIM_MASK;
- int val_shift = USB_U3_TX_LFPS_SWING_TRIM_SHIFT;
- u16 data;
-
- if (!phy_cfg->param[PHY_ADDR_0X20].addr && !phy_cfg->param[PHY_ADDR_0X20].data) {
- phy_cfg->param[PHY_ADDR_0X20].addr = PHY_ADDR_0X20;
- data = rtk_phy_read(phy_reg, PHY_ADDR_0X20);
- } else {
- data = phy_cfg->param[PHY_ADDR_0X20].data;
- }
-
- data &= ~(val_mask << val_shift);
- data |= ((efuse_val & val_mask) << val_shift);
-
- phy_cfg->param[PHY_ADDR_0X20].data = data;
- }
-
- if (phy_parameter->amplitude_control_fine != AMPLITUDE_CONTROL_FINE_DEFAULT) {
- u16 val_mask = AMPLITUDE_CONTROL_FINE_MASK;
-
- if (!phy_cfg->param[PHY_ADDR_0X21].addr && !phy_cfg->param[PHY_ADDR_0X21].data)
- phy_cfg->param[PHY_ADDR_0X21].addr = PHY_ADDR_0X21;
-
- phy_cfg->param[PHY_ADDR_0X21].data =
- phy_parameter->amplitude_control_fine & val_mask;
- }
-}
-
-static int parse_phy_data(struct rtk_phy *rtk_phy)
-{
- struct device *dev = rtk_phy->dev;
- struct phy_parameter *phy_parameter;
- int ret = 0;
- int index;
-
- rtk_phy->phy_parameter = devm_kzalloc(dev, sizeof(struct phy_parameter) *
- rtk_phy->num_phy, GFP_KERNEL);
- if (!rtk_phy->phy_parameter)
- return -ENOMEM;
-
- for (index = 0; index < rtk_phy->num_phy; index++) {
- phy_parameter = &((struct phy_parameter *)rtk_phy->phy_parameter)[index];
-
- phy_parameter->phy_reg.reg_mdio_ctl = of_iomap(dev->of_node, 0) + index;
-
- /* Amplitude control address 0x20 bit 0 to bit 7 */
- if (of_property_read_u32(dev->of_node, "realtek,amplitude-control-coarse-tuning",
- &phy_parameter->amplitude_control_coarse))
- phy_parameter->amplitude_control_coarse = AMPLITUDE_CONTROL_COARSE_DEFAULT;
-
- /* Amplitude control address 0x21 bit 0 to bit 16 */
- if (of_property_read_u32(dev->of_node, "realtek,amplitude-control-fine-tuning",
- &phy_parameter->amplitude_control_fine))
- phy_parameter->amplitude_control_fine = AMPLITUDE_CONTROL_FINE_DEFAULT;
-
- get_phy_data_by_efuse(rtk_phy, phy_parameter, index);
-
- update_amplitude_control_value(rtk_phy, phy_parameter);
- }
-
- return ret;
-}
-
-static int rtk_usb3phy_probe(struct platform_device *pdev)
-{
- struct rtk_phy *rtk_phy;
- struct device *dev = &pdev->dev;
- struct phy *generic_phy;
- struct phy_provider *phy_provider;
- const struct phy_cfg *phy_cfg;
- int ret;
-
- phy_cfg = of_device_get_match_data(dev);
- if (!phy_cfg) {
- dev_err(dev, "phy config are not assigned!\n");
- return -EINVAL;
- }
-
- rtk_phy = devm_kzalloc(dev, sizeof(*rtk_phy), GFP_KERNEL);
- if (!rtk_phy)
- return -ENOMEM;
-
- rtk_phy->dev = &pdev->dev;
- rtk_phy->phy.dev = rtk_phy->dev;
- rtk_phy->phy.label = "rtk-usb3phy";
- rtk_phy->phy.notify_port_status = rtk_phy_notify_port_status;
-
- rtk_phy->phy_cfg = devm_kzalloc(dev, sizeof(*phy_cfg), GFP_KERNEL);
-
- memcpy(rtk_phy->phy_cfg, phy_cfg, sizeof(*phy_cfg));
-
- rtk_phy->num_phy = 1;
-
- ret = parse_phy_data(rtk_phy);
- if (ret)
- goto err;
-
- platform_set_drvdata(pdev, rtk_phy);
-
- generic_phy = devm_phy_create(rtk_phy->dev, NULL, &ops);
- if (IS_ERR(generic_phy))
- return PTR_ERR(generic_phy);
-
- phy_set_drvdata(generic_phy, rtk_phy);
-
- phy_provider = devm_of_phy_provider_register(rtk_phy->dev, of_phy_simple_xlate);
- if (IS_ERR(phy_provider))
- return PTR_ERR(phy_provider);
-
- ret = usb_add_phy_dev(&rtk_phy->phy);
- if (ret)
- goto err;
-
- create_debug_files(rtk_phy);
-
-err:
- return ret;
-}
-
-static void rtk_usb3phy_remove(struct platform_device *pdev)
-{
- struct rtk_phy *rtk_phy = platform_get_drvdata(pdev);
-
- remove_debug_files(rtk_phy);
-
- usb_remove_phy(&rtk_phy->phy);
-}
-
-static const struct phy_cfg rtd1295_phy_cfg = {
- .param_size = MAX_USB_PHY_DATA_SIZE,
- .param = { [0] = {0x01, 0x4008}, [1] = {0x01, 0xe046},
- [2] = {0x02, 0x6046}, [3] = {0x03, 0x2779},
- [4] = {0x04, 0x72f5}, [5] = {0x05, 0x2ad3},
- [6] = {0x06, 0x000e}, [7] = {0x07, 0x2e00},
- [8] = {0x08, 0x3591}, [9] = {0x09, 0x525c},
- [10] = {0x0a, 0xa600}, [11] = {0x0b, 0xa904},
- [12] = {0x0c, 0xc000}, [13] = {0x0d, 0xef1c},
- [14] = {0x0e, 0x2000}, [15] = {0x0f, 0x0000},
- [16] = {0x10, 0x000c}, [17] = {0x11, 0x4c00},
- [18] = {0x12, 0xfc00}, [19] = {0x13, 0x0c81},
- [20] = {0x14, 0xde01}, [21] = {0x15, 0x0000},
- [22] = {0x16, 0x0000}, [23] = {0x17, 0x0000},
- [24] = {0x18, 0x0000}, [25] = {0x19, 0x4004},
- [26] = {0x1a, 0x1260}, [27] = {0x1b, 0xff00},
- [28] = {0x1c, 0xcb00}, [29] = {0x1d, 0xa03f},
- [30] = {0x1e, 0xc2e0}, [31] = {0x1f, 0x2807},
- [32] = {0x20, 0x947a}, [33] = {0x21, 0x88aa},
- [34] = {0x22, 0x0057}, [35] = {0x23, 0xab66},
- [36] = {0x24, 0x0800}, [37] = {0x25, 0x0000},
- [38] = {0x26, 0x040a}, [39] = {0x27, 0x01d6},
- [40] = {0x28, 0xf8c2}, [41] = {0x29, 0x3080},
- [42] = {0x2a, 0x3082}, [43] = {0x2b, 0x2078},
- [44] = {0x2c, 0xffff}, [45] = {0x2d, 0xffff},
- [46] = {0x2e, 0x0000}, [47] = {0x2f, 0x0040}, },
- .check_efuse = false,
- .do_toggle = true,
- .do_toggle_once = false,
- .use_default_parameter = false,
- .check_rx_front_end_offset = false,
-};
-
-static const struct phy_cfg rtd1619_phy_cfg = {
- .param_size = MAX_USB_PHY_DATA_SIZE,
- .param = { [8] = {0x08, 0x3591},
- [38] = {0x26, 0x840b},
- [40] = {0x28, 0xf842}, },
- .check_efuse = false,
- .do_toggle = true,
- .do_toggle_once = false,
- .use_default_parameter = false,
- .check_rx_front_end_offset = false,
-};
-
-static const struct phy_cfg rtd1319_phy_cfg = {
- .param_size = MAX_USB_PHY_DATA_SIZE,
- .param = { [1] = {0x01, 0xac86},
- [6] = {0x06, 0x0003},
- [9] = {0x09, 0x924c},
- [10] = {0x0a, 0xa608},
- [11] = {0x0b, 0xb905},
- [14] = {0x0e, 0x2010},
- [32] = {0x20, 0x705a},
- [33] = {0x21, 0xf645},
- [34] = {0x22, 0x0013},
- [35] = {0x23, 0xcb66},
- [41] = {0x29, 0xff00}, },
- .check_efuse = true,
- .do_toggle = true,
- .do_toggle_once = false,
- .use_default_parameter = false,
- .check_rx_front_end_offset = false,
-};
-
-static const struct phy_cfg rtd1619b_phy_cfg = {
- .param_size = MAX_USB_PHY_DATA_SIZE,
- .param = { [1] = {0x01, 0xac8c},
- [6] = {0x06, 0x0017},
- [9] = {0x09, 0x724c},
- [10] = {0x0a, 0xb610},
- [11] = {0x0b, 0xb90d},
- [13] = {0x0d, 0xef2a},
- [15] = {0x0f, 0x9050},
- [16] = {0x10, 0x000c},
- [32] = {0x20, 0x70ff},
- [34] = {0x22, 0x0013},
- [35] = {0x23, 0xdb66},
- [38] = {0x26, 0x8609},
- [41] = {0x29, 0xff13},
- [42] = {0x2a, 0x3070}, },
- .check_efuse = true,
- .do_toggle = false,
- .do_toggle_once = true,
- .use_default_parameter = false,
- .check_rx_front_end_offset = false,
-};
-
-static const struct phy_cfg rtd1319d_phy_cfg = {
- .param_size = MAX_USB_PHY_DATA_SIZE,
- .param = { [1] = {0x01, 0xac89},
- [4] = {0x04, 0xf2f5},
- [6] = {0x06, 0x0017},
- [9] = {0x09, 0x424c},
- [10] = {0x0a, 0x9610},
- [11] = {0x0b, 0x9901},
- [12] = {0x0c, 0xf000},
- [13] = {0x0d, 0xef2a},
- [14] = {0x0e, 0x1000},
- [15] = {0x0f, 0x9050},
- [32] = {0x20, 0x7077},
- [35] = {0x23, 0x0b62},
- [37] = {0x25, 0x10ec},
- [42] = {0x2a, 0x3070}, },
- .check_efuse = true,
- .do_toggle = false,
- .do_toggle_once = true,
- .use_default_parameter = false,
- .check_rx_front_end_offset = true,
-};
-
-static const struct of_device_id usbphy_rtk_dt_match[] = {
- { .compatible = "realtek,rtd1295-usb3phy", .data = &rtd1295_phy_cfg },
- { .compatible = "realtek,rtd1319-usb3phy", .data = &rtd1319_phy_cfg },
- { .compatible = "realtek,rtd1319d-usb3phy", .data = &rtd1319d_phy_cfg },
- { .compatible = "realtek,rtd1619-usb3phy", .data = &rtd1619_phy_cfg },
- { .compatible = "realtek,rtd1619b-usb3phy", .data = &rtd1619b_phy_cfg },
- {},
-};
-MODULE_DEVICE_TABLE(of, usbphy_rtk_dt_match);
-
-static struct platform_driver rtk_usb3phy_driver = {
- .probe = rtk_usb3phy_probe,
- .remove_new = rtk_usb3phy_remove,
- .driver = {
- .name = "rtk-usb3phy",
- .of_match_table = usbphy_rtk_dt_match,
- },
-};
-
-module_platform_driver(rtk_usb3phy_driver);
-
-MODULE_LICENSE("GPL");
-MODULE_ALIAS("platform: rtk-usb3phy");
-MODULE_AUTHOR("Stanley Chang <stanley_chang@realtek.com>");
-MODULE_DESCRIPTION("Realtek usb 3.0 phy driver");
phy = devm_phy_create(&pdev->dev, NULL, &sp_uphy_ops);
if (IS_ERR(phy)) {
- ret = -PTR_ERR(phy);
+ ret = PTR_ERR(phy);
return ret;
}
u32 num_ports;
u32 reg_offset;
u32 qsgmii_main_ports;
+ bool no_offset;
};
static int phy_gmii_sel_mode(struct phy *phy, enum phy_mode mode, int submode)
priv->num_ports = size / sizeof(u32);
if (!priv->num_ports)
return -EINVAL;
- priv->reg_offset = __be32_to_cpu(*offset);
+ if (!priv->no_offset)
+ priv->reg_offset = __be32_to_cpu(*offset);
}
if_phys = devm_kcalloc(dev, priv->num_ports,
dev_err(dev, "Failed to get syscon %d\n", ret);
return ret;
}
+ priv->no_offset = true;
}
ret = phy_gmii_sel_init_ports(priv);
config PINCTRL_LOCHNAGAR
tristate "Cirrus Logic Lochnagar pinctrl driver"
- depends on MFD_LOCHNAGAR
+ # Avoid clash caused by MIPS defining RST, which is used in the driver
+ depends on MFD_LOCHNAGAR && !MIPS
select GPIOLIB
select PINMUX
select PINCONF
static int pinctrl_commit_state(struct pinctrl *p, struct pinctrl_state *state)
{
struct pinctrl_setting *setting, *setting2;
- struct pinctrl_state *old_state = p->state;
+ struct pinctrl_state *old_state = READ_ONCE(p->state);
int ret;
- if (p->state) {
+ if (old_state) {
/*
* For each pinmux setting in the old state, forget SW's record
* of mux owner for that pingroup. Any pingroups which are
* still owned by the new state will be re-acquired by the call
* to pinmux_enable_setting() in the loop below.
*/
- list_for_each_entry(setting, &p->state->settings, node) {
+ list_for_each_entry(setting, &old_state->settings, node) {
if (setting->type != PIN_MAP_TYPE_MUX_GROUP)
continue;
pinmux_disable_setting(setting);
if (!np)
return -ENODEV;
- if (mem_regions == 0) {
- dev_err(&pdev->dev, "mem_regions is 0\n");
+ if (mem_regions == 0 || mem_regions >= 10000) {
+ dev_err(&pdev->dev, "mem_regions is invalid: %u\n", mem_regions);
return -EINVAL;
}
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
gpio_dev->saved_regs[i] = readl(gpio_dev->base + pin * 4) & ~PIN_IRQ_PENDING;
+
+ /* mask any interrupts not intended to be a wake source */
+ if (!(gpio_dev->saved_regs[i] & WAKE_SOURCE)) {
+ writel(gpio_dev->saved_regs[i] & ~BIT(INTERRUPT_MASK_OFF),
+ gpio_dev->base + pin * 4);
+ pm_pr_dbg("Disabling GPIO #%d interrupt for suspend.\n",
+ pin);
+ }
+
raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
}
#define FUNCTION_MASK GENMASK(1, 0)
#define FUNCTION_INVALID GENMASK(7, 0)
+#define WAKE_SOURCE (BIT(WAKE_CNTRL_OFF_S0I3) | \
+ BIT(WAKE_CNTRL_OFF_S3) | \
+ BIT(WAKE_CNTRL_OFF_S4) | \
+ BIT(WAKECNTRL_Z_OFF))
+
struct amd_function {
const char *name;
const char * const groups[NSELECTS];
}
};
+/*
+ * This lock class allows to tell lockdep that parent IRQ and children IRQ do
+ * not share the same class so it does not raise false positive
+ */
+static struct lock_class_key atmel_lock_key;
+static struct lock_class_key atmel_request_key;
+
static int atmel_pinctrl_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
irq_set_chip_and_handler(irq, &atmel_gpio_irq_chip,
handle_simple_irq);
irq_set_chip_data(irq, atmel_pioctrl);
+ irq_set_lockdep_class(irq, &atmel_lock_key, &atmel_request_key);
dev_dbg(dev,
"atmel gpio irq domain: hwirq: %d, linux irq: %d\n",
i, irq);
* @pinctrl_desc: pin controller description
* @name: Chip controller name
* @tpin: Total number of pins
+ * @gpio_reset: GPIO line handler that can reset the IC
*/
struct cy8c95x0_pinctrl {
struct regmap *regmap;
"gp77",
};
+static int cy8c95x0_pinmux_direction(struct cy8c95x0_pinctrl *chip,
+ unsigned int pin, bool input);
+
static inline u8 cypress_get_port(struct cy8c95x0_pinctrl *chip, unsigned int pin)
{
/* Account for GPORT2 which only has 4 bits */
ret = regmap_read(chip->regmap, reg, ®_val);
if (reg_val & bit)
arg = 1;
+ if (param == PIN_CONFIG_OUTPUT_ENABLE)
+ arg = !arg;
*config = pinconf_to_config_packed(param, (u16)arg);
out:
u8 port = cypress_get_port(chip, off);
u8 bit = cypress_get_pin_mask(chip, off);
unsigned long param = pinconf_to_config_param(config);
+ unsigned long arg = pinconf_to_config_argument(config);
unsigned int reg;
int ret;
case PIN_CONFIG_MODE_PWM:
reg = CY8C95X0_PWMSEL;
break;
+ case PIN_CONFIG_OUTPUT_ENABLE:
+ ret = cy8c95x0_pinmux_direction(chip, off, !arg);
+ goto out;
+ case PIN_CONFIG_INPUT_ENABLE:
+ ret = cy8c95x0_pinmux_direction(chip, off, arg);
+ goto out;
default:
ret = -ENOTSUPP;
goto out;
gc->get_direction = cy8c95x0_gpio_get_direction;
gc->get_multiple = cy8c95x0_gpio_get_multiple;
gc->set_multiple = cy8c95x0_gpio_set_multiple;
- gc->set_config = gpiochip_generic_config,
+ gc->set_config = gpiochip_generic_config;
gc->can_sleep = true;
gc->add_pin_ranges = cy8c95x0_add_pin_ranges;
static const struct rtd_pin_desc *rtd_pinctrl_find_mux(struct rtd_pinctrl *data, unsigned int pin)
{
- if (!data->info->muxes[pin].name)
+ if (data->info->muxes[pin].name)
return &data->info->muxes[pin];
return NULL;
static const struct rtd_pin_config_desc
*rtd_pinctrl_find_config(struct rtd_pinctrl *data, unsigned int pin)
{
- if (!data->info->configs[pin].name)
+ if (data->info->configs[pin].name)
return &data->info->configs[pin];
return NULL;
nmaps = 0;
ngroups = 0;
- for_each_child_of_node(np, child) {
+ for_each_available_child_of_node(np, child) {
int npinmux = of_property_count_u32_elems(child, "pinmux");
int npins = of_property_count_u32_elems(child, "pins");
nmaps = 0;
ngroups = 0;
mutex_lock(&sfp->mutex);
- for_each_child_of_node(np, child) {
+ for_each_available_child_of_node(np, child) {
int npins;
int i;
int ret;
ngroups = 0;
- for_each_child_of_node(np, child)
+ for_each_available_child_of_node(np, child)
ngroups += 1;
nmaps = 2 * ngroups;
nmaps = 0;
ngroups = 0;
mutex_lock(&sfp->mutex);
- for_each_child_of_node(np, child) {
+ for_each_available_child_of_node(np, child) {
int npins = of_property_count_u32_elems(child, "pinmux");
int *pins;
u32 *pinmux;
int i;
/* With few exceptions (e.g. bank 'Z'), pin number matches with pin index in array */
- pin_desc = pctl->pins + stm32_pin_nb;
- if (pin_desc->pin.number == stm32_pin_nb)
- return pin_desc;
+ if (stm32_pin_nb < pctl->npins) {
+ pin_desc = pctl->pins + stm32_pin_nb;
+ if (pin_desc->pin.number == stm32_pin_nb)
+ return pin_desc;
+ }
/* Otherwise, loop all array to find the pin with the right number */
for (i = 0; i < pctl->npins; i++) {
}
names = devm_kcalloc(dev, npins, sizeof(char *), GFP_KERNEL);
+ if (!names) {
+ err = -ENOMEM;
+ goto err_clk;
+ }
+
for (i = 0; i < npins; i++) {
stm32_pin = stm32_pctrl_get_desc_pin_from_gpio(pctl, bank, i);
if (stm32_pin && stm32_pin->pin.name)
#define MLXBF_BOOTCTL_SB_SECURE_MASK 0x03
#define MLXBF_BOOTCTL_SB_TEST_MASK 0x0c
+#define MLXBF_BOOTCTL_SB_DEV_MASK BIT(4)
#define MLXBF_SB_KEY_NUM 4
{ MLXBF_BOOTCTL_NONE, "none" },
};
+enum {
+ MLXBF_BOOTCTL_SB_LIFECYCLE_PRODUCTION = 0,
+ MLXBF_BOOTCTL_SB_LIFECYCLE_GA_SECURE = 1,
+ MLXBF_BOOTCTL_SB_LIFECYCLE_GA_NON_SECURE = 2,
+ MLXBF_BOOTCTL_SB_LIFECYCLE_RMA = 3
+};
+
static const char * const mlxbf_bootctl_lifecycle_states[] = {
- [0] = "Production",
- [1] = "GA Secured",
- [2] = "GA Non-Secured",
- [3] = "RMA",
+ [MLXBF_BOOTCTL_SB_LIFECYCLE_PRODUCTION] = "Production",
+ [MLXBF_BOOTCTL_SB_LIFECYCLE_GA_SECURE] = "GA Secured",
+ [MLXBF_BOOTCTL_SB_LIFECYCLE_GA_NON_SECURE] = "GA Non-Secured",
+ [MLXBF_BOOTCTL_SB_LIFECYCLE_RMA] = "RMA",
};
/* Log header format. */
static ssize_t lifecycle_state_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
+ int status_bits;
+ int use_dev_key;
+ int test_state;
int lc_state;
- lc_state = mlxbf_bootctl_smc(MLXBF_BOOTCTL_GET_TBB_FUSE_STATUS,
- MLXBF_BOOTCTL_FUSE_STATUS_LIFECYCLE);
- if (lc_state < 0)
- return lc_state;
+ status_bits = mlxbf_bootctl_smc(MLXBF_BOOTCTL_GET_TBB_FUSE_STATUS,
+ MLXBF_BOOTCTL_FUSE_STATUS_LIFECYCLE);
+ if (status_bits < 0)
+ return status_bits;
- lc_state &=
- MLXBF_BOOTCTL_SB_TEST_MASK | MLXBF_BOOTCTL_SB_SECURE_MASK;
+ use_dev_key = status_bits & MLXBF_BOOTCTL_SB_DEV_MASK;
+ test_state = status_bits & MLXBF_BOOTCTL_SB_TEST_MASK;
+ lc_state = status_bits & MLXBF_BOOTCTL_SB_SECURE_MASK;
/*
* If the test bits are set, we specify that the current state may be
* due to using the test bits.
*/
- if (lc_state & MLXBF_BOOTCTL_SB_TEST_MASK) {
- lc_state &= MLXBF_BOOTCTL_SB_SECURE_MASK;
-
+ if (test_state) {
return sprintf(buf, "%s(test)\n",
mlxbf_bootctl_lifecycle_states[lc_state]);
+ } else if (use_dev_key &&
+ (lc_state == MLXBF_BOOTCTL_SB_LIFECYCLE_GA_SECURE)) {
+ return sprintf(buf, "Secured (development)\n");
}
return sprintf(buf, "%s\n", mlxbf_bootctl_lifecycle_states[lc_state]);
attr->dev_attr.show = mlxbf_pmc_event_list_show;
attr->nr = blk_num;
attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL, "event_list");
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
pmc->block[blk_num].block_attr[i] = &attr->dev_attr.attr;
attr = NULL;
attr->nr = blk_num;
attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
"enable");
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
attr = NULL;
}
attr->nr = blk_num;
attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
"counter%d", j);
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
attr = NULL;
attr->nr = blk_num;
attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
"event%d", j);
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
attr = NULL;
}
attr->nr = blk_num;
attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
events[j].evt_name);
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
pmc->block[blk_num].block_attr[i] = &attr->dev_attr.attr;
attr = NULL;
i++;
pmc->block[blk_num].block_attr_grp.attrs = pmc->block[blk_num].block_attr;
pmc->block[blk_num].block_attr_grp.name = devm_kasprintf(
dev, GFP_KERNEL, pmc->block_name[blk_num]);
+ if (!pmc->block[blk_num].block_attr_grp.name)
+ return -ENOMEM;
pmc->groups[pmc->group_num] = &pmc->block[blk_num].block_attr_grp;
pmc->group_num++;
pmc->hwmon_dev = devm_hwmon_device_register_with_groups(
dev, "bfperf", pmc, pmc->groups);
+ if (IS_ERR(pmc->hwmon_dev))
+ return PTR_ERR(pmc->hwmon_dev);
platform_set_drvdata(pdev, pmc);
return 0;
size_t n)
{
struct ssam_controller *ctrl;
+ int ret;
ctrl = serdev_device_get_drvdata(dev);
- return ssam_controller_receive_buf(ctrl, buf, n);
+ ret = ssam_controller_receive_buf(ctrl, buf, n);
+
+ return ret < 0 ? 0 : ret;
}
static void ssam_write_wakeup(struct serdev_device *dev)
depends on RFKILL || RFKILL = n
depends on HOTPLUG_PCI
depends on ACPI_VIDEO || ACPI_VIDEO = n
+ depends on SERIO_I8042 || SERIO_I8042 = n
select INPUT_SPARSEKMAP
select LEDS_CLASS
select NEW_LEDS
config ASUS_NB_WMI
tristate "Asus Notebook WMI Driver"
depends on ASUS_WMI
- depends on SERIO_I8042 || SERIO_I8042 = n
help
This is a driver for newer Asus notebooks. It adds extra features
like wireless radio and bluetooth control, leds, hotkeys, backlight...
struct quirk_entry {
u32 s2idle_bug_mmio;
+ bool spurious_8042;
};
static struct quirk_entry quirk_s2idle_bug = {
.s2idle_bug_mmio = 0xfed80380,
};
+static struct quirk_entry quirk_spurious_8042 = {
+ .spurious_8042 = true,
+};
+
static const struct dmi_system_id fwbug_list[] = {
{
.ident = "L14 Gen2 AMD",
DMI_MATCH(DMI_PRODUCT_NAME, "HP Laptop 15s-eq2xxx"),
}
},
+ /* https://community.frame.work/t/tracking-framework-amd-ryzen-7040-series-lid-wakeup-behavior-feedback/39128 */
+ {
+ .ident = "Framework Laptop 13 (Phoenix)",
+ .driver_data = &quirk_spurious_8042,
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Framework"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Laptop 13 (AMD Ryzen 7040Series)"),
+ DMI_MATCH(DMI_BIOS_VERSION, "03.03"),
+ }
+ },
{}
};
{
const struct dmi_system_id *dmi_id;
+ if (dev->cpu_id == AMD_CPU_ID_CZN)
+ dev->disable_8042_wakeup = true;
+
dmi_id = dmi_first_match(fwbug_list);
if (!dmi_id)
return;
if (dev->quirks->s2idle_bug_mmio)
pr_info("Using s2idle quirk to avoid %s platform firmware bug\n",
dmi_id->ident);
+ if (dev->quirks->spurious_8042)
+ dev->disable_8042_wakeup = true;
}
#define SMU_MSG_LOG_RESET 0x07
#define SMU_MSG_LOG_DUMP_DATA 0x08
#define SMU_MSG_GET_SUP_CONSTRAINTS 0x09
-/* List of supported CPU ids */
-#define AMD_CPU_ID_RV 0x15D0
-#define AMD_CPU_ID_RN 0x1630
-#define AMD_CPU_ID_PCO AMD_CPU_ID_RV
-#define AMD_CPU_ID_CZN AMD_CPU_ID_RN
-#define AMD_CPU_ID_YC 0x14B5
-#define AMD_CPU_ID_CB 0x14D8
-#define AMD_CPU_ID_PS 0x14E8
-#define AMD_CPU_ID_SP 0x14A4
-#define PCI_DEVICE_ID_AMD_1AH_M20H_ROOT 0x1507
#define PMC_MSG_DELAY_MIN_US 50
#define RESPONSE_REGISTER_LOOP_MAX 20000
return -EINVAL;
}
-static int amd_pmc_czn_wa_irq1(struct amd_pmc_dev *pdev)
+static int amd_pmc_wa_irq1(struct amd_pmc_dev *pdev)
{
struct device *d;
int rc;
- if (!pdev->major) {
- rc = amd_pmc_get_smu_version(pdev);
- if (rc)
- return rc;
- }
+ /* cezanne platform firmware has a fix in 64.66.0 */
+ if (pdev->cpu_id == AMD_CPU_ID_CZN) {
+ if (!pdev->major) {
+ rc = amd_pmc_get_smu_version(pdev);
+ if (rc)
+ return rc;
+ }
- if (pdev->major > 64 || (pdev->major == 64 && pdev->minor > 65))
- return 0;
+ if (pdev->major > 64 || (pdev->major == 64 && pdev->minor > 65))
+ return 0;
+ }
d = bus_find_device_by_name(&serio_bus, NULL, "serio0");
if (!d)
{
struct amd_pmc_dev *pdev = dev_get_drvdata(dev);
- if (pdev->cpu_id == AMD_CPU_ID_CZN && !disable_workarounds) {
- int rc = amd_pmc_czn_wa_irq1(pdev);
+ if (pdev->disable_8042_wakeup && !disable_workarounds) {
+ int rc = amd_pmc_wa_irq1(pdev);
if (rc) {
dev_err(pdev->dev, "failed to adjust keyboard wakeup: %d\n", rc);
{ }
};
-static int amd_pmc_get_dram_size(struct amd_pmc_dev *dev)
-{
- int ret;
-
- switch (dev->cpu_id) {
- case AMD_CPU_ID_YC:
- if (!(dev->major > 90 || (dev->major == 90 && dev->minor > 39))) {
- ret = -EINVAL;
- goto err_dram_size;
- }
- break;
- default:
- ret = -EINVAL;
- goto err_dram_size;
- }
-
- ret = amd_pmc_send_cmd(dev, S2D_DRAM_SIZE, &dev->dram_size, dev->s2d_msg_id, true);
- if (ret || !dev->dram_size)
- goto err_dram_size;
-
- return 0;
-
-err_dram_size:
- dev_err(dev->dev, "DRAM size command not supported for this platform\n");
- return ret;
-}
-
static int amd_pmc_s2d_init(struct amd_pmc_dev *dev)
{
u32 phys_addr_low, phys_addr_hi;
return -EIO;
/* Get DRAM size */
- ret = amd_pmc_get_dram_size(dev);
- if (ret)
+ ret = amd_pmc_send_cmd(dev, S2D_DRAM_SIZE, &dev->dram_size, dev->s2d_msg_id, true);
+ if (ret || !dev->dram_size)
dev->dram_size = S2D_TELEMETRY_DRAMBYTES_MAX;
/* Get STB DRAM address */
struct mutex lock; /* generic mutex lock */
struct dentry *dbgfs_dir;
struct quirk_entry *quirks;
+ bool disable_8042_wakeup;
};
void amd_pmc_process_restore_quirks(struct amd_pmc_dev *dev);
void amd_pmc_quirks_init(struct amd_pmc_dev *dev);
+/* List of supported CPU ids */
+#define AMD_CPU_ID_RV 0x15D0
+#define AMD_CPU_ID_RN 0x1630
+#define AMD_CPU_ID_PCO AMD_CPU_ID_RV
+#define AMD_CPU_ID_CZN AMD_CPU_ID_RN
+#define AMD_CPU_ID_YC 0x14B5
+#define AMD_CPU_ID_CB 0x14D8
+#define AMD_CPU_ID_PS 0x14E8
+#define AMD_CPU_ID_SP 0x14A4
+#define PCI_DEVICE_ID_AMD_1AH_M20H_ROOT 0x1507
+
#endif /* PMC_H */
MODULE_PARM_DESC(tablet_mode_sw, "Tablet mode detect: -1:auto 0:disable 1:kbd-dock 2:lid-flip 3:lid-flip-rog");
static struct quirk_entry *quirks;
+static bool atkbd_reports_vol_keys;
-static bool asus_q500a_i8042_filter(unsigned char data, unsigned char str,
- struct serio *port)
+static bool asus_i8042_filter(unsigned char data, unsigned char str, struct serio *port)
{
- static bool extended;
- bool ret = false;
+ static bool extended_e0;
+ static bool extended_e1;
if (str & I8042_STR_AUXDATA)
return false;
- if (unlikely(data == 0xe1)) {
- extended = true;
- ret = true;
- } else if (unlikely(extended)) {
- extended = false;
- ret = true;
+ if (quirks->filter_i8042_e1_extended_codes) {
+ if (data == 0xe1) {
+ extended_e1 = true;
+ return true;
+ }
+
+ if (extended_e1) {
+ extended_e1 = false;
+ return true;
+ }
}
- return ret;
+ if (data == 0xe0) {
+ extended_e0 = true;
+ } else if (extended_e0) {
+ extended_e0 = false;
+
+ switch (data & 0x7f) {
+ case 0x20: /* e0 20 / e0 a0, Volume Mute press / release */
+ case 0x2e: /* e0 2e / e0 ae, Volume Down press / release */
+ case 0x30: /* e0 30 / e0 b0, Volume Up press / release */
+ atkbd_reports_vol_keys = true;
+ break;
+ }
+ }
+
+ return false;
}
static struct quirk_entry quirk_asus_unknown = {
};
static struct quirk_entry quirk_asus_q500a = {
- .i8042_filter = asus_q500a_i8042_filter,
+ .filter_i8042_e1_extended_codes = true,
.wmi_backlight_set_devstate = true,
};
static void asus_nb_wmi_quirks(struct asus_wmi_driver *driver)
{
- int ret;
-
quirks = &quirk_asus_unknown;
dmi_check_system(asus_quirks);
if (tablet_mode_sw != -1)
quirks->tablet_switch_mode = tablet_mode_sw;
-
- if (quirks->i8042_filter) {
- ret = i8042_install_filter(quirks->i8042_filter);
- if (ret) {
- pr_warn("Unable to install key filter\n");
- return;
- }
- pr_info("Using i8042 filter function for receiving events\n");
- }
}
static const struct key_entry asus_nb_wmi_keymap[] = {
if (acpi_video_handles_brightness_key_presses())
*code = ASUS_WMI_KEY_IGNORE;
+ break;
+ case 0x30: /* Volume Up */
+ case 0x31: /* Volume Down */
+ case 0x32: /* Volume Mute */
+ if (atkbd_reports_vol_keys)
+ *code = ASUS_WMI_KEY_IGNORE;
+
break;
}
}
.input_phys = ASUS_NB_WMI_FILE "/input0",
.detect_quirks = asus_nb_wmi_quirks,
.key_filter = asus_nb_wmi_key_filter,
+ .i8042_filter = asus_i8042_filter,
};
#include <linux/acpi.h>
#include <linux/backlight.h>
#include <linux/debugfs.h>
+#include <linux/delay.h>
#include <linux/dmi.h>
#include <linux/fb.h>
#include <linux/hwmon.h>
#define ASUS_SCREENPAD_BRIGHT_MAX 255
#define ASUS_SCREENPAD_BRIGHT_DEFAULT 60
+/* Controls the power state of the USB0 hub on ROG Ally which input is on */
+#define ASUS_USB0_PWR_EC0_CSEE "\\_SB.PCI0.SBRG.EC0.CSEE"
+/* 300ms so far seems to produce a reliable result on AC and battery */
+#define ASUS_USB0_PWR_EC0_CSEE_WAIT 300
+
static const char * const ashs_ids[] = { "ATK4001", "ATK4002", NULL };
static int throttle_thermal_policy_write(struct asus_wmi *);
bool fnlock_locked;
+ /* The ROG Ally device requires the MCU USB device be disconnected before suspend */
+ bool ally_mcu_usb_switch;
+
struct asus_wmi_debug debug;
struct asus_wmi_driver *driver;
asus->nv_temp_tgt_available = asus_wmi_dev_is_present(asus, ASUS_WMI_DEVID_NV_THERM_TARGET);
asus->panel_overdrive_available = asus_wmi_dev_is_present(asus, ASUS_WMI_DEVID_PANEL_OD);
asus->mini_led_mode_available = asus_wmi_dev_is_present(asus, ASUS_WMI_DEVID_MINI_LED_MODE);
+ asus->ally_mcu_usb_switch = acpi_has_method(NULL, ASUS_USB0_PWR_EC0_CSEE)
+ && dmi_match(DMI_BOARD_NAME, "RC71L");
err = fan_boost_mode_check_present(asus);
if (err)
goto fail_wmi_handler;
}
+ if (asus->driver->i8042_filter) {
+ err = i8042_install_filter(asus->driver->i8042_filter);
+ if (err)
+ pr_warn("Unable to install key filter - %d\n", err);
+ }
+
asus_wmi_battery_init(asus);
asus_wmi_debugfs_init(asus);
struct asus_wmi *asus;
asus = platform_get_drvdata(device);
+ if (asus->driver->i8042_filter)
+ i8042_remove_filter(asus->driver->i8042_filter);
wmi_remove_notify_handler(asus->driver->event_guid);
asus_wmi_backlight_exit(asus);
asus_screenpad_exit(asus);
asus_wmi_fnlock_update(asus);
asus_wmi_tablet_mode_get_state(asus);
+
+ return 0;
+}
+
+static int asus_hotk_resume_early(struct device *device)
+{
+ struct asus_wmi *asus = dev_get_drvdata(device);
+
+ if (asus->ally_mcu_usb_switch) {
+ if (ACPI_FAILURE(acpi_execute_simple_method(NULL, ASUS_USB0_PWR_EC0_CSEE, 0xB8)))
+ dev_err(device, "ROG Ally MCU failed to connect USB dev\n");
+ else
+ msleep(ASUS_USB0_PWR_EC0_CSEE_WAIT);
+ }
+ return 0;
+}
+
+static int asus_hotk_prepare(struct device *device)
+{
+ struct asus_wmi *asus = dev_get_drvdata(device);
+ int result, err;
+
+ if (asus->ally_mcu_usb_switch) {
+ /* When powersave is enabled it causes many issues with resume of USB hub */
+ result = asus_wmi_get_devstate_simple(asus, ASUS_WMI_DEVID_MCU_POWERSAVE);
+ if (result == 1) {
+ dev_warn(device, "MCU powersave enabled, disabling to prevent resume issues");
+ err = asus_wmi_set_devstate(ASUS_WMI_DEVID_MCU_POWERSAVE, 0, &result);
+ if (err || result != 1)
+ dev_err(device, "Failed to set MCU powersave mode: %d\n", err);
+ }
+ /* sleep required to ensure USB0 is disabled before sleep continues */
+ if (ACPI_FAILURE(acpi_execute_simple_method(NULL, ASUS_USB0_PWR_EC0_CSEE, 0xB7)))
+ dev_err(device, "ROG Ally MCU failed to disconnect USB dev\n");
+ else
+ msleep(ASUS_USB0_PWR_EC0_CSEE_WAIT);
+ }
return 0;
}
.thaw = asus_hotk_thaw,
.restore = asus_hotk_restore,
.resume = asus_hotk_resume,
+ .resume_early = asus_hotk_resume_early,
+ .prepare = asus_hotk_prepare,
};
/* Registration ***************************************************************/
bool wmi_backlight_set_devstate;
bool wmi_force_als_set;
bool wmi_ignore_fan;
+ bool filter_i8042_e1_extended_codes;
enum asus_wmi_tablet_switch_mode tablet_switch_mode;
int wapf;
/*
*/
int no_display_toggle;
u32 xusb2pr;
-
- bool (*i8042_filter)(unsigned char data, unsigned char str,
- struct serio *serio);
};
struct asus_wmi_driver {
* Return ASUS_WMI_KEY_IGNORE in code if event should be ignored. */
void (*key_filter) (struct asus_wmi_driver *driver, int *code,
unsigned int *value, bool *autorelease);
+ /* Optional standard i8042 filter */
+ bool (*i8042_filter)(unsigned char data, unsigned char str,
+ struct serio *serio);
int (*probe) (struct platform_device *device);
void (*detect_quirks) (struct asus_wmi_driver *driver);
static int hp_add_other_attributes(int attr_type)
{
struct kobject *attr_name_kobj;
- union acpi_object *obj = NULL;
int ret;
char *attr_name;
- mutex_lock(&bioscfg_drv.mutex);
-
attr_name_kobj = kzalloc(sizeof(*attr_name_kobj), GFP_KERNEL);
- if (!attr_name_kobj) {
- ret = -ENOMEM;
- goto err_other_attr_init;
- }
+ if (!attr_name_kobj)
+ return -ENOMEM;
+
+ mutex_lock(&bioscfg_drv.mutex);
/* Check if attribute type is supported */
switch (attr_type) {
default:
pr_err("Error: Unknown attr_type: %d\n", attr_type);
ret = -EINVAL;
- goto err_other_attr_init;
+ kfree(attr_name_kobj);
+ goto unlock_drv_mutex;
}
ret = kobject_init_and_add(attr_name_kobj, &attr_name_ktype,
NULL, "%s", attr_name);
if (ret) {
pr_err("Error encountered [%d]\n", ret);
- kobject_put(attr_name_kobj);
goto err_other_attr_init;
}
switch (attr_type) {
case HPWMI_SECURE_PLATFORM_TYPE:
ret = hp_populate_secure_platform_data(attr_name_kobj);
- if (ret)
- goto err_other_attr_init;
break;
case HPWMI_SURE_START_TYPE:
ret = hp_populate_sure_start_data(attr_name_kobj);
- if (ret)
- goto err_other_attr_init;
break;
default:
ret = -EINVAL;
- goto err_other_attr_init;
}
+ if (ret)
+ goto err_other_attr_init;
+
mutex_unlock(&bioscfg_drv.mutex);
return 0;
err_other_attr_init:
+ kobject_put(attr_name_kobj);
+unlock_drv_mutex:
mutex_unlock(&bioscfg_drv.mutex);
- kfree(obj);
return ret;
}
if (WARN_ON(priv->kbd_bl.initialized))
return -EEXIST;
- brightness = ideapad_kbd_bl_brightness_get(priv);
- if (brightness < 0)
- return brightness;
-
- priv->kbd_bl.last_brightness = brightness;
-
if (ideapad_kbd_bl_check_tristate(priv->kbd_bl.type)) {
priv->kbd_bl.led.max_brightness = 2;
} else {
priv->kbd_bl.led.max_brightness = 1;
}
+ brightness = ideapad_kbd_bl_brightness_get(priv);
+ if (brightness < 0)
+ return brightness;
+
+ priv->kbd_bl.last_brightness = brightness;
priv->kbd_bl.led.name = "platform::" LED_FUNCTION_KBD_BACKLIGHT;
priv->kbd_bl.led.brightness_get = ideapad_kbd_bl_led_cdev_brightness_get;
priv->kbd_bl.led.brightness_set_blocking = ideapad_kbd_bl_led_cdev_brightness_set;
* is based on the contiguous indexes from ltr_show output.
* pmc index and ltr index needs to be calculated from it.
*/
- for (pmc_index = 0; pmc_index < ARRAY_SIZE(pmcdev->pmcs) && ltr_index > 0; pmc_index++) {
+ for (pmc_index = 0; pmc_index < ARRAY_SIZE(pmcdev->pmcs) && ltr_index >= 0; pmc_index++) {
pmc = pmcdev->pmcs[pmc_index];
if (!pmc)
/**
* telemetry_update_events() - Update telemetry Configuration
* @pss_evtconfig: PSS related config. No change if num_evts = 0.
- * @pss_evtconfig: IOSS related config. No change if num_evts = 0.
+ * @ioss_evtconfig: IOSS related config. No change if num_evts = 0.
*
* This API updates the IOSS & PSS Telemetry configuration. Old config
* is overwritten. Call telemetry_reset_events when logging is over
/**
* telemetry_get_eventconfig() - Returns the pss and ioss events enabled
* @pss_evtconfig: Pointer to PSS related configuration.
- * @pss_evtconfig: Pointer to IOSS related configuration.
+ * @ioss_evtconfig: Pointer to IOSS related configuration.
* @pss_len: Number of u32 elements allocated for pss_evtconfig array
* @ioss_len: Number of u32 elements allocated for ioss_evtconfig array
*
bool wakeup_mode;
};
-static void detect_tablet_mode(struct platform_device *device)
+static void detect_tablet_mode(struct device *dev)
{
- struct intel_vbtn_priv *priv = dev_get_drvdata(&device->dev);
- acpi_handle handle = ACPI_HANDLE(&device->dev);
+ struct intel_vbtn_priv *priv = dev_get_drvdata(dev);
+ acpi_handle handle = ACPI_HANDLE(dev);
unsigned long long vgbs;
acpi_status status;
int m;
input_report_switch(priv->switches_dev, SW_TABLET_MODE, m);
m = (vgbs & VGBS_DOCK_MODE_FLAG) ? 1 : 0;
input_report_switch(priv->switches_dev, SW_DOCK, m);
+
+ input_sync(priv->switches_dev);
}
/*
priv->switches_dev->id.bustype = BUS_HOST;
if (priv->has_switches) {
- detect_tablet_mode(device);
+ detect_tablet_mode(&device->dev);
ret = input_register_device(priv->switches_dev);
if (ret)
autorelease = val && (!ke_rel || ke_rel->type == KE_IGNORE);
sparse_keymap_report_event(input_dev, event, val, autorelease);
+
+ /* Some devices need this to report further events */
+ acpi_evaluate_object(handle, "VBDL", NULL, NULL);
}
/*
static int intel_vbtn_pm_resume(struct device *dev)
{
+ struct intel_vbtn_priv *priv = dev_get_drvdata(dev);
+
intel_vbtn_pm_complete(dev);
+
+ if (priv->has_switches)
+ detect_tablet_mode(dev);
+
return 0;
}
* @ips: IPS driver struct
*
* Check whether the MCP is over its thermal or power budget.
+ *
+ * Returns: %true if the temp or power has exceeded its maximum, else %false
*/
static bool mcp_exceeded(struct ips_driver *ips)
{
* @cpu: CPU number to check
*
* Check a given CPU's average temp or power is over its limit.
+ *
+ * Returns: %true if the temp or power has exceeded its maximum, else %false
*/
static bool cpu_exceeded(struct ips_driver *ips, int cpu)
{
* @ips: IPS driver struct
*
* Check the MCH temp & power against their maximums.
+ *
+ * Returns: %true if the temp or power has exceeded its maximum, else %false
*/
static bool mch_exceeded(struct ips_driver *ips)
{
* - down (at TDP limit)
* - adjust both CPU and GPU down if possible
*
- cpu+ gpu+ cpu+gpu- cpu-gpu+ cpu-gpu-
-cpu < gpu < cpu+gpu+ cpu+ gpu+ nothing
-cpu < gpu >= cpu+gpu-(mcp<) cpu+gpu-(mcp<) gpu- gpu-
-cpu >= gpu < cpu-gpu+(mcp<) cpu- cpu-gpu+(mcp<) cpu-
-cpu >= gpu >= cpu-gpu- cpu-gpu- cpu-gpu- cpu-gpu-
+ * |cpu+ gpu+ cpu+gpu- cpu-gpu+ cpu-gpu-
+ * cpu < gpu < |cpu+gpu+ cpu+ gpu+ nothing
+ * cpu < gpu >= |cpu+gpu-(mcp<) cpu+gpu-(mcp<) gpu- gpu-
+ * cpu >= gpu < |cpu-gpu+(mcp<) cpu- cpu-gpu+(mcp<) cpu-
+ * cpu >= gpu >=|cpu-gpu- cpu-gpu- cpu-gpu- cpu-gpu-
*
+ * Returns: %0
*/
static int ips_adjust(void *data)
{
* @data: ips driver structure
*
* This is the main function for the IPS driver. It monitors power and
- * tempurature in the MCP and adjusts CPU and GPU power clams accordingly.
+ * temperature in the MCP and adjusts CPU and GPU power clamps accordingly.
*
- * We keep a 5s moving average of power consumption and tempurature. Using
+ * We keep a 5s moving average of power consumption and temperature. Using
* that data, along with CPU vs GPU preference, we adjust the power clamps
* up or down.
+ *
+ * Returns: %0 on success or -errno on error
*/
static int ips_monitor(void *data)
{
* Handle temperature limit trigger events, generally by lowering the clamps.
* If we're at a critical limit, we clamp back to the lowest possible value
* to prevent emergency shutdown.
+ *
+ * Returns: IRQ_NONE or IRQ_HANDLED
*/
static irqreturn_t ips_irq_handler(int irq, void *arg)
{
/**
* ips_detect_cpu - detect whether CPU supports IPS
+ * @ips: IPS driver struct
*
* Walk our list and see if we're on a supported CPU. If we find one,
* return the limits for it.
+ *
+ * Returns: the &ips_mcp_limits struct that matches the boot CPU or %NULL
*/
static struct ips_mcp_limits *ips_detect_cpu(struct ips_driver *ips)
{
* monitor and control graphics turbo mode. If we can find them, we can
* enable graphics turbo, otherwise we must disable it to avoid exceeding
* thermal and power limits in the MCP.
+ *
+ * Returns: %true if the required symbols are found, else %false
*/
static bool ips_get_i915_syms(struct ips_driver *ips)
{
* Iterates over a quirks list until one is found that matches the
* ThinkPad's vendor, BIOS and EC model.
*
- * Returns 0 if nothing matches, otherwise returns the quirks field of
+ * Returns: %0 if nothing matches, otherwise returns the quirks field of
* the matching &struct tpacpi_quirk entry.
*
- * The match criteria is: vendor, ec and bios much match.
+ * The match criteria is: vendor, ec and bios must match.
*/
static unsigned long __init tpacpi_check_quirks(
const struct tpacpi_quirk *qlist,
* TPACPI_FAN_WR_TPEC is also available and should be used to
* command the fan. The X31/X40/X41 seems to have 8 fan levels,
* but the ACPI tables just mention level 7.
+ *
+ * TPACPI_FAN_RD_TPEC_NS:
+ * This mode is used for a few ThinkPads (L13 Yoga Gen2, X13 Yoga Gen2 etc.)
+ * that are using non-standard EC locations for reporting fan speeds.
+ * Currently these platforms only provide fan rpm reporting.
+ *
*/
+#define FAN_RPM_CAL_CONST 491520 /* FAN RPM calculation offset for some non-standard ECFW */
+
+#define FAN_NS_CTRL_STATUS BIT(2) /* Bit which determines control is enabled or not */
+#define FAN_NS_CTRL BIT(4) /* Bit which determines control is by host or EC */
+
enum { /* Fan control constants */
fan_status_offset = 0x2f, /* EC register 0x2f */
fan_rpm_offset = 0x84, /* EC register 0x84: LSB, 0x85 MSB (RPM)
fan_select_offset = 0x31, /* EC register 0x31 (Firmware 7M)
bit 0 selects which fan is active */
+ fan_status_offset_ns = 0x93, /* Special status/control offset for non-standard EC Fan1 */
+ fan2_status_offset_ns = 0x96, /* Special status/control offset for non-standard EC Fan2 */
+ fan_rpm_status_ns = 0x95, /* Special offset for Fan1 RPM status for non-standard EC */
+ fan2_rpm_status_ns = 0x98, /* Special offset for Fan2 RPM status for non-standard EC */
+
TP_EC_FAN_FULLSPEED = 0x40, /* EC fan mode: full speed */
TP_EC_FAN_AUTO = 0x80, /* EC fan mode: auto fan control */
TPACPI_FAN_NONE = 0, /* No fan status or control */
TPACPI_FAN_RD_ACPI_GFAN, /* Use ACPI GFAN */
TPACPI_FAN_RD_TPEC, /* Use ACPI EC regs 0x2f, 0x84-0x85 */
+ TPACPI_FAN_RD_TPEC_NS, /* Use non-standard ACPI EC regs (eg: L13 Yoga gen2 etc.) */
};
enum fan_control_access_mode {
static u8 fan_control_resume_level;
static int fan_watchdog_maxinterval;
+static bool fan_with_ns_addr;
+
static struct mutex fan_mutex;
static void fan_watchdog_fire(struct work_struct *ignored);
}
break;
+ case TPACPI_FAN_RD_TPEC_NS:
+ /* Default mode is AUTO which means controlled by EC */
+ if (!acpi_ec_read(fan_status_offset_ns, &s))
+ return -EIO;
+
+ if (status)
+ *status = s;
+
+ break;
default:
return -ENXIO;
if (mutex_lock_killable(&fan_mutex))
return -ERESTARTSYS;
rc = fan_get_status(&s);
- if (!rc)
+ /* NS EC doesn't have register with level settings */
+ if (!rc && !fan_with_ns_addr)
fan_update_desired_level(s);
mutex_unlock(&fan_mutex);
if (likely(speed))
*speed = (hi << 8) | lo;
+ break;
+ case TPACPI_FAN_RD_TPEC_NS:
+ if (!acpi_ec_read(fan_rpm_status_ns, &lo))
+ return -EIO;
+ if (speed)
+ *speed = lo ? FAN_RPM_CAL_CONST / lo : 0;
break;
default:
static int fan2_get_speed(unsigned int *speed)
{
- u8 hi, lo;
+ u8 hi, lo, status;
bool rc;
switch (fan_status_access_mode) {
if (likely(speed))
*speed = (hi << 8) | lo;
+ break;
+ case TPACPI_FAN_RD_TPEC_NS:
+ rc = !acpi_ec_read(fan2_status_offset_ns, &status);
+ if (rc)
+ return -EIO;
+ if (!(status & FAN_NS_CTRL_STATUS)) {
+ pr_info("secondary fan control not supported\n");
+ return -EIO;
+ }
+ rc = !acpi_ec_read(fan2_rpm_status_ns, &lo);
+ if (rc)
+ return -EIO;
+ if (speed)
+ *speed = lo ? FAN_RPM_CAL_CONST / lo : 0;
break;
default:
#define TPACPI_FAN_2FAN 0x0002 /* EC 0x31 bit 0 selects fan2 */
#define TPACPI_FAN_2CTL 0x0004 /* selects fan2 control */
#define TPACPI_FAN_NOFAN 0x0008 /* no fan available */
+#define TPACPI_FAN_NS 0x0010 /* For EC with non-Standard register addresses */
static const struct tpacpi_quirk fan_quirk_table[] __initconst = {
TPACPI_QEC_IBM('1', 'Y', TPACPI_FAN_Q1),
TPACPI_Q_LNV3('N', '2', 'O', TPACPI_FAN_2CTL), /* P1 / X1 Extreme (2nd gen) */
TPACPI_Q_LNV3('N', '3', '0', TPACPI_FAN_2CTL), /* P15 (1st gen) / P15v (1st gen) */
TPACPI_Q_LNV3('N', '3', '7', TPACPI_FAN_2CTL), /* T15g (2nd gen) */
+ TPACPI_Q_LNV3('R', '1', 'F', TPACPI_FAN_NS), /* L13 Yoga Gen 2 */
+ TPACPI_Q_LNV3('N', '2', 'U', TPACPI_FAN_NS), /* X13 Yoga Gen 2*/
TPACPI_Q_LNV3('N', '1', 'O', TPACPI_FAN_NOFAN), /* X1 Tablet (2nd gen) */
};
return -ENODEV;
}
+ if (quirks & TPACPI_FAN_NS) {
+ pr_info("ECFW with non-standard fan reg control found\n");
+ fan_with_ns_addr = 1;
+ /* Fan ctrl support from host is undefined for now */
+ tp_features.fan_ctrl_status_undef = 1;
+ }
+
if (gfan_handle) {
/* 570, 600e/x, 770e, 770x */
fan_status_access_mode = TPACPI_FAN_RD_ACPI_GFAN;
} else {
/* all other ThinkPads: note that even old-style
* ThinkPad ECs supports the fan control register */
- if (likely(acpi_ec_read(fan_status_offset,
- &fan_control_initial_status))) {
+ if (fan_with_ns_addr ||
+ likely(acpi_ec_read(fan_status_offset, &fan_control_initial_status))) {
int res;
unsigned int speed;
- fan_status_access_mode = TPACPI_FAN_RD_TPEC;
+ fan_status_access_mode = fan_with_ns_addr ?
+ TPACPI_FAN_RD_TPEC_NS : TPACPI_FAN_RD_TPEC;
+
if (quirks & TPACPI_FAN_Q1)
fan_quirk1_setup();
/* Try and probe the 2nd fan */
if (res >= 0 && speed != FAN_NOT_PRESENT) {
/* It responded - so let's assume it's there */
tp_features.second_fan = 1;
- tp_features.second_fan_ctl = 1;
+ /* fan control not currently available for ns ECFW */
+ tp_features.second_fan_ctl = !fan_with_ns_addr;
pr_info("secondary fan control detected & enabled\n");
} else {
/* Fan not auto-detected */
str_enabled_disabled(status), status);
break;
+ case TPACPI_FAN_RD_TPEC_NS:
case TPACPI_FAN_RD_TPEC:
/* all except 570, 600e/x, 770e, 770x */
rc = fan_get_status_safe(&status);
seq_printf(m, "speed:\t\t%d\n", speed);
- if (status & TP_EC_FAN_FULLSPEED)
- /* Disengaged mode takes precedence */
- seq_printf(m, "level:\t\tdisengaged\n");
- else if (status & TP_EC_FAN_AUTO)
- seq_printf(m, "level:\t\tauto\n");
- else
- seq_printf(m, "level:\t\t%d\n", status);
+ if (fan_status_access_mode == TPACPI_FAN_RD_TPEC_NS) {
+ /*
+ * No full speed bit in NS EC
+ * EC Auto mode is set by default.
+ * No other levels settings available
+ */
+ seq_printf(m, "level:\t\t%s\n", status & FAN_NS_CTRL ? "unknown" : "auto");
+ } else {
+ if (status & TP_EC_FAN_FULLSPEED)
+ /* Disengaged mode takes precedence */
+ seq_printf(m, "level:\t\tdisengaged\n");
+ else if (status & TP_EC_FAN_AUTO)
+ seq_printf(m, "level:\t\tauto\n");
+ else
+ seq_printf(m, "level:\t\t%d\n", status);
+ }
break;
case TPACPI_FAN_NONE:
/* ACPI helpers/functions/probes */
-/**
+/*
* This evaluates a ACPI method call specific to the battery
* ACPI extension. The specifics are that an error is marked
* in the 32rd bit of the response, so we just check that here.
if (debug_dump_wdg)
wmi_dump_wdg(&gblock[i]);
+ if (!gblock[i].instance_count) {
+ dev_info(wmi_bus_dev, FW_INFO "%pUL has zero instances\n", &gblock[i].guid);
+ continue;
+ }
+
if (guid_already_parsed_for_legacy(device, &gblock[i].guid))
continue;
if (!state)
return -EINVAL;
- ret = pd->perf_ops->level_set(pd->ph, pd->domain_id, state, true);
+ ret = pd->perf_ops->level_set(pd->ph, pd->domain_id, state, false);
if (ret)
dev_warn(&genpd->dev, "Failed with %d when trying to set %d perf level",
ret, state);
rpmpds[i]->pd.power_off = rpmpd_power_off;
rpmpds[i]->pd.power_on = rpmpd_power_on;
rpmpds[i]->pd.set_performance_state = rpmpd_set_performance;
+ rpmpds[i]->pd.flags = GENPD_FLAG_ACTIVE_WAKEUP;
pm_genpd_init(&rpmpds[i]->pd, NULL, true);
data->domains[i] = &rpmpds[i]->pd;
#include <linux/of.h>
#include <linux/pm_qos.h>
#include <linux/slab.h>
-#include <linux/units.h>
struct dtpm_cpu {
struct dtpm dtpm;
if (pd->table[i].frequency < freq)
continue;
- return scale_pd_power_uw(pd_mask, pd->table[i].power *
- MICROWATT_PER_MILLIWATT);
+ return scale_pd_power_uw(pd_mask, pd->table[i].power);
}
return 0;
nr_cpus = cpumask_weight(&cpus);
dtpm->power_min = em->table[0].power;
- dtpm->power_min *= MICROWATT_PER_MILLIWATT;
dtpm->power_min *= nr_cpus;
dtpm->power_max = em->table[em->nr_perf_states - 1].power;
- dtpm->power_max *= MICROWATT_PER_MILLIWATT;
dtpm->power_max *= nr_cpus;
return 0;
if (policy) {
for_each_cpu(dtpm_cpu->cpu, policy->related_cpus)
per_cpu(dtpm_per_cpu, dtpm_cpu->cpu) = NULL;
+
+ cpufreq_cpu_put(policy);
}
kfree(dtpm_cpu);
return 0;
pd = em_cpu_get(cpu);
- if (!pd || em_is_artificial(pd))
- return -EINVAL;
+ if (!pd || em_is_artificial(pd)) {
+ ret = -EINVAL;
+ goto release_policy;
+ }
dtpm_cpu = kzalloc(sizeof(*dtpm_cpu), GFP_KERNEL);
- if (!dtpm_cpu)
- return -ENOMEM;
+ if (!dtpm_cpu) {
+ ret = -ENOMEM;
+ goto release_policy;
+ }
dtpm_init(&dtpm_cpu->dtpm, &dtpm_ops);
dtpm_cpu->cpu = cpu;
if (ret)
goto out_dtpm_unregister;
+ cpufreq_cpu_put(policy);
return 0;
out_dtpm_unregister:
per_cpu(dtpm_per_cpu, cpu) = NULL;
kfree(dtpm_cpu);
+release_policy:
+ cpufreq_cpu_put(policy);
return ret;
}
struct em_perf_domain *pd = em_pd_get(dev);
dtpm->power_min = pd->table[0].power;
- dtpm->power_min *= MICROWATT_PER_MILLIWATT;
dtpm->power_max = pd->table[pd->nr_perf_states - 1].power;
- dtpm->power_max *= MICROWATT_PER_MILLIWATT;
return 0;
}
struct device *dev = devfreq->dev.parent;
struct em_perf_domain *pd = em_pd_get(dev);
unsigned long freq;
- u64 power;
int i;
for (i = 0; i < pd->nr_perf_states; i++) {
-
- power = pd->table[i].power * MICROWATT_PER_MILLIWATT;
- if (power > power_limit)
+ if (pd->table[i].power > power_limit)
break;
}
dev_pm_qos_update_request(&dtpm_devfreq->qos_req, freq);
- power_limit = pd->table[i - 1].power * MICROWATT_PER_MILLIWATT;
+ power_limit = pd->table[i - 1].power;
return power_limit;
}
if (pd->table[i].frequency < freq)
continue;
- power = pd->table[i].power * MICROWATT_PER_MILLIWATT;
+ power = pd->table[i].power;
power *= status.busy_time;
power >>= 10;
for (i = 0; i < cnt; i++) {
event[i] = queue->buf[queue->head];
- queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS;
+ /* Paired with READ_ONCE() in queue_cnt() */
+ WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS);
}
spin_unlock_irqrestore(&queue->lock, flags);
dst->t.sec = seconds;
dst->t.nsec = remainder;
+ /* Both WRITE_ONCE() are paired with READ_ONCE() in queue_cnt() */
if (!queue_free(queue))
- queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS;
+ WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS);
- queue->tail = (queue->tail + 1) % PTP_MAX_TIMESTAMPS;
+ WRITE_ONCE(queue->tail, (queue->tail + 1) % PTP_MAX_TIMESTAMPS);
spin_unlock_irqrestore(&queue->lock, flags);
}
* that a writer might concurrently increment the tail does not
* matter, since the queue remains nonempty nonetheless.
*/
-static inline int queue_cnt(struct timestamp_event_queue *q)
+static inline int queue_cnt(const struct timestamp_event_queue *q)
{
- int cnt = q->tail - q->head;
+ /*
+ * Paired with WRITE_ONCE() in enqueue_external_timestamp(),
+ * ptp_read(), extts_fifo_show().
+ */
+ int cnt = READ_ONCE(q->tail) - READ_ONCE(q->head);
return cnt < 0 ? PTP_MAX_TIMESTAMPS + cnt : cnt;
}
qcnt = queue_cnt(queue);
if (qcnt) {
event = queue->buf[queue->head];
- queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS;
+ /* Paired with READ_ONCE() in queue_cnt() */
+ WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS);
}
spin_unlock_irqrestore(&queue->lock, flags);
pc->chip.ops = &bcm2835_pwm_ops;
pc->chip.npwm = 2;
+ platform_set_drvdata(pdev, pc);
+
ret = devm_pwmchip_add(&pdev->dev, &pc->chip);
if (ret < 0)
return dev_err_probe(&pdev->dev, ret,
{
lockdep_assert_held(&reset_list_mutex);
+ if (IS_ERR_OR_NULL(rstc))
+ return;
+
kref_put(&rstc->refcnt, __reset_control_release);
}
void reset_control_bulk_put(int num_rstcs, struct reset_control_bulk_data *rstcs)
{
mutex_lock(&reset_list_mutex);
- while (num_rstcs--) {
- if (IS_ERR_OR_NULL(rstcs[num_rstcs].rstc))
- continue;
+ while (num_rstcs--)
__reset_control_put_internal(rstcs[num_rstcs].rstc);
- }
mutex_unlock(&reset_list_mutex);
}
EXPORT_SYMBOL_GPL(reset_control_bulk_put);
if (!data)
return -ENOMEM;
- type = (enum hi6220_reset_ctrl_type)of_device_get_match_data(dev);
+ type = (uintptr_t)of_device_get_match_data(dev);
regmap = syscon_node_to_regmap(np);
if (IS_ERR(regmap)) {
* we count each request only once.
*/
device = cqr->startdev;
- if (device->profile.data) {
- counter = 1; /* request is not yet queued on the start device */
- list_for_each(l, &device->ccw_queue)
- if (++counter >= 31)
- break;
- }
+ if (!device->profile.data)
+ return;
+
+ spin_lock(get_ccwdev_lock(device->cdev));
+ counter = 1; /* request is not yet queued on the start device */
+ list_for_each(l, &device->ccw_queue)
+ if (++counter >= 31)
+ break;
+ spin_unlock(get_ccwdev_lock(device->cdev));
+
spin_lock(&device->profile.lock);
- if (device->profile.data) {
- device->profile.data->dasd_io_nr_req[counter]++;
- if (rq_data_dir(req) == READ)
- device->profile.data->dasd_read_nr_req[counter]++;
- }
+ device->profile.data->dasd_io_nr_req[counter]++;
+ if (rq_data_dir(req) == READ)
+ device->profile.data->dasd_read_nr_req[counter]++;
spin_unlock(&device->profile.lock);
}
__u8 secondary; /* 7 Secondary device address */
__u16 pprc_id; /* 8-9 Peer-to-Peer Remote Copy ID */
__u8 reserved2[12]; /* 10-21 reserved */
- __u16 prim_cu_ssid; /* 22-23 Pimary Control Unit SSID */
+ __u16 prim_cu_ssid; /* 22-23 Primary Control Unit SSID */
__u8 reserved3[12]; /* 24-35 reserved */
__u16 sec_cu_ssid; /* 36-37 Secondary Control Unit SSID */
__u8 reserved4[90]; /* 38-127 reserved */
#include <linux/blk-mq.h>
#include <linux/slab.h>
#include <linux/list.h>
+#include <linux/io.h>
#include <asm/eadm.h>
#include "scm_blk.h"
for (i = 0; i < nr_requests_per_io && scmrq->request[i]; i++) {
msb = &scmrq->aob->msb[i];
- aidaw = msb->data_addr;
+ aidaw = (u64)phys_to_virt(msb->data_addr);
if ((msb->flags & MSB_FLAG_IDA) && aidaw &&
IS_ALIGNED(aidaw, PAGE_SIZE))
msb->scm_addr = scmdev->address + ((u64) blk_rq_pos(req) << 9);
msb->oc = (rq_data_dir(req) == READ) ? MSB_OC_READ : MSB_OC_WRITE;
msb->flags |= MSB_FLAG_IDA;
- msb->data_addr = (u64) aidaw;
+ msb->data_addr = (u64)virt_to_phys(aidaw);
rq_for_each_segment(bv, req, iter) {
WARN_ON(bv.bv_offset);
msb->blk_count += bv.bv_len >> 12;
- aidaw->data_addr = (u64) page_address(bv.bv_page);
+ aidaw->data_addr = virt_to_phys(page_address(bv.bv_page));
aidaw++;
}
struct uvio_attest *uvio_attest)
{
struct uvio_attest __user *user_uvio_attest = (void __user *)uv_ioctl->argument_addr;
+ u32 __user *user_buf_add_len = (u32 __user *)&user_uvio_attest->add_data_len;
void __user *user_buf_add = (void __user *)uvio_attest->add_data_addr;
void __user *user_buf_meas = (void __user *)uvio_attest->meas_addr;
void __user *user_buf_uid = &user_uvio_attest->config_uid;
return -EFAULT;
if (add_data && copy_to_user(user_buf_add, add_data, uvio_attest->add_data_len))
return -EFAULT;
+ if (put_user(uvio_attest->add_data_len, user_buf_add_len))
+ return -EFAULT;
if (copy_to_user(user_buf_uid, uvcb_attest->config_uid, sizeof(uvcb_attest->config_uid)))
return -EFAULT;
return 0;
config ISM
tristate "Support for ISM vPCI Adapter"
depends on PCI
+ imply SMC
default n
help
Select this option if you want to use the Internal Shared Memory
- vPCI Adapter.
+ vPCI Adapter. The adapter can be used with the SMC network protocol.
To compile as a module choose M. The module name is ism.
If unsure, choose N.
MODULE_DEVICE_TABLE(pci, ism_device_table);
static debug_info_t *ism_debug_info;
-static const struct smcd_ops ism_ops;
#define NO_CLIENT 0xff /* must be >= MAX_CLIENTS */
static struct ism_client *clients[MAX_CLIENTS]; /* use an array rather than */
return ret;
}
-static int ism_query_rgid(struct ism_dev *ism, u64 rgid, u32 vid_valid,
- u32 vid)
-{
- union ism_query_rgid cmd;
-
- memset(&cmd, 0, sizeof(cmd));
- cmd.request.hdr.cmd = ISM_QUERY_RGID;
- cmd.request.hdr.len = sizeof(cmd.request);
-
- cmd.request.rgid = rgid;
- cmd.request.vlan_valid = vid_valid;
- cmd.request.vlan_id = vid;
-
- return ism_cmd(ism, &cmd);
-}
-
static void ism_free_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
{
clear_bit(dmb->sba_idx, ism->sba_bitmap);
return ism_cmd(ism, &cmd);
}
-static int ism_signal_ieq(struct ism_dev *ism, u64 rgid, u32 trigger_irq,
- u32 event_code, u64 info)
-{
- union ism_sig_ieq cmd;
-
- memset(&cmd, 0, sizeof(cmd));
- cmd.request.hdr.cmd = ISM_SIGNAL_IEQ;
- cmd.request.hdr.len = sizeof(cmd.request);
-
- cmd.request.rgid = rgid;
- cmd.request.trigger_irq = trigger_irq;
- cmd.request.event_code = event_code;
- cmd.request.info = info;
-
- return ism_cmd(ism, &cmd);
-}
-
static unsigned int max_bytes(unsigned int start, unsigned int len,
unsigned int boundary)
{
}
EXPORT_SYMBOL_GPL(ism_get_seid);
-static u16 ism_get_chid(struct ism_dev *ism)
-{
- if (!ism || !ism->pdev)
- return 0;
-
- return to_zpci(ism->pdev)->pchid;
-}
-
static void ism_handle_event(struct ism_dev *ism)
{
struct ism_event *entry;
return IRQ_HANDLED;
}
-static u64 ism_get_local_gid(struct ism_dev *ism)
-{
- return ism->local_gid;
-}
-
static int ism_dev_init(struct ism_dev *ism)
{
struct pci_dev *pdev = ism->pdev;
/*************************** SMC-D Implementation *****************************/
#if IS_ENABLED(CONFIG_SMC)
+static int ism_query_rgid(struct ism_dev *ism, u64 rgid, u32 vid_valid,
+ u32 vid)
+{
+ union ism_query_rgid cmd;
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.request.hdr.cmd = ISM_QUERY_RGID;
+ cmd.request.hdr.len = sizeof(cmd.request);
+
+ cmd.request.rgid = rgid;
+ cmd.request.vlan_valid = vid_valid;
+ cmd.request.vlan_id = vid;
+
+ return ism_cmd(ism, &cmd);
+}
+
static int smcd_query_rgid(struct smcd_dev *smcd, u64 rgid, u32 vid_valid,
u32 vid)
{
return ism_cmd_simple(smcd->priv, ISM_RESET_VLAN);
}
+static int ism_signal_ieq(struct ism_dev *ism, u64 rgid, u32 trigger_irq,
+ u32 event_code, u64 info)
+{
+ union ism_sig_ieq cmd;
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.request.hdr.cmd = ISM_SIGNAL_IEQ;
+ cmd.request.hdr.len = sizeof(cmd.request);
+
+ cmd.request.rgid = rgid;
+ cmd.request.trigger_irq = trigger_irq;
+ cmd.request.event_code = event_code;
+ cmd.request.info = info;
+
+ return ism_cmd(ism, &cmd);
+}
+
static int smcd_signal_ieq(struct smcd_dev *smcd, u64 rgid, u32 trigger_irq,
u32 event_code, u64 info)
{
SYSTEM_EID.type[0] != '0';
}
+static u64 ism_get_local_gid(struct ism_dev *ism)
+{
+ return ism->local_gid;
+}
+
static u64 smcd_get_local_gid(struct smcd_dev *smcd)
{
return ism_get_local_gid(smcd->priv);
}
+static u16 ism_get_chid(struct ism_dev *ism)
+{
+ if (!ism || !ism->pdev)
+ return 0;
+
+ return to_zpci(ism->pdev)->pchid;
+}
+
static u16 smcd_get_chid(struct smcd_dev *smcd)
{
return ism_get_chid(smcd->priv);
u32 handle_pci_error;
bool init_reset;
u8 soft_reset_support;
- u8 use_map_queue;
};
#define aac_adapter_interrupt(dev) \
struct fib *aac_fib_alloc_tag(struct aac_dev *dev, struct scsi_cmnd *scmd)
{
struct fib *fibptr;
- u32 blk_tag;
- int i;
- blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
- i = blk_mq_unique_tag_to_tag(blk_tag);
- fibptr = &dev->fibs[i];
+ fibptr = &dev->fibs[scsi_cmd_to_rq(scmd)->tag];
/*
* Null out fields that depend on being zero at the start of
* each I/O
#include <linux/compat.h>
#include <linux/blkdev.h>
-#include <linux/blk-mq-pci.h>
#include <linux/completion.h>
#include <linux/init.h>
#include <linux/interrupt.h>
return 0;
}
-static void aac_map_queues(struct Scsi_Host *shost)
-{
- struct aac_dev *aac = (struct aac_dev *)shost->hostdata;
-
- blk_mq_pci_map_queues(&shost->tag_set.map[HCTX_TYPE_DEFAULT],
- aac->pdev, 0);
- aac->use_map_queue = true;
-}
-
/**
* aac_change_queue_depth - alter queue depths
* @sdev: SCSI device we are considering
.bios_param = aac_biosparm,
.shost_groups = aac_host_groups,
.slave_configure = aac_slave_configure,
- .map_queues = aac_map_queues,
.change_queue_depth = aac_change_queue_depth,
.sdev_groups = aac_dev_groups,
.eh_abort_handler = aac_eh_abort,
shost->max_lun = AAC_MAX_LUN;
pci_set_drvdata(pdev, shost);
- shost->nr_hw_queues = aac->max_msix;
- shost->host_tagset = 1;
error = scsi_add_host(shost, &pdev->dev);
if (error)
struct aac_dev *aac = (struct aac_dev *)shost->hostdata;
aac_cancel_rescan_worker(aac);
- aac->use_map_queue = false;
scsi_remove_host(shost);
__aac_shutdown(aac);
#endif
u16 vector_no;
- struct scsi_cmnd *scmd;
- u32 blk_tag;
- struct Scsi_Host *shost = dev->scsi_host_ptr;
- struct blk_mq_queue_map *qmap;
atomic_inc(&q->numpending);
if ((dev->comm_interface == AAC_COMM_MESSAGE_TYPE3)
&& dev->sa_firmware)
vector_no = aac_get_vector(dev);
- else {
- if (!fib->vector_no || !fib->callback_data) {
- if (shost && dev->use_map_queue) {
- qmap = &shost->tag_set.map[HCTX_TYPE_DEFAULT];
- vector_no = qmap->mq_map[raw_smp_processor_id()];
- }
- /*
- * We hardcode the vector_no for
- * reserved commands as a valid shost is
- * absent during the init
- */
- else
- vector_no = 0;
- } else {
- scmd = (struct scsi_cmnd *)fib->callback_data;
- blk_tag = blk_mq_unique_tag(scsi_cmd_to_rq(scmd));
- vector_no = blk_mq_unique_tag_to_hwq(blk_tag);
- }
- }
+ else
+ vector_no = fib->vector_no;
if (native_hba) {
if (fib->flags & FIB_CONTEXT_FLAG_NATIVE_HBA_TMF) {
kfree(pwrb_context->pwrb_handle_base);
kfree(pwrb_context->pwrb_handle_basestd);
}
+ kfree(phwi_ctxt->be_wrbq);
return -ENOMEM;
}
struct fcoe_ctlr *ctlr;
struct fcoe_rcv_info *fr;
struct fcoe_percpu_s *bg;
- struct sk_buff *tmp_skb;
interface = container_of(ptype, struct bnx2fc_interface,
fcoe_packet_type);
goto err;
}
- tmp_skb = skb_share_check(skb, GFP_ATOMIC);
- if (!tmp_skb)
- goto err;
-
- skb = tmp_skb;
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (!skb)
+ return -1;
if (unlikely(eth_hdr(skb)->h_proto != htons(ETH_P_FCOE))) {
printk(KERN_ERR PFX "bnx2fc_rcv: Wrong FC type frame\n");
}
spin_lock_irqsave(qp->qp_lock_ptr, *flags);
- if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd)))
- sp->done(sp, res);
+ switch (sp->type) {
+ case SRB_SCSI_CMD:
+ if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd)))
+ sp->done(sp, res);
+ break;
+ default:
+ if (ret_cmd)
+ sp->done(sp, res);
+ break;
+ }
} else {
sp->done(sp, res);
}
struct sdebug_err_inject *inject;
struct scsi_device *sdev = (struct scsi_device *)file->f_inode->i_private;
- buf = kmalloc(count, GFP_KERNEL);
+ buf = kzalloc(count + 1, GFP_KERNEL);
if (!buf)
return -ENOMEM;
static int sdebug_target_alloc(struct scsi_target *starget)
{
struct sdebug_target_info *targetip;
- struct dentry *dentry;
targetip = kzalloc(sizeof(struct sdebug_target_info), GFP_KERNEL);
if (!targetip)
targetip->debugfs_entry = debugfs_create_dir(dev_name(&starget->dev),
sdebug_debugfs_root);
- if (IS_ERR_OR_NULL(targetip->debugfs_entry))
- pr_info("%s: failed to create debugfs directory for target %s\n",
- __func__, dev_name(&starget->dev));
debugfs_create_file("fail_reset", 0600, targetip->debugfs_entry, starget,
&sdebug_target_reset_fail_fops);
- if (IS_ERR_OR_NULL(dentry))
- pr_info("%s: failed to create fail_reset file for target %s\n",
- __func__, dev_name(&starget->dev));
starget->hostdata = targetip;
scsi_log_send(scmd);
scmd->submitter = SUBMITTED_BY_SCSI_ERROR_HANDLER;
+ scmd->flags |= SCMD_LAST;
/*
* Lock sdev->state_mutex to avoid that scsi_device_quiesce() can
scsi_init_command(dev, scmd);
scmd->submitter = SUBMITTED_BY_SCSI_RESET_IOCTL;
+ scmd->flags |= SCMD_LAST;
memset(&scmd->sdb, 0, sizeof(scmd->sdb));
scmd->cmd_len = 0;
return disk_changed ? DISK_EVENT_MEDIA_CHANGE : 0;
}
-static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr *sshdr)
+static int sd_sync_cache(struct scsi_disk *sdkp)
{
int retries, res;
struct scsi_device *sdp = sdkp->device;
const int timeout = sdp->request_queue->rq_timeout
* SD_FLUSH_TIMEOUT_MULTIPLIER;
- struct scsi_sense_hdr my_sshdr;
+ struct scsi_sense_hdr sshdr;
const struct scsi_exec_args exec_args = {
.req_flags = BLK_MQ_REQ_PM,
- /* caller might not be interested in sense, but we need it */
- .sshdr = sshdr ? : &my_sshdr,
+ .sshdr = &sshdr,
};
if (!scsi_device_online(sdp))
return -ENODEV;
- sshdr = exec_args.sshdr;
-
for (retries = 3; retries > 0; --retries) {
unsigned char cmd[16] = { 0 };
return res;
if (scsi_status_is_check_condition(res) &&
- scsi_sense_valid(sshdr)) {
- sd_print_sense_hdr(sdkp, sshdr);
+ scsi_sense_valid(&sshdr)) {
+ sd_print_sense_hdr(sdkp, &sshdr);
/* we need to evaluate the error return */
- if (sshdr->asc == 0x3a || /* medium not present */
- sshdr->asc == 0x20 || /* invalid command */
- (sshdr->asc == 0x74 && sshdr->ascq == 0x71)) /* drive is password locked */
+ if (sshdr.asc == 0x3a || /* medium not present */
+ sshdr.asc == 0x20 || /* invalid command */
+ (sshdr.asc == 0x74 && sshdr.ascq == 0x71)) /* drive is password locked */
/* this is no error here */
return 0;
+ /*
+ * This drive doesn't support sync and there's not much
+ * we can do because this is called during shutdown
+ * or suspend so just return success so those operations
+ * can proceed.
+ */
+ if (sshdr.sense_key == ILLEGAL_REQUEST)
+ return 0;
}
switch (host_byte(res)) {
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
- sd_sync_cache(sdkp, NULL);
+ sd_sync_cache(sdkp);
}
if ((system_state != SYSTEM_RESTART &&
static int sd_suspend_common(struct device *dev, bool runtime)
{
struct scsi_disk *sdkp = dev_get_drvdata(dev);
- struct scsi_sense_hdr sshdr;
int ret = 0;
if (!sdkp) /* E.g.: runtime suspend following sd_remove() */
if (sdkp->WCE && sdkp->media_present) {
if (!sdkp->device->silence_suspend)
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
- ret = sd_sync_cache(sdkp, &sshdr);
-
- if (ret) {
- /* ignore OFFLINE device */
- if (ret == -ENODEV)
- return 0;
-
- if (!scsi_sense_valid(&sshdr) ||
- sshdr.sense_key != ILLEGAL_REQUEST)
- return ret;
+ ret = sd_sync_cache(sdkp);
+ /* ignore OFFLINE device */
+ if (ret == -ENODEV)
+ return 0;
- /*
- * sshdr.sense_key == ILLEGAL_REQUEST means this drive
- * doesn't support sync. There's not much to do and
- * suspend shouldn't fail.
- */
- ret = 0;
- }
+ if (ret)
+ return ret;
}
if (sd_do_start_stop(sdkp->device, runtime)) {
static int sd_resume_system(struct device *dev)
{
- if (pm_runtime_suspended(dev))
+ if (pm_runtime_suspended(dev)) {
+ struct scsi_disk *sdkp = dev_get_drvdata(dev);
+ struct scsi_device *sdp = sdkp ? sdkp->device : NULL;
+
+ if (sdp && sdp->force_runtime_start_on_system_start)
+ pm_request_resume(dev);
+
return 0;
+ }
return sd_resume(dev, false);
}
static void intel_shim_vs_init(struct sdw_intel *sdw)
{
void __iomem *shim_vs = sdw->link_res->shim_vs;
- u16 act = 0;
+ u16 act;
+ act = intel_readw(shim_vs, SDW_SHIM2_INTEL_VS_ACTMCTL);
u16p_replace_bits(&act, 0x1, SDW_SHIM2_INTEL_VS_ACTMCTL_DOAIS);
act |= SDW_SHIM2_INTEL_VS_ACTMCTL_DACTQE;
act |= SDW_SHIM2_INTEL_VS_ACTMCTL_DODS;
* sdw_ml_sync_bank_switch: Multilink register bank switch
*
* @bus: SDW bus instance
+ * @multi_link: whether this is a multi-link stream with hardware-based sync
*
* Caller function should free the buffers on error
*/
-static int sdw_ml_sync_bank_switch(struct sdw_bus *bus)
+static int sdw_ml_sync_bank_switch(struct sdw_bus *bus, bool multi_link)
{
unsigned long time_left;
- if (!bus->multi_link)
+ if (!multi_link)
return 0;
/* Wait for completion of transfer */
bus->bank_switch_timeout = DEFAULT_BANK_SWITCH_TIMEOUT;
/* Check if bank switch was successful */
- ret = sdw_ml_sync_bank_switch(bus);
+ ret = sdw_ml_sync_bank_switch(bus, multi_link);
if (ret < 0) {
dev_err(bus->dev,
"multi link bank switch failed: %d\n", ret);
#include <linux/gpio/consumer.h>
#include <linux/pinctrl/consumer.h>
#include <linux/pm_runtime.h>
+#include <linux/iopoll.h>
#include <trace/events/spi.h>
/* SPI register offsets */
*/
#define DMA_MIN_BYTES 16
-#define SPI_DMA_MIN_TIMEOUT (msecs_to_jiffies(1000))
-#define SPI_DMA_TIMEOUT_PER_10K (msecs_to_jiffies(4))
-
#define AUTOSUSPEND_TIMEOUT 2000
struct atmel_spi_caps {
bool keep_cs;
u32 fifo_size;
+ bool last_polarity;
u8 native_cs_free;
u8 native_cs_for_gpio;
};
#define SPI_MAX_DMA_XFER 65535 /* true for both PDC and DMA */
#define INVALID_DMA_ADDRESS 0xffffffff
+/*
+ * This frequency can be anything supported by the controller, but to avoid
+ * unnecessary delay, the highest possible frequency is chosen.
+ *
+ * This frequency is the highest possible which is not interfering with other
+ * chip select registers (see Note for Serial Clock Bit Rate configuration in
+ * Atmel-11121F-ATARM-SAMA5D3-Series-Datasheet_02-Feb-16, page 1283)
+ */
+#define DUMMY_MSG_FREQUENCY 0x02
+/*
+ * 8 bits is the minimum data the controller is capable of sending.
+ *
+ * This message can be anything as it should not be treated by any SPI device.
+ */
+#define DUMMY_MSG 0xAA
+
/*
* Version 2 of the SPI controller has
* - CR.LASTXFER
return as->caps.is_spi2;
}
+/*
+ * Send a dummy message.
+ *
+ * This is sometimes needed when using a CS GPIO to force clock transition when
+ * switching between devices with different polarities.
+ */
+static void atmel_spi_send_dummy(struct atmel_spi *as, struct spi_device *spi, int chip_select)
+{
+ u32 status;
+ u32 csr;
+
+ /*
+ * Set a clock frequency to allow sending message on SPI bus.
+ * The frequency here can be anything, but is needed for
+ * the controller to send the data.
+ */
+ csr = spi_readl(as, CSR0 + 4 * chip_select);
+ csr = SPI_BFINS(SCBR, DUMMY_MSG_FREQUENCY, csr);
+ spi_writel(as, CSR0 + 4 * chip_select, csr);
+
+ /*
+ * Read all data coming from SPI bus, needed to be able to send
+ * the message.
+ */
+ spi_readl(as, RDR);
+ while (spi_readl(as, SR) & SPI_BIT(RDRF)) {
+ spi_readl(as, RDR);
+ cpu_relax();
+ }
+
+ spi_writel(as, TDR, DUMMY_MSG);
+
+ readl_poll_timeout_atomic(as->regs + SPI_SR, status,
+ (status & SPI_BIT(TXEMPTY)), 1, 1000);
+}
+
+
/*
* Earlier SPI controllers (e.g. on at91rm9200) have a design bug whereby
* they assume that spi slave device state will not change on deselect, so
* Master on Chip Select 0.") No workaround exists for that ... so for
* nCS0 on that chip, we (a) don't use the GPIO, (b) can't support CS_HIGH,
* and (c) will trigger that first erratum in some cases.
+ *
+ * When changing the clock polarity, the SPI controller waits for the next
+ * transmission to enforce the default clock state. This may be an issue when
+ * using a GPIO as Chip Select: the clock level is applied only when the first
+ * packet is sent, once the CS has already been asserted. The workaround is to
+ * avoid this by sending a first (dummy) message before toggling the CS state.
*/
-
static void cs_activate(struct atmel_spi *as, struct spi_device *spi)
{
struct atmel_spi_device *asd = spi->controller_state;
+ bool new_polarity;
int chip_select;
u32 mr;
}
mr = spi_readl(as, MR);
+
+ /*
+ * Ensures the clock polarity is valid before we actually
+ * assert the CS to avoid spurious clock edges to be
+ * processed by the spi devices.
+ */
+ if (spi_get_csgpiod(spi, 0)) {
+ new_polarity = (asd->csr & SPI_BIT(CPOL)) != 0;
+ if (new_polarity != as->last_polarity) {
+ /*
+ * Need to disable the GPIO before sending the dummy
+ * message because it is already set by the spi core.
+ */
+ gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 0);
+ atmel_spi_send_dummy(as, spi, chip_select);
+ as->last_polarity = new_polarity;
+ gpiod_set_value_cansleep(spi_get_csgpiod(spi, 0), 1);
+ }
+ }
} else {
u32 cpol = (spi->mode & SPI_CPOL) ? SPI_BIT(CPOL) : 0;
int i;
}
dma_timeout = msecs_to_jiffies(spi_controller_xfer_timeout(host, xfer));
- ret_timeout = wait_for_completion_interruptible_timeout(&as->xfer_completion,
- dma_timeout);
- if (ret_timeout <= 0) {
- dev_err(&spi->dev, "spi transfer %s\n",
- !ret_timeout ? "timeout" : "canceled");
- as->done_status = ret_timeout < 0 ? ret_timeout : -EIO;
+ ret_timeout = wait_for_completion_timeout(&as->xfer_completion, dma_timeout);
+ if (!ret_timeout) {
+ dev_err(&spi->dev, "spi transfer timeout\n");
+ as->done_status = -EIO;
}
if (as->done_status)
udelay(10);
cdns_spi_process_fifo(xspi, xspi->tx_fifo_depth, 0);
- spi_transfer_delay_exec(transfer);
cdns_spi_write(xspi, CDNS_SPI_IER, CDNS_SPI_IXR_DEFAULT);
return transfer->len;
ctrl |= (spi_imx->target_burst * 8 - 1)
<< MX51_ECSPI_CTRL_BL_OFFSET;
else {
- if (spi_imx->count >= 512)
- ctrl |= 0xFFF << MX51_ECSPI_CTRL_BL_OFFSET;
- else
- ctrl |= (spi_imx->count * spi_imx->bits_per_word - 1)
+ if (spi_imx->usedma) {
+ ctrl |= (spi_imx->bits_per_word *
+ spi_imx_bytes_per_word(spi_imx->bits_per_word) - 1)
<< MX51_ECSPI_CTRL_BL_OFFSET;
+ } else {
+ if (spi_imx->count >= MX51_ECSPI_CTRL_MAX_BURST)
+ ctrl |= (MX51_ECSPI_CTRL_MAX_BURST - 1)
+ << MX51_ECSPI_CTRL_BL_OFFSET;
+ else
+ ctrl |= (spi_imx->count * spi_imx->bits_per_word - 1)
+ << MX51_ECSPI_CTRL_BL_OFFSET;
+ }
}
/* set clock speed */
kfree(optee_device);
}
-static int optee_register_device(const uuid_t *device_uuid)
+static ssize_t need_supplicant_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return 0;
+}
+
+static DEVICE_ATTR_RO(need_supplicant);
+
+static int optee_register_device(const uuid_t *device_uuid, u32 func)
{
struct tee_client_device *optee_device = NULL;
int rc;
put_device(&optee_device->dev);
}
+ if (func == PTA_CMD_GET_DEVICES_SUPP)
+ device_create_file(&optee_device->dev,
+ &dev_attr_need_supplicant);
+
return rc;
}
num_devices = shm_size / sizeof(uuid_t);
for (idx = 0; idx < num_devices; idx++) {
- rc = optee_register_device(&device_uuid[idx]);
+ rc = optee_register_device(&device_uuid[idx], func);
if (rc)
goto out_shm;
}
snprintf(dir_name, sizeof(dir_name), "port%d", port->port);
parent = debugfs_lookup(dir_name, port->sw->debugfs_dir);
if (parent)
- debugfs_remove_recursive(debugfs_lookup("margining", parent));
+ debugfs_lookup_and_remove("margining", parent);
kfree(port->usb4->margining);
port->usb4->margining = NULL;
* Only set bonding if the link was not already bonded. This
* avoids the lane adapter to re-enter bonding state.
*/
- if (width == TB_LINK_WIDTH_SINGLE) {
+ if (width == TB_LINK_WIDTH_SINGLE && !tb_is_upstream_port(port)) {
ret = tb_port_set_lane_bonding(port, true);
if (ret)
goto err_lane1;
return tb_port_wait_for_link_width(down, TB_LINK_WIDTH_SINGLE, 100);
}
+/* Note updating sw->link_width done in tb_switch_update_link_attributes() */
static int tb_switch_asym_enable(struct tb_switch *sw, enum tb_link_width width)
{
struct tb_port *up, *down, *port;
return ret;
}
- sw->link_width = width;
return 0;
}
+/* Note updating sw->link_width done in tb_switch_update_link_attributes() */
static int tb_switch_asym_disable(struct tb_switch *sw)
{
struct tb_port *up, *down;
return ret;
}
- sw->link_width = TB_LINK_WIDTH_DUAL;
return 0;
}
if (!tb_switch_query_dp_resource(sw, port))
continue;
- list_add(&port->list, &tcm->dp_resources);
+ /*
+ * If DP IN on device router exist, position it at the
+ * beginning of the DP resources list, so that it is used
+ * before DP IN of the host router. This way external GPU(s)
+ * will be prioritized when pairing DP IN to a DP OUT.
+ */
+ if (tb_route(sw))
+ list_add(&port->list, &tcm->dp_resources);
+ else
+ list_add_tail(&port->list, &tcm->dp_resources);
+
tb_port_dbg(port, "DP IN resource available\n");
}
}
goto err_request;
/*
- * Always keep 1000 Mb/s to make sure xHCI has at least some
+ * Always keep 900 Mb/s to make sure xHCI has at least some
* bandwidth available for isochronous traffic.
*/
- if (consumed_up < 1000)
- consumed_up = 1000;
- if (consumed_down < 1000)
- consumed_down = 1000;
+ if (consumed_up < 900)
+ consumed_up = 900;
+ if (consumed_down < 900)
+ consumed_down = 900;
ret = usb4_usb3_port_write_allocated_bandwidth(port, consumed_up,
consumed_down);
{ "INT33C5", (kernel_ulong_t)&dw8250_dw_apb },
{ "INT3434", (kernel_ulong_t)&dw8250_dw_apb },
{ "INT3435", (kernel_ulong_t)&dw8250_dw_apb },
+ { "INTC10EE", (kernel_ulong_t)&dw8250_dw_apb },
{ },
};
MODULE_DEVICE_TABLE(acpi, dw8250_acpi_match);
OF_EARLYCON_DECLARE(omap8250, "ti,omap2-uart", early_omap8250_setup);
OF_EARLYCON_DECLARE(omap8250, "ti,omap3-uart", early_omap8250_setup);
OF_EARLYCON_DECLARE(omap8250, "ti,omap4-uart", early_omap8250_setup);
+OF_EARLYCON_DECLARE(omap8250, "ti,am654-uart", early_omap8250_setup);
#endif
if (priv->habit & UART_HAS_RHR_IT_DIS) {
reg = serial_in(p, UART_OMAP_IER2);
reg &= ~UART_OMAP_IER2_RHR_IT_DIS;
- serial_out(p, UART_OMAP_IER2, UART_OMAP_IER2_RHR_IT_DIS);
+ serial_out(p, UART_OMAP_IER2, reg);
}
dmaengine_tx_status(rxchan, cookie, &state);
if (priv->habit & UART_HAS_RHR_IT_DIS) {
reg = serial_in(p, UART_OMAP_IER2);
reg |= UART_OMAP_IER2_RHR_IT_DIS;
- serial_out(p, UART_OMAP_IER2, UART_OMAP_IER2_RHR_IT_DIS);
+ serial_out(p, UART_OMAP_IER2, reg);
}
dma_async_issue_pending(dma->rxchan);
status = serial_port_in(port, UART_LSR);
- if (priv->habit & UART_HAS_EFR2)
- am654_8250_handle_rx_dma(up, iir, status);
- else
- status = omap_8250_handle_rx_dma(up, iir, status);
+ if ((iir & 0x3f) != UART_IIR_THRI) {
+ if (priv->habit & UART_HAS_EFR2)
+ am654_8250_handle_rx_dma(up, iir, status);
+ else
+ status = omap_8250_handle_rx_dma(up, iir, status);
+ }
serial8250_modem_status(up);
if (status & UART_LSR_THRE && up->dma->tx_err) {
/* Deals with DMA transactions */
-struct pl011_sgbuf {
- struct scatterlist sg;
- char *buf;
+struct pl011_dmabuf {
+ dma_addr_t dma;
+ size_t len;
+ char *buf;
};
struct pl011_dmarx_data {
struct dma_chan *chan;
struct completion complete;
bool use_buf_b;
- struct pl011_sgbuf sgbuf_a;
- struct pl011_sgbuf sgbuf_b;
+ struct pl011_dmabuf dbuf_a;
+ struct pl011_dmabuf dbuf_b;
dma_cookie_t cookie;
bool running;
struct timer_list timer;
struct pl011_dmatx_data {
struct dma_chan *chan;
- struct scatterlist sg;
+ dma_addr_t dma;
+ size_t len;
char *buf;
bool queued;
};
#define PL011_DMA_BUFFER_SIZE PAGE_SIZE
-static int pl011_sgbuf_init(struct dma_chan *chan, struct pl011_sgbuf *sg,
+static int pl011_dmabuf_init(struct dma_chan *chan, struct pl011_dmabuf *db,
enum dma_data_direction dir)
{
- dma_addr_t dma_addr;
-
- sg->buf = dma_alloc_coherent(chan->device->dev,
- PL011_DMA_BUFFER_SIZE, &dma_addr, GFP_KERNEL);
- if (!sg->buf)
+ db->buf = dma_alloc_coherent(chan->device->dev, PL011_DMA_BUFFER_SIZE,
+ &db->dma, GFP_KERNEL);
+ if (!db->buf)
return -ENOMEM;
-
- sg_init_table(&sg->sg, 1);
- sg_set_page(&sg->sg, phys_to_page(dma_addr),
- PL011_DMA_BUFFER_SIZE, offset_in_page(dma_addr));
- sg_dma_address(&sg->sg) = dma_addr;
- sg_dma_len(&sg->sg) = PL011_DMA_BUFFER_SIZE;
+ db->len = PL011_DMA_BUFFER_SIZE;
return 0;
}
-static void pl011_sgbuf_free(struct dma_chan *chan, struct pl011_sgbuf *sg,
+static void pl011_dmabuf_free(struct dma_chan *chan, struct pl011_dmabuf *db,
enum dma_data_direction dir)
{
- if (sg->buf) {
+ if (db->buf) {
dma_free_coherent(chan->device->dev,
- PL011_DMA_BUFFER_SIZE, sg->buf,
- sg_dma_address(&sg->sg));
+ PL011_DMA_BUFFER_SIZE, db->buf, db->dma);
}
}
uart_port_lock_irqsave(&uap->port, &flags);
if (uap->dmatx.queued)
- dma_unmap_sg(dmatx->chan->device->dev, &dmatx->sg, 1,
- DMA_TO_DEVICE);
+ dma_unmap_single(dmatx->chan->device->dev, dmatx->dma,
+ dmatx->len, DMA_TO_DEVICE);
dmacr = uap->dmacr;
uap->dmacr = dmacr & ~UART011_TXDMAE;
memcpy(&dmatx->buf[first], &xmit->buf[0], second);
}
- dmatx->sg.length = count;
-
- if (dma_map_sg(dma_dev->dev, &dmatx->sg, 1, DMA_TO_DEVICE) != 1) {
+ dmatx->len = count;
+ dmatx->dma = dma_map_single(dma_dev->dev, dmatx->buf, count,
+ DMA_TO_DEVICE);
+ if (dmatx->dma == DMA_MAPPING_ERROR) {
uap->dmatx.queued = false;
dev_dbg(uap->port.dev, "unable to map TX DMA\n");
return -EBUSY;
}
- desc = dmaengine_prep_slave_sg(chan, &dmatx->sg, 1, DMA_MEM_TO_DEV,
+ desc = dmaengine_prep_slave_single(chan, dmatx->dma, dmatx->len, DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!desc) {
- dma_unmap_sg(dma_dev->dev, &dmatx->sg, 1, DMA_TO_DEVICE);
+ dma_unmap_single(dma_dev->dev, dmatx->dma, dmatx->len, DMA_TO_DEVICE);
uap->dmatx.queued = false;
/*
* If DMA cannot be used right now, we complete this
dmaengine_terminate_async(uap->dmatx.chan);
if (uap->dmatx.queued) {
- dma_unmap_sg(uap->dmatx.chan->device->dev, &uap->dmatx.sg, 1,
- DMA_TO_DEVICE);
+ dma_unmap_single(uap->dmatx.chan->device->dev, uap->dmatx.dma,
+ uap->dmatx.len, DMA_TO_DEVICE);
uap->dmatx.queued = false;
uap->dmacr &= ~UART011_TXDMAE;
pl011_write(uap->dmacr, uap, REG_DMACR);
struct dma_chan *rxchan = uap->dmarx.chan;
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_async_tx_descriptor *desc;
- struct pl011_sgbuf *sgbuf;
+ struct pl011_dmabuf *dbuf;
if (!rxchan)
return -EIO;
/* Start the RX DMA job */
- sgbuf = uap->dmarx.use_buf_b ?
- &uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
- desc = dmaengine_prep_slave_sg(rxchan, &sgbuf->sg, 1,
+ dbuf = uap->dmarx.use_buf_b ?
+ &uap->dmarx.dbuf_b : &uap->dmarx.dbuf_a;
+ desc = dmaengine_prep_slave_single(rxchan, dbuf->dma, dbuf->len,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
/*
bool readfifo)
{
struct tty_port *port = &uap->port.state->port;
- struct pl011_sgbuf *sgbuf = use_buf_b ?
- &uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
+ struct pl011_dmabuf *dbuf = use_buf_b ?
+ &uap->dmarx.dbuf_b : &uap->dmarx.dbuf_a;
int dma_count = 0;
u32 fifotaken = 0; /* only used for vdbg() */
if (uap->dmarx.poll_rate) {
/* The data can be taken by polling */
- dmataken = sgbuf->sg.length - dmarx->last_residue;
+ dmataken = dbuf->len - dmarx->last_residue;
/* Recalculate the pending size */
if (pending >= dmataken)
pending -= dmataken;
* Note that tty_insert_flip_buf() tries to take as many chars
* as it can.
*/
- dma_count = tty_insert_flip_string(port, sgbuf->buf + dmataken,
+ dma_count = tty_insert_flip_string(port, dbuf->buf + dmataken,
pending);
uap->port.icount.rx += dma_count;
/* Reset the last_residue for Rx DMA poll */
if (uap->dmarx.poll_rate)
- dmarx->last_residue = sgbuf->sg.length;
+ dmarx->last_residue = dbuf->len;
/*
* Only continue with trying to read the FIFO if all DMA chars have
{
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_chan *rxchan = dmarx->chan;
- struct pl011_sgbuf *sgbuf = dmarx->use_buf_b ?
- &dmarx->sgbuf_b : &dmarx->sgbuf_a;
+ struct pl011_dmabuf *dbuf = dmarx->use_buf_b ?
+ &dmarx->dbuf_b : &dmarx->dbuf_a;
size_t pending;
struct dma_tx_state state;
enum dma_status dmastat;
pl011_write(uap->dmacr, uap, REG_DMACR);
uap->dmarx.running = false;
- pending = sgbuf->sg.length - state.residue;
+ pending = dbuf->len - state.residue;
BUG_ON(pending > PL011_DMA_BUFFER_SIZE);
/* Then we terminate the transfer - we now know our residue */
dmaengine_terminate_all(rxchan);
struct pl011_dmarx_data *dmarx = &uap->dmarx;
struct dma_chan *rxchan = dmarx->chan;
bool lastbuf = dmarx->use_buf_b;
- struct pl011_sgbuf *sgbuf = dmarx->use_buf_b ?
- &dmarx->sgbuf_b : &dmarx->sgbuf_a;
+ struct pl011_dmabuf *dbuf = dmarx->use_buf_b ?
+ &dmarx->dbuf_b : &dmarx->dbuf_a;
size_t pending;
struct dma_tx_state state;
int ret;
* the DMA irq handler. So we check the residue here.
*/
rxchan->device->device_tx_status(rxchan, dmarx->cookie, &state);
- pending = sgbuf->sg.length - state.residue;
+ pending = dbuf->len - state.residue;
BUG_ON(pending > PL011_DMA_BUFFER_SIZE);
/* Then we terminate the transfer - we now know our residue */
dmaengine_terminate_all(rxchan);
unsigned long flags;
unsigned int dmataken = 0;
unsigned int size = 0;
- struct pl011_sgbuf *sgbuf;
+ struct pl011_dmabuf *dbuf;
int dma_count;
struct dma_tx_state state;
- sgbuf = dmarx->use_buf_b ? &uap->dmarx.sgbuf_b : &uap->dmarx.sgbuf_a;
+ dbuf = dmarx->use_buf_b ? &uap->dmarx.dbuf_b : &uap->dmarx.dbuf_a;
rxchan->device->device_tx_status(rxchan, dmarx->cookie, &state);
if (likely(state.residue < dmarx->last_residue)) {
- dmataken = sgbuf->sg.length - dmarx->last_residue;
+ dmataken = dbuf->len - dmarx->last_residue;
size = dmarx->last_residue - state.residue;
- dma_count = tty_insert_flip_string(port, sgbuf->buf + dmataken,
+ dma_count = tty_insert_flip_string(port, dbuf->buf + dmataken,
size);
if (dma_count == size)
dmarx->last_residue = state.residue;
return;
}
- sg_init_one(&uap->dmatx.sg, uap->dmatx.buf, PL011_DMA_BUFFER_SIZE);
+ uap->dmatx.len = PL011_DMA_BUFFER_SIZE;
/* The DMA buffer is now the FIFO the TTY subsystem can use */
uap->port.fifosize = PL011_DMA_BUFFER_SIZE;
goto skip_rx;
/* Allocate and map DMA RX buffers */
- ret = pl011_sgbuf_init(uap->dmarx.chan, &uap->dmarx.sgbuf_a,
+ ret = pl011_dmabuf_init(uap->dmarx.chan, &uap->dmarx.dbuf_a,
DMA_FROM_DEVICE);
if (ret) {
dev_err(uap->port.dev, "failed to init DMA %s: %d\n",
goto skip_rx;
}
- ret = pl011_sgbuf_init(uap->dmarx.chan, &uap->dmarx.sgbuf_b,
+ ret = pl011_dmabuf_init(uap->dmarx.chan, &uap->dmarx.dbuf_b,
DMA_FROM_DEVICE);
if (ret) {
dev_err(uap->port.dev, "failed to init DMA %s: %d\n",
"RX buffer B", ret);
- pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_a,
+ pl011_dmabuf_free(uap->dmarx.chan, &uap->dmarx.dbuf_a,
DMA_FROM_DEVICE);
goto skip_rx;
}
/* In theory, this should already be done by pl011_dma_flush_buffer */
dmaengine_terminate_all(uap->dmatx.chan);
if (uap->dmatx.queued) {
- dma_unmap_sg(uap->dmatx.chan->device->dev, &uap->dmatx.sg, 1,
- DMA_TO_DEVICE);
+ dma_unmap_single(uap->dmatx.chan->device->dev,
+ uap->dmatx.dma, uap->dmatx.len,
+ DMA_TO_DEVICE);
uap->dmatx.queued = false;
}
if (uap->using_rx_dma) {
dmaengine_terminate_all(uap->dmarx.chan);
/* Clean up the RX DMA */
- pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_a, DMA_FROM_DEVICE);
- pl011_sgbuf_free(uap->dmarx.chan, &uap->dmarx.sgbuf_b, DMA_FROM_DEVICE);
+ pl011_dmabuf_free(uap->dmarx.chan, &uap->dmarx.dbuf_a, DMA_FROM_DEVICE);
+ pl011_dmabuf_free(uap->dmarx.chan, &uap->dmarx.dbuf_b, DMA_FROM_DEVICE);
if (uap->dmarx.poll_rate)
del_timer_sync(&uap->dmarx.timer);
uap->using_rx_dma = false;
*/
static void ma35d1serial_console_write(struct console *co, const char *s, u32 count)
{
- struct uart_ma35d1_port *up = &ma35d1serial_ports[co->index];
+ struct uart_ma35d1_port *up;
unsigned long flags;
int locked = 1;
u32 ier;
+ if ((co->index < 0) || (co->index >= MA35_UART_NR)) {
+ pr_warn("Failed to write on ononsole port %x, out of range\n",
+ co->index);
+ return;
+ }
+
+ up = &ma35d1serial_ports[co->index];
+
if (up->port.sysrq)
locked = 0;
else if (oops_in_progress)
case SC16IS7XX_IIR_RTOI_SRC:
case SC16IS7XX_IIR_XOFFI_SRC:
rxlen = sc16is7xx_port_read(port, SC16IS7XX_RXLVL_REG);
+
+ /*
+ * There is a silicon bug that makes the chip report a
+ * time-out interrupt but no data in the FIFO. This is
+ * described in errata section 18.1.4.
+ *
+ * When this happens, read one byte from the FIFO to
+ * clear the interrupt.
+ */
+ if (iir == SC16IS7XX_IIR_RTOI_SRC && !rxlen)
+ rxlen = 1;
+
if (rxlen)
sc16is7xx_handle_rx(port, rxlen, iir);
break;
for (i = 0; i < hba->nr_hw_queues; i++) {
hwq = &hba->uhq[i];
- hwq->max_entries = hba->nutrs;
+ hwq->max_entries = hba->nutrs + 1;
spin_lock_init(&hwq->sq_lock);
spin_lock_init(&hwq->cq_lock);
mutex_init(&hwq->sq_mutex);
int tag = scsi_cmd_to_rq(cmd)->tag;
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
struct ufs_hw_queue *hwq;
+ unsigned long flags;
int err = FAILED;
if (!ufshcd_cmd_inflight(lrbp->cmd)) {
}
err = SUCCESS;
+ spin_lock_irqsave(&hwq->cq_lock, flags);
if (ufshcd_cmd_inflight(lrbp->cmd))
ufshcd_release_scsi_cmd(hba, lrbp);
+ spin_unlock_irqrestore(&hwq->cq_lock, flags);
out:
return err;
if (is_mcq_enabled(hba)) {
int utrd_size = sizeof(struct utp_transfer_req_desc);
struct utp_transfer_req_desc *src = lrbp->utr_descriptor_ptr;
- struct utp_transfer_req_desc *dest = hwq->sqe_base_addr + hwq->sq_tail_slot;
+ struct utp_transfer_req_desc *dest;
spin_lock(&hwq->sq_lock);
+ dest = hwq->sqe_base_addr + hwq->sq_tail_slot;
memcpy(dest, src, utrd_size);
ufshcd_inc_sq_tail(hwq);
spin_unlock(&hwq->sq_lock);
struct scsi_device *sdev = cmd->device;
struct Scsi_Host *shost = sdev->host;
struct ufs_hba *hba = shost_priv(shost);
+ struct ufshcd_lrb *lrbp = &hba->lrb[tag];
+ struct ufs_hw_queue *hwq;
+ unsigned long flags;
*ret = ufshcd_try_to_abort_task(hba, tag);
dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag,
hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1,
*ret ? "failed" : "succeeded");
+
+ /* Release cmd in MCQ mode if abort succeeds */
+ if (is_mcq_enabled(hba) && (*ret == 0)) {
+ hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp->cmd));
+ spin_lock_irqsave(&hwq->cq_lock, flags);
+ if (ufshcd_cmd_inflight(lrbp->cmd))
+ ufshcd_release_scsi_cmd(hba, lrbp);
+ spin_unlock_irqrestore(&hwq->cq_lock, flags);
+ }
+
return *ret == 0;
}
err = ufs_qcom_clk_scale_up_pre_change(hba);
else
err = ufs_qcom_clk_scale_down_pre_change(hba);
- if (err)
- ufshcd_uic_hibern8_exit(hba);
+ if (err) {
+ ufshcd_uic_hibern8_exit(hba);
+ return err;
+ }
} else {
if (scale_up)
err = ufs_qcom_clk_scale_up_post_change(hba);
* Vinayak Holikatti <h.vinayak@samsung.com>
*/
+#include <linux/clk.h>
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/pm_opp.h>
}
}
+/**
+ * ufshcd_parse_clock_min_max_freq - Parse MIN and MAX clocks freq
+ * @hba: per adapter instance
+ *
+ * This function parses MIN and MAX frequencies of all clocks required
+ * by the host drivers.
+ *
+ * Returns 0 for success and non-zero for failure
+ */
+static int ufshcd_parse_clock_min_max_freq(struct ufs_hba *hba)
+{
+ struct list_head *head = &hba->clk_list_head;
+ struct ufs_clk_info *clki;
+ struct dev_pm_opp *opp;
+ unsigned long freq;
+ u8 idx = 0;
+
+ list_for_each_entry(clki, head, list) {
+ if (!clki->name)
+ continue;
+
+ clki->clk = devm_clk_get(hba->dev, clki->name);
+ if (IS_ERR(clki->clk))
+ continue;
+
+ /* Find Max Freq */
+ freq = ULONG_MAX;
+ opp = dev_pm_opp_find_freq_floor_indexed(hba->dev, &freq, idx);
+ if (IS_ERR(opp)) {
+ dev_err(hba->dev, "Failed to find OPP for MAX frequency\n");
+ return PTR_ERR(opp);
+ }
+ clki->max_freq = dev_pm_opp_get_freq_indexed(opp, idx);
+ dev_pm_opp_put(opp);
+
+ /* Find Min Freq */
+ freq = 0;
+ opp = dev_pm_opp_find_freq_ceil_indexed(hba->dev, &freq, idx);
+ if (IS_ERR(opp)) {
+ dev_err(hba->dev, "Failed to find OPP for MIN frequency\n");
+ return PTR_ERR(opp);
+ }
+ clki->min_freq = dev_pm_opp_get_freq_indexed(opp, idx++);
+ dev_pm_opp_put(opp);
+ }
+
+ return 0;
+}
+
static int ufshcd_parse_operating_points(struct ufs_hba *hba)
{
struct device *dev = hba->dev;
return ret;
}
+ ret = ufshcd_parse_clock_min_max_freq(hba);
+ if (ret)
+ return ret;
+
hba->use_pm_opp = true;
return 0;
unsigned long flags;
int counter = 0;
+ local_bh_disable();
spin_lock_irqsave(&pdev->lock, flags);
if (pdev->cdnsp_state & (CDNSP_STATE_HALTED | CDNSP_STATE_DYING)) {
cdnsp_died(pdev);
spin_unlock_irqrestore(&pdev->lock, flags);
+ local_bh_enable();
return IRQ_HANDLED;
}
cdnsp_update_erst_dequeue(pdev, event_ring_deq, 1);
spin_unlock_irqrestore(&pdev->lock, flags);
+ local_bh_enable();
return IRQ_HANDLED;
}
if (cap->bDescriptorType != USB_DT_DEVICE_CAPABILITY) {
dev_notice(ddev, "descriptor type invalid, skip\n");
- continue;
+ goto skip_to_next_descriptor;
}
switch (cap_type) {
break;
}
+skip_to_next_descriptor:
total_len -= length;
buffer += length;
}
ret = 0;
}
mutex_unlock(&hub->status_mutex);
-
- /*
- * There is no need to lock status_mutex here, because status_mutex
- * protects hub->status, and the phy driver only checks the port
- * status without changing the status.
- */
- if (!ret) {
- struct usb_device *hdev = hub->hdev;
-
- /*
- * Only roothub will be notified of port state changes,
- * since the USB PHY only cares about changes at the next
- * level.
- */
- if (is_root_hub(hdev)) {
- struct usb_hcd *hcd = bus_to_hcd(hdev->bus);
-
- if (hcd->usb_phy)
- usb_phy_notify_port_status(hcd->usb_phy,
- port1 - 1, *status, *change);
- }
- }
-
return ret;
}
{
struct dwc2_qtd *qtd;
struct dwc2_host_chan *chan;
- u32 hcint, hcintmsk;
+ u32 hcint, hcintraw, hcintmsk;
chan = hsotg->hc_ptr_array[chnum];
- hcint = dwc2_readl(hsotg, HCINT(chnum));
+ hcintraw = dwc2_readl(hsotg, HCINT(chnum));
hcintmsk = dwc2_readl(hsotg, HCINTMSK(chnum));
+ hcint = hcintraw & hcintmsk;
+ dwc2_writel(hsotg, hcint, HCINT(chnum));
+
if (!chan) {
dev_err(hsotg->dev, "## hc_ptr_array for channel is NULL ##\n");
- dwc2_writel(hsotg, hcint, HCINT(chnum));
return;
}
chnum);
dev_vdbg(hsotg->dev,
" hcint 0x%08x, hcintmsk 0x%08x, hcint&hcintmsk 0x%08x\n",
- hcint, hcintmsk, hcint & hcintmsk);
+ hcintraw, hcintmsk, hcint);
}
- dwc2_writel(hsotg, hcint, HCINT(chnum));
-
/*
* If we got an interrupt after someone called
* dwc2_hcd_endpoint_disable() we don't want to crash below
return;
}
- chan->hcint = hcint;
- hcint &= hcintmsk;
+ chan->hcint = hcintraw;
/*
* If the channel was halted due to a dequeue, the qtd list might
pm_runtime_put(dev);
+ dma_set_max_seg_size(dev, UINT_MAX);
+
return 0;
err_exit_debugfs:
dwc->role_switch_default_mode = USB_DR_MODE_PERIPHERAL;
mode = DWC3_GCTL_PRTCAP_DEVICE;
}
+ dwc3_set_mode(dwc, mode);
dwc3_role_switch.fwnode = dev_fwnode(dwc->dev);
dwc3_role_switch.set = dwc3_usb_role_switch_set;
}
}
- dwc3_set_mode(dwc, mode);
return 0;
}
#else
pdata ? pdata->hs_phy_irq_index : -1);
if (irq > 0) {
/* Keep wakeup interrupts disabled until suspend */
- irq_set_status_flags(irq, IRQ_NOAUTOEN);
ret = devm_request_threaded_irq(qcom->dev, irq, NULL,
qcom_dwc3_resume_irq,
- IRQF_TRIGGER_HIGH | IRQF_ONESHOT,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
"qcom_dwc3 HS", qcom);
if (ret) {
dev_err(qcom->dev, "hs_phy_irq failed: %d\n", ret);
irq = dwc3_qcom_get_irq(pdev, "dp_hs_phy_irq",
pdata ? pdata->dp_hs_phy_irq_index : -1);
if (irq > 0) {
- irq_set_status_flags(irq, IRQ_NOAUTOEN);
ret = devm_request_threaded_irq(qcom->dev, irq, NULL,
qcom_dwc3_resume_irq,
- IRQF_TRIGGER_HIGH | IRQF_ONESHOT,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
"qcom_dwc3 DP_HS", qcom);
if (ret) {
dev_err(qcom->dev, "dp_hs_phy_irq failed: %d\n", ret);
irq = dwc3_qcom_get_irq(pdev, "dm_hs_phy_irq",
pdata ? pdata->dm_hs_phy_irq_index : -1);
if (irq > 0) {
- irq_set_status_flags(irq, IRQ_NOAUTOEN);
ret = devm_request_threaded_irq(qcom->dev, irq, NULL,
qcom_dwc3_resume_irq,
- IRQF_TRIGGER_HIGH | IRQF_ONESHOT,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
"qcom_dwc3 DM_HS", qcom);
if (ret) {
dev_err(qcom->dev, "dm_hs_phy_irq failed: %d\n", ret);
irq = dwc3_qcom_get_irq(pdev, "ss_phy_irq",
pdata ? pdata->ss_phy_irq_index : -1);
if (irq > 0) {
- irq_set_status_flags(irq, IRQ_NOAUTOEN);
ret = devm_request_threaded_irq(qcom->dev, irq, NULL,
qcom_dwc3_resume_irq,
- IRQF_TRIGGER_HIGH | IRQF_ONESHOT,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
"qcom_dwc3 SS", qcom);
if (ret) {
dev_err(qcom->dev, "ss_phy_irq failed: %d\n", ret);
if (!qcom->dwc3) {
ret = -ENODEV;
dev_err(dev, "failed to get dwc3 platform device\n");
+ of_platform_depopulate(dev);
}
node_put:
return ret;
}
-static struct platform_device *
-dwc3_qcom_create_urs_usb_platdev(struct device *dev)
+static struct platform_device *dwc3_qcom_create_urs_usb_platdev(struct device *dev)
{
+ struct platform_device *urs_usb = NULL;
struct fwnode_handle *fwh;
struct acpi_device *adev;
char name[8];
adev = to_acpi_device_node(fwh);
if (!adev)
- return NULL;
+ goto err_put_handle;
+
+ urs_usb = acpi_create_platform_device(adev, NULL);
+ if (IS_ERR_OR_NULL(urs_usb))
+ goto err_put_handle;
+
+ return urs_usb;
+
+err_put_handle:
+ fwnode_handle_put(fwh);
+
+ return urs_usb;
+}
- return acpi_create_platform_device(adev, NULL);
+static void dwc3_qcom_destroy_urs_usb_platdev(struct platform_device *urs_usb)
+{
+ struct fwnode_handle *fwh = urs_usb->dev.fwnode;
+
+ platform_device_unregister(urs_usb);
+ fwnode_handle_put(fwh);
}
static int dwc3_qcom_probe(struct platform_device *pdev)
qcom->qscratch_base = devm_ioremap_resource(dev, parent_res);
if (IS_ERR(qcom->qscratch_base)) {
ret = PTR_ERR(qcom->qscratch_base);
- goto clk_disable;
+ goto free_urs;
}
ret = dwc3_qcom_setup_irq(pdev);
if (ret) {
dev_err(dev, "failed to setup IRQs, err=%d\n", ret);
- goto clk_disable;
+ goto free_urs;
}
/*
if (ret) {
dev_err(dev, "failed to register DWC3 Core, err=%d\n", ret);
- goto depopulate;
+ goto free_urs;
}
ret = dwc3_qcom_interconnect_init(qcom);
interconnect_exit:
dwc3_qcom_interconnect_exit(qcom);
depopulate:
- if (np)
+ if (np) {
of_platform_depopulate(&pdev->dev);
- else
- platform_device_put(pdev);
+ } else {
+ device_remove_software_node(&qcom->dwc3->dev);
+ platform_device_del(qcom->dwc3);
+ }
+ platform_device_put(qcom->dwc3);
+free_urs:
+ if (qcom->urs_usb)
+ dwc3_qcom_destroy_urs_usb_platdev(qcom->urs_usb);
clk_disable:
for (i = qcom->num_clocks - 1; i >= 0; i--) {
clk_disable_unprepare(qcom->clks[i]);
struct device *dev = &pdev->dev;
int i;
- device_remove_software_node(&qcom->dwc3->dev);
- if (np)
+ if (np) {
of_platform_depopulate(&pdev->dev);
- else
- platform_device_put(pdev);
+ } else {
+ device_remove_software_node(&qcom->dwc3->dev);
+ platform_device_del(qcom->dwc3);
+ }
+ platform_device_put(qcom->dwc3);
+
+ if (qcom->urs_usb)
+ dwc3_qcom_destroy_urs_usb_platdev(qcom->urs_usb);
for (i = qcom->num_clocks - 1; i >= 0; i--) {
clk_disable_unprepare(qcom->clks[i]);
ret = of_property_read_string(dwc3_np, "maximum-speed", &maximum_speed);
if (ret < 0)
- return USB_SPEED_UNKNOWN;
+ goto out;
ret = match_string(speed_names, ARRAY_SIZE(speed_names), maximum_speed);
+out:
+ of_node_put(dwc3_np);
+
return (ret < 0) ? USB_SPEED_UNKNOWN : ret;
}
switch_usb2_role(rtk, rtk->cur_role);
+ platform_device_put(dwc3_pdev);
+ of_node_put(dwc3_node);
+
return 0;
err_pdev_put:
temp = size;
size -= temp;
next += temp;
- if (temp == size)
- goto done;
}
temp = snprintf(next, size, "\n");
size -= temp;
next += temp;
-done:
*sizep = size;
*nextp = next;
}
{
struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);
+ kfree(hidg->report_desc);
kfree(hidg->set_report_buf);
kfree(hidg);
}
hidg->report_length = opts->report_length;
hidg->report_desc_length = opts->report_desc_length;
if (opts->report_desc) {
- hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,
- opts->report_desc_length,
- GFP_KERNEL);
+ hidg->report_desc = kmemdup(opts->report_desc,
+ opts->report_desc_length,
+ GFP_KERNEL);
if (!hidg->report_desc) {
ret = -ENOMEM;
goto err_put_device;
dev_dbg(&udc->dev, "unbinding gadget driver [%s]\n", driver->function);
- kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
-
udc->allow_connect = false;
cancel_work_sync(&udc->vbus_work);
mutex_lock(&udc->connect_lock);
driver->is_bound = false;
udc->driver = NULL;
mutex_unlock(&udc_lock);
+
+ kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
}
/* ------------------------------------------------------------------------- */
if (sch_ep->ep_type == ISOC_OUT_EP) {
for (j = 0; j < sch_ep->num_budget_microframes; j++) {
- k = XHCI_MTK_BW_INDEX(base + j + CS_OFFSET);
- /* use cs to indicate existence of in-ss @(base+j) */
- if (tt->fs_bus_bw_in[k])
+ k = XHCI_MTK_BW_INDEX(base + j);
+ if (tt->in_ss_cnt[k])
return -ESCH_SS_OVERLAP;
}
} else if (sch_ep->ep_type == ISOC_IN_EP || sch_ep->ep_type == INT_IN_EP) {
tt->fs_frame_bw[f] -= (u16)sch_ep->bw_budget_table[j];
}
}
+
+ if (sch_ep->ep_type == ISOC_IN_EP || sch_ep->ep_type == INT_IN_EP) {
+ k = XHCI_MTK_BW_INDEX(base);
+ if (used)
+ tt->in_ss_cnt[k]++;
+ else
+ tt->in_ss_cnt[k]--;
+ }
}
if (used)
* @fs_bus_bw_in: save bandwidth used by FS/LS IN eps in each uframes
* @ls_bus_bw: save bandwidth used by LS eps in each uframes
* @fs_frame_bw: save bandwidth used by FS/LS eps in each FS frames
+ * @in_ss_cnt: the count of Start-Split for IN eps
* @ep_list: Endpoints using this TT
*/
struct mu3h_sch_tt {
u16 fs_bus_bw_in[XHCI_MTK_MAX_ESIT];
u8 ls_bus_bw[XHCI_MTK_MAX_ESIT];
u16 fs_frame_bw[XHCI_MTK_FRAMES_CNT];
+ u8 in_ss_cnt[XHCI_MTK_MAX_ESIT];
struct list_head ep_list;
};
/* xHC spec requires PCI devices to support D3hot and D3cold */
if (xhci->hci_version >= 0x120)
xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
- else if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version >= 0x110)
- xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
if (xhci->quirks & XHCI_RESET_ON_RESUME)
xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/of.h>
+#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/usb/phy.h>
#include <linux/slab.h>
int ret;
int irq;
struct xhci_plat_priv *priv = NULL;
-
+ bool of_match;
if (usb_disabled())
return -ENODEV;
&xhci->imod_interval);
}
- hcd->usb_phy = devm_usb_get_phy_by_phandle(sysdev, "usb-phy", 0);
- if (IS_ERR(hcd->usb_phy)) {
- ret = PTR_ERR(hcd->usb_phy);
- if (ret == -EPROBE_DEFER)
- goto disable_clk;
- hcd->usb_phy = NULL;
- } else {
- ret = usb_phy_init(hcd->usb_phy);
- if (ret)
- goto disable_clk;
+ /*
+ * Drivers such as dwc3 manages PHYs themself (and rely on driver name
+ * matching for the xhci platform device).
+ */
+ of_match = of_match_device(pdev->dev.driver->of_match_table, &pdev->dev);
+ if (of_match) {
+ hcd->usb_phy = devm_usb_get_phy_by_phandle(sysdev, "usb-phy", 0);
+ if (IS_ERR(hcd->usb_phy)) {
+ ret = PTR_ERR(hcd->usb_phy);
+ if (ret == -EPROBE_DEFER)
+ goto disable_clk;
+ hcd->usb_phy = NULL;
+ } else {
+ ret = usb_phy_init(hcd->usb_phy);
+ if (ret)
+ goto disable_clk;
+ }
}
hcd->tpl_support = of_usb_host_tpl_support(sysdev->of_node);
goto dealloc_usb2_hcd;
}
- xhci->shared_hcd->usb_phy = devm_usb_get_phy_by_phandle(sysdev,
- "usb-phy", 1);
- if (IS_ERR(xhci->shared_hcd->usb_phy)) {
- xhci->shared_hcd->usb_phy = NULL;
- } else {
- ret = usb_phy_init(xhci->shared_hcd->usb_phy);
- if (ret)
- dev_err(sysdev, "%s init usb3phy fail (ret=%d)\n",
- __func__, ret);
+ if (of_match) {
+ xhci->shared_hcd->usb_phy = devm_usb_get_phy_by_phandle(sysdev,
+ "usb-phy", 1);
+ if (IS_ERR(xhci->shared_hcd->usb_phy)) {
+ xhci->shared_hcd->usb_phy = NULL;
+ } else {
+ ret = usb_phy_init(xhci->shared_hcd->usb_phy);
+ if (ret)
+ dev_err(sysdev, "%s init usb3phy fail (ret=%d)\n",
+ __func__, ret);
+ }
}
xhci->shared_hcd->tpl_support = hcd->tpl_support;
{ USB_DEVICE(VENDOR_ID_MICROCHIP, 0x2412) }, /* USB2412 USB 2.0 */
{ USB_DEVICE(VENDOR_ID_MICROCHIP, 0x2514) }, /* USB2514B USB 2.0 */
{ USB_DEVICE(VENDOR_ID_MICROCHIP, 0x2517) }, /* USB2517 USB 2.0 */
+ { USB_DEVICE(VENDOR_ID_MICROCHIP, 0x2744) }, /* USB5744 USB 2.0 */
+ { USB_DEVICE(VENDOR_ID_MICROCHIP, 0x5744) }, /* USB5744 USB 3.0 */
{ USB_DEVICE(VENDOR_ID_REALTEK, 0x0411) }, /* RTS5411 USB 3.1 */
{ USB_DEVICE(VENDOR_ID_REALTEK, 0x5411) }, /* RTS5411 USB 2.1 */
{ USB_DEVICE(VENDOR_ID_REALTEK, 0x0414) }, /* RTS5414 USB 3.2 */
.num_supplies = 1,
};
+static const struct onboard_hub_pdata microchip_usb5744_data = {
+ .reset_us = 0,
+ .num_supplies = 2,
+};
+
static const struct onboard_hub_pdata realtek_rts5411_data = {
.reset_us = 0,
.num_supplies = 1,
{ .compatible = "usb424,2412", .data = µchip_usb424_data, },
{ .compatible = "usb424,2514", .data = µchip_usb424_data, },
{ .compatible = "usb424,2517", .data = µchip_usb424_data, },
+ { .compatible = "usb424,2744", .data = µchip_usb5744_data, },
+ { .compatible = "usb424,5744", .data = µchip_usb5744_data, },
{ .compatible = "usb451,8140", .data = &ti_tusb8041_data, },
{ .compatible = "usb451,8142", .data = &ti_tusb8041_data, },
{ .compatible = "usb4b4,6504", .data = &cypress_hx3_data, },
u64 adr, u8 id)
{
struct ljca_match_ids_walk_data wd = { 0 };
- struct acpi_device *parent, *adev;
struct device *dev = adap->dev;
+ struct acpi_device *parent;
char uid[4];
parent = ACPI_COMPANION(dev);
return;
/*
- * get auxdev ACPI handle from the ACPI device directly
- * under the parent that matches _ADR.
- */
- adev = acpi_find_child_device(parent, adr, false);
- if (adev) {
- ACPI_COMPANION_SET(&auxdev->dev, adev);
- return;
- }
-
- /*
- * _ADR is a grey area in the ACPI specification, some
+ * Currently LJCA hw doesn't use _ADR instead the shipped
* platforms use _HID to distinguish children devices.
*/
switch (adr) {
unsigned int i;
int ret;
+ /* Not all LJCA chips implement SPI, a timeout reading the descriptors is normal */
ret = ljca_send(adap, LJCA_CLIENT_MNG, LJCA_MNG_ENUM_SPI, NULL, 0, buf,
sizeof(buf), true, LJCA_ENUM_CLIENT_TIMEOUT_MS);
if (ret < 0)
- return ret;
+ return (ret == -ETIMEDOUT) ? 0 : ret;
/* check firmware response */
desc = (struct ljca_spi_descriptor *)buf;
{ USB_DEVICE(FTDI_VID, ACTISENSE_USG_PID) },
{ USB_DEVICE(FTDI_VID, ACTISENSE_NGT_PID) },
{ USB_DEVICE(FTDI_VID, ACTISENSE_NGW_PID) },
- { USB_DEVICE(FTDI_VID, ACTISENSE_D9AC_PID) },
- { USB_DEVICE(FTDI_VID, ACTISENSE_D9AD_PID) },
- { USB_DEVICE(FTDI_VID, ACTISENSE_D9AE_PID) },
+ { USB_DEVICE(FTDI_VID, ACTISENSE_UID_PID) },
+ { USB_DEVICE(FTDI_VID, ACTISENSE_USA_PID) },
+ { USB_DEVICE(FTDI_VID, ACTISENSE_NGX_PID) },
{ USB_DEVICE(FTDI_VID, ACTISENSE_D9AF_PID) },
{ USB_DEVICE(FTDI_VID, CHETCO_SEAGAUGE_PID) },
{ USB_DEVICE(FTDI_VID, CHETCO_SEASWITCH_PID) },
#define ACTISENSE_USG_PID 0xD9A9 /* USG USB Serial Adapter */
#define ACTISENSE_NGT_PID 0xD9AA /* NGT NMEA2000 Interface */
#define ACTISENSE_NGW_PID 0xD9AB /* NGW NMEA2000 Gateway */
-#define ACTISENSE_D9AC_PID 0xD9AC /* Actisense Reserved */
-#define ACTISENSE_D9AD_PID 0xD9AD /* Actisense Reserved */
-#define ACTISENSE_D9AE_PID 0xD9AE /* Actisense Reserved */
+#define ACTISENSE_UID_PID 0xD9AC /* USB Isolating Device */
+#define ACTISENSE_USA_PID 0xD9AD /* USB to Serial Adapter */
+#define ACTISENSE_NGX_PID 0xD9AE /* NGX NMEA2000 Gateway */
#define ACTISENSE_D9AF_PID 0xD9AF /* Actisense Reserved */
#define CHETCO_SEAGAUGE_PID 0xA548 /* SeaGauge USB Adapter */
#define CHETCO_SEASWITCH_PID 0xA549 /* SeaSwitch USB Adapter */
#define DELL_PRODUCT_5829E_ESIM 0x81e4
#define DELL_PRODUCT_5829E 0x81e6
-#define DELL_PRODUCT_FM101R 0x8213
-#define DELL_PRODUCT_FM101R_ESIM 0x8215
+#define DELL_PRODUCT_FM101R_ESIM 0x8213
+#define DELL_PRODUCT_FM101R 0x8215
#define KYOCERA_VENDOR_ID 0x0c88
#define KYOCERA_PRODUCT_KPC650 0x17da
#define QUECTEL_PRODUCT_RM500Q 0x0800
#define QUECTEL_PRODUCT_RM520N 0x0801
#define QUECTEL_PRODUCT_EC200U 0x0901
+#define QUECTEL_PRODUCT_EG912Y 0x6001
#define QUECTEL_PRODUCT_EC200S_CN 0x6002
#define QUECTEL_PRODUCT_EC200A 0x6005
#define QUECTEL_PRODUCT_EM061K_LWW 0x6008
#define UNISOC_VENDOR_ID 0x1782
/* TOZED LT70-C based on UNISOC SL8563 uses UNISOC's vendor ID */
#define TOZED_PRODUCT_LT70C 0x4055
+/* Luat Air72*U series based on UNISOC UIS8910 uses UNISOC's vendor ID */
+#define LUAT_PRODUCT_AIR720U 0x4e00
/* Device flags */
{ USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, 0x0700, 0xff), /* BG95 */
.driver_info = RSVD(3) | ZLP },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0xff, 0x30) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0, 0x40) },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0, 0) },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500Q, 0xff, 0xff, 0x10),
.driver_info = ZLP },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200U, 0xff, 0, 0) },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EG912Y, 0xff, 0, 0) },
{ USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500K, 0xff, 0x00, 0x00) },
{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0165, 0xff, 0xff, 0xff) },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0167, 0xff, 0xff, 0xff),
.driver_info = RSVD(4) },
- { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0189, 0xff, 0xff, 0xff) },
+ { USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0189, 0xff, 0xff, 0xff),
+ .driver_info = RSVD(4) },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0191, 0xff, 0xff, 0xff), /* ZTE EuFi890 */
.driver_info = RSVD(4) },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, 0x0196, 0xff, 0xff, 0xff) },
.driver_info = RSVD(0) | RSVD(1) | RSVD(6) },
{ USB_DEVICE(0x0489, 0xe0b5), /* Foxconn T77W968 ESIM */
.driver_info = RSVD(0) | RSVD(1) | RSVD(6) },
+ { USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0da, 0xff), /* Foxconn T99W265 MBIM variant */
+ .driver_info = RSVD(3) | RSVD(5) },
{ USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0db, 0xff), /* Foxconn T99W265 MBIM */
.driver_info = RSVD(3) },
{ USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0ee, 0xff), /* Foxconn T99W368 MBIM */
.driver_info = RSVD(4) | RSVD(5) | RSVD(6) },
{ USB_DEVICE(0x1782, 0x4d10) }, /* Fibocom L610 (AT mode) */
{ USB_DEVICE_INTERFACE_CLASS(0x1782, 0x4d11, 0xff) }, /* Fibocom L610 (ECM/RNDIS mode) */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x0001, 0xff, 0xff, 0xff) }, /* Fibocom L716-EU (ECM/RNDIS mode) */
{ USB_DEVICE(0x2cb7, 0x0104), /* Fibocom NL678 series */
.driver_info = RSVD(4) | RSVD(5) },
{ USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0105, 0xff), /* Fibocom NL678 series */
{ USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x40) },
{ USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0, 0) },
{ USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) },
{ } /* Terminating entry */
};
MODULE_DEVICE_TABLE(usb, option_ids);
USB_SC_DEVICE, USB_PR_DEVICE, NULL,
US_FL_INITIAL_READ10 ),
+/*
+ * Patch by Tasos Sahanidis <tasos@tasossah.com>
+ * This flash drive always shows up with write protect enabled
+ * during the first mode sense.
+ */
+UNUSUAL_DEV(0x0951, 0x1697, 0x0100, 0x0100,
+ "Kingston",
+ "DT Ultimate G3",
+ USB_SC_DEVICE, USB_PR_DEVICE, NULL,
+ US_FL_NO_WP_DETECT),
+
/*
* This Pentax still camera is not conformant
* to the USB storage specification: -
if (!partner)
return;
- adev = &partner->adev;
+ adev = &altmode->adev;
if (is_typec_plug(adev->dev.parent)) {
struct typec_plug *plug = to_typec_plug(adev->dev.parent);
{
struct altmode *alt = to_altmode(to_typec_altmode(dev));
- typec_altmode_put_partner(alt);
+ if (!is_typec_port(dev->parent))
+ typec_altmode_put_partner(alt);
altmode_id_remove(alt->adev.dev.parent, alt->id);
kfree(alt);
current_lim = PD_P_SNK_STDBY_MW / 5;
tcpm_set_current_limit(port, current_lim, 5000);
/* Not sink vbus if operational current is 0mA */
- tcpm_set_charge(port, !!pdo_max_current(port->snk_pdo[0]));
+ tcpm_set_charge(port, !port->pd_supported ||
+ pdo_max_current(port->snk_pdo[0]));
if (!port->pd_supported)
tcpm_set_state(port, SNK_READY, 0);
if (port->bist_request == BDO_MODE_TESTDATA && port->tcpc->set_bist_data)
port->tcpc->set_bist_data(port->tcpc, false);
+ switch (port->state) {
+ case ERROR_RECOVERY:
+ case PORT_RESET:
+ case PORT_RESET_WAIT_OFF:
+ return;
+ default:
+ break;
+ }
+
if (port->ams != NONE_AMS)
port->ams = NONE_AMS;
if (port->hard_reset_count < PD_N_HARD_RESET_COUNT)
ret = of_property_match_string(np, "reg-names", "patch-address");
if (ret < 0) {
dev_err(tps->dev, "failed to get patch-address %d\n", ret);
- return ret;
+ goto release_fw;
}
ret = of_property_read_u32_index(np, "reg", ret, &addr);
if (ret)
- return ret;
+ goto release_fw;
if (addr == 0 || (addr >= 0x20 && addr <= 0x23)) {
dev_err(tps->dev, "wrong patch address %u\n", addr);
- return -EINVAL;
+ ret = -EINVAL;
+ goto release_fw;
}
bpms_data.addr = (u8)addr;
TPS_REG_INT_PLUG_EVENT;
}
- tps->data = device_get_match_data(tps->dev);
+ if (dev_fwnode(tps->dev))
+ tps->data = device_get_match_data(tps->dev);
+ else
+ tps->data = i2c_get_match_data(client);
if (!tps->data)
return -EINVAL;
MODULE_DEVICE_TABLE(of, tps6598x_of_match);
static const struct i2c_device_id tps6598x_id[] = {
- { "tps6598x" },
+ { "tps6598x", (kernel_ulong_t)&tps6598x_data },
{ }
};
MODULE_DEVICE_TABLE(i2c, tps6598x_id);
con_num = UCSI_CCI_CONNECTOR(cci);
if (con_num) {
- if (con_num < PMIC_GLINK_MAX_PORTS &&
+ if (con_num <= PMIC_GLINK_MAX_PORTS &&
ucsi->port_orientation[con_num - 1]) {
int orientation = gpiod_get_value(ucsi->port_orientation[con_num - 1]);
struct mlx5_control_vq *cvq = &mvdev->cvq;
int err = 0;
- if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ))
+ if (mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)) {
+ u16 idx = cvq->vring.last_avail_idx;
+
err = vringh_init_iotlb(&cvq->vring, mvdev->actual_features,
MLX5_CVQ_MAX_ENT, false,
(struct vring_desc *)(uintptr_t)cvq->desc_addr,
(struct vring_avail *)(uintptr_t)cvq->driver_addr,
(struct vring_used *)(uintptr_t)cvq->device_addr);
+ if (!err)
+ cvq->vring.last_avail_idx = cvq->vring.last_used_idx = idx;
+ }
return err;
}
debugfs_create_file("config", 0400, vdpa_aux->dentry, vdpa_aux->pdsv, &config_fops);
for (i = 0; i < vdpa_aux->pdsv->num_vqs; i++) {
- char name[8];
+ char name[16];
snprintf(name, sizeof(name), "vq%02d", i);
debugfs_create_file(name, 0400, vdpa_aux->dentry,
return -EOPNOTSUPP;
}
- pdsv->negotiated_features = nego_features;
-
driver_features = pds_vdpa_get_driver_features(vdpa_dev);
+ pdsv->negotiated_features = nego_features;
dev_dbg(dev, "%s: %#llx => %#llx\n",
__func__, driver_features, nego_features);
pds_vdpa_cmd_set_status(pdsv, status);
- /* Note: still working with FW on the need for this reset cmd */
if (status == 0) {
+ struct vdpa_callback null_cb = { };
+
+ pds_vdpa_set_config_cb(vdpa_dev, &null_cb);
pds_vdpa_cmd_reset(pdsv);
for (i = 0; i < pdsv->num_vqs; i++) {
if (blk->shared_backend) {
blk->buffer = shared_buffer;
} else {
- blk->buffer = kvmalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT,
+ blk->buffer = kvzalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT,
GFP_KERNEL);
if (!blk->buffer) {
ret = -ENOMEM;
goto parent_err;
if (shared_backend) {
- shared_buffer = kvmalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT,
+ shared_buffer = kvzalloc(VDPASIM_BLK_CAPACITY << SECTOR_SHIFT,
GFP_KERNEL);
if (!shared_buffer) {
ret = -ENOMEM;
* VFIO_DEVICE_STATE_RUNNING.
*/
if (deferred_reset_needed) {
- spin_lock(&pds_vfio->reset_lock);
+ mutex_lock(&pds_vfio->reset_mutex);
pds_vfio->deferred_reset = true;
pds_vfio->deferred_reset_state = VFIO_DEVICE_STATE_ERROR;
- spin_unlock(&pds_vfio->reset_lock);
+ mutex_unlock(&pds_vfio->reset_mutex);
}
}
void pds_vfio_state_mutex_unlock(struct pds_vfio_pci_device *pds_vfio)
{
again:
- spin_lock(&pds_vfio->reset_lock);
+ mutex_lock(&pds_vfio->reset_mutex);
if (pds_vfio->deferred_reset) {
pds_vfio->deferred_reset = false;
if (pds_vfio->state == VFIO_DEVICE_STATE_ERROR) {
}
pds_vfio->state = pds_vfio->deferred_reset_state;
pds_vfio->deferred_reset_state = VFIO_DEVICE_STATE_RUNNING;
- spin_unlock(&pds_vfio->reset_lock);
+ mutex_unlock(&pds_vfio->reset_mutex);
goto again;
}
mutex_unlock(&pds_vfio->state_mutex);
- spin_unlock(&pds_vfio->reset_lock);
+ mutex_unlock(&pds_vfio->reset_mutex);
}
void pds_vfio_reset(struct pds_vfio_pci_device *pds_vfio)
{
- spin_lock(&pds_vfio->reset_lock);
+ mutex_lock(&pds_vfio->reset_mutex);
pds_vfio->deferred_reset = true;
pds_vfio->deferred_reset_state = VFIO_DEVICE_STATE_RUNNING;
if (!mutex_trylock(&pds_vfio->state_mutex)) {
- spin_unlock(&pds_vfio->reset_lock);
+ mutex_unlock(&pds_vfio->reset_mutex);
return;
}
- spin_unlock(&pds_vfio->reset_lock);
+ mutex_unlock(&pds_vfio->reset_mutex);
pds_vfio_state_mutex_unlock(pds_vfio);
}
pds_vfio->vf_id = vf_id;
+ mutex_init(&pds_vfio->state_mutex);
+ mutex_init(&pds_vfio->reset_mutex);
+
vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
vdev->mig_ops = &pds_vfio_lm_ops;
vdev->log_ops = &pds_vfio_log_ops;
return 0;
}
+static void pds_vfio_release_device(struct vfio_device *vdev)
+{
+ struct pds_vfio_pci_device *pds_vfio =
+ container_of(vdev, struct pds_vfio_pci_device,
+ vfio_coredev.vdev);
+
+ mutex_destroy(&pds_vfio->state_mutex);
+ mutex_destroy(&pds_vfio->reset_mutex);
+ vfio_pci_core_release_dev(vdev);
+}
+
static int pds_vfio_open_device(struct vfio_device *vdev)
{
struct pds_vfio_pci_device *pds_vfio =
if (err)
return err;
- mutex_init(&pds_vfio->state_mutex);
pds_vfio->state = VFIO_DEVICE_STATE_RUNNING;
pds_vfio->deferred_reset_state = VFIO_DEVICE_STATE_RUNNING;
pds_vfio_put_save_file(pds_vfio);
pds_vfio_dirty_disable(pds_vfio, true);
mutex_unlock(&pds_vfio->state_mutex);
- mutex_destroy(&pds_vfio->state_mutex);
vfio_pci_core_close_device(vdev);
}
static const struct vfio_device_ops pds_vfio_ops = {
.name = "pds-vfio",
.init = pds_vfio_init_device,
- .release = vfio_pci_core_release_dev,
+ .release = pds_vfio_release_device,
.open_device = pds_vfio_open_device,
.close_device = pds_vfio_close_device,
.ioctl = vfio_pci_core_ioctl,
struct pds_vfio_dirty dirty;
struct mutex state_mutex; /* protect migration state */
enum vfio_device_mig_state state;
- spinlock_t reset_lock; /* protect reset_done flow */
+ struct mutex reset_mutex; /* protect reset_done flow */
u8 deferred_reset;
enum vfio_device_mig_state deferred_reset_state;
struct notifier_block nb;
err:
put_device(&v->dev);
- ida_simple_remove(&vhost_vdpa_ida, v->minor);
return r;
}
if (v != VIRTIO_MSI_NO_VECTOR) {
int irq = pci_irq_vector(vp_dev->pci_dev, v);
- irq_set_affinity_hint(irq, NULL);
+ irq_update_affinity_hint(irq, NULL);
free_irq(irq, vq);
}
}
mask = vp_dev->msix_affinity_masks[info->msix_vector];
irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
if (!cpu_mask)
- irq_set_affinity_hint(irq, NULL);
+ irq_update_affinity_hint(irq, NULL);
else {
cpumask_copy(mask, cpu_mask);
- irq_set_affinity_hint(irq, mask);
+ irq_set_affinity_and_hint(irq, mask);
}
}
return 0;
err = -EINVAL;
mdev->common = vp_modern_map_capability(mdev, common,
- sizeof(struct virtio_pci_common_cfg), 4,
- 0, sizeof(struct virtio_pci_modern_common_cfg),
- &mdev->common_len, NULL);
+ sizeof(struct virtio_pci_common_cfg), 4, 0,
+ offsetofend(struct virtio_pci_modern_common_cfg,
+ queue_reset),
+ &mdev->common_len, NULL);
if (!mdev->common)
goto err_map_common;
mdev->isr = vp_modern_map_capability(mdev, isr, sizeof(u8), 1,
int i;
struct shared_info *s = HYPERVISOR_shared_info;
struct vcpu_info *vcpu_info = __this_cpu_read(xen_vcpu);
+ evtchn_port_t evtchn;
/* Timer interrupt has highest priority. */
- irq = irq_from_virq(cpu, VIRQ_TIMER);
+ irq = irq_evtchn_from_virq(cpu, VIRQ_TIMER, &evtchn);
if (irq != -1) {
- evtchn_port_t evtchn = evtchn_from_irq(irq);
word_idx = evtchn / BITS_PER_LONG;
bit_idx = evtchn % BITS_PER_LONG;
if (active_evtchns(cpu, s, word_idx) & (1ULL << bit_idx))
for (i = 0; i < EVTCHN_2L_NR_CHANNELS; i++) {
if (sync_test_bit(i, BM(sh->evtchn_pending))) {
int word_idx = i / BITS_PER_EVTCHN_WORD;
- printk(" %d: event %d -> irq %d%s%s%s\n",
+ printk(" %d: event %d -> irq %u%s%s%s\n",
cpu_from_evtchn(i), i,
- get_evtchn_to_irq(i),
+ irq_from_evtchn(i),
sync_test_bit(word_idx, BM(&v->evtchn_pending_sel))
? "" : " l2-clear",
!sync_test_bit(i, BM(sh->evtchn_mask))
/* IRQ <-> IPI mapping */
static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1};
+/* Cache for IPI event channels - needed for hot cpu unplug (avoid RCU usage). */
+static DEFINE_PER_CPU(evtchn_port_t [XEN_NR_IPIS], ipi_to_evtchn) = {[0 ... XEN_NR_IPIS-1] = 0};
/* Event channel distribution data */
static atomic_t channels_on_cpu[NR_CPUS];
#ifdef CONFIG_X86
static unsigned long *pirq_eoi_map;
#endif
-static bool (*pirq_needs_eoi)(unsigned irq);
+static bool (*pirq_needs_eoi)(struct irq_info *info);
#define EVTCHN_ROW(e) (e / (PAGE_SIZE/sizeof(**evtchn_to_irq)))
#define EVTCHN_COL(e) (e % (PAGE_SIZE/sizeof(**evtchn_to_irq)))
static struct irq_chip xen_percpu_chip;
static struct irq_chip xen_pirq_chip;
static void enable_dynirq(struct irq_data *data);
-static void disable_dynirq(struct irq_data *data);
static DEFINE_PER_CPU(unsigned int, irq_epoch);
return 0;
}
-int get_evtchn_to_irq(evtchn_port_t evtchn)
-{
- if (evtchn >= xen_evtchn_max_channels())
- return -1;
- if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL)
- return -1;
- return READ_ONCE(evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)]);
-}
-
/* Get info for IRQ */
static struct irq_info *info_for_irq(unsigned irq)
{
irq_set_chip_data(irq, info);
}
+static struct irq_info *evtchn_to_info(evtchn_port_t evtchn)
+{
+ int irq;
+
+ if (evtchn >= xen_evtchn_max_channels())
+ return NULL;
+ if (evtchn_to_irq[EVTCHN_ROW(evtchn)] == NULL)
+ return NULL;
+ irq = READ_ONCE(evtchn_to_irq[EVTCHN_ROW(evtchn)][EVTCHN_COL(evtchn)]);
+
+ return (irq < 0) ? NULL : info_for_irq(irq);
+}
+
/* Per CPU channel accounting */
static void channels_on_cpu_dec(struct irq_info *info)
{
info->is_accounted = 1;
}
+static void xen_irq_free_desc(unsigned int irq)
+{
+ /* Legacy IRQ descriptors are managed by the arch. */
+ if (irq >= nr_legacy_irqs())
+ irq_free_desc(irq);
+}
+
static void delayed_free_irq(struct work_struct *work)
{
struct irq_info *info = container_of(to_rcu_work(work), struct irq_info,
kfree(info);
- /* Legacy IRQ descriptors are managed by the arch. */
- if (irq >= nr_legacy_irqs())
- irq_free_desc(irq);
+ xen_irq_free_desc(irq);
}
/* Constructors for packed IRQ information. */
static int xen_irq_info_common_setup(struct irq_info *info,
- unsigned irq,
enum xen_irq_type type,
evtchn_port_t evtchn,
unsigned short cpu)
BUG_ON(info->type != IRQT_UNBOUND && info->type != type);
info->type = type;
- info->irq = irq;
info->evtchn = evtchn;
info->cpu = cpu;
info->mask_reason = EVT_MASK_REASON_EXPLICIT;
raw_spin_lock_init(&info->lock);
- ret = set_evtchn_to_irq(evtchn, irq);
+ ret = set_evtchn_to_irq(evtchn, info->irq);
if (ret < 0)
return ret;
- irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
+ irq_clear_status_flags(info->irq, IRQ_NOREQUEST | IRQ_NOAUTOEN);
return xen_evtchn_port_setup(evtchn);
}
-static int xen_irq_info_evtchn_setup(unsigned irq,
+static int xen_irq_info_evtchn_setup(struct irq_info *info,
evtchn_port_t evtchn,
struct xenbus_device *dev)
{
- struct irq_info *info = info_for_irq(irq);
int ret;
- ret = xen_irq_info_common_setup(info, irq, IRQT_EVTCHN, evtchn, 0);
+ ret = xen_irq_info_common_setup(info, IRQT_EVTCHN, evtchn, 0);
info->u.interdomain = dev;
if (dev)
atomic_inc(&dev->event_channels);
return ret;
}
-static int xen_irq_info_ipi_setup(unsigned cpu,
- unsigned irq,
- evtchn_port_t evtchn,
- enum ipi_vector ipi)
+static int xen_irq_info_ipi_setup(struct irq_info *info, unsigned int cpu,
+ evtchn_port_t evtchn, enum ipi_vector ipi)
{
- struct irq_info *info = info_for_irq(irq);
-
info->u.ipi = ipi;
- per_cpu(ipi_to_irq, cpu)[ipi] = irq;
+ per_cpu(ipi_to_irq, cpu)[ipi] = info->irq;
+ per_cpu(ipi_to_evtchn, cpu)[ipi] = evtchn;
- return xen_irq_info_common_setup(info, irq, IRQT_IPI, evtchn, 0);
+ return xen_irq_info_common_setup(info, IRQT_IPI, evtchn, 0);
}
-static int xen_irq_info_virq_setup(unsigned cpu,
- unsigned irq,
- evtchn_port_t evtchn,
- unsigned virq)
+static int xen_irq_info_virq_setup(struct irq_info *info, unsigned int cpu,
+ evtchn_port_t evtchn, unsigned int virq)
{
- struct irq_info *info = info_for_irq(irq);
-
info->u.virq = virq;
- per_cpu(virq_to_irq, cpu)[virq] = irq;
+ per_cpu(virq_to_irq, cpu)[virq] = info->irq;
- return xen_irq_info_common_setup(info, irq, IRQT_VIRQ, evtchn, 0);
+ return xen_irq_info_common_setup(info, IRQT_VIRQ, evtchn, 0);
}
-static int xen_irq_info_pirq_setup(unsigned irq,
- evtchn_port_t evtchn,
- unsigned pirq,
- unsigned gsi,
- uint16_t domid,
- unsigned char flags)
+static int xen_irq_info_pirq_setup(struct irq_info *info, evtchn_port_t evtchn,
+ unsigned int pirq, unsigned int gsi,
+ uint16_t domid, unsigned char flags)
{
- struct irq_info *info = info_for_irq(irq);
-
info->u.pirq.pirq = pirq;
info->u.pirq.gsi = gsi;
info->u.pirq.domid = domid;
info->u.pirq.flags = flags;
- return xen_irq_info_common_setup(info, irq, IRQT_PIRQ, evtchn, 0);
+ return xen_irq_info_common_setup(info, IRQT_PIRQ, evtchn, 0);
}
static void xen_irq_info_cleanup(struct irq_info *info)
/*
* Accessors for packed IRQ information.
*/
-evtchn_port_t evtchn_from_irq(unsigned irq)
+static evtchn_port_t evtchn_from_irq(unsigned int irq)
{
const struct irq_info *info = NULL;
unsigned int irq_from_evtchn(evtchn_port_t evtchn)
{
- return get_evtchn_to_irq(evtchn);
+ struct irq_info *info = evtchn_to_info(evtchn);
+
+ return info ? info->irq : -1;
}
EXPORT_SYMBOL_GPL(irq_from_evtchn);
-int irq_from_virq(unsigned int cpu, unsigned int virq)
+int irq_evtchn_from_virq(unsigned int cpu, unsigned int virq,
+ evtchn_port_t *evtchn)
{
- return per_cpu(virq_to_irq, cpu)[virq];
+ int irq = per_cpu(virq_to_irq, cpu)[virq];
+
+ *evtchn = evtchn_from_irq(irq);
+
+ return irq;
}
-static enum ipi_vector ipi_from_irq(unsigned irq)
+static enum ipi_vector ipi_from_irq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(irq);
-
BUG_ON(info == NULL);
BUG_ON(info->type != IRQT_IPI);
return info->u.ipi;
}
-static unsigned virq_from_irq(unsigned irq)
+static unsigned int virq_from_irq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(irq);
-
BUG_ON(info == NULL);
BUG_ON(info->type != IRQT_VIRQ);
return info->u.virq;
}
-static unsigned pirq_from_irq(unsigned irq)
+static unsigned int pirq_from_irq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(irq);
-
BUG_ON(info == NULL);
BUG_ON(info->type != IRQT_PIRQ);
return info->u.pirq.pirq;
}
-static enum xen_irq_type type_from_irq(unsigned irq)
-{
- return info_for_irq(irq)->type;
-}
-
-static unsigned cpu_from_irq(unsigned irq)
-{
- return info_for_irq(irq)->cpu;
-}
-
unsigned int cpu_from_evtchn(evtchn_port_t evtchn)
{
- int irq = get_evtchn_to_irq(evtchn);
- unsigned ret = 0;
-
- if (irq != -1)
- ret = cpu_from_irq(irq);
+ struct irq_info *info = evtchn_to_info(evtchn);
- return ret;
+ return info ? info->cpu : 0;
}
static void do_mask(struct irq_info *info, u8 reason)
}
#ifdef CONFIG_X86
-static bool pirq_check_eoi_map(unsigned irq)
+static bool pirq_check_eoi_map(struct irq_info *info)
{
- return test_bit(pirq_from_irq(irq), pirq_eoi_map);
+ return test_bit(pirq_from_irq(info), pirq_eoi_map);
}
#endif
-static bool pirq_needs_eoi_flag(unsigned irq)
+static bool pirq_needs_eoi_flag(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(irq);
BUG_ON(info->type != IRQT_PIRQ);
return info->u.pirq.flags & PIRQ_NEEDS_EOI;
}
-static void bind_evtchn_to_cpu(evtchn_port_t evtchn, unsigned int cpu,
+static void bind_evtchn_to_cpu(struct irq_info *info, unsigned int cpu,
bool force_affinity)
{
- int irq = get_evtchn_to_irq(evtchn);
- struct irq_info *info = info_for_irq(irq);
-
- BUG_ON(irq == -1);
-
if (IS_ENABLED(CONFIG_SMP) && force_affinity) {
- struct irq_data *data = irq_get_irq_data(irq);
+ struct irq_data *data = irq_get_irq_data(info->irq);
irq_data_update_affinity(data, cpumask_of(cpu));
irq_data_update_effective_affinity(data, cpumask_of(cpu));
}
- xen_evtchn_port_bind_to_cpu(evtchn, cpu, info->cpu);
+ xen_evtchn_port_bind_to_cpu(info->evtchn, cpu, info->cpu);
channels_on_cpu_dec(info);
info->cpu = cpu;
spin_lock_irqsave(&eoi->eoi_list_lock, flags);
- if (list_empty(&eoi->eoi_list)) {
+ elem = list_first_entry_or_null(&eoi->eoi_list, struct irq_info,
+ eoi_list);
+ if (!elem || info->eoi_time < elem->eoi_time) {
list_add(&info->eoi_list, &eoi->eoi_list);
mod_delayed_work_on(info->eoi_cpu, system_wq,
&eoi->delayed, delay);
}
EXPORT_SYMBOL_GPL(xen_irq_lateeoi);
-static void xen_irq_init(unsigned irq)
+static struct irq_info *xen_irq_init(unsigned int irq)
{
struct irq_info *info;
info = kzalloc(sizeof(*info), GFP_KERNEL);
- if (info == NULL)
- panic("Unable to allocate metadata for IRQ%d\n", irq);
+ if (info) {
+ info->irq = irq;
+ info->type = IRQT_UNBOUND;
+ info->refcnt = -1;
+ INIT_RCU_WORK(&info->rwork, delayed_free_irq);
- info->type = IRQT_UNBOUND;
- info->refcnt = -1;
- INIT_RCU_WORK(&info->rwork, delayed_free_irq);
+ set_info_for_irq(irq, info);
+ /*
+ * Interrupt affinity setting can be immediate. No point
+ * in delaying it until an interrupt is handled.
+ */
+ irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
- set_info_for_irq(irq, info);
- /*
- * Interrupt affinity setting can be immediate. No point
- * in delaying it until an interrupt is handled.
- */
- irq_set_status_flags(irq, IRQ_MOVE_PCNTXT);
+ INIT_LIST_HEAD(&info->eoi_list);
+ list_add_tail(&info->list, &xen_irq_list_head);
+ }
- INIT_LIST_HEAD(&info->eoi_list);
- list_add_tail(&info->list, &xen_irq_list_head);
+ return info;
}
-static int __must_check xen_allocate_irqs_dynamic(int nvec)
+static struct irq_info *xen_allocate_irq_dynamic(void)
{
- int i, irq = irq_alloc_descs(-1, 0, nvec, -1);
+ int irq = irq_alloc_desc_from(0, -1);
+ struct irq_info *info = NULL;
if (irq >= 0) {
- for (i = 0; i < nvec; i++)
- xen_irq_init(irq + i);
+ info = xen_irq_init(irq);
+ if (!info)
+ xen_irq_free_desc(irq);
}
- return irq;
-}
-
-static inline int __must_check xen_allocate_irq_dynamic(void)
-{
-
- return xen_allocate_irqs_dynamic(1);
+ return info;
}
-static int __must_check xen_allocate_irq_gsi(unsigned gsi)
+static struct irq_info *xen_allocate_irq_gsi(unsigned int gsi)
{
int irq;
+ struct irq_info *info;
/*
* A PV guest has no concept of a GSI (since it has no ACPI
else
irq = irq_alloc_desc_at(gsi, -1);
- xen_irq_init(irq);
+ info = xen_irq_init(irq);
+ if (!info)
+ xen_irq_free_desc(irq);
- return irq;
+ return info;
}
-static void xen_free_irq(unsigned irq)
+static void xen_free_irq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(irq);
-
if (WARN_ON(!info))
return;
clear_evtchn(info->evtchn);
}
-static void pirq_query_unmask(int irq)
+static void pirq_query_unmask(struct irq_info *info)
{
struct physdev_irq_status_query irq_status;
- struct irq_info *info = info_for_irq(irq);
-
- BUG_ON(info->type != IRQT_PIRQ);
- irq_status.irq = pirq_from_irq(irq);
+ irq_status.irq = pirq_from_irq(info);
if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
irq_status.flags = 0;
info->u.pirq.flags |= PIRQ_NEEDS_EOI;
}
-static void eoi_pirq(struct irq_data *data)
+static void do_eoi_pirq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(data->irq);
- evtchn_port_t evtchn = info ? info->evtchn : 0;
- struct physdev_eoi eoi = { .irq = pirq_from_irq(data->irq) };
+ struct physdev_eoi eoi = { .irq = pirq_from_irq(info) };
int rc = 0;
- if (!VALID_EVTCHN(evtchn))
+ if (!VALID_EVTCHN(info->evtchn))
return;
event_handler_exit(info);
- if (pirq_needs_eoi(data->irq)) {
+ if (pirq_needs_eoi(info)) {
rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi);
WARN_ON(rc);
}
}
+static void eoi_pirq(struct irq_data *data)
+{
+ struct irq_info *info = info_for_irq(data->irq);
+
+ do_eoi_pirq(info);
+}
+
+static void do_disable_dynirq(struct irq_info *info)
+{
+ if (VALID_EVTCHN(info->evtchn))
+ do_mask(info, EVT_MASK_REASON_EXPLICIT);
+}
+
+static void disable_dynirq(struct irq_data *data)
+{
+ struct irq_info *info = info_for_irq(data->irq);
+
+ if (info)
+ do_disable_dynirq(info);
+}
+
static void mask_ack_pirq(struct irq_data *data)
{
- disable_dynirq(data);
- eoi_pirq(data);
+ struct irq_info *info = info_for_irq(data->irq);
+
+ if (info) {
+ do_disable_dynirq(info);
+ do_eoi_pirq(info);
+ }
}
-static unsigned int __startup_pirq(unsigned int irq)
+static unsigned int __startup_pirq(struct irq_info *info)
{
struct evtchn_bind_pirq bind_pirq;
- struct irq_info *info = info_for_irq(irq);
- evtchn_port_t evtchn = evtchn_from_irq(irq);
+ evtchn_port_t evtchn = info->evtchn;
int rc;
- BUG_ON(info->type != IRQT_PIRQ);
-
if (VALID_EVTCHN(evtchn))
goto out;
- bind_pirq.pirq = pirq_from_irq(irq);
+ bind_pirq.pirq = pirq_from_irq(info);
/* NB. We are happy to share unless we are probing. */
bind_pirq.flags = info->u.pirq.flags & PIRQ_SHAREABLE ?
BIND_PIRQ__WILL_SHARE : 0;
rc = HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq);
if (rc != 0) {
- pr_warn("Failed to obtain physical IRQ %d\n", irq);
+ pr_warn("Failed to obtain physical IRQ %d\n", info->irq);
return 0;
}
evtchn = bind_pirq.port;
- pirq_query_unmask(irq);
+ pirq_query_unmask(info);
- rc = set_evtchn_to_irq(evtchn, irq);
+ rc = set_evtchn_to_irq(evtchn, info->irq);
if (rc)
goto err;
info->evtchn = evtchn;
- bind_evtchn_to_cpu(evtchn, 0, false);
+ bind_evtchn_to_cpu(info, 0, false);
rc = xen_evtchn_port_setup(evtchn);
if (rc)
out:
do_unmask(info, EVT_MASK_REASON_EXPLICIT);
- eoi_pirq(irq_get_irq_data(irq));
+ do_eoi_pirq(info);
return 0;
err:
- pr_err("irq%d: Failed to set port to irq mapping (%d)\n", irq, rc);
+ pr_err("irq%d: Failed to set port to irq mapping (%d)\n", info->irq,
+ rc);
xen_evtchn_close(evtchn);
return 0;
}
static unsigned int startup_pirq(struct irq_data *data)
{
- return __startup_pirq(data->irq);
+ struct irq_info *info = info_for_irq(data->irq);
+
+ return __startup_pirq(info);
}
static void shutdown_pirq(struct irq_data *data)
{
- unsigned int irq = data->irq;
- struct irq_info *info = info_for_irq(irq);
- evtchn_port_t evtchn = evtchn_from_irq(irq);
+ struct irq_info *info = info_for_irq(data->irq);
+ evtchn_port_t evtchn = info->evtchn;
BUG_ON(info->type != IRQT_PIRQ);
}
EXPORT_SYMBOL_GPL(xen_irq_from_gsi);
-static void __unbind_from_irq(unsigned int irq)
+static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
{
- evtchn_port_t evtchn = evtchn_from_irq(irq);
- struct irq_info *info = info_for_irq(irq);
+ evtchn_port_t evtchn;
+
+ if (!info) {
+ xen_irq_free_desc(irq);
+ return;
+ }
if (info->refcnt > 0) {
info->refcnt--;
return;
}
+ evtchn = info->evtchn;
+
if (VALID_EVTCHN(evtchn)) {
- unsigned int cpu = cpu_from_irq(irq);
+ unsigned int cpu = info->cpu;
struct xenbus_device *dev;
if (!info->is_static)
xen_evtchn_close(evtchn);
- switch (type_from_irq(irq)) {
+ switch (info->type) {
case IRQT_VIRQ:
- per_cpu(virq_to_irq, cpu)[virq_from_irq(irq)] = -1;
+ per_cpu(virq_to_irq, cpu)[virq_from_irq(info)] = -1;
break;
case IRQT_IPI:
- per_cpu(ipi_to_irq, cpu)[ipi_from_irq(irq)] = -1;
+ per_cpu(ipi_to_irq, cpu)[ipi_from_irq(info)] = -1;
+ per_cpu(ipi_to_evtchn, cpu)[ipi_from_irq(info)] = 0;
break;
case IRQT_EVTCHN:
dev = info->u.interdomain;
xen_irq_info_cleanup(info);
}
- xen_free_irq(irq);
+ xen_free_irq(info);
}
/*
int xen_bind_pirq_gsi_to_irq(unsigned gsi,
unsigned pirq, int shareable, char *name)
{
- int irq;
+ struct irq_info *info;
struct physdev_irq irq_op;
int ret;
mutex_lock(&irq_mapping_update_lock);
- irq = xen_irq_from_gsi(gsi);
- if (irq != -1) {
+ ret = xen_irq_from_gsi(gsi);
+ if (ret != -1) {
pr_info("%s: returning irq %d for gsi %u\n",
- __func__, irq, gsi);
+ __func__, ret, gsi);
goto out;
}
- irq = xen_allocate_irq_gsi(gsi);
- if (irq < 0)
+ info = xen_allocate_irq_gsi(gsi);
+ if (!info)
goto out;
- irq_op.irq = irq;
+ irq_op.irq = info->irq;
irq_op.vector = 0;
/* Only the privileged domain can do this. For non-priv, the pcifront
* this in the priv domain. */
if (xen_initial_domain() &&
HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) {
- xen_free_irq(irq);
- irq = -ENOSPC;
+ xen_free_irq(info);
+ ret = -ENOSPC;
goto out;
}
- ret = xen_irq_info_pirq_setup(irq, 0, pirq, gsi, DOMID_SELF,
+ ret = xen_irq_info_pirq_setup(info, 0, pirq, gsi, DOMID_SELF,
shareable ? PIRQ_SHAREABLE : 0);
if (ret < 0) {
- __unbind_from_irq(irq);
- irq = ret;
+ __unbind_from_irq(info, info->irq);
goto out;
}
- pirq_query_unmask(irq);
+ pirq_query_unmask(info);
/* We try to use the handler with the appropriate semantic for the
* type of interrupt: if the interrupt is an edge triggered
* interrupt we use handle_edge_irq.
* is the right choice either way.
*/
if (shareable)
- irq_set_chip_and_handler_name(irq, &xen_pirq_chip,
+ irq_set_chip_and_handler_name(info->irq, &xen_pirq_chip,
handle_fasteoi_irq, name);
else
- irq_set_chip_and_handler_name(irq, &xen_pirq_chip,
+ irq_set_chip_and_handler_name(info->irq, &xen_pirq_chip,
handle_edge_irq, name);
+ ret = info->irq;
+
out:
mutex_unlock(&irq_mapping_update_lock);
- return irq;
+ return ret;
}
#ifdef CONFIG_PCI_MSI
int pirq, int nvec, const char *name, domid_t domid)
{
int i, irq, ret;
+ struct irq_info *info;
mutex_lock(&irq_mapping_update_lock);
- irq = xen_allocate_irqs_dynamic(nvec);
+ irq = irq_alloc_descs(-1, 0, nvec, -1);
if (irq < 0)
goto out;
for (i = 0; i < nvec; i++) {
+ info = xen_irq_init(irq + i);
+ if (!info) {
+ ret = -ENOMEM;
+ goto error_irq;
+ }
+
irq_set_chip_and_handler_name(irq + i, &xen_pirq_chip, handle_edge_irq, name);
- ret = xen_irq_info_pirq_setup(irq + i, 0, pirq + i, 0, domid,
+ ret = xen_irq_info_pirq_setup(info, 0, pirq + i, 0, domid,
i == 0 ? 0 : PIRQ_MSI_GROUP);
if (ret < 0)
goto error_irq;
out:
mutex_unlock(&irq_mapping_update_lock);
return irq;
+
error_irq:
- while (nvec--)
- __unbind_from_irq(irq + nvec);
+ while (nvec--) {
+ info = info_for_irq(irq + nvec);
+ __unbind_from_irq(info, irq + nvec);
+ }
mutex_unlock(&irq_mapping_update_lock);
return ret;
}
}
}
- xen_free_irq(irq);
+ xen_free_irq(info);
out:
mutex_unlock(&irq_mapping_update_lock);
return rc;
}
-int xen_irq_from_pirq(unsigned pirq)
-{
- int irq;
-
- struct irq_info *info;
-
- mutex_lock(&irq_mapping_update_lock);
-
- list_for_each_entry(info, &xen_irq_list_head, list) {
- if (info->type != IRQT_PIRQ)
- continue;
- irq = info->irq;
- if (info->u.pirq.pirq == pirq)
- goto out;
- }
- irq = -1;
-out:
- mutex_unlock(&irq_mapping_update_lock);
-
- return irq;
-}
-
-
int xen_pirq_from_irq(unsigned irq)
{
- return pirq_from_irq(irq);
+ struct irq_info *info = info_for_irq(irq);
+
+ return pirq_from_irq(info);
}
EXPORT_SYMBOL_GPL(xen_pirq_from_irq);
static int bind_evtchn_to_irq_chip(evtchn_port_t evtchn, struct irq_chip *chip,
struct xenbus_device *dev)
{
- int irq;
- int ret;
+ int ret = -ENOMEM;
+ struct irq_info *info;
if (evtchn >= xen_evtchn_max_channels())
return -ENOMEM;
mutex_lock(&irq_mapping_update_lock);
- irq = get_evtchn_to_irq(evtchn);
+ info = evtchn_to_info(evtchn);
- if (irq == -1) {
- irq = xen_allocate_irq_dynamic();
- if (irq < 0)
+ if (!info) {
+ info = xen_allocate_irq_dynamic();
+ if (!info)
goto out;
- irq_set_chip_and_handler_name(irq, chip,
+ irq_set_chip_and_handler_name(info->irq, chip,
handle_edge_irq, "event");
- ret = xen_irq_info_evtchn_setup(irq, evtchn, dev);
+ ret = xen_irq_info_evtchn_setup(info, evtchn, dev);
if (ret < 0) {
- __unbind_from_irq(irq);
- irq = ret;
+ __unbind_from_irq(info, info->irq);
goto out;
}
/*
* affinity setting is not invoked on them so nothing would
* bind the channel.
*/
- bind_evtchn_to_cpu(evtchn, 0, false);
- } else {
- struct irq_info *info = info_for_irq(irq);
- if (!WARN_ON(!info || info->type != IRQT_EVTCHN))
- info->refcnt++;
+ bind_evtchn_to_cpu(info, 0, false);
+ } else if (!WARN_ON(info->type != IRQT_EVTCHN)) {
+ info->refcnt++;
}
+ ret = info->irq;
+
out:
mutex_unlock(&irq_mapping_update_lock);
- return irq;
+ return ret;
}
int bind_evtchn_to_irq(evtchn_port_t evtchn)
{
struct evtchn_bind_ipi bind_ipi;
evtchn_port_t evtchn;
- int ret, irq;
+ struct irq_info *info;
+ int ret;
mutex_lock(&irq_mapping_update_lock);
- irq = per_cpu(ipi_to_irq, cpu)[ipi];
+ ret = per_cpu(ipi_to_irq, cpu)[ipi];
- if (irq == -1) {
- irq = xen_allocate_irq_dynamic();
- if (irq < 0)
+ if (ret == -1) {
+ info = xen_allocate_irq_dynamic();
+ if (!info)
goto out;
- irq_set_chip_and_handler_name(irq, &xen_percpu_chip,
+ irq_set_chip_and_handler_name(info->irq, &xen_percpu_chip,
handle_percpu_irq, "ipi");
bind_ipi.vcpu = xen_vcpu_nr(cpu);
BUG();
evtchn = bind_ipi.port;
- ret = xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
+ ret = xen_irq_info_ipi_setup(info, cpu, evtchn, ipi);
if (ret < 0) {
- __unbind_from_irq(irq);
- irq = ret;
+ __unbind_from_irq(info, info->irq);
goto out;
}
/*
* Force the affinity mask to the target CPU so proc shows
* the correct target.
*/
- bind_evtchn_to_cpu(evtchn, cpu, true);
+ bind_evtchn_to_cpu(info, cpu, true);
+ ret = info->irq;
} else {
- struct irq_info *info = info_for_irq(irq);
+ info = info_for_irq(ret);
WARN_ON(info == NULL || info->type != IRQT_IPI);
}
out:
mutex_unlock(&irq_mapping_update_lock);
- return irq;
+ return ret;
}
static int bind_interdomain_evtchn_to_irq_chip(struct xenbus_device *dev,
{
struct evtchn_bind_virq bind_virq;
evtchn_port_t evtchn = 0;
- int irq, ret;
+ struct irq_info *info;
+ int ret;
mutex_lock(&irq_mapping_update_lock);
- irq = per_cpu(virq_to_irq, cpu)[virq];
+ ret = per_cpu(virq_to_irq, cpu)[virq];
- if (irq == -1) {
- irq = xen_allocate_irq_dynamic();
- if (irq < 0)
+ if (ret == -1) {
+ info = xen_allocate_irq_dynamic();
+ if (!info)
goto out;
if (percpu)
- irq_set_chip_and_handler_name(irq, &xen_percpu_chip,
+ irq_set_chip_and_handler_name(info->irq, &xen_percpu_chip,
handle_percpu_irq, "virq");
else
- irq_set_chip_and_handler_name(irq, &xen_dynamic_chip,
+ irq_set_chip_and_handler_name(info->irq, &xen_dynamic_chip,
handle_edge_irq, "virq");
bind_virq.virq = virq;
BUG_ON(ret < 0);
}
- ret = xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
+ ret = xen_irq_info_virq_setup(info, cpu, evtchn, virq);
if (ret < 0) {
- __unbind_from_irq(irq);
- irq = ret;
+ __unbind_from_irq(info, info->irq);
goto out;
}
* Force the affinity mask for percpu interrupts so proc
* shows the correct target.
*/
- bind_evtchn_to_cpu(evtchn, cpu, percpu);
+ bind_evtchn_to_cpu(info, cpu, percpu);
+ ret = info->irq;
} else {
- struct irq_info *info = info_for_irq(irq);
+ info = info_for_irq(ret);
WARN_ON(info == NULL || info->type != IRQT_VIRQ);
}
out:
mutex_unlock(&irq_mapping_update_lock);
- return irq;
+ return ret;
}
static void unbind_from_irq(unsigned int irq)
{
+ struct irq_info *info;
+
mutex_lock(&irq_mapping_update_lock);
- __unbind_from_irq(irq);
+ info = info_for_irq(irq);
+ __unbind_from_irq(info, irq);
mutex_unlock(&irq_mapping_update_lock);
}
int evtchn_make_refcounted(evtchn_port_t evtchn, bool is_static)
{
- int irq = get_evtchn_to_irq(evtchn);
- struct irq_info *info;
-
- if (irq == -1)
- return -ENOENT;
-
- info = info_for_irq(irq);
+ struct irq_info *info = evtchn_to_info(evtchn);
if (!info)
return -ENOENT;
int evtchn_get(evtchn_port_t evtchn)
{
- int irq;
struct irq_info *info;
int err = -ENOENT;
mutex_lock(&irq_mapping_update_lock);
- irq = get_evtchn_to_irq(evtchn);
- if (irq == -1)
- goto done;
-
- info = info_for_irq(irq);
+ info = evtchn_to_info(evtchn);
if (!info)
goto done;
void evtchn_put(evtchn_port_t evtchn)
{
- int irq = get_evtchn_to_irq(evtchn);
- if (WARN_ON(irq == -1))
+ struct irq_info *info = evtchn_to_info(evtchn);
+
+ if (WARN_ON(!info))
return;
- unbind_from_irq(irq);
+ unbind_from_irq(info->irq);
}
EXPORT_SYMBOL_GPL(evtchn_put);
void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector)
{
- int irq;
+ evtchn_port_t evtchn;
#ifdef CONFIG_X86
if (unlikely(vector == XEN_NMI_VECTOR)) {
return;
}
#endif
- irq = per_cpu(ipi_to_irq, cpu)[vector];
- BUG_ON(irq < 0);
- notify_remote_via_irq(irq);
+ evtchn = per_cpu(ipi_to_evtchn, cpu)[vector];
+ BUG_ON(evtchn == 0);
+ notify_remote_via_evtchn(evtchn);
}
struct evtchn_loop_ctrl {
void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl)
{
- int irq;
- struct irq_info *info;
+ struct irq_info *info = evtchn_to_info(port);
struct xenbus_device *dev;
- irq = get_evtchn_to_irq(port);
- if (irq == -1)
+ if (!info)
return;
/*
}
}
- info = info_for_irq(irq);
if (xchg_acquire(&info->is_active, 1))
return;
info->eoi_time = get_jiffies_64() + event_eoi_delay;
}
- generic_handle_irq(irq);
+ generic_handle_irq(info->irq);
}
int xen_evtchn_do_upcall(void)
mutex_lock(&irq_mapping_update_lock);
/* After resume the irq<->evtchn mappings are all cleared out */
- BUG_ON(get_evtchn_to_irq(evtchn) != -1);
+ BUG_ON(evtchn_to_info(evtchn));
/* Expect irq to have been bound before,
so there should be a proper type */
BUG_ON(info->type == IRQT_UNBOUND);
- (void)xen_irq_info_evtchn_setup(irq, evtchn, NULL);
+ info->irq = irq;
+ (void)xen_irq_info_evtchn_setup(info, evtchn, NULL);
mutex_unlock(&irq_mapping_update_lock);
- bind_evtchn_to_cpu(evtchn, info->cpu, false);
+ bind_evtchn_to_cpu(info, info->cpu, false);
/* Unmask the event channel. */
enable_irq(irq);
* it, but don't do the xenlinux-level rebind in that case.
*/
if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0)
- bind_evtchn_to_cpu(evtchn, tcpu, false);
+ bind_evtchn_to_cpu(info, tcpu, false);
do_unmask(info, EVT_MASK_REASON_TEMPORARY);
do_unmask(info, EVT_MASK_REASON_EXPLICIT);
}
-static void disable_dynirq(struct irq_data *data)
+static void do_ack_dynirq(struct irq_info *info)
{
- struct irq_info *info = info_for_irq(data->irq);
- evtchn_port_t evtchn = info ? info->evtchn : 0;
+ evtchn_port_t evtchn = info->evtchn;
if (VALID_EVTCHN(evtchn))
- do_mask(info, EVT_MASK_REASON_EXPLICIT);
+ event_handler_exit(info);
}
static void ack_dynirq(struct irq_data *data)
{
struct irq_info *info = info_for_irq(data->irq);
- evtchn_port_t evtchn = info ? info->evtchn : 0;
- if (VALID_EVTCHN(evtchn))
- event_handler_exit(info);
+ if (info)
+ do_ack_dynirq(info);
}
static void mask_ack_dynirq(struct irq_data *data)
{
- disable_dynirq(data);
- ack_dynirq(data);
+ struct irq_info *info = info_for_irq(data->irq);
+
+ if (info) {
+ do_disable_dynirq(info);
+ do_ack_dynirq(info);
+ }
}
static void lateeoi_ack_dynirq(struct irq_data *data)
if (rc) {
pr_warn("xen map irq failed gsi=%d irq=%d pirq=%d rc=%d\n",
gsi, irq, pirq, rc);
- xen_free_irq(irq);
+ xen_free_irq(info);
continue;
}
printk(KERN_DEBUG "xen: --> irq=%d, pirq=%d\n", irq, map_irq.pirq);
- __startup_pirq(irq);
+ __startup_pirq(info);
}
}
{
struct evtchn_bind_virq bind_virq;
evtchn_port_t evtchn;
+ struct irq_info *info;
int virq, irq;
for (virq = 0; virq < NR_VIRQS; virq++) {
if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1)
continue;
+ info = info_for_irq(irq);
- BUG_ON(virq_from_irq(irq) != virq);
+ BUG_ON(virq_from_irq(info) != virq);
/* Get a new binding from Xen. */
bind_virq.virq = virq;
evtchn = bind_virq.port;
/* Record the new mapping. */
- (void)xen_irq_info_virq_setup(cpu, irq, evtchn, virq);
+ xen_irq_info_virq_setup(info, cpu, evtchn, virq);
/* The affinity mask is still valid */
- bind_evtchn_to_cpu(evtchn, cpu, false);
+ bind_evtchn_to_cpu(info, cpu, false);
}
}
{
struct evtchn_bind_ipi bind_ipi;
evtchn_port_t evtchn;
+ struct irq_info *info;
int ipi, irq;
for (ipi = 0; ipi < XEN_NR_IPIS; ipi++) {
if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1)
continue;
+ info = info_for_irq(irq);
- BUG_ON(ipi_from_irq(irq) != ipi);
+ BUG_ON(ipi_from_irq(info) != ipi);
/* Get a new binding from Xen. */
bind_ipi.vcpu = xen_vcpu_nr(cpu);
evtchn = bind_ipi.port;
/* Record the new mapping. */
- (void)xen_irq_info_ipi_setup(cpu, irq, evtchn, ipi);
+ xen_irq_info_ipi_setup(info, cpu, evtchn, ipi);
/* The affinity mask is still valid */
- bind_evtchn_to_cpu(evtchn, cpu, false);
+ bind_evtchn_to_cpu(info, cpu, false);
}
}
event_handler_exit(info);
}
EXPORT_SYMBOL(xen_clear_irq_pending);
-void xen_set_irq_pending(int irq)
-{
- evtchn_port_t evtchn = evtchn_from_irq(irq);
-
- if (VALID_EVTCHN(evtchn))
- set_evtchn(evtchn);
-}
bool xen_test_irq_pending(int irq)
{
extern const struct evtchn_ops *evtchn_ops;
-int get_evtchn_to_irq(evtchn_port_t evtchn);
void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl);
unsigned int cpu_from_evtchn(evtchn_port_t evtchn);
#include <asm/xen/hypervisor.h>
#include <asm/xen/hypercall.h>
+#ifdef CONFIG_ACPI
+#include <acpi/processor.h>
+#endif
/*
* @cpu_id: Xen physical cpu logic number
return online;
}
+
+void xen_sanitize_proc_cap_bits(uint32_t *cap)
+{
+ struct xen_platform_op op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .u.set_pminfo.id = -1,
+ .u.set_pminfo.type = XEN_PM_PDC,
+ };
+ u32 buf[3] = { ACPI_PDC_REVISION_ID, 1, *cap };
+ int ret;
+
+ set_xen_guest_handle(op.u.set_pminfo.pdc, buf);
+ ret = HYPERVISOR_platform_op(&op);
+ if (ret)
+ pr_err("sanitize of _PDC buffer bits from Xen failed: %d\n",
+ ret);
+ else
+ *cap = buf[2];
+}
#endif
spinlock_t lock; /* Protects ioeventfds list */
struct list_head ioeventfds;
struct list_head list;
- struct ioreq_port ports[0];
+ struct ioreq_port ports[] __counted_by(vcpus);
};
static irqreturn_t ioeventfd_interrupt(int irq, void *dev_id)
.get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
+ .max_mapping_size = swiotlb_max_mapping_size,
};
#include <xen/xen-front-pgdir-shbuf.h>
-/**
+/*
* This structure represents the structure of a shared page
* that contains grant references to the pages of the shared
* buffer. This structure is common to many Xen para-virtualized
grant_ref_t gref[]; /* Variable length */
};
-/**
+/*
* Shared buffer ops which are differently implemented
* depending on the allocation mode, e.g. if the buffer
* is allocated by the corresponding backend or frontend.
int (*unmap)(struct xen_front_pgdir_shbuf *buf);
};
-/**
+/*
* Get granted reference to the very first page of the
* page directory. Usually this is passed to the backend,
* so it can find/fill the grant references to the buffer's
}
EXPORT_SYMBOL_GPL(xen_front_pgdir_shbuf_get_dir_start);
-/**
+/*
* Map granted references of the shared buffer.
*
* Depending on the shared buffer mode of allocation
}
EXPORT_SYMBOL_GPL(xen_front_pgdir_shbuf_map);
-/**
+/*
* Unmap granted references of the shared buffer.
*
* Depending on the shared buffer mode of allocation
}
EXPORT_SYMBOL_GPL(xen_front_pgdir_shbuf_unmap);
-/**
+/*
* Free all the resources of the shared buffer.
*
* \param buf shared buffer which resources to be freed.
offsetof(struct xen_page_directory, \
gref)) / sizeof(grant_ref_t))
-/**
+/*
* Get the number of pages the page directory consumes itself.
*
* \param buf shared buffer.
return DIV_ROUND_UP(buf->num_pages, XEN_NUM_GREFS_PER_PAGE);
}
-/**
+/*
* Calculate the number of grant references needed to share the buffer
* and its pages when backend allocates the buffer.
*
buf->num_grefs = get_num_pages_dir(buf);
}
-/**
+/*
* Calculate the number of grant references needed to share the buffer
* and its pages when frontend allocates the buffer.
*
#define xen_page_to_vaddr(page) \
((uintptr_t)pfn_to_kaddr(page_to_xen_pfn(page)))
-/**
+/*
* Unmap the buffer previously mapped with grant references
* provided by the backend.
*
return ret;
}
-/**
+/*
* Map the buffer with grant references provided by the backend.
*
* \param buf shared buffer.
return ret;
}
-/**
+/*
* Fill page directory with grant references to the pages of the
* page directory itself.
*
page_dir->gref_dir_next_page = XEN_GREF_LIST_END;
}
-/**
+/*
* Fill page directory with grant references to the pages of the
* page directory and the buffer we share with the backend.
*
}
}
-/**
+/*
* Grant references to the frontend's buffer pages.
*
* These will be shared with the backend, so it can
return 0;
}
-/**
+/*
* Grant all the references needed to share the buffer.
*
* Grant references to the page directory pages and, if
return 0;
}
-/**
+/*
* Allocate all required structures to mange shared buffer.
*
* \param buf shared buffer.
.grant_refs_for_buffer = guest_grant_refs_for_buffer,
};
-/**
+/*
* Allocate a new instance of a shared buffer.
*
* \param cfg configuration to be used while allocating a new shared buffer.
config HUGETLB_PAGE
def_bool HUGETLBFS
+ select XARRAY_MULTI
config HUGETLB_PAGE_OPTIMIZE_VMEMMAP
def_bool HUGETLB_PAGE
if (ret == -ENOMEM)
goto out_wake;
- ret = -ENOMEM;
vllist = afs_alloc_vlserver_list(0);
- if (!vllist)
+ if (!vllist) {
+ if (ret >= 0)
+ ret = -ENOMEM;
goto out_wake;
+ }
switch (ret) {
case -ENODATA:
struct afs_net *net = afs_d2net(dentry);
const char *name = dentry->d_name.name;
size_t len = dentry->d_name.len;
+ char *result = NULL;
int ret;
/* Names prefixed with a dot are R/W mounts. */
}
ret = dns_query(net->net, "afsdb", name, len, "srv=1",
- NULL, NULL, false);
- if (ret == -ENODATA)
- ret = -EDESTADDRREQ;
+ &result, NULL, false);
+ if (ret == -ENODATA || ret == -ENOKEY || ret == 0)
+ ret = -ENOENT;
+ if (ret > 0 && ret >= sizeof(struct dns_server_list_v1_header)) {
+ struct dns_server_list_v1_header *v1 = (void *)result;
+
+ if (v1->hdr.zero == 0 &&
+ v1->hdr.content == DNS_PAYLOAD_IS_SERVER_LIST &&
+ v1->hdr.version == 1 &&
+ (v1->status != DNS_LOOKUP_GOOD &&
+ v1->status != DNS_LOOKUP_GOOD_WITH_BAD))
+ return -ENOENT;
+
+ }
+
+ kfree(result);
return ret;
}
return 1;
}
-/*
- * Allow the VFS to enquire as to whether a dentry should be unhashed (mustn't
- * sleep)
- * - called from dput() when d_count is going to 0.
- * - return 1 to request dentry be unhashed, 0 otherwise
- */
-static int afs_dynroot_d_delete(const struct dentry *dentry)
-{
- return d_really_is_positive(dentry);
-}
-
const struct dentry_operations afs_dynroot_dentry_operations = {
.d_revalidate = afs_dynroot_d_revalidate,
- .d_delete = afs_dynroot_d_delete,
+ .d_delete = always_delete_dentry,
.d_release = afs_d_release,
.d_automount = afs_d_automount,
};
};
struct afs_server_list {
+ struct rcu_head rcu;
afs_volid_t vids[AFS_MAXTYPES]; /* Volume IDs */
refcount_t usage;
unsigned char nr_servers;
#define AFS_VOLUME_OFFLINE 4 /* - T if volume offline notice given */
#define AFS_VOLUME_BUSY 5 /* - T if volume busy notice given */
#define AFS_VOLUME_MAYBE_NO_IBULK 6 /* - T if some servers don't have InlineBulkStatus */
+#define AFS_VOLUME_RM_TREE 7 /* - Set if volume removed from cell->volumes */
#ifdef CONFIG_AFS_FSCACHE
struct fscache_volume *cache; /* Caching cookie */
#endif
extern struct afs_volume *afs_create_volume(struct afs_fs_context *);
extern int afs_activate_volume(struct afs_volume *);
extern void afs_deactivate_volume(struct afs_volume *);
+bool afs_try_get_volume(struct afs_volume *volume, enum afs_volume_trace reason);
extern struct afs_volume *afs_get_volume(struct afs_volume *, enum afs_volume_trace);
extern void afs_put_volume(struct afs_net *, struct afs_volume *, enum afs_volume_trace);
extern int afs_check_volume_status(struct afs_volume *, struct afs_operation *);
if (call->async) {
if (cancel_work_sync(&call->async_work))
afs_put_call(call);
- afs_put_call(call);
+ afs_set_call_complete(call, ret, 0);
}
ac->error = ret;
for (i = 0; i < slist->nr_servers; i++)
afs_unuse_server(net, slist->servers[i].server,
afs_server_trace_put_slist);
- kfree(slist);
+ kfree_rcu(slist, rcu);
}
}
return PTR_ERR(volume);
ctx->volume = volume;
+ if (volume->type != AFSVL_RWVOL) {
+ ctx->flock_mode = afs_flock_mode_local;
+ fc->sb_flags |= SB_RDONLY;
+ }
}
return 0;
}
/* Status load is ordered after lookup counter load */
+ if (cell->dns_status == DNS_LOOKUP_GOT_NOT_FOUND) {
+ pr_warn("No record of cell %s\n", cell->name);
+ vc->error = -ENOENT;
+ return false;
+ }
+
if (cell->dns_source == DNS_RECORD_UNAVAILABLE) {
vc->error = -EDESTADDRREQ;
return false;
*/
static void afs_vl_dump_edestaddrreq(const struct afs_vl_cursor *vc)
{
+ struct afs_cell *cell = vc->cell;
static int count;
int i;
rcu_read_lock();
pr_notice("EDESTADDR occurred\n");
+ pr_notice("CELL: %s err=%d\n", cell->name, cell->error);
+ pr_notice("DNS: src=%u st=%u lc=%x\n",
+ cell->dns_source, cell->dns_status, cell->dns_lookup_count);
pr_notice("VC: ut=%lx ix=%u ni=%hu fl=%hx err=%hd\n",
vc->untried, vc->index, vc->nr_iterations, vc->flags, vc->error);
} else if (p->vid > volume->vid) {
pp = &(*pp)->rb_right;
} else {
- volume = afs_get_volume(p, afs_volume_trace_get_cell_insert);
- goto found;
+ if (afs_try_get_volume(p, afs_volume_trace_get_cell_insert)) {
+ volume = p;
+ goto found;
+ }
+
+ set_bit(AFS_VOLUME_RM_TREE, &volume->flags);
+ rb_replace_node_rcu(&p->cell_node, &volume->cell_node, &cell->volumes);
}
}
afs_volume_trace_remove);
write_seqlock(&cell->volume_lock);
hlist_del_rcu(&volume->proc_link);
- rb_erase(&volume->cell_node, &cell->volumes);
+ if (!test_and_set_bit(AFS_VOLUME_RM_TREE, &volume->flags))
+ rb_erase(&volume->cell_node, &cell->volumes);
write_sequnlock(&cell->volume_lock);
}
}
_leave(" [destroyed]");
}
+/*
+ * Try to get a reference on a volume record.
+ */
+bool afs_try_get_volume(struct afs_volume *volume, enum afs_volume_trace reason)
+{
+ int r;
+
+ if (__refcount_inc_not_zero(&volume->ref, &r)) {
+ trace_afs_volume(volume->vid, r + 1, reason);
+ return true;
+ }
+ return false;
+}
+
/*
* Get a reference on a volume record.
*/
struct autofs_fs_context *ctx = fc->fs_private;
struct autofs_sb_info *sbi = s->s_fs_info;
struct inode *root_inode;
- struct dentry *root;
struct autofs_info *ino;
- int ret = -ENOMEM;
pr_debug("starting up, sbi = %p\n", sbi);
*/
ino = autofs_new_ino(sbi);
if (!ino)
- goto fail;
+ return -ENOMEM;
root_inode = autofs_get_inode(s, S_IFDIR | 0755);
+ if (!root_inode)
+ return -ENOMEM;
+
root_inode->i_uid = ctx->uid;
root_inode->i_gid = ctx->gid;
+ root_inode->i_fop = &autofs_root_operations;
+ root_inode->i_op = &autofs_dir_inode_operations;
- root = d_make_root(root_inode);
- if (!root)
- goto fail_ino;
-
- root->d_fsdata = ino;
+ s->s_root = d_make_root(root_inode);
+ if (unlikely(!s->s_root)) {
+ autofs_free_ino(ino);
+ return -ENOMEM;
+ }
+ s->s_root->d_fsdata = ino;
if (ctx->pgrp_set) {
sbi->oz_pgrp = find_get_pid(ctx->pgrp);
- if (!sbi->oz_pgrp) {
- ret = invalf(fc, "Could not find process group %d",
- ctx->pgrp);
- goto fail_dput;
- }
- } else {
+ if (!sbi->oz_pgrp)
+ return invalf(fc, "Could not find process group %d",
+ ctx->pgrp);
+ } else
sbi->oz_pgrp = get_task_pid(current, PIDTYPE_PGID);
- }
if (autofs_type_trigger(sbi->type))
- __managed_dentry_set_managed(root);
-
- root_inode->i_fop = &autofs_root_operations;
- root_inode->i_op = &autofs_dir_inode_operations;
+ /* s->s_root won't be contended so there's little to
+ * be gained by not taking the d_lock when setting
+ * d_flags, even when a lot mounts are being done.
+ */
+ managed_dentry_set_managed(s->s_root);
pr_debug("pipe fd = %d, pgrp = %u\n",
sbi->pipefd, pid_nr(sbi->oz_pgrp));
sbi->flags &= ~AUTOFS_SBI_CATATONIC;
-
- /*
- * Success! Install the root dentry now to indicate completion.
- */
- s->s_root = root;
return 0;
-
- /*
- * Failure ... clean up.
- */
-fail_dput:
- dput(root);
- goto fail;
-fail_ino:
- autofs_free_ino(ino);
-fail:
- return ret;
}
/*
depends on BCACHEFS_FS
select QUOTACTL
+config BCACHEFS_ERASURE_CODING
+ bool "bcachefs erasure coding (RAID5/6) support (EXPERIMENTAL)"
+ depends on BCACHEFS_FS
+ select QUOTACTL
+ help
+ This enables the "erasure_code" filesysystem and inode option, which
+ organizes data into reed-solomon stripes instead of ordinary
+ replication.
+
+ WARNING: this feature is still undergoing on disk format changes, and
+ should only be enabled for testing purposes.
+
config BCACHEFS_POSIX_ACL
bool "bcachefs POSIX ACL support"
depends on BCACHEFS_FS
return wp;
}
+static noinline void
+deallocate_extra_replicas(struct bch_fs *c,
+ struct open_buckets *ptrs,
+ struct open_buckets *ptrs_no_use,
+ unsigned extra_replicas)
+{
+ struct open_buckets ptrs2 = { 0 };
+ struct open_bucket *ob;
+ unsigned i;
+
+ open_bucket_for_each(c, ptrs, ob, i) {
+ unsigned d = bch_dev_bkey_exists(c, ob->dev)->mi.durability;
+
+ if (d && d <= extra_replicas) {
+ extra_replicas -= d;
+ ob_push(c, ptrs_no_use, ob);
+ } else {
+ ob_push(c, &ptrs2, ob);
+ }
+ }
+
+ *ptrs = ptrs2;
+}
+
/*
* Get us an open_bucket we can allocate from, return with it locked:
*/
int ret;
int i;
+ if (!IS_ENABLED(CONFIG_BCACHEFS_ERASURE_CODING))
+ erasure_code = false;
+
BUG_ON(flags & BCH_WRITE_ONLY_SPECIFIED_DEVS);
BUG_ON(!nr_replicas || !nr_replicas_required);
goto alloc_done;
/* Don't retry from all devices if we're out of open buckets: */
- if (bch2_err_matches(ret, BCH_ERR_open_buckets_empty))
- goto allocate_blocking;
+ if (bch2_err_matches(ret, BCH_ERR_open_buckets_empty)) {
+ int ret = open_bucket_add_buckets(trans, &ptrs, wp, devs_have,
+ target, erasure_code,
+ nr_replicas, &nr_effective,
+ &have_cache, watermark,
+ flags, cl);
+ if (!ret ||
+ bch2_err_matches(ret, BCH_ERR_transaction_restart) ||
+ bch2_err_matches(ret, BCH_ERR_open_buckets_empty))
+ goto alloc_done;
+ }
/*
* Only try to allocate cache (durability = 0 devices) from the
&have_cache, watermark,
flags, cl);
} else {
-allocate_blocking:
ret = open_bucket_add_buckets(trans, &ptrs, wp, devs_have,
target, erasure_code,
nr_replicas, &nr_effective,
if (ret)
goto err;
+ if (nr_effective > nr_replicas)
+ deallocate_extra_replicas(c, &ptrs, &wp->ptrs, nr_effective - nr_replicas);
+
/* Free buckets we didn't use: */
open_bucket_for_each(c, &wp->ptrs, ob, i)
open_bucket_free_unused(c, ob);
bp.level - 1,
0);
b = bch2_btree_iter_peek_node(iter);
- if (IS_ERR(b))
+ if (IS_ERR_OR_NULL(b))
goto err;
BUG_ON(b->c.level != bp.level - 1);
- if (b && extent_matches_bp(c, bp.btree_id, bp.level,
- bkey_i_to_s_c(&b->key),
- bucket, bp))
+ if (extent_matches_bp(c, bp.btree_id, bp.level,
+ bkey_i_to_s_c(&b->key),
+ bucket, bp))
return b;
- if (b && btree_node_will_make_reachable(b)) {
+ if (btree_node_will_make_reachable(b)) {
b = ERR_PTR(-BCH_ERR_backpointer_to_overwritten_btree_node);
} else {
backpointer_not_found(trans, bp_pos, bp, bkey_i_to_s_c(&b->key));
u64 start;
u64 end;
bool dirty;
- } entries[0];
+ } entries[];
};
struct journal_keys {
size_t gap;
size_t nr;
size_t size;
+ atomic_t ref;
+ bool initial_ref_held;
};
struct btree_trans_buf {
mempool_t compression_bounce[2];
mempool_t compress_workspace[BCH_COMPRESSION_TYPE_NR];
mempool_t decompress_workspace;
- ZSTD_parameters zstd_params;
+ size_t zstd_workspace_size;
struct crypto_shash *sha256;
struct crypto_sync_skcipher *chacha20;
#else
#error edit for your odd byteorder.
#endif
-} __packed __aligned(4);
+} __packed
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+__aligned(4)
+#endif
+;
#define KEY_INODE_MAX ((__u64)~0ULL)
#define KEY_OFFSET_MAX ((__u64)~0ULL)
x(move_extent_write, 36) \
x(move_extent_finish, 37) \
x(move_extent_fail, 38) \
- x(move_extent_alloc_mem_fail, 39) \
+ x(move_extent_start_fail, 39) \
x(copygc, 40) \
x(copygc_wait, 41) \
x(gc_gens_end, 42) \
#include "debug.h"
#include "errcode.h"
#include "error.h"
+#include "journal.h"
#include "trace.h"
#include <linux/prefetch.h>
BUG_ON(btree_node_read_in_flight(b) ||
btree_node_write_in_flight(b));
- if (btree_node_dirty(b))
- bch2_btree_complete_write(c, b, btree_current_write(b));
- clear_btree_node_dirty_acct(c, b);
-
btree_node_data_free(c, b);
}
- BUG_ON(atomic_read(&c->btree_cache.dirty));
+ BUG_ON(!bch2_journal_error(&c->journal) &&
+ atomic_read(&c->btree_cache.dirty));
list_splice(&bc->freed_pcpu, &bc->freed_nonpcpu);
rcu_assign_pointer(ca->buckets_gc, buckets);
}
- for_each_btree_key(trans, iter, BTREE_ID_alloc, POS_MIN,
- BTREE_ITER_PREFETCH, k, ret) {
+ ret = for_each_btree_key2(trans, iter, BTREE_ID_alloc, POS_MIN,
+ BTREE_ITER_PREFETCH, k, ({
ca = bch_dev_bkey_exists(c, k.k->p.inode);
g = gc_bucket(ca, k.k->p.offset);
g->stripe = a->stripe;
g->stripe_redundancy = a->stripe_redundancy;
}
- }
- bch2_trans_iter_exit(trans, &iter);
+
+ 0;
+ }));
err:
bch2_trans_put(trans);
if (ret)
return offset;
}
-static void btree_node_read_all_replicas_done(struct closure *cl)
+static CLOSURE_CALLBACK(btree_node_read_all_replicas_done)
{
- struct btree_node_read_all *ra =
- container_of(cl, struct btree_node_read_all, cl);
+ closure_type(ra, struct btree_node_read_all, cl);
struct bch_fs *c = ra->c;
struct btree *b = ra->b;
struct printbuf buf = PRINTBUF;
if (sync) {
closure_sync(&ra->cl);
- btree_node_read_all_replicas_done(&ra->cl);
+ btree_node_read_all_replicas_done(&ra->cl.work);
} else {
continue_at(&ra->cl, btree_node_read_all_replicas_done,
c->io_complete_wq);
return bch2_trans_run(c, __bch2_btree_root_read(trans, id, k, level));
}
-void bch2_btree_complete_write(struct bch_fs *c, struct btree *b,
- struct btree_write *w)
+static void bch2_btree_complete_write(struct bch_fs *c, struct btree *b,
+ struct btree_write *w)
{
unsigned long old, new, v = READ_ONCE(b->will_make_reachable);
int bch2_btree_root_read(struct bch_fs *, enum btree_id,
const struct bkey_i *, unsigned);
-void bch2_btree_complete_write(struct bch_fs *, struct btree *,
- struct btree_write *);
-
bool bch2_btree_post_write_cleanup(struct bch_fs *, struct btree *);
enum btree_write_flags {
trans->fn_idx = fn_idx;
trans->locking_wait.task = current;
trans->journal_replay_not_finished =
- !test_bit(JOURNAL_REPLAY_DONE, &c->journal.flags);
+ unlikely(!test_bit(JOURNAL_REPLAY_DONE, &c->journal.flags)) &&
+ atomic_inc_not_zero(&c->journal_keys.ref);
closure_init_stack(&trans->ref);
s = btree_trans_stats(trans);
srcu_read_unlock(&c->btree_trans_barrier, trans->srcu_idx);
}
- bch2_journal_preres_put(&c->journal, &trans->journal_preres);
-
kfree(trans->extra_journal_entries.data);
if (trans->fs_usage_deltas) {
kfree(trans->fs_usage_deltas);
}
+ if (unlikely(trans->journal_replay_not_finished))
+ bch2_journal_keys_put(c);
+
if (trans->mem_bytes == BTREE_TRANS_MEM_MAX)
mempool_free(trans->mem, &c->btree_trans_mem_pool);
else
mempool_exit(&c->btree_trans_pool);
}
-int bch2_fs_btree_iter_init(struct bch_fs *c)
+void bch2_fs_btree_iter_init_early(struct bch_fs *c)
{
struct btree_transaction_stats *s;
- int ret;
for (s = c->btree_transaction_stats;
s < c->btree_transaction_stats + ARRAY_SIZE(c->btree_transaction_stats);
INIT_LIST_HEAD(&c->btree_trans_list);
seqmutex_init(&c->btree_trans_lock);
+}
+
+int bch2_fs_btree_iter_init(struct bch_fs *c)
+{
+ int ret;
c->btree_trans_bufs = alloc_percpu(struct btree_trans_buf);
if (!c->btree_trans_bufs)
void bch2_btree_trans_to_text(struct printbuf *, struct btree_trans *);
void bch2_fs_btree_iter_exit(struct bch_fs *);
+void bch2_fs_btree_iter_init_early(struct bch_fs *);
int bch2_fs_btree_iter_init(struct bch_fs *);
#endif /* _BCACHEFS_BTREE_ITER_H */
struct journal_keys *keys = &c->journal_keys;
unsigned iters = 0;
struct journal_key *k;
+
+ BUG_ON(*idx > keys->nr);
search:
if (!*idx)
*idx = __bch2_journal_key_search(keys, btree_id, level, pos);
/* Since @keys was full, there was no gap: */
memcpy(new_keys.d, keys->d, sizeof(keys->d[0]) * keys->nr);
kvfree(keys->d);
- *keys = new_keys;
+ keys->d = new_keys.d;
+ keys->nr = new_keys.nr;
+ keys->size = new_keys.size;
/* And now the gap is at the end: */
- keys->gap = keys->nr;
+ keys->gap = keys->nr;
}
journal_iters_move_gap(c, keys->gap, idx);
cmp_int(l->journal_offset, r->journal_offset);
}
-void bch2_journal_keys_free(struct journal_keys *keys)
+void bch2_journal_keys_put(struct bch_fs *c)
{
+ struct journal_keys *keys = &c->journal_keys;
struct journal_key *i;
+ BUG_ON(atomic_read(&keys->ref) <= 0);
+
+ if (!atomic_dec_and_test(&keys->ref))
+ return;
+
move_gap(keys->d, keys->nr, keys->size, keys->gap, keys->nr);
keys->gap = keys->nr;
kvfree(keys->d);
keys->d = NULL;
keys->nr = keys->gap = keys->size = 0;
+
+ bch2_journal_entries_free(c);
}
static void __journal_keys_sort(struct journal_keys *keys)
struct bch_fs *,
struct btree *);
-void bch2_journal_keys_free(struct journal_keys *);
+void bch2_journal_keys_put(struct bch_fs *);
+
+static inline void bch2_journal_keys_put_initial(struct bch_fs *c)
+{
+ if (c->journal_keys.initial_ref_held)
+ bch2_journal_keys_put(c);
+ c->journal_keys.initial_ref_held = false;
+}
+
void bch2_journal_entries_free(struct bch_fs *);
int bch2_journal_keys_sort(struct bch_fs *);
ck->btree_trans_barrier_seq =
start_poll_synchronize_srcu(&c->btree_trans_barrier);
- if (ck->c.lock.readers)
+ if (ck->c.lock.readers) {
list_move_tail(&ck->list, &bc->freed_pcpu);
- else
+ bc->nr_freed_pcpu++;
+ } else {
list_move_tail(&ck->list, &bc->freed_nonpcpu);
+ bc->nr_freed_nonpcpu++;
+ }
atomic_long_inc(&bc->nr_freed);
kfree(ck->k);
{
struct bkey_cached *pos;
+ bc->nr_freed_nonpcpu++;
+
list_for_each_entry_reverse(pos, &bc->freed_nonpcpu, list) {
if (ULONG_CMP_GE(ck->btree_trans_barrier_seq,
pos->btree_trans_barrier_seq)) {
#else
mutex_lock(&bc->lock);
list_move_tail(&ck->list, &bc->freed_nonpcpu);
+ bc->nr_freed_nonpcpu++;
mutex_unlock(&bc->lock);
#endif
} else {
f->nr < ARRAY_SIZE(f->objs) / 2) {
ck = list_last_entry(&bc->freed_nonpcpu, struct bkey_cached, list);
list_del_init(&ck->list);
+ bc->nr_freed_nonpcpu--;
f->objs[f->nr++] = ck;
}
if (!list_empty(&bc->freed_nonpcpu)) {
ck = list_last_entry(&bc->freed_nonpcpu, struct bkey_cached, list);
list_del_init(&ck->list);
+ bc->nr_freed_nonpcpu--;
}
mutex_unlock(&bc->lock);
#endif
goto out;
bch2_journal_pin_drop(j, &ck->journal);
- bch2_journal_preres_put(j, &ck->res);
BUG_ON(!btree_node_locked(c_iter.path, 0));
BUG_ON(insert->k.u64s > ck->u64s);
- if (likely(!(flags & BTREE_INSERT_JOURNAL_REPLAY))) {
- int difference;
-
- BUG_ON(jset_u64s(insert->k.u64s) > trans->journal_preres.u64s);
-
- difference = jset_u64s(insert->k.u64s) - ck->res.u64s;
- if (difference > 0) {
- trans->journal_preres.u64s -= difference;
- ck->res.u64s += difference;
- }
- }
-
bkey_copy(ck->k, insert);
ck->valid = true;
* Newest freed entries are at the end of the list - once we hit one
* that's too new to be freed, we can bail out:
*/
+ scanned += bc->nr_freed_nonpcpu;
+
list_for_each_entry_safe(ck, t, &bc->freed_nonpcpu, list) {
if (!poll_state_synchronize_srcu(&c->btree_trans_barrier,
ck->btree_trans_barrier_seq))
six_lock_exit(&ck->c.lock);
kmem_cache_free(bch2_key_cache, ck);
atomic_long_dec(&bc->nr_freed);
- scanned++;
freed++;
+ bc->nr_freed_nonpcpu--;
}
if (scanned >= nr)
goto out;
+ scanned += bc->nr_freed_pcpu;
+
list_for_each_entry_safe(ck, t, &bc->freed_pcpu, list) {
if (!poll_state_synchronize_srcu(&c->btree_trans_barrier,
ck->btree_trans_barrier_seq))
six_lock_exit(&ck->c.lock);
kmem_cache_free(bch2_key_cache, ck);
atomic_long_dec(&bc->nr_freed);
- scanned++;
freed++;
+ bc->nr_freed_pcpu--;
}
if (scanned >= nr)
}
#endif
+ BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu);
+ BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu);
+
list_splice(&bc->freed_pcpu, &items);
list_splice(&bc->freed_nonpcpu, &items);
list_for_each_entry_safe(ck, n, &items, list) {
cond_resched();
- bch2_journal_pin_drop(&c->journal, &ck->journal);
- bch2_journal_preres_put(&c->journal, &ck->res);
-
list_del(&ck->list);
kfree(ck->k);
six_lock_exit(&ck->c.lock);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _BCACHEFS_BTREE_KEY_CACHE_TYPES_H
+#define _BCACHEFS_BTREE_KEY_CACHE_TYPES_H
+
+struct btree_key_cache_freelist {
+ struct bkey_cached *objs[16];
+ unsigned nr;
+};
+
+struct btree_key_cache {
+ struct mutex lock;
+ struct rhashtable table;
+ bool table_init_done;
+
+ struct list_head freed_pcpu;
+ size_t nr_freed_pcpu;
+ struct list_head freed_nonpcpu;
+ size_t nr_freed_nonpcpu;
+
+ struct shrinker *shrink;
+ unsigned shrink_iter;
+ struct btree_key_cache_freelist __percpu *pcpu_freed;
+
+ atomic_long_t nr_freed;
+ atomic_long_t nr_keys;
+ atomic_long_t nr_dirty;
+};
+
+struct bkey_cached_key {
+ u32 btree_id;
+ struct bpos pos;
+} __packed __aligned(4);
+
+#endif /* _BCACHEFS_BTREE_KEY_CACHE_TYPES_H */
bch2_btree_init_next(trans, b);
}
+static noinline int trans_lock_write_fail(struct btree_trans *trans, struct btree_insert_entry *i)
+{
+ while (--i >= trans->updates) {
+ if (same_leaf_as_prev(trans, i))
+ continue;
+
+ bch2_btree_node_unlock_write(trans, i->path, insert_l(i)->b);
+ }
+
+ trace_and_count(trans->c, trans_restart_would_deadlock_write, trans);
+ return btree_trans_restart(trans, BCH_ERR_transaction_restart_would_deadlock_write);
+}
+
+static inline int bch2_trans_lock_write(struct btree_trans *trans)
+{
+ struct btree_insert_entry *i;
+
+ EBUG_ON(trans->write_locked);
+
+ trans_for_each_update(trans, i) {
+ if (same_leaf_as_prev(trans, i))
+ continue;
+
+ if (bch2_btree_node_lock_write(trans, i->path, &insert_l(i)->b->c))
+ return trans_lock_write_fail(trans, i);
+
+ if (!i->cached)
+ bch2_btree_node_prep_for_write(trans, i->path, insert_l(i)->b);
+ }
+
+ trans->write_locked = true;
+ return 0;
+}
+
+static inline void bch2_trans_unlock_write(struct btree_trans *trans)
+{
+ if (likely(trans->write_locked)) {
+ struct btree_insert_entry *i;
+
+ trans_for_each_update(trans, i)
+ if (!same_leaf_as_prev(trans, i))
+ bch2_btree_node_unlock_write_inlined(trans, i->path,
+ insert_l(i)->b);
+ trans->write_locked = false;
+ }
+}
+
/* Inserting into a given leaf node (last stage of insert): */
/* Handle overwrites and do insert, for non extents: */
bch2_snapshot_is_internal_node(trans->c, i->k->k.p.snapshot));
}
-static noinline int
-bch2_trans_journal_preres_get_cold(struct btree_trans *trans, unsigned flags,
- unsigned long trace_ip)
-{
- return drop_locks_do(trans,
- bch2_journal_preres_get(&trans->c->journal,
- &trans->journal_preres,
- trans->journal_preres_u64s,
- (flags & BCH_WATERMARK_MASK)));
-}
-
static __always_inline int bch2_trans_journal_res_get(struct btree_trans *trans,
unsigned flags)
{
return 0;
}
+noinline static int
+btree_key_can_insert_cached_slowpath(struct btree_trans *trans, unsigned flags,
+ struct btree_path *path, unsigned new_u64s)
+{
+ struct bch_fs *c = trans->c;
+ struct btree_insert_entry *i;
+ struct bkey_cached *ck = (void *) path->l[0].b;
+ struct bkey_i *new_k;
+ int ret;
+
+ bch2_trans_unlock_write(trans);
+ bch2_trans_unlock(trans);
+
+ new_k = kmalloc(new_u64s * sizeof(u64), GFP_KERNEL);
+ if (!new_k) {
+ bch_err(c, "error allocating memory for key cache key, btree %s u64s %u",
+ bch2_btree_id_str(path->btree_id), new_u64s);
+ return -BCH_ERR_ENOMEM_btree_key_cache_insert;
+ }
+
+ ret = bch2_trans_relock(trans) ?:
+ bch2_trans_lock_write(trans);
+ if (unlikely(ret)) {
+ kfree(new_k);
+ return ret;
+ }
+
+ memcpy(new_k, ck->k, ck->u64s * sizeof(u64));
+
+ trans_for_each_update(trans, i)
+ if (i->old_v == &ck->k->v)
+ i->old_v = &new_k->v;
+
+ kfree(ck->k);
+ ck->u64s = new_u64s;
+ ck->k = new_k;
+ return 0;
+}
+
static int btree_key_can_insert_cached(struct btree_trans *trans, unsigned flags,
struct btree_path *path, unsigned u64s)
{
return 0;
new_u64s = roundup_pow_of_two(u64s);
- new_k = krealloc(ck->k, new_u64s * sizeof(u64), GFP_NOFS);
- if (!new_k) {
- bch_err(c, "error allocating memory for key cache key, btree %s u64s %u",
- bch2_btree_id_str(path->btree_id), new_u64s);
- return -BCH_ERR_ENOMEM_btree_key_cache_insert;
- }
+ new_k = krealloc(ck->k, new_u64s * sizeof(u64), GFP_NOWAIT);
+ if (unlikely(!new_k))
+ return btree_key_can_insert_cached_slowpath(trans, flags, path, new_u64s);
trans_for_each_update(trans, i)
if (i->old_v == &ck->k->v)
return ret;
}
-static noinline int trans_lock_write_fail(struct btree_trans *trans, struct btree_insert_entry *i)
-{
- while (--i >= trans->updates) {
- if (same_leaf_as_prev(trans, i))
- continue;
-
- bch2_btree_node_unlock_write(trans, i->path, insert_l(i)->b);
- }
-
- trace_and_count(trans->c, trans_restart_would_deadlock_write, trans);
- return btree_trans_restart(trans, BCH_ERR_transaction_restart_would_deadlock_write);
-}
-
-static inline int trans_lock_write(struct btree_trans *trans)
-{
- struct btree_insert_entry *i;
-
- trans_for_each_update(trans, i) {
- if (same_leaf_as_prev(trans, i))
- continue;
-
- if (bch2_btree_node_lock_write(trans, i->path, &insert_l(i)->b->c))
- return trans_lock_write_fail(trans, i);
-
- if (!i->cached)
- bch2_btree_node_prep_for_write(trans, i->path, insert_l(i)->b);
- }
-
- return 0;
-}
-
static noinline void bch2_drop_overwrites_from_journal(struct btree_trans *trans)
{
struct btree_insert_entry *i;
}
}
- ret = bch2_journal_preres_get(&c->journal,
- &trans->journal_preres, trans->journal_preres_u64s,
- (flags & BCH_WATERMARK_MASK)|JOURNAL_RES_GET_NONBLOCK);
- if (unlikely(ret == -BCH_ERR_journal_preres_get_blocked))
- ret = bch2_trans_journal_preres_get_cold(trans, flags, trace_ip);
- if (unlikely(ret))
- return ret;
-
- ret = trans_lock_write(trans);
+ ret = bch2_trans_lock_write(trans);
if (unlikely(ret))
return ret;
if (!ret && unlikely(trans->journal_replay_not_finished))
bch2_drop_overwrites_from_journal(trans);
- trans_for_each_update(trans, i)
- if (!same_leaf_as_prev(trans, i))
- bch2_btree_node_unlock_write_inlined(trans, i->path,
- insert_l(i)->b);
+ bch2_trans_unlock_write(trans);
if (!ret && trans->journal_pin)
bch2_journal_pin_add(&c->journal, trans->journal_res.seq,
struct bch_fs *c = trans->c;
struct btree_insert_entry *i = NULL;
struct btree_write_buffered_key *wb;
- unsigned u64s;
int ret = 0;
if (!trans->nr_updates &&
EBUG_ON(test_bit(BCH_FS_CLEAN_SHUTDOWN, &c->flags));
- memset(&trans->journal_preres, 0, sizeof(trans->journal_preres));
-
trans->journal_u64s = trans->extra_journal_entries.nr;
- trans->journal_preres_u64s = 0;
-
trans->journal_transaction_names = READ_ONCE(c->opts.journal_transaction_names);
-
if (trans->journal_transaction_names)
trans->journal_u64s += jset_u64s(JSET_ENTRY_LOG_U64s);
if (i->key_cache_already_flushed)
continue;
- /* we're going to journal the key being updated: */
- u64s = jset_u64s(i->k->k.u64s);
- if (i->cached &&
- likely(!(flags & BTREE_INSERT_JOURNAL_REPLAY)))
- trans->journal_preres_u64s += u64s;
-
if (i->flags & BTREE_UPDATE_NOJOURNAL)
continue;
- trans->journal_u64s += u64s;
+ /* we're going to journal the key being updated: */
+ trans->journal_u64s += jset_u64s(i->k->k.u64s);
/* and we're also going to log the overwrite: */
if (trans->journal_transaction_names)
trace_and_count(c, transaction_commit, trans, _RET_IP_);
out:
- bch2_journal_preres_put(&c->journal, &trans->journal_preres);
-
if (likely(!(flags & BTREE_INSERT_NOCHECK_RW)))
bch2_write_ref_put(c, BCH_WRITE_REF_trans);
out_reset:
#include <linux/list.h>
#include <linux/rhashtable.h>
-//#include "bkey_methods.h"
+#include "btree_key_cache_types.h"
#include "buckets_types.h"
#include "darray.h"
#include "errcode.h"
#endif
};
-struct btree_key_cache_freelist {
- struct bkey_cached *objs[16];
- unsigned nr;
-};
-
-struct btree_key_cache {
- struct mutex lock;
- struct rhashtable table;
- bool table_init_done;
- struct list_head freed_pcpu;
- struct list_head freed_nonpcpu;
- struct shrinker *shrink;
- unsigned shrink_iter;
- struct btree_key_cache_freelist __percpu *pcpu_freed;
-
- atomic_long_t nr_freed;
- atomic_long_t nr_keys;
- atomic_long_t nr_dirty;
-};
-
-struct bkey_cached_key {
- u32 btree_id;
- struct bpos pos;
-} __packed __aligned(4);
-
#define BKEY_CACHED_ACCESSED 0
#define BKEY_CACHED_DIRTY 1
struct rhash_head hash;
struct list_head list;
- struct journal_preres res;
struct journal_entry_pin journal;
u64 seq;
unsigned long ip_allocated;
};
-#ifndef CONFIG_LOCKDEP
#define BTREE_ITER_MAX 64
-#else
-#define BTREE_ITER_MAX 32
-#endif
struct btree_trans_commit_hook;
typedef int (btree_trans_commit_hook_fn)(struct btree_trans *, struct btree_trans_commit_hook *);
bool journal_transaction_names:1;
bool journal_replay_not_finished:1;
bool notrace_relock_fail:1;
+ bool write_locked:1;
enum bch_errcode restarted:16;
u32 restart_count;
unsigned long last_begin_ip;
struct journal_entry_pin *journal_pin;
struct journal_res journal_res;
- struct journal_preres journal_preres;
u64 *journal_seq;
struct disk_reservation *disk_res;
unsigned journal_u64s;
- unsigned journal_preres_u64s;
struct replicas_delta_list *fs_usage_deltas;
};
BTREE_UPDATE_PREJOURNAL);
}
+static noinline int bch2_btree_insert_clone_trans(struct btree_trans *trans,
+ enum btree_id btree,
+ struct bkey_i *k)
+{
+ struct bkey_i *n = bch2_trans_kmalloc(trans, bkey_bytes(&k->k));
+ int ret = PTR_ERR_OR_ZERO(n);
+ if (ret)
+ return ret;
+
+ bkey_copy(n, k);
+ return bch2_btree_insert_trans(trans, btree, n, 0);
+}
+
int __must_check bch2_trans_update_buffered(struct btree_trans *trans,
enum btree_id btree,
struct bkey_i *k)
EBUG_ON(trans->nr_wb_updates > trans->wb_updates_size);
EBUG_ON(k->k.u64s > BTREE_WRITE_BUFERED_U64s_MAX);
+ if (unlikely(trans->journal_replay_not_finished))
+ return bch2_btree_insert_clone_trans(trans, btree, k);
+
trans_for_each_wb_update(trans, i) {
if (i->btree == btree && bpos_eq(i->k.k.p, k->k.p)) {
bkey_copy(&i->k, k);
/* Calculate ideal packed bkey format for new btree nodes: */
-void __bch2_btree_calc_format(struct bkey_format_state *s, struct btree *b)
+static void __bch2_btree_calc_format(struct bkey_format_state *s, struct btree *b)
{
struct bkey_packed *k;
struct bset_tree *t;
return bch2_bkey_format_done(&s);
}
-static size_t btree_node_u64s_with_format(struct btree *b,
+static size_t btree_node_u64s_with_format(struct btree_nr_keys nr,
+ struct bkey_format *old_f,
struct bkey_format *new_f)
{
- struct bkey_format *old_f = &b->format;
-
/* stupid integer promotion rules */
ssize_t delta =
(((int) new_f->key_u64s - old_f->key_u64s) *
- (int) b->nr.packed_keys) +
+ (int) nr.packed_keys) +
(((int) new_f->key_u64s - BKEY_U64s) *
- (int) b->nr.unpacked_keys);
+ (int) nr.unpacked_keys);
- BUG_ON(delta + b->nr.live_u64s < 0);
+ BUG_ON(delta + nr.live_u64s < 0);
- return b->nr.live_u64s + delta;
+ return nr.live_u64s + delta;
}
/**
*
* @c: filesystem handle
* @b: btree node to rewrite
+ * @nr: number of keys for new node (i.e. b->nr)
* @new_f: bkey format to translate keys to
*
* Returns: true if all re-packed keys will be able to fit in a new node.
*
* Assumes all keys will successfully pack with the new format.
*/
-bool bch2_btree_node_format_fits(struct bch_fs *c, struct btree *b,
+static bool bch2_btree_node_format_fits(struct bch_fs *c, struct btree *b,
+ struct btree_nr_keys nr,
struct bkey_format *new_f)
{
- size_t u64s = btree_node_u64s_with_format(b, new_f);
+ size_t u64s = btree_node_u64s_with_format(nr, &b->format, new_f);
return __vstruct_bytes(struct btree_node, u64s) < btree_bytes(c);
}
* The keys might expand with the new format - if they wouldn't fit in
* the btree node anymore, use the old format for now:
*/
- if (!bch2_btree_node_format_fits(as->c, b, &format))
+ if (!bch2_btree_node_format_fits(as->c, b, b->nr, &format))
format = b->format;
SET_BTREE_NODE_SEQ(n->data, BTREE_NODE_SEQ(b->data) + 1);
up_read(&c->gc_lock);
as->took_gc_lock = false;
- bch2_journal_preres_put(&c->journal, &as->journal_preres);
-
bch2_journal_pin_drop(&c->journal, &as->journal);
bch2_journal_pin_flush(&c->journal, &as->journal);
bch2_disk_reservation_put(c, &as->disk_res);
bch2_journal_pin_drop(&c->journal, &as->journal);
- bch2_journal_preres_put(&c->journal, &as->journal_preres);
-
mutex_lock(&c->btree_interior_update_lock);
for (i = 0; i < as->nr_new_nodes; i++) {
b = as->new_nodes[i];
}
}
-static void btree_update_set_nodes_written(struct closure *cl)
+static CLOSURE_CALLBACK(btree_update_set_nodes_written)
{
- struct btree_update *as = container_of(cl, struct btree_update, cl);
+ closure_type(as, struct btree_update, cl);
struct bch_fs *c = as->c;
mutex_lock(&c->btree_interior_update_lock);
unsigned nr_nodes[2] = { 0, 0 };
unsigned update_level = level;
enum bch_watermark watermark = flags & BCH_WATERMARK_MASK;
- unsigned journal_flags = 0;
int ret = 0;
u32 restart_count = trans->restart_count;
flags &= ~BCH_WATERMARK_MASK;
flags |= watermark;
- if (flags & BTREE_INSERT_JOURNAL_RECLAIM)
- journal_flags |= JOURNAL_RES_GET_NONBLOCK;
- journal_flags |= watermark;
+ if (!(flags & BTREE_INSERT_JOURNAL_RECLAIM) &&
+ watermark < c->journal.watermark) {
+ struct journal_res res = { 0 };
+
+ ret = drop_locks_do(trans,
+ bch2_journal_res_get(&c->journal, &res, 1,
+ watermark|JOURNAL_RES_GET_CHECK));
+ if (ret)
+ return ERR_PTR(ret);
+ }
while (1) {
nr_nodes[!!update_level] += 1 + split;
break;
}
+ /*
+ * Always check for space for two keys, even if we won't have to
+ * split at prior level - it might have been a merge instead:
+ */
if (bch2_btree_node_insert_fits(c, path->l[update_level].b,
- BKEY_BTREE_PTR_U64s_MAX * (1 + split)))
+ BKEY_BTREE_PTR_U64s_MAX * 2))
break;
split = path->l[update_level].b->nr.live_u64s > BTREE_SPLIT_THRESHOLD(c);
if (ret)
goto err;
- ret = bch2_journal_preres_get(&c->journal, &as->journal_preres,
- BTREE_UPDATE_JOURNAL_RES,
- journal_flags|JOURNAL_RES_GET_NONBLOCK);
- if (ret) {
- if (flags & BTREE_INSERT_JOURNAL_RECLAIM) {
- ret = -BCH_ERR_journal_reclaim_would_deadlock;
- goto err;
- }
-
- ret = drop_locks_do(trans,
- bch2_journal_preres_get(&c->journal, &as->journal_preres,
- BTREE_UPDATE_JOURNAL_RES,
- journal_flags));
- if (ret == -BCH_ERR_journal_preres_get_blocked) {
- trace_and_count(c, trans_restart_journal_preres_get, trans, _RET_IP_, journal_flags);
- ret = btree_trans_restart(trans, BCH_ERR_transaction_restart_journal_preres_get);
- }
- if (ret)
- goto err;
- }
-
ret = bch2_disk_reservation_get(c, &as->disk_res,
(nr_nodes[0] + nr_nodes[1]) * btree_sectors(c),
c->opts.metadata_replicas,
struct bkey_packed *out[2];
struct bkey uk;
unsigned u64s, n1_u64s = (b->nr.live_u64s * 3) / 5;
+ struct { unsigned nr_keys, val_u64s; } nr_keys[2];
int i;
+ memset(&nr_keys, 0, sizeof(nr_keys));
+
for (i = 0; i < 2; i++) {
BUG_ON(n[i]->nsets != 1);
if (!i)
n1_pos = uk.p;
bch2_bkey_format_add_key(&format[i], &uk);
+
+ nr_keys[i].nr_keys++;
+ nr_keys[i].val_u64s += bkeyp_val_u64s(&b->format, k);
}
btree_set_min(n[0], b->data->min_key);
bch2_bkey_format_add_pos(&format[i], n[i]->data->max_key);
n[i]->data->format = bch2_bkey_format_done(&format[i]);
+
+ unsigned u64s = nr_keys[i].nr_keys * n[i]->data->format.key_u64s +
+ nr_keys[i].val_u64s;
+ if (__vstruct_bytes(struct btree_node, u64s) > btree_bytes(as->c))
+ n[i]->data->format = b->format;
+
btree_node_set_format(n[i], n[i]->data->format);
}
bch2_bkey_format_add_pos(&new_s, next->data->max_key);
new_f = bch2_bkey_format_done(&new_s);
- sib_u64s = btree_node_u64s_with_format(b, &new_f) +
- btree_node_u64s_with_format(m, &new_f);
+ sib_u64s = btree_node_u64s_with_format(b->nr, &b->format, &new_f) +
+ btree_node_u64s_with_format(m->nr, &m->format, &new_f);
if (sib_u64s > BTREE_FOREGROUND_MERGE_HYSTERESIS(c)) {
sib_u64s -= BTREE_FOREGROUND_MERGE_HYSTERESIS(c);
BUG_ON(!btree_node_hashed(b));
+ struct bch_extent_ptr *ptr;
+ bch2_bkey_drop_ptrs(bkey_i_to_s(new_key), ptr,
+ !bch2_bkey_has_device(bkey_i_to_s(&b->key), ptr->dev));
+
ret = bch2_btree_node_update_key(trans, &iter, b, new_key,
commit_flags, skip_triggers);
out:
#include "btree_locking.h"
#include "btree_update.h"
-void __bch2_btree_calc_format(struct bkey_format_state *, struct btree *);
-bool bch2_btree_node_format_fits(struct bch_fs *c, struct btree *,
- struct bkey_format *);
-
#define BTREE_UPDATE_NODES_MAX ((BTREE_MAX_DEPTH - 2) * 2 + GC_MERGE_NODES)
#define BTREE_UPDATE_JOURNAL_RES (BTREE_UPDATE_NODES_MAX * (BKEY_BTREE_PTR_U64s_MAX + 1))
unsigned update_level;
struct disk_reservation disk_res;
- struct journal_preres journal_preres;
/*
* BTREE_INTERIOR_UPDATING_NODE:
return ret;
*dst_sectors += sectors;
- *bucket_data_type = *dirty_sectors || *cached_sectors
- ? ptr_data_type : 0;
+
+ if (!*dirty_sectors && !*cached_sectors)
+ *bucket_data_type = 0;
+ else if (*bucket_data_type != BCH_DATA_stripe)
+ *bucket_data_type = ptr_data_type;
+
return 0;
}
bucket_gens->first_bucket = ca->mi.first_bucket;
bucket_gens->nbuckets = nbuckets;
- bch2_copygc_stop(c);
-
if (resize) {
down_write(&c->gc_lock);
down_write(&ca->bucket_lock);
*/
unsigned level = min((compression.level * 3) / 2, zstd_max_clevel());
ZSTD_parameters params = zstd_get_params(level, c->opts.encoded_extent_max);
- ZSTD_CCtx *ctx = zstd_init_cctx(workspace,
- zstd_cctx_workspace_bound(¶ms.cParams));
+ ZSTD_CCtx *ctx = zstd_init_cctx(workspace, c->zstd_workspace_size);
/*
* ZSTD requires that when we decompress we pass in the exact
size_t len = zstd_compress_cctx(ctx,
dst + 4, dst_len - 4 - 7,
src, src_len,
- &c->zstd_params);
+ ¶ms);
if (zstd_is_error(len))
return 0;
size_t decompress_workspace_size = 0;
ZSTD_parameters params = zstd_get_params(zstd_max_clevel(),
c->opts.encoded_extent_max);
+
+ /*
+ * ZSTD is lying: if we allocate the size of the workspace it says it
+ * requires, it returns memory allocation errors
+ */
+ c->zstd_workspace_size = zstd_cctx_workspace_bound(¶ms.cParams);
+
struct {
unsigned feature;
enum bch_compression_type type;
zlib_deflate_workspacesize(MAX_WBITS, DEF_MEM_LEVEL),
zlib_inflate_workspacesize(), },
{ BCH_FEATURE_zstd, BCH_COMPRESSION_TYPE_zstd,
- zstd_cctx_workspace_bound(¶ms.cParams),
+ c->zstd_workspace_size,
zstd_dctx_workspace_bound() },
}, *i;
bool have_compressed = false;
- c->zstd_params = params;
-
for (i = compression_types;
i < compression_types + ARRAY_SIZE(compression_types);
i++)
next_pos = insert->k.p;
+ /*
+ * Check for nonce offset inconsistency:
+ * This is debug code - we've been seeing this bug rarely, and
+ * it's been hard to reproduce, so this should give us some more
+ * information when it does occur:
+ */
+ struct printbuf err = PRINTBUF;
+ int invalid = bch2_bkey_invalid(c, bkey_i_to_s_c(insert), __btree_node_type(0, m->btree_id), 0, &err);
+ printbuf_exit(&err);
+
+ if (invalid) {
+ struct printbuf buf = PRINTBUF;
+
+ prt_str(&buf, "about to insert invalid key in data update path");
+ prt_str(&buf, "\nold: ");
+ bch2_bkey_val_to_text(&buf, c, old);
+ prt_str(&buf, "\nk: ");
+ bch2_bkey_val_to_text(&buf, c, k);
+ prt_str(&buf, "\nnew: ");
+ bch2_bkey_val_to_text(&buf, c, bkey_i_to_s_c(insert));
+
+ bch2_print_string_as_lines(KERN_ERR, buf.buf);
+ printbuf_exit(&buf);
+
+ bch2_fatal_error(c);
+ goto out;
+ }
+
ret = bch2_insert_snapshot_whiteouts(trans, m->btree_id,
k.k->p, bkey_start_pos(&insert->k)) ?:
bch2_insert_snapshot_whiteouts(trans, m->btree_id,
bch2_bio_free_pages_pool(c, &update->op.wbio.bio);
}
-void bch2_update_unwritten_extent(struct btree_trans *trans,
+static void bch2_update_unwritten_extent(struct btree_trans *trans,
struct data_update *update)
{
struct bch_fs *c = update->op.c;
}
}
+int bch2_extent_drop_ptrs(struct btree_trans *trans,
+ struct btree_iter *iter,
+ struct bkey_s_c k,
+ struct data_update_opts data_opts)
+{
+ struct bch_fs *c = trans->c;
+ struct bkey_i *n;
+ int ret;
+
+ n = bch2_bkey_make_mut_noupdate(trans, k);
+ ret = PTR_ERR_OR_ZERO(n);
+ if (ret)
+ return ret;
+
+ while (data_opts.kill_ptrs) {
+ unsigned i = 0, drop = __fls(data_opts.kill_ptrs);
+ struct bch_extent_ptr *ptr;
+
+ bch2_bkey_drop_ptrs(bkey_i_to_s(n), ptr, i++ == drop);
+ data_opts.kill_ptrs ^= 1U << drop;
+ }
+
+ /*
+ * If the new extent no longer has any pointers, bch2_extent_normalize()
+ * will do the appropriate thing with it (turning it into a
+ * KEY_TYPE_error key, or just a discard if it was a cached extent)
+ */
+ bch2_extent_normalize(c, bkey_i_to_s(n));
+
+ /*
+ * Since we're not inserting through an extent iterator
+ * (BTREE_ITER_ALL_SNAPSHOTS iterators aren't extent iterators),
+ * we aren't using the extent overwrite path to delete, we're
+ * just using the normal key deletion path:
+ */
+ if (bkey_deleted(&n->k) && !(iter->flags & BTREE_ITER_IS_EXTENTS))
+ n->k.size = 0;
+
+ return bch2_trans_relock(trans) ?:
+ bch2_trans_update(trans, iter, n, BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE) ?:
+ bch2_trans_commit(trans, NULL, NULL, BTREE_INSERT_NOFAIL);
+}
+
int bch2_data_update_init(struct btree_trans *trans,
+ struct btree_iter *iter,
struct moving_context *ctxt,
struct data_update *m,
struct write_point_specifier wp,
const struct bch_extent_ptr *ptr;
unsigned i, reserve_sectors = k.k->size * data_opts.extra_replicas;
unsigned ptrs_locked = 0;
- int ret;
+ int ret = 0;
bch2_bkey_buf_init(&m->k);
bch2_bkey_buf_reassemble(&m->k, c, k);
bkey_for_each_ptr(ptrs, ptr)
percpu_ref_get(&bch_dev_bkey_exists(c, ptr->dev)->ref);
+ unsigned durability_have = 0, durability_removing = 0;
+
i = 0;
bkey_for_each_ptr_decode(k.k, ptrs, p, entry) {
bool locked;
reserve_sectors += k.k->size;
m->op.nr_replicas += bch2_extent_ptr_desired_durability(c, &p);
- } else if (!p.ptr.cached) {
+ durability_removing += bch2_extent_ptr_desired_durability(c, &p);
+ } else if (!p.ptr.cached &&
+ !((1U << i) & m->data_opts.kill_ptrs)) {
bch2_dev_list_add_dev(&m->op.devs_have, p.ptr.dev);
+ durability_have += bch2_extent_ptr_durability(c, &p);
}
/*
move_ctxt_wait_event(ctxt,
(locked = bch2_bucket_nocow_trylock(&c->nocow_locks,
PTR_BUCKET_POS(c, &p.ptr), 0)) ||
- !atomic_read(&ctxt->read_sectors));
+ (!atomic_read(&ctxt->read_sectors) &&
+ !atomic_read(&ctxt->write_sectors)));
if (!locked)
bch2_bucket_nocow_lock(&c->nocow_locks,
i++;
}
+ /*
+ * If current extent durability is less than io_opts.data_replicas,
+ * we're not trying to rereplicate the extent up to data_replicas here -
+ * unless extra_replicas was specified
+ *
+ * Increasing replication is an explicit operation triggered by
+ * rereplicate, currently, so that users don't get an unexpected -ENOSPC
+ */
+ if (durability_have >= io_opts.data_replicas) {
+ m->data_opts.kill_ptrs |= m->data_opts.rewrite_ptrs;
+ m->data_opts.rewrite_ptrs = 0;
+ /* if iter == NULL, it's just a promote */
+ if (iter)
+ ret = bch2_extent_drop_ptrs(trans, iter, k, m->data_opts);
+ goto done;
+ }
+
+ m->op.nr_replicas = min(durability_removing, io_opts.data_replicas - durability_have) +
+ m->data_opts.extra_replicas;
+ m->op.nr_replicas_required = m->op.nr_replicas;
+
+ BUG_ON(!m->op.nr_replicas);
+
if (reserve_sectors) {
ret = bch2_disk_reservation_add(c, &m->op.res, reserve_sectors,
m->data_opts.extra_replicas
goto err;
}
- m->op.nr_replicas += m->data_opts.extra_replicas;
- m->op.nr_replicas_required = m->op.nr_replicas;
-
- BUG_ON(!m->op.nr_replicas);
+ if (bkey_extent_is_unwritten(k)) {
+ bch2_update_unwritten_extent(trans, m);
+ goto done;
+ }
- /* Special handling required: */
- if (bkey_extent_is_unwritten(k))
- return -BCH_ERR_unwritten_extent_update;
return 0;
err:
i = 0;
bch2_bkey_buf_exit(&m->k, c);
bch2_bio_free_pages_pool(c, &m->op.wbio.bio);
return ret;
+done:
+ bch2_data_update_exit(m);
+ return ret ?: -BCH_ERR_data_update_done;
}
void bch2_data_update_opts_normalize(struct bkey_s_c k, struct data_update_opts *opts)
void bch2_data_update_read_done(struct data_update *,
struct bch_extent_crc_unpacked);
+int bch2_extent_drop_ptrs(struct btree_trans *,
+ struct btree_iter *,
+ struct bkey_s_c,
+ struct data_update_opts);
+
void bch2_data_update_exit(struct data_update *);
-void bch2_update_unwritten_extent(struct btree_trans *, struct data_update *);
-int bch2_data_update_init(struct btree_trans *, struct moving_context *,
+int bch2_data_update_init(struct btree_trans *, struct btree_iter *,
+ struct moving_context *,
struct data_update *,
struct write_point_specifier,
struct bch_io_opts, struct data_update_opts,
return ret;
}
-int bch2_empty_dir_trans(struct btree_trans *trans, subvol_inum dir)
+int bch2_empty_dir_snapshot(struct btree_trans *trans, u64 dir, u32 snapshot)
{
struct btree_iter iter;
struct bkey_s_c k;
- u32 snapshot;
int ret;
- ret = bch2_subvolume_get_snapshot(trans, dir.subvol, &snapshot);
- if (ret)
- return ret;
-
for_each_btree_key_upto_norestart(trans, iter, BTREE_ID_dirents,
- SPOS(dir.inum, 0, snapshot),
- POS(dir.inum, U64_MAX), 0, k, ret)
+ SPOS(dir, 0, snapshot),
+ POS(dir, U64_MAX), 0, k, ret)
if (k.k->type == KEY_TYPE_dirent) {
ret = -ENOTEMPTY;
break;
return ret;
}
+int bch2_empty_dir_trans(struct btree_trans *trans, subvol_inum dir)
+{
+ u32 snapshot;
+
+ return bch2_subvolume_get_snapshot(trans, dir.subvol, &snapshot) ?:
+ bch2_empty_dir_snapshot(trans, dir.inum, snapshot);
+}
+
int bch2_readdir(struct bch_fs *c, subvol_inum inum, struct dir_context *ctx)
{
struct btree_trans *trans = bch2_trans_get(c);
const struct bch_hash_info *,
const struct qstr *, subvol_inum *);
+int bch2_empty_dir_snapshot(struct btree_trans *, u64, u32);
int bch2_empty_dir_trans(struct btree_trans *, subvol_inum);
int bch2_readdir(struct bch_fs *, subvol_inum, struct dir_context *);
case TARGET_DEV: {
struct bch_dev *ca;
+ out->atomic++;
rcu_read_lock();
ca = t.dev < c->sb.nr_devices
? rcu_dereference(c->devs[t.dev])
}
rcu_read_unlock();
+ out->atomic--;
break;
}
case TARGET_GROUP:
}
}
-void bch2_target_to_text_sb(struct printbuf *out, struct bch_sb *sb, unsigned v)
+static void bch2_target_to_text_sb(struct printbuf *out, struct bch_sb *sb, unsigned v)
{
struct target t = target_decode(v);
h->nr_active_devs++;
rcu_read_unlock();
+
+ /*
+ * If we only have redundancy + 1 devices, we're better off with just
+ * replication:
+ */
+ if (h->nr_active_devs < h->redundancy + 2)
+ bch_err(c, "insufficient devices available to create stripe (have %u, need %u) - mismatched bucket sizes?",
+ h->nr_active_devs, h->redundancy + 2);
+
list_add(&h->list, &c->ec_stripe_head_list);
return h;
}
h = ec_new_stripe_head_alloc(c, target, algo, redundancy, watermark);
found:
+ if (!IS_ERR_OR_NULL(h) &&
+ h->nr_active_devs < h->redundancy + 2) {
+ mutex_unlock(&h->lock);
+ h = NULL;
+ }
mutex_unlock(&c->ec_stripe_head_lock);
return h;
}
int ret;
h = __bch2_ec_stripe_head_get(trans, target, algo, redundancy, watermark);
- if (!h)
- bch_err(c, "no stripe head");
if (IS_ERR_OR_NULL(h))
return h;
x(BCH_ERR_fsck, fsck_repair_unimplemented) \
x(BCH_ERR_fsck, fsck_repair_impossible) \
x(0, restart_recovery) \
- x(0, unwritten_extent_update) \
+ x(0, data_update_done) \
x(EINVAL, device_state_not_allowed) \
x(EINVAL, member_info_missing) \
x(EINVAL, mismatched_block_size) \
x(BCH_ERR_invalid_sb, invalid_sb_members) \
x(BCH_ERR_invalid_sb, invalid_sb_disk_groups) \
x(BCH_ERR_invalid_sb, invalid_sb_replicas) \
+ x(BCH_ERR_invalid_sb, invalid_replicas_entry) \
x(BCH_ERR_invalid_sb, invalid_sb_journal) \
x(BCH_ERR_invalid_sb, invalid_sb_journal_seq_blacklist) \
x(BCH_ERR_invalid_sb, invalid_sb_crypt) \
return replicas;
}
-unsigned bch2_extent_ptr_desired_durability(struct bch_fs *c, struct extent_ptr_decoded *p)
+static inline unsigned __extent_ptr_durability(struct bch_dev *ca, struct extent_ptr_decoded *p)
{
- struct bch_dev *ca;
-
if (p->ptr.cached)
return 0;
- ca = bch_dev_bkey_exists(c, p->ptr.dev);
-
- return ca->mi.durability +
- (p->has_ec
- ? p->ec.redundancy
- : 0);
+ return p->has_ec
+ ? p->ec.redundancy + 1
+ : ca->mi.durability;
}
-unsigned bch2_extent_ptr_durability(struct bch_fs *c, struct extent_ptr_decoded *p)
+unsigned bch2_extent_ptr_desired_durability(struct bch_fs *c, struct extent_ptr_decoded *p)
{
- struct bch_dev *ca;
+ struct bch_dev *ca = bch_dev_bkey_exists(c, p->ptr.dev);
- if (p->ptr.cached)
- return 0;
+ return __extent_ptr_durability(ca, p);
+}
- ca = bch_dev_bkey_exists(c, p->ptr.dev);
+unsigned bch2_extent_ptr_durability(struct bch_fs *c, struct extent_ptr_decoded *p)
+{
+ struct bch_dev *ca = bch_dev_bkey_exists(c, p->ptr.dev);
if (ca->mi.state == BCH_MEMBER_STATE_failed)
return 0;
- return ca->mi.durability +
- (p->has_ec
- ? p->ec.redundancy
- : 0);
+ return __extent_ptr_durability(ca, p);
}
unsigned bch2_bkey_durability(struct bch_fs *c, struct bkey_s_c k)
unsigned i = 0;
bkey_for_each_ptr_decode(k.k, ptrs, p, entry) {
- if (p.crc.compression_type == BCH_COMPRESSION_TYPE_incompressible) {
+ if (p.crc.compression_type == BCH_COMPRESSION_TYPE_incompressible ||
+ p.ptr.unwritten) {
rewrite_ptrs = 0;
goto incompressible;
}
}
}
-static void bch2_dio_read_complete(struct closure *cl)
+static CLOSURE_CALLBACK(bch2_dio_read_complete)
{
- struct dio_read *dio = container_of(cl, struct dio_read, cl);
+ closure_type(dio, struct dio_read, cl);
dio->req->ki_complete(dio->req, dio->ret);
bio_check_or_release(&dio->rbio.bio, dio->should_dirty);
return 0;
}
-static void bch2_dio_write_flush_done(struct closure *cl)
+static CLOSURE_CALLBACK(bch2_dio_write_flush_done)
{
- struct dio_write *dio = container_of(cl, struct dio_write, op.cl);
+ closure_type(dio, struct dio_write, op.cl);
struct bch_fs *c = dio->op.c;
closure_debug_destroy(cl);
int bch2_filemap_get_contig_folios_d(struct address_space *mapping,
loff_t start, u64 end,
- int fgp_flags, gfp_t gfp,
+ fgf_t fgp_flags, gfp_t gfp,
folios *fs)
{
struct folio *f;
typedef DARRAY(struct folio *) folios;
int bch2_filemap_get_contig_folios_d(struct address_space *, loff_t,
- u64, int, gfp_t, folios *);
+ u64, fgf_t, gfp_t, folios *);
int bch2_write_invalidate_inode_pages_range(struct address_space *, loff_t, loff_t);
/*
if ((arg.flags & BCH_SUBVOL_SNAPSHOT_CREATE) &&
!arg.src_ptr)
- snapshot_src.subvol = to_bch_ei(dir)->ei_inode.bi_subvol;
+ snapshot_src.subvol = inode_inum(to_bch_ei(dir)).subvol;
inode = __bch2_create(file_mnt_idmap(filp), to_bch_ei(dir),
dst_dentry, arg.mode|S_IFDIR,
{
struct bch_inode_info *inode = to_bch_ei(vinode);
struct bch_inode_info *dir = to_bch_ei(vdir);
-
- if (*len < sizeof(struct bcachefs_fid_with_parent) / sizeof(u32))
- return FILEID_INVALID;
+ int min_len;
if (!S_ISDIR(inode->v.i_mode) && dir) {
struct bcachefs_fid_with_parent *fid = (void *) fh;
+ min_len = sizeof(*fid) / sizeof(u32);
+ if (*len < min_len) {
+ *len = min_len;
+ return FILEID_INVALID;
+ }
+
fid->fid = bch2_inode_to_fid(inode);
fid->dir = bch2_inode_to_fid(dir);
- *len = sizeof(*fid) / sizeof(u32);
+ *len = min_len;
return FILEID_BCACHEFS_WITH_PARENT;
} else {
struct bcachefs_fid *fid = (void *) fh;
+ min_len = sizeof(*fid) / sizeof(u32);
+ if (*len < min_len) {
+ *len = min_len;
+ return FILEID_INVALID;
+ }
*fid = bch2_inode_to_fid(inode);
- *len = sizeof(*fid) / sizeof(u32);
+ *len = min_len;
return FILEID_BCACHEFS_WITHOUT_PARENT;
}
}
if (!first)
seq_putc(seq, ':');
first = false;
- seq_puts(seq, "/dev/");
- seq_puts(seq, ca->name);
+ seq_puts(seq, ca->disk_sb.sb_name);
}
return 0;
struct bch_fs *c = sb->s_fs_info;
int ret;
+ if (test_bit(BCH_FS_EMERGENCY_RO, &c->flags))
+ return 0;
+
down_write(&c->state_lock);
ret = bch2_fs_read_write(c);
up_write(&c->state_lock);
return dget(sb->s_root);
err_put_super:
- sb->s_fs_info = NULL;
- c->vfs_sb = NULL;
deactivate_locked_super(sb);
- bch2_fs_stop(c);
return ERR_PTR(bch2_err_class(ret));
}
{
struct bch_fs *c = sb->s_fs_info;
- if (c)
- c->vfs_sb = NULL;
generic_shutdown_super(sb);
- if (c)
- bch2_fs_free(c);
+ bch2_fs_free(c);
}
static struct file_system_type bcache_fs_type = {
const struct nlink *l = _l;
const struct nlink *r = _r;
- return cmp_int(l->inum, r->inum) ?: cmp_int(l->snapshot, r->snapshot);
+ return cmp_int(l->inum, r->inum);
}
static void inc_link(struct bch_fs *c, struct snapshots_seen *s,
#include "btree_update.h"
#include "buckets.h"
#include "compress.h"
+#include "dirent.h"
#include "error.h"
#include "extents.h"
#include "extent_update.h"
if (ret)
goto out;
- if (fsck_err_on(S_ISDIR(inode.bi_mode), c,
- deleted_inode_is_dir,
- "directory %llu:%u in deleted_inodes btree",
- pos.offset, pos.snapshot))
- goto delete;
+ if (S_ISDIR(inode.bi_mode)) {
+ ret = bch2_empty_dir_snapshot(trans, pos.offset, pos.snapshot);
+ if (fsck_err_on(ret == -ENOTEMPTY, c, deleted_inode_is_dir,
+ "non empty directory %llu:%u in deleted_inodes btree",
+ pos.offset, pos.snapshot))
+ goto delete;
+ if (ret)
+ goto out;
+ }
if (fsck_err_on(!(inode.bi_flags & BCH_INODE_unlinked), c,
deleted_inode_not_unlinked,
* unlinked inodes in the snapshot leaves:
*/
*need_another_pass = true;
- return 0;
+ goto out;
}
ret = 1;
*/
for_each_btree_key(trans, iter, BTREE_ID_deleted_inodes, POS_MIN,
BTREE_ITER_PREFETCH|BTREE_ITER_ALL_SNAPSHOTS, k, ret) {
- ret = lockrestart_do(trans, may_delete_deleted_inode(trans, &iter, k.k->p,
- &need_another_pass));
+ ret = commit_do(trans, NULL, NULL,
+ BTREE_INSERT_NOFAIL|
+ BTREE_INSERT_LAZY_RW,
+ may_delete_deleted_inode(trans, &iter, k.k->p, &need_another_pass));
if (ret < 0)
break;
bio = &op->write.op.wbio.bio;
bio_init(bio, NULL, bio->bi_inline_vecs, pages, 0);
- ret = bch2_data_update_init(trans, NULL, &op->write,
+ ret = bch2_data_update_init(trans, NULL, NULL, &op->write,
writepoint_hashed((unsigned long) current),
opts,
(struct data_update_opts) {
__wp_update_state(wp, state);
}
-static void bch2_write_index(struct closure *cl)
+static CLOSURE_CALLBACK(bch2_write_index)
{
- struct bch_write_op *op = container_of(cl, struct bch_write_op, cl);
+ closure_type(op, struct bch_write_op, cl);
struct write_point *wp = op->wp;
struct workqueue_struct *wq = index_update_wq(op);
unsigned long flags;
* checksum:
*/
csum = bch2_checksum_bio(c, op->crc.csum_type, nonce, &op->wbio.bio);
- if (bch2_crc_cmp(op->crc.csum, csum))
+ if (bch2_crc_cmp(op->crc.csum, csum) && !c->opts.no_data_io)
return -EIO;
ret = bch2_encrypt_bio(c, op->crc.csum_type, nonce, &op->wbio.bio);
bch2_nocow_write_convert_unwritten(op);
}
-static void bch2_nocow_write_done(struct closure *cl)
+static CLOSURE_CALLBACK(bch2_nocow_write_done)
{
- struct bch_write_op *op = container_of(cl, struct bch_write_op, cl);
+ closure_type(op, struct bch_write_op, cl);
__bch2_nocow_write_done(op);
bch2_write_done(cl);
op->insert_keys.top = op->insert_keys.keys;
} else if (op->flags & BCH_WRITE_SYNC) {
closure_sync(&op->cl);
- bch2_nocow_write_done(&op->cl);
+ bch2_nocow_write_done(&op->cl.work);
} else {
/*
* XXX
* If op->discard is true, instead of inserting the data it invalidates the
* region of the cache represented by op->bio and op->inode.
*/
-void bch2_write(struct closure *cl)
+CLOSURE_CALLBACK(bch2_write)
{
- struct bch_write_op *op = container_of(cl, struct bch_write_op, cl);
+ closure_type(op, struct bch_write_op, cl);
struct bio *bio = &op->wbio.bio;
struct bch_fs *c = op->c;
unsigned data_len;
op->devs_need_flush = NULL;
}
-void bch2_write(struct closure *);
-
+CLOSURE_CALLBACK(bch2_write);
void bch2_write_point_do_index_updates(struct work_struct *);
static inline struct bch_write_bio *wbio_init(struct bio *bio)
return ret;
}
-static bool journal_entry_close(struct journal *j)
+bool bch2_journal_entry_close(struct journal *j)
{
bool ret;
atomic64_inc(&j->seq);
journal_pin_list_init(fifo_push_ref(&j->pin), 1);
+ BUG_ON(j->pin.back - 1 != atomic64_read(&j->seq));
+
BUG_ON(j->buf + (journal_cur_seq(j) & JOURNAL_BUF_MASK) != buf);
bkey_extent_init(&buf->key);
bool ret = atomic64_read(&j->seq) == j->seq_ondisk;
if (!ret)
- journal_entry_close(j);
+ bch2_journal_entry_close(j);
return ret;
}
/*
* Recheck after taking the lock, so we don't race with another thread
- * that just did journal_entry_open() and call journal_entry_close()
+ * that just did journal_entry_open() and call bch2_journal_entry_close()
* unnecessarily
*/
if (journal_res_get_fast(j, res, flags)) {
return ret;
}
-/* journal_preres: */
-
-static bool journal_preres_available(struct journal *j,
- struct journal_preres *res,
- unsigned new_u64s,
- unsigned flags)
-{
- bool ret = bch2_journal_preres_get_fast(j, res, new_u64s, flags, true);
-
- if (!ret && mutex_trylock(&j->reclaim_lock)) {
- bch2_journal_reclaim(j);
- mutex_unlock(&j->reclaim_lock);
- }
-
- return ret;
-}
-
-int __bch2_journal_preres_get(struct journal *j,
- struct journal_preres *res,
- unsigned new_u64s,
- unsigned flags)
-{
- int ret;
-
- closure_wait_event(&j->preres_wait,
- (ret = bch2_journal_error(j)) ||
- journal_preres_available(j, res, new_u64s, flags));
- return ret;
-}
-
/* journal_entry_res: */
void bch2_journal_entry_res_resize(struct journal *j,
bch2_journal_reclaim_stop(j);
bch2_journal_flush_all_pins(j);
- wait_event(j->wait, journal_entry_close(j));
+ wait_event(j->wait, bch2_journal_entry_close(j));
/*
* Always write a new journal entry, to make sure the clock hands are up
prt_printf(out, "last_seq:\t\t%llu\n", journal_last_seq(j));
prt_printf(out, "last_seq_ondisk:\t%llu\n", j->last_seq_ondisk);
prt_printf(out, "flushed_seq_ondisk:\t%llu\n", j->flushed_seq_ondisk);
- prt_printf(out, "prereserved:\t\t%u/%u\n", j->prereserved.reserved, j->prereserved.remaining);
prt_printf(out, "watermark:\t\t%s\n", bch2_watermarks[j->watermark]);
prt_printf(out, "each entry reserved:\t%u\n", j->entry_u64s_reserved);
prt_printf(out, "nr flush writes:\t%llu\n", j->nr_flush_writes);
static inline u64 journal_cur_seq(struct journal *j)
{
- EBUG_ON(j->pin.back - 1 != atomic64_read(&j->seq));
-
- return j->pin.back - 1;
+ return atomic64_read(&j->seq);
}
static inline u64 journal_last_unwritten_seq(struct journal *j)
return s;
}
+bool bch2_journal_entry_close(struct journal *);
void bch2_journal_buf_put_final(struct journal *, u64, bool);
static inline void __bch2_journal_buf_put(struct journal *j, unsigned idx, u64 seq)
return 0;
}
-/* journal_preres: */
-
-static inline void journal_set_watermark(struct journal *j)
-{
- union journal_preres_state s = READ_ONCE(j->prereserved);
- unsigned watermark = BCH_WATERMARK_stripe;
-
- if (fifo_free(&j->pin) < j->pin.size / 4)
- watermark = max_t(unsigned, watermark, BCH_WATERMARK_copygc);
- if (fifo_free(&j->pin) < j->pin.size / 8)
- watermark = max_t(unsigned, watermark, BCH_WATERMARK_reclaim);
-
- if (s.reserved > s.remaining)
- watermark = max_t(unsigned, watermark, BCH_WATERMARK_copygc);
- if (!s.remaining)
- watermark = max_t(unsigned, watermark, BCH_WATERMARK_reclaim);
-
- if (watermark == j->watermark)
- return;
-
- swap(watermark, j->watermark);
- if (watermark > j->watermark)
- journal_wake(j);
-}
-
-static inline void bch2_journal_preres_put(struct journal *j,
- struct journal_preres *res)
-{
- union journal_preres_state s = { .reserved = res->u64s };
-
- if (!res->u64s)
- return;
-
- s.v = atomic64_sub_return(s.v, &j->prereserved.counter);
- res->u64s = 0;
-
- if (unlikely(s.waiting)) {
- clear_bit(ilog2((((union journal_preres_state) { .waiting = 1 }).v)),
- (unsigned long *) &j->prereserved.v);
- closure_wake_up(&j->preres_wait);
- }
-
- if (s.reserved <= s.remaining && j->watermark)
- journal_set_watermark(j);
-}
-
-int __bch2_journal_preres_get(struct journal *,
- struct journal_preres *, unsigned, unsigned);
-
-static inline int bch2_journal_preres_get_fast(struct journal *j,
- struct journal_preres *res,
- unsigned new_u64s,
- unsigned flags,
- bool set_waiting)
-{
- int d = new_u64s - res->u64s;
- union journal_preres_state old, new;
- u64 v = atomic64_read(&j->prereserved.counter);
- enum bch_watermark watermark = flags & BCH_WATERMARK_MASK;
- int ret;
-
- do {
- old.v = new.v = v;
- ret = 0;
-
- if (watermark == BCH_WATERMARK_reclaim ||
- new.reserved + d < new.remaining) {
- new.reserved += d;
- ret = 1;
- } else if (set_waiting && !new.waiting)
- new.waiting = true;
- else
- return 0;
- } while ((v = atomic64_cmpxchg(&j->prereserved.counter,
- old.v, new.v)) != old.v);
-
- if (ret)
- res->u64s += d;
- return ret;
-}
-
-static inline int bch2_journal_preres_get(struct journal *j,
- struct journal_preres *res,
- unsigned new_u64s,
- unsigned flags)
-{
- if (new_u64s <= res->u64s)
- return 0;
-
- if (bch2_journal_preres_get_fast(j, res, new_u64s, flags, false))
- return 0;
-
- if (flags & JOURNAL_RES_GET_NONBLOCK)
- return -BCH_ERR_journal_preres_get_blocked;
-
- return __bch2_journal_preres_get(j, res, new_u64s, flags);
-}
-
/* journal_entry_res: */
void bch2_journal_entry_res_resize(struct journal *,
struct jset_entry_data_usage *u =
container_of(entry, struct jset_entry_data_usage, entry);
unsigned bytes = jset_u64s(le16_to_cpu(entry->u64s)) * sizeof(u64);
+ struct printbuf err = PRINTBUF;
int ret = 0;
if (journal_entry_err_on(bytes < sizeof(*u) ||
journal_entry_data_usage_bad_size,
"invalid journal entry usage: bad size")) {
journal_entry_null_range(entry, vstruct_next(entry));
- return ret;
+ goto out;
}
+ if (journal_entry_err_on(bch2_replicas_entry_validate(&u->r, c->disk_sb.sb, &err),
+ c, version, jset, entry,
+ journal_entry_data_usage_bad_size,
+ "invalid journal entry usage: %s", err.buf)) {
+ journal_entry_null_range(entry, vstruct_next(entry));
+ goto out;
+ }
+out:
fsck_err:
+ printbuf_exit(&err);
return ret;
}
return 0;
}
-static void bch2_journal_read_device(struct closure *cl)
+static CLOSURE_CALLBACK(bch2_journal_read_device)
{
- struct journal_device *ja =
- container_of(cl, struct journal_device, read);
+ closure_type(ja, struct journal_device, read);
struct bch_dev *ca = container_of(ja, struct bch_dev, journal);
struct bch_fs *c = ca->fs;
struct journal_list *jlist =
if (ja->bucket_seq[ja->cur_idx] &&
ja->sectors_free == ca->mi.bucket_size) {
+#if 0
+ /*
+ * Debug code for ZNS support, where we (probably) want to be
+ * correlated where we stopped in the journal to the zone write
+ * points:
+ */
bch_err(c, "ja->sectors_free == ca->mi.bucket_size");
bch_err(c, "cur_idx %u/%u", ja->cur_idx, ja->nr);
for (i = 0; i < 3; i++) {
bch_err(c, "bucket_seq[%u] = %llu", idx, ja->bucket_seq[idx]);
}
+#endif
ja->sectors_free = 0;
}
return j->buf + (journal_last_unwritten_seq(j) & JOURNAL_BUF_MASK);
}
-static void journal_write_done(struct closure *cl)
+static CLOSURE_CALLBACK(journal_write_done)
{
- struct journal *j = container_of(cl, struct journal, io);
+ closure_type(j, struct journal, io);
struct bch_fs *c = container_of(j, struct bch_fs, journal);
struct journal_buf *w = journal_last_unwritten_buf(j);
struct bch_replicas_padded replicas;
} while ((v = atomic64_cmpxchg(&j->reservations.counter,
old.v, new.v)) != old.v);
+ bch2_journal_reclaim_fast(j);
bch2_journal_space_available(j);
closure_wake_up(&w->wait);
percpu_ref_put(&ca->io_ref);
}
-static void do_journal_write(struct closure *cl)
+static CLOSURE_CALLBACK(do_journal_write)
{
- struct journal *j = container_of(cl, struct journal, io);
+ closure_type(j, struct journal, io);
struct bch_fs *c = container_of(j, struct bch_fs, journal);
struct bch_dev *ca;
struct journal_buf *w = journal_last_unwritten_buf(j);
return 0;
}
-void bch2_journal_write(struct closure *cl)
+CLOSURE_CALLBACK(bch2_journal_write)
{
- struct journal *j = container_of(cl, struct journal, io);
+ closure_type(j, struct journal, io);
struct bch_fs *c = container_of(j, struct bch_fs, journal);
struct bch_dev *ca;
struct journal_buf *w = journal_last_unwritten_buf(j);
int bch2_journal_read(struct bch_fs *, u64 *, u64 *, u64 *);
-void bch2_journal_write(struct closure *);
+CLOSURE_CALLBACK(bch2_journal_write);
#endif /* _BCACHEFS_JOURNAL_IO_H */
return available;
}
-static void journal_set_remaining(struct journal *j, unsigned u64s_remaining)
+static inline void journal_set_watermark(struct journal *j, bool low_on_space)
{
- union journal_preres_state old, new;
- u64 v = atomic64_read(&j->prereserved.counter);
+ unsigned watermark = BCH_WATERMARK_stripe;
- do {
- old.v = new.v = v;
- new.remaining = u64s_remaining;
- } while ((v = atomic64_cmpxchg(&j->prereserved.counter,
- old.v, new.v)) != old.v);
+ if (low_on_space)
+ watermark = max_t(unsigned, watermark, BCH_WATERMARK_reclaim);
+ if (fifo_free(&j->pin) < j->pin.size / 4)
+ watermark = max_t(unsigned, watermark, BCH_WATERMARK_reclaim);
+
+ if (watermark == j->watermark)
+ return;
+
+ swap(watermark, j->watermark);
+ if (watermark > j->watermark)
+ journal_wake(j);
}
static struct journal_space
struct bch_fs *c = container_of(j, struct bch_fs, journal);
struct bch_dev *ca;
unsigned clean, clean_ondisk, total;
- s64 u64s_remaining = 0;
unsigned max_entry_size = min(j->buf[0].buf_size >> 9,
j->buf[1].buf_size >> 9);
unsigned i, nr_online = 0, nr_devs_want;
else
clear_bit(JOURNAL_MAY_SKIP_FLUSH, &j->flags);
- u64s_remaining = (u64) clean << 6;
- u64s_remaining -= (u64) total << 3;
- u64s_remaining = max(0LL, u64s_remaining);
- u64s_remaining /= 4;
- u64s_remaining = min_t(u64, u64s_remaining, U32_MAX);
+ journal_set_watermark(j, clean * 4 <= total);
out:
j->cur_entry_sectors = !ret ? j->space[journal_space_discarded].next_entry : 0;
j->cur_entry_error = ret;
- journal_set_remaining(j, u64s_remaining);
- journal_set_watermark(j);
if (!ret)
journal_wake(j);
/* Try to keep the journal at most half full: */
nr_buckets = ja->nr / 2;
- /* And include pre-reservations: */
- nr_buckets += DIV_ROUND_UP(j->prereserved.reserved,
- (ca->mi.bucket_size << 6) -
- journal_entry_overhead(j));
-
nr_buckets = min(nr_buckets, ja->nr);
bucket_to_flush = (ja->cur_idx + nr_buckets) % ja->nr;
msecs_to_jiffies(c->opts.journal_reclaim_delay)))
min_nr = 1;
- if (j->prereserved.reserved * 4 > j->prereserved.remaining)
- min_nr = 1;
-
- if (fifo_free(&j->pin) <= 32)
+ if (j->watermark != BCH_WATERMARK_stripe)
min_nr = 1;
if (atomic_read(&c->btree_cache.dirty) * 2 > c->btree_cache.used)
trace_and_count(c, journal_reclaim_start, c,
direct, kicked,
min_nr, min_key_cache,
- j->prereserved.reserved,
- j->prereserved.remaining,
atomic_read(&c->btree_cache.dirty),
c->btree_cache.used,
atomic_long_read(&c->btree_key_cache.nr_dirty),
(1U << JOURNAL_PIN_btree), 0, 0, 0))
*did_work = true;
+ if (seq_to_flush > journal_cur_seq(j))
+ bch2_journal_entry_close(j);
+
spin_lock(&j->lock);
/*
* If journal replay hasn't completed, the unreplayed journal entries
u64 seq;
};
-/*
- * For reserving space in the journal prior to getting a reservation on a
- * particular journal entry:
- */
-struct journal_preres {
- unsigned u64s;
-};
-
union journal_res_state {
struct {
atomic64_t counter;
};
};
-union journal_preres_state {
- struct {
- atomic64_t counter;
- };
-
- struct {
- u64 v;
- };
-
- struct {
- u64 waiting:1,
- reserved:31,
- remaining:32;
- };
-};
-
/* bytes: */
#define JOURNAL_ENTRY_SIZE_MIN (64U << 10) /* 64k */
#define JOURNAL_ENTRY_SIZE_MAX (4U << 20) /* 4M */
union journal_res_state reservations;
enum bch_watermark watermark;
- union journal_preres_state prereserved;
-
} __aligned(SMP_CACHE_BYTES);
unsigned long flags;
}
}
-static void trace_move_extent_alloc_mem_fail2(struct bch_fs *c, struct bkey_s_c k)
-{
- if (trace_move_extent_alloc_mem_fail_enabled()) {
- struct printbuf buf = PRINTBUF;
-
- bch2_bkey_val_to_text(&buf, c, k);
- trace_move_extent_alloc_mem_fail(c, buf.buf);
- printbuf_exit(&buf);
- }
-}
-
struct moving_io {
struct list_head read_list;
struct list_head io_list;
atomic_read(&ctxt->write_sectors) != sectors_pending);
}
+static void bch2_moving_ctxt_flush_all(struct moving_context *ctxt)
+{
+ move_ctxt_wait_event(ctxt, list_empty(&ctxt->reads));
+ bch2_trans_unlock_long(ctxt->trans);
+ closure_sync(&ctxt->cl);
+}
+
void bch2_moving_ctxt_exit(struct moving_context *ctxt)
{
struct bch_fs *c = ctxt->trans->c;
- move_ctxt_wait_event(ctxt, list_empty(&ctxt->reads));
- closure_sync(&ctxt->cl);
+ bch2_moving_ctxt_flush_all(ctxt);
EBUG_ON(atomic_read(&ctxt->write_sectors));
EBUG_ON(atomic_read(&ctxt->write_ios));
scnprintf(stats->name, sizeof(stats->name), "%s", name);
}
-static int bch2_extent_drop_ptrs(struct btree_trans *trans,
- struct btree_iter *iter,
- struct bkey_s_c k,
- struct data_update_opts data_opts)
-{
- struct bch_fs *c = trans->c;
- struct bkey_i *n;
- int ret;
-
- n = bch2_bkey_make_mut_noupdate(trans, k);
- ret = PTR_ERR_OR_ZERO(n);
- if (ret)
- return ret;
-
- while (data_opts.kill_ptrs) {
- unsigned i = 0, drop = __fls(data_opts.kill_ptrs);
- struct bch_extent_ptr *ptr;
-
- bch2_bkey_drop_ptrs(bkey_i_to_s(n), ptr, i++ == drop);
- data_opts.kill_ptrs ^= 1U << drop;
- }
-
- /*
- * If the new extent no longer has any pointers, bch2_extent_normalize()
- * will do the appropriate thing with it (turning it into a
- * KEY_TYPE_error key, or just a discard if it was a cached extent)
- */
- bch2_extent_normalize(c, bkey_i_to_s(n));
-
- /*
- * Since we're not inserting through an extent iterator
- * (BTREE_ITER_ALL_SNAPSHOTS iterators aren't extent iterators),
- * we aren't using the extent overwrite path to delete, we're
- * just using the normal key deletion path:
- */
- if (bkey_deleted(&n->k))
- n->k.size = 0;
-
- return bch2_trans_relock(trans) ?:
- bch2_trans_update(trans, iter, n, BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE) ?:
- bch2_trans_commit(trans, NULL, NULL, BTREE_INSERT_NOFAIL);
-}
-
int bch2_move_extent(struct moving_context *ctxt,
struct move_bucket_in_flight *bucket_in_flight,
struct btree_iter *iter,
io->rbio.bio.bi_iter.bi_sector = bkey_start_offset(k.k);
io->rbio.bio.bi_end_io = move_read_endio;
- ret = bch2_data_update_init(trans, ctxt, &io->write, ctxt->wp,
+ ret = bch2_data_update_init(trans, iter, ctxt, &io->write, ctxt->wp,
io_opts, data_opts, iter->btree_id, k);
- if (ret && ret != -BCH_ERR_unwritten_extent_update)
+ if (ret)
goto err_free_pages;
- if (ret == -BCH_ERR_unwritten_extent_update) {
- bch2_update_unwritten_extent(trans, &io->write);
- move_free(io);
- return 0;
- }
-
- BUG_ON(ret);
-
io->write.op.end_io = move_write_done;
if (ctxt->rate)
err_free:
kfree(io);
err:
- this_cpu_inc(c->counters[BCH_COUNTER_move_extent_alloc_mem_fail]);
- trace_move_extent_alloc_mem_fail2(c, k);
+ if (ret == -BCH_ERR_data_update_done)
+ return 0;
+
+ if (bch2_err_matches(ret, EROFS) ||
+ bch2_err_matches(ret, BCH_ERR_transaction_restart))
+ return ret;
+
+ this_cpu_inc(c->counters[BCH_COUNTER_move_extent_start_fail]);
+ if (trace_move_extent_start_fail_enabled()) {
+ struct printbuf buf = PRINTBUF;
+
+ bch2_bkey_val_to_text(&buf, c, k);
+ prt_str(&buf, ": ");
+ prt_str(&buf, bch2_err_str(ret));
+ trace_move_extent_start_fail(c, buf.buf);
+ printbuf_exit(&buf);
+ }
return ret;
}
int bch2_move_ratelimit(struct moving_context *ctxt)
{
struct bch_fs *c = ctxt->trans->c;
+ bool is_kthread = current->flags & PF_KTHREAD;
u64 delay;
- if (ctxt->wait_on_copygc && !c->copygc_running) {
- bch2_trans_unlock_long(ctxt->trans);
+ if (ctxt->wait_on_copygc && c->copygc_running) {
+ bch2_moving_ctxt_flush_all(ctxt);
wait_event_killable(c->copygc_running_wq,
!c->copygc_running ||
- kthread_should_stop());
+ (is_kthread && kthread_should_stop()));
}
do {
delay = ctxt->rate ? bch2_ratelimit_delay(ctxt->rate) : 0;
-
- if (delay) {
- if (delay > HZ / 10)
- bch2_trans_unlock_long(ctxt->trans);
- else
- bch2_trans_unlock(ctxt->trans);
- set_current_state(TASK_INTERRUPTIBLE);
- }
-
- if ((current->flags & PF_KTHREAD) && kthread_should_stop()) {
- __set_current_state(TASK_RUNNING);
+ if (is_kthread && kthread_should_stop())
return 1;
- }
if (delay)
- schedule_timeout(delay);
+ move_ctxt_wait_event_timeout(ctxt,
+ freezing(current) ||
+ (is_kthread && kthread_should_stop()),
+ delay);
if (unlikely(freezing(current))) {
- move_ctxt_wait_event(ctxt, list_empty(&ctxt->reads));
+ bch2_moving_ctxt_flush_all(ctxt);
try_to_freeze();
}
} while (delay);
{
struct btree_trans *trans = ctxt->trans;
struct bch_fs *c = trans->c;
+ bool is_kthread = current->flags & PF_KTHREAD;
struct bch_io_opts io_opts = bch2_opts_to_inode_opts(c->opts);
struct btree_iter iter;
struct bkey_buf sk;
}
while (!(ret = bch2_move_ratelimit(ctxt))) {
+ if (is_kthread && kthread_should_stop())
+ break;
+
bch2_trans_begin(trans);
ret = bch2_get_next_backpointer(trans, bucket, gen,
wait_queue_head_t wait;
};
+#define move_ctxt_wait_event_timeout(_ctxt, _cond, _timeout) \
+({ \
+ int _ret = 0; \
+ while (true) { \
+ bool cond_finished = false; \
+ bch2_moving_ctxt_do_pending_writes(_ctxt); \
+ \
+ if (_cond) \
+ break; \
+ bch2_trans_unlock_long((_ctxt)->trans); \
+ _ret = __wait_event_timeout((_ctxt)->wait, \
+ bch2_moving_ctxt_next_pending_write(_ctxt) || \
+ (cond_finished = (_cond)), _timeout); \
+ if (_ret || ( cond_finished)) \
+ break; \
+ } \
+ _ret; \
+})
+
#define move_ctxt_wait_event(_ctxt, _cond) \
do { \
bool cond_finished = false; \
goto err;
darray_for_each(buckets, i) {
- if (unlikely(freezing(current)))
+ if (kthread_should_stop() || freezing(current))
break;
f = move_bucket_in_flight_add(buckets_in_flight, *i);
u64 start_seq = c->journal_replay_seq_start;
u64 end_seq = c->journal_replay_seq_start;
size_t i;
- int ret;
+ int ret = 0;
move_gap(keys->d, keys->nr, keys->size, keys->gap, keys->nr);
keys->gap = keys->nr;
goto err;
}
+ BUG_ON(!atomic_read(&keys->ref));
+
for (i = 0; i < keys->nr; i++) {
k = keys_sorted[i];
}
}
+ if (!c->opts.keep_journal)
+ bch2_journal_keys_put_initial(c);
+
replay_now_at(j, j->replay_journal_seq_end);
j->replay_journal_seq = 0;
bch2_flush_fsck_errs(c);
if (!c->opts.keep_journal &&
- test_bit(JOURNAL_REPLAY_DONE, &c->journal.flags)) {
- bch2_journal_keys_free(&c->journal_keys);
- bch2_journal_entries_free(c);
- }
+ test_bit(JOURNAL_REPLAY_DONE, &c->journal.flags))
+ bch2_journal_keys_put_initial(c);
kfree(clean);
if (!ret && test_bit(BCH_FS_NEED_DELETE_DEAD_SNAPSHOTS, &c->flags)) {
static inline int bch2_run_explicit_recovery_pass(struct bch_fs *c,
enum bch_recovery_pass pass)
{
+ if (c->recovery_passes_explicit & BIT_ULL(pass))
+ return 0;
+
bch_info(c, "running explicit recovery pass %s (%u), currently at %s (%u)",
bch2_recovery_passes[pass], pass,
bch2_recovery_passes[c->curr_recovery_pass], c->curr_recovery_pass);
{
check_indirect_extent_deleting(new, &flags);
+ if (old.k->type == KEY_TYPE_reflink_v &&
+ new->k.type == KEY_TYPE_reflink_v &&
+ old.k->u64s == new->k.u64s &&
+ !memcmp(bkey_s_c_to_reflink_v(old).v->start,
+ bkey_i_to_reflink_v(new)->v.start,
+ bkey_val_bytes(&new->k) - 8))
+ return 0;
+
return bch2_trans_mark_extent(trans, btree_id, level, old, new, flags);
}
prt_printf(out, "]");
}
+int bch2_replicas_entry_validate(struct bch_replicas_entry *r,
+ struct bch_sb *sb,
+ struct printbuf *err)
+{
+ if (!r->nr_devs) {
+ prt_printf(err, "no devices in entry ");
+ goto bad;
+ }
+
+ if (r->nr_required > 1 &&
+ r->nr_required >= r->nr_devs) {
+ prt_printf(err, "bad nr_required in entry ");
+ goto bad;
+ }
+
+ for (unsigned i = 0; i < r->nr_devs; i++)
+ if (!bch2_dev_exists(sb, r->devs[i])) {
+ prt_printf(err, "invalid device %u in entry ", r->devs[i]);
+ goto bad;
+ }
+
+ return 0;
+bad:
+ bch2_replicas_entry_to_text(err, r);
+ return -BCH_ERR_invalid_replicas_entry;
+}
+
void bch2_cpu_replicas_to_text(struct printbuf *out,
struct bch_replicas_cpu *r)
{
}
static struct bch_replicas_cpu
-cpu_replicas_add_entry(struct bch_replicas_cpu *old,
+cpu_replicas_add_entry(struct bch_fs *c,
+ struct bch_replicas_cpu *old,
struct bch_replicas_entry *new_entry)
{
unsigned i;
replicas_entry_bytes(new_entry)),
};
+ for (i = 0; i < new_entry->nr_devs; i++)
+ BUG_ON(!bch2_dev_exists2(c, new_entry->devs[i]));
+
BUG_ON(!new_entry->data_type);
verify_replicas_entry(new_entry);
if (c->replicas_gc.entries &&
!__replicas_has_entry(&c->replicas_gc, new_entry)) {
- new_gc = cpu_replicas_add_entry(&c->replicas_gc, new_entry);
+ new_gc = cpu_replicas_add_entry(c, &c->replicas_gc, new_entry);
if (!new_gc.entries) {
ret = -BCH_ERR_ENOMEM_cpu_replicas;
goto err;
}
if (!__replicas_has_entry(&c->replicas, new_entry)) {
- new_r = cpu_replicas_add_entry(&c->replicas, new_entry);
+ new_r = cpu_replicas_add_entry(c, &c->replicas, new_entry);
if (!new_r.entries) {
ret = -BCH_ERR_ENOMEM_cpu_replicas;
goto err;
if (idx < 0) {
struct bch_replicas_cpu n;
- n = cpu_replicas_add_entry(&c->replicas, r);
+ n = cpu_replicas_add_entry(c, &c->replicas, r);
if (!n.entries)
return -BCH_ERR_ENOMEM_cpu_replicas;
struct bch_sb *sb,
struct printbuf *err)
{
- unsigned i, j;
+ unsigned i;
sort_cmp_size(cpu_r->entries,
cpu_r->nr,
struct bch_replicas_entry *e =
cpu_replicas_entry(cpu_r, i);
- if (e->data_type >= BCH_DATA_NR) {
- prt_printf(err, "invalid data type in entry ");
- bch2_replicas_entry_to_text(err, e);
- return -BCH_ERR_invalid_sb_replicas;
- }
-
- if (!e->nr_devs) {
- prt_printf(err, "no devices in entry ");
- bch2_replicas_entry_to_text(err, e);
- return -BCH_ERR_invalid_sb_replicas;
- }
-
- if (e->nr_required > 1 &&
- e->nr_required >= e->nr_devs) {
- prt_printf(err, "bad nr_required in entry ");
- bch2_replicas_entry_to_text(err, e);
- return -BCH_ERR_invalid_sb_replicas;
- }
-
- for (j = 0; j < e->nr_devs; j++)
- if (!bch2_dev_exists(sb, e->devs[j])) {
- prt_printf(err, "invalid device %u in entry ", e->devs[j]);
- bch2_replicas_entry_to_text(err, e);
- return -BCH_ERR_invalid_sb_replicas;
- }
+ int ret = bch2_replicas_entry_validate(e, sb, err);
+ if (ret)
+ return ret;
if (i + 1 < cpu_r->nr) {
struct bch_replicas_entry *n =
void bch2_replicas_entry_sort(struct bch_replicas_entry *);
void bch2_replicas_entry_to_text(struct printbuf *,
struct bch_replicas_entry *);
+int bch2_replicas_entry_validate(struct bch_replicas_entry *,
+ struct bch_sb *, struct printbuf *);
void bch2_cpu_replicas_to_text(struct printbuf *, struct bch_replicas_cpu *);
static inline struct bch_replicas_entry *
this_cpu_sub(*lock->readers, !ret);
preempt_enable();
- if (!ret && (old & SIX_LOCK_WAITING_write))
- ret = -1 - SIX_LOCK_write;
+ if (!ret) {
+ smp_mb();
+ if (atomic_read(&lock->state) & SIX_LOCK_WAITING_write)
+ ret = -1 - SIX_LOCK_write;
+ }
} else if (type == SIX_LOCK_write && lock->readers) {
if (try) {
atomic_add(SIX_LOCK_HELD_write, &lock->state);
parent_id, id))
goto err;
- parent->v.children[i] = le32_to_cpu(child_id);
+ parent->v.children[i] = cpu_to_le32(child_id);
normalize_snapshot_child_pointers(&parent->v);
}
};
struct snapshot_table {
- struct snapshot_t s[0];
+ DECLARE_FLEX_ARRAY(struct snapshot_t, s);
};
typedef struct {
if (!IS_ERR_OR_NULL(sb->bdev))
blkdev_put(sb->bdev, sb->holder);
kfree(sb->holder);
+ kfree(sb->sb_name);
kfree(sb->sb);
memset(sb, 0, sizeof(*sb));
if (!sb->holder)
return -ENOMEM;
+ sb->sb_name = kstrdup(path, GFP_KERNEL);
+ if (!sb->sb_name)
+ return -ENOMEM;
+
#ifndef __KERNEL__
if (opt_get(*opts, direct_io) == false)
sb->mode |= BLK_OPEN_BUFFERED;
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Kent Overstreet <kent.overstreet@gmail.com>");
MODULE_DESCRIPTION("bcachefs filesystem");
+MODULE_SOFTDEP("pre: crc32c");
+MODULE_SOFTDEP("pre: crc64");
+MODULE_SOFTDEP("pre: sha256");
+MODULE_SOFTDEP("pre: chacha20");
+MODULE_SOFTDEP("pre: poly1305");
+MODULE_SOFTDEP("pre: xxhash");
#define KTYPE(type) \
static const struct attribute_group type ## _group = { \
bch2_dev_allocator_add(c, ca);
bch2_recalc_capacity(c);
+ set_bit(BCH_FS_RW, &c->flags);
+ set_bit(BCH_FS_WAS_RW, &c->flags);
+
+#ifndef BCH_WRITE_REF_DEBUG
+ percpu_ref_reinit(&c->writes);
+#else
+ for (i = 0; i < BCH_WRITE_REF_NR; i++) {
+ BUG_ON(atomic_long_read(&c->writes[i]));
+ atomic_long_inc(&c->writes[i]);
+ }
+#endif
+
ret = bch2_gc_thread_start(c);
if (ret) {
bch_err(c, "error starting gc thread");
goto err;
}
-#ifndef BCH_WRITE_REF_DEBUG
- percpu_ref_reinit(&c->writes);
-#else
- for (i = 0; i < BCH_WRITE_REF_NR; i++) {
- BUG_ON(atomic_long_read(&c->writes[i]));
- atomic_long_inc(&c->writes[i]);
- }
-#endif
- set_bit(BCH_FS_RW, &c->flags);
- set_bit(BCH_FS_WAS_RW, &c->flags);
-
bch2_do_discards(c);
bch2_do_invalidates(c);
bch2_do_stripe_deletes(c);
bch2_do_pending_node_rewrites(c);
return 0;
err:
- __bch2_fs_read_only(c);
+ if (test_bit(BCH_FS_RW, &c->flags))
+ bch2_fs_read_only(c);
+ else
+ __bch2_fs_read_only(c);
return ret;
}
bch2_io_clock_exit(&c->io_clock[WRITE]);
bch2_io_clock_exit(&c->io_clock[READ]);
bch2_fs_compress_exit(c);
- bch2_journal_keys_free(&c->journal_keys);
- bch2_journal_entries_free(c);
+ bch2_journal_keys_put_initial(c);
+ BUG_ON(atomic_read(&c->journal_keys.ref));
bch2_fs_btree_write_buffer_exit(c);
percpu_free_rwsem(&c->mark_lock);
free_percpu(c->online_reserved);
init_rwsem(&c->gc_lock);
mutex_init(&c->gc_gens_lock);
+ atomic_set(&c->journal_keys.ref, 1);
+ c->journal_keys.initial_ref_held = true;
for (i = 0; i < BCH_TIME_STAT_NR; i++)
bch2_time_stats_init(&c->times[i]);
bch2_fs_copygc_init(c);
bch2_fs_btree_key_cache_init_early(&c->btree_key_cache);
+ bch2_fs_btree_iter_init_early(c);
bch2_fs_btree_interior_update_init_early(c);
bch2_fs_allocator_background_init(c);
bch2_fs_allocator_foreground_init(c);
struct bch_sb_handle {
struct bch_sb *sb;
struct block_device *bdev;
+ char *sb_name;
struct bio *bio;
void *holder;
size_t buffer_size;
if (!btree_type_has_ptrs(id))
continue;
- for_each_btree_key(trans, iter, id, POS_MIN,
- BTREE_ITER_ALL_SNAPSHOTS, k, ret) {
+ ret = for_each_btree_key2(trans, iter, id, POS_MIN,
+ BTREE_ITER_ALL_SNAPSHOTS, k, ({
struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k);
const union bch_extent_entry *entry;
struct extent_ptr_decoded p;
nr_uncompressed_extents++;
else if (compressed)
nr_compressed_extents++;
- }
- bch2_trans_iter_exit(trans, &iter);
+ 0;
+ }));
}
bch2_trans_put(trans);
TRACE_EVENT(journal_reclaim_start,
TP_PROTO(struct bch_fs *c, bool direct, bool kicked,
u64 min_nr, u64 min_key_cache,
- u64 prereserved, u64 prereserved_total,
u64 btree_cache_dirty, u64 btree_cache_total,
u64 btree_key_cache_dirty, u64 btree_key_cache_total),
- TP_ARGS(c, direct, kicked, min_nr, min_key_cache, prereserved, prereserved_total,
+ TP_ARGS(c, direct, kicked, min_nr, min_key_cache,
btree_cache_dirty, btree_cache_total,
btree_key_cache_dirty, btree_key_cache_total),
__field(bool, kicked )
__field(u64, min_nr )
__field(u64, min_key_cache )
- __field(u64, prereserved )
- __field(u64, prereserved_total )
__field(u64, btree_cache_dirty )
__field(u64, btree_cache_total )
__field(u64, btree_key_cache_dirty )
__entry->kicked = kicked;
__entry->min_nr = min_nr;
__entry->min_key_cache = min_key_cache;
- __entry->prereserved = prereserved;
- __entry->prereserved_total = prereserved_total;
__entry->btree_cache_dirty = btree_cache_dirty;
__entry->btree_cache_total = btree_cache_total;
__entry->btree_key_cache_dirty = btree_key_cache_dirty;
__entry->btree_key_cache_total = btree_key_cache_total;
),
- TP_printk("%d,%d direct %u kicked %u min %llu key cache %llu prereserved %llu/%llu btree cache %llu/%llu key cache %llu/%llu",
+ TP_printk("%d,%d direct %u kicked %u min %llu key cache %llu btree cache %llu/%llu key cache %llu/%llu",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->direct,
__entry->kicked,
__entry->min_nr,
__entry->min_key_cache,
- __entry->prereserved,
- __entry->prereserved_total,
__entry->btree_cache_dirty,
__entry->btree_cache_total,
__entry->btree_key_cache_dirty,
TP_printk("%d:%d %s", MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(msg))
);
-DEFINE_EVENT(bkey, move_extent_alloc_mem_fail,
- TP_PROTO(struct bch_fs *c, const char *k),
- TP_ARGS(c, k)
+DEFINE_EVENT(bkey, move_extent_start_fail,
+ TP_PROTO(struct bch_fs *c, const char *str),
+ TP_ARGS(c, str)
);
TRACE_EVENT(move_data,
s.v = v + 1;
s.defined = true;
} else {
+ /*
+ * Check if this option was set on the parent - if so, switched
+ * back to inheriting from the parent:
+ *
+ * rename() also has to deal with keeping inherited options up
+ * to date - see bch2_reinherit_attrs()
+ */
+ spin_lock(&dentry->d_lock);
if (!IS_ROOT(dentry)) {
struct bch_inode_info *dir =
to_bch_ei(d_inode(dentry->d_parent));
} else {
s.v = 0;
}
+ spin_unlock(&dentry->d_lock);
s.defined = false;
}
if (btrfs_block_can_be_shared(trans, root, buf)) {
ret = btrfs_lookup_extent_info(trans, fs_info, buf->start,
btrfs_header_level(buf), 1,
- &refs, &flags);
+ &refs, &flags, NULL);
if (ret)
return ret;
if (unlikely(refs == 0)) {
start = round_down(start, fs_info->sectorsize);
btrfs_free_reserved_data_space_noquota(fs_info, len);
- btrfs_qgroup_free_data(inode, reserved, start, len);
+ btrfs_qgroup_free_data(inode, reserved, start, len, NULL);
}
/*
return -ENOMEM;
}
- if (btrfs_qgroup_enabled(fs_info) && !generic_ref->skip_qgroup) {
+ if (btrfs_qgroup_full_accounting(fs_info) && !generic_ref->skip_qgroup) {
record = kzalloc(sizeof(*record), GFP_NOFS);
if (!record) {
kmem_cache_free(btrfs_delayed_tree_ref_cachep, ref);
return -ENOMEM;
}
- if (btrfs_qgroup_enabled(fs_info) && !generic_ref->skip_qgroup) {
+ if (btrfs_qgroup_full_accounting(fs_info) && !generic_ref->skip_qgroup) {
record = kzalloc(sizeof(*record), GFP_NOFS);
if (!record) {
kmem_cache_free(btrfs_delayed_data_ref_cachep, ref);
goto fail_alloc;
}
+ btrfs_info(fs_info, "first mount of filesystem %pU", disk_super->fsid);
/*
* Verify the type first, if that or the checksum value are
* corrupted, we'll find out
}
}
+static void btrfs_free_all_qgroup_pertrans(struct btrfs_fs_info *fs_info)
+{
+ struct btrfs_root *gang[8];
+ int i;
+ int ret;
+
+ spin_lock(&fs_info->fs_roots_radix_lock);
+ while (1) {
+ ret = radix_tree_gang_lookup_tag(&fs_info->fs_roots_radix,
+ (void **)gang, 0,
+ ARRAY_SIZE(gang),
+ BTRFS_ROOT_TRANS_TAG);
+ if (ret == 0)
+ break;
+ for (i = 0; i < ret; i++) {
+ struct btrfs_root *root = gang[i];
+
+ btrfs_qgroup_free_meta_all_pertrans(root);
+ radix_tree_tag_clear(&fs_info->fs_roots_radix,
+ (unsigned long)root->root_key.objectid,
+ BTRFS_ROOT_TRANS_TAG);
+ }
+ }
+ spin_unlock(&fs_info->fs_roots_radix_lock);
+}
+
void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
struct btrfs_fs_info *fs_info)
{
EXTENT_DIRTY);
btrfs_destroy_pinned_extent(fs_info, &cur_trans->pinned_extents);
+ btrfs_free_all_qgroup_pertrans(fs_info);
+
cur_trans->state =TRANS_STATE_COMPLETED;
wake_up(&cur_trans->commit_wait);
}
*/
int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info, u64 bytenr,
- u64 offset, int metadata, u64 *refs, u64 *flags)
+ u64 offset, int metadata, u64 *refs, u64 *flags,
+ u64 *owning_root)
{
struct btrfs_root *extent_root;
struct btrfs_delayed_ref_head *head;
u32 item_size;
u64 num_refs;
u64 extent_flags;
+ u64 owner = 0;
int ret;
/*
struct btrfs_extent_item);
num_refs = btrfs_extent_refs(leaf, ei);
extent_flags = btrfs_extent_flags(leaf, ei);
+ owner = btrfs_get_extent_owner_root(fs_info, leaf,
+ path->slots[0]);
} else {
ret = -EUCLEAN;
btrfs_err(fs_info,
*refs = num_refs;
if (flags)
*flags = extent_flags;
+ if (owning_root)
+ *owning_root = owner;
out_free:
btrfs_free_path(path);
return ret;
return ret;
}
+static void free_head_ref_squota_rsv(struct btrfs_fs_info *fs_info,
+ struct btrfs_delayed_ref_head *href)
+{
+ u64 root = href->owning_root;
+
+ /*
+ * Don't check must_insert_reserved, as this is called from contexts
+ * where it has already been unset.
+ */
+ if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE ||
+ !href->is_data || !is_fstree(root))
+ return;
+
+ btrfs_qgroup_free_refroot(fs_info, root, href->reserved_bytes,
+ BTRFS_QGROUP_RSV_DATA);
+}
+
static int run_delayed_data_ref(struct btrfs_trans_handle *trans,
struct btrfs_delayed_ref_head *href,
struct btrfs_delayed_ref_node *node,
struct btrfs_squota_delta delta = {
.root = href->owning_root,
.num_bytes = node->num_bytes,
- .rsv_bytes = href->reserved_bytes,
.is_data = true,
.is_inc = true,
.generation = trans->transid,
flags, ref->objectid,
ref->offset, &key,
node->ref_mod, href->owning_root);
+ free_head_ref_squota_rsv(trans->fs_info, href);
if (!ret)
ret = btrfs_record_squota_delta(trans->fs_info, &delta);
- else
- btrfs_qgroup_free_refroot(trans->fs_info, delta.root,
- delta.rsv_bytes, BTRFS_QGROUP_RSV_DATA);
} else if (node->action == BTRFS_ADD_DELAYED_REF) {
ret = __btrfs_inc_extent_ref(trans, node, parent, ref->root,
ref->objectid, ref->offset,
struct btrfs_squota_delta delta = {
.root = href->owning_root,
.num_bytes = fs_info->nodesize,
- .rsv_bytes = 0,
.is_data = false,
.is_inc = true,
.generation = trans->transid,
int ret = 0;
if (TRANS_ABORTED(trans)) {
- if (insert_reserved)
+ if (insert_reserved) {
btrfs_pin_extent(trans, node->bytenr, node->num_bytes, 1);
+ free_head_ref_squota_rsv(trans->fs_info, href);
+ }
return 0;
}
struct btrfs_delayed_ref_root *delayed_refs,
struct btrfs_delayed_ref_head *head)
{
+ u64 ret = 0;
+
/*
* We had csum deletions accounted for in our delayed refs rsv, we need
* to drop the csum leaves for this update from our delayed_refs_rsv.
btrfs_delayed_refs_rsv_release(fs_info, 0, nr_csums);
- return btrfs_calc_delayed_ref_csum_bytes(fs_info, nr_csums);
+ ret = btrfs_calc_delayed_ref_csum_bytes(fs_info, nr_csums);
}
- if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE &&
- head->must_insert_reserved && head->is_data)
- btrfs_qgroup_free_refroot(fs_info, head->owning_root,
- head->reserved_bytes, BTRFS_QGROUP_RSV_DATA);
+ /* must_insert_reserved can be set only if we didn't run the head ref. */
+ if (head->must_insert_reserved)
+ free_head_ref_squota_rsv(fs_info, head);
- return 0;
+ return ret;
}
static int cleanup_ref_head(struct btrfs_trans_handle *trans,
* spin lock.
*/
must_insert_reserved = locked_ref->must_insert_reserved;
+ /*
+ * Unsetting this on the head ref relinquishes ownership of
+ * the rsv_bytes, so it is critical that every possible code
+ * path from here forward frees all reserves including qgroup
+ * reserve.
+ */
locked_ref->must_insert_reserved = false;
extent_op = locked_ref->extent_op;
struct btrfs_squota_delta delta = {
.root = delayed_ref_root,
.num_bytes = num_bytes,
- .rsv_bytes = 0,
.is_data = is_data,
.is_inc = false,
.generation = btrfs_extent_generation(leaf, ei),
.root = root_objectid,
.num_bytes = ins->offset,
.generation = trans->transid,
- .rsv_bytes = 0,
.is_data = true,
.is_inc = true,
};
/* We don't lock the tree block, it's OK to be racy here */
ret = btrfs_lookup_extent_info(trans, fs_info, bytenr,
wc->level - 1, 1, &refs,
- &flags);
+ &flags, NULL);
/* We don't care about errors in readahead. */
if (ret < 0)
continue;
ret = btrfs_lookup_extent_info(trans, fs_info,
eb->start, level, 1,
&wc->refs[level],
- &wc->flags[level]);
+ &wc->flags[level],
+ NULL);
BUG_ON(ret == -ENOMEM);
if (ret)
return ret;
u64 bytenr;
u64 generation;
u64 parent;
+ u64 owner_root = 0;
struct btrfs_tree_parent_check check = { 0 };
struct btrfs_key key;
struct btrfs_ref ref = { 0 };
ret = btrfs_lookup_extent_info(trans, fs_info, bytenr, level - 1, 1,
&wc->refs[level - 1],
- &wc->flags[level - 1]);
+ &wc->flags[level - 1],
+ &owner_root);
if (ret < 0)
goto out_unlock;
find_next_key(path, level, &wc->drop_progress);
btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, bytenr,
- fs_info->nodesize, parent,
- btrfs_header_owner(next));
+ fs_info->nodesize, parent, owner_root);
btrfs_init_tree_ref(&ref, level - 1, root->root_key.objectid,
0, false);
ret = btrfs_free_extent(trans, &ref);
ret = btrfs_lookup_extent_info(trans, fs_info,
eb->start, level, 1,
&wc->refs[level],
- &wc->flags[level]);
+ &wc->flags[level],
+ NULL);
if (ret < 0) {
btrfs_tree_unlock_rw(eb, path->locks[level]);
path->locks[level] = 0;
ret = btrfs_lookup_extent_info(trans, fs_info,
path->nodes[level]->start,
level, 1, &wc->refs[level],
- &wc->flags[level]);
+ &wc->flags[level], NULL);
if (ret < 0) {
err = ret;
goto out_end_trans;
int btrfs_lookup_data_extent(struct btrfs_fs_info *fs_info, u64 start, u64 len);
int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info, u64 bytenr,
- u64 offset, int metadata, u64 *refs, u64 *flags);
+ u64 offset, int metadata, u64 *refs, u64 *flags,
+ u64 *owner_root);
int btrfs_pin_extent(struct btrfs_trans_handle *trans, u64 bytenr, u64 num,
int reserved);
int btrfs_pin_extent_for_log_replay(struct btrfs_trans_handle *trans,
* the array will be skipped
*
* Return: 0 if all pages were able to be allocated;
- * -ENOMEM otherwise, and the caller is responsible for freeing all
- * non-null page pointers in the array.
+ * -ENOMEM otherwise, the partially allocated pages would be freed and
+ * the array slots zeroed
*/
int btrfs_alloc_page_array(unsigned int nr_pages, struct page **page_array)
{
* though alloc_pages_bulk_array() falls back to alloc_page()
* if it could not bulk-allocate. So we must be out of memory.
*/
- if (allocated == last)
+ if (allocated == last) {
+ for (int i = 0; i < allocated; i++) {
+ __free_page(page_array[i]);
+ page_array[i] = NULL;
+ }
return -ENOMEM;
+ }
memalloc_retry_wait(GFP_NOFS);
}
ret = 0;
} else {
u32 clear_bits = ~(EXTENT_LOCKED | EXTENT_NODATASUM |
- EXTENT_DELALLOC_NEW | EXTENT_CTLBITS);
+ EXTENT_DELALLOC_NEW | EXTENT_CTLBITS |
+ EXTENT_QGROUP_RESERVED);
/*
* At this point we can safely clear everything except the
qgroup_reserved -= range->len;
} else if (qgroup_reserved > 0) {
btrfs_qgroup_free_data(BTRFS_I(inode), data_reserved,
- range->start, range->len);
+ range->start, range->len, NULL);
qgroup_reserved -= range->len;
}
list_del(&range->list);
* And at reserve time, it's always aligned to page size, so
* just free one page here.
*/
- btrfs_qgroup_free_data(inode, NULL, 0, PAGE_SIZE);
+ btrfs_qgroup_free_data(inode, NULL, 0, PAGE_SIZE, NULL);
btrfs_free_path(path);
btrfs_end_transaction(trans);
return ret;
*/
if (state_flags & EXTENT_DELALLOC)
btrfs_qgroup_free_data(BTRFS_I(inode), NULL, start,
- end - start + 1);
+ end - start + 1, NULL);
clear_extent_bit(io_tree, start, end,
EXTENT_CLEAR_ALL_BITS | EXTENT_DO_ACCOUNTING,
int ret;
alloc_hint = get_extent_allocation_hint(inode, start, len);
+again:
ret = btrfs_reserve_extent(root, len, len, fs_info->sectorsize,
0, alloc_hint, &ins, 1, 1);
+ if (ret == -EAGAIN) {
+ ASSERT(btrfs_is_zoned(fs_info));
+ wait_on_bit_io(&inode->root->fs_info->flags, BTRFS_FS_NEED_ZONE_FINISH,
+ TASK_UNINTERRUPTIBLE);
+ goto again;
+ }
if (ret)
return ERR_PTR(ret);
* reserved data space.
* Since the IO will never happen for this page.
*/
- btrfs_qgroup_free_data(inode, NULL, cur, range_end + 1 - cur);
+ btrfs_qgroup_free_data(inode, NULL, cur, range_end + 1 - cur, NULL);
if (!inode_evicting) {
clear_extent_bit(tree, cur, range_end, EXTENT_LOCKED |
EXTENT_DELALLOC | EXTENT_UPTODATE |
struct btrfs_path *path;
u64 start = ins->objectid;
u64 len = ins->offset;
- int qgroup_released;
+ u64 qgroup_released = 0;
int ret;
memset(&stack_fi, 0, sizeof(stack_fi));
btrfs_set_stack_file_extent_compression(&stack_fi, BTRFS_COMPRESS_NONE);
/* Encryption and other encoding is reserved and all 0 */
- qgroup_released = btrfs_qgroup_release_data(inode, file_offset, len);
- if (qgroup_released < 0)
- return ERR_PTR(qgroup_released);
+ ret = btrfs_qgroup_release_data(inode, file_offset, len, &qgroup_released);
+ if (ret < 0)
+ return ERR_PTR(ret);
if (trans) {
ret = insert_reserved_file_extent(trans, inode,
btrfs_delalloc_release_metadata(inode, disk_num_bytes, ret < 0);
out_qgroup_free_data:
if (ret < 0)
- btrfs_qgroup_free_data(inode, data_reserved, start, num_bytes);
+ btrfs_qgroup_free_data(inode, data_reserved, start, num_bytes, NULL);
out_free_data_space:
/*
* If btrfs_reserve_extent() succeeded, then we already decremented
* are limited to own subvolumes only
*/
ret = -EPERM;
+ } else if (btrfs_ino(BTRFS_I(src_inode)) != BTRFS_FIRST_FREE_OBJECTID) {
+ /*
+ * Snapshots must be made with the src_inode referring
+ * to the subvolume inode, otherwise the permission
+ * checking above is useless because we may have
+ * permission on a lower directory but not the subvol
+ * itself.
+ */
+ ret = -EINVAL;
} else {
ret = btrfs_mksnapshot(&file->f_path, idmap,
name, namelen,
static noinline int copy_to_sk(struct btrfs_path *path,
struct btrfs_key *key,
struct btrfs_ioctl_search_key *sk,
- size_t *buf_size,
+ u64 *buf_size,
char __user *ubuf,
unsigned long *sk_offset,
int *num_found)
static noinline int search_ioctl(struct inode *inode,
struct btrfs_ioctl_search_key *sk,
- size_t *buf_size,
+ u64 *buf_size,
char __user *ubuf)
{
struct btrfs_fs_info *info = btrfs_sb(inode->i_sb);
struct btrfs_ioctl_search_args __user *uargs = argp;
struct btrfs_ioctl_search_key sk;
int ret;
- size_t buf_size;
+ u64 buf_size;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
struct btrfs_ioctl_search_args_v2 __user *uarg = argp;
struct btrfs_ioctl_search_args_v2 args;
int ret;
- size_t buf_size;
- const size_t buf_limit = SZ_16M;
+ u64 buf_size;
+ const u64 buf_limit = SZ_16M;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
arg->clone_sources = compat_ptr(args32.clone_sources);
arg->parent_root = args32.parent_root;
arg->flags = args32.flags;
+ arg->version = args32.version;
memcpy(arg->reserved, args32.reserved,
sizeof(args32.reserved));
#else
{
struct btrfs_ordered_extent *entry;
int ret;
+ u64 qgroup_rsv = 0;
if (flags &
((1 << BTRFS_ORDERED_NOCOW) | (1 << BTRFS_ORDERED_PREALLOC))) {
/* For nocow write, we can release the qgroup rsv right now */
- ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes);
+ ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes, &qgroup_rsv);
if (ret < 0)
return ERR_PTR(ret);
} else {
* The ordered extent has reserved qgroup space, release now
* and pass the reserved number for qgroup_record to free.
*/
- ret = btrfs_qgroup_release_data(inode, file_offset, num_bytes);
+ ret = btrfs_qgroup_release_data(inode, file_offset, num_bytes, &qgroup_rsv);
if (ret < 0)
return ERR_PTR(ret);
}
entry->inode = igrab(&inode->vfs_inode);
entry->compress_type = compress_type;
entry->truncated_len = (u64)-1;
- entry->qgroup_rsv = ret;
+ entry->qgroup_rsv = qgroup_rsv;
entry->flags = flags;
refcount_set(&entry->refs, 1);
init_waitqueue_head(&entry->wait);
release = entry->disk_num_bytes;
else
release = entry->num_bytes;
- btrfs_delalloc_release_metadata(btrfs_inode, release, false);
+ btrfs_delalloc_release_metadata(btrfs_inode, release,
+ test_bit(BTRFS_ORDERED_IOERR,
+ &entry->flags));
}
percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes,
u64 bytenr = record->bytenr;
if (!btrfs_qgroup_full_accounting(fs_info))
- return 0;
+ return 1;
lockdep_assert_held(&delayed_refs->lock);
trace_btrfs_qgroup_trace_extent(fs_info, record);
qgroup_update_counters(fs_info, &qgroups, nr_old_roots, nr_new_roots,
num_bytes, seq);
+ /*
+ * We're done using the iterator, release all its qgroups while holding
+ * fs_info->qgroup_lock so that we don't race with btrfs_remove_qgroup()
+ * and trigger use-after-free accesses to qgroups.
+ */
+ qgroup_iterator_nested_clean(&qgroups);
+
/*
* Bump qgroup_seq to avoid seq overlap
*/
fs_info->qgroup_seq += max(nr_old_roots, nr_new_roots) + 1;
spin_unlock(&fs_info->qgroup_lock);
out_free:
- qgroup_iterator_nested_clean(&qgroups);
ulist_free(old_roots);
ulist_free(new_roots);
return ret;
/* Free ranges specified by @reserved, normally in error path */
static int qgroup_free_reserved_data(struct btrfs_inode *inode,
- struct extent_changeset *reserved, u64 start, u64 len)
+ struct extent_changeset *reserved,
+ u64 start, u64 len, u64 *freed_ret)
{
struct btrfs_root *root = inode->root;
struct ulist_node *unode;
struct ulist_iterator uiter;
struct extent_changeset changeset;
- int freed = 0;
+ u64 freed = 0;
int ret;
extent_changeset_init(&changeset);
}
btrfs_qgroup_free_refroot(root->fs_info, root->root_key.objectid, freed,
BTRFS_QGROUP_RSV_DATA);
- ret = freed;
+ if (freed_ret)
+ *freed_ret = freed;
+ ret = 0;
out:
extent_changeset_release(&changeset);
return ret;
static int __btrfs_qgroup_release_data(struct btrfs_inode *inode,
struct extent_changeset *reserved, u64 start, u64 len,
- int free)
+ u64 *released, int free)
{
struct extent_changeset changeset;
int trace_op = QGROUP_RELEASE;
/* In release case, we shouldn't have @reserved */
WARN_ON(!free && reserved);
if (free && reserved)
- return qgroup_free_reserved_data(inode, reserved, start, len);
+ return qgroup_free_reserved_data(inode, reserved, start, len, released);
extent_changeset_init(&changeset);
ret = clear_record_extent_bits(&inode->io_tree, start, start + len -1,
EXTENT_QGROUP_RESERVED, &changeset);
btrfs_qgroup_free_refroot(inode->root->fs_info,
inode->root->root_key.objectid,
changeset.bytes_changed, BTRFS_QGROUP_RSV_DATA);
- ret = changeset.bytes_changed;
+ if (released)
+ *released = changeset.bytes_changed;
out:
extent_changeset_release(&changeset);
return ret;
* NOTE: This function may sleep for memory allocation.
*/
int btrfs_qgroup_free_data(struct btrfs_inode *inode,
- struct extent_changeset *reserved, u64 start, u64 len)
+ struct extent_changeset *reserved,
+ u64 start, u64 len, u64 *freed)
{
- return __btrfs_qgroup_release_data(inode, reserved, start, len, 1);
+ return __btrfs_qgroup_release_data(inode, reserved, start, len, freed, 1);
}
/*
*
* NOTE: This function may sleep for memory allocation.
*/
-int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len)
+int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len, u64 *released)
{
- return __btrfs_qgroup_release_data(inode, NULL, start, len, 0);
+ return __btrfs_qgroup_release_data(inode, NULL, start, len, released, 0);
}
static void add_root_meta_rsv(struct btrfs_root *root, int num_bytes,
qgroup_rsv_release(fs_info, qgroup, num_bytes,
BTRFS_QGROUP_RSV_META_PREALLOC);
- qgroup_rsv_add(fs_info, qgroup, num_bytes,
- BTRFS_QGROUP_RSV_META_PERTRANS);
+ if (!sb_rdonly(fs_info->sb))
+ qgroup_rsv_add(fs_info, qgroup, num_bytes,
+ BTRFS_QGROUP_RSV_META_PERTRANS);
list_for_each_entry(glist, &qgroup->groups, next_group)
qgroup_iterator_add(&qgroup_list, glist->group);
*root = RB_ROOT;
}
+void btrfs_free_squota_rsv(struct btrfs_fs_info *fs_info, u64 root, u64 rsv_bytes)
+{
+ if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE)
+ return;
+
+ if (!is_fstree(root))
+ return;
+
+ btrfs_qgroup_free_refroot(fs_info, root, rsv_bytes, BTRFS_QGROUP_RSV_DATA);
+}
+
int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info,
struct btrfs_squota_delta *delta)
{
out:
spin_unlock(&fs_info->qgroup_lock);
- if (!ret && delta->rsv_bytes)
- btrfs_qgroup_free_refroot(fs_info, root, delta->rsv_bytes,
- BTRFS_QGROUP_RSV_DATA);
return ret;
}
u64 root;
/* The number of bytes in the extent being counted. */
u64 num_bytes;
- /* The number of bytes reserved for this extent. */
- u64 rsv_bytes;
/* The generation the extent was created in. */
u64 generation;
/* Whether we are using or freeing the extent. */
/* New io_tree based accurate qgroup reserve API */
int btrfs_qgroup_reserve_data(struct btrfs_inode *inode,
struct extent_changeset **reserved, u64 start, u64 len);
-int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len);
+int btrfs_qgroup_release_data(struct btrfs_inode *inode, u64 start, u64 len, u64 *released);
int btrfs_qgroup_free_data(struct btrfs_inode *inode,
struct extent_changeset *reserved, u64 start,
- u64 len);
+ u64 len, u64 *freed);
int btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes,
enum btrfs_qgroup_rsv_type type, bool enforce);
int __btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes,
struct btrfs_root *root, struct extent_buffer *eb);
void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans);
bool btrfs_check_quota_leak(struct btrfs_fs_info *fs_info);
+void btrfs_free_squota_rsv(struct btrfs_fs_info *fs_info, u64 root, u64 rsv_bytes);
int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info,
struct btrfs_squota_delta *delta);
btrfs_put_bioc(bioc);
}
- return ret;
+ return 0;
}
int btrfs_get_raid_extent_offset(struct btrfs_fs_info *fs_info,
dump_ref_action(fs_info, ra);
kfree(ref);
kfree(ra);
+ kfree(re);
goto out_unlock;
} else if (be->num_refs == 0) {
btrfs_err(fs_info,
dump_ref_action(fs_info, ra);
kfree(ref);
kfree(ra);
+ kfree(re);
goto out_unlock;
}
*/
ASSERT(sctx->cur_stripe < SCRUB_TOTAL_STRIPES);
+ /* @found_logical_ret must be specified. */
+ ASSERT(found_logical_ret);
+
stripe = &sctx->stripes[sctx->cur_stripe];
scrub_reset_stripe(stripe);
ret = scrub_find_fill_first_stripe(bg, &sctx->extent_path,
/* Either >0 as no more extents or <0 for error. */
if (ret)
return ret;
- if (found_logical_ret)
- *found_logical_ret = stripe->logical;
+ *found_logical_ret = stripe->logical;
sctx->cur_stripe++;
/* We filled one group, submit it. */
/* Go through each extent items inside the logical range */
while (cur_logical < logical_end) {
- u64 found_logical;
+ u64 found_logical = U64_MAX;
u64 cur_physical = physical + cur_logical - logical_start;
/* Canceled? */
if (ret < 0)
break;
+ /* queue_scrub_stripe() returned 0, @found_logical must be updated. */
+ ASSERT(found_logical != U64_MAX);
cur_logical = found_logical + BTRFS_STRIPE_LEN;
/* Don't hold CPU for too long time */
}
sctx->send_filp = fget(arg->send_fd);
- if (!sctx->send_filp) {
+ if (!sctx->send_filp || !(sctx->send_filp->f_mode & FMODE_WRITE)) {
ret = -EBADF;
goto out;
}
static void btrfs_put_super(struct super_block *sb)
{
- close_ctree(btrfs_sb(sb));
+ struct btrfs_fs_info *fs_info = btrfs_sb(sb);
+
+ btrfs_info(fs_info, "last unmount of filesystem %pU", fs_info->fs_devices->fsid);
+ close_ctree(fs_info);
}
enum {
static struct kmem_cache *btrfs_trans_handle_cachep;
-#define BTRFS_ROOT_TRANS_TAG 0
-
/*
* Transaction states and transitions
*
btrfs_release_path(path);
ret = btrfs_create_qgroup(trans, objectid);
- if (ret) {
+ if (ret && ret != -EEXIST) {
btrfs_abort_transaction(trans, ret);
goto fail;
}
#include "ctree.h"
#include "misc.h"
+/* Radix-tree tag for roots that are part of the trasaction. */
+#define BTRFS_ROOT_TRANS_TAG 0
+
enum btrfs_trans_state {
TRANS_STATE_RUNNING,
TRANS_STATE_COMMIT_PREP,
#include "inode-item.h"
#include "dir-item.h"
#include "raid-stripe-tree.h"
+#include "extent-tree.h"
/*
* Error message should follow the following format:
unsigned long ptr; /* Current pointer inside inline refs */
unsigned long end; /* Extent item end */
const u32 item_size = btrfs_item_size(leaf, slot);
+ u8 last_type = 0;
+ u64 last_seq = U64_MAX;
u64 flags;
u64 generation;
u64 total_refs; /* Total refs in btrfs_extent_item */
* 2.2) Ref type specific data
* Either using btrfs_extent_inline_ref::offset, or specific
* data structure.
+ *
+ * All above inline items should follow the order:
+ *
+ * - All btrfs_extent_inline_ref::type should be in an ascending
+ * order
+ *
+ * - Within the same type, the items should follow a descending
+ * order by their sequence number. The sequence number is
+ * determined by:
+ * * btrfs_extent_inline_ref::offset for all types other than
+ * EXTENT_DATA_REF
+ * * hash_extent_data_ref() for EXTENT_DATA_REF
*/
if (unlikely(item_size < sizeof(*ei))) {
extent_err(leaf, slot,
struct btrfs_extent_inline_ref *iref;
struct btrfs_extent_data_ref *dref;
struct btrfs_shared_data_ref *sref;
+ u64 seq;
u64 dref_offset;
u64 inline_offset;
u8 inline_type;
iref = (struct btrfs_extent_inline_ref *)ptr;
inline_type = btrfs_extent_inline_ref_type(leaf, iref);
inline_offset = btrfs_extent_inline_ref_offset(leaf, iref);
+ seq = inline_offset;
if (unlikely(ptr + btrfs_extent_inline_ref_size(inline_type) > end)) {
extent_err(leaf, slot,
"inline ref item overflows extent item, ptr %lu iref size %u end %lu",
case BTRFS_EXTENT_DATA_REF_KEY:
dref = (struct btrfs_extent_data_ref *)(&iref->offset);
dref_offset = btrfs_extent_data_ref_offset(leaf, dref);
+ seq = hash_extent_data_ref(
+ btrfs_extent_data_ref_root(leaf, dref),
+ btrfs_extent_data_ref_objectid(leaf, dref),
+ btrfs_extent_data_ref_offset(leaf, dref));
if (unlikely(!IS_ALIGNED(dref_offset,
fs_info->sectorsize))) {
extent_err(leaf, slot,
inline_type);
return -EUCLEAN;
}
+ if (inline_type < last_type) {
+ extent_err(leaf, slot,
+ "inline ref out-of-order: has type %u, prev type %u",
+ inline_type, last_type);
+ return -EUCLEAN;
+ }
+ /* Type changed, allow the sequence starts from U64_MAX again. */
+ if (inline_type > last_type)
+ last_seq = U64_MAX;
+ if (seq > last_seq) {
+ extent_err(leaf, slot,
+"inline ref out-of-order: has type %u offset %llu seq 0x%llx, prev type %u seq 0x%llx",
+ inline_type, inline_offset, seq,
+ last_type, last_seq);
+ return -EUCLEAN;
+ }
+ last_type = inline_type;
+ last_seq = seq;
ptr += btrfs_extent_inline_ref_size(inline_type);
}
/* No padding is allowed */
if (!fs_devices) {
fs_devices = alloc_fs_devices(disk_super->fsid);
+ if (IS_ERR(fs_devices))
+ return ERR_CAST(fs_devices);
+
if (has_metadata_uuid)
memcpy(fs_devices->metadata_uuid,
disk_super->metadata_uuid, BTRFS_FSID_SIZE);
- if (IS_ERR(fs_devices))
- return ERR_CAST(fs_devices);
-
if (same_fsid_diff_dev) {
generate_random_uuid(fs_devices->fsid);
fs_devices->temp_fsid = true;
read_unlock(&em_tree->lock);
if (!em) {
- btrfs_crit(fs_info, "unable to find logical %llu length %llu",
+ btrfs_crit(fs_info,
+ "unable to find chunk map for logical %llu length %llu",
logical, length);
return ERR_PTR(-EINVAL);
}
- if (em->start > logical || em->start + em->len < logical) {
+ if (em->start > logical || em->start + em->len <= logical) {
btrfs_crit(fs_info,
- "found a bad mapping, wanted %llu-%llu, found %llu-%llu",
- logical, length, em->start, em->start + em->len);
+ "found a bad chunk map, wanted %llu-%llu, found %llu-%llu",
+ logical, logical + length, em->start, em->start + em->len);
free_extent_map(em);
return ERR_PTR(-EINVAL);
}
}
out:
- if (cache->alloc_offset > fs_info->zone_size) {
- btrfs_err(fs_info,
- "zoned: invalid write pointer %llu in block group %llu",
- cache->alloc_offset, cache->start);
- ret = -EIO;
- }
-
if (cache->alloc_offset > cache->zone_capacity) {
btrfs_err(fs_info,
"zoned: invalid write pointer %llu (larger than zone capacity %llu) in block group %llu",
struct debugfs_fsdata *fsd;
void *d_fsd;
+ /*
+ * This could only happen if some debugfs user erroneously calls
+ * debugfs_file_get() on a dentry that isn't even a file, let
+ * them know about it.
+ */
+ if (WARN_ON(!d_is_reg(dentry)))
+ return -EINVAL;
+
d_fsd = READ_ONCE(dentry->d_fsdata);
if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
fsd = d_fsd;
~DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
refcount_set(&fsd->active_users, 1);
init_completion(&fsd->active_users_drained);
+ INIT_LIST_HEAD(&fsd->cancellations);
+ mutex_init(&fsd->cancellations_mtx);
+
if (cmpxchg(&dentry->d_fsdata, d_fsd, fsd) != d_fsd) {
+ mutex_destroy(&fsd->cancellations_mtx);
kfree(fsd);
fsd = READ_ONCE(dentry->d_fsdata);
}
}
EXPORT_SYMBOL_GPL(debugfs_file_put);
+/**
+ * debugfs_enter_cancellation - enter a debugfs cancellation
+ * @file: the file being accessed
+ * @cancellation: the cancellation object, the cancel callback
+ * inside of it must be initialized
+ *
+ * When a debugfs file is removed it needs to wait for all active
+ * operations to complete. However, the operation itself may need
+ * to wait for hardware or completion of some asynchronous process
+ * or similar. As such, it may need to be cancelled to avoid long
+ * waits or even deadlocks.
+ *
+ * This function can be used inside a debugfs handler that may
+ * need to be cancelled. As soon as this function is called, the
+ * cancellation's 'cancel' callback may be called, at which point
+ * the caller should proceed to call debugfs_leave_cancellation()
+ * and leave the debugfs handler function as soon as possible.
+ * Note that the 'cancel' callback is only ever called in the
+ * context of some kind of debugfs_remove().
+ *
+ * This function must be paired with debugfs_leave_cancellation().
+ */
+void debugfs_enter_cancellation(struct file *file,
+ struct debugfs_cancellation *cancellation)
+{
+ struct debugfs_fsdata *fsd;
+ struct dentry *dentry = F_DENTRY(file);
+
+ INIT_LIST_HEAD(&cancellation->list);
+
+ if (WARN_ON(!d_is_reg(dentry)))
+ return;
+
+ if (WARN_ON(!cancellation->cancel))
+ return;
+
+ fsd = READ_ONCE(dentry->d_fsdata);
+ if (WARN_ON(!fsd ||
+ ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)))
+ return;
+
+ mutex_lock(&fsd->cancellations_mtx);
+ list_add(&cancellation->list, &fsd->cancellations);
+ mutex_unlock(&fsd->cancellations_mtx);
+
+ /* if we're already removing wake it up to cancel */
+ if (d_unlinked(dentry))
+ complete(&fsd->active_users_drained);
+}
+EXPORT_SYMBOL_GPL(debugfs_enter_cancellation);
+
+/**
+ * debugfs_leave_cancellation - leave cancellation section
+ * @file: the file being accessed
+ * @cancellation: the cancellation previously registered with
+ * debugfs_enter_cancellation()
+ *
+ * See the documentation of debugfs_enter_cancellation().
+ */
+void debugfs_leave_cancellation(struct file *file,
+ struct debugfs_cancellation *cancellation)
+{
+ struct debugfs_fsdata *fsd;
+ struct dentry *dentry = F_DENTRY(file);
+
+ if (WARN_ON(!d_is_reg(dentry)))
+ return;
+
+ fsd = READ_ONCE(dentry->d_fsdata);
+ if (WARN_ON(!fsd ||
+ ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)))
+ return;
+
+ mutex_lock(&fsd->cancellations_mtx);
+ if (!list_empty(&cancellation->list))
+ list_del(&cancellation->list);
+ mutex_unlock(&fsd->cancellations_mtx);
+}
+EXPORT_SYMBOL_GPL(debugfs_leave_cancellation);
+
/*
* Only permit access to world-readable files when the kernel is locked down.
* We also need to exclude any file that has ways to write or alter it as root
static void debugfs_release_dentry(struct dentry *dentry)
{
- void *fsd = dentry->d_fsdata;
+ struct debugfs_fsdata *fsd = dentry->d_fsdata;
- if (!((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT))
- kfree(dentry->d_fsdata);
+ if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
+ return;
+
+ /* check it wasn't a dir (no fsdata) or automount (no real_fops) */
+ if (fsd && fsd->real_fops) {
+ WARN_ON(!list_empty(&fsd->cancellations));
+ mutex_destroy(&fsd->cancellations_mtx);
+ }
+
+ kfree(fsd);
}
static struct vfsmount *debugfs_automount(struct path *path)
{
- debugfs_automount_t f;
- f = (debugfs_automount_t)path->dentry->d_fsdata;
- return f(path->dentry, d_inode(path->dentry)->i_private);
+ struct debugfs_fsdata *fsd = path->dentry->d_fsdata;
+
+ return fsd->automount(path->dentry, d_inode(path->dentry)->i_private);
}
static const struct dentry_operations debugfs_dops = {
void *data)
{
struct dentry *dentry = start_creating(name, parent);
+ struct debugfs_fsdata *fsd;
struct inode *inode;
if (IS_ERR(dentry))
return dentry;
+ fsd = kzalloc(sizeof(*fsd), GFP_KERNEL);
+ if (!fsd) {
+ failed_creating(dentry);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ fsd->automount = f;
+
if (!(debugfs_allow & DEBUGFS_ALLOW_API)) {
failed_creating(dentry);
+ kfree(fsd);
return ERR_PTR(-EPERM);
}
if (unlikely(!inode)) {
pr_err("out of free dentries, can not create automount '%s'\n",
name);
+ kfree(fsd);
return failed_creating(dentry);
}
make_empty_dir_inode(inode);
inode->i_flags |= S_AUTOMOUNT;
inode->i_private = data;
- dentry->d_fsdata = (void *)f;
+ dentry->d_fsdata = fsd;
/* directory inodes start off with i_nlink == 2 (for "." entry) */
inc_nlink(inode);
d_instantiate(dentry, inode);
fsd = READ_ONCE(dentry->d_fsdata);
if ((unsigned long)fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)
return;
- if (!refcount_dec_and_test(&fsd->active_users))
+
+ /* if we hit zero, just wait for all to finish */
+ if (!refcount_dec_and_test(&fsd->active_users)) {
wait_for_completion(&fsd->active_users_drained);
+ return;
+ }
+
+ /* if we didn't hit zero, try to cancel any we can */
+ while (refcount_read(&fsd->active_users)) {
+ struct debugfs_cancellation *c;
+
+ /*
+ * Lock the cancellations. Note that the cancellations
+ * structs are meant to be on the stack, so we need to
+ * ensure we either use them here or don't touch them,
+ * and debugfs_leave_cancellation() will wait for this
+ * to be finished processing before exiting one. It may
+ * of course win and remove the cancellation, but then
+ * chances are we never even got into this bit, we only
+ * do if the refcount isn't zero already.
+ */
+ mutex_lock(&fsd->cancellations_mtx);
+ while ((c = list_first_entry_or_null(&fsd->cancellations,
+ typeof(*c), list))) {
+ list_del_init(&c->list);
+ c->cancel(dentry, c->cancel_data);
+ }
+ mutex_unlock(&fsd->cancellations_mtx);
+
+ wait_for_completion(&fsd->active_users_drained);
+ }
}
static void remove_one(struct dentry *victim)
#ifndef _DEBUGFS_INTERNAL_H_
#define _DEBUGFS_INTERNAL_H_
+#include <linux/list.h>
struct file_operations;
struct debugfs_fsdata {
const struct file_operations *real_fops;
- refcount_t active_users;
- struct completion active_users_drained;
+ union {
+ /* automount_fn is used when real_fops is NULL */
+ debugfs_automount_t automount;
+ struct {
+ refcount_t active_users;
+ struct completion active_users_drained;
+
+ /* protect cancellations */
+ struct mutex cancellations_mtx;
+ struct list_head cancellations;
+ };
+ };
};
/*
return rc;
}
+static int ecryptfs_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
static int ecryptfs_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int flags)
struct kstat lower_stat;
int rc;
- rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
- request_mask, flags);
+ rc = ecryptfs_do_getattr(ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
if (!rc) {
fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
performance under extremely memory pressure without extra cost.
See the documentation at <file:Documentation/filesystems/erofs.rst>
- for more details.
+ and the web pages at <https://erofs.docs.kernel.org> for more details.
If unsure, say N.
up_read(&devs->rwsem);
return 0;
}
- map->m_bdev = dif->bdev_handle->bdev;
+ map->m_bdev = dif->bdev_handle ? dif->bdev_handle->bdev : NULL;
map->m_daxdev = dif->dax_dev;
map->m_dax_part_off = dif->dax_part_off;
map->m_fscache = dif->fscache;
if (map->m_pa >= startoff &&
map->m_pa < startoff + length) {
map->m_pa -= startoff;
- map->m_bdev = dif->bdev_handle->bdev;
+ map->m_bdev = dif->bdev_handle ?
+ dif->bdev_handle->bdev : NULL;
map->m_daxdev = dif->dax_dev;
map->m_dax_part_off = dif->dax_part_off;
map->m_fscache = dif->fscache;
struct erofs_sb_info *sbi = EROFS_SB(sb);
struct erofs_inode *vi = EROFS_I(inode);
const erofs_off_t inode_loc = erofs_iloc(inode);
-
erofs_blk_t blkaddr, nblks = 0;
void *kaddr;
struct erofs_inode_compact *dic;
struct erofs_inode_extended *die, *copied = NULL;
+ union erofs_inode_i_u iu;
unsigned int ifmt;
int err;
dic = kaddr + *ofs;
ifmt = le16_to_cpu(dic->i_format);
-
if (ifmt & ~EROFS_I_ALL) {
- erofs_err(inode->i_sb, "unsupported i_format %u of nid %llu",
+ erofs_err(sb, "unsupported i_format %u of nid %llu",
ifmt, vi->nid);
err = -EOPNOTSUPP;
goto err_out;
vi->datalayout = erofs_inode_datalayout(ifmt);
if (vi->datalayout >= EROFS_INODE_DATALAYOUT_MAX) {
- erofs_err(inode->i_sb, "unsupported datalayout %u of nid %llu",
+ erofs_err(sb, "unsupported datalayout %u of nid %llu",
vi->datalayout, vi->nid);
err = -EOPNOTSUPP;
goto err_out;
vi->xattr_isize = erofs_xattr_ibody_size(die->i_xattr_icount);
inode->i_mode = le16_to_cpu(die->i_mode);
- switch (inode->i_mode & S_IFMT) {
- case S_IFREG:
- case S_IFDIR:
- case S_IFLNK:
- vi->raw_blkaddr = le32_to_cpu(die->i_u.raw_blkaddr);
- break;
- case S_IFCHR:
- case S_IFBLK:
- inode->i_rdev =
- new_decode_dev(le32_to_cpu(die->i_u.rdev));
- break;
- case S_IFIFO:
- case S_IFSOCK:
- inode->i_rdev = 0;
- break;
- default:
- goto bogusimode;
- }
+ iu = die->i_u;
i_uid_write(inode, le32_to_cpu(die->i_uid));
i_gid_write(inode, le32_to_cpu(die->i_gid));
set_nlink(inode, le32_to_cpu(die->i_nlink));
-
- /* extended inode has its own timestamp */
+ /* each extended inode has its own timestamp */
inode_set_ctime(inode, le64_to_cpu(die->i_mtime),
le32_to_cpu(die->i_mtime_nsec));
inode->i_size = le64_to_cpu(die->i_size);
-
- /* total blocks for compressed files */
- if (erofs_inode_is_data_compressed(vi->datalayout))
- nblks = le32_to_cpu(die->i_u.compressed_blocks);
- else if (vi->datalayout == EROFS_INODE_CHUNK_BASED)
- /* fill chunked inode summary info */
- vi->chunkformat = le16_to_cpu(die->i_u.c.format);
kfree(copied);
copied = NULL;
break;
vi->xattr_isize = erofs_xattr_ibody_size(dic->i_xattr_icount);
inode->i_mode = le16_to_cpu(dic->i_mode);
- switch (inode->i_mode & S_IFMT) {
- case S_IFREG:
- case S_IFDIR:
- case S_IFLNK:
- vi->raw_blkaddr = le32_to_cpu(dic->i_u.raw_blkaddr);
- break;
- case S_IFCHR:
- case S_IFBLK:
- inode->i_rdev =
- new_decode_dev(le32_to_cpu(dic->i_u.rdev));
- break;
- case S_IFIFO:
- case S_IFSOCK:
- inode->i_rdev = 0;
- break;
- default:
- goto bogusimode;
- }
+ iu = dic->i_u;
i_uid_write(inode, le16_to_cpu(dic->i_uid));
i_gid_write(inode, le16_to_cpu(dic->i_gid));
set_nlink(inode, le16_to_cpu(dic->i_nlink));
-
/* use build time for compact inodes */
inode_set_ctime(inode, sbi->build_time, sbi->build_time_nsec);
inode->i_size = le32_to_cpu(dic->i_size);
- if (erofs_inode_is_data_compressed(vi->datalayout))
- nblks = le32_to_cpu(dic->i_u.compressed_blocks);
- else if (vi->datalayout == EROFS_INODE_CHUNK_BASED)
- vi->chunkformat = le16_to_cpu(dic->i_u.c.format);
break;
default:
- erofs_err(inode->i_sb,
- "unsupported on-disk inode version %u of nid %llu",
+ erofs_err(sb, "unsupported on-disk inode version %u of nid %llu",
erofs_inode_version(ifmt), vi->nid);
err = -EOPNOTSUPP;
goto err_out;
}
- if (vi->datalayout == EROFS_INODE_CHUNK_BASED) {
+ switch (inode->i_mode & S_IFMT) {
+ case S_IFREG:
+ case S_IFDIR:
+ case S_IFLNK:
+ vi->raw_blkaddr = le32_to_cpu(iu.raw_blkaddr);
+ break;
+ case S_IFCHR:
+ case S_IFBLK:
+ inode->i_rdev = new_decode_dev(le32_to_cpu(iu.rdev));
+ break;
+ case S_IFIFO:
+ case S_IFSOCK:
+ inode->i_rdev = 0;
+ break;
+ default:
+ erofs_err(sb, "bogus i_mode (%o) @ nid %llu", inode->i_mode,
+ vi->nid);
+ err = -EFSCORRUPTED;
+ goto err_out;
+ }
+
+ /* total blocks for compressed files */
+ if (erofs_inode_is_data_compressed(vi->datalayout)) {
+ nblks = le32_to_cpu(iu.compressed_blocks);
+ } else if (vi->datalayout == EROFS_INODE_CHUNK_BASED) {
+ /* fill chunked inode summary info */
+ vi->chunkformat = le16_to_cpu(iu.c.format);
if (vi->chunkformat & ~EROFS_CHUNK_FORMAT_ALL) {
- erofs_err(inode->i_sb,
- "unsupported chunk format %x of nid %llu",
+ erofs_err(sb, "unsupported chunk format %x of nid %llu",
vi->chunkformat, vi->nid);
err = -EOPNOTSUPP;
goto err_out;
inode->i_blocks = nblks << (sb->s_blocksize_bits - 9);
return kaddr;
-bogusimode:
- erofs_err(inode->i_sb, "bogus i_mode (%o) @ nid %llu",
- inode->i_mode, vi->nid);
- err = -EFSCORRUPTED;
err_out:
DBG_BUGON(1);
kfree(copied);
goto out_unlock;
}
- iocb->ki_pos += status;
ret += status;
endbyte = pos + status - 1;
ret2 = filemap_write_and_wait_range(inode->i_mapping, pos,
return;
}
/*
- * If i_disksize got extended due to writeback of delalloc blocks while
- * the DIO was running we could fail to cleanup the orphan list in
- * ext4_handle_inode_extension(). Do it now.
+ * If i_disksize got extended either due to writeback of delalloc
+ * blocks or extending truncate while the DIO was running we could fail
+ * to cleanup the orphan list in ext4_handle_inode_extension(). Do it
+ * now.
*/
if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) {
handle_t *handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
* blocks. But the code in ext4_iomap_alloc() is careful to use
* zeroed/unwritten extents if this is possible; thus we won't leave
* uninitialized blocks in a file even if we didn't succeed in writing
- * as much as we intended.
+ * as much as we intended. Also we can race with truncate or write
+ * expanding the file so we have to be a bit careful here.
*/
- WARN_ON_ONCE(i_size_read(inode) < READ_ONCE(EXT4_I(inode)->i_disksize));
- if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize))
+ if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
+ pos + size <= i_size_read(inode))
return size;
return ext4_handle_inode_extension(inode, pos, size);
}
start = max(start, rounddown(ac->ac_o_ex.fe_logical,
(ext4_lblk_t)EXT4_BLOCKS_PER_GROUP(ac->ac_sb)));
+ /* avoid unnecessary preallocation that may trigger assertions */
+ if (start + size > EXT_MAX_BLOCKS)
+ size = EXT_MAX_BLOCKS - start;
+
/* don't cover already allocated blocks in selected range */
if (ar->pleft && start <= ar->lleft) {
size -= ar->lleft + 1 - start;
if (fc->dax) {
fuse_free_dax_mem_ranges(&fc->dax->free_ranges);
kfree(fc->dax);
+ fc->dax = NULL;
}
}
if (!ia)
return -ENOMEM;
- if (fopen_direct_io && fc->direct_io_relax) {
+ if (fopen_direct_io && fc->direct_io_allow_mmap) {
res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
if (res) {
fuse_io_free(ia);
ssize_t res;
bool exclusive_lock =
!(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) ||
+ get_fuse_conn(inode)->direct_io_allow_mmap ||
iocb->ki_flags & IOCB_APPEND ||
fuse_direct_write_extending_i_size(iocb, from);
* Take exclusive lock if
* - Parallel direct writes are disabled - a user space decision
* - Parallel direct writes are enabled and i_size is being extended.
+ * - Shared mmap on direct_io file is supported (FUSE_DIRECT_IO_ALLOW_MMAP).
* This might not be needed at all, but needs further investigation.
*/
if (exclusive_lock)
if (ff->open_flags & FOPEN_DIRECT_IO) {
/* Can't provide the coherency needed for MAP_SHARED
- * if FUSE_DIRECT_IO_RELAX isn't set.
+ * if FUSE_DIRECT_IO_ALLOW_MMAP isn't set.
*/
- if ((vma->vm_flags & VM_MAYSHARE) && !fc->direct_io_relax)
+ if ((vma->vm_flags & VM_MAYSHARE) && !fc->direct_io_allow_mmap)
return -ENODEV;
invalidate_inode_pages2(file->f_mapping);
struct fuse_forget_link *next;
};
+/* Submount lookup tracking */
+struct fuse_submount_lookup {
+ /** Refcount */
+ refcount_t count;
+
+ /** Unique ID, which identifies the inode between userspace
+ * and kernel */
+ u64 nodeid;
+
+ /** The request used for sending the FORGET message */
+ struct fuse_forget_link *forget;
+};
+
/** FUSE inode */
struct fuse_inode {
/** Inode data */
*/
struct fuse_inode_dax *dax;
#endif
+ /** Submount specific lookup tracking */
+ struct fuse_submount_lookup *submount_lookup;
};
/** FUSE inode state bits */
/* Is tmpfile not implemented by fs? */
unsigned int no_tmpfile:1;
- /* relax restrictions in FOPEN_DIRECT_IO mode */
- unsigned int direct_io_relax:1;
+ /* Relax restrictions to allow shared mmap in FOPEN_DIRECT_IO mode */
+ unsigned int direct_io_allow_mmap:1;
/* Is statx not implemented by fs? */
unsigned int no_statx:1;
return kzalloc(sizeof(struct fuse_forget_link), GFP_KERNEL_ACCOUNT);
}
+static struct fuse_submount_lookup *fuse_alloc_submount_lookup(void)
+{
+ struct fuse_submount_lookup *sl;
+
+ sl = kzalloc(sizeof(struct fuse_submount_lookup), GFP_KERNEL_ACCOUNT);
+ if (!sl)
+ return NULL;
+ sl->forget = fuse_alloc_forget();
+ if (!sl->forget)
+ goto out_free;
+
+ return sl;
+
+out_free:
+ kfree(sl);
+ return NULL;
+}
+
static struct inode *fuse_alloc_inode(struct super_block *sb)
{
struct fuse_inode *fi;
fi->attr_version = 0;
fi->orig_ino = 0;
fi->state = 0;
+ fi->submount_lookup = NULL;
mutex_init(&fi->mutex);
spin_lock_init(&fi->lock);
fi->forget = fuse_alloc_forget();
kmem_cache_free(fuse_inode_cachep, fi);
}
+static void fuse_cleanup_submount_lookup(struct fuse_conn *fc,
+ struct fuse_submount_lookup *sl)
+{
+ if (!refcount_dec_and_test(&sl->count))
+ return;
+
+ fuse_queue_forget(fc, sl->forget, sl->nodeid, 1);
+ sl->forget = NULL;
+ kfree(sl);
+}
+
static void fuse_evict_inode(struct inode *inode)
{
struct fuse_inode *fi = get_fuse_inode(inode);
fi->nlookup);
fi->forget = NULL;
}
+
+ if (fi->submount_lookup) {
+ fuse_cleanup_submount_lookup(fc, fi->submount_lookup);
+ fi->submount_lookup = NULL;
+ }
}
if (S_ISREG(inode->i_mode) && !fuse_is_bad(inode)) {
WARN_ON(!list_empty(&fi->write_files));
fuse_dax_dontcache(inode, attr->flags);
}
+static void fuse_init_submount_lookup(struct fuse_submount_lookup *sl,
+ u64 nodeid)
+{
+ sl->nodeid = nodeid;
+ refcount_set(&sl->count, 1);
+}
+
static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr,
struct fuse_conn *fc)
{
*/
if (fc->auto_submounts && (attr->flags & FUSE_ATTR_SUBMOUNT) &&
S_ISDIR(attr->mode)) {
+ struct fuse_inode *fi;
+
inode = new_inode(sb);
if (!inode)
return NULL;
fuse_init_inode(inode, attr, fc);
- get_fuse_inode(inode)->nodeid = nodeid;
+ fi = get_fuse_inode(inode);
+ fi->nodeid = nodeid;
+ fi->submount_lookup = fuse_alloc_submount_lookup();
+ if (!fi->submount_lookup) {
+ iput(inode);
+ return NULL;
+ }
+ /* Sets nlookup = 1 on fi->submount_lookup->nlookup */
+ fuse_init_submount_lookup(fi->submount_lookup, nodeid);
inode->i_flags |= S_AUTOMOUNT;
goto done;
}
iput(inode);
goto retry;
}
-done:
fi = get_fuse_inode(inode);
spin_lock(&fi->lock);
fi->nlookup++;
spin_unlock(&fi->lock);
+done:
fuse_change_attributes(inode, attr, NULL, attr_valid, attr_version);
return inode;
fc->init_security = 1;
if (flags & FUSE_CREATE_SUPP_GROUP)
fc->create_supp_group = 1;
- if (flags & FUSE_DIRECT_IO_RELAX)
- fc->direct_io_relax = 1;
+ if (flags & FUSE_DIRECT_IO_ALLOW_MMAP)
+ fc->direct_io_allow_mmap = 1;
} else {
ra_pages = fc->max_read / PAGE_SIZE;
fc->no_lock = 1;
FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA |
FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT |
FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP |
- FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_RELAX;
+ FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP;
#ifdef CONFIG_FUSE_DAX
if (fm->fc->dax)
flags |= FUSE_MAP_ALIGNMENT;
struct super_block *parent_sb = parent_fi->inode.i_sb;
struct fuse_attr root_attr;
struct inode *root;
+ struct fuse_submount_lookup *sl;
+ struct fuse_inode *fi;
fuse_sb_defaults(sb);
fm->sb = sb;
* its nlookup should not be incremented. fuse_iget() does
* that, though, so undo it here.
*/
- get_fuse_inode(root)->nlookup--;
+ fi = get_fuse_inode(root);
+ fi->nlookup--;
+
sb->s_d_op = &fuse_dentry_operations;
sb->s_root = d_make_root(root);
if (!sb->s_root)
return -ENOMEM;
+ /*
+ * Grab the parent's submount_lookup pointer and take a
+ * reference on the shared nlookup from the parent. This is to
+ * prevent the last forget for this nodeid from getting
+ * triggered until all users have finished with it.
+ */
+ sl = parent_fi->submount_lookup;
+ WARN_ON(!sl);
+ if (sl) {
+ refcount_inc(&sl->count);
+ fi->submount_lookup = sl;
+ }
+
return 0;
}
lockdep_set_class_and_name(&mapping->invalidate_lock,
&sb->s_type->invalidate_lock_key,
"mapping.invalidate_lock");
+ if (sb->s_iflags & SB_I_STABLE_WRITES)
+ mapping_set_stable_writes(mapping);
inode->i_private = NULL;
inode->i_mapping = mapping;
INIT_HLIST_HEAD(&inode->i_dentry); /* buggered by rcu freeing */
struct commit_header *tmp;
struct buffer_head *bh;
struct timespec64 now;
- blk_opf_t write_flags = REQ_OP_WRITE | REQ_SYNC;
+ blk_opf_t write_flags = REQ_OP_WRITE | JBD2_JOURNAL_REQ_FLAGS;
*cbh = NULL;
if (!ret)
ret = err;
}
+ cond_resched();
spin_lock(&journal->j_list_lock);
jinode->i_flags &= ~JI_COMMIT_RUNNING;
smp_mb();
*/
jbd2_journal_update_sb_log_tail(journal,
journal->j_tail_sequence,
- journal->j_tail,
- REQ_SYNC);
+ journal->j_tail, 0);
mutex_unlock(&journal->j_checkpoint_mutex);
} else {
jbd2_debug(3, "superblock not updated\n");
for (i = 0; i < bufs; i++) {
struct buffer_head *bh = wbuf[i];
+
/*
* Compute checksum.
*/
clear_buffer_dirty(bh);
set_buffer_uptodate(bh);
bh->b_end_io = journal_end_buffer_io_sync;
- submit_bh(REQ_OP_WRITE | REQ_SYNC, bh);
+ submit_bh(REQ_OP_WRITE | JBD2_JOURNAL_REQ_FLAGS,
+ bh);
}
cond_resched();
* space and if we lose sb update during power failure we'd replay
* old transaction with possibly newly overwritten data.
*/
- ret = jbd2_journal_update_sb_log_tail(journal, tid, block,
- REQ_SYNC | REQ_FUA);
+ ret = jbd2_journal_update_sb_log_tail(journal, tid, block, REQ_FUA);
if (ret)
goto out;
*/
jbd2_journal_update_sb_log_tail(journal,
journal->j_tail_sequence,
- journal->j_tail,
- REQ_SYNC | REQ_FUA);
+ journal->j_tail, REQ_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
}
return jbd2_journal_start_thread(journal);
return -EIO;
}
- trace_jbd2_write_superblock(journal, write_flags);
+ /*
+ * Always set high priority flags to exempt from block layer's
+ * QOS policies, e.g. writeback throttle.
+ */
+ write_flags |= JBD2_JOURNAL_REQ_FLAGS;
if (!(journal->j_flags & JBD2_BARRIER))
write_flags &= ~(REQ_FUA | REQ_PREFLUSH);
+
+ trace_jbd2_write_superblock(journal, write_flags);
+
if (buffer_write_io_error(bh)) {
/*
* Oh, dear. A previous attempt to write the journal
jbd2_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode);
sb->s_errno = cpu_to_be32(errcode);
- jbd2_write_superblock(journal, REQ_SYNC | REQ_FUA);
+ jbd2_write_superblock(journal, REQ_FUA);
}
EXPORT_SYMBOL(jbd2_journal_update_sb_errno);
++journal->j_transaction_sequence;
write_unlock(&journal->j_state_lock);
- jbd2_mark_journal_empty(journal,
- REQ_SYNC | REQ_PREFLUSH | REQ_FUA);
+ jbd2_mark_journal_empty(journal, REQ_PREFLUSH | REQ_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
} else
err = -EIO;
* the magic code for a fully-recovered superblock. Any future
* commits of data to the journal will restore the current
* s_start value. */
- jbd2_mark_journal_empty(journal, REQ_SYNC | REQ_FUA);
+ jbd2_mark_journal_empty(journal, REQ_FUA);
if (flags)
err = __jbd2_journal_erase(journal, flags);
if (write) {
/* Lock to make assertions happy... */
mutex_lock_io(&journal->j_checkpoint_mutex);
- jbd2_mark_journal_empty(journal, REQ_SYNC | REQ_FUA);
+ jbd2_mark_journal_empty(journal, REQ_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
}
return -EINVAL;
}
+ /* In this case, ->private_data is protected by f_pos_lock */
+ file->private_data = NULL;
return vfs_setpos(file, offset, U32_MAX);
}
inode->i_ino, fs_umode_to_dtype(inode->i_mode));
}
-static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
+static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
{
struct offset_ctx *so_ctx = inode->i_op->get_offset_ctx(inode);
XA_STATE(xas, &so_ctx->xa, ctx->pos);
while (true) {
dentry = offset_find_next(&xas);
if (!dentry)
- break;
+ return ERR_PTR(-ENOENT);
if (!offset_dir_emit(ctx, dentry)) {
dput(dentry);
dput(dentry);
ctx->pos = xas.xa_index + 1;
}
+ return NULL;
}
/**
if (!dir_emit_dots(file, ctx))
return 0;
- offset_iterate_dir(d_inode(dir), ctx);
+ /* In this case, ->private_data is protected by f_pos_lock */
+ if (ctx->pos == 2)
+ file->private_data = NULL;
+ else if (file->private_data == ERR_PTR(-ENOENT))
+ return 0;
+ file->private_data = offset_iterate_dir(d_inode(dir), ctx);
return 0;
}
int i;
int flags = nfsexp_flags(rqstp, exp);
- validate_process_creds();
-
/* discard any old override before preparing the new set */
revert_creds(get_cred(current_real_cred()));
new = prepare_creds();
else
new->cap_effective = cap_raise_nfsd_set(new->cap_effective,
new->cap_permitted);
- validate_process_creds();
put_cred(override_creds(new));
put_cred(new);
- validate_process_creds();
return 0;
oom:
void nfsd_net_reply_cache_destroy(struct nfsd_net *nn);
int nfsd_reply_cache_init(struct nfsd_net *);
void nfsd_reply_cache_shutdown(struct nfsd_net *);
-int nfsd_cache_lookup(struct svc_rqst *rqstp,
- struct nfsd_cacherep **cacherep);
+int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start,
+ unsigned int len, struct nfsd_cacherep **cacherep);
void nfsd_cache_update(struct svc_rqst *rqstp, struct nfsd_cacherep *rp,
int cachetype, __be32 *statp);
int nfsd_reply_cache_stats_show(struct seq_file *m, void *v);
static void encode_bitmap4(struct xdr_stream *xdr, const __u32 *bitmap,
size_t len)
{
- xdr_stream_encode_uint32_array(xdr, bitmap, len);
-}
-
-static int decode_cb_fattr4(struct xdr_stream *xdr, uint32_t *bitmap,
- struct nfs4_cb_fattr *fattr)
-{
- fattr->ncf_cb_change = 0;
- fattr->ncf_cb_fsize = 0;
- if (bitmap[0] & FATTR4_WORD0_CHANGE)
- if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_change) < 0)
- return -NFSERR_BAD_XDR;
- if (bitmap[0] & FATTR4_WORD0_SIZE)
- if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_fsize) < 0)
- return -NFSERR_BAD_XDR;
- return 0;
+ WARN_ON_ONCE(xdr_stream_encode_uint32_array(xdr, bitmap, len) < 0);
}
/*
hdr->nops++;
}
-/*
- * CB_GETATTR4args
- * struct CB_GETATTR4args {
- * nfs_fh4 fh;
- * bitmap4 attr_request;
- * };
- *
- * The size and change attributes are the only one
- * guaranteed to be serviced by the client.
- */
-static void
-encode_cb_getattr4args(struct xdr_stream *xdr, struct nfs4_cb_compound_hdr *hdr,
- struct nfs4_cb_fattr *fattr)
-{
- struct nfs4_delegation *dp =
- container_of(fattr, struct nfs4_delegation, dl_cb_fattr);
- struct knfsd_fh *fh = &dp->dl_stid.sc_file->fi_fhandle;
-
- encode_nfs_cb_opnum4(xdr, OP_CB_GETATTR);
- encode_nfs_fh4(xdr, fh);
- encode_bitmap4(xdr, fattr->ncf_cb_bmap, ARRAY_SIZE(fattr->ncf_cb_bmap));
- hdr->nops++;
-}
-
/*
* CB_SEQUENCE4args
*
xdr_reserve_space(xdr, 0);
}
-/*
- * 20.1. Operation 3: CB_GETATTR - Get Attributes
- */
-static void nfs4_xdr_enc_cb_getattr(struct rpc_rqst *req,
- struct xdr_stream *xdr, const void *data)
-{
- const struct nfsd4_callback *cb = data;
- struct nfs4_cb_fattr *ncf =
- container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
- struct nfs4_cb_compound_hdr hdr = {
- .ident = cb->cb_clp->cl_cb_ident,
- .minorversion = cb->cb_clp->cl_minorversion,
- };
-
- encode_cb_compound4args(xdr, &hdr);
- encode_cb_sequence4args(xdr, cb, &hdr);
- encode_cb_getattr4args(xdr, &hdr, ncf);
- encode_cb_nops(&hdr);
-}
-
/*
* 20.2. Operation 4: CB_RECALL - Recall a Delegation
*/
return 0;
}
-/*
- * 20.1. Operation 3: CB_GETATTR - Get Attributes
- */
-static int nfs4_xdr_dec_cb_getattr(struct rpc_rqst *rqstp,
- struct xdr_stream *xdr,
- void *data)
-{
- struct nfsd4_callback *cb = data;
- struct nfs4_cb_compound_hdr hdr;
- int status;
- u32 bitmap[3] = {0};
- u32 attrlen;
- struct nfs4_cb_fattr *ncf =
- container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
-
- status = decode_cb_compound4res(xdr, &hdr);
- if (unlikely(status))
- return status;
-
- status = decode_cb_sequence4res(xdr, cb);
- if (unlikely(status || cb->cb_seq_status))
- return status;
-
- status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status);
- if (status)
- return status;
- if (xdr_stream_decode_uint32_array(xdr, bitmap, 3) < 0)
- return -NFSERR_BAD_XDR;
- if (xdr_stream_decode_u32(xdr, &attrlen) < 0)
- return -NFSERR_BAD_XDR;
- if (attrlen > (sizeof(ncf->ncf_cb_change) + sizeof(ncf->ncf_cb_fsize)))
- return -NFSERR_BAD_XDR;
- status = decode_cb_fattr4(xdr, bitmap, ncf);
- return status;
-}
-
/*
* 20.2. Operation 4: CB_RECALL - Recall a Delegation
*/
PROC(CB_NOTIFY_LOCK, COMPOUND, cb_notify_lock, cb_notify_lock),
PROC(CB_OFFLOAD, COMPOUND, cb_offload, cb_offload),
PROC(CB_RECALL_ANY, COMPOUND, cb_recall_any, cb_recall_any),
- PROC(CB_GETATTR, COMPOUND, cb_getattr, cb_getattr),
};
static unsigned int nfs4_cb_counts[ARRAY_SIZE(nfs4_cb_procedures)];
static const struct nfsd4_callback_ops nfsd4_cb_recall_ops;
static const struct nfsd4_callback_ops nfsd4_cb_notify_lock_ops;
-static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops;
static struct workqueue_struct *laundry_wq;
dp->dl_recalled = false;
nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client,
&nfsd4_cb_recall_ops, NFSPROC4_CLNT_CB_RECALL);
- nfsd4_init_cb(&dp->dl_cb_fattr.ncf_getattr, dp->dl_stid.sc_client,
- &nfsd4_cb_getattr_ops, NFSPROC4_CLNT_CB_GETATTR);
- dp->dl_cb_fattr.ncf_file_modified = false;
- dp->dl_cb_fattr.ncf_cb_bmap[0] = FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE;
get_nfs4_file(fp);
dp->dl_stid.sc_file = fp;
return dp;
/* XXX: alternatively, we could get/drop in seq start/stop */
drop_client(clp);
- return 0;
+ return seq_release(inode, file);
}
static const struct file_operations client_states_fops = {
spin_unlock(&nn->client_lock);
}
-static int
-nfsd4_cb_getattr_done(struct nfsd4_callback *cb, struct rpc_task *task)
-{
- struct nfs4_cb_fattr *ncf =
- container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
-
- ncf->ncf_cb_status = task->tk_status;
- switch (task->tk_status) {
- case -NFS4ERR_DELAY:
- rpc_delay(task, 2 * HZ);
- return 0;
- default:
- return 1;
- }
-}
-
-static void
-nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
-{
- struct nfs4_cb_fattr *ncf =
- container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
- struct nfs4_delegation *dp =
- container_of(ncf, struct nfs4_delegation, dl_cb_fattr);
-
- nfs4_put_stid(&dp->dl_stid);
- clear_bit(CB_GETATTR_BUSY, &ncf->ncf_cb_flags);
- wake_up_bit(&ncf->ncf_cb_flags, CB_GETATTR_BUSY);
-}
-
static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
.done = nfsd4_cb_recall_any_done,
.release = nfsd4_cb_recall_any_release,
};
-static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops = {
- .done = nfsd4_cb_getattr_done,
- .release = nfsd4_cb_getattr_release,
-};
-
-void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf)
-{
- struct nfs4_delegation *dp =
- container_of(ncf, struct nfs4_delegation, dl_cb_fattr);
-
- if (test_and_set_bit(CB_GETATTR_BUSY, &ncf->ncf_cb_flags))
- return;
- refcount_inc(&dp->dl_stid.sc_count);
- nfsd4_run_cb(&ncf->ncf_getattr);
-}
-
static struct nfs4_client *create_client(struct xdr_netobj name,
struct svc_rqst *rqstp, nfs4_verifier *verf)
{
struct svc_fh *parent = NULL;
int cb_up;
int status = 0;
- struct kstat stat;
- struct path path;
cb_up = nfsd4_cb_channel_good(oo->oo_owner.so_client);
open->op_recall = false;
if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
open->op_delegate_type = NFS4_OPEN_DELEGATE_WRITE;
trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
- path.mnt = currentfh->fh_export->ex_path.mnt;
- path.dentry = currentfh->fh_dentry;
- if (vfs_getattr(&path, &stat,
- (STATX_SIZE | STATX_CTIME | STATX_CHANGE_COOKIE),
- AT_STATX_SYNC_AS_STAT)) {
- nfs4_put_stid(&dp->dl_stid);
- destroy_delegation(dp);
- goto out_no_deleg;
- }
- dp->dl_cb_fattr.ncf_cur_fsize = stat.size;
- dp->dl_cb_fattr.ncf_initial_cinfo =
- nfsd4_change_attribute(&stat, d_inode(currentfh->fh_dentry));
} else {
open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
* nfsd4_deleg_getattr_conflict - Recall if GETATTR causes conflict
* @rqstp: RPC transaction context
* @inode: file to be checked for a conflict
- * @modified: return true if file was modified
- * @size: new size of file if modified is true
*
* This function is called when there is a conflict between a write
* delegation and a change/size GETATTR from another client. The server
* delegation before replying to the GETATTR. See RFC 8881 section
* 18.7.4.
*
+ * The current implementation does not support CB_GETATTR yet. However
+ * this can avoid recalling the delegation could be added in follow up
+ * work.
+ *
* Returns 0 if there is no conflict; otherwise an nfs_stat
* code is returned.
*/
__be32
-nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
- bool *modified, u64 *size)
+nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode)
{
+ __be32 status;
struct file_lock_context *ctx;
- struct nfs4_delegation *dp;
- struct nfs4_cb_fattr *ncf;
struct file_lock *fl;
- struct iattr attrs;
- __be32 status;
-
- might_sleep();
+ struct nfs4_delegation *dp;
- *modified = false;
ctx = locks_inode_context(inode);
if (!ctx)
return 0;
break_lease:
spin_unlock(&ctx->flc_lock);
nfsd_stats_wdeleg_getattr_inc();
-
- dp = fl->fl_owner;
- ncf = &dp->dl_cb_fattr;
- nfs4_cb_getattr(&dp->dl_cb_fattr);
- wait_on_bit(&ncf->ncf_cb_flags, CB_GETATTR_BUSY, TASK_INTERRUPTIBLE);
- if (ncf->ncf_cb_status) {
- status = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
- if (status != nfserr_jukebox ||
- !nfsd_wait_for_delegreturn(rqstp, inode))
- return status;
- }
- if (!ncf->ncf_file_modified &&
- (ncf->ncf_initial_cinfo != ncf->ncf_cb_change ||
- ncf->ncf_cur_fsize != ncf->ncf_cb_fsize))
- ncf->ncf_file_modified = true;
- if (ncf->ncf_file_modified) {
- /*
- * The server would not update the file's metadata
- * with the client's modified size.
- */
- attrs.ia_mtime = attrs.ia_ctime = current_time(inode);
- attrs.ia_valid = ATTR_MTIME | ATTR_CTIME;
- setattr_copy(&nop_mnt_idmap, inode, &attrs);
- mark_inode_dirty(inode);
- ncf->ncf_cur_fsize = ncf->ncf_cb_fsize;
- *size = ncf->ncf_cur_fsize;
- *modified = true;
- }
+ status = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
+ if (status != nfserr_jukebox ||
+ !nfsd_wait_for_delegreturn(rqstp, inode))
+ return status;
return 0;
}
break;
u32 attrmask[3];
unsigned long mask[2];
} u;
- bool file_modified;
unsigned long bit;
- u64 size = 0;
WARN_ON_ONCE(bmval[1] & NFSD_WRITEONLY_ATTRS_WORD1);
WARN_ON_ONCE(!nfsd_attrs_supported(minorversion, bmval));
}
args.size = 0;
if (u.attrmask[0] & (FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE)) {
- status = nfsd4_deleg_getattr_conflict(rqstp, d_inode(dentry),
- &file_modified, &size);
+ status = nfsd4_deleg_getattr_conflict(rqstp, d_inode(dentry));
if (status)
goto out;
}
AT_STATX_SYNC_AS_STAT);
if (err)
goto out_nfserr;
- args.size = file_modified ? size : args.stat.size;
+ args.size = args.stat.size;
if (!(args.stat.result_mask & STATX_BTIME))
/* underlying FS does not offer btime so we can't share it */
return freed;
}
-/*
- * Walk an xdr_buf and get a CRC for at most the first RC_CSUMLEN bytes
+/**
+ * nfsd_cache_csum - Checksum incoming NFS Call arguments
+ * @buf: buffer containing a whole RPC Call message
+ * @start: starting byte of the NFS Call header
+ * @remaining: size of the NFS Call header, in bytes
+ *
+ * Compute a weak checksum of the leading bytes of an NFS procedure
+ * call header to help verify that a retransmitted Call matches an
+ * entry in the duplicate reply cache.
+ *
+ * To avoid assumptions about how the RPC message is laid out in
+ * @buf and what else it might contain (eg, a GSS MIC suffix), the
+ * caller passes us the exact location and length of the NFS Call
+ * header.
+ *
+ * Returns a 32-bit checksum value, as defined in RFC 793.
*/
-static __wsum
-nfsd_cache_csum(struct svc_rqst *rqstp)
+static __wsum nfsd_cache_csum(struct xdr_buf *buf, unsigned int start,
+ unsigned int remaining)
{
+ unsigned int base, len;
+ struct xdr_buf subbuf;
+ __wsum csum = 0;
+ void *p;
int idx;
- unsigned int base;
- __wsum csum;
- struct xdr_buf *buf = &rqstp->rq_arg;
- const unsigned char *p = buf->head[0].iov_base;
- size_t csum_len = min_t(size_t, buf->head[0].iov_len + buf->page_len,
- RC_CSUMLEN);
- size_t len = min(buf->head[0].iov_len, csum_len);
+
+ if (remaining > RC_CSUMLEN)
+ remaining = RC_CSUMLEN;
+ if (xdr_buf_subsegment(buf, &subbuf, start, remaining))
+ return csum;
/* rq_arg.head first */
- csum = csum_partial(p, len, 0);
- csum_len -= len;
+ if (subbuf.head[0].iov_len) {
+ len = min_t(unsigned int, subbuf.head[0].iov_len, remaining);
+ csum = csum_partial(subbuf.head[0].iov_base, len, csum);
+ remaining -= len;
+ }
/* Continue into page array */
- idx = buf->page_base / PAGE_SIZE;
- base = buf->page_base & ~PAGE_MASK;
- while (csum_len) {
- p = page_address(buf->pages[idx]) + base;
- len = min_t(size_t, PAGE_SIZE - base, csum_len);
+ idx = subbuf.page_base / PAGE_SIZE;
+ base = subbuf.page_base & ~PAGE_MASK;
+ while (remaining) {
+ p = page_address(subbuf.pages[idx]) + base;
+ len = min_t(unsigned int, PAGE_SIZE - base, remaining);
csum = csum_partial(p, len, csum);
- csum_len -= len;
+ remaining -= len;
base = 0;
++idx;
}
/**
* nfsd_cache_lookup - Find an entry in the duplicate reply cache
* @rqstp: Incoming Call to find
+ * @start: starting byte in @rqstp->rq_arg of the NFS Call header
+ * @len: size of the NFS Call header, in bytes
* @cacherep: OUT: DRC entry for this request
*
* Try to find an entry matching the current call in the cache. When none
* %RC_REPLY: Reply from cache
* %RC_DROPIT: Do not process the request further
*/
-int nfsd_cache_lookup(struct svc_rqst *rqstp, struct nfsd_cacherep **cacherep)
+int nfsd_cache_lookup(struct svc_rqst *rqstp, unsigned int start,
+ unsigned int len, struct nfsd_cacherep **cacherep)
{
struct nfsd_net *nn;
struct nfsd_cacherep *rp, *found;
goto out;
}
- csum = nfsd_cache_csum(rqstp);
+ csum = nfsd_cache_csum(&rqstp->rq_arg, start, len);
/*
* Since the common case is a cache miss followed by an insert,
return;
}
-/*
- * Copy cached reply to current reply buffer. Should always fit.
- * FIXME as reply is in a page, we should just attach the page, and
- * keep a refcount....
- */
static int
nfsd_cache_append(struct svc_rqst *rqstp, struct kvec *data)
{
- struct kvec *vec = &rqstp->rq_res.head[0];
-
- if (vec->iov_len + data->iov_len > PAGE_SIZE) {
- printk(KERN_WARNING "nfsd: cached reply too large (%zd).\n",
- data->iov_len);
- return 0;
- }
- memcpy((char*)vec->iov_base + vec->iov_len, data->iov_base, data->iov_len);
- vec->iov_len += data->iov_len;
- return 1;
+ __be32 *p;
+
+ p = xdr_reserve_space(&rqstp->rq_res_stream, data->iov_len);
+ if (unlikely(!p))
+ return false;
+ memcpy(p, data->iov_base, data->iov_len);
+ xdr_commit_encode(&rqstp->rq_res_stream);
+ return true;
}
/*
err = svc_addsock(nn->nfsd_serv, net, fd, buf, SIMPLE_TRANSACTION_LIMIT, cred);
- if (err >= 0 &&
- !nn->nfsd_serv->sv_nrthreads && !xchg(&nn->keep_active, 1))
+ if (err < 0 && !nn->nfsd_serv->sv_nrthreads && !nn->keep_active)
+ nfsd_last_thread(net);
+ else if (err >= 0 &&
+ !nn->nfsd_serv->sv_nrthreads && !xchg(&nn->keep_active, 1))
svc_get(nn->nfsd_serv);
nfsd_put(net);
svc_xprt_put(xprt);
}
out_err:
+ if (!nn->nfsd_serv->sv_nrthreads && !nn->keep_active)
+ nfsd_last_thread(net);
+
nfsd_put(net);
return err;
}
int ret = -ENODEV;
mutex_lock(&nfsd_mutex);
- if (nn->nfsd_serv) {
- svc_get(nn->nfsd_serv);
+ if (nn->nfsd_serv)
ret = 0;
- }
- mutex_unlock(&nfsd_mutex);
+ else
+ mutex_unlock(&nfsd_mutex);
return ret;
}
*/
int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
{
- mutex_lock(&nfsd_mutex);
- nfsd_put(sock_net(cb->skb->sk));
mutex_unlock(&nfsd_mutex);
return 0;
int nfsd_minorversion(struct nfsd_net *nn, u32 minorversion, enum vers_op change);
void nfsd_reset_versions(struct nfsd_net *nn);
int nfsd_create_serv(struct net *net);
+void nfsd_last_thread(struct net *net);
extern int nfsd_max_blksize;
/* Only used under nfsd_mutex, so this atomic may be overkill: */
static atomic_t nfsd_notifier_refcount = ATOMIC_INIT(0);
-static void nfsd_last_thread(struct net *net)
+void nfsd_last_thread(struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct svc_serv *serv = nn->nfsd_serv;
rqstp->rq_server->sv_maxconn = nn->max_connections;
svc_recv(rqstp);
- validate_process_creds();
}
atomic_dec(&nfsdstats.th_cnt);
const struct svc_procedure *proc = rqstp->rq_procinfo;
__be32 *statp = rqstp->rq_accept_statp;
struct nfsd_cacherep *rp;
+ unsigned int start, len;
+ __be32 *nfs_reply;
/*
* Give the xdr decoder a chance to change this if it wants
*/
rqstp->rq_cachetype = proc->pc_cachetype;
+ /*
+ * ->pc_decode advances the argument stream past the NFS
+ * Call header, so grab the header's starting location and
+ * size now for the call to nfsd_cache_lookup().
+ */
+ start = xdr_stream_pos(&rqstp->rq_arg_stream);
+ len = xdr_stream_remaining(&rqstp->rq_arg_stream);
if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream))
goto out_decode_err;
smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter | 1);
rp = NULL;
- switch (nfsd_cache_lookup(rqstp, &rp)) {
+ switch (nfsd_cache_lookup(rqstp, start, len, &rp)) {
case RC_DOIT:
break;
case RC_REPLY:
goto out_dropit;
}
+ nfs_reply = xdr_inline_decode(&rqstp->rq_res_stream, 0);
*statp = proc->pc_func(rqstp);
if (test_bit(RQ_DROPME, &rqstp->rq_flags))
goto out_update_drop;
*/
smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
- nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
+ nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, nfs_reply);
out_cached_reply:
return 1;
time64_t cpntf_time; /* last time stateid used */
};
-struct nfs4_cb_fattr {
- struct nfsd4_callback ncf_getattr;
- u32 ncf_cb_status;
- u32 ncf_cb_bmap[1];
-
- /* from CB_GETATTR reply */
- u64 ncf_cb_change;
- u64 ncf_cb_fsize;
-
- unsigned long ncf_cb_flags;
- bool ncf_file_modified;
- u64 ncf_initial_cinfo;
- u64 ncf_cur_fsize;
-};
-
-/* bits for ncf_cb_flags */
-#define CB_GETATTR_BUSY 0
-
/*
* Represents a delegation stateid. The nfs4_client holds references to these
* and they are put when it is being destroyed or when the delegation is
int dl_retries;
struct nfsd4_callback dl_recall;
bool dl_recalled;
-
- /* for CB_GETATTR */
- struct nfs4_cb_fattr dl_cb_fattr;
};
#define cb_to_delegation(cb) \
NFSPROC4_CLNT_CB_SEQUENCE,
NFSPROC4_CLNT_CB_NOTIFY_LOCK,
NFSPROC4_CLNT_CB_RECALL_ANY,
- NFSPROC4_CLNT_CB_GETATTR,
};
/* Returns true iff a is later than b: */
}
extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
- struct inode *inode, bool *file_modified, u64 *size);
-extern void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf);
+ struct inode *inode);
#endif /* NFSD4_STATE_H */
int host_err;
bool retried = false;
- validate_process_creds();
/*
* If we get here, then the client has already done an "open",
* and (hopefully) checked permission - so allow OWNER_OVERRIDE
}
err = nfserrno(host_err);
}
- validate_process_creds();
return err;
}
nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp, int may_flags,
struct file **filp)
{
- int err;
-
- validate_process_creds();
- err = __nfsd_open(rqstp, fhp, S_IFREG, may_flags, filp);
- validate_process_creds();
- return err;
+ return __nfsd_open(rqstp, fhp, S_IFREG, may_flags, filp);
}
/*
#define NFS4_dec_cb_recall_any_sz (cb_compound_dec_hdr_sz + \
cb_sequence_dec_sz + \
op_dec_sz)
-
-/*
- * 1: CB_GETATTR opcode (32-bit)
- * N: file_handle
- * 1: number of entry in attribute array (32-bit)
- * 1: entry 0 in attribute array (32-bit)
- */
-#define NFS4_enc_cb_getattr_sz (cb_compound_enc_hdr_sz + \
- cb_sequence_enc_sz + \
- 1 + enc_nfs4_fh_sz + 1 + 1)
-/*
- * 4: fattr_bitmap_maxsz
- * 1: attribute array len
- * 2: change attr (64-bit)
- * 2: size (64-bit)
- */
-#define NFS4_dec_cb_getattr_sz (cb_compound_dec_hdr_sz + \
- cb_sequence_dec_sz + 4 + 1 + 2 + 2 + op_dec_sz)
down_write(&NILFS_MDT(sufile)->mi_sem);
ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0, &bh);
- if (!ret) {
- mark_buffer_dirty(bh);
- nilfs_mdt_mark_dirty(sufile);
- kaddr = kmap_atomic(bh->b_page);
- su = nilfs_sufile_block_get_segment_usage(sufile, segnum, bh, kaddr);
+ if (ret)
+ goto out_sem;
+
+ kaddr = kmap_atomic(bh->b_page);
+ su = nilfs_sufile_block_get_segment_usage(sufile, segnum, bh, kaddr);
+ if (unlikely(nilfs_segment_usage_error(su))) {
+ struct the_nilfs *nilfs = sufile->i_sb->s_fs_info;
+
+ kunmap_atomic(kaddr);
+ brelse(bh);
+ if (nilfs_segment_is_active(nilfs, segnum)) {
+ nilfs_error(sufile->i_sb,
+ "active segment %llu is erroneous",
+ (unsigned long long)segnum);
+ } else {
+ /*
+ * Segments marked erroneous are never allocated by
+ * nilfs_sufile_alloc(); only active segments, ie,
+ * the segments indexed by ns_segnum or ns_nextnum,
+ * can be erroneous here.
+ */
+ WARN_ON_ONCE(1);
+ }
+ ret = -EIO;
+ } else {
nilfs_segment_usage_set_dirty(su);
kunmap_atomic(kaddr);
+ mark_buffer_dirty(bh);
+ nilfs_mdt_mark_dirty(sufile);
brelse(bh);
}
+out_sem:
up_write(&NILFS_MDT(sufile)->mi_sem);
return ret;
}
kaddr = kmap_atomic(bh->b_page);
su = nilfs_sufile_block_get_segment_usage(sufile, segnum, bh, kaddr);
- WARN_ON(nilfs_segment_usage_error(su));
- if (modtime)
+ if (modtime) {
+ /*
+ * Check segusage error and set su_lastmod only when updating
+ * this entry with a valid timestamp, not for cancellation.
+ */
+ WARN_ON_ONCE(nilfs_segment_usage_error(su));
su->su_lastmod = cpu_to_le64(modtime);
+ }
su->su_nblocks = cpu_to_le32(nblocks);
kunmap_atomic(kaddr);
goto failed_sbh;
}
nilfs_release_super_block(nilfs);
- sb_set_blocksize(sb, blocksize);
+ if (!sb_set_blocksize(sb, blocksize)) {
+ nilfs_err(sb, "bad blocksize %d", blocksize);
+ err = -EINVAL;
+ goto out;
+ }
err = nilfs_load_super_block(nilfs, sb, blocksize, &sbp);
if (err)
int error;
struct file *f;
- validate_creds(cred);
-
/* We must always pass in a valid mount pointer. */
BUG_ON(!path->mnt);
struct file *f;
int error;
- validate_creds(cred);
f = alloc_empty_file(flags, cred);
if (IS_ERR(f))
return f;
path.dentry = temp;
err = ovl_copy_up_data(c, &path);
/*
- * We cannot hold lock_rename() throughout this helper, because or
+ * We cannot hold lock_rename() throughout this helper, because of
* lock ordering with sb_writers, which shouldn't be held when calling
* ovl_copy_up_data(), so lock workdir and destdir and make sure that
* temp wasn't moved before copy up completion or cleanup.
- * If temp was moved, abort without the cleanup.
*/
ovl_start_write(c->dentry);
if (lock_rename(c->workdir, c->destdir) != NULL ||
temp->d_parent != c->workdir) {
+ /* temp or workdir moved underneath us? abort without cleanup */
+ dput(temp);
err = -EIO;
goto unlock;
} else if (err) {
type = ovl_path_real(dentry, &realpath);
old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getattr(&realpath, stat, request_mask, flags);
+ err = ovl_do_getattr(&realpath, stat, request_mask, flags);
if (err)
goto out;
(!is_dir ? STATX_NLINK : 0);
ovl_path_lower(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerstat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerstat, lowermask,
+ flags);
if (err)
goto out;
ovl_path_lowerdata(dentry, &realpath);
if (realpath.dentry) {
- err = vfs_getattr(&realpath, &lowerdatastat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerdatastat,
+ lowermask, flags);
if (err)
goto out;
} else {
return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
}
+static inline int ovl_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
/* util.c */
int ovl_get_write_access(struct dentry *dentry);
void ovl_put_write_access(struct dentry *dentry);
struct ovl_fs_context *ctx = fc->fs_private;
struct ovl_fs_context_layer *l;
char *dup = NULL, *iter;
- ssize_t nr_lower = 0, nr = 0, nr_data = 0;
+ ssize_t nr_lower, nr;
bool data_layer = false;
/*
iter = dup;
l = ctx->lower;
for (nr = 0; nr < nr_lower; nr++, l++) {
+ ctx->nr++;
memset(l, 0, sizeof(*l));
err = ovl_mount_dir(iter, &l->path);
goto out_put;
if (data_layer)
- nr_data++;
+ ctx->nr_data++;
/* Calling strchr() again would overrun. */
- if ((nr + 1) == nr_lower)
+ if (ctx->nr == nr_lower)
break;
err = -EINVAL;
* This is a regular layer so we require that
* there are no data layers.
*/
- if ((ctx->nr_data + nr_data) > 0) {
+ if (ctx->nr_data > 0) {
pr_err("regular lower layers cannot follow data lower layers");
goto out_put;
}
data_layer = true;
iter++;
}
- ctx->nr = nr_lower;
- ctx->nr_data += nr_data;
kfree(dup);
return 0;
return 0;
}
-/**
+/*
* Caller must hold a reference to inode to prevent it from being freed while
* it is marked inuse.
*/
struct pagemap_scan_private *p = walk->private;
struct vm_area_struct *vma = walk->vma;
unsigned long vma_category = 0;
+ bool wp_allowed = userfaultfd_wp_async(vma) &&
+ userfaultfd_wp_use_markers(vma);
- if (userfaultfd_wp_async(vma) && userfaultfd_wp_use_markers(vma))
- vma_category |= PAGE_IS_WPALLOWED;
- else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
- return -EPERM;
+ if (!wp_allowed) {
+ /* User requested explicit failure over wp-async capability */
+ if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
+ return -EPERM;
+ /*
+ * User requires wr-protect, and allows silently skipping
+ * unsupported vmas.
+ */
+ if (p->arg.flags & PM_SCAN_WP_MATCHING)
+ return 1;
+ /*
+ * Then the request doesn't involve wr-protects at all,
+ * fall through to the rest checks, and allow vma walk.
+ */
+ }
if (vma->vm_flags & VM_PFNMAP)
return 1;
+ if (wp_allowed)
+ vma_category |= PAGE_IS_WPALLOWED;
+
if (!pagemap_scan_is_interesting_vma(vma_category, p))
return 1;
return 0;
}
- if (!p->vec_out) {
+ if ((p->arg.flags & PM_SCAN_WP_MATCHING) && !p->vec_out) {
/* Fast path for performing exclusive WP */
for (addr = start; addr != end; pte++, addr += PAGE_SIZE) {
if (pte_uffd_wp(ptep_get(pte)))
oparms.fid->mid = le64_to_cpu(o_rsp->hdr.MessageId);
#endif /* CIFS_DEBUG2 */
- rc = -EINVAL;
+
if (o_rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE) {
+ spin_unlock(&cfids->cfid_list_lock);
+ rc = -EINVAL;
+ goto oshr_free;
+ }
+
+ rc = smb2_parse_contexts(server, rsp_iov,
+ &oparms.fid->epoch,
+ oparms.fid->lease_key,
+ &oplock, NULL, NULL);
+ if (rc) {
spin_unlock(&cfids->cfid_list_lock);
goto oshr_free;
}
- smb2_parse_contexts(server, o_rsp,
- &oparms.fid->epoch,
- oparms.fid->lease_key, &oplock,
- NULL, NULL);
+ rc = -EINVAL;
if (!(oplock & SMB2_LEASE_READ_CACHING_HE)) {
spin_unlock(&cfids->cfid_list_lock);
goto oshr_free;
#ifdef CONFIG_CIFS_DEBUG2
struct smb_hdr *smb = buf;
- cifs_dbg(VFS, "Cmd: %d Err: 0x%x Flags: 0x%x Flgs2: 0x%x Mid: %d Pid: %d\n",
- smb->Command, smb->Status.CifsError,
- smb->Flags, smb->Flags2, smb->Mid, smb->Pid);
- cifs_dbg(VFS, "smb buf %p len %u\n", smb,
- server->ops->calc_smb_size(smb));
+ cifs_dbg(VFS, "Cmd: %d Err: 0x%x Flags: 0x%x Flgs2: 0x%x Mid: %d Pid: %d Wct: %d\n",
+ smb->Command, smb->Status.CifsError, smb->Flags,
+ smb->Flags2, smb->Mid, smb->Pid, smb->WordCount);
+ if (!server->ops->check_message(buf, server->total_read, server)) {
+ cifs_dbg(VFS, "smb buf %p len %u\n", smb,
+ server->ops->calc_smb_size(smb));
+ }
#endif /* CONFIG_CIFS_DEBUG2 */
}
* strlen(";sec=ntlmsspi") */
#define MAX_MECH_STR_LEN 13
-/* strlen of "host=" */
-#define HOST_KEY_LEN 5
+/* strlen of ";host=" */
+#define HOST_KEY_LEN 6
/* strlen of ";ip4=" or ";ip6=" */
#define IP_KEY_LEN 5
.listxattr = cifs_listxattr,
};
+/*
+ * Advance the EOF marker to after the source range.
+ */
+static int cifs_precopy_set_eof(struct inode *src_inode, struct cifsInodeInfo *src_cifsi,
+ struct cifs_tcon *src_tcon,
+ unsigned int xid, loff_t src_end)
+{
+ struct cifsFileInfo *writeable_srcfile;
+ int rc = -EINVAL;
+
+ writeable_srcfile = find_writable_file(src_cifsi, FIND_WR_FSUID_ONLY);
+ if (writeable_srcfile) {
+ if (src_tcon->ses->server->ops->set_file_size)
+ rc = src_tcon->ses->server->ops->set_file_size(
+ xid, src_tcon, writeable_srcfile,
+ src_inode->i_size, true /* no need to set sparse */);
+ else
+ rc = -ENOSYS;
+ cifsFileInfo_put(writeable_srcfile);
+ cifs_dbg(FYI, "SetFSize for copychunk rc = %d\n", rc);
+ }
+
+ if (rc < 0)
+ goto set_failed;
+
+ netfs_resize_file(&src_cifsi->netfs, src_end);
+ fscache_resize_cookie(cifs_inode_cookie(src_inode), src_end);
+ return 0;
+
+set_failed:
+ return filemap_write_and_wait(src_inode->i_mapping);
+}
+
+/*
+ * Flush out either the folio that overlaps the beginning of a range in which
+ * pos resides or the folio that overlaps the end of a range unless that folio
+ * is entirely within the range we're going to invalidate. We extend the flush
+ * bounds to encompass the folio.
+ */
+static int cifs_flush_folio(struct inode *inode, loff_t pos, loff_t *_fstart, loff_t *_fend,
+ bool first)
+{
+ struct folio *folio;
+ unsigned long long fpos, fend;
+ pgoff_t index = pos / PAGE_SIZE;
+ size_t size;
+ int rc = 0;
+
+ folio = filemap_get_folio(inode->i_mapping, index);
+ if (IS_ERR(folio))
+ return 0;
+
+ size = folio_size(folio);
+ fpos = folio_pos(folio);
+ fend = fpos + size - 1;
+ *_fstart = min_t(unsigned long long, *_fstart, fpos);
+ *_fend = max_t(unsigned long long, *_fend, fend);
+ if ((first && pos == fpos) || (!first && pos == fend))
+ goto out;
+
+ rc = filemap_write_and_wait_range(inode->i_mapping, fpos, fend);
+out:
+ folio_put(folio);
+ return rc;
+}
+
static loff_t cifs_remap_file_range(struct file *src_file, loff_t off,
struct file *dst_file, loff_t destoff, loff_t len,
unsigned int remap_flags)
{
struct inode *src_inode = file_inode(src_file);
struct inode *target_inode = file_inode(dst_file);
+ struct cifsInodeInfo *src_cifsi = CIFS_I(src_inode);
+ struct cifsInodeInfo *target_cifsi = CIFS_I(target_inode);
struct cifsFileInfo *smb_file_src = src_file->private_data;
- struct cifsFileInfo *smb_file_target;
- struct cifs_tcon *target_tcon;
+ struct cifsFileInfo *smb_file_target = dst_file->private_data;
+ struct cifs_tcon *target_tcon, *src_tcon;
+ unsigned long long destend, fstart, fend, new_size;
unsigned int xid;
int rc;
- if (remap_flags & ~(REMAP_FILE_DEDUP | REMAP_FILE_ADVISORY))
+ if (remap_flags & REMAP_FILE_DEDUP)
+ return -EOPNOTSUPP;
+ if (remap_flags & ~REMAP_FILE_ADVISORY)
return -EINVAL;
cifs_dbg(FYI, "clone range\n");
xid = get_xid();
- if (!src_file->private_data || !dst_file->private_data) {
+ if (!smb_file_src || !smb_file_target) {
rc = -EBADF;
cifs_dbg(VFS, "missing cifsFileInfo on copy range src file\n");
goto out;
}
- smb_file_target = dst_file->private_data;
+ src_tcon = tlink_tcon(smb_file_src->tlink);
target_tcon = tlink_tcon(smb_file_target->tlink);
/*
if (len == 0)
len = src_inode->i_size - off;
- cifs_dbg(FYI, "about to flush pages\n");
- /* should we flush first and last page first */
- truncate_inode_pages_range(&target_inode->i_data, destoff,
- PAGE_ALIGN(destoff + len)-1);
+ cifs_dbg(FYI, "clone range\n");
- if (target_tcon->ses->server->ops->duplicate_extents)
+ /* Flush the source buffer */
+ rc = filemap_write_and_wait_range(src_inode->i_mapping, off,
+ off + len - 1);
+ if (rc)
+ goto unlock;
+
+ /* The server-side copy will fail if the source crosses the EOF marker.
+ * Advance the EOF marker after the flush above to the end of the range
+ * if it's short of that.
+ */
+ if (src_cifsi->netfs.remote_i_size < off + len) {
+ rc = cifs_precopy_set_eof(src_inode, src_cifsi, src_tcon, xid, off + len);
+ if (rc < 0)
+ goto unlock;
+ }
+
+ new_size = destoff + len;
+ destend = destoff + len - 1;
+
+ /* Flush the folios at either end of the destination range to prevent
+ * accidental loss of dirty data outside of the range.
+ */
+ fstart = destoff;
+ fend = destend;
+
+ rc = cifs_flush_folio(target_inode, destoff, &fstart, &fend, true);
+ if (rc)
+ goto unlock;
+ rc = cifs_flush_folio(target_inode, destend, &fstart, &fend, false);
+ if (rc)
+ goto unlock;
+
+ /* Discard all the folios that overlap the destination region. */
+ cifs_dbg(FYI, "about to discard pages %llx-%llx\n", fstart, fend);
+ truncate_inode_pages_range(&target_inode->i_data, fstart, fend);
+
+ fscache_invalidate(cifs_inode_cookie(target_inode), NULL,
+ i_size_read(target_inode), 0);
+
+ rc = -EOPNOTSUPP;
+ if (target_tcon->ses->server->ops->duplicate_extents) {
rc = target_tcon->ses->server->ops->duplicate_extents(xid,
smb_file_src, smb_file_target, off, len, destoff);
- else
- rc = -EOPNOTSUPP;
+ if (rc == 0 && new_size > i_size_read(target_inode)) {
+ truncate_setsize(target_inode, new_size);
+ netfs_resize_file(&target_cifsi->netfs, new_size);
+ fscache_resize_cookie(cifs_inode_cookie(target_inode),
+ new_size);
+ }
+ }
/* force revalidate of size and timestamps of target file now
that target is updated on the server */
CIFS_I(target_inode)->time = 0;
+unlock:
/* although unlocking in the reverse order from locking is not
strictly necessary here it is a little cleaner to be consistent */
unlock_two_nondirectories(src_inode, target_inode);
{
struct inode *src_inode = file_inode(src_file);
struct inode *target_inode = file_inode(dst_file);
+ struct cifsInodeInfo *src_cifsi = CIFS_I(src_inode);
struct cifsFileInfo *smb_file_src;
struct cifsFileInfo *smb_file_target;
struct cifs_tcon *src_tcon;
struct cifs_tcon *target_tcon;
+ unsigned long long destend, fstart, fend;
ssize_t rc;
cifs_dbg(FYI, "copychunk range\n");
if (rc)
goto unlock;
- /* should we flush first and last page first */
- truncate_inode_pages(&target_inode->i_data, 0);
+ /* The server-side copy will fail if the source crosses the EOF marker.
+ * Advance the EOF marker after the flush above to the end of the range
+ * if it's short of that.
+ */
+ if (src_cifsi->server_eof < off + len) {
+ rc = cifs_precopy_set_eof(src_inode, src_cifsi, src_tcon, xid, off + len);
+ if (rc < 0)
+ goto unlock;
+ }
+
+ destend = destoff + len - 1;
+
+ /* Flush the folios at either end of the destination range to prevent
+ * accidental loss of dirty data outside of the range.
+ */
+ fstart = destoff;
+ fend = destend;
+
+ rc = cifs_flush_folio(target_inode, destoff, &fstart, &fend, true);
+ if (rc)
+ goto unlock;
+ rc = cifs_flush_folio(target_inode, destend, &fstart, &fend, false);
+ if (rc)
+ goto unlock;
+
+ /* Discard all the folios that overlap the destination region. */
+ truncate_inode_pages_range(&target_inode->i_data, fstart, fend);
rc = file_modified(dst_file);
- if (!rc)
+ if (!rc) {
rc = target_tcon->ses->server->ops->copychunk_range(xid,
smb_file_src, smb_file_target, off, len, destoff);
+ if (rc > 0 && destoff + rc > i_size_read(target_inode))
+ truncate_setsize(target_inode, destoff + rc);
+ }
file_accessed(src_file);
bool reparse_point;
bool symlink;
};
- __u32 reparse_tag;
+ struct {
+ __u32 tag;
+ union {
+ struct reparse_data_buffer *buf;
+ struct reparse_posix_data *posix;
+ };
+ } reparse;
char *symlink_target;
union {
struct smb2_file_all_info fi;
struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
const char *full_path,
- char **target_path,
- struct kvec *rsp_iov);
+ char **target_path);
/* open a file for non-posix mounts */
int (*open)(const unsigned int xid, struct cifs_open_parms *oparms, __u32 *oplock,
void *buf);
struct mid_q_entry **, char **, int *);
enum securityEnum (*select_sectype)(struct TCP_Server_Info *,
enum securityEnum);
- int (*next_header)(char *);
+ int (*next_header)(struct TCP_Server_Info *server, char *buf,
+ unsigned int *noff);
/* ioctl passthrough for query_info */
int (*ioctl_query_info)(const unsigned int xid,
struct cifs_tcon *tcon,
bool (*is_status_io_timeout)(char *buf);
/* Check for STATUS_NETWORK_NAME_DELETED */
bool (*is_network_name_deleted)(char *buf, struct TCP_Server_Info *srv);
+ int (*parse_reparse_point)(struct cifs_sb_info *cifs_sb,
+ struct kvec *rsp_iov,
+ struct cifs_open_info_data *data);
};
struct smb_version_values {
__u8 OplockLevel;
__u16 Fid;
__le32 CreateAction;
- __le64 CreationTime;
- __le64 LastAccessTime;
- __le64 LastWriteTime;
- __le64 ChangeTime;
- __le32 FileAttributes;
+ struct_group(common_attributes,
+ __le64 CreationTime;
+ __le64 LastAccessTime;
+ __le64 LastWriteTime;
+ __le64 ChangeTime;
+ __le32 FileAttributes;
+ );
__le64 AllocationSize;
__le64 EndOfFile;
__le16 FileType;
__le32 DataDisplacement;
__u8 SetupCount; /* 1 */
__le16 ReturnedDataLen;
- __u16 ByteCount;
+ __le16 ByteCount;
} __attribute__((packed)) TRANSACT_IOCTL_RSP;
#define CIFS_ACL_OWNER 1
__le16 ReparseDataLength;
__u16 Reserved;
__le64 InodeType; /* LNK, FIFO, CHR etc. */
- char PathBuffer[];
+ __u8 DataBuffer[];
} __attribute__((packed));
struct cifs_quota_data {
/* QueryFileInfo/QueryPathinfo (also for SetPath/SetFile) data buffer formats */
/******************************************************************************/
typedef struct { /* data block encoding of response to level 263 QPathInfo */
- __le64 CreationTime;
- __le64 LastAccessTime;
- __le64 LastWriteTime;
- __le64 ChangeTime;
- __le32 Attributes;
+ struct_group(common_attributes,
+ __le64 CreationTime;
+ __le64 LastAccessTime;
+ __le64 LastWriteTime;
+ __le64 ChangeTime;
+ __le32 Attributes;
+ );
__u32 Pad1;
__le64 AllocationSize;
__le64 EndOfFile; /* size ie offset to first free byte in file */
const struct cifs_fid *fid);
bool cifs_reparse_point_to_fattr(struct cifs_sb_info *cifs_sb,
struct cifs_fattr *fattr,
- u32 tag);
+ struct cifs_open_info_data *data);
extern int smb311_posix_get_inode_info(struct inode **pinode, const char *search_path,
struct super_block *sb, unsigned int xid);
extern int cifs_get_inode_info_unix(struct inode **pinode,
struct cifs_tcon *tcon,
const unsigned char *searchName, char **syminfo,
const struct nls_table *nls_codepage, int remap);
+extern int cifs_query_reparse_point(const unsigned int xid,
+ struct cifs_tcon *tcon,
+ struct cifs_sb_info *cifs_sb,
+ const char *full_path,
+ u32 *tag, struct kvec *rsp,
+ int *rsp_buftype);
extern int CIFSSMBQuerySymLink(const unsigned int xid, struct cifs_tcon *tcon,
__u16 fid, char **symlinkinfo,
const struct nls_table *nls_codepage);
int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix);
char *extract_hostname(const char *unc);
char *extract_sharename(const char *unc);
+int parse_reparse_point(struct reparse_data_buffer *buf,
+ u32 plen, struct cifs_sb_info *cifs_sb,
+ bool unicode, struct cifs_open_info_data *data);
+int cifs_sfu_make_node(unsigned int xid, struct inode *inode,
+ struct dentry *dentry, struct cifs_tcon *tcon,
+ const char *full_path, umode_t mode, dev_t dev);
#ifdef CONFIG_CIFS_DFS_UPCALL
static inline int get_dfs_path(const unsigned int xid, struct cifs_ses *ses,
*oplock |= CIFS_CREATE_ACTION;
if (buf) {
- /* copy from CreationTime to Attributes */
- memcpy((char *)buf, (char *)&rsp->CreationTime, 36);
+ /* copy commonly used attributes */
+ memcpy(&buf->common_attributes,
+ &rsp->common_attributes,
+ sizeof(buf->common_attributes));
/* the file_info buf is endian converted by caller */
buf->AllocationSize = rsp->AllocationSize;
buf->EndOfFile = rsp->EndOfFile;
return rc;
}
-/*
- * Recent Windows versions now create symlinks more frequently
- * and they use the "reparse point" mechanism below. We can of course
- * do symlinks nicely to Samba and other servers which support the
- * CIFS Unix Extensions and we can also do SFU symlinks and "client only"
- * "MF" symlinks optionally, but for recent Windows we really need to
- * reenable the code below and fix the cifs_symlink callers to handle this.
- * In the interim this code has been moved to its own config option so
- * it is not compiled in by default until callers fixed up and more tested.
- */
-int
-CIFSSMBQuerySymLink(const unsigned int xid, struct cifs_tcon *tcon,
- __u16 fid, char **symlinkinfo,
- const struct nls_table *nls_codepage)
+int cifs_query_reparse_point(const unsigned int xid,
+ struct cifs_tcon *tcon,
+ struct cifs_sb_info *cifs_sb,
+ const char *full_path,
+ u32 *tag, struct kvec *rsp,
+ int *rsp_buftype)
{
- int rc = 0;
- int bytes_returned;
- struct smb_com_transaction_ioctl_req *pSMB;
- struct smb_com_transaction_ioctl_rsp *pSMBr;
- bool is_unicode;
- unsigned int sub_len;
- char *sub_start;
- struct reparse_symlink_data *reparse_buf;
- struct reparse_posix_data *posix_buf;
+ struct cifs_open_parms oparms;
+ TRANSACT_IOCTL_REQ *io_req = NULL;
+ TRANSACT_IOCTL_RSP *io_rsp = NULL;
+ struct cifs_fid fid;
__u32 data_offset, data_count;
- char *end_of_smb;
+ __u8 *start, *end;
+ int io_rsp_len;
+ int oplock = 0;
+ int rc;
- cifs_dbg(FYI, "In Windows reparse style QueryLink for fid %u\n", fid);
- rc = smb_init(SMB_COM_NT_TRANSACT, 23, tcon, (void **) &pSMB,
- (void **) &pSMBr);
+ cifs_tcon_dbg(FYI, "%s: path=%s\n", __func__, full_path);
+
+ if (cap_unix(tcon->ses))
+ return -EOPNOTSUPP;
+
+ oparms = (struct cifs_open_parms) {
+ .tcon = tcon,
+ .cifs_sb = cifs_sb,
+ .desired_access = FILE_READ_ATTRIBUTES,
+ .create_options = cifs_create_options(cifs_sb,
+ OPEN_REPARSE_POINT),
+ .disposition = FILE_OPEN,
+ .path = full_path,
+ .fid = &fid,
+ };
+
+ rc = CIFS_open(xid, &oparms, &oplock, NULL);
if (rc)
return rc;
- pSMB->TotalParameterCount = 0 ;
- pSMB->TotalDataCount = 0;
- pSMB->MaxParameterCount = cpu_to_le32(2);
- /* BB find exact data count max from sess structure BB */
- pSMB->MaxDataCount = cpu_to_le32(CIFSMaxBufSize & 0xFFFFFF00);
- pSMB->MaxSetupCount = 4;
- pSMB->Reserved = 0;
- pSMB->ParameterOffset = 0;
- pSMB->DataCount = 0;
- pSMB->DataOffset = 0;
- pSMB->SetupCount = 4;
- pSMB->SubCommand = cpu_to_le16(NT_TRANSACT_IOCTL);
- pSMB->ParameterCount = pSMB->TotalParameterCount;
- pSMB->FunctionCode = cpu_to_le32(FSCTL_GET_REPARSE_POINT);
- pSMB->IsFsctl = 1; /* FSCTL */
- pSMB->IsRootFlag = 0;
- pSMB->Fid = fid; /* file handle always le */
- pSMB->ByteCount = 0;
+ rc = smb_init(SMB_COM_NT_TRANSACT, 23, tcon,
+ (void **)&io_req, (void **)&io_rsp);
+ if (rc)
+ goto error;
- rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB,
- (struct smb_hdr *) pSMBr, &bytes_returned, 0);
- if (rc) {
- cifs_dbg(FYI, "Send error in QueryReparseLinkInfo = %d\n", rc);
- goto qreparse_out;
- }
+ io_req->TotalParameterCount = 0;
+ io_req->TotalDataCount = 0;
+ io_req->MaxParameterCount = cpu_to_le32(2);
+ /* BB find exact data count max from sess structure BB */
+ io_req->MaxDataCount = cpu_to_le32(CIFSMaxBufSize & 0xFFFFFF00);
+ io_req->MaxSetupCount = 4;
+ io_req->Reserved = 0;
+ io_req->ParameterOffset = 0;
+ io_req->DataCount = 0;
+ io_req->DataOffset = 0;
+ io_req->SetupCount = 4;
+ io_req->SubCommand = cpu_to_le16(NT_TRANSACT_IOCTL);
+ io_req->ParameterCount = io_req->TotalParameterCount;
+ io_req->FunctionCode = cpu_to_le32(FSCTL_GET_REPARSE_POINT);
+ io_req->IsFsctl = 1;
+ io_req->IsRootFlag = 0;
+ io_req->Fid = fid.netfid;
+ io_req->ByteCount = 0;
+
+ rc = SendReceive(xid, tcon->ses, (struct smb_hdr *)io_req,
+ (struct smb_hdr *)io_rsp, &io_rsp_len, 0);
+ if (rc)
+ goto error;
- data_offset = le32_to_cpu(pSMBr->DataOffset);
- data_count = le32_to_cpu(pSMBr->DataCount);
- if (get_bcc(&pSMBr->hdr) < 2 || data_offset > 512) {
- /* BB also check enough total bytes returned */
- rc = -EIO; /* bad smb */
- goto qreparse_out;
- }
- if (!data_count || (data_count > 2048)) {
+ data_offset = le32_to_cpu(io_rsp->DataOffset);
+ data_count = le32_to_cpu(io_rsp->DataCount);
+ if (get_bcc(&io_rsp->hdr) < 2 || data_offset > 512 ||
+ !data_count || data_count > 2048) {
rc = -EIO;
- cifs_dbg(FYI, "Invalid return data count on get reparse info ioctl\n");
- goto qreparse_out;
- }
- end_of_smb = 2 + get_bcc(&pSMBr->hdr) + (char *)&pSMBr->ByteCount;
- reparse_buf = (struct reparse_symlink_data *)
- ((char *)&pSMBr->hdr.Protocol + data_offset);
- if ((char *)reparse_buf >= end_of_smb) {
- rc = -EIO;
- goto qreparse_out;
- }
- if (reparse_buf->ReparseTag == cpu_to_le32(IO_REPARSE_TAG_NFS)) {
- cifs_dbg(FYI, "NFS style reparse tag\n");
- posix_buf = (struct reparse_posix_data *)reparse_buf;
-
- if (posix_buf->InodeType != cpu_to_le64(NFS_SPECFILE_LNK)) {
- cifs_dbg(FYI, "unsupported file type 0x%llx\n",
- le64_to_cpu(posix_buf->InodeType));
- rc = -EOPNOTSUPP;
- goto qreparse_out;
- }
- is_unicode = true;
- sub_len = le16_to_cpu(reparse_buf->ReparseDataLength);
- if (posix_buf->PathBuffer + sub_len > end_of_smb) {
- cifs_dbg(FYI, "reparse buf beyond SMB\n");
- rc = -EIO;
- goto qreparse_out;
- }
- *symlinkinfo = cifs_strndup_from_utf16(posix_buf->PathBuffer,
- sub_len, is_unicode, nls_codepage);
- goto qreparse_out;
- } else if (reparse_buf->ReparseTag !=
- cpu_to_le32(IO_REPARSE_TAG_SYMLINK)) {
- rc = -EOPNOTSUPP;
- goto qreparse_out;
+ goto error;
}
- /* Reparse tag is NTFS symlink */
- sub_start = le16_to_cpu(reparse_buf->SubstituteNameOffset) +
- reparse_buf->PathBuffer;
- sub_len = le16_to_cpu(reparse_buf->SubstituteNameLength);
- if (sub_start + sub_len > end_of_smb) {
- cifs_dbg(FYI, "reparse buf beyond SMB\n");
+ end = 2 + get_bcc(&io_rsp->hdr) + (__u8 *)&io_rsp->ByteCount;
+ start = (__u8 *)&io_rsp->hdr.Protocol + data_offset;
+ if (start >= end) {
rc = -EIO;
- goto qreparse_out;
+ goto error;
}
- if (pSMBr->hdr.Flags2 & SMBFLG2_UNICODE)
- is_unicode = true;
- else
- is_unicode = false;
-
- /* BB FIXME investigate remapping reserved chars here */
- *symlinkinfo = cifs_strndup_from_utf16(sub_start, sub_len, is_unicode,
- nls_codepage);
- if (!*symlinkinfo)
- rc = -ENOMEM;
-qreparse_out:
- cifs_buf_release(pSMB);
- /*
- * Note: On -EAGAIN error only caller can retry on handle based calls
- * since file handle passed in no longer valid.
- */
+ *tag = le32_to_cpu(((struct reparse_data_buffer *)start)->ReparseTag);
+ rsp->iov_base = io_rsp;
+ rsp->iov_len = io_rsp_len;
+ *rsp_buftype = CIFS_LARGE_BUFFER;
+ CIFSSMBClose(xid, tcon, fid.netfid);
+ return 0;
+
+error:
+ cifs_buf_release(io_req);
+ CIFSSMBClose(xid, tcon, fid.netfid);
return rc;
}
spin_unlock(&server->srv_lock);
cifs_swn_reset_server_dstaddr(server);
cifs_server_unlock(server);
-
- /* increase ref count which reconnect work will drop */
- spin_lock(&cifs_tcp_ses_lock);
- server->srv_count++;
- spin_unlock(&cifs_tcp_ses_lock);
- if (mod_delayed_work(cifsiod_wq, &server->reconnect, 0))
- cifs_put_tcp_session(server, false);
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
}
} while (server->tcpStatus == CifsNeedReconnect);
spin_unlock(&server->srv_lock);
cifs_swn_reset_server_dstaddr(server);
cifs_server_unlock(server);
-
- /* increase ref count which reconnect work will drop */
- spin_lock(&cifs_tcp_ses_lock);
- server->srv_count++;
- spin_unlock(&cifs_tcp_ses_lock);
- if (mod_delayed_work(cifsiod_wq, &server->reconnect, 0))
- cifs_put_tcp_session(server, false);
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
} while (server->tcpStatus == CifsNeedReconnect);
mutex_lock(&server->refpath_lock);
server->total_read += length;
if (server->ops->next_header) {
- next_offset = server->ops->next_header(buf);
+ if (server->ops->next_header(server, buf, &next_offset)) {
+ cifs_dbg(VFS, "%s: malformed response (next_offset=%u)\n",
+ __func__, next_offset);
+ cifs_reconnect(server, true);
+ continue;
+ }
if (next_offset)
server->pdu_size = next_offset;
}
list_del_init(&server->tcp_ses_list);
spin_unlock(&cifs_tcp_ses_lock);
- /* For secondary channels, we pick up ref-count on the primary server */
- if (SERVER_IS_CHAN(server))
- cifs_put_tcp_session(server->primary_server, from_reconnect);
-
cancel_delayed_work_sync(&server->echo);
- if (from_reconnect) {
+ if (from_reconnect)
/*
* Avoid deadlock here: reconnect work calls
* cifs_put_tcp_session() at its end. Need to be sure
* that reconnect work does nothing with server pointer after
* that step.
*/
- if (cancel_delayed_work(&server->reconnect))
- cifs_put_tcp_session(server, from_reconnect);
- } else {
- if (cancel_delayed_work_sync(&server->reconnect))
- cifs_put_tcp_session(server, from_reconnect);
- }
+ cancel_delayed_work(&server->reconnect);
+ else
+ cancel_delayed_work_sync(&server->reconnect);
+
+ /* For secondary channels, we pick up ref-count on the primary server */
+ if (SERVER_IS_CHAN(server))
+ cifs_put_tcp_session(server->primary_server, from_reconnect);
spin_lock(&server->srv_lock);
server->tcpStatus = CifsExiting;
ses->chans[i].server = NULL;
}
+ /* we now account for primary channel in iface->refcount */
+ if (ses->chans[0].iface) {
+ kref_put(&ses->chans[0].iface->refcount, release_iface);
+ ses->chans[0].server = NULL;
+ }
+
sesInfoFree(ses);
cifs_put_tcp_session(server, 0);
}
/* we do not want atime to be less than mtime, it broke some apps */
atime = inode_set_atime_to_ts(inode, current_time(inode));
mtime = inode_get_mtime(inode);
- if (timespec64_compare(&atime, &mtime))
+ if (timespec64_compare(&atime, &mtime) < 0)
inode_set_atime_to_ts(inode, inode_get_mtime(inode));
if (PAGE_SIZE > rc)
return -EOPNOTSUPP;
rc = server->ops->query_symlink(xid, tcon,
cifs_sb, full_path,
- &fattr->cf_symlink_target,
- NULL);
+ &fattr->cf_symlink_target);
cifs_dbg(FYI, "%s: query_symlink: %d\n", __func__, rc);
}
return rc;
fattr->cf_mode, fattr->cf_uniqueid, fattr->cf_nlink);
}
+static inline dev_t nfs_mkdev(struct reparse_posix_data *buf)
+{
+ u64 v = le64_to_cpu(*(__le64 *)buf->DataBuffer);
+
+ return MKDEV(v >> 32, v & 0xffffffff);
+}
+
bool cifs_reparse_point_to_fattr(struct cifs_sb_info *cifs_sb,
struct cifs_fattr *fattr,
- u32 tag)
+ struct cifs_open_info_data *data)
{
+ struct reparse_posix_data *buf = data->reparse.posix;
+ u32 tag = data->reparse.tag;
+
+ if (tag == IO_REPARSE_TAG_NFS && buf) {
+ switch (le64_to_cpu(buf->InodeType)) {
+ case NFS_SPECFILE_CHR:
+ fattr->cf_mode |= S_IFCHR | cifs_sb->ctx->file_mode;
+ fattr->cf_dtype = DT_CHR;
+ fattr->cf_rdev = nfs_mkdev(buf);
+ break;
+ case NFS_SPECFILE_BLK:
+ fattr->cf_mode |= S_IFBLK | cifs_sb->ctx->file_mode;
+ fattr->cf_dtype = DT_BLK;
+ fattr->cf_rdev = nfs_mkdev(buf);
+ break;
+ case NFS_SPECFILE_FIFO:
+ fattr->cf_mode |= S_IFIFO | cifs_sb->ctx->file_mode;
+ fattr->cf_dtype = DT_FIFO;
+ break;
+ case NFS_SPECFILE_SOCK:
+ fattr->cf_mode |= S_IFSOCK | cifs_sb->ctx->file_mode;
+ fattr->cf_dtype = DT_SOCK;
+ break;
+ case NFS_SPECFILE_LNK:
+ fattr->cf_mode = S_IFLNK | cifs_sb->ctx->file_mode;
+ fattr->cf_dtype = DT_LNK;
+ break;
+ default:
+ WARN_ON_ONCE(1);
+ return false;
+ }
+ return true;
+ }
+
switch (tag) {
case IO_REPARSE_TAG_LX_SYMLINK:
fattr->cf_mode |= S_IFLNK | cifs_sb->ctx->file_mode;
case 0: /* SMB1 symlink */
case IO_REPARSE_TAG_SYMLINK:
case IO_REPARSE_TAG_NFS:
- fattr->cf_mode = S_IFLNK;
+ fattr->cf_mode = S_IFLNK | cifs_sb->ctx->file_mode;
fattr->cf_dtype = DT_LNK;
break;
default:
fattr->cf_nlink = le32_to_cpu(info->NumberOfLinks);
if (cifs_open_data_reparse(data) &&
- cifs_reparse_point_to_fattr(cifs_sb, fattr, data->reparse_tag))
+ cifs_reparse_point_to_fattr(cifs_sb, fattr, data))
goto out_reparse;
if (fattr->cf_cifsattrs & ATTR_DIRECTORY) {
out_reparse:
if (S_ISLNK(fattr->cf_mode)) {
+ if (likely(data->symlink_target))
+ fattr->cf_eof = strnlen(data->symlink_target, PATH_MAX);
fattr->cf_symlink_target = data->symlink_target;
data->symlink_target = NULL;
}
data.adjust_tz = false;
if (data.symlink_target) {
data.symlink = true;
- data.reparse_tag = IO_REPARSE_TAG_SYMLINK;
+ data.reparse.tag = IO_REPARSE_TAG_SYMLINK;
}
cifs_open_info_to_fattr(&fattr, &data, inode->i_sb);
break;
struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
struct kvec rsp_iov, *iov = NULL;
int rsp_buftype = CIFS_NO_BUFFER;
- u32 tag = data->reparse_tag;
+ u32 tag = data->reparse.tag;
int rc = 0;
if (!tag && server->ops->query_reparse_point) {
if (!rc)
iov = &rsp_iov;
}
- switch ((data->reparse_tag = tag)) {
+
+ rc = -EOPNOTSUPP;
+ switch ((data->reparse.tag = tag)) {
case 0: /* SMB1 symlink */
- iov = NULL;
- fallthrough;
- case IO_REPARSE_TAG_NFS:
- case IO_REPARSE_TAG_SYMLINK:
- if (!data->symlink_target && server->ops->query_symlink) {
+ if (server->ops->query_symlink) {
rc = server->ops->query_symlink(xid, tcon,
cifs_sb, full_path,
- &data->symlink_target,
- iov);
+ &data->symlink_target);
}
break;
case IO_REPARSE_TAG_MOUNT_POINT:
cifs_create_junction_fattr(fattr, sb);
+ rc = 0;
goto out;
+ default:
+ if (data->symlink_target) {
+ rc = 0;
+ } else if (server->ops->parse_reparse_point) {
+ rc = server->ops->parse_reparse_point(cifs_sb,
+ iov, data);
+ }
+ break;
}
cifs_open_info_to_fattr(fattr, data, sb);
cifs_dbg(VFS, "Length less than smb header size\n");
}
return -EIO;
+ } else if (total_read < sizeof(*smb) + 2 * smb->WordCount) {
+ cifs_dbg(VFS, "%s: can't read BCC due to invalid WordCount(%u)\n",
+ __func__, smb->WordCount);
+ return -EIO;
}
/* otherwise, there is enough to get to the BCC */
static void
cifs_fill_common_info(struct cifs_fattr *fattr, struct cifs_sb_info *cifs_sb)
{
+ struct cifs_open_info_data data = {
+ .reparse = { .tag = fattr->cf_cifstag, },
+ };
+
fattr->cf_uid = cifs_sb->ctx->linux_uid;
fattr->cf_gid = cifs_sb->ctx->linux_gid;
* reasonably map some of them to directories vs. files vs. symlinks
*/
if ((fattr->cf_cifsattrs & ATTR_REPARSE) &&
- cifs_reparse_point_to_fattr(cifs_sb, fattr, fattr->cf_cifstag))
+ cifs_reparse_point_to_fattr(cifs_sb, fattr, &data))
goto out_reparse;
if (fattr->cf_cifsattrs & ATTR_DIRECTORY) {
iface = ses->chans[i].iface;
server = ses->chans[i].server;
+ /*
+ * remove these references first, since we need to unlock
+ * the chan_lock here, since iface_lock is a higher lock
+ */
+ ses->chans[i].iface = NULL;
+ ses->chans[i].server = NULL;
+ spin_unlock(&ses->chan_lock);
+
if (iface) {
spin_lock(&ses->iface_lock);
- kref_put(&iface->refcount, release_iface);
- ses->chans[i].iface = NULL;
iface->num_channels--;
if (iface->weight_fulfilled)
iface->weight_fulfilled--;
+ kref_put(&iface->refcount, release_iface);
spin_unlock(&ses->iface_lock);
}
- spin_unlock(&ses->chan_lock);
- if (server && !server->terminate) {
- server->terminate = true;
- cifs_signal_cifsd_for_reconnect(server, false);
- }
- spin_lock(&ses->chan_lock);
-
if (server) {
- ses->chans[i].server = NULL;
+ if (!server->terminate) {
+ server->terminate = true;
+ cifs_signal_cifsd_for_reconnect(server, false);
+ }
cifs_put_tcp_session(server, false);
}
+ spin_lock(&ses->chan_lock);
}
done:
cifs_dbg(FYI, "unable to find a suitable iface\n");
}
- if (!chan_index && !iface) {
+ if (!iface) {
cifs_dbg(FYI, "unable to get the interface matching: %pIS\n",
&ss);
spin_unlock(&ses->iface_lock);
}
/* now drop the ref to the current iface */
- if (old_iface && iface) {
+ if (old_iface) {
cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n",
&old_iface->sockaddr,
&iface->sockaddr);
kref_put(&old_iface->refcount, release_iface);
} else if (old_iface) {
- cifs_dbg(FYI, "releasing ref to iface: %pIS\n",
+ /* if a new candidate is not found, keep things as is */
+ cifs_dbg(FYI, "could not replace iface: %pIS\n",
&old_iface->sockaddr);
-
- old_iface->num_channels--;
- if (old_iface->weight_fulfilled)
- old_iface->weight_fulfilled--;
-
- kref_put(&old_iface->refcount, release_iface);
} else if (!chan_index) {
/* special case: update interface for primary channel */
- cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
- &iface->sockaddr);
- iface->num_channels++;
- iface->weight_fulfilled++;
- } else {
- WARN_ON(!iface);
- cifs_dbg(FYI, "adding new iface: %pIS\n", &iface->sockaddr);
+ if (iface) {
+ cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
+ &iface->sockaddr);
+ iface->num_channels++;
+ iface->weight_fulfilled++;
+ }
}
spin_unlock(&ses->iface_lock);
- spin_lock(&ses->chan_lock);
- chan_index = cifs_ses_get_chan_index(ses, server);
- if (chan_index == CIFS_INVAL_CHAN_INDEX) {
+ if (iface) {
+ spin_lock(&ses->chan_lock);
+ chan_index = cifs_ses_get_chan_index(ses, server);
+ if (chan_index == CIFS_INVAL_CHAN_INDEX) {
+ spin_unlock(&ses->chan_lock);
+ return 0;
+ }
+
+ ses->chans[chan_index].iface = iface;
spin_unlock(&ses->chan_lock);
- return 0;
}
- ses->chans[chan_index].iface = iface;
-
- /* No iface is found. if secondary chan, drop connection */
- if (!iface && SERVER_IS_CHAN(server))
- ses->chans[chan_index].server = NULL;
-
- spin_unlock(&ses->chan_lock);
-
- if (!iface && SERVER_IS_CHAN(server))
- cifs_put_tcp_session(server, false);
-
return rc;
}
struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
const char *full_path,
- char **target_path,
- struct kvec *rsp_iov)
+ char **target_path)
{
int rc;
- int oplock = 0;
- bool is_reparse_point = !!rsp_iov;
- struct cifs_fid fid;
- struct cifs_open_parms oparms;
- cifs_dbg(FYI, "%s: path: %s\n", __func__, full_path);
+ cifs_tcon_dbg(FYI, "%s: path=%s\n", __func__, full_path);
- if (is_reparse_point) {
- cifs_dbg(VFS, "reparse points not handled for SMB1 symlinks\n");
+ if (!cap_unix(tcon->ses))
return -EOPNOTSUPP;
- }
-
- /* Check for unix extensions */
- if (cap_unix(tcon->ses)) {
- rc = CIFSSMBUnixQuerySymLink(xid, tcon, full_path, target_path,
- cifs_sb->local_nls,
- cifs_remap(cifs_sb));
- if (rc == -EREMOTE)
- rc = cifs_unix_dfs_readlink(xid, tcon, full_path,
- target_path,
- cifs_sb->local_nls);
-
- goto out;
- }
-
- oparms = (struct cifs_open_parms) {
- .tcon = tcon,
- .cifs_sb = cifs_sb,
- .desired_access = FILE_READ_ATTRIBUTES,
- .create_options = cifs_create_options(cifs_sb,
- OPEN_REPARSE_POINT),
- .disposition = FILE_OPEN,
- .path = full_path,
- .fid = &fid,
- };
-
- rc = CIFS_open(xid, &oparms, &oplock, NULL);
- if (rc)
- goto out;
-
- rc = CIFSSMBQuerySymLink(xid, tcon, fid.netfid, target_path,
- cifs_sb->local_nls);
- if (rc)
- goto out_close;
- convert_delimiter(*target_path, '/');
-out_close:
- CIFSSMBClose(xid, tcon, fid.netfid);
-out:
- if (!rc)
- cifs_dbg(FYI, "%s: target path: %s\n", __func__, *target_path);
+ rc = CIFSSMBUnixQuerySymLink(xid, tcon, full_path, target_path,
+ cifs_sb->local_nls, cifs_remap(cifs_sb));
+ if (rc == -EREMOTE)
+ rc = cifs_unix_dfs_readlink(xid, tcon, full_path,
+ target_path, cifs_sb->local_nls);
return rc;
}
+static int cifs_parse_reparse_point(struct cifs_sb_info *cifs_sb,
+ struct kvec *rsp_iov,
+ struct cifs_open_info_data *data)
+{
+ struct reparse_data_buffer *buf;
+ TRANSACT_IOCTL_RSP *io = rsp_iov->iov_base;
+ bool unicode = !!(io->hdr.Flags2 & SMBFLG2_UNICODE);
+ u32 plen = le16_to_cpu(io->ByteCount);
+
+ buf = (struct reparse_data_buffer *)((__u8 *)&io->hdr.Protocol +
+ le32_to_cpu(io->DataOffset));
+ return parse_reparse_point(buf, plen, cifs_sb, unicode, data);
+}
+
static bool
cifs_is_read_op(__u32 oplock)
{
{
struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
struct inode *newinode = NULL;
- int rc = -EPERM;
- struct cifs_open_info_data buf = {};
- struct cifs_io_parms io_parms;
- __u32 oplock = 0;
- struct cifs_fid fid;
- struct cifs_open_parms oparms;
- unsigned int bytes_written;
- struct win_dev *pdev;
- struct kvec iov[2];
+ int rc;
if (tcon->unix_ext) {
/*
d_instantiate(dentry, newinode);
return rc;
}
-
/*
- * SMB1 SFU emulation: should work with all servers, but only
- * support block and char device (no socket & fifo)
+ * Check if mounted with mount parm 'sfu' mount parm.
+ * SFU emulation should work with all servers, but only
+ * supports block and char device (no socket & fifo),
+ * and was used by default in earlier versions of Windows
*/
if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL))
- return rc;
-
- if (!S_ISCHR(mode) && !S_ISBLK(mode))
- return rc;
-
- cifs_dbg(FYI, "sfu compat create special file\n");
-
- oparms = (struct cifs_open_parms) {
- .tcon = tcon,
- .cifs_sb = cifs_sb,
- .desired_access = GENERIC_WRITE,
- .create_options = cifs_create_options(cifs_sb, CREATE_NOT_DIR |
- CREATE_OPTION_SPECIAL),
- .disposition = FILE_CREATE,
- .path = full_path,
- .fid = &fid,
- };
-
- if (tcon->ses->server->oplocks)
- oplock = REQ_OPLOCK;
- else
- oplock = 0;
- rc = tcon->ses->server->ops->open(xid, &oparms, &oplock, &buf);
- if (rc)
- return rc;
-
- /*
- * BB Do not bother to decode buf since no local inode yet to put
- * timestamps in, but we can reuse it safely.
- */
-
- pdev = (struct win_dev *)&buf.fi;
- io_parms.pid = current->tgid;
- io_parms.tcon = tcon;
- io_parms.offset = 0;
- io_parms.length = sizeof(struct win_dev);
- iov[1].iov_base = &buf.fi;
- iov[1].iov_len = sizeof(struct win_dev);
- if (S_ISCHR(mode)) {
- memcpy(pdev->type, "IntxCHR", 8);
- pdev->major = cpu_to_le64(MAJOR(dev));
- pdev->minor = cpu_to_le64(MINOR(dev));
- rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms,
- &bytes_written, iov, 1);
- } else if (S_ISBLK(mode)) {
- memcpy(pdev->type, "IntxBLK", 8);
- pdev->major = cpu_to_le64(MAJOR(dev));
- pdev->minor = cpu_to_le64(MINOR(dev));
- rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms,
- &bytes_written, iov, 1);
- }
- tcon->ses->server->ops->close(xid, tcon, &fid);
- d_drop(dentry);
-
- /* FIXME: add code here to set EAs */
-
- cifs_free_open_info(&buf);
- return rc;
+ return -EPERM;
+ return cifs_sfu_make_node(xid, inode, dentry, tcon,
+ full_path, mode, dev);
}
-
-
struct smb_version_operations smb1_operations = {
.send_cancel = send_nt_cancel,
.compare_fids = cifs_compare_fids,
.is_path_accessible = cifs_is_path_accessible,
.can_echo = cifs_can_echo,
.query_path_info = cifs_query_path_info,
+ .query_reparse_point = cifs_query_reparse_point,
.query_file_info = cifs_query_file_info,
.get_srv_inum = cifs_get_srv_inum,
.set_path_size = CIFSSMBSetEOF,
.rename = CIFSSMBRename,
.create_hardlink = CIFSCreateHardLink,
.query_symlink = cifs_query_symlink,
+ .parse_reparse_point = cifs_parse_reparse_point,
.open = cifs_open_file,
.set_fid = cifs_set_fid,
.close = cifs_close_file,
break;
}
data->reparse_point = reparse_point;
- data->reparse_tag = tag;
+ data->reparse.tag = tag;
return rc;
}
}
mid = le64_to_cpu(shdr->MessageId);
+ if (check_smb2_hdr(shdr, mid))
+ return 1;
+
+ if (shdr->StructureSize != SMB2_HEADER_STRUCTURE_SIZE) {
+ cifs_dbg(VFS, "Invalid structure size %u\n",
+ le16_to_cpu(shdr->StructureSize));
+ return 1;
+ }
+
+ command = le16_to_cpu(shdr->Command);
+ if (command >= NUMBER_OF_SMB2_COMMANDS) {
+ cifs_dbg(VFS, "Invalid SMB2 command %d\n", command);
+ return 1;
+ }
+
if (len < pdu_size) {
if ((len >= hdr_size)
&& (shdr->Status != 0)) {
return 1;
}
- if (check_smb2_hdr(shdr, mid))
- return 1;
-
- if (shdr->StructureSize != SMB2_HEADER_STRUCTURE_SIZE) {
- cifs_dbg(VFS, "Invalid structure size %u\n",
- le16_to_cpu(shdr->StructureSize));
- return 1;
- }
-
- command = le16_to_cpu(shdr->Command);
- if (command >= NUMBER_OF_SMB2_COMMANDS) {
- cifs_dbg(VFS, "Invalid SMB2 command %d\n", command);
- return 1;
- }
-
if (smb2_rsp_struct_sizes[command] != pdu->StructureSize2) {
if (command != SMB2_OPLOCK_BREAK_HE && (shdr->Status == 0 ||
pdu->StructureSize2 != SMB2_ERROR_STRUCTURE_SIZE2_LE)) {
char *
smb2_get_data_area_len(int *off, int *len, struct smb2_hdr *shdr)
{
+ const int max_off = 4096;
+ const int max_len = 128 * 1024;
+
*off = 0;
*len = 0;
* Invalid length or offset probably means data area is invalid, but
* we have little choice but to ignore the data area in this case.
*/
- if (*off > 4096) {
- cifs_dbg(VFS, "offset %d too large, data area ignored\n", *off);
- *len = 0;
+ if (unlikely(*off < 0 || *off > max_off ||
+ *len < 0 || *len > max_len)) {
+ cifs_dbg(VFS, "%s: invalid data area (off=%d len=%d)\n",
+ __func__, *off, *len);
*off = 0;
- } else if (*off < 0) {
- cifs_dbg(VFS, "negative offset %d to data invalid ignore data area\n",
- *off);
- *off = 0;
- *len = 0;
- } else if (*len < 0) {
- cifs_dbg(VFS, "negative data length %d invalid, data area ignored\n",
- *len);
*len = 0;
- } else if (*len > 128 * 1024) {
- cifs_dbg(VFS, "data area larger than 128K: %d\n", *len);
+ } else if (*off == 0) {
*len = 0;
}
/* return pointer to beginning of data area, ie offset from SMB start */
- if ((*off != 0) && (*len != 0))
+ if (*off > 0 && *len > 0)
return (char *)shdr + *off;
- else
- return NULL;
+ return NULL;
}
/*
cifs_server_dbg(VFS, "Cmd: %d Err: 0x%x Flags: 0x%x Mid: %llu Pid: %d\n",
shdr->Command, shdr->Status, shdr->Flags, shdr->MessageId,
shdr->Id.SyncId.ProcessId);
- cifs_server_dbg(VFS, "smb buf %p len %u\n", buf,
- server->ops->calc_smb_size(buf));
+ if (!server->ops->check_message(buf, server->total_read, server)) {
+ cifs_server_dbg(VFS, "smb buf %p len %u\n", buf,
+ server->ops->calc_smb_size(buf));
+ }
#endif
}
usleep_range(512, 2048);
} while (++retry_count < 5);
+ if (!rc && !dfs_rsp)
+ rc = -EIO;
if (rc) {
if (!is_retryable_error(rc) && rc != -ENOENT && rc != -EOPNOTSUPP)
cifs_tcon_dbg(VFS, "%s: ioctl error: rc=%d\n", __func__, rc);
return rc;
}
-static int
-parse_reparse_posix(struct reparse_posix_data *symlink_buf,
- u32 plen, char **target_path,
- struct cifs_sb_info *cifs_sb)
+/* See MS-FSCC 2.1.2.6 for the 'NFS' style reparse tags */
+static int parse_reparse_posix(struct reparse_posix_data *buf,
+ struct cifs_sb_info *cifs_sb,
+ struct cifs_open_info_data *data)
{
unsigned int len;
-
- /* See MS-FSCC 2.1.2.6 for the 'NFS' style reparse tags */
- len = le16_to_cpu(symlink_buf->ReparseDataLength);
-
- if (le64_to_cpu(symlink_buf->InodeType) != NFS_SPECFILE_LNK) {
- cifs_dbg(VFS, "%lld not a supported symlink type\n",
- le64_to_cpu(symlink_buf->InodeType));
+ u64 type;
+
+ switch ((type = le64_to_cpu(buf->InodeType))) {
+ case NFS_SPECFILE_LNK:
+ len = le16_to_cpu(buf->ReparseDataLength);
+ data->symlink_target = cifs_strndup_from_utf16(buf->DataBuffer,
+ len, true,
+ cifs_sb->local_nls);
+ if (!data->symlink_target)
+ return -ENOMEM;
+ convert_delimiter(data->symlink_target, '/');
+ cifs_dbg(FYI, "%s: target path: %s\n",
+ __func__, data->symlink_target);
+ break;
+ case NFS_SPECFILE_CHR:
+ case NFS_SPECFILE_BLK:
+ case NFS_SPECFILE_FIFO:
+ case NFS_SPECFILE_SOCK:
+ break;
+ default:
+ cifs_dbg(VFS, "%s: unhandled inode type: 0x%llx\n",
+ __func__, type);
return -EOPNOTSUPP;
}
-
- *target_path = cifs_strndup_from_utf16(
- symlink_buf->PathBuffer,
- len, true, cifs_sb->local_nls);
- if (!(*target_path))
- return -ENOMEM;
-
- convert_delimiter(*target_path, '/');
- cifs_dbg(FYI, "%s: target path: %s\n", __func__, *target_path);
-
return 0;
}
-static int
-parse_reparse_symlink(struct reparse_symlink_data_buffer *symlink_buf,
- u32 plen, char **target_path,
- struct cifs_sb_info *cifs_sb)
+static int parse_reparse_symlink(struct reparse_symlink_data_buffer *sym,
+ u32 plen, bool unicode,
+ struct cifs_sb_info *cifs_sb,
+ struct cifs_open_info_data *data)
{
- unsigned int sub_len;
- unsigned int sub_offset;
+ unsigned int len;
+ unsigned int offs;
/* We handle Symbolic Link reparse tag here. See: MS-FSCC 2.1.2.4 */
- sub_offset = le16_to_cpu(symlink_buf->SubstituteNameOffset);
- sub_len = le16_to_cpu(symlink_buf->SubstituteNameLength);
- if (sub_offset + 20 > plen ||
- sub_offset + sub_len + 20 > plen) {
+ offs = le16_to_cpu(sym->SubstituteNameOffset);
+ len = le16_to_cpu(sym->SubstituteNameLength);
+ if (offs + 20 > plen || offs + len + 20 > plen) {
cifs_dbg(VFS, "srv returned malformed symlink buffer\n");
return -EIO;
}
- *target_path = cifs_strndup_from_utf16(
- symlink_buf->PathBuffer + sub_offset,
- sub_len, true, cifs_sb->local_nls);
- if (!(*target_path))
+ data->symlink_target = cifs_strndup_from_utf16(sym->PathBuffer + offs,
+ len, unicode,
+ cifs_sb->local_nls);
+ if (!data->symlink_target)
return -ENOMEM;
- convert_delimiter(*target_path, '/');
- cifs_dbg(FYI, "%s: target path: %s\n", __func__, *target_path);
+ convert_delimiter(data->symlink_target, '/');
+ cifs_dbg(FYI, "%s: target path: %s\n", __func__, data->symlink_target);
return 0;
}
-static int
-parse_reparse_point(struct reparse_data_buffer *buf,
- u32 plen, char **target_path,
- struct cifs_sb_info *cifs_sb)
+int parse_reparse_point(struct reparse_data_buffer *buf,
+ u32 plen, struct cifs_sb_info *cifs_sb,
+ bool unicode, struct cifs_open_info_data *data)
{
- if (plen < sizeof(struct reparse_data_buffer)) {
- cifs_dbg(VFS, "reparse buffer is too small. Must be at least 8 bytes but was %d\n",
- plen);
+ if (plen < sizeof(*buf)) {
+ cifs_dbg(VFS, "%s: reparse buffer is too small. Must be at least 8 bytes but was %d\n",
+ __func__, plen);
return -EIO;
}
- if (plen < le16_to_cpu(buf->ReparseDataLength) +
- sizeof(struct reparse_data_buffer)) {
- cifs_dbg(VFS, "srv returned invalid reparse buf length: %d\n",
- plen);
+ if (plen < le16_to_cpu(buf->ReparseDataLength) + sizeof(*buf)) {
+ cifs_dbg(VFS, "%s: invalid reparse buf length: %d\n",
+ __func__, plen);
return -EIO;
}
+ data->reparse.buf = buf;
+
/* See MS-FSCC 2.1.2 */
switch (le32_to_cpu(buf->ReparseTag)) {
case IO_REPARSE_TAG_NFS:
- return parse_reparse_posix(
- (struct reparse_posix_data *)buf,
- plen, target_path, cifs_sb);
+ return parse_reparse_posix((struct reparse_posix_data *)buf,
+ cifs_sb, data);
case IO_REPARSE_TAG_SYMLINK:
return parse_reparse_symlink(
(struct reparse_symlink_data_buffer *)buf,
- plen, target_path, cifs_sb);
+ plen, unicode, cifs_sb, data);
+ case IO_REPARSE_TAG_LX_SYMLINK:
+ case IO_REPARSE_TAG_AF_UNIX:
+ case IO_REPARSE_TAG_LX_FIFO:
+ case IO_REPARSE_TAG_LX_CHR:
+ case IO_REPARSE_TAG_LX_BLK:
+ return 0;
default:
- cifs_dbg(VFS, "srv returned unknown symlink buffer tag:0x%08x\n",
- le32_to_cpu(buf->ReparseTag));
+ cifs_dbg(VFS, "%s: unhandled reparse tag: 0x%08x\n",
+ __func__, le32_to_cpu(buf->ReparseTag));
return -EOPNOTSUPP;
}
}
-static int smb2_query_symlink(const unsigned int xid,
- struct cifs_tcon *tcon,
- struct cifs_sb_info *cifs_sb,
- const char *full_path,
- char **target_path,
- struct kvec *rsp_iov)
+static int smb2_parse_reparse_point(struct cifs_sb_info *cifs_sb,
+ struct kvec *rsp_iov,
+ struct cifs_open_info_data *data)
{
struct reparse_data_buffer *buf;
struct smb2_ioctl_rsp *io = rsp_iov->iov_base;
u32 plen = le32_to_cpu(io->OutputCount);
- cifs_dbg(FYI, "%s: path: %s\n", __func__, full_path);
-
buf = (struct reparse_data_buffer *)((u8 *)io +
le32_to_cpu(io->OutputOffset));
- return parse_reparse_point(buf, plen, target_path, cifs_sb);
+ return parse_reparse_point(buf, plen, cifs_sb, true, data);
}
static int smb2_query_reparse_point(const unsigned int xid,
struct kvec *rsp_iov;
struct smb2_ioctl_rsp *ioctl_rsp;
struct reparse_data_buffer *reparse_buf;
- u32 plen;
+ u32 off, count, len;
cifs_dbg(FYI, "%s: path: %s\n", __func__, full_path);
*/
if (rc == 0) {
/* See MS-FSCC 2.3.23 */
+ off = le32_to_cpu(ioctl_rsp->OutputOffset);
+ count = le32_to_cpu(ioctl_rsp->OutputCount);
+ if (check_add_overflow(off, count, &len) ||
+ len > rsp_iov[1].iov_len) {
+ cifs_tcon_dbg(VFS, "%s: invalid ioctl: off=%d count=%d\n",
+ __func__, off, count);
+ rc = -EIO;
+ goto query_rp_exit;
+ }
- reparse_buf = (struct reparse_data_buffer *)
- ((char *)ioctl_rsp +
- le32_to_cpu(ioctl_rsp->OutputOffset));
- plen = le32_to_cpu(ioctl_rsp->OutputCount);
-
- if (plen + le32_to_cpu(ioctl_rsp->OutputOffset) >
- rsp_iov[1].iov_len) {
- cifs_tcon_dbg(FYI, "srv returned invalid ioctl len: %d\n",
- plen);
+ reparse_buf = (void *)((u8 *)ioctl_rsp + off);
+ len = sizeof(*reparse_buf);
+ if (count < len ||
+ count < le16_to_cpu(reparse_buf->ReparseDataLength) + len) {
+ cifs_tcon_dbg(VFS, "%s: invalid ioctl: off=%d count=%d\n",
+ __func__, off, count);
rc = -EIO;
goto query_rp_exit;
}
struct inode *inode = file_inode(file);
struct cifsInodeInfo *cifsi = CIFS_I(inode);
struct cifsFileInfo *cfile = file->private_data;
+ unsigned long long new_size;
long rc;
unsigned int xid;
__le64 eof;
/*
* do we also need to change the size of the file?
*/
- if (keep_size == false && i_size_read(inode) < offset + len) {
- eof = cpu_to_le64(offset + len);
+ new_size = offset + len;
+ if (keep_size == false && (unsigned long long)i_size_read(inode) < new_size) {
+ eof = cpu_to_le64(new_size);
rc = SMB2_set_eof(xid, tcon, cfile->fid.persistent_fid,
cfile->fid.volatile_fid, cfile->pid, &eof);
+ if (rc >= 0) {
+ truncate_setsize(inode, new_size);
+ fscache_resize_cookie(cifs_inode_cookie(inode), new_size);
+ }
}
zero_range_exit:
if (rc < 0)
goto out_2;
+ truncate_setsize(inode, old_eof + len);
+ fscache_resize_cookie(cifs_inode_cookie(inode), i_size_read(inode));
+
rc = smb2_copychunk_range(xid, cfile, cfile, off, count, off + len);
if (rc < 0)
goto out_2;
struct smb2_hdr *shdr;
unsigned int pdu_length = server->pdu_size;
unsigned int buf_size;
+ unsigned int next_cmd;
struct mid_q_entry *mid_entry;
int next_is_large;
char *next_buffer = NULL;
next_is_large = server->large_buf;
one_more:
shdr = (struct smb2_hdr *)buf;
- if (shdr->NextCommand) {
+ next_cmd = le32_to_cpu(shdr->NextCommand);
+ if (next_cmd) {
+ if (WARN_ON_ONCE(next_cmd > pdu_length))
+ return -1;
if (next_is_large)
next_buffer = (char *)cifs_buf_get();
else
next_buffer = (char *)cifs_small_buf_get();
- memcpy(next_buffer,
- buf + le32_to_cpu(shdr->NextCommand),
- pdu_length - le32_to_cpu(shdr->NextCommand));
+ memcpy(next_buffer, buf + next_cmd, pdu_length - next_cmd);
}
mid_entry = smb2_find_mid(server, buf);
else
ret = cifs_handle_standard(server, mid_entry);
- if (ret == 0 && shdr->NextCommand) {
- pdu_length -= le32_to_cpu(shdr->NextCommand);
+ if (ret == 0 && next_cmd) {
+ pdu_length -= next_cmd;
server->large_buf = next_is_large;
if (next_is_large)
server->bigbuf = buf = next_buffer;
NULL, 0, false);
}
-static int
-smb2_next_header(char *buf)
+static int smb2_next_header(struct TCP_Server_Info *server, char *buf,
+ unsigned int *noff)
{
struct smb2_hdr *hdr = (struct smb2_hdr *)buf;
struct smb2_transform_hdr *t_hdr = (struct smb2_transform_hdr *)buf;
- if (hdr->ProtocolId == SMB2_TRANSFORM_PROTO_NUM)
- return sizeof(struct smb2_transform_hdr) +
- le32_to_cpu(t_hdr->OriginalMessageSize);
-
- return le32_to_cpu(hdr->NextCommand);
+ if (hdr->ProtocolId == SMB2_TRANSFORM_PROTO_NUM) {
+ *noff = le32_to_cpu(t_hdr->OriginalMessageSize);
+ if (unlikely(check_add_overflow(*noff, sizeof(*t_hdr), noff)))
+ return -EINVAL;
+ } else {
+ *noff = le32_to_cpu(hdr->NextCommand);
+ }
+ if (unlikely(*noff && *noff < MID_HEADER_SIZE(server)))
+ return -EINVAL;
+ return 0;
}
-static int
-smb2_make_node(unsigned int xid, struct inode *inode,
- struct dentry *dentry, struct cifs_tcon *tcon,
- const char *full_path, umode_t mode, dev_t dev)
+int cifs_sfu_make_node(unsigned int xid, struct inode *inode,
+ struct dentry *dentry, struct cifs_tcon *tcon,
+ const char *full_path, umode_t mode, dev_t dev)
{
- struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
- int rc = -EPERM;
struct cifs_open_info_data buf = {};
- struct cifs_io_parms io_parms = {0};
- __u32 oplock = 0;
- struct cifs_fid fid;
+ struct TCP_Server_Info *server = tcon->ses->server;
struct cifs_open_parms oparms;
+ struct cifs_io_parms io_parms = {};
+ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+ struct cifs_fid fid;
unsigned int bytes_written;
struct win_dev *pdev;
struct kvec iov[2];
-
- /*
- * Check if mounted with mount parm 'sfu' mount parm.
- * SFU emulation should work with all servers, but only
- * supports block and char device (no socket & fifo),
- * and was used by default in earlier versions of Windows
- */
- if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL))
- return rc;
-
- /*
- * TODO: Add ability to create instead via reparse point. Windows (e.g.
- * their current NFS server) uses this approach to expose special files
- * over SMB2/SMB3 and Samba will do this with SMB3.1.1 POSIX Extensions
- */
+ __u32 oplock = server->oplocks ? REQ_OPLOCK : 0;
+ int rc;
if (!S_ISCHR(mode) && !S_ISBLK(mode) && !S_ISFIFO(mode))
- return rc;
-
- cifs_dbg(FYI, "sfu compat create special file\n");
+ return -EPERM;
oparms = (struct cifs_open_parms) {
.tcon = tcon,
.fid = &fid,
};
- if (tcon->ses->server->oplocks)
- oplock = REQ_OPLOCK;
- else
- oplock = 0;
- rc = tcon->ses->server->ops->open(xid, &oparms, &oplock, &buf);
+ rc = server->ops->open(xid, &oparms, &oplock, &buf);
if (rc)
return rc;
* BB Do not bother to decode buf since no local inode yet to put
* timestamps in, but we can reuse it safely.
*/
-
pdev = (struct win_dev *)&buf.fi;
io_parms.pid = current->tgid;
io_parms.tcon = tcon;
- io_parms.offset = 0;
- io_parms.length = sizeof(struct win_dev);
- iov[1].iov_base = &buf.fi;
- iov[1].iov_len = sizeof(struct win_dev);
+ io_parms.length = sizeof(*pdev);
+ iov[1].iov_base = pdev;
+ iov[1].iov_len = sizeof(*pdev);
if (S_ISCHR(mode)) {
memcpy(pdev->type, "IntxCHR", 8);
pdev->major = cpu_to_le64(MAJOR(dev));
pdev->minor = cpu_to_le64(MINOR(dev));
- rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms,
- &bytes_written, iov, 1);
} else if (S_ISBLK(mode)) {
memcpy(pdev->type, "IntxBLK", 8);
pdev->major = cpu_to_le64(MAJOR(dev));
pdev->minor = cpu_to_le64(MINOR(dev));
- rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms,
- &bytes_written, iov, 1);
} else if (S_ISFIFO(mode)) {
memcpy(pdev->type, "LnxFIFO", 8);
- pdev->major = 0;
- pdev->minor = 0;
- rc = tcon->ses->server->ops->sync_write(xid, &fid, &io_parms,
- &bytes_written, iov, 1);
}
- tcon->ses->server->ops->close(xid, tcon, &fid);
- d_drop(dentry);
+ rc = server->ops->sync_write(xid, &fid, &io_parms,
+ &bytes_written, iov, 1);
+ server->ops->close(xid, tcon, &fid);
+ d_drop(dentry);
/* FIXME: add code here to set EAs */
-
cifs_free_open_info(&buf);
return rc;
}
+static int smb2_make_node(unsigned int xid, struct inode *inode,
+ struct dentry *dentry, struct cifs_tcon *tcon,
+ const char *full_path, umode_t mode, dev_t dev)
+{
+ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+
+ /*
+ * Check if mounted with mount parm 'sfu' mount parm.
+ * SFU emulation should work with all servers, but only
+ * supports block and char device (no socket & fifo),
+ * and was used by default in earlier versions of Windows
+ */
+ if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL))
+ return -EPERM;
+ /*
+ * TODO: Add ability to create instead via reparse point. Windows (e.g.
+ * their current NFS server) uses this approach to expose special files
+ * over SMB2/SMB3 and Samba will do this with SMB3.1.1 POSIX Extensions
+ */
+ return cifs_sfu_make_node(xid, inode, dentry, tcon,
+ full_path, mode, dev);
+}
+
#ifdef CONFIG_CIFS_ALLOW_INSECURE_LEGACY
struct smb_version_operations smb20_operations = {
.compare_fids = smb2_compare_fids,
.unlink = smb2_unlink,
.rename = smb2_rename_path,
.create_hardlink = smb2_create_hardlink,
- .query_symlink = smb2_query_symlink,
+ .parse_reparse_point = smb2_parse_reparse_point,
.query_mf_symlink = smb3_query_mf_symlink,
.create_mf_symlink = smb3_create_mf_symlink,
.open = smb2_open_file,
.unlink = smb2_unlink,
.rename = smb2_rename_path,
.create_hardlink = smb2_create_hardlink,
- .query_symlink = smb2_query_symlink,
+ .parse_reparse_point = smb2_parse_reparse_point,
.query_mf_symlink = smb3_query_mf_symlink,
.create_mf_symlink = smb3_create_mf_symlink,
.open = smb2_open_file,
.unlink = smb2_unlink,
.rename = smb2_rename_path,
.create_hardlink = smb2_create_hardlink,
- .query_symlink = smb2_query_symlink,
+ .parse_reparse_point = smb2_parse_reparse_point,
.query_mf_symlink = smb3_query_mf_symlink,
.create_mf_symlink = smb3_create_mf_symlink,
.open = smb2_open_file,
.unlink = smb2_unlink,
.rename = smb2_rename_path,
.create_hardlink = smb2_create_hardlink,
- .query_symlink = smb2_query_symlink,
+ .parse_reparse_point = smb2_parse_reparse_point,
.query_mf_symlink = smb3_query_mf_symlink,
.create_mf_symlink = smb3_create_mf_symlink,
.open = smb2_open_file,
static int
smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
- struct TCP_Server_Info *server)
+ struct TCP_Server_Info *server, bool from_reconnect)
{
int rc = 0;
struct nls_table *nls_codepage = NULL;
* as cifs_put_tcp_session takes a higher lock
* i.e. cifs_tcp_ses_lock
*/
- cifs_put_tcp_session(server, 1);
+ cifs_put_tcp_session(server, from_reconnect);
server->terminate = true;
cifs_signal_cifsd_for_reconnect(server, false);
}
if (smb2_command != SMB2_INTERNAL_CMD)
- if (mod_delayed_work(cifsiod_wq, &server->reconnect, 0))
- cifs_put_tcp_session(server, false);
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
atomic_inc(&tconInfoReconnectCount);
out:
void **request_buf, unsigned int *total_len)
{
/* BB eventually switch this to SMB2 specific small buf size */
- if (smb2_command == SMB2_SET_INFO)
+ switch (smb2_command) {
+ case SMB2_SET_INFO:
+ case SMB2_QUERY_INFO:
*request_buf = cifs_buf_get();
- else
+ break;
+ default:
*request_buf = cifs_small_buf_get();
+ break;
+ }
if (*request_buf == NULL) {
/* BB should we add a retry in here if not a writepage? */
return -ENOMEM;
{
int rc;
- rc = smb2_reconnect(smb2_command, tcon, server);
+ rc = smb2_reconnect(smb2_command, tcon, server, false);
if (rc)
return rc;
posix->nlink, posix->mode, posix->reparse_tag);
}
-void
-smb2_parse_contexts(struct TCP_Server_Info *server,
- struct smb2_create_rsp *rsp,
- unsigned int *epoch, char *lease_key, __u8 *oplock,
- struct smb2_file_all_info *buf,
- struct create_posix_rsp *posix)
+int smb2_parse_contexts(struct TCP_Server_Info *server,
+ struct kvec *rsp_iov,
+ unsigned int *epoch,
+ char *lease_key, __u8 *oplock,
+ struct smb2_file_all_info *buf,
+ struct create_posix_rsp *posix)
{
- char *data_offset;
+ struct smb2_create_rsp *rsp = rsp_iov->iov_base;
struct create_context *cc;
- unsigned int next;
- unsigned int remaining;
+ size_t rem, off, len;
+ size_t doff, dlen;
+ size_t noff, nlen;
char *name;
static const char smb3_create_tag_posix[] = {
0x93, 0xAD, 0x25, 0x50, 0x9C,
};
*oplock = 0;
- data_offset = (char *)rsp + le32_to_cpu(rsp->CreateContextsOffset);
- remaining = le32_to_cpu(rsp->CreateContextsLength);
- cc = (struct create_context *)data_offset;
+
+ off = le32_to_cpu(rsp->CreateContextsOffset);
+ rem = le32_to_cpu(rsp->CreateContextsLength);
+ if (check_add_overflow(off, rem, &len) || len > rsp_iov->iov_len)
+ return -EINVAL;
+ cc = (struct create_context *)((u8 *)rsp + off);
/* Initialize inode number to 0 in case no valid data in qfid context */
if (buf)
buf->IndexNumber = 0;
- while (remaining >= sizeof(struct create_context)) {
- name = le16_to_cpu(cc->NameOffset) + (char *)cc;
- if (le16_to_cpu(cc->NameLength) == 4 &&
- strncmp(name, SMB2_CREATE_REQUEST_LEASE, 4) == 0)
- *oplock = server->ops->parse_lease_buf(cc, epoch,
- lease_key);
- else if (buf && (le16_to_cpu(cc->NameLength) == 4) &&
- strncmp(name, SMB2_CREATE_QUERY_ON_DISK_ID, 4) == 0)
- parse_query_id_ctxt(cc, buf);
- else if ((le16_to_cpu(cc->NameLength) == 16)) {
- if (posix &&
- memcmp(name, smb3_create_tag_posix, 16) == 0)
+ while (rem >= sizeof(*cc)) {
+ doff = le16_to_cpu(cc->DataOffset);
+ dlen = le32_to_cpu(cc->DataLength);
+ if (check_add_overflow(doff, dlen, &len) || len > rem)
+ return -EINVAL;
+
+ noff = le16_to_cpu(cc->NameOffset);
+ nlen = le16_to_cpu(cc->NameLength);
+ if (noff + nlen >= doff)
+ return -EINVAL;
+
+ name = (char *)cc + noff;
+ switch (nlen) {
+ case 4:
+ if (!strncmp(name, SMB2_CREATE_REQUEST_LEASE, 4)) {
+ *oplock = server->ops->parse_lease_buf(cc, epoch,
+ lease_key);
+ } else if (buf &&
+ !strncmp(name, SMB2_CREATE_QUERY_ON_DISK_ID, 4)) {
+ parse_query_id_ctxt(cc, buf);
+ }
+ break;
+ case 16:
+ if (posix && !memcmp(name, smb3_create_tag_posix, 16))
parse_posix_ctxt(cc, buf, posix);
+ break;
+ default:
+ cifs_dbg(FYI, "%s: unhandled context (nlen=%zu dlen=%zu)\n",
+ __func__, nlen, dlen);
+ if (IS_ENABLED(CONFIG_CIFS_DEBUG2))
+ cifs_dump_mem("context data: ", cc, dlen);
+ break;
}
- /* else {
- cifs_dbg(FYI, "Context not matched with len %d\n",
- le16_to_cpu(cc->NameLength));
- cifs_dump_mem("Cctxt name: ", name, 4);
- } */
-
- next = le32_to_cpu(cc->Next);
- if (!next)
+
+ off = le32_to_cpu(cc->Next);
+ if (!off)
break;
- remaining -= next;
- cc = (struct create_context *)((char *)cc + next);
+ if (check_sub_overflow(rem, off, &rem))
+ return -EINVAL;
+ cc = (struct create_context *)((u8 *)cc + off);
}
if (rsp->OplockLevel != SMB2_OPLOCK_LEVEL_LEASE)
*oplock = rsp->OplockLevel;
- return;
+ return 0;
}
static int
}
- smb2_parse_contexts(server, rsp, &oparms->fid->epoch,
- oparms->fid->lease_key, oplock, buf, posix);
+ rc = smb2_parse_contexts(server, &rsp_iov, &oparms->fid->epoch,
+ oparms->fid->lease_key, oplock, buf, posix);
creat_exit:
SMB2_open_free(&rqst);
free_rsp_buf(resp_buftype, rsp);
} else {
trace_smb3_close_done(xid, persistent_fid, tcon->tid,
ses->Suid);
- /*
- * Note that have to subtract 4 since struct network_open_info
- * has a final 4 byte pad that close response does not have
- */
if (pbuf)
- memcpy(pbuf, (char *)&rsp->CreationTime, sizeof(*pbuf) - 4);
+ memcpy(&pbuf->network_open_info,
+ &rsp->network_open_info,
+ sizeof(pbuf->network_open_info));
}
atomic_dec(&tcon->num_remote_opens);
struct smb2_query_info_req *req;
struct kvec *iov = rqst->rq_iov;
unsigned int total_len;
+ size_t len;
int rc;
+ if (unlikely(check_add_overflow(input_len, sizeof(*req), &len) ||
+ len > CIFSMaxBufSize))
+ return -EINVAL;
+
rc = smb2_plain_req_init(SMB2_QUERY_INFO, tcon, server,
(void **) &req, &total_len);
if (rc)
iov[0].iov_base = (char *)req;
/* 1 for Buffer */
- iov[0].iov_len = total_len - 1 + input_len;
+ iov[0].iov_len = len;
return 0;
}
SMB2_query_info_free(struct smb_rqst *rqst)
{
if (rqst && rqst->rq_iov)
- cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */
+ cifs_buf_release(rqst->rq_iov[0].iov_base); /* request */
}
static int
int rc;
bool resched = false;
+ /* first check if ref count has reached 0, if not inc ref count */
+ spin_lock(&cifs_tcp_ses_lock);
+ if (!server->srv_count) {
+ spin_unlock(&cifs_tcp_ses_lock);
+ return;
+ }
+ server->srv_count++;
+ spin_unlock(&cifs_tcp_ses_lock);
+
/* If server is a channel, select the primary channel */
pserver = SERVER_IS_CHAN(server) ? server->primary_server : server;
}
spin_unlock(&ses->chan_lock);
}
-
spin_unlock(&cifs_tcp_ses_lock);
list_for_each_entry_safe(tcon, tcon2, &tmp_list, rlist) {
- rc = smb2_reconnect(SMB2_INTERNAL_CMD, tcon, server);
+ rc = smb2_reconnect(SMB2_INTERNAL_CMD, tcon, server, true);
if (!rc)
cifs_reopen_persistent_handles(tcon);
else
/* now reconnect sessions for necessary channels */
list_for_each_entry_safe(ses, ses2, &tmp_ses_list, rlist) {
tcon->ses = ses;
- rc = smb2_reconnect(SMB2_INTERNAL_CMD, tcon, server);
+ rc = smb2_reconnect(SMB2_INTERNAL_CMD, tcon, server, true);
if (rc)
resched = true;
list_del_init(&ses->rlist);
done:
cifs_dbg(FYI, "Reconnecting tcons and channels finished\n");
- if (resched) {
+ if (resched)
queue_delayed_work(cifsiod_wq, &server->reconnect, 2 * HZ);
- mutex_unlock(&pserver->reconnect_mutex);
-
- /* no need to put tcp session as we're retrying */
- return;
- }
mutex_unlock(&pserver->reconnect_mutex);
/* now we can safely release srv struct */
server->ops->need_neg(server)) {
spin_unlock(&server->srv_lock);
/* No need to send echo on newly established connections */
- spin_lock(&cifs_tcp_ses_lock);
- server->srv_count++;
- spin_unlock(&cifs_tcp_ses_lock);
- if (mod_delayed_work(cifsiod_wq, &server->reconnect, 0))
- cifs_put_tcp_session(server, false);
-
+ mod_delayed_work(cifsiod_wq, &server->reconnect, 0);
return rc;
}
spin_unlock(&server->srv_lock);
return 0;
}
+static inline void free_qfs_info_req(struct kvec *iov)
+{
+ cifs_buf_release(iov->iov_base);
+}
+
int
SMB311_posix_qfs_info(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_fid, u64 volatile_fid, struct kstatfs *fsdata)
rc = cifs_send_recv(xid, ses, server,
&rqst, &resp_buftype, flags, &rsp_iov);
- cifs_small_buf_release(iov.iov_base);
+ free_qfs_info_req(&iov);
if (rc) {
cifs_stats_fail_inc(tcon, SMB2_QUERY_INFO_HE);
goto posix_qfsinf_exit;
rc = cifs_send_recv(xid, ses, server,
&rqst, &resp_buftype, flags, &rsp_iov);
- cifs_small_buf_release(iov.iov_base);
+ free_qfs_info_req(&iov);
if (rc) {
cifs_stats_fail_inc(tcon, SMB2_QUERY_INFO_HE);
goto qfsinf_exit;
rc = cifs_send_recv(xid, ses, server,
&rqst, &resp_buftype, flags, &rsp_iov);
- cifs_small_buf_release(iov.iov_base);
+ free_qfs_info_req(&iov);
if (rc) {
cifs_stats_fail_inc(tcon, SMB2_QUERY_INFO_HE);
goto qfsattr_exit;
} __packed;
struct smb2_file_network_open_info {
- __le64 CreationTime;
- __le64 LastAccessTime;
- __le64 LastWriteTime;
- __le64 ChangeTime;
- __le64 AllocationSize;
- __le64 EndOfFile;
- __le32 Attributes;
+ struct_group(network_open_info,
+ __le64 CreationTime;
+ __le64 LastAccessTime;
+ __le64 LastWriteTime;
+ __le64 ChangeTime;
+ __le64 AllocationSize;
+ __le64 EndOfFile;
+ __le32 Attributes;
+ );
__le32 Reserved;
} __packed; /* level 34 Query also similar returned in close rsp and open rsp */
extern enum securityEnum smb2_select_sectype(struct TCP_Server_Info *,
enum securityEnum);
-extern void smb2_parse_contexts(struct TCP_Server_Info *server,
- struct smb2_create_rsp *rsp,
- unsigned int *epoch, char *lease_key,
- __u8 *oplock, struct smb2_file_all_info *buf,
- struct create_posix_rsp *posix);
+int smb2_parse_contexts(struct TCP_Server_Info *server,
+ struct kvec *rsp_iov,
+ unsigned int *epoch,
+ char *lease_key, __u8 *oplock,
+ struct smb2_file_all_info *buf,
+ struct create_posix_rsp *posix);
+
extern int smb3_encryption_required(const struct cifs_tcon *tcon);
extern int smb2_validate_iov(unsigned int offset, unsigned int buffer_length,
struct kvec *iov, unsigned int min_buf_size);
ptriplet->encryption.context,
ses->smb3encryptionkey,
SMB3_ENC_DEC_KEY_SIZE);
+ if (rc)
+ return rc;
rc = generate_key(ses, ptriplet->decryption.label,
ptriplet->decryption.context,
ses->smb3decryptionkey,
return rc;
}
- if (rc)
- return rc;
-
#ifdef CONFIG_CIFS_DEBUG_DUMP_KEYS
cifs_dbg(VFS, "%s: dumping generated AES session keys\n", __func__);
/*
__le16 StructureSize; /* 60 */
__le16 Flags;
__le32 Reserved;
- __le64 CreationTime;
- __le64 LastAccessTime;
- __le64 LastWriteTime;
- __le64 ChangeTime;
- __le64 AllocationSize; /* Beginning of FILE_STANDARD_INFO equivalent */
- __le64 EndOfFile;
- __le32 Attributes;
+ struct_group(network_open_info,
+ __le64 CreationTime;
+ __le64 LastAccessTime;
+ __le64 LastWriteTime;
+ __le64 ChangeTime;
+ /* Beginning of FILE_STANDARD_INFO equivalent */
+ __le64 AllocationSize;
+ __le64 EndOfFile;
+ __le32 Attributes;
+ );
} __packed;
#define SMB2_CREATE_SD_BUFFER "SecD" /* security descriptor */
#define SMB2_CREATE_DURABLE_HANDLE_REQUEST "DHnQ"
#define SMB2_CREATE_DURABLE_HANDLE_RECONNECT "DHnC"
-#define SMB2_CREATE_ALLOCATION_SIZE "AISi"
+#define SMB2_CREATE_ALLOCATION_SIZE "AlSi"
#define SMB2_CREATE_QUERY_MAXIMAL_ACCESS_REQUEST "MxAc"
#define SMB2_CREATE_TIMEWARP_REQUEST "TWrp"
#define SMB2_CREATE_QUERY_ON_DISK_ID "QFid"
#define SMB2_LEASE_WRITE_CACHING_LE cpu_to_le32(0x04)
#define SMB2_LEASE_FLAG_BREAK_IN_PROGRESS_LE cpu_to_le32(0x02)
+#define SMB2_LEASE_FLAG_PARENT_LEASE_KEY_SET_LE cpu_to_le32(0x04)
#define SMB2_LEASE_KEY_SIZE 16
kfree(work->tr_buf);
kvfree(work->request_buf);
kfree(work->iov);
+ if (!list_empty(&work->interim_entry))
+ list_del(&work->interim_entry);
+
if (work->async_id)
ksmbd_release_id(&work->conn->async_ida, work->async_id);
kmem_cache_free(work_cache, work);
static int __ksmbd_iov_pin_rsp(struct ksmbd_work *work, void *ib, int len,
void *aux_buf, unsigned int aux_size)
{
- struct aux_read *ar;
+ struct aux_read *ar = NULL;
int need_iov_cnt = 1;
if (aux_size) {
new = krealloc(work->iov,
sizeof(struct kvec) * work->iov_alloc_cnt,
GFP_KERNEL | __GFP_ZERO);
- if (!new)
+ if (!new) {
+ kfree(ar);
+ work->iov_alloc_cnt -= 4;
return -ENOMEM;
+ }
work->iov = new;
}
lease->new_state = 0;
lease->flags = lctx->flags;
lease->duration = lctx->duration;
+ lease->is_dir = lctx->is_dir;
memcpy(lease->parent_lease_key, lctx->parent_lease_key, SMB2_LEASE_KEY_SIZE);
lease->version = lctx->version;
- lease->epoch = 0;
+ lease->epoch = le16_to_cpu(lctx->epoch);
INIT_LIST_HEAD(&opinfo->lease_entry);
opinfo->o_lease = lease;
{
struct oplock_info *opinfo;
- if (S_ISDIR(file_inode(fp->filp)->i_mode))
- return;
+ if (fp->reserve_lease_break)
+ smb_lazy_parent_lease_break_close(fp);
opinfo = opinfo_get(fp);
if (!opinfo)
/* upgrading lease */
if ((atomic_read(&ci->op_count) +
atomic_read(&ci->sop_count)) == 1) {
- if (lease->state ==
- (lctx->req_state & lease->state)) {
+ if (lease->state != SMB2_LEASE_NONE_LE &&
+ lease->state == (lctx->req_state & lease->state)) {
lease->state |= lctx->req_state;
if (lctx->req_state &
SMB2_LEASE_WRITE_CACHING_LE)
lease_read_to_write(opinfo);
+
}
} else if ((atomic_read(&ci->op_count) +
atomic_read(&ci->sop_count)) > 1) {
interim_entry);
setup_async_work(in_work, NULL, NULL);
smb2_send_interim_resp(in_work, STATUS_PENDING);
- list_del(&in_work->interim_entry);
+ list_del_init(&in_work->interim_entry);
+ release_async_work(in_work);
}
INIT_WORK(&work->work, __smb2_lease_break_noti);
ksmbd_queue_work(work);
lease->new_state =
SMB2_LEASE_READ_CACHING_LE;
} else {
- if (lease->state & SMB2_LEASE_HANDLE_CACHING_LE)
+ if (lease->state & SMB2_LEASE_HANDLE_CACHING_LE &&
+ !lease->is_dir)
lease->new_state =
SMB2_LEASE_READ_CACHING_LE;
else
SMB2_LEASE_KEY_SIZE);
lease2->duration = lease1->duration;
lease2->flags = lease1->flags;
+ lease2->epoch = lease1->epoch++;
}
static int add_lease_global_list(struct oplock_info *opinfo)
}
}
+void smb_send_parent_lease_break_noti(struct ksmbd_file *fp,
+ struct lease_ctx_info *lctx)
+{
+ struct oplock_info *opinfo;
+ struct ksmbd_inode *p_ci = NULL;
+
+ if (lctx->version != 2)
+ return;
+
+ p_ci = ksmbd_inode_lookup_lock(fp->filp->f_path.dentry->d_parent);
+ if (!p_ci)
+ return;
+
+ read_lock(&p_ci->m_lock);
+ list_for_each_entry(opinfo, &p_ci->m_op_list, op_entry) {
+ if (!opinfo->is_lease)
+ continue;
+
+ if (opinfo->o_lease->state != SMB2_OPLOCK_LEVEL_NONE &&
+ (!(lctx->flags & SMB2_LEASE_FLAG_PARENT_LEASE_KEY_SET_LE) ||
+ !compare_guid_key(opinfo, fp->conn->ClientGUID,
+ lctx->parent_lease_key))) {
+ if (!atomic_inc_not_zero(&opinfo->refcount))
+ continue;
+
+ atomic_inc(&opinfo->conn->r_count);
+ if (ksmbd_conn_releasing(opinfo->conn)) {
+ atomic_dec(&opinfo->conn->r_count);
+ continue;
+ }
+
+ read_unlock(&p_ci->m_lock);
+ oplock_break(opinfo, SMB2_OPLOCK_LEVEL_NONE);
+ opinfo_conn_put(opinfo);
+ read_lock(&p_ci->m_lock);
+ }
+ }
+ read_unlock(&p_ci->m_lock);
+
+ ksmbd_inode_put(p_ci);
+}
+
+void smb_lazy_parent_lease_break_close(struct ksmbd_file *fp)
+{
+ struct oplock_info *opinfo;
+ struct ksmbd_inode *p_ci = NULL;
+
+ rcu_read_lock();
+ opinfo = rcu_dereference(fp->f_opinfo);
+ rcu_read_unlock();
+
+ if (!opinfo->is_lease || opinfo->o_lease->version != 2)
+ return;
+
+ p_ci = ksmbd_inode_lookup_lock(fp->filp->f_path.dentry->d_parent);
+ if (!p_ci)
+ return;
+
+ read_lock(&p_ci->m_lock);
+ list_for_each_entry(opinfo, &p_ci->m_op_list, op_entry) {
+ if (!opinfo->is_lease)
+ continue;
+
+ if (opinfo->o_lease->state != SMB2_OPLOCK_LEVEL_NONE) {
+ if (!atomic_inc_not_zero(&opinfo->refcount))
+ continue;
+
+ atomic_inc(&opinfo->conn->r_count);
+ if (ksmbd_conn_releasing(opinfo->conn)) {
+ atomic_dec(&opinfo->conn->r_count);
+ continue;
+ }
+ read_unlock(&p_ci->m_lock);
+ oplock_break(opinfo, SMB2_OPLOCK_LEVEL_NONE);
+ opinfo_conn_put(opinfo);
+ read_lock(&p_ci->m_lock);
+ }
+ }
+ read_unlock(&p_ci->m_lock);
+
+ ksmbd_inode_put(p_ci);
+}
+
/**
* smb_grant_oplock() - handle oplock/lease request on file open
* @work: smb work
bool prev_op_has_lease;
__le32 prev_op_state = 0;
- /* not support directory lease */
- if (S_ISDIR(file_inode(fp->filp)->i_mode))
- return 0;
-
opinfo = alloc_opinfo(work, pid, tid);
if (!opinfo)
return -ENOMEM;
memcpy(buf->lcontext.LeaseKey, lease->lease_key,
SMB2_LEASE_KEY_SIZE);
buf->lcontext.LeaseFlags = lease->flags;
+ buf->lcontext.Epoch = cpu_to_le16(++lease->epoch);
buf->lcontext.LeaseState = lease->state;
memcpy(buf->lcontext.ParentLeaseKey, lease->parent_lease_key,
SMB2_LEASE_KEY_SIZE);
/**
* parse_lease_state() - parse lease context containted in file open request
* @open_req: buffer containing smb2 file open(create) request
+ * @is_dir: whether leasing file is directory
*
* Return: oplock state, -ENOENT if create lease context not found
*/
-struct lease_ctx_info *parse_lease_state(void *open_req)
+struct lease_ctx_info *parse_lease_state(void *open_req, bool is_dir)
{
struct create_context *cc;
struct smb2_create_req *req = (struct smb2_create_req *)open_req;
struct create_lease_v2 *lc = (struct create_lease_v2 *)cc;
memcpy(lreq->lease_key, lc->lcontext.LeaseKey, SMB2_LEASE_KEY_SIZE);
- lreq->req_state = lc->lcontext.LeaseState;
+ if (is_dir) {
+ lreq->req_state = lc->lcontext.LeaseState &
+ ~SMB2_LEASE_WRITE_CACHING_LE;
+ lreq->is_dir = true;
+ } else
+ lreq->req_state = lc->lcontext.LeaseState;
lreq->flags = lc->lcontext.LeaseFlags;
+ lreq->epoch = lc->lcontext.Epoch;
lreq->duration = lc->lcontext.LeaseDuration;
memcpy(lreq->parent_lease_key, lc->lcontext.ParentLeaseKey,
SMB2_LEASE_KEY_SIZE);
__le32 flags;
__le64 duration;
__u8 parent_lease_key[SMB2_LEASE_KEY_SIZE];
+ __le16 epoch;
int version;
+ bool is_dir;
};
struct lease_table {
__u8 parent_lease_key[SMB2_LEASE_KEY_SIZE];
int version;
unsigned short epoch;
+ bool is_dir;
struct lease_table *l_lb;
};
/* Lease related functions */
void create_lease_buf(u8 *rbuf, struct lease *lease);
-struct lease_ctx_info *parse_lease_state(void *open_req);
+struct lease_ctx_info *parse_lease_state(void *open_req, bool is_dir);
__u8 smb2_map_lease_to_oplock(__le32 lease_state);
int lease_read_to_write(struct oplock_info *opinfo);
int find_same_lease_key(struct ksmbd_session *sess, struct ksmbd_inode *ci,
struct lease_ctx_info *lctx);
void destroy_lease_table(struct ksmbd_conn *conn);
+void smb_send_parent_lease_break_noti(struct ksmbd_file *fp,
+ struct lease_ctx_info *lctx);
+void smb_lazy_parent_lease_break_close(struct ksmbd_file *fp);
#endif /* __KSMBD_OPLOCK_H */
conn->signing_algorithm = SIGNING_ALG_AES_CMAC_LE;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_LEASES)
- conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING;
+ conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING |
+ SMB2_GLOBAL_CAP_DIRECTORY_LEASING;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION &&
conn->cli_cap & SMB2_GLOBAL_CAP_ENCRYPTION)
conn->signing_algorithm = SIGNING_ALG_AES_CMAC_LE;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_LEASES)
- conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING;
+ conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING |
+ SMB2_GLOBAL_CAP_DIRECTORY_LEASING;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION ||
(!(server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION_OFF) &&
conn->signing_algorithm = SIGNING_ALG_AES_CMAC_LE;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_LEASES)
- conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING;
+ conn->vals->capabilities |= SMB2_GLOBAL_CAP_LEASING |
+ SMB2_GLOBAL_CAP_DIRECTORY_LEASING;
if (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION ||
(!(server_conf.flags & KSMBD_GLOBAL_FLAG_SMB2_ENCRYPTION_OFF) &&
int setup_async_work(struct ksmbd_work *work, void (*fn)(void **), void **arg)
{
- struct smb2_hdr *rsp_hdr;
struct ksmbd_conn *conn = work->conn;
int id;
- rsp_hdr = ksmbd_resp_buf_next(work);
- rsp_hdr->Flags |= SMB2_FLAGS_ASYNC_COMMAND;
-
id = ksmbd_acquire_async_msg_id(&conn->async_ida);
if (id < 0) {
pr_err("Failed to alloc async message id\n");
}
work->asynchronous = true;
work->async_id = id;
- rsp_hdr->Id.AsyncId = cpu_to_le64(id);
ksmbd_debug(SMB,
"Send interim Response to inform async request id : %d\n",
__SMB2_HEADER_STRUCTURE_SIZE);
rsp_hdr = smb2_get_msg(in_work->response_buf);
+ rsp_hdr->Flags |= SMB2_FLAGS_ASYNC_COMMAND;
+ rsp_hdr->Id.AsyncId = cpu_to_le64(work->async_id);
smb2_set_err_rsp(in_work);
rsp_hdr->Status = status;
rc = 0;
} else {
rc = ksmbd_vfs_setxattr(idmap, path, attr_name, value,
- le16_to_cpu(eabuf->EaValueLength), 0);
+ le16_to_cpu(eabuf->EaValueLength),
+ 0, true);
if (rc < 0) {
ksmbd_debug(SMB,
"ksmbd_vfs_setxattr is failed(%d)\n",
return -EBADF;
}
- rc = ksmbd_vfs_setxattr(idmap, path, xattr_stream_name, NULL, 0, 0);
+ rc = ksmbd_vfs_setxattr(idmap, path, xattr_stream_name, NULL, 0, 0, false);
if (rc < 0)
pr_err("Failed to store XATTR stream name :%d\n", rc);
return 0;
da.flags = XATTR_DOSINFO_ATTRIB | XATTR_DOSINFO_CREATE_TIME |
XATTR_DOSINFO_ITIME;
- rc = ksmbd_vfs_set_dos_attrib_xattr(mnt_idmap(path->mnt), path, &da);
+ rc = ksmbd_vfs_set_dos_attrib_xattr(mnt_idmap(path->mnt), path, &da, true);
if (rc)
ksmbd_debug(SMB, "failed to store file attribute into xattr\n");
}
sizeof(struct create_sd_buf_req))
return -EINVAL;
return set_info_sec(work->conn, work->tcon, path, &sd_buf->ntsd,
- le32_to_cpu(sd_buf->ccontext.DataLength), true);
+ le32_to_cpu(sd_buf->ccontext.DataLength), true, false);
}
static void ksmbd_acls_fattr(struct smb_fattr *fattr,
*(char *)req->Buffer == '\\') {
pr_err("not allow directory name included leading slash\n");
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
}
name = smb2_get_name(req->Buffer,
if (rc != -ENOMEM)
rc = -ENOENT;
name = NULL;
- goto err_out1;
+ goto err_out2;
}
ksmbd_debug(SMB, "converted name = %s\n", name);
if (!test_share_config_flag(work->tcon->share_conf,
KSMBD_SHARE_FLAG_STREAMS)) {
rc = -EBADF;
- goto err_out1;
+ goto err_out2;
}
rc = parse_stream_name(name, &stream_name, &s_type);
if (rc < 0)
- goto err_out1;
+ goto err_out2;
}
rc = ksmbd_validate_filename(name);
if (rc < 0)
- goto err_out1;
+ goto err_out2;
if (ksmbd_share_veto_filename(share, name)) {
rc = -ENOENT;
ksmbd_debug(SMB, "Reject open(), vetoed file: %s\n",
name);
- goto err_out1;
+ goto err_out2;
}
} else {
name = kstrdup("", GFP_KERNEL);
if (!name) {
rc = -ENOMEM;
- goto err_out1;
+ goto err_out2;
}
}
- req_op_level = req->RequestedOplockLevel;
- if (req_op_level == SMB2_OPLOCK_LEVEL_LEASE)
- lc = parse_lease_state(req);
-
if (le32_to_cpu(req->ImpersonationLevel) > le32_to_cpu(IL_DELEGATE)) {
pr_err("Invalid impersonationlevel : 0x%x\n",
le32_to_cpu(req->ImpersonationLevel));
rc = -EIO;
rsp->hdr.Status = STATUS_BAD_IMPERSONATION_LEVEL;
- goto err_out1;
+ goto err_out2;
}
if (req->CreateOptions && !(req->CreateOptions & CREATE_OPTIONS_MASK_LE)) {
pr_err("Invalid create options : 0x%x\n",
le32_to_cpu(req->CreateOptions));
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
} else {
if (req->CreateOptions & FILE_SEQUENTIAL_ONLY_LE &&
req->CreateOptions & FILE_RANDOM_ACCESS_LE)
(FILE_OPEN_BY_FILE_ID_LE | CREATE_TREE_CONNECTION |
FILE_RESERVE_OPFILTER_LE)) {
rc = -EOPNOTSUPP;
- goto err_out1;
+ goto err_out2;
}
if (req->CreateOptions & FILE_DIRECTORY_FILE_LE) {
if (req->CreateOptions & FILE_NON_DIRECTORY_FILE_LE) {
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
} else if (req->CreateOptions & FILE_NO_COMPRESSION_LE) {
req->CreateOptions = ~(FILE_NO_COMPRESSION_LE);
}
pr_err("Invalid create disposition : 0x%x\n",
le32_to_cpu(req->CreateDisposition));
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
}
if (!(req->DesiredAccess & DESIRED_ACCESS_MASK)) {
pr_err("Invalid desired access : 0x%x\n",
le32_to_cpu(req->DesiredAccess));
rc = -EACCES;
- goto err_out1;
+ goto err_out2;
}
if (req->FileAttributes && !(req->FileAttributes & FILE_ATTRIBUTE_MASK_LE)) {
pr_err("Invalid file attribute : 0x%x\n",
le32_to_cpu(req->FileAttributes));
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
}
if (req->CreateContextsOffset) {
context = smb2_find_context_vals(req, SMB2_CREATE_EA_BUFFER, 4);
if (IS_ERR(context)) {
rc = PTR_ERR(context);
- goto err_out1;
+ goto err_out2;
} else if (context) {
ea_buf = (struct create_ea_buf_req *)context;
if (le16_to_cpu(context->DataOffset) +
le32_to_cpu(context->DataLength) <
sizeof(struct create_ea_buf_req)) {
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
}
if (req->CreateOptions & FILE_NO_EA_KNOWLEDGE_LE) {
rsp->hdr.Status = STATUS_ACCESS_DENIED;
rc = -EACCES;
- goto err_out1;
+ goto err_out2;
}
}
SMB2_CREATE_QUERY_MAXIMAL_ACCESS_REQUEST, 4);
if (IS_ERR(context)) {
rc = PTR_ERR(context);
- goto err_out1;
+ goto err_out2;
} else if (context) {
ksmbd_debug(SMB,
"get query maximal access context\n");
SMB2_CREATE_TIMEWARP_REQUEST, 4);
if (IS_ERR(context)) {
rc = PTR_ERR(context);
- goto err_out1;
+ goto err_out2;
} else if (context) {
ksmbd_debug(SMB, "get timewarp context\n");
rc = -EBADF;
- goto err_out1;
+ goto err_out2;
}
if (tcon->posix_extensions) {
SMB2_CREATE_TAG_POSIX, 16);
if (IS_ERR(context)) {
rc = PTR_ERR(context);
- goto err_out1;
+ goto err_out2;
} else if (context) {
struct create_posix *posix =
(struct create_posix *)context;
le32_to_cpu(context->DataLength) <
sizeof(struct create_posix) - 4) {
rc = -EINVAL;
- goto err_out1;
+ goto err_out2;
}
ksmbd_debug(SMB, "get posix context\n");
if (ksmbd_override_fsids(work)) {
rc = -ENOMEM;
- goto err_out1;
+ goto err_out2;
}
rc = ksmbd_vfs_kern_path_locked(work, name, LOOKUP_NO_SYMLINKS,
}
}
- rc = ksmbd_query_inode_status(d_inode(path.dentry->d_parent));
+ rc = ksmbd_query_inode_status(path.dentry->d_parent);
if (rc == KSMBD_INODE_STATUS_PENDING_DELETE) {
rc = -EBUSY;
goto err_out;
idmap,
&path,
pntsd,
- pntsd_size);
+ pntsd_size,
+ false);
kfree(pntsd);
if (rc)
pr_err("failed to store ntacl in xattr : %d\n",
fp->attrib_only = !(req->DesiredAccess & ~(FILE_READ_ATTRIBUTES_LE |
FILE_WRITE_ATTRIBUTES_LE | FILE_SYNCHRONIZE_LE));
- if (!S_ISDIR(file_inode(filp)->i_mode) && open_flags & O_TRUNC &&
- !fp->attrib_only && !stream_name) {
- smb_break_all_oplock(work, fp);
- need_truncate = 1;
- }
/* fp should be searchable through ksmbd_inode.m_fp_list
* after daccess, saccess, attrib_only, and stream are
goto err_out;
}
+ if (file_present || created)
+ ksmbd_vfs_kern_path_unlock(&parent_path, &path);
+
+ if (!S_ISDIR(file_inode(filp)->i_mode) && open_flags & O_TRUNC &&
+ !fp->attrib_only && !stream_name) {
+ smb_break_all_oplock(work, fp);
+ need_truncate = 1;
+ }
+
+ req_op_level = req->RequestedOplockLevel;
+ if (req_op_level == SMB2_OPLOCK_LEVEL_LEASE)
+ lc = parse_lease_state(req, S_ISDIR(file_inode(filp)->i_mode));
+
share_ret = ksmbd_smb_check_shared_mode(fp->filp, fp);
if (!test_share_config_flag(work->tcon->share_conf, KSMBD_SHARE_FLAG_OPLOCKS) ||
(req_op_level == SMB2_OPLOCK_LEVEL_LEASE &&
!(conn->vals->capabilities & SMB2_GLOBAL_CAP_LEASING))) {
if (share_ret < 0 && !S_ISDIR(file_inode(fp->filp)->i_mode)) {
rc = share_ret;
- goto err_out;
+ goto err_out1;
}
} else {
if (req_op_level == SMB2_OPLOCK_LEVEL_LEASE) {
+ /*
+ * Compare parent lease using parent key. If there is no
+ * a lease that has same parent key, Send lease break
+ * notification.
+ */
+ smb_send_parent_lease_break_noti(fp, lc);
+
req_op_level = smb2_map_lease_to_oplock(lc->req_state);
ksmbd_debug(SMB,
"lease req for(%s) req oplock state 0x%x, lease state 0x%x\n",
name, req_op_level, lc->req_state);
rc = find_same_lease_key(sess, fp->f_ci, lc);
if (rc)
- goto err_out;
+ goto err_out1;
} else if (open_flags == O_RDONLY &&
(req_op_level == SMB2_OPLOCK_LEVEL_BATCH ||
req_op_level == SMB2_OPLOCK_LEVEL_EXCLUSIVE))
le32_to_cpu(req->hdr.Id.SyncId.TreeId),
lc, share_ret);
if (rc < 0)
- goto err_out;
+ goto err_out1;
}
if (req->CreateOptions & FILE_DELETE_ON_CLOSE_LE)
ksmbd_fd_set_delete_on_close(fp, file_info);
if (need_truncate) {
- rc = smb2_create_truncate(&path);
+ rc = smb2_create_truncate(&fp->filp->f_path);
if (rc)
- goto err_out;
+ goto err_out1;
}
if (req->CreateContextsOffset) {
SMB2_CREATE_ALLOCATION_SIZE, 4);
if (IS_ERR(az_req)) {
rc = PTR_ERR(az_req);
- goto err_out;
+ goto err_out1;
} else if (az_req) {
loff_t alloc_size;
int err;
le32_to_cpu(az_req->ccontext.DataLength) <
sizeof(struct create_alloc_size_req)) {
rc = -EINVAL;
- goto err_out;
+ goto err_out1;
}
alloc_size = le64_to_cpu(az_req->AllocationSize);
ksmbd_debug(SMB,
context = smb2_find_context_vals(req, SMB2_CREATE_QUERY_ON_DISK_ID, 4);
if (IS_ERR(context)) {
rc = PTR_ERR(context);
- goto err_out;
+ goto err_out1;
} else if (context) {
ksmbd_debug(SMB, "get query on disk id context\n");
query_disk_id = 1;
rc = ksmbd_vfs_getattr(&path, &stat);
if (rc)
- goto err_out;
+ goto err_out1;
if (stat.result_mask & STATX_BTIME)
fp->create_time = ksmbd_UnixTimeToNT(stat.btime);
}
err_out:
- if (file_present || created) {
- inode_unlock(d_inode(parent_path.dentry));
- path_put(&path);
- path_put(&parent_path);
- }
- ksmbd_revert_fsids(work);
+ if (rc && (file_present || created))
+ ksmbd_vfs_kern_path_unlock(&parent_path, &path);
+
err_out1:
+ ksmbd_revert_fsids(work);
+
+err_out2:
if (!rc) {
ksmbd_update_fstate(&work->sess->file_table, fp, FP_INITED);
rc = ksmbd_iov_pin_rsp(work, (void *)rsp, iov_len);
rc = ksmbd_vfs_setxattr(file_mnt_idmap(fp->filp),
&fp->filp->f_path,
xattr_stream_name,
- NULL, 0, 0);
+ NULL, 0, 0, true);
if (rc < 0) {
pr_err("failed to store stream name in xattr: %d\n",
rc);
if (rc)
rc = -EINVAL;
out:
- if (file_present) {
- inode_unlock(d_inode(parent_path.dentry));
- path_put(&path);
- path_put(&parent_path);
- }
+ if (file_present)
+ ksmbd_vfs_kern_path_unlock(&parent_path, &path);
+
if (!IS_ERR(link_name))
kfree(link_name);
kfree(pathname);
da.flags = XATTR_DOSINFO_ATTRIB | XATTR_DOSINFO_CREATE_TIME |
XATTR_DOSINFO_ITIME;
- rc = ksmbd_vfs_set_dos_attrib_xattr(idmap, &filp->f_path, &da);
+ rc = ksmbd_vfs_set_dos_attrib_xattr(idmap, &filp->f_path, &da,
+ true);
if (rc)
ksmbd_debug(SMB,
"failed to restore file attribute in EA\n");
fp->saccess |= FILE_SHARE_DELETE_LE;
return set_info_sec(fp->conn, fp->tcon, &fp->filp->f_path, pntsd,
- buf_len, false);
+ buf_len, false, true);
}
/**
smb2_remove_blocked_lock,
argv);
if (rc) {
+ kfree(argv);
err = -ENOMEM;
goto out;
}
da.attr = le32_to_cpu(fp->f_ci->m_fattr);
ret = ksmbd_vfs_set_dos_attrib_xattr(idmap,
- &fp->filp->f_path, &da);
+ &fp->filp->f_path,
+ &da, true);
if (ret)
fp->f_ci->m_fattr = old_fattr;
}
le32_to_cpu(req->LeaseState));
}
+ if (ret < 0) {
+ rsp->hdr.Status = err;
+ goto err_out;
+ }
+
lease_state = lease->state;
opinfo->op_state = OPLOCK_STATE_NONE;
wake_up_interruptible_all(&opinfo->oplock_q);
wake_up_interruptible_all(&opinfo->oplock_brk);
opinfo_put(opinfo);
- if (ret < 0) {
- rsp->hdr.Status = err;
- goto err_out;
- }
-
rsp->StructureSize = cpu_to_le16(36);
rsp->Reserved = 0;
rsp->Flags = 0;
return;
err_out:
- opinfo->op_state = OPLOCK_STATE_NONE;
wake_up_interruptible_all(&opinfo->oplock_q);
atomic_dec(&opinfo->breaking_cnt);
wake_up_interruptible_all(&opinfo->oplock_brk);
pntsd_size += sizeof(struct smb_acl) + nt_size;
}
- ksmbd_vfs_set_sd_xattr(conn, idmap, path, pntsd, pntsd_size);
+ ksmbd_vfs_set_sd_xattr(conn, idmap, path, pntsd, pntsd_size, false);
kfree(pntsd);
}
int set_info_sec(struct ksmbd_conn *conn, struct ksmbd_tree_connect *tcon,
const struct path *path, struct smb_ntsd *pntsd, int ntsd_len,
- bool type_check)
+ bool type_check, bool get_write)
{
int rc;
struct smb_fattr fattr = {{0}};
if (test_share_config_flag(tcon->share_conf, KSMBD_SHARE_FLAG_ACL_XATTR)) {
/* Update WinACL in xattr */
ksmbd_vfs_remove_sd_xattrs(idmap, path);
- ksmbd_vfs_set_sd_xattr(conn, idmap, path, pntsd, ntsd_len);
+ ksmbd_vfs_set_sd_xattr(conn, idmap, path, pntsd, ntsd_len,
+ get_write);
}
out:
__le32 *pdaccess, int uid);
int set_info_sec(struct ksmbd_conn *conn, struct ksmbd_tree_connect *tcon,
const struct path *path, struct smb_ntsd *pntsd, int ntsd_len,
- bool type_check);
+ bool type_check, bool get_write);
void id_to_sid(unsigned int cid, uint sidtype, struct smb_sid *ssid);
void ksmbd_init_domain(u32 *sub_auth);
return -ENOENT;
}
+ err = mnt_want_write(parent_path->mnt);
+ if (err) {
+ path_put(parent_path);
+ putname(filename);
+ return -ENOENT;
+ }
+
inode_lock_nested(parent_path->dentry->d_inode, I_MUTEX_PARENT);
d = lookup_one_qstr_excl(&last, parent_path->dentry, 0);
if (IS_ERR(d))
err_out:
inode_unlock(d_inode(parent_path->dentry));
+ mnt_drop_write(parent_path->mnt);
path_put(parent_path);
putname(filename);
return -ENOENT;
fp->stream.name,
(void *)stream_buf,
size,
- 0);
+ 0,
+ true);
if (err < 0)
goto out;
}
}
+ /* Reserve lease break for parent dir at closing time */
+ fp->reserve_lease_break = true;
+
/* Do we need to break any of a levelII oplock? */
smb_break_all_levII_oplock(work, fp, 1);
goto out_err;
}
- err = mnt_want_write(path->mnt);
- if (err)
- goto out_err;
-
idmap = mnt_idmap(path->mnt);
if (S_ISDIR(d_inode(path->dentry)->i_mode)) {
err = vfs_rmdir(idmap, d_inode(parent), path->dentry);
if (err)
ksmbd_debug(VFS, "unlink failed, err %d\n", err);
}
- mnt_drop_write(path->mnt);
out_err:
ksmbd_revert_fsids(work);
goto out3;
}
- parent_fp = ksmbd_lookup_fd_inode(d_inode(old_child->d_parent));
+ parent_fp = ksmbd_lookup_fd_inode(old_child->d_parent);
if (parent_fp) {
if (parent_fp->daccess & FILE_DELETE_LE) {
pr_err("parent dir is opened with delete access\n");
* @attr_value: xattr value to set
* @attr_size: size of xattr value
* @flags: destination buffer length
+ * @get_write: get write access to a mount
*
* Return: 0 on success, otherwise error
*/
int ksmbd_vfs_setxattr(struct mnt_idmap *idmap,
const struct path *path, const char *attr_name,
- void *attr_value, size_t attr_size, int flags)
+ void *attr_value, size_t attr_size, int flags,
+ bool get_write)
{
int err;
- err = mnt_want_write(path->mnt);
- if (err)
- return err;
+ if (get_write == true) {
+ err = mnt_want_write(path->mnt);
+ if (err)
+ return err;
+ }
err = vfs_setxattr(idmap,
path->dentry,
flags);
if (err)
ksmbd_debug(VFS, "setxattr failed, err %d\n", err);
- mnt_drop_write(path->mnt);
+ if (get_write == true)
+ mnt_drop_write(path->mnt);
return err;
}
}
if (!err) {
+ err = mnt_want_write(parent_path->mnt);
+ if (err) {
+ path_put(path);
+ path_put(parent_path);
+ return err;
+ }
+
err = ksmbd_vfs_lock_parent(parent_path->dentry, path->dentry);
if (err) {
path_put(path);
return err;
}
+void ksmbd_vfs_kern_path_unlock(struct path *parent_path, struct path *path)
+{
+ inode_unlock(d_inode(parent_path->dentry));
+ mnt_drop_write(parent_path->mnt);
+ path_put(path);
+ path_put(parent_path);
+}
+
struct dentry *ksmbd_vfs_kern_path_create(struct ksmbd_work *work,
const char *name,
unsigned int flags,
int ksmbd_vfs_set_sd_xattr(struct ksmbd_conn *conn,
struct mnt_idmap *idmap,
const struct path *path,
- struct smb_ntsd *pntsd, int len)
+ struct smb_ntsd *pntsd, int len,
+ bool get_write)
{
int rc;
struct ndr sd_ndr = {0}, acl_ndr = {0};
rc = ksmbd_vfs_setxattr(idmap, path,
XATTR_NAME_SD, sd_ndr.data,
- sd_ndr.offset, 0);
+ sd_ndr.offset, 0, get_write);
if (rc < 0)
pr_err("Failed to store XATTR ntacl :%d\n", rc);
int ksmbd_vfs_set_dos_attrib_xattr(struct mnt_idmap *idmap,
const struct path *path,
- struct xattr_dos_attrib *da)
+ struct xattr_dos_attrib *da,
+ bool get_write)
{
struct ndr n;
int err;
return err;
err = ksmbd_vfs_setxattr(idmap, path, XATTR_NAME_DOS_ATTRIBUTE,
- (void *)n.data, n.offset, 0);
+ (void *)n.data, n.offset, 0, get_write);
if (err)
ksmbd_debug(SMB, "failed to store dos attribute in xattr\n");
kfree(n.data);
}
posix_state_to_acl(&acl_state, acls->a_entries);
- rc = mnt_want_write(path->mnt);
- if (rc)
- goto out_err;
-
rc = set_posix_acl(idmap, dentry, ACL_TYPE_ACCESS, acls);
if (rc < 0)
ksmbd_debug(SMB, "Set posix acl(ACL_TYPE_ACCESS) failed, rc : %d\n",
ksmbd_debug(SMB, "Set posix acl(ACL_TYPE_DEFAULT) failed, rc : %d\n",
rc);
}
- mnt_drop_write(path->mnt);
-out_err:
free_acl_state(&acl_state);
posix_acl_release(acls);
return rc;
}
}
- rc = mnt_want_write(path->mnt);
- if (rc)
- goto out_err;
-
rc = set_posix_acl(idmap, dentry, ACL_TYPE_ACCESS, acls);
if (rc < 0)
ksmbd_debug(SMB, "Set posix acl(ACL_TYPE_ACCESS) failed, rc : %d\n",
ksmbd_debug(SMB, "Set posix acl(ACL_TYPE_DEFAULT) failed, rc : %d\n",
rc);
}
- mnt_drop_write(path->mnt);
-out_err:
posix_acl_release(acls);
return rc;
}
int attr_name_len);
int ksmbd_vfs_setxattr(struct mnt_idmap *idmap,
const struct path *path, const char *attr_name,
- void *attr_value, size_t attr_size, int flags);
+ void *attr_value, size_t attr_size, int flags,
+ bool get_write);
int ksmbd_vfs_xattr_stream_name(char *stream_name, char **xattr_stream_name,
size_t *xattr_stream_name_size, int s_type);
int ksmbd_vfs_remove_xattr(struct mnt_idmap *idmap,
int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
unsigned int flags, struct path *parent_path,
struct path *path, bool caseless);
+void ksmbd_vfs_kern_path_unlock(struct path *parent_path, struct path *path);
struct dentry *ksmbd_vfs_kern_path_create(struct ksmbd_work *work,
const char *name,
unsigned int flags,
int ksmbd_vfs_set_sd_xattr(struct ksmbd_conn *conn,
struct mnt_idmap *idmap,
const struct path *path,
- struct smb_ntsd *pntsd, int len);
+ struct smb_ntsd *pntsd, int len,
+ bool get_write);
int ksmbd_vfs_get_sd_xattr(struct ksmbd_conn *conn,
struct mnt_idmap *idmap,
struct dentry *dentry,
struct smb_ntsd **pntsd);
int ksmbd_vfs_set_dos_attrib_xattr(struct mnt_idmap *idmap,
const struct path *path,
- struct xattr_dos_attrib *da);
+ struct xattr_dos_attrib *da,
+ bool get_write);
int ksmbd_vfs_get_dos_attrib_xattr(struct mnt_idmap *idmap,
struct dentry *dentry,
struct xattr_dos_attrib *da);
return tmp & inode_hash_mask;
}
-static struct ksmbd_inode *__ksmbd_inode_lookup(struct inode *inode)
+static struct ksmbd_inode *__ksmbd_inode_lookup(struct dentry *de)
{
struct hlist_head *head = inode_hashtable +
- inode_hash(inode->i_sb, inode->i_ino);
+ inode_hash(d_inode(de)->i_sb, (unsigned long)de);
struct ksmbd_inode *ci = NULL, *ret_ci = NULL;
hlist_for_each_entry(ci, head, m_hash) {
- if (ci->m_inode == inode) {
+ if (ci->m_de == de) {
if (atomic_inc_not_zero(&ci->m_count))
ret_ci = ci;
break;
static struct ksmbd_inode *ksmbd_inode_lookup(struct ksmbd_file *fp)
{
- return __ksmbd_inode_lookup(file_inode(fp->filp));
+ return __ksmbd_inode_lookup(fp->filp->f_path.dentry);
}
-static struct ksmbd_inode *ksmbd_inode_lookup_by_vfsinode(struct inode *inode)
+struct ksmbd_inode *ksmbd_inode_lookup_lock(struct dentry *d)
{
struct ksmbd_inode *ci;
read_lock(&inode_hash_lock);
- ci = __ksmbd_inode_lookup(inode);
+ ci = __ksmbd_inode_lookup(d);
read_unlock(&inode_hash_lock);
+
return ci;
}
-int ksmbd_query_inode_status(struct inode *inode)
+int ksmbd_query_inode_status(struct dentry *dentry)
{
struct ksmbd_inode *ci;
int ret = KSMBD_INODE_STATUS_UNKNOWN;
read_lock(&inode_hash_lock);
- ci = __ksmbd_inode_lookup(inode);
+ ci = __ksmbd_inode_lookup(dentry);
if (ci) {
ret = KSMBD_INODE_STATUS_OK;
if (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS))
static void ksmbd_inode_hash(struct ksmbd_inode *ci)
{
struct hlist_head *b = inode_hashtable +
- inode_hash(ci->m_inode->i_sb, ci->m_inode->i_ino);
+ inode_hash(d_inode(ci->m_de)->i_sb, (unsigned long)ci->m_de);
hlist_add_head(&ci->m_hash, b);
}
static int ksmbd_inode_init(struct ksmbd_inode *ci, struct ksmbd_file *fp)
{
- ci->m_inode = file_inode(fp->filp);
atomic_set(&ci->m_count, 1);
atomic_set(&ci->op_count, 0);
atomic_set(&ci->sop_count, 0);
INIT_LIST_HEAD(&ci->m_fp_list);
INIT_LIST_HEAD(&ci->m_op_list);
rwlock_init(&ci->m_lock);
+ ci->m_de = fp->filp->f_path.dentry;
return 0;
}
kfree(ci);
}
-static void ksmbd_inode_put(struct ksmbd_inode *ci)
+void ksmbd_inode_put(struct ksmbd_inode *ci)
{
if (atomic_dec_and_test(&ci->m_count))
ksmbd_inode_free(ci);
return fp;
}
-struct ksmbd_file *ksmbd_lookup_fd_inode(struct inode *inode)
+struct ksmbd_file *ksmbd_lookup_fd_inode(struct dentry *dentry)
{
struct ksmbd_file *lfp;
struct ksmbd_inode *ci;
+ struct inode *inode = d_inode(dentry);
- ci = ksmbd_inode_lookup_by_vfsinode(inode);
+ read_lock(&inode_hash_lock);
+ ci = __ksmbd_inode_lookup(dentry);
+ read_unlock(&inode_hash_lock);
if (!ci)
return NULL;
atomic_t op_count;
/* opinfo count for streams */
atomic_t sop_count;
- struct inode *m_inode;
+ struct dentry *m_de;
unsigned int m_flags;
struct hlist_node m_hash;
struct list_head m_fp_list;
struct ksmbd_readdir_data readdir_data;
int dot_dotdot[2];
unsigned int f_state;
+ bool reserve_lease_break;
};
static inline void set_ctx_actor(struct dir_context *ctx,
struct ksmbd_file *ksmbd_lookup_fd_slow(struct ksmbd_work *work, u64 id,
u64 pid);
void ksmbd_fd_put(struct ksmbd_work *work, struct ksmbd_file *fp);
+struct ksmbd_inode *ksmbd_inode_lookup_lock(struct dentry *d);
+void ksmbd_inode_put(struct ksmbd_inode *ci);
struct ksmbd_file *ksmbd_lookup_durable_fd(unsigned long long id);
struct ksmbd_file *ksmbd_lookup_fd_cguid(char *cguid);
-struct ksmbd_file *ksmbd_lookup_fd_inode(struct inode *inode);
+struct ksmbd_file *ksmbd_lookup_fd_inode(struct dentry *dentry);
unsigned int ksmbd_open_durable_fd(struct ksmbd_file *fp);
struct ksmbd_file *ksmbd_open_fd(struct ksmbd_work *work, struct file *filp);
void ksmbd_close_tree_conn_fds(struct ksmbd_work *work);
KSMBD_INODE_STATUS_PENDING_DELETE,
};
-int ksmbd_query_inode_status(struct inode *inode);
+int ksmbd_query_inode_status(struct dentry *dentry);
bool ksmbd_inode_pending_delete(struct ksmbd_file *fp);
void ksmbd_set_inode_pending_delete(struct ksmbd_file *fp);
void ksmbd_clear_inode_pending_delete(struct ksmbd_file *fp);
TRACE("Block @ 0x%llx, %scompressed size %d\n", index - 2,
compressed ? "" : "un", length);
}
- if (length < 0 || length > output->length ||
+ if (length <= 0 || length > output->length ||
(index + length) > msblk->bytes_used) {
res = -EIO;
goto out;
idmap = mnt_idmap(path->mnt);
if (inode->i_op->getattr)
return inode->i_op->getattr(idmap, path, stat,
- request_mask, query_flags);
+ request_mask,
+ query_flags | AT_GETATTR_NOSEC);
generic_fillattr(idmap, request_mask, inode, stat);
return 0;
{
int retval;
+ if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC))
+ return -EPERM;
+
retval = security_inode_getattr(path);
if (retval)
return retval;
/*
* eventfs_mutex protects the eventfs_inode (ei) dentry. Any access
* to the ei->dentry must be done under this mutex and after checking
- * if ei->is_freed is not set. The ei->dentry is released under the
- * mutex at the same time ei->is_freed is set. If ei->is_freed is set
- * then the ei->dentry is invalid.
+ * if ei->is_freed is not set. When ei->is_freed is set, the dentry
+ * is on its way to being freed after the last dput() is made on it.
*/
static DEFINE_MUTEX(eventfs_mutex);
/*
* The eventfs_inode (ei) itself is protected by SRCU. It is released from
* its parent's list and will have is_freed set (under eventfs_mutex).
- * After the SRCU grace period is over, the ei may be freed.
+ * After the SRCU grace period is over and the last dput() is called
+ * the ei is freed.
*/
DEFINE_STATIC_SRCU(eventfs_srcu);
if (!(dentry->d_inode->i_mode & S_IFDIR)) {
if (!ei->entry_attrs) {
ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * ei->nr_entries,
- GFP_KERNEL);
+ GFP_NOFS);
if (!ei->entry_attrs) {
ret = -ENOMEM;
goto out;
.release = eventfs_release,
};
-static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, umode_t mode)
+static void update_inode_attr(struct dentry *dentry, struct inode *inode,
+ struct eventfs_attr *attr, umode_t mode)
{
if (!attr) {
inode->i_mode = mode;
if (attr->mode & EVENTFS_SAVE_UID)
inode->i_uid = attr->uid;
+ else
+ inode->i_uid = d_inode(dentry->d_parent)->i_uid;
if (attr->mode & EVENTFS_SAVE_GID)
inode->i_gid = attr->gid;
+ else
+ inode->i_gid = d_inode(dentry->d_parent)->i_gid;
}
/**
return eventfs_failed_creating(dentry);
/* If the user updated the directory's attributes, use them */
- update_inode_attr(inode, attr, mode);
+ update_inode_attr(dentry, inode, attr, mode);
inode->i_op = &eventfs_file_inode_operations;
inode->i_fop = fop;
return eventfs_failed_creating(dentry);
/* If the user updated the directory's attributes, use them */
- update_inode_attr(inode, &ei->attr, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
+ update_inode_attr(dentry, inode, &ei->attr,
+ S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO);
inode->i_op = &eventfs_root_dir_inode_operations;
inode->i_fop = &eventfs_file_operations;
struct eventfs_attr *attr = NULL;
struct dentry **e_dentry = &ei->d_children[idx];
struct dentry *dentry;
- bool invalidate = false;
+
+ WARN_ON_ONCE(!inode_is_locked(parent->d_inode));
mutex_lock(&eventfs_mutex);
if (ei->is_freed) {
mutex_unlock(&eventfs_mutex);
- /* The lookup already has the parent->d_inode locked */
- if (!lookup)
- inode_lock(parent->d_inode);
-
dentry = create_file(name, mode, attr, parent, data, fops);
- if (!lookup)
- inode_unlock(parent->d_inode);
-
mutex_lock(&eventfs_mutex);
if (IS_ERR_OR_NULL(dentry)) {
* created the dentry for this e_dentry. In which case
* use that one.
*
- * Note, with the mutex held, the e_dentry cannot have content
- * and the ei->is_freed be true at the same time.
+ * If ei->is_freed is set, the e_dentry is currently on its
+ * way to being freed, don't return it. If e_dentry is NULL
+ * it means it was already freed.
*/
- dentry = *e_dentry;
- if (WARN_ON_ONCE(dentry && ei->is_freed))
+ if (ei->is_freed)
dentry = NULL;
+ else
+ dentry = *e_dentry;
/* The lookup does not need to up the dentry refcount */
if (dentry && !lookup)
dget(dentry);
* Otherwise it means two dentries exist with the same name.
*/
WARN_ON_ONCE(!ei->is_freed);
- invalidate = true;
+ dentry = NULL;
}
mutex_unlock(&eventfs_mutex);
- if (invalidate)
- d_invalidate(dentry);
-
- if (lookup || invalidate)
+ if (lookup)
dput(dentry);
- return invalidate ? NULL : dentry;
+ return dentry;
}
/**
create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
struct dentry *parent, bool lookup)
{
- bool invalidate = false;
struct dentry *dentry = NULL;
+ WARN_ON_ONCE(!inode_is_locked(parent->d_inode));
+
mutex_lock(&eventfs_mutex);
if (pei->is_freed || ei->is_freed) {
mutex_unlock(&eventfs_mutex);
}
mutex_unlock(&eventfs_mutex);
- /* The lookup already has the parent->d_inode locked */
- if (!lookup)
- inode_lock(parent->d_inode);
-
dentry = create_dir(ei, parent);
- if (!lookup)
- inode_unlock(parent->d_inode);
-
mutex_lock(&eventfs_mutex);
if (IS_ERR_OR_NULL(dentry) && !ei->is_freed) {
* created the dentry for this e_dentry. In which case
* use that one.
*
- * Note, with the mutex held, the e_dentry cannot have content
- * and the ei->is_freed be true at the same time.
+ * If ei->is_freed is set, the e_dentry is currently on its
+ * way to being freed.
*/
dentry = ei->dentry;
if (dentry && !lookup)
* Otherwise it means two dentries exist with the same name.
*/
WARN_ON_ONCE(!ei->is_freed);
- invalidate = true;
+ dentry = NULL;
}
mutex_unlock(&eventfs_mutex);
- if (invalidate)
- d_invalidate(dentry);
- if (lookup || invalidate)
+ if (lookup)
dput(dentry);
- return invalidate ? NULL : dentry;
+ return dentry;
}
/**
if (strcmp(ei_child->name, name) != 0)
continue;
ret = simple_lookup(dir, dentry, flags);
+ if (IS_ERR(ret))
+ goto out;
create_dir_dentry(ei, ei_child, ei_dentry, true);
created = true;
break;
if (r <= 0)
continue;
ret = simple_lookup(dir, dentry, flags);
+ if (IS_ERR(ret))
+ goto out;
create_file_dentry(ei, i, ei_dentry, name, mode, cdata,
fops, true);
break;
{
struct dentry **tmp;
- tmp = krealloc(*dentries, sizeof(d) * (cnt + 2), GFP_KERNEL);
+ tmp = krealloc(*dentries, sizeof(d) * (cnt + 2), GFP_NOFS);
if (!tmp)
return -1;
tmp[cnt] = d;
return -ENOMEM;
}
+ inode_lock(parent->d_inode);
list_for_each_entry_srcu(ei_child, &ei->children, list,
srcu_read_lock_held(&eventfs_srcu)) {
d = create_dir_dentry(ei, ei_child, parent, false);
cnt++;
}
}
+ inode_unlock(parent->d_inode);
srcu_read_unlock(&eventfs_srcu, idx);
ret = dcache_dir_open(inode, file);
struct dentry *dentry;
int error;
+ /* Must always have a parent. */
+ if (WARN_ON_ONCE(!parent))
+ return ERR_PTR(-EINVAL);
+
error = simple_pin_fs(&trace_fs_type, &tracefs_mount,
&tracefs_mount_count);
if (error)
return ERR_PTR(error);
- /*
- * If the parent is not specified, we create it in the root.
- * We need the root dentry to do this, which is in the super
- * block. A pointer to that is in the struct vfsmount that we
- * have around.
- */
- if (!parent)
- parent = tracefs_mount->mnt_root;
-
if (unlikely(IS_DEADDIR(parent->d_inode)))
dentry = ERR_PTR(-ENOENT);
else
{
struct inode *inode = mapping->host;
struct folio *folio = filemap_lock_folio(mapping, index);
- if (!folio) {
+ if (IS_ERR(folio)) {
folio = read_mapping_folio(mapping, index, NULL);
if (IS_ERR(folio)) {
bool "XFS online metadata check usage data collection"
default y
depends on XFS_ONLINE_SCRUB
- select XFS_DEBUG
+ select DEBUG_FS
help
If you say Y here, the kernel will gather usage data about
the online metadata check subsystem. This includes the number
ASSERT(mp->m_alloc_maxlevels > 0);
+ /*
+ * For a btree shorter than the maximum height, the worst case is that
+ * every level gets split and a new level is added, then while inserting
+ * another entry to refill the AGFL, every level under the old root gets
+ * split again. This is:
+ *
+ * (full height split reservation) + (AGFL refill split height)
+ * = (current height + 1) + (current height - 1)
+ * = (new height) + (new height - 2)
+ * = 2 * new height - 2
+ *
+ * For a btree of maximum height, the worst case is that every level
+ * under the root gets split, then while inserting another entry to
+ * refill the AGFL, every level under the root gets split again. This is
+ * also:
+ *
+ * 2 * (current height - 1)
+ * = 2 * (new height - 1)
+ * = 2 * new height - 2
+ */
+
/* space needed by-bno freespace btree */
min_free = min_t(unsigned int, levels[XFS_BTNUM_BNOi] + 1,
- mp->m_alloc_maxlevels);
+ mp->m_alloc_maxlevels) * 2 - 2;
/* space needed by-size freespace btree */
min_free += min_t(unsigned int, levels[XFS_BTNUM_CNTi] + 1,
- mp->m_alloc_maxlevels);
+ mp->m_alloc_maxlevels) * 2 - 2;
/* space needed reverse mapping used space btree */
if (xfs_has_rmapbt(mp))
min_free += min_t(unsigned int, levels[XFS_BTNUM_RMAPi] + 1,
- mp->m_rmap_maxlevels);
+ mp->m_rmap_maxlevels) * 2 - 2;
return min_free;
}
return ret;
}
-/* Abort all the intents that were committed. */
STATIC void
-xfs_defer_trans_abort(
- struct xfs_trans *tp,
- struct list_head *dop_pending)
+xfs_defer_pending_abort(
+ struct xfs_mount *mp,
+ struct list_head *dop_list)
{
struct xfs_defer_pending *dfp;
const struct xfs_defer_op_type *ops;
- trace_xfs_defer_trans_abort(tp, _RET_IP_);
-
/* Abort intent items that don't have a done item. */
- list_for_each_entry(dfp, dop_pending, dfp_list) {
+ list_for_each_entry(dfp, dop_list, dfp_list) {
ops = defer_op_types[dfp->dfp_type];
- trace_xfs_defer_pending_abort(tp->t_mountp, dfp);
+ trace_xfs_defer_pending_abort(mp, dfp);
if (dfp->dfp_intent && !dfp->dfp_done) {
ops->abort_intent(dfp->dfp_intent);
dfp->dfp_intent = NULL;
}
}
+/* Abort all the intents that were committed. */
+STATIC void
+xfs_defer_trans_abort(
+ struct xfs_trans *tp,
+ struct list_head *dop_pending)
+{
+ trace_xfs_defer_trans_abort(tp, _RET_IP_);
+ xfs_defer_pending_abort(tp->t_mountp, dop_pending);
+}
+
/*
* Capture resources that the caller said not to release ("held") when the
* transaction commits. Caller is responsible for zero-initializing @dres.
/* Release all resources that we used to capture deferred ops. */
void
-xfs_defer_ops_capture_free(
+xfs_defer_ops_capture_abort(
struct xfs_mount *mp,
struct xfs_defer_capture *dfc)
{
unsigned short i;
+ xfs_defer_pending_abort(mp, &dfc->dfc_dfops);
xfs_defer_cancel_list(mp, &dfc->dfc_dfops);
for (i = 0; i < dfc->dfc_held.dr_bufs; i++)
/* Commit the transaction and add the capture structure to the list. */
error = xfs_trans_commit(tp);
if (error) {
- xfs_defer_ops_capture_free(mp, dfc);
+ xfs_defer_ops_capture_abort(mp, dfc);
return error;
}
struct list_head *capture_list);
void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp,
struct xfs_defer_resources *dres);
-void xfs_defer_ops_capture_free(struct xfs_mount *mp,
+void xfs_defer_ops_capture_abort(struct xfs_mount *mp,
struct xfs_defer_capture *d);
void xfs_defer_resources_rele(struct xfs_defer_resources *dres);
if (mode && nextents + naextents > nblocks)
return __this_address;
+ if (nextents + naextents == 0 && nblocks != 0)
+ return __this_address;
+
if (S_ISDIR(mode) && nextents > mp->m_dir_geo->max_extents)
return __this_address;
struct xfs_dquot *dqp,
struct xfs_buf *bp)
{
- struct xfs_disk_dquot *ddqp = bp->b_addr + dqp->q_bufoffset;
+ struct xfs_dqblk *dqb = xfs_buf_offset(bp, dqp->q_bufoffset);
+ struct xfs_disk_dquot *ddqp = &dqb->dd_diskdq;
/*
* Ensure that we got the type and ID we were looking for.
}
/* Flush the incore dquot to the ondisk buffer. */
- dqblk = bp->b_addr + dqp->q_bufoffset;
+ dqblk = xfs_buf_offset(bp, dqp->q_bufoffset);
xfs_dquot_to_disk(&dqblk->dd_diskdq, dqp);
/*
#include "xfs_log.h"
#include "xfs_log_priv.h"
#include "xfs_log_recover.h"
+#include "xfs_error.h"
STATIC void
xlog_recover_dquot_ra_pass2(
{
struct xfs_mount *mp = log->l_mp;
struct xfs_buf *bp;
+ struct xfs_dqblk *dqb;
struct xfs_disk_dquot *ddq, *recddq;
struct xfs_dq_logformat *dq_f;
xfs_failaddr_t fa;
return error;
ASSERT(bp);
- ddq = xfs_buf_offset(bp, dq_f->qlf_boffset);
+ dqb = xfs_buf_offset(bp, dq_f->qlf_boffset);
+ ddq = &dqb->dd_diskdq;
/*
* If the dquot has an LSN in it, recover the dquot only if it's less
* than the lsn of the transaction we are replaying.
*/
if (xfs_has_crc(mp)) {
- struct xfs_dqblk *dqb = (struct xfs_dqblk *)ddq;
xfs_lsn_t lsn = be64_to_cpu(dqb->dd_lsn);
if (lsn && lsn != -1 && XFS_LSN_CMP(lsn, current_lsn) >= 0) {
memcpy(ddq, recddq, item->ri_buf[1].i_len);
if (xfs_has_crc(mp)) {
- xfs_update_cksum((char *)ddq, sizeof(struct xfs_dqblk),
+ xfs_update_cksum((char *)dqb, sizeof(struct xfs_dqblk),
XFS_DQUOT_CRC_OFF);
}
+ /* Validate the recovered dquot. */
+ fa = xfs_dqblk_verify(log->l_mp, dqb, dq_f->qlf_id);
+ if (fa) {
+ XFS_CORRUPTION_ERROR("Bad dquot after recovery",
+ XFS_ERRLEVEL_LOW, mp, dqb,
+ sizeof(struct xfs_dqblk));
+ xfs_alert(mp,
+ "Metadata corruption detected at %pS, dquot 0x%x",
+ fa, dq_f->qlf_id);
+ error = -EFSCORRUPTED;
+ goto out_release;
+ }
+
ASSERT(dq_f->qlf_size == 2);
ASSERT(bp->b_mount == mp);
bp->b_flags |= _XBF_LOGRECOVERY;
extern void xfs_setup_iops(struct xfs_inode *ip);
extern void xfs_diflags_to_iflags(struct xfs_inode *ip, bool init);
+static inline void xfs_update_stable_writes(struct xfs_inode *ip)
+{
+ if (bdev_stable_writes(xfs_inode_buftarg(ip)->bt_bdev))
+ mapping_set_stable_writes(VFS_I(ip)->i_mapping);
+ else
+ mapping_clear_stable_writes(VFS_I(ip)->i_mapping);
+}
+
/*
* When setting up a newly allocated inode, we need to call
* xfs_finish_inode_setup() once the inode is fully instantiated at
struct xfs_log_dinode *ldip;
uint isize;
int need_free = 0;
+ xfs_failaddr_t fa;
if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) {
in_f = item->ri_buf[0].i_addr;
* superblock flag to determine whether we need to look at di_flushiter
* to skip replay when the on disk inode is newer than the log one
*/
- if (!xfs_has_v3inodes(mp) &&
- ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
- /*
- * Deal with the wrap case, DI_MAX_FLUSH is less
- * than smaller numbers
- */
- if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
- ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
- /* do nothing */
- } else {
- trace_xfs_log_recover_inode_skip(log, in_f);
- error = 0;
- goto out_release;
+ if (!xfs_has_v3inodes(mp)) {
+ if (ldip->di_flushiter < be16_to_cpu(dip->di_flushiter)) {
+ /*
+ * Deal with the wrap case, DI_MAX_FLUSH is less
+ * than smaller numbers
+ */
+ if (be16_to_cpu(dip->di_flushiter) == DI_MAX_FLUSH &&
+ ldip->di_flushiter < (DI_MAX_FLUSH >> 1)) {
+ /* do nothing */
+ } else {
+ trace_xfs_log_recover_inode_skip(log, in_f);
+ error = 0;
+ goto out_release;
+ }
}
+
+ /* Take the opportunity to reset the flush iteration count */
+ ldip->di_flushiter = 0;
}
- /* Take the opportunity to reset the flush iteration count */
- ldip->di_flushiter = 0;
if (unlikely(S_ISREG(ldip->di_mode))) {
if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) &&
(dip->di_mode != 0))
error = xfs_recover_inode_owner_change(mp, dip, in_f,
buffer_list);
- /* re-generate the checksum. */
+ /* re-generate the checksum and validate the recovered inode. */
xfs_dinode_calc_crc(log->l_mp, dip);
+ fa = xfs_dinode_verify(log->l_mp, in_f->ilf_ino, dip);
+ if (fa) {
+ XFS_CORRUPTION_ERROR(
+ "Bad dinode after recovery",
+ XFS_ERRLEVEL_LOW, mp, dip, sizeof(*dip));
+ xfs_alert(mp,
+ "Metadata corruption detected at %pS, inode 0x%llx",
+ fa, in_f->ilf_ino);
+ error = -EFSCORRUPTED;
+ goto out_release;
+ }
ASSERT(bp->b_mount == mp);
bp->b_flags |= _XBF_LOGRECOVERY;
struct fileattr *fa)
{
struct xfs_mount *mp = ip->i_mount;
+ bool rtflag = (fa->fsx_xflags & FS_XFLAG_REALTIME);
uint64_t i_flags2;
- /* Can't change realtime flag if any extents are allocated. */
- if ((ip->i_df.if_nextents || ip->i_delayed_blks) &&
- XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME))
- return -EINVAL;
+ if (rtflag != XFS_IS_REALTIME_INODE(ip)) {
+ /* Can't change realtime flag if any extents are allocated. */
+ if (ip->i_df.if_nextents || ip->i_delayed_blks)
+ return -EINVAL;
+ }
- /* If realtime flag is set then must have realtime device */
- if (fa->fsx_xflags & FS_XFLAG_REALTIME) {
+ if (rtflag) {
+ /* If realtime flag is set then must have realtime device */
if (mp->m_sb.sb_rblocks == 0 || mp->m_sb.sb_rextsize == 0 ||
xfs_extlen_to_rtxmod(mp, ip->i_extsize))
return -EINVAL;
- }
- /* Clear reflink if we are actually able to set the rt flag. */
- if ((fa->fsx_xflags & FS_XFLAG_REALTIME) && xfs_is_reflink_inode(ip))
- ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK;
+ /* Clear reflink if we are actually able to set the rt flag. */
+ if (xfs_is_reflink_inode(ip))
+ ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK;
+ }
/* diflags2 only valid for v3 inodes. */
i_flags2 = xfs_flags2diflags2(ip, fa->fsx_xflags);
ip->i_diflags2 = i_flags2;
xfs_diflags_to_iflags(ip, false);
+
+ /*
+ * Make the stable writes flag match that of the device the inode
+ * resides on when flipping the RT flag.
+ */
+ if (rtflag != XFS_IS_REALTIME_INODE(ip) && S_ISREG(VFS_I(ip)->i_mode))
+ xfs_update_stable_writes(ip);
+
xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
XFS_STATS_INC(mp, xs_ig_attrchg);
gfp_mask = mapping_gfp_mask(inode->i_mapping);
mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
+ /*
+ * For real-time inodes update the stable write flags to that of the RT
+ * device instead of the data device.
+ */
+ if (S_ISREG(inode->i_mode) && XFS_IS_REALTIME_INODE(ip))
+ xfs_update_stable_writes(ip);
+
/*
* If there is no attribute fork no ACL can exist on this inode,
* and it can't have any file capabilities attached to it either.
* the buffer manually, the code needs to be kept in sync
* with the I/O completion path.
*/
- xlog_state_done_syncing(iclog);
- up(&iclog->ic_sema);
- return;
+ goto sync;
}
/*
* avoid shutdown re-entering this path and erroring out again.
*/
if (log->l_targ != log->l_mp->m_ddev_targp &&
- blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev)) {
- xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
- return;
- }
+ blkdev_issue_flush(log->l_mp->m_ddev_targp->bt_bdev))
+ goto shutdown;
}
if (iclog->ic_flags & XLOG_ICL_NEED_FUA)
iclog->ic_bio.bi_opf |= REQ_FUA;
iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
- if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
- xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
- return;
- }
+ if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count))
+ goto shutdown;
+
if (is_vmalloc_addr(iclog->ic_data))
flush_kernel_vmap_range(iclog->ic_data, count);
}
submit_bio(&iclog->ic_bio);
+ return;
+shutdown:
+ xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
+sync:
+ xlog_state_done_syncing(iclog);
+ up(&iclog->ic_sema);
}
/*
list_for_each_entry_safe(dfc, next, capture_list, dfc_list) {
list_del_init(&dfc->dfc_list);
- xfs_defer_ops_capture_free(mp, dfc);
+ xfs_defer_ops_capture_abort(mp, dfc);
}
}
}
}
del = got;
+ xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
/* Grab the corresponding mapping in the data fork. */
nmaps = 1;
int acpi_bus_init_power(struct acpi_device *device);
int acpi_device_fix_up_power(struct acpi_device *device);
void acpi_device_fix_up_power_extended(struct acpi_device *adev);
+void acpi_device_fix_up_power_children(struct acpi_device *adev);
int acpi_bus_update_power(acpi_handle handle, int *state_p);
int acpi_device_update_power(struct acpi_device *device, int *state_p);
bool acpi_bus_power_manageable(acpi_handle handle);
*/
static __always_inline int queued_spin_value_unlocked(struct qspinlock lock)
{
- return !atomic_read(&lock.val);
+ return !lock.val.counter;
}
/**
int drm_atomic_helper_prepare_planes(struct drm_device *dev,
struct drm_atomic_state *state);
+void drm_atomic_helper_unprepare_planes(struct drm_device *dev,
+ struct drm_atomic_state *state);
#define DRM_PLANE_COMMIT_ACTIVE_ONLY BIT(0)
#define DRM_PLANE_COMMIT_NO_DISABLE_AFTER_MODESET BIT(1)
-/* SPDX-License-Identifier: GPL-2.0 OR MIT */
+/* SPDX-License-Identifier: GPL-2.0-only OR MIT */
#ifndef __DRM_GPUVM_H__
#define __DRM_GPUVM_H__
struct drm_device;
struct drm_gem_object;
+struct drm_file;
/* core prime functions */
struct dma_buf *drm_gem_dmabuf_export(struct drm_device *dev,
struct dma_buf_export_info *exp_info);
void drm_gem_dmabuf_release(struct dma_buf *dma_buf);
+int drm_gem_prime_fd_to_handle(struct drm_device *dev,
+ struct drm_file *file_priv, int prime_fd, uint32_t *handle);
+int drm_gem_prime_handle_to_fd(struct drm_device *dev,
+ struct drm_file *file_priv, uint32_t handle, uint32_t flags,
+ int *prime_fd);
+
/* helper functions for exporting */
int drm_gem_map_attach(struct dma_buf *dma_buf,
struct dma_buf_attachment *attach);
#include <linux/mod_devicetable.h>
#include <linux/property.h>
#include <linux/uuid.h>
-#include <linux/fw_table.h>
struct irq_domain;
struct irq_domain_ops;
#endif
#include <acpi/acpi.h>
-#ifdef CONFIG_ACPI_TABLE_LIB
-#define EXPORT_SYMBOL_ACPI_LIB(x) EXPORT_SYMBOL_NS_GPL(x, ACPI)
-#define __init_or_acpilib
-#define __initdata_or_acpilib
-#else
-#define EXPORT_SYMBOL_ACPI_LIB(x)
-#define __init_or_acpilib __init
-#define __initdata_or_acpilib __initdata
-#endif
-
#ifdef CONFIG_ACPI
#include <linux/list.h>
#include <linux/dynamic_debug.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/fw_table.h>
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/acpi_io.h>
#include <asm/acpi.h>
+#ifdef CONFIG_ACPI_TABLE_LIB
+#define EXPORT_SYMBOL_ACPI_LIB(x) EXPORT_SYMBOL_NS_GPL(x, ACPI)
+#define __init_or_acpilib
+#define __initdata_or_acpilib
+#else
+#define EXPORT_SYMBOL_ACPI_LIB(x)
+#define __init_or_acpilib __init
+#define __initdata_or_acpilib __initdata
+#endif
+
static inline acpi_handle acpi_device_handle(struct acpi_device *adev)
{
return adev ? adev->handle : NULL;
u32 nominal_perf;
u32 lowest_nonlinear_perf;
u32 lowest_perf;
+ u32 min_limit_perf;
+ u32 max_limit_perf;
+ u32 min_limit_freq;
+ u32 max_limit_freq;
u32 max_freq;
u32 min_freq;
#define module_ffa_driver(__ffa_driver) \
module_driver(__ffa_driver, ffa_register, ffa_unregister)
+extern struct bus_type ffa_bus_type;
+
/* FFA transport related */
struct ffa_partition_info {
u16 id;
extern void blk_post_runtime_suspend(struct request_queue *q, int err);
extern void blk_pre_runtime_resume(struct request_queue *q);
extern void blk_post_runtime_resume(struct request_queue *q);
-extern void blk_set_runtime_active(struct request_queue *q);
#else
static inline void blk_pm_runtime_init(struct request_queue *q,
struct device *dev) {}
bool bd_write_holder;
bool bd_has_submit_bio;
dev_t bd_dev;
+ struct inode *bd_inode; /* will die */
+
atomic_t bd_openers;
spinlock_t bd_size_lock; /* for bd_inode->i_size updates */
- struct inode * bd_inode; /* will die */
void * bd_claiming;
void * bd_holder;
const struct blk_holder_ops *bd_holder_ops;
#ifdef CONFIG_FAIL_MAKE_REQUEST
bool bd_make_it_fail;
#endif
+ bool bd_ro_warned;
/*
* keep this out-of-line as it's both big and not needed in the fast
* path
extern spinlock_t btf_idr_lock;
extern struct kobject *btf_kobj;
extern struct bpf_mem_alloc bpf_global_ma, bpf_global_percpu_ma;
-extern bool bpf_global_ma_set, bpf_global_percpu_ma_set;
+extern bool bpf_global_ma_set;
typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64);
typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
aux->ctx_field_size = size;
}
+static bool bpf_is_ldimm64(const struct bpf_insn *insn)
+{
+ return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
+}
+
static inline bool bpf_pseudo_func(const struct bpf_insn *insn)
{
- return insn->code == (BPF_LD | BPF_IMM | BPF_DW) &&
- insn->src_reg == BPF_PSEUDO_FUNC;
+ return bpf_is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC;
}
struct bpf_prog_ops {
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
void *addr1, void *addr2);
+void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old);
+
void *bpf_arch_text_copy(void *dst, void *src, size_t len);
int bpf_arch_text_invalidate(void *dst, size_t len);
#ifdef CONFIG_NET
BPF_LINK_TYPE(BPF_LINK_TYPE_NETNS, netns)
BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
+BPF_LINK_TYPE(BPF_LINK_TYPE_NETFILTER, netfilter)
+BPF_LINK_TYPE(BPF_LINK_TYPE_TCX, tcx)
+BPF_LINK_TYPE(BPF_LINK_TYPE_NETKIT, netkit)
#endif
#ifdef CONFIG_PERF_EVENTS
BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
#endif
BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE_MULTI, kprobe_multi)
BPF_LINK_TYPE(BPF_LINK_TYPE_STRUCT_OPS, struct_ops)
+BPF_LINK_TYPE(BPF_LINK_TYPE_UPROBE_MULTI, uprobe_multi)
struct tnum callback_ret_range;
bool in_async_callback_fn;
bool in_exception_callback_fn;
+ /* For callback calling functions that limit number of possible
+ * callback executions (e.g. bpf_loop) keeps track of current
+ * simulated iteration number.
+ * Value in frame N refers to number of times callback with frame
+ * N+1 was simulated, e.g. for the following call:
+ *
+ * bpf_loop(..., fn, ...); | suppose current frame is N
+ * | fn would be simulated in frame N+1
+ * | number of simulations is tracked in frame N
+ */
+ u32 callback_depth;
/* The following fields should be last. See copy_func_state() */
int acquired_refs;
struct bpf_idx_pair *jmp_history;
u32 jmp_history_cnt;
u32 dfs_depth;
+ u32 callback_unroll_depth;
};
#define bpf_get_spilled_reg(slot, frame, mask) \
* this instruction, regardless of any heuristics
*/
bool force_checkpoint;
+ /* true if instruction is a call to a helper function that
+ * accepts callback function as a parameter.
+ */
+ bool calls_callback;
};
#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
struct closure;
struct closure_syncer;
-typedef void (closure_fn) (struct closure *);
+typedef void (closure_fn) (struct work_struct *);
extern struct dentry *bcache_debug;
struct closure_waitlist {
INIT_WORK(&cl->work, cl->work.func);
BUG_ON(!queue_work(wq, &cl->work));
} else
- cl->fn(cl);
+ cl->fn(&cl->work);
}
/**
__closure_wake_up(list);
}
+#define CLOSURE_CALLBACK(name) void name(struct work_struct *ws)
+#define closure_type(name, type, member) \
+ struct closure *cl = container_of(ws, struct closure, work); \
+ type *name = container_of(cl, type, member)
+
/**
* continue_at - jump to another function with barrier
*
CPUHP_AP_ARM_CORESIGHT_CTI_STARTING,
CPUHP_AP_ARM64_ISNDEP_STARTING,
CPUHP_AP_SMPCFD_DYING,
+ CPUHP_AP_HRTIMERS_DYING,
CPUHP_AP_X86_TBOOT_DYING,
CPUHP_AP_ARM_CACHE_B15_RAC_DYING,
CPUHP_AP_ONLINE,
* same context as task->real_cred.
*/
struct cred {
- atomic_t usage;
-#ifdef CONFIG_DEBUG_CREDENTIALS
- atomic_t subscribers; /* number of processes subscribed */
- void *put_addr;
- unsigned magic;
-#define CRED_MAGIC 0x43736564
-#define CRED_MAGIC_DEAD 0x44656144
-#endif
+ atomic_long_t usage;
kuid_t uid; /* real UID of the task */
kgid_t gid; /* real GID of the task */
kuid_t suid; /* saved UID of the task */
extern void __init cred_init(void);
extern int set_cred_ucounts(struct cred *);
-/*
- * check for validity of credentials
- */
-#ifdef CONFIG_DEBUG_CREDENTIALS
-extern void __noreturn __invalid_creds(const struct cred *, const char *, unsigned);
-extern void __validate_process_creds(struct task_struct *,
- const char *, unsigned);
-
-extern bool creds_are_invalid(const struct cred *cred);
-
-static inline void __validate_creds(const struct cred *cred,
- const char *file, unsigned line)
-{
- if (unlikely(creds_are_invalid(cred)))
- __invalid_creds(cred, file, line);
-}
-
-#define validate_creds(cred) \
-do { \
- __validate_creds((cred), __FILE__, __LINE__); \
-} while(0)
-
-#define validate_process_creds() \
-do { \
- __validate_process_creds(current, __FILE__, __LINE__); \
-} while(0)
-
-extern void validate_creds_for_do_exit(struct task_struct *);
-#else
-static inline void validate_creds(const struct cred *cred)
-{
-}
-static inline void validate_creds_for_do_exit(struct task_struct *tsk)
-{
-}
-static inline void validate_process_creds(void)
-{
-}
-#endif
-
static inline bool cap_ambient_invariant_ok(const struct cred *cred)
{
return cap_issubset(cred->cap_ambient,
*/
static inline struct cred *get_new_cred_many(struct cred *cred, int nr)
{
- atomic_add(nr, &cred->usage);
+ atomic_long_add(nr, &cred->usage);
return cred;
}
struct cred *nonconst_cred = (struct cred *) cred;
if (!cred)
return cred;
- validate_creds(cred);
nonconst_cred->non_rcu = 0;
return get_new_cred_many(nonconst_cred, nr);
}
struct cred *nonconst_cred = (struct cred *) cred;
if (!cred)
return NULL;
- if (!atomic_inc_not_zero(&nonconst_cred->usage))
+ if (!atomic_long_inc_not_zero(&nonconst_cred->usage))
return NULL;
- validate_creds(cred);
nonconst_cred->non_rcu = 0;
return cred;
}
struct cred *cred = (struct cred *) _cred;
if (cred) {
- validate_creds(cred);
- if (atomic_sub_and_test(nr, &cred->usage))
+ if (atomic_long_sub_and_test(nr, &cred->usage))
__put_cred(cred);
}
}
* update
*/
unsigned long next_ops_update_sis;
+ /* for waiting until the execution of the kdamond_fn is started */
+ struct completion kdamond_started;
/* public: */
struct task_struct *kdamond;
ssize_t debugfs_read_file_str(struct file *file, char __user *user_buf,
size_t count, loff_t *ppos);
+/**
+ * struct debugfs_cancellation - cancellation data
+ * @list: internal, for keeping track
+ * @cancel: callback to call
+ * @cancel_data: extra data for the callback to call
+ */
+struct debugfs_cancellation {
+ struct list_head list;
+ void (*cancel)(struct dentry *, void *);
+ void *cancel_data;
+};
+
+void __acquires(cancellation)
+debugfs_enter_cancellation(struct file *file,
+ struct debugfs_cancellation *cancellation);
+void __releases(cancellation)
+debugfs_leave_cancellation(struct file *file,
+ struct debugfs_cancellation *cancellation);
+
#else
#include <linux/err.h>
mutex_unlock(&dev->mutex);
}
+DEFINE_GUARD(device, struct device *, device_lock(_T), device_unlock(_T))
+
static inline void device_lock_assert(struct device *dev)
{
lockdep_assert_held(&dev->mutex);
return __dma_fence_is_later(f1->seqno, f2->seqno, f1->ops);
}
+/**
+ * dma_fence_is_later_or_same - return true if f1 is later or same as f2
+ * @f1: the first fence from the same context
+ * @f2: the second fence from the same context
+ *
+ * Returns true if f1 is chronologically later than f2 or the same fence. Both
+ * fences must be from the same context, since a seqno is not re-used across
+ * contexts.
+ */
+static inline bool dma_fence_is_later_or_same(struct dma_fence *f1,
+ struct dma_fence *f2)
+{
+ return f1 == f2 || dma_fence_is_later(f1, f2);
+}
+
/**
* dma_fence_later - return the chronologically later fence
* @f1: the first fence from the same context
" .previous" "\n" \
)
-#ifdef CONFIG_IA64
-#define KSYM_FUNC(name) @fptr(name)
-#elif defined(CONFIG_PARISC) && defined(CONFIG_64BIT)
+#if defined(CONFIG_PARISC) && defined(CONFIG_64BIT)
#define KSYM_FUNC(name) P%name
#else
#define KSYM_FUNC(name) name
int count;
};
-#include <linux/acpi.h>
-#include <acpi/acpi.h>
-
union acpi_subtable_headers {
struct acpi_subtable_header common;
struct acpi_hmat_structure hmat;
#define HID_USAGE_SENSOR_ALS 0x200041
#define HID_USAGE_SENSOR_DATA_LIGHT 0x2004d0
#define HID_USAGE_SENSOR_LIGHT_ILLUM 0x2004d1
-#define HID_USAGE_SENSOR_LIGHT_COLOR_TEMPERATURE 0x2004d2
-#define HID_USAGE_SENSOR_LIGHT_CHROMATICITY 0x2004d3
-#define HID_USAGE_SENSOR_LIGHT_CHROMATICITY_X 0x2004d4
-#define HID_USAGE_SENSOR_LIGHT_CHROMATICITY_Y 0x2004d5
/* PROX (200011) */
#define HID_USAGE_SENSOR_PROX 0x200011
struct list_head debug_list;
spinlock_t debug_list_lock;
wait_queue_head_t debug_wait;
+ struct kref ref;
unsigned int id; /* system unique id */
#endif /* CONFIG_BPF */
};
+void hiddev_free(struct kref *ref);
+
#define to_hid_device(pdev) \
container_of(pdev, struct hid_device, dev)
memcpy(to, from, chunk);
kunmap_local(from);
- from += chunk;
+ to += chunk;
offset += chunk;
len -= chunk;
} while (len > 0);
int hrtimers_prepare_cpu(unsigned int cpu);
#ifdef CONFIG_HOTPLUG_CPU
-int hrtimers_dead_cpu(unsigned int cpu);
+int hrtimers_cpu_dying(unsigned int cpu);
#else
-#define hrtimers_dead_cpu NULL
+#define hrtimers_cpu_dying NULL
#endif
#endif
return (vma->vm_flags & VM_MAYSHARE) && vma->vm_private_data;
}
-static inline bool __vma_private_lock(struct vm_area_struct *vma)
-{
- return (!(vma->vm_flags & VM_MAYSHARE)) && vma->vm_private_data;
-}
+bool __vma_private_lock(struct vm_area_struct *vma);
/*
* Safe version of huge_pte_offset() to check the locks. See comments
static inline const struct ieee80211_he_6ghz_oper *
ieee80211_he_6ghz_oper(const struct ieee80211_he_operation *he_oper)
{
- const u8 *ret = (const void *)&he_oper->optional;
+ const u8 *ret;
u32 he_oper_params;
if (!he_oper)
return NULL;
+ ret = (const void *)&he_oper->optional;
+
he_oper_params = le32_to_cpu(he_oper->he_oper_params);
if (!(he_oper_params & IEEE80211_HE_OPERATION_6GHZ_OP_INFO))
action != WLAN_PUB_ACTION_LOC_TRACK_NOTI &&
action != WLAN_PUB_ACTION_FTM_REQUEST &&
action != WLAN_PUB_ACTION_FTM_RESPONSE &&
- action != WLAN_PUB_ACTION_FILS_DISCOVERY;
+ action != WLAN_PUB_ACTION_FILS_DISCOVERY &&
+ action != WLAN_PUB_ACTION_VENDOR_SPECIFIC;
}
/**
struct list_head io_buffers_cache;
+ /* deferred free list, protected by ->uring_lock */
+ struct hlist_head io_buf_list;
+
/* Keep this last, we don't need it for the fast path */
struct wait_queue_head poll_wq;
struct io_restriction restrictions;
/* keep async read/write and isreg together and in order */
REQ_F_SUPPORT_NOWAIT_BIT,
REQ_F_ISREG_BIT,
+ REQ_F_POLL_NO_LAZY_BIT,
/* not a real bit, just to check we're not overflowing the space */
__REQ_F_LAST_BIT,
REQ_F_CLEAR_POLLIN = BIT(REQ_F_CLEAR_POLLIN_BIT),
/* hashed into ->cancel_hash_locked, protected by ->uring_lock */
REQ_F_HASH_LOCKED = BIT(REQ_F_HASH_LOCKED_BIT),
+ /* don't use lazy poll wake for this request */
+ REQ_F_POLL_NO_LAZY = BIT(REQ_F_POLL_NO_LAZY_BIT),
};
typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);
dev->iommu->priv = priv;
}
+extern struct mutex iommu_probe_device_lock;
int iommu_probe_device(struct device *dev);
int iommu_dev_enable_feature(struct device *dev, enum iommu_dev_features f);
JBD2_FEATURE_INCOMPAT_FUNCS(csum3, CSUM_V3)
JBD2_FEATURE_INCOMPAT_FUNCS(fast_commit, FAST_COMMIT)
+/* Journal high priority write IO operation flags */
+#define JBD2_JOURNAL_REQ_FLAGS (REQ_META | REQ_SYNC | REQ_IDLE)
+
/*
* Journal flag definitions
*/
unsigned int flags;
#define KEY_TYPE_NET_DOMAIN 0x00000001 /* Keys of this type have a net namespace domain */
+#define KEY_TYPE_INSTANT_REAP 0x00000002 /* Keys of this type don't have a delay after expiring */
/* vet a description */
int (*vet_description)(const char *description);
*
*/
struct kretprobe_holder {
- struct kretprobe *rp;
+ struct kretprobe __rcu *rp;
struct objpool_head pool;
};
#ifdef CONFIG_KRETPROBE_ON_RETHOOK
static nokprobe_inline struct kretprobe *get_kretprobe(struct kretprobe_instance *ri)
{
- RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
- "Kretprobe is accessed from instance under preemptive context");
-
- return (struct kretprobe *)READ_ONCE(ri->node.rethook->data);
+ /* rethook::data is non-changed field, so that you can access it freely. */
+ return (struct kretprobe *)ri->node.rethook->data;
}
static nokprobe_inline unsigned long get_kretprobe_retaddr(struct kretprobe_instance *ri)
{
static nokprobe_inline struct kretprobe *get_kretprobe(struct kretprobe_instance *ri)
{
- RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
- "Kretprobe is accessed from instance under preemptive context");
-
- return READ_ONCE(ri->rph->rp);
+ return rcu_dereference_check(ri->rph->rp, rcu_read_lock_any_held());
}
static nokprobe_inline unsigned long get_kretprobe_retaddr(struct kretprobe_instance *ri)
struct list_head vm_list;
struct mutex lock;
struct kvm_io_bus __rcu *buses[KVM_NR_BUSES];
-#ifdef CONFIG_HAVE_KVM_EVENTFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
struct {
spinlock_t lock;
struct list_head items;
struct list_head resampler_list;
struct mutex resampler_lock;
} irqfds;
- struct list_head ioeventfds;
#endif
+ struct list_head ioeventfds;
struct kvm_vm_stat stat;
struct kvm_arch arch;
refcount_t users_count;
* Update side is protected by irq_lock.
*/
struct kvm_irq_routing_table __rcu *irq_routing;
-#endif
-#ifdef CONFIG_HAVE_KVM_IRQFD
+
struct hlist_head irq_ack_notifier_list;
#endif
}
#endif
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
int kvm_irqfd_init(void);
void kvm_irqfd_exit(void);
#else
int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
-#ifdef CONFIG_HAVE_KVM_EVENTFD
-
void kvm_eventfd_init(struct kvm *kvm);
int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args);
void kvm_irqfd_release(struct kvm *kvm);
bool kvm_notify_irqfd_resampler(struct kvm *kvm,
{
return false;
}
-#endif
-
-#else
-
-static inline void kvm_eventfd_init(struct kvm *kvm) {}
-
-static inline int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args)
-{
- return -EINVAL;
-}
-
-static inline void kvm_irqfd_release(struct kvm *kvm) {}
-
-#ifdef CONFIG_HAVE_KVM_IRQCHIP
-static inline void kvm_irq_routing_update(struct kvm *kvm)
-{
-}
-#endif
-
-static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
-{
- return -ENOSYS;
-}
-
-#endif /* CONFIG_HAVE_KVM_EVENTFD */
+#endif /* CONFIG_HAVE_KVM_IRQCHIP */
void kvm_arch_irq_routing_update(struct kvm *kvm);
* A function that translates value of following registers to the linkmode:
* IEEE 802.3-2018 45.2.3.10 "EEE control and capability 1" register (3.20)
* IEEE 802.3-2018 45.2.7.13 "EEE advertisement 1" register (7.60)
- * IEEE 802.3-2018 45.2.7.14 "EEE "link partner ability 1 register (7.61)
+ * IEEE 802.3-2018 45.2.7.14 "EEE link partner ability 1" register (7.61)
*/
static inline void mii_eee_cap1_mod_linkmode_t(unsigned long *adv, u32 val)
{
u8 reserved_at_140[0x8];
u8 bth_dst_qp[0x18];
- u8 reserved_at_160[0x20];
+ u8 inner_esp_spi[0x20];
u8 outer_esp_spi[0x20];
u8 reserved_at_1a0[0x60];
};
MLX5_IPSEC_ASO_INC_SN = 0x2,
};
+enum {
+ MLX5_IPSEC_ASO_REPLAY_WIN_32BIT = 0x0,
+ MLX5_IPSEC_ASO_REPLAY_WIN_64BIT = 0x1,
+ MLX5_IPSEC_ASO_REPLAY_WIN_128BIT = 0x2,
+ MLX5_IPSEC_ASO_REPLAY_WIN_256BIT = 0x3,
+};
+
struct mlx5_ifc_ipsec_aso_bits {
u8 valid[0x1];
u8 reserved_at_201[0x1];
*/
static inline bool vma_is_initial_heap(const struct vm_area_struct *vma)
{
- return vma->vm_start <= vma->vm_mm->brk &&
- vma->vm_end >= vma->vm_mm->start_brk;
+ return vma->vm_start < vma->vm_mm->brk &&
+ vma->vm_end > vma->vm_mm->start_brk;
}
/*
* its "stack". It's not even well-defined for programs written
* languages like Go.
*/
- return vma->vm_start <= vma->vm_mm->start_stack &&
- vma->vm_end >= vma->vm_mm->start_stack;
+ return vma->vm_start <= vma->vm_mm->start_stack &&
+ vma->vm_end >= vma->vm_mm->start_stack;
}
static inline bool vma_is_temporary_stack(struct vm_area_struct *vma)
if (folio_test_unevictable(folio) || !lrugen->enabled)
return false;
/*
- * There are three common cases for this page:
- * 1. If it's hot, e.g., freshly faulted in or previously hot and
- * migrated, add it to the youngest generation.
- * 2. If it's cold but can't be evicted immediately, i.e., an anon page
- * not in swapcache or a dirty page pending writeback, add it to the
- * second oldest generation.
- * 3. Everything else (clean, cold) is added to the oldest generation.
+ * There are four common cases for this page:
+ * 1. If it's hot, i.e., freshly faulted in, add it to the youngest
+ * generation, and it's protected over the rest below.
+ * 2. If it can't be evicted immediately, i.e., a dirty page pending
+ * writeback, add it to the second youngest generation.
+ * 3. If it should be evicted first, e.g., cold and clean from
+ * folio_rotate_reclaimable(), add it to the oldest generation.
+ * 4. Everything else falls between 2 & 3 above and is added to the
+ * second oldest generation if it's considered inactive, or the
+ * oldest generation otherwise. See lru_gen_is_active().
*/
if (folio_test_active(folio))
seq = lrugen->max_seq;
else if ((type == LRU_GEN_ANON && !folio_test_swapcache(folio)) ||
(folio_test_reclaim(folio) &&
(folio_test_dirty(folio) || folio_test_writeback(folio))))
- seq = lrugen->min_seq[type] + 1;
- else
+ seq = lrugen->max_seq - 1;
+ else if (reclaiming || lrugen->min_seq[type] + MIN_NR_GENS >= lrugen->max_seq)
seq = lrugen->min_seq[type];
+ else
+ seq = lrugen->min_seq[type] + 1;
gen = lru_gen_from_seq(seq);
flags = (gen + 1UL) << LRU_GEN_PGOFF;
* the old generation, is incremented when all its bins become empty.
*
* There are four operations:
- * 1. MEMCG_LRU_HEAD, which moves an memcg to the head of a random bin in its
+ * 1. MEMCG_LRU_HEAD, which moves a memcg to the head of a random bin in its
* current generation (old or young) and updates its "seg" to "head";
- * 2. MEMCG_LRU_TAIL, which moves an memcg to the tail of a random bin in its
+ * 2. MEMCG_LRU_TAIL, which moves a memcg to the tail of a random bin in its
* current generation (old or young) and updates its "seg" to "tail";
- * 3. MEMCG_LRU_OLD, which moves an memcg to the head of a random bin in the old
+ * 3. MEMCG_LRU_OLD, which moves a memcg to the head of a random bin in the old
* generation, updates its "gen" to "old" and resets its "seg" to "default";
- * 4. MEMCG_LRU_YOUNG, which moves an memcg to the tail of a random bin in the
+ * 4. MEMCG_LRU_YOUNG, which moves a memcg to the tail of a random bin in the
* young generation, updates its "gen" to "young" and resets its "seg" to
* "default".
*
* The events that trigger the above operations are:
* 1. Exceeding the soft limit, which triggers MEMCG_LRU_HEAD;
- * 2. The first attempt to reclaim an memcg below low, which triggers
+ * 2. The first attempt to reclaim a memcg below low, which triggers
* MEMCG_LRU_TAIL;
- * 3. The first attempt to reclaim an memcg below reclaimable size threshold,
- * which triggers MEMCG_LRU_TAIL;
- * 4. The second attempt to reclaim an memcg below reclaimable size threshold,
- * which triggers MEMCG_LRU_YOUNG;
- * 5. Attempting to reclaim an memcg below min, which triggers MEMCG_LRU_YOUNG;
+ * 3. The first attempt to reclaim a memcg offlined or below reclaimable size
+ * threshold, which triggers MEMCG_LRU_TAIL;
+ * 4. The second attempt to reclaim a memcg offlined or below reclaimable size
+ * threshold, which triggers MEMCG_LRU_YOUNG;
+ * 5. Attempting to reclaim a memcg below min, which triggers MEMCG_LRU_YOUNG;
* 6. Finishing the aging on the eviction path, which triggers MEMCG_LRU_YOUNG;
- * 7. Offlining an memcg, which triggers MEMCG_LRU_OLD.
+ * 7. Offlining a memcg, which triggers MEMCG_LRU_OLD.
*
- * Note that memcg LRU only applies to global reclaim, and the round-robin
- * incrementing of their max_seq counters ensures the eventual fairness to all
- * eligible memcgs. For memcg reclaim, it still relies on mem_cgroup_iter().
+ * Notes:
+ * 1. Memcg LRU only applies to global reclaim, and the round-robin incrementing
+ * of their max_seq counters ensures the eventual fairness to all eligible
+ * memcgs. For memcg reclaim, it still relies on mem_cgroup_iter().
+ * 2. There are only two valid generations: old (seq) and young (seq+1).
+ * MEMCG_NR_GENS is set to three so that when reading the generation counter
+ * locklessly, a stale value (seq-1) does not wraparound to young.
*/
-#define MEMCG_NR_GENS 2
+#define MEMCG_NR_GENS 3
#define MEMCG_NR_BINS 8
struct lru_gen_memcg {
ML_PRIV_CAN,
};
+enum netdev_stat_type {
+ NETDEV_PCPU_STAT_NONE,
+ NETDEV_PCPU_STAT_LSTATS, /* struct pcpu_lstats */
+ NETDEV_PCPU_STAT_TSTATS, /* struct pcpu_sw_netstats */
+ NETDEV_PCPU_STAT_DSTATS, /* struct pcpu_dstats */
+};
+
/**
* struct net_device - The DEVICE structure.
*
*
* @ml_priv: Mid-layer private
* @ml_priv_type: Mid-layer private type
- * @lstats: Loopback statistics
- * @tstats: Tunnel statistics
- * @dstats: Dummy statistics
- * @vstats: Virtual ethernet statistics
+ *
+ * @pcpu_stat_type: Type of device statistics which the core should
+ * allocate/free: none, lstats, tstats, dstats. none
+ * means the driver is handling statistics allocation/
+ * freeing internally.
+ * @lstats: Loopback statistics: packets, bytes
+ * @tstats: Tunnel statistics: RX/TX packets, RX/TX bytes
+ * @dstats: Dummy statistics: RX/TX/drop packets, RX/TX bytes
*
* @garp_port: GARP
* @mrp_port: MRP
void *ml_priv;
enum netdev_ml_priv_type ml_priv_type;
+ enum netdev_stat_type pcpu_stat_type:8;
union {
struct pcpu_lstats __percpu *lstats;
struct pcpu_sw_netstats __percpu *tstats;
struct u64_stats_sync syncp;
} __aligned(4 * sizeof(u64));
+struct pcpu_dstats {
+ u64 rx_packets;
+ u64 rx_bytes;
+ u64 rx_drops;
+ u64 tx_packets;
+ u64 tx_bytes;
+ u64 tx_drops;
+ struct u64_stats_sync syncp;
+} __aligned(8 * sizeof(u64));
+
struct pcpu_lstats {
u64_stats_t packets;
u64_stats_t bytes;
/* writeback related tags are not used */
AS_NO_WRITEBACK_TAGS = 5,
AS_LARGE_FOLIO_SUPPORT = 6,
- AS_RELEASE_ALWAYS = 7, /* Call ->release_folio(), even if no private data */
- AS_UNMOVABLE = 8, /* The mapping cannot be moved, ever */
+ AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */
+ AS_STABLE_WRITES, /* must wait for writeback before modifying
+ folio contents */
+ AS_UNMOVABLE, /* The mapping cannot be moved, ever */
};
/**
clear_bit(AS_RELEASE_ALWAYS, &mapping->flags);
}
+static inline bool mapping_stable_writes(const struct address_space *mapping)
+{
+ return test_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
+static inline void mapping_set_stable_writes(struct address_space *mapping)
+{
+ set_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
+static inline void mapping_clear_stable_writes(struct address_space *mapping)
+{
+ clear_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
static inline void mapping_set_unmovable(struct address_space *mapping)
{
/*
int pci_disable_link_state(struct pci_dev *pdev, int state);
int pci_disable_link_state_locked(struct pci_dev *pdev, int state);
int pci_enable_link_state(struct pci_dev *pdev, int state);
+int pci_enable_link_state_locked(struct pci_dev *pdev, int state);
void pcie_no_aspm(void);
bool pcie_aspm_support_enabled(void);
bool pcie_aspm_enabled(struct pci_dev *pdev);
{ return 0; }
static inline int pci_enable_link_state(struct pci_dev *pdev, int state)
{ return 0; }
+static inline int pci_enable_link_state_locked(struct pci_dev *pdev, int state)
+{ return 0; }
static inline void pcie_no_aspm(void) { }
static inline bool pcie_aspm_support_enabled(void) { return false; }
static inline bool pcie_aspm_enabled(struct pci_dev *pdev) { return false; }
};
/*
- * ,-----------------------[1:n]----------------------.
- * V V
- * perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
- * ^ ^ | |
- * `--------[1:n]---------' `-[n:1]-> pmu <-[1:n]-'
+ * ,-----------------------[1:n]------------------------.
+ * V V
+ * perf_event_context <-[1:n]-> perf_event_pmu_context <-[1:n]- perf_event
+ * | |
+ * `--[n:1]-> pmu <-[1:n]--'
*
*
* struct perf_event_pmu_context lifetime is refcount based and RCU freed
* ctx->mutex pinning the configuration. Since we hold a reference on
* group_leader (through the filedesc) it can't go away, therefore it's
* associated pmu_ctx must exist and cannot change due to ctx->mutex.
+ *
+ * perf_event holds a refcount on perf_event_context
+ * perf_event holds a refcount on perf_event_pmu_context
*/
struct perf_event_pmu_context {
struct pmu *pmu;
/* Charging mode - 1=Barrel, 2=USB */
#define ASUS_WMI_DEVID_CHARGE_MODE 0x0012006C
+/* MCU powersave mode */
+#define ASUS_WMI_DEVID_MCU_POWERSAVE 0x001200E2
+
/* epu is connected? 1 == true */
#define ASUS_WMI_DEVID_EGPU_CONNECTED 0x00090018
/* egpu on/off */
*/
struct rethook {
void *data;
- rethook_handler_t handler;
+ /*
+ * To avoid sparse warnings, this uses a raw function pointer with
+ * __rcu, instead of rethook_handler_t. But this must be same as
+ * rethook_handler_t.
+ */
+ void (__rcu *handler) (struct rethook_node *, void *, unsigned long, struct pt_regs *);
struct objpool_head pool;
struct rcu_head rcu;
};
struct mutex work_mutex;
struct sk_psock_work_state work_state;
struct delayed_work work;
+ struct sock *sk_pair;
struct rcu_work rwork;
};
#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
#include <asm/stacktrace.h>
+#include <linux/linkage.h>
/*
* The lowest address on tsk's stack which we can plausibly erase.
# endif
}
+asmlinkage void noinstr stackleak_erase(void);
+asmlinkage void noinstr stackleak_erase_on_task_stack(void);
+asmlinkage void noinstr stackleak_erase_off_task_stack(void);
+void __no_caller_saved_registers noinstr stackleak_track_stack(void);
+
#else /* !CONFIG_GCC_PLUGIN_STACKLEAK */
static inline void stackleak_task_init(struct task_struct *t) { }
#endif
bool hs_enable; /* FPE handshake enable */
enum stmmac_fpe_state lp_fpe_state; /* Link Partner FPE state */
enum stmmac_fpe_state lo_fpe_state; /* Local station FPE state */
+ u32 fpe_csr; /* MAC_FPE_CTRL_STS reg cache */
};
struct stmmac_safety_feature_cfg {
#ifdef CONFIG_TCP_AO
u8 ao_keyid;
u8 ao_rcv_next;
- u8 maclen;
+ bool used_tcp_ao;
#endif
};
static inline bool tcp_rsk_used_ao(const struct request_sock *req)
{
- /* The real length of MAC is saved in the request socket,
- * signing anything with zero-length makes no sense, so here is
- * a little hack..
- */
#ifndef CONFIG_TCP_AO
return false;
#else
- return tcp_rsk(req)->maclen != 0;
+ return tcp_rsk(req)->used_tcp_ao;
#endif
}
#ifndef _LINUX_UNITS_H
#define _LINUX_UNITS_H
+#include <linux/bits.h>
#include <linux/math.h>
/* Metric prefixes in accordance with Système international (d'unités) */
*/
int (*set_wakeup)(struct usb_phy *x, bool enabled);
- /* notify phy port status change */
- int (*notify_port_status)(struct usb_phy *x, int port,
- u16 portstatus, u16 portchange);
-
/* notify phy connect status change */
int (*notify_connect)(struct usb_phy *x,
enum usb_device_speed speed);
return 0;
}
-static inline int
-usb_phy_notify_port_status(struct usb_phy *x, int port, u16 portstatus, u16 portchange)
-{
- if (x && x->notify_port_status)
- return x->notify_port_status(x, port, portstatus, portchange);
- else
- return 0;
-}
-
static inline int
usb_phy_notify_connect(struct usb_phy *x, enum usb_device_speed speed)
{
#define VENDOR_ID_NVIDIA 0x0955
#define VENDOR_ID_TPLINK 0x2357
#define VENDOR_ID_DLINK 0x2001
+#define VENDOR_ID_ASUS 0x0b05
#if IS_REACHABLE(CONFIG_USB_RTL8152)
extern u8 rtl8152_get_version(struct usb_interface *intf);
/*
* External user API
*/
-#if IS_ENABLED(CONFIG_VFIO_GROUP)
struct iommu_group *vfio_file_iommu_group(struct file *file);
+
+#if IS_ENABLED(CONFIG_VFIO_GROUP)
bool vfio_file_is_group(struct file *file);
bool vfio_file_has_dev(struct file *file, struct vfio_device *device);
#else
-static inline struct iommu_group *vfio_file_iommu_group(struct file *file)
-{
- return NULL;
-}
-
static inline bool vfio_file_is_group(struct file *file)
{
return false;
#include <linux/pci.h>
#include <linux/virtio_pci.h>
-struct virtio_pci_modern_common_cfg {
- struct virtio_pci_common_cfg cfg;
-
- __le16 queue_notify_data; /* read-write */
- __le16 queue_reset; /* read-write */
-};
-
/**
* struct virtio_pci_modern_device - info for modern PCI virtio
* @pci_dev: Ptr to the PCI device struct
__u8 length;
__u8 prefix_len;
+ union __packed {
+ __u8 flags;
+ struct __packed {
#if defined(__BIG_ENDIAN_BITFIELD)
- __u8 onlink : 1,
+ __u8 onlink : 1,
autoconf : 1,
reserved : 6;
#elif defined(__LITTLE_ENDIAN_BITFIELD)
- __u8 reserved : 6,
+ __u8 reserved : 6,
autoconf : 1,
onlink : 1;
#else
#error "Please fix <asm/byteorder.h>"
#endif
+ };
+ };
__be32 valid;
__be32 prefered;
__be32 reserved2;
struct in6_addr prefix;
};
+/* rfc4861 4.6.2: IPv6 PIO is 32 bytes in size */
+static_assert(sizeof(struct prefix_info) == 32);
+
#include <linux/ipv6.h>
#include <linux/netdevice.h>
#include <net/if_inet6.h>
};
#define unix_sk(ptr) container_of_const(ptr, struct unix_sock, sk)
+#define unix_peer(sk) (unix_sk(sk)->peer)
#define peer_wait peer_wq.wait
struct smp_csrk {
bdaddr_t bdaddr;
u8 bdaddr_type;
+ u8 link_type;
u8 type;
u8 val[16];
};
struct rcu_head rcu;
bdaddr_t bdaddr;
u8 bdaddr_type;
+ u8 link_type;
u8 authenticated;
u8 type;
u8 enc_size;
bdaddr_t rpa;
bdaddr_t bdaddr;
u8 addr_type;
+ u8 link_type;
u8 val[16];
};
struct list_head list;
struct rcu_head rcu;
bdaddr_t bdaddr;
+ u8 bdaddr_type;
+ u8 link_type;
u8 type;
u8 val[HCI_LINK_KEY_SIZE];
u8 pin_len;
continue;
/* Match CIG ID if set */
- if (cig != BT_ISO_QOS_CIG_UNSET && cig != c->iso_qos.ucast.cig)
+ if (cig != c->iso_qos.ucast.cig)
continue;
/* Match CIS ID if set */
- if (id != BT_ISO_QOS_CIS_UNSET && id != c->iso_qos.ucast.cis)
+ if (id != c->iso_qos.ucast.cis)
continue;
/* Match destination address if set */
*/
void cfg80211_links_removed(struct net_device *dev, u16 link_mask);
+#ifdef CONFIG_CFG80211_DEBUGFS
+/**
+ * wiphy_locked_debugfs_read - do a locked read in debugfs
+ * @wiphy: the wiphy to use
+ * @file: the file being read
+ * @buf: the buffer to fill and then read from
+ * @bufsize: size of the buffer
+ * @userbuf: the user buffer to copy to
+ * @count: read count
+ * @ppos: read position
+ * @handler: the read handler to call (under wiphy lock)
+ * @data: additional data to pass to the read handler
+ */
+ssize_t wiphy_locked_debugfs_read(struct wiphy *wiphy, struct file *file,
+ char *buf, size_t bufsize,
+ char __user *userbuf, size_t count,
+ loff_t *ppos,
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t bufsize,
+ void *data),
+ void *data);
+
+/**
+ * wiphy_locked_debugfs_write - do a locked write in debugfs
+ * @wiphy: the wiphy to use
+ * @file: the file being written to
+ * @buf: the buffer to copy the user data to
+ * @bufsize: size of the buffer
+ * @userbuf: the user buffer to copy from
+ * @count: read count
+ * @handler: the write handler to call (under wiphy lock)
+ * @data: additional data to pass to the write handler
+ */
+ssize_t wiphy_locked_debugfs_write(struct wiphy *wiphy, struct file *file,
+ char *buf, size_t bufsize,
+ const char __user *userbuf, size_t count,
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data),
+ void *data);
+#endif
+
#endif /* __NET_CFG80211_H */
* struct genl_multicast_group - generic netlink multicast group
* @name: name of the multicast group, names are per-family
* @flags: GENL_* flags (%GENL_ADMIN_PERM or %GENL_UNS_ADMIN_PERM)
+ * @cap_sys_admin: whether %CAP_SYS_ADMIN is required for binding
*/
struct genl_multicast_group {
char name[GENL_NAMSIZ];
u8 flags;
+ u8 cap_sys_admin:1;
};
struct genl_split_ops;
#define IF_RS_SENT 0x10
#define IF_READY 0x80000000
-/* prefix flags */
-#define IF_PREFIX_ONLINK 0x01
-#define IF_PREFIX_AUTOCONF 0x02
-
enum {
INET6_IFADDR_STATE_PREDAD,
INET6_IFADDR_STATE_DAD,
refcount_t fib6_ref;
unsigned long expires;
-
- struct hlist_node gc_link;
-
struct dst_metrics *fib6_metrics;
#define fib6_pmtu fib6_metrics->metrics[RTAX_MTU-1]
return rt->fib6_src.plen > 0;
}
+static inline void fib6_clean_expires(struct fib6_info *f6i)
+{
+ f6i->fib6_flags &= ~RTF_EXPIRES;
+ f6i->expires = 0;
+}
+
+static inline void fib6_set_expires(struct fib6_info *f6i,
+ unsigned long expires)
+{
+ f6i->expires = expires;
+ f6i->fib6_flags |= RTF_EXPIRES;
+}
+
static inline bool fib6_check_expired(const struct fib6_info *f6i)
{
if (f6i->fib6_flags & RTF_EXPIRES)
return false;
}
-static inline bool fib6_has_expires(const struct fib6_info *f6i)
-{
- return f6i->fib6_flags & RTF_EXPIRES;
-}
-
/* Function to safely get fn->fn_sernum for passed in rt
* and store result in passed in cookie.
* Return true if we can get cookie safely
struct inet_peer_base tb6_peers;
unsigned int flags;
unsigned int fib_seq;
- struct hlist_head tb6_gc_hlist; /* GC candidates */
#define RT6_TABLE_HAS_DFLT_ROUTER BIT(0)
};
int fib6_init(void);
-/* fib6_info must be locked by the caller, and fib6_info->fib6_table can be
- * NULL.
- */
-static inline void fib6_set_expires_locked(struct fib6_info *f6i,
- unsigned long expires)
-{
- struct fib6_table *tb6;
-
- tb6 = f6i->fib6_table;
- f6i->expires = expires;
- if (tb6 && !fib6_has_expires(f6i))
- hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
- f6i->fib6_flags |= RTF_EXPIRES;
-}
-
-/* fib6_info must be locked by the caller, and fib6_info->fib6_table can be
- * NULL. If fib6_table is NULL, the fib6_info will no be inserted into the
- * list of GC candidates until it is inserted into a table.
- */
-static inline void fib6_set_expires(struct fib6_info *f6i,
- unsigned long expires)
-{
- spin_lock_bh(&f6i->fib6_table->tb6_lock);
- fib6_set_expires_locked(f6i, expires);
- spin_unlock_bh(&f6i->fib6_table->tb6_lock);
-}
-
-static inline void fib6_clean_expires_locked(struct fib6_info *f6i)
-{
- if (fib6_has_expires(f6i))
- hlist_del_init(&f6i->gc_link);
- f6i->fib6_flags &= ~RTF_EXPIRES;
- f6i->expires = 0;
-}
-
-static inline void fib6_clean_expires(struct fib6_info *f6i)
-{
- spin_lock_bh(&f6i->fib6_table->tb6_lock);
- fib6_clean_expires_locked(f6i);
- spin_unlock_bh(&f6i->fib6_table->tb6_lock);
-}
-
struct ipv6_route_iter {
struct seq_net_private p;
struct fib6_walker w;
struct rcu_head rcu;
struct net_device *dev;
netdevice_tracker dev_tracker;
- u8 primary_key[0];
+ u8 primary_key[];
} __randomize_layout;
struct neigh_ops {
enum flow_offload_tuple_dir dir,
struct nf_flow_rule *flow_rule);
void (*free)(struct nf_flowtable *ft);
+ void (*get)(struct nf_flowtable *ft);
+ void (*put)(struct nf_flowtable *ft);
nf_hookfn *hook;
struct module *owner;
};
}
list_add_tail(&block_cb->list, &block->cb_list);
+ up_write(&flow_table->flow_block_lock);
+
+ if (flow_table->type->get)
+ flow_table->type->get(flow_table);
+ return 0;
unlock:
up_write(&flow_table->flow_block_lock);
WARN_ON(true);
}
up_write(&flow_table->flow_block_lock);
+
+ if (flow_table->type->put)
+ flow_table->type->put(flow_table);
}
void flow_offload_route_init(struct flow_offload *flow,
return *(__force __be32 *)sreg;
}
-static inline void nft_reg_store64(u32 *dreg, u64 val)
+static inline void nft_reg_store64(u64 *dreg, u64 val)
{
- put_unaligned(val, (u64 *)dreg);
+ put_unaligned(val, dreg);
}
static inline u64 nft_reg_load64(const u32 *sreg)
int netkit_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
int netkit_prog_detach(const union bpf_attr *attr, struct bpf_prog *prog);
int netkit_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr);
+INDIRECT_CALLABLE_DECLARE(struct net_device *netkit_peer_dev(struct net_device *dev));
#else
static inline int netkit_prog_attach(const union bpf_attr *attr,
struct bpf_prog *prog)
{
return -EINVAL;
}
+
+static inline struct net_device *netkit_peer_dev(struct net_device *dev)
+{
+ return NULL;
+}
#endif /* CONFIG_NETKIT */
#endif /* __NET_NETKIT_H */
return sk->sk_type == SOCK_STREAM && sk->sk_protocol == IPPROTO_TCP;
}
+static inline bool sk_is_stream_unix(const struct sock *sk)
+{
+ return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM;
+}
+
/**
* sk_eat_skb - Release a skb if it is no longer needed
* @sk: socket to eat this skb from
return to_ct_params(a)->nf_ft;
}
+static inline struct nf_conntrack_helper *tcf_ct_helper(const struct tc_action *a)
+{
+ return to_ct_params(a)->helper;
+}
+
#else
static inline uint16_t tcf_ct_zone(const struct tc_action *a) { return 0; }
static inline int tcf_ct_action(const struct tc_action *a) { return 0; }
{
return NULL;
}
+static inline struct nf_conntrack_helper *tcf_ct_helper(const struct tc_action *a)
+{
+ return NULL;
+}
#endif /* CONFIG_NF_CONNTRACK */
#if IS_ENABLED(CONFIG_NET_ACT_CT)
return tcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf));
}
-static inline void tcp_adjust_rcv_ssthresh(struct sock *sk)
+static inline void __tcp_adjust_rcv_ssthresh(struct sock *sk, u32 new_ssthresh)
{
int unused_mem = sk_unused_reserved_mem(sk);
struct tcp_sock *tp = tcp_sk(sk);
- tp->rcv_ssthresh = min(tp->rcv_ssthresh, 4U * tp->advmss);
+ tp->rcv_ssthresh = min(tp->rcv_ssthresh, new_ssthresh);
if (unused_mem)
tp->rcv_ssthresh = max_t(u32, tp->rcv_ssthresh,
tcp_win_from_space(sk, unused_mem));
}
+static inline void tcp_adjust_rcv_ssthresh(struct sock *sk)
+{
+ __tcp_adjust_rcv_ssthresh(sk, 4U * tcp_sk(sk)->advmss);
+}
+
void tcp_cleanup_rbuf(struct sock *sk, int copied);
void __tcp_cleanup_rbuf(struct sock *sk, int copied);
return key->maclen;
}
+/* Use tcp_ao_len_aligned() for TCP header calculations */
static inline int tcp_ao_len(const struct tcp_ao_key *key)
{
return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr);
}
+static inline int tcp_ao_len_aligned(const struct tcp_ao_key *key)
+{
+ return round_up(tcp_ao_len(key), 4);
+}
+
static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key)
{
return key->digest_size;
{
__rdma_block_iter_start(biter, umem->sgt_append.sgt.sgl,
umem->sgt_append.sgt.nents, pgsz);
+ biter->__sg_advance = ib_umem_offset(umem) & ~(pgsz - 1);
+ biter->__sg_numblocks = ib_umem_num_dma_blocks(umem, pgsz);
+}
+
+static inline bool __rdma_umem_block_iter_next(struct ib_block_iter *biter)
+{
+ return __rdma_block_iter_next(biter) && biter->__sg_numblocks--;
}
/**
*/
#define rdma_umem_for_each_dma_block(umem, biter, pgsz) \
for (__rdma_umem_block_iter_start(biter, umem, pgsz); \
- __rdma_block_iter_next(biter);)
+ __rdma_umem_block_iter_next(biter);)
#ifdef CONFIG_INFINIBAND_USER_MEM
/* internal states */
struct scatterlist *__sg; /* sg holding the current aligned block */
dma_addr_t __dma_addr; /* unaligned DMA address of this block */
+ size_t __sg_numblocks; /* ib_umem_num_dma_blocks() */
unsigned int __sg_nents; /* number of SG entries */
unsigned int __sg_advance; /* number of bytes to advance in sg in next step */
unsigned int __pg_bit; /* alignment of current block */
* power state for system suspend/resume (suspend to RAM and
* hibernation) operations.
*/
- bool manage_system_start_stop;
+ unsigned manage_system_start_stop:1;
/*
* If true, let the high-level device driver (sd) manage the device
* power state for runtime device suspand and resume operations.
*/
- bool manage_runtime_start_stop;
+ unsigned manage_runtime_start_stop:1;
/*
* If true, let the high-level device driver (sd) manage the device
* power state for system shutdown (power off) operations.
*/
- bool manage_shutdown;
+ unsigned manage_shutdown:1;
+
+ /*
+ * If set and if the device is runtime suspended, ask the high-level
+ * device driver (sd) to force a runtime resume of the device.
+ */
+ unsigned force_runtime_start_on_system_start:1;
unsigned removable:1;
unsigned changed:1; /* Data invalid due to media change */
bool cs35l41_safe_reset(struct regmap *regmap, enum cs35l41_boost_type b_type);
int cs35l41_mdsync_up(struct regmap *regmap);
int cs35l41_global_enable(struct device *dev, struct regmap *regmap, enum cs35l41_boost_type b_type,
- int enable, bool firmware_running);
+ int enable, struct cs_dsp *dsp);
#endif /* __CS35L41_H */
__field( void *, clnt )
__field( __u8, type )
__field( __u16, tag )
- __array( unsigned char, line, P9_PROTO_DUMP_SZ )
+ __dynamic_array(unsigned char, line,
+ min_t(size_t, pdu->capacity, P9_PROTO_DUMP_SZ))
),
TP_fast_assign(
__entry->clnt = clnt;
__entry->type = pdu->id;
__entry->tag = pdu->tag;
- memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ);
+ memcpy(__get_dynamic_array(line), pdu->sdata,
+ __get_dynamic_array_len(line));
),
- TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n",
+ TP_printk("clnt %lu %s(tag = %d)\n%*ph\n",
(unsigned long)__entry->clnt, show_9p_op(__entry->type),
- __entry->tag, 0, __entry->line, 16, __entry->line + 16)
+ __entry->tag, __get_dynamic_array_len(line),
+ __get_dynamic_array(line))
);
__entry->valid ? "valid" : "invalid")
);
-#if defined(CONFIG_HAVE_KVM_IRQFD)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
TRACE_EVENT(kvm_set_irq,
TP_PROTO(unsigned int gsi, int level, int irq_source_id),
TP_ARGS(gsi, level, irq_source_id),
TP_printk("gsi %u level %d source %d",
__entry->gsi, __entry->level, __entry->irq_source_id)
);
-#endif /* defined(CONFIG_HAVE_KVM_IRQFD) */
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
#if defined(__KVM_HAVE_IOAPIC)
#define kvm_deliver_mode \
#endif /* defined(__KVM_HAVE_IOAPIC) */
-#if defined(CONFIG_HAVE_KVM_IRQFD)
+#if defined(CONFIG_HAVE_KVM_IRQCHIP)
#ifdef kvm_irqchips
#define kvm_ack_irq_string "irqchip %s pin %u"
TP_printk(kvm_ack_irq_string, kvm_ack_irq_parm)
);
-#endif /* defined(CONFIG_HAVE_KVM_IRQFD) */
+#endif /* defined(CONFIG_HAVE_KVM_IRQCHIP) */
E_(rxrpc_rtt_tx_ping, "PING")
#define rxrpc_rtt_rx_traces \
- EM(rxrpc_rtt_rx_cancel, "CNCL") \
+ EM(rxrpc_rtt_rx_other_ack, "OACK") \
EM(rxrpc_rtt_rx_obsolete, "OBSL") \
EM(rxrpc_rtt_rx_lost, "LOST") \
EM(rxrpc_rtt_rx_ping_response, "PONG") \
*/
#define BTRFS_METADATA_ITEM_KEY 169
+/*
+ * Special inline ref key which stores the id of the subvolume which originally
+ * created the extent. This subvolume owns the extent permanently from the
+ * perspective of simple quotas. Needed to know which subvolume to free quota
+ * usage from when the extent is deleted.
+ *
+ * Stored as an inline ref rather to avoid wasting space on a separate item on
+ * top of the existing extent item. However, unlike the other inline refs,
+ * there is one one owner ref per extent rather than one per extent.
+ *
+ * Because of this, it goes at the front of the list of inline refs, and thus
+ * must have a lower type value than any other inline ref type (to satisfy the
+ * disk format rule that inline refs have non-decreasing type).
+ */
+#define BTRFS_EXTENT_OWNER_REF_KEY 172
+
#define BTRFS_TREE_BLOCK_REF_KEY 176
#define BTRFS_EXTENT_DATA_REF_KEY 178
#define BTRFS_SHARED_DATA_REF_KEY 184
-/*
- * Special inline ref key which stores the id of the subvolume which originally
- * created the extent. This subvolume owns the extent permanently from the
- * perspective of simple quotas. Needed to know which subvolume to free quota
- * usage from when the extent is deleted.
- */
-#define BTRFS_EXTENT_OWNER_REF_KEY 188
-
/*
* block groups give us hints into the extent allocation trees. Which
* blocks are free etc etc
#define AT_HANDLE_FID AT_REMOVEDIR /* file handle is needed to
compare object identity and may not
be usable to open_by_handle_at(2) */
+#if defined(__KERNEL__)
+#define AT_GETATTR_NOSEC 0x80000000
+#endif
#endif /* _UAPI_LINUX_FCNTL_H */
* - add FUSE_HAS_EXPIRE_ONLY
*
* 7.39
- * - add FUSE_DIRECT_IO_RELAX
+ * - add FUSE_DIRECT_IO_ALLOW_MMAP
* - add FUSE_STATX and related structures
*/
* FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir,
* symlink and mknod (single group that matches parent)
* FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation
- * FUSE_DIRECT_IO_RELAX: relax restrictions in FOPEN_DIRECT_IO mode, for now
- * allow shared mmap
+ * FUSE_DIRECT_IO_ALLOW_MMAP: allow shared mmap in FOPEN_DIRECT_IO mode.
*/
#define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1)
#define FUSE_HAS_INODE_DAX (1ULL << 33)
#define FUSE_CREATE_SUPP_GROUP (1ULL << 34)
#define FUSE_HAS_EXPIRE_ONLY (1ULL << 35)
-#define FUSE_DIRECT_IO_RELAX (1ULL << 36)
+#define FUSE_DIRECT_IO_ALLOW_MMAP (1ULL << 36)
+
+/* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
+#define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
/**
* CUSE INIT request/reply flags
#define KVM_API_VERSION 12
-/* *** Deprecated interfaces *** */
-
-#define KVM_TRC_SHIFT 16
-
-#define KVM_TRC_ENTRYEXIT (1 << KVM_TRC_SHIFT)
-#define KVM_TRC_HANDLER (1 << (KVM_TRC_SHIFT + 1))
-
-#define KVM_TRC_VMENTRY (KVM_TRC_ENTRYEXIT + 0x01)
-#define KVM_TRC_VMEXIT (KVM_TRC_ENTRYEXIT + 0x02)
-#define KVM_TRC_PAGE_FAULT (KVM_TRC_HANDLER + 0x01)
-
-#define KVM_TRC_HEAD_SIZE 12
-#define KVM_TRC_CYCLE_SIZE 8
-#define KVM_TRC_EXTRA_MAX 7
-
-#define KVM_TRC_INJ_VIRQ (KVM_TRC_HANDLER + 0x02)
-#define KVM_TRC_REDELIVER_EVT (KVM_TRC_HANDLER + 0x03)
-#define KVM_TRC_PEND_INTR (KVM_TRC_HANDLER + 0x04)
-#define KVM_TRC_IO_READ (KVM_TRC_HANDLER + 0x05)
-#define KVM_TRC_IO_WRITE (KVM_TRC_HANDLER + 0x06)
-#define KVM_TRC_CR_READ (KVM_TRC_HANDLER + 0x07)
-#define KVM_TRC_CR_WRITE (KVM_TRC_HANDLER + 0x08)
-#define KVM_TRC_DR_READ (KVM_TRC_HANDLER + 0x09)
-#define KVM_TRC_DR_WRITE (KVM_TRC_HANDLER + 0x0A)
-#define KVM_TRC_MSR_READ (KVM_TRC_HANDLER + 0x0B)
-#define KVM_TRC_MSR_WRITE (KVM_TRC_HANDLER + 0x0C)
-#define KVM_TRC_CPUID (KVM_TRC_HANDLER + 0x0D)
-#define KVM_TRC_INTR (KVM_TRC_HANDLER + 0x0E)
-#define KVM_TRC_NMI (KVM_TRC_HANDLER + 0x0F)
-#define KVM_TRC_VMMCALL (KVM_TRC_HANDLER + 0x10)
-#define KVM_TRC_HLT (KVM_TRC_HANDLER + 0x11)
-#define KVM_TRC_CLTS (KVM_TRC_HANDLER + 0x12)
-#define KVM_TRC_LMSW (KVM_TRC_HANDLER + 0x13)
-#define KVM_TRC_APIC_ACCESS (KVM_TRC_HANDLER + 0x14)
-#define KVM_TRC_TDP_FAULT (KVM_TRC_HANDLER + 0x15)
-#define KVM_TRC_GTLB_WRITE (KVM_TRC_HANDLER + 0x16)
-#define KVM_TRC_STLB_WRITE (KVM_TRC_HANDLER + 0x17)
-#define KVM_TRC_STLB_INVAL (KVM_TRC_HANDLER + 0x18)
-#define KVM_TRC_PPC_INSTR (KVM_TRC_HANDLER + 0x19)
-
-struct kvm_user_trace_setup {
- __u32 buf_size;
- __u32 buf_nr;
-};
-
-#define __KVM_DEPRECATED_MAIN_W_0x06 \
- _IOW(KVMIO, 0x06, struct kvm_user_trace_setup)
-#define __KVM_DEPRECATED_MAIN_0x07 _IO(KVMIO, 0x07)
-#define __KVM_DEPRECATED_MAIN_0x08 _IO(KVMIO, 0x08)
-
-#define __KVM_DEPRECATED_VM_R_0x70 _IOR(KVMIO, 0x70, struct kvm_assigned_irq)
-
-struct kvm_breakpoint {
- __u32 enabled;
- __u32 padding;
- __u64 address;
-};
-
-struct kvm_debug_guest {
- __u32 enabled;
- __u32 pad;
- struct kvm_breakpoint breakpoints[4];
- __u32 singlestep;
-};
-
-#define __KVM_DEPRECATED_VCPU_W_0x87 _IOW(KVMIO, 0x87, struct kvm_debug_guest)
-
-/* *** End of deprecated interfaces *** */
-
-
/* for KVM_SET_USER_MEMORY_REGION */
struct kvm_userspace_memory_region {
__u32 slot;
*/
#define KVM_GET_VCPU_MMAP_SIZE _IO(KVMIO, 0x04) /* in bytes */
#define KVM_GET_SUPPORTED_CPUID _IOWR(KVMIO, 0x05, struct kvm_cpuid2)
-#define KVM_TRACE_ENABLE __KVM_DEPRECATED_MAIN_W_0x06
-#define KVM_TRACE_PAUSE __KVM_DEPRECATED_MAIN_0x07
-#define KVM_TRACE_DISABLE __KVM_DEPRECATED_MAIN_0x08
#define KVM_GET_EMULATED_CPUID _IOWR(KVMIO, 0x09, struct kvm_cpuid2)
#define KVM_GET_MSR_FEATURE_INDEX_LIST _IOWR(KVMIO, 0x0a, struct kvm_msr_list)
_IOW(KVMIO, 0x67, struct kvm_coalesced_mmio_zone)
#define KVM_UNREGISTER_COALESCED_MMIO \
_IOW(KVMIO, 0x68, struct kvm_coalesced_mmio_zone)
-#define KVM_ASSIGN_PCI_DEVICE _IOR(KVMIO, 0x69, \
- struct kvm_assigned_pci_dev)
#define KVM_SET_GSI_ROUTING _IOW(KVMIO, 0x6a, struct kvm_irq_routing)
-/* deprecated, replaced by KVM_ASSIGN_DEV_IRQ */
-#define KVM_ASSIGN_IRQ __KVM_DEPRECATED_VM_R_0x70
-#define KVM_ASSIGN_DEV_IRQ _IOW(KVMIO, 0x70, struct kvm_assigned_irq)
#define KVM_REINJECT_CONTROL _IO(KVMIO, 0x71)
-#define KVM_DEASSIGN_PCI_DEVICE _IOW(KVMIO, 0x72, \
- struct kvm_assigned_pci_dev)
-#define KVM_ASSIGN_SET_MSIX_NR _IOW(KVMIO, 0x73, \
- struct kvm_assigned_msix_nr)
-#define KVM_ASSIGN_SET_MSIX_ENTRY _IOW(KVMIO, 0x74, \
- struct kvm_assigned_msix_entry)
-#define KVM_DEASSIGN_DEV_IRQ _IOW(KVMIO, 0x75, struct kvm_assigned_irq)
#define KVM_IRQFD _IOW(KVMIO, 0x76, struct kvm_irqfd)
#define KVM_CREATE_PIT2 _IOW(KVMIO, 0x77, struct kvm_pit_config)
#define KVM_SET_BOOT_CPU_ID _IO(KVMIO, 0x78)
* KVM_CAP_VM_TSC_CONTROL to set defaults for a VM */
#define KVM_SET_TSC_KHZ _IO(KVMIO, 0xa2)
#define KVM_GET_TSC_KHZ _IO(KVMIO, 0xa3)
-/* Available with KVM_CAP_PCI_2_3 */
-#define KVM_ASSIGN_SET_INTX_MASK _IOW(KVMIO, 0xa4, \
- struct kvm_assigned_pci_dev)
/* Available with KVM_CAP_SIGNAL_MSI */
#define KVM_SIGNAL_MSI _IOW(KVMIO, 0xa5, struct kvm_msi)
/* Available with KVM_CAP_PPC_GET_SMMU_INFO */
#define KVM_SET_SREGS _IOW(KVMIO, 0x84, struct kvm_sregs)
#define KVM_TRANSLATE _IOWR(KVMIO, 0x85, struct kvm_translation)
#define KVM_INTERRUPT _IOW(KVMIO, 0x86, struct kvm_interrupt)
-/* KVM_DEBUG_GUEST is no longer supported, use KVM_SET_GUEST_DEBUG instead */
-#define KVM_DEBUG_GUEST __KVM_DEPRECATED_VCPU_W_0x87
#define KVM_GET_MSRS _IOWR(KVMIO, 0x88, struct kvm_msrs)
#define KVM_SET_MSRS _IOW(KVMIO, 0x89, struct kvm_msrs)
#define KVM_SET_CPUID _IOW(KVMIO, 0x8a, struct kvm_cpuid)
union { \
struct { MEMBERS } ATTRS; \
struct TAG { MEMBERS } ATTRS NAME; \
- }
+ } ATTRS
#ifdef __cplusplus
/* sizeof(struct{}) is 1 in C++, not 0, can't use C version of the macro. */
* set (which is the default), the 'stream' fields will be forced to 0 by the
* kernel.
*/
- #define V4L2_SUBDEV_CLIENT_CAP_STREAMS (1U << 0)
+ #define V4L2_SUBDEV_CLIENT_CAP_STREAMS (1ULL << 0)
/**
* struct v4l2_subdev_client_capability - Capabilities of the client accessing
__le32 queue_used_hi; /* read-write */
};
+/*
+ * Warning: do not use sizeof on this: use offsetofend for
+ * specific fields you need.
+ */
+struct virtio_pci_modern_common_cfg {
+ struct virtio_pci_common_cfg cfg;
+
+ __le16 queue_notify_data; /* read-write */
+ __le16 queue_reset; /* read-write */
+};
+
/* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
struct virtio_pci_cfg_cap {
struct virtio_pci_cap cap;
/* Clear an irq's pending state, in preparation for polling on it */
void xen_clear_irq_pending(int irq);
-void xen_set_irq_pending(int irq);
bool xen_test_irq_pending(int irq);
/* Poll waiting for an irq to become pending. In the usual case, the
/* Determine the IRQ which is bound to an event channel */
unsigned int irq_from_evtchn(evtchn_port_t evtchn);
-int irq_from_virq(unsigned int cpu, unsigned int virq);
-evtchn_port_t evtchn_from_irq(unsigned irq);
+int irq_evtchn_from_virq(unsigned int cpu, unsigned int virq,
+ evtchn_port_t *evtchn);
int xen_set_callback_via(uint64_t via);
int xen_evtchn_do_upcall(void);
/* De-allocates the above mentioned physical interrupt. */
int xen_destroy_irq(int irq);
-/* Return irq from pirq */
-int xen_irq_from_pirq(unsigned pirq);
-
/* Return the pirq allocated to the irq. */
int xen_pirq_from_irq(unsigned irq);
};
ktime_t timeout = KTIME_MAX;
struct io_uring_sync_cancel_reg sc;
- struct fd f = { };
+ struct file *file = NULL;
DEFINE_WAIT(wait);
int ret, i;
/* we can grab a normal file descriptor upfront */
if ((cd.flags & IORING_ASYNC_CANCEL_FD) &&
!(cd.flags & IORING_ASYNC_CANCEL_FD_FIXED)) {
- f = fdget(sc.fd);
- if (!f.file)
+ file = fget(sc.fd);
+ if (!file)
return -EBADF;
- cd.file = f.file;
+ cd.file = file;
}
ret = __io_sync_cancel(current->io_uring, &cd, sc.fd);
if (ret == -ENOENT || ret > 0)
ret = 0;
out:
- fdput(f);
+ if (file)
+ fput(file);
return ret;
}
if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
struct io_sq_data *sq = ctx->sq_data;
- if (mutex_trylock(&sq->lock)) {
- if (sq->thread) {
- sq_pid = task_pid_nr(sq->thread);
- sq_cpu = task_cpu(sq->thread);
- }
- mutex_unlock(&sq->lock);
- }
+ sq_pid = sq->task_pid;
+ sq_cpu = sq->sq_cpu;
}
seq_printf(m, "SqThread:\t%d\n", sq_pid);
newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
lnk->flags = READ_ONCE(sqe->hardlink_flags);
- lnk->oldpath = getname(oldf);
+ lnk->oldpath = getname_uflags(oldf, lnk->flags);
if (IS_ERR(lnk->oldpath))
return PTR_ERR(lnk->oldpath);
struct io_kiocb *req, *tmp;
struct io_tw_state ts = { .locked = true, };
+ percpu_ref_get(&ctx->refs);
mutex_lock(&ctx->uring_lock);
llist_for_each_entry_safe(req, tmp, node, io_task_work.node)
req->io_task_work.func(req, &ts);
return;
io_submit_flush_completions(ctx);
mutex_unlock(&ctx->uring_lock);
+ percpu_ref_put(&ctx->refs);
}
static int io_alloc_hash_table(struct io_hash_table *table, unsigned bits)
INIT_LIST_HEAD(&ctx->sqd_list);
INIT_LIST_HEAD(&ctx->cq_overflow_list);
INIT_LIST_HEAD(&ctx->io_buffers_cache);
+ INIT_HLIST_HEAD(&ctx->io_buf_list);
io_alloc_cache_init(&ctx->rsrc_node_cache, IO_NODE_ALLOC_CACHE_MAX,
sizeof(struct io_rsrc_node));
io_alloc_cache_init(&ctx->apoll_cache, IO_ALLOC_CACHE_MAX,
return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0;
}
-static void io_mem_free(void *ptr)
+void io_mem_free(void *ptr)
{
if (!ptr)
return;
{
struct page **page_array;
unsigned int nr_pages;
+ void *page_addr;
int ret, i;
*npages = 0;
io_pages_free(&page_array, ret > 0 ? ret : 0);
return ret < 0 ? ERR_PTR(ret) : ERR_PTR(-EFAULT);
}
- /*
- * Should be a single page. If the ring is small enough that we can
- * use a normal page, that is fine. If we need multiple pages, then
- * userspace should use a huge page. That's the only way to guarantee
- * that we get contigious memory, outside of just being lucky or
- * (currently) having low memory fragmentation.
- */
- if (page_array[0] != page_array[ret - 1])
- goto err;
- /*
- * Can't support mapping user allocated ring memory on 32-bit archs
- * where it could potentially reside in highmem. Just fail those with
- * -EINVAL, just like we did on kernels that didn't support this
- * feature.
- */
+ page_addr = page_address(page_array[0]);
for (i = 0; i < nr_pages; i++) {
- if (PageHighMem(page_array[i])) {
- ret = -EINVAL;
+ ret = -EINVAL;
+
+ /*
+ * Can't support mapping user allocated ring memory on 32-bit
+ * archs where it could potentially reside in highmem. Just
+ * fail those with -EINVAL, just like we did on kernels that
+ * didn't support this feature.
+ */
+ if (PageHighMem(page_array[i]))
goto err;
- }
+
+ /*
+ * No support for discontig pages for now, should either be a
+ * single normal page, or a huge page. Later on we can add
+ * support for remapping discontig pages, for now we will
+ * just fail them with EINVAL.
+ */
+ if (page_address(page_array[i]) != page_addr)
+ goto err;
+ page_addr += PAGE_SIZE;
}
*pages = page_array;
}
}
-static void *io_mem_alloc(size_t size)
+void *io_mem_alloc(size_t size)
{
gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
void *ret;
ctx->mm_account = NULL;
}
io_rings_free(ctx);
+ io_kbuf_mmap_list_free(ctx);
percpu_ref_exit(&ctx->refs);
free_uid(ctx->user);
init_completion(&exit.completion);
init_task_work(&exit.task_work, io_tctx_exit_cb);
exit.ctx = ctx;
- /*
- * Some may use context even when all refs and requests have been put,
- * and they are free to do so while still holding uring_lock or
- * completion_lock, see io_req_task_submit(). Apart from other work,
- * this lock/unlock section also waits them to finish.
- */
+
mutex_lock(&ctx->uring_lock);
while (!list_empty(&ctx->tctx_list)) {
WARN_ON_ONCE(time_after(jiffies, timeout));
struct page *page;
void *ptr;
- /* Don't allow mmap if the ring was setup without it */
- if (ctx->flags & IORING_SETUP_NO_MMAP)
- return ERR_PTR(-EINVAL);
-
switch (offset & IORING_OFF_MMAP_MASK) {
case IORING_OFF_SQ_RING:
case IORING_OFF_CQ_RING:
+ /* Don't allow mmap if the ring was setup without it */
+ if (ctx->flags & IORING_SETUP_NO_MMAP)
+ return ERR_PTR(-EINVAL);
ptr = ctx->rings;
break;
case IORING_OFF_SQES:
+ /* Don't allow mmap if the ring was setup without it */
+ if (ctx->flags & IORING_SETUP_NO_MMAP)
+ return ERR_PTR(-EINVAL);
ptr = ctx->sq_sqes;
break;
case IORING_OFF_PBUF_RING: {
unsigned int bgid;
bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT;
- mutex_lock(&ctx->uring_lock);
+ rcu_read_lock();
ptr = io_pbuf_get_address(ctx, bgid);
- mutex_unlock(&ctx->uring_lock);
+ rcu_read_unlock();
if (!ptr)
return ERR_PTR(-EINVAL);
break;
size_t, argsz)
{
struct io_ring_ctx *ctx;
- struct fd f;
+ struct file *file;
long ret;
if (unlikely(flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP |
if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
return -EINVAL;
fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
- f.file = tctx->registered_rings[fd];
- f.flags = 0;
- if (unlikely(!f.file))
+ file = tctx->registered_rings[fd];
+ if (unlikely(!file))
return -EBADF;
} else {
- f = fdget(fd);
- if (unlikely(!f.file))
+ file = fget(fd);
+ if (unlikely(!file))
return -EBADF;
ret = -EOPNOTSUPP;
- if (unlikely(!io_is_uring_fops(f.file)))
+ if (unlikely(!io_is_uring_fops(file)))
goto out;
}
- ctx = f.file->private_data;
+ ctx = file->private_data;
ret = -EBADFD;
if (unlikely(ctx->flags & IORING_SETUP_R_DISABLED))
goto out;
}
}
out:
- fdput(f);
+ if (!(flags & IORING_ENTER_REGISTERED_RING))
+ fput(file);
return ret;
}
{
struct io_ring_ctx *ctx;
long ret = -EBADF;
- struct fd f;
+ struct file *file;
bool use_registered_ring;
use_registered_ring = !!(opcode & IORING_REGISTER_USE_REGISTERED_RING);
if (unlikely(!tctx || fd >= IO_RINGFD_REG_MAX))
return -EINVAL;
fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
- f.file = tctx->registered_rings[fd];
- f.flags = 0;
- if (unlikely(!f.file))
+ file = tctx->registered_rings[fd];
+ if (unlikely(!file))
return -EBADF;
} else {
- f = fdget(fd);
- if (unlikely(!f.file))
+ file = fget(fd);
+ if (unlikely(!file))
return -EBADF;
ret = -EOPNOTSUPP;
- if (!io_is_uring_fops(f.file))
+ if (!io_is_uring_fops(file))
goto out_fput;
}
- ctx = f.file->private_data;
+ ctx = file->private_data;
mutex_lock(&ctx->uring_lock);
ret = __io_uring_register(ctx, opcode, arg, nr_args);
mutex_unlock(&ctx->uring_lock);
trace_io_uring_register(ctx, opcode, ctx->nr_user_files, ctx->nr_user_bufs, ret);
out_fput:
- fdput(f);
+ if (!use_registered_ring)
+ fput(file);
return ret;
}
bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
bool cancel_all);
+void *io_mem_alloc(size_t size);
+void io_mem_free(void *ptr);
+
#if defined(CONFIG_PROVE_LOCKING)
static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx)
{
__u16 bid;
};
+struct io_buf_free {
+ struct hlist_node list;
+ void *mem;
+ size_t size;
+ int inuse;
+};
+
+static struct io_buffer_list *__io_buffer_get_list(struct io_ring_ctx *ctx,
+ struct io_buffer_list *bl,
+ unsigned int bgid)
+{
+ if (bl && bgid < BGID_ARRAY)
+ return &bl[bgid];
+
+ return xa_load(&ctx->io_bl_xa, bgid);
+}
+
static inline struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
unsigned int bgid)
{
- if (ctx->io_bl && bgid < BGID_ARRAY)
- return &ctx->io_bl[bgid];
+ lockdep_assert_held(&ctx->uring_lock);
- return xa_load(&ctx->io_bl_xa, bgid);
+ return __io_buffer_get_list(ctx, ctx->io_bl, bgid);
}
static int io_buffer_add_list(struct io_ring_ctx *ctx,
struct io_buffer_list *bl, unsigned int bgid)
{
+ /*
+ * Store buffer group ID and finally mark the list as visible.
+ * The normal lookup doesn't care about the visibility as we're
+ * always under the ->uring_lock, but the RCU lookup from mmap does.
+ */
bl->bgid = bgid;
+ smp_store_release(&bl->is_ready, 1);
+
if (bgid < BGID_ARRAY)
return 0;
static __cold int io_init_bl_list(struct io_ring_ctx *ctx)
{
+ struct io_buffer_list *bl;
int i;
- ctx->io_bl = kcalloc(BGID_ARRAY, sizeof(struct io_buffer_list),
- GFP_KERNEL);
- if (!ctx->io_bl)
+ bl = kcalloc(BGID_ARRAY, sizeof(struct io_buffer_list), GFP_KERNEL);
+ if (!bl)
return -ENOMEM;
for (i = 0; i < BGID_ARRAY; i++) {
- INIT_LIST_HEAD(&ctx->io_bl[i].buf_list);
- ctx->io_bl[i].bgid = i;
+ INIT_LIST_HEAD(&bl[i].buf_list);
+ bl[i].bgid = i;
}
+ smp_store_release(&ctx->io_bl, bl);
return 0;
}
+/*
+ * Mark the given mapped range as free for reuse
+ */
+static void io_kbuf_mark_free(struct io_ring_ctx *ctx, struct io_buffer_list *bl)
+{
+ struct io_buf_free *ibf;
+
+ hlist_for_each_entry(ibf, &ctx->io_buf_list, list) {
+ if (bl->buf_ring == ibf->mem) {
+ ibf->inuse = 0;
+ return;
+ }
+ }
+
+ /* can't happen... */
+ WARN_ON_ONCE(1);
+}
+
static int __io_remove_buffers(struct io_ring_ctx *ctx,
struct io_buffer_list *bl, unsigned nbufs)
{
if (bl->is_mapped) {
i = bl->buf_ring->tail - bl->head;
if (bl->is_mmap) {
- folio_put(virt_to_folio(bl->buf_ring));
+ /*
+ * io_kbuf_list_free() will free the page(s) at
+ * ->release() time.
+ */
+ io_kbuf_mark_free(ctx, bl);
bl->buf_ring = NULL;
bl->is_mmap = 0;
} else if (bl->buf_nr_pages) {
xa_for_each(&ctx->io_bl_xa, index, bl) {
xa_erase(&ctx->io_bl_xa, bl->bgid);
__io_remove_buffers(ctx, bl, -1U);
- kfree(bl);
+ kfree_rcu(bl, rcu);
}
+ /*
+ * Move deferred locked entries to cache before pruning
+ */
+ spin_lock(&ctx->completion_lock);
+ if (!list_empty(&ctx->io_buffers_comp))
+ list_splice_init(&ctx->io_buffers_comp, &ctx->io_buffers_cache);
+ spin_unlock(&ctx->completion_lock);
+
list_for_each_safe(item, tmp, &ctx->io_buffers_cache) {
buf = list_entry(item, struct io_buffer, list);
kmem_cache_free(io_buf_cachep, buf);
INIT_LIST_HEAD(&bl->buf_list);
ret = io_buffer_add_list(ctx, bl, p->bgid);
if (ret) {
- kfree(bl);
+ /*
+ * Doesn't need rcu free as it was never visible, but
+ * let's keep it consistent throughout. Also can't
+ * be a lower indexed array group, as adding one
+ * where lookup failed cannot happen.
+ */
+ if (p->bgid >= BGID_ARRAY)
+ kfree_rcu(bl, rcu);
+ else
+ WARN_ON_ONCE(1);
goto err;
}
}
return -EINVAL;
}
-static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg,
+/*
+ * See if we have a suitable region that we can reuse, rather than allocate
+ * both a new io_buf_free and mem region again. We leave it on the list as
+ * even a reused entry will need freeing at ring release.
+ */
+static struct io_buf_free *io_lookup_buf_free_entry(struct io_ring_ctx *ctx,
+ size_t ring_size)
+{
+ struct io_buf_free *ibf, *best = NULL;
+ size_t best_dist;
+
+ hlist_for_each_entry(ibf, &ctx->io_buf_list, list) {
+ size_t dist;
+
+ if (ibf->inuse || ibf->size < ring_size)
+ continue;
+ dist = ibf->size - ring_size;
+ if (!best || dist < best_dist) {
+ best = ibf;
+ if (!dist)
+ break;
+ best_dist = dist;
+ }
+ }
+
+ return best;
+}
+
+static int io_alloc_pbuf_ring(struct io_ring_ctx *ctx,
+ struct io_uring_buf_reg *reg,
struct io_buffer_list *bl)
{
- gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
+ struct io_buf_free *ibf;
size_t ring_size;
void *ptr;
ring_size = reg->ring_entries * sizeof(struct io_uring_buf_ring);
- ptr = (void *) __get_free_pages(gfp, get_order(ring_size));
- if (!ptr)
- return -ENOMEM;
- bl->buf_ring = ptr;
+ /* Reuse existing entry, if we can */
+ ibf = io_lookup_buf_free_entry(ctx, ring_size);
+ if (!ibf) {
+ ptr = io_mem_alloc(ring_size);
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+
+ /* Allocate and store deferred free entry */
+ ibf = kmalloc(sizeof(*ibf), GFP_KERNEL_ACCOUNT);
+ if (!ibf) {
+ io_mem_free(ptr);
+ return -ENOMEM;
+ }
+ ibf->mem = ptr;
+ ibf->size = ring_size;
+ hlist_add_head(&ibf->list, &ctx->io_buf_list);
+ }
+ ibf->inuse = 1;
+ bl->buf_ring = ibf->mem;
bl->is_mapped = 1;
bl->is_mmap = 1;
return 0;
struct io_buffer_list *bl, *free_bl = NULL;
int ret;
+ lockdep_assert_held(&ctx->uring_lock);
+
if (copy_from_user(®, arg, sizeof(reg)))
return -EFAULT;
if (!(reg.flags & IOU_PBUF_RING_MMAP))
ret = io_pin_pbuf_ring(®, bl);
else
- ret = io_alloc_pbuf_ring(®, bl);
+ ret = io_alloc_pbuf_ring(ctx, ®, bl);
if (!ret) {
bl->nr_entries = reg.ring_entries;
return 0;
}
- kfree(free_bl);
+ kfree_rcu(free_bl, rcu);
return ret;
}
struct io_uring_buf_reg reg;
struct io_buffer_list *bl;
+ lockdep_assert_held(&ctx->uring_lock);
+
if (copy_from_user(®, arg, sizeof(reg)))
return -EFAULT;
if (reg.resv[0] || reg.resv[1] || reg.resv[2])
__io_remove_buffers(ctx, bl, -1U);
if (bl->bgid >= BGID_ARRAY) {
xa_erase(&ctx->io_bl_xa, bl->bgid);
- kfree(bl);
+ kfree_rcu(bl, rcu);
}
return 0;
}
{
struct io_buffer_list *bl;
- bl = io_buffer_get_list(ctx, bgid);
+ bl = __io_buffer_get_list(ctx, smp_load_acquire(&ctx->io_bl), bgid);
+
if (!bl || !bl->is_mmap)
return NULL;
+ /*
+ * Ensure the list is fully setup. Only strictly needed for RCU lookup
+ * via mmap, and in that case only for the array indexed groups. For
+ * the xarray lookups, it's either visible and ready, or not at all.
+ */
+ if (!smp_load_acquire(&bl->is_ready))
+ return NULL;
return bl->buf_ring;
}
+
+/*
+ * Called at or after ->release(), free the mmap'ed buffers that we used
+ * for memory mapped provided buffer rings.
+ */
+void io_kbuf_mmap_list_free(struct io_ring_ctx *ctx)
+{
+ struct io_buf_free *ibf;
+ struct hlist_node *tmp;
+
+ hlist_for_each_entry_safe(ibf, tmp, &ctx->io_buf_list, list) {
+ hlist_del(&ibf->list);
+ io_mem_free(ibf->mem);
+ kfree(ibf);
+ }
+}
struct page **buf_pages;
struct io_uring_buf_ring *buf_ring;
};
+ struct rcu_head rcu;
};
__u16 bgid;
__u8 is_mapped;
/* ring mapped provided buffers, but mmap'ed by application */
__u8 is_mmap;
+ /* bl is visible from an RCU point of view for lookup */
+ __u8 is_ready;
};
struct io_buffer {
int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg);
int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg);
+void io_kbuf_mmap_list_free(struct io_ring_ctx *ctx);
+
unsigned int __io_put_kbuf(struct io_kiocb *req, unsigned issue_flags);
bool io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags);
static void __io_poll_execute(struct io_kiocb *req, int mask)
{
+ unsigned flags = 0;
+
io_req_set_res(req, mask, 0);
req->io_task_work.func = io_poll_task_func;
trace_io_uring_task_add(req, mask);
- __io_req_task_work_add(req, IOU_F_TWQ_LAZY_WAKE);
+
+ if (!(req->flags & REQ_F_POLL_NO_LAZY))
+ flags = IOU_F_TWQ_LAZY_WAKE;
+ __io_req_task_work_add(req, flags);
}
static inline void io_poll_execute(struct io_kiocb *req, int res)
poll->head = head;
poll->wait.private = (void *) wqe_private;
- if (poll->events & EPOLLEXCLUSIVE)
+ if (poll->events & EPOLLEXCLUSIVE) {
+ /*
+ * Exclusive waits may only wake a limited amount of entries
+ * rather than all of them, this may interfere with lazy
+ * wake if someone does wait(events > 1). Ensure we don't do
+ * lazy wake for those, as we need to process each one as they
+ * come in.
+ */
+ req->flags |= REQ_F_POLL_NO_LAZY;
add_wait_queue_exclusive(head, &poll->wait);
- else
+ } else {
add_wait_queue(head, &poll->wait);
+ }
}
static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
*/
const struct bio_vec *bvec = imu->bvec;
- if (offset <= bvec->bv_len) {
+ if (offset < bvec->bv_len) {
/*
* Note, huge pages buffers consists of one large
* bvec entry and should always go this way. The other
int __io_scm_file_account(struct io_ring_ctx *ctx, struct file *file);
-#if defined(CONFIG_UNIX)
-static inline bool io_file_need_scm(struct file *filp)
-{
- return !!unix_get_socket(filp);
-}
-#else
static inline bool io_file_need_scm(struct file *filp)
{
return false;
}
-#endif
static inline int io_scm_file_account(struct io_ring_ctx *ctx,
struct file *file)
did_sig = get_signal(&ksig);
cond_resched();
mutex_lock(&sqd->lock);
+ sqd->sq_cpu = raw_smp_processor_id();
}
return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
}
snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
set_task_comm(current, buf);
- if (sqd->sq_cpu != -1)
+ /* reset to our pid after we've set task_comm, for fdinfo */
+ sqd->task_pid = current->pid;
+
+ if (sqd->sq_cpu != -1) {
set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
- else
+ } else {
set_cpus_allowed_ptr(current, cpu_online_mask);
+ sqd->sq_cpu = raw_smp_processor_id();
+ }
mutex_lock(&sqd->lock);
while (1) {
mutex_unlock(&sqd->lock);
cond_resched();
mutex_lock(&sqd->lock);
+ sqd->sq_cpu = raw_smp_processor_id();
}
continue;
}
mutex_unlock(&sqd->lock);
schedule();
mutex_lock(&sqd->lock);
+ sqd->sq_cpu = raw_smp_processor_id();
}
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
atomic_andnot(IORING_SQ_NEED_WAKEUP,
#include <linux/nospec.h>
#include <uapi/linux/io_uring.h>
-#include <uapi/asm-generic/ioctls.h>
+#include <asm/ioctls.h>
#include "io_uring.h"
#include "rsrc.h"
config CRASH_DUMP
bool "kernel crash dumps"
depends on ARCH_SUPPORTS_CRASH_DUMP
- depends on ARCH_SUPPORTS_KEXEC
select CRASH_CORE
select KEXEC_CORE
- select KEXEC
help
Generate crash dump after being started by kexec.
This should be normally only set in special crash dump kernels
if (tsk != current)
return 0;
- if (WARN_ON_ONCE(!current->mm))
+ if (!current->mm)
return 0;
exe_file = get_mm_exe_file(current->mm);
if (!exe_file)
mutex_unlock(&aux->poke_mutex);
}
+void __weak bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old)
+{
+ WARN_ON_ONCE(1);
+}
+
static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
struct bpf_prog *old,
struct bpf_prog *new)
{
- u8 *old_addr, *new_addr, *old_bypass_addr;
struct prog_poke_elem *elem;
struct bpf_array_aux *aux;
list_for_each_entry(elem, &aux->poke_progs, list) {
struct bpf_jit_poke_descriptor *poke;
- int i, ret;
+ int i;
for (i = 0; i < elem->aux->size_poke_tab; i++) {
poke = &elem->aux->poke_tab[i];
* activated, so tail call updates can arrive from here
* while JIT is still finishing its final fixup for
* non-activated poke entries.
- * 3) On program teardown, the program's kallsym entry gets
- * removed out of RCU callback, but we can only untrack
- * from sleepable context, therefore bpf_arch_text_poke()
- * might not see that this is in BPF text section and
- * bails out with -EINVAL. As these are unreachable since
- * RCU grace period already passed, we simply skip them.
- * 4) Also programs reaching refcount of zero while patching
+ * 3) Also programs reaching refcount of zero while patching
* is in progress is okay since we're protected under
* poke_mutex and untrack the programs before the JIT
- * buffer is freed. When we're still in the middle of
- * patching and suddenly kallsyms entry of the program
- * gets evicted, we just skip the rest which is fine due
- * to point 3).
- * 5) Any other error happening below from bpf_arch_text_poke()
- * is a unexpected bug.
+ * buffer is freed.
*/
if (!READ_ONCE(poke->tailcall_target_stable))
continue;
poke->tail_call.key != key)
continue;
- old_bypass_addr = old ? NULL : poke->bypass_addr;
- old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
- new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
-
- if (new) {
- ret = bpf_arch_text_poke(poke->tailcall_target,
- BPF_MOD_JUMP,
- old_addr, new_addr);
- BUG_ON(ret < 0 && ret != -EINVAL);
- if (!old) {
- ret = bpf_arch_text_poke(poke->tailcall_bypass,
- BPF_MOD_JUMP,
- poke->bypass_addr,
- NULL);
- BUG_ON(ret < 0 && ret != -EINVAL);
- }
- } else {
- ret = bpf_arch_text_poke(poke->tailcall_bypass,
- BPF_MOD_JUMP,
- old_bypass_addr,
- poke->bypass_addr);
- BUG_ON(ret < 0 && ret != -EINVAL);
- /* let other CPUs finish the execution of program
- * so that it will not possible to expose them
- * to invalid nop, stack unwind, nop state
- */
- if (!ret)
- synchronize_rcu();
- ret = bpf_arch_text_poke(poke->tailcall_target,
- BPF_MOD_JUMP,
- old_addr, NULL);
- BUG_ON(ret < 0 && ret != -EINVAL);
- }
+ bpf_arch_poke_desc_update(poke, new, old);
}
}
}
#define OFF insn->off
#define IMM insn->imm
-struct bpf_mem_alloc bpf_global_ma, bpf_global_percpu_ma;
-bool bpf_global_ma_set, bpf_global_percpu_ma_set;
+struct bpf_mem_alloc bpf_global_ma;
+bool bpf_global_ma_set;
/* No hurry in this branch
*
static int bpf_adj_delta_to_off(struct bpf_insn *insn, u32 pos, s32 end_old,
s32 end_new, s32 curr, const bool probe_pass)
{
- const s32 off_min = S16_MIN, off_max = S16_MAX;
+ s64 off_min, off_max, off;
s32 delta = end_new - end_old;
- s32 off;
- if (insn->code == (BPF_JMP32 | BPF_JA))
+ if (insn->code == (BPF_JMP32 | BPF_JA)) {
off = insn->imm;
- else
+ off_min = S32_MIN;
+ off_max = S32_MAX;
+ } else {
off = insn->off;
+ off_min = S16_MIN;
+ off_max = S16_MAX;
+ }
if (curr < pos && curr + off + 1 >= end_old)
off += delta;
ret = bpf_mem_alloc_init(&bpf_global_ma, 0, false);
bpf_global_ma_set = !ret;
- ret = bpf_mem_alloc_init(&bpf_global_percpu_ma, 0, true);
- bpf_global_percpu_ma_set = !ret;
- return !bpf_global_ma_set || !bpf_global_percpu_ma_set;
+ return ret;
}
late_initcall(bpf_global_ma_init);
#endif
memcg = get_memcg(c);
old_memcg = set_active_memcg(memcg);
ret = __alloc(c, NUMA_NO_NODE, GFP_KERNEL | __GFP_NOWARN | __GFP_ACCOUNT);
+ if (ret)
+ *(struct bpf_mem_cache **)ret = c;
set_active_memcg(old_memcg);
mem_cgroup_put(memcg);
}
#include <linux/poison.h>
#include <linux/module.h>
#include <linux/cpumask.h>
+#include <linux/bpf_mem_alloc.h>
#include <net/xdp.h>
#include "disasm.h"
#undef BPF_LINK_TYPE
};
+struct bpf_mem_alloc bpf_global_percpu_ma;
+static bool bpf_global_percpu_ma_set;
+
/* bpf_check() is a static code analyzer that walks eBPF program
* instruction by instruction and updates register/stack state.
* All paths of conditional branches are analyzed until 'bpf_exit' insn.
struct btf *btf_vmlinux;
static DEFINE_MUTEX(bpf_verifier_lock);
+static DEFINE_MUTEX(bpf_percpu_ma_lock);
static const struct bpf_line_info *
find_linfo(const struct bpf_verifier_env *env, u32 insn_off)
return func_id == BPF_FUNC_dynptr_data;
}
-static bool is_callback_calling_kfunc(u32 btf_id);
+static bool is_sync_callback_calling_kfunc(u32 btf_id);
static bool is_bpf_throw_kfunc(struct bpf_insn *insn);
-static bool is_callback_calling_function(enum bpf_func_id func_id)
+static bool is_sync_callback_calling_function(enum bpf_func_id func_id)
{
return func_id == BPF_FUNC_for_each_map_elem ||
- func_id == BPF_FUNC_timer_set_callback ||
func_id == BPF_FUNC_find_vma ||
func_id == BPF_FUNC_loop ||
func_id == BPF_FUNC_user_ringbuf_drain;
return func_id == BPF_FUNC_timer_set_callback;
}
+static bool is_callback_calling_function(enum bpf_func_id func_id)
+{
+ return is_sync_callback_calling_function(func_id) ||
+ is_async_callback_calling_function(func_id);
+}
+
+static bool is_sync_callback_calling_insn(struct bpf_insn *insn)
+{
+ return (bpf_helper_call(insn) && is_sync_callback_calling_function(insn->imm)) ||
+ (bpf_pseudo_kfunc_call(insn) && is_sync_callback_calling_kfunc(insn->imm));
+}
+
static bool is_storage_get_function(enum bpf_func_id func_id)
{
return func_id == BPF_FUNC_sk_storage_get ||
dst_state->first_insn_idx = src->first_insn_idx;
dst_state->last_insn_idx = src->last_insn_idx;
dst_state->dfs_depth = src->dfs_depth;
+ dst_state->callback_unroll_depth = src->callback_unroll_depth;
dst_state->used_as_loop_entry = src->used_as_loop_entry;
for (i = 0; i <= src->curframe; i++) {
dst = dst_state->frame[i];
reg->subreg_def = DEF_NOT_SUBREG;
}
-static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
- enum reg_arg_type t)
+static int __check_reg_arg(struct bpf_verifier_env *env, struct bpf_reg_state *regs, u32 regno,
+ enum reg_arg_type t)
{
- struct bpf_verifier_state *vstate = env->cur_state;
- struct bpf_func_state *state = vstate->frame[vstate->curframe];
struct bpf_insn *insn = env->prog->insnsi + env->insn_idx;
- struct bpf_reg_state *reg, *regs = state->regs;
+ struct bpf_reg_state *reg;
bool rw64;
if (regno >= MAX_BPF_REG) {
return 0;
}
+static int check_reg_arg(struct bpf_verifier_env *env, u32 regno,
+ enum reg_arg_type t)
+{
+ struct bpf_verifier_state *vstate = env->cur_state;
+ struct bpf_func_state *state = vstate->frame[vstate->curframe];
+
+ return __check_reg_arg(env, state->regs, regno, t);
+}
+
static void mark_jmp_point(struct bpf_verifier_env *env, int idx)
{
env->insn_aux_data[idx].jmp_point = true;
/* Backtrack one insn at a time. If idx is not at the top of recorded
* history then previous instruction came from straight line execution.
+ * Return -ENOENT if we exhausted all instructions within given state.
+ *
+ * It's legal to have a bit of a looping with the same starting and ending
+ * insn index within the same state, e.g.: 3->4->5->3, so just because current
+ * instruction index is the same as state's first_idx doesn't mean we are
+ * done. If there is still some jump history left, we should keep going. We
+ * need to take into account that we might have a jump history between given
+ * state's parent and itself, due to checkpointing. In this case, we'll have
+ * history entry recording a jump from last instruction of parent state and
+ * first instruction of given state.
*/
static int get_prev_insn_idx(struct bpf_verifier_state *st, int i,
u32 *history)
{
u32 cnt = *history;
+ if (i == st->first_insn_idx) {
+ if (cnt == 0)
+ return -ENOENT;
+ if (cnt == 1 && st->jmp_history[0].idx == i)
+ return -ENOENT;
+ }
+
if (cnt && st->jmp_history[cnt - 1].idx == i) {
i = st->jmp_history[cnt - 1].prev_idx;
(*history)--;
}
}
+static bool calls_callback(struct bpf_verifier_env *env, int insn_idx);
+
/* For given verifier state backtrack_insn() is called from the last insn to
* the first insn. Its purpose is to compute a bitmask of registers and
* stack slots that needs precision in the parent verifier state.
return -EFAULT;
return 0;
}
- } else if ((bpf_helper_call(insn) &&
- is_callback_calling_function(insn->imm) &&
- !is_async_callback_calling_function(insn->imm)) ||
- (bpf_pseudo_kfunc_call(insn) && is_callback_calling_kfunc(insn->imm))) {
- /* callback-calling helper or kfunc call, which means
- * we are exiting from subprog, but unlike the subprog
- * call handling above, we shouldn't propagate
- * precision of r1-r5 (if any requested), as they are
- * not actually arguments passed directly to callback
- * subprogs
+ } else if (is_sync_callback_calling_insn(insn) && idx != subseq_idx - 1) {
+ /* exit from callback subprog to callback-calling helper or
+ * kfunc call. Use idx/subseq_idx check to discern it from
+ * straight line code backtracking.
+ * Unlike the subprog call handling above, we shouldn't
+ * propagate precision of r1-r5 (if any requested), as they are
+ * not actually arguments passed directly to callback subprogs
*/
if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) {
verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
} else if (opcode == BPF_EXIT) {
bool r0_precise;
+ /* Backtracking to a nested function call, 'idx' is a part of
+ * the inner frame 'subseq_idx' is a part of the outer frame.
+ * In case of a regular function call, instructions giving
+ * precision to registers R1-R5 should have been found already.
+ * In case of a callback, it is ok to have R1-R5 marked for
+ * backtracking, as these registers are set by the function
+ * invoking callback.
+ */
+ if (subseq_idx >= 0 && calls_callback(env, subseq_idx))
+ for (i = BPF_REG_1; i <= BPF_REG_5; i++)
+ bt_clear_reg(bt, i);
if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) {
- /* if backtracing was looking for registers R1-R5
- * they should have been found already.
- */
verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
WARN_ONCE(1, "verifier backtracking bug");
return -EFAULT;
* Nothing to be tracked further in the parent state.
*/
return 0;
- if (i == first_idx)
- break;
subseq_idx = i;
i = get_prev_insn_idx(st, i, &history);
+ if (i == -ENOENT)
+ break;
if (i >= env->prog->len) {
/* This can happen if backtracking reached insn 0
* and there are still reg_mask or stack_mask
/* after the call registers r0 - r5 were scratched */
for (i = 0; i < CALLER_SAVED_REGS; i++) {
mark_reg_not_init(env, regs, caller_saved[i]);
- check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+ __check_reg_arg(env, regs, caller_saved[i], DST_OP_NO_MARK);
}
}
struct bpf_func_state *caller,
struct bpf_func_state *callee, int insn_idx);
-static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
- int *insn_idx, int subprog,
- set_callee_state_fn set_callee_state_cb)
+static int setup_func_entry(struct bpf_verifier_env *env, int subprog, int callsite,
+ set_callee_state_fn set_callee_state_cb,
+ struct bpf_verifier_state *state)
{
- struct bpf_verifier_state *state = env->cur_state;
struct bpf_func_state *caller, *callee;
int err;
return -E2BIG;
}
- caller = state->frame[state->curframe];
if (state->frame[state->curframe + 1]) {
verbose(env, "verifier bug. Frame %d already allocated\n",
state->curframe + 1);
return -EFAULT;
}
+ caller = state->frame[state->curframe];
+ callee = kzalloc(sizeof(*callee), GFP_KERNEL);
+ if (!callee)
+ return -ENOMEM;
+ state->frame[state->curframe + 1] = callee;
+
+ /* callee cannot access r0, r6 - r9 for reading and has to write
+ * into its own stack before reading from it.
+ * callee can read/write into caller's stack
+ */
+ init_func_state(env, callee,
+ /* remember the callsite, it will be used by bpf_exit */
+ callsite,
+ state->curframe + 1 /* frameno within this callchain */,
+ subprog /* subprog number within this prog */);
+ /* Transfer references to the callee */
+ err = copy_reference_state(callee, caller);
+ err = err ?: set_callee_state_cb(env, caller, callee, callsite);
+ if (err)
+ goto err_out;
+
+ /* only increment it after check_reg_arg() finished */
+ state->curframe++;
+
+ return 0;
+
+err_out:
+ free_func_state(callee);
+ state->frame[state->curframe + 1] = NULL;
+ return err;
+}
+
+static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ int insn_idx, int subprog,
+ set_callee_state_fn set_callee_state_cb)
+{
+ struct bpf_verifier_state *state = env->cur_state, *callback_state;
+ struct bpf_func_state *caller, *callee;
+ int err;
+
+ caller = state->frame[state->curframe];
err = btf_check_subprog_call(env, subprog, caller->regs);
if (err == -EFAULT)
return err;
- if (subprog_is_global(env, subprog)) {
- if (err) {
- verbose(env, "Caller passes invalid args into func#%d\n",
- subprog);
- return err;
- } else {
- if (env->log.level & BPF_LOG_LEVEL)
- verbose(env,
- "Func#%d is global and valid. Skipping.\n",
- subprog);
- clear_caller_saved_regs(env, caller->regs);
-
- /* All global functions return a 64-bit SCALAR_VALUE */
- mark_reg_unknown(env, caller->regs, BPF_REG_0);
- caller->regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
-
- /* continue with next insn after call */
- return 0;
- }
- }
/* set_callee_state is used for direct subprog calls, but we are
* interested in validating only BPF helpers that can call subprogs as
* callbacks
*/
- if (set_callee_state_cb != set_callee_state) {
- env->subprog_info[subprog].is_cb = true;
- if (bpf_pseudo_kfunc_call(insn) &&
- !is_callback_calling_kfunc(insn->imm)) {
- verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n",
- func_id_name(insn->imm), insn->imm);
- return -EFAULT;
- } else if (!bpf_pseudo_kfunc_call(insn) &&
- !is_callback_calling_function(insn->imm)) { /* helper */
- verbose(env, "verifier bug: helper %s#%d not marked as callback-calling\n",
- func_id_name(insn->imm), insn->imm);
- return -EFAULT;
- }
+ env->subprog_info[subprog].is_cb = true;
+ if (bpf_pseudo_kfunc_call(insn) &&
+ !is_sync_callback_calling_kfunc(insn->imm)) {
+ verbose(env, "verifier bug: kfunc %s#%d not marked as callback-calling\n",
+ func_id_name(insn->imm), insn->imm);
+ return -EFAULT;
+ } else if (!bpf_pseudo_kfunc_call(insn) &&
+ !is_callback_calling_function(insn->imm)) { /* helper */
+ verbose(env, "verifier bug: helper %s#%d not marked as callback-calling\n",
+ func_id_name(insn->imm), insn->imm);
+ return -EFAULT;
}
if (insn->code == (BPF_JMP | BPF_CALL) &&
/* there is no real recursion here. timer callbacks are async */
env->subprog_info[subprog].is_async_cb = true;
async_cb = push_async_cb(env, env->subprog_info[subprog].start,
- *insn_idx, subprog);
+ insn_idx, subprog);
if (!async_cb)
return -EFAULT;
callee = async_cb->frame[0];
callee->async_entry_cnt = caller->async_entry_cnt + 1;
/* Convert bpf_timer_set_callback() args into timer callback args */
- err = set_callee_state_cb(env, caller, callee, *insn_idx);
+ err = set_callee_state_cb(env, caller, callee, insn_idx);
if (err)
return err;
+ return 0;
+ }
+
+ /* for callback functions enqueue entry to callback and
+ * proceed with next instruction within current frame.
+ */
+ callback_state = push_stack(env, env->subprog_info[subprog].start, insn_idx, false);
+ if (!callback_state)
+ return -ENOMEM;
+
+ err = setup_func_entry(env, subprog, insn_idx, set_callee_state_cb,
+ callback_state);
+ if (err)
+ return err;
+
+ callback_state->callback_unroll_depth++;
+ callback_state->frame[callback_state->curframe - 1]->callback_depth++;
+ caller->callback_depth = 0;
+ return 0;
+}
+
+static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ int *insn_idx)
+{
+ struct bpf_verifier_state *state = env->cur_state;
+ struct bpf_func_state *caller;
+ int err, subprog, target_insn;
+
+ target_insn = *insn_idx + insn->imm + 1;
+ subprog = find_subprog(env, target_insn);
+ if (subprog < 0) {
+ verbose(env, "verifier bug. No program starts at insn %d\n", target_insn);
+ return -EFAULT;
+ }
+
+ caller = state->frame[state->curframe];
+ err = btf_check_subprog_call(env, subprog, caller->regs);
+ if (err == -EFAULT)
+ return err;
+ if (subprog_is_global(env, subprog)) {
+ if (err) {
+ verbose(env, "Caller passes invalid args into func#%d\n", subprog);
+ return err;
+ }
+
+ if (env->log.level & BPF_LOG_LEVEL)
+ verbose(env, "Func#%d is global and valid. Skipping.\n", subprog);
clear_caller_saved_regs(env, caller->regs);
+
+ /* All global functions return a 64-bit SCALAR_VALUE */
mark_reg_unknown(env, caller->regs, BPF_REG_0);
caller->regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG;
+
/* continue with next insn after call */
return 0;
}
- callee = kzalloc(sizeof(*callee), GFP_KERNEL);
- if (!callee)
- return -ENOMEM;
- state->frame[state->curframe + 1] = callee;
-
- /* callee cannot access r0, r6 - r9 for reading and has to write
- * into its own stack before reading from it.
- * callee can read/write into caller's stack
+ /* for regular function entry setup new frame and continue
+ * from that frame.
*/
- init_func_state(env, callee,
- /* remember the callsite, it will be used by bpf_exit */
- *insn_idx /* callsite */,
- state->curframe + 1 /* frameno within this callchain */,
- subprog /* subprog number within this prog */);
-
- /* Transfer references to the callee */
- err = copy_reference_state(callee, caller);
- if (err)
- goto err_out;
-
- err = set_callee_state_cb(env, caller, callee, *insn_idx);
+ err = setup_func_entry(env, subprog, *insn_idx, set_callee_state, state);
if (err)
- goto err_out;
+ return err;
clear_caller_saved_regs(env, caller->regs);
- /* only increment it after check_reg_arg() finished */
- state->curframe++;
-
/* and go analyze first insn of the callee */
*insn_idx = env->subprog_info[subprog].start - 1;
verbose(env, "caller:\n");
print_verifier_state(env, caller, true);
verbose(env, "callee:\n");
- print_verifier_state(env, callee, true);
+ print_verifier_state(env, state->frame[state->curframe], true);
}
- return 0;
-err_out:
- free_func_state(callee);
- state->frame[state->curframe + 1] = NULL;
- return err;
+ return 0;
}
int map_set_for_each_callback_args(struct bpf_verifier_env *env,
return 0;
}
-static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
- int *insn_idx)
-{
- int subprog, target_insn;
-
- target_insn = *insn_idx + insn->imm + 1;
- subprog = find_subprog(env, target_insn);
- if (subprog < 0) {
- verbose(env, "verifier bug. No program starts at insn %d\n",
- target_insn);
- return -EFAULT;
- }
-
- return __check_func_call(env, insn, insn_idx, subprog, set_callee_state);
-}
-
static int set_map_elem_callback_state(struct bpf_verifier_env *env,
struct bpf_func_state *caller,
struct bpf_func_state *callee,
static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx)
{
- struct bpf_verifier_state *state = env->cur_state;
+ struct bpf_verifier_state *state = env->cur_state, *prev_st;
struct bpf_func_state *caller, *callee;
struct bpf_reg_state *r0;
+ bool in_callback_fn;
int err;
callee = state->frame[state->curframe];
verbose_invalid_scalar(env, r0, &range, "callback return", "R0");
return -EINVAL;
}
+ if (!calls_callback(env, callee->callsite)) {
+ verbose(env, "BUG: in callback at %d, callsite %d !calls_callback\n",
+ *insn_idx, callee->callsite);
+ return -EFAULT;
+ }
} else {
/* return to the caller whatever r0 had in the callee */
caller->regs[BPF_REG_0] = *r0;
return err;
}
- *insn_idx = callee->callsite + 1;
+ /* for callbacks like bpf_loop or bpf_for_each_map_elem go back to callsite,
+ * there function call logic would reschedule callback visit. If iteration
+ * converges is_state_visited() would prune that visit eventually.
+ */
+ in_callback_fn = callee->in_callback_fn;
+ if (in_callback_fn)
+ *insn_idx = callee->callsite;
+ else
+ *insn_idx = callee->callsite + 1;
+
if (env->log.level & BPF_LOG_LEVEL) {
verbose(env, "returning from callee:\n");
print_verifier_state(env, callee, true);
* bpf_throw, this will be done by copy_verifier_state for extra frames. */
free_func_state(callee);
state->frame[state->curframe--] = NULL;
+
+ /* for callbacks widen imprecise scalars to make programs like below verify:
+ *
+ * struct ctx { int i; }
+ * void cb(int idx, struct ctx *ctx) { ctx->i++; ... }
+ * ...
+ * struct ctx = { .i = 0; }
+ * bpf_loop(100, cb, &ctx, 0);
+ *
+ * This is similar to what is done in process_iter_next_call() for open
+ * coded iterators.
+ */
+ prev_st = in_callback_fn ? find_prev_entry(env, state, *insn_idx) : NULL;
+ if (prev_st) {
+ err = widen_imprecise_scalars(env, prev_st, state);
+ if (err)
+ return err;
+ }
return 0;
}
}
break;
case BPF_FUNC_for_each_map_elem:
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_map_elem_callback_state);
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_map_elem_callback_state);
break;
case BPF_FUNC_timer_set_callback:
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_timer_callback_state);
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_timer_callback_state);
break;
case BPF_FUNC_find_vma:
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_find_vma_callback_state);
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_find_vma_callback_state);
break;
case BPF_FUNC_snprintf:
err = check_bpf_snprintf_call(env, regs);
break;
case BPF_FUNC_loop:
update_loop_inline_state(env, meta.subprogno);
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_loop_callback_state);
+ /* Verifier relies on R1 value to determine if bpf_loop() iteration
+ * is finished, thus mark it precise.
+ */
+ err = mark_chain_precision(env, BPF_REG_1);
+ if (err)
+ return err;
+ if (cur_func(env)->callback_depth < regs[BPF_REG_1].umax_value) {
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_loop_callback_state);
+ } else {
+ cur_func(env)->callback_depth = 0;
+ if (env->log.level & BPF_LOG_LEVEL2)
+ verbose(env, "frame%d bpf_loop iteration limit reached\n",
+ env->cur_state->curframe);
+ }
break;
case BPF_FUNC_dynptr_from_mem:
if (regs[BPF_REG_1].type != PTR_TO_MAP_VALUE) {
break;
}
case BPF_FUNC_user_ringbuf_drain:
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_user_ringbuf_callback_state);
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_user_ringbuf_callback_state);
break;
}
btf_id == special_kfunc_list[KF_bpf_refcount_acquire_impl];
}
-static bool is_callback_calling_kfunc(u32 btf_id)
+static bool is_sync_callback_calling_kfunc(u32 btf_id)
{
return btf_id == special_kfunc_list[KF_bpf_rbtree_add_impl];
}
return -EACCES;
}
+ /* Check the arguments */
+ err = check_kfunc_args(env, &meta, insn_idx);
+ if (err < 0)
+ return err;
+
+ if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
+ err = push_callback_call(env, insn, insn_idx, meta.subprogno,
+ set_rbtree_add_callback_state);
+ if (err) {
+ verbose(env, "kfunc %s#%d failed callback verification\n",
+ func_name, meta.func_id);
+ return err;
+ }
+ }
+
rcu_lock = is_kfunc_bpf_rcu_read_lock(&meta);
rcu_unlock = is_kfunc_bpf_rcu_read_unlock(&meta);
return -EINVAL;
}
- /* Check the arguments */
- err = check_kfunc_args(env, &meta, insn_idx);
- if (err < 0)
- return err;
/* In case of release function, we get register number of refcounted
* PTR_TO_BTF_ID in bpf_kfunc_arg_meta, do the release now.
*/
}
}
- if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
- err = __check_func_call(env, insn, insn_idx_p, meta.subprogno,
- set_rbtree_add_callback_state);
- if (err) {
- verbose(env, "kfunc %s#%d failed callback verification\n",
- func_name, meta.func_id);
- return err;
- }
- }
-
if (meta.func_id == special_kfunc_list[KF_bpf_throw]) {
if (!bpf_jit_supports_exceptions()) {
verbose(env, "JIT does not support calling kfunc %s#%d\n",
if (meta.func_id == special_kfunc_list[KF_bpf_obj_new_impl] && !bpf_global_ma_set)
return -ENOMEM;
- if (meta.func_id == special_kfunc_list[KF_bpf_percpu_obj_new_impl] && !bpf_global_percpu_ma_set)
- return -ENOMEM;
+ if (meta.func_id == special_kfunc_list[KF_bpf_percpu_obj_new_impl]) {
+ if (!bpf_global_percpu_ma_set) {
+ mutex_lock(&bpf_percpu_ma_lock);
+ if (!bpf_global_percpu_ma_set) {
+ err = bpf_mem_alloc_init(&bpf_global_percpu_ma, 0, true);
+ if (!err)
+ bpf_global_percpu_ma_set = true;
+ }
+ mutex_unlock(&bpf_percpu_ma_lock);
+ if (err)
+ return err;
+ }
+ }
if (((u64)(u32)meta.arg_constant.value) != meta.arg_constant.value) {
verbose(env, "local type ID argument must be in range [0, U32_MAX]\n");
return env->insn_aux_data[insn_idx].force_checkpoint;
}
+static void mark_calls_callback(struct bpf_verifier_env *env, int idx)
+{
+ env->insn_aux_data[idx].calls_callback = true;
+}
+
+static bool calls_callback(struct bpf_verifier_env *env, int insn_idx)
+{
+ return env->insn_aux_data[insn_idx].calls_callback;
+}
enum {
DONE_EXPLORING = 0,
* w - next instruction
* e - edge
*/
-static int push_insn(int t, int w, int e, struct bpf_verifier_env *env,
- bool loop_ok)
+static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
{
int *insn_stack = env->cfg.insn_stack;
int *insn_state = env->cfg.insn_state;
insn_stack[env->cfg.cur_stack++] = w;
return KEEP_EXPLORING;
} else if ((insn_state[w] & 0xF0) == DISCOVERED) {
- if (loop_ok && env->bpf_capable)
+ if (env->bpf_capable)
return DONE_EXPLORING;
verbose_linfo(env, t, "%d: ", t);
verbose_linfo(env, w, "%d: ", w);
struct bpf_verifier_env *env,
bool visit_callee)
{
- int ret;
+ int ret, insn_sz;
- ret = push_insn(t, t + 1, FALLTHROUGH, env, false);
+ insn_sz = bpf_is_ldimm64(&insns[t]) ? 2 : 1;
+ ret = push_insn(t, t + insn_sz, FALLTHROUGH, env);
if (ret)
return ret;
- mark_prune_point(env, t + 1);
+ mark_prune_point(env, t + insn_sz);
/* when we exit from subprog, we need to record non-linear history */
- mark_jmp_point(env, t + 1);
+ mark_jmp_point(env, t + insn_sz);
if (visit_callee) {
mark_prune_point(env, t);
- ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env,
- /* It's ok to allow recursion from CFG point of
- * view. __check_func_call() will do the actual
- * check.
- */
- bpf_pseudo_func(insns + t));
+ ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env);
}
return ret;
}
static int visit_insn(int t, struct bpf_verifier_env *env)
{
struct bpf_insn *insns = env->prog->insnsi, *insn = &insns[t];
- int ret, off;
+ int ret, off, insn_sz;
if (bpf_pseudo_func(insn))
return visit_func_call_insn(t, insns, env, true);
/* All non-branch instructions have a single fall-through edge. */
if (BPF_CLASS(insn->code) != BPF_JMP &&
- BPF_CLASS(insn->code) != BPF_JMP32)
- return push_insn(t, t + 1, FALLTHROUGH, env, false);
+ BPF_CLASS(insn->code) != BPF_JMP32) {
+ insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
+ return push_insn(t, t + insn_sz, FALLTHROUGH, env);
+ }
switch (BPF_OP(insn->code)) {
case BPF_EXIT:
* async state will be pushed for further exploration.
*/
mark_prune_point(env, t);
+ /* For functions that invoke callbacks it is not known how many times
+ * callback would be called. Verifier models callback calling functions
+ * by repeatedly visiting callback bodies and returning to origin call
+ * instruction.
+ * In order to stop such iteration verifier needs to identify when a
+ * state identical some state from a previous iteration is reached.
+ * Check below forces creation of checkpoint before callback calling
+ * instruction to allow search for such identical states.
+ */
+ if (is_sync_callback_calling_insn(insn)) {
+ mark_calls_callback(env, t);
+ mark_force_checkpoint(env, t);
+ mark_prune_point(env, t);
+ mark_jmp_point(env, t);
+ }
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
struct bpf_kfunc_call_arg_meta meta;
off = insn->imm;
/* unconditional jump with single edge */
- ret = push_insn(t, t + off + 1, FALLTHROUGH, env,
- true);
+ ret = push_insn(t, t + off + 1, FALLTHROUGH, env);
if (ret)
return ret;
/* conditional jump with two edges */
mark_prune_point(env, t);
- ret = push_insn(t, t + 1, FALLTHROUGH, env, true);
+ ret = push_insn(t, t + 1, FALLTHROUGH, env);
if (ret)
return ret;
- return push_insn(t, t + insn->off + 1, BRANCH, env, true);
+ return push_insn(t, t + insn->off + 1, BRANCH, env);
}
}
}
for (i = 0; i < insn_cnt; i++) {
+ struct bpf_insn *insn = &env->prog->insnsi[i];
+
if (insn_state[i] != EXPLORED) {
verbose(env, "unreachable insn %d\n", i);
ret = -EINVAL;
goto err_free;
}
+ if (bpf_is_ldimm64(insn)) {
+ if (insn_state[i + 1] != 0) {
+ verbose(env, "jump into the middle of ldimm64 insn %d\n", i);
+ ret = -EINVAL;
+ goto err_free;
+ }
+ i++; /* skip second half of ldimm64 */
+ }
}
ret = 0; /* cfg looks good */
}
goto skip_inf_loop_check;
}
+ if (calls_callback(env, insn_idx)) {
+ if (states_equal(env, &sl->state, cur, true))
+ goto hit;
+ goto skip_inf_loop_check;
+ }
/* attempt to detect infinite loop to avoid unnecessary doomed work */
if (states_maybe_looping(&sl->state, cur) &&
states_equal(env, &sl->state, cur, false) &&
- !iter_active_depths_differ(&sl->state, cur)) {
+ !iter_active_depths_differ(&sl->state, cur) &&
+ sl->state.callback_unroll_depth == cur->callback_unroll_depth) {
verbose_linfo(env, insn_idx, "; ");
verbose(env, "infinite loop detected at insn %d\n", insn_idx);
verbose(env, "cur state:");
return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
}
-static int cgroup_pressure_open(struct kernfs_open_file *of)
-{
- if (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE))
- return -EPERM;
-
- return 0;
-}
-
static void cgroup_pressure_release(struct kernfs_open_file *of)
{
struct cgroup_file_ctx *ctx = of->priv;
{
.name = "io.pressure",
.file_offset = offsetof(struct cgroup, psi_files[PSI_IO]),
- .open = cgroup_pressure_open,
.seq_show = cgroup_io_pressure_show,
.write = cgroup_io_pressure_write,
.poll = cgroup_pressure_poll,
{
.name = "memory.pressure",
.file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]),
- .open = cgroup_pressure_open,
.seq_show = cgroup_memory_pressure_show,
.write = cgroup_memory_pressure_write,
.poll = cgroup_pressure_poll,
{
.name = "cpu.pressure",
.file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]),
- .open = cgroup_pressure_open,
.seq_show = cgroup_cpu_pressure_show,
.write = cgroup_cpu_pressure_write,
.poll = cgroup_pressure_poll,
{
.name = "irq.pressure",
.file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]),
- .open = cgroup_pressure_open,
.seq_show = cgroup_irq_pressure_show,
.write = cgroup_irq_pressure_write,
.poll = cgroup_pressure_poll,
bool cgroup_freezing(struct task_struct *task)
{
bool ret;
+ unsigned int state;
rcu_read_lock();
- ret = task_freezer(task)->state & CGROUP_FREEZING;
+ /* Check if the cgroup is still FREEZING, but not FROZEN. The extra
+ * !FROZEN check is required, because the FREEZING bit is not cleared
+ * when the state FROZEN is reached.
+ */
+ state = task_freezer(task)->state;
+ ret = (state & CGROUP_FREEZING) && !(state & CGROUP_FROZEN);
rcu_read_unlock();
return ret;
[CPUHP_HRTIMERS_PREPARE] = {
.name = "hrtimers:prepare",
.startup.single = hrtimers_prepare_cpu,
- .teardown.single = hrtimers_dead_cpu,
+ .teardown.single = NULL,
},
[CPUHP_SMPCFD_PREPARE] = {
.name = "smpcfd:prepare",
.startup.single = NULL,
.teardown.single = smpcfd_dying_cpu,
},
+ [CPUHP_AP_HRTIMERS_DYING] = {
+ .name = "hrtimers:dying",
+ .startup.single = NULL,
+ .teardown.single = hrtimers_cpu_dying,
+ },
+
/* Entry state on starting. Interrupts enabled from here on. Transient
* state for synchronsization */
[CPUHP_AP_ONLINE] = {
* It returns 0 on success and -EINVAL on failure.
*/
static int __init parse_crashkernel_suffix(char *cmdline,
- unsigned long long *crash_size,
+ unsigned long long *crash_size,
const char *suffix)
{
char *cur = cmdline;
unsigned long long *crash_base,
const char *suffix)
{
- char *first_colon, *first_space;
- char *ck_cmdline;
- char *name = "crashkernel=";
+ char *first_colon, *first_space;
+ char *ck_cmdline;
+ char *name = "crashkernel=";
BUG_ON(!crash_size || !crash_base);
*crash_size = 0;
return;
}
- if ((crash_base > CRASH_ADDR_LOW_MAX) &&
+ if ((crash_base >= CRASH_ADDR_LOW_MAX) &&
crash_low_size && reserve_crashkernel_low(crash_low_size)) {
memblock_phys_free(crash_base, crash_size);
return;
*/
struct cred init_cred = {
.usage = ATOMIC_INIT(4),
-#ifdef CONFIG_DEBUG_CREDENTIALS
- .subscribers = ATOMIC_INIT(2),
- .magic = CRED_MAGIC,
-#endif
.uid = GLOBAL_ROOT_UID,
.gid = GLOBAL_ROOT_GID,
.suid = GLOBAL_ROOT_UID,
.ucounts = &init_ucounts,
};
-static inline void set_cred_subscribers(struct cred *cred, int n)
-{
-#ifdef CONFIG_DEBUG_CREDENTIALS
- atomic_set(&cred->subscribers, n);
-#endif
-}
-
-static inline int read_cred_subscribers(const struct cred *cred)
-{
-#ifdef CONFIG_DEBUG_CREDENTIALS
- return atomic_read(&cred->subscribers);
-#else
- return 0;
-#endif
-}
-
-static inline void alter_cred_subscribers(const struct cred *_cred, int n)
-{
-#ifdef CONFIG_DEBUG_CREDENTIALS
- struct cred *cred = (struct cred *) _cred;
-
- atomic_add(n, &cred->subscribers);
-#endif
-}
-
/*
* The RCU callback to actually dispose of a set of credentials
*/
kdebug("put_cred_rcu(%p)", cred);
-#ifdef CONFIG_DEBUG_CREDENTIALS
- if (cred->magic != CRED_MAGIC_DEAD ||
- atomic_read(&cred->usage) != 0 ||
- read_cred_subscribers(cred) != 0)
- panic("CRED: put_cred_rcu() sees %p with"
- " mag %x, put %p, usage %d, subscr %d\n",
- cred, cred->magic, cred->put_addr,
- atomic_read(&cred->usage),
- read_cred_subscribers(cred));
-#else
- if (atomic_read(&cred->usage) != 0)
- panic("CRED: put_cred_rcu() sees %p with usage %d\n",
- cred, atomic_read(&cred->usage));
-#endif
+ if (atomic_long_read(&cred->usage) != 0)
+ panic("CRED: put_cred_rcu() sees %p with usage %ld\n",
+ cred, atomic_long_read(&cred->usage));
security_cred_free(cred);
key_put(cred->session_keyring);
*/
void __put_cred(struct cred *cred)
{
- kdebug("__put_cred(%p{%d,%d})", cred,
- atomic_read(&cred->usage),
- read_cred_subscribers(cred));
-
- BUG_ON(atomic_read(&cred->usage) != 0);
-#ifdef CONFIG_DEBUG_CREDENTIALS
- BUG_ON(read_cred_subscribers(cred) != 0);
- cred->magic = CRED_MAGIC_DEAD;
- cred->put_addr = __builtin_return_address(0);
-#endif
+ kdebug("__put_cred(%p{%ld})", cred,
+ atomic_long_read(&cred->usage));
+
+ BUG_ON(atomic_long_read(&cred->usage) != 0);
BUG_ON(cred == current->cred);
BUG_ON(cred == current->real_cred);
{
struct cred *real_cred, *cred;
- kdebug("exit_creds(%u,%p,%p,{%d,%d})", tsk->pid, tsk->real_cred, tsk->cred,
- atomic_read(&tsk->cred->usage),
- read_cred_subscribers(tsk->cred));
+ kdebug("exit_creds(%u,%p,%p,{%ld})", tsk->pid, tsk->real_cred, tsk->cred,
+ atomic_long_read(&tsk->cred->usage));
real_cred = (struct cred *) tsk->real_cred;
tsk->real_cred = NULL;
cred = (struct cred *) tsk->cred;
tsk->cred = NULL;
- validate_creds(cred);
if (real_cred == cred) {
- alter_cred_subscribers(cred, -2);
put_cred_many(cred, 2);
} else {
- validate_creds(real_cred);
- alter_cred_subscribers(real_cred, -1);
put_cred(real_cred);
- alter_cred_subscribers(cred, -1);
put_cred(cred);
}
if (!new)
return NULL;
- atomic_set(&new->usage, 1);
-#ifdef CONFIG_DEBUG_CREDENTIALS
- new->magic = CRED_MAGIC;
-#endif
+ atomic_long_set(&new->usage, 1);
if (security_cred_alloc_blank(new, GFP_KERNEL_ACCOUNT) < 0)
goto error;
const struct cred *old;
struct cred *new;
- validate_process_creds();
-
new = kmem_cache_alloc(cred_jar, GFP_KERNEL);
if (!new)
return NULL;
memcpy(new, old, sizeof(struct cred));
new->non_rcu = 0;
- atomic_set(&new->usage, 1);
- set_cred_subscribers(new, 0);
+ atomic_long_set(&new->usage, 1);
get_group_info(new->group_info);
get_uid(new->user);
get_user_ns(new->user_ns);
if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
goto error;
- validate_creds(new);
return new;
error:
clone_flags & CLONE_THREAD
) {
p->real_cred = get_cred_many(p->cred, 2);
- alter_cred_subscribers(p->cred, 2);
- kdebug("share_creds(%p{%d,%d})",
- p->cred, atomic_read(&p->cred->usage),
- read_cred_subscribers(p->cred));
+ kdebug("share_creds(%p{%ld})",
+ p->cred, atomic_long_read(&p->cred->usage));
inc_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
return 0;
}
p->cred = p->real_cred = get_cred(new);
inc_rlimit_ucounts(task_ucounts(p), UCOUNT_RLIMIT_NPROC, 1);
- alter_cred_subscribers(new, 2);
- validate_creds(new);
return 0;
error_put:
struct task_struct *task = current;
const struct cred *old = task->real_cred;
- kdebug("commit_creds(%p{%d,%d})", new,
- atomic_read(&new->usage),
- read_cred_subscribers(new));
+ kdebug("commit_creds(%p{%ld})", new,
+ atomic_long_read(&new->usage));
BUG_ON(task->cred != old);
-#ifdef CONFIG_DEBUG_CREDENTIALS
- BUG_ON(read_cred_subscribers(old) < 2);
- validate_creds(old);
- validate_creds(new);
-#endif
- BUG_ON(atomic_read(&new->usage) < 1);
+ BUG_ON(atomic_long_read(&new->usage) < 1);
get_cred(new); /* we will require a ref for the subj creds too */
* RLIMIT_NPROC limits on user->processes have already been checked
* in set_user().
*/
- alter_cred_subscribers(new, 2);
if (new->user != old->user || new->user_ns != old->user_ns)
inc_rlimit_ucounts(new->ucounts, UCOUNT_RLIMIT_NPROC, 1);
rcu_assign_pointer(task->real_cred, new);
rcu_assign_pointer(task->cred, new);
if (new->user != old->user || new->user_ns != old->user_ns)
dec_rlimit_ucounts(old->ucounts, UCOUNT_RLIMIT_NPROC, 1);
- alter_cred_subscribers(old, -2);
/* send notifications */
if (!uid_eq(new->uid, old->uid) ||
*/
void abort_creds(struct cred *new)
{
- kdebug("abort_creds(%p{%d,%d})", new,
- atomic_read(&new->usage),
- read_cred_subscribers(new));
+ kdebug("abort_creds(%p{%ld})", new,
+ atomic_long_read(&new->usage));
-#ifdef CONFIG_DEBUG_CREDENTIALS
- BUG_ON(read_cred_subscribers(new) != 0);
-#endif
- BUG_ON(atomic_read(&new->usage) < 1);
+ BUG_ON(atomic_long_read(&new->usage) < 1);
put_cred(new);
}
EXPORT_SYMBOL(abort_creds);
{
const struct cred *old = current->cred;
- kdebug("override_creds(%p{%d,%d})", new,
- atomic_read(&new->usage),
- read_cred_subscribers(new));
-
- validate_creds(old);
- validate_creds(new);
+ kdebug("override_creds(%p{%ld})", new,
+ atomic_long_read(&new->usage));
/*
* NOTE! This uses 'get_new_cred()' rather than 'get_cred()'.
* we are only installing the cred into the thread-synchronous
* '->cred' pointer, not the '->real_cred' pointer that is
* visible to other threads under RCU.
- *
- * Also note that we did validate_creds() manually, not depending
- * on the validation in 'get_cred()'.
*/
get_new_cred((struct cred *)new);
- alter_cred_subscribers(new, 1);
rcu_assign_pointer(current->cred, new);
- alter_cred_subscribers(old, -1);
- kdebug("override_creds() = %p{%d,%d}", old,
- atomic_read(&old->usage),
- read_cred_subscribers(old));
+ kdebug("override_creds() = %p{%ld}", old,
+ atomic_long_read(&old->usage));
return old;
}
EXPORT_SYMBOL(override_creds);
{
const struct cred *override = current->cred;
- kdebug("revert_creds(%p{%d,%d})", old,
- atomic_read(&old->usage),
- read_cred_subscribers(old));
+ kdebug("revert_creds(%p{%ld})", old,
+ atomic_long_read(&old->usage));
- validate_creds(old);
- validate_creds(override);
- alter_cred_subscribers(old, 1);
rcu_assign_pointer(current->cred, old);
- alter_cred_subscribers(override, -1);
put_cred(override);
}
EXPORT_SYMBOL(revert_creds);
kdebug("prepare_kernel_cred() alloc %p", new);
old = get_task_cred(daemon);
- validate_creds(old);
*new = *old;
new->non_rcu = 0;
- atomic_set(&new->usage, 1);
- set_cred_subscribers(new, 0);
+ atomic_long_set(&new->usage, 1);
get_uid(new->user);
get_user_ns(new->user_ns);
get_group_info(new->group_info);
goto error;
put_cred(old);
- validate_creds(new);
return new;
error:
return security_kernel_create_files_as(new, inode);
}
EXPORT_SYMBOL(set_create_files_as);
-
-#ifdef CONFIG_DEBUG_CREDENTIALS
-
-bool creds_are_invalid(const struct cred *cred)
-{
- if (cred->magic != CRED_MAGIC)
- return true;
- return false;
-}
-EXPORT_SYMBOL(creds_are_invalid);
-
-/*
- * dump invalid credentials
- */
-static void dump_invalid_creds(const struct cred *cred, const char *label,
- const struct task_struct *tsk)
-{
- pr_err("%s credentials: %p %s%s%s\n",
- label, cred,
- cred == &init_cred ? "[init]" : "",
- cred == tsk->real_cred ? "[real]" : "",
- cred == tsk->cred ? "[eff]" : "");
- pr_err("->magic=%x, put_addr=%p\n",
- cred->magic, cred->put_addr);
- pr_err("->usage=%d, subscr=%d\n",
- atomic_read(&cred->usage),
- read_cred_subscribers(cred));
- pr_err("->*uid = { %d,%d,%d,%d }\n",
- from_kuid_munged(&init_user_ns, cred->uid),
- from_kuid_munged(&init_user_ns, cred->euid),
- from_kuid_munged(&init_user_ns, cred->suid),
- from_kuid_munged(&init_user_ns, cred->fsuid));
- pr_err("->*gid = { %d,%d,%d,%d }\n",
- from_kgid_munged(&init_user_ns, cred->gid),
- from_kgid_munged(&init_user_ns, cred->egid),
- from_kgid_munged(&init_user_ns, cred->sgid),
- from_kgid_munged(&init_user_ns, cred->fsgid));
-#ifdef CONFIG_SECURITY
- pr_err("->security is %p\n", cred->security);
- if ((unsigned long) cred->security >= PAGE_SIZE &&
- (((unsigned long) cred->security & 0xffffff00) !=
- (POISON_FREE << 24 | POISON_FREE << 16 | POISON_FREE << 8)))
- pr_err("->security {%x, %x}\n",
- ((u32*)cred->security)[0],
- ((u32*)cred->security)[1]);
-#endif
-}
-
-/*
- * report use of invalid credentials
- */
-void __noreturn __invalid_creds(const struct cred *cred, const char *file, unsigned line)
-{
- pr_err("Invalid credentials\n");
- pr_err("At %s:%u\n", file, line);
- dump_invalid_creds(cred, "Specified", current);
- BUG();
-}
-EXPORT_SYMBOL(__invalid_creds);
-
-/*
- * check the credentials on a process
- */
-void __validate_process_creds(struct task_struct *tsk,
- const char *file, unsigned line)
-{
- if (tsk->cred == tsk->real_cred) {
- if (unlikely(read_cred_subscribers(tsk->cred) < 2 ||
- creds_are_invalid(tsk->cred)))
- goto invalid_creds;
- } else {
- if (unlikely(read_cred_subscribers(tsk->real_cred) < 1 ||
- read_cred_subscribers(tsk->cred) < 1 ||
- creds_are_invalid(tsk->real_cred) ||
- creds_are_invalid(tsk->cred)))
- goto invalid_creds;
- }
- return;
-
-invalid_creds:
- pr_err("Invalid process credentials\n");
- pr_err("At %s:%u\n", file, line);
-
- dump_invalid_creds(tsk->real_cred, "Real", tsk);
- if (tsk->cred != tsk->real_cred)
- dump_invalid_creds(tsk->cred, "Effective", tsk);
- else
- pr_err("Effective creds == Real creds\n");
- BUG();
-}
-EXPORT_SYMBOL(__validate_process_creds);
-
-/*
- * check creds for do_exit()
- */
-void validate_creds_for_do_exit(struct task_struct *tsk)
-{
- kdebug("validate_creds_for_do_exit(%p,%p{%d,%d})",
- tsk->real_cred, tsk->cred,
- atomic_read(&tsk->cred->usage),
- read_cred_subscribers(tsk->cred));
-
- __validate_process_creds(tsk, __FILE__, __LINE__);
-}
-
-#endif /* CONFIG_DEBUG_CREDENTIALS */
PERF_EVENT_STATE_INACTIVE;
}
-static void __perf_event_read_size(struct perf_event *event, int nr_siblings)
+static int __perf_event_read_size(u64 read_format, int nr_siblings)
{
int entry = sizeof(u64); /* value */
int size = 0;
int nr = 1;
- if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
+ if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
size += sizeof(u64);
- if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
+ if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
size += sizeof(u64);
- if (event->attr.read_format & PERF_FORMAT_ID)
+ if (read_format & PERF_FORMAT_ID)
entry += sizeof(u64);
- if (event->attr.read_format & PERF_FORMAT_LOST)
+ if (read_format & PERF_FORMAT_LOST)
entry += sizeof(u64);
- if (event->attr.read_format & PERF_FORMAT_GROUP) {
+ if (read_format & PERF_FORMAT_GROUP) {
nr += nr_siblings;
size += sizeof(u64);
}
- size += entry * nr;
- event->read_size = size;
+ /*
+ * Since perf_event_validate_size() limits this to 16k and inhibits
+ * adding more siblings, this will never overflow.
+ */
+ return size + nr * entry;
}
static void __perf_event_header_size(struct perf_event *event, u64 sample_type)
*/
static void perf_event__header_size(struct perf_event *event)
{
- __perf_event_read_size(event,
- event->group_leader->nr_siblings);
+ event->read_size =
+ __perf_event_read_size(event->attr.read_format,
+ event->group_leader->nr_siblings);
__perf_event_header_size(event, event->attr.sample_type);
}
event->id_header_size = size;
}
+/*
+ * Check that adding an event to the group does not result in anybody
+ * overflowing the 64k event limit imposed by the output buffer.
+ *
+ * Specifically, check that the read_size for the event does not exceed 16k,
+ * read_size being the one term that grows with groups size. Since read_size
+ * depends on per-event read_format, also (re)check the existing events.
+ *
+ * This leaves 48k for the constant size fields and things like callchains,
+ * branch stacks and register sets.
+ */
static bool perf_event_validate_size(struct perf_event *event)
{
- /*
- * The values computed here will be over-written when we actually
- * attach the event.
- */
- __perf_event_read_size(event, event->group_leader->nr_siblings + 1);
- __perf_event_header_size(event, event->attr.sample_type & ~PERF_SAMPLE_READ);
- perf_event__id_header_size(event);
+ struct perf_event *sibling, *group_leader = event->group_leader;
+
+ if (__perf_event_read_size(event->attr.read_format,
+ group_leader->nr_siblings + 1) > 16*1024)
+ return false;
+
+ if (__perf_event_read_size(group_leader->attr.read_format,
+ group_leader->nr_siblings + 1) > 16*1024)
+ return false;
/*
- * Sum the lot; should not exceed the 64k limit we have on records.
- * Conservative limit to allow for callchains and other variable fields.
+ * When creating a new group leader, group_leader->ctx is initialized
+ * after the size has been validated, but we cannot safely use
+ * for_each_sibling_event() until group_leader->ctx is set. A new group
+ * leader cannot have any siblings yet, so we can safely skip checking
+ * the non-existent siblings.
*/
- if (event->read_size + event->header_size +
- event->id_header_size + sizeof(struct perf_event_header) >= 16*1024)
- return false;
+ if (event == group_leader)
+ return true;
+
+ for_each_sibling_event(sibling, group_leader) {
+ if (__perf_event_read_size(sibling->attr.read_format,
+ group_leader->nr_siblings + 1) > 16*1024)
+ return false;
+ }
return true;
}
void *task_ctx_data = NULL;
if (!ctx->task) {
+ /*
+ * perf_pmu_migrate_context() / __perf_pmu_install_event()
+ * relies on the fact that find_get_pmu_context() cannot fail
+ * for CPU contexts.
+ */
struct perf_cpu_pmu_context *cpc;
cpc = per_cpu_ptr(pmu->cpu_pmu_context, event->cpu);
int cpu, struct perf_event *event)
{
struct perf_event_pmu_context *epc;
+ struct perf_event_context *old_ctx = event->ctx;
+
+ get_ctx(ctx); /* normally find_get_context() */
event->cpu = cpu;
epc = find_get_pmu_context(pmu, ctx, event);
if (event->state >= PERF_EVENT_STATE_OFF)
event->state = PERF_EVENT_STATE_INACTIVE;
perf_install_in_context(ctx, event, cpu);
+
+ /*
+ * Now that event->ctx is updated and visible, put the old ctx.
+ */
+ put_ctx(old_ctx);
}
static void __perf_pmu_install(struct perf_event_context *ctx,
struct perf_event_context *src_ctx, *dst_ctx;
LIST_HEAD(events);
+ /*
+ * Since per-cpu context is persistent, no need to grab an extra
+ * reference.
+ */
src_ctx = &per_cpu_ptr(&perf_cpu_context, src_cpu)->ctx;
dst_ctx = &per_cpu_ptr(&perf_cpu_context, dst_cpu)->ctx;
ptrace_event(PTRACE_EVENT_EXIT, code);
user_events_exit(tsk);
- validate_creds_for_do_exit(tsk);
-
io_uring_files_cancel();
exit_signals(tsk); /* sets PF_EXITING */
if (tsk->task_frag.page)
put_page(tsk->task_frag.page);
- validate_creds_for_do_exit(tsk);
exit_task_stack_account(tsk);
check_stack_usage();
if (WARN_ON_ONCE(freezing(p)))
goto unlock;
- if (task_call_func(p, __restore_freezer_state, NULL))
+ if (!frozen(p) || task_call_func(p, __restore_freezer_state, NULL))
goto unlock;
wake_up_state(p, TASK_FROZEN);
owner = uval & FUTEX_TID_MASK;
if (pending_op && !pi && !owner) {
- futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
+ futex_wake(uaddr, FLAGS_SIZE_32 | FLAGS_SHARED, 1,
+ FUTEX_BITSET_MATCH_ANY);
return 0;
}
* Wake robust non-PI futexes here. The wakeup of
* PI futexes happens in exit_pi_state():
*/
- if (!pi && (uval & FUTEX_WAITERS))
- futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
+ if (!pi && (uval & FUTEX_WAITERS)) {
+ futex_wake(uaddr, FLAGS_SIZE_32 | FLAGS_SHARED, 1,
+ FUTEX_BITSET_MATCH_ANY);
+ }
return 0;
}
rp->rph = NULL;
return -ENOMEM;
}
- rp->rph->rp = rp;
+ rcu_assign_pointer(rp->rph->rp, rp);
rp->nmissed = 0;
/* Establish function entry probe point */
ret = register_kprobe(&rp->kp);
#ifdef CONFIG_KRETPROBE_ON_RETHOOK
rethook_free(rps[i]->rh);
#else
- rps[i]->rph->rp = NULL;
+ rcu_assign_pointer(rps[i]->rph->rp, NULL);
#endif
}
mutex_unlock(&kprobe_mutex);
size = chain_block_size(curr);
if (likely(size >= req)) {
del_chain_block(0, size, chain_block_next(curr));
- add_chain_block(curr + req, size - req);
+ if (size > req)
+ add_chain_block(curr + req, size - req);
return curr;
}
}
write_lock(&resource_lock);
for (addr = gfr_start(base, size, align, flags);
- gfr_continue(base, addr, size, flags);
- addr = gfr_next(addr, size, flags)) {
+ gfr_continue(base, addr, align, flags);
+ addr = gfr_next(addr, align, flags)) {
if (__region_intersects(base, addr, size, 0, IORES_DESC_NONE) !=
REGION_DISJOINT)
continue;
dequeue_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { }
#endif
+static void reweight_eevdf(struct cfs_rq *cfs_rq, struct sched_entity *se,
+ unsigned long weight)
+{
+ unsigned long old_weight = se->load.weight;
+ u64 avruntime = avg_vruntime(cfs_rq);
+ s64 vlag, vslice;
+
+ /*
+ * VRUNTIME
+ * ========
+ *
+ * COROLLARY #1: The virtual runtime of the entity needs to be
+ * adjusted if re-weight at !0-lag point.
+ *
+ * Proof: For contradiction assume this is not true, so we can
+ * re-weight without changing vruntime at !0-lag point.
+ *
+ * Weight VRuntime Avg-VRuntime
+ * before w v V
+ * after w' v' V'
+ *
+ * Since lag needs to be preserved through re-weight:
+ *
+ * lag = (V - v)*w = (V'- v')*w', where v = v'
+ * ==> V' = (V - v)*w/w' + v (1)
+ *
+ * Let W be the total weight of the entities before reweight,
+ * since V' is the new weighted average of entities:
+ *
+ * V' = (WV + w'v - wv) / (W + w' - w) (2)
+ *
+ * by using (1) & (2) we obtain:
+ *
+ * (WV + w'v - wv) / (W + w' - w) = (V - v)*w/w' + v
+ * ==> (WV-Wv+Wv+w'v-wv)/(W+w'-w) = (V - v)*w/w' + v
+ * ==> (WV - Wv)/(W + w' - w) + v = (V - v)*w/w' + v
+ * ==> (V - v)*W/(W + w' - w) = (V - v)*w/w' (3)
+ *
+ * Since we are doing at !0-lag point which means V != v, we
+ * can simplify (3):
+ *
+ * ==> W / (W + w' - w) = w / w'
+ * ==> Ww' = Ww + ww' - ww
+ * ==> W * (w' - w) = w * (w' - w)
+ * ==> W = w (re-weight indicates w' != w)
+ *
+ * So the cfs_rq contains only one entity, hence vruntime of
+ * the entity @v should always equal to the cfs_rq's weighted
+ * average vruntime @V, which means we will always re-weight
+ * at 0-lag point, thus breach assumption. Proof completed.
+ *
+ *
+ * COROLLARY #2: Re-weight does NOT affect weighted average
+ * vruntime of all the entities.
+ *
+ * Proof: According to corollary #1, Eq. (1) should be:
+ *
+ * (V - v)*w = (V' - v')*w'
+ * ==> v' = V' - (V - v)*w/w' (4)
+ *
+ * According to the weighted average formula, we have:
+ *
+ * V' = (WV - wv + w'v') / (W - w + w')
+ * = (WV - wv + w'(V' - (V - v)w/w')) / (W - w + w')
+ * = (WV - wv + w'V' - Vw + wv) / (W - w + w')
+ * = (WV + w'V' - Vw) / (W - w + w')
+ *
+ * ==> V'*(W - w + w') = WV + w'V' - Vw
+ * ==> V' * (W - w) = (W - w) * V (5)
+ *
+ * If the entity is the only one in the cfs_rq, then reweight
+ * always occurs at 0-lag point, so V won't change. Or else
+ * there are other entities, hence W != w, then Eq. (5) turns
+ * into V' = V. So V won't change in either case, proof done.
+ *
+ *
+ * So according to corollary #1 & #2, the effect of re-weight
+ * on vruntime should be:
+ *
+ * v' = V' - (V - v) * w / w' (4)
+ * = V - (V - v) * w / w'
+ * = V - vl * w / w'
+ * = V - vl'
+ */
+ if (avruntime != se->vruntime) {
+ vlag = (s64)(avruntime - se->vruntime);
+ vlag = div_s64(vlag * old_weight, weight);
+ se->vruntime = avruntime - vlag;
+ }
+
+ /*
+ * DEADLINE
+ * ========
+ *
+ * When the weight changes, the virtual time slope changes and
+ * we should adjust the relative virtual deadline accordingly.
+ *
+ * d' = v' + (d - v)*w/w'
+ * = V' - (V - v)*w/w' + (d - v)*w/w'
+ * = V - (V - v)*w/w' + (d - v)*w/w'
+ * = V + (d - V)*w/w'
+ */
+ vslice = (s64)(se->deadline - avruntime);
+ vslice = div_s64(vslice * old_weight, weight);
+ se->deadline = avruntime + vslice;
+}
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
- unsigned long old_weight = se->load.weight;
+ bool curr = cfs_rq->curr == se;
if (se->on_rq) {
/* commit outstanding execution time */
- if (cfs_rq->curr == se)
+ if (curr)
update_curr(cfs_rq);
else
- avg_vruntime_sub(cfs_rq, se);
+ __dequeue_entity(cfs_rq, se);
update_load_sub(&cfs_rq->load, se->load.weight);
}
dequeue_load_avg(cfs_rq, se);
- update_load_set(&se->load, weight);
-
if (!se->on_rq) {
/*
* Because we keep se->vlag = V - v_i, while: lag_i = w_i*(V - v_i),
* we need to scale se->vlag when w_i changes.
*/
- se->vlag = div_s64(se->vlag * old_weight, weight);
+ se->vlag = div_s64(se->vlag * se->load.weight, weight);
} else {
- s64 deadline = se->deadline - se->vruntime;
- /*
- * When the weight changes, the virtual time slope changes and
- * we should adjust the relative virtual deadline accordingly.
- */
- deadline = div_s64(deadline * old_weight, weight);
- se->deadline = se->vruntime + deadline;
- if (se != cfs_rq->curr)
- min_deadline_cb_propagate(&se->run_node, NULL);
+ reweight_eevdf(cfs_rq, se, weight);
}
+ update_load_set(&se->load, weight);
+
#ifdef CONFIG_SMP
do {
u32 divider = get_pelt_divider(&se->avg);
enqueue_load_avg(cfs_rq, se);
if (se->on_rq) {
update_load_add(&cfs_rq->load, se->load.weight);
- if (cfs_rq->curr != se)
- avg_vruntime_add(cfs_rq, se);
+ if (!curr) {
+ /*
+ * The entity's vruntime has been adjusted, so let's check
+ * whether the rq-wide min_vruntime needs updated too. Since
+ * the calculations above require stable min_vruntime rather
+ * than up-to-date one, we do the update at the end of the
+ * reweight process.
+ */
+ __enqueue_entity(cfs_rq, se);
+ update_min_vruntime(cfs_rq);
+ }
}
}
#ifndef CONFIG_SMP
shares = READ_ONCE(gcfs_rq->tg->shares);
-
- if (likely(se->load.weight == shares))
- return;
#else
- shares = calc_group_shares(gcfs_rq);
+ shares = calc_group_shares(gcfs_rq);
#endif
-
- reweight_entity(cfs_rq_of(se), se, shares);
+ if (unlikely(se->load.weight != shares))
+ reweight_entity(cfs_rq_of(se), se, shares);
}
#else /* CONFIG_FAIR_GROUP_SCHED */
continue;
}
- /* Are we the first idle CPU? */
+ /*
+ * Are we the first idle core in a non-SMT domain or higher,
+ * or the first idle CPU in a SMT domain?
+ */
return cpu == env->dst_cpu;
}
- if (idle_smt == env->dst_cpu)
- return true;
+ /* Are we the first idle CPU with busy siblings? */
+ if (idle_smt != -1)
+ return idle_smt == env->dst_cpu;
/* Are we the first CPU of this group ? */
return group_balance_cpu(sg) == env->dst_cpu;
if (bits & PR_MDWE_NO_INHERIT && !(bits & PR_MDWE_REFUSE_EXEC_GAIN))
return -EINVAL;
+ /* PARISC cannot allow mdwe as it needs writable stacks */
+ if (IS_ENABLED(CONFIG_PARISC))
+ return -EINVAL;
+
current_bits = get_current_mdwe();
if (current_bits && current_bits != bits)
return -EPERM; /* Cannot unset the flags */
COND_SYSCALL_COMPAT(recvmmsg_time32);
COND_SYSCALL_COMPAT(recvmmsg_time64);
+/* Posix timer syscalls may be configured out */
+COND_SYSCALL(timer_create);
+COND_SYSCALL(timer_gettime);
+COND_SYSCALL(timer_getoverrun);
+COND_SYSCALL(timer_settime);
+COND_SYSCALL(timer_delete);
+COND_SYSCALL(clock_adjtime);
+COND_SYSCALL(getitimer);
+COND_SYSCALL(setitimer);
+COND_SYSCALL(alarm);
+COND_SYSCALL_COMPAT(timer_create);
+COND_SYSCALL_COMPAT(getitimer);
+COND_SYSCALL_COMPAT(setitimer);
+
/*
* Architecture specific syscalls: see further below
*/
}
}
-int hrtimers_dead_cpu(unsigned int scpu)
+int hrtimers_cpu_dying(unsigned int dying_cpu)
{
struct hrtimer_cpu_base *old_base, *new_base;
- int i;
+ int i, ncpu = cpumask_first(cpu_active_mask);
- BUG_ON(cpu_online(scpu));
- tick_cancel_sched_timer(scpu);
+ tick_cancel_sched_timer(dying_cpu);
+
+ old_base = this_cpu_ptr(&hrtimer_bases);
+ new_base = &per_cpu(hrtimer_bases, ncpu);
- /*
- * this BH disable ensures that raise_softirq_irqoff() does
- * not wakeup ksoftirqd (and acquire the pi-lock) while
- * holding the cpu_base lock
- */
- local_bh_disable();
- local_irq_disable();
- old_base = &per_cpu(hrtimer_bases, scpu);
- new_base = this_cpu_ptr(&hrtimer_bases);
/*
* The caller is globally serialized and nobody else
* takes two locks at once, deadlock is not possible.
*/
- raw_spin_lock(&new_base->lock);
- raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
+ raw_spin_lock(&old_base->lock);
+ raw_spin_lock_nested(&new_base->lock, SINGLE_DEPTH_NESTING);
for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
migrate_hrtimer_list(&old_base->clock_base[i],
* The migration might have changed the first expiring softirq
* timer on this CPU. Update it.
*/
- hrtimer_update_softirq_timer(new_base, false);
+ __hrtimer_get_next_event(new_base, HRTIMER_ACTIVE_SOFT);
+ /* Tell the other CPU to retrigger the next event */
+ smp_call_function_single(ncpu, retrigger_next_event, NULL, 0);
- raw_spin_unlock(&old_base->lock);
raw_spin_unlock(&new_base->lock);
+ raw_spin_unlock(&old_base->lock);
- /* Check, if we got expired work to do */
- __hrtimer_peek_ahead_timers();
- local_irq_enable();
- local_bh_enable();
return 0;
}
#include <linux/time_namespace.h>
#include <linux/compat.h>
-#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER
-/* Architectures may override SYS_NI and COMPAT_SYS_NI */
-#include <asm/syscall_wrapper.h>
-#endif
-
-asmlinkage long sys_ni_posix_timers(void)
-{
- pr_err_once("process %d (%s) attempted a POSIX timer syscall "
- "while CONFIG_POSIX_TIMERS is not set\n",
- current->pid, current->comm);
- return -ENOSYS;
-}
-
-#ifndef SYS_NI
-#define SYS_NI(name) SYSCALL_ALIAS(sys_##name, sys_ni_posix_timers)
-#endif
-
-#ifndef COMPAT_SYS_NI
-#define COMPAT_SYS_NI(name) SYSCALL_ALIAS(compat_sys_##name, sys_ni_posix_timers)
-#endif
-
-SYS_NI(timer_create);
-SYS_NI(timer_gettime);
-SYS_NI(timer_getoverrun);
-SYS_NI(timer_settime);
-SYS_NI(timer_delete);
-SYS_NI(clock_adjtime);
-SYS_NI(getitimer);
-SYS_NI(setitimer);
-SYS_NI(clock_adjtime32);
-#ifdef __ARCH_WANT_SYS_ALARM
-SYS_NI(alarm);
-#endif
-
/*
* We preserve minimal support for CLOCK_REALTIME and CLOCK_MONOTONIC
* as it is easy to remain compatible with little code. CLOCK_BOOTTIME
which_clock);
}
-#ifdef CONFIG_COMPAT
-COMPAT_SYS_NI(timer_create);
-#endif
-
-#if defined(CONFIG_COMPAT) || defined(CONFIG_ALPHA)
-COMPAT_SYS_NI(getitimer);
-COMPAT_SYS_NI(setitimer);
-#endif
-
#ifdef CONFIG_COMPAT_32BIT_TIME
-SYS_NI(timer_settime32);
-SYS_NI(timer_gettime32);
SYSCALL_DEFINE2(clock_settime32, const clockid_t, which_clock,
struct old_timespec32 __user *, tp)
*/
void rethook_stop(struct rethook *rh)
{
- WRITE_ONCE(rh->handler, NULL);
+ rcu_assign_pointer(rh->handler, NULL);
}
/**
*/
void rethook_free(struct rethook *rh)
{
- WRITE_ONCE(rh->handler, NULL);
+ rethook_stop(rh);
call_rcu(&rh->rcu, rethook_free_rcu);
}
return 0;
}
+static inline rethook_handler_t rethook_get_handler(struct rethook *rh)
+{
+ return (rethook_handler_t)rcu_dereference_check(rh->handler,
+ rcu_read_lock_any_held());
+}
+
/**
* rethook_alloc() - Allocate struct rethook.
* @data: a data to pass the @handler when hooking the return.
return ERR_PTR(-ENOMEM);
rh->data = data;
- rh->handler = handler;
+ rcu_assign_pointer(rh->handler, handler);
/* initialize the objpool for rethook nodes */
if (objpool_init(&rh->pool, num, size, GFP_KERNEL, rh,
*/
void rethook_recycle(struct rethook_node *node)
{
- lockdep_assert_preemption_disabled();
+ rethook_handler_t handler;
- if (likely(READ_ONCE(node->rethook->handler)))
+ handler = rethook_get_handler(node->rethook);
+ if (likely(handler))
objpool_push(node, &node->rethook->pool);
else
call_rcu(&node->rcu, free_rethook_node_rcu);
*/
struct rethook_node *rethook_try_get(struct rethook *rh)
{
- rethook_handler_t handler = READ_ONCE(rh->handler);
-
- lockdep_assert_preemption_disabled();
+ rethook_handler_t handler = rethook_get_handler(rh);
/* Check whether @rh is going to be freed. */
if (unlikely(!handler))
rhn = container_of(first, struct rethook_node, llist);
if (WARN_ON_ONCE(rhn->frame != frame))
break;
- handler = READ_ONCE(rhn->rethook->handler);
+ handler = rethook_get_handler(rhn->rethook);
if (handler)
handler(rhn, rhn->rethook->data,
correct_ret_addr, regs);
*cnt = rb_time_cnt(top);
- /* If top and bottom counts don't match, this interrupted a write */
- if (*cnt != rb_time_cnt(bottom))
+ /* If top, msb or bottom counts don't match, this interrupted a write */
+ if (*cnt != rb_time_cnt(msb) || *cnt != rb_time_cnt(bottom))
return false;
/* The shift to msb will lose its cnt bits */
return local_try_cmpxchg(l, &expect, set);
}
-static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
-{
- unsigned long cnt, top, bottom, msb;
- unsigned long cnt2, top2, bottom2, msb2;
- u64 val;
-
- /* The cmpxchg always fails if it interrupted an update */
- if (!__rb_time_read(t, &val, &cnt2))
- return false;
-
- if (val != expect)
- return false;
-
- cnt = local_read(&t->cnt);
- if ((cnt & 3) != cnt2)
- return false;
-
- cnt2 = cnt + 1;
-
- rb_time_split(val, &top, &bottom, &msb);
- top = rb_time_val_cnt(top, cnt);
- bottom = rb_time_val_cnt(bottom, cnt);
-
- rb_time_split(set, &top2, &bottom2, &msb2);
- top2 = rb_time_val_cnt(top2, cnt2);
- bottom2 = rb_time_val_cnt(bottom2, cnt2);
-
- if (!rb_time_read_cmpxchg(&t->cnt, cnt, cnt2))
- return false;
- if (!rb_time_read_cmpxchg(&t->msb, msb, msb2))
- return false;
- if (!rb_time_read_cmpxchg(&t->top, top, top2))
- return false;
- if (!rb_time_read_cmpxchg(&t->bottom, bottom, bottom2))
- return false;
- return true;
-}
-
#else /* 64 bits */
/* local64_t always succeeds */
{
local64_set(&t->time, val);
}
-
-static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
-{
- return local64_try_cmpxchg(&t->time, &expect, set);
-}
#endif
/*
free_buffer_page(bpage);
}
+ free_page((unsigned long)cpu_buffer->free_page);
+
kfree(cpu_buffer);
}
*/
barrier();
- if ((iter->head + length) > commit || length > BUF_MAX_DATA_SIZE)
+ if ((iter->head + length) > commit || length > BUF_PAGE_SIZE)
/* Writer corrupted the read? */
goto reset;
return length;
}
-static u64 rb_time_delta(struct ring_buffer_event *event)
-{
- switch (event->type_len) {
- case RINGBUF_TYPE_PADDING:
- return 0;
-
- case RINGBUF_TYPE_TIME_EXTEND:
- return rb_event_time_stamp(event);
-
- case RINGBUF_TYPE_TIME_STAMP:
- return 0;
-
- case RINGBUF_TYPE_DATA:
- return event->time_delta;
- default:
- return 0;
- }
-}
-
static inline bool
rb_try_to_discard(struct ring_buffer_per_cpu *cpu_buffer,
struct ring_buffer_event *event)
unsigned long new_index, old_index;
struct buffer_page *bpage;
unsigned long addr;
- u64 write_stamp;
- u64 delta;
new_index = rb_event_index(event);
old_index = new_index + rb_event_ts_length(event);
bpage = READ_ONCE(cpu_buffer->tail_page);
- delta = rb_time_delta(event);
-
- if (!rb_time_read(&cpu_buffer->write_stamp, &write_stamp))
- return false;
-
- /* Make sure the write stamp is read before testing the location */
- barrier();
-
+ /*
+ * Make sure the tail_page is still the same and
+ * the next write location is the end of this event
+ */
if (bpage->page == (void *)addr && rb_page_write(bpage) == old_index) {
unsigned long write_mask =
local_read(&bpage->write) & ~RB_WRITE_MASK;
unsigned long event_length = rb_event_length(event);
- /* Something came in, can't discard */
- if (!rb_time_cmpxchg(&cpu_buffer->write_stamp,
- write_stamp, write_stamp - delta))
- return false;
-
/*
- * It's possible that the event time delta is zero
- * (has the same time stamp as the previous event)
- * in which case write_stamp and before_stamp could
- * be the same. In such a case, force before_stamp
- * to be different than write_stamp. It doesn't
- * matter what it is, as long as its different.
+ * For the before_stamp to be different than the write_stamp
+ * to make sure that the next event adds an absolute
+ * value and does not rely on the saved write stamp, which
+ * is now going to be bogus.
+ *
+ * By setting the before_stamp to zero, the next event
+ * is not going to use the write_stamp and will instead
+ * create an absolute timestamp. This means there's no
+ * reason to update the wirte_stamp!
*/
- if (!delta)
- rb_time_set(&cpu_buffer->before_stamp, 0);
+ rb_time_set(&cpu_buffer->before_stamp, 0);
/*
* If an event were to come in now, it would see that the
* write_stamp and the before_stamp are different, and assume
* that this event just added itself before updating
* the write stamp. The interrupting event will fix the
- * write stamp for us, and use the before stamp as its delta.
+ * write stamp for us, and use an absolute timestamp.
*/
/*
return;
/*
- * If this interrupted another event,
+ * If this interrupted another event,
*/
if (atomic_inc_return(this_cpu_ptr(&checking)) != 1)
goto out;
* absolute timestamp.
* Don't bother if this is the start of a new page (w == 0).
*/
- if (unlikely(!a_ok || !b_ok || (info->before != info->after && w))) {
+ if (!w) {
+ /* Use the sub-buffer timestamp */
+ info->delta = 0;
+ } else if (unlikely(!a_ok || !b_ok || info->before != info->after)) {
info->add_timestamp |= RB_ADD_STAMP_FORCE | RB_ADD_STAMP_EXTEND;
info->length += RB_LEN_TIME_EXTEND;
} else {
/* See if we shot pass the end of this buffer page */
if (unlikely(write > BUF_PAGE_SIZE)) {
- /* before and after may now different, fix it up*/
- b_ok = rb_time_read(&cpu_buffer->before_stamp, &info->before);
- a_ok = rb_time_read(&cpu_buffer->write_stamp, &info->after);
- if (a_ok && b_ok && info->before != info->after)
- (void)rb_time_cmpxchg(&cpu_buffer->before_stamp,
- info->before, info->after);
- if (a_ok && b_ok)
- check_buffer(cpu_buffer, info, CHECK_FULL_PAGE);
+ check_buffer(cpu_buffer, info, CHECK_FULL_PAGE);
return rb_move_tail(cpu_buffer, tail, info);
}
if (likely(tail == w)) {
- u64 save_before;
- bool s_ok;
-
/* Nothing interrupted us between A and C */
/*D*/ rb_time_set(&cpu_buffer->write_stamp, info->ts);
- barrier();
- /*E*/ s_ok = rb_time_read(&cpu_buffer->before_stamp, &save_before);
- RB_WARN_ON(cpu_buffer, !s_ok);
+ /*
+ * If something came in between C and D, the write stamp
+ * may now not be in sync. But that's fine as the before_stamp
+ * will be different and then next event will just be forced
+ * to use an absolute timestamp.
+ */
if (likely(!(info->add_timestamp &
(RB_ADD_STAMP_FORCE | RB_ADD_STAMP_ABSOLUTE))))
/* This did not interrupt any time update */
else
/* Just use full timestamp for interrupting event */
info->delta = info->ts;
- barrier();
check_buffer(cpu_buffer, info, tail);
- if (unlikely(info->ts != save_before)) {
- /* SLOW PATH - Interrupted between C and E */
-
- a_ok = rb_time_read(&cpu_buffer->write_stamp, &info->after);
- RB_WARN_ON(cpu_buffer, !a_ok);
-
- /* Write stamp must only go forward */
- if (save_before > info->after) {
- /*
- * We do not care about the result, only that
- * it gets updated atomically.
- */
- (void)rb_time_cmpxchg(&cpu_buffer->write_stamp,
- info->after, save_before);
- }
- }
} else {
u64 ts;
/* SLOW PATH - Interrupted between A and C */
- a_ok = rb_time_read(&cpu_buffer->write_stamp, &info->after);
- /* Was interrupted before here, write_stamp must be valid */
+
+ /* Save the old before_stamp */
+ a_ok = rb_time_read(&cpu_buffer->before_stamp, &info->before);
RB_WARN_ON(cpu_buffer, !a_ok);
+
+ /*
+ * Read a new timestamp and update the before_stamp to make
+ * the next event after this one force using an absolute
+ * timestamp. This is in case an interrupt were to come in
+ * between E and F.
+ */
ts = rb_time_stamp(cpu_buffer->buffer);
+ rb_time_set(&cpu_buffer->before_stamp, ts);
+
barrier();
- /*E*/ if (write == (local_read(&tail_page->write) & RB_WRITE_MASK) &&
- info->after < ts &&
- rb_time_cmpxchg(&cpu_buffer->write_stamp,
- info->after, ts)) {
- /* Nothing came after this event between C and E */
+ /*E*/ a_ok = rb_time_read(&cpu_buffer->write_stamp, &info->after);
+ /* Was interrupted before here, write_stamp must be valid */
+ RB_WARN_ON(cpu_buffer, !a_ok);
+ barrier();
+ /*F*/ if (write == (local_read(&tail_page->write) & RB_WRITE_MASK) &&
+ info->after == info->before && info->after < ts) {
+ /*
+ * Nothing came after this event between C and F, it is
+ * safe to use info->after for the delta as it
+ * matched info->before and is still valid.
+ */
info->delta = ts - info->after;
} else {
/*
- * Interrupted between C and E:
+ * Interrupted between C and F:
* Lost the previous events time stamp. Just set the
* delta to zero, and this will be the same time as
* the event this event interrupted. And the events that
int nr_loops = 0;
int add_ts_default;
+ /* ring buffer does cmpxchg, make sure it is safe in NMI context */
+ if (!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) &&
+ (unlikely(in_nmi()))) {
+ return NULL;
+ }
+
rb_start_commit(cpu_buffer);
/* The commit page can not change after this */
if (ring_buffer_time_stamp_abs(cpu_buffer->buffer)) {
add_ts_default = RB_ADD_STAMP_ABSOLUTE;
info.length += RB_LEN_TIME_EXTEND;
+ if (info.length > BUF_MAX_DATA_SIZE)
+ goto out_fail;
} else {
add_ts_default = RB_ADD_STAMP_NONE;
}
if (!iter)
return NULL;
- iter->event = kmalloc(BUF_MAX_DATA_SIZE, flags);
+ /* Holds the entire event: data and meta data */
+ iter->event = kmalloc(BUF_PAGE_SIZE, flags);
if (!iter->event) {
kfree(iter);
return NULL;
ret = test_trace_synth_event();
WARN_ON(ret);
+
+ /* Disable when done */
+ trace_array_set_clr_event(gen_synth_test->tr,
+ "synthetic",
+ "gen_synth_test", false);
+ trace_array_set_clr_event(empty_synth_test->tr,
+ "synthetic",
+ "empty_synth_test", false);
+ trace_array_set_clr_event(create_synth_test->tr,
+ "synthetic",
+ "create_synth_test", false);
out:
return ret;
}
return global_trace.stop_count;
}
-/**
- * tracing_start - quick start of the tracer
- *
- * If tracing is enabled but was stopped by tracing_stop,
- * this will start the tracer back up.
- */
-void tracing_start(void)
+static void tracing_start_tr(struct trace_array *tr)
{
struct trace_buffer *buffer;
unsigned long flags;
if (tracing_disabled)
return;
- raw_spin_lock_irqsave(&global_trace.start_lock, flags);
- if (--global_trace.stop_count) {
- if (global_trace.stop_count < 0) {
+ raw_spin_lock_irqsave(&tr->start_lock, flags);
+ if (--tr->stop_count) {
+ if (WARN_ON_ONCE(tr->stop_count < 0)) {
/* Someone screwed up their debugging */
- WARN_ON_ONCE(1);
- global_trace.stop_count = 0;
+ tr->stop_count = 0;
}
goto out;
}
/* Prevent the buffers from switching */
- arch_spin_lock(&global_trace.max_lock);
+ arch_spin_lock(&tr->max_lock);
- buffer = global_trace.array_buffer.buffer;
+ buffer = tr->array_buffer.buffer;
if (buffer)
ring_buffer_record_enable(buffer);
#ifdef CONFIG_TRACER_MAX_TRACE
- buffer = global_trace.max_buffer.buffer;
+ buffer = tr->max_buffer.buffer;
if (buffer)
ring_buffer_record_enable(buffer);
#endif
- arch_spin_unlock(&global_trace.max_lock);
-
- out:
- raw_spin_unlock_irqrestore(&global_trace.start_lock, flags);
-}
-
-static void tracing_start_tr(struct trace_array *tr)
-{
- struct trace_buffer *buffer;
- unsigned long flags;
-
- if (tracing_disabled)
- return;
-
- /* If global, we need to also start the max tracer */
- if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
- return tracing_start();
-
- raw_spin_lock_irqsave(&tr->start_lock, flags);
-
- if (--tr->stop_count) {
- if (tr->stop_count < 0) {
- /* Someone screwed up their debugging */
- WARN_ON_ONCE(1);
- tr->stop_count = 0;
- }
- goto out;
- }
-
- buffer = tr->array_buffer.buffer;
- if (buffer)
- ring_buffer_record_enable(buffer);
+ arch_spin_unlock(&tr->max_lock);
out:
raw_spin_unlock_irqrestore(&tr->start_lock, flags);
}
/**
- * tracing_stop - quick stop of the tracer
+ * tracing_start - quick start of the tracer
*
- * Light weight way to stop tracing. Use in conjunction with
- * tracing_start.
+ * If tracing is enabled but was stopped by tracing_stop,
+ * this will start the tracer back up.
*/
-void tracing_stop(void)
+void tracing_start(void)
+
+{
+ return tracing_start_tr(&global_trace);
+}
+
+static void tracing_stop_tr(struct trace_array *tr)
{
struct trace_buffer *buffer;
unsigned long flags;
- raw_spin_lock_irqsave(&global_trace.start_lock, flags);
- if (global_trace.stop_count++)
+ raw_spin_lock_irqsave(&tr->start_lock, flags);
+ if (tr->stop_count++)
goto out;
/* Prevent the buffers from switching */
- arch_spin_lock(&global_trace.max_lock);
+ arch_spin_lock(&tr->max_lock);
- buffer = global_trace.array_buffer.buffer;
+ buffer = tr->array_buffer.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
#ifdef CONFIG_TRACER_MAX_TRACE
- buffer = global_trace.max_buffer.buffer;
+ buffer = tr->max_buffer.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
#endif
- arch_spin_unlock(&global_trace.max_lock);
+ arch_spin_unlock(&tr->max_lock);
out:
- raw_spin_unlock_irqrestore(&global_trace.start_lock, flags);
+ raw_spin_unlock_irqrestore(&tr->start_lock, flags);
}
-static void tracing_stop_tr(struct trace_array *tr)
+/**
+ * tracing_stop - quick stop of the tracer
+ *
+ * Light weight way to stop tracing. Use in conjunction with
+ * tracing_start.
+ */
+void tracing_stop(void)
{
- struct trace_buffer *buffer;
- unsigned long flags;
-
- /* If global, we need to also stop the max tracer */
- if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
- return tracing_stop();
-
- raw_spin_lock_irqsave(&tr->start_lock, flags);
- if (tr->stop_count++)
- goto out;
-
- buffer = tr->array_buffer.buffer;
- if (buffer)
- ring_buffer_record_disable(buffer);
-
- out:
- raw_spin_unlock_irqrestore(&tr->start_lock, flags);
+ return tracing_stop_tr(&global_trace);
}
static int trace_save_cmdline(struct task_struct *tsk)
for_each_tracing_cpu(cpu) {
page = alloc_pages_node(cpu_to_node(cpu),
GFP_KERNEL | __GFP_NORETRY, 0);
- if (!page)
- goto failed;
+ /* This is just an optimization and can handle failures */
+ if (!page) {
+ pr_err("Failed to allocate event buffer\n");
+ break;
+ }
event = page_address(page);
memset(event, 0, sizeof(*event));
WARN_ON_ONCE(1);
preempt_enable();
}
-
- return;
- failed:
- trace_buffered_event_disable();
}
static void enable_trace_buffered_event(void *data)
if (--trace_buffered_event_ref)
return;
- preempt_disable();
/* For each CPU, set the buffer as used. */
- smp_call_function_many(tracing_buffer_mask,
- disable_trace_buffered_event, NULL, 1);
- preempt_enable();
+ on_each_cpu_mask(tracing_buffer_mask, disable_trace_buffered_event,
+ NULL, true);
/* Wait for all current users to finish */
synchronize_rcu();
free_page((unsigned long)per_cpu(trace_buffered_event, cpu));
per_cpu(trace_buffered_event, cpu) = NULL;
}
+
/*
- * Make sure trace_buffered_event is NULL before clearing
- * trace_buffered_event_cnt.
+ * Wait for all CPUs that potentially started checking if they can use
+ * their event buffer only after the previous synchronize_rcu() call and
+ * they still read a valid pointer from trace_buffered_event. It must be
+ * ensured they don't see cleared trace_buffered_event_cnt else they
+ * could wrongly decide to use the pointed-to buffer which is now freed.
*/
- smp_wmb();
+ synchronize_rcu();
- preempt_disable();
- /* Do the work on each cpu */
- smp_call_function_many(tracing_buffer_mask,
- enable_trace_buffered_event, NULL, 1);
- preempt_enable();
+ /* For each CPU, relinquish the buffer */
+ on_each_cpu_mask(tracing_buffer_mask, enable_trace_buffered_event, NULL,
+ true);
}
static struct trace_buffer *temp_buffer;
iter->leftover = ret;
} else {
- print_trace_line(iter);
+ ret = print_trace_line(iter);
+ if (ret == TRACE_TYPE_PARTIAL_LINE) {
+ iter->seq.full = 0;
+ trace_seq_puts(&iter->seq, "[LINE TOO BIG]\n");
+ }
ret = trace_print_seq(m, &iter->seq);
/*
* If we overflow the seq_file buffer, then it will
return 0;
}
+int tracing_single_release_file_tr(struct inode *inode, struct file *filp)
+{
+ tracing_release_file_tr(inode, filp);
+ return single_release(inode, filp);
+}
+
static int tracing_mark_open(struct inode *inode, struct file *filp)
{
stream_open(inode, filp);
if (!tr->array_buffer.buffer)
return 0;
+ /* Do not allow tracing while resizing ring buffer */
+ tracing_stop_tr(tr);
+
ret = ring_buffer_resize(tr->array_buffer.buffer, size, cpu);
if (ret < 0)
- return ret;
+ goto out_start;
#ifdef CONFIG_TRACER_MAX_TRACE
- if (!(tr->flags & TRACE_ARRAY_FL_GLOBAL) ||
- !tr->current_trace->use_max_tr)
+ if (!tr->allocated_snapshot)
goto out;
ret = ring_buffer_resize(tr->max_buffer.buffer, size, cpu);
WARN_ON(1);
tracing_disabled = 1;
}
- return ret;
+ goto out_start;
}
update_buffer_entries(&tr->max_buffer, cpu);
#endif /* CONFIG_TRACER_MAX_TRACE */
update_buffer_entries(&tr->array_buffer, cpu);
-
+ out_start:
+ tracing_start_tr(tr);
return ret;
}
int tracing_open_generic_tr(struct inode *inode, struct file *filp);
int tracing_open_file_tr(struct inode *inode, struct file *filp);
int tracing_release_file_tr(struct inode *inode, struct file *filp);
+int tracing_single_release_file_tr(struct inode *inode, struct file *filp);
bool tracing_is_disabled(void);
bool tracer_tracing_is_on(struct trace_array *tr);
void tracer_tracing_on(struct trace_array *tr);
{
int ret;
- ret = security_locked_down(LOCKDOWN_TRACEFS);
+ ret = tracing_open_file_tr(inode, file);
if (ret)
return ret;
+ /* Clear private_data to avoid warning in single_open() */
+ file->private_data = NULL;
return single_open(file, hist_show, file);
}
.open = event_hist_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = single_release,
+ .release = tracing_single_release_file_tr,
};
#ifdef CONFIG_HIST_TRIGGERS_DEBUG
{
int ret;
- ret = security_locked_down(LOCKDOWN_TRACEFS);
+ ret = tracing_open_file_tr(inode, file);
if (ret)
return ret;
+ /* Clear private_data to avoid warning in single_open() */
+ file->private_data = NULL;
return single_open(file, hist_debug_show, file);
}
.open = event_hist_debug_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = single_release,
+ .release = tracing_single_release_file_tr,
};
#endif
* @cmd: A pointer to the dynevent_cmd struct representing the new event
* @name: The name of the synthetic event
* @mod: The module creating the event, NULL if not created from a module
- * @args: Variable number of arg (pairs), one pair for each field
+ * @...: Variable number of arg (pairs), one pair for each field
*
* NOTE: Users normally won't want to call this function directly, but
* rather use the synth_event_gen_cmd_start() wrapper, which
* synth_event_trace - Trace a synthetic event
* @file: The trace_event_file representing the synthetic event
* @n_vals: The number of values in vals
- * @args: Variable number of args containing the event values
+ * @...: Variable number of args containing the event values
*
* Trace a synthetic event using the values passed in the variable
* argument list.
{
struct print_entry *field;
struct trace_seq *s = &iter->seq;
+ int max = iter->ent_size - offsetof(struct print_entry, buf);
trace_assign_type(field, iter->ent);
seq_print_ip_sym(s, field->ip, flags);
- trace_seq_printf(s, ": %s", field->buf);
+ trace_seq_printf(s, ": %.*s", max, field->buf);
return trace_handle_return(s);
}
struct trace_event *event)
{
struct print_entry *field;
+ int max = iter->ent_size - offsetof(struct print_entry, buf);
trace_assign_type(field, iter->ent);
- trace_seq_printf(&iter->seq, "# %lx %s", field->ip, field->buf);
+ trace_seq_printf(&iter->seq, "# %lx %.*s", field->ip, max, field->buf);
return trace_handle_return(&iter->seq);
}
pr_warn_once("workqueue: round-robin CPU selection forced, expect performance impact\n");
}
- if (cpumask_empty(wq_unbound_cpumask))
- return cpu;
-
new_cpu = __this_cpu_read(wq_rr_cpu_last);
new_cpu = cpumask_next_and(new_cpu, wq_unbound_cpumask, cpu_online_mask);
if (unlikely(new_cpu >= nr_cpu_ids)) {
#endif /* CONFIG_WQ_WATCHDOG */
+static void __init restrict_unbound_cpumask(const char *name, const struct cpumask *mask)
+{
+ if (!cpumask_intersects(wq_unbound_cpumask, mask)) {
+ pr_warn("workqueue: Restricting unbound_cpumask (%*pb) with %s (%*pb) leaves no CPU, ignoring\n",
+ cpumask_pr_args(wq_unbound_cpumask), name, cpumask_pr_args(mask));
+ return;
+ }
+
+ cpumask_and(wq_unbound_cpumask, wq_unbound_cpumask, mask);
+}
+
/**
* workqueue_init_early - early init for workqueue subsystem
*
BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
- cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(HK_TYPE_WQ));
- cpumask_and(wq_unbound_cpumask, wq_unbound_cpumask, housekeeping_cpumask(HK_TYPE_DOMAIN));
-
+ cpumask_copy(wq_unbound_cpumask, cpu_possible_mask);
+ restrict_unbound_cpumask("HK_TYPE_WQ", housekeeping_cpumask(HK_TYPE_WQ));
+ restrict_unbound_cpumask("HK_TYPE_DOMAIN", housekeeping_cpumask(HK_TYPE_DOMAIN));
if (!cpumask_empty(&wq_cmdline_cpumask))
- cpumask_and(wq_unbound_cpumask, wq_unbound_cpumask, &wq_cmdline_cpumask);
+ restrict_unbound_cpumask("workqueue.unbound_cpus", &wq_cmdline_cpumask);
pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC);
endmenu
-config DEBUG_CREDENTIALS
- bool "Debug credential management"
- depends on DEBUG_KERNEL
- help
- Enable this to turn on some debug checking for credential
- management. The additional code keeps track of the number of
- pointers from task_structs to any given cred struct, and checks to
- see that this number never exceeds the usage count of the cred
- struct.
-
- Furthermore, if SELinux is enabled, this also checks that the
- security pointer in the cred struct is never seen to be invalid.
-
- If unsure, say N.
-
source "kernel/rcu/Kconfig.debug"
config DEBUG_WQ_FORCE_RR_CPU
closure_debug_destroy(cl);
if (destructor)
- destructor(cl);
+ destructor(&cl->work);
if (parent)
closure_put(parent);
int done;
};
-static void closure_sync_fn(struct closure *cl)
+static CLOSURE_CALLBACK(closure_sync_fn)
{
+ struct closure *cl = container_of(ws, struct closure, work);
struct closure_syncer *s = cl->s;
struct task_struct *p;
E(ENOSPC),
E(ENOSR),
E(ENOSTR),
-#ifdef ENOSYM
- E(ENOSYM),
-#endif
E(ENOSYS),
E(ENOTBLK),
E(ENOTCONN),
#endif
E(EREMOTE),
E(EREMOTEIO),
-#ifdef EREMOTERELEASE
- E(EREMOTERELEASE),
-#endif
E(ERESTART),
E(ERFKILL),
E(EROFS),
* Copyright (C) 2023 Intel Corp.
*/
#include <linux/errno.h>
-#include <linux/fw_table.h>
+#include <linux/acpi.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/string.h>
if (!masks)
goto fail_node_to_cpumask;
- /* Stabilize the cpumasks */
- cpus_read_lock();
build_node_to_cpumask(node_to_cpumask);
+ /*
+ * Make a local cache of 'cpu_present_mask', so the two stages
+ * spread can observe consistent 'cpu_present_mask' without holding
+ * cpu hotplug lock, then we can reduce deadlock risk with cpu
+ * hotplug code.
+ *
+ * Here CPU hotplug may happen when reading `cpu_present_mask`, and
+ * we can live with the case because it only affects that hotplug
+ * CPU is handled in the 1st or 2nd stage, and either way is correct
+ * from API user viewpoint since 2-stage spread is sort of
+ * optimization.
+ */
+ cpumask_copy(npresmsk, data_race(cpu_present_mask));
+
/* grouping present CPUs first */
ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask,
- cpu_present_mask, nmsk, masks);
+ npresmsk, nmsk, masks);
if (ret < 0)
goto fail_build_affinity;
nr_present = ret;
curgrp = 0;
else
curgrp = nr_present;
- cpumask_andnot(npresmsk, cpu_possible_mask, cpu_present_mask);
+ cpumask_andnot(npresmsk, cpu_possible_mask, npresmsk);
ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask,
npresmsk, nmsk, masks);
if (ret >= 0)
nr_others = ret;
fail_build_affinity:
- cpus_read_unlock();
-
if (ret >= 0)
WARN_ON(nr_present + nr_others < numgrps);
goto delete;
xas_store(&xas, xa_mk_value(v));
} else {
- if (!test_bit(bit, bitmap->bitmap))
+ if (!bitmap || !test_bit(bit, bitmap->bitmap))
goto err;
__clear_bit(bit, bitmap->bitmap);
xas_set_mark(&xas, XA_FREE_MARK);
void *kaddr = kmap_local_page(page);
size_t n = min(bytes, (size_t)PAGE_SIZE - offset);
- n = iterate_and_advance(i, bytes, kaddr,
+ n = iterate_and_advance(i, n, kaddr + offset,
copy_to_user_iter_nofault,
memcpy_to_iter);
kunmap_local(kaddr);
KUNIT_EXPECT_TRUE(test, test->log->append_newlines);
full_log = string_stream_get_string(test->log);
- kunit_add_action(test, (kunit_action_t *)kfree, full_log);
+ kunit_add_action(test, kfree_wrapper, full_log);
KUNIT_EXPECT_NOT_ERR_OR_NULL(test,
strstr(full_log, "put this in log."));
KUNIT_EXPECT_NOT_ERR_OR_NULL(test,
}
EXPORT_SYMBOL_GPL(kunit_init_test);
+/* Only warn when a test takes more than twice the threshold */
+#define KUNIT_SPEED_WARNING_MULTIPLIER 2
+
+/* Slow tests are defined as taking more than 1s */
+#define KUNIT_SPEED_SLOW_THRESHOLD_S 1
+
+#define KUNIT_SPEED_SLOW_WARNING_THRESHOLD_S \
+ (KUNIT_SPEED_WARNING_MULTIPLIER * KUNIT_SPEED_SLOW_THRESHOLD_S)
+
+#define s_to_timespec64(s) ns_to_timespec64((s) * NSEC_PER_SEC)
+
+static void kunit_run_case_check_speed(struct kunit *test,
+ struct kunit_case *test_case,
+ struct timespec64 duration)
+{
+ struct timespec64 slow_thr =
+ s_to_timespec64(KUNIT_SPEED_SLOW_WARNING_THRESHOLD_S);
+ enum kunit_speed speed = test_case->attr.speed;
+
+ if (timespec64_compare(&duration, &slow_thr) < 0)
+ return;
+
+ if (speed == KUNIT_SPEED_VERY_SLOW || speed == KUNIT_SPEED_SLOW)
+ return;
+
+ kunit_warn(test,
+ "Test should be marked slow (runtime: %lld.%09lds)",
+ duration.tv_sec, duration.tv_nsec);
+}
+
/*
* Initializes and runs test case. Does not clean up or do post validations.
*/
struct kunit_suite *suite,
struct kunit_case *test_case)
{
+ struct timespec64 start, end;
+
if (suite->init) {
int ret;
}
}
+ ktime_get_ts64(&start);
+
test_case->run_case(test);
+
+ ktime_get_ts64(&end);
+
+ kunit_run_case_check_speed(test, test_case, timespec64_sub(end, start));
}
static void kunit_case_internal_cleanup(struct kunit *test)
return 0;
}
+ kunit_suite_counter = 1;
+
static_branch_inc(&kunit_running);
for (i = 0; i < num_suites; i++) {
for (i = 0; i < num_suites; i++)
kunit_exit_suite(suites[i]);
-
- kunit_suite_counter = 1;
}
EXPORT_SYMBOL_GPL(__kunit_test_suites_exit);
while (head != READ_ONCE(slot->last)) {
void *obj;
+ /*
+ * data visibility of 'last' and 'head' could be out of
+ * order since memory updating of 'last' and 'head' are
+ * performed in push() and pop() independently
+ *
+ * before any retrieving attempts, pop() must guarantee
+ * 'last' is behind 'head', that is to say, there must
+ * be available objects in slot, which could be ensured
+ * by condition 'last != head && last - head <= nr_objs'
+ * that is equivalent to 'last - head - 1 < nr_objs' as
+ * 'last' and 'head' are both unsigned int32
+ */
+ if (READ_ONCE(slot->last) - head - 1 >= pool->nr_objs) {
+ head = READ_ONCE(slot->head);
+ continue;
+ }
+
/* obj must be retrieved before moving forward head */
obj = READ_ONCE(slot->entries[head & slot->mask]);
IDA_BUG_ON(ida, !ida_is_empty(ida));
}
+/*
+ * Check various situations where we attempt to free an ID we don't own.
+ */
+static void ida_check_bad_free(struct ida *ida)
+{
+ unsigned long i;
+
+ printk("vvv Ignore \"not allocated\" warnings\n");
+ /* IDA is empty; all of these will fail */
+ ida_free(ida, 0);
+ for (i = 0; i < 31; i++)
+ ida_free(ida, 1 << i);
+
+ /* IDA contains a single value entry */
+ IDA_BUG_ON(ida, ida_alloc_min(ida, 3, GFP_KERNEL) != 3);
+ ida_free(ida, 0);
+ for (i = 0; i < 31; i++)
+ ida_free(ida, 1 << i);
+
+ /* IDA contains a single bitmap */
+ IDA_BUG_ON(ida, ida_alloc_min(ida, 1023, GFP_KERNEL) != 1023);
+ ida_free(ida, 0);
+ for (i = 0; i < 31; i++)
+ ida_free(ida, 1 << i);
+
+ /* IDA contains a tree */
+ IDA_BUG_ON(ida, ida_alloc_min(ida, (1 << 20) - 1, GFP_KERNEL) != (1 << 20) - 1);
+ ida_free(ida, 0);
+ for (i = 0; i < 31; i++)
+ ida_free(ida, 1 << i);
+ printk("^^^ \"not allocated\" warnings over\n");
+
+ ida_free(ida, 3);
+ ida_free(ida, 1023);
+ ida_free(ida, (1 << 20) - 1);
+
+ IDA_BUG_ON(ida, !ida_is_empty(ida));
+}
+
static DEFINE_IDA(ida);
static int ida_checks(void)
ida_check_leaf(&ida, 1024 * 64);
ida_check_max(&ida);
ida_check_conv(&ida);
+ ida_check_bad_free(&ida);
printk("IDA: %u of %u tests passed\n", tests_passed, tests_run);
return (tests_run != tests_passed) ? 0 : -EINVAL;
/* Loop starting from the root node to the current node. */
for (depth = fwnode_count_parents(fwnode); depth >= 0; depth--) {
- struct fwnode_handle *__fwnode =
- fwnode_get_nth_parent(fwnode, depth);
+ /*
+ * Only get a reference for other nodes (i.e. parent nodes).
+ * fwnode refcount may be 0 here.
+ */
+ struct fwnode_handle *__fwnode = depth ?
+ fwnode_get_nth_parent(fwnode, depth) : fwnode;
buf = string(buf, end, fwnode_get_name_prefix(__fwnode),
default_str_spec);
buf = string(buf, end, fwnode_get_name(__fwnode),
default_str_spec);
- fwnode_handle_put(__fwnode);
+ if (depth)
+ fwnode_handle_put(__fwnode);
}
return buf;
typedef struct {
short ncount[FSE_MAX_SYMBOL_VALUE + 1];
- FSE_DTable dtable[1]; /* Dynamically sized */
+ FSE_DTable dtable[]; /* Dynamically sized */
} FSE_DecompressWksp;
area from being merged with adjacent virtual memory areas due to the
difference in their name.
-config USERFAULTFD
- bool "Enable userfaultfd() system call"
- depends on MMU
- help
- Enable the userfaultfd() system call that allows to intercept and
- handle page faults in userland.
-
config HAVE_ARCH_USERFAULTFD_WP
bool
help
help
Arch has userfaultfd minor fault support
+menuconfig USERFAULTFD
+ bool "Enable userfaultfd() system call"
+ depends on MMU
+ help
+ Enable the userfaultfd() system call that allows to intercept and
+ handle page faults in userland.
+
+if USERFAULTFD
config PTE_MARKER_UFFD_WP
bool "Userfaultfd write protection support for shmem/hugetlbfs"
default y
Allows to create marker PTEs for userfaultfd write protection
purposes. It is required to enable userfaultfd write protection on
file-backed memory types like shmem and hugetlbfs.
+endif # USERFAULTFD
# multi-gen LRU {
config LRU_GEN
if (!ctx)
return NULL;
+ init_completion(&ctx->kdamond_started);
+
ctx->attrs.sample_interval = 5 * 1000;
ctx->attrs.aggr_interval = 100 * 1000;
ctx->attrs.ops_update_interval = 60 * 1000 * 1000;
mutex_lock(&ctx->kdamond_lock);
if (!ctx->kdamond) {
err = 0;
+ reinit_completion(&ctx->kdamond_started);
ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d",
nr_running_ctxs);
if (IS_ERR(ctx->kdamond)) {
err = PTR_ERR(ctx->kdamond);
ctx->kdamond = NULL;
+ } else {
+ wait_for_completion(&ctx->kdamond_started);
}
}
mutex_unlock(&ctx->kdamond_lock);
matched = true;
break;
default:
- break;
+ return false;
}
return matched == filter->matching;
new->age = r->age;
new->last_nr_accesses = r->last_nr_accesses;
new->nr_accesses_bp = r->nr_accesses_bp;
+ new->nr_accesses = r->nr_accesses;
damon_insert_region(new, r, damon_next_region(r), t);
}
pr_debug("kdamond (%d) starts\n", current->pid);
+ complete(&ctx->kdamond_started);
kdamond_init_intervals_sis(ctx);
if (ctx->ops.init)
* damon_sysfs_before_damos_apply() understands the situation by showing the
* 'finished' status and do nothing.
*
+ * If DAMOS is not applied to any region due to any reasons including the
+ * access pattern, the watermarks, the quotas, and the filters,
+ * ->before_damos_apply() will not be called back. Until the situation is
+ * changed, the update will not be finished. To avoid this,
+ * damon_sysfs_after_sampling() set the status as 'finished' if more than two
+ * apply intervals of the scheme is passed while the state is 'idle'.
+ *
* Finally, the tried regions request handling finisher function
* (damon_sysfs_schemes_update_regions_stop()) unregisters the callbacks.
*/
int nr_regions;
unsigned long total_bytes;
enum damos_sysfs_regions_upd_status upd_status;
+ unsigned long upd_timeout_jiffies;
};
static struct damon_sysfs_scheme_regions *
struct damon_sysfs_scheme_regions *regions = kmalloc(sizeof(*regions),
GFP_KERNEL);
+ if (!regions)
+ return NULL;
+
regions->kobj = (struct kobject){};
INIT_LIST_HEAD(®ions->regions_list);
regions->nr_regions = 0;
return 0;
region = damon_sysfs_scheme_region_alloc(r);
+ if (!region)
+ return 0;
list_add_tail(®ion->list, &sysfs_regions->regions_list);
sysfs_regions->nr_regions++;
if (kobject_init_and_add(®ion->kobj,
for (i = 0; i < sysfs_schemes->nr; i++) {
sysfs_regions = sysfs_schemes->schemes_arr[i]->tried_regions;
if (sysfs_regions->upd_status ==
- DAMOS_TRIED_REGIONS_UPD_STARTED)
+ DAMOS_TRIED_REGIONS_UPD_STARTED ||
+ time_after(jiffies,
+ sysfs_regions->upd_timeout_jiffies))
sysfs_regions->upd_status =
DAMOS_TRIED_REGIONS_UPD_FINISHED;
}
return 0;
}
+static struct damos *damos_sysfs_nth_scheme(int n, struct damon_ctx *ctx)
+{
+ struct damos *scheme;
+ int i = 0;
+
+ damon_for_each_scheme(scheme, ctx) {
+ if (i == n)
+ return scheme;
+ i++;
+ }
+ return NULL;
+}
+
static void damos_tried_regions_init_upd_status(
- struct damon_sysfs_schemes *sysfs_schemes)
+ struct damon_sysfs_schemes *sysfs_schemes,
+ struct damon_ctx *ctx)
{
int i;
+ struct damos *scheme;
+ struct damon_sysfs_scheme_regions *sysfs_regions;
- for (i = 0; i < sysfs_schemes->nr; i++)
- sysfs_schemes->schemes_arr[i]->tried_regions->upd_status =
- DAMOS_TRIED_REGIONS_UPD_IDLE;
+ for (i = 0; i < sysfs_schemes->nr; i++) {
+ sysfs_regions = sysfs_schemes->schemes_arr[i]->tried_regions;
+ scheme = damos_sysfs_nth_scheme(i, ctx);
+ if (!scheme) {
+ sysfs_regions->upd_status =
+ DAMOS_TRIED_REGIONS_UPD_FINISHED;
+ continue;
+ }
+ sysfs_regions->upd_status = DAMOS_TRIED_REGIONS_UPD_IDLE;
+ sysfs_regions->upd_timeout_jiffies = jiffies +
+ 2 * usecs_to_jiffies(scheme->apply_interval_us ?
+ scheme->apply_interval_us :
+ ctx->attrs.sample_interval);
+ }
}
/* Called from damon_sysfs_cmd_request_callback under damon_sysfs_lock */
{
damon_sysfs_schemes_clear_regions(sysfs_schemes, ctx);
damon_sysfs_schemes_for_damos_callback = sysfs_schemes;
- damos_tried_regions_init_upd_status(sysfs_schemes);
+ damos_tried_regions_init_upd_status(sysfs_schemes, ctx);
damos_regions_upd_total_bytes_only = total_bytes_only;
ctx->callback.before_damos_apply = damon_sysfs_before_damos_apply;
ctx->callback.after_sampling = damon_sysfs_after_sampling;
struct damon_ctx *ctx,
struct damon_sysfs_target *sys_target)
{
- int err;
+ int err = 0;
if (damon_target_has_pid(ctx)) {
err = damon_sysfs_update_target_pid(target, sys_target->pid);
damon_for_each_target_safe(t, next, ctx) {
if (i < sysfs_targets->nr) {
- damon_sysfs_update_target(t, ctx,
+ err = damon_sysfs_update_target(t, ctx,
sysfs_targets->targets_arr[i]);
+ if (err)
+ return err;
} else {
if (damon_target_has_pid(ctx))
put_pid(t->pid);
}
}
- if (pmd_none(*vmf->pmd))
+ if (pmd_none(*vmf->pmd) && vmf->prealloc_pte)
pmd_install(mm, vmf->pmd, &vmf->prealloc_pte);
return false;
* handled in the specific fault path, and it'll prohibit the
* fault-around logic.
*/
- if (!pte_none(vmf->pte[count]))
+ if (!pte_none(ptep_get(&vmf->pte[count])))
goto skip;
count++;
int nr = folio_nr_pages(folio);
xas_split(&xas, folio, folio_order(folio));
- if (folio_test_swapbacked(folio)) {
- __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS,
- -nr);
- } else {
- __lruvec_stat_mod_folio(folio, NR_FILE_THPS,
- -nr);
- filemap_nr_thps_dec(mapping);
+ if (folio_test_pmd_mappable(folio)) {
+ if (folio_test_swapbacked(folio)) {
+ __lruvec_stat_mod_folio(folio,
+ NR_SHMEM_THPS, -nr);
+ } else {
+ __lruvec_stat_mod_folio(folio,
+ NR_FILE_THPS, -nr);
+ filemap_nr_thps_dec(mapping);
+ }
}
}
return (get_vma_private_data(vma) & flag) != 0;
}
+bool __vma_private_lock(struct vm_area_struct *vma)
+{
+ return !(vma->vm_flags & VM_MAYSHARE) &&
+ get_vma_private_data(vma) & ~HPAGE_RESV_MASK &&
+ is_vma_resv_set(vma, HPAGE_RESV_OWNER);
+}
+
void hugetlb_dup_vma_private(struct vm_area_struct *vma)
{
VM_BUG_ON_VMA(!is_vm_hugetlb_page(vma), vma);
if (!object) {
pr_warn("Cannot allocate a kmemleak_object structure\n");
kmemleak_disable();
+ return NULL;
}
- return object;
-}
-
-static int __link_object(struct kmemleak_object *object, unsigned long ptr,
- size_t size, int min_count, bool is_phys)
-{
-
- struct kmemleak_object *parent;
- struct rb_node **link, *rb_parent;
- unsigned long untagged_ptr;
- unsigned long untagged_objp;
-
INIT_LIST_HEAD(&object->object_list);
INIT_LIST_HEAD(&object->gray_list);
INIT_HLIST_HEAD(&object->area_list);
raw_spin_lock_init(&object->lock);
atomic_set(&object->use_count, 1);
- object->flags = OBJECT_ALLOCATED | (is_phys ? OBJECT_PHYS : 0);
- object->pointer = ptr;
- object->size = kfence_ksize((void *)ptr) ?: size;
object->excess_ref = 0;
- object->min_count = min_count;
object->count = 0; /* white color initially */
- object->jiffies = jiffies;
object->checksum = 0;
object->del_state = 0;
/* kernel backtrace */
object->trace_handle = set_track_prepare();
+ return object;
+}
+
+static int __link_object(struct kmemleak_object *object, unsigned long ptr,
+ size_t size, int min_count, bool is_phys)
+{
+
+ struct kmemleak_object *parent;
+ struct rb_node **link, *rb_parent;
+ unsigned long untagged_ptr;
+ unsigned long untagged_objp;
+
+ object->flags = OBJECT_ALLOCATED | (is_phys ? OBJECT_PHYS : 0);
+ object->pointer = ptr;
+ object->size = kfence_ksize((void *)ptr) ?: size;
+ object->min_count = min_count;
+ object->jiffies = jiffies;
+
untagged_ptr = (unsigned long)kasan_reset_tag((void *)ptr);
/*
* Only update min_addr and max_addr with object
void __ref kmemleak_update_trace(const void *ptr)
{
struct kmemleak_object *object;
+ depot_stack_handle_t trace_handle;
unsigned long flags;
pr_debug("%s(0x%px)\n", __func__, ptr);
return;
}
+ trace_handle = set_track_prepare();
raw_spin_lock_irqsave(&object->lock, flags);
- object->trace_handle = set_track_prepare();
+ object->trace_handle = trace_handle;
raw_spin_unlock_irqrestore(&object->lock, flags);
put_object(object);
page = pfn_swap_entry_to_page(entry);
}
/* return 1 if the page is an normal ksm page or KSM-placed zero page */
- ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte);
+ ret = (page && PageKsm(page)) || is_ksm_zero_pte(ptent);
pte_unmap_unlock(pte, ptl);
return ret;
}
struct folio *folio = NULL;
LIST_HEAD(folio_list);
bool pageout_anon_only_filter;
+ unsigned int batch_count = 0;
if (fatal_signal_pending(current))
return -EINTR;
regular_folio:
#endif
tlb_change_page_size(tlb, PAGE_SIZE);
+restart:
start_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
if (!start_pte)
return 0;
for (; addr < end; pte++, addr += PAGE_SIZE) {
ptent = ptep_get(pte);
+ if (++batch_count == SWAP_CLUSTER_MAX) {
+ batch_count = 0;
+ if (need_resched()) {
+ pte_unmap_unlock(start_pte, ptl);
+ cond_resched();
+ goto restart;
+ }
+ }
+
if (pte_none(ptent))
continue;
* Moreover, it should not come from DMA buffer and is not readily
* reclaimable. So those GFP bits should be masked off.
*/
-#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT)
+#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \
+ __GFP_ACCOUNT | __GFP_NOFAIL)
/*
* mod_objcg_mlstate() may be called with irq enabled, so
return NULL;
from_memcg:
+ objcg = NULL;
for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) {
/*
* Memcg pointer is protected by scope (see set_active_memcg())
objcg = rcu_dereference_check(memcg->objcg, 1);
if (likely(objcg))
break;
- objcg = NULL;
}
return objcg;
continue;
} else {
/* We should have covered all the swap entry types */
+ pr_alert("unrecognized swap entry 0x%lx\n", entry.val);
WARN_ON_ONCE(1);
}
pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
kasan_remove_zero_shadow(__va(PFN_PHYS(pfn)), PFN_PHYS(nr_pages));
}
+/*
+ * Must be called with mem_hotplug_lock in write mode.
+ */
int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
struct zone *zone, struct memory_group *group)
{
!IS_ALIGNED(pfn + nr_pages, PAGES_PER_SECTION)))
return -EINVAL;
- mem_hotplug_begin();
/* associate pfn range with the zone */
move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
writeback_set_ratelimit();
memory_notify(MEM_ONLINE, &arg);
- mem_hotplug_done();
return 0;
failed_addition:
(((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
memory_notify(MEM_CANCEL_ONLINE, &arg);
remove_pfn_range_from_zone(zone, pfn, nr_pages);
- mem_hotplug_done();
return ret;
}
/* create memory block devices after memory was added */
ret = create_memory_block_devices(start, size, params.altmap, group);
if (ret) {
- arch_remove_memory(start, size, NULL);
+ arch_remove_memory(start, size, params.altmap);
goto error_free;
}
return 0;
}
+/*
+ * Must be called with mem_hotplug_lock in write mode.
+ */
int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
struct zone *zone, struct memory_group *group)
{
!IS_ALIGNED(start_pfn + nr_pages, PAGES_PER_SECTION)))
return -EINVAL;
- mem_hotplug_begin();
-
/*
* Don't allow to offline memory blocks that contain holes.
* Consequently, memory blocks with holes can never get onlined
memory_notify(MEM_OFFLINE, &arg);
remove_pfn_range_from_zone(zone, start_pfn, nr_pages);
- mem_hotplug_done();
return 0;
failed_removal_isolated:
(unsigned long long) start_pfn << PAGE_SHIFT,
((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
reason);
- mem_hotplug_done();
return ret;
}
*/
void folio_wait_stable(struct folio *folio)
{
- if (folio_inode(folio)->i_sb->s_iflags & SB_I_STABLE_WRITES)
+ if (mapping_stable_writes(folio_mapping(folio)))
folio_wait_writeback(folio);
}
EXPORT_SYMBOL_GPL(folio_wait_stable);
}
VM_BUG_ON_FOLIO(folio_test_writeback(folio),
folio);
- truncate_inode_folio(mapping, folio);
+
+ if (!folio_test_large(folio)) {
+ truncate_inode_folio(mapping, folio);
+ } else if (truncate_inode_partial_folio(folio, lstart, lend)) {
+ /*
+ * If we split a page, reset the loop so
+ * that we pick up the new sub pages.
+ * Otherwise the THP was entirely
+ * dropped or the target range was
+ * zeroed, so just continue the loop as
+ * is.
+ */
+ if (!folio_test_large(folio)) {
+ folio_unlock(folio);
+ index = start;
+ break;
+ }
+ }
}
folio_unlock(folio);
}
ret = -EEXIST;
/* Refuse to overwrite any PTE, even a PTE marker (e.g. UFFD WP). */
- if (!pte_none(*dst_pte))
+ if (!pte_none(ptep_get(dst_pte)))
goto out_unlock;
set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
{
+#ifdef CONFIG_STACK_GROWSUP
+ /*
+ * For an upwards growing stack the calculation is much simpler.
+ * Memory for the maximum stack size is reserved at the top of the
+ * task. mmap_base starts directly below the stack and grows
+ * downwards.
+ */
+ return PAGE_ALIGN_DOWN(mmap_upper_limit(rlim_stack) - rnd);
+#else
unsigned long gap = rlim_stack->rlim_cur;
unsigned long pad = stack_guard_gap;
gap = MAX_GAP;
return PAGE_ALIGN(STACK_TOP - gap - rnd);
+#endif
}
void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
else
VM_WARN_ON_ONCE(true);
+ WRITE_ONCE(lruvec->lrugen.seg, seg);
+ WRITE_ONCE(lruvec->lrugen.gen, new);
+
hlist_nulls_del_rcu(&lruvec->lrugen.list);
if (op == MEMCG_LRU_HEAD || op == MEMCG_LRU_OLD)
pgdat->memcg_lru.nr_memcgs[old]--;
pgdat->memcg_lru.nr_memcgs[new]++;
- lruvec->lrugen.gen = new;
- WRITE_ONCE(lruvec->lrugen.seg, seg);
-
if (!pgdat->memcg_lru.nr_memcgs[old] && old == get_memcg_gen(pgdat->memcg_lru.seq))
WRITE_ONCE(pgdat->memcg_lru.seq, pgdat->memcg_lru.seq + 1);
gen = get_memcg_gen(pgdat->memcg_lru.seq);
+ lruvec->lrugen.gen = gen;
+
hlist_nulls_add_tail_rcu(&lruvec->lrugen.list, &pgdat->memcg_lru.fifo[gen][bin]);
pgdat->memcg_lru.nr_memcgs[gen]++;
- lruvec->lrugen.gen = gen;
-
spin_unlock_irq(&pgdat->memcg_lru.lock);
}
}
}
/* protected */
- if (tier > tier_idx) {
+ if (tier > tier_idx || refs == BIT(LRU_REFS_WIDTH)) {
int hist = lru_hist_from_seq(lrugen->min_seq[type]);
gen = folio_inc_gen(lruvec, folio, false);
}
/* try to scrape all its memory if this memcg was deleted */
- *nr_to_scan = mem_cgroup_online(memcg) ? (total >> sc->priority) : total;
+ if (!mem_cgroup_online(memcg)) {
+ *nr_to_scan = total;
+ return false;
+ }
+
+ *nr_to_scan = total >> sc->priority;
/*
* The aging tries to be lazy to reduce the overhead, while the eviction
DEFINE_MAX_SEQ(lruvec);
if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg))
- return 0;
+ return -1;
if (!should_run_aging(lruvec, max_seq, sc, can_swap, &nr_to_scan))
return nr_to_scan;
return try_to_inc_max_seq(lruvec, max_seq, sc, can_swap, false) ? -1 : 0;
}
-static unsigned long get_nr_to_reclaim(struct scan_control *sc)
+static bool should_abort_scan(struct lruvec *lruvec, struct scan_control *sc)
{
+ int i;
+ enum zone_watermarks mark;
+
/* don't abort memcg reclaim to ensure fairness */
if (!root_reclaim(sc))
- return -1;
+ return false;
+
+ if (sc->nr_reclaimed >= max(sc->nr_to_reclaim, compact_gap(sc->order)))
+ return true;
+
+ /* check the order to exclude compaction-induced reclaim */
+ if (!current_is_kswapd() || sc->order)
+ return false;
+
+ mark = sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING ?
+ WMARK_PROMO : WMARK_HIGH;
+
+ for (i = 0; i <= sc->reclaim_idx; i++) {
+ struct zone *zone = lruvec_pgdat(lruvec)->node_zones + i;
+ unsigned long size = wmark_pages(zone, mark) + MIN_LRU_BATCH;
+
+ if (managed_zone(zone) && !zone_watermark_ok(zone, 0, size, sc->reclaim_idx, 0))
+ return false;
+ }
- return max(sc->nr_to_reclaim, compact_gap(sc->order));
+ /* kswapd should abort if all eligible zones are safe */
+ return true;
}
static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
{
long nr_to_scan;
unsigned long scanned = 0;
- unsigned long nr_to_reclaim = get_nr_to_reclaim(sc);
int swappiness = get_swappiness(lruvec, sc);
/* clean file folios are more likely to exist */
if (scanned >= nr_to_scan)
break;
- if (sc->nr_reclaimed >= nr_to_reclaim)
+ if (should_abort_scan(lruvec, sc))
break;
cond_resched();
}
- /* whether try_to_inc_max_seq() was successful */
+ /* whether this lruvec should be rotated */
return nr_to_scan < 0;
}
bool success;
unsigned long scanned = sc->nr_scanned;
unsigned long reclaimed = sc->nr_reclaimed;
- int seg = lru_gen_memcg_seg(lruvec);
struct mem_cgroup *memcg = lruvec_memcg(lruvec);
struct pglist_data *pgdat = lruvec_pgdat(lruvec);
- /* see the comment on MEMCG_NR_GENS */
- if (!lruvec_is_sizable(lruvec, sc))
- return seg != MEMCG_LRU_TAIL ? MEMCG_LRU_TAIL : MEMCG_LRU_YOUNG;
-
mem_cgroup_calculate_protection(NULL, memcg);
if (mem_cgroup_below_min(NULL, memcg))
if (mem_cgroup_below_low(NULL, memcg)) {
/* see the comment on MEMCG_NR_GENS */
- if (seg != MEMCG_LRU_TAIL)
+ if (lru_gen_memcg_seg(lruvec) != MEMCG_LRU_TAIL)
return MEMCG_LRU_TAIL;
memcg_memory_event(memcg, MEMCG_LOW);
flush_reclaim_state(sc);
- return success ? MEMCG_LRU_YOUNG : 0;
+ if (success && mem_cgroup_online(memcg))
+ return MEMCG_LRU_YOUNG;
+
+ if (!success && lruvec_is_sizable(lruvec, sc))
+ return 0;
+
+ /* one retry if offlined or too small */
+ return lru_gen_memcg_seg(lruvec) != MEMCG_LRU_TAIL ?
+ MEMCG_LRU_TAIL : MEMCG_LRU_YOUNG;
}
#ifdef CONFIG_MEMCG
struct lruvec *lruvec;
struct lru_gen_folio *lrugen;
struct mem_cgroup *memcg;
- const struct hlist_nulls_node *pos;
- unsigned long nr_to_reclaim = get_nr_to_reclaim(sc);
+ struct hlist_nulls_node *pos;
+ gen = get_memcg_gen(READ_ONCE(pgdat->memcg_lru.seq));
bin = first_bin = get_random_u32_below(MEMCG_NR_BINS);
restart:
op = 0;
memcg = NULL;
- gen = get_memcg_gen(READ_ONCE(pgdat->memcg_lru.seq));
rcu_read_lock();
}
mem_cgroup_put(memcg);
+ memcg = NULL;
+
+ if (gen != READ_ONCE(lrugen->gen))
+ continue;
lruvec = container_of(lrugen, struct lruvec, lrugen);
memcg = lruvec_memcg(lruvec);
rcu_read_lock();
- if (sc->nr_reclaimed >= nr_to_reclaim)
+ if (should_abort_scan(lruvec, sc))
break;
}
mem_cgroup_put(memcg);
- if (sc->nr_reclaimed >= nr_to_reclaim)
+ if (!is_a_nulls(pos))
return;
/* restart if raced with lru_gen_rotate_memcg() */
if (sc->priority != DEF_PRIORITY || sc->nr_to_reclaim < MIN_LRU_BATCH)
return;
/*
- * Determine the initial priority based on ((total / MEMCG_NR_GENS) >>
- * priority) * reclaimed_to_scanned_ratio = nr_to_reclaim, where the
- * estimated reclaimed_to_scanned_ratio = inactive / total.
+ * Determine the initial priority based on
+ * (total >> priority) * reclaimed_to_scanned_ratio = nr_to_reclaim,
+ * where reclaimed_to_scanned_ratio = inactive / total.
*/
reclaimable = node_page_state(pgdat, NR_INACTIVE_FILE);
if (get_swappiness(lruvec, sc))
reclaimable += node_page_state(pgdat, NR_INACTIVE_ANON);
- reclaimable /= MEMCG_NR_GENS;
-
/* round down reclaimable and round up sc->nr_to_reclaim */
priority = fls_long(reclaimable) - 1 - fls_long(sc->nr_to_reclaim - 1);
* 1. For pages accessed through page tables, hotter pages pushed out
* hot pages which refaulted immediately.
* 2. For pages accessed multiple times through file descriptors,
- * numbers of accesses might have been out of the range.
+ * they would have been protected by sort_folio().
*/
- if (lru_gen_in_fault() || refs == BIT(LRU_REFS_WIDTH)) {
- folio_set_workingset(folio);
+ if (lru_gen_in_fault() || refs >= BIT(LRU_REFS_WIDTH) - 1) {
+ set_mask_bits(&folio->flags, 0, LRU_REFS_MASK | BIT(PG_workingset));
mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + type, delta);
}
unlock:
return 0;
list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
+ if (!vlan_hw_filter_capable(by_dev, vid_info->proto))
+ continue;
err = vlan_vid_add(dev, vid_info->proto, vid_info->vid);
if (err)
goto unwind;
list_for_each_entry_continue_reverse(vid_info,
&vlan_info->vid_list,
list) {
+ if (!vlan_hw_filter_capable(by_dev, vid_info->proto))
+ continue;
vlan_vid_del(dev, vid_info->proto, vid_info->vid);
}
if (!vlan_info)
return;
- list_for_each_entry(vid_info, &vlan_info->vid_list, list)
+ list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
+ if (!vlan_hw_filter_capable(by_dev, vid_info->proto))
+ continue;
vlan_vid_del(dev, vid_info->proto, vid_info->vid);
+ }
}
EXPORT_SYMBOL(vlan_vids_del_by_dev);
uint16_t *nwname = va_arg(ap, uint16_t *);
char ***wnames = va_arg(ap, char ***);
+ *wnames = NULL;
+
errcode = p9pdu_readf(pdu, proto_version,
"w", nwname);
if (!errcode) {
GFP_NOFS);
if (!*wnames)
errcode = -ENOMEM;
+ else
+ (*wnames)[0] = NULL;
}
if (!errcode) {
proto_version,
"s",
&(*wnames)[i]);
- if (errcode)
+ if (errcode) {
+ (*wnames)[i] = NULL;
break;
+ }
}
}
if (*wnames) {
int i;
- for (i = 0; i < *nwname; i++)
+ for (i = 0; i < *nwname; i++) {
+ if (!(*wnames)[i])
+ break;
kfree((*wnames)[i]);
+ }
+ kfree(*wnames);
+ *wnames = NULL;
}
- kfree(*wnames);
- *wnames = NULL;
}
}
break;
break;
}
case TIOCINQ: {
- /*
- * These two are safe on a single CPU system as only
- * user tasks fiddle here
- */
- struct sk_buff *skb = skb_peek(&sk->sk_receive_queue);
+ struct sk_buff *skb;
long amount = 0;
+ spin_lock_irq(&sk->sk_receive_queue.lock);
+ skb = skb_peek(&sk->sk_receive_queue);
if (skb)
amount = skb->len - sizeof(struct ddpehdr);
+ spin_unlock_irq(&sk->sk_receive_queue.lock);
rc = put_user(amount, (int __user *)argp);
break;
}
case SIOCINQ:
{
struct sk_buff *skb;
+ int amount;
if (sock->state != SS_CONNECTED) {
error = -EINVAL;
goto done;
}
+ spin_lock_irq(&sk->sk_receive_queue.lock);
skb = skb_peek(&sk->sk_receive_queue);
- error = put_user(skb ? skb->len : 0,
- (int __user *)argp) ? -EFAULT : 0;
+ amount = skb ? skb->len : 0;
+ spin_unlock_irq(&sk->sk_receive_queue.lock);
+ error = put_user(amount, (int __user *)argp) ? -EFAULT : 0;
goto done;
}
case ATM_SETSC:
if (flags & MSG_OOB)
return -EOPNOTSUPP;
+ lock_sock(sk);
+
skb = skb_recv_datagram(sk, flags, &err);
if (!skb) {
if (sk->sk_shutdown & RCV_SHUTDOWN)
- return 0;
+ err = 0;
+ release_sock(sk);
return err;
}
skb_free_datagram(sk, skb);
+ release_sock(sk);
+
if (flags & MSG_TRUNC)
copied = skblen;
{
struct hci_rp_read_class_of_dev *rp = data;
+ if (WARN_ON(!hdev))
+ return HCI_ERROR_UNSPECIFIED;
+
bt_dev_dbg(hdev, "status 0x%2.2x", rp->status);
if (rp->status)
} else {
conn->enc_key_size = rp->key_size;
status = 0;
+
+ if (conn->enc_key_size < hdev->min_enc_key_size) {
+ /* As slave role, the conn->state has been set to
+ * BT_CONNECTED and l2cap conn req might not be received
+ * yet, at this moment the l2cap layer almost does
+ * nothing with the non-zero status.
+ * So we also clear encrypt related bits, and then the
+ * handler of l2cap conn req will get the right secure
+ * state at a later time.
+ */
+ status = HCI_ERROR_AUTH_FAILURE;
+ clear_bit(HCI_CONN_ENCRYPT, &conn->flags);
+ clear_bit(HCI_CONN_AES_CCM, &conn->flags);
+ }
}
- hci_encrypt_cfm(conn, 0);
+ hci_encrypt_cfm(conn, status);
done:
hci_dev_unlock(hdev);
if (!rp->status)
conn->auth_payload_timeout = get_unaligned_le16(sent + 2);
- hci_encrypt_cfm(conn, 0);
-
unlock:
hci_dev_unlock(hdev);
return;
}
- set_bit(HCI_INQUIRY, &hdev->flags);
+ if (hci_sent_cmd_data(hdev, HCI_OP_INQUIRY))
+ set_bit(HCI_INQUIRY, &hdev->flags);
}
static void hci_cs_create_conn(struct hci_dev *hdev, __u8 status)
cp.handle = cpu_to_le16(conn->handle);
cp.timeout = cpu_to_le16(hdev->auth_payload_timeout);
if (hci_send_cmd(conn->hdev, HCI_OP_WRITE_AUTH_PAYLOAD_TO,
- sizeof(cp), &cp)) {
+ sizeof(cp), &cp))
bt_dev_err(hdev, "write auth payload timeout failed");
- goto notify;
- }
-
- goto unlock;
}
notify:
kfree_skb(skb);
}
+static inline void l2cap_sig_send_rej(struct l2cap_conn *conn, u16 ident)
+{
+ struct l2cap_cmd_rej_unk rej;
+
+ rej.reason = cpu_to_le16(L2CAP_REJ_NOT_UNDERSTOOD);
+ l2cap_send_cmd(conn, ident, L2CAP_COMMAND_REJ, sizeof(rej), &rej);
+}
+
static inline void l2cap_sig_channel(struct l2cap_conn *conn,
struct sk_buff *skb)
{
if (len > skb->len || !cmd->ident) {
BT_DBG("corrupted command");
+ l2cap_sig_send_rej(conn, cmd->ident);
break;
}
err = l2cap_bredr_sig_cmd(conn, cmd, len, skb->data);
if (err) {
- struct l2cap_cmd_rej_unk rej;
-
BT_ERR("Wrong link type (%d)", err);
-
- rej.reason = cpu_to_le16(L2CAP_REJ_NOT_UNDERSTOOD);
- l2cap_send_cmd(conn, cmd->ident, L2CAP_COMMAND_REJ,
- sizeof(rej), &rej);
+ l2cap_sig_send_rej(conn, cmd->ident);
}
skb_pull(skb, len);
}
+ if (skb->len > 0) {
+ BT_DBG("corrupted command");
+ l2cap_sig_send_rej(conn, 0);
+ }
+
drop:
kfree_skb(skb);
}
for (i = 0; i < key_count; i++) {
struct mgmt_link_key_info *key = &cp->keys[i];
- if (key->addr.type != BDADDR_BREDR || key->type > 0x08)
+ /* Considering SMP over BREDR/LE, there is no need to check addr_type */
+ if (key->type > 0x08)
return mgmt_cmd_status(sk, hdev->id,
MGMT_OP_LOAD_LINK_KEYS,
MGMT_STATUS_INVALID_PARAMS);
for (i = 0; i < irk_count; i++) {
struct mgmt_irk_info *irk = &cp->irks[i];
+ u8 addr_type = le_addr_type(irk->addr.type);
if (hci_is_blocked_key(hdev,
HCI_BLOCKED_KEY_TYPE_IRK,
continue;
}
+ /* When using SMP over BR/EDR, the addr type should be set to BREDR */
+ if (irk->addr.type == BDADDR_BREDR)
+ addr_type = BDADDR_BREDR;
+
hci_add_irk(hdev, &irk->addr.bdaddr,
- le_addr_type(irk->addr.type), irk->val,
+ addr_type, irk->val,
BDADDR_ANY);
}
for (i = 0; i < key_count; i++) {
struct mgmt_ltk_info *key = &cp->keys[i];
u8 type, authenticated;
+ u8 addr_type = le_addr_type(key->addr.type);
if (hci_is_blocked_key(hdev,
HCI_BLOCKED_KEY_TYPE_LTK,
continue;
}
+ /* When using SMP over BR/EDR, the addr type should be set to BREDR */
+ if (key->addr.type == BDADDR_BREDR)
+ addr_type = BDADDR_BREDR;
+
hci_add_ltk(hdev, &key->addr.bdaddr,
- le_addr_type(key->addr.type), type, authenticated,
+ addr_type, type, authenticated,
key->val, key->enc_size, key->ediv, key->rand);
}
ev.store_hint = persistent;
bacpy(&ev.key.addr.bdaddr, &key->bdaddr);
- ev.key.addr.type = BDADDR_BREDR;
+ ev.key.addr.type = link_to_bdaddr(key->link_type, key->bdaddr_type);
ev.key.type = key->type;
memcpy(ev.key.val, key->val, HCI_LINK_KEY_SIZE);
ev.key.pin_len = key->pin_len;
ev.store_hint = persistent;
bacpy(&ev.key.addr.bdaddr, &key->bdaddr);
- ev.key.addr.type = link_to_bdaddr(LE_LINK, key->bdaddr_type);
+ ev.key.addr.type = link_to_bdaddr(key->link_type, key->bdaddr_type);
ev.key.type = mgmt_ltk_type(key);
ev.key.enc_size = key->enc_size;
ev.key.ediv = key->ediv;
bacpy(&ev.rpa, &irk->rpa);
bacpy(&ev.irk.addr.bdaddr, &irk->bdaddr);
- ev.irk.addr.type = link_to_bdaddr(LE_LINK, irk->addr_type);
+ ev.irk.addr.type = link_to_bdaddr(irk->link_type, irk->addr_type);
memcpy(ev.irk.val, irk->val, sizeof(irk->val));
mgmt_event(MGMT_EV_NEW_IRK, hdev, &ev, sizeof(ev), NULL);
ev.store_hint = persistent;
bacpy(&ev.key.addr.bdaddr, &csrk->bdaddr);
- ev.key.addr.type = link_to_bdaddr(LE_LINK, csrk->bdaddr_type);
+ ev.key.addr.type = link_to_bdaddr(csrk->link_type, csrk->bdaddr_type);
ev.key.type = csrk->type;
memcpy(ev.key.val, csrk->val, sizeof(csrk->val));
}
if (smp->remote_irk) {
+ smp->remote_irk->link_type = hcon->type;
mgmt_new_irk(hdev, smp->remote_irk, persistent);
/* Now that user space can be considered to know the
}
if (smp->csrk) {
+ smp->csrk->link_type = hcon->type;
smp->csrk->bdaddr_type = hcon->dst_type;
bacpy(&smp->csrk->bdaddr, &hcon->dst);
mgmt_new_csrk(hdev, smp->csrk, persistent);
}
if (smp->responder_csrk) {
+ smp->responder_csrk->link_type = hcon->type;
smp->responder_csrk->bdaddr_type = hcon->dst_type;
bacpy(&smp->responder_csrk->bdaddr, &hcon->dst);
mgmt_new_csrk(hdev, smp->responder_csrk, persistent);
}
if (smp->ltk) {
+ smp->ltk->link_type = hcon->type;
smp->ltk->bdaddr_type = hcon->dst_type;
bacpy(&smp->ltk->bdaddr, &hcon->dst);
mgmt_new_ltk(hdev, smp->ltk, persistent);
}
if (smp->responder_ltk) {
+ smp->responder_ltk->link_type = hcon->type;
smp->responder_ltk->bdaddr_type = hcon->dst_type;
bacpy(&smp->responder_ltk->bdaddr, &hcon->dst);
mgmt_new_ltk(hdev, smp->responder_ltk, persistent);
key = hci_add_link_key(hdev, smp->conn->hcon, &hcon->dst,
smp->link_key, type, 0, &persistent);
if (key) {
+ key->link_type = hcon->type;
+ key->bdaddr_type = hcon->dst_type;
mgmt_new_link_key(hdev, key, persistent);
/* Don't keep debug keys around if the relevant
ktime_t tstamp = skb->tstamp;
struct ip_frag_state state;
struct iphdr *iph;
- int err;
+ int err = 0;
/* for offloaded checksums cleanup checksum before fragmentation */
if (skb->ip_summed == CHECKSUM_PARTIAL &&
if (i == max_netdevices)
return -ENFILE;
- snprintf(res, IFNAMSIZ, name, i);
+ /* 'res' and 'name' could overlap, use 'buf' as an intermediate buffer */
+ strscpy(buf, name, IFNAMSIZ);
+ snprintf(res, IFNAMSIZ, buf, i);
return i;
}
if (gso_segs > READ_ONCE(dev->gso_max_segs))
return features & ~NETIF_F_GSO_MASK;
+ if (unlikely(skb->len >= READ_ONCE(dev->gso_max_size)))
+ return features & ~NETIF_F_GSO_MASK;
+
if (!skb_shinfo(skb)->gso_type) {
skb_warn_bad_offload(skb);
return features & ~NETIF_F_GSO_MASK;
}
EXPORT_SYMBOL(netif_tx_stop_all_queues);
+static int netdev_do_alloc_pcpu_stats(struct net_device *dev)
+{
+ void __percpu *v;
+
+ /* Drivers implementing ndo_get_peer_dev must support tstat
+ * accounting, so that skb_do_redirect() can bump the dev's
+ * RX stats upon network namespace switch.
+ */
+ if (dev->netdev_ops->ndo_get_peer_dev &&
+ dev->pcpu_stat_type != NETDEV_PCPU_STAT_TSTATS)
+ return -EOPNOTSUPP;
+
+ switch (dev->pcpu_stat_type) {
+ case NETDEV_PCPU_STAT_NONE:
+ return 0;
+ case NETDEV_PCPU_STAT_LSTATS:
+ v = dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
+ break;
+ case NETDEV_PCPU_STAT_TSTATS:
+ v = dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
+ break;
+ case NETDEV_PCPU_STAT_DSTATS:
+ v = dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return v ? 0 : -ENOMEM;
+}
+
+static void netdev_do_free_pcpu_stats(struct net_device *dev)
+{
+ switch (dev->pcpu_stat_type) {
+ case NETDEV_PCPU_STAT_NONE:
+ return;
+ case NETDEV_PCPU_STAT_LSTATS:
+ free_percpu(dev->lstats);
+ break;
+ case NETDEV_PCPU_STAT_TSTATS:
+ free_percpu(dev->tstats);
+ break;
+ case NETDEV_PCPU_STAT_DSTATS:
+ free_percpu(dev->dstats);
+ break;
+ }
+}
+
/**
* register_netdevice() - register a network device
* @dev: device to register
goto err_uninit;
}
+ ret = netdev_do_alloc_pcpu_stats(dev);
+ if (ret)
+ goto err_uninit;
+
ret = dev_index_reserve(net, dev->ifindex);
if (ret < 0)
- goto err_uninit;
+ goto err_free_pcpu;
dev->ifindex = ret;
/* Transfer changeable features to wanted_features and enable
call_netdevice_notifiers(NETDEV_PRE_UNINIT, dev);
err_ifindex_release:
dev_index_release(net, dev->ifindex);
+err_free_pcpu:
+ netdev_do_free_pcpu_stats(dev);
err_uninit:
if (dev->netdev_ops->ndo_uninit)
dev->netdev_ops->ndo_uninit(dev);
WARN_ON(rcu_access_pointer(dev->ip_ptr));
WARN_ON(rcu_access_pointer(dev->ip6_ptr));
+ netdev_do_free_pcpu_stats(dev);
if (dev->priv_destructor)
dev->priv_destructor(dev);
if (dev->needs_free_netdev)
}
static const struct genl_multicast_group dropmon_mcgrps[] = {
- { .name = "events", },
+ { .name = "events", .cap_sys_admin = 1 },
};
static void send_dm_alert(struct work_struct *work)
.cmd = NET_DM_CMD_START,
.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
.doit = net_dm_cmd_trace,
+ .flags = GENL_ADMIN_PERM,
},
{
.cmd = NET_DM_CMD_STOP,
.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
.doit = net_dm_cmd_trace,
+ .flags = GENL_ADMIN_PERM,
},
{
.cmd = NET_DM_CMD_CONFIG_GET,
#include <net/xdp.h>
#include <net/mptcp.h>
#include <net/netfilter/nf_conntrack_bpf.h>
+#include <net/netkit.h>
#include <linux/un.h>
#include "dev.h"
DEFINE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info);
EXPORT_PER_CPU_SYMBOL_GPL(bpf_redirect_info);
+static struct net_device *skb_get_peer_dev(struct net_device *dev)
+{
+ const struct net_device_ops *ops = dev->netdev_ops;
+
+ if (likely(ops->ndo_get_peer_dev))
+ return INDIRECT_CALL_1(ops->ndo_get_peer_dev,
+ netkit_peer_dev, dev);
+ return NULL;
+}
+
int skb_do_redirect(struct sk_buff *skb)
{
struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
if (unlikely(!dev))
goto out_drop;
if (flags & BPF_F_PEER) {
- const struct net_device_ops *ops = dev->netdev_ops;
-
- if (unlikely(!ops->ndo_get_peer_dev ||
- !skb_at_tc_ingress(skb)))
+ if (unlikely(!skb_at_tc_ingress(skb)))
goto out_drop;
- dev = ops->ndo_get_peer_dev(dev);
+ dev = skb_get_peer_dev(dev);
if (unlikely(!dev ||
!(dev->flags & IFF_UP) ||
net_eq(net, dev_net(dev))))
goto out_drop;
skb->dev = dev;
+ dev_sw_netstats_rx_add(dev, skb->len);
return -EAGAIN;
}
return flags & BPF_F_NEIGH ?
return 0;
}
+static void sk_msg_reset_curr(struct sk_msg *msg)
+{
+ u32 i = msg->sg.start;
+ u32 len = 0;
+
+ do {
+ len += sk_msg_elem(msg, i)->length;
+ sk_msg_iter_var_next(i);
+ if (len >= msg->sg.size)
+ break;
+ } while (i != msg->sg.end);
+
+ msg->sg.curr = i;
+ msg->sg.copybreak = 0;
+}
+
static const struct bpf_func_proto bpf_msg_cork_bytes_proto = {
.func = bpf_msg_cork_bytes,
.gpl_only = false,
msg->sg.end - shift + NR_MSG_FRAG_IDS :
msg->sg.end - shift;
out:
+ sk_msg_reset_curr(msg);
msg->data = sg_virt(&msg->sg.data[first_sge]) + start - offset;
msg->data_end = msg->data + bytes;
return 0;
msg->sg.data[new] = rsge;
}
+ sk_msg_reset_curr(msg);
sk_msg_compute_data_pointers(msg);
return 0;
}
sk_mem_uncharge(msg->sk, len - pop);
msg->sg.size -= (len - pop);
+ sk_msg_reset_curr(msg);
sk_msg_compute_data_pointers(msg);
return 0;
}
}
if (tcase->frag_skbs) {
- unsigned int total_size = 0, total_true_size = 0, alloc_size = 0;
+ unsigned int total_size = 0, total_true_size = 0;
struct sk_buff *frag_skb, *prev = NULL;
- page = alloc_page(GFP_KERNEL);
- KUNIT_ASSERT_NOT_NULL(test, page);
- page_ref_add(page, tcase->nr_frag_skbs - 1);
-
for (i = 0; i < tcase->nr_frag_skbs; i++) {
unsigned int frag_size;
+ page = alloc_page(GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, page);
+
frag_size = tcase->frag_skbs[i];
- frag_skb = build_skb(page_address(page) + alloc_size,
+ frag_skb = build_skb(page_address(page),
frag_size + shinfo_size);
KUNIT_ASSERT_NOT_NULL(test, frag_skb);
__skb_put(frag_skb, frag_size);
total_size += frag_size;
total_true_size += frag_skb->truesize;
- alloc_size += frag_size + shinfo_size;
}
- KUNIT_ASSERT_LE(test, alloc_size, PAGE_SIZE);
-
skb->len += total_size;
skb->data_len += total_size;
skb->truesize += total_true_size;
{
int max_clean = atomic_read(&tbl->gc_entries) -
READ_ONCE(tbl->gc_thresh2);
+ u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
unsigned long tref = jiffies - 5 * HZ;
struct neighbour *n, *tmp;
int shrunk = 0;
+ int loop = 0;
NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
shrunk++;
if (shrunk >= max_clean)
break;
+ if (++loop == 16) {
+ if (ktime_get_ns() > tmax)
+ goto unlock;
+ loop = 0;
+ }
}
}
WRITE_ONCE(tbl->last_flush, jiffies);
-
+unlock:
write_unlock_bh(&tbl->lock);
return shrunk;
#include <linux/nsproxy.h>
#include <linux/slab.h>
#include <linux/errqueue.h>
+#include <linux/io_uring.h>
#include <linux/uaccess.h>
if (fd < 0 || !(file = fget_raw(fd)))
return -EBADF;
+ /* don't allow io_uring files */
+ if (io_uring_get_socket(file)) {
+ fput(file);
+ return -EINVAL;
+ }
*fpp++ = file;
fpl->count++;
}
/* GSO partial only requires that we trim off any excess that
* doesn't fit into an MSS sized block, so take care of that
* now.
+ * Cap len to not accidentally hit GSO_BY_FRAGS.
*/
- partial_segs = len / mss;
+ partial_segs = min(len, GSO_BY_FRAGS - 1) / mss;
if (partial_segs > 1)
mss *= partial_segs;
else
static void skb_extensions_init(void)
{
BUILD_BUG_ON(SKB_EXT_NUM >= 8);
+#if !IS_ENABLED(CONFIG_KCOV_INSTRUMENT_ALL)
BUILD_BUG_ON(skb_ext_total_length() > 255);
+#endif
skbuff_ext_cache = kmem_cache_create("skbuff_ext_cache",
SKB_EXT_ALIGN_VALUE * skb_ext_total_length(),
if (psock->sk_redir)
sock_put(psock->sk_redir);
+ if (psock->sk_pair)
+ sock_put(psock->sk_pair);
sock_put(psock->sk);
kfree(psock);
}
{
if (sk_is_tcp(sk))
return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_LISTEN);
+ if (sk_is_stream_unix(sk))
+ return (1 << sk->sk_state) & TCPF_ESTABLISHED;
return true;
}
remove_wait_queue(sk_sleep(sk), &wait);
sk->sk_write_pending--;
} while (!done);
- return 0;
+ return done < 0 ? done : 0;
}
EXPORT_SYMBOL(sk_stream_wait_connect);
static int
dns_resolver_preparse(struct key_preparsed_payload *prep)
{
+ const struct dns_server_list_v1_header *v1;
const struct dns_payload_header *bin;
struct user_key_payload *upayload;
unsigned long derrno;
return -EINVAL;
}
+ v1 = (const struct dns_server_list_v1_header *)bin;
+ if ((v1->status != DNS_LOOKUP_GOOD &&
+ v1->status != DNS_LOOKUP_GOOD_WITH_BAD)) {
+ if (prep->expiry == TIME64_MAX)
+ prep->expiry = ktime_get_real_seconds() + 1;
+ }
+
result_len = datalen;
goto store_result;
}
struct key_type key_type_dns_resolver = {
.name = "dns_resolver",
- .flags = KEY_TYPE_NET_DOMAIN,
+ .flags = KEY_TYPE_NET_DOMAIN | KEY_TYPE_INSTANT_REAP,
.preparse = dns_resolver_preparse,
.free_preparse = dns_resolver_free_preparse,
.instantiate = generic_key_instantiate,
ret = skb->len;
break;
}
+ ret = 0;
}
rtnl_unlock();
if (unlikely(!pskb_may_pull(skb, total_pull)))
return NULL;
+ ifehdr = (struct ifeheadr *)(skb->data + skb->dev->hard_header_len);
skb_set_mac_header(skb, total_pull);
__skb_pull(skb, total_pull);
*metalen = ifehdrln - IFE_METAHDRLEN;
int tv = get_random_u32_below(max_delay);
im->tm_running = 1;
- if (!mod_timer(&im->timer, jiffies+tv+2))
- refcount_inc(&im->refcnt);
+ if (refcount_inc_not_zero(&im->refcnt)) {
+ if (mod_timer(&im->timer, jiffies + tv + 2))
+ ip_ma_put(im);
+ }
}
static void igmp_gq_start_timer(struct in_device *in_dev)
module_init(inet_diag_init);
module_exit(inet_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("INET/INET6: socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2 /* AF_INET */);
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 10 /* AF_INET6 */);
if (err)
goto unlock;
}
+ sock_set_flag(sk, SOCK_RCU_FREE);
if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport &&
sk->sk_family == AF_INET6)
__sk_nulls_add_node_tail_rcu(sk, &ilb2->nulls_head);
else
__sk_nulls_add_node_rcu(sk, &ilb2->nulls_head);
- sock_set_flag(sk, SOCK_RCU_FREE);
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
unlock:
spin_unlock(&ilb2->lock);
}
if (dev->header_ops) {
+ int pull_len = tunnel->hlen + sizeof(struct iphdr);
+
if (skb_cow_head(skb, 0))
goto free_skb;
tnl_params = (const struct iphdr *)skb->data;
- /* Pull skb since ip_tunnel_xmit() needs skb->data pointing
- * to gre header.
- */
- skb_pull(skb, tunnel->hlen + sizeof(struct iphdr));
+ if (!pskb_network_may_pull(skb, pull_len))
+ goto free_skb;
+
+ /* ip_tunnel_xmit() needs skb->data pointing to gre header. */
+ skb_pull(skb, pull_len);
skb_reset_mac_header(skb);
if (skb->ip_summed == CHECKSUM_PARTIAL &&
module_init(raw_diag_init);
module_exit(raw_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("RAW socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-255 /* AF_INET - IPPROTO_RAW */);
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 10-255 /* AF_INET6 - IPPROTO_RAW */);
goto reject_redirect;
}
- n = __ipv4_neigh_lookup(rt->dst.dev, new_gw);
+ n = __ipv4_neigh_lookup(rt->dst.dev, (__force u32)new_gw);
if (!n)
n = neigh_create(&arp_tbl, &new_gw, rt->dst.dev);
if (!IS_ERR(n)) {
return -EINVAL;
tp->window_clamp = 0;
} else {
- tp->window_clamp = val < SOCK_MIN_RCVBUF / 2 ?
- SOCK_MIN_RCVBUF / 2 : val;
- tp->rcv_ssthresh = min(tp->rcv_wnd, tp->window_clamp);
+ u32 new_rcv_ssthresh, old_window_clamp = tp->window_clamp;
+ u32 new_window_clamp = val < SOCK_MIN_RCVBUF / 2 ?
+ SOCK_MIN_RCVBUF / 2 : val;
+
+ if (new_window_clamp == old_window_clamp)
+ return 0;
+
+ tp->window_clamp = new_window_clamp;
+ if (new_window_clamp < old_window_clamp) {
+ /* need to apply the reserved mem provisioning only
+ * when shrinking the window clamp
+ */
+ __tcp_adjust_rcv_ssthresh(sk, tp->window_clamp);
+
+ } else {
+ new_rcv_ssthresh = min(tp->rcv_wnd, tp->window_clamp);
+ tp->rcv_ssthresh = max(new_rcv_ssthresh,
+ tp->rcv_ssthresh);
+ }
}
return 0;
}
break;
case TCP_AO_REPAIR:
+ if (!tcp_can_repair_sock(sk)) {
+ err = -EPERM;
+ break;
+ }
err = tcp_ao_set_repair(sk, optval, optlen);
break;
#ifdef CONFIG_TCP_AO
}
#endif
case TCP_AO_REPAIR:
+ if (!tcp_can_repair_sock(sk))
+ return -EPERM;
return tcp_ao_get_repair(sk, optval, optlen);
case TCP_AO_GET_KEYS:
case TCP_AO_INFO: {
const struct tcp_ao_hdr *aoh;
struct tcp_ao_key *key;
- treq->maclen = 0;
+ treq->used_tcp_ao = false;
if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh)
return;
treq->ao_rcv_next = aoh->keyid;
treq->ao_keyid = aoh->rnext_keyid;
- treq->maclen = tcp_ao_maclen(key);
+ treq->used_tcp_ao = true;
}
static enum skb_drop_reason
ao_info->current_key = key;
if (!ao_info->rnext_key)
ao_info->rnext_key = key;
- tp->tcp_header_len += tcp_ao_len(key);
+ tp->tcp_header_len += tcp_ao_len_aligned(key);
ao_info->lisn = htonl(tp->write_seq);
ao_info->snd_sne = 0;
syn_tcp_option_space -= TCPOLEN_MSS_ALIGNED;
syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED;
syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED;
- if (tcp_ao_len(key) > syn_tcp_option_space) {
+ if (tcp_ao_len_aligned(key) > syn_tcp_option_space) {
err = -EMSGSIZE;
goto err_kfree;
}
if (!dev || !l3index)
return -EINVAL;
+ if (!bound_dev_if || bound_dev_if != cmd.ifindex) {
+ /* tcp_ao_established_key() doesn't expect having
+ * non peer-matching key on an established TCP-AO
+ * connection.
+ */
+ if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
+ return -EINVAL;
+ }
+
/* It's still possible to bind after adding keys or even
* re-bind to a different dev (with CAP_NET_RAW).
* So, no reason to return error here, rather try to be
module_init(tcp_diag_init);
module_exit(tcp_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("TCP socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-6 /* AF_INET - IPPROTO_TCP */);
* then we can probably ignore it.
*/
if (before(ack, prior_snd_una)) {
+ u32 max_window;
+
+ /* do not accept ACK for bytes we never sent. */
+ max_window = min_t(u64, tp->max_window, tp->bytes_acked);
/* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */
- if (before(ack, prior_snd_una - tp->max_window)) {
+ if (before(ack, prior_snd_una - max_window)) {
if (!(flag & FLAG_NO_CHALLENGE_ACK))
tcp_send_challenge_ack(sk);
return -SKB_DROP_REASON_TCP_TOO_OLD_ACK;
* up to bandwidth of 18Gigabit/sec. 8) ]
*/
+/* Estimates max number of increments of remote peer TSval in
+ * a replay window (based on our current RTO estimation).
+ */
+static u32 tcp_tsval_replay(const struct sock *sk)
+{
+ /* If we use usec TS resolution,
+ * then expect the remote peer to use the same resolution.
+ */
+ if (tcp_sk(sk)->tcp_usec_ts)
+ return inet_csk(sk)->icsk_rto * (USEC_PER_SEC / HZ);
+
+ /* RFC 7323 recommends a TSval clock between 1ms and 1sec.
+ * We know that some OS (including old linux) can use 1200 Hz.
+ */
+ return inet_csk(sk)->icsk_rto * 1200 / HZ;
+}
+
static int tcp_disordered_ack(const struct sock *sk, const struct sk_buff *skb)
{
const struct tcp_sock *tp = tcp_sk(sk);
u32 seq = TCP_SKB_CB(skb)->seq;
u32 ack = TCP_SKB_CB(skb)->ack_seq;
- return (/* 1. Pure ACK with correct sequence number. */
+ return /* 1. Pure ACK with correct sequence number. */
(th->ack && seq == TCP_SKB_CB(skb)->end_seq && seq == tp->rcv_nxt) &&
/* 2. ... and duplicate ACK. */
!tcp_may_update_window(tp, ack, seq, ntohs(th->window) << tp->rx_opt.snd_wscale) &&
/* 4. ... and sits in replay window. */
- (s32)(tp->rx_opt.ts_recent - tp->rx_opt.rcv_tsval) <= (inet_csk(sk)->icsk_rto * 1024) / HZ);
+ (s32)(tp->rx_opt.ts_recent - tp->rx_opt.rcv_tsval) <=
+ tcp_tsval_replay(sk);
}
static inline bool tcp_paws_discard(const struct sock *sk,
if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
goto drop_and_release; /* Invalid TCP options */
if (aoh) {
- tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ tcp_rsk(req)->used_tcp_ao = true;
tcp_rsk(req)->ao_rcv_next = aoh->keyid;
tcp_rsk(req)->ao_keyid = aoh->rnext_keyid;
+
} else {
- tcp_rsk(req)->maclen = 0;
+ tcp_rsk(req)->used_tcp_ao = false;
}
#endif
tcp_rsk(req)->snt_isn = isn;
reply_options[0] = htonl((TCPOPT_AO << 24) | (tcp_ao_len(key) << 16) |
(aoh->rnext_keyid << 8) | keyid);
- arg->iov[0].iov_len += round_up(tcp_ao_len(key), 4);
+ arg->iov[0].iov_len += tcp_ao_len_aligned(key);
reply->doff = arg->iov[0].iov_len / 4;
if (tcp_ao_hash_hdr(AF_INET, (char *)&reply_options[1],
(tcp_ao_len(key->ao_key) << 16) |
(key->ao_key->sndid << 8) |
key->rcv_next);
- arg.iov[0].iov_len += round_up(tcp_ao_len(key->ao_key), 4);
+ arg.iov[0].iov_len += tcp_ao_len_aligned(key->ao_key);
rep.th.doff = arg.iov[0].iov_len / 4;
tcp_ao_hash_hdr(AF_INET, (char *)&rep.opt[offset],
ao_key = treq->af_specific->ao_lookup(sk, req,
tcp_rsk(req)->ao_keyid, -1);
if (ao_key)
- newtp->tcp_header_len += tcp_ao_len(ao_key);
+ newtp->tcp_header_len += tcp_ao_len_aligned(ao_key);
#endif
if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
timestamps = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps);
if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
}
}
ireq->tstamp_ok &= !ireq->sack_ok;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
ireq->tstamp_ok &= !ireq->sack_ok;
}
size += TCPOLEN_MD5SIG_ALIGNED;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- size += tcp_ao_len(key->ao_key);
+ size += tcp_ao_len_aligned(key->ao_key);
}
if (likely(tp->rx_opt.tstamp_ok)) {
if (skb_still_in_host_queue(sk, skb))
return -EBUSY;
+start:
if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
+ if (unlikely(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)) {
+ TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_SYN;
+ TCP_SKB_CB(skb)->seq++;
+ goto start;
+ }
if (unlikely(before(TCP_SKB_CB(skb)->end_seq, tp->snd_una))) {
WARN_ON_ONCE(1);
return -EINVAL;
if (tcp_rsk_used_ao(req)) {
#ifdef CONFIG_TCP_AO
struct tcp_ao_key *ao_key = NULL;
- u8 maclen = tcp_rsk(req)->maclen;
u8 keyid = tcp_rsk(req)->ao_keyid;
ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
* for another peer-matching key, but the peer has requested
* ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
*/
- if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
- u8 key_maclen = ao_key ? tcp_ao_maclen(ao_key) : 0;
-
+ if (unlikely(!ao_key)) {
rcu_read_unlock();
kfree_skb(skb);
- net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
- keyid, maclen, key_maclen);
+ net_warn_ratelimited("TCP-AO: the keyid %u from SYN packet is not present - not sending SYNACK\n",
+ keyid);
return NULL;
}
key.ao_key = ao_key;
module_init(udp_diag_init);
module_exit(udp_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("UDP socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-17 /* AF_INET - IPPROTO_UDP */);
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-136 /* AF_INET - IPPROTO_UDPLITE */);
pmsg->prefix_len = pinfo->prefix_len;
pmsg->prefix_type = pinfo->type;
pmsg->prefix_pad3 = 0;
- pmsg->prefix_flags = 0;
- if (pinfo->onlink)
- pmsg->prefix_flags |= IF_PREFIX_ONLINK;
- if (pinfo->autoconf)
- pmsg->prefix_flags |= IF_PREFIX_AUTOCONF;
+ pmsg->prefix_flags = pinfo->flags;
if (nla_put(skb, PREFIX_ADDRESS, sizeof(pinfo->prefix), &pinfo->prefix))
goto nla_put_failure;
INIT_LIST_HEAD(&f6i->fib6_siblings);
refcount_set(&f6i->fib6_ref, 1);
- INIT_HLIST_NODE(&f6i->gc_link);
-
return f6i;
}
net->ipv6.fib6_null_entry);
table->tb6_root.fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO;
inet_peer_base_init(&table->tb6_peers);
- INIT_HLIST_HEAD(&table->tb6_gc_hlist);
}
return table;
lockdep_is_held(&table->tb6_lock));
}
}
-
- fib6_clean_expires_locked(rt);
}
/*
if (!(iter->fib6_flags & RTF_EXPIRES))
return -EEXIST;
if (!(rt->fib6_flags & RTF_EXPIRES))
- fib6_clean_expires_locked(iter);
+ fib6_clean_expires(iter);
else
- fib6_set_expires_locked(iter,
- rt->expires);
+ fib6_set_expires(iter, rt->expires);
if (rt->fib6_pmtu)
fib6_metric_set(iter, RTAX_MTU,
if (rt->nh)
list_add(&rt->nh_list, &rt->nh->f6i_list);
__fib6_update_sernum_upto_root(rt, fib6_new_sernum(info->nl_net));
-
- if (fib6_has_expires(rt))
- hlist_add_head(&rt->gc_link, &table->tb6_gc_hlist);
-
fib6_start_gc(info->nl_net, rt);
}
if (!pn_leaf && !(pn->fn_flags & RTN_RTINFO)) {
pn_leaf = fib6_find_prefix(info->nl_net, table,
pn);
-#if RT6_DEBUG >= 2
- if (!pn_leaf) {
- WARN_ON(!pn_leaf);
+ if (!pn_leaf)
pn_leaf =
info->nl_net->ipv6.fib6_null_entry;
- }
-#endif
fib6_info_hold(pn_leaf);
rcu_assign_pointer(pn->leaf, pn_leaf);
}
* Garbage collection
*/
-static int fib6_age(struct fib6_info *rt, struct fib6_gc_args *gc_args)
+static int fib6_age(struct fib6_info *rt, void *arg)
{
+ struct fib6_gc_args *gc_args = arg;
unsigned long now = jiffies;
/*
* Routes are expired even if they are in use.
*/
- if (fib6_has_expires(rt) && rt->expires) {
+ if (rt->fib6_flags & RTF_EXPIRES && rt->expires) {
if (time_after(now, rt->expires)) {
RT6_TRACE("expiring %p\n", rt);
return -1;
return 0;
}
-static void fib6_gc_table(struct net *net,
- struct fib6_table *tb6,
- struct fib6_gc_args *gc_args)
-{
- struct fib6_info *rt;
- struct hlist_node *n;
- struct nl_info info = {
- .nl_net = net,
- .skip_notify = false,
- };
-
- hlist_for_each_entry_safe(rt, n, &tb6->tb6_gc_hlist, gc_link)
- if (fib6_age(rt, gc_args) == -1)
- fib6_del(rt, &info);
-}
-
-static void fib6_gc_all(struct net *net, struct fib6_gc_args *gc_args)
-{
- struct fib6_table *table;
- struct hlist_head *head;
- unsigned int h;
-
- rcu_read_lock();
- for (h = 0; h < FIB6_TABLE_HASHSZ; h++) {
- head = &net->ipv6.fib_table_hash[h];
- hlist_for_each_entry_rcu(table, head, tb6_hlist) {
- spin_lock_bh(&table->tb6_lock);
- fib6_gc_table(net, table, gc_args);
- spin_unlock_bh(&table->tb6_lock);
- }
- }
- rcu_read_unlock();
-}
-
void fib6_run_gc(unsigned long expires, struct net *net, bool force)
{
struct fib6_gc_args gc_args;
net->ipv6.sysctl.ip6_rt_gc_interval;
gc_args.more = 0;
- fib6_gc_all(net, &gc_args);
+ fib6_clean_all(net, fib6_age, &gc_args);
now = jiffies;
net->ipv6.ip6_rt_last_gc = now;
rt->dst_nocount = true;
if (cfg->fc_flags & RTF_EXPIRES)
- fib6_set_expires_locked(rt, jiffies +
- clock_t_to_jiffies(cfg->fc_expires));
+ fib6_set_expires(rt, jiffies +
+ clock_t_to_jiffies(cfg->fc_expires));
else
- fib6_clean_expires_locked(rt);
+ fib6_clean_expires(rt);
if (cfg->fc_protocol == RTPROT_UNSPEC)
cfg->fc_protocol = RTPROT_BOOT;
if (tcp_key_is_md5(key))
tot_len += TCPOLEN_MD5SIG_ALIGNED;
if (tcp_key_is_ao(key))
- tot_len += tcp_ao_len(key->ao_key);
+ tot_len += tcp_ao_len_aligned(key->ao_key);
#ifdef CONFIG_MPTCP
if (rst && !tcp_key_is_md5(key)) {
config MAC80211_DEBUGFS
bool "Export mac80211 internals in DebugFS"
- depends on MAC80211 && DEBUG_FS
+ depends on MAC80211 && CFG80211_DEBUGFS
help
Select this to see extensive information about
the internal state of mac80211 in debugfs.
lockdep_is_held(&local->hw.wiphy->mtx));
/*
- * If there are no changes, then accept a link that doesn't exist,
+ * If there are no changes, then accept a link that exist,
* unless it's a new link.
*/
- if (params->link_id < 0 && !new_link &&
+ if (params->link_id >= 0 && !new_link &&
!params->link_mac && !params->txpwr_set &&
!params->supported_rates_len &&
!params->ht_capa && !params->vht_capa &&
#include "debugfs_netdev.h"
#include "driver-ops.h"
+struct ieee80211_if_read_sdata_data {
+ ssize_t (*format)(const struct ieee80211_sub_if_data *, char *, int);
+ struct ieee80211_sub_if_data *sdata;
+};
+
+static ssize_t ieee80211_if_read_sdata_handler(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t bufsize,
+ void *data)
+{
+ struct ieee80211_if_read_sdata_data *d = data;
+
+ return d->format(d->sdata, buf, bufsize);
+}
+
static ssize_t ieee80211_if_read_sdata(
- struct ieee80211_sub_if_data *sdata,
+ struct file *file,
char __user *userbuf,
size_t count, loff_t *ppos,
ssize_t (*format)(const struct ieee80211_sub_if_data *sdata, char *, int))
{
+ struct ieee80211_sub_if_data *sdata = file->private_data;
+ struct ieee80211_if_read_sdata_data data = {
+ .format = format,
+ .sdata = sdata,
+ };
char buf[200];
- ssize_t ret = -EINVAL;
- wiphy_lock(sdata->local->hw.wiphy);
- ret = (*format)(sdata, buf, sizeof(buf));
- wiphy_unlock(sdata->local->hw.wiphy);
+ return wiphy_locked_debugfs_read(sdata->local->hw.wiphy,
+ file, buf, sizeof(buf),
+ userbuf, count, ppos,
+ ieee80211_if_read_sdata_handler,
+ &data);
+}
- if (ret >= 0)
- ret = simple_read_from_buffer(userbuf, count, ppos, buf, ret);
+struct ieee80211_if_write_sdata_data {
+ ssize_t (*write)(struct ieee80211_sub_if_data *, const char *, int);
+ struct ieee80211_sub_if_data *sdata;
+};
+
+static ssize_t ieee80211_if_write_sdata_handler(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data)
+{
+ struct ieee80211_if_write_sdata_data *d = data;
- return ret;
+ return d->write(d->sdata, buf, count);
}
static ssize_t ieee80211_if_write_sdata(
- struct ieee80211_sub_if_data *sdata,
+ struct file *file,
const char __user *userbuf,
size_t count, loff_t *ppos,
ssize_t (*write)(struct ieee80211_sub_if_data *sdata, const char *, int))
{
+ struct ieee80211_sub_if_data *sdata = file->private_data;
+ struct ieee80211_if_write_sdata_data data = {
+ .write = write,
+ .sdata = sdata,
+ };
char buf[64];
- ssize_t ret;
- if (count >= sizeof(buf))
- return -E2BIG;
+ return wiphy_locked_debugfs_write(sdata->local->hw.wiphy,
+ file, buf, sizeof(buf),
+ userbuf, count,
+ ieee80211_if_write_sdata_handler,
+ &data);
+}
- if (copy_from_user(buf, userbuf, count))
- return -EFAULT;
- buf[count] = '\0';
+struct ieee80211_if_read_link_data {
+ ssize_t (*format)(const struct ieee80211_link_data *, char *, int);
+ struct ieee80211_link_data *link;
+};
- wiphy_lock(sdata->local->hw.wiphy);
- ret = (*write)(sdata, buf, count);
- wiphy_unlock(sdata->local->hw.wiphy);
+static ssize_t ieee80211_if_read_link_handler(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t bufsize,
+ void *data)
+{
+ struct ieee80211_if_read_link_data *d = data;
- return ret;
+ return d->format(d->link, buf, bufsize);
}
static ssize_t ieee80211_if_read_link(
- struct ieee80211_link_data *link,
+ struct file *file,
char __user *userbuf,
size_t count, loff_t *ppos,
ssize_t (*format)(const struct ieee80211_link_data *link, char *, int))
{
+ struct ieee80211_link_data *link = file->private_data;
+ struct ieee80211_if_read_link_data data = {
+ .format = format,
+ .link = link,
+ };
char buf[200];
- ssize_t ret = -EINVAL;
- wiphy_lock(link->sdata->local->hw.wiphy);
- ret = (*format)(link, buf, sizeof(buf));
- wiphy_unlock(link->sdata->local->hw.wiphy);
+ return wiphy_locked_debugfs_read(link->sdata->local->hw.wiphy,
+ file, buf, sizeof(buf),
+ userbuf, count, ppos,
+ ieee80211_if_read_link_handler,
+ &data);
+}
+
+struct ieee80211_if_write_link_data {
+ ssize_t (*write)(struct ieee80211_link_data *, const char *, int);
+ struct ieee80211_link_data *link;
+};
- if (ret >= 0)
- ret = simple_read_from_buffer(userbuf, count, ppos, buf, ret);
+static ssize_t ieee80211_if_write_link_handler(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data)
+{
+ struct ieee80211_if_write_sdata_data *d = data;
- return ret;
+ return d->write(d->sdata, buf, count);
}
static ssize_t ieee80211_if_write_link(
- struct ieee80211_link_data *link,
+ struct file *file,
const char __user *userbuf,
size_t count, loff_t *ppos,
ssize_t (*write)(struct ieee80211_link_data *link, const char *, int))
{
+ struct ieee80211_link_data *link = file->private_data;
+ struct ieee80211_if_write_link_data data = {
+ .write = write,
+ .link = link,
+ };
char buf[64];
- ssize_t ret;
-
- if (count >= sizeof(buf))
- return -E2BIG;
-
- if (copy_from_user(buf, userbuf, count))
- return -EFAULT;
- buf[count] = '\0';
-
- wiphy_lock(link->sdata->local->hw.wiphy);
- ret = (*write)(link, buf, count);
- wiphy_unlock(link->sdata->local->hw.wiphy);
- return ret;
+ return wiphy_locked_debugfs_write(link->sdata->local->hw.wiphy,
+ file, buf, sizeof(buf),
+ userbuf, count,
+ ieee80211_if_write_link_handler,
+ &data);
}
#define IEEE80211_IF_FMT(name, type, field, format_string) \
char __user *userbuf, \
size_t count, loff_t *ppos) \
{ \
- return ieee80211_if_read_sdata(file->private_data, \
+ return ieee80211_if_read_sdata(file, \
userbuf, count, ppos, \
ieee80211_if_fmt_##name); \
}
const char __user *userbuf, \
size_t count, loff_t *ppos) \
{ \
- return ieee80211_if_write_sdata(file->private_data, userbuf, \
+ return ieee80211_if_write_sdata(file, userbuf, \
count, ppos, \
ieee80211_if_parse_##name); \
}
char __user *userbuf, \
size_t count, loff_t *ppos) \
{ \
- return ieee80211_if_read_link(file->private_data, \
+ return ieee80211_if_read_link(file, \
userbuf, count, ppos, \
ieee80211_if_fmt_##name); \
}
const char __user *userbuf, \
size_t count, loff_t *ppos) \
{ \
- return ieee80211_if_write_link(file->private_data, userbuf, \
+ return ieee80211_if_write_link(file, userbuf, \
count, ppos, \
ieee80211_if_parse_##name); \
}
STA_OPS_RW(aql);
-static ssize_t sta_agg_status_read(struct file *file, char __user *userbuf,
- size_t count, loff_t *ppos)
+static ssize_t sta_agg_status_do_read(struct wiphy *wiphy, struct file *file,
+ char *buf, size_t bufsz, void *data)
{
- char *buf, *p;
- ssize_t bufsz = 71 + IEEE80211_NUM_TIDS * 40;
+ struct sta_info *sta = data;
+ char *p = buf;
int i;
- struct sta_info *sta = file->private_data;
struct tid_ampdu_rx *tid_rx;
struct tid_ampdu_tx *tid_tx;
- ssize_t ret;
-
- buf = kzalloc(bufsz, GFP_KERNEL);
- if (!buf)
- return -ENOMEM;
- p = buf;
-
- rcu_read_lock();
p += scnprintf(p, bufsz + buf - p, "next dialog_token: %#02x\n",
sta->ampdu_mlme.dialog_token_allocator + 1);
for (i = 0; i < IEEE80211_NUM_TIDS; i++) {
bool tid_rx_valid;
- tid_rx = rcu_dereference(sta->ampdu_mlme.tid_rx[i]);
- tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[i]);
+ tid_rx = wiphy_dereference(wiphy, sta->ampdu_mlme.tid_rx[i]);
+ tid_tx = wiphy_dereference(wiphy, sta->ampdu_mlme.tid_tx[i]);
tid_rx_valid = test_bit(i, sta->ampdu_mlme.agg_session_valid);
p += scnprintf(p, bufsz + buf - p, "%02d", i);
tid_tx ? skb_queue_len(&tid_tx->pending) : 0);
p += scnprintf(p, bufsz + buf - p, "\n");
}
- rcu_read_unlock();
- ret = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
+ return p - buf;
+}
+
+static ssize_t sta_agg_status_read(struct file *file, char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ struct sta_info *sta = file->private_data;
+ struct wiphy *wiphy = sta->local->hw.wiphy;
+ size_t bufsz = 71 + IEEE80211_NUM_TIDS * 40;
+ char *buf = kmalloc(bufsz, GFP_KERNEL);
+ ssize_t ret;
+
+ if (!buf)
+ return -ENOMEM;
+
+ ret = wiphy_locked_debugfs_read(wiphy, file, buf, bufsz,
+ userbuf, count, ppos,
+ sta_agg_status_do_read, sta);
kfree(buf);
+
return ret;
}
-static ssize_t sta_agg_status_write(struct file *file, const char __user *userbuf,
- size_t count, loff_t *ppos)
+static ssize_t sta_agg_status_do_write(struct wiphy *wiphy, struct file *file,
+ char *buf, size_t count, void *data)
{
- char _buf[25] = {}, *buf = _buf;
- struct sta_info *sta = file->private_data;
+ struct sta_info *sta = data;
bool start, tx;
unsigned long tid;
- char *pos;
+ char *pos = buf;
int ret, timeout = 5000;
- if (count > sizeof(_buf))
- return -EINVAL;
-
- if (copy_from_user(buf, userbuf, count))
- return -EFAULT;
-
- buf[sizeof(_buf) - 1] = '\0';
- pos = buf;
buf = strsep(&pos, " ");
if (!buf)
return -EINVAL;
if (ret || tid >= IEEE80211_NUM_TIDS)
return -EINVAL;
- wiphy_lock(sta->local->hw.wiphy);
if (tx) {
if (start)
ret = ieee80211_start_tx_ba_session(&sta->sta, tid,
3, true);
ret = 0;
}
- wiphy_unlock(sta->local->hw.wiphy);
return ret ?: count;
}
+
+static ssize_t sta_agg_status_write(struct file *file,
+ const char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ struct sta_info *sta = file->private_data;
+ struct wiphy *wiphy = sta->local->hw.wiphy;
+ char _buf[26];
+
+ return wiphy_locked_debugfs_write(wiphy, file, _buf, sizeof(_buf),
+ userbuf, count,
+ sta_agg_status_do_write, sta);
+}
STA_OPS_RW(agg_status);
/* link sta attributes */
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright 2015 Intel Deutschland GmbH
- * Copyright (C) 2022 Intel Corporation
+ * Copyright (C) 2022-2023 Intel Corporation
*/
#include <net/mac80211.h>
#include "ieee80211_i.h"
if (ret)
return ret;
+ /* during reconfig don't add it to debugfs again */
+ if (local->in_reconfig)
+ return 0;
+
for_each_set_bit(link_id, &links_to_add, IEEE80211_MLD_MAX_NUM_LINKS) {
link_sta = rcu_dereference_protected(info->link[link_id],
lockdep_is_held(&local->hw.wiphy->mtx));
static inline struct ieee80211_sub_if_data *
get_bss_sdata(struct ieee80211_sub_if_data *sdata)
{
- if (sdata->vif.type == NL80211_IFTYPE_AP_VLAN)
+ if (sdata && sdata->vif.type == NL80211_IFTYPE_AP_VLAN)
sdata = container_of(sdata->bss, struct ieee80211_sub_if_data,
u.ap);
struct ieee80211_sub_if_data *sdata,
u32 queues, bool drop)
{
- struct ieee80211_vif *vif = sdata ? &sdata->vif : NULL;
+ struct ieee80211_vif *vif;
might_sleep();
lockdep_assert_wiphy(local->hw.wiphy);
+ sdata = get_bss_sdata(sdata);
+ vif = sdata ? &sdata->vif : NULL;
+
if (sdata && !check_sdata_in_driver(sdata))
return;
might_sleep();
lockdep_assert_wiphy(local->hw.wiphy);
+ sdata = get_bss_sdata(sdata);
+
if (sdata && !check_sdata_in_driver(sdata))
return;
case NL80211_CHAN_WIDTH_80:
case NL80211_CHAN_WIDTH_80P80:
case NL80211_CHAN_WIDTH_160:
+ case NL80211_CHAN_WIDTH_320:
bw = ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40 ?
IEEE80211_STA_RX_BW_40 : IEEE80211_STA_RX_BW_20;
break;
case WLAN_SP_MESH_PEERING_OPEN:
if (!matches_local)
event = OPN_RJCT;
- if (!mesh_plink_free_count(sdata) ||
- (sta->mesh->plid && sta->mesh->plid != plid))
+ else if (!mesh_plink_free_count(sdata) ||
+ (sta->mesh->plid && sta->mesh->plid != plid))
event = OPN_IGNR;
else
event = OPN_ACPT;
case WLAN_SP_MESH_PEERING_CONFIRM:
if (!matches_local)
event = CNF_RJCT;
- if (!mesh_plink_free_count(sdata) ||
- sta->mesh->llid != llid ||
- (sta->mesh->plid && sta->mesh->plid != plid))
+ else if (!mesh_plink_free_count(sdata) ||
+ sta->mesh->llid != llid ||
+ (sta->mesh->plid && sta->mesh->plid != plid))
event = CNF_IGNR;
else
event = CNF_ACPT;
return;
}
elems = ieee802_11_parse_elems(baseaddr, len - baselen, true, NULL);
- mesh_process_plink_frame(sdata, mgmt, elems, rx_status);
- kfree(elems);
+ if (elems) {
+ mesh_process_plink_frame(sdata, mgmt, elems, rx_status);
+ kfree(elems);
+ }
}
{
const struct ieee80211_multi_link_elem *ml;
const struct element *sub;
- size_t ml_len;
+ ssize_t ml_len;
unsigned long removed_links = 0;
u16 link_removal_timeout[IEEE80211_MLD_MAX_NUM_LINKS] = {};
u8 link_id;
elems->scratch + elems->scratch_len -
elems->scratch_pos,
WLAN_EID_FRAGMENT);
+ if (ml_len < 0)
+ return;
elems->ml_reconf = (const void *)elems->scratch_pos;
elems->ml_reconf_len = ml_len;
kunit_test_suite(mptcp_crypto_suite);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("KUnit tests for MPTCP Crypto");
module_init(mptcp_diag_init);
module_exit(mptcp_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("MPTCP socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-262 /* AF_INET - IPPROTO_MPTCP */);
mp_opt->suboptions |= OPTION_MPTCP_DSS;
mp_opt->use_map = 1;
mp_opt->mpc_map = 1;
+ mp_opt->use_ack = 0;
mp_opt->data_len = get_unaligned_be16(ptr);
ptr += 2;
}
struct mptcp_pm_addr_entry *entry;
list_for_each_entry(entry, rm_list, list) {
- remove_anno_list_by_saddr(msk, &entry->addr);
- if (alist.nr < MPTCP_RM_IDS_MAX)
+ if ((remove_anno_list_by_saddr(msk, &entry->addr) ||
+ lookup_subflow_by_saddr(&msk->conn_list, &entry->addr)) &&
+ alist.nr < MPTCP_RM_IDS_MAX)
alist.ids[alist.nr++] = entry->addr.id;
}
mptcp_do_fallback(ssk);
}
+#define MPTCP_MAX_GSO_SIZE (GSO_LEGACY_MAX_SIZE - (MAX_TCP_HEADER + 1))
+
static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
struct mptcp_data_frag *dfrag,
struct mptcp_sendmsg_info *info)
return -EAGAIN;
/* compute send limit */
+ if (unlikely(ssk->sk_gso_max_size > MPTCP_MAX_GSO_SIZE))
+ ssk->sk_gso_max_size = MPTCP_MAX_GSO_SIZE;
info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags);
copy = info->size_goal;
if (__test_and_clear_bit(MPTCP_CLEAN_UNA, &msk->cb_flags))
__mptcp_clean_una_wakeup(sk);
if (unlikely(msk->cb_flags)) {
- /* be sure to set the current sk state before tacking actions
- * depending on sk_state, that is processing MPTCP_ERROR_REPORT
+ /* be sure to sync the msk state before taking actions
+ * depending on sk_state (MPTCP_ERROR_REPORT)
+ * On sk release avoid actions depending on the first subflow
*/
- if (__test_and_clear_bit(MPTCP_CONNECTED, &msk->cb_flags))
- __mptcp_set_connected(sk);
+ if (__test_and_clear_bit(MPTCP_SYNC_STATE, &msk->cb_flags) && msk->first)
+ __mptcp_sync_state(sk, msk->pending_state);
if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags))
__mptcp_error_report(sk);
if (__test_and_clear_bit(MPTCP_SYNC_SNDBUF, &msk->cb_flags))
#define MPTCP_ERROR_REPORT 3
#define MPTCP_RETRANSMIT 4
#define MPTCP_FLUSH_JOIN_LIST 5
-#define MPTCP_CONNECTED 6
+#define MPTCP_SYNC_STATE 6
#define MPTCP_SYNC_SNDBUF 7
struct mptcp_skb_cb {
bool use_64bit_ack; /* Set when we received a 64-bit DSN */
bool csum_enabled;
bool allow_infinite_fallback;
+ u8 pending_state; /* A subflow asked to set this sk_state,
+ * protected by the msk data lock
+ */
u8 mpc_endpoint_id;
u8 recvmsg_inq:1,
cork:1,
struct mptcp_options_received *mp_opt);
void mptcp_finish_connect(struct sock *sk);
-void __mptcp_set_connected(struct sock *sk);
+void __mptcp_sync_state(struct sock *sk, int state);
void mptcp_reset_tout_timer(struct mptcp_sock *msk, unsigned long fail_tout);
static inline void mptcp_stop_tout_timer(struct sock *sk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
- return sk->sk_state == TCP_ESTABLISHED &&
+ return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) &&
is_active_ssk(subflow) &&
!subflow->conn_finished;
}
val = READ_ONCE(inet_sk(sk)->tos);
mptcp_for_each_subflow(msk, subflow) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+ bool slow;
+ slow = lock_sock_fast(ssk);
__ip_sock_set_tos(ssk, val);
+ unlock_sock_fast(ssk, slow);
}
release_sock(sk);
return inet_sk(sk)->inet_dport != inet_sk((struct sock *)msk)->inet_dport;
}
-void __mptcp_set_connected(struct sock *sk)
+void __mptcp_sync_state(struct sock *sk, int state)
{
- __mptcp_propagate_sndbuf(sk, mptcp_sk(sk)->first);
+ struct mptcp_sock *msk = mptcp_sk(sk);
+
+ __mptcp_propagate_sndbuf(sk, msk->first);
if (sk->sk_state == TCP_SYN_SENT) {
- inet_sk_state_store(sk, TCP_ESTABLISHED);
+ inet_sk_state_store(sk, state);
sk->sk_state_change(sk);
}
}
-static void mptcp_set_connected(struct sock *sk)
+static void mptcp_propagate_state(struct sock *sk, struct sock *ssk)
{
+ struct mptcp_sock *msk = mptcp_sk(sk);
+
mptcp_data_lock(sk);
- if (!sock_owned_by_user(sk))
- __mptcp_set_connected(sk);
- else
- __set_bit(MPTCP_CONNECTED, &mptcp_sk(sk)->cb_flags);
+ if (!sock_owned_by_user(sk)) {
+ __mptcp_sync_state(sk, ssk->sk_state);
+ } else {
+ msk->pending_state = ssk->sk_state;
+ __set_bit(MPTCP_SYNC_STATE, &msk->cb_flags);
+ }
mptcp_data_unlock(sk);
}
subflow_set_remote_key(msk, subflow, &mp_opt);
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVEACK);
mptcp_finish_connect(sk);
- mptcp_set_connected(parent);
+ mptcp_propagate_state(parent, sk);
} else if (subflow->request_join) {
u8 hmac[SHA256_DIGEST_SIZE];
} else if (mptcp_check_fallback(sk)) {
fallback:
mptcp_rcv_space_init(msk, sk);
- mptcp_set_connected(parent);
+ mptcp_propagate_state(parent, sk);
}
return;
mptcp_rcv_space_init(msk, sk);
pr_fallback(msk);
subflow->conn_finished = 1;
- mptcp_set_connected(parent);
+ mptcp_propagate_state(parent, sk);
}
/* as recvmsg() does not acquire the subflow socket for ssk selection
kunit_test_suite(mptcp_token_suite);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("KUnit tests for MPTCP Token");
if ((had_link == has_link) || chained)
return 0;
- if (had_link)
- netif_carrier_off(ndp->ndev.dev);
- else
- netif_carrier_on(ndp->ndev.dev);
-
if (!ndp->multi_package && !nc->package->multi_channel) {
if (had_link) {
ndp->flags |= NCSI_DEV_RESHUFFLE;
ip_set_dereference((inst)->ip_set_list)[id]
#define ip_set_ref_netlink(inst,id) \
rcu_dereference_raw((inst)->ip_set_list)[id]
+#define ip_set_dereference_nfnl(p) \
+ rcu_dereference_check(p, lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
/* The set types are implemented in modules and registered set types
* can be found in ip_set_type_list. Adding/deleting types is
static struct ip_set *
ip_set_rcu_get(struct net *net, ip_set_id_t index)
{
- struct ip_set *set;
struct ip_set_net *inst = ip_set_pernet(net);
- rcu_read_lock();
- /* ip_set_list itself needs to be protected */
- set = rcu_dereference(inst->ip_set_list)[index];
- rcu_read_unlock();
-
- return set;
+ /* ip_set_list and the set pointer need to be protected */
+ return ip_set_dereference_nfnl(inst->ip_set_list)[index];
}
static inline void
ip_set(inst, to_id) = from;
write_unlock_bh(&ip_set_ref_lock);
+ /* Make sure all readers of the old set pointers are completed. */
+ synchronize_rcu();
+
return 0;
}
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) || IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
static const struct nf_defrag_hook *
get_proto_defrag_hook(struct bpf_nf_link *link,
- const struct nf_defrag_hook __rcu *global_hook,
+ const struct nf_defrag_hook __rcu **ptr_global_hook,
const char *mod)
{
const struct nf_defrag_hook *hook;
/* RCU protects us from races against module unloading */
rcu_read_lock();
- hook = rcu_dereference(global_hook);
+ hook = rcu_dereference(*ptr_global_hook);
if (!hook) {
rcu_read_unlock();
err = request_module(mod);
return ERR_PTR(err < 0 ? err : -EINVAL);
rcu_read_lock();
- hook = rcu_dereference(global_hook);
+ hook = rcu_dereference(*ptr_global_hook);
}
if (hook && try_module_get(hook->owner)) {
switch (link->hook_ops.pf) {
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4)
case NFPROTO_IPV4:
- hook = get_proto_defrag_hook(link, nf_defrag_v4_hook, "nf_defrag_ipv4");
+ hook = get_proto_defrag_hook(link, &nf_defrag_v4_hook, "nf_defrag_ipv4");
if (IS_ERR(hook))
return PTR_ERR(hook);
#endif
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
case NFPROTO_IPV6:
- hook = get_proto_defrag_hook(link, nf_defrag_v6_hook, "nf_defrag_ipv6");
+ hook = get_proto_defrag_hook(link, &nf_defrag_v6_hook, "nf_defrag_ipv6");
if (IS_ERR(hook))
return PTR_ERR(hook);
static struct nft_table *nft_table_lookup_byhandle(const struct net *net,
const struct nlattr *nla,
- u8 genmask, u32 nlpid)
+ int family, u8 genmask, u32 nlpid)
{
struct nftables_pernet *nft_net;
struct nft_table *table;
nft_net = nft_pernet(net);
list_for_each_entry(table, &nft_net->tables, list) {
if (be64_to_cpu(nla_get_be64(nla)) == table->handle &&
+ table->family == family &&
nft_active_genmask(table, genmask)) {
if (nft_table_has_owner(table) &&
nlpid && table->nlpid != nlpid)
if (nla[NFTA_TABLE_HANDLE]) {
attr = nla[NFTA_TABLE_HANDLE];
- table = nft_table_lookup_byhandle(net, attr, genmask,
+ table = nft_table_lookup_byhandle(net, attr, family, genmask,
NETLINK_CB(skb).portid);
} else {
attr = nla[NFTA_TABLE_NAME];
if (err < 0) {
NL_SET_BAD_ATTR(extack, attr);
- break;
+ return err;
}
}
- return err;
+
+ return 0;
}
/*
call_rcu(&trans->rcu, nft_trans_gc_trans_free);
}
-static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
- unsigned int gc_seq,
- bool sync)
+struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc,
+ unsigned int gc_seq)
{
- struct nft_set_elem_catchall *catchall, *next;
+ struct nft_set_elem_catchall *catchall;
const struct nft_set *set = gc->set;
- struct nft_elem_priv *elem_priv;
struct nft_set_ext *ext;
- list_for_each_entry_safe(catchall, next, &set->catchall_list, list) {
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext))
nft_set_elem_dead(ext);
dead_elem:
- if (sync)
- gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC);
- else
- gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
-
+ gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
if (!gc)
return NULL;
- elem_priv = catchall->elem;
- if (sync) {
- nft_setelem_data_deactivate(gc->net, gc->set, elem_priv);
- nft_setelem_catchall_destroy(catchall);
- }
-
- nft_trans_gc_elem_add(gc, elem_priv);
+ nft_trans_gc_elem_add(gc, catchall->elem);
}
return gc;
}
-struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc,
- unsigned int gc_seq)
-{
- return nft_trans_gc_catchall(gc, gc_seq, false);
-}
-
struct nft_trans_gc *nft_trans_gc_catchall_sync(struct nft_trans_gc *gc)
{
- return nft_trans_gc_catchall(gc, 0, true);
+ struct nft_set_elem_catchall *catchall, *next;
+ const struct nft_set *set = gc->set;
+ struct nft_elem_priv *elem_priv;
+ struct nft_set_ext *ext;
+
+ WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net));
+
+ list_for_each_entry_safe(catchall, next, &set->catchall_list, list) {
+ ext = nft_set_elem_ext(set, catchall->elem);
+
+ if (!nft_set_elem_expired(ext))
+ continue;
+
+ gc = nft_trans_gc_queue_sync(gc, GFP_KERNEL);
+ if (!gc)
+ return NULL;
+
+ elem_priv = catchall->elem;
+ nft_setelem_data_deactivate(gc->net, gc->set, elem_priv);
+ nft_setelem_catchall_destroy(catchall);
+ nft_trans_gc_elem_add(gc, elem_priv);
+ }
+
+ return gc;
}
static void nf_tables_module_autoload_cleanup(struct net *net)
switch (priv->size) {
case 8: {
+ u64 *dst64 = (void *)dst;
u64 src64;
switch (priv->op) {
case NFT_BYTEORDER_NTOH:
for (i = 0; i < priv->len / 8; i++) {
src64 = nft_reg_load64(&src[i]);
- nft_reg_store64(&dst[i],
+ nft_reg_store64(&dst64[i],
be64_to_cpu((__force __be64)src64));
}
break;
for (i = 0; i < priv->len / 8; i++) {
src64 = (__force __u64)
cpu_to_be64(nft_reg_load64(&src[i]));
- nft_reg_store64(&dst[i], src64);
+ nft_reg_store64(&dst64[i], src64);
}
break;
}
priv->expr_array[i] = dynset_expr;
priv->num_exprs++;
- if (set->num_exprs &&
- dynset_expr->ops != set->exprs[i]->ops) {
- err = -EOPNOTSUPP;
- goto err_expr_free;
+ if (set->num_exprs) {
+ if (i >= set->num_exprs) {
+ err = -EINVAL;
+ goto err_expr_free;
+ }
+ if (dynset_expr->ops != set->exprs[i]->ops) {
+ err = -EOPNOTSUPP;
+ goto err_expr_free;
+ }
}
i++;
}
offset = i + priv->offset;
if (priv->flags & NFT_EXTHDR_F_PRESENT) {
- *dest = 1;
+ nft_reg_store8(dest, 1);
} else {
if (priv->len % NFT_REG32_SIZE)
dest[priv->len / NFT_REG32_SIZE] = 0;
type = bufp[0];
if (type == priv->type) {
- *dest = 1;
+ nft_reg_store8(dest, 1);
return;
}
switch (priv->result) {
case NFT_FIB_RESULT_OIF:
index = dev ? dev->ifindex : 0;
- *dreg = (priv->flags & NFTA_FIB_F_PRESENT) ? !!index : index;
+ if (priv->flags & NFTA_FIB_F_PRESENT)
+ nft_reg_store8(dreg, !!index);
+ else
+ *dreg = index;
+
break;
case NFT_FIB_RESULT_OIFNAME:
if (priv->flags & NFTA_FIB_F_PRESENT)
- *dreg = !!dev;
+ nft_reg_store8(dreg, !!dev);
else
strscpy_pad(reg, dev ? dev->name : "", IFNAMSIZ);
break;
{
switch (key) {
case NFT_META_TIME_NS:
- nft_reg_store64(dest, ktime_get_real_ns());
+ nft_reg_store64((u64 *)dest, ktime_get_real_ns());
break;
case NFT_META_TIME_DAY:
nft_reg_store8(dest, nft_meta_weekday());
e = f->mt[r].e;
+ if (!nft_set_elem_active(&e->ext, iter->genmask))
+ goto cont;
+
iter->err = iter->fn(ctx, set, iter, &e->priv);
if (iter->err < 0)
goto out;
{
struct nft_rbtree *priv = nft_set_priv(set);
struct nft_rbtree_elem *rbe, *rbe_end = NULL;
- struct nftables_pernet *nft_net;
struct rb_node *node, *next;
struct nft_trans_gc *gc;
struct net *net;
set = nft_set_container_of(priv);
net = read_pnet(&set->net);
- nft_net = nft_pernet(net);
gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL);
if (!gc)
*/
return false;
- filp = sk->sk_socket->file;
- if (filp == NULL)
+ read_lock_bh(&sk->sk_callback_lock);
+ filp = sk->sk_socket ? sk->sk_socket->file : NULL;
+ if (filp == NULL) {
+ read_unlock_bh(&sk->sk_callback_lock);
return ((info->match ^ info->invert) &
(XT_OWNER_UID | XT_OWNER_GID)) == 0;
+ }
if (info->match & XT_OWNER_UID) {
kuid_t uid_min = make_kuid(net->user_ns, info->uid_min);
kuid_t uid_max = make_kuid(net->user_ns, info->uid_max);
if ((uid_gte(filp->f_cred->fsuid, uid_min) &&
uid_lte(filp->f_cred->fsuid, uid_max)) ^
- !(info->invert & XT_OWNER_UID))
+ !(info->invert & XT_OWNER_UID)) {
+ read_unlock_bh(&sk->sk_callback_lock);
return false;
+ }
}
if (info->match & XT_OWNER_GID) {
}
}
- if (match ^ !(info->invert & XT_OWNER_GID))
+ if (match ^ !(info->invert & XT_OWNER_GID)) {
+ read_unlock_bh(&sk->sk_callback_lock);
return false;
+ }
}
+ read_unlock_bh(&sk->sk_callback_lock);
return true;
}
if ((grp->flags & GENL_UNS_ADMIN_PERM) &&
!ns_capable(net->user_ns, CAP_NET_ADMIN))
ret = -EPERM;
+ if (grp->cap_sys_admin &&
+ !ns_capable(net->user_ns, CAP_SYS_ADMIN))
+ ret = -EPERM;
break;
}
struct sock *sk = sock->sk;
if (sk)
- atomic_inc(&pkt_sk(sk)->mapped);
+ atomic_long_inc(&pkt_sk(sk)->mapped);
}
static void packet_mm_close(struct vm_area_struct *vma)
struct sock *sk = sock->sk;
if (sk)
- atomic_dec(&pkt_sk(sk)->mapped);
+ atomic_long_dec(&pkt_sk(sk)->mapped);
}
static const struct vm_operations_struct packet_mmap_ops = {
err = -EBUSY;
if (!closing) {
- if (atomic_read(&po->mapped))
+ if (atomic_long_read(&po->mapped))
goto out;
if (packet_read_pending(rb))
goto out;
err = -EBUSY;
mutex_lock(&po->pg_vec_lock);
- if (closing || atomic_read(&po->mapped) == 0) {
+ if (closing || atomic_long_read(&po->mapped) == 0) {
err = 0;
spin_lock_bh(&rb_queue->lock);
swap(rb->pg_vec, pg_vec);
po->prot_hook.func = (po->rx_ring.pg_vec) ?
tpacket_rcv : packet_rcv;
skb_queue_purge(rb_queue);
- if (atomic_read(&po->mapped))
- pr_err("packet_mmap: vma is busy: %d\n",
- atomic_read(&po->mapped));
+ if (atomic_long_read(&po->mapped))
+ pr_err("packet_mmap: vma is busy: %ld\n",
+ atomic_long_read(&po->mapped));
}
mutex_unlock(&po->pg_vec_lock);
}
}
- atomic_inc(&po->mapped);
+ atomic_long_inc(&po->mapped);
vma->vm_ops = &packet_mmap_ops;
err = 0;
module_init(packet_diag_init);
module_exit(packet_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("PACKET socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 17 /* AF_PACKET */);
__be16 num;
struct packet_rollover *rollover;
struct packet_mclist *mclist;
- atomic_t mapped;
+ atomic_long_t mapped;
enum tpacket_versions tp_version;
unsigned int tp_hdrlen;
unsigned int tp_reserve;
static const struct genl_multicast_group psample_nl_mcgrps[] = {
[PSAMPLE_NL_MCGRP_CONFIG] = { .name = PSAMPLE_NL_MCGRP_CONFIG_NAME },
- [PSAMPLE_NL_MCGRP_SAMPLE] = { .name = PSAMPLE_NL_MCGRP_SAMPLE_NAME },
+ [PSAMPLE_NL_MCGRP_SAMPLE] = { .name = PSAMPLE_NL_MCGRP_SAMPLE_NAME,
+ .flags = GENL_UNS_ADMIN_PERM },
};
static struct genl_family psample_nl_family __ro_after_init;
return -EINVAL;
}
+ ret = gpiod_direction_output(rfkill->reset_gpio, true);
+ if (ret)
+ return ret;
+
+ ret = gpiod_direction_output(rfkill->shutdown_gpio, true);
+ if (ret)
+ return ret;
+
rfkill->rfkill_dev = rfkill_alloc(rfkill->name, &pdev->dev,
rfkill->type, &rfkill_gpio_ops,
rfkill);
*/
static void rose_kill_by_device(struct net_device *dev)
{
- struct sock *s;
+ struct sock *sk, *array[16];
+ struct rose_sock *rose;
+ bool rescan;
+ int i, cnt;
+start:
+ rescan = false;
+ cnt = 0;
spin_lock_bh(&rose_list_lock);
- sk_for_each(s, &rose_list) {
- struct rose_sock *rose = rose_sk(s);
+ sk_for_each(sk, &rose_list) {
+ rose = rose_sk(sk);
+ if (rose->device == dev) {
+ if (cnt == ARRAY_SIZE(array)) {
+ rescan = true;
+ break;
+ }
+ sock_hold(sk);
+ array[cnt++] = sk;
+ }
+ }
+ spin_unlock_bh(&rose_list_lock);
+ for (i = 0; i < cnt; i++) {
+ sk = array[cnt];
+ rose = rose_sk(sk);
+ lock_sock(sk);
+ spin_lock_bh(&rose_list_lock);
if (rose->device == dev) {
- rose_disconnect(s, ENETUNREACH, ROSE_OUT_OF_ORDER, 0);
+ rose_disconnect(sk, ENETUNREACH, ROSE_OUT_OF_ORDER, 0);
if (rose->neighbour)
rose->neighbour->use--;
netdev_put(rose->device, &rose->dev_tracker);
rose->device = NULL;
}
+ spin_unlock_bh(&rose_list_lock);
+ release_sock(sk);
+ sock_put(sk);
+ cond_resched();
}
- spin_unlock_bh(&rose_list_lock);
+ if (rescan)
+ goto start;
}
/*
break;
}
+ spin_lock_bh(&rose_list_lock);
netdev_put(rose->device, &rose->dev_tracker);
+ rose->device = NULL;
+ spin_unlock_bh(&rose_list_lock);
sock->sk = NULL;
release_sock(sk);
sock_put(sk);
case TIOCINQ: {
struct sk_buff *skb;
long amount = 0L;
- /* These two are safe on a single CPU system as only user tasks fiddle here */
+
+ spin_lock_irq(&sk->sk_receive_queue.lock);
if ((skb = skb_peek(&sk->sk_receive_queue)) != NULL)
amount = skb->len;
+ spin_unlock_irq(&sk->sk_receive_queue.lock);
return put_user(amount, (unsigned int __user *) argp);
}
static struct rxrpc_bundle *rxrpc_alloc_bundle(struct rxrpc_call *call,
gfp_t gfp)
{
+ static atomic_t rxrpc_bundle_id;
struct rxrpc_bundle *bundle;
bundle = kzalloc(sizeof(*bundle), gfp);
bundle->upgrade = test_bit(RXRPC_CALL_UPGRADE, &call->flags);
bundle->service_id = call->dest_srx.srx_service;
bundle->security_level = call->security_level;
+ bundle->debug_id = atomic_inc_return(&rxrpc_bundle_id);
refcount_set(&bundle->ref, 1);
atomic_set(&bundle->active, 1);
INIT_LIST_HEAD(&bundle->waiting_calls);
static void rxrpc_free_bundle(struct rxrpc_bundle *bundle)
{
- trace_rxrpc_bundle(bundle->debug_id, 1, rxrpc_bundle_free);
+ trace_rxrpc_bundle(bundle->debug_id, refcount_read(&bundle->ref),
+ rxrpc_bundle_free);
rxrpc_put_peer(bundle->peer, rxrpc_peer_put_bundle);
key_put(bundle->key);
kfree(bundle);
*/
int rxrpc_look_up_bundle(struct rxrpc_call *call, gfp_t gfp)
{
- static atomic_t rxrpc_bundle_id;
struct rxrpc_bundle *bundle, *candidate;
struct rxrpc_local *local = call->local;
struct rb_node *p, **pp, *parent;
}
_debug("new bundle");
- candidate->debug_id = atomic_inc_return(&rxrpc_bundle_id);
rb_link_node(&candidate->local_node, parent, pp);
rb_insert_color(&candidate->local_node, &local->client_bundles);
call->bundle = rxrpc_get_bundle(candidate, rxrpc_bundle_get_client_call);
clear_bit(i + RXRPC_CALL_RTT_PEND_SHIFT, &call->rtt_avail);
smp_mb(); /* Read data before setting avail bit */
set_bit(i, &call->rtt_avail);
- if (type != rxrpc_rtt_rx_cancel)
- rxrpc_peer_add_rtt(call, type, i, acked_serial, ack_serial,
- sent_at, resp_time);
- else
- trace_rxrpc_rtt_rx(call, rxrpc_rtt_rx_cancel, i,
- orig_serial, acked_serial, 0, 0);
+ rxrpc_peer_add_rtt(call, type, i, acked_serial, ack_serial,
+ sent_at, resp_time);
matched = true;
}
summary.ack_reason, nr_acks);
rxrpc_inc_stat(call->rxnet, stat_rx_acks[ack.reason]);
- switch (ack.reason) {
- case RXRPC_ACK_PING_RESPONSE:
- rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
- rxrpc_rtt_rx_ping_response);
- break;
- case RXRPC_ACK_REQUESTED:
- rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
- rxrpc_rtt_rx_requested_ack);
- break;
- default:
- if (acked_serial != 0)
+ if (acked_serial != 0) {
+ switch (ack.reason) {
+ case RXRPC_ACK_PING_RESPONSE:
rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
- rxrpc_rtt_rx_cancel);
- break;
- }
-
- if (ack.reason == RXRPC_ACK_PING) {
- rxrpc_send_ACK(call, RXRPC_ACK_PING_RESPONSE, ack_serial,
- rxrpc_propose_ack_respond_to_ping);
- } else if (sp->hdr.flags & RXRPC_REQUEST_ACK) {
- rxrpc_send_ACK(call, RXRPC_ACK_REQUESTED, ack_serial,
- rxrpc_propose_ack_respond_to_ack);
+ rxrpc_rtt_rx_ping_response);
+ break;
+ case RXRPC_ACK_REQUESTED:
+ rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
+ rxrpc_rtt_rx_requested_ack);
+ break;
+ default:
+ rxrpc_complete_rtt_probe(call, skb->tstamp, acked_serial, ack_serial,
+ rxrpc_rtt_rx_other_ack);
+ break;
+ }
}
/* If we get an EXCEEDS_WINDOW ACK from the server, it probably
rxrpc_is_client_call(call)) {
rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
0, -ENETRESET);
- return;
+ goto send_response;
}
/* If we get an OUT_OF_SEQUENCE ACK from the server, that can also
rxrpc_is_client_call(call)) {
rxrpc_set_call_completion(call, RXRPC_CALL_REMOTELY_ABORTED,
0, -ENETRESET);
- return;
+ goto send_response;
}
/* Discard any out-of-order or duplicate ACKs (outside lock). */
trace_rxrpc_rx_discard_ack(call->debug_id, ack_serial,
first_soft_ack, call->acks_first_seq,
prev_pkt, call->acks_prev_seq);
- return;
+ goto send_response;
}
info.rxMTU = 0;
case RXRPC_CALL_SERVER_AWAIT_ACK:
break;
default:
- return;
+ goto send_response;
}
if (before(hard_ack, call->acks_hard_ack) ||
if (after(hard_ack, call->acks_hard_ack)) {
if (rxrpc_rotate_tx_window(call, hard_ack, &summary)) {
rxrpc_end_tx_phase(call, false, rxrpc_eproto_unexpected_ack);
- return;
+ goto send_response;
}
}
rxrpc_propose_ack_ping_for_lost_reply);
rxrpc_congestion_management(call, skb, &summary, acked_serial);
+
+send_response:
+ if (ack.reason == RXRPC_ACK_PING)
+ rxrpc_send_ACK(call, RXRPC_ACK_PING_RESPONSE, ack_serial,
+ rxrpc_propose_ack_respond_to_ping);
+ else if (sp->hdr.flags & RXRPC_REQUEST_ACK)
+ rxrpc_send_ACK(call, RXRPC_ACK_REQUESTED, ack_serial,
+ rxrpc_propose_ack_respond_to_ack);
}
/*
!test_bit(NF_FLOW_HW_ESTABLISHED, &flow->flags);
}
+static void tcf_ct_flow_table_get_ref(struct tcf_ct_flow_table *ct_ft);
+
+static void tcf_ct_nf_get(struct nf_flowtable *ft)
+{
+ struct tcf_ct_flow_table *ct_ft =
+ container_of(ft, struct tcf_ct_flow_table, nf_ft);
+
+ tcf_ct_flow_table_get_ref(ct_ft);
+}
+
+static void tcf_ct_flow_table_put(struct tcf_ct_flow_table *ct_ft);
+
+static void tcf_ct_nf_put(struct nf_flowtable *ft)
+{
+ struct tcf_ct_flow_table *ct_ft =
+ container_of(ft, struct tcf_ct_flow_table, nf_ft);
+
+ tcf_ct_flow_table_put(ct_ft);
+}
+
static struct nf_flowtable_type flowtable_ct = {
.gc = tcf_ct_flow_is_outdated,
.action = tcf_ct_flow_table_fill_actions,
+ .get = tcf_ct_nf_get,
+ .put = tcf_ct_nf_put,
.owner = THIS_MODULE,
};
return err;
}
+static void tcf_ct_flow_table_get_ref(struct tcf_ct_flow_table *ct_ft)
+{
+ refcount_inc(&ct_ft->ref);
+}
+
static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
{
- struct flow_block_cb *block_cb, *tmp_cb;
struct tcf_ct_flow_table *ct_ft;
struct flow_block *block;
rwork);
nf_flow_table_free(&ct_ft->nf_ft);
- /* Remove any remaining callbacks before cleanup */
block = &ct_ft->nf_ft.flow_block;
down_write(&ct_ft->nf_ft.flow_block_lock);
- list_for_each_entry_safe(block_cb, tmp_cb, &block->cb_list, list) {
- list_del(&block_cb->list);
- flow_block_cb_free(block_cb);
- }
+ WARN_ON(!list_empty(&block->cb_list));
up_write(&ct_ft->nf_ft.flow_block_lock);
kfree(ct_ft);
if (bind) {
struct flow_action_entry *entry = entry_data;
+ if (tcf_ct_helper(act))
+ return -EOPNOTSUPP;
+
entry->id = FLOW_ACTION_CT;
entry->ct.action = tcf_ct_action(act);
entry->ct.zone = tcf_ct_zone(act);
module_init(sctp_diag_init);
module_exit(sctp_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("SCTP socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 2-132);
struct smc_llc_qentry *qentry;
int rc;
- /* receive CONFIRM LINK request from server over RoCE fabric */
- qentry = smc_llc_wait(link->lgr, NULL, SMC_LLC_WAIT_TIME,
+ /* Receive CONFIRM LINK request from server over RoCE fabric.
+ * Increasing the client's timeout by twice as much as the server's
+ * timeout by default can temporarily avoid decline messages of
+ * both sides crossing or colliding
+ */
+ qentry = smc_llc_wait(link->lgr, NULL, 2 * SMC_LLC_WAIT_TIME,
SMC_LLC_CONFIRM_LINK);
if (!qentry) {
struct smc_clc_msg_decline dclc;
int bufsize = smc_uncompress_bufsize(clc->d0.dmbe_size);
smc->conn.peer_rmbe_idx = clc->d0.dmbe_idx;
- smc->conn.peer_token = clc->d0.token;
+ smc->conn.peer_token = ntohll(clc->d0.token);
/* msg header takes up space in the buffer */
smc->conn.peer_rmbe_size = bufsize - sizeof(struct smcd_cdc_msg);
atomic_set(&smc->conn.peer_rmbe_space, smc->conn.peer_rmbe_size);
if (rc)
return rc;
}
- ini->ism_peer_gid[ini->ism_selected] = aclc->d0.gid;
+ ini->ism_peer_gid[ini->ism_selected] = ntohll(aclc->d0.gid);
/* there is only one lgr role for SMC-D; use server lock */
mutex_lock(&smc_server_lgr_pending);
{
struct smc_connection *conn = &smc->conn;
struct smc_clc_first_contact_ext_v2x fce;
+ struct smcd_dev *smcd = conn->lgr->smcd;
struct smc_clc_msg_accept_confirm *clc;
struct smc_clc_fce_gid_ext gle;
struct smc_clc_msg_trail trl;
memcpy(clc->hdr.eyecatcher, SMCD_EYECATCHER,
sizeof(SMCD_EYECATCHER));
clc->hdr.typev1 = SMC_TYPE_D;
- clc->d0.gid =
- conn->lgr->smcd->ops->get_local_gid(conn->lgr->smcd);
- clc->d0.token = conn->rmb_desc->token;
+ clc->d0.gid = htonll(smcd->ops->get_local_gid(smcd));
+ clc->d0.token = htonll(conn->rmb_desc->token);
clc->d0.dmbe_size = conn->rmbe_size_comp;
clc->d0.dmbe_idx = 0;
memcpy(&clc->d0.linkid, conn->lgr->id, SMC_LGR_ID_SIZE);
if (version == SMC_V1) {
clc->hdr.length = htons(SMCD_CLC_ACCEPT_CONFIRM_LEN);
} else {
- clc_v2->d1.chid =
- htons(smc_ism_get_chid(conn->lgr->smcd));
+ clc_v2->d1.chid = htons(smc_ism_get_chid(smcd));
if (eid && eid[0])
memcpy(clc_v2->d1.eid, eid, SMC_MAX_EID_LEN);
len = SMCD_CLC_ACCEPT_CONFIRM_LEN_V2;
} __packed;
struct smcd_clc_msg_accept_confirm_common { /* SMCD accept/confirm */
- u64 gid; /* Sender GID */
- u64 token; /* DMB token */
+ __be64 gid; /* Sender GID */
+ __be64 token; /* DMB token */
u8 dmbe_idx; /* DMBE index */
#if defined(__BIG_ENDIAN_BITFIELD)
u8 dmbe_size : 4, /* buf size (compressed) */
module_init(smc_diag_init);
module_exit(smc_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("SMC socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 43 /* AF_SMC */);
MODULE_ALIAS_GENL_FAMILY(SMCR_GENL_FAMILY_NAME);
static struct cred machine_cred = {
.usage = ATOMIC_INIT(1),
-#ifdef CONFIG_DEBUG_CREDENTIALS
- .magic = CRED_MAGIC,
-#endif
};
/*
}
for (filled = 0; filled < pages; filled = ret) {
- ret = alloc_pages_bulk_array_node(GFP_KERNEL,
- rqstp->rq_pool->sp_id,
- pages, rqstp->rq_pages);
+ ret = alloc_pages_bulk_array(GFP_KERNEL, pages,
+ rqstp->rq_pages);
if (ret > filled)
/* Made progress, don't sleep yet */
continue;
module_exit(tipc_diag_exit);
MODULE_LICENSE("Dual BSD/GPL");
+MODULE_DESCRIPTION("TIPC socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, AF_TIPC);
return -EMSGSIZE;
skb_put(skb, TLV_SPACE(len));
+ memset(tlv, 0, TLV_SPACE(len));
tlv->tlv_type = htons(type);
tlv->tlv_len = htons(TLV_LENGTH(len));
if (len && data)
}
sk_msg_page_add(msg_pl, page, part, off);
+ msg_pl->sg.copybreak = 0;
+ msg_pl->sg.curr = msg_pl->sg.end;
sk_mem_charge(sk, part);
*copied += part;
try_to_copy -= part;
lock_sock(sk);
retry:
+ /* same checks as in tls_sw_push_pending_record() */
rec = ctx->open_rec;
if (!rec)
goto unlock;
msg_pl = &rec->msg_plaintext;
+ if (msg_pl->sg.size == 0)
+ goto unlock;
/* Check the BPF advisor and perform transmission. */
ret = bpf_exec_tx_verdict(msg_pl, sk, false, TLS_RECORD_TYPE_DATA,
}
#endif /* CONFIG_SECURITY_NETWORK */
-#define unix_peer(sk) (unix_sk(sk)->peer)
-
static inline int unix_our_peer(struct sock *sk, struct sock *osk)
{
return unix_peer(osk) == sk;
if (!(state->flags & MSG_PEEK))
WRITE_ONCE(u->oob_skb, NULL);
-
+ else
+ skb_get(oob_skb);
unix_state_unlock(sk);
chunk = state->recv_actor(oob_skb, 0, chunk, state);
- if (!(state->flags & MSG_PEEK)) {
+ if (!(state->flags & MSG_PEEK))
UNIXCB(oob_skb).consumed += 1;
- kfree_skb(oob_skb);
- }
+
+ consume_skb(oob_skb);
mutex_unlock(&u->iolock);
module_init(unix_diag_init);
module_exit(unix_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("UNIX socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, 1 /* AF_LOCAL */);
int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
{
+ struct sock *sk_pair;
+
if (restore) {
sk->sk_write_space = psock->saved_write_space;
sock_replace_proto(sk, psock->sk_proto);
return 0;
}
+ sk_pair = unix_peer(sk);
+ sock_hold(sk_pair);
+ psock->sk_pair = sk_pair;
unix_stream_bpf_check_needs_rebuild(psock->sk_proto);
sock_replace_proto(sk, &unix_stream_bpf_prot);
return 0;
module_init(vsock_diag_init);
module_exit(vsock_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("VMware Virtual Sockets monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG,
40 /* AF_VSOCK */);
t_ops = virtio_transport_get_ops(info->vsk);
if (t_ops->can_msgzerocopy) {
- int pages_in_iov = iov_iter_npages(iov_iter, MAX_SKB_FRAGS);
- int pages_to_send = min(pages_in_iov, MAX_SKB_FRAGS);
+ int pages_to_send = iov_iter_npages(iov_iter, MAX_SKB_FRAGS);
/* +1 is for packet header. */
return t_ops->can_msgzerocopy(pages_to_send + 1);
struct virtio_vsock_sock *vvs = vsk->trans;
s64 bytes;
- bytes = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
+ bytes = (s64)vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
if (bytes < 0)
bytes = 0;
--- /dev/null
+/* Chen-Yu Tsai's regdb certificate */
+0x30, 0x82, 0x02, 0xa7, 0x30, 0x82, 0x01, 0x8f,
+0x02, 0x14, 0x61, 0xc0, 0x38, 0x65, 0x1a, 0xab,
+0xdc, 0xf9, 0x4b, 0xd0, 0xac, 0x7f, 0xf0, 0x6c,
+0x72, 0x48, 0xdb, 0x18, 0xc6, 0x00, 0x30, 0x0d,
+0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d,
+0x01, 0x01, 0x0b, 0x05, 0x00, 0x30, 0x0f, 0x31,
+0x0d, 0x30, 0x0b, 0x06, 0x03, 0x55, 0x04, 0x03,
+0x0c, 0x04, 0x77, 0x65, 0x6e, 0x73, 0x30, 0x20,
+0x17, 0x0d, 0x32, 0x33, 0x31, 0x32, 0x30, 0x31,
+0x30, 0x37, 0x34, 0x31, 0x31, 0x34, 0x5a, 0x18,
+0x0f, 0x32, 0x31, 0x32, 0x33, 0x31, 0x31, 0x30,
+0x37, 0x30, 0x37, 0x34, 0x31, 0x31, 0x34, 0x5a,
+0x30, 0x0f, 0x31, 0x0d, 0x30, 0x0b, 0x06, 0x03,
+0x55, 0x04, 0x03, 0x0c, 0x04, 0x77, 0x65, 0x6e,
+0x73, 0x30, 0x82, 0x01, 0x22, 0x30, 0x0d, 0x06,
+0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01,
+0x01, 0x01, 0x05, 0x00, 0x03, 0x82, 0x01, 0x0f,
+0x00, 0x30, 0x82, 0x01, 0x0a, 0x02, 0x82, 0x01,
+0x01, 0x00, 0xa9, 0x7a, 0x2c, 0x78, 0x4d, 0xa7,
+0x19, 0x2d, 0x32, 0x52, 0xa0, 0x2e, 0x6c, 0xef,
+0x88, 0x7f, 0x15, 0xc5, 0xb6, 0x69, 0x54, 0x16,
+0x43, 0x14, 0x79, 0x53, 0xb7, 0xae, 0x88, 0xfe,
+0xc0, 0xb7, 0x5d, 0x47, 0x8e, 0x1a, 0xe1, 0xef,
+0xb3, 0x90, 0x86, 0xda, 0xd3, 0x64, 0x81, 0x1f,
+0xce, 0x5d, 0x9e, 0x4b, 0x6e, 0x58, 0x02, 0x3e,
+0xb2, 0x6f, 0x5e, 0x42, 0x47, 0x41, 0xf4, 0x2c,
+0xb8, 0xa8, 0xd4, 0xaa, 0xc0, 0x0e, 0xe6, 0x48,
+0xf0, 0xa8, 0xce, 0xcb, 0x08, 0xae, 0x37, 0xaf,
+0xf6, 0x40, 0x39, 0xcb, 0x55, 0x6f, 0x5b, 0x4f,
+0x85, 0x34, 0xe6, 0x69, 0x10, 0x50, 0x72, 0x5e,
+0x4e, 0x9d, 0x4c, 0xba, 0x38, 0x36, 0x0d, 0xce,
+0x73, 0x38, 0xd7, 0x27, 0x02, 0x2a, 0x79, 0x03,
+0xe1, 0xac, 0xcf, 0xb0, 0x27, 0x85, 0x86, 0x93,
+0x17, 0xab, 0xec, 0x42, 0x77, 0x37, 0x65, 0x8a,
+0x44, 0xcb, 0xd6, 0x42, 0x93, 0x92, 0x13, 0xe3,
+0x39, 0x45, 0xc5, 0x6e, 0x00, 0x4a, 0x7f, 0xcb,
+0x42, 0x17, 0x2b, 0x25, 0x8c, 0xb8, 0x17, 0x3b,
+0x15, 0x36, 0x59, 0xde, 0x42, 0xce, 0x21, 0xe6,
+0xb6, 0xc7, 0x6e, 0x5e, 0x26, 0x1f, 0xf7, 0x8a,
+0x57, 0x9e, 0xa5, 0x96, 0x72, 0xb7, 0x02, 0x32,
+0xeb, 0x07, 0x2b, 0x73, 0xe2, 0x4f, 0x66, 0x58,
+0x9a, 0xeb, 0x0f, 0x07, 0xb6, 0xab, 0x50, 0x8b,
+0xc3, 0x8f, 0x17, 0xfa, 0x0a, 0x99, 0xc2, 0x16,
+0x25, 0xbf, 0x2d, 0x6b, 0x1a, 0xaa, 0xe6, 0x3e,
+0x5f, 0xeb, 0x6d, 0x9b, 0x5d, 0x4d, 0x42, 0x83,
+0x2d, 0x39, 0xb8, 0xc9, 0xac, 0xdb, 0x3a, 0x91,
+0x50, 0xdf, 0xbb, 0xb1, 0x76, 0x6d, 0x15, 0x73,
+0xfd, 0xc6, 0xe6, 0x6b, 0x71, 0x9e, 0x67, 0x36,
+0x22, 0x83, 0x79, 0xb1, 0xd6, 0xb8, 0x84, 0x52,
+0xaf, 0x96, 0x5b, 0xc3, 0x63, 0x02, 0x4e, 0x78,
+0x70, 0x57, 0x02, 0x03, 0x01, 0x00, 0x01, 0x30,
+0x0d, 0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7,
+0x0d, 0x01, 0x01, 0x0b, 0x05, 0x00, 0x03, 0x82,
+0x01, 0x01, 0x00, 0x24, 0x28, 0xee, 0x22, 0x74,
+0x7f, 0x7c, 0xfa, 0x6c, 0x1f, 0xb3, 0x18, 0xd1,
+0xc2, 0x3d, 0x7d, 0x29, 0x42, 0x88, 0xad, 0x82,
+0xa5, 0xb1, 0x8a, 0x05, 0xd0, 0xec, 0x5c, 0x91,
+0x20, 0xf6, 0x82, 0xfd, 0xd5, 0x67, 0x60, 0x5f,
+0x31, 0xf5, 0xbd, 0x88, 0x91, 0x70, 0xbd, 0xb8,
+0xb9, 0x8c, 0x88, 0xfe, 0x53, 0xc9, 0x54, 0x9b,
+0x43, 0xc4, 0x7a, 0x43, 0x74, 0x6b, 0xdd, 0xb0,
+0xb1, 0x3b, 0x33, 0x45, 0x46, 0x78, 0xa3, 0x1c,
+0xef, 0x54, 0x68, 0xf7, 0x85, 0x9c, 0xe4, 0x51,
+0x6f, 0x06, 0xaf, 0x81, 0xdb, 0x2a, 0x7b, 0x7b,
+0x6f, 0xa8, 0x9c, 0x67, 0xd8, 0xcb, 0xc9, 0x91,
+0x40, 0x00, 0xae, 0xd9, 0xa1, 0x9f, 0xdd, 0xa6,
+0x43, 0x0e, 0x28, 0x7b, 0xaa, 0x1b, 0xe9, 0x84,
+0xdb, 0x76, 0x64, 0x42, 0x70, 0xc9, 0xc0, 0xeb,
+0xae, 0x84, 0x11, 0x16, 0x68, 0x4e, 0x84, 0x9e,
+0x7e, 0x92, 0x36, 0xee, 0x1c, 0x3b, 0x08, 0x63,
+0xeb, 0x79, 0x84, 0x15, 0x08, 0x9d, 0xaf, 0xc8,
+0x9a, 0xc7, 0x34, 0xd3, 0x94, 0x4b, 0xd1, 0x28,
+0x97, 0xbe, 0xd1, 0x45, 0x75, 0xdc, 0x35, 0x62,
+0xac, 0x1d, 0x1f, 0xb7, 0xb7, 0x15, 0x87, 0xc8,
+0x98, 0xc0, 0x24, 0x31, 0x56, 0x8d, 0xed, 0xdb,
+0x06, 0xc6, 0x46, 0xbf, 0x4b, 0x6d, 0xa6, 0xd5,
+0xab, 0xcc, 0x60, 0xfc, 0xe5, 0x37, 0xb6, 0x53,
+0x7d, 0x58, 0x95, 0xa9, 0x56, 0xc7, 0xf7, 0xee,
+0xc3, 0xa0, 0x76, 0xf7, 0x65, 0x4d, 0x53, 0xfa,
+0xff, 0x5f, 0x76, 0x33, 0x5a, 0x08, 0xfa, 0x86,
+0x92, 0x5a, 0x13, 0xfa, 0x1a, 0xfc, 0xf2, 0x1b,
+0x8c, 0x7f, 0x42, 0x6d, 0xb7, 0x7e, 0xb7, 0xb4,
+0xf0, 0xc7, 0x83, 0xbb, 0xa2, 0x81, 0x03, 0x2d,
+0xd4, 0x2a, 0x63, 0x3f, 0xf7, 0x31, 0x2e, 0x40,
+0x33, 0x5c, 0x46, 0xbc, 0x9b, 0xc1, 0x05, 0xa5,
+0x45, 0x4e, 0xc3,
return err;
}
+ wiphy_lock(&rdev->wiphy);
list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
if (!wdev->netdev)
continue;
nl80211_notify_iface(rdev, wdev, NL80211_CMD_DEL_INTERFACE);
}
- wiphy_lock(&rdev->wiphy);
nl80211_notify_wiphy(rdev, NL80211_CMD_DEL_WIPHY);
wiphy_net_set(&rdev->wiphy, net);
WARN_ON(err);
nl80211_notify_wiphy(rdev, NL80211_CMD_NEW_WIPHY);
- wiphy_unlock(&rdev->wiphy);
list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
if (!wdev->netdev)
continue;
nl80211_notify_iface(rdev, wdev, NL80211_CMD_NEW_INTERFACE);
}
+ wiphy_unlock(&rdev->wiphy);
return 0;
}
{
struct cfg80211_registered_device *rdev = data;
+ wiphy_lock(&rdev->wiphy);
rdev_rfkill_poll(rdev);
+ wiphy_unlock(&rdev->wiphy);
}
void cfg80211_stop_p2p_device(struct cfg80211_registered_device *rdev,
u32 rssi_hyst;
s32 last_rssi_event_value;
enum nl80211_cqm_rssi_threshold_event last_rssi_event_type;
+ bool use_range_api;
int n_rssi_thresholds;
s32 rssi_thresholds[] __counted_by(n_rssi_thresholds);
};
*
* Copyright 2009 Luis R. Rodriguez <lrodriguez@atheros.com>
* Copyright 2007 Johannes Berg <johannes@sipsolutions.net>
+ * Copyright (C) 2023 Intel Corporation
*/
#include <linux/slab.h>
DEBUGFS_ADD(long_retry_limit);
DEBUGFS_ADD(ht40allow_map);
}
+
+struct debugfs_read_work {
+ struct wiphy_work work;
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data);
+ struct wiphy *wiphy;
+ struct file *file;
+ char *buf;
+ size_t bufsize;
+ void *data;
+ ssize_t ret;
+ struct completion completion;
+};
+
+static void wiphy_locked_debugfs_read_work(struct wiphy *wiphy,
+ struct wiphy_work *work)
+{
+ struct debugfs_read_work *w = container_of(work, typeof(*w), work);
+
+ w->ret = w->handler(w->wiphy, w->file, w->buf, w->bufsize, w->data);
+ complete(&w->completion);
+}
+
+static void wiphy_locked_debugfs_read_cancel(struct dentry *dentry,
+ void *data)
+{
+ struct debugfs_read_work *w = data;
+
+ wiphy_work_cancel(w->wiphy, &w->work);
+ complete(&w->completion);
+}
+
+ssize_t wiphy_locked_debugfs_read(struct wiphy *wiphy, struct file *file,
+ char *buf, size_t bufsize,
+ char __user *userbuf, size_t count,
+ loff_t *ppos,
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t bufsize,
+ void *data),
+ void *data)
+{
+ struct debugfs_read_work work = {
+ .handler = handler,
+ .wiphy = wiphy,
+ .file = file,
+ .buf = buf,
+ .bufsize = bufsize,
+ .data = data,
+ .ret = -ENODEV,
+ .completion = COMPLETION_INITIALIZER_ONSTACK(work.completion),
+ };
+ struct debugfs_cancellation cancellation = {
+ .cancel = wiphy_locked_debugfs_read_cancel,
+ .cancel_data = &work,
+ };
+
+ /* don't leak stack data or whatever */
+ memset(buf, 0, bufsize);
+
+ wiphy_work_init(&work.work, wiphy_locked_debugfs_read_work);
+ wiphy_work_queue(wiphy, &work.work);
+
+ debugfs_enter_cancellation(file, &cancellation);
+ wait_for_completion(&work.completion);
+ debugfs_leave_cancellation(file, &cancellation);
+
+ if (work.ret < 0)
+ return work.ret;
+
+ if (WARN_ON(work.ret > bufsize))
+ return -EINVAL;
+
+ return simple_read_from_buffer(userbuf, count, ppos, buf, work.ret);
+}
+EXPORT_SYMBOL_GPL(wiphy_locked_debugfs_read);
+
+struct debugfs_write_work {
+ struct wiphy_work work;
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data);
+ struct wiphy *wiphy;
+ struct file *file;
+ char *buf;
+ size_t count;
+ void *data;
+ ssize_t ret;
+ struct completion completion;
+};
+
+static void wiphy_locked_debugfs_write_work(struct wiphy *wiphy,
+ struct wiphy_work *work)
+{
+ struct debugfs_write_work *w = container_of(work, typeof(*w), work);
+
+ w->ret = w->handler(w->wiphy, w->file, w->buf, w->count, w->data);
+ complete(&w->completion);
+}
+
+static void wiphy_locked_debugfs_write_cancel(struct dentry *dentry,
+ void *data)
+{
+ struct debugfs_write_work *w = data;
+
+ wiphy_work_cancel(w->wiphy, &w->work);
+ complete(&w->completion);
+}
+
+ssize_t wiphy_locked_debugfs_write(struct wiphy *wiphy,
+ struct file *file, char *buf, size_t bufsize,
+ const char __user *userbuf, size_t count,
+ ssize_t (*handler)(struct wiphy *wiphy,
+ struct file *file,
+ char *buf,
+ size_t count,
+ void *data),
+ void *data)
+{
+ struct debugfs_write_work work = {
+ .handler = handler,
+ .wiphy = wiphy,
+ .file = file,
+ .buf = buf,
+ .count = count,
+ .data = data,
+ .ret = -ENODEV,
+ .completion = COMPLETION_INITIALIZER_ONSTACK(work.completion),
+ };
+ struct debugfs_cancellation cancellation = {
+ .cancel = wiphy_locked_debugfs_write_cancel,
+ .cancel_data = &work,
+ };
+
+ /* mostly used for strings so enforce NUL-termination for safety */
+ if (count >= bufsize)
+ return -EINVAL;
+
+ memset(buf, 0, bufsize);
+
+ if (copy_from_user(buf, userbuf, count))
+ return -EFAULT;
+
+ wiphy_work_init(&work.work, wiphy_locked_debugfs_write_work);
+ wiphy_work_queue(wiphy, &work.work);
+
+ debugfs_enter_cancellation(file, &cancellation);
+ wait_for_completion(&work.completion);
+ debugfs_leave_cancellation(file, &cancellation);
+
+ return work.ret;
+}
+EXPORT_SYMBOL_GPL(wiphy_locked_debugfs_write);
struct net_device *dev = wdev->netdev;
void *hdr;
+ lockdep_assert_wiphy(&rdev->wiphy);
+
WARN_ON(cmd != NL80211_CMD_NEW_INTERFACE &&
cmd != NL80211_CMD_DEL_INTERFACE &&
cmd != NL80211_CMD_SET_INTERFACE);
if_idx = 0;
+ wiphy_lock(&rdev->wiphy);
list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
if (if_idx < if_start) {
if_idx++;
cb->nlh->nlmsg_seq, NLM_F_MULTI,
rdev, wdev,
NL80211_CMD_NEW_INTERFACE) < 0) {
+ wiphy_unlock(&rdev->wiphy);
goto out;
}
if_idx++;
}
+ wiphy_unlock(&rdev->wiphy);
wp_idx++;
}
int i, n, low_index;
int err;
- /* RSSI reporting disabled? */
- if (!cqm_config)
- return rdev_set_cqm_rssi_range_config(rdev, dev, 0, 0);
-
/*
* Obtain current RSSI value if possible, if not and no RSSI threshold
* event has been received yet, we should receive an event after a
wdev->iftype != NL80211_IFTYPE_P2P_CLIENT)
return -EOPNOTSUPP;
- if (n_thresholds <= 1 && rdev->ops->set_cqm_rssi_config) {
- if (n_thresholds == 0 || thresholds[0] == 0) /* Disabling */
- return rdev_set_cqm_rssi_config(rdev, dev, 0, 0);
-
- return rdev_set_cqm_rssi_config(rdev, dev,
- thresholds[0], hysteresis);
- }
-
- if (!wiphy_ext_feature_isset(&rdev->wiphy,
- NL80211_EXT_FEATURE_CQM_RSSI_LIST))
- return -EOPNOTSUPP;
-
if (n_thresholds == 1 && thresholds[0] == 0) /* Disabling */
n_thresholds = 0;
old = wiphy_dereference(wdev->wiphy, wdev->cqm_config);
+ /* if already disabled just succeed */
+ if (!n_thresholds && !old)
+ return 0;
+
+ if (n_thresholds > 1) {
+ if (!wiphy_ext_feature_isset(&rdev->wiphy,
+ NL80211_EXT_FEATURE_CQM_RSSI_LIST) ||
+ !rdev->ops->set_cqm_rssi_range_config)
+ return -EOPNOTSUPP;
+ } else {
+ if (!rdev->ops->set_cqm_rssi_config)
+ return -EOPNOTSUPP;
+ }
+
if (n_thresholds) {
cqm_config = kzalloc(struct_size(cqm_config, rssi_thresholds,
n_thresholds),
memcpy(cqm_config->rssi_thresholds, thresholds,
flex_array_size(cqm_config, rssi_thresholds,
n_thresholds));
+ cqm_config->use_range_api = n_thresholds > 1 ||
+ !rdev->ops->set_cqm_rssi_config;
rcu_assign_pointer(wdev->cqm_config, cqm_config);
+
+ if (cqm_config->use_range_api)
+ err = cfg80211_cqm_rssi_update(rdev, dev, cqm_config);
+ else
+ err = rdev_set_cqm_rssi_config(rdev, dev,
+ thresholds[0],
+ hysteresis);
} else {
RCU_INIT_POINTER(wdev->cqm_config, NULL);
+ /* if enabled as range also disable via range */
+ if (old->use_range_api)
+ err = rdev_set_cqm_rssi_range_config(rdev, dev, 0, 0);
+ else
+ err = rdev_set_cqm_rssi_config(rdev, dev, 0, 0);
}
- err = cfg80211_cqm_rssi_update(rdev, dev, cqm_config);
if (err) {
rcu_assign_pointer(wdev->cqm_config, old);
kfree_rcu(cqm_config, rcu_head);
s32 rssi_level;
cqm_config = wiphy_dereference(wdev->wiphy, wdev->cqm_config);
- if (!wdev->cqm_config)
+ if (!cqm_config)
return;
- cfg80211_cqm_rssi_update(rdev, wdev->netdev, cqm_config);
+ if (cqm_config->use_range_api)
+ cfg80211_cqm_rssi_update(rdev, wdev->netdev, cqm_config);
rssi_level = cqm_config->last_rssi_event_value;
rssi_event = cqm_config->last_rssi_event_type;
rcu_read_lock();
if (xsk_check_common(xs))
- goto skip_tx;
+ goto out;
pool = xs->pool;
xsk_generic_xmit(sk);
}
-skip_tx:
if (xs->rx && !xskq_prod_is_empty(xs->rx))
mask |= EPOLLIN | EPOLLRDNORM;
if (xs->tx && xsk_tx_writeable(xs))
mask |= EPOLLOUT | EPOLLWRNORM;
-
+out:
rcu_read_unlock();
return mask;
}
module_init(xsk_diag_init);
module_exit(xsk_diag_exit);
MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("XDP socket monitoring via SOCK_DIAG");
MODULE_ALIAS_NET_PF_PROTO_TYPE(PF_NETLINK, NETLINK_SOCK_DIAG, AF_XDP);
UIMAGE_TYPE ?= kernel
UIMAGE_LOADADDR ?= arch_must_set_this
UIMAGE_ENTRYADDR ?= $(UIMAGE_LOADADDR)
-UIMAGE_NAME ?= 'Linux-$(KERNELRELEASE)'
+UIMAGE_NAME ?= Linux-$(KERNELRELEASE)
quiet_cmd_uimage = UIMAGE $@
cmd_uimage = $(BASH) $(MKIMAGE) -A $(UIMAGE_ARCH) -O linux \
-C $(UIMAGE_COMPRESSION) $(UIMAGE_OPTS-y) \
-T $(UIMAGE_TYPE) \
-a $(UIMAGE_LOADADDR) -e $(UIMAGE_ENTRYADDR) \
- -n $(UIMAGE_NAME) -d $< $@
+ -n '$(UIMAGE_NAME)' -d $< $@
# XZ
# ---------------------------------------------------------------------------
# 11160: a7 fb ff 60 aghi %r15,-160
# or
# 100092: e3 f0 ff c8 ff 71 lay %r15,-56(%r15)
- $re = qr/.*(?:lay|ag?hi).*\%r15,-(([0-9]{2}|[3-9])[0-9]{2})
- (?:\(\%r15\))?$/ox;
+ $re = qr/.*(?:lay|ag?hi).*\%r15,-([0-9]+)(?:\(\%r15\))?$/o;
} elsif ($arch eq 'sparc' || $arch eq 'sparc64') {
# f0019d10: 9d e3 bf 90 save %sp, -112, %sp
$re = qr/.*save.*%sp, -(([0-9]{2}|[3-9])[0-9]{2}), %sp/o;
while (my $line = <STDIN>) {
if ($line =~ m/$funcre/) {
$func = $1;
- next if $line !~ m/^($xs*)/;
+ next if $line !~ m/^($x*)/;
if ($total_size > $min_stack) {
push @stack, "$intro$total_size\n";
}
-
- $addr = $1;
- $addr =~ s/ /0/g;
- $addr = "0x$addr";
-
+ $addr = "0x$1";
$intro = "$addr $func [$file]:";
my $padlen = 56 - length($intro);
while ($padlen > 0) {
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0-only
+import fnmatch
import os
-import glob
import re
import argparse
else:
print(*compatibles, sep='\n')
+def glob_without_symlinks(root, glob):
+ for path, dirs, files in os.walk(root):
+ # Ignore hidden directories
+ for d in dirs:
+ if fnmatch.fnmatch(d, ".*"):
+ dirs.remove(d)
+ for f in files:
+ if fnmatch.fnmatch(f, glob):
+ yield os.path.join(path, f)
+
def files_to_parse(path_args):
for f in path_args:
if os.path.isdir(f):
- for filename in glob.iglob(f + "/**/*.c", recursive=True):
+ for filename in glob_without_symlinks(f, "*.c"):
yield filename
else:
yield f
* if (argc <= 1)
* printf("%s: no command arguments :(\n", *argv);
* else
- * printf("%s: %d command arguments!\n", *argv, args - 1);
+ * printf("%s: %d command arguments!\n", *argv, argc - 1);
* }
*
* after:
* // perturb_local_entropy()
* } else {
* local_entropy ^= 3896280633962944730;
- * printf("%s: %d command arguments!\n", *argv, args - 1);
+ * printf("%s: %d command arguments!\n", *argv, argc - 1);
* }
*
* // latent_entropy_execute() 4.
{
const_tree fieldtype;
const_tree typesize;
- const_tree elemtype;
- const_tree elemsize;
fieldtype = TREE_TYPE(field);
typesize = TYPE_SIZE(fieldtype);
if (TREE_CODE(fieldtype) != ARRAY_TYPE)
return false;
- elemtype = TREE_TYPE(fieldtype);
- elemsize = TYPE_SIZE(elemtype);
-
/* size of type is represented in bits */
if (typesize == NULL_TREE && TYPE_DOMAIN(fieldtype) != NULL_TREE &&
TYPE_MAX_VALUE(TYPE_DOMAIN(fieldtype)) == NULL_TREE)
return true;
- if (typesize != NULL_TREE &&
- (TREE_CONSTANT(typesize) && (!tree_to_uhwi(typesize) ||
- tree_to_uhwi(typesize) == tree_to_uhwi(elemsize))))
- return true;
-
return false;
}
/*
* enforce that we don't randomize the layout of the last
- * element of a struct if it's a 0 or 1-length array
- * or a proper flexible array
+ * element of a struct if it's a proper flexible array
*/
if (is_flexible_array(newtree[num_fields - 1])) {
has_flexarray = true;
for kobj in kset_for_each_object(gdb.parse_and_eval('bus_kset')):
subsys = container_of(kobj, kset_type.get_type().pointer(), 'kobj')
subsys_priv = container_of(subsys, subsys_private_type.get_type().pointer(), 'subsys')
- yield subsys_priv['bus']
+ yield subsys_priv
def for_each_class():
for kobj in kset_for_each_object(gdb.parse_and_eval('class_kset')):
subsys = container_of(kobj, kset_type.get_type().pointer(), 'kobj')
subsys_priv = container_of(subsys, subsys_private_type.get_type().pointer(), 'subsys')
- yield subsys_priv['class']
+ yield subsys_priv
def get_bus_by_name(name):
for item in for_each_bus():
- if item['name'].string() == name:
+ if item['bus']['name'].string() == name:
return item
raise gdb.GdbError("Can't find bus type {!r}".format(name))
def get_class_by_name(name):
for item in for_each_class():
- if item['name'].string() == name:
+ if item['class']['name'].string() == name:
return item
raise gdb.GdbError("Can't find device class {!r}".format(name))
def bus_for_each_device(bus):
- for kn in klist_for_each(bus['p']['klist_devices']):
+ for kn in klist_for_each(bus['klist_devices']):
dp = container_of(kn, device_private_type.get_type().pointer(), 'knode_bus')
yield dp['device']
def class_for_each_device(cls):
- for kn in klist_for_each(cls['p']['klist_devices']):
+ for kn in klist_for_each(cls['klist_devices']):
dp = container_of(kn, device_private_type.get_type().pointer(), 'knode_class')
yield dp['device']
def invoke(self, arg, from_tty):
if not arg:
for bus in for_each_bus():
- gdb.write('bus {}:\t{}\n'.format(bus['name'].string(), bus))
+ gdb.write('bus {}:\t{}\n'.format(bus['bus']['name'].string(), bus))
for dev in bus_for_each_device(bus):
_show_device(dev, level=1)
else:
def invoke(self, arg, from_tty):
if not arg:
for cls in for_each_class():
- gdb.write("class {}:\t{}\n".format(cls['name'].string(), cls))
+ gdb.write("class {}:\t{}\n".format(cls['class']['name'].string(), cls))
for dev in class_for_each_device(cls):
_show_device(dev, level=1)
else:
import gdb
-from linux import utils
+from linux import utils, lists
task_type = utils.CachedType("struct task_struct")
def task_lists():
task_ptr_type = task_type.get_type().pointer()
init_task = gdb.parse_and_eval("init_task").address
- t = g = init_task
+ t = init_task
while True:
- while True:
- yield t
+ thread_head = t['signal']['thread_head']
+ for thread in lists.list_for_each_entry(thread_head, task_ptr_type, 'thread_node'):
+ yield thread
- t = utils.container_of(t['thread_group']['next'],
- task_ptr_type, "thread_group")
- if t == g:
- break
-
- t = g = utils.container_of(g['tasks']['next'],
- task_ptr_type, "tasks")
+ t = utils.container_of(t['tasks']['next'],
+ task_ptr_type, "tasks")
if t == init_task:
return
static void sym_validate_range(struct symbol *sym)
{
struct property *prop;
+ struct symbol *range_sym;
int base;
long long val, val2;
- char str[64];
switch (sym->type) {
case S_INT:
if (!prop)
return;
val = strtoll(sym->curr.val, NULL, base);
- val2 = sym_get_range_val(prop->expr->left.sym, base);
+ range_sym = prop->expr->left.sym;
+ val2 = sym_get_range_val(range_sym, base);
if (val >= val2) {
- val2 = sym_get_range_val(prop->expr->right.sym, base);
+ range_sym = prop->expr->right.sym;
+ val2 = sym_get_range_val(range_sym, base);
if (val <= val2)
return;
}
- if (sym->type == S_INT)
- sprintf(str, "%lld", val2);
- else
- sprintf(str, "0x%llx", val2);
- sym->curr.val = xstrdup(str);
+ sym->curr.val = range_sym->curr.val;
}
static void sym_set_changed(struct symbol *sym)
const Elf_Rela *rela;
for (rela = start; rela < stop; rela++) {
+ Elf_Sym *tsym;
Elf_Addr taddr, r_offset;
unsigned int r_type, r_sym;
r_offset = TO_NATIVE(rela->r_offset);
get_rel_type_and_sym(elf, rela->r_info, &r_type, &r_sym);
- taddr = TO_NATIVE(rela->r_addend);
+ tsym = elf->symtab_start + r_sym;
+ taddr = tsym->st_value + TO_NATIVE(rela->r_addend);
switch (elf->hdr->e_machine) {
case EM_RISCV:
break;
}
- check_section_mismatch(mod, elf, elf->symtab_start + r_sym,
+ check_section_mismatch(mod, elf, tsym,
fsecndx, fromsec, r_offset, taddr);
}
}
CMS_NOSMIMECAP | use_keyid |
use_signed_attrs),
"CMS_add1_signer");
- ERR(CMS_final(cms, bm, NULL, CMS_NOCERTS | CMS_BINARY) < 0,
+ ERR(CMS_final(cms, bm, NULL, CMS_NOCERTS | CMS_BINARY) != 1,
"CMS_final");
#else
b = BIO_new_file(sig_file_name, "wb");
ERR(!b, "%s", sig_file_name);
#ifndef USE_PKCS7
- ERR(i2d_CMS_bio_stream(b, cms, NULL, 0) < 0,
+ ERR(i2d_CMS_bio_stream(b, cms, NULL, 0) != 1,
"%s", sig_file_name);
#else
- ERR(i2d_PKCS7_bio(b, pkcs7) < 0,
+ ERR(i2d_PKCS7_bio(b, pkcs7) != 1,
"%s", sig_file_name);
#endif
BIO_free(b);
if (!raw_sig) {
#ifndef USE_PKCS7
- ERR(i2d_CMS_bio_stream(bd, cms, NULL, 0) < 0, "%s", dest_name);
+ ERR(i2d_CMS_bio_stream(bd, cms, NULL, 0) != 1, "%s", dest_name);
#else
- ERR(i2d_PKCS7_bio(bd, pkcs7) < 0, "%s", dest_name);
+ ERR(i2d_PKCS7_bio(bd, pkcs7) != 1, "%s", dest_name);
#endif
} else {
BIO *b;
ERR(BIO_write(bd, &sig_info, sizeof(sig_info)) < 0, "%s", dest_name);
ERR(BIO_write(bd, magic_number, sizeof(magic_number) - 1) < 0, "%s", dest_name);
- ERR(BIO_free(bd) < 0, "%s", dest_name);
+ ERR(BIO_free(bd) != 1, "%s", dest_name);
/* Finally, if we're signing in place, replace the original. */
if (replace_orig)
}
}
+/*
+ * Set the expiration time on a key.
+ */
+void key_set_expiry(struct key *key, time64_t expiry)
+{
+ key->expiry = expiry;
+ if (expiry != TIME64_MAX) {
+ if (!(key->type->flags & KEY_TYPE_INSTANT_REAP))
+ expiry += key_gc_delay;
+ key_schedule_gc(expiry);
+ }
+}
+
/*
* Schedule a dead links collection run.
*/
static u8 gc_state; /* Internal persistent state */
#define KEY_GC_REAP_AGAIN 0x01 /* - Need another cycle */
#define KEY_GC_REAPING_LINKS 0x02 /* - We need to reap links */
-#define KEY_GC_SET_TIMER 0x04 /* - We need to restart the timer */
#define KEY_GC_REAPING_DEAD_1 0x10 /* - We need to mark dead keys */
#define KEY_GC_REAPING_DEAD_2 0x20 /* - We need to reap dead key links */
#define KEY_GC_REAPING_DEAD_3 0x40 /* - We need to reap dead keys */
struct rb_node *cursor;
struct key *key;
- time64_t new_timer, limit;
+ time64_t new_timer, limit, expiry;
kenter("[%lx,%x]", key_gc_flags, gc_state);
limit = ktime_get_real_seconds();
- if (limit > key_gc_delay)
- limit -= key_gc_delay;
- else
- limit = key_gc_delay;
/* Work out what we're going to be doing in this pass */
gc_state &= KEY_GC_REAPING_DEAD_1 | KEY_GC_REAPING_DEAD_2;
gc_state <<= 1;
if (test_and_clear_bit(KEY_GC_KEY_EXPIRED, &key_gc_flags))
- gc_state |= KEY_GC_REAPING_LINKS | KEY_GC_SET_TIMER;
+ gc_state |= KEY_GC_REAPING_LINKS;
if (test_and_clear_bit(KEY_GC_REAP_KEYTYPE, &key_gc_flags))
gc_state |= KEY_GC_REAPING_DEAD_1;
}
}
- if (gc_state & KEY_GC_SET_TIMER) {
- if (key->expiry > limit && key->expiry < new_timer) {
+ expiry = key->expiry;
+ if (expiry != TIME64_MAX) {
+ if (!(key->type->flags & KEY_TYPE_INSTANT_REAP))
+ expiry += key_gc_delay;
+ if (expiry > limit && expiry < new_timer) {
kdebug("will expire %x in %lld",
key_serial(key), key->expiry - limit);
new_timer = key->expiry;
*/
kdebug("pass complete");
- if (gc_state & KEY_GC_SET_TIMER && new_timer != (time64_t)TIME64_MAX) {
+ if (new_timer != TIME64_MAX) {
new_timer += key_gc_delay;
key_schedule_gc(new_timer);
}
extern void keyring_gc(struct key *keyring, time64_t limit);
extern void keyring_restriction_gc(struct key *keyring,
struct key_type *dead_type);
+void key_set_expiry(struct key *key, time64_t expiry);
extern void key_schedule_gc(time64_t gc_at);
extern void key_schedule_gc_links(void);
extern void key_gc_keytype(struct key_type *ktype);
*/
static inline bool key_is_dead(const struct key *key, time64_t limit)
{
+ time64_t expiry = key->expiry;
+
+ if (expiry != TIME64_MAX) {
+ if (!(key->type->flags & KEY_TYPE_INSTANT_REAP))
+ expiry += key_gc_delay;
+ if (expiry <= limit)
+ return true;
+ }
+
return
key->flags & ((1 << KEY_FLAG_DEAD) |
(1 << KEY_FLAG_INVALIDATED)) ||
- (key->expiry > 0 && key->expiry <= limit) ||
key->domain_tag->removed;
}
key->uid = uid;
key->gid = gid;
key->perm = perm;
+ key->expiry = TIME64_MAX;
key->restrict_link = restrict_link;
key->last_used_at = ktime_get_real_seconds();
if (authkey)
key_invalidate(authkey);
- if (prep->expiry != TIME64_MAX) {
- key->expiry = prep->expiry;
- key_schedule_gc(prep->expiry + key_gc_delay);
- }
+ key_set_expiry(key, prep->expiry);
}
}
atomic_inc(&key->user->nikeys);
mark_key_instantiated(key, -error);
notify_key(key, NOTIFY_KEY_INSTANTIATED, -error);
- key->expiry = ktime_get_real_seconds() + timeout;
- key_schedule_gc(key->expiry + key_gc_delay);
+ key_set_expiry(key, ktime_get_real_seconds() + timeout);
if (test_and_clear_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags))
awaken = 1;
void key_set_timeout(struct key *key, unsigned timeout)
{
- time64_t expiry = 0;
+ time64_t expiry = TIME64_MAX;
/* make the changes with the locks held to prevent races */
down_write(&key->sem);
if (timeout > 0)
expiry = ktime_get_real_seconds() + timeout;
-
- key->expiry = expiry;
- key_schedule_gc(key->expiry + key_gc_delay);
+ key_set_expiry(key, expiry);
up_write(&key->sem);
}
/* come up with a suitable timeout value */
expiry = READ_ONCE(key->expiry);
- if (expiry == 0) {
+ if (expiry == TIME64_MAX) {
memcpy(xbuf, "perm", 5);
} else if (now >= expiry) {
memcpy(xbuf, "expd", 5);
struct inode_security_struct *isec;
u32 sid;
- validate_creds(cred);
-
if (unlikely(IS_PRIVATE(inode)))
return 0;
struct inode_security_struct *isec;
u32 sid;
- validate_creds(cred);
-
ad.type = LSM_AUDIT_DATA_DENTRY;
ad.u.dentry = dentry;
sid = cred_sid(cred);
if (!mask)
return 0;
- validate_creds(cred);
-
if (unlikely(IS_PRIVATE(inode)))
return 0;
STATE(DRAINING),
STATE(PAUSED),
STATE(SUSPENDED),
+ STATE(DISCONNECTED),
};
static const char * const snd_pcm_access_names[] = {
struct pcmtst_buf_iter *v_iter = substream->runtime->private_data;
timer_shutdown_sync(&v_iter->timer_instance);
- v_iter->substream = NULL;
playback_capture_test = !v_iter->is_buf_corrupted;
kfree(v_iter);
return 0;
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
// We can't call timer_shutdown_sync here, as it is forbidden to sleep here
v_iter->suspend = true;
+ timer_delete(&v_iter->timer_instance);
break;
}
return snd_pcm_lib_ioctl(substream, cmd, arg);
}
+static int snd_pcmtst_sync_stop(struct snd_pcm_substream *substream)
+{
+ struct pcmtst_buf_iter *v_iter = substream->runtime->private_data;
+
+ timer_delete_sync(&v_iter->timer_instance);
+
+ return 0;
+}
+
static const struct snd_pcm_ops snd_pcmtst_playback_ops = {
.open = snd_pcmtst_pcm_open,
.close = snd_pcmtst_pcm_close,
.trigger = snd_pcmtst_pcm_trigger,
.hw_params = snd_pcmtst_pcm_hw_params,
.ioctl = snd_pcmtst_ioctl,
+ .sync_stop = snd_pcmtst_sync_stop,
.hw_free = snd_pcmtst_pcm_hw_free,
.prepare = snd_pcmtst_pcm_prepare,
.pointer = snd_pcmtst_pcm_pointer,
.hw_params = snd_pcmtst_pcm_hw_params,
.hw_free = snd_pcmtst_pcm_hw_free,
.ioctl = snd_pcmtst_ioctl,
+ .sync_stop = snd_pcmtst_sync_stop,
.prepare = snd_pcmtst_pcm_prepare,
.pointer = snd_pcmtst_pcm_pointer,
};
static struct nhlt_specific_cfg *
nhlt_get_specific_cfg(struct device *dev, struct nhlt_fmt *fmt, u8 num_ch,
- u32 rate, u8 vbps, u8 bps)
+ u32 rate, u8 vbps, u8 bps, bool ignore_vbps)
{
struct nhlt_fmt_cfg *cfg = fmt->fmt_config;
struct wav_fmt *wfmt;
dev_dbg(dev, "Endpoint format: ch=%d fmt=%d/%d rate=%d\n",
wfmt->channels, _vbps, _bps, wfmt->samples_per_sec);
+ /*
+ * When looking for exact match of configuration ignore the vbps
+ * from NHLT table when ignore_vbps is true
+ */
if (wfmt->channels == num_ch && wfmt->samples_per_sec == rate &&
- vbps == _vbps && bps == _bps)
+ (ignore_vbps || vbps == _vbps) && bps == _bps)
return &cfg->config;
cfg = (struct nhlt_fmt_cfg *)(cfg->config.caps + cfg->config.size);
{
struct nhlt_specific_cfg *cfg;
struct nhlt_endpoint *epnt;
+ bool ignore_vbps = false;
struct nhlt_fmt *fmt;
int i;
dev_dbg(dev, "Looking for configuration:\n");
dev_dbg(dev, " vbus_id=%d link_type=%d dir=%d, dev_type=%d\n",
bus_id, link_type, dir, dev_type);
- dev_dbg(dev, " ch=%d fmt=%d/%d rate=%d\n", num_ch, vbps, bps, rate);
+ if (link_type == NHLT_LINK_DMIC && bps == 32 && (vbps == 24 || vbps == 32)) {
+ /*
+ * The DMIC hardware supports only one type of 32 bits sample
+ * size, which is 24 bit sampling on the MSB side and bits[1:0]
+ * are used for indicating the channel number.
+ * It has been observed that some NHLT tables have the vbps
+ * specified as 32 while some uses 24.
+ * The format these variations describe are identical, the
+ * hardware is configured and behaves the same way.
+ * Note: when the samples assumed to be vbps=32 then the 'noise'
+ * introduced by the lower two bits (channel number) have no
+ * real life implication on audio quality.
+ */
+ dev_dbg(dev,
+ " ch=%d fmt=%d rate=%d (vbps is ignored for DMIC 32bit format)\n",
+ num_ch, bps, rate);
+ ignore_vbps = true;
+ } else {
+ dev_dbg(dev, " ch=%d fmt=%d/%d rate=%d\n", num_ch, vbps, bps, rate);
+ }
dev_dbg(dev, "Endpoint count=%d\n", nhlt->endpoint_count);
epnt = (struct nhlt_endpoint *)nhlt->desc;
if (nhlt_check_ep_match(dev, epnt, bus_id, link_type, dir, dev_type)) {
fmt = (struct nhlt_fmt *)(epnt->config.caps + epnt->config.size);
- cfg = nhlt_get_specific_cfg(dev, fmt, num_ch, rate, vbps, bps);
+ cfg = nhlt_get_specific_cfg(dev, fmt, num_ch, rate,
+ vbps, bps, ignore_vbps);
if (cfg)
return cfg;
}
cs_dsp_stop(dsp);
cs_dsp_power_down(dsp);
- cs35l41->firmware_running = false;
dev_dbg(cs35l41->dev, "Unloaded Firmware\n");
}
cs35l41->playback_started = true;
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
regmap_multi_reg_write(reg, cs35l41_hda_config_dsp,
ARRAY_SIZE(cs35l41_hda_config_dsp));
regmap_update_bits(reg, CS35L41_PWR_CTRL2,
regmap_multi_reg_write(reg, cs35l41_hda_mute, ARRAY_SIZE(cs35l41_hda_mute));
} else {
dev_dbg(dev, "Unmuting\n");
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
regmap_multi_reg_write(reg, cs35l41_hda_unmute_dsp,
ARRAY_SIZE(cs35l41_hda_unmute_dsp));
} else {
dev_dbg(dev, "Play (Complete)\n");
cs35l41_global_enable(dev, reg, cs35l41->hw_cfg.bst_type, 1,
- cs35l41->firmware_running);
+ &cs35l41->cs_dsp);
cs35l41_mute(dev, false);
}
cs35l41_mute(dev, true);
cs35l41_global_enable(dev, reg, cs35l41->hw_cfg.bst_type, 0,
- cs35l41->firmware_running);
+ &cs35l41->cs_dsp);
}
static void cs35l41_hda_pause_done(struct device *dev)
regmap_update_bits(reg, CS35L41_PWR_CTRL2, CS35L41_AMP_EN_MASK, 0 << CS35L41_AMP_EN_SHIFT);
if (cs35l41->hw_cfg.bst_type == CS35L41_EXT_BOOST)
regmap_write(reg, CS35L41_GPIO1_CTRL1, 0x00000001);
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
cs35l41_set_cspl_mbox_cmd(dev, reg, CSPL_MBOX_CMD_PAUSE);
regmap_update_bits(reg, CS35L41_PWR_CTRL2,
CS35L41_VMON_EN_MASK | CS35L41_IMON_EN_MASK,
break;
case HDA_GEN_PCM_ACT_CLOSE:
mutex_lock(&cs35l41->fw_mutex);
- if (!cs35l41->firmware_running && cs35l41->request_fw_load &&
+ if (!cs35l41->cs_dsp.running && cs35l41->request_fw_load &&
!cs35l41->fw_request_ongoing) {
dev_info(dev, "Requesting Firmware Load after HDA_GEN_PCM_ACT_CLOSE\n");
cs35l41->fw_request_ongoing = true;
static int cs35l41_ready_for_reset(struct cs35l41_hda *cs35l41)
{
mutex_lock(&cs35l41->fw_mutex);
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
cs35l41->cs_dsp.running = false;
cs35l41->cs_dsp.booted = false;
- cs35l41->firmware_running = false;
}
regcache_mark_dirty(cs35l41->regmap);
mutex_unlock(&cs35l41->fw_mutex);
mutex_lock(&cs35l41->fw_mutex);
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
ret = cs35l41_enter_hibernate(cs35l41->dev, cs35l41->regmap,
cs35l41->hw_cfg.bst_type);
if (ret)
regcache_cache_only(cs35l41->regmap, false);
- if (cs35l41->firmware_running) {
+ if (cs35l41->cs_dsp.running) {
ret = cs35l41_exit_hibernate(cs35l41->dev, cs35l41->regmap);
if (ret) {
dev_warn(cs35l41->dev, "Unable to exit Hibernate.");
goto clean_dsp;
}
- cs35l41->firmware_running = true;
-
return 0;
clean_dsp:
static void cs35l41_load_firmware(struct cs35l41_hda *cs35l41, bool load)
{
- if (cs35l41->firmware_running && !load) {
+ if (cs35l41->cs_dsp.running && !load) {
dev_dbg(cs35l41->dev, "Unloading Firmware\n");
cs35l41_shutdown_dsp(cs35l41);
- } else if (!cs35l41->firmware_running && load) {
+ } else if (!cs35l41->cs_dsp.running && load) {
dev_dbg(cs35l41->dev, "Loading Firmware\n");
cs35l41_smart_amp(cs35l41);
} else {
cs35l41->acpi_subsystem_id, cs35l41->hw_cfg.bst_type,
cs35l41->hw_cfg.gpio1.func == CS35l41_VSPK_SWITCH,
cs35l41->hw_cfg.spk_pos ? 'R' : 'L',
- cs35l41->firmware_running, cs35l41->speaker_id);
+ cs35l41->cs_dsp.running, cs35l41->speaker_id);
return ret;
}
if (cs35l41_safe_reset(cs35l41->regmap, cs35l41->hw_cfg.bst_type))
gpiod_set_value_cansleep(cs35l41->reset_gpio, 0);
gpiod_put(cs35l41->reset_gpio);
+ gpiod_put(cs35l41->cs_gpio);
acpi_dev_put(cs35l41->dacpi);
kfree(cs35l41->acpi_subsystem_id);
if (cs35l41_safe_reset(cs35l41->regmap, cs35l41->hw_cfg.bst_type))
gpiod_set_value_cansleep(cs35l41->reset_gpio, 0);
gpiod_put(cs35l41->reset_gpio);
+ gpiod_put(cs35l41->cs_gpio);
kfree(cs35l41->acpi_subsystem_id);
}
EXPORT_SYMBOL_NS_GPL(cs35l41_hda_remove, SND_HDA_SCODEC_CS35L41);
} __packed;
enum cs35l41_hda_spk_pos {
- CS35l41_LEFT,
- CS35l41_RIGHT,
+ CS35L41_LEFT,
+ CS35L41_RIGHT,
};
enum cs35l41_hda_gpio_function {
struct device *dev;
struct regmap *regmap;
struct gpio_desc *reset_gpio;
+ struct gpio_desc *cs_gpio;
struct cs35l41_hw_cfg hw_cfg;
struct hda_codec *codec;
//
// Author: Stefan Binding <sbinding@opensource.cirrus.com>
+#include <linux/acpi.h>
#include <linux/gpio/consumer.h>
#include <linux/string.h>
#include "cs35l41_hda_property.h"
+#include <linux/spi/spi.h>
+
+#define MAX_AMPS 4
+
+struct cs35l41_config {
+ const char *ssid;
+ enum {
+ SPI,
+ I2C
+ } bus;
+ int num_amps;
+ enum {
+ INTERNAL,
+ EXTERNAL
+ } boost_type;
+ u8 channel[MAX_AMPS];
+ int reset_gpio_index; /* -1 if no reset gpio */
+ int spkid_gpio_index; /* -1 if no spkid gpio */
+ int cs_gpio_index; /* -1 if no cs gpio, or cs-gpios already exists, max num amps == 2 */
+ int boost_ind_nanohenry; /* Required if boost_type == Internal */
+ int boost_peak_milliamp; /* Required if boost_type == Internal */
+ int boost_cap_microfarad; /* Required if boost_type == Internal */
+};
+
+static const struct cs35l41_config cs35l41_config_table[] = {
+/*
+ * Device 103C89C6 does have _DSD, however it is setup to use the wrong boost type.
+ * We can override the _DSD to correct the boost type here.
+ * Since this laptop has valid ACPI, we do not need to handle cs-gpios, since that already exists
+ * in the ACPI. The Reset GPIO is also valid, so we can use the Reset defined in _DSD.
+ */
+ { "103C89C6", SPI, 2, INTERNAL, { CS35L41_RIGHT, CS35L41_LEFT, 0, 0 }, -1, -1, -1, 1000, 4500, 24 },
+ { "104312AF", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431433", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431463", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431473", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, -1, 0, 1000, 4500, 24 },
+ { "10431483", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, -1, 0, 1000, 4500, 24 },
+ { "10431493", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "104314D3", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "104314E3", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431503", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431533", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431573", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431663", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, -1, 0, 1000, 4500, 24 },
+ { "104316D3", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+ { "104316F3", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+ { "104317F3", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431863", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "104318D3", I2C, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 0, 0, 0 },
+ { "10431C9F", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431CAF", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431CCF", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431CDF", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431CEF", SPI, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 1000, 4500, 24 },
+ { "10431D1F", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431DA2", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+ { "10431E02", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+ { "10431EE2", I2C, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, -1, -1, 0, 0, 0 },
+ { "10431F12", I2C, 2, INTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 0, 1, -1, 1000, 4500, 24 },
+ { "10431F1F", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, -1, 0, 0, 0, 0 },
+ { "10431F62", SPI, 2, EXTERNAL, { CS35L41_LEFT, CS35L41_RIGHT, 0, 0 }, 1, 2, 0, 0, 0, 0 },
+ {}
+};
+
+static int cs35l41_add_gpios(struct cs35l41_hda *cs35l41, struct device *physdev, int reset_gpio,
+ int spkid_gpio, int cs_gpio_index, int num_amps)
+{
+ struct acpi_gpio_mapping *gpio_mapping = NULL;
+ struct acpi_gpio_params *reset_gpio_params = NULL;
+ struct acpi_gpio_params *spkid_gpio_params = NULL;
+ struct acpi_gpio_params *cs_gpio_params = NULL;
+ unsigned int num_entries = 0;
+ unsigned int reset_index, spkid_index, csgpio_index;
+ int i;
+
+ /*
+ * GPIO Mapping only needs to be done once, since it would be available for subsequent amps
+ */
+ if (cs35l41->dacpi->driver_gpios)
+ return 0;
+
+ if (reset_gpio >= 0) {
+ reset_index = num_entries;
+ num_entries++;
+ }
+
+ if (spkid_gpio >= 0) {
+ spkid_index = num_entries;
+ num_entries++;
+ }
+
+ if ((cs_gpio_index >= 0) && (num_amps == 2)) {
+ csgpio_index = num_entries;
+ num_entries++;
+ }
+
+ if (!num_entries)
+ return 0;
+
+ /* must include termination entry */
+ num_entries++;
+
+ gpio_mapping = devm_kcalloc(physdev, num_entries, sizeof(struct acpi_gpio_mapping),
+ GFP_KERNEL);
+
+ if (!gpio_mapping)
+ goto err;
+
+ if (reset_gpio >= 0) {
+ gpio_mapping[reset_index].name = "reset-gpios";
+ reset_gpio_params = devm_kcalloc(physdev, num_amps, sizeof(struct acpi_gpio_params),
+ GFP_KERNEL);
+ if (!reset_gpio_params)
+ goto err;
+
+ for (i = 0; i < num_amps; i++)
+ reset_gpio_params[i].crs_entry_index = reset_gpio;
+
+ gpio_mapping[reset_index].data = reset_gpio_params;
+ gpio_mapping[reset_index].size = num_amps;
+ }
+
+ if (spkid_gpio >= 0) {
+ gpio_mapping[spkid_index].name = "spk-id-gpios";
+ spkid_gpio_params = devm_kcalloc(physdev, num_amps, sizeof(struct acpi_gpio_params),
+ GFP_KERNEL);
+ if (!spkid_gpio_params)
+ goto err;
+
+ for (i = 0; i < num_amps; i++)
+ spkid_gpio_params[i].crs_entry_index = spkid_gpio;
+
+ gpio_mapping[spkid_index].data = spkid_gpio_params;
+ gpio_mapping[spkid_index].size = num_amps;
+ }
+
+ if ((cs_gpio_index >= 0) && (num_amps == 2)) {
+ gpio_mapping[csgpio_index].name = "cs-gpios";
+ /* only one GPIO CS is supported without using _DSD, obtained using index 0 */
+ cs_gpio_params = devm_kzalloc(physdev, sizeof(struct acpi_gpio_params), GFP_KERNEL);
+ if (!cs_gpio_params)
+ goto err;
+
+ cs_gpio_params->crs_entry_index = cs_gpio_index;
+
+ gpio_mapping[csgpio_index].data = cs_gpio_params;
+ gpio_mapping[csgpio_index].size = 1;
+ }
+
+ return devm_acpi_dev_add_driver_gpios(physdev, gpio_mapping);
+err:
+ devm_kfree(physdev, gpio_mapping);
+ devm_kfree(physdev, reset_gpio_params);
+ devm_kfree(physdev, spkid_gpio_params);
+ devm_kfree(physdev, cs_gpio_params);
+ return -ENOMEM;
+}
+
+static int generic_dsd_config(struct cs35l41_hda *cs35l41, struct device *physdev, int id,
+ const char *hid)
+{
+ struct cs35l41_hw_cfg *hw_cfg = &cs35l41->hw_cfg;
+ const struct cs35l41_config *cfg;
+ struct gpio_desc *cs_gpiod;
+ struct spi_device *spi;
+ bool dsd_found;
+ int ret;
+
+ for (cfg = cs35l41_config_table; cfg->ssid; cfg++) {
+ if (!strcasecmp(cfg->ssid, cs35l41->acpi_subsystem_id))
+ break;
+ }
+
+ if (!cfg->ssid)
+ return -ENOENT;
+
+ if (!cs35l41->dacpi || cs35l41->dacpi != ACPI_COMPANION(physdev)) {
+ dev_err(cs35l41->dev, "ACPI Device does not match, cannot override _DSD.\n");
+ return -ENODEV;
+ }
+
+ dev_info(cs35l41->dev, "Adding DSD properties for %s\n", cs35l41->acpi_subsystem_id);
+
+ dsd_found = acpi_dev_has_props(cs35l41->dacpi);
+
+ if (!dsd_found) {
+ ret = cs35l41_add_gpios(cs35l41, physdev, cfg->reset_gpio_index,
+ cfg->spkid_gpio_index, cfg->cs_gpio_index,
+ cfg->num_amps);
+ if (ret) {
+ dev_err(cs35l41->dev, "Error adding GPIO mapping: %d\n", ret);
+ return ret;
+ }
+ } else if (cfg->reset_gpio_index >= 0 || cfg->spkid_gpio_index >= 0) {
+ dev_warn(cs35l41->dev, "Cannot add Reset/Speaker ID/SPI CS GPIO Mapping, "
+ "_DSD already exists.\n");
+ }
+
+ if (cfg->bus == SPI) {
+ cs35l41->index = id;
+
+#if IS_ENABLED(CONFIG_SPI)
+ /*
+ * Manually set the Chip Select for the second amp <cs_gpio_index> in the node.
+ * This is only supported for systems with 2 amps, since we cannot expand the
+ * default number of chip selects without using cs-gpios
+ * The CS GPIO must be set high prior to communicating with the first amp (which
+ * uses a native chip select), to ensure the second amp does not clash with the
+ * first.
+ */
+ if (cfg->cs_gpio_index >= 0) {
+ spi = to_spi_device(cs35l41->dev);
+
+ if (cfg->num_amps != 2) {
+ dev_warn(cs35l41->dev,
+ "Cannot update SPI CS, Number of Amps (%d) != 2\n",
+ cfg->num_amps);
+ } else if (dsd_found) {
+ dev_warn(cs35l41->dev,
+ "Cannot update SPI CS, _DSD already exists.\n");
+ } else {
+ /*
+ * This is obtained using driver_gpios, since only one GPIO for CS
+ * exists, this can be obtained using index 0.
+ */
+ cs_gpiod = gpiod_get_index(physdev, "cs", 0, GPIOD_OUT_LOW);
+ if (IS_ERR(cs_gpiod)) {
+ dev_err(cs35l41->dev,
+ "Unable to get Chip Select GPIO descriptor\n");
+ return PTR_ERR(cs_gpiod);
+ }
+ if (id == 1) {
+ spi_set_csgpiod(spi, 0, cs_gpiod);
+ cs35l41->cs_gpio = cs_gpiod;
+ } else {
+ gpiod_set_value_cansleep(cs_gpiod, true);
+ gpiod_put(cs_gpiod);
+ }
+ spi_setup(spi);
+ }
+ }
+#endif
+ } else {
+ if (cfg->num_amps > 2)
+ /*
+ * i2c addresses for 3/4 amps are used in order: 0x40, 0x41, 0x42, 0x43,
+ * subtracting 0x40 would give zero-based index
+ */
+ cs35l41->index = id - 0x40;
+ else
+ /* i2c addr 0x40 for first amp (always), 0x41/0x42 for 2nd amp */
+ cs35l41->index = id == 0x40 ? 0 : 1;
+ }
+
+ if (cfg->num_amps == 3)
+ /* 3 amps means a center channel, so no duplicate channels */
+ cs35l41->channel_index = 0;
+ else
+ /*
+ * if 4 amps, there are duplicate channels, so they need different indexes
+ * if 2 amps, no duplicate channels, channel_index would be 0
+ */
+ cs35l41->channel_index = cs35l41->index / 2;
+
+ cs35l41->reset_gpio = fwnode_gpiod_get_index(acpi_fwnode_handle(cs35l41->dacpi), "reset",
+ cs35l41->index, GPIOD_OUT_LOW,
+ "cs35l41-reset");
+ cs35l41->speaker_id = cs35l41_get_speaker_id(physdev, cs35l41->index, cfg->num_amps, -1);
+
+ hw_cfg->spk_pos = cfg->channel[cs35l41->index];
+
+ if (cfg->boost_type == INTERNAL) {
+ hw_cfg->bst_type = CS35L41_INT_BOOST;
+ hw_cfg->bst_ind = cfg->boost_ind_nanohenry;
+ hw_cfg->bst_ipk = cfg->boost_peak_milliamp;
+ hw_cfg->bst_cap = cfg->boost_cap_microfarad;
+ hw_cfg->gpio1.func = CS35L41_NOT_USED;
+ hw_cfg->gpio1.valid = true;
+ } else {
+ hw_cfg->bst_type = CS35L41_EXT_BOOST;
+ hw_cfg->bst_ind = -1;
+ hw_cfg->bst_ipk = -1;
+ hw_cfg->bst_cap = -1;
+ hw_cfg->gpio1.func = CS35l41_VSPK_SWITCH;
+ hw_cfg->gpio1.valid = true;
+ }
+
+ hw_cfg->gpio2.func = CS35L41_INTERRUPT;
+ hw_cfg->gpio2.valid = true;
+ hw_cfg->valid = true;
+
+ return 0;
+}
/*
* Device CLSA010(0/1) doesn't have _DSD so a gpiod_get by the label reset won't work.
return 0;
}
-/*
- * Device 103C89C6 does have _DSD, however it is setup to use the wrong boost type.
- * We can override the _DSD to correct the boost type here.
- * Since this laptop has valid ACPI, we do not need to handle cs-gpios, since that already exists
- * in the ACPI.
- */
-static int hp_vision_acpi_fix(struct cs35l41_hda *cs35l41, struct device *physdev, int id,
- const char *hid)
-{
- struct cs35l41_hw_cfg *hw_cfg = &cs35l41->hw_cfg;
-
- dev_info(cs35l41->dev, "Adding DSD properties for %s\n", cs35l41->acpi_subsystem_id);
-
- cs35l41->index = id;
- cs35l41->channel_index = 0;
-
- /*
- * This system has _DSD, it just contains an error, so we can still get the reset using
- * the "reset" label.
- */
- cs35l41->reset_gpio = fwnode_gpiod_get_index(acpi_fwnode_handle(cs35l41->dacpi), "reset",
- cs35l41->index, GPIOD_OUT_LOW,
- "cs35l41-reset");
- cs35l41->speaker_id = -ENOENT;
- hw_cfg->spk_pos = cs35l41->index ? 0 : 1; // right:left
- hw_cfg->gpio1.func = CS35L41_NOT_USED;
- hw_cfg->gpio1.valid = true;
- hw_cfg->gpio2.func = CS35L41_INTERRUPT;
- hw_cfg->gpio2.valid = true;
- hw_cfg->bst_type = CS35L41_INT_BOOST;
- hw_cfg->bst_ind = 1000;
- hw_cfg->bst_ipk = 4500;
- hw_cfg->bst_cap = 24;
- hw_cfg->valid = true;
-
- return 0;
-}
-
struct cs35l41_prop_model {
const char *hid;
const char *ssid;
static const struct cs35l41_prop_model cs35l41_prop_model_table[] = {
{ "CLSA0100", NULL, lenovo_legion_no_acpi },
{ "CLSA0101", NULL, lenovo_legion_no_acpi },
- { "CSC3551", "103C89C6", hp_vision_acpi_fix },
+ { "CSC3551", "103C89C6", generic_dsd_config },
+ { "CSC3551", "104312AF", generic_dsd_config },
+ { "CSC3551", "10431433", generic_dsd_config },
+ { "CSC3551", "10431463", generic_dsd_config },
+ { "CSC3551", "10431473", generic_dsd_config },
+ { "CSC3551", "10431483", generic_dsd_config },
+ { "CSC3551", "10431493", generic_dsd_config },
+ { "CSC3551", "104314D3", generic_dsd_config },
+ { "CSC3551", "104314E3", generic_dsd_config },
+ { "CSC3551", "10431503", generic_dsd_config },
+ { "CSC3551", "10431533", generic_dsd_config },
+ { "CSC3551", "10431573", generic_dsd_config },
+ { "CSC3551", "10431663", generic_dsd_config },
+ { "CSC3551", "104316D3", generic_dsd_config },
+ { "CSC3551", "104316F3", generic_dsd_config },
+ { "CSC3551", "104317F3", generic_dsd_config },
+ { "CSC3551", "10431863", generic_dsd_config },
+ { "CSC3551", "104318D3", generic_dsd_config },
+ { "CSC3551", "10431C9F", generic_dsd_config },
+ { "CSC3551", "10431CAF", generic_dsd_config },
+ { "CSC3551", "10431CCF", generic_dsd_config },
+ { "CSC3551", "10431CDF", generic_dsd_config },
+ { "CSC3551", "10431CEF", generic_dsd_config },
+ { "CSC3551", "10431D1F", generic_dsd_config },
+ { "CSC3551", "10431DA2", generic_dsd_config },
+ { "CSC3551", "10431E02", generic_dsd_config },
+ { "CSC3551", "10431EE2", generic_dsd_config },
+ { "CSC3551", "10431F12", generic_dsd_config },
+ { "CSC3551", "10431F1F", generic_dsd_config },
+ { "CSC3551", "10431F62", generic_dsd_config },
{}
};
if (!strcmp(model->hid, hid) &&
(!model->ssid ||
(cs35l41->acpi_subsystem_id &&
- !strcmp(model->ssid, cs35l41->acpi_subsystem_id))))
+ !strcasecmp(model->ssid, cs35l41->acpi_subsystem_id))))
return model->add_prop(cs35l41, physdev, id, hid);
}
return -ENOMEM;
cs35l56->base.dev = &clt->dev;
+
+#ifdef CS35L56_WAKE_HOLD_TIME_US
+ cs35l56->base.can_hibernate = true;
+#endif
cs35l56->base.regmap = devm_regmap_init_i2c(clt, &cs35l56_regmap_i2c);
if (IS_ERR(cs35l56->base.regmap)) {
ret = PTR_ERR(cs35l56->base.regmap);
return -ENOMEM;
cs35l56->base.dev = &spi->dev;
+
+#ifdef CS35L56_WAKE_HOLD_TIME_US
+ cs35l56->base.can_hibernate = true;
+#endif
cs35l56->base.regmap = devm_regmap_init_spi(spi, &cs35l56_regmap_spi);
if (IS_ERR(cs35l56->base.regmap)) {
ret = PTR_ERR(cs35l56->base.regmap);
if (chip->driver_caps & AZX_DCAPS_I915_COMPONENT) {
err = snd_hdac_i915_init(azx_bus(chip));
if (err < 0) {
+ if (err == -EPROBE_DEFER)
+ goto out_free;
+
/* if the controller is bound only with HDMI/DP
* (for HSW and BDW), we need to abort the probe;
* for other chips, still continue probing as other
SND_PCI_QUIRK(0x17aa, 0x36a7, "Lenovo C50 All in one", 0),
/* https://bugs.launchpad.net/bugs/1821663 */
SND_PCI_QUIRK(0x1631, 0xe017, "Packard Bell NEC IMEDIA 5204", 0),
+ /* KONTRON SinglePC may cause a stall at runtime resume */
+ SND_PCI_QUIRK(0x1734, 0x1232, "KONTRON SinglePC", 0),
{}
};
#endif /* CONFIG_PM */
SND_PCI_QUIRK(0x103c, 0x871a, "HP", 1),
SND_PCI_QUIRK(0x103c, 0x8711, "HP", 1),
SND_PCI_QUIRK(0x103c, 0x8715, "HP", 1),
+ SND_PCI_QUIRK(0x1043, 0x86ae, "ASUS", 1), /* Z170 PRO */
+ SND_PCI_QUIRK(0x1043, 0x86c7, "ASUS", 1), /* Z170M PLUS */
SND_PCI_QUIRK(0x1462, 0xec94, "MS-7C94", 1),
+ SND_PCI_QUIRK(0x8086, 0x2060, "Intel NUC5CPYB", 1),
SND_PCI_QUIRK(0x8086, 0x2081, "Intel NUC 10", 1),
{}
};
ALC887_FIXUP_ASUS_AUDIO,
ALC887_FIXUP_ASUS_HMIC,
ALCS1200A_FIXUP_MIC_VREF,
+ ALC888VD_FIXUP_MIC_100VREF,
};
static void alc889_fixup_coef(struct hda_codec *codec,
{}
}
},
+ [ALC888VD_FIXUP_MIC_100VREF] = {
+ .type = HDA_FIXUP_PINCTLS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x18, PIN_VREF100 }, /* headset mic */
+ {}
+ }
+ },
};
static const struct snd_pci_quirk alc882_fixup_tbl[] = {
SND_PCI_QUIRK(0x106b, 0x4a00, "Macbook 5,2", ALC889_FIXUP_MBA11_VREF),
SND_PCI_QUIRK(0x1071, 0x8258, "Evesham Voyaeger", ALC882_FIXUP_EAPD),
+ SND_PCI_QUIRK(0x10ec, 0x12d8, "iBase Elo Touch", ALC888VD_FIXUP_MIC_100VREF),
SND_PCI_QUIRK(0x13fe, 0x1009, "Advantech MIT-W101", ALC886_FIXUP_EAPD),
SND_PCI_QUIRK(0x1458, 0xa002, "Gigabyte EP45-DS3/Z87X-UD3H", ALC889_FIXUP_FRONT_HP_NO_PRESENCE),
SND_PCI_QUIRK(0x1458, 0xa0b8, "Gigabyte AZ370-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS),
case 0x10ec0230:
case 0x10ec0236:
case 0x10ec0256:
+ case 0x10ec0257:
case 0x19e58326:
alc_write_coef_idx(codec, 0x48, 0x0);
alc_update_coef_idx(codec, 0x49, 0x0045, 0x0);
case 0x10ec0230:
case 0x10ec0236:
case 0x10ec0256:
+ case 0x10ec0257:
case 0x19e58326:
alc_write_coef_idx(codec, 0x48, 0xd011);
alc_update_coef_idx(codec, 0x49, 0x007f, 0x0045);
case 0x10ec0236:
case 0x10ec0255:
case 0x10ec0256:
+ case 0x10ec0257:
case 0x19e58326:
alc_update_coef_idx(codec, 0x1b, 0x8000, 1 << 15); /* Reset HP JD */
alc_update_coef_idx(codec, 0x1b, 0x8000, 0 << 15);
ALC290_FIXUP_SUBWOOFER_HSJACK,
ALC269_FIXUP_THINKPAD_ACPI,
ALC269_FIXUP_DMIC_THINKPAD_ACPI,
+ ALC269VB_FIXUP_CHUWI_COREBOOK_XPRO,
ALC255_FIXUP_ACER_MIC_NO_PRESENCE,
ALC255_FIXUP_ASUS_MIC_NO_PRESENCE,
ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
.type = HDA_FIXUP_FUNC,
.v.func = alc269_fixup_pincfg_U7x7_headset_mic,
},
+ [ALC269VB_FIXUP_CHUWI_COREBOOK_XPRO] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x18, 0x03a19020 }, /* headset mic */
+ { 0x1b, 0x90170150 }, /* speaker */
+ { }
+ },
+ },
[ALC269_FIXUP_AMIC] = {
.type = HDA_FIXUP_PINS,
.v.pins = (const struct hda_pintbl[]) {
SND_PCI_QUIRK(0x1028, 0x0b1a, "Dell Precision 5570", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x0b37, "Dell Inspiron 16 Plus 7620 2-in-1", ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS),
SND_PCI_QUIRK(0x1028, 0x0b71, "Dell Inspiron 16 Plus 7620", ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS),
+ SND_PCI_QUIRK(0x1028, 0x0beb, "Dell XPS 15 9530 (2023)", ALC289_FIXUP_DELL_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1028, 0x0c03, "Dell Precision 5340", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x0c19, "Dell Precision 3340", ALC236_FIXUP_DELL_DUAL_CODECS),
SND_PCI_QUIRK(0x1028, 0x0c1a, "Dell Precision 3340", ALC236_FIXUP_DELL_DUAL_CODECS),
SND_PCI_QUIRK(0x103c, 0x83b9, "HP Spectre x360", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x841c, "HP Pavilion 15-CK0xx", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x8497, "HP Envy x360", ALC269_FIXUP_HP_MUTE_LED_MIC3),
+ SND_PCI_QUIRK(0x103c, 0x84ae, "HP 15-db0403ng", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
SND_PCI_QUIRK(0x103c, 0x84da, "HP OMEN dc0019-ur", ALC295_FIXUP_HP_OMEN),
SND_PCI_QUIRK(0x103c, 0x84e7, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x8519, "HP Spectre x360 15-df0xxx", ALC285_FIXUP_HP_SPECTRE_X360),
SND_PCI_QUIRK(0x103c, 0x8898, "HP EliteBook 845 G8 Notebook PC", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST),
SND_PCI_QUIRK(0x103c, 0x88d0, "HP Pavilion 15-eh1xxx (mainboard 88D0)", ALC287_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8902, "HP OMEN 16", ALC285_FIXUP_HP_MUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x890e, "HP 255 G8 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
SND_PCI_QUIRK(0x103c, 0x8919, "HP Pavilion Aero Laptop 13-be0xxx", ALC287_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x896d, "HP ZBook Firefly 16 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x896e, "HP EliteBook x360 830 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8abb, "HP ZBook Firefly 14 G9", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8ad1, "HP EliteBook 840 14 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8ad2, "HP EliteBook 860 16 inch G9 Notebook PC", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8b2f, "HP 255 15.6 inch G10 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
SND_PCI_QUIRK(0x103c, 0x8b42, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8b43, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8b44, "HP", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8c70, "HP EliteBook 835 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8c71, "HP EliteBook 845 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x8c72, "HP EliteBook 865 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8ca4, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8ca7, "HP ZBook Fury", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8cf5, "HP ZBook Studio 16", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300),
SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
SND_PCI_QUIRK(0x1043, 0x10a1, "ASUS UX391UA", ALC294_FIXUP_ASUS_SPK),
SND_PCI_QUIRK(0x1043, 0x10c0, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x10d0, "ASUS X540LA/X540LJ", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1043, 0x10d3, "ASUS K6500ZC", ALC294_FIXUP_ASUS_SPK),
SND_PCI_QUIRK(0x1043, 0x115d, "Asus 1015E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
SND_PCI_QUIRK(0x1043, 0x11c0, "ASUS X556UR", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x125e, "ASUS Q524UQK", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK),
- SND_PCI_QUIRK(0x1043, 0x1433, "ASUS GX650P", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
- SND_PCI_QUIRK(0x1043, 0x1463, "Asus GA402X", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
- SND_PCI_QUIRK(0x1043, 0x1473, "ASUS GU604V", ALC285_FIXUP_ASUS_HEADSET_MIC),
- SND_PCI_QUIRK(0x1043, 0x1483, "ASUS GU603V", ALC285_FIXUP_ASUS_HEADSET_MIC),
- SND_PCI_QUIRK(0x1043, 0x1493, "ASUS GV601V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1433, "ASUS GX650PY/PZ/PV/PU/PYV/PZV/PIV/PVV", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1463, "Asus GA402X/GA402N", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1473, "ASUS GU604VI/VC/VE/VG/VJ/VQ/VU/VV/VY/VZ", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1483, "ASUS GU603VQ/VU/VV/VJ/VI", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1493, "ASUS GV601VV/VU/VJ/VQ/VI", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x14d3, "ASUS G614JY/JZ/JG", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x14e3, "ASUS G513PI/PU/PV", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1503, "ASUS G733PY/PZ/PZV/PYV", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A),
- SND_PCI_QUIRK(0x1043, 0x1573, "ASUS GZ301V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1533, "ASUS GV302XA/XJ/XQ/XU/XV/XI", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1573, "ASUS GZ301VV/VQ/VU/VJ/VA/VC/VE/VVC/VQC/VUC/VJC/VEC/VCC", ALC285_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1043, 0x1662, "ASUS GV301QH", ALC294_FIXUP_ASUS_DUAL_SPK),
- SND_PCI_QUIRK(0x1043, 0x1663, "ASUS GU603ZV", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1663, "ASUS GU603ZI/ZJ/ZQ/ZU/ZV", ALC285_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1043, 0x1683, "ASUS UM3402YAR", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x16b2, "ASUS GU603", ALC289_FIXUP_ASUS_GA401),
+ SND_PCI_QUIRK(0x1043, 0x16d3, "ASUS UX5304VA", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x16e3, "ASUS UX50", ALC269_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x1043, 0x16f3, "ASUS UX7602VI/BZ", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1740, "ASUS UX430UA", ALC295_FIXUP_ASUS_DACS),
SND_PCI_QUIRK(0x1043, 0x17d1, "ASUS UX431FL", ALC294_FIXUP_ASUS_DUAL_SPK),
- SND_PCI_QUIRK(0x1043, 0x17f3, "ROG Ally RC71L_RC71L", ALC294_FIXUP_ASUS_ALLY),
+ SND_PCI_QUIRK(0x1043, 0x17f3, "ROG Ally NR2301L/X", ALC294_FIXUP_ASUS_ALLY),
+ SND_PCI_QUIRK(0x1043, 0x1863, "ASUS UX6404VI/VV", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1881, "ASUS Zephyrus S/M", ALC294_FIXUP_ASUS_GX502_PINS),
SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x18d3, "ASUS UM3504DA", ALC294_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x18f1, "Asus FX505DT", ALC256_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1043, 0x194e, "ASUS UX563FD", ALC294_FIXUP_ASUS_HPE),
SND_PCI_QUIRK(0x1043, 0x1970, "ASUS UX550VE", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x19e1, "ASUS UX581LV", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x1a13, "Asus G73Jw", ALC269_FIXUP_ASUS_G73JW),
SND_PCI_QUIRK(0x1043, 0x1a30, "ASUS X705UD", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1a63, "ASUS UX3405MA", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1a83, "ASUS UM5302LA", ALC294_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1a8f, "ASUS UX582ZS", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1b11, "ASUS UX431DA", ALC294_FIXUP_ASUS_COEF_1B),
SND_PCI_QUIRK(0x1043, 0x1b13, "Asus U41SV", ALC269_FIXUP_INV_DMIC),
SND_PCI_QUIRK(0x1043, 0x1b93, "ASUS G614JVR/JIR", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1bbd, "ASUS Z550MA", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1043, 0x1c03, "ASUS UM3406HA", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1c23, "Asus X55U", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+ SND_PCI_QUIRK(0x1043, 0x1c33, "ASUS UX5304MA", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x1c43, "ASUS UX8406MA", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1c62, "ASUS GU603", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x1c92, "ASUS ROG Strix G15", ALC285_FIXUP_ASUS_G533Z_PINS),
- SND_PCI_QUIRK(0x1043, 0x1c9f, "ASUS G614JI", ALC285_FIXUP_ASUS_HEADSET_MIC),
- SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
+ SND_PCI_QUIRK(0x1043, 0x1c9f, "ASUS G614JU/JV/JI", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JY/JZ/JI/JG", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
SND_PCI_QUIRK(0x1043, 0x1ccd, "ASUS X555UB", ALC256_FIXUP_ASUS_MIC),
- SND_PCI_QUIRK(0x1043, 0x1d1f, "ASUS ROG Strix G17 2023 (G713PV)", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1ccf, "ASUS G814JU/JV/JI", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x1cdf, "ASUS G814JY/JZ/JG", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x1cef, "ASUS G834JY/JZ/JI/JG", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1d1f, "ASUS G713PI/PU/PV/PVN", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1d42, "ASUS Zephyrus G14 2022", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x1d4e, "ASUS TM420", ALC256_FIXUP_ASUS_HPE),
+ SND_PCI_QUIRK(0x1043, 0x1da2, "ASUS UP6502ZA/ZD", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1e02, "ASUS UX3402ZA", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x16a3, "ASUS UX3402VA", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1f62, "ASUS UX7602ZM", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1e11, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA502),
- SND_PCI_QUIRK(0x1043, 0x1e12, "ASUS UM3402", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1e12, "ASUS UM6702RA/RC", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1e51, "ASUS Zephyrus M15", ALC294_FIXUP_ASUS_GU502_PINS),
SND_PCI_QUIRK(0x1043, 0x1e5e, "ASUS ROG Strix G513", ALC294_FIXUP_ASUS_G513_PINS),
SND_PCI_QUIRK(0x1043, 0x1e8e, "ASUS Zephyrus G15", ALC289_FIXUP_ASUS_GA401),
+ SND_PCI_QUIRK(0x1043, 0x1ee2, "ASUS UM3402", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x1043, 0x1c52, "ASUS Zephyrus G15 2022", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x1f11, "ASUS Zephyrus G14", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x1f12, "ASUS UM5302", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1f1f, "ASUS H7604JI/JV/J3D", ALC245_FIXUP_CS35L41_SPI_2),
+ SND_PCI_QUIRK(0x1043, 0x1f62, "ASUS UX7602ZM", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x1043, 0x1f92, "ASUS ROG Flow X16", ALC289_FIXUP_ASUS_GA401),
SND_PCI_QUIRK(0x1043, 0x3030, "ASUS ZN270IE", ALC256_FIXUP_ASUS_AIO_GPIO2),
SND_PCI_QUIRK(0x1043, 0x3a20, "ASUS G614JZR", ALC245_FIXUP_CS35L41_SPI_2),
SND_PCI_QUIRK(0x17aa, 0x387d, "Yoga S780-16 pro Quad AAC", ALC287_FIXUP_TAS2781_I2C),
SND_PCI_QUIRK(0x17aa, 0x387e, "Yoga S780-16 pro Quad YC", ALC287_FIXUP_TAS2781_I2C),
SND_PCI_QUIRK(0x17aa, 0x3881, "YB9 dual power mode2 YC", ALC287_FIXUP_TAS2781_I2C),
+ SND_PCI_QUIRK(0x17aa, 0x3882, "Lenovo Yoga Pro 7 14APH8", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN),
SND_PCI_QUIRK(0x17aa, 0x3884, "Y780 YG DUAL", ALC287_FIXUP_TAS2781_I2C),
SND_PCI_QUIRK(0x17aa, 0x3886, "Y780 VECO DUAL", ALC287_FIXUP_TAS2781_I2C),
SND_PCI_QUIRK(0x17aa, 0x38a7, "Y780P AMD YG dual", ALC287_FIXUP_TAS2781_I2C),
SND_PCI_QUIRK(0x1d72, 0x1901, "RedmiBook 14", ALC256_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1d72, 0x1945, "Redmi G", ALC256_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1d72, 0x1947, "RedmiBook Air", ALC255_FIXUP_XIAOMI_HEADSET_MIC),
+ SND_PCI_QUIRK(0x2782, 0x0232, "CHUWI CoreBook XPro", ALC269VB_FIXUP_CHUWI_COREBOOK_XPRO),
SND_PCI_QUIRK(0x8086, 0x2074, "Intel NUC 8", ALC233_FIXUP_INTEL_NUC8_DMIC),
SND_PCI_QUIRK(0x8086, 0x2080, "Intel NUC 8 Rugged", ALC256_FIXUP_INTEL_NUC8_RUGGED),
SND_PCI_QUIRK(0x8086, 0x2081, "Intel NUC 10", ALC256_FIXUP_INTEL_NUC10),
SND_PCI_QUIRK(0x8086, 0x3038, "Intel NUC 13", ALC295_FIXUP_CHROME_BOOK),
SND_PCI_QUIRK(0xf111, 0x0001, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0xf111, 0x0005, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0xf111, 0x0006, "Framework Laptop", ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE),
#if 0
/* Below is a quirk table taken from the old code.
{0x12, 0x90a60130},
{0x17, 0x90170110},
{0x21, 0x03211020}),
- SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
- {0x14, 0x90170110},
- {0x21, 0x04211020}),
- SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
- {0x14, 0x90170110},
- {0x21, 0x04211030}),
- SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE,
- ALC295_STANDARD_PINS,
- {0x17, 0x21014020},
- {0x18, 0x21a19030}),
- SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE,
- ALC295_STANDARD_PINS,
- {0x17, 0x21014040},
- {0x18, 0x21a19050}),
- SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE,
- ALC295_STANDARD_PINS),
SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE,
ALC298_STANDARD_PINS,
{0x17, 0x90170110}),
SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
{0x19, 0x40000000},
{0x1b, 0x40000000}),
+ SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
+ {0x19, 0x40000000},
+ {0x1b, 0x40000000}),
SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
{0x19, 0x40000000},
{0x1a, 0x40000000}),
SND_PCI_QUIRK(0x17aa, 0x32f7, "Lenovo ThinkCentre M90", ALC897_FIXUP_HEADSET_MIC_PIN),
SND_PCI_QUIRK(0x17aa, 0x3321, "Lenovo ThinkCentre M70 Gen4", ALC897_FIXUP_HEADSET_MIC_PIN),
SND_PCI_QUIRK(0x17aa, 0x331b, "Lenovo ThinkCentre M90 Gen4", ALC897_FIXUP_HEADSET_MIC_PIN),
+ SND_PCI_QUIRK(0x17aa, 0x3364, "Lenovo ThinkCentre M90 Gen5", ALC897_FIXUP_HEADSET_MIC_PIN),
SND_PCI_QUIRK(0x17aa, 0x3742, "Lenovo TianYi510Pro-14IOB", ALC897_FIXUP_HEADSET_MIC_PIN2),
SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo Ideapad Y550P", ALC662_FIXUP_IDEAPAD),
SND_PCI_QUIRK(0x17aa, 0x3a0d, "Lenovo Ideapad Y550", ALC662_FIXUP_IDEAPAD),
status = efi.get_variable(efi_name, &efi_guid, &attr,
&tas_priv->cali_data.total_sz,
tas_priv->cali_data.data);
- if (status != EFI_SUCCESS)
- return -EINVAL;
}
+ if (status != EFI_SUCCESS)
+ return -EINVAL;
tmp_val = (unsigned int *)tas_priv->cali_data.data;
tas_priv->fw_state = TASDEVICE_DSP_FW_ALL_OK;
tasdevice_prmg_load(tas_priv, 0);
+ if (tas_priv->fmw->nr_programs > 0)
+ tas_priv->cur_prog = 0;
+ if (tas_priv->fmw->nr_configurations > 0)
+ tas_priv->cur_conf = 0;
/* If calibrated data occurs error, dsp will still works with default
* calibrated data inside algo.
tas2781_save_calibration(tas_priv);
out:
- if (tas_priv->fw_state == TASDEVICE_DSP_FW_FAIL) {
- /*If DSP FW fail, kcontrol won't be created */
- tasdevice_config_info_remove(tas_priv);
- tasdevice_dsp_remove(tas_priv);
- }
mutex_unlock(&tas_priv->codec_lock);
if (fmw)
release_firmware(fmw);
{
struct tasdevice_priv *tas_priv = dev_get_drvdata(dev);
struct hda_component *comps = master_data;
+ comps = &comps[tas_priv->index];
- if (comps[tas_priv->index].dev == dev)
- memset(&comps[tas_priv->index], 0, sizeof(*comps));
+ if (comps->dev == dev) {
+ comps->dev = NULL;
+ memset(comps->name, 0, sizeof(comps->name));
+ comps->playback_hook = NULL;
+ }
tasdevice_config_info_remove(tas_priv);
tasdevice_dsp_remove(tas_priv);
pm_runtime_put_autosuspend(tas_priv->dev);
+ tas2781_reset(tas_priv);
+
ret = component_add(tas_priv->dev, &tas2781_hda_comp_ops);
if (ret) {
dev_err(tas_priv->dev, "Register component failed: %d\n", ret);
pm_runtime_disable(tas_priv->dev);
- goto err;
}
- tas2781_reset(tas_priv);
err:
if (ret)
tas2781_hda_remove(&clt->dev);
{}
},
},
+ {
+ .flags = FLAG_AMD_LEGACY,
+ .device = ACP_PCI_DEV_ID,
+ .dmi_table = (const struct dmi_system_id []) {
+ {
+ .matches = {
+ DMI_EXACT_MATCH(DMI_BOARD_VENDOR, "HUAWEI"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "HVY-WXX9"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "M1010"),
+ },
+ },
+ {}
+ },
+ },
{
.flags = FLAG_AMD_LEGACY,
.device = ACP_PCI_DEV_ID,
DMI_MATCH(DMI_PRODUCT_NAME, "M6500RC"),
}
},
+ {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "E1504FA"),
+ }
+ },
{
.driver_data = &acp6x_card,
.matches = {
DMI_MATCH(DMI_BOARD_NAME, "8A3E"),
}
},
+ {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "HP"),
+ DMI_MATCH(DMI_BOARD_NAME, "8B2F"),
+ }
+ },
{
.driver_data = &acp6x_card,
.matches = {
DMI_MATCH(DMI_PRODUCT_VERSION, "pang12"),
}
},
+ {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "System76"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "pang13"),
+ }
+ },
{}
};
#include <sound/cs35l41.h>
+#define CS35L41_FIRMWARE_OLD_VERSION 0x001C00 /* v0.28.0 */
+
static const struct reg_default cs35l41_reg[] = {
{ CS35L41_PWR_CTRL1, 0x00000000 },
{ CS35L41_PWR_CTRL2, 0x00000000 },
* the PLL Lock interrupt, in the IRQ handler.
*/
int cs35l41_global_enable(struct device *dev, struct regmap *regmap, enum cs35l41_boost_type b_type,
- int enable, bool firmware_running)
+ int enable, struct cs_dsp *dsp)
{
int ret;
unsigned int gpio1_func, pad_control, pwr_ctrl1, pwr_ctrl3, int_status, pup_pdn_mask;
}
regmap_write(regmap, CS35L41_IRQ1_STATUS1, CS35L41_PUP_DONE_MASK);
- if (firmware_running)
+ if (dsp->running && dsp->fw_id_version > CS35L41_FIRMWARE_OLD_VERSION)
ret = cs35l41_set_cspl_mbox_cmd(dev, regmap,
CSPL_MBOX_CMD_SPK_OUT_ENABLE);
else
ARRAY_SIZE(cs35l41_pup_patch));
ret = cs35l41_global_enable(cs35l41->dev, cs35l41->regmap, cs35l41->hw_cfg.bst_type,
- 1, cs35l41->dsp.cs_dsp.running);
+ 1, &cs35l41->dsp.cs_dsp);
break;
case SND_SOC_DAPM_POST_PMD:
ret = cs35l41_global_enable(cs35l41->dev, cs35l41->regmap, cs35l41->hw_cfg.bst_type,
- 0, cs35l41->dsp.cs_dsp.running);
+ 0, &cs35l41->dsp.cs_dsp);
regmap_multi_reg_write_bypassed(cs35l41->regmap,
cs35l41_pdn_patch,
.driver = {
.name = "cs35l45",
.of_match_table = cs35l45_of_match,
- .pm = &cs35l45_pm_ops,
+ .pm = pm_ptr(&cs35l45_pm_ops),
},
.id_table = cs35l45_id_i2c,
.probe = cs35l45_i2c_probe,
.driver = {
.name = "cs35l45",
.of_match_table = cs35l45_of_match,
- .pm = &cs35l45_pm_ops,
+ .pm = pm_ptr(&cs35l45_pm_ops),
},
.id_table = cs35l45_id_spi,
.probe = cs35l45_spi_probe,
cs35l45_setup_hibernate(cs35l45);
+ regmap_set_bits(cs35l45->regmap, CS35L45_IRQ1_MASK_2, CS35L45_DSP_VIRT2_MBOX_MASK);
+
// Don't wait for ACK since bus activity would wake the device
regmap_write(cs35l45->regmap, CS35L45_DSP_VIRT1_MBOX_1, CSPL_MBOX_CMD_HIBERNATE);
CSPL_MBOX_CMD_OUT_OF_HIBERNATE);
if (!ret) {
dev_dbg(cs35l45->dev, "Wake success at cycle: %d\n", j);
+ regmap_clear_bits(cs35l45->regmap, CS35L45_IRQ1_MASK_2,
+ CS35L45_DSP_VIRT2_MBOX_MASK);
return 0;
}
usleep_range(100, 200);
return -ETIMEDOUT;
}
-static int __maybe_unused cs35l45_runtime_suspend(struct device *dev)
+static int cs35l45_runtime_suspend(struct device *dev)
{
struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
return 0;
}
-static int __maybe_unused cs35l45_runtime_resume(struct device *dev)
+static int cs35l45_runtime_resume(struct device *dev)
{
struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
int ret;
return ret;
}
+static int cs35l45_sys_suspend(struct device *dev)
+{
+ struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
+
+ dev_dbg(cs35l45->dev, "System suspend, disabling IRQ\n");
+ disable_irq(cs35l45->irq);
+
+ return 0;
+}
+
+static int cs35l45_sys_suspend_noirq(struct device *dev)
+{
+ struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
+
+ dev_dbg(cs35l45->dev, "Late system suspend, reenabling IRQ\n");
+ enable_irq(cs35l45->irq);
+
+ return 0;
+}
+
+static int cs35l45_sys_resume_noirq(struct device *dev)
+{
+ struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
+
+ dev_dbg(cs35l45->dev, "Early system resume, disabling IRQ\n");
+ disable_irq(cs35l45->irq);
+
+ return 0;
+}
+
+static int cs35l45_sys_resume(struct device *dev)
+{
+ struct cs35l45_private *cs35l45 = dev_get_drvdata(dev);
+
+ dev_dbg(cs35l45->dev, "System resume, reenabling IRQ\n");
+ enable_irq(cs35l45->irq);
+
+ return 0;
+}
+
static int cs35l45_apply_property_config(struct cs35l45_private *cs35l45)
{
struct device_node *node = cs35l45->dev->of_node;
}
EXPORT_SYMBOL_NS_GPL(cs35l45_remove, SND_SOC_CS35L45);
-const struct dev_pm_ops cs35l45_pm_ops = {
- SET_RUNTIME_PM_OPS(cs35l45_runtime_suspend, cs35l45_runtime_resume, NULL)
+EXPORT_GPL_DEV_PM_OPS(cs35l45_pm_ops) = {
+ RUNTIME_PM_OPS(cs35l45_runtime_suspend, cs35l45_runtime_resume, NULL)
+
+ SYSTEM_SLEEP_PM_OPS(cs35l45_sys_suspend, cs35l45_sys_resume)
+ NOIRQ_SYSTEM_SLEEP_PM_OPS(cs35l45_sys_suspend_noirq, cs35l45_sys_resume_noirq)
};
-EXPORT_SYMBOL_NS_GPL(cs35l45_pm_ops, SND_SOC_CS35L45);
MODULE_DESCRIPTION("ASoC CS35L45 driver");
MODULE_AUTHOR("James Schulman, Cirrus Logic Inc, <james.schulman@cirrus.com>");
return ret;
}
-static void cs42l43_start_hs_bias(struct cs42l43_codec *priv, bool force_high)
+static void cs42l43_start_hs_bias(struct cs42l43_codec *priv, bool type_detect)
{
struct cs42l43 *cs42l43 = priv->core;
unsigned int val = 0x3 << CS42L43_HSBIAS_MODE_SHIFT;
regmap_update_bits(cs42l43->regmap, CS42L43_HS2,
CS42L43_HS_CLAMP_DISABLE_MASK, CS42L43_HS_CLAMP_DISABLE_MASK);
- if (!force_high && priv->bias_low)
- val = 0x2 << CS42L43_HSBIAS_MODE_SHIFT;
-
- if (priv->bias_sense_ua) {
- regmap_update_bits(cs42l43->regmap,
- CS42L43_HS_BIAS_SENSE_AND_CLAMP_AUTOCONTROL,
- CS42L43_HSBIAS_SENSE_EN_MASK |
- CS42L43_AUTO_HSBIAS_CLAMP_EN_MASK,
- CS42L43_HSBIAS_SENSE_EN_MASK |
- CS42L43_AUTO_HSBIAS_CLAMP_EN_MASK);
+ if (!type_detect) {
+ if (priv->bias_low)
+ val = 0x2 << CS42L43_HSBIAS_MODE_SHIFT;
+
+ if (priv->bias_sense_ua)
+ regmap_update_bits(cs42l43->regmap,
+ CS42L43_HS_BIAS_SENSE_AND_CLAMP_AUTOCONTROL,
+ CS42L43_HSBIAS_SENSE_EN_MASK |
+ CS42L43_AUTO_HSBIAS_CLAMP_EN_MASK,
+ CS42L43_HSBIAS_SENSE_EN_MASK |
+ CS42L43_AUTO_HSBIAS_CLAMP_EN_MASK);
}
regmap_update_bits(cs42l43->regmap, CS42L43_MIC_DETECT_CONTROL_1,
break;
case SND_SOC_DAIFMT_LEFT_J:
hi_size = bitwidth_sclk;
- frm_delay = 2;
+ frm_delay = 0;
frm_phase = 1;
break;
case SND_SOC_DAIFMT_DSP_A:
return cs43130_show_dc(dev, buf, HP_RIGHT);
}
-static u16 const cs43130_ac_freq[CS43130_AC_FREQ] = {
+static const u16 cs43130_ac_freq[CS43130_AC_FREQ] = {
24,
43,
93,
.use_single_write = true,
};
-static u16 const cs43130_dc_threshold[CS43130_DC_THRESHOLD] = {
+static const u16 cs43130_dc_threshold[CS43130_DC_THRESHOLD] = {
50,
120,
};
aad_pdata->mic_det_thr =
da7219_aad_fw_mic_det_thr(dev, fw_val32);
else
- aad_pdata->mic_det_thr = DA7219_AAD_MIC_DET_THR_500_OHMS;
+ aad_pdata->mic_det_thr = DA7219_AAD_MIC_DET_THR_200_OHMS;
if (fwnode_property_read_u32(aad_np, "dlg,jack-ins-deb", &fw_val32) >= 0)
aad_pdata->jack_ins_deb =
.sig_bits = 24,
},
},
+};
+
+static struct snd_soc_dai_driver hdac_hda_hdmi_dais[] = {
{
.id = HDAC_HDMI_0_DAI_ID,
.name = "intel-hdmi-hifi1",
.endianness = 1,
};
+static const struct snd_soc_component_driver hdac_hda_hdmi_codec = {
+ .probe = hdac_hda_codec_probe,
+ .remove = hdac_hda_codec_remove,
+ .idle_bias_on = false,
+ .endianness = 1,
+};
+
static int hdac_hda_dev_probe(struct hdac_device *hdev)
{
+ struct hdac_hda_priv *hda_pvt = dev_get_drvdata(&hdev->dev);
struct hdac_ext_link *hlink;
int ret;
snd_hdac_ext_bus_link_get(hdev->bus, hlink);
/* ASoC specific initialization */
- ret = devm_snd_soc_register_component(&hdev->dev,
- &hdac_hda_codec, hdac_hda_dais,
- ARRAY_SIZE(hdac_hda_dais));
+ if (hda_pvt->need_display_power)
+ ret = devm_snd_soc_register_component(&hdev->dev,
+ &hdac_hda_hdmi_codec, hdac_hda_hdmi_dais,
+ ARRAY_SIZE(hdac_hda_hdmi_dais));
+ else
+ ret = devm_snd_soc_register_component(&hdev->dev,
+ &hdac_hda_codec, hdac_hda_dais,
+ ARRAY_SIZE(hdac_hda_dais));
+
if (ret < 0) {
dev_err(&hdev->dev, "failed to register HDA codec %d\n", ret);
return ret;
static void hdmi_codec_jack_report(struct hdmi_codec_priv *hcp,
unsigned int jack_status)
{
- if (hcp->jack && jack_status != hcp->jack_status) {
- snd_soc_jack_report(hcp->jack, jack_status, SND_JACK_LINEOUT);
+ if (jack_status != hcp->jack_status) {
+ if (hcp->jack)
+ snd_soc_jack_report(hcp->jack, jack_status, SND_JACK_LINEOUT);
hcp->jack_status = jack_status;
}
}
if (hcp->hcd.ops->hook_plugged_cb) {
hcp->jack = jack;
+
+ /*
+ * Report the initial jack status which may have been provided
+ * by the parent hdmi driver while the hpd hook was registered.
+ */
+ snd_soc_jack_report(jack, hcp->jack_status, SND_JACK_LINEOUT);
+
return 0;
}
tx->dev = dev;
+ /* Set active_decimator default value */
+ tx->active_decimator[TX_MACRO_AIF1_CAP] = -1;
+ tx->active_decimator[TX_MACRO_AIF2_CAP] = -1;
+ tx->active_decimator[TX_MACRO_AIF3_CAP] = -1;
+
/* set MCLK and NPL rates */
clk_set_rate(tx->mclk, MCLK_FREQ);
clk_set_rate(tx->npl, MCLK_FREQ);
struct soc_bytes_ext *params = (void *)kcontrol->private_value;
int i, reg;
u16 reg_val, *val;
+ __be16 tmp;
val = (u16 *)ucontrol->value.bytes.data;
reg = NAU8822_REG_EQ1;
/* conversion of 16-bit integers between native CPU format
* and big endian format
*/
- reg_val = cpu_to_be16(reg_val);
- memcpy(val + i, ®_val, sizeof(reg_val));
+ tmp = cpu_to_be16(reg_val);
+ memcpy(val + i, &tmp, sizeof(tmp));
}
return 0;
void *data;
u16 *val, value;
int i, reg, ret;
+ __be16 *tmp;
data = kmemdup(ucontrol->value.bytes.data,
params->max, GFP_KERNEL | GFP_DMA);
/* conversion of 16-bit integers between native CPU format
* and big endian format
*/
- value = be16_to_cpu(*(val + i));
+ tmp = (__be16 *)(val + i);
+ value = be16_to_cpup(tmp);
ret = snd_soc_component_write(component, reg + i, value);
if (ret) {
dev_err(component->dev,
struct regulator_bulk_data supplies[ARRAY_SIZE(rt5645_supply_names)];
struct rt5645_eq_param_s *eq_param;
struct timer_list btn_check_timer;
+ struct mutex jd_mutex;
int codec_type;
int sysclk;
rt5645_enable_push_button_irq(component, true);
}
} else {
+ if (rt5645->en_button_func)
+ rt5645_enable_push_button_irq(component, false);
snd_soc_dapm_disable_pin(dapm, "Mic Det Power");
snd_soc_dapm_sync(dapm);
rt5645->jack_type = SND_JACK_HEADPHONE;
if (!rt5645->component)
return;
+ mutex_lock(&rt5645->jd_mutex);
+
switch (rt5645->pdata.jd_mode) {
case 0: /* Not using rt5645 JD */
if (rt5645->gpiod_hp_det) {
if (!val && (rt5645->jack_type == 0)) { /* jack in */
report = rt5645_jack_detect(rt5645->component, 1);
- } else if (!val && rt5645->jack_type != 0) {
+ } else if (!val && rt5645->jack_type == SND_JACK_HEADSET) {
/* for push button and jack out */
btn_type = 0;
if (snd_soc_component_read(rt5645->component, RT5645_INT_IRQ_ST) & 0x4) {
rt5645_jack_detect(rt5645->component, 0);
}
+ mutex_unlock(&rt5645->jd_mutex);
+
snd_soc_jack_report(rt5645->hp_jack, report, SND_JACK_HEADPHONE);
snd_soc_jack_report(rt5645->mic_jack, report, SND_JACK_MICROPHONE);
if (rt5645->en_button_func)
}
timer_setup(&rt5645->btn_check_timer, rt5645_btn_check_callback, 0);
+ mutex_init(&rt5645->jd_mutex);
INIT_DELAYED_WORK(&rt5645->jack_detect_work, rt5645_jack_detect_work);
INIT_DELAYED_WORK(&rt5645->rcclock_work, rt5645_rcclock_work);
goto out;
}
- conf = &(tas_fmw->configs[cfg_no]);
for (i = 0, prog_status = 0; i < tas_priv->ndev; i++) {
if (cfg_info[rca_conf_no]->active_dev & (1 << i)) {
- if (tas_priv->tasdevice[i].cur_prog != prm_no
- || tas_priv->force_fwload_status) {
+ if (prm_no >= 0
+ && (tas_priv->tasdevice[i].cur_prog != prm_no
+ || tas_priv->force_fwload_status)) {
tas_priv->tasdevice[i].cur_conf = -1;
tas_priv->tasdevice[i].is_loading = true;
prog_status++;
}
for (i = 0, status = 0; i < tas_priv->ndev; i++) {
- if (tas_priv->tasdevice[i].cur_conf != cfg_no
+ if (cfg_no >= 0
+ && tas_priv->tasdevice[i].cur_conf != cfg_no
&& (cfg_info[rca_conf_no]->active_dev & (1 << i))
&& (tas_priv->tasdevice[i].is_loaderr == false)) {
status++;
}
if (status) {
+ conf = &(tas_fmw->configs[cfg_no]);
status = 0;
tasdevice_load_data(tas_priv, &(conf->dev_data));
for (i = 0; i < tas_priv->ndev; i++) {
}
for (i = 0, prog_status = 0; i < tas_priv->ndev; i++) {
- if (tas_priv->tasdevice[i].cur_prog != prm_no) {
+ if (prm_no >= 0 && tas_priv->tasdevice[i].cur_prog != prm_no) {
tas_priv->tasdevice[i].cur_conf = -1;
tas_priv->tasdevice[i].is_loading = true;
prog_status++;
}
for (i = 0, prog_status = 0; i < tas_priv->ndev; i++) {
- if (tas_priv->tasdevice[i].cur_prog != prm_no) {
+ if (prm_no >= 0 && tas_priv->tasdevice[i].cur_prog != prm_no) {
tas_priv->tasdevice[i].cur_conf = -1;
tas_priv->tasdevice[i].is_loading = true;
prog_status++;
/* Boost mixer */
static const struct snd_kcontrol_new wm8974_boost_mixer[] = {
-SOC_DAPM_SINGLE("Aux Switch", WM8974_INPPGA, 6, 1, 1),
+SOC_DAPM_SINGLE("PGA Switch", WM8974_INPPGA, 6, 1, 1),
};
/* Input PGA */
/* Boost Mixer */
{"ADC", NULL, "Boost Mixer"},
- {"Boost Mixer", "Aux Switch", "Aux Input"},
- {"Boost Mixer", NULL, "Input PGA"},
+ {"Boost Mixer", NULL, "Aux Input"},
+ {"Boost Mixer", "PGA Switch", "Input PGA"},
{"Boost Mixer", NULL, "MICP"},
/* Input PGA */
ret = wm_adsp_buffer_read(buf, caps->region_defs[i].base_offset,
®ion->base_addr);
if (ret < 0)
- return ret;
+ goto err;
ret = wm_adsp_buffer_read(buf, caps->region_defs[i].size_offset,
&offset);
if (ret < 0)
- return ret;
+ goto err;
region->cumulative_size = offset;
}
return 0;
+
+err:
+ kfree(buf->regions);
+ return ret;
}
static void wm_adsp_buffer_clear(struct wm_adsp_compr_buf *buf)
config SND_SOC_IMX_RPMSG
tristate "SoC Audio support for i.MX boards with rpmsg"
depends on RPMSG
+ depends on OF && I2C
select SND_SOC_IMX_PCM_RPMSG
select SND_SOC_IMX_AUDIO_RPMSG
help
FSL_SAI_CR3_TRCE_MASK,
FSL_SAI_CR3_TRCE((dl_cfg[dl_cfg_idx].mask[tx] & trce_mask)));
+ /*
+ * When the TERE and FSD_MSTR enabled before configuring the word width
+ * There will be no frame sync clock issue, because word width impact
+ * the generation of frame sync clock.
+ *
+ * TERE enabled earlier only for i.MX8MP case for the hardware limitation,
+ * We need to disable FSD_MSTR before configuring word width, then enable
+ * FSD_MSTR bit for this specific case.
+ */
+ if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output &&
+ !sai->is_consumer_mode)
+ regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
+ FSL_SAI_CR4_FSD_MSTR, 0);
+
regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
FSL_SAI_CR4_SYWD_MASK | FSL_SAI_CR4_FRSZ_MASK |
FSL_SAI_CR4_CHMOD_MASK,
regmap_update_bits(sai->regmap, FSL_SAI_xCR5(tx, ofs),
FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK |
FSL_SAI_CR5_FBT_MASK, val_cr5);
+
+ /* Enable FSD_MSTR after configuring word width */
+ if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output &&
+ !sai->is_consumer_mode)
+ regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
+ FSL_SAI_CR4_FSD_MSTR, FSL_SAI_CR4_FSD_MSTR);
+
regmap_write(sai->regmap, FSL_SAI_xMR(tx),
~0UL - ((1 << min(channels, slots)) - 1));
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
unsigned int ofs = sai->soc_data->reg_offset;
+ /* Clear xMR to avoid channel swap with mclk_with_tere enabled case */
+ regmap_write(sai->regmap, FSL_SAI_xMR(tx), 0);
+
regmap_update_bits(sai->regmap, FSL_SAI_xCR3(tx, ofs),
FSL_SAI_CR3_TRCE_MASK, 0);
struct device *dev = &xcvr->pdev->dev;
int ret;
- freq = xcvr->soc_data->spdif_only ? freq / 10 : freq;
+ freq = xcvr->soc_data->spdif_only ? freq / 5 : freq;
clk_disable_unprepare(xcvr->phy_clk);
ret = clk_set_rate(xcvr->phy_clk, freq);
if (ret < 0) {
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
u32 m_ctl = 0, v_ctl = 0;
u32 r = substream->runtime->rate, ch = substream->runtime->channels;
- u32 fout = 32 * r * ch * 10 * 2;
+ u32 fout = 32 * r * ch * 10;
int ret = 0;
switch (xcvr->mode) {
case FSL_XCVR_MODE_SPDIF:
+ if (xcvr->soc_data->spdif_only && tx) {
+ ret = regmap_update_bits(xcvr->regmap, FSL_XCVR_TX_DPTH_CTRL_SET,
+ FSL_XCVR_TX_DPTH_CTRL_BYPASS_FEM,
+ FSL_XCVR_TX_DPTH_CTRL_BYPASS_FEM);
+ if (ret < 0) {
+ dev_err(dai->dev, "Failed to set bypass fem: %d\n", ret);
+ return ret;
+ }
+ }
+ fallthrough;
case FSL_XCVR_MODE_ARC:
if (tx) {
ret = fsl_xcvr_en_aud_pll(xcvr, fout);
#define BYT_RT5640_HSMIC2_ON_IN1 BIT(27)
#define BYT_RT5640_JD_HP_ELITEP_1000G2 BIT(28)
#define BYT_RT5640_USE_AMCR0F28 BIT(29)
+#define BYT_RT5640_SWAPPED_SPEAKERS BIT(30)
#define BYTCR_INPUT_DEFAULTS \
(BYT_RT5640_IN3_MAP | \
dev_info(dev, "quirk MONO_SPEAKER enabled\n");
if (byt_rt5640_quirk & BYT_RT5640_NO_SPEAKERS)
dev_info(dev, "quirk NO_SPEAKERS enabled\n");
+ if (byt_rt5640_quirk & BYT_RT5640_SWAPPED_SPEAKERS)
+ dev_info(dev, "quirk SWAPPED_SPEAKERS enabled\n");
if (byt_rt5640_quirk & BYT_RT5640_LINEOUT)
dev_info(dev, "quirk LINEOUT enabled\n");
if (byt_rt5640_quirk & BYT_RT5640_LINEOUT_AS_HP2)
BYT_RT5640_SSP0_AIF1 |
BYT_RT5640_MCLK_EN),
},
+ {
+ /* Medion Lifetab S10346 */
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "AMI Corporation"),
+ DMI_MATCH(DMI_BOARD_NAME, "Aptio CRB"),
+ /* Above strings are much too generic, also match on BIOS date */
+ DMI_MATCH(DMI_BIOS_DATE, "10/22/2015"),
+ },
+ .driver_data = (void *)(BYTCR_INPUT_DEFAULTS |
+ BYT_RT5640_SWAPPED_SPEAKERS |
+ BYT_RT5640_SSP0_AIF1 |
+ BYT_RT5640_MCLK_EN),
+ },
{ /* Mele PCG03 Mini PC */
.matches = {
DMI_EXACT_MATCH(DMI_BOARD_VENDOR, "Mini PC"),
const char *platform_name;
struct acpi_device *adev;
struct device *codec_dev;
+ const char *cfg_spk;
bool sof_parent;
int ret_val = 0;
int dai_index = 0;
- int i, cfg_spk;
- int aif;
+ int i, aif;
is_bytcr = false;
priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
}
if (byt_rt5640_quirk & BYT_RT5640_NO_SPEAKERS) {
- cfg_spk = 0;
+ cfg_spk = "0";
spk_type = "none";
} else if (byt_rt5640_quirk & BYT_RT5640_MONO_SPEAKER) {
- cfg_spk = 1;
+ cfg_spk = "1";
spk_type = "mono";
+ } else if (byt_rt5640_quirk & BYT_RT5640_SWAPPED_SPEAKERS) {
+ cfg_spk = "swapped";
+ spk_type = "swapped";
} else {
- cfg_spk = 2;
+ cfg_spk = "2";
spk_type = "stereo";
}
headset2_string = " cfg-hs2:in1";
snprintf(byt_rt5640_components, sizeof(byt_rt5640_components),
- "cfg-spk:%d cfg-mic:%s aif:%d%s%s", cfg_spk,
+ "cfg-spk:%s cfg-mic:%s aif:%d%s%s", cfg_spk,
map_name[BYT_RT5640_MAP(byt_rt5640_quirk)], aif,
lineout_string, headset2_string);
byt_rt5640_card.components = byt_rt5640_components;
card->dapm_widgets = skl_hda_widgets;
card->num_dapm_widgets = ARRAY_SIZE(skl_hda_widgets);
if (!ctx->idisp_codec) {
+ card->dapm_routes = &skl_hda_map[IDISP_ROUTE_COUNT];
+ num_route -= IDISP_ROUTE_COUNT;
for (i = 0; i < IDISP_DAI_COUNT; i++) {
skl_hda_be_dai_links[i].codecs = &snd_soc_dummy_dlc;
skl_hda_be_dai_links[i].num_codecs = 1;
{
struct device *dev = card->dev;
struct snd_soc_acpi_mach *mach = dev_get_platdata(card->dev);
- int sdw_be_num = 0, ssp_num = 0, dmic_num = 0, hdmi_num = 0, bt_num = 0;
+ int sdw_be_num = 0, ssp_num = 0, dmic_num = 0, bt_num = 0;
struct mc_private *ctx = snd_soc_card_get_drvdata(card);
struct snd_soc_acpi_mach_params *mach_params = &mach->mach_params;
const struct snd_soc_acpi_link_adr *adr_link = mach_params->links;
char *codec_name, *codec_dai_name;
int i, j, be_id = 0;
int codec_index;
+ int hdmi_num;
int ret;
ret = get_dailink_info(dev, adr_link, &sdw_be_num, &codec_conf_num);
ssp_num = hweight_long(ssp_mask);
}
- if (mach_params->codec_mask & IDISP_CODEC_MASK) {
+ if (mach_params->codec_mask & IDISP_CODEC_MASK)
ctx->hdmi.idisp_codec = true;
- if (sof_sdw_quirk & SOF_SDW_TGL_HDMI)
- hdmi_num = SOF_TGL_HDMI_COUNT;
- else
- hdmi_num = SOF_PRE_TGL_HDMI_COUNT;
- }
+ if (sof_sdw_quirk & SOF_SDW_TGL_HDMI)
+ hdmi_num = SOF_TGL_HDMI_COUNT;
+ else
+ hdmi_num = SOF_PRE_TGL_HDMI_COUNT;
/* enable dmic01 & dmic16k */
if (sof_sdw_quirk & SOF_SDW_PCH_DMIC || mach_params->dmic_num)
bt_num = 1;
dev_dbg(dev, "sdw %d, ssp %d, dmic %d, hdmi %d, bt: %d\n",
- sdw_be_num, ssp_num, dmic_num, hdmi_num, bt_num);
+ sdw_be_num, ssp_num, dmic_num,
+ ctx->hdmi.idisp_codec ? hdmi_num : 0, bt_num);
/* allocate BE dailinks */
num_links = sdw_be_num + ssp_num + dmic_num + hdmi_num + bt_num;
.adr = 0x00013701FA355601ull,
.num_endpoints = 1,
.endpoints = &spk_r_endpoint,
- .name_prefix = "cs35l56-8"
+ .name_prefix = "AMP8"
},
{
.adr = 0x00013601FA355601ull,
.num_endpoints = 1,
.endpoints = &spk_3_endpoint,
- .name_prefix = "cs35l56-7"
+ .name_prefix = "AMP7"
}
};
.adr = 0x00023301FA355601ull,
.num_endpoints = 1,
.endpoints = &spk_l_endpoint,
- .name_prefix = "cs35l56-1"
+ .name_prefix = "AMP1"
},
{
.adr = 0x00023201FA355601ull,
.num_endpoints = 1,
.endpoints = &spk_2_endpoint,
- .name_prefix = "cs35l56-2"
+ .name_prefix = "AMP2"
}
};
snd_pcm_set_sync(substream);
mconfig = skl_tplg_fe_get_cpr_module(dai, substream->stream);
- if (!mconfig)
+ if (!mconfig) {
+ kfree(dma_params);
return -EINVAL;
+ }
skl_tplg_d0i3_get(skl, mconfig->d0i3_caps);
dais = krealloc(skl->dais, sizeof(skl_fe_dai) +
sizeof(skl_platform_dai), GFP_KERNEL);
if (!dais) {
+ kfree(skl->dais);
ret = -ENOMEM;
goto err;
}
ret = devm_snd_soc_register_component(dev, &skl_component,
skl->dais, num_dais);
- if (ret)
+ if (ret) {
+ kfree(skl->dais);
dev_err(dev, "soc component registration failed %d\n", ret);
+ }
err:
return ret;
}
reply.size = (reply.header >> 32) & IPC_DATA_OFFSET_SZ_MASK;
buf = krealloc(reply.data, reply.size, GFP_KERNEL);
- if (!buf)
+ if (!buf) {
+ kfree(reply.data);
return -ENOMEM;
+ }
*payload = buf;
*bytes = reply.size;
static int sc8280xp_snd_init(struct snd_soc_pcm_runtime *rtd)
{
struct sc8280xp_snd_data *data = snd_soc_card_get_drvdata(rtd->card);
+ struct snd_soc_dai *cpu_dai = snd_soc_rtd_to_cpu(rtd, 0);
+ struct snd_soc_card *card = rtd->card;
+
+ switch (cpu_dai->id) {
+ case WSA_CODEC_DMA_RX_0:
+ case WSA_CODEC_DMA_RX_1:
+ /*
+ * set limit of 0dB on Digital Volume for Speakers,
+ * this can prevent damage of speakers to some extent without
+ * active speaker protection
+ */
+ snd_soc_limit_volume(card, "WSA_RX0 Digital Volume", 84);
+ snd_soc_limit_volume(card, "WSA_RX1 Digital Volume", 84);
+ break;
+ default:
+ break;
+ }
return qcom_snd_wcd_jack_setup(rtd, &data->jack, &data->jack_setup);
}
kctl = snd_soc_card_get_kcontrol(card, name);
if (kctl) {
struct soc_mixer_control *mc = (struct soc_mixer_control *)kctl->private_value;
- if (max <= mc->max) {
+ if (max <= mc->max - mc->min) {
mc->platform_max = max;
ret = 0;
}
if (snd_soc_dai_active(dai) == 0 &&
(dai->rate || dai->channels || dai->sample_bits))
soc_pcm_set_dai_params(dai, NULL);
-
- if (snd_soc_dai_stream_active(dai, substream->stream) == 0) {
- if (dai->driver->ops && !dai->driver->ops->mute_unmute_on_trigger)
- snd_soc_dai_digital_mute(dai, 1, substream->stream);
- }
}
}
if (snd_soc_dai_active(dai) == 1)
soc_pcm_set_dai_params(dai, NULL);
- if (snd_soc_dai_stream_active(dai, substream->stream) == 1)
- snd_soc_dai_digital_mute(dai, 1, substream->stream);
+ if (snd_soc_dai_stream_active(dai, substream->stream) == 1) {
+ if (dai->driver->ops && !dai->driver->ops->mute_unmute_on_trigger)
+ snd_soc_dai_digital_mute(dai, 1, substream->stream);
+ }
}
/* run the stream event */
static int sof_ipc3_widget_setup_comp_pipeline(struct snd_sof_widget *swidget)
{
struct snd_soc_component *scomp = swidget->scomp;
+ struct snd_sof_pipeline *spipe = swidget->spipe;
struct sof_ipc_pipe_new *pipeline;
struct snd_sof_widget *comp_swidget;
int ret;
swidget->dynamic_pipeline_widget);
swidget->core = pipeline->core;
+ spipe->core_mask |= BIT(pipeline->core);
return 0;
struct sof_ipc4_control_data *cdata = scontrol->ipc_control_data;
struct sof_ipc4_gain *gain = swidget->private;
struct sof_ipc4_msg *msg = &cdata->msg;
- struct sof_ipc4_gain_data data;
+ struct sof_ipc4_gain_params params;
bool all_channels_equal = true;
u32 value;
int ret, i;
*/
for (i = 0; i < scontrol->num_channels; i++) {
if (all_channels_equal) {
- data.channels = SOF_IPC4_GAIN_ALL_CHANNELS_MASK;
- data.init_val = cdata->chanv[0].value;
+ params.channels = SOF_IPC4_GAIN_ALL_CHANNELS_MASK;
+ params.init_val = cdata->chanv[0].value;
} else {
- data.channels = cdata->chanv[i].channel;
- data.init_val = cdata->chanv[i].value;
+ params.channels = cdata->chanv[i].channel;
+ params.init_val = cdata->chanv[i].value;
}
/* set curve type and duration from topology */
- data.curve_duration_l = gain->data.curve_duration_l;
- data.curve_duration_h = gain->data.curve_duration_h;
- data.curve_type = gain->data.curve_type;
+ params.curve_duration_l = gain->data.params.curve_duration_l;
+ params.curve_duration_h = gain->data.params.curve_duration_h;
+ params.curve_type = gain->data.params.curve_type;
- msg->data_ptr = &data;
- msg->data_size = sizeof(data);
+ msg->data_ptr = ¶ms;
+ msg->data_size = sizeof(params);
ret = sof_ipc4_set_get_kcontrol_data(scontrol, true, lock);
msg->data_ptr = NULL;
static const struct sof_topology_token gain_tokens[] = {
{SOF_TKN_GAIN_RAMP_TYPE, SND_SOC_TPLG_TUPLE_TYPE_WORD,
- get_token_u32, offsetof(struct sof_ipc4_gain_data, curve_type)},
+ get_token_u32, offsetof(struct sof_ipc4_gain_params, curve_type)},
{SOF_TKN_GAIN_RAMP_DURATION,
SND_SOC_TPLG_TUPLE_TYPE_WORD, get_token_u32,
- offsetof(struct sof_ipc4_gain_data, curve_duration_l)},
+ offsetof(struct sof_ipc4_gain_params, curve_duration_l)},
{SOF_TKN_GAIN_VAL, SND_SOC_TPLG_TUPLE_TYPE_WORD,
- get_token_u32, offsetof(struct sof_ipc4_gain_data, init_val)},
+ get_token_u32, offsetof(struct sof_ipc4_gain_params, init_val)},
};
/* SRC */
static const struct sof_topology_token src_tokens[] = {
{SOF_TKN_SRC_RATE_OUT, SND_SOC_TPLG_TUPLE_TYPE_WORD, get_token_u32,
- offsetof(struct sof_ipc4_src, sink_rate)},
+ offsetof(struct sof_ipc4_src_data, sink_rate)},
};
static const struct sof_token_info ipc4_token_list[SOF_TOKEN_COUNT] = {
{
struct snd_soc_component *scomp = swidget->scomp;
struct sof_ipc4_pipeline *pipeline;
+ struct snd_sof_pipeline *spipe = swidget->spipe;
int ret;
pipeline = kzalloc(sizeof(*pipeline), GFP_KERNEL);
}
swidget->core = pipeline->core_id;
+ spipe->core_mask |= BIT(pipeline->core_id);
if (pipeline->use_chain_dma) {
dev_dbg(scomp->dev, "Set up chain DMA for %s\n", swidget->widget->name);
swidget->private = gain;
- gain->data.channels = SOF_IPC4_GAIN_ALL_CHANNELS_MASK;
- gain->data.init_val = SOF_IPC4_VOL_ZERO_DB;
+ gain->data.params.channels = SOF_IPC4_GAIN_ALL_CHANNELS_MASK;
+ gain->data.params.init_val = SOF_IPC4_VOL_ZERO_DB;
- ret = sof_ipc4_get_audio_fmt(scomp, swidget, &gain->available_fmt, &gain->base_config);
+ ret = sof_ipc4_get_audio_fmt(scomp, swidget, &gain->available_fmt, &gain->data.base_config);
if (ret)
goto err;
- ret = sof_update_ipc_object(scomp, &gain->data, SOF_GAIN_TOKENS, swidget->tuples,
- swidget->num_tuples, sizeof(gain->data), 1);
+ ret = sof_update_ipc_object(scomp, &gain->data.params, SOF_GAIN_TOKENS,
+ swidget->tuples, swidget->num_tuples, sizeof(gain->data), 1);
if (ret) {
dev_err(scomp->dev, "Parsing gain tokens failed\n");
goto err;
dev_dbg(scomp->dev,
"pga widget %s: ramp type: %d, ramp duration %d, initial gain value: %#x\n",
- swidget->widget->name, gain->data.curve_type, gain->data.curve_duration_l,
- gain->data.init_val);
+ swidget->widget->name, gain->data.params.curve_type,
+ gain->data.params.curve_duration_l, gain->data.params.init_val);
ret = sof_ipc4_widget_setup_msg(swidget, &gain->msg);
if (ret)
static int sof_ipc4_widget_setup_comp_src(struct snd_sof_widget *swidget)
{
struct snd_soc_component *scomp = swidget->scomp;
+ struct snd_sof_pipeline *spipe = swidget->spipe;
struct sof_ipc4_src *src;
int ret;
swidget->private = src;
- ret = sof_ipc4_get_audio_fmt(scomp, swidget, &src->available_fmt, &src->base_config);
+ ret = sof_ipc4_get_audio_fmt(scomp, swidget, &src->available_fmt,
+ &src->data.base_config);
if (ret)
goto err;
- ret = sof_update_ipc_object(scomp, src, SOF_SRC_TOKENS, swidget->tuples,
+ ret = sof_update_ipc_object(scomp, &src->data, SOF_SRC_TOKENS, swidget->tuples,
swidget->num_tuples, sizeof(*src), 1);
if (ret) {
dev_err(scomp->dev, "Parsing SRC tokens failed\n");
goto err;
}
- dev_dbg(scomp->dev, "SRC sink rate %d\n", src->sink_rate);
+ spipe->core_mask |= BIT(swidget->core);
+
+ dev_dbg(scomp->dev, "SRC sink rate %d\n", src->data.sink_rate);
ret = sof_ipc4_widget_setup_msg(swidget, &src->msg);
if (ret)
{
struct snd_soc_component *scomp = swidget->scomp;
struct sof_ipc4_fw_module *fw_module;
+ struct snd_sof_pipeline *spipe = swidget->spipe;
struct sof_ipc4_process *process;
void *cfg;
int ret;
sof_ipc4_widget_update_kcontrol_module_id(swidget);
+ /* set pipeline core mask to keep track of the core the module is scheduled to run on */
+ spipe->core_mask |= BIT(swidget->core);
+
return 0;
free_base_cfg_ext:
kfree(process->base_config_ext);
u32 out_ref_rate, out_ref_channels, out_ref_valid_bits;
int ret;
- ret = sof_ipc4_init_input_audio_fmt(sdev, swidget, &gain->base_config,
+ ret = sof_ipc4_init_input_audio_fmt(sdev, swidget, &gain->data.base_config,
pipeline_params, available_fmt);
if (ret < 0)
return ret;
out_ref_channels = SOF_IPC4_AUDIO_FORMAT_CFG_CHANNELS_COUNT(in_fmt->fmt_cfg);
out_ref_valid_bits = SOF_IPC4_AUDIO_FORMAT_CFG_V_BIT_DEPTH(in_fmt->fmt_cfg);
- ret = sof_ipc4_init_output_audio_fmt(sdev, &gain->base_config, available_fmt,
+ ret = sof_ipc4_init_output_audio_fmt(sdev, &gain->data.base_config, available_fmt,
out_ref_rate, out_ref_channels, out_ref_valid_bits);
if (ret < 0) {
dev_err(sdev->dev, "Failed to initialize output format for %s",
}
/* update pipeline memory usage */
- sof_ipc4_update_resource_usage(sdev, swidget, &gain->base_config);
+ sof_ipc4_update_resource_usage(sdev, swidget, &gain->data.base_config);
return 0;
}
u32 out_ref_rate, out_ref_channels, out_ref_valid_bits;
int output_format_index, input_format_index;
- input_format_index = sof_ipc4_init_input_audio_fmt(sdev, swidget, &src->base_config,
+ input_format_index = sof_ipc4_init_input_audio_fmt(sdev, swidget, &src->data.base_config,
pipeline_params, available_fmt);
if (input_format_index < 0)
return input_format_index;
*/
out_ref_rate = params_rate(fe_params);
- output_format_index = sof_ipc4_init_output_audio_fmt(sdev, &src->base_config,
+ output_format_index = sof_ipc4_init_output_audio_fmt(sdev, &src->data.base_config,
available_fmt, out_ref_rate,
out_ref_channels, out_ref_valid_bits);
if (output_format_index < 0) {
}
/* update pipeline memory usage */
- sof_ipc4_update_resource_usage(sdev, swidget, &src->base_config);
+ sof_ipc4_update_resource_usage(sdev, swidget, &src->data.base_config);
out_audio_fmt = &available_fmt->output_pin_fmts[output_format_index].audio_fmt;
- src->sink_rate = out_audio_fmt->sampling_frequency;
+ src->data.sink_rate = out_audio_fmt->sampling_frequency;
/* update pipeline_params for sink widgets */
return sof_ipc4_update_hw_params(sdev, pipeline_params, out_audio_fmt);
{
struct sof_ipc4_gain *gain = swidget->private;
- ipc_size = sizeof(struct sof_ipc4_base_module_cfg) +
- sizeof(struct sof_ipc4_gain_data);
- ipc_data = gain;
+ ipc_size = sizeof(gain->data);
+ ipc_data = &gain->data;
msg = &gain->msg;
break;
{
struct sof_ipc4_src *src = swidget->private;
- ipc_size = sizeof(struct sof_ipc4_base_module_cfg) + sizeof(src->sink_rate);
- ipc_data = src;
+ ipc_size = sizeof(src->data);
+ ipc_data = &src->data;
msg = &src->msg;
break;
} __packed;
/**
- * struct sof_ipc4_gain_data - IPC gain blob
+ * struct sof_ipc4_gain_params - IPC gain parameters
* @channels: Channels
* @init_val: Initial value
* @curve_type: Curve type
* @curve_duration_l: Curve duration low part
* @curve_duration_h: Curve duration high part
*/
-struct sof_ipc4_gain_data {
+struct sof_ipc4_gain_params {
uint32_t channels;
uint32_t init_val;
uint32_t curve_type;
uint32_t reserved;
uint32_t curve_duration_l;
uint32_t curve_duration_h;
-} __aligned(8);
+} __packed __aligned(4);
/**
- * struct sof_ipc4_gain - gain config data
+ * struct sof_ipc4_gain_data - IPC gain init blob
* @base_config: IPC base config data
+ * @params: Initial parameters for the gain module
+ */
+struct sof_ipc4_gain_data {
+ struct sof_ipc4_base_module_cfg base_config;
+ struct sof_ipc4_gain_params params;
+} __packed __aligned(4);
+
+/**
+ * struct sof_ipc4_gain - gain config data
* @data: IPC gain blob
* @available_fmt: Available audio format
* @msg: message structure for gain
*/
struct sof_ipc4_gain {
- struct sof_ipc4_base_module_cfg base_config;
struct sof_ipc4_gain_data data;
struct sof_ipc4_available_audio_format available_fmt;
struct sof_ipc4_msg msg;
struct sof_ipc4_msg msg;
};
-/**
- * struct sof_ipc4_src SRC config data
+/*
+ * struct sof_ipc4_src_data - IPC data for SRC
* @base_config: IPC base config data
* @sink_rate: Output rate for sink module
+ */
+struct sof_ipc4_src_data {
+ struct sof_ipc4_base_module_cfg base_config;
+ uint32_t sink_rate;
+} __packed __aligned(4);
+
+/**
+ * struct sof_ipc4_src - SRC config data
+ * @data: IPC base config data
* @available_fmt: Available audio format
* @msg: IPC4 message struct containing header and data info
*/
struct sof_ipc4_src {
- struct sof_ipc4_base_module_cfg base_config;
- uint32_t sink_rate;
+ struct sof_ipc4_src_data data;
struct sof_ipc4_available_audio_format available_fmt;
struct sof_ipc4_msg msg;
};
struct snd_sof_widget *swidget)
{
const struct sof_ipc_tplg_ops *tplg_ops = sof_ipc_get_ops(sdev, tplg);
+ struct snd_sof_pipeline *spipe = swidget->spipe;
struct snd_sof_widget *pipe_widget;
int err = 0;
int ret;
}
/*
- * disable widget core. continue to route setup status and complete flag
- * even if this fails and return the appropriate error
+ * decrement ref count for cores associated with all modules in the pipeline and clear
+ * the complete flag
*/
- ret = snd_sof_dsp_core_put(sdev, swidget->core);
- if (ret < 0) {
- dev_err(sdev->dev, "error: failed to disable target core: %d for widget %s\n",
- swidget->core, swidget->widget->name);
- if (!err)
- err = ret;
+ if (swidget->id == snd_soc_dapm_scheduler) {
+ int i;
+
+ for_each_set_bit(i, &spipe->core_mask, sdev->num_cores) {
+ ret = snd_sof_dsp_core_put(sdev, i);
+ if (ret < 0) {
+ dev_err(sdev->dev, "failed to disable target core: %d for pipeline %s\n",
+ i, swidget->widget->name);
+ if (!err)
+ err = ret;
+ }
+ }
+ swidget->spipe->complete = 0;
}
/*
err = ret;
}
- /* clear pipeline complete */
- if (swidget->id == snd_soc_dapm_scheduler)
- swidget->spipe->complete = 0;
-
if (!err)
dev_dbg(sdev->dev, "widget %s freed\n", swidget->widget->name);
struct snd_sof_widget *swidget)
{
const struct sof_ipc_tplg_ops *tplg_ops = sof_ipc_get_ops(sdev, tplg);
+ struct snd_sof_pipeline *spipe = swidget->spipe;
bool use_count_decremented = false;
int ret;
+ int i;
/* skip if there is no private data */
if (!swidget->private)
goto use_count_dec;
}
- /* enable widget core */
- ret = snd_sof_dsp_core_get(sdev, swidget->core);
- if (ret < 0) {
- dev_err(sdev->dev, "error: failed to enable target core for widget %s\n",
- swidget->widget->name);
- goto pipe_widget_free;
+ /* update ref count for cores associated with all modules in the pipeline */
+ if (swidget->id == snd_soc_dapm_scheduler) {
+ for_each_set_bit(i, &spipe->core_mask, sdev->num_cores) {
+ ret = snd_sof_dsp_core_get(sdev, i);
+ if (ret < 0) {
+ dev_err(sdev->dev, "failed to enable target core %d for pipeline %s\n",
+ i, swidget->widget->name);
+ goto pipe_widget_free;
+ }
+ }
}
/* setup widget in the DSP */
if (tplg_ops && tplg_ops->widget_setup) {
ret = tplg_ops->widget_setup(sdev, swidget);
if (ret < 0)
- goto core_put;
+ goto pipe_widget_free;
}
/* send config for DAI components */
return 0;
widget_free:
- /* widget use_count and core ref_count will both be decremented by sof_widget_free() */
+ /* widget use_count will be decremented by sof_widget_free() */
sof_widget_free_unlocked(sdev, swidget);
use_count_decremented = true;
-core_put:
- if (!use_count_decremented)
- snd_sof_dsp_core_put(sdev, swidget->core);
pipe_widget_free:
- if (swidget->id != snd_soc_dapm_scheduler)
+ if (swidget->id != snd_soc_dapm_scheduler) {
sof_widget_free_unlocked(sdev, swidget->spipe->pipe_widget);
+ } else {
+ int j;
+
+ /* decrement ref count for all cores that were updated previously */
+ for_each_set_bit(j, &spipe->core_mask, sdev->num_cores) {
+ if (j >= i)
+ break;
+ snd_sof_dsp_core_put(sdev, j);
+ }
+ }
use_count_dec:
if (!use_count_decremented)
swidget->use_count--;
* @paused_count: Count of number of PCM's that have started and have currently paused this
pipeline
* @complete: flag used to indicate that pipeline set up is complete.
+ * @core_mask: Mask containing target cores for all modules in the pipeline
* @list: List item in sdev pipeline_list
*/
struct snd_sof_pipeline {
int started_count;
int paused_count;
int complete;
+ unsigned long core_mask;
struct list_head list;
};
/* perform pcm set op */
if (ipc_pcm_ops && ipc_pcm_ops->pcm_setup) {
ret = ipc_pcm_ops->pcm_setup(sdev, spcm);
- if (ret < 0)
+ if (ret < 0) {
+ kfree(spcm);
return ret;
+ }
}
dai_drv->dobj.private = spcm;
#define SND_DJM_850_IDX 0x2
#define SND_DJM_900NXS2_IDX 0x3
#define SND_DJM_750MK2_IDX 0x4
+#define SND_DJM_450_IDX 0x5
#define SND_DJM_CTL(_name, suffix, _default_value, _windex) { \
};
+// DJM-450
+static const u16 snd_djm_opts_450_cap1[] = {
+ 0x0103, 0x0100, 0x0106, 0x0107, 0x0108, 0x0109, 0x010d, 0x010a };
+
+static const u16 snd_djm_opts_450_cap2[] = {
+ 0x0203, 0x0200, 0x0206, 0x0207, 0x0208, 0x0209, 0x020d, 0x020a };
+
+static const u16 snd_djm_opts_450_cap3[] = {
+ 0x030a, 0x0311, 0x0312, 0x0307, 0x0308, 0x0309, 0x030d };
+
+static const u16 snd_djm_opts_450_pb1[] = { 0x0100, 0x0101, 0x0104 };
+static const u16 snd_djm_opts_450_pb2[] = { 0x0200, 0x0201, 0x0204 };
+static const u16 snd_djm_opts_450_pb3[] = { 0x0300, 0x0301, 0x0304 };
+
+static const struct snd_djm_ctl snd_djm_ctls_450[] = {
+ SND_DJM_CTL("Capture Level", cap_level, 0, SND_DJM_WINDEX_CAPLVL),
+ SND_DJM_CTL("Ch1 Input", 450_cap1, 2, SND_DJM_WINDEX_CAP),
+ SND_DJM_CTL("Ch2 Input", 450_cap2, 2, SND_DJM_WINDEX_CAP),
+ SND_DJM_CTL("Ch3 Input", 450_cap3, 0, SND_DJM_WINDEX_CAP),
+ SND_DJM_CTL("Ch1 Output", 450_pb1, 0, SND_DJM_WINDEX_PB),
+ SND_DJM_CTL("Ch2 Output", 450_pb2, 1, SND_DJM_WINDEX_PB),
+ SND_DJM_CTL("Ch3 Output", 450_pb3, 2, SND_DJM_WINDEX_PB)
+};
+
+
// DJM-750
static const u16 snd_djm_opts_750_cap1[] = {
0x0101, 0x0103, 0x0106, 0x0107, 0x0108, 0x0109, 0x010a, 0x010f };
[SND_DJM_850_IDX] = SND_DJM_DEVICE(850),
[SND_DJM_900NXS2_IDX] = SND_DJM_DEVICE(900nxs2),
[SND_DJM_750MK2_IDX] = SND_DJM_DEVICE(750mk2),
+ [SND_DJM_450_IDX] = SND_DJM_DEVICE(450),
};
case USB_ID(0x2b73, 0x0017): /* Pioneer DJ DJM-250MK2 */
err = snd_djm_controls_create(mixer, SND_DJM_250MK2_IDX);
break;
+ case USB_ID(0x2b73, 0x0013): /* Pioneer DJ DJM-450 */
+ err = snd_djm_controls_create(mixer, SND_DJM_450_IDX);
+ break;
case USB_ID(0x08e4, 0x017f): /* Pioneer DJ DJM-750 */
err = snd_djm_controls_create(mixer, SND_DJM_750_IDX);
break;
static int snd_usb_motu_m_series_boot_quirk(struct usb_device *dev)
{
- msleep(2000);
+ msleep(4000);
return 0;
}
unsigned int id)
{
switch (id) {
- case USB_ID(0x07fd, 0x0008): /* MOTU M Series */
+ case USB_ID(0x07fd, 0x0008): /* MOTU M Series, 1st hardware version */
return snd_usb_motu_m_series_boot_quirk(dev);
}
#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A510 0xD46
+#define ARM_CPU_PART_CORTEX_A520 0xD80
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_A715 0xD4D
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
-#define APM_CPU_PART_POTENZA 0x000
+#define APM_CPU_PART_XGENE 0x000
+#define APM_CPU_VAR_POTENZA 0x00
#define CAVIUM_CPU_PART_THUNDERX 0x0A1
#define CAVIUM_CPU_PART_THUNDERX_81XX 0x0A2
#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510)
+#define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define KVM_HYPERCALL_EXIT_SMC (1U << 0)
#define KVM_HYPERCALL_EXIT_16BIT (1U << 1)
+/*
+ * Get feature ID registers userspace writable mask.
+ *
+ * From DDI0487J.a, D19.2.66 ("ID_AA64MMFR2_EL1, AArch64 Memory Model
+ * Feature Register 2"):
+ *
+ * "The Feature ID space is defined as the System register space in
+ * AArch64 with op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7},
+ * op2=={0-7}."
+ *
+ * This covers all currently known R/O registers that indicate
+ * anything useful feature wise, including the ID registers.
+ *
+ * If we ever need to introduce a new range, it will be described as
+ * such in the range field.
+ */
+#define KVM_ARM_FEATURE_ID_RANGE_IDX(op0, op1, crn, crm, op2) \
+ ({ \
+ __u64 __op1 = (op1) & 3; \
+ __op1 -= (__op1 == 3); \
+ (__op1 << 6 | ((crm) & 7) << 3 | (op2)); \
+ })
+
+#define KVM_ARM_FEATURE_ID_RANGE 0
+#define KVM_ARM_FEATURE_ID_RANGE_SIZE (3 * 8 * 8)
+
+struct reg_mask_range {
+ __u64 addr; /* Pointer to mask array */
+ __u32 range; /* Requested range */
+ __u32 reserved[13];
+};
+
#endif
#endif /* __ARM_KVM_H__ */
PERF_REG_ARM64_LR,
PERF_REG_ARM64_SP,
PERF_REG_ARM64_PC,
+ PERF_REG_ARM64_MAX,
/* Extended/pseudo registers */
- PERF_REG_ARM64_VG = 46, // SVE Vector Granule
-
- PERF_REG_ARM64_MAX = PERF_REG_ARM64_PC + 1,
- PERF_REG_ARM64_EXTENDED_MAX = PERF_REG_ARM64_VG + 1
+ PERF_REG_ARM64_VG = 46, /* SVE Vector Granule */
+ PERF_REG_ARM64_EXTENDED_MAX
};
+
+#define PERF_REG_EXTENDED_MASK (1ULL << PERF_REG_ARM64_VG)
+
#endif /* _ASM_ARM64_PERF_REGS_H */
arm64_tools_dir = $(top_srcdir)/arch/arm64/tools
arm64_sysreg_tbl = $(arm64_tools_dir)/sysreg
arm64_gen_sysreg = $(arm64_tools_dir)/gen-sysreg.awk
-arm64_generated_dir = $(top_srcdir)/tools/arch/arm64/include/generated
+arm64_generated_dir = $(OUTPUT)arch/arm64/include/generated
arm64_sysreg_defs = $(arm64_generated_dir)/asm/sysreg-defs.h
all: $(arm64_sysreg_defs)
/* We now return you to your regularly scheduled HPUX. */
-#define ENOSYM 215 /* symbol does not exist in executable */
#define ENOTSOCK 216 /* Socket operation on non-socket */
#define EDESTADDRREQ 217 /* Destination address required */
#define EMSGSIZE 218 /* Message too long */
#define ETIMEDOUT 238 /* Connection timed out */
#define ECONNREFUSED 239 /* Connection refused */
#define EREFUSED ECONNREFUSED /* for HP's NFS apparently */
-#define EREMOTERELEASE 240 /* Remote peer released connection */
#define EHOSTDOWN 241 /* Host is down */
#define EHOSTUNREACH 242 /* No route to host */
__u8 reserved[1728];
};
+#define KVM_S390_VM_CPU_PROCESSOR_UV_FEAT_GUEST 6
+#define KVM_S390_VM_CPU_MACHINE_UV_FEAT_GUEST 7
+
+#define KVM_S390_VM_CPU_UV_FEAT_NR_BITS 64
+struct kvm_s390_vm_cpu_uv_feat {
+ union {
+ struct {
+ __u64 : 4;
+ __u64 ap : 1; /* bit 4 */
+ __u64 ap_intr : 1; /* bit 5 */
+ __u64 : 58;
+ };
+ __u64 feat;
+ };
+};
+
/* kvm attributes for crypto */
#define KVM_S390_VM_CRYPTO_ENABLE_AES_KW 0
#define KVM_S390_VM_CRYPTO_ENABLE_DEA_KW 1
#define X86_FEATURE_CAT_L3 ( 7*32+ 4) /* Cache Allocation Technology L3 */
#define X86_FEATURE_CAT_L2 ( 7*32+ 5) /* Cache Allocation Technology L2 */
#define X86_FEATURE_CDP_L3 ( 7*32+ 6) /* Code and Data Prioritization L3 */
-#define X86_FEATURE_INVPCID_SINGLE ( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
#define X86_FEATURE_XCOMPACTED ( 7*32+10) /* "" Use compacted XSTATE (XSAVES or XSAVEC) */
#define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */
#define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */
#define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */
+#define X86_FEATURE_USER_SHSTK (11*32+23) /* Shadow stack support for user mode applications */
+
+#define X86_FEATURE_SRSO (11*32+24) /* "" AMD BTB untrain RETs */
+#define X86_FEATURE_SRSO_ALIAS (11*32+25) /* "" AMD BTB untrain RETs through aliasing */
+#define X86_FEATURE_IBPB_ON_VMEXIT (11*32+26) /* "" Issue an IBPB only on VMEXIT */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */
#define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */
#define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
+#define X86_FEATURE_SHSTK (16*32+ 7) /* "" Shadow stack */
#define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */
#define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */
#define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */
/* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
#define X86_FEATURE_NO_NESTED_DATA_BP (20*32+ 0) /* "" No Nested Data Breakpoints */
+#define X86_FEATURE_WRMSR_XX_BASE_NS (20*32+ 1) /* "" WRMSR to {FS,GS,KERNEL_GS}_BASE is non-serializing */
#define X86_FEATURE_LFENCE_RDTSC (20*32+ 2) /* "" LFENCE always serializing / synchronizes RDTSC */
#define X86_FEATURE_NULL_SEL_CLR_BASE (20*32+ 6) /* "" Null Selector Clears Base */
#define X86_FEATURE_AUTOIBRS (20*32+ 8) /* "" Automatic IBRS */
#define X86_FEATURE_NO_SMM_CTL_MSR (20*32+ 9) /* "" SMM_CTL MSR is not present */
+#define X86_FEATURE_SBPB (20*32+27) /* "" Selective Branch Prediction Barrier */
+#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */
+#define X86_FEATURE_SRSO_NO (20*32+29) /* "" CPU is not affected by SRSO */
+
/*
* BUG word(s)
*/
#define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */
#define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
#define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
+#define X86_BUG_GDS X86_BUG(30) /* CPU is affected by Gather Data Sampling */
+/* BUG word 2 */
+#define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */
+#define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */
#endif /* _ASM_X86_CPUFEATURES_H */
# define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31))
#endif
+#ifdef CONFIG_X86_USER_SHADOW_STACK
+#define DISABLE_USER_SHSTK 0
+#else
+#define DISABLE_USER_SHSTK (1 << (X86_FEATURE_USER_SHSTK & 31))
+#endif
+
+#ifdef CONFIG_X86_KERNEL_IBT
+#define DISABLE_IBT 0
+#else
+#define DISABLE_IBT (1 << (X86_FEATURE_IBT & 31))
+#endif
+
/*
* Make sure to add features to the correct mask
*/
#define DISABLED_MASK9 (DISABLE_SGX)
#define DISABLED_MASK10 0
#define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
- DISABLE_CALL_DEPTH_TRACKING)
+ DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK)
#define DISABLED_MASK12 (DISABLE_LAM)
#define DISABLED_MASK13 0
#define DISABLED_MASK14 0
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
DISABLE_ENQCMD)
#define DISABLED_MASK17 0
-#define DISABLED_MASK18 0
+#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK19 0
#define DISABLED_MASK20 0
#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
#define MSR_INTEGRITY_CAPS_ARRAY_BIST BIT(MSR_INTEGRITY_CAPS_ARRAY_BIST_BIT)
#define MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT 4
#define MSR_INTEGRITY_CAPS_PERIODIC_BIST BIT(MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT)
+#define MSR_INTEGRITY_CAPS_SAF_GEN_MASK GENMASK_ULL(10, 9)
#define MSR_LBR_NHM_FROM 0x00000680
#define MSR_LBR_NHM_TO 0x000006c0
#define MSR_AMD64_CPUID_FN_1 0xc0011004
#define MSR_AMD64_LS_CFG 0xc0011020
#define MSR_AMD64_DC_CFG 0xc0011022
+#define MSR_AMD64_TW_CFG 0xc0011023
#define MSR_AMD64_DE_CFG 0xc0011029
#define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT 1
/* AMD Last Branch Record MSRs */
#define MSR_AMD64_LBR_SELECT 0xc000010e
+/* Zen4 */
+#define MSR_ZEN4_BP_CFG 0xc001102e
+#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
+
+/* Fam 19h MSRs */
+#define MSR_F19H_UMC_PERF_CTL 0xc0010800
+#define MSR_F19H_UMC_PERF_CTR 0xc0010801
+
+/* Zen 2 */
+#define MSR_ZEN2_SPECTRAL_CHICKEN 0xc00110e3
+#define MSR_ZEN2_SPECTRAL_CHICKEN_BIT BIT_ULL(1)
+
/* Fam 17h MSRs */
#define MSR_F17H_IRPERF 0xc00000e9
-#define MSR_ZEN2_SPECTRAL_CHICKEN 0xc00110e3
-#define MSR_ZEN2_SPECTRAL_CHICKEN_BIT BIT_ULL(1)
-
/* Fam 16h MSRs */
#define MSR_F16H_L2I_PERF_CTL 0xc0010230
#define MSR_F16H_L2I_PERF_CTR 0xc0010231
#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
#define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
-/* AMD-V MSRs */
+/* AMD-V MSRs */
#define MSR_VM_CR 0xc0010114
#define MSR_VM_IGNNE 0xc0010115
#define MSR_VM_HSAVE_PA 0xc0010117
+#define SVM_VM_CR_VALID_MASK 0x001fULL
+#define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
+#define SVM_VM_CR_SVM_DIS_MASK 0x0010ULL
+
/* Hardware Feedback Interface */
#define MSR_IA32_HW_FEEDBACK_PTR 0x17d0
#define MSR_IA32_HW_FEEDBACK_CONFIG 0x17d1
#define ARCH_MAP_VDSO_32 0x2002
#define ARCH_MAP_VDSO_64 0x2003
+/* Don't use 0x3001-0x3004 because of old glibcs */
+
#define ARCH_GET_UNTAG_MASK 0x4001
#define ARCH_ENABLE_TAGGED_ADDR 0x4002
#define ARCH_GET_MAX_TAG_BITS 0x4003
#define ARCH_FORCE_TAGGED_SVA 0x4004
+#define ARCH_SHSTK_ENABLE 0x5001
+#define ARCH_SHSTK_DISABLE 0x5002
+#define ARCH_SHSTK_LOCK 0x5003
+#define ARCH_SHSTK_UNLOCK 0x5004
+#define ARCH_SHSTK_STATUS 0x5005
+
+/* ARCH_SHSTK_ features bits */
+#define ARCH_SHSTK_SHSTK (1ULL << 0)
+#define ARCH_SHSTK_WRSS (1ULL << 1)
+
#endif /* _ASM_X86_PRCTL_H */
if (error)
goto setval_error;
- if (new_val->addr_family == ADDR_FAMILY_IPV6) {
+ if (new_val->addr_family & ADDR_FAMILY_IPV6) {
error = fprintf(nmfile, "\n[ipv6]\n");
if (error < 0)
goto setval_error;
if (error < 0)
goto setval_error;
- error = fprintf(nmfile, "gateway=%s\n", (char *)new_val->gate_way);
- if (error < 0)
- goto setval_error;
-
- error = fprintf(nmfile, "dns=%s\n", (char *)new_val->dns_addr);
- if (error < 0)
- goto setval_error;
+ /* we do not want ipv4 addresses in ipv6 section and vice versa */
+ if (is_ipv6 != is_ipv4((char *)new_val->gate_way)) {
+ error = fprintf(nmfile, "gateway=%s\n", (char *)new_val->gate_way);
+ if (error < 0)
+ goto setval_error;
+ }
+ if (is_ipv6 != is_ipv4((char *)new_val->dns_addr)) {
+ error = fprintf(nmfile, "dns=%s\n", (char *)new_val->dns_addr);
+ if (error < 0)
+ goto setval_error;
+ }
fclose(nmfile);
fclose(ifcfg_file);
# or "manual" if no boot-time protocol should be used)
#
# address1=ipaddr1/plen
-# address=ipaddr2/plen
+# address2=ipaddr2/plen
#
# gateway=gateway1;gateway2
#
#
# [ipv6]
# address1=ipaddr1/plen
-# address2=ipaddr1/plen
+# address2=ipaddr2/plen
#
# gateway=gateway1;gateway2
#
*/
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpacked"
+#pragma GCC diagnostic ignored "-Wattributes"
#define __get_unaligned_t(type, ptr) ({ \
const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \
__SYSCALL(__NR_futex_waitv, sys_futex_waitv)
#define __NR_set_mempolicy_home_node 450
__SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
-
#define __NR_cachestat 451
__SYSCALL(__NR_cachestat, sys_cachestat)
-
#define __NR_fchmodat2 452
__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
+#define __NR_map_shadow_stack 453
+__SYSCALL(__NR_map_shadow_stack, sys_map_shadow_stack)
+#define __NR_futex_wake 454
+__SYSCALL(__NR_futex_wake, sys_futex_wake)
+#define __NR_futex_wait 455
+__SYSCALL(__NR_futex_wait, sys_futex_wait)
+#define __NR_futex_requeue 456
+__SYSCALL(__NR_futex_requeue, sys_futex_requeue)
#undef __NR_syscalls
-#define __NR_syscalls 453
+#define __NR_syscalls 457
/*
* 32 bit systems traditionally used different
#define DRM_IOCTL_MODE_PAGE_FLIP DRM_IOWR(0xB0, struct drm_mode_crtc_page_flip)
#define DRM_IOCTL_MODE_DIRTYFB DRM_IOWR(0xB1, struct drm_mode_fb_dirty_cmd)
+/**
+ * DRM_IOCTL_MODE_CREATE_DUMB - Create a new dumb buffer object.
+ *
+ * KMS dumb buffers provide a very primitive way to allocate a buffer object
+ * suitable for scanout and map it for software rendering. KMS dumb buffers are
+ * not suitable for hardware-accelerated rendering nor video decoding. KMS dumb
+ * buffers are not suitable to be displayed on any other device than the KMS
+ * device where they were allocated from. Also see
+ * :ref:`kms_dumb_buffer_objects`.
+ *
+ * The IOCTL argument is a struct drm_mode_create_dumb.
+ *
+ * User-space is expected to create a KMS dumb buffer via this IOCTL, then add
+ * it as a KMS framebuffer via &DRM_IOCTL_MODE_ADDFB and map it via
+ * &DRM_IOCTL_MODE_MAP_DUMB.
+ *
+ * &DRM_CAP_DUMB_BUFFER indicates whether this IOCTL is supported.
+ * &DRM_CAP_DUMB_PREFERRED_DEPTH and &DRM_CAP_DUMB_PREFER_SHADOW indicate
+ * driver preferences for dumb buffers.
+ */
#define DRM_IOCTL_MODE_CREATE_DUMB DRM_IOWR(0xB2, struct drm_mode_create_dumb)
#define DRM_IOCTL_MODE_MAP_DUMB DRM_IOWR(0xB3, struct drm_mode_map_dumb)
#define DRM_IOCTL_MODE_DESTROY_DUMB DRM_IOWR(0xB4, struct drm_mode_destroy_dumb)
*/
/**
- * DOC: uevents generated by i915 on it's device node
+ * DOC: uevents generated by i915 on its device node
*
* I915_L3_PARITY_UEVENT - Generated when the driver receives a parity mismatch
- * event from the gpu l3 cache. Additional information supplied is ROW,
+ * event from the GPU L3 cache. Additional information supplied is ROW,
* BANK, SUBBANK, SLICE of the affected cacheline. Userspace should keep
- * track of these events and if a specific cache-line seems to have a
- * persistent error remap it with the l3 remapping tool supplied in
+ * track of these events, and if a specific cache-line seems to have a
+ * persistent error, remap it with the L3 remapping tool supplied in
* intel-gpu-tools. The value supplied with the event is always 1.
*
* I915_ERROR_UEVENT - Generated upon error detection, currently only via
__u8 contents_encryption_mode;
__u8 filenames_encryption_mode;
__u8 flags;
- __u8 __reserved[4];
+ __u8 log2_data_unit_size;
+ __u8 __reserved[3];
__u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE];
};
#define KVM_EXIT_RISCV_SBI 35
#define KVM_EXIT_RISCV_CSR 36
#define KVM_EXIT_NOTIFY 37
+#define KVM_EXIT_LOONGARCH_IOCSR 38
/* For KVM_EXIT_INTERNAL_ERROR */
/* Emulate instruction failed. */
__u32 len;
__u8 is_write;
} mmio;
+ /* KVM_EXIT_LOONGARCH_IOCSR */
+ struct {
+ __u64 phys_addr;
+ __u8 data[8];
+ __u32 len;
+ __u8 is_write;
+ } iocsr_io;
/* KVM_EXIT_HYPERCALL */
struct {
__u64 nr;
#define KVM_CAP_COUNTER_OFFSET 227
#define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228
#define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229
+#define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230
#ifdef KVM_CAP_IRQ_ROUTING
#define KVM_REG_ARM64 0x6000000000000000ULL
#define KVM_REG_MIPS 0x7000000000000000ULL
#define KVM_REG_RISCV 0x8000000000000000ULL
+#define KVM_REG_LOONGARCH 0x9000000000000000ULL
#define KVM_REG_SIZE_SHIFT 52
#define KVM_REG_SIZE_MASK 0x00f0000000000000ULL
__u64 addr; /* userspace address of attr data */
};
-#define KVM_DEV_VFIO_GROUP 1
-#define KVM_DEV_VFIO_GROUP_ADD 1
-#define KVM_DEV_VFIO_GROUP_DEL 2
+#define KVM_DEV_VFIO_FILE 1
+
+#define KVM_DEV_VFIO_FILE_ADD 1
+#define KVM_DEV_VFIO_FILE_DEL 2
+
+/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */
+#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE
+
+#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD
+#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL
#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3
enum kvm_device_type {
#define KVM_ARM_MTE_COPY_TAGS _IOR(KVMIO, 0xb4, struct kvm_arm_copy_mte_tags)
/* Available with KVM_CAP_COUNTER_OFFSET */
#define KVM_ARM_SET_COUNTER_OFFSET _IOW(KVMIO, 0xb5, struct kvm_arm_counter_offset)
+#define KVM_ARM_GET_REG_WRITABLE_MASKS _IOR(KVMIO, 0xb6, struct reg_mask_range)
/* ioctl for vm fd */
#define KVM_CREATE_DEVICE _IOWR(KVMIO, 0xe0, struct kvm_create_device)
FSCONFIG_SET_PATH = 3, /* Set parameter, supplying an object by path */
FSCONFIG_SET_PATH_EMPTY = 4, /* Set parameter, supplying an object by (empty) path */
FSCONFIG_SET_FD = 5, /* Set parameter, supplying an object by fd */
- FSCONFIG_CMD_CREATE = 6, /* Invoke superblock creation */
+ FSCONFIG_CMD_CREATE = 6, /* Create new or reuse existing superblock */
FSCONFIG_CMD_RECONFIGURE = 7, /* Invoke superblock reconfiguration */
+ FSCONFIG_CMD_CREATE_EXCL = 8, /* Create new superblock, fail if reusing existing superblock */
};
/*
*/
#define VHOST_VDPA_RESUME _IO(VHOST_VIRTIO, 0x7E)
+/* Get the group for the descriptor table including driver & device areas
+ * of a virtqueue: read index, write group in num.
+ * The virtqueue index is stored in the index field of vhost_vring_state.
+ * The group ID of the descriptor table for this specific virtqueue
+ * is returned via num field of vhost_vring_state.
+ */
+#define VHOST_VDPA_GET_VRING_DESC_GROUP _IOWR(VHOST_VIRTIO, 0x7F, \
+ struct vhost_vring_state)
#endif
CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h)
CFLAGS_handshake:=$(call get_hdr_inc,_LINUX_HANDSHAKE_H,handshake.h)
CFLAGS_netdev:=$(call get_hdr_inc,_LINUX_NETDEV_H,netdev.h)
-CFLAGS_nfsd:=$(call get_hdr_inc,_LINUX_NFSD_H,nfsd.h)
+CFLAGS_nfsd:=$(call get_hdr_inc,_LINUX_NFSD_NETLINK_H,nfsd_netlink.h)
/* Enums */
static const char * const devlink_op_strmap[] = {
[3] = "get",
- [7] = "port-get",
+ // skip "port-get", duplicate reply value
[DEVLINK_CMD_PORT_NEW] = "port-new",
[13] = "sb-get",
[17] = "sb-pool-get",
int devlink_port_set(struct ynl_sock *ys, struct devlink_port_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.port_function)
devlink_dl_port_function_put(nlh, DEVLINK_ATTR_PORT_FUNCTION, &req->port_function);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_port_del(struct ynl_sock *ys, struct devlink_port_del_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.port_index)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_PORT_INDEX, req->port_index);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_port_split(struct ynl_sock *ys, struct devlink_port_split_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.port_split_count)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_PORT_SPLIT_COUNT, req->port_split_count);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_port_unsplit(struct ynl_sock *ys,
struct devlink_port_unsplit_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.port_index)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_PORT_INDEX, req->port_index);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_sb_pool_set(struct ynl_sock *ys,
struct devlink_sb_pool_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sb_pool_size)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_SB_POOL_SIZE, req->sb_pool_size);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_sb_port_pool_set(struct ynl_sock *ys,
struct devlink_sb_port_pool_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sb_threshold)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_SB_THRESHOLD, req->sb_threshold);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_sb_tc_pool_bind_set(struct ynl_sock *ys,
struct devlink_sb_tc_pool_bind_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sb_threshold)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_SB_THRESHOLD, req->sb_threshold);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_sb_occ_snapshot(struct ynl_sock *ys,
struct devlink_sb_occ_snapshot_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sb_index)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_SB_INDEX, req->sb_index);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_sb_occ_max_clear(struct ynl_sock *ys,
struct devlink_sb_occ_max_clear_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sb_index)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_SB_INDEX, req->sb_index);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_eswitch_set(struct ynl_sock *ys,
struct devlink_eswitch_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.eswitch_encap_mode)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_ESWITCH_ENCAP_MODE, req->eswitch_encap_mode);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_dpipe_table_counters_set(struct ynl_sock *ys,
struct devlink_dpipe_table_counters_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.dpipe_table_counters_enabled)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_DPIPE_TABLE_COUNTERS_ENABLED, req->dpipe_table_counters_enabled);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_resource_set(struct ynl_sock *ys,
struct devlink_resource_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.resource_size)
mnl_attr_put_u64(nlh, DEVLINK_ATTR_RESOURCE_SIZE, req->resource_size);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_param_set(struct ynl_sock *ys, struct devlink_param_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.param_value_cmode)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_PARAM_VALUE_CMODE, req->param_value_cmode);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_region_del(struct ynl_sock *ys, struct devlink_region_del_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.region_snapshot_id)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_REGION_SNAPSHOT_ID, req->region_snapshot_id);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_port_param_set(struct ynl_sock *ys,
struct devlink_port_param_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.port_index)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_PORT_INDEX, req->port_index);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_health_reporter_set(struct ynl_sock *ys,
struct devlink_health_reporter_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.health_reporter_auto_dump)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_HEALTH_REPORTER_AUTO_DUMP, req->health_reporter_auto_dump);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_health_reporter_recover(struct ynl_sock *ys,
struct devlink_health_reporter_recover_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.health_reporter_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME, req->health_reporter_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_health_reporter_diagnose(struct ynl_sock *ys,
struct devlink_health_reporter_diagnose_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.health_reporter_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME, req->health_reporter_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_health_reporter_dump_clear(struct ynl_sock *ys,
struct devlink_health_reporter_dump_clear_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.health_reporter_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME, req->health_reporter_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_flash_update(struct ynl_sock *ys,
struct devlink_flash_update_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.flash_update_overwrite_mask)
mnl_attr_put(nlh, DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK, sizeof(struct nla_bitfield32), &req->flash_update_overwrite_mask);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_trap_set(struct ynl_sock *ys, struct devlink_trap_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.trap_action)
mnl_attr_put_u8(nlh, DEVLINK_ATTR_TRAP_ACTION, req->trap_action);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_trap_group_set(struct ynl_sock *ys,
struct devlink_trap_group_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.trap_policer_id)
mnl_attr_put_u32(nlh, DEVLINK_ATTR_TRAP_POLICER_ID, req->trap_policer_id);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_trap_policer_set(struct ynl_sock *ys,
struct devlink_trap_policer_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.trap_policer_burst)
mnl_attr_put_u64(nlh, DEVLINK_ATTR_TRAP_POLICER_BURST, req->trap_policer_burst);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_health_reporter_test(struct ynl_sock *ys,
struct devlink_health_reporter_test_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.health_reporter_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_HEALTH_REPORTER_NAME, req->health_reporter_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_rate_set(struct ynl_sock *ys, struct devlink_rate_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.rate_parent_node_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_RATE_PARENT_NODE_NAME, req->rate_parent_node_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_rate_new(struct ynl_sock *ys, struct devlink_rate_new_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.rate_parent_node_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_RATE_PARENT_NODE_NAME, req->rate_parent_node_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_rate_del(struct ynl_sock *ys, struct devlink_rate_del_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.rate_node_name_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_RATE_NODE_NAME, req->rate_node_name);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_linecard_set(struct ynl_sock *ys,
struct devlink_linecard_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.linecard_type_len)
mnl_attr_put_strz(nlh, DEVLINK_ATTR_LINECARD_TYPE, req->linecard_type);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int devlink_selftests_run(struct ynl_sock *ys,
struct devlink_selftests_run_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.selftests)
devlink_dl_selftest_id_put(nlh, DEVLINK_ATTR_SELFTESTS, &req->selftests);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_linkinfo_set(struct ynl_sock *ys,
struct ethtool_linkinfo_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.transceiver)
mnl_attr_put_u8(nlh, ETHTOOL_A_LINKINFO_TRANSCEIVER, req->transceiver);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_linkmodes_set(struct ynl_sock *ys,
struct ethtool_linkmodes_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.rate_matching)
mnl_attr_put_u8(nlh, ETHTOOL_A_LINKMODES_RATE_MATCHING, req->rate_matching);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_debug_set(struct ynl_sock *ys, struct ethtool_debug_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.msgmask)
ethtool_bitset_put(nlh, ETHTOOL_A_DEBUG_MSGMASK, &req->msgmask);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_wol_set(struct ynl_sock *ys, struct ethtool_wol_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.sopass_len)
mnl_attr_put(nlh, ETHTOOL_A_WOL_SOPASS, req->_present.sopass_len, req->sopass);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_privflags_set(struct ynl_sock *ys,
struct ethtool_privflags_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.flags)
ethtool_bitset_put(nlh, ETHTOOL_A_PRIVFLAGS_FLAGS, &req->flags);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_rings_set(struct ynl_sock *ys, struct ethtool_rings_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.tx_push_buf_len_max)
mnl_attr_put_u32(nlh, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, req->tx_push_buf_len_max);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_channels_set(struct ynl_sock *ys,
struct ethtool_channels_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.combined_count)
mnl_attr_put_u32(nlh, ETHTOOL_A_CHANNELS_COMBINED_COUNT, req->combined_count);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_coalesce_set(struct ynl_sock *ys,
struct ethtool_coalesce_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.tx_aggr_time_usecs)
mnl_attr_put_u32(nlh, ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS, req->tx_aggr_time_usecs);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_pause_set(struct ynl_sock *ys, struct ethtool_pause_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.stats_src)
mnl_attr_put_u32(nlh, ETHTOOL_A_PAUSE_STATS_SRC, req->stats_src);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_eee_set(struct ynl_sock *ys, struct ethtool_eee_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.tx_lpi_timer)
mnl_attr_put_u32(nlh, ETHTOOL_A_EEE_TX_LPI_TIMER, req->tx_lpi_timer);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_cable_test_act(struct ynl_sock *ys,
struct ethtool_cable_test_act_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.header)
ethtool_header_put(nlh, ETHTOOL_A_CABLE_TEST_HEADER, &req->header);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_cable_test_tdr_act(struct ynl_sock *ys,
struct ethtool_cable_test_tdr_act_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.header)
ethtool_header_put(nlh, ETHTOOL_A_CABLE_TEST_TDR_HEADER, &req->header);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_fec_set(struct ynl_sock *ys, struct ethtool_fec_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.stats)
ethtool_fec_stat_put(nlh, ETHTOOL_A_FEC_STATS, &req->stats);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_module_set(struct ynl_sock *ys, struct ethtool_module_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.power_mode)
mnl_attr_put_u8(nlh, ETHTOOL_A_MODULE_POWER_MODE, req->power_mode);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_pse_set(struct ynl_sock *ys, struct ethtool_pse_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.pw_d_status)
mnl_attr_put_u32(nlh, ETHTOOL_A_PODL_PSE_PW_D_STATUS, req->pw_d_status);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_plca_set_cfg(struct ynl_sock *ys,
struct ethtool_plca_set_cfg_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.burst_tmr)
mnl_attr_put_u32(nlh, ETHTOOL_A_PLCA_BURST_TMR, req->burst_tmr);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int ethtool_mm_set(struct ynl_sock *ys, struct ethtool_mm_set_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.tx_min_frag_size)
mnl_attr_put_u32(nlh, ETHTOOL_A_MM_TX_MIN_FRAG_SIZE, req->tx_min_frag_size);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int fou_add(struct ynl_sock *ys, struct fou_add_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.ifindex)
mnl_attr_put_u32(nlh, FOU_ATTR_IFINDEX, req->ifindex);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int fou_del(struct ynl_sock *ys, struct fou_del_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
if (req->_present.peer_v6_len)
mnl_attr_put(nlh, FOU_ATTR_PEER_V6, req->_present.peer_v6_len, req->peer_v6);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
int handshake_done(struct ynl_sock *ys, struct handshake_done_req *req)
{
+ struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };
struct nlmsghdr *nlh;
int err;
for (unsigned int i = 0; i < req->n_remote_auth; i++)
mnl_attr_put_u32(nlh, HANDSHAKE_A_DONE_REMOTE_AUTH, req->remote_auth[i]);
- err = ynl_exec(ys, nlh, NULL);
+ err = ynl_exec(ys, nlh, &yrs);
if (err < 0)
return -1;
cw.block_start(line=f"static const char * const {map_name}[] =")
for op_name, op in family.msgs.items():
if op.rsp_value:
+ # Make sure we don't add duplicated entries, if multiple commands
+ # produce the same response in legacy families.
+ if family.rsp_by_value[op.rsp_value] != op:
+ cw.p(f'// skip "{op_name}", duplicate reply value')
+ continue
+
if op.req_value == op.rsp_value:
cw.p(f'[{op.enum_name}] = "{op_name}",')
else:
ret_ok = '0'
ret_err = '-1'
direction = "request"
- local_vars = ['struct nlmsghdr *nlh;',
+ local_vars = ['struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };',
+ 'struct nlmsghdr *nlh;',
'int err;']
if 'reply' in ri.op[ri.op_mode]:
ret_ok = 'rsp'
ret_err = 'NULL'
- local_vars += [f'{type_name(ri, rdir(direction))} *rsp;',
- 'struct ynl_req_state yrs = { .yarg = { .ys = ys, }, };']
+ local_vars += [f'{type_name(ri, rdir(direction))} *rsp;']
print_prototype(ri, direction, terminate=False)
ri.cw.block_start()
attr.attr_put(ri, "req")
ri.cw.nl()
- parse_arg = "NULL"
if 'reply' in ri.op[ri.op_mode]:
ri.cw.p('rsp = calloc(1, sizeof(*rsp));')
ri.cw.p('yrs.yarg.data = rsp;')
else:
ri.cw.p(f'yrs.rsp_cmd = {ri.op.rsp_value};')
ri.cw.nl()
- parse_arg = '&yrs'
- ri.cw.p(f"err = ynl_exec(ys, nlh, {parse_arg});")
+ ri.cw.p("err = ynl_exec(ys, nlh, &yrs);")
ri.cw.p('if (err < 0)')
if 'reply' in ri.op[ri.op_mode]:
ri.cw.p('goto err_free;')
*
* Yes, this is unfortunate. A better solution is in the works.
*/
-NORETURN(__invalid_creds)
NORETURN(__kunit_abort)
NORETURN(__module_put_and_kthread_exit)
NORETURN(__reiserfs_panic)
+arch/arm64/tools/gen-sysreg.awk
+arch/arm64/tools/sysreg
tools/perf
tools/arch
tools/scripts
SHELL = $(SHELL_PATH)
+arm64_gen_sysreg_dir := $(srctree)/tools/arch/arm64/tools
+ifneq ($(OUTPUT),)
+ arm64_gen_sysreg_outdir := $(OUTPUT)
+else
+ arm64_gen_sysreg_outdir := $(CURDIR)
+endif
+
+arm64-sysreg-defs: FORCE
+ $(Q)$(MAKE) -C $(arm64_gen_sysreg_dir) O=$(arm64_gen_sysreg_outdir)
+
+arm64-sysreg-defs-clean:
+ $(call QUIET_CLEAN,arm64-sysreg-defs)
+ $(Q)$(MAKE) -C $(arm64_gen_sysreg_dir) O=$(arm64_gen_sysreg_outdir) \
+ clean > /dev/null
+
beauty_linux_dir := $(srctree)/tools/perf/trace/beauty/include/linux/
linux_uapi_dir := $(srctree)/tools/include/uapi/linux
asm_generic_uapi_dir := $(srctree)/tools/include/uapi/asm-generic
# Create output directory if not already present
_dummy := $(shell [ -d '$(beauty_ioctl_outdir)' ] || mkdir -p '$(beauty_ioctl_outdir)')
-arm64_gen_sysreg_dir := $(srctree)/tools/arch/arm64/tools
-
-arm64-sysreg-defs: FORCE
- $(Q)$(MAKE) -C $(arm64_gen_sysreg_dir)
-
-arm64-sysreg-defs-clean:
- $(call QUIET_CLEAN,arm64-sysreg-defs)
- $(Q)$(MAKE) -C $(arm64_gen_sysreg_dir) clean > /dev/null
-
$(drm_ioctl_array): $(drm_hdr_dir)/drm.h $(drm_hdr_dir)/i915_drm.h $(drm_ioctl_tbl)
$(Q)$(SHELL) '$(drm_ioctl_tbl)' $(drm_hdr_dir) > $@
450 common set_mempolicy_home_node sys_set_mempolicy_home_node
451 n64 cachestat sys_cachestat
452 n64 fchmodat2 sys_fchmodat2
+453 n64 map_shadow_stack sys_map_shadow_stack
+454 n64 futex_wake sys_futex_wake
+455 n64 futex_wait sys_futex_wait
+456 n64 futex_requeue sys_futex_requeue
450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
+453 common map_shadow_stack sys_ni_syscall
+454 common futex_wake sys_futex_wake
+455 common futex_wait sys_futex_wait
+456 common futex_requeue sys_futex_requeue
450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
451 common cachestat sys_cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2 sys_fchmodat2
+453 common map_shadow_stack sys_map_shadow_stack sys_map_shadow_stack
+454 common futex_wake sys_futex_wake sys_futex_wake
+455 common futex_wait sys_futex_wait sys_futex_wait
+456 common futex_requeue sys_futex_requeue sys_futex_requeue
451 common cachestat sys_cachestat
452 common fchmodat2 sys_fchmodat2
453 64 map_shadow_stack sys_map_shadow_stack
+454 common futex_wake sys_futex_wake
+455 common futex_wait sys_futex_wait
+456 common futex_requeue sys_futex_requeue
#
# Due to a historical design error, certain syscalls are numbered differently
/*
* pid
*/
- ret += printf(" %*ld ", PRINT_PID_WIDTH, work->id);
+ ret += printf(" %*" PRIu64 " ", PRINT_PID_WIDTH, work->id);
/*
* tgid
strbuf_release(&buf);
}
+static bool json_skip_duplicate_pmus(void *ps __maybe_unused)
+{
+ return false;
+}
+
static bool default_skip_duplicate_pmus(void *ps)
{
struct print_state *print_state = ps;
.print_end = json_print_end,
.print_event = json_print_event,
.print_metric = json_print_metric,
+ .skip_duplicate_pmus = json_skip_duplicate_pmus,
};
ps = &json_ps;
} else {
"MetricName": "slots_lost_misspeculation_fraction",
"MetricExpr": "100 * ((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * #slots))",
"BriefDescription": "Fraction of slots lost due to misspeculation",
+ "DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"ScaleUnit": "1percent of slots"
},
"MetricName": "retired_fraction",
"MetricExpr": "100 * (OP_RETIRED / (CPU_CYCLES * #slots))",
"BriefDescription": "Fraction of slots retiring, useful work",
+ "DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"ScaleUnit": "1percent of slots"
},
#define SOL_MPTCP 284
#define SOL_MCTP 285
#define SOL_SMC 286
+#define SOL_VSOCK 287
/* IPX options */
#define IPX_TYPE 1
CFLAGS_libstring.o += -Wno-unused-parameter -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
CFLAGS_hweight.o += -Wno-unused-parameter -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
CFLAGS_header.o += -include $(OUTPUT)PERF-VERSION-FILE
-CFLAGS_arm-spe.o += -I$(srctree)/tools/arch/arm64/include/ -I$(srctree)/tools/arch/arm64/include/generated/
+CFLAGS_arm-spe.o += -I$(srctree)/tools/arch/arm64/include/ -I$(OUTPUT)arch/arm64/include/generated/
$(OUTPUT)util/argv_split.o: ../lib/argv_split.c FORCE
$(call rule_mkdir)
#include <linux/zalloc.h>
#include <linux/string.h>
#include <bpf/bpf.h>
+#include <inttypes.h>
#include "bpf_skel/lock_contention.skel.h"
#include "bpf_skel/lock_data.h"
if (cgrp)
return cgrp->name;
- snprintf(name_buf, sizeof(name_buf), "cgroup:%lu", cgrp_id);
+ snprintf(name_buf, sizeof(name_buf), "cgroup:%" PRIu64 "", cgrp_id);
return name_buf;
}
m->pmu = pm->pmu ?: "cpu";
m->metric_name = pm->metric_name;
- m->default_metricgroup_name = pm->default_metricgroup_name;
+ m->default_metricgroup_name = pm->default_metricgroup_name ?: "";
m->modifier = NULL;
if (modifier) {
m->modifier = strdup(modifier);
elif(re.match('Enabling non-boot CPUs .*', msg)):
# start of first cpu resume
cpu_start = ktime
- elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)) \
+ elif(re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg) \
or re.match('psci: CPU(?P<cpu>[0-9]*) killed.*', msg)):
# end of a cpu suspend, start of the next
m = re.match('smpboot: CPU (?P<cpu>[0-9]*) is now offline', msg)
struct timeval interval_tv = { 5, 0 };
struct timespec interval_ts = { 5, 0 };
-/* Save original CPU model */
-unsigned int model_orig;
-
unsigned int num_iterations;
unsigned int header_iterations;
unsigned int debug;
unsigned int summary_only;
unsigned int list_header_only;
unsigned int dump_only;
-unsigned int do_snb_cstates;
-unsigned int do_knl_cstates;
-unsigned int do_slm_cstates;
-unsigned int use_c1_residency_msr;
unsigned int has_aperf;
unsigned int has_epb;
unsigned int has_turbo;
unsigned int is_hybrid;
-unsigned int do_irtl_snb;
-unsigned int do_irtl_hsw;
unsigned int units = 1000000; /* MHz etc */
unsigned int genuine_intel;
unsigned int authentic_amd;
unsigned int hygon_genuine;
unsigned int max_level, max_extended_level;
unsigned int has_invariant_tsc;
-unsigned int do_nhm_platform_info;
-unsigned int no_MSR_MISC_PWR_MGMT;
unsigned int aperf_mperf_multiplier = 1;
double bclk;
double base_hz;
unsigned int show_pkg_only;
unsigned int show_core_only;
char *output_buffer, *outp;
-unsigned int do_rapl;
unsigned int do_dts;
unsigned int do_ptm;
unsigned int do_ipc;
unsigned int gfx_act_mhz;
unsigned int tj_max;
unsigned int tj_max_override;
-int tcc_offset_bits;
double rapl_power_units, rapl_time_units;
double rapl_dram_energy_units, rapl_energy_units;
double rapl_joule_counter_range;
-unsigned int do_core_perf_limit_reasons;
-unsigned int has_automatic_cstate_conversion;
-unsigned int dis_cstate_prewake;
-unsigned int do_gfx_perf_limit_reasons;
-unsigned int do_ring_perf_limit_reasons;
unsigned int crystal_hz;
unsigned long long tsc_hz;
int base_cpu;
-double discover_bclk(unsigned int family, unsigned int model);
unsigned int has_hwp; /* IA32_PM_ENABLE, IA32_HWP_CAPABILITIES */
/* IA32_HWP_REQUEST, IA32_HWP_STATUS */
unsigned int has_hwp_notify; /* IA32_HWP_INTERRUPT */
unsigned int has_hwp_activity_window; /* IA32_HWP_REQUEST[bits 41:32] */
unsigned int has_hwp_epp; /* IA32_HWP_REQUEST[bits 31:24] */
unsigned int has_hwp_pkg; /* IA32_HWP_REQUEST_PKG */
-unsigned int has_misc_feature_control;
unsigned int first_counter_read = 1;
int ignore_stdin;
-#define RAPL_PKG (1 << 0)
- /* 0x610 MSR_PKG_POWER_LIMIT */
- /* 0x611 MSR_PKG_ENERGY_STATUS */
-#define RAPL_PKG_PERF_STATUS (1 << 1)
- /* 0x613 MSR_PKG_PERF_STATUS */
-#define RAPL_PKG_POWER_INFO (1 << 2)
- /* 0x614 MSR_PKG_POWER_INFO */
-
-#define RAPL_DRAM (1 << 3)
- /* 0x618 MSR_DRAM_POWER_LIMIT */
- /* 0x619 MSR_DRAM_ENERGY_STATUS */
-#define RAPL_DRAM_PERF_STATUS (1 << 4)
- /* 0x61b MSR_DRAM_PERF_STATUS */
-#define RAPL_DRAM_POWER_INFO (1 << 5)
- /* 0x61c MSR_DRAM_POWER_INFO */
-
-#define RAPL_CORES_POWER_LIMIT (1 << 6)
- /* 0x638 MSR_PP0_POWER_LIMIT */
-#define RAPL_CORE_POLICY (1 << 7)
- /* 0x63a MSR_PP0_POLICY */
-
-#define RAPL_GFX (1 << 8)
- /* 0x640 MSR_PP1_POWER_LIMIT */
- /* 0x641 MSR_PP1_ENERGY_STATUS */
- /* 0x642 MSR_PP1_POLICY */
-
-#define RAPL_CORES_ENERGY_STATUS (1 << 9)
- /* 0x639 MSR_PP0_ENERGY_STATUS */
-#define RAPL_PER_CORE_ENERGY (1 << 10)
- /* Indicates cores energy collection is per-core,
- * not per-package. */
-#define RAPL_AMD_F17H (1 << 11)
- /* 0xc0010299 MSR_RAPL_PWR_UNIT */
- /* 0xc001029a MSR_CORE_ENERGY_STAT */
- /* 0xc001029b MSR_PKG_ENERGY_STAT */
-#define RAPL_CORES (RAPL_CORES_ENERGY_STATUS | RAPL_CORES_POWER_LIMIT)
+int get_msr(int cpu, off_t offset, unsigned long long *msr);
+
+/* Model specific support Start */
+
+/* List of features that may diverge among different platforms */
+struct platform_features {
+ bool has_msr_misc_feature_control; /* MSR_MISC_FEATURE_CONTROL */
+ bool has_msr_misc_pwr_mgmt; /* MSR_MISC_PWR_MGMT */
+ bool has_nhm_msrs; /* MSR_PLATFORM_INFO, MSR_IA32_TEMPERATURE_TARGET, MSR_SMI_COUNT, MSR_PKG_CST_CONFIG_CONTROL, MSR_IA32_POWER_CTL, TRL MSRs */
+ bool has_config_tdp; /* MSR_CONFIG_TDP_NOMINAL/LEVEL_1/LEVEL_2/CONTROL, MSR_TURBO_ACTIVATION_RATIO */
+ int bclk_freq; /* CPU base clock */
+ int crystal_freq; /* Crystal clock to use when not available from CPUID.15 */
+ int supported_cstates; /* Core cstates and Package cstates supported */
+ int cst_limit; /* MSR_PKG_CST_CONFIG_CONTROL */
+ bool has_cst_auto_convension; /* AUTOMATIC_CSTATE_CONVERSION bit in MSR_PKG_CST_CONFIG_CONTROL */
+ bool has_irtl_msrs; /* MSR_PKGC3/PKGC6/PKGC7/PKGC8/PKGC9/PKGC10_IRTL */
+ bool has_msr_core_c1_res; /* MSR_CORE_C1_RES */
+ bool has_msr_module_c6_res_ms; /* MSR_MODULE_C6_RES_MS */
+ bool has_msr_c6_demotion_policy_config; /* MSR_CC6_DEMOTION_POLICY_CONFIG/MSR_MC6_DEMOTION_POLICY_CONFIG */
+ bool has_msr_atom_pkg_c6_residency; /* MSR_ATOM_PKG_C6_RESIDENCY */
+ bool has_msr_knl_core_c6_residency; /* MSR_KNL_CORE_C6_RESIDENCY */
+ bool has_ext_cst_msrs; /* MSR_PKG_WEIGHTED_CORE_C0_RES/MSR_PKG_ANY_CORE_C0_RES/MSR_PKG_ANY_GFXE_C0_RES/MSR_PKG_BOTH_CORE_GFXE_C0_RES */
+ bool has_cst_prewake_bit; /* Cstate prewake bit in MSR_IA32_POWER_CTL */
+ int trl_msrs; /* MSR_TURBO_RATIO_LIMIT/LIMIT1/LIMIT2/SECONDARY, Atom TRL MSRs */
+ int plr_msrs; /* MSR_CORE/GFX/RING_PERF_LIMIT_REASONS */
+ int rapl_msrs; /* RAPL PKG/DRAM/CORE/GFX MSRs, AMD RAPL MSRs */
+ bool has_per_core_rapl; /* Indicates cores energy collection is per-core, not per-package. AMD specific for now */
+ bool has_rapl_divisor; /* Divisor for Energy unit raw value from MSR_RAPL_POWER_UNIT */
+ bool has_fixed_rapl_unit; /* Fixed Energy Unit used for DRAM RAPL Domain */
+ int rapl_quirk_tdp; /* Hardcoded TDP value when cannot be retrieved from hardware */
+ int tcc_offset_bits; /* TCC Offset bits in MSR_IA32_TEMPERATURE_TARGET */
+ bool enable_tsc_tweak; /* Use CPU Base freq instead of TSC freq for aperf/mperf counter */
+ bool need_perf_multiplier; /* mperf/aperf multiplier */
+};
+
+struct platform_data {
+ unsigned int model;
+ const struct platform_features *features;
+};
+
+/* For BCLK */
+enum bclk_freq {
+ BCLK_100MHZ = 1,
+ BCLK_133MHZ,
+ BCLK_SLV,
+};
+
+#define SLM_BCLK_FREQS 5
+double slm_freq_table[SLM_BCLK_FREQS] = { 83.3, 100.0, 133.3, 116.7, 80.0 };
+
+double slm_bclk(void)
+{
+ unsigned long long msr = 3;
+ unsigned int i;
+ double freq;
+
+ if (get_msr(base_cpu, MSR_FSB_FREQ, &msr))
+ fprintf(outf, "SLM BCLK: unknown\n");
+
+ i = msr & 0xf;
+ if (i >= SLM_BCLK_FREQS) {
+ fprintf(outf, "SLM BCLK[%d] invalid\n", i);
+ i = 3;
+ }
+ freq = slm_freq_table[i];
+
+ if (!quiet)
+ fprintf(outf, "SLM BCLK: %.1f Mhz\n", freq);
+
+ return freq;
+}
+
+/* For Package cstate limit */
+enum package_cstate_limit {
+ CST_LIMIT_NHM = 1,
+ CST_LIMIT_SNB,
+ CST_LIMIT_HSW,
+ CST_LIMIT_SKX,
+ CST_LIMIT_ICX,
+ CST_LIMIT_SLV,
+ CST_LIMIT_AMT,
+ CST_LIMIT_KNL,
+ CST_LIMIT_GMT,
+};
+
+/* For Turbo Ratio Limit MSRs */
+enum turbo_ratio_limit_msrs {
+ TRL_BASE = BIT(0),
+ TRL_LIMIT1 = BIT(1),
+ TRL_LIMIT2 = BIT(2),
+ TRL_ATOM = BIT(3),
+ TRL_KNL = BIT(4),
+ TRL_CORECOUNT = BIT(5),
+};
+
+/* For Perf Limit Reason MSRs */
+enum perf_limit_reason_msrs {
+ PLR_CORE = BIT(0),
+ PLR_GFX = BIT(1),
+ PLR_RING = BIT(2),
+};
+
+/* For RAPL MSRs */
+enum rapl_msrs {
+ RAPL_PKG_POWER_LIMIT = BIT(0), /* 0x610 MSR_PKG_POWER_LIMIT */
+ RAPL_PKG_ENERGY_STATUS = BIT(1), /* 0x611 MSR_PKG_ENERGY_STATUS */
+ RAPL_PKG_PERF_STATUS = BIT(2), /* 0x613 MSR_PKG_PERF_STATUS */
+ RAPL_PKG_POWER_INFO = BIT(3), /* 0x614 MSR_PKG_POWER_INFO */
+ RAPL_DRAM_POWER_LIMIT = BIT(4), /* 0x618 MSR_DRAM_POWER_LIMIT */
+ RAPL_DRAM_ENERGY_STATUS = BIT(5), /* 0x619 MSR_DRAM_ENERGY_STATUS */
+ RAPL_DRAM_PERF_STATUS = BIT(6), /* 0x61b MSR_DRAM_PERF_STATUS */
+ RAPL_DRAM_POWER_INFO = BIT(7), /* 0x61c MSR_DRAM_POWER_INFO */
+ RAPL_CORE_POWER_LIMIT = BIT(8), /* 0x638 MSR_PP0_POWER_LIMIT */
+ RAPL_CORE_ENERGY_STATUS = BIT(9), /* 0x639 MSR_PP0_ENERGY_STATUS */
+ RAPL_CORE_POLICY = BIT(10), /* 0x63a MSR_PP0_POLICY */
+ RAPL_GFX_POWER_LIMIT = BIT(11), /* 0x640 MSR_PP1_POWER_LIMIT */
+ RAPL_GFX_ENERGY_STATUS = BIT(12), /* 0x641 MSR_PP1_ENERGY_STATUS */
+ RAPL_GFX_POLICY = BIT(13), /* 0x642 MSR_PP1_POLICY */
+ RAPL_AMD_PWR_UNIT = BIT(14), /* 0xc0010299 MSR_AMD_RAPL_POWER_UNIT */
+ RAPL_AMD_CORE_ENERGY_STAT = BIT(15), /* 0xc001029a MSR_AMD_CORE_ENERGY_STATUS */
+ RAPL_AMD_PKG_ENERGY_STAT = BIT(16), /* 0xc001029b MSR_AMD_PKG_ENERGY_STATUS */
+};
+
+#define RAPL_PKG (RAPL_PKG_ENERGY_STATUS | RAPL_PKG_POWER_LIMIT)
+#define RAPL_DRAM (RAPL_DRAM_ENERGY_STATUS | RAPL_DRAM_POWER_LIMIT)
+#define RAPL_CORE (RAPL_CORE_ENERGY_STATUS | RAPL_CORE_POWER_LIMIT)
+#define RAPL_GFX (RAPL_GFX_POWER_LIMIT | RAPL_GFX_ENERGY_STATUS)
+
+#define RAPL_PKG_ALL (RAPL_PKG | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO)
+#define RAPL_DRAM_ALL (RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_DRAM_POWER_INFO)
+#define RAPL_CORE_ALL (RAPL_CORE | RAPL_CORE_POLICY)
+#define RAPL_GFX_ALL (RAPL_GFX | RAPL_GFX_POLIGY)
+
+#define RAPL_AMD_F17H (RAPL_AMD_PWR_UNIT | RAPL_AMD_CORE_ENERGY_STAT | RAPL_AMD_PKG_ENERGY_STAT)
+
+/* For Cstates */
+enum cstates {
+ CC1 = BIT(0),
+ CC3 = BIT(1),
+ CC6 = BIT(2),
+ CC7 = BIT(3),
+ PC2 = BIT(4),
+ PC3 = BIT(5),
+ PC6 = BIT(6),
+ PC7 = BIT(7),
+ PC8 = BIT(8),
+ PC9 = BIT(9),
+ PC10 = BIT(10),
+};
+
+static const struct platform_features nhm_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_133MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_NHM,
+ .trl_msrs = TRL_BASE,
+};
+
+static const struct platform_features nhx_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_133MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_NHM,
+};
+
+static const struct platform_features snb_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_SNB,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features snx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_SNB,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL,
+};
+
+static const struct platform_features ivb_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_SNB,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features ivx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_SNB,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE | TRL_LIMIT1,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL,
+};
+
+static const struct platform_features hsw_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features hsx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE | TRL_LIMIT1 | TRL_LIMIT2,
+ .plr_msrs = PLR_CORE | PLR_RING,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+ .has_fixed_rapl_unit = 1,
+};
+
+static const struct platform_features hswl_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features hswg_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features bdw_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features bdwg_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features bdx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | PC2 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .has_cst_auto_convension = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+ .has_fixed_rapl_unit = 1,
+};
+
+static const struct platform_features skl_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .crystal_freq = 24000000,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .has_ext_cst_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .tcc_offset_bits = 6,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX,
+ .enable_tsc_tweak = 1,
+};
+
+static const struct platform_features cnl_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .has_msr_core_c1_res = 1,
+ .has_ext_cst_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .tcc_offset_bits = 6,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX,
+ .enable_tsc_tweak = 1,
+};
+
+static const struct platform_features adl_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | CC7 | PC2 | PC3 | PC6 | PC8 | PC10,
+ .cst_limit = CST_LIMIT_HSW,
+ .has_irtl_msrs = 1,
+ .has_msr_core_c1_res = 1,
+ .has_ext_cst_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .tcc_offset_bits = 6,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX,
+ .enable_tsc_tweak = 1,
+};
+
+static const struct platform_features skx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | PC2 | PC6,
+ .cst_limit = CST_LIMIT_SKX,
+ .has_irtl_msrs = 1,
+ .has_cst_auto_convension = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+ .has_fixed_rapl_unit = 1,
+};
+
+static const struct platform_features icx_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | PC2 | PC6,
+ .cst_limit = CST_LIMIT_ICX,
+ .has_irtl_msrs = 1,
+ .has_cst_prewake_bit = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+ .has_fixed_rapl_unit = 1,
+};
+
+static const struct platform_features spr_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | PC2 | PC6,
+ .cst_limit = CST_LIMIT_SKX,
+ .has_msr_core_c1_res = 1,
+ .has_irtl_msrs = 1,
+ .has_cst_prewake_bit = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+};
+
+static const struct platform_features srf_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | PC2 | PC6,
+ .cst_limit = CST_LIMIT_SKX,
+ .has_msr_core_c1_res = 1,
+ .has_msr_module_c6_res_ms = 1,
+ .has_irtl_msrs = 1,
+ .has_cst_prewake_bit = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+};
+
+static const struct platform_features grr_features = {
+ .has_msr_misc_feature_control = 1,
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6,
+ .cst_limit = CST_LIMIT_SKX,
+ .has_msr_core_c1_res = 1,
+ .has_msr_module_c6_res_ms = 1,
+ .has_irtl_msrs = 1,
+ .has_cst_prewake_bit = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+};
+
+static const struct platform_features slv_features = {
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_SLV,
+ .supported_cstates = CC1 | CC6 | PC6,
+ .cst_limit = CST_LIMIT_SLV,
+ .has_msr_core_c1_res = 1,
+ .has_msr_module_c6_res_ms = 1,
+ .has_msr_c6_demotion_policy_config = 1,
+ .has_msr_atom_pkg_c6_residency = 1,
+ .trl_msrs = TRL_ATOM,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE,
+ .has_rapl_divisor = 1,
+ .rapl_quirk_tdp = 30,
+};
+
+static const struct platform_features slvd_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_SLV,
+ .supported_cstates = CC1 | CC6 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_SLV,
+ .has_msr_atom_pkg_c6_residency = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_CORE,
+ .rapl_quirk_tdp = 30,
+};
+
+static const struct platform_features amt_features = {
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_133MHZ,
+ .supported_cstates = CC1 | CC3 | CC6 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_AMT,
+ .trl_msrs = TRL_BASE,
+};
+
+static const struct platform_features gmt_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .crystal_freq = 19200000,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_GMT,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features gmtd_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .crystal_freq = 25000000,
+ .supported_cstates = CC1 | CC6 | PC2 | PC6,
+ .cst_limit = CST_LIMIT_GMT,
+ .has_irtl_msrs = 1,
+ .has_msr_core_c1_res = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_CORE_ENERGY_STATUS,
+};
+
+static const struct platform_features gmtp_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .crystal_freq = 19200000,
+ .supported_cstates = CC1 | CC3 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_GMT,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO,
+};
+
+static const struct platform_features tmt_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | CC7 | PC2 | PC3 | PC6 | PC7 | PC8 | PC9 | PC10,
+ .cst_limit = CST_LIMIT_GMT,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX,
+ .enable_tsc_tweak = 1,
+};
+
+static const struct platform_features tmtd_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6,
+ .cst_limit = CST_LIMIT_GMT,
+ .has_irtl_msrs = 1,
+ .trl_msrs = TRL_BASE | TRL_CORECOUNT,
+ .rapl_msrs = RAPL_PKG_ALL,
+};
+
+static const struct platform_features knl_features = {
+ .has_msr_misc_pwr_mgmt = 1,
+ .has_nhm_msrs = 1,
+ .has_config_tdp = 1,
+ .bclk_freq = BCLK_100MHZ,
+ .supported_cstates = CC1 | CC6 | PC3 | PC6,
+ .cst_limit = CST_LIMIT_KNL,
+ .has_msr_knl_core_c6_residency = 1,
+ .trl_msrs = TRL_KNL,
+ .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL,
+ .has_fixed_rapl_unit = 1,
+ .need_perf_multiplier = 1,
+};
+
+static const struct platform_features default_features = {
+};
+
+static const struct platform_features amd_features_with_rapl = {
+ .rapl_msrs = RAPL_AMD_F17H,
+ .has_per_core_rapl = 1,
+ .rapl_quirk_tdp = 280, /* This is the max stock TDP of HEDT/Server Fam17h+ chips */
+};
+
+static const struct platform_data turbostat_pdata[] = {
+ { INTEL_FAM6_NEHALEM, &nhm_features },
+ { INTEL_FAM6_NEHALEM_G, &nhm_features },
+ { INTEL_FAM6_NEHALEM_EP, &nhm_features },
+ { INTEL_FAM6_NEHALEM_EX, &nhx_features },
+ { INTEL_FAM6_WESTMERE, &nhm_features },
+ { INTEL_FAM6_WESTMERE_EP, &nhm_features },
+ { INTEL_FAM6_WESTMERE_EX, &nhx_features },
+ { INTEL_FAM6_SANDYBRIDGE, &snb_features },
+ { INTEL_FAM6_SANDYBRIDGE_X, &snx_features },
+ { INTEL_FAM6_IVYBRIDGE, &ivb_features },
+ { INTEL_FAM6_IVYBRIDGE_X, &ivx_features },
+ { INTEL_FAM6_HASWELL, &hsw_features },
+ { INTEL_FAM6_HASWELL_X, &hsx_features },
+ { INTEL_FAM6_HASWELL_L, &hswl_features },
+ { INTEL_FAM6_HASWELL_G, &hswg_features },
+ { INTEL_FAM6_BROADWELL, &bdw_features },
+ { INTEL_FAM6_BROADWELL_G, &bdwg_features },
+ { INTEL_FAM6_BROADWELL_X, &bdx_features },
+ { INTEL_FAM6_BROADWELL_D, &bdx_features },
+ { INTEL_FAM6_SKYLAKE_L, &skl_features },
+ { INTEL_FAM6_SKYLAKE, &skl_features },
+ { INTEL_FAM6_SKYLAKE_X, &skx_features },
+ { INTEL_FAM6_KABYLAKE_L, &skl_features },
+ { INTEL_FAM6_KABYLAKE, &skl_features },
+ { INTEL_FAM6_COMETLAKE, &skl_features },
+ { INTEL_FAM6_COMETLAKE_L, &skl_features },
+ { INTEL_FAM6_CANNONLAKE_L, &cnl_features },
+ { INTEL_FAM6_ICELAKE_X, &icx_features },
+ { INTEL_FAM6_ICELAKE_D, &icx_features },
+ { INTEL_FAM6_ICELAKE_L, &cnl_features },
+ { INTEL_FAM6_ICELAKE_NNPI, &cnl_features },
+ { INTEL_FAM6_ROCKETLAKE, &cnl_features },
+ { INTEL_FAM6_TIGERLAKE_L, &cnl_features },
+ { INTEL_FAM6_TIGERLAKE, &cnl_features },
+ { INTEL_FAM6_SAPPHIRERAPIDS_X, &spr_features },
+ { INTEL_FAM6_EMERALDRAPIDS_X, &spr_features },
+ { INTEL_FAM6_GRANITERAPIDS_X, &spr_features },
+ { INTEL_FAM6_LAKEFIELD, &cnl_features },
+ { INTEL_FAM6_ALDERLAKE, &adl_features },
+ { INTEL_FAM6_ALDERLAKE_L, &adl_features },
+ { INTEL_FAM6_RAPTORLAKE, &adl_features },
+ { INTEL_FAM6_RAPTORLAKE_P, &adl_features },
+ { INTEL_FAM6_RAPTORLAKE_S, &adl_features },
+ { INTEL_FAM6_METEORLAKE, &cnl_features },
+ { INTEL_FAM6_METEORLAKE_L, &cnl_features },
+ { INTEL_FAM6_ARROWLAKE, &cnl_features },
+ { INTEL_FAM6_LUNARLAKE_M, &cnl_features },
+ { INTEL_FAM6_ATOM_SILVERMONT, &slv_features },
+ { INTEL_FAM6_ATOM_SILVERMONT_D, &slvd_features },
+ { INTEL_FAM6_ATOM_AIRMONT, &amt_features },
+ { INTEL_FAM6_ATOM_GOLDMONT, &gmt_features },
+ { INTEL_FAM6_ATOM_GOLDMONT_D, &gmtd_features },
+ { INTEL_FAM6_ATOM_GOLDMONT_PLUS, &gmtp_features },
+ { INTEL_FAM6_ATOM_TREMONT_D, &tmtd_features },
+ { INTEL_FAM6_ATOM_TREMONT, &tmt_features },
+ { INTEL_FAM6_ATOM_TREMONT_L, &tmt_features },
+ { INTEL_FAM6_ATOM_GRACEMONT, &adl_features },
+ { INTEL_FAM6_ATOM_CRESTMONT_X, &srf_features },
+ { INTEL_FAM6_ATOM_CRESTMONT, &grr_features },
+ { INTEL_FAM6_XEON_PHI_KNL, &knl_features },
+ { INTEL_FAM6_XEON_PHI_KNM, &knl_features },
+ /*
+ * Missing support for
+ * INTEL_FAM6_ICELAKE
+ * INTEL_FAM6_ATOM_SILVERMONT_MID
+ * INTEL_FAM6_ATOM_AIRMONT_MID
+ * INTEL_FAM6_ATOM_AIRMONT_NP
+ */
+ { 0, NULL },
+};
+
+static const struct platform_features *platform;
+
+void probe_platform_features(unsigned int family, unsigned int model)
+{
+ int i;
+
+ platform = &default_features;
+
+ if (authentic_amd || hygon_genuine) {
+ if (max_extended_level >= 0x80000007) {
+ unsigned int eax, ebx, ecx, edx;
+
+ __cpuid(0x80000007, eax, ebx, ecx, edx);
+ /* RAPL (Fam 17h+) */
+ if ((edx & (1 << 14)) && family >= 0x17)
+ platform = &amd_features_with_rapl;
+ }
+ return;
+ }
+
+ if (!genuine_intel || family != 6)
+ return;
+
+ for (i = 0; turbostat_pdata[i].features; i++) {
+ if (turbostat_pdata[i].model == model) {
+ platform = turbostat_pdata[i].features;
+ return;
+ }
+ }
+}
+
+/* Model specific support End */
+
#define TJMAX_DEFAULT 100
/* MSRs that are not yet in the kernel-provided header. */
char *progname;
#define CPU_SUBSET_MAXCPUS 1024 /* need to use before probe... */
-cpu_set_t *cpu_present_set, *cpu_affinity_set, *cpu_subset;
-size_t cpu_present_setsize, cpu_affinity_setsize, cpu_subset_size;
+cpu_set_t *cpu_present_set, *cpu_effective_set, *cpu_allowed_set, *cpu_affinity_set, *cpu_subset;
+size_t cpu_present_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affinity_setsize, cpu_subset_size;
#define MAX_ADDED_COUNTERS 8
#define MAX_ADDED_THREAD_COUNTERS 24
#define BITMASK_SIZE 32
unsigned int x2apic_id;
unsigned int flags;
bool is_atom;
-#define CPU_IS_FIRST_THREAD_IN_CORE 0x2
-#define CPU_IS_FIRST_CORE_IN_PACKAGE 0x4
unsigned long long counter[MAX_ADDED_THREAD_COUNTERS];
} *thread_even, *thread_odd;
struct core_data {
+ int base_cpu;
unsigned long long c3;
unsigned long long c6;
unsigned long long c7;
} *core_even, *core_odd;
struct pkg_data {
+ int base_cpu;
unsigned long long pc2;
unsigned long long pc3;
unsigned long long pc6;
switch (idx) {
case IDX_PKG_ENERGY:
- if (do_rapl & RAPL_AMD_F17H)
+ if (platform->rapl_msrs & RAPL_AMD_F17H)
offset = MSR_PKG_ENERGY_STAT;
else
offset = MSR_PKG_ENERGY_STATUS;
{
switch (idx) {
case IDX_PKG_ENERGY:
- return do_rapl & (RAPL_PKG | RAPL_AMD_F17H);
+ return platform->rapl_msrs & (RAPL_PKG | RAPL_AMD_F17H);
case IDX_DRAM_ENERGY:
- return do_rapl & RAPL_DRAM;
+ return platform->rapl_msrs & RAPL_DRAM;
case IDX_PP0_ENERGY:
- return do_rapl & RAPL_CORES_ENERGY_STATUS;
+ return platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS;
case IDX_PP1_ENERGY:
- return do_rapl & RAPL_GFX;
+ return platform->rapl_msrs & RAPL_GFX;
case IDX_PKG_PERF:
- return do_rapl & RAPL_PKG_PERF_STATUS;
+ return platform->rapl_msrs & RAPL_PKG_PERF_STATUS;
case IDX_DRAM_PERF:
- return do_rapl & RAPL_DRAM_PERF_STATUS;
+ return platform->rapl_msrs & RAPL_DRAM_PERF_STATUS;
default:
return 0;
}
int num_die;
int num_cpus;
int num_cores;
+ int allowed_packages;
+ int allowed_cpus;
+ int allowed_cores;
int max_cpu_num;
int max_node_num;
int nodes_per_pkg;
int *irq_column_2_cpu; /* /proc/interrupts column numbers */
int *irqs_per_cpu; /* indexed by cpu_num */
-void setup_all_buffers(void);
+void setup_all_buffers(bool startup);
char *sys_lpi_file;
char *sys_lpi_file_sysfs = "/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us";
return !CPU_ISSET_S(cpu, cpu_present_setsize, cpu_present_set);
}
+int cpu_is_not_allowed(int cpu)
+{
+ return !CPU_ISSET_S(cpu, cpu_allowed_setsize, cpu_allowed_set);
+}
+
/*
* run func(thread, core, package) in topology order
* skip non-present cpus
struct thread_data *t;
struct core_data *c;
struct pkg_data *p;
-
t = GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no);
- if (cpu_is_not_present(t->cpu_id))
+ if (cpu_is_not_allowed(t->cpu_id))
continue;
c = GET_CORE(core_base, core_no, node_no, pkg_no);
return 0;
}
+int is_cpu_first_thread_in_core(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ UNUSED(p);
+
+ return ((int)t->cpu_id == c->base_cpu || c->base_cpu < 0);
+}
+
+int is_cpu_first_core_in_package(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ UNUSED(c);
+
+ return ((int)t->cpu_id == p->base_cpu || p->base_cpu < 0);
+}
+
+int is_cpu_first_thread_in_package(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ return is_cpu_first_thread_in_core(t, c, p) && is_cpu_first_core_in_package(t, c, p);
+}
+
int cpu_migrate(int cpu)
{
CPU_ZERO_S(cpu_affinity_setsize, cpu_affinity_set);
if (DO_BIC(BIC_CORE_THROT_CNT))
outp += sprintf(outp, "%sCoreThr", (printed++ ? delim : ""));
- if (do_rapl && !rapl_joules) {
- if (DO_BIC(BIC_CorWatt) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ if (platform->rapl_msrs && !rapl_joules) {
+ if (DO_BIC(BIC_CorWatt) && platform->has_per_core_rapl)
outp += sprintf(outp, "%sCorWatt", (printed++ ? delim : ""));
- } else if (do_rapl && rapl_joules) {
- if (DO_BIC(BIC_Cor_J) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ } else if (platform->rapl_msrs && rapl_joules) {
+ if (DO_BIC(BIC_Cor_J) && platform->has_per_core_rapl)
outp += sprintf(outp, "%sCor_J", (printed++ ? delim : ""));
}
if (DO_BIC(BIC_SYS_LPI))
outp += sprintf(outp, "%sSYS%%LPI", (printed++ ? delim : ""));
- if (do_rapl && !rapl_joules) {
+ if (platform->rapl_msrs && !rapl_joules) {
if (DO_BIC(BIC_PkgWatt))
outp += sprintf(outp, "%sPkgWatt", (printed++ ? delim : ""));
- if (DO_BIC(BIC_CorWatt) && !(do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_CorWatt) && !platform->has_per_core_rapl)
outp += sprintf(outp, "%sCorWatt", (printed++ ? delim : ""));
if (DO_BIC(BIC_GFXWatt))
outp += sprintf(outp, "%sGFXWatt", (printed++ ? delim : ""));
outp += sprintf(outp, "%sPKG_%%", (printed++ ? delim : ""));
if (DO_BIC(BIC_RAM__))
outp += sprintf(outp, "%sRAM_%%", (printed++ ? delim : ""));
- } else if (do_rapl && rapl_joules) {
+ } else if (platform->rapl_msrs && rapl_joules) {
if (DO_BIC(BIC_Pkg_J))
outp += sprintf(outp, "%sPkg_J", (printed++ ? delim : ""));
- if (DO_BIC(BIC_Cor_J) && !(do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_Cor_J) && !platform->has_per_core_rapl)
outp += sprintf(outp, "%sCor_J", (printed++ ? delim : ""));
if (DO_BIC(BIC_GFX_J))
outp += sprintf(outp, "%sGFX_J", (printed++ ? delim : ""));
int printed = 0;
/* if showing only 1st thread in core and this isn't one, bail out */
- if (show_core_only && !(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
+ if (show_core_only && !is_cpu_first_thread_in_core(t, c, p))
return 0;
/* if showing only 1st thread in pkg and this isn't one, bail out */
- if (show_pkg_only && !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (show_pkg_only && !is_cpu_first_core_in_package(t, c, p))
return 0;
/*if not summary line and --cpu is used */
outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->c1 / tsc);
/* print per-core data only for 1st thread in core */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
+ if (!is_cpu_first_thread_in_core(t, c, p))
goto done;
if (DO_BIC(BIC_CPU_c3))
fmt8 = "%s%.2f";
- if (DO_BIC(BIC_CorWatt) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_CorWatt) && platform->has_per_core_rapl)
outp +=
sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units / interval_float);
- if (DO_BIC(BIC_Cor_J) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_Cor_J) && platform->has_per_core_rapl)
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units);
/* print per-package data only for 1st core in package */
- if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_core_in_package(t, c, p))
goto done;
/* PkgTmp */
outp +=
sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units / interval_float);
- if (DO_BIC(BIC_CorWatt) && !(do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_CorWatt) && !platform->has_per_core_rapl)
outp +=
sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units / interval_float);
if (DO_BIC(BIC_GFXWatt))
p->energy_dram * rapl_dram_energy_units / interval_float);
if (DO_BIC(BIC_Pkg_J))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units);
- if (DO_BIC(BIC_Cor_J) && !(do_rapl & RAPL_PER_CORE_ENERGY))
+ if (DO_BIC(BIC_Cor_J) && !platform->has_per_core_rapl)
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units);
if (DO_BIC(BIC_GFX_J))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_gfx * rapl_energy_units);
int soft_c1_residency_display(int bic)
{
- if (!DO_BIC(BIC_CPU_c1) || use_c1_residency_msr)
+ if (!DO_BIC(BIC_CPU_c1) || platform->has_msr_core_c1_res)
return 0;
return DO_BIC_READ(bic);
old->c1 = new->c1 - old->c1;
- if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || soft_c1_residency_display(BIC_Avg_MHz)) {
+ if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || DO_BIC(BIC_IPC)
+ || soft_c1_residency_display(BIC_Avg_MHz)) {
if ((new->aperf > old->aperf) && (new->mperf > old->mperf)) {
old->aperf = new->aperf - old->aperf;
old->mperf = new->mperf - old->mperf;
}
}
- if (use_c1_residency_msr) {
+ if (platform->has_msr_core_c1_res) {
/*
* Some models have a dedicated C1 residency MSR,
* which should be more accurate than the derivation below.
int retval = 0;
/* calculate core delta only for 1st thread in core */
- if (t->flags & CPU_IS_FIRST_THREAD_IN_CORE)
+ if (is_cpu_first_thread_in_core(t, c, p))
delta_core(c, c2);
/* always calculate thread delta */
return retval;
/* calculate package delta only for 1st core in package */
- if (t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE)
+ if (is_cpu_first_core_in_package(t, c, p))
retval = delta_package(p, p2);
return retval;
t->irq_count = 0;
t->smi_count = 0;
- /* tells format_counters to dump all fields from this set */
- t->flags = CPU_IS_FIRST_THREAD_IN_CORE | CPU_IS_FIRST_CORE_IN_PACKAGE;
-
c->c3 = 0;
c->c6 = 0;
c->c7 = 0;
}
/* sum per-core values only for 1st thread in core */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
+ if (!is_cpu_first_thread_in_core(t, c, p))
return 0;
average.cores.c3 += c->c3;
}
/* sum per-pkg values only for 1st core in pkg */
- if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_core_in_package(t, c, p))
return 0;
if (DO_BIC(BIC_Totl_c0))
/* Use the global time delta for the average. */
average.threads.tv_delta = tv_delta;
- average.threads.tsc /= topo.num_cpus;
- average.threads.aperf /= topo.num_cpus;
- average.threads.mperf /= topo.num_cpus;
- average.threads.instr_count /= topo.num_cpus;
- average.threads.c1 /= topo.num_cpus;
+ average.threads.tsc /= topo.allowed_cpus;
+ average.threads.aperf /= topo.allowed_cpus;
+ average.threads.mperf /= topo.allowed_cpus;
+ average.threads.instr_count /= topo.allowed_cpus;
+ average.threads.c1 /= topo.allowed_cpus;
if (average.threads.irq_count > 9999999)
sums_need_wide_columns = 1;
- average.cores.c3 /= topo.num_cores;
- average.cores.c6 /= topo.num_cores;
- average.cores.c7 /= topo.num_cores;
- average.cores.mc6_us /= topo.num_cores;
+ average.cores.c3 /= topo.allowed_cores;
+ average.cores.c6 /= topo.allowed_cores;
+ average.cores.c7 /= topo.allowed_cores;
+ average.cores.mc6_us /= topo.allowed_cores;
if (DO_BIC(BIC_Totl_c0))
- average.packages.pkg_wtd_core_c0 /= topo.num_packages;
+ average.packages.pkg_wtd_core_c0 /= topo.allowed_packages;
if (DO_BIC(BIC_Any_c0))
- average.packages.pkg_any_core_c0 /= topo.num_packages;
+ average.packages.pkg_any_core_c0 /= topo.allowed_packages;
if (DO_BIC(BIC_GFX_c0))
- average.packages.pkg_any_gfxe_c0 /= topo.num_packages;
+ average.packages.pkg_any_gfxe_c0 /= topo.allowed_packages;
if (DO_BIC(BIC_CPUGFX))
- average.packages.pkg_both_core_gfxe_c0 /= topo.num_packages;
+ average.packages.pkg_both_core_gfxe_c0 /= topo.allowed_packages;
- average.packages.pc2 /= topo.num_packages;
+ average.packages.pc2 /= topo.allowed_packages;
if (DO_BIC(BIC_Pkgpc3))
- average.packages.pc3 /= topo.num_packages;
+ average.packages.pc3 /= topo.allowed_packages;
if (DO_BIC(BIC_Pkgpc6))
- average.packages.pc6 /= topo.num_packages;
+ average.packages.pc6 /= topo.allowed_packages;
if (DO_BIC(BIC_Pkgpc7))
- average.packages.pc7 /= topo.num_packages;
+ average.packages.pc7 /= topo.allowed_packages;
- average.packages.pc8 /= topo.num_packages;
- average.packages.pc9 /= topo.num_packages;
- average.packages.pc10 /= topo.num_packages;
+ average.packages.pc8 /= topo.allowed_packages;
+ average.packages.pc9 /= topo.allowed_packages;
+ average.packages.pc10 /= topo.allowed_packages;
for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) {
if (mp->format == FORMAT_RAW)
sums_need_wide_columns = 1;
continue;
}
- average.threads.counter[i] /= topo.num_cpus;
+ average.threads.counter[i] /= topo.allowed_cpus;
}
for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
if (mp->format == FORMAT_RAW)
if (average.cores.counter[i] > 9999999)
sums_need_wide_columns = 1;
}
- average.cores.counter[i] /= topo.num_cores;
+ average.cores.counter[i] /= topo.allowed_cores;
}
for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) {
if (mp->format == FORMAT_RAW)
if (average.packages.counter[i] > 9999999)
sums_need_wide_columns = 1;
}
- average.packages.counter[i] /= topo.num_packages;
+ average.packages.counter[i] /= topo.allowed_packages;
}
}
retry:
t->tsc = rdtsc(); /* we are running on local CPU of interest */
- if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || soft_c1_residency_display(BIC_Avg_MHz)) {
+ if (DO_BIC(BIC_Avg_MHz) || DO_BIC(BIC_Busy) || DO_BIC(BIC_Bzy_MHz) || DO_BIC(BIC_IPC)
+ || soft_c1_residency_display(BIC_Avg_MHz)) {
unsigned long long tsc_before, tsc_between, tsc_after, aperf_time, mperf_time;
/*
return -5;
t->smi_count = msr & 0xFFFFFFFF;
}
- if (DO_BIC(BIC_CPU_c1) && use_c1_residency_msr) {
+ if (DO_BIC(BIC_CPU_c1) && platform->has_msr_core_c1_res) {
if (get_msr(cpu, MSR_CORE_C1_RES, &t->c1))
return -6;
}
}
/* collect core counters only for 1st thread in core */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
+ if (!is_cpu_first_thread_in_core(t, c, p))
goto done;
if (DO_BIC(BIC_CPU_c3) || soft_c1_residency_display(BIC_CPU_c3)) {
return -6;
}
- if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !do_knl_cstates) {
+ if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !platform->has_msr_knl_core_c6_residency) {
if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6))
return -7;
- } else if (do_knl_cstates || soft_c1_residency_display(BIC_CPU_c6)) {
+ } else if (platform->has_msr_knl_core_c6_residency && soft_c1_residency_display(BIC_CPU_c6)) {
if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6))
return -7;
}
if (DO_BIC(BIC_CORE_THROT_CNT))
get_core_throt_cnt(cpu, &c->core_throt_cnt);
- if (do_rapl & RAPL_AMD_F17H) {
+ if (platform->rapl_msrs & RAPL_AMD_F17H) {
if (get_msr(cpu, MSR_CORE_ENERGY_STAT, &msr))
return -14;
c->core_energy = msr & 0xFFFFFFFF;
}
/* collect package counters only for 1st core in package */
- if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_core_in_package(t, c, p))
goto done;
if (DO_BIC(BIC_Totl_c0)) {
if (get_msr(cpu, MSR_PKG_C3_RESIDENCY, &p->pc3))
return -9;
if (DO_BIC(BIC_Pkgpc6)) {
- if (do_slm_cstates) {
+ if (platform->has_msr_atom_pkg_c6_residency) {
if (get_msr(cpu, MSR_ATOM_PKG_C6_RESIDENCY, &p->pc6))
return -10;
} else {
if (DO_BIC(BIC_SYS_LPI))
p->sys_lpi = cpuidle_cur_sys_lpi_us;
- if (do_rapl & RAPL_PKG) {
+ if (platform->rapl_msrs & RAPL_PKG) {
if (get_msr_sum(cpu, MSR_PKG_ENERGY_STATUS, &msr))
return -13;
p->energy_pkg = msr;
}
- if (do_rapl & RAPL_CORES_ENERGY_STATUS) {
+ if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS) {
if (get_msr_sum(cpu, MSR_PP0_ENERGY_STATUS, &msr))
return -14;
p->energy_cores = msr;
}
- if (do_rapl & RAPL_DRAM) {
+ if (platform->rapl_msrs & RAPL_DRAM) {
if (get_msr_sum(cpu, MSR_DRAM_ENERGY_STATUS, &msr))
return -15;
p->energy_dram = msr;
}
- if (do_rapl & RAPL_GFX) {
+ if (platform->rapl_msrs & RAPL_GFX) {
if (get_msr_sum(cpu, MSR_PP1_ENERGY_STATUS, &msr))
return -16;
p->energy_gfx = msr;
}
- if (do_rapl & RAPL_PKG_PERF_STATUS) {
+ if (platform->rapl_msrs & RAPL_PKG_PERF_STATUS) {
if (get_msr_sum(cpu, MSR_PKG_PERF_STATUS, &msr))
return -16;
p->rapl_pkg_perf_status = msr;
}
- if (do_rapl & RAPL_DRAM_PERF_STATUS) {
+ if (platform->rapl_msrs & RAPL_DRAM_PERF_STATUS) {
if (get_msr_sum(cpu, MSR_DRAM_PERF_STATUS, &msr))
return -16;
p->rapl_dram_perf_status = msr;
}
- if (do_rapl & RAPL_AMD_F17H) {
+ if (platform->rapl_msrs & RAPL_AMD_F17H) {
if (get_msr_sum(cpu, MSR_PKG_ENERGY_STAT, &msr))
return -13;
p->energy_pkg = msr;
PCLRSV, PCLRSV
};
-static void calculate_tsc_tweak()
+void probe_cst_limit(void)
{
- tsc_tweak = base_hz / tsc_hz;
-}
+ unsigned long long msr;
+ int *pkg_cstate_limits;
+
+ if (!platform->has_nhm_msrs)
+ return;
+
+ switch (platform->cst_limit) {
+ case CST_LIMIT_NHM:
+ pkg_cstate_limits = nhm_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_SNB:
+ pkg_cstate_limits = snb_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_HSW:
+ pkg_cstate_limits = hsw_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_SKX:
+ pkg_cstate_limits = skx_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_ICX:
+ pkg_cstate_limits = icx_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_SLV:
+ pkg_cstate_limits = slv_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_AMT:
+ pkg_cstate_limits = amt_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_KNL:
+ pkg_cstate_limits = phi_pkg_cstate_limits;
+ break;
+ case CST_LIMIT_GMT:
+ pkg_cstate_limits = glm_pkg_cstate_limits;
+ break;
+ default:
+ return;
+ }
-void prewake_cstate_probe(unsigned int family, unsigned int model);
+ get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr);
+ pkg_cstate_limit = pkg_cstate_limits[msr & 0xF];
+}
-static void dump_nhm_platform_info(void)
+static void dump_platform_info(void)
{
unsigned long long msr;
unsigned int ratio;
+ if (!platform->has_nhm_msrs)
+ return;
+
get_msr(base_cpu, MSR_PLATFORM_INFO, &msr);
fprintf(outf, "cpu%d: MSR_PLATFORM_INFO: 0x%08llx\n", base_cpu, msr);
ratio = (msr >> 8) & 0xFF;
fprintf(outf, "%d * %.1f = %.1f MHz base frequency\n", ratio, bclk, ratio * bclk);
+}
+
+static void dump_power_ctl(void)
+{
+ unsigned long long msr;
+
+ if (!platform->has_nhm_msrs)
+ return;
get_msr(base_cpu, MSR_IA32_POWER_CTL, &msr);
fprintf(outf, "cpu%d: MSR_IA32_POWER_CTL: 0x%08llx (C1E auto-promotion: %sabled)\n",
base_cpu, msr, msr & 0x2 ? "EN" : "DIS");
/* C-state Pre-wake Disable (CSTATE_PREWAKE_DISABLE) */
- if (dis_cstate_prewake)
+ if (platform->has_cst_prewake_bit)
fprintf(outf, "C-state Pre-wake: %sabled\n", msr & 0x40000000 ? "DIS" : "EN");
return;
}
-static void dump_hsw_turbo_ratio_limits(void)
+static void dump_turbo_ratio_limit2(void)
{
unsigned long long msr;
unsigned int ratio;
return;
}
-static void dump_ivt_turbo_ratio_limits(void)
+static void dump_turbo_ratio_limit1(void)
{
unsigned long long msr;
unsigned int ratio;
return;
}
-int has_turbo_ratio_group_limits(int family, int model)
-{
-
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_SKYLAKE_X:
- case INTEL_FAM6_ICELAKE_X:
- case INTEL_FAM6_SAPPHIRERAPIDS_X:
- case INTEL_FAM6_ATOM_GOLDMONT_D:
- case INTEL_FAM6_ATOM_TREMONT_D:
- return 1;
- default:
- return 0;
- }
-}
-
-static void dump_turbo_ratio_limits(int trl_msr_offset, int family, int model)
+static void dump_turbo_ratio_limits(int trl_msr_offset)
{
unsigned long long msr, core_counts;
int shift;
fprintf(outf, "cpu%d: MSR_%sTURBO_RATIO_LIMIT: 0x%08llx\n",
base_cpu, trl_msr_offset == MSR_SECONDARY_TURBO_RATIO_LIMIT ? "SECONDARY_" : "", msr);
- if (has_turbo_ratio_group_limits(family, model)) {
+ if (platform->trl_msrs & TRL_CORECOUNT) {
get_msr(base_cpu, MSR_TURBO_RATIO_LIMIT1, &core_counts);
fprintf(outf, "cpu%d: MSR_TURBO_RATIO_LIMIT1: 0x%08llx\n", base_cpu, core_counts);
} else {
ratio[i], bclk, ratio[i] * bclk, cores[i]);
}
-static void dump_nhm_cst_cfg(void)
+static void dump_cst_cfg(void)
{
unsigned long long msr;
+ if (!platform->has_nhm_msrs)
+ return;
+
get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr);
fprintf(outf, "cpu%d: MSR_PKG_CST_CONFIG_CONTROL: 0x%08llx", base_cpu, msr);
(msr & (1 << 15)) ? "" : "UN", (unsigned int)msr & 0xF, pkg_cstate_limit_strings[pkg_cstate_limit]);
#define AUTOMATIC_CSTATE_CONVERSION (1UL << 16)
- if (has_automatic_cstate_conversion) {
+ if (platform->has_cst_auto_convension) {
fprintf(outf, ", automatic c-state conversion=%s", (msr & AUTOMATIC_CSTATE_CONVERSION) ? "on" : "off");
}
{
unsigned long long msr;
- get_msr(base_cpu, MSR_PKGC3_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
-
- get_msr(base_cpu, MSR_PKGC6_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ if (!platform->has_irtl_msrs)
+ return;
- get_msr(base_cpu, MSR_PKGC7_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ if (platform->supported_cstates & PC3) {
+ get_msr(base_cpu, MSR_PKGC3_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
- if (!do_irtl_hsw)
- return;
+ if (platform->supported_cstates & PC6) {
+ get_msr(base_cpu, MSR_PKGC6_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
- get_msr(base_cpu, MSR_PKGC8_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ if (platform->supported_cstates & PC7) {
+ get_msr(base_cpu, MSR_PKGC7_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
- get_msr(base_cpu, MSR_PKGC9_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ if (platform->supported_cstates & PC8) {
+ get_msr(base_cpu, MSR_PKGC8_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
- get_msr(base_cpu, MSR_PKGC10_IRTL, &msr);
- fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", base_cpu, msr);
- fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
- (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ if (platform->supported_cstates & PC9) {
+ get_msr(base_cpu, MSR_PKGC9_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
+ if (platform->supported_cstates & PC10) {
+ get_msr(base_cpu, MSR_PKGC10_IRTL, &msr);
+ fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", base_cpu, msr);
+ fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT",
+ (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]);
+ }
}
void free_fd_percpu(void)
cpu_present_set = NULL;
cpu_present_setsize = 0;
+ CPU_FREE(cpu_effective_set);
+ cpu_effective_set = NULL;
+ cpu_effective_setsize = 0;
+
+ CPU_FREE(cpu_allowed_set);
+ cpu_allowed_set = NULL;
+ cpu_allowed_setsize = 0;
+
CPU_FREE(cpu_affinity_set);
cpu_affinity_set = NULL;
cpu_affinity_setsize = 0;
return -1;
}
-int get_thread_siblings(struct cpu_topology *thiscpu)
+static int parse_cpu_str(char *cpu_str, cpu_set_t *cpu_set, int cpu_set_size)
{
- char path[80], character;
- FILE *filep;
- unsigned long map;
- int so, shift, sib_core;
- int cpu = thiscpu->logical_cpu_id;
- int offset = topo.max_cpu_num + 1;
- size_t size;
- int thread_id = 0;
+ unsigned int start, end;
+ char *next = cpu_str;
- thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1));
- if (thiscpu->thread_id < 0)
- thiscpu->thread_id = thread_id++;
- if (!thiscpu->put_ids)
- return -1;
+ while (next && *next) {
- size = CPU_ALLOC_SIZE((topo.max_cpu_num + 1));
- CPU_ZERO_S(size, thiscpu->put_ids);
+ if (*next == '-') /* no negative cpu numbers */
+ return 1;
- sprintf(path, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings", cpu);
- filep = fopen(path, "r");
+ start = strtoul(next, &next, 10);
- if (!filep) {
- warnx("%s: open failed", path);
- return -1;
- }
- do {
- offset -= BITMASK_SIZE;
- if (fscanf(filep, "%lx%c", &map, &character) != 2)
- err(1, "%s: failed to parse file", path);
- for (shift = 0; shift < BITMASK_SIZE; shift++) {
- if ((map >> shift) & 0x1) {
- so = shift + offset;
- sib_core = get_core_id(so);
- if (sib_core == thiscpu->physical_core_id) {
- CPU_SET_S(so, size, thiscpu->put_ids);
- if ((so != cpu) && (cpus[so].thread_id < 0))
- cpus[so].thread_id = thread_id++;
- }
- }
- }
- } while (character == ',');
+ if (start >= CPU_SUBSET_MAXCPUS)
+ return 1;
+ CPU_SET_S(start, cpu_set_size, cpu_set);
+
+ if (*next == '\0' || *next == '\n')
+ break;
+
+ if (*next == ',') {
+ next += 1;
+ continue;
+ }
+
+ if (*next == '-') {
+ next += 1; /* start range */
+ } else if (*next == '.') {
+ next += 1;
+ if (*next == '.')
+ next += 1; /* start range */
+ else
+ return 1;
+ }
+
+ end = strtoul(next, &next, 10);
+ if (end <= start)
+ return 1;
+
+ while (++start <= end) {
+ if (start >= CPU_SUBSET_MAXCPUS)
+ return 1;
+ CPU_SET_S(start, cpu_set_size, cpu_set);
+ }
+
+ if (*next == ',')
+ next += 1;
+ else if (*next != '\0' && *next != '\n')
+ return 1;
+ }
+
+ return 0;
+}
+
+int get_thread_siblings(struct cpu_topology *thiscpu)
+{
+ char path[80], character;
+ FILE *filep;
+ unsigned long map;
+ int so, shift, sib_core;
+ int cpu = thiscpu->logical_cpu_id;
+ int offset = topo.max_cpu_num + 1;
+ size_t size;
+ int thread_id = 0;
+
+ thiscpu->put_ids = CPU_ALLOC((topo.max_cpu_num + 1));
+ if (thiscpu->thread_id < 0)
+ thiscpu->thread_id = thread_id++;
+ if (!thiscpu->put_ids)
+ return -1;
+
+ size = CPU_ALLOC_SIZE((topo.max_cpu_num + 1));
+ CPU_ZERO_S(size, thiscpu->put_ids);
+
+ sprintf(path, "/sys/devices/system/cpu/cpu%d/topology/thread_siblings", cpu);
+ filep = fopen(path, "r");
+
+ if (!filep) {
+ warnx("%s: open failed", path);
+ return -1;
+ }
+ do {
+ offset -= BITMASK_SIZE;
+ if (fscanf(filep, "%lx%c", &map, &character) != 2)
+ err(1, "%s: failed to parse file", path);
+ for (shift = 0; shift < BITMASK_SIZE; shift++) {
+ if ((map >> shift) & 0x1) {
+ so = shift + offset;
+ sib_core = get_core_id(so);
+ if (sib_core == thiscpu->physical_core_id) {
+ CPU_SET_S(so, size, thiscpu->put_ids);
+ if ((so != cpu) && (cpus[so].thread_id < 0))
+ cpus[so].thread_id = thread_id++;
+ }
+ }
+ }
+ } while (character == ',');
fclose(filep);
return CPU_COUNT_S(size, thiscpu->put_ids);
t = GET_THREAD(thread_base, thread_no, core_no, node_no, pkg_no);
- if (cpu_is_not_present(t->cpu_id))
+ if (cpu_is_not_allowed(t->cpu_id))
continue;
t2 = GET_THREAD(thread_base2, thread_no, core_no, node_no, pkg_no);
return 0;
}
+#define PATH_EFFECTIVE_CPUS "/sys/fs/cgroup/cpuset.cpus.effective"
+
+static char cpu_effective_str[1024];
+
+static int update_effective_str(bool startup)
+{
+ FILE *fp;
+ char *pos;
+ char buf[1024];
+ int ret;
+
+ if (cpu_effective_str[0] == '\0' && !startup)
+ return 0;
+
+ fp = fopen(PATH_EFFECTIVE_CPUS, "r");
+ if (!fp)
+ return 0;
+
+ pos = fgets(buf, 1024, fp);
+ if (!pos)
+ err(1, "%s: file read failed\n", PATH_EFFECTIVE_CPUS);
+
+ fclose(fp);
+
+ ret = strncmp(cpu_effective_str, buf, 1024);
+ if (!ret)
+ return 0;
+
+ strncpy(cpu_effective_str, buf, 1024);
+ return 1;
+}
+
+static void update_effective_set(bool startup)
+{
+ update_effective_str(startup);
+
+ if (parse_cpu_str(cpu_effective_str, cpu_effective_set, cpu_effective_setsize))
+ err(1, "%s: cpu str malformat %s\n", PATH_EFFECTIVE_CPUS, cpu_effective_str);
+}
+
void re_initialize(void)
{
free_all_buffers();
- setup_all_buffers();
- fprintf(outf, "turbostat: re-initialized with num_cpus %d\n", topo.num_cpus);
+ setup_all_buffers(false);
+ fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, topo.allowed_cpus);
}
void set_max_cpu_num(void)
/*
* snapshot_gfx_mhz()
*
- * record snapshot of
- * /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
+ * fall back to /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
+ * when /sys/class/drm/card0/gt_cur_freq_mhz is not available.
*
* return 1 if config change requires a restart, else return 0
*/
static FILE *fp;
int retval;
- if (fp == NULL)
- fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", "r");
- else {
+ if (fp == NULL) {
+ fp = fopen("/sys/class/drm/card0/gt_cur_freq_mhz", "r");
+ if (!fp)
+ fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", "r");
+ } else {
rewind(fp);
fflush(fp);
}
/*
* snapshot_gfx_cur_mhz()
*
- * record snapshot of
- * /sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz
+ * fall back to /sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz
+ * when /sys/class/drm/card0/gt_act_freq_mhz is not available.
*
* return 1 if config change requires a restart, else return 0
*/
static FILE *fp;
int retval;
- if (fp == NULL)
- fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", "r");
- else {
+ if (fp == NULL) {
+ fp = fopen("/sys/class/drm/card0/gt_act_freq_mhz", "r");
+ if (!fp)
+ fp = fopen_or_die("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", "r");
+ } else {
rewind(fp);
fflush(fp);
}
re_initialize();
goto restart;
}
+ if (update_effective_str(false)) {
+ re_initialize();
+ goto restart;
+ }
do_sleep();
if (snapshot_proc_sysfs_files())
goto restart;
exit(-6);
}
-/*
- * NHM adds support for additional MSRs:
- *
- * MSR_SMI_COUNT 0x00000034
- *
- * MSR_PLATFORM_INFO 0x000000ce
- * MSR_PKG_CST_CONFIG_CONTROL 0x000000e2
- *
- * MSR_MISC_PWR_MGMT 0x000001aa
- *
- * MSR_PKG_C3_RESIDENCY 0x000003f8
- * MSR_PKG_C6_RESIDENCY 0x000003f9
- * MSR_CORE_C3_RESIDENCY 0x000003fc
- * MSR_CORE_C6_RESIDENCY 0x000003fd
- *
- * Side effect:
- * sets global pkg_cstate_limit to decode MSR_PKG_CST_CONFIG_CONTROL
- * sets has_misc_feature_control
- */
-int probe_nhm_msrs(unsigned int family, unsigned int model)
+void probe_bclk(void)
{
unsigned long long msr;
unsigned int base_ratio;
- int *pkg_cstate_limits;
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- bclk = discover_bclk(family, model);
+ if (!platform->has_nhm_msrs)
+ return;
- switch (model) {
- case INTEL_FAM6_NEHALEM: /* Core i7 and i5 Processor - Clarksfield, Lynnfield, Jasper Forest */
- case INTEL_FAM6_NEHALEM_EX: /* Nehalem-EX Xeon - Beckton */
- pkg_cstate_limits = nhm_pkg_cstate_limits;
- break;
- case INTEL_FAM6_SANDYBRIDGE: /* SNB */
- case INTEL_FAM6_SANDYBRIDGE_X: /* SNB Xeon */
- case INTEL_FAM6_IVYBRIDGE: /* IVB */
- case INTEL_FAM6_IVYBRIDGE_X: /* IVB Xeon */
- pkg_cstate_limits = snb_pkg_cstate_limits;
- has_misc_feature_control = 1;
- break;
- case INTEL_FAM6_HASWELL: /* HSW */
- case INTEL_FAM6_HASWELL_G: /* HSW */
- case INTEL_FAM6_HASWELL_X: /* HSX */
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_BROADWELL: /* BDW */
- case INTEL_FAM6_BROADWELL_G: /* BDW */
- case INTEL_FAM6_BROADWELL_X: /* BDX */
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- pkg_cstate_limits = hsw_pkg_cstate_limits;
- has_misc_feature_control = 1;
- break;
- case INTEL_FAM6_SKYLAKE_X: /* SKX */
- case INTEL_FAM6_SAPPHIRERAPIDS_X: /* SPR */
- pkg_cstate_limits = skx_pkg_cstate_limits;
- has_misc_feature_control = 1;
- break;
- case INTEL_FAM6_ICELAKE_X: /* ICX */
- pkg_cstate_limits = icx_pkg_cstate_limits;
- has_misc_feature_control = 1;
- break;
- case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
- no_MSR_MISC_PWR_MGMT = 1;
- /* FALLTHRU */
- case INTEL_FAM6_ATOM_SILVERMONT_D: /* AVN */
- pkg_cstate_limits = slv_pkg_cstate_limits;
- break;
- case INTEL_FAM6_ATOM_AIRMONT: /* AMT */
- pkg_cstate_limits = amt_pkg_cstate_limits;
- no_MSR_MISC_PWR_MGMT = 1;
- break;
- case INTEL_FAM6_XEON_PHI_KNL: /* PHI */
- pkg_cstate_limits = phi_pkg_cstate_limits;
- break;
- case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
- case INTEL_FAM6_ATOM_GOLDMONT_D: /* DNV */
- case INTEL_FAM6_ATOM_TREMONT: /* EHL */
- case INTEL_FAM6_ATOM_TREMONT_D: /* JVL */
- pkg_cstate_limits = glm_pkg_cstate_limits;
- break;
- default:
- return 0;
- }
- get_msr(base_cpu, MSR_PKG_CST_CONFIG_CONTROL, &msr);
- pkg_cstate_limit = pkg_cstate_limits[msr & 0xF];
+ if (platform->bclk_freq == BCLK_100MHZ)
+ bclk = 100.00;
+ else if (platform->bclk_freq == BCLK_133MHZ)
+ bclk = 133.33;
+ else if (platform->bclk_freq == BCLK_SLV)
+ bclk = slm_bclk();
+ else
+ return;
get_msr(base_cpu, MSR_PLATFORM_INFO, &msr);
base_ratio = (msr >> 8) & 0xFF;
base_hz = base_ratio * bclk * 1000000;
has_base_hz = 1;
- return 1;
-}
-/*
- * SLV client has support for unique MSRs:
- *
- * MSR_CC6_DEMOTION_POLICY_CONFIG
- * MSR_MC6_DEMOTION_POLICY_CONFIG
- */
+ if (platform->enable_tsc_tweak)
+ tsc_tweak = base_hz / tsc_hz;
+}
-int has_slv_msrs(unsigned int family, unsigned int model)
+static void remove_underbar(char *s)
{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
+ char *to = s;
- switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT:
- case INTEL_FAM6_ATOM_SILVERMONT_MID:
- case INTEL_FAM6_ATOM_AIRMONT_MID:
- return 1;
+ while (*s) {
+ if (*s != '_')
+ *to++ = *s;
+ s++;
}
- return 0;
+
+ *to = 0;
}
-int is_dnv(unsigned int family, unsigned int model)
+static void dump_turbo_ratio_info(void)
{
+ if (!has_turbo)
+ return;
- if (!genuine_intel)
- return 0;
+ if (!platform->has_nhm_msrs)
+ return;
- if (family != 6)
- return 0;
+ if (platform->trl_msrs & TRL_LIMIT2)
+ dump_turbo_ratio_limit2();
- switch (model) {
- case INTEL_FAM6_ATOM_GOLDMONT_D:
- return 1;
- }
- return 0;
-}
+ if (platform->trl_msrs & TRL_LIMIT1)
+ dump_turbo_ratio_limit1();
-int is_bdx(unsigned int family, unsigned int model)
-{
+ if (platform->trl_msrs & TRL_BASE) {
+ dump_turbo_ratio_limits(MSR_TURBO_RATIO_LIMIT);
- if (!genuine_intel)
- return 0;
+ if (is_hybrid)
+ dump_turbo_ratio_limits(MSR_SECONDARY_TURBO_RATIO_LIMIT);
+ }
- if (family != 6)
- return 0;
+ if (platform->trl_msrs & TRL_ATOM)
+ dump_atom_turbo_ratio_limits();
- switch (model) {
- case INTEL_FAM6_BROADWELL_X:
- return 1;
- }
- return 0;
+ if (platform->trl_msrs & TRL_KNL)
+ dump_knl_turbo_ratio_limits();
+
+ if (platform->has_config_tdp)
+ dump_config_tdp();
}
-int is_skx(unsigned int family, unsigned int model)
+static int read_sysfs_int(char *path)
{
+ FILE *input;
+ int retval = -1;
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_SKYLAKE_X:
- return 1;
+ input = fopen(path, "r");
+ if (input == NULL) {
+ if (debug)
+ fprintf(outf, "NSFOD %s\n", path);
+ return (-1);
}
- return 0;
+ if (fscanf(input, "%d", &retval) != 1)
+ err(1, "%s: failed to read int from file", path);
+ fclose(input);
+
+ return (retval);
}
-int is_icx(unsigned int family, unsigned int model)
+static void dump_sysfs_file(char *path)
{
+ FILE *input;
+ char cpuidle_buf[64];
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ICELAKE_X:
- return 1;
+ input = fopen(path, "r");
+ if (input == NULL) {
+ if (debug)
+ fprintf(outf, "NSFOD %s\n", path);
+ return;
}
- return 0;
+ if (!fgets(cpuidle_buf, sizeof(cpuidle_buf), input))
+ err(1, "%s: failed to read file", path);
+ fclose(input);
+
+ fprintf(outf, "%s: %s", strrchr(path, '/') + 1, cpuidle_buf);
}
-int is_spr(unsigned int family, unsigned int model)
+static void probe_intel_uncore_frequency(void)
{
+ int i, j;
+ char path[128];
if (!genuine_intel)
- return 0;
+ return;
- if (family != 6)
- return 0;
+ if (access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00", R_OK))
+ return;
- switch (model) {
- case INTEL_FAM6_SAPPHIRERAPIDS_X:
- return 1;
- }
- return 0;
-}
+ /* Cluster level sysfs not supported yet. */
+ if (!access("/sys/devices/system/cpu/intel_uncore_frequency/uncore00", R_OK))
+ return;
-int is_ehl(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
+ if (!access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz", R_OK))
+ BIC_PRESENT(BIC_UNCORE_MHZ);
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ATOM_TREMONT:
- return 1;
- }
- return 0;
-}
-
-int is_jvl(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ATOM_TREMONT_D:
- return 1;
- }
- return 0;
-}
-
-int has_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (has_slv_msrs(family, model))
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- /* Nehalem compatible, but do not include turbo-ratio limit support */
- case INTEL_FAM6_NEHALEM_EX: /* Nehalem-EX Xeon - Beckton */
- case INTEL_FAM6_XEON_PHI_KNL: /* PHI - Knights Landing (different MSR definition) */
- return 0;
- default:
- return 1;
- }
-}
-
-int has_atom_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (has_slv_msrs(family, model))
- return 1;
-
- return 0;
-}
-
-int has_ivt_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_IVYBRIDGE_X: /* IVB Xeon */
- case INTEL_FAM6_HASWELL_X: /* HSW Xeon */
- return 1;
- default:
- return 0;
- }
-}
-
-int has_hsw_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_HASWELL_X: /* HSW Xeon */
- return 1;
- default:
- return 0;
- }
-}
-
-int has_knl_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_XEON_PHI_KNL: /* Knights Landing */
- return 1;
- default:
- return 0;
- }
-}
-
-int has_glm_turbo_ratio_limit(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ATOM_GOLDMONT:
- case INTEL_FAM6_SKYLAKE_X:
- case INTEL_FAM6_ICELAKE_X:
- case INTEL_FAM6_SAPPHIRERAPIDS_X:
- return 1;
- default:
- return 0;
- }
-}
-
-int has_config_tdp(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_IVYBRIDGE: /* IVB */
- case INTEL_FAM6_HASWELL: /* HSW */
- case INTEL_FAM6_HASWELL_X: /* HSX */
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_HASWELL_G: /* HSW */
- case INTEL_FAM6_BROADWELL: /* BDW */
- case INTEL_FAM6_BROADWELL_G: /* BDW */
- case INTEL_FAM6_BROADWELL_X: /* BDX */
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- case INTEL_FAM6_SKYLAKE_X: /* SKX */
- case INTEL_FAM6_ICELAKE_X: /* ICX */
- case INTEL_FAM6_SAPPHIRERAPIDS_X: /* SPR */
- case INTEL_FAM6_XEON_PHI_KNL: /* Knights Landing */
- return 1;
- default:
- return 0;
- }
-}
-
-/*
- * tcc_offset_bits:
- * 0: Tcc Offset not supported (Default)
- * 6: Bit 29:24 of MSR_PLATFORM_INFO
- * 4: Bit 27:24 of MSR_PLATFORM_INFO
- */
-void check_tcc_offset(int model)
-{
- unsigned long long msr;
-
- if (!genuine_intel)
- return;
-
- switch (model) {
- case INTEL_FAM6_SKYLAKE_L:
- case INTEL_FAM6_SKYLAKE:
- case INTEL_FAM6_KABYLAKE_L:
- case INTEL_FAM6_KABYLAKE:
- case INTEL_FAM6_ICELAKE_L:
- case INTEL_FAM6_ICELAKE:
- case INTEL_FAM6_TIGERLAKE_L:
- case INTEL_FAM6_TIGERLAKE:
- case INTEL_FAM6_COMETLAKE:
- if (!get_msr(base_cpu, MSR_PLATFORM_INFO, &msr)) {
- msr = (msr >> 30) & 1;
- if (msr)
- tcc_offset_bits = 6;
- }
- return;
- default:
- return;
- }
-}
-
-static void remove_underbar(char *s)
-{
- char *to = s;
-
- while (*s) {
- if (*s != '_')
- *to++ = *s;
- s++;
- }
-
- *to = 0;
-}
-
-static void dump_turbo_ratio_info(unsigned int family, unsigned int model)
-{
- if (!has_turbo)
- return;
-
- if (has_hsw_turbo_ratio_limit(family, model))
- dump_hsw_turbo_ratio_limits();
-
- if (has_ivt_turbo_ratio_limit(family, model))
- dump_ivt_turbo_ratio_limits();
-
- if (has_turbo_ratio_limit(family, model)) {
- dump_turbo_ratio_limits(MSR_TURBO_RATIO_LIMIT, family, model);
-
- if (is_hybrid)
- dump_turbo_ratio_limits(MSR_SECONDARY_TURBO_RATIO_LIMIT, family, model);
- }
-
- if (has_atom_turbo_ratio_limit(family, model))
- dump_atom_turbo_ratio_limits();
-
- if (has_knl_turbo_ratio_limit(family, model))
- dump_knl_turbo_ratio_limits();
-
- if (has_config_tdp(family, model))
- dump_config_tdp();
-}
-
-static void dump_cstate_pstate_config_info(unsigned int family, unsigned int model)
-{
- if (!do_nhm_platform_info)
- return;
-
- dump_nhm_platform_info();
- dump_turbo_ratio_info(family, model);
- dump_nhm_cst_cfg();
-}
-
-static int read_sysfs_int(char *path)
-{
- FILE *input;
- int retval = -1;
-
- input = fopen(path, "r");
- if (input == NULL) {
- if (debug)
- fprintf(outf, "NSFOD %s\n", path);
- return (-1);
- }
- if (fscanf(input, "%d", &retval) != 1)
- err(1, "%s: failed to read int from file", path);
- fclose(input);
-
- return (retval);
-}
-
-static void dump_sysfs_file(char *path)
-{
- FILE *input;
- char cpuidle_buf[64];
-
- input = fopen(path, "r");
- if (input == NULL) {
- if (debug)
- fprintf(outf, "NSFOD %s\n", path);
- return;
- }
- if (!fgets(cpuidle_buf, sizeof(cpuidle_buf), input))
- err(1, "%s: failed to read file", path);
- fclose(input);
-
- fprintf(outf, "%s: %s", strrchr(path, '/') + 1, cpuidle_buf);
-}
-
-static void intel_uncore_frequency_probe(void)
-{
- int i, j;
- char path[128];
-
- if (!genuine_intel)
- return;
-
- if (access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00", R_OK))
- return;
-
- if (!access("/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz", R_OK))
- BIC_PRESENT(BIC_UNCORE_MHZ);
-
- if (quiet)
- return;
+ if (quiet)
+ return;
for (i = 0; i < topo.num_packages; ++i) {
for (j = 0; j < topo.num_die; ++j) {
}
}
+static void probe_graphics(void)
+{
+ if (!access("/sys/class/drm/card0/power/rc6_residency_ms", R_OK))
+ BIC_PRESENT(BIC_GFX_rc6);
+
+ if (!access("/sys/class/drm/card0/gt_cur_freq_mhz", R_OK) ||
+ !access("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", R_OK))
+ BIC_PRESENT(BIC_GFXMHz);
+
+ if (!access("/sys/class/drm/card0/gt_act_freq_mhz", R_OK) ||
+ !access("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", R_OK))
+ BIC_PRESENT(BIC_GFXACTMHz);
+}
+
static void dump_sysfs_cstate_config(void)
{
char path[64];
cpu = t->cpu_id;
/* EPB is per-package */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE) || !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_thread_in_package(t, c, p))
return 0;
if (cpu_migrate(cpu)) {
cpu = t->cpu_id;
/* MSR_HWP_CAPABILITIES is per-package */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE) || !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_thread_in_package(t, c, p))
return 0;
if (cpu_migrate(cpu)) {
cpu = t->cpu_id;
/* per-package */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE) || !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_thread_in_package(t, c, p))
return 0;
if (cpu_migrate(cpu)) {
return -1;
}
- if (do_core_perf_limit_reasons) {
+ if (platform->plr_msrs & PLR_CORE) {
get_msr(cpu, MSR_CORE_PERF_LIMIT_REASONS, &msr);
fprintf(outf, "cpu%d: MSR_CORE_PERF_LIMIT_REASONS, 0x%08llx", cpu, msr);
fprintf(outf, " (Active: %s%s%s%s%s%s%s%s%s%s%s%s%s%s)",
(msr & 1 << 17) ? "ThermStatus, " : "", (msr & 1 << 16) ? "PROCHOT, " : "");
}
- if (do_gfx_perf_limit_reasons) {
+ if (platform->plr_msrs & PLR_GFX) {
get_msr(cpu, MSR_GFX_PERF_LIMIT_REASONS, &msr);
fprintf(outf, "cpu%d: MSR_GFX_PERF_LIMIT_REASONS, 0x%08llx", cpu, msr);
fprintf(outf, " (Active: %s%s%s%s%s%s%s%s)",
(msr & 1 << 25) ? "GFXPwr, " : "",
(msr & 1 << 26) ? "PkgPwrL1, " : "", (msr & 1 << 27) ? "PkgPwrL2, " : "");
}
- if (do_ring_perf_limit_reasons) {
+ if (platform->plr_msrs & PLR_RING) {
get_msr(cpu, MSR_RING_PERF_LIMIT_REASONS, &msr);
fprintf(outf, "cpu%d: MSR_RING_PERF_LIMIT_REASONS, 0x%08llx", cpu, msr);
fprintf(outf, " (Active: %s%s%s%s%s%s)",
#define RAPL_POWER_GRANULARITY 0x7FFF /* 15 bit power granularity */
#define RAPL_TIME_GRANULARITY 0x3F /* 6 bit time granularity */
-double get_tdp_intel(unsigned int model)
+double get_quirk_tdp(void)
{
- unsigned long long msr;
-
- if (do_rapl & RAPL_PKG_POWER_INFO)
- if (!get_msr(base_cpu, MSR_PKG_POWER_INFO, &msr))
- return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units;
+ if (platform->rapl_quirk_tdp)
+ return platform->rapl_quirk_tdp;
- switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT:
- case INTEL_FAM6_ATOM_SILVERMONT_D:
- return 30.0;
- default:
- return 135.0;
- }
+ return 135.0;
}
-double get_tdp_amd(unsigned int family)
+double get_tdp_intel(void)
{
- UNUSED(family);
+ unsigned long long msr;
- /* This is the max stock TDP of HEDT/Server Fam17h+ chips */
- return 280.0;
+ if (platform->rapl_msrs & RAPL_PKG_POWER_INFO)
+ if (!get_msr(base_cpu, MSR_PKG_POWER_INFO, &msr))
+ return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units;
+ return get_quirk_tdp();
}
-/*
- * rapl_dram_energy_units_probe()
- * Energy units are either hard-coded, or come from RAPL Energy Unit MSR.
- */
-static double rapl_dram_energy_units_probe(int model, double rapl_energy_units)
+double get_tdp_amd(void)
{
- /* only called for genuine_intel, family 6 */
-
- switch (model) {
- case INTEL_FAM6_HASWELL_X: /* HSX */
- case INTEL_FAM6_BROADWELL_X: /* BDX */
- case INTEL_FAM6_SKYLAKE_X: /* SKX */
- case INTEL_FAM6_XEON_PHI_KNL: /* KNL */
- case INTEL_FAM6_ICELAKE_X: /* ICX */
- return (rapl_dram_energy_units = 15.3 / 1000000);
- default:
- return (rapl_energy_units);
- }
+ return get_quirk_tdp();
}
-void rapl_probe_intel(unsigned int family, unsigned int model)
+void rapl_probe_intel(void)
{
unsigned long long msr;
unsigned int time_unit;
double tdp;
- if (family != 6)
- return;
-
- switch (model) {
- case INTEL_FAM6_SANDYBRIDGE:
- case INTEL_FAM6_IVYBRIDGE:
- case INTEL_FAM6_HASWELL: /* HSW */
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_HASWELL_G: /* HSW */
- case INTEL_FAM6_BROADWELL: /* BDW */
- case INTEL_FAM6_BROADWELL_G: /* BDW */
- do_rapl = RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_GFX | RAPL_PKG_POWER_INFO;
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_Cor_J);
- BIC_PRESENT(BIC_GFX_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_CorWatt);
- BIC_PRESENT(BIC_GFXWatt);
- }
- break;
- case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
- do_rapl = RAPL_PKG | RAPL_PKG_POWER_INFO;
- if (rapl_joules)
- BIC_PRESENT(BIC_Pkg_J);
- else
- BIC_PRESENT(BIC_PkgWatt);
- break;
- case INTEL_FAM6_ATOM_TREMONT: /* EHL */
- do_rapl =
- RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS
- | RAPL_GFX | RAPL_PKG_POWER_INFO;
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_Cor_J);
- BIC_PRESENT(BIC_RAM_J);
- BIC_PRESENT(BIC_GFX_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_CorWatt);
- BIC_PRESENT(BIC_RAMWatt);
- BIC_PRESENT(BIC_GFXWatt);
- }
- break;
- case INTEL_FAM6_ATOM_TREMONT_D: /* JVL */
- do_rapl = RAPL_PKG | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO;
- BIC_PRESENT(BIC_PKG__);
- if (rapl_joules)
- BIC_PRESENT(BIC_Pkg_J);
- else
- BIC_PRESENT(BIC_PkgWatt);
- break;
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- do_rapl =
- RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS
- | RAPL_GFX | RAPL_PKG_POWER_INFO;
- BIC_PRESENT(BIC_PKG__);
- BIC_PRESENT(BIC_RAM__);
- if (rapl_joules) {
+ if (rapl_joules) {
+ if (platform->rapl_msrs & RAPL_PKG_ENERGY_STATUS)
BIC_PRESENT(BIC_Pkg_J);
+ if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS)
BIC_PRESENT(BIC_Cor_J);
+ if (platform->rapl_msrs & RAPL_DRAM_ENERGY_STATUS)
BIC_PRESENT(BIC_RAM_J);
+ if (platform->rapl_msrs & RAPL_GFX_ENERGY_STATUS)
BIC_PRESENT(BIC_GFX_J);
- } else {
+ } else {
+ if (platform->rapl_msrs & RAPL_PKG_ENERGY_STATUS)
BIC_PRESENT(BIC_PkgWatt);
+ if (platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS)
BIC_PRESENT(BIC_CorWatt);
+ if (platform->rapl_msrs & RAPL_DRAM_ENERGY_STATUS)
BIC_PRESENT(BIC_RAMWatt);
+ if (platform->rapl_msrs & RAPL_GFX_ENERGY_STATUS)
BIC_PRESENT(BIC_GFXWatt);
- }
- break;
- case INTEL_FAM6_HASWELL_X: /* HSX */
- case INTEL_FAM6_BROADWELL_X: /* BDX */
- case INTEL_FAM6_SKYLAKE_X: /* SKX */
- case INTEL_FAM6_ICELAKE_X: /* ICX */
- case INTEL_FAM6_SAPPHIRERAPIDS_X: /* SPR */
- case INTEL_FAM6_XEON_PHI_KNL: /* KNL */
- do_rapl =
- RAPL_PKG | RAPL_DRAM | RAPL_DRAM_POWER_INFO | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS |
- RAPL_PKG_POWER_INFO;
+ }
+
+ if (platform->rapl_msrs & RAPL_PKG_PERF_STATUS)
BIC_PRESENT(BIC_PKG__);
+ if (platform->rapl_msrs & RAPL_DRAM_PERF_STATUS)
BIC_PRESENT(BIC_RAM__);
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_RAM_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_RAMWatt);
- }
- break;
- case INTEL_FAM6_SANDYBRIDGE_X:
- case INTEL_FAM6_IVYBRIDGE_X:
- do_rapl =
- RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_DRAM | RAPL_DRAM_POWER_INFO | RAPL_PKG_PERF_STATUS |
- RAPL_DRAM_PERF_STATUS | RAPL_PKG_POWER_INFO;
- BIC_PRESENT(BIC_PKG__);
- BIC_PRESENT(BIC_RAM__);
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_Cor_J);
- BIC_PRESENT(BIC_RAM_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_CorWatt);
- BIC_PRESENT(BIC_RAMWatt);
- }
- break;
- case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
- case INTEL_FAM6_ATOM_SILVERMONT_D: /* AVN */
- do_rapl = RAPL_PKG | RAPL_CORES;
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_Cor_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_CorWatt);
- }
- break;
- case INTEL_FAM6_ATOM_GOLDMONT_D: /* DNV */
- do_rapl =
- RAPL_PKG | RAPL_DRAM | RAPL_DRAM_POWER_INFO | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS |
- RAPL_PKG_POWER_INFO | RAPL_CORES_ENERGY_STATUS;
- BIC_PRESENT(BIC_PKG__);
- BIC_PRESENT(BIC_RAM__);
- if (rapl_joules) {
- BIC_PRESENT(BIC_Pkg_J);
- BIC_PRESENT(BIC_Cor_J);
- BIC_PRESENT(BIC_RAM_J);
- } else {
- BIC_PRESENT(BIC_PkgWatt);
- BIC_PRESENT(BIC_CorWatt);
- BIC_PRESENT(BIC_RAMWatt);
- }
- break;
- default:
- return;
- }
/* units on package 0, verify later other packages match */
if (get_msr(base_cpu, MSR_RAPL_POWER_UNIT, &msr))
return;
rapl_power_units = 1.0 / (1 << (msr & 0xF));
- if (model == INTEL_FAM6_ATOM_SILVERMONT)
+ if (platform->has_rapl_divisor)
rapl_energy_units = 1.0 * (1 << (msr >> 8 & 0x1F)) / 1000000;
else
rapl_energy_units = 1.0 / (1 << (msr >> 8 & 0x1F));
- rapl_dram_energy_units = rapl_dram_energy_units_probe(model, rapl_energy_units);
+ if (platform->has_fixed_rapl_unit)
+ rapl_dram_energy_units = (15.3 / 1000000);
+ else
+ rapl_dram_energy_units = rapl_energy_units;
time_unit = msr >> 16 & 0xF;
if (time_unit == 0)
rapl_time_units = 1.0 / (1 << (time_unit));
- tdp = get_tdp_intel(model);
+ tdp = get_tdp_intel();
rapl_joule_counter_range = 0xFFFFFFFF * rapl_energy_units / tdp;
if (!quiet)
fprintf(outf, "RAPL: %.0f sec. Joule Counter Range, at %.0f Watts\n", rapl_joule_counter_range, tdp);
}
-void rapl_probe_amd(unsigned int family, unsigned int model)
+void rapl_probe_amd(void)
{
unsigned long long msr;
- unsigned int eax, ebx, ecx, edx;
- unsigned int has_rapl = 0;
double tdp;
- UNUSED(model);
-
- if (max_extended_level >= 0x80000007) {
- __cpuid(0x80000007, eax, ebx, ecx, edx);
- /* RAPL (Fam 17h+) */
- has_rapl = edx & (1 << 14);
- }
-
- if (!has_rapl || family < 0x17)
- return;
-
- do_rapl = RAPL_AMD_F17H | RAPL_PER_CORE_ENERGY;
if (rapl_joules) {
BIC_PRESENT(BIC_Pkg_J);
BIC_PRESENT(BIC_Cor_J);
rapl_energy_units = ldexp(1.0, -(msr >> 8 & 0x1f));
rapl_power_units = ldexp(1.0, -(msr & 0xf));
- tdp = get_tdp_amd(family);
+ tdp = get_tdp_amd();
rapl_joule_counter_range = 0xFFFFFFFF * rapl_energy_units / tdp;
if (!quiet)
fprintf(outf, "RAPL: %.0f sec. Joule Counter Range, at %.0f Watts\n", rapl_joule_counter_range, tdp);
}
-/*
- * rapl_probe()
- *
- * sets do_rapl, rapl_power_units, rapl_energy_units, rapl_time_units
- */
-void rapl_probe(unsigned int family, unsigned int model)
-{
- if (genuine_intel)
- rapl_probe_intel(family, model);
- if (authentic_amd || hygon_genuine)
- rapl_probe_amd(family, model);
-}
-
-void perf_limit_reasons_probe(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return;
-
- if (family != 6)
- return;
-
- switch (model) {
- case INTEL_FAM6_HASWELL: /* HSW */
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_HASWELL_G: /* HSW */
- do_gfx_perf_limit_reasons = 1;
- /* FALLTHRU */
- case INTEL_FAM6_HASWELL_X: /* HSX */
- do_core_perf_limit_reasons = 1;
- do_ring_perf_limit_reasons = 1;
- default:
- return;
- }
-}
-
-void automatic_cstate_conversion_probe(unsigned int family, unsigned int model)
-{
- if (family != 6)
- return;
-
- switch (model) {
- case INTEL_FAM6_BROADWELL_X:
- case INTEL_FAM6_SKYLAKE_X:
- has_automatic_cstate_conversion = 1;
- }
-}
-
-void prewake_cstate_probe(unsigned int family, unsigned int model)
-{
- if (is_icx(family, model) || is_spr(family, model))
- dis_cstate_prewake = 1;
-}
-
-int print_thermal(struct thread_data *t, struct core_data *c, struct pkg_data *p)
-{
- unsigned long long msr;
- unsigned int dts, dts2;
- int cpu;
-
- UNUSED(c);
- UNUSED(p);
-
- if (!(do_dts || do_ptm))
- return 0;
-
- cpu = t->cpu_id;
-
- /* DTS is per-core, no need to print for each thread */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
- return 0;
-
- if (cpu_migrate(cpu)) {
- fprintf(outf, "print_thermal: Could not migrate to CPU %d\n", cpu);
- return -1;
- }
-
- if (do_ptm && (t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE)) {
- if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_STATUS, &msr))
- return 0;
-
- dts = (msr >> 16) & 0x7F;
- fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_STATUS: 0x%08llx (%d C)\n", cpu, msr, tj_max - dts);
-
- if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &msr))
- return 0;
-
- dts = (msr >> 16) & 0x7F;
- dts2 = (msr >> 8) & 0x7F;
- fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n",
- cpu, msr, tj_max - dts, tj_max - dts2);
- }
-
- if (do_dts && debug) {
- unsigned int resolution;
-
- if (get_msr(cpu, MSR_IA32_THERM_STATUS, &msr))
- return 0;
-
- dts = (msr >> 16) & 0x7F;
- resolution = (msr >> 27) & 0xF;
- fprintf(outf, "cpu%d: MSR_IA32_THERM_STATUS: 0x%08llx (%d C +/- %d)\n",
- cpu, msr, tj_max - dts, resolution);
-
- if (get_msr(cpu, MSR_IA32_THERM_INTERRUPT, &msr))
- return 0;
-
- dts = (msr >> 16) & 0x7F;
- dts2 = (msr >> 8) & 0x7F;
- fprintf(outf, "cpu%d: MSR_IA32_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n",
- cpu, msr, tj_max - dts, tj_max - dts2);
- }
-
- return 0;
-}
-
void print_power_limit_msr(int cpu, unsigned long long msr, char *label)
{
fprintf(outf, "cpu%d: %s: %sabled (%0.3f Watts, %f sec, clamp %sabled)\n",
UNUSED(c);
UNUSED(p);
- if (!do_rapl)
+ if (!platform->rapl_msrs)
return 0;
/* RAPL counters are per package, so print only for 1st thread/package */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE) || !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_thread_in_package(t, c, p))
return 0;
cpu = t->cpu_id;
return -1;
}
- if (do_rapl & RAPL_AMD_F17H) {
+ if (platform->rapl_msrs & RAPL_AMD_F17H) {
msr_name = "MSR_RAPL_PWR_UNIT";
if (get_msr(cpu, MSR_RAPL_PWR_UNIT, &msr))
return -1;
fprintf(outf, "cpu%d: %s: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr_name, msr,
rapl_power_units, rapl_energy_units, rapl_time_units);
- if (do_rapl & RAPL_PKG_POWER_INFO) {
+ if (platform->rapl_msrs & RAPL_PKG_POWER_INFO) {
if (get_msr(cpu, MSR_PKG_POWER_INFO, &msr))
return -5;
((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units);
}
- if (do_rapl & RAPL_PKG) {
+ if (platform->rapl_msrs & RAPL_PKG) {
if (get_msr(cpu, MSR_PKG_POWER_LIMIT, &msr))
return -9;
cpu, ((msr >> 0) & 0x1FFF) * rapl_power_units, (msr >> 31) & 1 ? "" : "UN");
}
- if (do_rapl & RAPL_DRAM_POWER_INFO) {
+ if (platform->rapl_msrs & RAPL_DRAM_POWER_INFO) {
if (get_msr(cpu, MSR_DRAM_POWER_INFO, &msr))
return -6;
((msr >> 32) & RAPL_POWER_GRANULARITY) * rapl_power_units,
((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units);
}
- if (do_rapl & RAPL_DRAM) {
+ if (platform->rapl_msrs & RAPL_DRAM) {
if (get_msr(cpu, MSR_DRAM_POWER_LIMIT, &msr))
return -9;
fprintf(outf, "cpu%d: MSR_DRAM_POWER_LIMIT: 0x%08llx (%slocked)\n",
print_power_limit_msr(cpu, msr, "DRAM Limit");
}
- if (do_rapl & RAPL_CORE_POLICY) {
+ if (platform->rapl_msrs & RAPL_CORE_POLICY) {
if (get_msr(cpu, MSR_PP0_POLICY, &msr))
return -7;
fprintf(outf, "cpu%d: MSR_PP0_POLICY: %lld\n", cpu, msr & 0xF);
}
- if (do_rapl & RAPL_CORES_POWER_LIMIT) {
+ if (platform->rapl_msrs & RAPL_CORE_POWER_LIMIT) {
if (get_msr(cpu, MSR_PP0_POWER_LIMIT, &msr))
return -9;
fprintf(outf, "cpu%d: MSR_PP0_POWER_LIMIT: 0x%08llx (%slocked)\n",
cpu, msr, (msr >> 31) & 1 ? "" : "UN");
print_power_limit_msr(cpu, msr, "Cores Limit");
}
- if (do_rapl & RAPL_GFX) {
+ if (platform->rapl_msrs & RAPL_GFX) {
if (get_msr(cpu, MSR_PP1_POLICY, &msr))
return -8;
}
/*
- * SNB adds support for additional MSRs:
- *
- * MSR_PKG_C7_RESIDENCY 0x000003fa
- * MSR_CORE_C7_RESIDENCY 0x000003fe
- * MSR_PKG_C2_RESIDENCY 0x0000060d
- */
-
-int has_snb_msrs(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_SANDYBRIDGE:
- case INTEL_FAM6_SANDYBRIDGE_X:
- case INTEL_FAM6_IVYBRIDGE: /* IVB */
- case INTEL_FAM6_IVYBRIDGE_X: /* IVB Xeon */
- case INTEL_FAM6_HASWELL: /* HSW */
- case INTEL_FAM6_HASWELL_X: /* HSW */
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_HASWELL_G: /* HSW */
- case INTEL_FAM6_BROADWELL: /* BDW */
- case INTEL_FAM6_BROADWELL_G: /* BDW */
- case INTEL_FAM6_BROADWELL_X: /* BDX */
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- case INTEL_FAM6_SKYLAKE_X: /* SKX */
- case INTEL_FAM6_ICELAKE_X: /* ICX */
- case INTEL_FAM6_SAPPHIRERAPIDS_X: /* SPR */
- case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
- case INTEL_FAM6_ATOM_GOLDMONT_D: /* DNV */
- case INTEL_FAM6_ATOM_TREMONT: /* EHL */
- case INTEL_FAM6_ATOM_TREMONT_D: /* JVL */
- return 1;
- }
- return 0;
-}
-
-/*
- * HSW ULT added support for C8/C9/C10 MSRs:
- *
- * MSR_PKG_C8_RESIDENCY 0x00000630
- * MSR_PKG_C9_RESIDENCY 0x00000631
- * MSR_PKG_C10_RESIDENCY 0x00000632
- *
- * MSR_PKGC8_IRTL 0x00000633
- * MSR_PKGC9_IRTL 0x00000634
- * MSR_PKGC10_IRTL 0x00000635
- *
- */
-int has_c8910_msrs(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_HASWELL_L: /* HSW */
- case INTEL_FAM6_BROADWELL: /* BDW */
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
- case INTEL_FAM6_ATOM_TREMONT: /* EHL */
- return 1;
- }
- return 0;
-}
-
-/*
- * SKL adds support for additional MSRS:
+ * probe_rapl()
*
- * MSR_PKG_WEIGHTED_CORE_C0_RES 0x00000658
- * MSR_PKG_ANY_CORE_C0_RES 0x00000659
- * MSR_PKG_ANY_GFXE_C0_RES 0x0000065A
- * MSR_PKG_BOTH_CORE_GFXE_C0_RES 0x0000065B
+ * sets rapl_power_units, rapl_energy_units, rapl_time_units
*/
-int has_skl_msrs(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- return 1;
- }
- return 0;
-}
-
-int is_slm(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_ATOM_SILVERMONT: /* BYT */
- case INTEL_FAM6_ATOM_SILVERMONT_D: /* AVN */
- return 1;
- }
- return 0;
-}
-
-int is_knl(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_XEON_PHI_KNL: /* KNL */
- return 1;
- }
- return 0;
-}
-
-int is_cnl(unsigned int family, unsigned int model)
-{
- if (!genuine_intel)
- return 0;
-
- if (family != 6)
- return 0;
-
- switch (model) {
- case INTEL_FAM6_CANNONLAKE_L: /* CNL */
- return 1;
- }
-
- return 0;
-}
-
-unsigned int get_aperf_mperf_multiplier(unsigned int family, unsigned int model)
-{
- if (is_knl(family, model))
- return 1024;
- return 1;
-}
-
-#define SLM_BCLK_FREQS 5
-double slm_freq_table[SLM_BCLK_FREQS] = { 83.3, 100.0, 133.3, 116.7, 80.0 };
-
-double slm_bclk(void)
-{
- unsigned long long msr = 3;
- unsigned int i;
- double freq;
-
- if (get_msr(base_cpu, MSR_FSB_FREQ, &msr))
- fprintf(outf, "SLM BCLK: unknown\n");
-
- i = msr & 0xf;
- if (i >= SLM_BCLK_FREQS) {
- fprintf(outf, "SLM BCLK[%d] invalid\n", i);
- i = 3;
- }
- freq = slm_freq_table[i];
-
- if (!quiet)
- fprintf(outf, "SLM BCLK: %.1f Mhz\n", freq);
-
- return freq;
-}
-
-double discover_bclk(unsigned int family, unsigned int model)
+void probe_rapl(void)
{
- if (has_snb_msrs(family, model) || is_knl(family, model))
- return 100.00;
- else if (is_slm(family, model))
- return slm_bclk();
- else
- return 133.33;
-}
-
-int get_cpu_type(struct thread_data *t, struct core_data *c, struct pkg_data *p)
-{
- unsigned int eax, ebx, ecx, edx;
-
- UNUSED(c);
- UNUSED(p);
-
- if (!genuine_intel)
- return 0;
+ if (!platform->rapl_msrs)
+ return;
- if (cpu_migrate(t->cpu_id)) {
- fprintf(outf, "Could not migrate to CPU %d\n", t->cpu_id);
- return -1;
- }
+ if (genuine_intel)
+ rapl_probe_intel();
+ if (authentic_amd || hygon_genuine)
+ rapl_probe_amd();
- if (max_level < 0x1a)
- return 0;
+ if (quiet)
+ return;
- __cpuid(0x1a, eax, ebx, ecx, edx);
- eax = (eax >> 24) & 0xFF;
- if (eax == 0x20)
- t->is_atom = true;
- return 0;
+ for_all_cpus(print_rapl, ODD_COUNTERS);
}
/*
return 0;
/* this is a per-package concept */
- if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE) || !(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
+ if (!is_cpu_first_thread_in_package(t, c, p))
return 0;
cpu = t->cpu_id;
}
/* Temperature Target MSR is Nehalem and newer only */
- if (!do_nhm_platform_info)
+ if (!platform->has_nhm_msrs)
goto guess;
if (get_msr(base_cpu, MSR_IA32_TEMPERATURE_TARGET, &msr))
tcc_default = (msr >> 16) & 0xFF;
if (!quiet) {
- switch (tcc_offset_bits) {
- case 4:
- tcc_offset = (msr >> 24) & 0xF;
- fprintf(outf, "cpu%d: MSR_IA32_TEMPERATURE_TARGET: 0x%08llx (%d C) (%d default - %d offset)\n",
- cpu, msr, tcc_default - tcc_offset, tcc_default, tcc_offset);
- break;
- case 6:
- tcc_offset = (msr >> 24) & 0x3F;
+ int bits = platform->tcc_offset_bits;
+ unsigned long long enabled = 0;
+
+ if (bits && !get_msr(base_cpu, MSR_PLATFORM_INFO, &enabled))
+ enabled = (enabled >> 30) & 1;
+
+ if (bits && enabled) {
+ tcc_offset = (msr >> 24) & GENMASK(bits - 1, 0);
fprintf(outf, "cpu%d: MSR_IA32_TEMPERATURE_TARGET: 0x%08llx (%d C) (%d default - %d offset)\n",
cpu, msr, tcc_default - tcc_offset, tcc_default, tcc_offset);
- break;
- default:
+ } else {
fprintf(outf, "cpu%d: MSR_IA32_TEMPERATURE_TARGET: 0x%08llx (%d C)\n", cpu, msr, tcc_default);
- break;
}
}
- if (!tcc_default)
- goto guess;
+ if (!tcc_default)
+ goto guess;
+
+ tj_max = tcc_default;
+
+ return 0;
+
+guess:
+ tj_max = TJMAX_DEFAULT;
+ fprintf(outf, "cpu%d: Guessing tjMax %d C, Please use -T to specify\n", cpu, tj_max);
+
+ return 0;
+}
+
+int print_thermal(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ unsigned long long msr;
+ unsigned int dts, dts2;
+ int cpu;
+
+ UNUSED(c);
+ UNUSED(p);
+
+ if (!(do_dts || do_ptm))
+ return 0;
+
+ cpu = t->cpu_id;
+
+ /* DTS is per-core, no need to print for each thread */
+ if (!is_cpu_first_thread_in_core(t, c, p))
+ return 0;
+
+ if (cpu_migrate(cpu)) {
+ fprintf(outf, "print_thermal: Could not migrate to CPU %d\n", cpu);
+ return -1;
+ }
+
+ if (do_ptm && is_cpu_first_core_in_package(t, c, p)) {
+ if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_STATUS, &msr))
+ return 0;
+
+ dts = (msr >> 16) & 0x7F;
+ fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_STATUS: 0x%08llx (%d C)\n", cpu, msr, tj_max - dts);
+
+ if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, &msr))
+ return 0;
+
+ dts = (msr >> 16) & 0x7F;
+ dts2 = (msr >> 8) & 0x7F;
+ fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n",
+ cpu, msr, tj_max - dts, tj_max - dts2);
+ }
+
+ if (do_dts && debug) {
+ unsigned int resolution;
+
+ if (get_msr(cpu, MSR_IA32_THERM_STATUS, &msr))
+ return 0;
+
+ dts = (msr >> 16) & 0x7F;
+ resolution = (msr >> 27) & 0xF;
+ fprintf(outf, "cpu%d: MSR_IA32_THERM_STATUS: 0x%08llx (%d C +/- %d)\n",
+ cpu, msr, tj_max - dts, resolution);
+
+ if (get_msr(cpu, MSR_IA32_THERM_INTERRUPT, &msr))
+ return 0;
+
+ dts = (msr >> 16) & 0x7F;
+ dts2 = (msr >> 8) & 0x7F;
+ fprintf(outf, "cpu%d: MSR_IA32_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n",
+ cpu, msr, tj_max - dts, tj_max - dts2);
+ }
+
+ return 0;
+}
+
+void probe_thermal(void)
+{
+ if (!access("/sys/devices/system/cpu/cpu0/thermal_throttle/core_throttle_count", R_OK))
+ BIC_PRESENT(BIC_CORE_THROT_CNT);
+ else
+ BIC_NOT_PRESENT(BIC_CORE_THROT_CNT);
+
+ for_all_cpus(set_temperature_target, ODD_COUNTERS);
+
+ if (quiet)
+ return;
+
+ for_all_cpus(print_thermal, ODD_COUNTERS);
+}
+
+int get_cpu_type(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ unsigned int eax, ebx, ecx, edx;
+
+ UNUSED(c);
+ UNUSED(p);
- tj_max = tcc_default;
+ if (!genuine_intel)
+ return 0;
- return 0;
+ if (cpu_migrate(t->cpu_id)) {
+ fprintf(outf, "Could not migrate to CPU %d\n", t->cpu_id);
+ return -1;
+ }
-guess:
- tj_max = TJMAX_DEFAULT;
- fprintf(outf, "cpu%d: Guessing tjMax %d C, Please use -T to specify\n", cpu, tj_max);
+ if (max_level < 0x1a)
+ return 0;
+ __cpuid(0x1a, eax, ebx, ecx, edx);
+ eax = (eax >> 24) & 0xFF;
+ if (eax == 0x20)
+ t->is_atom = true;
return 0;
}
{
unsigned long long msr;
- if (!has_misc_feature_control)
+ if (!platform->has_msr_misc_feature_control)
return;
if (!get_msr(base_cpu, MSR_MISC_FEATURE_CONTROL, &msr))
{
unsigned long long msr;
- if (!do_nhm_platform_info)
- return;
-
- if (no_MSR_MISC_PWR_MGMT)
+ if (!platform->has_msr_misc_pwr_mgmt)
return;
if (!get_msr(base_cpu, MSR_MISC_PWR_MGMT, &msr))
{
unsigned long long msr;
+ if (!platform->has_msr_c6_demotion_policy_config)
+ return;
+
if (!get_msr(base_cpu, MSR_CC6_DEMOTION_POLICY_CONFIG, &msr))
fprintf(outf, "cpu%d: MSR_CC6_DEMOTION_POLICY_CONFIG: 0x%08llx (%sable-CC6-Demotion)\n",
base_cpu, msr, msr & (1 << 0) ? "EN" : "DIS");
base_cpu, msr, msr & (1 << 0) ? "EN" : "DIS");
}
-/*
- * When models are the same, for the purpose of turbostat, reuse
- */
-unsigned int intel_model_duplicates(unsigned int model)
-{
-
- switch (model) {
- case INTEL_FAM6_NEHALEM_EP: /* Core i7, Xeon 5500 series - Bloomfield, Gainstown NHM-EP */
- case INTEL_FAM6_NEHALEM: /* Core i7 and i5 Processor - Clarksfield, Lynnfield, Jasper Forest */
- case 0x1F: /* Core i7 and i5 Processor - Nehalem */
- case INTEL_FAM6_WESTMERE: /* Westmere Client - Clarkdale, Arrandale */
- case INTEL_FAM6_WESTMERE_EP: /* Westmere EP - Gulftown */
- return INTEL_FAM6_NEHALEM;
-
- case INTEL_FAM6_NEHALEM_EX: /* Nehalem-EX Xeon - Beckton */
- case INTEL_FAM6_WESTMERE_EX: /* Westmere-EX Xeon - Eagleton */
- return INTEL_FAM6_NEHALEM_EX;
-
- case INTEL_FAM6_XEON_PHI_KNM:
- return INTEL_FAM6_XEON_PHI_KNL;
-
- case INTEL_FAM6_BROADWELL_X:
- case INTEL_FAM6_BROADWELL_D: /* BDX-DE */
- return INTEL_FAM6_BROADWELL_X;
-
- case INTEL_FAM6_SKYLAKE_L:
- case INTEL_FAM6_SKYLAKE:
- case INTEL_FAM6_KABYLAKE_L:
- case INTEL_FAM6_KABYLAKE:
- case INTEL_FAM6_COMETLAKE_L:
- case INTEL_FAM6_COMETLAKE:
- return INTEL_FAM6_SKYLAKE_L;
-
- case INTEL_FAM6_ICELAKE_L:
- case INTEL_FAM6_ICELAKE_NNPI:
- case INTEL_FAM6_TIGERLAKE_L:
- case INTEL_FAM6_TIGERLAKE:
- case INTEL_FAM6_ROCKETLAKE:
- case INTEL_FAM6_LAKEFIELD:
- case INTEL_FAM6_ALDERLAKE:
- case INTEL_FAM6_ALDERLAKE_L:
- case INTEL_FAM6_ATOM_GRACEMONT:
- case INTEL_FAM6_RAPTORLAKE:
- case INTEL_FAM6_RAPTORLAKE_P:
- case INTEL_FAM6_RAPTORLAKE_S:
- case INTEL_FAM6_METEORLAKE:
- case INTEL_FAM6_METEORLAKE_L:
- return INTEL_FAM6_CANNONLAKE_L;
-
- case INTEL_FAM6_ATOM_TREMONT_L:
- return INTEL_FAM6_ATOM_TREMONT;
-
- case INTEL_FAM6_ICELAKE_D:
- return INTEL_FAM6_ICELAKE_X;
-
- case INTEL_FAM6_EMERALDRAPIDS_X:
- return INTEL_FAM6_SAPPHIRERAPIDS_X;
- }
- return model;
-}
-
void print_dev_latency(void)
{
char *path = "/dev/cpu_dma_latency";
BIC_PRESENT(BIC_IPC);
}
+void probe_cstates(void)
+{
+ probe_cst_limit();
+
+ if (platform->supported_cstates & CC1)
+ BIC_PRESENT(BIC_CPU_c1);
+
+ if (platform->supported_cstates & CC3)
+ BIC_PRESENT(BIC_CPU_c3);
+
+ if (platform->supported_cstates & CC6)
+ BIC_PRESENT(BIC_CPU_c6);
+
+ if (platform->supported_cstates & CC7)
+ BIC_PRESENT(BIC_CPU_c7);
+
+ if (platform->supported_cstates & PC2 && (pkg_cstate_limit >= PCL__2))
+ BIC_PRESENT(BIC_Pkgpc2);
+
+ if (platform->supported_cstates & PC3 && (pkg_cstate_limit >= PCL__3))
+ BIC_PRESENT(BIC_Pkgpc3);
+
+ if (platform->supported_cstates & PC6 && (pkg_cstate_limit >= PCL__6))
+ BIC_PRESENT(BIC_Pkgpc6);
+
+ if (platform->supported_cstates & PC7 && (pkg_cstate_limit >= PCL__7))
+ BIC_PRESENT(BIC_Pkgpc7);
+
+ if (platform->supported_cstates & PC8 && (pkg_cstate_limit >= PCL__8))
+ BIC_PRESENT(BIC_Pkgpc8);
+
+ if (platform->supported_cstates & PC9 && (pkg_cstate_limit >= PCL__9))
+ BIC_PRESENT(BIC_Pkgpc9);
+
+ if (platform->supported_cstates & PC10 && (pkg_cstate_limit >= PCL_10))
+ BIC_PRESENT(BIC_Pkgpc10);
+
+ if (platform->has_msr_module_c6_res_ms)
+ BIC_PRESENT(BIC_Mod_c6);
+
+ if (platform->has_ext_cst_msrs) {
+ BIC_PRESENT(BIC_Totl_c0);
+ BIC_PRESENT(BIC_Any_c0);
+ BIC_PRESENT(BIC_GFX_c0);
+ BIC_PRESENT(BIC_CPUGFX);
+ }
+
+ if (quiet)
+ return;
+
+ dump_power_ctl();
+ dump_cst_cfg();
+ decode_c6_demotion_policy_msr();
+ print_dev_latency();
+ dump_sysfs_cstate_config();
+ print_irtl();
+}
+
+void probe_lpi(void)
+{
+ if (!access("/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us", R_OK))
+ BIC_PRESENT(BIC_CPU_LPI);
+ else
+ BIC_NOT_PRESENT(BIC_CPU_LPI);
+
+ if (!access(sys_lpi_file_sysfs, R_OK)) {
+ sys_lpi_file = sys_lpi_file_sysfs;
+ BIC_PRESENT(BIC_SYS_LPI);
+ } else if (!access(sys_lpi_file_debugfs, R_OK)) {
+ sys_lpi_file = sys_lpi_file_debugfs;
+ BIC_PRESENT(BIC_SYS_LPI);
+ } else {
+ sys_lpi_file_sysfs = NULL;
+ BIC_NOT_PRESENT(BIC_SYS_LPI);
+ }
+
+}
+
+void probe_pstates(void)
+{
+ probe_bclk();
+
+ if (quiet)
+ return;
+
+ dump_platform_info();
+ dump_turbo_ratio_info();
+ dump_sysfs_pstate_config();
+ decode_misc_pwr_mgmt_msr();
+
+ for_all_cpus(print_hwp, ODD_COUNTERS);
+ for_all_cpus(print_epb, ODD_COUNTERS);
+ for_all_cpus(print_perf_limit, ODD_COUNTERS);
+}
+
void process_cpuid()
{
unsigned int eax, ebx, ecx, edx;
edx_flags & (1 << 22) ? "ACPI-TM" : "-",
edx_flags & (1 << 28) ? "HT" : "-", edx_flags & (1 << 29) ? "TM" : "-");
}
- if (genuine_intel) {
- model_orig = model;
- model = intel_model_duplicates(model);
- }
+
+ probe_platform_features(family, model);
if (!(edx_flags & (1 << 5)))
errx(1, "CPUID: no MSR");
__cpuid(0x15, eax_crystal, ebx_tsc, crystal_hz, edx);
if (ebx_tsc != 0) {
-
if (!quiet && (ebx != 0))
fprintf(outf, "CPUID(0x15): eax_crystal: %d ebx_tsc: %d ecx_crystal_hz: %d\n",
eax_crystal, ebx_tsc, crystal_hz);
if (crystal_hz == 0)
- switch (model) {
- case INTEL_FAM6_SKYLAKE_L: /* SKL */
- crystal_hz = 24000000; /* 24.0 MHz */
- break;
- case INTEL_FAM6_ATOM_GOLDMONT_D: /* DNV */
- crystal_hz = 25000000; /* 25.0 MHz */
- break;
- case INTEL_FAM6_ATOM_GOLDMONT: /* BXT */
- case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
- crystal_hz = 19200000; /* 19.2 MHz */
- break;
- default:
- crystal_hz = 0;
- }
+ crystal_hz = platform->crystal_freq;
if (crystal_hz) {
tsc_hz = (unsigned long long)crystal_hz *ebx_tsc / eax_crystal;
}
if (has_aperf)
- aperf_mperf_multiplier = get_aperf_mperf_multiplier(family, model);
+ aperf_mperf_multiplier = platform->need_perf_multiplier ? 1024 : 1;
BIC_PRESENT(BIC_IRQ);
BIC_PRESENT(BIC_TSC_MHz);
+}
- if (probe_nhm_msrs(family, model)) {
- do_nhm_platform_info = 1;
- BIC_PRESENT(BIC_CPU_c1);
- BIC_PRESENT(BIC_CPU_c3);
- BIC_PRESENT(BIC_CPU_c6);
- BIC_PRESENT(BIC_SMI);
- }
- do_snb_cstates = has_snb_msrs(family, model);
-
- if (do_snb_cstates)
- BIC_PRESENT(BIC_CPU_c7);
-
- do_irtl_snb = has_snb_msrs(family, model);
- if (do_snb_cstates && (pkg_cstate_limit >= PCL__2))
- BIC_PRESENT(BIC_Pkgpc2);
- if (pkg_cstate_limit >= PCL__3)
- BIC_PRESENT(BIC_Pkgpc3);
- if (pkg_cstate_limit >= PCL__6)
- BIC_PRESENT(BIC_Pkgpc6);
- if (do_snb_cstates && (pkg_cstate_limit >= PCL__7))
- BIC_PRESENT(BIC_Pkgpc7);
- if (has_slv_msrs(family, model)) {
- BIC_NOT_PRESENT(BIC_Pkgpc2);
- BIC_NOT_PRESENT(BIC_Pkgpc3);
- BIC_PRESENT(BIC_Pkgpc6);
- BIC_NOT_PRESENT(BIC_Pkgpc7);
- BIC_PRESENT(BIC_Mod_c6);
- use_c1_residency_msr = 1;
- }
- if (is_jvl(family, model)) {
- BIC_NOT_PRESENT(BIC_CPU_c3);
- BIC_NOT_PRESENT(BIC_CPU_c7);
- BIC_NOT_PRESENT(BIC_Pkgpc2);
- BIC_NOT_PRESENT(BIC_Pkgpc3);
- BIC_NOT_PRESENT(BIC_Pkgpc6);
- BIC_NOT_PRESENT(BIC_Pkgpc7);
- }
- if (is_dnv(family, model)) {
- BIC_PRESENT(BIC_CPU_c1);
- BIC_NOT_PRESENT(BIC_CPU_c3);
- BIC_NOT_PRESENT(BIC_Pkgpc3);
- BIC_NOT_PRESENT(BIC_CPU_c7);
- BIC_NOT_PRESENT(BIC_Pkgpc7);
- use_c1_residency_msr = 1;
- }
- if (is_skx(family, model) || is_icx(family, model) || is_spr(family, model)) {
- BIC_NOT_PRESENT(BIC_CPU_c3);
- BIC_NOT_PRESENT(BIC_Pkgpc3);
- BIC_NOT_PRESENT(BIC_CPU_c7);
- BIC_NOT_PRESENT(BIC_Pkgpc7);
- }
- if (is_bdx(family, model)) {
- BIC_NOT_PRESENT(BIC_CPU_c7);
- BIC_NOT_PRESENT(BIC_Pkgpc7);
- }
- if (has_c8910_msrs(family, model)) {
- if (pkg_cstate_limit >= PCL__8)
- BIC_PRESENT(BIC_Pkgpc8);
- if (pkg_cstate_limit >= PCL__9)
- BIC_PRESENT(BIC_Pkgpc9);
- if (pkg_cstate_limit >= PCL_10)
- BIC_PRESENT(BIC_Pkgpc10);
- }
- do_irtl_hsw = has_c8910_msrs(family, model);
- if (has_skl_msrs(family, model)) {
- BIC_PRESENT(BIC_Totl_c0);
- BIC_PRESENT(BIC_Any_c0);
- BIC_PRESENT(BIC_GFX_c0);
- BIC_PRESENT(BIC_CPUGFX);
- }
- do_slm_cstates = is_slm(family, model);
- do_knl_cstates = is_knl(family, model);
-
- if (do_slm_cstates || do_knl_cstates || is_cnl(family, model) || is_ehl(family, model))
- BIC_NOT_PRESENT(BIC_CPU_c3);
-
- if (!quiet)
- decode_misc_pwr_mgmt_msr();
-
- if (!quiet && has_slv_msrs(family, model))
- decode_c6_demotion_policy_msr();
-
- rapl_probe(family, model);
- perf_limit_reasons_probe(family, model);
- automatic_cstate_conversion_probe(family, model);
-
- check_tcc_offset(model_orig);
-
- if (!quiet)
- dump_cstate_pstate_config_info(family, model);
- intel_uncore_frequency_probe();
-
- if (!quiet)
- print_dev_latency();
- if (!quiet)
- dump_sysfs_cstate_config();
- if (!quiet)
- dump_sysfs_pstate_config();
+void probe_pm_features(void)
+{
+ probe_pstates();
- if (has_skl_msrs(family, model) || is_ehl(family, model))
- calculate_tsc_tweak();
+ probe_cstates();
- if (!access("/sys/class/drm/card0/power/rc6_residency_ms", R_OK))
- BIC_PRESENT(BIC_GFX_rc6);
+ probe_lpi();
- if (!access("/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz", R_OK))
- BIC_PRESENT(BIC_GFXMHz);
+ probe_intel_uncore_frequency();
- if (!access("/sys/class/graphics/fb0/device/drm/card0/gt_act_freq_mhz", R_OK))
- BIC_PRESENT(BIC_GFXACTMHz);
+ probe_graphics();
- if (!access("/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us", R_OK))
- BIC_PRESENT(BIC_CPU_LPI);
- else
- BIC_NOT_PRESENT(BIC_CPU_LPI);
+ probe_rapl();
- if (!access("/sys/devices/system/cpu/cpu0/thermal_throttle/core_throttle_count", R_OK))
- BIC_PRESENT(BIC_CORE_THROT_CNT);
- else
- BIC_NOT_PRESENT(BIC_CORE_THROT_CNT);
+ probe_thermal();
- if (!access(sys_lpi_file_sysfs, R_OK)) {
- sys_lpi_file = sys_lpi_file_sysfs;
- BIC_PRESENT(BIC_SYS_LPI);
- } else if (!access(sys_lpi_file_debugfs, R_OK)) {
- sys_lpi_file = sys_lpi_file_debugfs;
- BIC_PRESENT(BIC_SYS_LPI);
- } else {
- sys_lpi_file_sysfs = NULL;
- BIC_NOT_PRESENT(BIC_SYS_LPI);
- }
+ if (platform->has_nhm_msrs)
+ BIC_PRESENT(BIC_SMI);
if (!quiet)
decode_misc_feature_control();
-
- return;
}
/*
return 0;
}
-void topology_probe()
+void topology_probe(bool startup)
{
int i;
int max_core_id = 0;
for_all_proc_cpus(mark_cpu_present);
/*
- * Validate that all cpus in cpu_subset are also in cpu_present_set
+ * Allocate and initialize cpu_effective_set
+ */
+ cpu_effective_set = CPU_ALLOC((topo.max_cpu_num + 1));
+ if (cpu_effective_set == NULL)
+ err(3, "CPU_ALLOC");
+ cpu_effective_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1));
+ CPU_ZERO_S(cpu_effective_setsize, cpu_effective_set);
+ update_effective_set(startup);
+
+ /*
+ * Allocate and initialize cpu_allowed_set
+ */
+ cpu_allowed_set = CPU_ALLOC((topo.max_cpu_num + 1));
+ if (cpu_allowed_set == NULL)
+ err(3, "CPU_ALLOC");
+ cpu_allowed_setsize = CPU_ALLOC_SIZE((topo.max_cpu_num + 1));
+ CPU_ZERO_S(cpu_allowed_setsize, cpu_allowed_set);
+
+ /*
+ * Validate and update cpu_allowed_set.
+ *
+ * Make sure all cpus in cpu_subset are also in cpu_present_set during startup.
+ * Give a warning when cpus in cpu_subset become unavailable at runtime.
+ * Give a warning when cpus are not effective because of cgroup setting.
+ *
+ * cpu_allowed_set is the intersection of cpu_present_set/cpu_effective_set/cpu_subset.
*/
for (i = 0; i < CPU_SUBSET_MAXCPUS; ++i) {
- if (CPU_ISSET_S(i, cpu_subset_size, cpu_subset))
- if (!CPU_ISSET_S(i, cpu_present_setsize, cpu_present_set))
- err(1, "cpu%d not present", i);
+ if (cpu_subset && !CPU_ISSET_S(i, cpu_subset_size, cpu_subset))
+ continue;
+
+ if (!CPU_ISSET_S(i, cpu_present_setsize, cpu_present_set)) {
+ if (cpu_subset) {
+ /* cpus in cpu_subset must be in cpu_present_set during startup */
+ if (startup)
+ err(1, "cpu%d not present", i);
+ else
+ fprintf(stderr, "cpu%d not present\n", i);
+ }
+ continue;
+ }
+
+ if (CPU_COUNT_S(cpu_effective_setsize, cpu_effective_set)) {
+ if (!CPU_ISSET_S(i, cpu_effective_setsize, cpu_effective_set)) {
+ fprintf(stderr, "cpu%d not effective\n", i);
+ continue;
+ }
+ }
+
+ CPU_SET_S(i, cpu_allowed_setsize, cpu_allowed_set);
}
+ if (!CPU_COUNT_S(cpu_allowed_setsize, cpu_allowed_set))
+ err(-ENODEV, "No valid cpus found");
+ sched_setaffinity(0, cpu_allowed_setsize, cpu_allowed_set);
+
/*
* Allocate and initialize cpu_affinity_set
*/
if (*c == NULL)
goto error;
- for (i = 0; i < num_cores; i++)
+ for (i = 0; i < num_cores; i++) {
(*c)[i].core_id = -1;
+ (*c)[i].base_cpu = -1;
+ }
*p = calloc(topo.num_packages, sizeof(struct pkg_data));
if (*p == NULL)
goto error;
- for (i = 0; i < topo.num_packages; i++)
+ for (i = 0; i < topo.num_packages; i++) {
(*p)[i].package_id = i;
+ (*p)[i].base_cpu = -1;
+ }
return;
error:
p = GET_PKG(pkg_base, pkg_id);
t->cpu_id = cpu_id;
- if (thread_id == 0) {
- t->flags |= CPU_IS_FIRST_THREAD_IN_CORE;
- if (cpu_is_first_core_in_package(cpu_id))
- t->flags |= CPU_IS_FIRST_CORE_IN_PACKAGE;
+ if (!cpu_is_not_allowed(cpu_id)) {
+ if (c->base_cpu < 0)
+ c->base_cpu = t->cpu_id;
+ if (p->base_cpu < 0)
+ p->base_cpu = t->cpu_id;
}
c->core_id = core_id;
err(-1, "calloc %d", topo.max_cpu_num + 1);
}
-void setup_all_buffers(void)
+int update_topo(struct thread_data *t, struct core_data *c, struct pkg_data *p)
+{
+ topo.allowed_cpus++;
+ if ((int)t->cpu_id == c->base_cpu)
+ topo.allowed_cores++;
+ if ((int)t->cpu_id == p->base_cpu)
+ topo.allowed_packages++;
+
+ return 0;
+}
+
+void topology_update(void)
+{
+ topo.allowed_cpus = 0;
+ topo.allowed_cores = 0;
+ topo.allowed_packages = 0;
+ for_all_cpus(update_topo, ODD_COUNTERS);
+}
+void setup_all_buffers(bool startup)
{
- topology_probe();
+ topology_probe(startup);
allocate_irq_buffers();
allocate_fd_percpu();
allocate_counters(&thread_even, &core_even, &package_even);
allocate_counters(&thread_odd, &core_odd, &package_odd);
allocate_output_buffer();
for_all_proc_cpus(initialize_counters);
+ topology_update();
}
void set_base_cpu(void)
{
- base_cpu = sched_getcpu();
- if (base_cpu < 0)
- err(-ENODEV, "No valid cpus found");
+ int i;
- if (debug > 1)
- fprintf(outf, "base_cpu = %d\n", base_cpu);
+ for (i = 0; i < topo.max_cpu_num + 1; ++i) {
+ if (cpu_is_not_allowed(i))
+ continue;
+ base_cpu = i;
+ if (debug > 1)
+ fprintf(outf, "base_cpu = %d\n", base_cpu);
+ return;
+ }
+ err(-ENODEV, "No valid cpus found");
}
void turbostat_init()
{
- setup_all_buffers();
+ setup_all_buffers(true);
set_base_cpu();
check_dev_msr();
check_permissions();
process_cpuid();
+ probe_pm_features();
linux_perf_init();
- if (!quiet)
- for_all_cpus(print_hwp, ODD_COUNTERS);
-
- if (!quiet)
- for_all_cpus(print_epb, ODD_COUNTERS);
-
- if (!quiet)
- for_all_cpus(print_perf_limit, ODD_COUNTERS);
-
- if (!quiet)
- for_all_cpus(print_rapl, ODD_COUNTERS);
-
- for_all_cpus(set_temperature_target, ODD_COUNTERS);
-
for_all_cpus(get_cpu_type, ODD_COUNTERS);
for_all_cpus(get_cpu_type, EVEN_COUNTERS);
- if (!quiet)
- for_all_cpus(print_thermal, ODD_COUNTERS);
-
- if (!quiet && do_irtl_snb)
- print_irtl();
-
if (DO_BIC(BIC_IPC))
(void)get_instr_count_fd(base_cpu);
}
first_counter_read = 0;
if (status)
exit(status);
- /* clear affinity side-effect of get_counters() */
- sched_setaffinity(0, cpu_present_setsize, cpu_present_set);
gettimeofday(&tv_even, (struct timezone *)NULL);
child_pid = fork();
void print_version()
{
- fprintf(outf, "turbostat version 2023.03.17 - Len Brown <lenb@kernel.org>\n");
+ fprintf(outf, "turbostat version 2023.11.07 - Len Brown <lenb@kernel.org>\n");
}
#define COMMAND_LINE_SIZE 2048
*/
void parse_cpu_command(char *optarg)
{
- unsigned int start, end;
- char *next;
-
if (!strcmp(optarg, "core")) {
if (cpu_subset)
goto error;
CPU_ZERO_S(cpu_subset_size, cpu_subset);
- next = optarg;
-
- while (next && *next) {
-
- if (*next == '-') /* no negative cpu numbers */
- goto error;
-
- start = strtoul(next, &next, 10);
-
- if (start >= CPU_SUBSET_MAXCPUS)
- goto error;
- CPU_SET_S(start, cpu_subset_size, cpu_subset);
-
- if (*next == '\0')
- break;
-
- if (*next == ',') {
- next += 1;
- continue;
- }
-
- if (*next == '-') {
- next += 1; /* start range */
- } else if (*next == '.') {
- next += 1;
- if (*next == '.')
- next += 1; /* start range */
- else
- goto error;
- }
-
- end = strtoul(next, &next, 10);
- if (end <= start)
- goto error;
-
- while (++start <= end) {
- if (start >= CPU_SUBSET_MAXCPUS)
- goto error;
- CPU_SET_S(start, cpu_subset_size, cpu_subset);
- }
-
- if (*next == ',')
- next += 1;
- else if (*next != '\0')
- goto error;
- }
+ if (parse_cpu_str(optarg, cpu_subset, cpu_subset_size))
+ goto error;
return;
int main(int argc, char **argv)
{
+ int fd, ret;
+
+ fd = open("/sys/fs/cgroup/cgroup.procs", O_WRONLY);
+ if (fd < 0)
+ goto skip_cgroup_setting;
+
+ ret = write(fd, "0\n", 2);
+ if (ret == -1)
+ perror("Can't update cgroup\n");
+
+ close(fd);
+
+skip_cgroup_setting:
outf = stderr;
cmdline(argc, argv);
cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o
cxl_core-y += config_check.o
cxl_core-y += cxl_core_test.o
+cxl_core-y += cxl_core_exports.o
obj-m += test/
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2022 Intel Corporation. All rights reserved. */
+
+#include "cxl.h"
+
+/* Exporting of cxl_core symbols that are only used by cxl_test */
+EXPORT_SYMBOL_NS_GPL(cxl_num_decoders_committed, CXL);
return 0;
dev_dbg(&port->dev, "%s commit\n", dev_name(&cxld->dev));
- if (port->commit_end + 1 != id) {
+ if (cxl_num_decoders_committed(port) != id) {
dev_dbg(&port->dev,
"%s: out of order commit, expected decoder%d.%d\n",
- dev_name(&cxld->dev), port->id, port->commit_end + 1);
+ dev_name(&cxld->dev), port->id,
+ cxl_num_decoders_committed(port));
return -EBUSY;
}
nfit_test_setup(ndtest_resource_lookup, NULL);
- rc = class_regster(&ndtest_dimm_class);
+ rc = class_register(&ndtest_dimm_class);
if (rc)
goto err_register;
abs_objtree := $(realpath $(abs_objtree))
BUILD := $(abs_objtree)/kselftest
KHDR_INCLUDES := -isystem ${abs_objtree}/usr/include
- KHDR_DIR := ${abs_objtree}/usr/include
else
BUILD := $(CURDIR)
abs_srctree := $(shell cd $(top_srcdir) && pwd)
KHDR_INCLUDES := -isystem ${abs_srctree}/usr/include
- KHDR_DIR := ${abs_srctree}/usr/include
DEFAULT_INSTALL_HDR_PATH := 1
endif
# all isn't the first target in the file.
.DEFAULT_GOAL := all
-all: kernel_header_files
+all:
@ret=1; \
for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
ret=$$((ret * $$?)); \
done; exit $$ret;
-kernel_header_files:
- @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \
- if [ $$? -ne 0 ]; then \
- RED='\033[1;31m'; \
- NOCOLOR='\033[0m'; \
- echo; \
- echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \
- echo "Please run this and try again:"; \
- echo; \
- echo " cd $(top_srcdir)"; \
- echo " make headers"; \
- echo; \
- exit 1; \
- fi
-
-.PHONY: kernel_header_files
-
run_tests: all
@for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
err = snd_ctl_elem_info(card_data->handle,
ctl_data->info);
if (err < 0) {
- ksft_print_msg("%s getting info for %d\n",
+ ksft_print_msg("%s getting info for %s\n",
snd_strerror(err),
ctl_data->name);
}
*/
ret = open("/proc/sys/abi/sme_default_vector_length", O_RDONLY, 0);
if (ret >= 0) {
- ksft_test_result(fork_test(), "fork_test");
+ ksft_test_result(fork_test(), "fork_test\n");
} else {
ksft_print_msg("SME not supported\n");
CONFIG_CRYPTO_XXHASH=y
CONFIG_DCB=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
-CONFIG_DEBUG_CREDENTIALS=y
CONFIG_DEBUG_INFO_BTF=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_MEMORY_INIT=y
test_sockmap_pass_prog__destroy(pass);
}
+static void test_sockmap_unconnected_unix(void)
+{
+ int err, map, stream = 0, dgram = 0, zero = 0;
+ struct test_sockmap_pass_prog *skel;
+
+ skel = test_sockmap_pass_prog__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ map = bpf_map__fd(skel->maps.sock_map_rx);
+
+ stream = xsocket(AF_UNIX, SOCK_STREAM, 0);
+ if (stream < 0)
+ return;
+
+ dgram = xsocket(AF_UNIX, SOCK_DGRAM, 0);
+ if (dgram < 0) {
+ close(stream);
+ return;
+ }
+
+ err = bpf_map_update_elem(map, &zero, &stream, BPF_ANY);
+ ASSERT_ERR(err, "bpf_map_update_elem(stream)");
+
+ err = bpf_map_update_elem(map, &zero, &dgram, BPF_ANY);
+ ASSERT_OK(err, "bpf_map_update_elem(dgram)");
+
+ close(stream);
+ close(dgram);
+}
+
void test_sockmap_basic(void)
{
if (test__start_subtest("sockmap create_update_free"))
test_sockmap_skb_verdict_fionread(false);
if (test__start_subtest("sockmap skb_verdict msg_f_peek"))
test_sockmap_skb_verdict_peek();
+
+ if (test__start_subtest("sockmap unconnected af_unix"))
+ test_sockmap_unconnected_unix();
}
}
static void pairs_redir_to_connected(int cli0, int peer0, int cli1, int peer1,
- int sock_mapfd, int verd_mapfd, enum redir_mode mode)
+ int sock_mapfd, int nop_mapfd,
+ int verd_mapfd, enum redir_mode mode)
{
const char *log_prefix = redir_mode_str(mode);
unsigned int pass;
if (err)
return;
+ if (nop_mapfd >= 0) {
+ err = add_to_sockmap(nop_mapfd, cli0, cli1);
+ if (err)
+ return;
+ }
+
n = write(cli1, "a", 1);
if (n < 0)
FAIL_ERRNO("%s: write", log_prefix);
goto close0;
c1 = sfd[0], p1 = sfd[1];
- pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, verd_mapfd, mode);
+ pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, -1, verd_mapfd, mode);
xclose(c1);
xclose(p1);
if (err)
goto close_cli0;
- pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, verd_mapfd, mode);
+ pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, -1, verd_mapfd, mode);
xclose(c1);
xclose(p1);
if (err)
goto close;
- pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, verd_mapfd, mode);
+ pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, -1, verd_mapfd, mode);
xclose(c1);
xclose(p1);
xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}
-static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
- int verd_mapfd, enum redir_mode mode)
+static void unix_inet_redir_to_connected(int family, int type,
+ int sock_mapfd, int nop_mapfd,
+ int verd_mapfd,
+ enum redir_mode mode)
{
int c0, c1, p0, p1;
int sfd[2];
goto close_cli0;
c1 = sfd[0], p1 = sfd[1];
- pairs_redir_to_connected(c0, p0, c1, p1, sock_mapfd, verd_mapfd, mode);
+ pairs_redir_to_connected(c0, p0, c1, p1,
+ sock_mapfd, nop_mapfd, verd_mapfd, mode);
xclose(c1);
xclose(p1);
struct bpf_map *inner_map, int family)
{
int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
+ int nop_map = bpf_map__fd(skel->maps.nop_map);
int verdict_map = bpf_map__fd(skel->maps.verdict_map);
int sock_map = bpf_map__fd(inner_map);
int err;
return;
skel->bss->test_ingress = false;
- unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ unix_inet_redir_to_connected(family, SOCK_DGRAM,
+ sock_map, -1, verdict_map,
REDIR_EGRESS);
- unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ unix_inet_redir_to_connected(family, SOCK_DGRAM,
+ sock_map, -1, verdict_map,
+ REDIR_EGRESS);
+
+ unix_inet_redir_to_connected(family, SOCK_DGRAM,
+ sock_map, nop_map, verdict_map,
+ REDIR_EGRESS);
+ unix_inet_redir_to_connected(family, SOCK_STREAM,
+ sock_map, nop_map, verdict_map,
REDIR_EGRESS);
skel->bss->test_ingress = true;
- unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ unix_inet_redir_to_connected(family, SOCK_DGRAM,
+ sock_map, -1, verdict_map,
+ REDIR_INGRESS);
+ unix_inet_redir_to_connected(family, SOCK_STREAM,
+ sock_map, -1, verdict_map,
+ REDIR_INGRESS);
+
+ unix_inet_redir_to_connected(family, SOCK_DGRAM,
+ sock_map, nop_map, verdict_map,
REDIR_INGRESS);
- unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ unix_inet_redir_to_connected(family, SOCK_STREAM,
+ sock_map, nop_map, verdict_map,
REDIR_INGRESS);
xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
// SPDX-License-Identifier: GPL-2.0
+#include <unistd.h>
#include <test_progs.h>
#include <network_helpers.h>
+#include "tailcall_poke.skel.h"
+
/* test_tailcall_1 checks basic functionality by patching multiple locations
* in a single program for a single tail call slot with nop->jmp, jmp->nop
bpf_object__close(tgt_obj);
}
+#define JMP_TABLE "/sys/fs/bpf/jmp_table"
+
+static int poke_thread_exit;
+
+static void *poke_update(void *arg)
+{
+ __u32 zero = 0, prog1_fd, prog2_fd, map_fd;
+ struct tailcall_poke *call = arg;
+
+ map_fd = bpf_map__fd(call->maps.jmp_table);
+ prog1_fd = bpf_program__fd(call->progs.call1);
+ prog2_fd = bpf_program__fd(call->progs.call2);
+
+ while (!poke_thread_exit) {
+ bpf_map_update_elem(map_fd, &zero, &prog1_fd, BPF_ANY);
+ bpf_map_update_elem(map_fd, &zero, &prog2_fd, BPF_ANY);
+ }
+
+ return NULL;
+}
+
+/*
+ * We are trying to hit prog array update during another program load
+ * that shares the same prog array map.
+ *
+ * For that we share the jmp_table map between two skeleton instances
+ * by pinning the jmp_table to same path. Then first skeleton instance
+ * periodically updates jmp_table in 'poke update' thread while we load
+ * the second skeleton instance in the main thread.
+ */
+static void test_tailcall_poke(void)
+{
+ struct tailcall_poke *call, *test;
+ int err, cnt = 10;
+ pthread_t thread;
+
+ unlink(JMP_TABLE);
+
+ call = tailcall_poke__open_and_load();
+ if (!ASSERT_OK_PTR(call, "tailcall_poke__open"))
+ return;
+
+ err = bpf_map__pin(call->maps.jmp_table, JMP_TABLE);
+ if (!ASSERT_OK(err, "bpf_map__pin"))
+ goto out;
+
+ err = pthread_create(&thread, NULL, poke_update, call);
+ if (!ASSERT_OK(err, "new toggler"))
+ goto out;
+
+ while (cnt--) {
+ test = tailcall_poke__open();
+ if (!ASSERT_OK_PTR(test, "tailcall_poke__open"))
+ break;
+
+ err = bpf_map__set_pin_path(test->maps.jmp_table, JMP_TABLE);
+ if (!ASSERT_OK(err, "bpf_map__pin")) {
+ tailcall_poke__destroy(test);
+ break;
+ }
+
+ bpf_program__set_autoload(test->progs.test, true);
+ bpf_program__set_autoload(test->progs.call1, false);
+ bpf_program__set_autoload(test->progs.call2, false);
+
+ err = tailcall_poke__load(test);
+ tailcall_poke__destroy(test);
+ if (!ASSERT_OK(err, "tailcall_poke__load"))
+ break;
+ }
+
+ poke_thread_exit = 1;
+ ASSERT_OK(pthread_join(thread, NULL), "pthread_join");
+
+out:
+ bpf_map__unpin(call->maps.jmp_table, JMP_TABLE);
+ tailcall_poke__destroy(call);
+}
+
void test_tailcalls(void)
{
if (test__start_subtest("tailcall_1"))
test_tailcall_bpf2bpf_fentry_fexit();
if (test__start_subtest("tailcall_bpf2bpf_fentry_entry"))
test_tailcall_bpf2bpf_fentry_entry();
+ if (test__start_subtest("tailcall_poke"))
+ test_tailcall_poke();
}
#include "test_progs.h"
#include "network_helpers.h"
+#include "netlink_helpers.h"
#include "test_tc_neigh_fib.skel.h"
#include "test_tc_neigh.skel.h"
#include "test_tc_peer.skel.h"
}
}
+enum dev_mode {
+ MODE_VETH,
+ MODE_NETKIT,
+};
+
struct netns_setup_result {
- int ifindex_veth_src;
- int ifindex_veth_src_fwd;
- int ifindex_veth_dst;
- int ifindex_veth_dst_fwd;
+ enum dev_mode dev_mode;
+ int ifindex_src;
+ int ifindex_src_fwd;
+ int ifindex_dst;
+ int ifindex_dst_fwd;
};
static int get_ifaddr(const char *name, char *ifaddr)
return 0;
}
+static int create_netkit(int mode, char *prim, char *peer)
+{
+ struct rtattr *linkinfo, *data, *peer_info;
+ struct rtnl_handle rth = { .fd = -1 };
+ const char *type = "netkit";
+ struct {
+ struct nlmsghdr n;
+ struct ifinfomsg i;
+ char buf[1024];
+ } req = {};
+ int err;
+
+ err = rtnl_open(&rth, 0);
+ if (!ASSERT_OK(err, "open_rtnetlink"))
+ return err;
+
+ memset(&req, 0, sizeof(req));
+ req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+ req.n.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
+ req.n.nlmsg_type = RTM_NEWLINK;
+ req.i.ifi_family = AF_UNSPEC;
+
+ addattr_l(&req.n, sizeof(req), IFLA_IFNAME, prim, strlen(prim));
+ linkinfo = addattr_nest(&req.n, sizeof(req), IFLA_LINKINFO);
+ addattr_l(&req.n, sizeof(req), IFLA_INFO_KIND, type, strlen(type));
+ data = addattr_nest(&req.n, sizeof(req), IFLA_INFO_DATA);
+ addattr32(&req.n, sizeof(req), IFLA_NETKIT_MODE, mode);
+ peer_info = addattr_nest(&req.n, sizeof(req), IFLA_NETKIT_PEER_INFO);
+ req.n.nlmsg_len += sizeof(struct ifinfomsg);
+ addattr_l(&req.n, sizeof(req), IFLA_IFNAME, peer, strlen(peer));
+ addattr_nest_end(&req.n, peer_info);
+ addattr_nest_end(&req.n, data);
+ addattr_nest_end(&req.n, linkinfo);
+
+ err = rtnl_talk(&rth, &req.n, NULL);
+ ASSERT_OK(err, "talk_rtnetlink");
+ rtnl_close(&rth);
+ return err;
+}
+
static int netns_setup_links_and_routes(struct netns_setup_result *result)
{
struct nstoken *nstoken = NULL;
- char veth_src_fwd_addr[IFADDR_STR_LEN+1] = {};
-
- SYS(fail, "ip link add veth_src type veth peer name veth_src_fwd");
- SYS(fail, "ip link add veth_dst type veth peer name veth_dst_fwd");
+ char src_fwd_addr[IFADDR_STR_LEN+1] = {};
+ int err;
- SYS(fail, "ip link set veth_dst_fwd address " MAC_DST_FWD);
- SYS(fail, "ip link set veth_dst address " MAC_DST);
+ if (result->dev_mode == MODE_VETH) {
+ SYS(fail, "ip link add src type veth peer name src_fwd");
+ SYS(fail, "ip link add dst type veth peer name dst_fwd");
+
+ SYS(fail, "ip link set dst_fwd address " MAC_DST_FWD);
+ SYS(fail, "ip link set dst address " MAC_DST);
+ } else if (result->dev_mode == MODE_NETKIT) {
+ err = create_netkit(NETKIT_L3, "src", "src_fwd");
+ if (!ASSERT_OK(err, "create_ifindex_src"))
+ goto fail;
+ err = create_netkit(NETKIT_L3, "dst", "dst_fwd");
+ if (!ASSERT_OK(err, "create_ifindex_dst"))
+ goto fail;
+ }
- if (get_ifaddr("veth_src_fwd", veth_src_fwd_addr))
+ if (get_ifaddr("src_fwd", src_fwd_addr))
goto fail;
- result->ifindex_veth_src = if_nametoindex("veth_src");
- if (!ASSERT_GT(result->ifindex_veth_src, 0, "ifindex_veth_src"))
+ result->ifindex_src = if_nametoindex("src");
+ if (!ASSERT_GT(result->ifindex_src, 0, "ifindex_src"))
goto fail;
- result->ifindex_veth_src_fwd = if_nametoindex("veth_src_fwd");
- if (!ASSERT_GT(result->ifindex_veth_src_fwd, 0, "ifindex_veth_src_fwd"))
+ result->ifindex_src_fwd = if_nametoindex("src_fwd");
+ if (!ASSERT_GT(result->ifindex_src_fwd, 0, "ifindex_src_fwd"))
goto fail;
- result->ifindex_veth_dst = if_nametoindex("veth_dst");
- if (!ASSERT_GT(result->ifindex_veth_dst, 0, "ifindex_veth_dst"))
+ result->ifindex_dst = if_nametoindex("dst");
+ if (!ASSERT_GT(result->ifindex_dst, 0, "ifindex_dst"))
goto fail;
- result->ifindex_veth_dst_fwd = if_nametoindex("veth_dst_fwd");
- if (!ASSERT_GT(result->ifindex_veth_dst_fwd, 0, "ifindex_veth_dst_fwd"))
+ result->ifindex_dst_fwd = if_nametoindex("dst_fwd");
+ if (!ASSERT_GT(result->ifindex_dst_fwd, 0, "ifindex_dst_fwd"))
goto fail;
- SYS(fail, "ip link set veth_src netns " NS_SRC);
- SYS(fail, "ip link set veth_src_fwd netns " NS_FWD);
- SYS(fail, "ip link set veth_dst_fwd netns " NS_FWD);
- SYS(fail, "ip link set veth_dst netns " NS_DST);
+ SYS(fail, "ip link set src netns " NS_SRC);
+ SYS(fail, "ip link set src_fwd netns " NS_FWD);
+ SYS(fail, "ip link set dst_fwd netns " NS_FWD);
+ SYS(fail, "ip link set dst netns " NS_DST);
/** setup in 'src' namespace */
nstoken = open_netns(NS_SRC);
if (!ASSERT_OK_PTR(nstoken, "setns src"))
goto fail;
- SYS(fail, "ip addr add " IP4_SRC "/32 dev veth_src");
- SYS(fail, "ip addr add " IP6_SRC "/128 dev veth_src nodad");
- SYS(fail, "ip link set dev veth_src up");
+ SYS(fail, "ip addr add " IP4_SRC "/32 dev src");
+ SYS(fail, "ip addr add " IP6_SRC "/128 dev src nodad");
+ SYS(fail, "ip link set dev src up");
- SYS(fail, "ip route add " IP4_DST "/32 dev veth_src scope global");
- SYS(fail, "ip route add " IP4_NET "/16 dev veth_src scope global");
- SYS(fail, "ip route add " IP6_DST "/128 dev veth_src scope global");
+ SYS(fail, "ip route add " IP4_DST "/32 dev src scope global");
+ SYS(fail, "ip route add " IP4_NET "/16 dev src scope global");
+ SYS(fail, "ip route add " IP6_DST "/128 dev src scope global");
- SYS(fail, "ip neigh add " IP4_DST " dev veth_src lladdr %s",
- veth_src_fwd_addr);
- SYS(fail, "ip neigh add " IP6_DST " dev veth_src lladdr %s",
- veth_src_fwd_addr);
+ if (result->dev_mode == MODE_VETH) {
+ SYS(fail, "ip neigh add " IP4_DST " dev src lladdr %s",
+ src_fwd_addr);
+ SYS(fail, "ip neigh add " IP6_DST " dev src lladdr %s",
+ src_fwd_addr);
+ }
close_netns(nstoken);
* needs v4 one in order to start ARP probing. IP4_NET route is added
* to the endpoints so that the ARP processing will reply.
*/
- SYS(fail, "ip addr add " IP4_SLL "/32 dev veth_src_fwd");
- SYS(fail, "ip addr add " IP4_DLL "/32 dev veth_dst_fwd");
- SYS(fail, "ip link set dev veth_src_fwd up");
- SYS(fail, "ip link set dev veth_dst_fwd up");
+ SYS(fail, "ip addr add " IP4_SLL "/32 dev src_fwd");
+ SYS(fail, "ip addr add " IP4_DLL "/32 dev dst_fwd");
+ SYS(fail, "ip link set dev src_fwd up");
+ SYS(fail, "ip link set dev dst_fwd up");
- SYS(fail, "ip route add " IP4_SRC "/32 dev veth_src_fwd scope global");
- SYS(fail, "ip route add " IP6_SRC "/128 dev veth_src_fwd scope global");
- SYS(fail, "ip route add " IP4_DST "/32 dev veth_dst_fwd scope global");
- SYS(fail, "ip route add " IP6_DST "/128 dev veth_dst_fwd scope global");
+ SYS(fail, "ip route add " IP4_SRC "/32 dev src_fwd scope global");
+ SYS(fail, "ip route add " IP6_SRC "/128 dev src_fwd scope global");
+ SYS(fail, "ip route add " IP4_DST "/32 dev dst_fwd scope global");
+ SYS(fail, "ip route add " IP6_DST "/128 dev dst_fwd scope global");
close_netns(nstoken);
if (!ASSERT_OK_PTR(nstoken, "setns dst"))
goto fail;
- SYS(fail, "ip addr add " IP4_DST "/32 dev veth_dst");
- SYS(fail, "ip addr add " IP6_DST "/128 dev veth_dst nodad");
- SYS(fail, "ip link set dev veth_dst up");
+ SYS(fail, "ip addr add " IP4_DST "/32 dev dst");
+ SYS(fail, "ip addr add " IP6_DST "/128 dev dst nodad");
+ SYS(fail, "ip link set dev dst up");
- SYS(fail, "ip route add " IP4_SRC "/32 dev veth_dst scope global");
- SYS(fail, "ip route add " IP4_NET "/16 dev veth_dst scope global");
- SYS(fail, "ip route add " IP6_SRC "/128 dev veth_dst scope global");
+ SYS(fail, "ip route add " IP4_SRC "/32 dev dst scope global");
+ SYS(fail, "ip route add " IP4_NET "/16 dev dst scope global");
+ SYS(fail, "ip route add " IP6_SRC "/128 dev dst scope global");
- SYS(fail, "ip neigh add " IP4_SRC " dev veth_dst lladdr " MAC_DST_FWD);
- SYS(fail, "ip neigh add " IP6_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+ if (result->dev_mode == MODE_VETH) {
+ SYS(fail, "ip neigh add " IP4_SRC " dev dst lladdr " MAC_DST_FWD);
+ SYS(fail, "ip neigh add " IP6_SRC " dev dst lladdr " MAC_DST_FWD);
+ }
close_netns(nstoken);
const struct bpf_program *chk_prog,
const struct netns_setup_result *setup_result)
{
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_src_fwd);
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_dst_fwd);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_src_fwd);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_dst_fwd);
int err;
- /* tc qdisc add dev veth_src_fwd clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_src_fwd, setup_result->ifindex_veth_src_fwd);
- /* tc filter add dev veth_src_fwd ingress bpf da src_prog */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_INGRESS, src_prog, 0);
- /* tc filter add dev veth_src_fwd egress bpf da chk_prog */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_EGRESS, chk_prog, 0);
+ /* tc qdisc add dev src_fwd clsact */
+ QDISC_CLSACT_CREATE(&qdisc_src_fwd, setup_result->ifindex_src_fwd);
+ /* tc filter add dev src_fwd ingress bpf da src_prog */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_INGRESS, src_prog, 0);
+ /* tc filter add dev src_fwd egress bpf da chk_prog */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_EGRESS, chk_prog, 0);
- /* tc qdisc add dev veth_dst_fwd clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_dst_fwd, setup_result->ifindex_veth_dst_fwd);
- /* tc filter add dev veth_dst_fwd ingress bpf da dst_prog */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_INGRESS, dst_prog, 0);
- /* tc filter add dev veth_dst_fwd egress bpf da chk_prog */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_EGRESS, chk_prog, 0);
+ /* tc qdisc add dev dst_fwd clsact */
+ QDISC_CLSACT_CREATE(&qdisc_dst_fwd, setup_result->ifindex_dst_fwd);
+ /* tc filter add dev dst_fwd ingress bpf da dst_prog */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_INGRESS, dst_prog, 0);
+ /* tc filter add dev dst_fwd egress bpf da chk_prog */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_EGRESS, chk_prog, 0);
return 0;
fail:
static int netns_load_dtime_bpf(struct test_tc_dtime *skel,
const struct netns_setup_result *setup_result)
{
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_src_fwd);
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_dst_fwd);
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_src);
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_dst);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_src_fwd);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_dst_fwd);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_src);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_dst);
struct nstoken *nstoken;
int err;
nstoken = open_netns(NS_SRC);
if (!ASSERT_OK_PTR(nstoken, "setns " NS_SRC))
return -1;
- /* tc qdisc add dev veth_src clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_src, setup_result->ifindex_veth_src);
- /* tc filter add dev veth_src ingress bpf da ingress_host */
- XGRESS_FILTER_ADD(&qdisc_veth_src, BPF_TC_INGRESS, skel->progs.ingress_host, 0);
- /* tc filter add dev veth_src egress bpf da egress_host */
- XGRESS_FILTER_ADD(&qdisc_veth_src, BPF_TC_EGRESS, skel->progs.egress_host, 0);
+ /* tc qdisc add dev src clsact */
+ QDISC_CLSACT_CREATE(&qdisc_src, setup_result->ifindex_src);
+ /* tc filter add dev src ingress bpf da ingress_host */
+ XGRESS_FILTER_ADD(&qdisc_src, BPF_TC_INGRESS, skel->progs.ingress_host, 0);
+ /* tc filter add dev src egress bpf da egress_host */
+ XGRESS_FILTER_ADD(&qdisc_src, BPF_TC_EGRESS, skel->progs.egress_host, 0);
close_netns(nstoken);
/* setup ns_dst tc progs */
nstoken = open_netns(NS_DST);
if (!ASSERT_OK_PTR(nstoken, "setns " NS_DST))
return -1;
- /* tc qdisc add dev veth_dst clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_dst, setup_result->ifindex_veth_dst);
- /* tc filter add dev veth_dst ingress bpf da ingress_host */
- XGRESS_FILTER_ADD(&qdisc_veth_dst, BPF_TC_INGRESS, skel->progs.ingress_host, 0);
- /* tc filter add dev veth_dst egress bpf da egress_host */
- XGRESS_FILTER_ADD(&qdisc_veth_dst, BPF_TC_EGRESS, skel->progs.egress_host, 0);
+ /* tc qdisc add dev dst clsact */
+ QDISC_CLSACT_CREATE(&qdisc_dst, setup_result->ifindex_dst);
+ /* tc filter add dev dst ingress bpf da ingress_host */
+ XGRESS_FILTER_ADD(&qdisc_dst, BPF_TC_INGRESS, skel->progs.ingress_host, 0);
+ /* tc filter add dev dst egress bpf da egress_host */
+ XGRESS_FILTER_ADD(&qdisc_dst, BPF_TC_EGRESS, skel->progs.egress_host, 0);
close_netns(nstoken);
/* setup ns_fwd tc progs */
nstoken = open_netns(NS_FWD);
if (!ASSERT_OK_PTR(nstoken, "setns " NS_FWD))
return -1;
- /* tc qdisc add dev veth_dst_fwd clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_dst_fwd, setup_result->ifindex_veth_dst_fwd);
- /* tc filter add dev veth_dst_fwd ingress prio 100 bpf da ingress_fwdns_prio100 */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_INGRESS,
+ /* tc qdisc add dev dst_fwd clsact */
+ QDISC_CLSACT_CREATE(&qdisc_dst_fwd, setup_result->ifindex_dst_fwd);
+ /* tc filter add dev dst_fwd ingress prio 100 bpf da ingress_fwdns_prio100 */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_INGRESS,
skel->progs.ingress_fwdns_prio100, 100);
- /* tc filter add dev veth_dst_fwd ingress prio 101 bpf da ingress_fwdns_prio101 */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_INGRESS,
+ /* tc filter add dev dst_fwd ingress prio 101 bpf da ingress_fwdns_prio101 */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_INGRESS,
skel->progs.ingress_fwdns_prio101, 101);
- /* tc filter add dev veth_dst_fwd egress prio 100 bpf da egress_fwdns_prio100 */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_EGRESS,
+ /* tc filter add dev dst_fwd egress prio 100 bpf da egress_fwdns_prio100 */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_EGRESS,
skel->progs.egress_fwdns_prio100, 100);
- /* tc filter add dev veth_dst_fwd egress prio 101 bpf da egress_fwdns_prio101 */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_EGRESS,
+ /* tc filter add dev dst_fwd egress prio 101 bpf da egress_fwdns_prio101 */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_EGRESS,
skel->progs.egress_fwdns_prio101, 101);
- /* tc qdisc add dev veth_src_fwd clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_src_fwd, setup_result->ifindex_veth_src_fwd);
- /* tc filter add dev veth_src_fwd ingress prio 100 bpf da ingress_fwdns_prio100 */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_INGRESS,
+ /* tc qdisc add dev src_fwd clsact */
+ QDISC_CLSACT_CREATE(&qdisc_src_fwd, setup_result->ifindex_src_fwd);
+ /* tc filter add dev src_fwd ingress prio 100 bpf da ingress_fwdns_prio100 */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_INGRESS,
skel->progs.ingress_fwdns_prio100, 100);
- /* tc filter add dev veth_src_fwd ingress prio 101 bpf da ingress_fwdns_prio101 */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_INGRESS,
+ /* tc filter add dev src_fwd ingress prio 101 bpf da ingress_fwdns_prio101 */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_INGRESS,
skel->progs.ingress_fwdns_prio101, 101);
- /* tc filter add dev veth_src_fwd egress prio 100 bpf da egress_fwdns_prio100 */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_EGRESS,
+ /* tc filter add dev src_fwd egress prio 100 bpf da egress_fwdns_prio100 */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_EGRESS,
skel->progs.egress_fwdns_prio100, 100);
- /* tc filter add dev veth_src_fwd egress prio 101 bpf da egress_fwdns_prio101 */
- XGRESS_FILTER_ADD(&qdisc_veth_src_fwd, BPF_TC_EGRESS,
+ /* tc filter add dev src_fwd egress prio 101 bpf da egress_fwdns_prio101 */
+ XGRESS_FILTER_ADD(&qdisc_src_fwd, BPF_TC_EGRESS,
skel->progs.egress_fwdns_prio101, 101);
close_netns(nstoken);
return 0;
if (!ASSERT_OK_PTR(skel, "test_tc_dtime__open"))
return;
- skel->rodata->IFINDEX_SRC = setup_result->ifindex_veth_src_fwd;
- skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+ skel->rodata->IFINDEX_SRC = setup_result->ifindex_src_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_dst_fwd;
err = test_tc_dtime__load(skel);
if (!ASSERT_OK(err, "test_tc_dtime__load"))
if (!ASSERT_OK_PTR(skel, "test_tc_neigh__open"))
goto done;
- skel->rodata->IFINDEX_SRC = setup_result->ifindex_veth_src_fwd;
- skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+ skel->rodata->IFINDEX_SRC = setup_result->ifindex_src_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_dst_fwd;
err = test_tc_neigh__load(skel);
if (!ASSERT_OK(err, "test_tc_neigh__load"))
if (!ASSERT_OK_PTR(skel, "test_tc_peer__open"))
goto done;
- skel->rodata->IFINDEX_SRC = setup_result->ifindex_veth_src_fwd;
- skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+ skel->rodata->IFINDEX_SRC = setup_result->ifindex_src_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_dst_fwd;
err = test_tc_peer__load(skel);
if (!ASSERT_OK(err, "test_tc_peer__load"))
static void test_tc_redirect_peer_l3(struct netns_setup_result *setup_result)
{
LIBBPF_OPTS(bpf_tc_hook, qdisc_tun_fwd);
- LIBBPF_OPTS(bpf_tc_hook, qdisc_veth_dst_fwd);
+ LIBBPF_OPTS(bpf_tc_hook, qdisc_dst_fwd);
struct test_tc_peer *skel = NULL;
struct nstoken *nstoken = NULL;
int err;
goto fail;
skel->rodata->IFINDEX_SRC = ifindex;
- skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_dst_fwd;
err = test_tc_peer__load(skel);
if (!ASSERT_OK(err, "test_tc_peer__load"))
/* Load "tc_src_l3" to the tun_fwd interface to redirect packets
* towards dst, and "tc_dst" to redirect packets
- * and "tc_chk" on veth_dst_fwd to drop non-redirected packets.
+ * and "tc_chk" on dst_fwd to drop non-redirected packets.
*/
/* tc qdisc add dev tun_fwd clsact */
QDISC_CLSACT_CREATE(&qdisc_tun_fwd, ifindex);
/* tc filter add dev tun_fwd ingress bpf da tc_src_l3 */
XGRESS_FILTER_ADD(&qdisc_tun_fwd, BPF_TC_INGRESS, skel->progs.tc_src_l3, 0);
- /* tc qdisc add dev veth_dst_fwd clsact */
- QDISC_CLSACT_CREATE(&qdisc_veth_dst_fwd, setup_result->ifindex_veth_dst_fwd);
- /* tc filter add dev veth_dst_fwd ingress bpf da tc_dst_l3 */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_INGRESS, skel->progs.tc_dst_l3, 0);
- /* tc filter add dev veth_dst_fwd egress bpf da tc_chk */
- XGRESS_FILTER_ADD(&qdisc_veth_dst_fwd, BPF_TC_EGRESS, skel->progs.tc_chk, 0);
+ /* tc qdisc add dev dst_fwd clsact */
+ QDISC_CLSACT_CREATE(&qdisc_dst_fwd, setup_result->ifindex_dst_fwd);
+ /* tc filter add dev dst_fwd ingress bpf da tc_dst_l3 */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_INGRESS, skel->progs.tc_dst_l3, 0);
+ /* tc filter add dev dst_fwd egress bpf da tc_chk */
+ XGRESS_FILTER_ADD(&qdisc_dst_fwd, BPF_TC_EGRESS, skel->progs.tc_chk, 0);
/* Setup route and neigh tables */
SYS(fail, "ip -netns " NS_SRC " addr add dev tun_src " IP4_TUN_SRC "/24");
SYS(fail, "ip -netns " NS_SRC " addr add dev tun_src " IP6_TUN_SRC "/64 nodad");
SYS(fail, "ip -netns " NS_FWD " addr add dev tun_fwd " IP6_TUN_FWD "/64 nodad");
- SYS(fail, "ip -netns " NS_SRC " route del " IP4_DST "/32 dev veth_src scope global");
+ SYS(fail, "ip -netns " NS_SRC " route del " IP4_DST "/32 dev src scope global");
SYS(fail, "ip -netns " NS_SRC " route add " IP4_DST "/32 via " IP4_TUN_FWD
" dev tun_src scope global");
- SYS(fail, "ip -netns " NS_DST " route add " IP4_TUN_SRC "/32 dev veth_dst scope global");
- SYS(fail, "ip -netns " NS_SRC " route del " IP6_DST "/128 dev veth_src scope global");
+ SYS(fail, "ip -netns " NS_DST " route add " IP4_TUN_SRC "/32 dev dst scope global");
+ SYS(fail, "ip -netns " NS_SRC " route del " IP6_DST "/128 dev src scope global");
SYS(fail, "ip -netns " NS_SRC " route add " IP6_DST "/128 via " IP6_TUN_FWD
" dev tun_src scope global");
- SYS(fail, "ip -netns " NS_DST " route add " IP6_TUN_SRC "/128 dev veth_dst scope global");
+ SYS(fail, "ip -netns " NS_DST " route add " IP6_TUN_SRC "/128 dev dst scope global");
- SYS(fail, "ip -netns " NS_DST " neigh add " IP4_TUN_SRC " dev veth_dst lladdr " MAC_DST_FWD);
- SYS(fail, "ip -netns " NS_DST " neigh add " IP6_TUN_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+ SYS(fail, "ip -netns " NS_DST " neigh add " IP4_TUN_SRC " dev dst lladdr " MAC_DST_FWD);
+ SYS(fail, "ip -netns " NS_DST " neigh add " IP6_TUN_SRC " dev dst lladdr " MAC_DST_FWD);
if (!ASSERT_OK(set_forwarding(false), "disable forwarding"))
goto fail;
close_netns(nstoken);
}
-#define RUN_TEST(name) \
+#define RUN_TEST(name, mode) \
({ \
- struct netns_setup_result setup_result; \
+ struct netns_setup_result setup_result = { .dev_mode = mode, }; \
if (test__start_subtest(#name)) \
if (ASSERT_OK(netns_setup_namespaces("add"), "setup namespaces")) { \
if (ASSERT_OK(netns_setup_links_and_routes(&setup_result), \
{
netns_setup_namespaces_nofail("delete");
- RUN_TEST(tc_redirect_peer);
- RUN_TEST(tc_redirect_peer_l3);
- RUN_TEST(tc_redirect_neigh);
- RUN_TEST(tc_redirect_neigh_fib);
- RUN_TEST(tc_redirect_dtime);
+ RUN_TEST(tc_redirect_peer, MODE_VETH);
+ RUN_TEST(tc_redirect_peer, MODE_NETKIT);
+ RUN_TEST(tc_redirect_peer_l3, MODE_VETH);
+ RUN_TEST(tc_redirect_peer_l3, MODE_NETKIT);
+ RUN_TEST(tc_redirect_neigh, MODE_VETH);
+ RUN_TEST(tc_redirect_neigh_fib, MODE_VETH);
+ RUN_TEST(tc_redirect_dtime, MODE_VETH);
return NULL;
}
#include "verifier_helper_restricted.skel.h"
#include "verifier_helper_value_access.skel.h"
#include "verifier_int_ptr.skel.h"
+#include "verifier_iterating_callbacks.skel.h"
#include "verifier_jeq_infer_not_null.skel.h"
#include "verifier_ld_ind.skel.h"
#include "verifier_ldsx.skel.h"
void test_verifier_helper_restricted(void) { RUN(verifier_helper_restricted); }
void test_verifier_helper_value_access(void) { RUN(verifier_helper_value_access); }
void test_verifier_int_ptr(void) { RUN(verifier_int_ptr); }
+void test_verifier_iterating_callbacks(void) { RUN(verifier_iterating_callbacks); }
void test_verifier_jeq_infer_not_null(void) { RUN(verifier_jeq_infer_not_null); }
void test_verifier_ld_ind(void) { RUN(verifier_ld_ind); }
void test_verifier_ldsx(void) { RUN(verifier_ldsx); }
return 0;
}
+static int outer_loop(__u32 index, void *data)
+{
+ bpf_loop(nr_loops, empty_callback, NULL, 0);
+ __sync_add_and_fetch(&hits, nr_loops);
+ return 0;
+}
+
SEC("fentry/" SYS_PREFIX "sys_getpgid")
int benchmark(void *ctx)
{
- for (int i = 0; i < 1000; i++) {
- bpf_loop(nr_loops, empty_callback, NULL, 0);
-
- __sync_add_and_fetch(&hits, nr_loops);
- }
+ bpf_loop(1000, outer_loop, NULL, 0);
return 0;
}
if (!p)
return 0;
bpf_for_each_map_elem(&array_map, cb1, &p, 0);
+ bpf_kfunc_call_test_release(p);
return 0;
}
return 0;
bpf_spin_lock(&lock);
bpf_rbtree_add(&rbtree, &f->node, rbless);
+ bpf_spin_unlock(&lock);
return 0;
}
if (!f)
return 0;
bpf_loop(5, subprog_cb_ref, NULL, 0);
+ bpf_obj_drop(f);
return 0;
}
#define STACK_TABLE_EPOCH_SHIFT 20
#define STROBE_MAX_STR_LEN 1
#define STROBE_MAX_CFGS 32
+#define READ_MAP_VAR_PAYLOAD_CAP \
+ ((1 + STROBE_MAX_MAP_ENTRIES * 2) * STROBE_MAX_STR_LEN)
#define STROBE_MAX_PAYLOAD \
(STROBE_MAX_STRS * STROBE_MAX_STR_LEN + \
- STROBE_MAX_MAPS * (1 + STROBE_MAX_MAP_ENTRIES * 2) * STROBE_MAX_STR_LEN)
+ STROBE_MAX_MAPS * READ_MAP_VAR_PAYLOAD_CAP)
struct strobe_value_header {
/*
size_t idx, void *tls_base,
struct strobe_value_generic *value,
struct strobemeta_payload *data,
- void *payload)
+ size_t off)
{
void *location;
uint64_t len;
return 0;
bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
- len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN, value->ptr);
+ len = bpf_probe_read_user_str(&data->payload[off], STROBE_MAX_STR_LEN, value->ptr);
/*
* if bpf_probe_read_user_str returns error (<0), due to casting to
* unsinged int, it will become big number, so next check is
return 0;
data->str_lens[idx] = len;
- return len;
+ return off + len;
}
-static __always_inline void *read_map_var(struct strobemeta_cfg *cfg,
- size_t idx, void *tls_base,
- struct strobe_value_generic *value,
- struct strobemeta_payload *data,
- void *payload)
+static __always_inline uint64_t read_map_var(struct strobemeta_cfg *cfg,
+ size_t idx, void *tls_base,
+ struct strobe_value_generic *value,
+ struct strobemeta_payload *data,
+ size_t off)
{
struct strobe_map_descr* descr = &data->map_descrs[idx];
struct strobe_map_raw map;
location = calc_location(&cfg->map_locs[idx], tls_base);
if (!location)
- return payload;
+ return off;
bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location);
if (bpf_probe_read_user(&map, sizeof(struct strobe_map_raw), value->ptr))
- return payload;
+ return off;
descr->id = map.id;
descr->cnt = map.cnt;
data->req_meta_valid = 1;
}
- len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN, map.tag);
+ len = bpf_probe_read_user_str(&data->payload[off], STROBE_MAX_STR_LEN, map.tag);
if (len <= STROBE_MAX_STR_LEN) {
descr->tag_len = len;
- payload += len;
+ off += len;
}
#ifdef NO_UNROLL
break;
descr->key_lens[i] = 0;
- len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN,
+ len = bpf_probe_read_user_str(&data->payload[off], STROBE_MAX_STR_LEN,
map.entries[i].key);
if (len <= STROBE_MAX_STR_LEN) {
descr->key_lens[i] = len;
- payload += len;
+ off += len;
}
descr->val_lens[i] = 0;
- len = bpf_probe_read_user_str(payload, STROBE_MAX_STR_LEN,
+ len = bpf_probe_read_user_str(&data->payload[off], STROBE_MAX_STR_LEN,
map.entries[i].val);
if (len <= STROBE_MAX_STR_LEN) {
descr->val_lens[i] = len;
- payload += len;
+ off += len;
}
}
- return payload;
+ return off;
}
#ifdef USE_BPF_LOOP
struct strobemeta_payload *data;
void *tls_base;
struct strobemeta_cfg *cfg;
- void *payload;
+ size_t payload_off;
/* value gets mutated */
struct strobe_value_generic *value;
enum read_type type;
};
-static int read_var_callback(__u32 index, struct read_var_ctx *ctx)
+static int read_var_callback(__u64 index, struct read_var_ctx *ctx)
{
+ /* lose precision info for ctx->payload_off, verifier won't track
+ * double xor, barrier_var() is needed to force clang keep both xors.
+ */
+ ctx->payload_off ^= index;
+ barrier_var(ctx->payload_off);
+ ctx->payload_off ^= index;
switch (ctx->type) {
case READ_INT_VAR:
if (index >= STROBE_MAX_INTS)
case READ_MAP_VAR:
if (index >= STROBE_MAX_MAPS)
return 1;
- ctx->payload = read_map_var(ctx->cfg, index, ctx->tls_base,
- ctx->value, ctx->data, ctx->payload);
+ if (ctx->payload_off > sizeof(ctx->data->payload) - READ_MAP_VAR_PAYLOAD_CAP)
+ return 1;
+ ctx->payload_off = read_map_var(ctx->cfg, index, ctx->tls_base,
+ ctx->value, ctx->data, ctx->payload_off);
break;
case READ_STR_VAR:
if (index >= STROBE_MAX_STRS)
return 1;
- ctx->payload += read_str_var(ctx->cfg, index, ctx->tls_base,
- ctx->value, ctx->data, ctx->payload);
+ if (ctx->payload_off > sizeof(ctx->data->payload) - STROBE_MAX_STR_LEN)
+ return 1;
+ ctx->payload_off = read_str_var(ctx->cfg, index, ctx->tls_base,
+ ctx->value, ctx->data, ctx->payload_off);
break;
}
return 0;
pid_t pid = bpf_get_current_pid_tgid() >> 32;
struct strobe_value_generic value = {0};
struct strobemeta_cfg *cfg;
- void *tls_base, *payload;
+ size_t payload_off;
+ void *tls_base;
cfg = bpf_map_lookup_elem(&strobemeta_cfgs, &pid);
if (!cfg)
data->int_vals_set_mask = 0;
data->req_meta_valid = 0;
- payload = data->payload;
+ payload_off = 0;
/*
* we don't have struct task_struct definition, it should be:
* tls_base = (void *)task->thread.fsbase;
.tls_base = tls_base,
.value = &value,
.data = data,
- .payload = payload,
+ .payload_off = 0,
};
int err;
err = bpf_loop(STROBE_MAX_MAPS, read_var_callback, &ctx, 0);
if (err != STROBE_MAX_MAPS)
return NULL;
+
+ payload_off = ctx.payload_off;
+ /* this should not really happen, here only to satisfy verifer */
+ if (payload_off > sizeof(data->payload))
+ payload_off = sizeof(data->payload);
#else
#ifdef NO_UNROLL
#pragma clang loop unroll(disable)
#pragma unroll
#endif /* NO_UNROLL */
for (int i = 0; i < STROBE_MAX_STRS; ++i) {
- payload += read_str_var(cfg, i, tls_base, &value, data, payload);
+ payload_off = read_str_var(cfg, i, tls_base, &value, data, payload_off);
}
#ifdef NO_UNROLL
#pragma clang loop unroll(disable)
#pragma unroll
#endif /* NO_UNROLL */
for (int i = 0; i < STROBE_MAX_MAPS; ++i) {
- payload = read_map_var(cfg, i, tls_base, &value, data, payload);
+ payload_off = read_map_var(cfg, i, tls_base, &value, data, payload_off);
}
#endif /* USE_BPF_LOOP */
* return pointer right after end of payload, so it's possible to
* calculate exact amount of useful data that needs to be sent
*/
- return payload;
+ return &data->payload[payload_off];
}
SEC("raw_tracepoint/kfree_skb")
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+ __uint(max_entries, 1);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+SEC("?fentry/bpf_fentry_test1")
+int BPF_PROG(test, int a)
+{
+ bpf_tail_call_static(ctx, &jmp_table, 0);
+ return 0;
+}
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(call1, int a)
+{
+ return 0;
+}
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(call2, int a)
+{
+ return 0;
+}
__type(value, __u64);
} sock_map SEC(".maps");
+struct {
+ __uint(type, BPF_MAP_TYPE_SOCKMAP);
+ __uint(max_entries, 2);
+ __type(key, __u32);
+ __type(value, __u64);
+} nop_map SEC(".maps");
+
struct {
__uint(type, BPF_MAP_TYPE_SOCKHASH);
__uint(max_entries, 2);
" ::: __clobber_all);
}
+SEC("socket")
+__description("conditional loop (2)")
+__success
+__failure_unpriv __msg_unpriv("back-edge from insn 10 to 11")
+__naked void conditional_loop2(void)
+{
+ asm volatile (" \
+ r9 = 2 ll; \
+ r3 = 0x20 ll; \
+ r4 = 0x35 ll; \
+ r8 = r4; \
+ goto l1_%=; \
+l0_%=: r9 -= r3; \
+ r9 -= r4; \
+ r9 -= r8; \
+l1_%=: r8 += r4; \
+ if r8 < 0x64 goto l0_%=; \
+ r0 = r9; \
+ exit; \
+" ::: __clobber_all);
+}
+
+SEC("socket")
+__description("unconditional loop after conditional jump")
+__failure __msg("infinite loop detected")
+__failure_unpriv __msg_unpriv("back-edge from insn 3 to 2")
+__naked void uncond_loop_after_cond_jmp(void)
+{
+ asm volatile (" \
+ r0 = 0; \
+ if r0 > 0 goto l1_%=; \
+l0_%=: r0 = 1; \
+ goto l0_%=; \
+l1_%=: exit; \
+" ::: __clobber_all);
+}
+
+
+__naked __noinline __used
+static unsigned long never_ending_subprog()
+{
+ asm volatile (" \
+ r0 = r1; \
+ goto -1; \
+" ::: __clobber_all);
+}
+
+SEC("socket")
+__description("unconditional loop after conditional jump")
+/* infinite loop is detected *after* check_cfg() */
+__failure __msg("infinite loop detected")
+__naked void uncond_loop_in_subprog_after_cond_jmp(void)
+{
+ asm volatile (" \
+ r0 = 0; \
+ if r0 > 0 goto l1_%=; \
+l0_%=: r0 += 1; \
+ call never_ending_subprog; \
+l1_%=: exit; \
+" ::: __clobber_all);
+}
+
char _license[] SEC("license") = "GPL";
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 8);
+ __type(key, __u32);
+ __type(value, __u64);
+} map SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_USER_RINGBUF);
+ __uint(max_entries, 8);
+} ringbuf SEC(".maps");
+
+struct vm_area_struct;
+struct bpf_map;
+
+struct buf_context {
+ char *buf;
+};
+
+struct num_context {
+ __u64 i;
+ __u64 j;
+};
+
+__u8 choice_arr[2] = { 0, 1 };
+
+static int unsafe_on_2nd_iter_cb(__u32 idx, struct buf_context *ctx)
+{
+ if (idx == 0) {
+ ctx->buf = (char *)(0xDEAD);
+ return 0;
+ }
+
+ if (bpf_probe_read_user(ctx->buf, 8, (void *)(0xBADC0FFEE)))
+ return 1;
+
+ return 0;
+}
+
+SEC("?raw_tp")
+__failure __msg("R1 type=scalar expected=fp")
+int unsafe_on_2nd_iter(void *unused)
+{
+ char buf[4];
+ struct buf_context loop_ctx = { .buf = buf };
+
+ bpf_loop(100, unsafe_on_2nd_iter_cb, &loop_ctx, 0);
+ return 0;
+}
+
+static int unsafe_on_zero_iter_cb(__u32 idx, struct num_context *ctx)
+{
+ ctx->i = 0;
+ return 0;
+}
+
+SEC("?raw_tp")
+__failure __msg("invalid access to map value, value_size=2 off=32 size=1")
+int unsafe_on_zero_iter(void *unused)
+{
+ struct num_context loop_ctx = { .i = 32 };
+
+ bpf_loop(100, unsafe_on_zero_iter_cb, &loop_ctx, 0);
+ return choice_arr[loop_ctx.i];
+}
+
+static int widening_cb(__u32 idx, struct num_context *ctx)
+{
+ ++ctx->i;
+ return 0;
+}
+
+SEC("?raw_tp")
+__success
+int widening(void *unused)
+{
+ struct num_context loop_ctx = { .i = 0, .j = 1 };
+
+ bpf_loop(100, widening_cb, &loop_ctx, 0);
+ /* loop_ctx.j is not changed during callback iteration,
+ * verifier should not apply widening to it.
+ */
+ return choice_arr[loop_ctx.j];
+}
+
+static int loop_detection_cb(__u32 idx, struct num_context *ctx)
+{
+ for (;;) {}
+ return 0;
+}
+
+SEC("?raw_tp")
+__failure __msg("infinite loop detected")
+int loop_detection(void *unused)
+{
+ struct num_context loop_ctx = { .i = 0 };
+
+ bpf_loop(100, loop_detection_cb, &loop_ctx, 0);
+ return 0;
+}
+
+static __always_inline __u64 oob_state_machine(struct num_context *ctx)
+{
+ switch (ctx->i) {
+ case 0:
+ ctx->i = 1;
+ break;
+ case 1:
+ ctx->i = 32;
+ break;
+ }
+ return 0;
+}
+
+static __u64 for_each_map_elem_cb(struct bpf_map *map, __u32 *key, __u64 *val, void *data)
+{
+ return oob_state_machine(data);
+}
+
+SEC("?raw_tp")
+__failure __msg("invalid access to map value, value_size=2 off=32 size=1")
+int unsafe_for_each_map_elem(void *unused)
+{
+ struct num_context loop_ctx = { .i = 0 };
+
+ bpf_for_each_map_elem(&map, for_each_map_elem_cb, &loop_ctx, 0);
+ return choice_arr[loop_ctx.i];
+}
+
+static __u64 ringbuf_drain_cb(struct bpf_dynptr *dynptr, void *data)
+{
+ return oob_state_machine(data);
+}
+
+SEC("?raw_tp")
+__failure __msg("invalid access to map value, value_size=2 off=32 size=1")
+int unsafe_ringbuf_drain(void *unused)
+{
+ struct num_context loop_ctx = { .i = 0 };
+
+ bpf_user_ringbuf_drain(&ringbuf, ringbuf_drain_cb, &loop_ctx, 0);
+ return choice_arr[loop_ctx.i];
+}
+
+static __u64 find_vma_cb(struct task_struct *task, struct vm_area_struct *vma, void *data)
+{
+ return oob_state_machine(data);
+}
+
+SEC("?raw_tp")
+__failure __msg("invalid access to map value, value_size=2 off=32 size=1")
+int unsafe_find_vma(void *unused)
+{
+ struct task_struct *task = bpf_get_current_task_btf();
+ struct num_context loop_ctx = { .i = 0 };
+
+ bpf_find_vma(task, 0, find_vma_cb, &loop_ctx, 0);
+ return choice_arr[loop_ctx.i];
+}
+
+static int iter_limit_cb(__u32 idx, struct num_context *ctx)
+{
+ ctx->i++;
+ return 0;
+}
+
+SEC("?raw_tp")
+__success
+int bpf_loop_iter_limit_ok(void *unused)
+{
+ struct num_context ctx = { .i = 0 };
+
+ bpf_loop(1, iter_limit_cb, &ctx, 0);
+ return choice_arr[ctx.i];
+}
+
+SEC("?raw_tp")
+__failure __msg("invalid access to map value, value_size=2 off=2 size=1")
+int bpf_loop_iter_limit_overflow(void *unused)
+{
+ struct num_context ctx = { .i = 0 };
+
+ bpf_loop(2, iter_limit_cb, &ctx, 0);
+ return choice_arr[ctx.i];
+}
+
+static int iter_limit_level2a_cb(__u32 idx, struct num_context *ctx)
+{
+ ctx->i += 100;
+ return 0;
+}
+
+static int iter_limit_level2b_cb(__u32 idx, struct num_context *ctx)
+{
+ ctx->i += 10;
+ return 0;
+}
+
+static int iter_limit_level1_cb(__u32 idx, struct num_context *ctx)
+{
+ ctx->i += 1;
+ bpf_loop(1, iter_limit_level2a_cb, ctx, 0);
+ bpf_loop(1, iter_limit_level2b_cb, ctx, 0);
+ return 0;
+}
+
+/* Check that path visiting every callback function once had been
+ * reached by verifier. Variables 'ctx{1,2}i' below serve as flags,
+ * with each decimal digit corresponding to a callback visit marker.
+ */
+SEC("socket")
+__success __retval(111111)
+int bpf_loop_iter_limit_nested(void *unused)
+{
+ struct num_context ctx1 = { .i = 0 };
+ struct num_context ctx2 = { .i = 0 };
+ __u64 a, b, c;
+
+ bpf_loop(1, iter_limit_level1_cb, &ctx1, 0);
+ bpf_loop(1, iter_limit_level1_cb, &ctx2, 0);
+ a = ctx1.i;
+ b = ctx2.i;
+ /* Force 'ctx1.i' and 'ctx2.i' precise. */
+ c = choice_arr[(a + b) % 2];
+ /* This makes 'c' zero, but neither clang nor verifier know it. */
+ c /= 10;
+ /* Make sure that verifier does not visit 'impossible' states:
+ * enumerate all possible callback visit masks.
+ */
+ if (a != 0 && a != 1 && a != 11 && a != 101 && a != 111 &&
+ b != 0 && b != 1 && b != 11 && b != 101 && b != 111)
+ asm volatile ("r0 /= 0;" ::: "r0");
+ return 1000 * a + b + c;
+}
+
+char _license[] SEC("license") = "GPL";
" ::: __clobber_all);
}
-SEC("tracepoint")
+SEC("socket")
__description("bounded loop, start in the middle")
-__failure __msg("back-edge")
+__success
+__failure_unpriv __msg_unpriv("back-edge")
__naked void loop_start_in_the_middle(void)
{
asm volatile (" \
SEC("tracepoint")
__description("bounded recursion")
-__failure __msg("back-edge")
+__failure
+/* verifier limitation in detecting max stack depth */
+__msg("the call stack of 8 frames is too deep !")
__naked void bounded_recursion(void)
{
asm volatile (" \
}
#endif /* v4 instruction */
+
+SEC("?raw_tp")
+__success __log_level(2)
+/*
+ * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
+ * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
+ * expected log messages to the one specific mark_chain_precision operation.
+ *
+ * This is quite fragile: if verifier checkpointing heuristic changes, this
+ * might need adjusting.
+ */
+__msg("2: (07) r0 += 1 ; R0_w=6")
+__msg("3: (35) if r0 >= 0xa goto pc+1")
+__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
+__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=: R0_rw=P4")
+__msg("3: R0_w=6")
+__naked int state_loop_first_last_equal(void)
+{
+ asm volatile (
+ "r0 = 0;"
+ "l0_%=:"
+ "r0 += 1;"
+ "r0 += 1;"
+ /* every few iterations we'll have a checkpoint here with
+ * first_idx == last_idx, potentially confusing precision
+ * backtracking logic
+ */
+ "if r0 >= 10 goto l1_%=;" /* checkpoint + mark_precise */
+ "goto l0_%=;"
+ "l1_%=:"
+ "exit;"
+ ::: __clobber_common
+ );
+}
+
+char _license[] SEC("license") = "GPL";
SEC("?raw_tp")
__success __log_level(2)
+/* First simulated path does not include callback body,
+ * r1 and r4 are always precise for bpf_loop() calls.
+ */
+__msg("9: (85) call bpf_loop#181")
+__msg("mark_precise: frame0: last_idx 9 first_idx 9 subseq_idx -1")
+__msg("mark_precise: frame0: parent state regs=r4 stack=:")
+__msg("mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx 9")
+__msg("mark_precise: frame0: regs=r4 stack= before 8: (b7) r4 = 0")
+__msg("mark_precise: frame0: last_idx 9 first_idx 9 subseq_idx -1")
+__msg("mark_precise: frame0: parent state regs=r1 stack=:")
+__msg("mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx 9")
+__msg("mark_precise: frame0: regs=r1 stack= before 8: (b7) r4 = 0")
+__msg("mark_precise: frame0: regs=r1 stack= before 7: (b7) r3 = 0")
+__msg("mark_precise: frame0: regs=r1 stack= before 6: (bf) r2 = r8")
+__msg("mark_precise: frame0: regs=r1 stack= before 5: (bf) r1 = r6")
+__msg("mark_precise: frame0: regs=r6 stack= before 4: (b7) r6 = 3")
+/* r6 precision propagation */
__msg("14: (0f) r1 += r6")
-__msg("mark_precise: frame0: last_idx 14 first_idx 10")
+__msg("mark_precise: frame0: last_idx 14 first_idx 9")
__msg("mark_precise: frame0: regs=r6 stack= before 13: (bf) r1 = r7")
__msg("mark_precise: frame0: regs=r6 stack= before 12: (27) r6 *= 4")
__msg("mark_precise: frame0: regs=r6 stack= before 11: (25) if r6 > 0x3 goto pc+4")
__msg("mark_precise: frame0: regs=r6 stack= before 10: (bf) r6 = r0")
-__msg("mark_precise: frame0: parent state regs=r0 stack=:")
-__msg("mark_precise: frame0: last_idx 18 first_idx 0")
-__msg("mark_precise: frame0: regs=r0 stack= before 18: (95) exit")
+__msg("mark_precise: frame0: regs=r0 stack= before 9: (85) call bpf_loop")
+/* State entering callback body popped from states stack */
+__msg("from 9 to 17: frame1:")
+__msg("17: frame1: R1=scalar() R2=0 R10=fp0 cb")
+__msg("17: (b7) r0 = 0")
+__msg("18: (95) exit")
+__msg("returning from callee:")
+__msg("to caller at 9:")
+__msg("frame 0: propagating r1,r4")
+__msg("mark_precise: frame0: last_idx 9 first_idx 9 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r1,r4 stack= before 18: (95) exit")
+__msg("from 18 to 9: safe")
__naked int callback_result_precise(void)
{
asm volatile (
SEC("?raw_tp")
__success __log_level(2)
+/* First simulated path does not include callback body */
__msg("12: (0f) r1 += r6")
-__msg("mark_precise: frame0: last_idx 12 first_idx 10")
+__msg("mark_precise: frame0: last_idx 12 first_idx 9")
__msg("mark_precise: frame0: regs=r6 stack= before 11: (bf) r1 = r7")
__msg("mark_precise: frame0: regs=r6 stack= before 10: (27) r6 *= 4")
+__msg("mark_precise: frame0: regs=r6 stack= before 9: (85) call bpf_loop")
__msg("mark_precise: frame0: parent state regs=r6 stack=:")
-__msg("mark_precise: frame0: last_idx 16 first_idx 0")
-__msg("mark_precise: frame0: regs=r6 stack= before 16: (95) exit")
-__msg("mark_precise: frame1: regs= stack= before 15: (b7) r0 = 0")
-__msg("mark_precise: frame1: regs= stack= before 9: (85) call bpf_loop#181")
+__msg("mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx 9")
__msg("mark_precise: frame0: regs=r6 stack= before 8: (b7) r4 = 0")
__msg("mark_precise: frame0: regs=r6 stack= before 7: (b7) r3 = 0")
__msg("mark_precise: frame0: regs=r6 stack= before 6: (bf) r2 = r8")
__msg("mark_precise: frame0: regs=r6 stack= before 5: (b7) r1 = 1")
__msg("mark_precise: frame0: regs=r6 stack= before 4: (b7) r6 = 3")
+/* State entering callback body popped from states stack */
+__msg("from 9 to 15: frame1:")
+__msg("15: frame1: R1=scalar() R2=0 R10=fp0 cb")
+__msg("15: (b7) r0 = 0")
+__msg("16: (95) exit")
+__msg("returning from callee:")
+__msg("to caller at 9:")
+/* r1, r4 are always precise for bpf_loop(),
+ * r6 was marked before backtracking to callback body.
+ */
+__msg("frame 0: propagating r1,r4,r6")
+__msg("mark_precise: frame0: last_idx 9 first_idx 9 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r1,r4,r6 stack= before 16: (95) exit")
+__msg("mark_precise: frame1: regs= stack= before 15: (b7) r0 = 0")
+__msg("mark_precise: frame1: regs= stack= before 9: (85) call bpf_loop")
+__msg("mark_precise: frame0: parent state regs= stack=:")
+__msg("from 16 to 9: safe")
__naked int parent_callee_saved_reg_precise_with_callback(void)
{
asm volatile (
SEC("?raw_tp")
__success __log_level(2)
+/* First simulated path does not include callback body */
__msg("14: (0f) r1 += r6")
-__msg("mark_precise: frame0: last_idx 14 first_idx 11")
+__msg("mark_precise: frame0: last_idx 14 first_idx 10")
__msg("mark_precise: frame0: regs=r6 stack= before 13: (bf) r1 = r7")
__msg("mark_precise: frame0: regs=r6 stack= before 12: (27) r6 *= 4")
__msg("mark_precise: frame0: regs=r6 stack= before 11: (79) r6 = *(u64 *)(r10 -8)")
+__msg("mark_precise: frame0: regs= stack=-8 before 10: (85) call bpf_loop")
__msg("mark_precise: frame0: parent state regs= stack=-8:")
-__msg("mark_precise: frame0: last_idx 18 first_idx 0")
-__msg("mark_precise: frame0: regs= stack=-8 before 18: (95) exit")
-__msg("mark_precise: frame1: regs= stack= before 17: (b7) r0 = 0")
-__msg("mark_precise: frame1: regs= stack= before 10: (85) call bpf_loop#181")
+__msg("mark_precise: frame0: last_idx 9 first_idx 0 subseq_idx 10")
__msg("mark_precise: frame0: regs= stack=-8 before 9: (b7) r4 = 0")
__msg("mark_precise: frame0: regs= stack=-8 before 8: (b7) r3 = 0")
__msg("mark_precise: frame0: regs= stack=-8 before 7: (bf) r2 = r8")
__msg("mark_precise: frame0: regs= stack=-8 before 6: (bf) r1 = r6")
__msg("mark_precise: frame0: regs= stack=-8 before 5: (7b) *(u64 *)(r10 -8) = r6")
__msg("mark_precise: frame0: regs=r6 stack= before 4: (b7) r6 = 3")
+/* State entering callback body popped from states stack */
+__msg("from 10 to 17: frame1:")
+__msg("17: frame1: R1=scalar() R2=0 R10=fp0 cb")
+__msg("17: (b7) r0 = 0")
+__msg("18: (95) exit")
+__msg("returning from callee:")
+__msg("to caller at 10:")
+/* r1, r4 are always precise for bpf_loop(),
+ * fp-8 was marked before backtracking to callback body.
+ */
+__msg("frame 0: propagating r1,r4,fp-8")
+__msg("mark_precise: frame0: last_idx 10 first_idx 10 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r1,r4 stack=-8 before 18: (95) exit")
+__msg("mark_precise: frame1: regs= stack= before 17: (b7) r0 = 0")
+__msg("mark_precise: frame1: regs= stack= before 10: (85) call bpf_loop#181")
+__msg("mark_precise: frame0: parent state regs= stack=:")
+__msg("from 18 to 10: safe")
__naked int parent_stack_slot_precise_with_callback(void)
{
asm volatile (
#define DEFAULT_TTL 64
#define MAX_ALLOWED_PORTS 8
+#define MAX_PACKET_OFF 0xffff
+
#define swap(a, b) \
do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
}
struct tcpopt_context {
- __u8 *ptr;
- __u8 *end;
+ void *data;
void *data_end;
__be32 *tsecr;
__u8 wscale;
bool option_timestamp;
bool option_sack;
+ __u32 off;
};
-static int tscookie_tcpopt_parse(struct tcpopt_context *ctx)
+static __always_inline u8 *next(struct tcpopt_context *ctx, __u32 sz)
{
- __u8 opcode, opsize;
+ __u64 off = ctx->off;
+ __u8 *data;
- if (ctx->ptr >= ctx->end)
- return 1;
- if (ctx->ptr >= ctx->data_end)
- return 1;
+ /* Verifier forbids access to packet when offset exceeds MAX_PACKET_OFF */
+ if (off > MAX_PACKET_OFF - sz)
+ return NULL;
- opcode = ctx->ptr[0];
+ data = ctx->data + off;
+ barrier_var(data);
+ if (data + sz >= ctx->data_end)
+ return NULL;
- if (opcode == TCPOPT_EOL)
- return 1;
- if (opcode == TCPOPT_NOP) {
- ++ctx->ptr;
- return 0;
- }
+ ctx->off += sz;
+ return data;
+}
- if (ctx->ptr + 1 >= ctx->end)
- return 1;
- if (ctx->ptr + 1 >= ctx->data_end)
+static int tscookie_tcpopt_parse(struct tcpopt_context *ctx)
+{
+ __u8 *opcode, *opsize, *wscale, *tsecr;
+ __u32 off = ctx->off;
+
+ opcode = next(ctx, 1);
+ if (!opcode)
return 1;
- opsize = ctx->ptr[1];
- if (opsize < 2)
+
+ if (*opcode == TCPOPT_EOL)
return 1;
+ if (*opcode == TCPOPT_NOP)
+ return 0;
- if (ctx->ptr + opsize > ctx->end)
+ opsize = next(ctx, 1);
+ if (!opsize || *opsize < 2)
return 1;
- switch (opcode) {
+ switch (*opcode) {
case TCPOPT_WINDOW:
- if (opsize == TCPOLEN_WINDOW && ctx->ptr + TCPOLEN_WINDOW <= ctx->data_end)
- ctx->wscale = ctx->ptr[2] < TCP_MAX_WSCALE ? ctx->ptr[2] : TCP_MAX_WSCALE;
+ wscale = next(ctx, 1);
+ if (!wscale)
+ return 1;
+ if (*opsize == TCPOLEN_WINDOW)
+ ctx->wscale = *wscale < TCP_MAX_WSCALE ? *wscale : TCP_MAX_WSCALE;
break;
case TCPOPT_TIMESTAMP:
- if (opsize == TCPOLEN_TIMESTAMP && ctx->ptr + TCPOLEN_TIMESTAMP <= ctx->data_end) {
+ tsecr = next(ctx, 4);
+ if (!tsecr)
+ return 1;
+ if (*opsize == TCPOLEN_TIMESTAMP) {
ctx->option_timestamp = true;
/* Client's tsval becomes our tsecr. */
- *ctx->tsecr = get_unaligned((__be32 *)(ctx->ptr + 2));
+ *ctx->tsecr = get_unaligned((__be32 *)tsecr);
}
break;
case TCPOPT_SACK_PERM:
- if (opsize == TCPOLEN_SACK_PERM)
+ if (*opsize == TCPOLEN_SACK_PERM)
ctx->option_sack = true;
break;
}
- ctx->ptr += opsize;
+ ctx->off = off + *opsize;
return 0;
}
static __always_inline bool tscookie_init(struct tcphdr *tcp_header,
__u16 tcp_len, __be32 *tsval,
- __be32 *tsecr, void *data_end)
+ __be32 *tsecr, void *data, void *data_end)
{
struct tcpopt_context loop_ctx = {
- .ptr = (__u8 *)(tcp_header + 1),
- .end = (__u8 *)tcp_header + tcp_len,
+ .data = data,
.data_end = data_end,
.tsecr = tsecr,
.wscale = TS_OPT_WSCALE_MASK,
.option_timestamp = false,
.option_sack = false,
+ /* Note: currently verifier would track .off as unbound scalar.
+ * In case if verifier would at some point get smarter and
+ * compute bounded value for this var, beware that it might
+ * hinder bpf_loop() convergence validation.
+ */
+ .off = (__u8 *)(tcp_header + 1) - (__u8 *)data,
};
u32 cookie;
cookie = (__u32)value;
if (tscookie_init((void *)hdr->tcp, hdr->tcp_len,
- &tsopt_buf[0], &tsopt_buf[1], data_end))
+ &tsopt_buf[0], &tsopt_buf[1], data, data_end))
tsopt = tsopt_buf;
/* Check that there is enough space for a SYNACK. It also covers
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_TRACEPOINT,
- .errstr = "back-edge from insn 0 to 0",
+ .errstr = "the call stack of 9 frames is too deep",
.result = REJECT,
},
{
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_TRACEPOINT,
- .errstr = "back-edge",
+ .errstr = "the call stack of 9 frames is too deep",
.result = REJECT,
},
{
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_TRACEPOINT,
- .errstr = "back-edge",
+ .errstr = "the call stack of 9 frames is too deep",
.result = REJECT,
},
{
BPF_MOV64_IMM(BPF_REG_0, 2),
BPF_EXIT_INSN(),
},
- .errstr = "invalid BPF_LD_IMM insn",
- .errstr_unpriv = "R1 pointer comparison",
+ .errstr = "jump into the middle of ldimm64 insn 1",
+ .errstr_unpriv = "jump into the middle of ldimm64 insn 1",
.result = REJECT,
},
{
BPF_LD_IMM64(BPF_REG_0, 1),
BPF_EXIT_INSN(),
},
- .errstr = "invalid BPF_LD_IMM insn",
- .errstr_unpriv = "R1 pointer comparison",
+ .errstr = "jump into the middle of ldimm64 insn 1",
+ .errstr_unpriv = "jump into the middle of ldimm64 insn 1",
.result = REJECT,
},
{
struct xdp_info *meta = data - sizeof(struct xdp_info);
if (meta->count != pkt->pkt_nb) {
- ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n",
- __func__, pkt->pkt_nb, meta->count);
+ ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%llu]\n",
+ __func__, pkt->pkt_nb,
+ (unsigned long long)meta->count);
return false;
}
if (addr >= umem->num_frames * umem->frame_size ||
addr + len > umem->num_frames * umem->frame_size) {
- ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len);
+ ksft_print_msg("Frag invalid addr: %llx len: %u\n",
+ (unsigned long long)addr, len);
return false;
}
if (!umem->unaligned_mode && addr % umem->frame_size + len > umem->frame_size) {
- ksft_print_msg("Frag crosses frame boundary addr: %llx len: %u\n", addr, len);
+ ksft_print_msg("Frag crosses frame boundary addr: %llx len: %u\n",
+ (unsigned long long)addr, len);
return false;
}
u64 addr = *xsk_ring_cons__comp_addr(&xsk->umem->cq, idx + rcvd - 1);
ksft_print_msg("[%s] Too many packets completed\n", __func__);
- ksft_print_msg("Last completion address: %llx\n", addr);
+ ksft_print_msg("Last completion address: %llx\n",
+ (unsigned long long)addr);
return TEST_FAILURE;
}
}
if (stats.tx_invalid_descs != ifobject->xsk->pkt_stream->nb_pkts / 2) {
- ksft_print_msg("[%s] tx_invalid_descs incorrect. Got [%u] expected [%u]\n",
- __func__, stats.tx_invalid_descs,
+ ksft_print_msg("[%s] tx_invalid_descs incorrect. Got [%llu] expected [%u]\n",
+ __func__,
+ (unsigned long long)stats.tx_invalid_descs,
ifobject->xsk->pkt_stream->nb_pkts);
return TEST_FAILURE;
}
CONFIG_CRYPTO_XXHASH=y
CONFIG_DCB=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
-CONFIG_DEBUG_CREDENTIALS=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEFAULT_FQ_CODEL=y
__u64 bitmap_size, __u32 flags,
struct __test_metadata *_metadata)
{
- unsigned long i, count, nbits = bitmap_size * BITS_PER_BYTE;
+ unsigned long i, nbits = bitmap_size * BITS_PER_BYTE;
unsigned long nr = nbits / 2;
__u64 out_dirty = 0;
/* Mark all even bits as dirty in the mock domain */
- for (count = 0, i = 0; i < nbits; count += !(i % 2), i++)
- if (!(i % 2))
- set_bit(i, (unsigned long *)bitmap);
- ASSERT_EQ(nr, count);
+ for (i = 0; i < nbits; i += 2)
+ set_bit(i, (unsigned long *)bitmap);
test_cmd_mock_domain_set_dirty(fd, hwpt_id, length, iova, page_size,
bitmap, &out_dirty);
memset(bitmap, 0, bitmap_size);
test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap,
flags);
- for (count = 0, i = 0; i < nbits; count += !(i % 2), i++)
+ /* Beware ASSERT_EQ() is two statements -- braces are not redundant! */
+ for (i = 0; i < nbits; i++) {
ASSERT_EQ(!(i % 2), test_bit(i, (unsigned long *)bitmap));
- ASSERT_EQ(count, out_dirty);
+ }
memset(bitmap, 0, bitmap_size);
test_cmd_get_dirty_bitmap(fd, hwpt_id, length, iova, page_size, bitmap,
ARCH_DIR := $(ARCH)
endif
-ifeq ($(ARCH),arm64)
-arm64_tools_dir := $(top_srcdir)/tools/arch/arm64/tools/
-GEN_HDRS := $(top_srcdir)/tools/arch/arm64/include/generated/
-CFLAGS += -I$(GEN_HDRS)
-
-$(GEN_HDRS): $(wildcard $(arm64_tools_dir)/*)
- $(MAKE) -C $(arm64_tools_dir)
-endif
-
LIBKVM += lib/assert.c
LIBKVM += lib/elf.c
LIBKVM += lib/guest_modes.c
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_tlb_flush
TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
-TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
TEST_GEN_PROGS_riscv += demand_paging_test
TEST_GEN_PROGS_riscv += dirty_log_test
-TEST_GEN_PROGS_riscv += guest_print_test
TEST_GEN_PROGS_riscv += get-reg-list
+TEST_GEN_PROGS_riscv += guest_print_test
+TEST_GEN_PROGS_riscv += kvm_binary_stats_test
TEST_GEN_PROGS_riscv += kvm_create_max_vcpus
TEST_GEN_PROGS_riscv += kvm_page_table_test
TEST_GEN_PROGS_riscv += set_memory_region_test
-TEST_GEN_PROGS_riscv += kvm_binary_stats_test
+TEST_GEN_PROGS_riscv += steal_time
SPLIT_TESTS += get-reg-list
LINUX_TOOL_ARCH_INCLUDE = $(top_srcdir)/tools/arch/$(ARCH)/include
endif
CFLAGS += -Wall -Wstrict-prototypes -Wuninitialized -O2 -g -std=gnu99 \
- -Wno-gnu-variable-sized-type-not-at-end -MD\
+ -Wno-gnu-variable-sized-type-not-at-end -MD -MP \
-fno-builtin-memcmp -fno-builtin-memcpy -fno-builtin-memset \
-fno-builtin-strnlen \
-fno-stack-protector -fno-PIE -I$(LINUX_TOOL_INCLUDE) \
ifeq ($(ARCH),s390)
CFLAGS += -march=z10
endif
+ifeq ($(ARCH),arm64)
+tools_dir := $(top_srcdir)/tools
+arm64_tools_dir := $(tools_dir)/arch/arm64/tools/
+
+ifneq ($(abs_objdir),)
+arm64_hdr_outdir := $(abs_objdir)/tools/
+else
+arm64_hdr_outdir := $(tools_dir)/
+endif
+
+GEN_HDRS := $(arm64_hdr_outdir)arch/arm64/include/generated/
+CFLAGS += -I$(GEN_HDRS)
+
+$(GEN_HDRS): $(wildcard $(arm64_tools_dir)/*)
+ $(MAKE) -C $(arm64_tools_dir) OUTPUT=$(arm64_hdr_outdir)
+endif
no-pie-option := $(call try-run, echo 'int main(void) { return 0; }' | \
$(CC) -Werror $(CFLAGS) -no-pie -x c - -o "$$TMP", -no-pie)
for_each_sublist(c, s) {
if (!strcmp(s->name, "base"))
continue;
- strcat(c->name + len, s->name);
- len += strlen(s->name) + 1;
- c->name[len - 1] = '+';
+ if (len)
+ c->name[len++] = '+';
+ strcpy(c->name + len, s->name);
+ len += strlen(s->name);
}
- c->name[len - 1] = '\0';
+ c->name[len] = '\0';
return c->name;
}
/* Access flag update enable/disable */
#define TCR_EL1_HA (1ULL << 39)
-void aarch64_get_supported_page_sizes(uint32_t ipa,
- bool *ps4k, bool *ps16k, bool *ps64k);
+void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
+ uint32_t *ipa16k, uint32_t *ipa64k);
void vm_init_descriptor_tables(struct kvm_vm *vm);
void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu);
extern struct guest_mode guest_modes[NUM_VM_MODES];
-#define guest_mode_append(mode, supported, enabled) ({ \
- guest_modes[mode] = (struct guest_mode){ supported, enabled }; \
+#define guest_mode_append(mode, enabled) ({ \
+ guest_modes[mode] = (struct guest_mode){ (enabled), (enabled) }; \
})
void guest_modes_append_default(void);
const char *name;
long capability;
int feature;
+ int feature_type;
bool finalize;
__u64 *regs;
__u64 regs_n;
enum vm_guest_mode {
VM_MODE_P52V48_4K,
+ VM_MODE_P52V48_16K,
VM_MODE_P52V48_64K,
VM_MODE_P48V48_4K,
VM_MODE_P48V48_16K,
#define __KVM_SYSCALL_ERROR(_name, _ret) \
"%s failed, rc: %i errno: %i (%s)", (_name), (_ret), errno, strerror(errno)
+/*
+ * Use the "inner", double-underscore macro when reporting errors from within
+ * other macros so that the name of ioctl() and not its literal numeric value
+ * is printed on error. The "outer" macro is strongly preferred when reporting
+ * errors "directly", i.e. without an additional layer of macros, as it reduces
+ * the probability of passing in the wrong string.
+ */
#define __KVM_IOCTL_ERROR(_name, _ret) __KVM_SYSCALL_ERROR(_name, _ret)
#define KVM_IOCTL_ERROR(_ioctl, _ret) __KVM_IOCTL_ERROR(#_ioctl, _ret)
#define __kvm_ioctl(kvm_fd, cmd, arg) \
kvm_do_ioctl(kvm_fd, cmd, arg)
-
-#define _kvm_ioctl(kvm_fd, cmd, name, arg) \
+#define kvm_ioctl(kvm_fd, cmd, arg) \
({ \
int ret = __kvm_ioctl(kvm_fd, cmd, arg); \
\
- TEST_ASSERT(!ret, __KVM_IOCTL_ERROR(name, ret)); \
+ TEST_ASSERT(!ret, __KVM_IOCTL_ERROR(#cmd, ret)); \
})
-#define kvm_ioctl(kvm_fd, cmd, arg) \
- _kvm_ioctl(kvm_fd, cmd, #cmd, arg)
-
static __always_inline void static_assert_is_vm(struct kvm_vm *vm) { }
#define __vm_ioctl(vm, cmd, arg) \
kvm_do_ioctl((vm)->fd, cmd, arg); \
})
-#define _vm_ioctl(vm, cmd, name, arg) \
+/*
+ * Assert that a VM or vCPU ioctl() succeeded, with extra magic to detect if
+ * the ioctl() failed because KVM killed/bugged the VM. To detect a dead VM,
+ * probe KVM_CAP_USER_MEMORY, which (a) has been supported by KVM since before
+ * selftests existed and (b) should never outright fail, i.e. is supposed to
+ * return 0 or 1. If KVM kills a VM, KVM returns -EIO for all ioctl()s for the
+ * VM and its vCPUs, including KVM_CHECK_EXTENSION.
+ */
+#define __TEST_ASSERT_VM_VCPU_IOCTL(cond, name, ret, vm) \
+do { \
+ int __errno = errno; \
+ \
+ static_assert_is_vm(vm); \
+ \
+ if (cond) \
+ break; \
+ \
+ if (errno == EIO && \
+ __vm_ioctl(vm, KVM_CHECK_EXTENSION, (void *)KVM_CAP_USER_MEMORY) < 0) { \
+ TEST_ASSERT(errno == EIO, "KVM killed the VM, should return -EIO"); \
+ TEST_FAIL("KVM killed/bugged the VM, check the kernel log for clues"); \
+ } \
+ errno = __errno; \
+ TEST_ASSERT(cond, __KVM_IOCTL_ERROR(name, ret)); \
+} while (0)
+
+#define TEST_ASSERT_VM_VCPU_IOCTL(cond, cmd, ret, vm) \
+ __TEST_ASSERT_VM_VCPU_IOCTL(cond, #cmd, ret, vm)
+
+#define vm_ioctl(vm, cmd, arg) \
({ \
int ret = __vm_ioctl(vm, cmd, arg); \
\
- TEST_ASSERT(!ret, __KVM_IOCTL_ERROR(name, ret)); \
+ __TEST_ASSERT_VM_VCPU_IOCTL(!ret, #cmd, ret, vm); \
})
-#define vm_ioctl(vm, cmd, arg) \
- _vm_ioctl(vm, cmd, #cmd, arg)
-
-
static __always_inline void static_assert_is_vcpu(struct kvm_vcpu *vcpu) { }
#define __vcpu_ioctl(vcpu, cmd, arg) \
kvm_do_ioctl((vcpu)->fd, cmd, arg); \
})
-#define _vcpu_ioctl(vcpu, cmd, name, arg) \
+#define vcpu_ioctl(vcpu, cmd, arg) \
({ \
int ret = __vcpu_ioctl(vcpu, cmd, arg); \
\
- TEST_ASSERT(!ret, __KVM_IOCTL_ERROR(name, ret)); \
+ __TEST_ASSERT_VM_VCPU_IOCTL(!ret, #cmd, ret, (vcpu)->vm); \
})
-#define vcpu_ioctl(vcpu, cmd, arg) \
- _vcpu_ioctl(vcpu, cmd, #cmd, arg)
-
/*
* Looks up and returns the value corresponding to the capability
* (KVM_CAP_*) given by cap.
{
int ret = __vm_ioctl(vm, KVM_CHECK_EXTENSION, (void *)cap);
- TEST_ASSERT(ret >= 0, KVM_IOCTL_ERROR(KVM_CHECK_EXTENSION, ret));
+ TEST_ASSERT_VM_VCPU_IOCTL(ret >= 0, KVM_CHECK_EXTENSION, ret, vm);
return ret;
}
{
int fd = __vm_ioctl(vm, KVM_GET_STATS_FD, NULL);
- TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_GET_STATS_FD, fd));
+ TEST_ASSERT_VM_VCPU_IOCTL(fd >= 0, KVM_GET_STATS_FD, fd, vm);
return fd;
}
{
int fd = __vcpu_ioctl(vcpu, KVM_GET_STATS_FD, NULL);
- TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_GET_STATS_FD, fd));
+ TEST_ASSERT_VM_VCPU_IOCTL(fd >= 0, KVM_CHECK_EXTENSION, fd, vcpu->vm);
return fd;
}
#include "kvm_util.h"
#include <linux/stringify.h>
-static inline uint64_t __kvm_reg_id(uint64_t type, uint64_t idx,
- uint64_t size)
+static inline uint64_t __kvm_reg_id(uint64_t type, uint64_t subtype,
+ uint64_t idx, uint64_t size)
{
- return KVM_REG_RISCV | type | idx | size;
+ return KVM_REG_RISCV | type | subtype | idx | size;
}
#if __riscv_xlen == 64
#define KVM_REG_SIZE_ULONG KVM_REG_SIZE_U32
#endif
-#define RISCV_CONFIG_REG(name) __kvm_reg_id(KVM_REG_RISCV_CONFIG, \
- KVM_REG_RISCV_CONFIG_REG(name), \
- KVM_REG_SIZE_ULONG)
+#define RISCV_CONFIG_REG(name) __kvm_reg_id(KVM_REG_RISCV_CONFIG, 0, \
+ KVM_REG_RISCV_CONFIG_REG(name), \
+ KVM_REG_SIZE_ULONG)
-#define RISCV_CORE_REG(name) __kvm_reg_id(KVM_REG_RISCV_CORE, \
- KVM_REG_RISCV_CORE_REG(name), \
- KVM_REG_SIZE_ULONG)
+#define RISCV_CORE_REG(name) __kvm_reg_id(KVM_REG_RISCV_CORE, 0, \
+ KVM_REG_RISCV_CORE_REG(name), \
+ KVM_REG_SIZE_ULONG)
-#define RISCV_CSR_REG(name) __kvm_reg_id(KVM_REG_RISCV_CSR, \
- KVM_REG_RISCV_CSR_REG(name), \
- KVM_REG_SIZE_ULONG)
+#define RISCV_GENERAL_CSR_REG(name) __kvm_reg_id(KVM_REG_RISCV_CSR, \
+ KVM_REG_RISCV_CSR_GENERAL, \
+ KVM_REG_RISCV_CSR_REG(name), \
+ KVM_REG_SIZE_ULONG)
-#define RISCV_TIMER_REG(name) __kvm_reg_id(KVM_REG_RISCV_TIMER, \
- KVM_REG_RISCV_TIMER_REG(name), \
- KVM_REG_SIZE_U64)
+#define RISCV_TIMER_REG(name) __kvm_reg_id(KVM_REG_RISCV_TIMER, 0, \
+ KVM_REG_RISCV_TIMER_REG(name), \
+ KVM_REG_SIZE_U64)
-#define RISCV_ISA_EXT_REG(idx) __kvm_reg_id(KVM_REG_RISCV_ISA_EXT, \
- idx, KVM_REG_SIZE_ULONG)
+#define RISCV_ISA_EXT_REG(idx) __kvm_reg_id(KVM_REG_RISCV_ISA_EXT, \
+ KVM_REG_RISCV_ISA_SINGLE, \
+ idx, KVM_REG_SIZE_ULONG)
+
+#define RISCV_SBI_EXT_REG(idx) __kvm_reg_id(KVM_REG_RISCV_SBI_EXT, \
+ KVM_REG_RISCV_SBI_SINGLE, \
+ idx, KVM_REG_SIZE_ULONG)
/* L3 index Bit[47:39] */
#define PGTBL_L3_INDEX_MASK 0x0000FF8000000000ULL
#define SATP_ASID_SHIFT 44
#define SATP_ASID_MASK _AC(0xFFFF, UL)
+/* SBI return error codes */
+#define SBI_SUCCESS 0
+#define SBI_ERR_FAILURE -1
+#define SBI_ERR_NOT_SUPPORTED -2
+#define SBI_ERR_INVALID_PARAM -3
+#define SBI_ERR_DENIED -4
+#define SBI_ERR_INVALID_ADDRESS -5
+#define SBI_ERR_ALREADY_AVAILABLE -6
+#define SBI_ERR_ALREADY_STARTED -7
+#define SBI_ERR_ALREADY_STOPPED -8
+
#define SBI_EXT_EXPERIMENTAL_START 0x08000000
#define SBI_EXT_EXPERIMENTAL_END 0x08FFFFFF
#define KVM_RISCV_SELFTESTS_SBI_UCALL 0
#define KVM_RISCV_SELFTESTS_SBI_UNEXP 1
+enum sbi_ext_id {
+ SBI_EXT_BASE = 0x10,
+ SBI_EXT_STA = 0x535441,
+};
+
+enum sbi_ext_base_fid {
+ SBI_EXT_BASE_PROBE_EXT = 3,
+};
+
struct sbiret {
long error;
long value;
unsigned long arg3, unsigned long arg4,
unsigned long arg5);
+bool guest_sbi_probe_extension(int extid, long *out_val);
+
#endif /* SELFTEST_KVM_PROCESSOR_H */
}
int guest_vsnprintf(char *buf, int n, const char *fmt, va_list args);
-int guest_snprintf(char *buf, int n, const char *fmt, ...);
+__printf(3, 4) int guest_snprintf(char *buf, int n, const char *fmt, ...);
char *strdup_printf(const char *fmt, ...) __attribute__((format(printf, 1, 2), nonnull(1)));
void *ucall_arch_get_ucall(struct kvm_vcpu *vcpu);
void ucall(uint64_t cmd, int nargs, ...);
-void ucall_fmt(uint64_t cmd, const char *fmt, ...);
-void ucall_assert(uint64_t cmd, const char *exp, const char *file,
- unsigned int line, const char *fmt, ...);
+__printf(2, 3) void ucall_fmt(uint64_t cmd, const char *fmt, ...);
+__printf(5, 6) void ucall_assert(uint64_t cmd, const char *exp,
+ const char *file, unsigned int line,
+ const char *fmt, ...);
uint64_t get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc);
void ucall_init(struct kvm_vm *vm, vm_paddr_t mmio_gpa);
int ucall_nr_pages_required(uint64_t page_size);
#include "kvm_util.h"
#include "processor.h"
#include <linux/bitfield.h>
+#include <linux/sizes.h>
#define DEFAULT_ARM64_GUEST_STACK_VADDR_MIN 0xac0000
return (gva >> vm->page_shift) & mask;
}
+static inline bool use_lpa2_pte_format(struct kvm_vm *vm)
+{
+ return (vm->page_size == SZ_4K || vm->page_size == SZ_16K) &&
+ (vm->pa_bits > 48 || vm->va_bits > 48);
+}
+
static uint64_t addr_pte(struct kvm_vm *vm, uint64_t pa, uint64_t attrs)
{
uint64_t pte;
- pte = pa & GENMASK(47, vm->page_shift);
- if (vm->page_shift == 16)
- pte |= FIELD_GET(GENMASK(51, 48), pa) << 12;
+ if (use_lpa2_pte_format(vm)) {
+ pte = pa & GENMASK(49, vm->page_shift);
+ pte |= FIELD_GET(GENMASK(51, 50), pa) << 8;
+ attrs &= ~GENMASK(9, 8);
+ } else {
+ pte = pa & GENMASK(47, vm->page_shift);
+ if (vm->page_shift == 16)
+ pte |= FIELD_GET(GENMASK(51, 48), pa) << 12;
+ }
pte |= attrs;
return pte;
{
uint64_t pa;
- pa = pte & GENMASK(47, vm->page_shift);
- if (vm->page_shift == 16)
- pa |= FIELD_GET(GENMASK(15, 12), pte) << 48;
+ if (use_lpa2_pte_format(vm)) {
+ pa = pte & GENMASK(49, vm->page_shift);
+ pa |= FIELD_GET(GENMASK(9, 8), pte) << 50;
+ } else {
+ pa = pte & GENMASK(47, vm->page_shift);
+ if (vm->page_shift == 16)
+ pa |= FIELD_GET(GENMASK(15, 12), pte) << 48;
+ }
return pa;
}
/* Configure base granule size */
switch (vm->mode) {
- case VM_MODE_P52V48_4K:
- TEST_FAIL("AArch64 does not support 4K sized pages "
- "with 52-bit physical address ranges");
case VM_MODE_PXXV48_4K:
TEST_FAIL("AArch64 does not support 4K sized pages "
"with ANY-bit physical address ranges");
case VM_MODE_P36V48_64K:
tcr_el1 |= 1ul << 14; /* TG0 = 64KB */
break;
+ case VM_MODE_P52V48_16K:
case VM_MODE_P48V48_16K:
case VM_MODE_P40V48_16K:
case VM_MODE_P36V48_16K:
case VM_MODE_P36V47_16K:
tcr_el1 |= 2ul << 14; /* TG0 = 16KB */
break;
+ case VM_MODE_P52V48_4K:
case VM_MODE_P48V48_4K:
case VM_MODE_P40V48_4K:
case VM_MODE_P36V48_4K:
/* Configure output size */
switch (vm->mode) {
+ case VM_MODE_P52V48_4K:
+ case VM_MODE_P52V48_16K:
case VM_MODE_P52V48_64K:
tcr_el1 |= 6ul << 32; /* IPS = 52 bits */
ttbr0_el1 |= FIELD_GET(GENMASK(51, 48), vm->pgd) << 2;
/* TCR_EL1 |= IRGN0:WBWA | ORGN0:WBWA | SH0:Inner-Shareable */;
tcr_el1 |= (1 << 8) | (1 << 10) | (3 << 12);
tcr_el1 |= (64 - vm->va_bits) /* T0SZ */;
+ if (use_lpa2_pte_format(vm))
+ tcr_el1 |= (1ul << 59) /* DS */;
vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SCTLR_EL1), sctlr_el1);
vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TCR_EL1), tcr_el1);
return read_sysreg(tpidr_el1);
}
-void aarch64_get_supported_page_sizes(uint32_t ipa,
- bool *ps4k, bool *ps16k, bool *ps64k)
+static uint32_t max_ipa_for_page_size(uint32_t vm_ipa, uint32_t gran,
+ uint32_t not_sup_val, uint32_t ipa52_min_val)
+{
+ if (gran == not_sup_val)
+ return 0;
+ else if (gran >= ipa52_min_val && vm_ipa >= 52)
+ return 52;
+ else
+ return min(vm_ipa, 48U);
+}
+
+void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
+ uint32_t *ipa16k, uint32_t *ipa64k)
{
struct kvm_vcpu_init preferred_init;
int kvm_fd, vm_fd, vcpu_fd, err;
uint64_t val;
+ uint32_t gran;
struct kvm_one_reg reg = {
.id = KVM_ARM64_SYS_REG(SYS_ID_AA64MMFR0_EL1),
.addr = (uint64_t)&val,
err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
- *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
- *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
- *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
+ *ipa4k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN4_NI,
+ ID_AA64MMFR0_EL1_TGRAN4_52_BIT);
+
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
+ *ipa64k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN64_NI,
+ ID_AA64MMFR0_EL1_TGRAN64_IMP);
+
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
+ *ipa16k = max_ipa_for_page_size(ipa, gran, ID_AA64MMFR0_EL1_TGRAN16_NI,
+ ID_AA64MMFR0_EL1_TGRAN16_52_BIT);
close(vcpu_fd);
close(vm_fd);
void guest_modes_append_default(void)
{
#ifndef __aarch64__
- guest_mode_append(VM_MODE_DEFAULT, true, true);
+ guest_mode_append(VM_MODE_DEFAULT, true);
#else
{
unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE);
- bool ps4k, ps16k, ps64k;
+ uint32_t ipa4k, ipa16k, ipa64k;
int i;
- aarch64_get_supported_page_sizes(limit, &ps4k, &ps16k, &ps64k);
+ aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k);
- vm_mode_default = NUM_VM_MODES;
+ guest_mode_append(VM_MODE_P52V48_4K, ipa4k >= 52);
+ guest_mode_append(VM_MODE_P52V48_16K, ipa16k >= 52);
+ guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52);
- if (limit >= 52)
- guest_mode_append(VM_MODE_P52V48_64K, ps64k, ps64k);
- if (limit >= 48) {
- guest_mode_append(VM_MODE_P48V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P48V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P48V48_64K, ps64k, ps64k);
- }
- if (limit >= 40) {
- guest_mode_append(VM_MODE_P40V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P40V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P40V48_64K, ps64k, ps64k);
- if (ps4k)
- vm_mode_default = VM_MODE_P40V48_4K;
- }
- if (limit >= 36) {
- guest_mode_append(VM_MODE_P36V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P36V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P36V48_64K, ps64k, ps64k);
- guest_mode_append(VM_MODE_P36V47_16K, ps16k, ps16k);
- }
+ guest_mode_append(VM_MODE_P48V48_4K, ipa4k >= 48);
+ guest_mode_append(VM_MODE_P48V48_16K, ipa16k >= 48);
+ guest_mode_append(VM_MODE_P48V48_64K, ipa64k >= 48);
+
+ guest_mode_append(VM_MODE_P40V48_4K, ipa4k >= 40);
+ guest_mode_append(VM_MODE_P40V48_16K, ipa16k >= 40);
+ guest_mode_append(VM_MODE_P40V48_64K, ipa64k >= 40);
+
+ guest_mode_append(VM_MODE_P36V48_4K, ipa4k >= 36);
+ guest_mode_append(VM_MODE_P36V48_16K, ipa16k >= 36);
+ guest_mode_append(VM_MODE_P36V48_64K, ipa64k >= 36);
+ guest_mode_append(VM_MODE_P36V47_16K, ipa16k >= 36);
+
+ vm_mode_default = ipa4k >= 40 ? VM_MODE_P40V48_4K : NUM_VM_MODES;
/*
* Pick the first supported IPA size if the default
close(kvm_fd);
/* Starting with z13 we have 47bits of physical address */
if (info.ibc >= 0x30)
- guest_mode_append(VM_MODE_P47V64_4K, true, true);
+ guest_mode_append(VM_MODE_P47V64_4K, true);
}
#endif
#ifdef __riscv
unsigned int sz = kvm_check_cap(KVM_CAP_VM_GPA_BITS);
if (sz >= 52)
- guest_mode_append(VM_MODE_P52V48_4K, true, true);
+ guest_mode_append(VM_MODE_P52V48_4K, true);
if (sz >= 48)
- guest_mode_append(VM_MODE_P48V48_4K, true, true);
+ guest_mode_append(VM_MODE_P48V48_4K, true);
}
#endif
}
{
static const char * const strings[] = {
[VM_MODE_P52V48_4K] = "PA-bits:52, VA-bits:48, 4K pages",
+ [VM_MODE_P52V48_16K] = "PA-bits:52, VA-bits:48, 16K pages",
[VM_MODE_P52V48_64K] = "PA-bits:52, VA-bits:48, 64K pages",
[VM_MODE_P48V48_4K] = "PA-bits:48, VA-bits:48, 4K pages",
[VM_MODE_P48V48_16K] = "PA-bits:48, VA-bits:48, 16K pages",
const struct vm_guest_mode_params vm_guest_mode_params[] = {
[VM_MODE_P52V48_4K] = { 52, 48, 0x1000, 12 },
+ [VM_MODE_P52V48_16K] = { 52, 48, 0x4000, 14 },
[VM_MODE_P52V48_64K] = { 52, 48, 0x10000, 16 },
[VM_MODE_P48V48_4K] = { 48, 48, 0x1000, 12 },
[VM_MODE_P48V48_16K] = { 48, 48, 0x4000, 14 },
case VM_MODE_P36V48_64K:
vm->pgtable_levels = 3;
break;
+ case VM_MODE_P52V48_16K:
case VM_MODE_P48V48_16K:
case VM_MODE_P40V48_16K:
case VM_MODE_P36V48_16K:
vcpu->vm = vm;
vcpu->id = vcpu_id;
vcpu->fd = __vm_ioctl(vm, KVM_CREATE_VCPU, (void *)(unsigned long)vcpu_id);
- TEST_ASSERT(vcpu->fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_VCPU, vcpu->fd));
+ TEST_ASSERT_VM_VCPU_IOCTL(vcpu->fd >= 0, KVM_CREATE_VCPU, vcpu->fd, vm);
TEST_ASSERT(vcpu_mmap_sz() >= sizeof(*vcpu->run), "vcpu mmap size "
"smaller than expected, vcpu_mmap_sz: %i expected_min: %zi",
satp = (vm->pgd >> PGTBL_PAGE_SIZE_SHIFT) & SATP_PPN;
satp |= SATP_MODE_48;
- vcpu_set_reg(vcpu, RISCV_CSR_REG(satp), satp);
+ vcpu_set_reg(vcpu, RISCV_GENERAL_CSR_REG(satp), satp);
}
void vcpu_arch_dump(FILE *stream, struct kvm_vcpu *vcpu, uint8_t indent)
vcpu_set_reg(vcpu, RISCV_CORE_REG(regs.pc), (unsigned long)guest_code);
/* Setup default exception vector of guest */
- vcpu_set_reg(vcpu, RISCV_CSR_REG(stvec), (unsigned long)guest_unexp_trap);
+ vcpu_set_reg(vcpu, RISCV_GENERAL_CSR_REG(stvec), (unsigned long)guest_unexp_trap);
return vcpu;
}
void assert_on_unhandled_exception(struct kvm_vcpu *vcpu)
{
}
+
+struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0,
+ unsigned long arg1, unsigned long arg2,
+ unsigned long arg3, unsigned long arg4,
+ unsigned long arg5)
+{
+ register uintptr_t a0 asm ("a0") = (uintptr_t)(arg0);
+ register uintptr_t a1 asm ("a1") = (uintptr_t)(arg1);
+ register uintptr_t a2 asm ("a2") = (uintptr_t)(arg2);
+ register uintptr_t a3 asm ("a3") = (uintptr_t)(arg3);
+ register uintptr_t a4 asm ("a4") = (uintptr_t)(arg4);
+ register uintptr_t a5 asm ("a5") = (uintptr_t)(arg5);
+ register uintptr_t a6 asm ("a6") = (uintptr_t)(fid);
+ register uintptr_t a7 asm ("a7") = (uintptr_t)(ext);
+ struct sbiret ret;
+
+ asm volatile (
+ "ecall"
+ : "+r" (a0), "+r" (a1)
+ : "r" (a2), "r" (a3), "r" (a4), "r" (a5), "r" (a6), "r" (a7)
+ : "memory");
+ ret.error = a0;
+ ret.value = a1;
+
+ return ret;
+}
+
+bool guest_sbi_probe_extension(int extid, long *out_val)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_BASE, SBI_EXT_BASE_PROBE_EXT, extid,
+ 0, 0, 0, 0, 0);
+
+ __GUEST_ASSERT(!ret.error || ret.error == SBI_ERR_NOT_SUPPORTED,
+ "ret.error=%ld, ret.value=%ld\n", ret.error, ret.value);
+
+ if (ret.error == SBI_ERR_NOT_SUPPORTED)
+ return false;
+
+ if (out_val)
+ *out_val = ret.value;
+
+ return true;
+}
#include "kvm_util.h"
#include "processor.h"
-struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0,
- unsigned long arg1, unsigned long arg2,
- unsigned long arg3, unsigned long arg4,
- unsigned long arg5)
-{
- register uintptr_t a0 asm ("a0") = (uintptr_t)(arg0);
- register uintptr_t a1 asm ("a1") = (uintptr_t)(arg1);
- register uintptr_t a2 asm ("a2") = (uintptr_t)(arg2);
- register uintptr_t a3 asm ("a3") = (uintptr_t)(arg3);
- register uintptr_t a4 asm ("a4") = (uintptr_t)(arg4);
- register uintptr_t a5 asm ("a5") = (uintptr_t)(arg5);
- register uintptr_t a6 asm ("a6") = (uintptr_t)(fid);
- register uintptr_t a7 asm ("a7") = (uintptr_t)(ext);
- struct sbiret ret;
-
- asm volatile (
- "ecall"
- : "+r" (a0), "+r" (a1)
- : "r" (a2), "r" (a3), "r" (a4), "r" (a5), "r" (a6), "r" (a7)
- : "memory");
- ret.error = a0;
- ret.value = a1;
-
- return ret;
-}
-
void *ucall_arch_get_ucall(struct kvm_vcpu *vcpu)
{
struct kvm_run *run = vcpu->run;
#define REG_MASK (KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK)
+enum {
+ VCPU_FEATURE_ISA_EXT = 0,
+ VCPU_FEATURE_SBI_EXT,
+};
+
static bool isa_ext_cant_disable[KVM_RISCV_ISA_EXT_MAX];
bool filter_reg(__u64 reg)
*
* Note: The below list is alphabetically sorted.
*/
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_A:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_C:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_D:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_F:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_H:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_I:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_M:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_V:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SMSTATEEN:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SSAIA:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SSTC:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SVINVAL:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SVNAPOT:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SVPBMT:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBA:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBB:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBS:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICBOM:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICBOZ:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICNTR:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICOND:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICSR:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIFENCEI:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIHINTPAUSE:
- case KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIHPM:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_A:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_C:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_D:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_F:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_H:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_I:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_M:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_V:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SMSTATEEN:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSAIA:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSTC:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SVINVAL:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SVNAPOT:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SVPBMT:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZBA:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZBB:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZBS:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICBOM:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICBOZ:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICNTR:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICOND:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICSR:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZIFENCEI:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZIHINTPAUSE:
+ case KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZIHPM:
+ /*
+ * Like ISA_EXT registers, SBI_EXT registers are only visible when the
+ * host supports them and disabling them does not affect the visibility
+ * of the SBI_EXT register itself.
+ */
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_V01:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_TIME:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_IPI:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_RFENCE:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_SRST:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_HSM:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_PMU:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_DBCN:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_STA:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_EXPERIMENTAL:
+ case KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_VENDOR:
return true;
/* AIA registers are always available when Ssaia can't be disabled */
case KVM_REG_RISCV_CSR | KVM_REG_RISCV_CSR_AIA | KVM_REG_RISCV_CSR_AIA_REG(siselect):
return err == EINVAL;
}
-static inline bool vcpu_has_ext(struct kvm_vcpu *vcpu, int ext)
+static bool vcpu_has_ext(struct kvm_vcpu *vcpu, uint64_t ext_id)
{
int ret;
unsigned long value;
- ret = __vcpu_get_reg(vcpu, RISCV_ISA_EXT_REG(ext), &value);
+ ret = __vcpu_get_reg(vcpu, ext_id, &value);
return (ret) ? false : !!value;
}
{
unsigned long isa_ext_state[KVM_RISCV_ISA_EXT_MAX] = { 0 };
struct vcpu_reg_sublist *s;
+ uint64_t feature;
int rc;
for (int i = 0; i < KVM_RISCV_ISA_EXT_MAX; i++)
isa_ext_cant_disable[i] = true;
}
+ for (int i = 0; i < KVM_RISCV_SBI_EXT_MAX; i++) {
+ rc = __vcpu_set_reg(vcpu, RISCV_SBI_EXT_REG(i), 0);
+ TEST_ASSERT(!rc || (rc == -1 && errno == ENOENT), "Unexpected error");
+ }
+
for_each_sublist(c, s) {
if (!s->feature)
continue;
+ switch (s->feature_type) {
+ case VCPU_FEATURE_ISA_EXT:
+ feature = RISCV_ISA_EXT_REG(s->feature);
+ break;
+ case VCPU_FEATURE_SBI_EXT:
+ feature = RISCV_SBI_EXT_REG(s->feature);
+ break;
+ default:
+ TEST_FAIL("Unknown feature type");
+ }
+
/* Try to enable the desired extension */
- __vcpu_set_reg(vcpu, RISCV_ISA_EXT_REG(s->feature), 1);
+ __vcpu_set_reg(vcpu, feature, 1);
/* Double check whether the desired extension was enabled */
- __TEST_REQUIRE(vcpu_has_ext(vcpu, s->feature),
+ __TEST_REQUIRE(vcpu_has_ext(vcpu, feature),
"%s not available, skipping tests\n", s->name);
}
}
}
#define KVM_ISA_EXT_ARR(ext) \
-[KVM_RISCV_ISA_EXT_##ext] = "KVM_RISCV_ISA_EXT_" #ext
+[KVM_RISCV_ISA_EXT_##ext] = "KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_" #ext
-static const char *isa_ext_id_to_str(const char *prefix, __u64 id)
+static const char *isa_ext_single_id_to_str(__u64 reg_off)
{
- /* reg_off is the offset into unsigned long kvm_isa_ext_arr[] */
- __u64 reg_off = id & ~(REG_MASK | KVM_REG_RISCV_ISA_EXT);
-
- assert((id & KVM_REG_RISCV_TYPE_MASK) == KVM_REG_RISCV_ISA_EXT);
-
static const char * const kvm_isa_ext_reg_name[] = {
KVM_ISA_EXT_ARR(A),
KVM_ISA_EXT_ARR(C),
};
if (reg_off >= ARRAY_SIZE(kvm_isa_ext_reg_name))
- return strdup_printf("%lld /* UNKNOWN */", reg_off);
+ return strdup_printf("KVM_REG_RISCV_ISA_SINGLE | %lld /* UNKNOWN */", reg_off);
return kvm_isa_ext_reg_name[reg_off];
}
+static const char *isa_ext_multi_id_to_str(__u64 reg_subtype, __u64 reg_off)
+{
+ const char *unknown = "";
+
+ if (reg_off > KVM_REG_RISCV_ISA_MULTI_REG_LAST)
+ unknown = " /* UNKNOWN */";
+
+ switch (reg_subtype) {
+ case KVM_REG_RISCV_ISA_MULTI_EN:
+ return strdup_printf("KVM_REG_RISCV_ISA_MULTI_EN | %lld%s", reg_off, unknown);
+ case KVM_REG_RISCV_ISA_MULTI_DIS:
+ return strdup_printf("KVM_REG_RISCV_ISA_MULTI_DIS | %lld%s", reg_off, unknown);
+ }
+
+ return strdup_printf("%lld | %lld /* UNKNOWN */", reg_subtype, reg_off);
+}
+
+static const char *isa_ext_id_to_str(const char *prefix, __u64 id)
+{
+ __u64 reg_off = id & ~(REG_MASK | KVM_REG_RISCV_ISA_EXT);
+ __u64 reg_subtype = reg_off & KVM_REG_RISCV_SUBTYPE_MASK;
+
+ assert((id & KVM_REG_RISCV_TYPE_MASK) == KVM_REG_RISCV_ISA_EXT);
+
+ reg_off &= ~KVM_REG_RISCV_SUBTYPE_MASK;
+
+ switch (reg_subtype) {
+ case KVM_REG_RISCV_ISA_SINGLE:
+ return isa_ext_single_id_to_str(reg_off);
+ case KVM_REG_RISCV_ISA_MULTI_EN:
+ case KVM_REG_RISCV_ISA_MULTI_DIS:
+ return isa_ext_multi_id_to_str(reg_subtype, reg_off);
+ }
+
+ return strdup_printf("%lld | %lld /* UNKNOWN */", reg_subtype, reg_off);
+}
+
#define KVM_SBI_EXT_ARR(ext) \
[ext] = "KVM_REG_RISCV_SBI_SINGLE | " #ext
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_SRST),
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_HSM),
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_PMU),
+ KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_STA),
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_EXPERIMENTAL),
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_VENDOR),
KVM_SBI_EXT_ARR(KVM_RISCV_SBI_EXT_DBCN),
return strdup_printf("%lld | %lld /* UNKNOWN */", reg_subtype, reg_off);
}
+static const char *sbi_sta_id_to_str(__u64 reg_off)
+{
+ switch (reg_off) {
+ case 0: return "KVM_REG_RISCV_SBI_STA | KVM_REG_RISCV_SBI_STA_REG(shmem_lo)";
+ case 1: return "KVM_REG_RISCV_SBI_STA | KVM_REG_RISCV_SBI_STA_REG(shmem_hi)";
+ }
+ return strdup_printf("KVM_REG_RISCV_SBI_STA | %lld /* UNKNOWN */", reg_off);
+}
+
+static const char *sbi_id_to_str(const char *prefix, __u64 id)
+{
+ __u64 reg_off = id & ~(REG_MASK | KVM_REG_RISCV_SBI_STATE);
+ __u64 reg_subtype = reg_off & KVM_REG_RISCV_SUBTYPE_MASK;
+
+ assert((id & KVM_REG_RISCV_TYPE_MASK) == KVM_REG_RISCV_SBI_STATE);
+
+ reg_off &= ~KVM_REG_RISCV_SUBTYPE_MASK;
+
+ switch (reg_subtype) {
+ case KVM_REG_RISCV_SBI_STA:
+ return sbi_sta_id_to_str(reg_off);
+ }
+
+ return strdup_printf("%lld | %lld /* UNKNOWN */", reg_subtype, reg_off);
+}
+
void print_reg(const char *prefix, __u64 id)
{
const char *reg_size = NULL;
reg_size = "KVM_REG_SIZE_U128";
break;
default:
- printf("\tKVM_REG_RISCV | (%lld << KVM_REG_SIZE_SHIFT) | 0x%llx /* UNKNOWN */,",
- (id & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT, id & REG_MASK);
+ printf("\tKVM_REG_RISCV | (%lld << KVM_REG_SIZE_SHIFT) | 0x%llx /* UNKNOWN */,\n",
+ (id & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT, id & ~REG_MASK);
+ return;
}
switch (id & KVM_REG_RISCV_TYPE_MASK) {
printf("\tKVM_REG_RISCV | %s | KVM_REG_RISCV_SBI_EXT | %s,\n",
reg_size, sbi_ext_id_to_str(prefix, id));
break;
+ case KVM_REG_RISCV_SBI_STATE:
+ printf("\tKVM_REG_RISCV | %s | KVM_REG_RISCV_SBI_STATE | %s,\n",
+ reg_size, sbi_id_to_str(prefix, id));
+ break;
default:
- printf("\tKVM_REG_RISCV | %s | 0x%llx /* UNKNOWN */,",
- reg_size, id & REG_MASK);
+ printf("\tKVM_REG_RISCV | %s | 0x%llx /* UNKNOWN */,\n",
+ reg_size, id & ~REG_MASK);
+ return;
}
}
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_TIMER | KVM_REG_RISCV_TIMER_REG(time),
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_TIMER | KVM_REG_RISCV_TIMER_REG(compare),
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_TIMER | KVM_REG_RISCV_TIMER_REG(state),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_V01,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_TIME,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_IPI,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_RFENCE,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_SRST,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_HSM,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_PMU,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_EXPERIMENTAL,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_VENDOR,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_DBCN,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_MULTI_EN | 0,
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_MULTI_DIS | 0,
};
/*
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_TIMER | KVM_REG_RISCV_TIMER_REG(state),
};
-static __u64 h_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_H,
+static __u64 sbi_base_regs[] = {
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_V01,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_TIME,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_IPI,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_RFENCE,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_SRST,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_HSM,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_EXPERIMENTAL,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_VENDOR,
+};
+
+static __u64 sbi_sta_regs[] = {
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | KVM_RISCV_SBI_EXT_STA,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_STATE | KVM_REG_RISCV_SBI_STA | KVM_REG_RISCV_SBI_STA_REG(shmem_lo),
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_SBI_STATE | KVM_REG_RISCV_SBI_STA | KVM_REG_RISCV_SBI_STA_REG(shmem_hi),
};
static __u64 zicbom_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CONFIG | KVM_REG_RISCV_CONFIG_REG(zicbom_block_size),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICBOM,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICBOM,
};
static __u64 zicboz_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CONFIG | KVM_REG_RISCV_CONFIG_REG(zicboz_block_size),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICBOZ,
-};
-
-static __u64 svpbmt_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SVPBMT,
-};
-
-static __u64 sstc_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SSTC,
-};
-
-static __u64 svinval_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SVINVAL,
-};
-
-static __u64 zihintpause_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIHINTPAUSE,
-};
-
-static __u64 zba_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBA,
-};
-
-static __u64 zbb_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBB,
-};
-
-static __u64 zbs_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZBS,
-};
-
-static __u64 zicntr_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICNTR,
-};
-
-static __u64 zicond_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICOND,
-};
-
-static __u64 zicsr_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZICSR,
-};
-
-static __u64 zifencei_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIFENCEI,
-};
-
-static __u64 zihpm_regs[] = {
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_ZIHPM,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_ZICBOZ,
};
static __u64 aia_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CSR | KVM_REG_RISCV_CSR_AIA | KVM_REG_RISCV_CSR_AIA_REG(siph),
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CSR | KVM_REG_RISCV_CSR_AIA | KVM_REG_RISCV_CSR_AIA_REG(iprio1h),
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CSR | KVM_REG_RISCV_CSR_AIA | KVM_REG_RISCV_CSR_AIA_REG(iprio2h),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SSAIA,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SSAIA,
};
static __u64 smstateen_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_CSR | KVM_REG_RISCV_CSR_SMSTATEEN | KVM_REG_RISCV_CSR_SMSTATEEN_REG(sstateen0),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_SMSTATEEN,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_SMSTATEEN,
};
static __u64 fp_f_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_U32 | KVM_REG_RISCV_FP_F | KVM_REG_RISCV_FP_F_REG(f[30]),
KVM_REG_RISCV | KVM_REG_SIZE_U32 | KVM_REG_RISCV_FP_F | KVM_REG_RISCV_FP_F_REG(f[31]),
KVM_REG_RISCV | KVM_REG_SIZE_U32 | KVM_REG_RISCV_FP_F | KVM_REG_RISCV_FP_F_REG(fcsr),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_F,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_F,
};
static __u64 fp_d_regs[] = {
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_FP_D | KVM_REG_RISCV_FP_D_REG(f[30]),
KVM_REG_RISCV | KVM_REG_SIZE_U64 | KVM_REG_RISCV_FP_D | KVM_REG_RISCV_FP_D_REG(f[31]),
KVM_REG_RISCV | KVM_REG_SIZE_U32 | KVM_REG_RISCV_FP_D | KVM_REG_RISCV_FP_D_REG(fcsr),
- KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_RISCV_ISA_EXT_D,
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | KVM_RISCV_ISA_EXT_D,
};
-#define BASE_SUBLIST \
+#define SUBLIST_BASE \
{"base", .regs = base_regs, .regs_n = ARRAY_SIZE(base_regs), \
.skips_set = base_skips_set, .skips_set_n = ARRAY_SIZE(base_skips_set),}
-#define H_REGS_SUBLIST \
- {"h", .feature = KVM_RISCV_ISA_EXT_H, .regs = h_regs, .regs_n = ARRAY_SIZE(h_regs),}
-#define ZICBOM_REGS_SUBLIST \
+#define SUBLIST_SBI_BASE \
+ {"sbi-base", .feature_type = VCPU_FEATURE_SBI_EXT, .feature = KVM_RISCV_SBI_EXT_V01, \
+ .regs = sbi_base_regs, .regs_n = ARRAY_SIZE(sbi_base_regs),}
+#define SUBLIST_SBI_STA \
+ {"sbi-sta", .feature_type = VCPU_FEATURE_SBI_EXT, .feature = KVM_RISCV_SBI_EXT_STA, \
+ .regs = sbi_sta_regs, .regs_n = ARRAY_SIZE(sbi_sta_regs),}
+#define SUBLIST_ZICBOM \
{"zicbom", .feature = KVM_RISCV_ISA_EXT_ZICBOM, .regs = zicbom_regs, .regs_n = ARRAY_SIZE(zicbom_regs),}
-#define ZICBOZ_REGS_SUBLIST \
+#define SUBLIST_ZICBOZ \
{"zicboz", .feature = KVM_RISCV_ISA_EXT_ZICBOZ, .regs = zicboz_regs, .regs_n = ARRAY_SIZE(zicboz_regs),}
-#define SVPBMT_REGS_SUBLIST \
- {"svpbmt", .feature = KVM_RISCV_ISA_EXT_SVPBMT, .regs = svpbmt_regs, .regs_n = ARRAY_SIZE(svpbmt_regs),}
-#define SSTC_REGS_SUBLIST \
- {"sstc", .feature = KVM_RISCV_ISA_EXT_SSTC, .regs = sstc_regs, .regs_n = ARRAY_SIZE(sstc_regs),}
-#define SVINVAL_REGS_SUBLIST \
- {"svinval", .feature = KVM_RISCV_ISA_EXT_SVINVAL, .regs = svinval_regs, .regs_n = ARRAY_SIZE(svinval_regs),}
-#define ZIHINTPAUSE_REGS_SUBLIST \
- {"zihintpause", .feature = KVM_RISCV_ISA_EXT_ZIHINTPAUSE, .regs = zihintpause_regs, .regs_n = ARRAY_SIZE(zihintpause_regs),}
-#define ZBA_REGS_SUBLIST \
- {"zba", .feature = KVM_RISCV_ISA_EXT_ZBA, .regs = zba_regs, .regs_n = ARRAY_SIZE(zba_regs),}
-#define ZBB_REGS_SUBLIST \
- {"zbb", .feature = KVM_RISCV_ISA_EXT_ZBB, .regs = zbb_regs, .regs_n = ARRAY_SIZE(zbb_regs),}
-#define ZBS_REGS_SUBLIST \
- {"zbs", .feature = KVM_RISCV_ISA_EXT_ZBS, .regs = zbs_regs, .regs_n = ARRAY_SIZE(zbs_regs),}
-#define ZICNTR_REGS_SUBLIST \
- {"zicntr", .feature = KVM_RISCV_ISA_EXT_ZICNTR, .regs = zicntr_regs, .regs_n = ARRAY_SIZE(zicntr_regs),}
-#define ZICOND_REGS_SUBLIST \
- {"zicond", .feature = KVM_RISCV_ISA_EXT_ZICOND, .regs = zicond_regs, .regs_n = ARRAY_SIZE(zicond_regs),}
-#define ZICSR_REGS_SUBLIST \
- {"zicsr", .feature = KVM_RISCV_ISA_EXT_ZICSR, .regs = zicsr_regs, .regs_n = ARRAY_SIZE(zicsr_regs),}
-#define ZIFENCEI_REGS_SUBLIST \
- {"zifencei", .feature = KVM_RISCV_ISA_EXT_ZIFENCEI, .regs = zifencei_regs, .regs_n = ARRAY_SIZE(zifencei_regs),}
-#define ZIHPM_REGS_SUBLIST \
- {"zihpm", .feature = KVM_RISCV_ISA_EXT_ZIHPM, .regs = zihpm_regs, .regs_n = ARRAY_SIZE(zihpm_regs),}
-#define AIA_REGS_SUBLIST \
+#define SUBLIST_AIA \
{"aia", .feature = KVM_RISCV_ISA_EXT_SSAIA, .regs = aia_regs, .regs_n = ARRAY_SIZE(aia_regs),}
-#define SMSTATEEN_REGS_SUBLIST \
+#define SUBLIST_SMSTATEEN \
{"smstateen", .feature = KVM_RISCV_ISA_EXT_SMSTATEEN, .regs = smstateen_regs, .regs_n = ARRAY_SIZE(smstateen_regs),}
-#define FP_F_REGS_SUBLIST \
+#define SUBLIST_FP_F \
{"fp_f", .feature = KVM_RISCV_ISA_EXT_F, .regs = fp_f_regs, \
.regs_n = ARRAY_SIZE(fp_f_regs),}
-#define FP_D_REGS_SUBLIST \
+#define SUBLIST_FP_D \
{"fp_d", .feature = KVM_RISCV_ISA_EXT_D, .regs = fp_d_regs, \
.regs_n = ARRAY_SIZE(fp_d_regs),}
-static struct vcpu_reg_list h_config = {
- .sublists = {
- BASE_SUBLIST,
- H_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zicbom_config = {
- .sublists = {
- BASE_SUBLIST,
- ZICBOM_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zicboz_config = {
- .sublists = {
- BASE_SUBLIST,
- ZICBOZ_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list svpbmt_config = {
- .sublists = {
- BASE_SUBLIST,
- SVPBMT_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list sstc_config = {
- .sublists = {
- BASE_SUBLIST,
- SSTC_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list svinval_config = {
- .sublists = {
- BASE_SUBLIST,
- SVINVAL_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zihintpause_config = {
- .sublists = {
- BASE_SUBLIST,
- ZIHINTPAUSE_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zba_config = {
- .sublists = {
- BASE_SUBLIST,
- ZBA_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zbb_config = {
- .sublists = {
- BASE_SUBLIST,
- ZBB_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zbs_config = {
- .sublists = {
- BASE_SUBLIST,
- ZBS_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zicntr_config = {
- .sublists = {
- BASE_SUBLIST,
- ZICNTR_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zicond_config = {
- .sublists = {
- BASE_SUBLIST,
- ZICOND_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zicsr_config = {
- .sublists = {
- BASE_SUBLIST,
- ZICSR_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zifencei_config = {
- .sublists = {
- BASE_SUBLIST,
- ZIFENCEI_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list zihpm_config = {
- .sublists = {
- BASE_SUBLIST,
- ZIHPM_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list aia_config = {
- .sublists = {
- BASE_SUBLIST,
- AIA_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list smstateen_config = {
- .sublists = {
- BASE_SUBLIST,
- SMSTATEEN_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list fp_f_config = {
- .sublists = {
- BASE_SUBLIST,
- FP_F_REGS_SUBLIST,
- {0},
- },
-};
-
-static struct vcpu_reg_list fp_d_config = {
- .sublists = {
- BASE_SUBLIST,
- FP_D_REGS_SUBLIST,
- {0},
- },
-};
+#define KVM_ISA_EXT_SIMPLE_CONFIG(ext, extu) \
+static __u64 regs_##ext[] = { \
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | \
+ KVM_REG_RISCV_ISA_EXT | KVM_REG_RISCV_ISA_SINGLE | \
+ KVM_RISCV_ISA_EXT_##extu, \
+}; \
+static struct vcpu_reg_list config_##ext = { \
+ .sublists = { \
+ SUBLIST_BASE, \
+ { \
+ .name = #ext, \
+ .feature = KVM_RISCV_ISA_EXT_##extu, \
+ .regs = regs_##ext, \
+ .regs_n = ARRAY_SIZE(regs_##ext), \
+ }, \
+ {0}, \
+ }, \
+} \
+
+#define KVM_SBI_EXT_SIMPLE_CONFIG(ext, extu) \
+static __u64 regs_sbi_##ext[] = { \
+ KVM_REG_RISCV | KVM_REG_SIZE_ULONG | \
+ KVM_REG_RISCV_SBI_EXT | KVM_REG_RISCV_SBI_SINGLE | \
+ KVM_RISCV_SBI_EXT_##extu, \
+}; \
+static struct vcpu_reg_list config_sbi_##ext = { \
+ .sublists = { \
+ SUBLIST_BASE, \
+ { \
+ .name = "sbi-"#ext, \
+ .feature_type = VCPU_FEATURE_SBI_EXT, \
+ .feature = KVM_RISCV_SBI_EXT_##extu, \
+ .regs = regs_sbi_##ext, \
+ .regs_n = ARRAY_SIZE(regs_sbi_##ext), \
+ }, \
+ {0}, \
+ }, \
+} \
+
+#define KVM_ISA_EXT_SUBLIST_CONFIG(ext, extu) \
+static struct vcpu_reg_list config_##ext = { \
+ .sublists = { \
+ SUBLIST_BASE, \
+ SUBLIST_##extu, \
+ {0}, \
+ }, \
+} \
+
+#define KVM_SBI_EXT_SUBLIST_CONFIG(ext, extu) \
+static struct vcpu_reg_list config_sbi_##ext = { \
+ .sublists = { \
+ SUBLIST_BASE, \
+ SUBLIST_SBI_##extu, \
+ {0}, \
+ }, \
+} \
+
+/* Note: The below list is alphabetically sorted. */
+
+KVM_SBI_EXT_SUBLIST_CONFIG(base, BASE);
+KVM_SBI_EXT_SUBLIST_CONFIG(sta, STA);
+KVM_SBI_EXT_SIMPLE_CONFIG(pmu, PMU);
+KVM_SBI_EXT_SIMPLE_CONFIG(dbcn, DBCN);
+
+KVM_ISA_EXT_SUBLIST_CONFIG(aia, AIA);
+KVM_ISA_EXT_SUBLIST_CONFIG(fp_f, FP_F);
+KVM_ISA_EXT_SUBLIST_CONFIG(fp_d, FP_D);
+KVM_ISA_EXT_SIMPLE_CONFIG(h, H);
+KVM_ISA_EXT_SUBLIST_CONFIG(smstateen, SMSTATEEN);
+KVM_ISA_EXT_SIMPLE_CONFIG(sstc, SSTC);
+KVM_ISA_EXT_SIMPLE_CONFIG(svinval, SVINVAL);
+KVM_ISA_EXT_SIMPLE_CONFIG(svnapot, SVNAPOT);
+KVM_ISA_EXT_SIMPLE_CONFIG(svpbmt, SVPBMT);
+KVM_ISA_EXT_SIMPLE_CONFIG(zba, ZBA);
+KVM_ISA_EXT_SIMPLE_CONFIG(zbb, ZBB);
+KVM_ISA_EXT_SIMPLE_CONFIG(zbs, ZBS);
+KVM_ISA_EXT_SUBLIST_CONFIG(zicbom, ZICBOM);
+KVM_ISA_EXT_SUBLIST_CONFIG(zicboz, ZICBOZ);
+KVM_ISA_EXT_SIMPLE_CONFIG(zicntr, ZICNTR);
+KVM_ISA_EXT_SIMPLE_CONFIG(zicond, ZICOND);
+KVM_ISA_EXT_SIMPLE_CONFIG(zicsr, ZICSR);
+KVM_ISA_EXT_SIMPLE_CONFIG(zifencei, ZIFENCEI);
+KVM_ISA_EXT_SIMPLE_CONFIG(zihintpause, ZIHINTPAUSE);
+KVM_ISA_EXT_SIMPLE_CONFIG(zihpm, ZIHPM);
struct vcpu_reg_list *vcpu_configs[] = {
- &h_config,
- &zicbom_config,
- &zicboz_config,
- &svpbmt_config,
- &sstc_config,
- &svinval_config,
- &zihintpause_config,
- &zba_config,
- &zbb_config,
- &zbs_config,
- &zicntr_config,
- &zicond_config,
- &zicsr_config,
- &zifencei_config,
- &zihpm_config,
- &aia_config,
- &smstateen_config,
- &fp_f_config,
- &fp_d_config,
+ &config_sbi_base,
+ &config_sbi_sta,
+ &config_sbi_pmu,
+ &config_sbi_dbcn,
+ &config_aia,
+ &config_fp_f,
+ &config_fp_d,
+ &config_h,
+ &config_smstateen,
+ &config_sstc,
+ &config_svinval,
+ &config_svnapot,
+ &config_svpbmt,
+ &config_zba,
+ &config_zbb,
+ &config_zbs,
+ &config_zicbom,
+ &config_zicboz,
+ &config_zicntr,
+ &config_zicond,
+ &config_zicsr,
+ &config_zifencei,
+ &config_zihintpause,
+ &config_zihpm,
};
int vcpu_configs_n = ARRAY_SIZE(vcpu_configs);
*/
val = guest_spin_on_val(0);
__GUEST_ASSERT(val == 1 || val == MMIO_VAL,
- "Expected '1' or MMIO ('%llx'), got '%llx'", MMIO_VAL, val);
+ "Expected '1' or MMIO ('%lx'), got '%lx'", MMIO_VAL, val);
/* Spin until the misaligning memory region move completes. */
val = guest_spin_on_val(MMIO_VAL);
__GUEST_ASSERT(val == 1 || val == 0,
- "Expected '0' or '1' (no MMIO), got '%llx'", val);
+ "Expected '0' or '1' (no MMIO), got '%lx'", val);
/* Spin until the memory region starts to get re-aligned. */
val = guest_spin_on_val(0);
__GUEST_ASSERT(val == 1 || val == MMIO_VAL,
- "Expected '1' or MMIO ('%llx'), got '%llx'", MMIO_VAL, val);
+ "Expected '1' or MMIO ('%lx'), got '%lx'", MMIO_VAL, val);
/* Spin until the re-aligning memory region move completes. */
val = guest_spin_on_val(MMIO_VAL);
struct kvm_vm *vm;
int r, i;
-#ifdef __x86_64__
+#if defined __aarch64__ || defined __x86_64__
supported_flags |= KVM_MEM_READONLY;
+#endif
+#ifdef __x86_64__
if (kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM))
vm = vm_create_barebones_protected_vm();
else
#include <pthread.h>
#include <linux/kernel.h>
#include <asm/kvm.h>
+#ifndef __riscv
#include <asm/kvm_para.h>
+#endif
#include "test_util.h"
#include "kvm_util.h"
pr_info(" st_time: %ld\n", st->st_time);
}
+#elif defined(__riscv)
+
+/* SBI STA shmem must have 64-byte alignment */
+#define STEAL_TIME_SIZE ((sizeof(struct sta_struct) + 63) & ~63)
+
+static vm_paddr_t st_gpa[NR_VCPUS];
+
+struct sta_struct {
+ uint32_t sequence;
+ uint32_t flags;
+ uint64_t steal;
+ uint8_t preempted;
+ uint8_t pad[47];
+} __packed;
+
+static void sta_set_shmem(vm_paddr_t gpa, unsigned long flags)
+{
+ unsigned long lo = (unsigned long)gpa;
+#if __riscv_xlen == 32
+ unsigned long hi = (unsigned long)(gpa >> 32);
+#else
+ unsigned long hi = gpa == -1 ? -1 : 0;
+#endif
+ struct sbiret ret = sbi_ecall(SBI_EXT_STA, 0, lo, hi, flags, 0, 0, 0);
+
+ GUEST_ASSERT(ret.value == 0 && ret.error == 0);
+}
+
+static void check_status(struct sta_struct *st)
+{
+ GUEST_ASSERT(!(READ_ONCE(st->sequence) & 1));
+ GUEST_ASSERT(READ_ONCE(st->flags) == 0);
+ GUEST_ASSERT(READ_ONCE(st->preempted) == 0);
+}
+
+static void guest_code(int cpu)
+{
+ struct sta_struct *st = st_gva[cpu];
+ uint32_t sequence;
+ long out_val = 0;
+ bool probe;
+
+ probe = guest_sbi_probe_extension(SBI_EXT_STA, &out_val);
+ GUEST_ASSERT(probe && out_val == 1);
+
+ sta_set_shmem(st_gpa[cpu], 0);
+ GUEST_SYNC(0);
+
+ check_status(st);
+ WRITE_ONCE(guest_stolen_time[cpu], st->steal);
+ sequence = READ_ONCE(st->sequence);
+ check_status(st);
+ GUEST_SYNC(1);
+
+ check_status(st);
+ GUEST_ASSERT(sequence < READ_ONCE(st->sequence));
+ WRITE_ONCE(guest_stolen_time[cpu], st->steal);
+ check_status(st);
+ GUEST_DONE();
+}
+
+static bool is_steal_time_supported(struct kvm_vcpu *vcpu)
+{
+ uint64_t id = RISCV_SBI_EXT_REG(KVM_RISCV_SBI_EXT_STA);
+ unsigned long enabled;
+
+ vcpu_get_reg(vcpu, id, &enabled);
+ TEST_ASSERT(enabled == 0 || enabled == 1, "Expected boolean result");
+
+ return enabled;
+}
+
+static void steal_time_init(struct kvm_vcpu *vcpu, uint32_t i)
+{
+ /* ST_GPA_BASE is identity mapped */
+ st_gva[i] = (void *)(ST_GPA_BASE + i * STEAL_TIME_SIZE);
+ st_gpa[i] = addr_gva2gpa(vcpu->vm, (vm_vaddr_t)st_gva[i]);
+ sync_global_to_guest(vcpu->vm, st_gva[i]);
+ sync_global_to_guest(vcpu->vm, st_gpa[i]);
+}
+
+static void steal_time_dump(struct kvm_vm *vm, uint32_t vcpu_idx)
+{
+ struct sta_struct *st = addr_gva2hva(vm, (ulong)st_gva[vcpu_idx]);
+ int i;
+
+ pr_info("VCPU%d:\n", vcpu_idx);
+ pr_info(" sequence: %d\n", st->sequence);
+ pr_info(" flags: %d\n", st->flags);
+ pr_info(" steal: %"PRIu64"\n", st->steal);
+ pr_info(" preempted: %d\n", st->preempted);
+ pr_info(" pad: ");
+ for (i = 0; i < 47; ++i)
+ pr_info("%d", st->pad[i]);
+ pr_info("\n");
+}
+
#endif
static void *do_steal_time(void *arg)
if (msr->fault_expected)
__GUEST_ASSERT(vector == GP_VECTOR,
"Expected #GP on %sMSR(0x%x), got vector '0x%x'",
- msr->idx, msr->write ? "WR" : "RD", vector);
+ msr->write ? "WR" : "RD", msr->idx, vector);
else
__GUEST_ASSERT(!vector,
"Expected success on %sMSR(0x%x), got vector '0x%x'",
- msr->idx, msr->write ? "WR" : "RD", vector);
+ msr->write ? "WR" : "RD", msr->idx, vector);
if (vector || is_write_only_msr(msr->idx))
goto done;
if (msr->write)
__GUEST_ASSERT(!vector,
- "WRMSR(0x%x) to '0x%llx', RDMSR read '0x%llx'",
+ "WRMSR(0x%x) to '0x%lx', RDMSR read '0x%lx'",
msr->idx, msr->write_val, msr_val);
/* Invariant TSC bit appears when TSC invariant control MSR is written to */
vector = __hyperv_hypercall(hcall->control, input, output, &res);
if (hcall->ud_expected) {
__GUEST_ASSERT(vector == UD_VECTOR,
- "Expected #UD for control '%u', got vector '0x%x'",
+ "Expected #UD for control '%lu', got vector '0x%x'",
hcall->control, vector);
} else {
__GUEST_ASSERT(!vector,
- "Expected no exception for control '%u', got vector '0x%x'",
+ "Expected no exception for control '%lu', got vector '0x%x'",
hcall->control, vector);
GUEST_ASSERT_EQ(res, hcall->expect);
}
+++ /dev/null
-/*
- * mmio_warning_test
- *
- * Copyright (C) 2019, Google LLC.
- *
- * This work is licensed under the terms of the GNU GPL, version 2.
- *
- * Test that we don't get a kernel warning when we call KVM_RUN after a
- * triple fault occurs. To get the triple fault to occur we call KVM_RUN
- * on a VCPU that hasn't been properly setup.
- *
- */
-
-#define _GNU_SOURCE
-#include <fcntl.h>
-#include <kvm_util.h>
-#include <linux/kvm.h>
-#include <processor.h>
-#include <pthread.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <sys/ioctl.h>
-#include <sys/mman.h>
-#include <sys/stat.h>
-#include <sys/types.h>
-#include <sys/wait.h>
-#include <test_util.h>
-#include <unistd.h>
-
-#define NTHREAD 4
-#define NPROCESS 5
-
-struct thread_context {
- int kvmcpu;
- struct kvm_run *run;
-};
-
-void *thr(void *arg)
-{
- struct thread_context *tc = (struct thread_context *)arg;
- int res;
- int kvmcpu = tc->kvmcpu;
- struct kvm_run *run = tc->run;
-
- res = ioctl(kvmcpu, KVM_RUN, 0);
- pr_info("ret1=%d exit_reason=%d suberror=%d\n",
- res, run->exit_reason, run->internal.suberror);
-
- return 0;
-}
-
-void test(void)
-{
- int i, kvm, kvmvm, kvmcpu;
- pthread_t th[NTHREAD];
- struct kvm_run *run;
- struct thread_context tc;
-
- kvm = open("/dev/kvm", O_RDWR);
- TEST_ASSERT(kvm != -1, "failed to open /dev/kvm");
- kvmvm = __kvm_ioctl(kvm, KVM_CREATE_VM, NULL);
- TEST_ASSERT(kvmvm > 0, KVM_IOCTL_ERROR(KVM_CREATE_VM, kvmvm));
- kvmcpu = ioctl(kvmvm, KVM_CREATE_VCPU, 0);
- TEST_ASSERT(kvmcpu != -1, KVM_IOCTL_ERROR(KVM_CREATE_VCPU, kvmcpu));
- run = (struct kvm_run *)mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED,
- kvmcpu, 0);
- tc.kvmcpu = kvmcpu;
- tc.run = run;
- srand(getpid());
- for (i = 0; i < NTHREAD; i++) {
- pthread_create(&th[i], NULL, thr, (void *)(uintptr_t)&tc);
- usleep(rand() % 10000);
- }
- for (i = 0; i < NTHREAD; i++)
- pthread_join(th[i], NULL);
-}
-
-int get_warnings_count(void)
-{
- int warnings;
- FILE *f;
-
- f = popen("dmesg | grep \"WARNING:\" | wc -l", "r");
- if (fscanf(f, "%d", &warnings) < 1)
- warnings = 0;
- pclose(f);
-
- return warnings;
-}
-
-int main(void)
-{
- int warnings_before, warnings_after;
-
- TEST_REQUIRE(host_cpu_is_intel);
-
- TEST_REQUIRE(!vm_is_unrestricted_guest(NULL));
-
- warnings_before = get_warnings_count();
-
- for (int i = 0; i < NPROCESS; ++i) {
- int status;
- int pid = fork();
-
- if (pid < 0)
- exit(1);
- if (pid == 0) {
- test();
- exit(0);
- }
- while (waitpid(pid, &status, __WALL) != pid)
- ;
- }
-
- warnings_after = get_warnings_count();
- TEST_ASSERT(warnings_before == warnings_after,
- "Warnings found in kernel. Run 'dmesg' to inspect them.");
-
- return 0;
-}
\
if (fault_wanted) \
__GUEST_ASSERT((vector) == UD_VECTOR, \
- "Expected #UD on " insn " for testcase '0x%x', got '0x%x'", vector); \
+ "Expected #UD on " insn " for testcase '0x%x', got '0x%x'", \
+ testcase, vector); \
else \
__GUEST_ASSERT(!(vector), \
- "Expected success on " insn " for testcase '0x%x', got '0x%x'", vector); \
+ "Expected success on " insn " for testcase '0x%x', got '0x%x'", \
+ testcase, vector); \
} while (0)
static void guest_monitor_wait(int testcase)
__TEST_REQUIRE(token == MAGIC_TOKEN,
"This test must be run with the magic token %d.\n"
"This is done by nx_huge_pages_test.sh, which\n"
- "also handles environment setup for the test.");
+ "also handles environment setup for the test.", MAGIC_TOKEN);
run_test(reclaim_period_ms, false, reboot_permissions);
run_test(reclaim_period_ms, true, reboot_permissions);
\
for (i = 0; i < size; i++) \
__GUEST_ASSERT(mem[i] == pattern, \
- "Guest expected 0x%x at offset %lu (gpa 0x%llx), got 0x%x", \
+ "Guest expected 0x%x at offset %lu (gpa 0x%lx), got 0x%x", \
pattern, i, gpa + i, mem[i]); \
} while (0)
run_guest(vmcb, svm->vmcb_gpa);
__GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_VMMCALL,
- "Expected VMMCAL #VMEXIT, got '0x%x', info1 = '0x%llx, info2 = '0x%llx'",
+ "Expected VMMCAL #VMEXIT, got '0x%x', info1 = '0x%lx, info2 = '0x%lx'",
vmcb->control.exit_code,
vmcb->control.exit_info_1, vmcb->control.exit_info_2);
run_guest(vmcb, svm->vmcb_gpa);
__GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_HLT,
- "Expected HLT #VMEXIT, got '0x%x', info1 = '0x%llx, info2 = '0x%llx'",
+ "Expected HLT #VMEXIT, got '0x%x', info1 = '0x%lx, info2 = '0x%lx'",
vmcb->control.exit_code,
vmcb->control.exit_info_1, vmcb->control.exit_info_2);
uint8_t vector = wrmsr_safe(MSR_IA32_PERF_CAPABILITIES, val);
__GUEST_ASSERT(vector == GP_VECTOR,
- "Expected #GP for value '0x%llx', got vector '0x%x'",
+ "Expected #GP for value '0x%lx', got vector '0x%x'",
val, vector);
}
\
__GUEST_ASSERT((__supported & (xfeatures)) != (xfeatures) || \
__supported == ((xfeatures) | (dependencies)), \
- "supported = 0x%llx, xfeatures = 0x%llx, dependencies = 0x%llx", \
+ "supported = 0x%lx, xfeatures = 0x%llx, dependencies = 0x%llx", \
__supported, (xfeatures), (dependencies)); \
} while (0)
uint64_t __supported = (supported_xcr0) & (xfeatures); \
\
__GUEST_ASSERT(!__supported || __supported == (xfeatures), \
- "supported = 0x%llx, xfeatures = 0x%llx", \
+ "supported = 0x%lx, xfeatures = 0x%llx", \
__supported, (xfeatures)); \
} while (0)
vector = xsetbv_safe(0, supported_xcr0);
__GUEST_ASSERT(!vector,
- "Expected success on XSETBV(0x%llx), got vector '0x%x'",
+ "Expected success on XSETBV(0x%lx), got vector '0x%x'",
supported_xcr0, vector);
for (i = 0; i < 64; i++) {
vector = xsetbv_safe(0, supported_xcr0 | BIT_ULL(i));
__GUEST_ASSERT(vector == GP_VECTOR,
- "Expected #GP on XSETBV(0x%llx), supported XCR0 = %llx, got vector '0x%x'",
+ "Expected #GP on XSETBV(0x%llx), supported XCR0 = %lx, got vector '0x%x'",
BIT_ULL(i), supported_xcr0, vector);
}
selfdir = $(realpath $(dir $(filter %/lib.mk,$(MAKEFILE_LIST))))
top_srcdir = $(selfdir)/../../..
-ifeq ("$(origin O)", "command line")
- KBUILD_OUTPUT := $(O)
+ifeq ($(KHDR_INCLUDES),)
+KHDR_INCLUDES := -isystem $(top_srcdir)/usr/include
endif
-ifneq ($(KBUILD_OUTPUT),)
- # Make's built-in functions such as $(abspath ...), $(realpath ...) cannot
- # expand a shell special character '~'. We use a somewhat tedious way here.
- abs_objtree := $(shell cd $(top_srcdir) && mkdir -p $(KBUILD_OUTPUT) && cd $(KBUILD_OUTPUT) && pwd)
- $(if $(abs_objtree),, \
- $(error failed to create output directory "$(KBUILD_OUTPUT)"))
- # $(realpath ...) resolves symlinks
- abs_objtree := $(realpath $(abs_objtree))
- KHDR_DIR := ${abs_objtree}/usr/include
-else
- abs_srctree := $(shell cd $(top_srcdir) && pwd)
- KHDR_DIR := ${abs_srctree}/usr/include
-endif
-
-KHDR_INCLUDES := -isystem $(KHDR_DIR)
-
# The following are built by lib.mk common compile rules.
# TEST_CUSTOM_PROGS should be used by tests that require
# custom build rule and prevent common build rule use.
TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED))
TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES))
-all: kernel_header_files $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) \
- $(TEST_GEN_FILES)
-
-kernel_header_files:
- @ls $(KHDR_DIR)/linux/*.h >/dev/null 2>/dev/null; \
- if [ $$? -ne 0 ]; then \
- RED='\033[1;31m'; \
- NOCOLOR='\033[0m'; \
- echo; \
- echo -e "$${RED}error$${NOCOLOR}: missing kernel header files."; \
- echo "Please run this and try again:"; \
- echo; \
- echo " cd $(top_srcdir)"; \
- echo " make headers"; \
- echo; \
- exit 1; \
- fi
-
-.PHONY: kernel_header_files
+all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES)
define RUN_TESTS
BASE_DIR="$(selfdir)"; \
gup_longterm
mkdirty
va_high_addr_switch
+hugetlb_fault_after_madv
TEST_GEN_FILES += mremap_dontunmap
TEST_GEN_FILES += mremap_test
TEST_GEN_FILES += on-fault-limit
-TEST_GEN_PROGS += pagemap_ioctl
+TEST_GEN_FILES += pagemap_ioctl
TEST_GEN_FILES += thuge-gen
TEST_GEN_FILES += transhuge-stress
TEST_GEN_FILES += uffd-stress
TEST_GEN_FILES += hugetlb_fault_after_madv
ifneq ($(ARCH),arm64)
-TEST_GEN_PROGS += soft-dirty
+TEST_GEN_FILES += soft-dirty
endif
ifeq ($(ARCH),x86_64)
{
int err;
+ ksft_print_header();
+
pagesize = getpagesize();
thpsize = read_pmd_pagesize();
if (thpsize)
ARRAY_SIZE(hugetlbsizes));
detect_huge_zeropage();
- ksft_print_header();
ksft_set_plan(ARRAY_SIZE(anon_test_cases) * tests_per_anon_test_case() +
ARRAY_SIZE(anon_thp_test_cases) * tests_per_anon_thp_test_case() +
ARRAY_SIZE(non_anon_test_cases) * tests_per_non_anon_test_case());
int uffd;
int page_size;
int hpage_size;
+const char *progname;
#define LEN(region) ((region.end - region.start)/page_size)
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK | UFFD_USER_MODE_ONLY);
if (uffd == -1)
- ksft_exit_fail_msg("uffd syscall failed\n");
+ return uffd;
uffdio_api.api = UFFD_API;
uffdio_api.features = UFFD_FEATURE_WP_UNPOPULATED | UFFD_FEATURE_WP_ASYNC |
UFFD_FEATURE_WP_HUGETLBFS_SHMEM;
if (ioctl(uffd, UFFDIO_API, &uffdio_api))
- ksft_exit_fail_msg("UFFDIO_API\n");
+ return -1;
if (!(uffdio_api.api & UFFDIO_REGISTER_MODE_WP) ||
!(uffdio_api.features & UFFD_FEATURE_WP_UNPOPULATED) ||
!(uffdio_api.features & UFFD_FEATURE_WP_ASYNC) ||
!(uffdio_api.features & UFFD_FEATURE_WP_HUGETLBFS_SHMEM))
- ksft_exit_fail_msg("UFFDIO_API error %llu\n", uffdio_api.api);
+ return -1;
return 0;
}
munmap(mem, mem_size);
/* 9. Memory mapped file */
- fd = open(__FILE__, O_RDONLY);
+ fd = open(progname, O_RDONLY);
if (fd < 0)
- ksft_exit_fail_msg("%s Memory mapped file\n");
+ ksft_exit_fail_msg("%s Memory mapped file\n", __func__);
- ret = stat(__FILE__, &sbuf);
+ ret = stat(progname, &sbuf);
if (ret < 0)
ksft_exit_fail_msg("error %d %d %s\n", ret, errno, strerror(errno));
fmem = mmap(NULL, sbuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (fmem == MAP_FAILED)
- ksft_exit_fail_msg("error nomem %ld %s\n", errno, strerror(errno));
+ ksft_exit_fail_msg("error nomem %d %s\n", errno, strerror(errno));
tmp_buf = malloc(sbuf.st_size);
memcpy(tmp_buf, fmem, sbuf.st_size);
fmem = mmap(NULL, buf_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
if (fmem == MAP_FAILED)
- ksft_exit_fail_msg("error nomem %ld %s\n", errno, strerror(errno));
+ ksft_exit_fail_msg("error nomem %d %s\n", errno, strerror(errno));
wp_init(fmem, buf_size);
wp_addr_range(fmem, buf_size);
extra_thread_faults);
}
-int main(void)
+int main(int argc, char *argv[])
{
int mem_size, shmid, buf_size, fd, i, ret;
char *mem, *map, *fmem;
struct stat sbuf;
+ progname = argv[0];
+
ksft_print_header();
+
+ if (init_uffd())
+ return ksft_exit_pass();
+
ksft_set_plan(115);
page_size = getpagesize();
if (pagemap_fd < 0)
return -EINVAL;
- if (init_uffd())
- ksft_exit_fail_msg("uffd init failed\n");
-
/* 1. Sanity testing */
sanity_tests_sd();
fmem = mmap(NULL, sbuf.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
if (fmem == MAP_FAILED)
- ksft_exit_fail_msg("error nomem %ld %s\n", errno, strerror(errno));
+ ksft_exit_fail_msg("error nomem %d %s\n", errno, strerror(errno));
wp_init(fmem, sbuf.st_size);
wp_addr_range(fmem, sbuf.st_size);
fmem = mmap(NULL, buf_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
if (fmem == MAP_FAILED)
- ksft_exit_fail_msg("error nomem %ld %s\n", errno, strerror(errno));
+ ksft_exit_fail_msg("error nomem %d %s\n", errno, strerror(errno));
wp_init(fmem, buf_size);
wp_addr_range(fmem, buf_size);
CATEGORY="hugetlb" run_test ./hugepage-vmemmap
CATEGORY="hugetlb" run_test ./hugetlb-madvise
+nr_hugepages_tmp=$(cat /proc/sys/vm/nr_hugepages)
# For this test, we need one and just one huge page
echo 1 > /proc/sys/vm/nr_hugepages
CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv
+# Restore the previous number of huge pages, since further tests rely on it
+echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
if test_selected "hugetlb"; then
echo "NOTE: These hugetlb tests provide minimal coverage. Use"
TEST_PROGS += test_vxlan_nolocalbypass.sh
TEST_PROGS += test_bridge_backup_port.sh
TEST_PROGS += fdb_flush.sh
+TEST_PROGS += vlan_hw_filter.sh
TEST_FILES := settings
.msg_iov = &iov,
.msg_iovlen = 1
};
- struct unix_diag_req *udr;
struct nlmsghdr *nlh;
int ret;
{
struct addrinfo hints, *ai;
struct iovec iov[1];
+ unsigned char *buf;
struct msghdr msg;
char cbuf[1024];
- char *buf;
int err;
int fd;
int main(int argc, char **argv)
{
- unsigned int nr_process = 1;
+ long nr_process = 1;
int route_sock = -1, ret = KSFT_SKIP;
int test_desc_fd[2];
uint32_t route_seq;
exit_usage(argv);
}
- if (nr_process > MAX_PROCESSES || !nr_process) {
+ if (nr_process > MAX_PROCESSES || nr_process < 1) {
printk("nr_process should be between [1; %u]",
MAX_PROCESSES);
exit_usage(argv);
#include <sys/ioctl.h>
#include <sys/poll.h>
+#include <sys/random.h>
#include <sys/sendfile.h>
#include <sys/stat.h>
#include <sys/socket.h>
static void init_rng(void)
{
- int fd = open("/dev/urandom", O_RDONLY);
unsigned int foo;
- if (fd > 0) {
- int ret = read(fd, &foo, sizeof(foo));
-
- if (ret < 0)
- srand(fd + foo);
- close(fd);
+ if (getrandom(&foo, sizeof(foo), 0) == -1) {
+ perror("getrandom");
+ exit(1);
}
srand(foo);
#include <time.h>
#include <sys/ioctl.h>
+#include <sys/random.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/wait.h>
static void init_rng(void)
{
- int fd = open("/dev/urandom", O_RDONLY);
unsigned int foo;
- if (fd > 0) {
- int ret = read(fd, &foo, sizeof(foo));
-
- if (ret < 0)
- srand(fd + foo);
- close(fd);
+ if (getrandom(&foo, sizeof(foo), 0) == -1) {
+ perror("getrandom");
+ exit(1);
}
srand(foo);
fi
if reset "mpc backup" &&
- continue_if mptcp_lib_kallsyms_doesnt_have "mptcp_subflow_send_ack$"; then
+ continue_if mptcp_lib_kallsyms_doesnt_have "T mptcp_subflow_send_ack$"; then
pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow,backup
speed=slow \
run_tests $ns1 $ns2 10.0.1.1
fi
if reset "mpc backup both sides" &&
- continue_if mptcp_lib_kallsyms_doesnt_have "mptcp_subflow_send_ack$"; then
+ continue_if mptcp_lib_kallsyms_doesnt_have "T mptcp_subflow_send_ack$"; then
pm_nl_add_endpoint $ns1 10.0.1.1 flags subflow,backup
pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow,backup
speed=slow \
fi
if reset "mpc switch to backup" &&
- continue_if mptcp_lib_kallsyms_doesnt_have "mptcp_subflow_send_ack$"; then
+ continue_if mptcp_lib_kallsyms_doesnt_have "T mptcp_subflow_send_ack$"; then
pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow
sflags=backup speed=slow \
run_tests $ns1 $ns2 10.0.1.1
fi
if reset "mpc switch to backup both sides" &&
- continue_if mptcp_lib_kallsyms_doesnt_have "mptcp_subflow_send_ack$"; then
+ continue_if mptcp_lib_kallsyms_doesnt_have "T mptcp_subflow_send_ack$"; then
pm_nl_add_endpoint $ns1 10.0.1.1 flags subflow
pm_nl_add_endpoint $ns2 10.0.1.2 flags subflow
sflags=backup speed=slow \
if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then
test_linkfail=1024 fastclose=server \
run_tests $ns1 $ns2 10.0.1.1
- chk_join_nr 0 0 0
+ chk_join_nr 0 0 0 0 0 0 1
chk_fclose_nr 1 1 invert
chk_rst_nr 1 1
fi
done
sleep 5
- run_cmd_grep "10.23.11." ip addr show dev "$devdummy"
+ run_cmd_grep_fail "10.23.11." ip addr show dev "$devdummy"
if [ $? -eq 0 ]; then
check_err 1
end_test "FAIL: preferred_lft addresses remaining"
run_cmd ip -netns "$testns" addr add dev "$DEV_NS" 10.1.1.100/24
- run_cmd ip -netns "$testns" link set dev $DEV_NS ups
+ run_cmd ip -netns "$testns" link set dev $DEV_NS up
run_cmd ip -netns "$testns" link del "$DEV_NS"
# test external mode
--- /dev/null
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+readonly NETNS="ns-$(mktemp -u XXXXXX)"
+
+ret=0
+
+cleanup() {
+ ip netns del $NETNS
+}
+
+trap cleanup EXIT
+
+fail() {
+ echo "ERROR: ${1:-unexpected return code} (ret: $_)" >&2
+ ret=1
+}
+
+ip netns add ${NETNS}
+ip netns exec ${NETNS} ip link add bond0 type bond mode 0
+ip netns exec ${NETNS} ip link add bond_slave_1 type veth peer veth2
+ip netns exec ${NETNS} ip link set bond_slave_1 master bond0
+ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off
+ip netns exec ${NETNS} ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
+ip netns exec ${NETNS} ip link add link bond0 name bond0.0 type vlan id 0
+ip netns exec ${NETNS} ip link set bond_slave_1 nomaster
+ip netns exec ${NETNS} ip link del veth2 || fail "Please check vlan HW filter function"
+
+exit $ret
}
#define SOCK_BUF_SIZE (2 * 1024 * 1024)
-#define MAX_MSG_SIZE (32 * 1024)
+#define MAX_MSG_PAGES 4
static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
{
unsigned long curr_hash;
+ size_t max_msg_size;
int page_size;
int msg_count;
int fd;
curr_hash = 0;
page_size = getpagesize();
- msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE;
+ max_msg_size = MAX_MSG_PAGES * page_size;
+ msg_count = SOCK_BUF_SIZE / max_msg_size;
for (int i = 0; i < msg_count; i++) {
size_t buf_size;
/* Use "small" buffers and "big" buffers. */
if (i & 1)
buf_size = page_size +
- (rand() % (MAX_MSG_SIZE - page_size));
+ (rand() % (max_msg_size - page_size));
else
buf_size = 1 + (rand() % page_size);
unsigned long remote_hash;
unsigned long curr_hash;
int fd;
- char buf[MAX_MSG_SIZE];
struct msghdr msg = {0};
struct iovec iov = {0};
control_writeln("SRVREADY");
/* Wait, until peer sends whole data. */
control_expectln("SENDDONE");
- iov.iov_base = buf;
- iov.iov_len = sizeof(buf);
+ iov.iov_len = MAX_MSG_PAGES * getpagesize();
+ iov.iov_base = malloc(iov.iov_len);
+ if (!iov.iov_base) {
+ perror("malloc");
+ exit(EXIT_FAILURE);
+ }
+
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
curr_hash += hash_djb2(msg.msg_iov[0].iov_base, recv_size);
}
+ free(iov.iov_base);
close(fd);
remote_hash = control_readulong();
config HAVE_KVM
bool
-config HAVE_KVM_PFNCACHE
+config KVM_COMMON
bool
+ select EVENTFD
+ select INTERVAL_TREE
+ select PREEMPT_NOTIFIERS
-config HAVE_KVM_IRQCHIP
+config HAVE_KVM_PFNCACHE
bool
-config HAVE_KVM_IRQFD
+config HAVE_KVM_IRQCHIP
bool
config HAVE_KVM_IRQ_ROUTING
bool
depends on HAVE_KVM_DIRTY_RING
-config HAVE_KVM_EVENTFD
- bool
- select EVENTFD
-
config KVM_MMIO
bool
bool
config KVM_GENERIC_MEMORY_ATTRIBUTES
- select KVM_GENERIC_MMU_NOTIFIER
+ depends on KVM_GENERIC_MMU_NOTIFIER
bool
config KVM_PRIVATE_MEM
#include <kvm/iodev.h>
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
static struct workqueue_struct *irqfd_cleanup_wq;
synchronize_srcu(&kvm->irq_srcu);
kvm_arch_post_irq_ack_notifier_list_update(kvm);
}
-#endif
-
-void
-kvm_eventfd_init(struct kvm *kvm)
-{
-#ifdef CONFIG_HAVE_KVM_IRQFD
- spin_lock_init(&kvm->irqfds.lock);
- INIT_LIST_HEAD(&kvm->irqfds.items);
- INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
- mutex_init(&kvm->irqfds.resampler_lock);
-#endif
- INIT_LIST_HEAD(&kvm->ioeventfds);
-}
-#ifdef CONFIG_HAVE_KVM_IRQFD
/*
* shutdown any irqfd's that match fd+gsi
*/
return kvm_assign_ioeventfd(kvm, args);
}
+
+void
+kvm_eventfd_init(struct kvm *kvm)
+{
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
+ spin_lock_init(&kvm->irqfds.lock);
+ INIT_LIST_HEAD(&kvm->irqfds.items);
+ INIT_LIST_HEAD(&kvm->irqfds.resampler_list);
+ mutex_init(&kvm->irqfds.resampler_lock);
+#endif
+ INIT_LIST_HEAD(&kvm->ioeventfds);
+}
static const struct address_space_operations kvm_gmem_aops = {
.dirty_folio = noop_dirty_folio,
-#ifdef CONFIG_MIGRATION
.migrate_folio = kvm_gmem_migrate_folio,
-#endif
.error_remove_page = kvm_gmem_error_page,
};
static const struct file_operations stat_fops_per_vm;
-static struct file_operations kvm_chardev_ops;
-
static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
unsigned long arg);
#ifdef CONFIG_KVM_COMPAT
if (!kvm)
return ERR_PTR(-ENOMEM);
- /* KVM is pinned via open("/dev/kvm"), the fd passed to this ioctl(). */
- __module_get(kvm_chardev_ops.owner);
-
KVM_MMU_LOCK_INIT(kvm);
mmgrab(current->mm);
kvm->mm = current->mm;
if (r)
goto out_err_no_disable;
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
#endif
out_err_no_srcu:
kvm_arch_free_vm(kvm);
mmdrop(current->mm);
- module_put(kvm_chardev_ops.owner);
return ERR_PTR(r);
}
preempt_notifier_dec();
hardware_disable_all();
mmdrop(mm);
- module_put(kvm_chardev_ops.owner);
}
void kvm_get_kvm(struct kvm *kvm)
return 0;
}
-static const struct file_operations kvm_vcpu_fops = {
+static struct file_operations kvm_vcpu_fops = {
.release = kvm_vcpu_release,
.unlocked_ioctl = kvm_vcpu_ioctl,
.mmap = kvm_vcpu_mmap,
}
static const struct file_operations kvm_vcpu_stats_fops = {
+ .owner = THIS_MODULE,
.read = kvm_vcpu_stats_read,
.release = kvm_vcpu_stats_release,
.llseek = noop_llseek,
return 0;
}
-static const struct file_operations kvm_device_fops = {
+static struct file_operations kvm_device_fops = {
.unlocked_ioctl = kvm_device_ioctl,
.release = kvm_device_release,
KVM_COMPAT(kvm_device_ioctl),
#ifdef CONFIG_HAVE_KVM_MSI
case KVM_CAP_SIGNAL_MSI:
#endif
-#ifdef CONFIG_HAVE_KVM_IRQFD
+#ifdef CONFIG_HAVE_KVM_IRQCHIP
case KVM_CAP_IRQFD:
#endif
case KVM_CAP_IOEVENTFD_ANY_LENGTH:
}
static const struct file_operations kvm_vm_stats_fops = {
+ .owner = THIS_MODULE,
.read = kvm_vm_stats_read,
.release = kvm_vm_stats_release,
.llseek = noop_llseek,
}
#endif
-static const struct file_operations kvm_vm_fops = {
+static struct file_operations kvm_vm_fops = {
.release = kvm_vm_release,
.unlocked_ioctl = kvm_vm_ioctl,
.llseek = noop_llseek,
r += PAGE_SIZE; /* coalesced mmio ring page */
#endif
break;
- case KVM_TRACE_ENABLE:
- case KVM_TRACE_PAUSE:
- case KVM_TRACE_DISABLE:
- r = -EOPNOTSUPP;
- break;
default:
return kvm_arch_dev_ioctl(filp, ioctl, arg);
}
return r < 0 ? r : 0;
}
-/* Caller must hold slots_lock. */
int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
int len, struct kvm_io_device *dev)
{
struct kvm_io_bus *new_bus, *bus;
struct kvm_io_range range;
+ lockdep_assert_held(&kvm->slots_lock);
+
bus = kvm_get_bus(kvm, bus_idx);
if (!bus)
return -ENOMEM;
goto err_async_pf;
kvm_chardev_ops.owner = module;
+ kvm_vm_fops.owner = module;
+ kvm_vcpu_fops.owner = module;
+ kvm_device_fops.owner = module;
kvm_preempt_ops.sched_in = kvm_sched_in;
kvm_preempt_ops.sched_out = kvm_sched_out;